Reorganizator
A Simple Yet Effective Alternative to Desktop (Re)Organization
Introduction
|
Download the demo: |
Before web-based search, there was little or no way to competently navigate the internet. There were millions of virtually inaccessible bits of information scattered across the web on servers, hiding in plain sight. Then Google introduced a seemingly simple web application that did the impossible: it found you what you were looking for, all of the time. Since then, search has become the most important web-based feature, ever, and biggest business on the web, using a text-based advertising revenue model to pull in billions for search giants such as Google and Yahoo.
A search engine works by constantly scouring the web using robots - automated "web crawlers" - that traverse link after link after link, and build an index of every page they come across. The results of these web crawls are used by an algorithm to rank each page based on pages that link to it, and their respective ranks.[1] It works very well.
In this system, the actual physical location of the results is irrelevant. The first two results for any given query may well contain links to pages that (literally) reside on web servers on different sides of the earth. For the user running the query, though, they are right next to one another. They may as well be running on a single server sitting in the living room.
A user conducting a web-based search might not know about the material the query will return before it is returned. More importantly, they don't need to. In this sense, it is a purely contextual search, with the best contextual match hopefully returned first. This is analogous to walking into a department store with a general idea of what to purchase, and browsing until the closest match is found.
Desktop search, on the other hand, is different. A user executing a desktop search is more likely than not looking for something they know they already have. Instead of having a general idea, the user is querying for a specific result. Contextual clues are only secondary at this point, and, may in fact actually be a hindrance in the search, leading to "false positives." A desktop search is like walking into a messy bedroom and looking for a particular CD. It's not something that may or may not exist, and not something that can exist in many places in different ways. There are no implicit contextual clues that make it the best result. It is explicitly the desired object, no other results matter.
A Working Case
Let's examine a possible approach to the aforementioned bedroom CD search, one which will maximize our search effort by combining the computers most efficient attributes with our own problem solving abilities. We will implement an example of "cognitive divide and conquer".
I walk into a bedroom with clothes strewn about, books, CDs, DVDs, magazines, and miscellaneous objects all clustered in small aggregations here and there. i'm looking for a Radiohead CD (for the sake of this example) – I can't quite remember the name but if I saw it I'd know it. I haven't listened to it in a good long while, nor do I recall the last time I saw it.
Luckily for me, I have a magic way to reorganize the entire room using certain criteria, simply by stating the criteria aloud.
"Type."
No sooner have I completed the command than is everything grouped into discrete aggregations by type throughout the room. I walk over to the stack of CDs, a stack that now contains every CD in the room.
"Name."
Quickly the pile of CDs is broken up into alphabetical groups.
"Date."
Finally, as I walk over to the ‘R' group, noting that the stack is sorted in reverse chronological order. I tap the stack and the order reverses. Voila! It's the CD I'm looking for.
Imagine you could really do that with your physical belongings. Wouldn't life be grand? Well, in the digital world, life is grand. Everything that appears to have been magical is an implementation of a file explorer application that allows for this type of dynamic, on-the-fly reorganizing.
In the bedroom example above, the bedroom is analogous to a directory in a file system. Within this directory are all the subdirectories and files that make up every piece of content in the room. In a perfect world, or more aptly, on a perfect hard drive, everything would already be organized into perfectly discrete directories and subdirectories. More likely in reality, though, is an ad hoc organizational structure exhibiting a certain degree of randomness and corruptness. A messy bedroom.
The beauty of the method applied above is that the underlying organizational structure is completely unimportant, and also untouched. Every artifact in the domain will be reorganized on the fly in a meaningful way.
How It Works
Figures 1 and 2 are screen shots of a small sample application created to demonstrate this idea. The implementation was done in Java.
A hard drive is a tree structure consisting of directories and singular files with them. Like all tree structures, there is a root and there are leaves. A leaf is defined as a node on the tree without any nodes below it. The only concerns in this process are leaves since each leaf represents a file that could be the search target.
Directories are of no concern, so they are eliminated initially in favor of a single bin that contains every file. From this bin, virtual subdirectories are created that will serve to aggregate the files in meaningful ways. These subdirectories are analogous to the piles in the earlier bedroom example. For each virtual subdirectory, repeat the process, creating further virtual subdirectories using different organizational criteria – thus rebuilding a new tree structure in a meaningful way to work best for the specific semantic elements of the target.
|
| Figure 1 |
In figure 2, notice a bunch of subdirectories under a root directory called Reorganizator. Not one of these directories actually exists on the hard drive itself. A directory full of files and other subdirectories full of files was opened by the application, under the static virtual root Reorganizator. Using right click options, this mess of files was first reorganized by file type, creating the virtual directories Documents, Text Files, PDFS, and so on. Then Documents was further reorganized by date last modified, creating the virtual directories 1 day ago, 3 weeks ago, etcetera. Finally, the More than 10 weeks ago directory was reorganized alphabetically, creating virtual subdirectories for any letter being the first letter in a filename contained in it.
|
| Figure 2 |
The application otherwise behaves very similarly to any other mainstream file explorer, such as Windows Explorer or Nautilus. Folder contents are shown in the list view pane on the right, and each file can be opened by double-clicking it. The user can open any file in any virtual directory, control the directory structure in a meaningful and easy-to-use manner, and open, close and delete files without ever actually needing to know where the files exist in the underlying static directory structure.
Conclusion
Although this sample application is limited in its functionality, a full blown version could offer the user not only the ability to reorganize files on the fly, but also the ability to dynamically modify the criteria, naming conventions, and general behavior via an intuitive interface designed to present the user with every option available to the system as potential organizational criteria.
Desktop search has a special property that lends itself towards active user interaction rather than passive user search. That property is that the user often has a better idea of the desired result than the engine does. The user is also able to easily make semantic links that are very difficult for a computer algorithm.
For example, I may be looking for a magazine somewhere in my room that contains an article I'd like to reread, I'm just not quite sure which magazine it's in. I have a feeling, though, that I'll know it when I see it. Some part of the cover will be reminiscent, maybe a particular color on the bottom right corner, something. In this case, I know exactly what I'm looking for, although not exactly how to look for it. This is where the active approach pulls out ahead of the race; maximizing the search by allowing the human mind to do what it does far better than a computer: think. I can easily decide that the best thing to do is put all the magazines together and let my intellect determine what to do next; I know it's been about a month since I read that article, let's see what magazines I read a month ago. And so on…
This is one simple attempt to find a middle ground between the creative and keen inference abilities of the human mind, and the speed and autonomous power of a machine.
References