Fixing, or killing, the filesystem
The filesystem sucks. It’s too complicated for the “average user,” and it’s too complicated for me: I have crap all over my computer and not a clue where, or what, much of it is.
Actually, I think it turns out that I don’t use it, or need it, as much as I think I do. Let’s see, here are the apps in my Dock. Which ones use the filesystem as part of their “user experience”?
- Finder: Duh; but this would go if the filesystem didn’t exist, obviously.
- Calendar/iCal: Nope.
- Mail: Nope.
- MarsEdit: Nope.
- Reeder: Nope.
- Tweetbot: Nope.
- Chrome: Nope — except for somewhere to store downloads; but we’ll see about that.
- Messages: Nope.
- Textual: Nope — except for somewhere to store logs; but I don’t mind using some kind of built-in log viewer for that, as long as it has decent search.
- Transmit: Mmm, yes.
- iTunes: Nope.
- Capo: Mmm. Not really. This is actually the first “document-based” app in the list.
- Sketch: Not to any extent greater than Capo does; except, exporting SVG/EPS/etc is an important feature.
- BBEdit: Kinda, maybe. Perhaps I wouldn’t mind only using its “project” functionality, which I currently don’t use at all. (I use “instaprojects” as a kind of super disk browser.) But it still needs to be able to open any “file” on disk.
- Terminal: Yes! The filesystem is one of the most important parts of UNIX — but this needn’t be sacrificed in taking the filesystem away as a user-facing interface aspect.
- System Preferences: Nope.
Where I’ve said “Nope,” I’m referring to applications which could work just as well with some kind of futuristic soupy object database as their storage; I don’t care how much they use the filesystem under-the-hood in their current implementation.
Important properties
What are the important properties of a data-management interface that could replace the filesystem?
- Organisation of items into categories somehow. Right now we do this with “folders”; but Gmail uses “labels” instead. An equivalent dichotomy on the web is the difference between “categories” and “tags” on blog posts: a post can have any number of tags but, in theory, only one category. Messages in Gmail can have any number of labels, but files on computers can only be in one folder at once. (Symlinks &c. aside, but they’re just confusing.)
- A document created or opened in one application should be able to be opened by other applications. If I make an SVG document in Sketch, I need to be able to upload it with Transmit, or go edit it in Inkscape instead, or whatever.
- Documents and applications are fundamentally separate: don’t browse one with an app which can also browse the others.
- Search is the primary interface. On the Web, there’s no way to look at everything at once: you have to look through a search engine. But “search” is a broad term: looking at all the documents which have a certain tag is a kind of search, not just “type search terms / see results”. Apple’s search interfaces all suck, but they’ve been getting better. Google is not the right model for this kind of search; but neither are crazy interfaces like this one. The search has to be fast, too.
- It has to be accessible in a moderately-sensible way through the Terminal. This might not matter to the “average user”, but it sure matters to me.
Beyond this, there are some more technical concerns:
- Get rid of file-name extensions. They were a mistake, and they’ve propagated through computing history like nothing else. Store file types as MIME types or UTIs or whatever. And, it should be easy to change the “type” of a file as stored on disk, because otherwise incompatibilities and other fun stuff. One nice thing about using MIME types is that you can store stuff like the charset of text documents and so on in parameters, and you can use them as the types for HTTP uploads, too. (They’ll still creep in, of course, through things like downloads from the Web and the potential need to add them before uploading files to a server.)
- A ‘document’ should be able to have multiple ‘aspects’ to it. On the old Mac these were the data fork and the resource fork, but now we usually use ‘bundles’ which can be stored in tarballs etc. and sent across the lowest-common-denominator internet protocols without losing the resource data.
Filesystems vs. databases
The filesystem can be used as a lightweight, fairly reliable database system. The filesystem is really a database: well-deployed and as a consequence highly robust. But it wasn't designed for being used as a subset of SQLite, say.Any reforms to the filesystem would probably scale it up to be a more considerable database. Files having "aspects" would have to be handled in a relational way. Increased complexity means increased fragility, but starting from scratch also allows us to take into account modern requirements.
What would a filesystem look like if you scaled it up from a cdb or redis base, using CouchDB or other no-sql techniques?
This gives me another idea: saved searches, somehow, in the way that CouchDB does them with views. When you make a new file or make a change to a file, it’s asynchronously indexed in each of your saved searches, if it matches them.
Summary
Some files need to be grouped to work with unix tools. This might be a largely tradition-led way of doing things, but there are still some obvious conveniences, and the design has lasted. But most files don't need to be grouped in any filesystem-like way. They can be stored with metadata and essentially "lost" to the user from a hierarchical perspective.
An experiment you can try at home: don't use the filesystem. Some laptops are predicated on this idea, of course, but their idea is that you use a walled cloud instead. But what if you don't save files at all? Is that possible? Experimental evidence suggests that it isn't possible, but that power users probably archive more than they need to because of the library problem. A library holds many books, but only a few are consulted. Since it's impossible to know in advance which books need to be consulted, a library has to hold many books. Similarly if you don't know which of your files you're going to need to look at in future, you had better save all of them. Because disk space is cheap, there's no reason why you shouldn't do so; except that it creates a bit of a clutter. But then, as William Loughborough said, the clutter is inherent to the organism!