Ultimately, do object storage plays displace file systems or are they absorbed?

April 25. 2012    NO COMMENT

(article is also featured here)

NAB kept me totally away from all the interesting online discussions last week. It’s too late to respond to @JoinToigo’s tweet (we’d call this Figs after Easter in Dutch), but I thought I’d share my thoughts in a bit more than 140 characters.

The short answer is no … but a better answer is very much *yes*.

The first file systems were not designed with the thought of petabytes of data. I don’t know what the exact projections were back then, but gigabytes must have sounded pretty sci-fi. Bytes and kilobytes were a lot more common. We didn’t think that we’d soon all be creating tens if not hundreds of multi-megabyte files per day.

File systems have of course evolved a lot and some have become so popular you could actually say they have a fan base (I’d need to do research on ZFS fan clubs). It is clear that the file system has played a very important role in the evolution of the computer industry. In my list of features that helped to make computers a commodity, the file system would probably be in the top three (with the windows-style GUI and the mouse). The file system enables the use of directories, which have been the most important tool to keep our data organized.

But like Robin Harris says “entropy refers to the inherent tendency for any organized system to disorder”. Especially with the amounts of data we are dealing with today, we have to put a lot of energy into keeping our data organized. We have come to a point where our directories are not that organized anymore because we have too much data. But that doesn’t matter all that much since there are so many applications out there (and a lot more coming) that can do this for us.

Take Google docs for example. Docs lets you star and share your documents, and organize them in collections. And no matter how you organize your stuff, Docs will find it back for you. Docs has a great search function (it’s Google after all) that is lot more powerful than the search in windows explorer or OSX’ Finder (although spotlight is actually pretty good). Picasa and Itunes are just two more examples of applications that help us keep our data organized with hardly any role for the file system. Eventually the applications will make the file system obsolete. Many of the applications we are using today are cloud based and run on object storage, with no file system involved, the application just masks the lack of a file system.

For businesses the situation is the same but different. Applications in the cloud are increasingly popular, so a lot of business data is already stored in a public of private object store. But a lot of business applications simply need a file system interface. For now, that is. If the current data growth continues, a lot of file systems will hit their scalability limits. And here object storage will play a very important role as object storage platforms have been (at least the good ones) designed to scale out big.

One interesting example is the media and entertainment industry. If there is one industry where data is big, it’s there: think of the 4k and soon 8k movies. Movies have become multi-petabyte projects (tens of petabytes). Companies in this industry understand they need more efficient storage and tape is no longer an option. All major studios are running object storage projects right now (mostly with file systems on top). This frees them from worrying about “how many files fit into a directory before it slows down”, and “how many directories can I have” and “how deep can my file system tree become” – especially as it relates to access performance.

So, expect object storage systems to become more and more popular. As long as needed, object storage will be implemented with some file system gateway on top but eventually, when the applications are read, we will see less and less file systems. It just makes more sense to have the application talk directly to the storage. REST makes it all very simple. And fast. And economically feasible.

And now, anticipating the next question: Shouldn’t there be some standard REST API? I used to strongly believe so. But while doing research for this piece, I stumbled across Wikipedia’s list of file systems. With hardly a dozen object storage REST API’s on the market, it’s not all that bad in my opinion. Still, I believe object storage vendors all agree standardization is good. It’s just a matter of waiting to see which API will eventually become the most popular with the applications that use them.