Storage is probably one the most outdated technologies that is still essential to everything we do.  As everything else in the tech universe gets faster, more parallel, and cheaper, storage just gets bigger and cheaper.  The exception to this is SSD.  I long for the day when this new type of storage will be viable to completely replace all other existing hard drives…….anyways, back to point.

As a company who is deploying a private cloud to run our services, capacity is really a secondary concern.  We don’t need 50PB of storage space to host virtual machine images (well, not yet anyway).  A few TB’s gets us a good number of machines.  What we care about is performance.

From personal experience, I’ve found that we run out of IO performance long before we run out of space when it comes to hosting virtual machine images.  This is a huge problem for cloud architectures because the storage layer needs to be global and shared to take full advantage of the model.  We need a single storage pool that not only has the capacity to hold all our virtual machine instances, but most importantly, has the performance to run them all without negatively affecting any other virtual machine.

This means two things: 1) it must allow parallel access to the file system, and 2) it must be scalable!  When we add nodes to the pool, it must not only scale in capacity, but also in IO performance.

When I talk to storage vendors, they still don’t get it.  They are still thinking of storage in terms of size instead of performance.  They want to know how many files we need to store or how many TB/PB of space we require.   The question these vendors should be asking is how many machines do we need to host off their storage platform?  And, what is the IO performance we expect?  I could easily host 20 virtual machine images on the 2TB NAS they want sell, but the disks/controllers just aren’t up to that task (a 2TB NAS will choke and die long before the 20 machine mark, unless I’m willing to pay a ridiculous amount of money for it).  The problem is that performance has taken a back seat to capacity and that needs to stop.

To this end, I’ve been taking a long hard look at GlusterFS.  This open source file system, built to remove the IO bottle neck from super computing clusters, seems to be a very appropriate solution to the cloud storage issue.  It is a parallel, distributed file system that scales in both capacity and performance.

We will be deploying GlusterFS to store and host virtual machine images in our production environment this year.  I’ll let you know how it goes.

Advertisement