Latest Entries »

Storage is probably one the most outdated technologies that is still essential to everything we do.  As everything else in the tech universe gets faster, more parallel, and cheaper, storage just gets bigger and cheaper.  The exception to this is SSD.  I long for the day when this new type of storage will be viable to completely replace all other existing hard drives…….anyways, back to point.

As a company who is deploying a private cloud to run our services, capacity is really a secondary concern.  We don’t need 50PB of storage space to host virtual machine images (well, not yet anyway).  A few TB’s gets us a good number of machines.  What we care about is performance.

From personal experience, I’ve found that we run out of IO performance long before we run out of space when it comes to hosting virtual machine images.  This is a huge problem for cloud architectures because the storage layer needs to be global and shared to take full advantage of the model.  We need a single storage pool that not only has the capacity to hold all our virtual machine instances, but most importantly, has the performance to run them all without negatively affecting any other virtual machine.

This means two things: 1) it must allow parallel access to the file system, and 2) it must be scalable!  When we add nodes to the pool, it must not only scale in capacity, but also in IO performance.

When I talk to storage vendors, they still don’t get it.  They are still thinking of storage in terms of size instead of performance.  They want to know how many files we need to store or how many TB/PB of space we require.   The question these vendors should be asking is how many machines do we need to host off their storage platform?  And, what is the IO performance we expect?  I could easily host 20 virtual machine images on the 2TB NAS they want sell, but the disks/controllers just aren’t up to that task (a 2TB NAS will choke and die long before the 20 machine mark, unless I’m willing to pay a ridiculous amount of money for it).  The problem is that performance has taken a back seat to capacity and that needs to stop.

To this end, I’ve been taking a long hard look at GlusterFS.  This open source file system, built to remove the IO bottle neck from super computing clusters, seems to be a very appropriate solution to the cloud storage issue.  It is a parallel, distributed file system that scales in both capacity and performance.

We will be deploying GlusterFS to store and host virtual machine images in our production environment this year.  I’ll let you know how it goes.

TCP Window Scaling

Recently, we’ve been having a small (but significantly too big) amount of users who have been having issues connecting to our websites.  This has been a very frustrating problem as there was no pattern of location, browser, OS, ISP, or any other normal factor related to connection issues.

In the end, it ended up being a problem with TCP window scaling.  If you don’t know what this is, don’t worry, it is very technical and I’m not going to go into details here ;)   Basically, this setting is turned on by default in all modern Linux/Unix kernels and makes your internet connection faster (when it works).  Unfortunately, there is equipment out there on the internet that does not handle TCP window scaling correctly and if you are unlucky enough to have it between your computer and the website you are trying to connect to, then you will experience intermittent issues accessing the site.

Now, this is all very well documented and googling it will present a wealth of knowledge about how to turn off TCP window scaling on your computer so you don’t have these problems anymore.  But what about the servers hosting these websites?  We can’t tell all our users to turn off TCP window scaling on their computers.  Shouldn’t there be something we can do on our end to prevent this problem from happening?  As it turns out, there is.  Turn off TCP window scaling and TCP timestamps on all our public facing equipment.  Below is the code to do that on Linux (RedHat flavors):

sysctl net.ipv4.tcp_window_scaling=0
sysctl net.ipv4.tcp_timestamps=0

Turning off TCP timestamps is the part that is missing from all the online information and what is absolutely essential for fixing this issue on the server side (it’s not necessary on the client side).

Recently, I’ve been reading about Event Driven Architecture (EDA).  This is really exciting stuff and I’m convinced that it will be the future of the data center.

Combine this with virtualization and configuration management tools (like Chef) and EDA provides the mechanism for intelligent architecture that is automated and flexible.  Imagine an infrastructure that can not only alert you when a machine fails, but know what it means and trigger the actions necessary to fix it.  The problem could be fixed automatically before the notification email is delivered to your inbox!  This is the first stepping stone to true artificial intelligence at the infrastructure level.  Just as event driven programming transformed software applications, EDA will transform the data center!

I’ve incorporated EDA into my vision for our infrastructure and determined the tools necessary to start building the foundation of our EDA.  The first thing you need to build an EDA is a message bus that is accessible across the entire infrastructure.  RabbitMQ seems to be a great fit for this part of the EDA model.  It is a redundant, fault tolerant, high performance messaging queue.  It is built with the AMQP messaging protocol in mind and is ideal for the system wide messaging infrastructure that my vision requires.

Once the messaging queue (or message bus) is in place, we can proceed to the next step in implementing our EDA infrastructure.  Stay tuned!

Last week I announced the launch of the Stocu.com closed beta and promised to post my invite link.

Well, promise kept!  Below is my invite link to the Stocu.com closed beta.  Just click it to sign up.

Hurry, only the first 8 people to sign up with this link will get in! (after that, I’m all out of invites)

Yesterday afternoon we launched Stocu.com in closed beta.

This is a new project that we (Sazze, Inc.) are incubating and has been in the works for a couple months.

This is the first major new project that has completely leveraged our frameworks and infrastructure platforms that we have been developing over the past 2 years.  It was a complete success!  Normally, a project like this would have taken months to develop, but we went from concept to closed beta in 3 weeks!

For those of you who would like to know what Stocu.com is all about, here’s an official description from our copy experts:

Stocu.com started out as an idea for a “stock picking game for fame,” and quickly grew into the platform for a social network where users are able to predict where stocks will close in either a day or a week, and gain market insight from comments and predictions made by their fellow users.

If that sounds interesting to you, there are two options:

  1. Head over to Stocu.com and request an invite
  2. Keep checking out my blog (I will post an invite code later this week)

I hinted about it in the last post, and now it is here!

We are giving away a brand new 2010 Ford Focus!  It’s official, we are insane……insane about deals! :D

Anyway, go here to find out the rules and enter the giveaway:

Win a Brand New Car from dealspl.us!

Win a Brand New Car from dealspl.us!

Win a Brand New Car from dealspl.us!

dealspl.us has a brand new look!

We’ve been working hard for a while now on a complete redesign of dealspl.us and today is the magic day!  Head on over there and check it out.  We want to know what you think, so feel free to comment here, or, better yet, comment on dealspl.us – Looking Better than Ever

Also, there is going to be a really awesome giveaway in celebration of this redesign, so stay tuned…..

dealspl.us on your phone

So, it’s been about a year since my last post…….shame on me.

I could give the standard excuses about being too busy, etc, etc, but I wont.  Instead, I’m going to dust this blog off and breathe some more life into it with this announcement:

dealspl.us has launched a version of it’s site specifically for the mobile phone!  To check it out, point your mobile phone browser to: m.dealspl.us

This site is specifically tailored for the mobile phone screen (touch screens in particular).  You can view all the great deals and coupons on dealspl.us quickly and easily on your phone when you’re out and about.

We’ve even had reports from some of our users that retailers have accepted coupons on their phones instead of paper coupons!  This is really exciting as we’re always trying to improve the way you can save money and make better purchasing decisions.

So check it out.  You just might save some money ;)

Flast

I’m excited to announce the official release of Flast! (my very first open source project!)

For those of you who don’t know (everyone), Flast is an open source framework for PHP version 5.3.  It is focused on performance (fast) and removing restrictions on developers (flexible).

You can check out Flast here: http://sourceforge.net/projects/flast

Your feedback is much appreciated.

Here’s a little history about why I decided to create Flast…..

While working on DealsPlus and Sazze, we evaluated many PHP frameworks, but decided to create our own because they were all to slow and/or forced us to use a particular coding methodology (i.e. MVC).  After successfully creating a very useful framework that actually improves performance, I wanted to give back to the open  source community by using my experience (and the awesome new features in PHP 5.3) to create a framework that gives the developer complete control over performance and functionality.

Right now, Flast is in a very early pre-Alpha phase, but it should be useful by the time a production ready PHP 5.3 is released.

PHP Memcached Manager

We’ve been using memcached on both our sites for a while now to help alleviate database load and speed things up in general.

However, we’ve been lacking a good web-based manager to see the cache status and manually clear the cache.  (I’ve been doing this via telnet on the command line and have been to busy to write my own script…..)

Today, I stumbled accross this gem: http://livebookmark.net/journal/2008/05/21/memcachephp-stats-like-apcphp/

It’s a simple GUI for memcached that is written in PHP and was exactly what I was looking for!

Follow

Get every new post delivered to your Inbox.