«

»

Jun
19
2009

Clever solutions: Microsoft Azure and its fabric controller

I’m not the biggest fan of Microsoft Hyper-V which I assume is the hypervisor used in their Azure solution. As illustrated earlier, Hyper-V has its problems, especially with live migration and iSCSI support. But, the Azure have some clever solution on its own.

I found a video from a presentation held at Professional Developers Conference (PDC) 2008 where Erick Smith and Chuck Lenzmeier talked about how Microsoft Azure works. In their presentation, they talk about their fabric controller. Now, the fabric controller is an interesting piece of software. It maintains a view of the entire data center. It is capable controlling and manage the data center to satisfy SLA agreements.

First, the fabric controller is capable of allocating resources to a service and determine how the roles of the service is connected. It also handles fault tolerance and system updates. Now, fault tolerance and redundancy are pretty interesting:

VMware vCenter have no way of knowing the physical layout of a data center. It does not know and it does not care. Now, consider the following: You have two racks of computers. There are 8 machines and one switch in each rack. This would mean that you will have 16 machines in your cluster. An virtual machine could be added to any of these machines while either the Fault Tolerance (FT) or High Availability (HA) module will take care of redundancy in case of a failure at the node. Now, what happens if they are both connected to the same switch and it dies? Easy, the service is down. Solution: add redundant network paths.

A better solution to this problem is to make sure that the virtual machine and its clone are running at physically separated locations. One at a machine in rack 1, another one in rack 2. Now, this is exactly what the fabric controller is capable of. It maintains a view of the physical infrastructure to make sure that a single point of failure does not impact devices behind it. It knows that a switch failure could take out both the master and its clone and will separate them to different domains (for example, rack 1 is one domain, rack 2 is another).  This information is also utilized when updates are applied.

Now, that’s how you remove most of hazzle of operating a data center. With VMware vSphere, you still have to create a lot of redundant paths to maintain virtual machine and service redundancy. You also have to have some knowledge about the physical infrastructure when building your virtual data center and placing virtual machines on it. Now, it will select which hosts to place the virtual machines at if you have enabled distributed resource scheduling (DRS), but it is not capable of handling failures beyond physical hosts for the virtual machines.

The fabric controller is capable of handling traffic management, load balancing and such too. I have to say, the fabric controller is a pretty nice feature :-)

Watch this video for more on Microsoft Azure and how they build their data center!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>