Storage availability is an area that can often be overlooked in terms of high availability. As we have discussed in the last several posts, your server workloads should be highly available so your business can continue to run smoother.  Moving toward a virtualized infrastructure will provide much flexibility and resiliency in your operations. All of this – in an ideal world – will run on top of shared storage – a storage area network (SAN).  That SAN may be iSCSI or Fibre Channel or perhaps even something newer/better/updated in the future. Since the storage is “shared,” that should alert you that it must be as redundant and highly available as is reasonable.

One of the most common designs for highly available SANs is multiple head-units with shared storage behind them. Think of this as Head A + Head B, in some sort of an active/active or active/passive failover configuration, and both heads are connected to storage shelves with disks in them. That’s all well and good and has served the industry well for decades, but we want to make sure we have redundant/multiple connections to the shelves. We want to have reasonable RAID levels on the various disk pools and make sure this failover is near-instant and as non-disruptive as possible.

Not all SANs are equal.  Hardware is hardware – you have servers, CPU, memory, HBA and disks – both spinny and flash/SSD based.  Software is where it’s at. The software controls failover in the event of a failure.  The software provides appropriate caching optimization to the server workloads.  The software provides rich feature-sets like thin provisioning.

Knowing what we know about how bad things will happen, wouldn’t it be great if this software could also facilitate better high availability? For example: why would you put all your storage in a rack (or set of racks) in a server room?  Shouldn’t another design be considered where you can have a “stretched” highly available SAN? Instead of the typical dual-node/shared-storage SAN, what if you could build a resilient/highly-available SAN that spans closets – or floors, or buildings – on a well-connected campus. You can. With any design decision you must weigh tension between cost, flexibility and high availability. If you “stretched” a SAN into two different geographic locations, you would be buying additional (at minimum, double) disk shelves and disks themselves. But think about what you would have then: a SAN that can withstand bad things that might happen to a closet, floor, or building. You could also move server workloads between closets, floors, or buildings with ease using migration tools from your hypervisor and geographically separate core clustered server workloads in order to achieve application-level high availability in case of disaster. And here you would have the core building block of a good disaster recovery plan (which we will discuss later).

As we dig deeper into achieving high availability, the talk will become more technical. If you don’t have someone on your payroll that speaks this language, it’s time to get someone. Ask us how we can help implement the hardware and software that will provide your business with a highly available environment. Downtime kills productivity. Contact us today!