Tag Archives: storage

Some Reality for us Infrastructure Peeps or Apps are cool too

Don’t’ you just love double titles?

For many years I have been an infrastructure guy. I really liked how the cables, and processors and Memory and blinking lights worked. Applications were often the necessary evil tolerated so that I can play with cool technology. During my own journey toward learning about the cloud it becomes increasingly important to consider the function of the application. Six years ago me would totally punch me in the face right now. Traitor. J

1 – Don’t get your App messed up in my resource buckets of awesomeness

 

So the reality check to the Infrastructure geek in me is this: The application teams really think of what you do as the network. That is why when anything is ever wrong it is always “the network’s” fault. What we love to do is getting abstracted more and more. I will still contend that is very important and very hard to do. Whether you are building reference architectures or deploying a converged infrastructure appliance almost no one but us cares. They just want the data to do their jobs. So while we have really great discussions about speeds and feeds, the guy in the picture below just wants the app. From the hypervisor down we need to design with the application in mind or we will risk becoming like that goth dude locked in the server room on IT Crowd.

 

2 Honey badger don’t care about FCoE

My next post will get into what I have been researching regarding what is out there and hopefully help us (infra. peeps) understand our App/Dev brothers better.

You are probably an Infrastructure person if:

  1. You read this blog.
  2. You work mainly with Virtualization
  3. Storage Admin
  4. Network Admin
  5. You like to make fun of DBA’s

 

Extents vs Storage DRS

I was meeting with a customer today and had to stop for a second when they said they were using 10 TB datastores in vSphere 4.1.

At first I was going through my head of maybe NFS? No they are an all block shop. Oh wait yeah, extents. They were using 2 TB -512 byte luns to create a giant Datastore. I asked, why? The answer was simple, “so we only manage one datastore.”

I responded with well check out Storage DRS in vSphere 5! It gives you that one point to manage and automatic placement across multiple datastores. Additionally you actually can find which VM lives where, and use Storage Maintenance mode to do storage related maintenance. Right now they are locked into using extents. If they change their datastores into a Cluster the gain flexibility while not losing the ease of management.

I wanted to use the opportunity to list some information I think about Extents with VMware.

  1. Extents do not equal bad. Just have the right reason to use them, and running out of space is not one.
  2. If you lose one extent you don’t lose everything, unless that one is the first extent.
  3. VMware places blocks on extents in some sort of even fashion. It is not spill and fill. While not really load balancing you don’t kill just one lun at a time.

An extent with a datastore is like a stack of luns. Don’t knock out the bottom block!

 

Some points about Storage DRS.

  1. Storage DRS places VMDK’s based on IO and Space metrics.
  2. Storage DRS and SRM 5 don’t play nice, last time I checked (2/13/12).
  3. Combine Storage DRS with Storage Policy and you have a really easy way to place and manage VM’s on the storage. Just set the policy and check if it is compliant.

A Storage DRS cluster is multiple datastores appearing as one.

Some links on the topics:

Some more information from VMware on Extents
More on Storage DRS (SDRS)

In conclusion, SDRS may be removing some of the last reasons to use an extent (getting multiple lun performance with single point of management). Add that to being able to have up to 64 TB Datastores with VMFS and using extents will become even rarer than before. Unless you have another reason? Post it in the comments!

Storage Caching vs Tiering Part 2

Recently I had the privilege of being a Tech Field Day Delegate. Tech Field Day is organized by Gestalt IT. If you want more detail on Tech Field Day visit right here. In interest of full disclosure the vendors we visit sponsor the event. The delegates are under no obligation to review good or bad the sponsoring companies.

After jumping in with a post last week on tierless caching I wanted to jump in with my thoughts on a second Tech Field Day vendor. Avere presented a very interesting and technical presentation. I appreciated being engaged on an engineering level and not a marketing pitch.

Avere tiers everything. It is essentially a scale out NAS solution (they called it a FXT Appliance) that can front end any existing NFS. Described to me by someone else as file acceleration. The Avere NAS stores data internally on a cluster of NAS units. The “paranoia meter” lets you set how often the mass storage device is updated. If you need more availability or speed you add Avere devices. If you need more disk space you add to your mass storage. In their benchmarking tests they basically used some drives connected to a CentOS machine running NFS front-ended by Avere’s NAS units. They were able to get the required IOPS at a fraction of the cost of NetApp or EMC.

The Avere Systems blog provides some good questions on Tiering.

The really good part of the presentation is how they write between the tiers. Everything is optimized for that particular type of media, SSD, SAS or SATA.
When I asked about NetApp’s statements about tiering (funny they were on the same day). Ron Bianchini responded, “that when you sell hammers, everything is a nail.” I believe him.

So how do we move past all the marketing speak to get down to the truth when it comes to Caching and Tiering. I am leaning toward thinking of any location where data lives for any period of time as a tier. I think a cache is a tier. Really fast cache for reads and writes is for sure a tier. Different kinds of disks are tiers. So I would say everyone has tiers. The value comes in when the storage vendor innovates and automates the movement and management of that data.

My questions/comments about Avere.

1. Slick technology. I would like to see it work in the enterprise over time. People might be scared because it is not one of the “big names”.
2. Having came from Spinnaker. Is the plan to go long term with Avere, or build something to be purchased by a big guy?
3. I would like to see how the methods used by the Avere FXT appliance can be applied to block storage. Plenty of slow inexpensive iSCSI products that would benefit from a device like this on the front end.

Storage Caching vs Tiering Part 1

Recently I had the privilege of being a Tech Field Day Delegate. Tech Field Day is organized by Gestalt IT. If you want more detail on Tech Field Day visit right here. In interest of full disclosure the vendors we visit sponsor the event. The delegates are under no obligation to review good or bad the sponsoring companies.

The first place hosting the delegates was NetApp. I basically have worked with several different storage vendors but I must admit I have never experienced NetApp in any way before. Except for Storage vMotioning Virtual Machines from an old NetApp (I don’t even know the model) to a new SAN.

Among the 4 hours of slide shows I learned a ton. One great topic is Storage Caching vs Tiering. Some of the delegates have already blogged about the sessions here and here.

So I am going to give my super quick summary of Caching as I understood it from the NetApp session. Followed by a post about Tiering as I learned from one of our subsequent sessions from Avere.

1. Caching is superior to Tiering because Tiering requires too much management.
2. Caching outperforms tiering.
3. Tiering drives cost up.

The NetApp method is to use really quick Flash Memory to speed up the performance of the SAN. Their software attempts to predict what data will be read and keep that data available in the cache. This “front-ends” a giant pool of SATA drives. The cache cards provide the performance the the SATA drives provide a single large pool to manage. With a simplified management model and using just one type of big disk the cost is driven down.

My Take Away in Tierless-Caching

This is a solution that has a place and would work well for many situations. This is not the only solution. All in all the presentation was very good. The comparisons against tiering were really setup against a “straw-man”. A multi-device tiered solution requiring manual management off all the different storage tiers is of course a really hard solution. It could cost more to obtain and could be more expensive to manage. I asked about fully virtual automated tiering solutions. Solutions that manage your “tiers” as one big pool. These solutions would seem to solve the problem of managing tiers of disks, keeping the cost down. The question was somewhat deflected because these solutions will move data on a schedule. “How can I know when to move my data up to the top tier?” was the question posed by NetApp. Of course this is not exactly how a fully-automated tiering SAN works, but is a valid concern.

My Questions for the Smartguys:

1. How can the NetApp caching software choices be better/worse than software that makes tiering decisions from companies that have done this for several years?
2. If tiering is so bad, why does Compellent’s stock continue to rise in anticipation of an acquisition from someone big?
3. Would I really want to pay NetApp sized money to send my backups to a NetApp pool of SATA disks? Would I be better off with a more affordable SATA solution for Backup to Disk even if I have to spend slightly more time managing the device?

B.Y.O.P – The Alternative Vblock

In college I often would be invited to a get together that could often include the letters BYOB, Bring Your Own Beer. Sometimes a cookout would be BYOM, Bring Your Own Meat (or meat alternative for the vegetarians). So today I want to leverage this to push my new acronym B.Y.O.P. Bring Your Own Pod. Lately I have been seeing people talk about Vblocks. If I can venture a succinct definition a Vblock is a pre-configured set of Cisco, EMC and VMware products tested by super smart people, approved by these people to work together, then supported by these organizations as a single entity. Your reseller/solutions provider really should already be doing this very thing for you. You may choose to buy just the network piece, or the hypervisor but your partner should be able to verify a solution to work from end to end and provide unified support.

So You can’t call it BYOPCVCEP

Why not Vblock? This might get me blacklisted by the Elders of the vDiva council, but VCE doesn’t exist to make your life in the datacenter easier, they exist to sell you more VMware, Cisco and EMC. Vblock for sure simplifies your buying experience. I believe they are all great products and may very well do just what you need. Without competition though the only winner is VCE. Do not by forced into a box by the giant vendors. Find someone that can help determine your end goal, provide you vendor neutral analysis of the building blocks needed to achieve your end goal. Then provide the correct vendors and unified support to Build Your Own Pod.

So What is the Alternative Vblock

Originally I was going to draw up a sweet solution of 3par, Xsigo and Dell R610′s and say, “Hey everyone! This is some cool stuff. Try to quiet the overwhelmingly loud voice calling from VCE and give this Alternative Vblock a try.” As I thought more and more about it I think doing that is contrary to my main point. I would like more to provide the discussion points or some possible products among others that can be used to Build Your Own Pod. I am a firm believer in getting what is right for your datacenter needs. So here is a few links to help begin the discussion.

Xsigo and Pod – Jon Toor
3par and iBlocks – Marc Farley

Adaptive Queuing in ESX

While troubleshooting another issue a week or two ago I came across this VMware knowledge base article. Having spent most of the time with other brand arrays in the past, I thought this was a pretty cool solution verses just increasing the queue length of the HBA. I would recommend setting this on your 3par BEFORE you get QFULL problems. Additionally, Netapp has an implementation of this as well.

Be sure to read the note at the bottom especially:

If hosts running operating systems other than ESX are connected to array ports that are being accessed by ESX hosts, while the latter are configured to use the adaptive algorithm, make sure those operating systems use an adaptive queue depth algorithm as well or isolate them on different ports on the storage array.

I do need to dig deeper how this affects performance as the queue begins to fill, not sure if one method is better than another. Is this the new direction that many Storage Vendors will follow?

Until then, the best advice is to do what your storage vendor recommends, especially if they say it is critical.

Here is a quick run through for you.

In the vSphere Client

wpid348-media_1272214293023.png

Select the ESX host and go to the configuration tab and click on the Advanced Settings under Software.

In the Advanced Settings

wpid349-media_1272214590686.png

Select the option for Disk and scroll down to the QFullSampleSize and QFullThreshold.
Change the values to the 3par recommended values:
QFullSampleSize = 32
QFullThreshold = 4

iSCSI Connections on EqualLogic PS Series

Equallogic PS Series Design Considerations

VMware vSphere introduces support for multipathing for iSCSI. Equallogic released a recommended configuration for using MPIO with iSCSI.   I have a few observations after working with MPIO and iSCSI. The main lesson is know the capabilities of the storage before you go trying to see how man paths you can have with active IO.

  1. EqualLogic defines a host connection as 1 iSCSI path to a volume. At VMware Partner Exchange 2010 I was told by a Dell guy, “Yeah, gotta read those release notes!”
  2. EqualLogic limits the number of hosts in the to 128 per pool or 256 per group connections in the 4000 series (see table 1 for full breakdown) and to 512/2048 per pool/group connections in the 6000 series arrays.
  3. The EqualLogic MPIO recommendation mentioned above can consume many connections with just a few vSphere hosts.

I was under the false impression that by “hosts” we were talking about physical connections to the array. Especially since the datasheet says “Hosts Accessing PS series Group”. It actually means iSCSI connections to a volume. Therefore if you have 1 host with 128 volumes singly connected via 1 iSCSI path each, you are already at your limit (on the PS4000).

An example of how fast vSphere iSCSI MPIO (Round Robin) can consume available connections can be seen this this scenario. Five vSphere hosts with 2 network cards each on the iSCSI network. If we follow the whitepaper above we will create 4 vmkernel ports per host. Each vmkernel creates an additional connection per volume. Therefore if we have 10 300 GB volumes for datastores we already have 200 iSCSI connections to our Equallogic array. Really no problem for the 6000 series but the 4000 will start to drop connections. I have not even added the connections created by the vStorage API/VCB capable backup server. So here is a formula*:

N – number of hosts

V – number of vmkernel ports

T – number of targeted volumes

B – number of connections from the backup server

C – number of connections

(N * V * T) + B = C

Equallogic PS Series Array Connections (pool/group)
4000E 128/256
4000X 128/256
4000XV 128/256
6000E 512/2048
6000S 512/2048
6000X 512/2048
6000XV 512/2048
6010,6500,6510 Series 512/2048

Use multiple pools within the group in order to avoid dropped iSCSI connections and provide scalability. This reduces the number of spindles you are hitting with your IO. Using care to know the capacity of the array will help avoid big problems down the road.

*I have seen the connections actually be higher and I can only figure this is because the way EqualLogic does iSCSI redirection.

New VMware KB – zeroedthick or eagerzeroedthick

Due to the performance hit while zeroing mentioned in the Thin Provisioning Performance white paper this article in the VMware knowledge base could be of some good use.

I would suggest using eagerzeroedthick for any high IO tier 1 type of Virtual Machine. This can be done when creating the VMDK from the GUI by selecting the “Support Clustering Features such as Fault Tolerance” check box.

So go out and check your VMDK’s.

Storage Design and VDI

Recently I have spent time re-thinking certain configuration scenarios and asking myself, “Why?” If there is something I do day to day during installs is this still true when it comes to vSphere? or will it still be true when it comes to future versions.
Lately I have questioned how I deploy LUNs/volumes/datastores. I usually deploy multiple moderate size datastores. In my opinion this was always the best way to fit in MOST situations. I also will create datastores based on need afterward. So will create some general use datastores then add a bigger or smaller store based on performance/storage needs. After all the research I have done and asking questions on twitter* I still think this is a good plan in most situations.
I went over a VMworld.com session TA3220 – VMware vStorage VMFS-3 Architectural Advances since ESX 3.0 and read this paper:
http://www.vmware.com/resources/techresources/1059
I also went over some blog posts at Yellow-Bricks.com and Virtualgeek.

An idea occurred to me when it comes to using extents in VMFS, SCSI Reservations/Locks, and VDI “Boot Storms”. First some things a picked up.
1. Extents are not “spill and fill” VMFS places VM files across all the LUNs. Not quite what I would call load balancing, since it does not take IO load into account when placing files. So in situations where all the VM’s have similar loads this won’t be a problem.
2. Only the first LUN in a VMFS span gets locked by “storage and VMFS Administrative tasks” (Scalable Storage Performance pg 9). Not sure if this implies all locks.

Booting 100′s of VM’s for VMware View will cause locking and even though vSphere is much better when it comes to how quickly this process takes. There is still an impact. So I am beginning to think of a disk layout to ease administration for VDI, and possibly lay the groundwork for improved performance. Here is my theory:

Create four LUNs with 200GB each. Use VMFS to extents to group them together. Resulting in an 800 GB datastore with 4 disk queues and only 1 LUN that locks during administrative tasks.

Give this datastore to VMware View and let it have at it. Since the IO load for each VM is mostly the same, and really at the highest during boot other tasks performed on the LUN after the initial boot storm will have even less impact. So we can let desktops get destroyed and rebuilt/cloned all day with only locking that first LUN. This part I still need to confirm in the LAB.

What I have seen in the lab is with same sized clones the data on disk was spread pretty evenly across the LUNs.

Any other ideas? Please leave a comment. Maybe I am way off base.

*(thanks to @lamw @jasonboche and @sakacc for discussing or answering my tweets)

Using iSCSI to get some big ole disk in a Virtual Machine

First, I have lived in the South too long, because I said “Big ole disk” and couldn’t think of a more appropriate phrase. Now someone rescue me if I start to tell you to “mash” the power button on your server or SAN. I kid.

I am sure everyone out there has used this before but I like to document these things just case someone else needs help.

A coworker and I were installing a vSphere environment last week to support some new software for a customer. The software vendor required approximately 30 x 146GB drives in a Raid 5 to store images. Never would guess the software vendor happens to sell SANs too! I exaggerate it actually called for 3TB of usable space.

So my thought was to get over 2TB limit of VMFS we would need to use the MS iSCSI initiator inside the VM. Then my coworker thought we could enable MPIO using two virtual Nics with vmxnet3. We tied each vmxnet3 nic to a separate port group and assigned one of the 2 physical NICs to each port group. Additionally vmxnet3 lets you enable jumbo frames and the physical nics were already set to mtu 9000 because this was on the software iscsi vswitch. So we were able to get multiple paths from the VM to the network and have jumbo frames all the way through.

Next we presented the iSCSI volume of 3TB to the Windows machines. Of course at first it sees it as a couple of smaller volumes. Convert the disk to GPT and align to 64k, then format with NTFS. Just like that a 3TB disk inside a Virtual Machine.

iSCSI MPIO

Now we saw IOMETER push better sequential IO than an RDM that was set up for Round Robin, but not quite as good in the Random IO department as a RDM.

The main gain here is to get a file bigger than 2TB minus 512B. Useful for the scan/image servers that store a tons of files for a long time.

To sum up and make it clear.

1. Use the Microsoft iSCSI initiator and MPIO inside the VM when you need more than 2TB on  a disk.

2. Use 2 port groups and bind them to separate physical nics to let the MPIO actually work over 2 nics.

3. With vSphere use the VMXNET3 driver for network to use jumbo frames, the E1000 driver does not support this.