Wednesday, March 25, 2015

Dual VIC or not to Dual VIC - The case against redundant UCS VICs

I answer lots of questions internally via email, and I hate that it's trapped behind walls. In the spirit of doing something a little different and to lob it over the walls, I'm just going to start posting them. 

I'll call this #RealTalk - I'll redact anything that could point back to the real customer - but if you're on Twitter I'll leave your information intact. 

This first one seems to come up often. The case for redundant VICs on Cisco UCS. The main reason I'm against it??.... if your application is so damn important, why is it running on a single host?! It adds more complexity, I'm against complexity. 

Anyway, here's the raw email .....
---------------------------------------
Hey Scott, 
Thanks this helps a lot. The old team is almost 100%. <redacted> left for EMC, <redacted> is a PSS for another team and @Vallard bailed to the cloud team. Back fills are being hired and we should have a full team by the end of the month. Its like having a whole new team.  I don’t think  we will make our numbers. The K12’s are tough. 
  

From: "Scott Hanson (scohanso)" <scohanso@cisco.com>
Date: Tuesday, March 24, 2015 at 7:03 PM
To: <redacted>
Subject: Re: VMware Host

I've seen customers do it when less than life's are on the line. 

I generally talk customers out of doing it - if the app is that important, it's not running on a single host anyway, there's some type of application cluster, OS cluster, or VMware Fault Tolerance involved. 

Also the failure rate of anything without moving parts rounds to zero. In the past I've pointed people to Chris Aitkinson's blog. He's an admin for Travelport and at the time he wrote this blog they had 1900 servers and 1 VIC failure. Not sure if he still blogs, but there's a Google cache version still - http://webcache.googleusercontent.com/search?q=cache:J3W4MCHxlj8J:www.chrisatkinson.com/%3Fp%3D10&hl=en&gl=us&strip=1

For those reasons, I think it's overkill. 

However, in this case, maybe it's not. Not being able to book a trip is different than a trauma unit. 

Ultimately their call, but I'd lay it out and let them decide. 

If they do it, make sure they have an understanding of placement policies, as to avoid accidentally placing the redundant vNICs on the same adapter -  http://virtuallymikebrown.com/2014/10/29/real-world-cisco-ucs-adapter-placement/

How's the old team doing? You guys making your numbers and getting some $$$ .... It's actually a little light over here, hopefully picks up at the end. 

Scott Hanson - @CiscoServerGeek
Consulting Systems Engineer
US Enterprise - Data Center


On Mar 24, 2015, at 6:37 <redacted> wrote:

Hey Scott, 
Hope you are doing well. Your thoughts on the question below . Should they get the 1380 VIC for redundancy ? My gut feeling tells me yes for best practice since it is a single point a failure and these guys are a TIER 1 trauma center.  Any reason why we should not add another VIC for redundancy ? Will the cluster be sufficient for redundancy ? 

From: <redacted>
Date: Tuesday, March 24, 2015 at 4:22 PM
To: <redacted>
Subject: VMware Host

<REDACTED>

 

If you were going to build an ESX cluster with UCS B200 M4 blades and were going to put Tier-1 applications on this cluster would you rely solely on the built-in 1340 VICs?  Or would you purchase the 1380 VICs as well in order to create VIC redundancy in VMware as shown below?

 

As in:

 

1340 VIC        1380 VIC       Purpose

vmnic0         vmnic3          Mgmt

vmnic1         vmnic4          VMotion

vmnic2         vmnic3          Production VMs

 

This way if a VIC failed you would still be up and connected to Ethernet and Storage.

 

Or do you think this is “overkill”.