Category Archives: Uncategorized

LACP, Yeah You Know Me

Traditionally when configuring vCenter and ESXi, most customers go the default networking route with Management, vMotion, and vSAN Port Groups sitting on a vSphere Distributed Switch (vDS).  Usually, I see customers configure for active/active or active/standby.  When using active/standby I usually see Management and vMotion with Uplink1 active and Uplink2 standby and then Uplink2 active and Uplink1 standby for vSAN traffic. 

Configuring for active/active doesn’t really take advantage of both NICs equally and is not a great load balancing technique.  Using the active/standby deployment as discussed above usually works well, but I usually see the vSAN link used significantly more than the Management/vMotion link. 

What if I told you that there was a better way?  Something that would allow for using all the available bandwidth while providing redundancy in case of failure.  Well Hello Nur…errr…LACP.

First, I think it is critical to understand the anacronyms and terminology:

LACP – Link Aggregation Control Protocol – Allows for the bundling of multiple ports into one single combined interface that allows for load balancing and redundancy.  An example would be 2x25G ports being bundled together to form a 50G pipe.  This is the overall technology, but different vendors use different naming for how they do this.

  • LAG = Port-Channel (PC) – One vendor uses Link Aggregation Groups (LAGs) and one uses Port-Channels.  These are the same thing.  This allows for multiple ports on the same switch to be bundled together to for the higher bandwidth interface. 

  • MLAG = vPC – Multi-Chassis Link Aggregation (MLAG) and Virtual Port Channel (vPC) are both doing the same thing.  They allow you to use port/ports from two separate switches and combine them together to get the higher bandwidth interface.  This allows for increased redundance where losing a switch will only take down one of the two bonded interfaces. 

Now that we understand the anacronyms, let’s get this configured.  We are going to begin where many customers are today with their production system up and running, with a Virtual Distributed Switch (vDS).  Teaming and failover for the Management and vMotion portgroups is configured with Uplink 1 active and Uplink 2 in standby while the vSAN port group is configured for Uplink 2 active and Uplink 1 standby. 

The first change we need to make is to change the failover order for all the port groups to an active/standby model with Uplink 1 active and Uplink 2 standby.  The easiest way to do this is to right click on your vDS, choose “Distributed Port Group” and then click “Manage Distributed Port Groups”

Select “Teaming and failover” and then click “NEXT”


Select all your port groups and click “NEXT”.


Make sure that Uplink 1 is active, and Uplink 2 is standby then click “NEXT”.  Review your changes and click “FINISH”.

My switches have two 25G cables running between two Cisco 93180 switches that are configured with vPC; ports 47/48 on each switch are bundled together to for a 50G tunnel between the switches.  You must have vPC between the switches setup and configured if you are going to use MLAG/vPC. 

For information around Cisco and vPC please see Configuring vPCs.

Using the command “sh port-channel summary” we can see that the vPC link (port-channel 161) is active and communicating properly between the switches. 

Now that we have verified the two switches talking to each other, we are going to look at the host networking to the switches.  Each ESXi host has two 25G connections, one running to each switch.  In my lab the hosts are using ports 1-8 on each switch.  Notice, the only difference is my description; I like to show that there are two ports tied to this LAG.

On Switch A we are going to create the port-channels, but we are not going to add eth1/1-8 to them yet because that would break communication.  These switchports are cabled to the hypervisors’ vmnic associated with Uplink 1 (currently Active on our vDS port groups and currently handling all traffic in and out of the host.)  You would run the same commands on both switches changing up the description if desired.

Interface po201
Description esx01-LAG1-p1
vpc 201

Interface eth1/1
switchport mode active
mtu 9216
Description esx01-LAG1-p1

Interface po201
Description esx01-LAG1-p1
vpc 201

Interface eth1/2
switchport mode active
mtu 9216
Description esx02-LAG1-p1

Interface po202
Description esx02-LAG1-p1
vpc 202

Interface eth1/3
switchport mode active
mtu 9216
Description esx03-LAG1-p1

Interface po203
Description esx03-LAG1-p1
vpc 203

Interface eth1/4
switchport mode active
mtu 9216
Description esx04-LAG1-p1

Interface po204
Description esx04-LAG1-p1
vpc 204

Interface eth1/5
switchport mode active
mtu 9216
Description esx05-LAG1-p1

Interface po205
Description esx05-LAG1-p1
vpc 205

Interface eth1/6
switchport mode active
mtu 9216
Description esx06-LAG1-p1

Interface po206
Description esx06-LAG1-p1
vpc 206

Interface eth1/7
switchport mode active
mtu 9216
Description esx07-LAG1-p1

Interface po207
Description esx07-LAG1-p1
vpc 207

Interface eth1/8
switchport mode active
mtu 9216
Description esx08-LAG1-p1

Interface po208
Description esx08-LAG1-p1
vpc 208

Running “sh port-channel summary” shows that I created all my port channels, but none contain member ports yet.

Only on switch B, where these switchports are cabled to the hypervisors’ vmnic associated with Uplink 2 (currently Standby on our vDS port groups,) we are now going to add the ethernet interfaces to the port channel groups. 

SWITCH B ONLY

Interface eth1/1
channel-group 201 mode active

Interface eth1/2
channel-group 202 mode active

Interface eth1/3
channel-group 203 mode active

Interface eth1/4
channel-group 204 mode active

Interface eth1/5
channel-group 205 mode active

Interface eth1/6
channel-group 206 mode active

Interface eth1/7
channel-group 207 mode active

Interface eth1/8
channel-group 208 mode active

Running “sh port-channel summary” now shows that we have ports connected to this vPC now on switch2, but they are in a suspended state.  In vCenter, we are now going to move Uplink 2 to the LAG on the vDS.

Now we are going to create the LACP group.  Click on your vDS, then click “Configure” and then click LACP from the settings pane.  You will see the “New Link Aggregation Group” window.   Change the name as needed and just take the defaults for the rest and then click “OK”.

Click “MIGRATING NETWORK TRAFFIC TO LAGS”, then click “MANAGE DISTRIBUTED PORT GROUPS”.

Select “Teaming and failover” and click “NEXT”.

Select your port groups then click “NEXT”.

Move Uplink 2 to unused and LAG1 to standby and click “NEXT”.  You will get a notification that using standalone uplinks and a standby LAG should only be temporary while migrating to the LAG.  Click “OK”

Now we are going to migrate our Uplink 2 to LAG1.  Click “MIGRATING NETWORK TRAFFIC TO LAGS” then click “ADD AND MANAGE HOSTS”.

Click “Manage host networking” then click “NEXT”.


Select your hosts and then click “NEXT”

Select vmnic3 (your vmnics might be different) uplink and change it to “LAG1-1” then click “NEXT”.

We don’t need to make changes to our VMkernel adapters because they already live on the vDS and this LAG group is part of that.  Click “NEXT”.

We are not migrating any VM networking so click “NEXT” then click “FINISH” on the Ready to complete screen.

We currently have vmnic2 Uplink 1 on our vDS using the standard configuration and vmnic3 is attached to LAG-1.  Next, we need to move vmnic2 Uplink 1 to the MLAG.  Click “ADD AND MANAGE HOSTS”.

Click “Manage host networking” then click “NEXT”.

Select your hosts and then click “NEXT”

Select vmnic2 Uplink 1 and change it to “LAG1-0” then click “NEXT”.

We don’t need to make changes to our VMkernel adapters because they already live on the vDS and this LAG group is part of that.  Click “NEXT”.

We are not migrating any VM networking so click “NEXT” then click “FINISH” on the Ready to complete screen.

We now have both Uplinks on the MLAG.  Now we need to configure our ports on switch A to be part of the port-channel groups.  SSH into switch A and run the following. 

SWITCH A ONLY

Interface eth1/1
channel-group 201 mode active

Interface eth1/2
channel-group 202 mode active

Interface eth1/3
channel-group 203 mode active

Interface eth1/4
channel-group 204 mode active

Interface eth1/5
channel-group 205 mode active

Interface eth1/6
channel-group 206 mode active

Interface eth1/7
channel-group 207 mode active

Interface eth1/8
channel-group 208 mode active

Running “sh port-channel summary” we now see both sides of our vPC with the status of “P” which is “Up in port-channel”.

We have one final step and that is to make the MLAG the only active connection for teaming and failover.  Click “MANAGE DISTRIBUTED PORT GROUPS”.

Select “Teaming and failover” and click “NEXT”.

Select your port groups then click “NEXT”.

Move both Uplink 1 and Uplink 2 to Unused and move LAG1 to Active and click “NEXT” then “FINISH”.

For those with short attention spans; TLDR

Steps:

  1. Make sure that the vPC peer link is configured based on best practice from your switch vendor.
  2. Change vDS port groups to use Uplink 1 active and Uplink 2 standby.
  3. Configure port-channels on both switch A and switch B.
  4. Configure ethernet ports on switch A and switch B for ESXi hosts.
  5. Add only switch B ethernet ports to port-channel groups.
  6. Create LAG in vCenter.
  7. Edit port groups and make LAG1 standby and make Uplink 2 unused.
  8. Move vmnic3 from Uplink 2 to LAG1-1 and validate port-channel configuration.
  9. Move vmnic2 from Uplink 1 to LAG1-0
  10. Add switch A ethernet ports to port-channel groups.
  11. Move LAG1 to active and Uplink 1 and Uplink 2 to unused for all port groups.
  12. Profit.


LSI MegaRAID SAS 3108 – Cisco 12G SAS Raid – VSAN JBOD

The other day I decided to switch out my disk that I was using for VSAN caching. I was doing some testing with an NVME, but now had to back it down to a SAS SSD. The disk I used had been used previously in a different system, so it had a foreign configuration that I had to remove.

  1. Best practice would be to work on one host at a time. Put the first host in maintenance mode, and choose to “Ensure Availability”.
  2. After the host is in maintenance mode, click on your cluster then click “Configure” tab, and then click “Disk Management“.
  3. Click on the disk group that you want to remove and then click the “Remove the disk group” button.
  4. You will get another data migration question. I choose “Ensure data accessibility from other hosts“. Click “Yes“.
  5. Wait for the disk group to be removed from the host. When complete, reboot your host. When prompted during the boot process, press “Ctrl-R” to get to the raid configuration menu.
  6. Press “Ctrl-P” or “Ctrl-N” to switch pages. One of the pages should show your disks and the slots they are in. We have a problem here. The only option for my 400GB SSD is to erase the disk because it has the state of “Foreign“.
  7. Switch pages to the “Virtual Drive Management” page and then on the Cisco 12G SAS Modular Raid press “F2“. This will give a menu; select “Foreign Config” and then “Clear“.
  8. This will clear out your configuration so please make sure that you have thought things through. If you are OK with the possibility of data loss, click “Yes“.
  9. Now we are getting somewhere. The disk now shows UG (Unconfigured Good).
  10. Highlight the disk and then press “F2“. From the menu click “Make JBOD“.
  11. DATA ON DISKS WILL BE DELETED so make sure you want to do this. Click “Yes” to proceed.
  12. All looks good. Escape out and exit the application. Reboot your host when done.

  13. My new 400GB disk shows up in VMware now.

    c
  14. Now click on your cluster, then click the “Configure” tab and then click “Disk Management“. Click on the host that you removed the disk from earlier and then choose “Add disk group” button. Choose your cache disk and your capacity disks and you are ready to go. Take your host out of maintenance mode and repeat steps on each host.

VSAN on Cisco C240-M3 with LSI MegaRAID SAS 9271-i

In the past I have configured a LSI MegaRAID SAS 3108 – Cisco 12G SAS Raid controller with 1GB FBWC module. When I set that up, I just passed through control to VMware. The MegaRAID SAS 9271-I is different; here is how I set them up. I used VMware KB2111266 for reference on configuration settings.

When booting and the controller information comes up, press CTRL-H.

  1. Click “Start”.
  2. I already have Virtual Drive 0 configured for my ESXi OS. Virtual Drive 1 has my 400GB disk I am using for VSAN caching. I have four unconfigured disks that I want to use for my capacity tier.
    Click “Configuration Wizard”.
  3. Click “Add Configuration” radio button and then click “Next“.
  4. Click “Manual Configuration” radio button and then click “Next“.
  5. Now we see the four unconfigured drives on the left side. Click the first one, then click “Add to Array“.
  6. Click on the “Accept DG” button. Repeat Steps 5 and 6 until all of your disks are in their own disk group then click “Next“.
  7. In the left pane click the “Add to SPAN” button.
  8. The Disk Group appears in the right window under Span. Click “Next“.
  9. Depending on if the disk is an HDD or SSD, your settings will change. In my example I configured for HDD. When finished changing settings, click “Accept” and then “Next“.

  10. You will receive and alert about the possibility of slower performance with Write Through. Click “Yes“.
  11. You now have to click “Back” and repeat steps 7-10 until all of your drives have been added.
  12. One all of your drives have been added, click “Accept“.
  13. Click “Yes” to save the configuration.
  14. Acknowledge that you know that data will be lost on the new virtual drives. Click “Yes”.
  15. You will now see all of your drives under Virtual Drives. Click the “Home” button.
  16. Click “Exit”.
  17. Click “Yes”.
  18. Power cycle your server.
  19. Success!! Vcenter shows my drives under storage devices. I can now add these disks to VSAN.

 

Storage Policy Based Management Wins

Hey everyone, it’s my first VSAN post! For the past few months I have been building out VSAN on a few test environments. One of them is what I call a Frankenstein cluster. It consists of four Cisco C series hosts with four SSD for Caching and 16 SSD for Capacity set up for an all flash VSAN. In order to do depupe and compression you have to have all flash. I am not going into performance discussions right now, but instead want to talk about Storage Policy Based Management or SPBM. Last week I had someone ask me where to set disk to Thick/Thin within the web client. Notice that Type says “As defined in the VM storage Policy”.

Here is the default policy assigned to this VM. Notice that there is nothing in my rules that would define thick/thin or anything between.

I bet you are now thinking, “What does the Fat Client say”. Well, I am glad you asked. I have a VM on the VSAN datastore with a 40GB Thick Provision Eager Zeroed HD1.

What does that look like for storage usage? I will show both from fat and web client. I think there is a bug in the web client that I will discuss shortly. I bet you are asking why it is showing 80GB Used storage. Remember my storage policy? It is set for raid 1 which will mirror the data. Keep that in mind if you will be using raid 1 with VSAN. Number all seem to jive.


Hey, let’s add a second disk. Let’s make it 100GB Thick Eager Zero. After adding the disk I went into windows and onlined/initialized the disk. After creating the disk in windows…these were the results. This is where I think there is a VMware bug. If you look at the second image, the storage usage in the web client at the VM level never changes. Comparing some numbers. Before the Used total was 591.73GB and this increased to 794.68GB used. This is a change of 203GB. Change in free space went from 6.04TB to 5.84TB which is a change of 200GB. Number looks like what we would expect.

Now let’s have some fun! Time to change the Default VSAN Storage Policy. Go to Homeà VM Storage Policies. Highlight the policy you want to change and then click the little pencil icon to edit. Click the Add Rule drop down at the bottom and choose “Object space reservation (%)”. I chose the default of 0. This means that any disks that have the default storage policy assigned to them will essentially be thin provisioned. Space will not be consumed until something is actually added to the drive. I chose to apply to all VMs now (I only have one). This might take some time if you have a lot of VMs that will change.


You should now be back at the Storage Policy Screen. I want to make sure that policy applied. I clicked on my default storage policy. Once in the policy, I clicked the “Check Compliance” button. On the right side I see “Compliant 1”. Just to make sure this applied to all disks on the VM (you can have separate storage policies apply to different disks) I went back to my VM à Manage à Policies. Notice all disks are compliant.


What does this all mean for my space? Let’s break it down. Used total was 794.68 and now is 514.68. This is a change of exactly 280GB! Free space went from 5.84TB free to 6.11TB. This is a change of 270GB. Look at the used space in the datastore. Notice also that the VM now shows provisioned storage of 144GB and used of 36.43GB.


Now for the interesting part. Let’s look at the fat client. Notice that the disks still show that they are Thick Provision Eager Zeroed, but because of the storage policy, they really are not.

I conclusion the storage policy wins…even though the Fat Client doesn’t seem to know that. Please let me know if you have any questions or want me to test anything else with this scenario.

Creating Cisco UCS port channels and then assigning a VLAN

In my new position I am learning a lot about UCS. Today I had to create a port channel for both A and B fabrics and then assign a VLAN to both.

  1. Log into Cisco Manager.
  2. Click on the LAN tab and then Plus sign next to Fabric A.
  3. Right click Port-Channels and select Create Port Channel.
  4. Give the port channel an ID and a Name.
  5. Select which ports are going to be used in this port channel group. Make sure you hit the >> button to move them over to be in the port channel. Click Finish.
  6. Repeat steps 1-5 on Fabric B
  7. Click LAN in the left navigation window and then at the bottom click on LAN Uplinks Manager.
  8. Click the VLAN tab and then click the VLAN Manager tab. In the left navigation pane you will select the port channel group that you created earlier. In my case I am using port channel 23. In the right window you select which vlans you want to check the box of the vlans you want to be a part of this port group. Click the Add to VLAN/VLAN Group box at the bottom of the screen.

    That should be it. You have now created a port channel and assigned a VLAN to it!

VMK0 MAC Change

I ran into an issue the other day in a UCS blade system where all of the VMK0 had the same MAC addresses. See VMware KB https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1031111

I had to remove the vmk0 port group (removes connectivity) and then I had to add this back through a KVM connection to the host. This is how I did it. Make sure you save the IP Information before making any changes.

VMK0 MAC Before

  1. Open ILO or KVM session to your host. Under troubleshooting choose the “Enable ESXi Shell” option. Once enabled do an alt-F1 on the keyboard and you should be at a login prompt. Login with root and your root password.
  2. I am going to use my port group name of “ESXi_MGMT” as an example. Yours might be different. Type esxcfg-vmknic -d -p ESXi_MGMT . This will remove vmk0.
  3. Type esxcfg-vmknic -a -I <management IP> -n <netmask> -p ESXi_MGMT. This will add the vmkernel back.

VMK0 MAC After Change

One important thing to note. Notice that Management traffic is no longer enabled on my vmk0 connection. You must edit this connection and check the box for management traffic. VMware will automatically move management to another vmk so make sure you go through and remove it from that vmkernel.

Storage vMotion Folder Rename

keep-calm-and-think-work-smarter-not-harder

I ran into a project the other day where the VM names in vCenter did not match up with the Windows hostname of the VM.  The VMware administrator was fixing this by shutting down the orignal machine and then cloning it.  The problem is that he then would have to do some configuration, and then on the datastore, the name would still be wrong because it appends a _1 behind the name because that VM name already exists.  The easiest way to change the name on your datastore is to do a Storage vMotion.  In my example I created a VM named “Original” and then change the name to “NewName”.  I will show what happened along the way.

What does the original datastore look like?


Now I renamed the VM from “Original” to “NewName”.  Notice that on the datastore the folder and files still use the “Original” name.


Time to do a Storage vMotion.  For my example I am using a powered off VM for simplicity.  The full C# VIC will not let you do a live Storage vMotion of a VM, however, you can use the Web VIC to accomplish this.   


Now the “NewName” VM is on a new host.  Looking at the Datastore, we see both the folder and file names have all been changed.

VMworld 2015 Schedule Builder is now live!

Time to sign up for the Ask the Expert vBloggers session first! #VMworld 2015 Schedule Builder is now live!

VMworld 2015 Schedule Builder is now live!

Search from over 400 unique sessions, sign up early for your favorites, and view customized recommendations based on your attendee profile! Are you ready for any?


VMware Advocacy

Who’s ready for the VMworld Party Already…

Looks like VMware is heading back to ATT stadium this year.

Join the VMworld community at AT&T park, home of the San Francisco Giants, for an evening of fun, feasting, and fantastic entertainment! VMworld will transform the park with exciting carnival rides and interactive midway games with the stunning backdrop of the San Francisco Bay. The party starts Wednesday, September 2 at 7:30PM .


VMware Advocacy

VMworld 2015 Content Catalog is now live!

What sessions will you attend? The #VMworld 2015 Content Catalog is now live!

VMworld 2015 Content Catalog is now live!

The Content Catalog allows prospective VMworld attendees access to the VMworld agenda, with the ability to peruse breakouts and note sessions of interest. You can search and filter to your heart’s content—by track, category, session format, industry, role, technical level, speaker name, location (US or Europe), and keyword search. You cannot schedule sessions in the catalog.


VMware Advocacy