LACP, Yeah You Know Me

Traditionally when configuring vCenter and ESXi, most customers go the default networking route with Management, vMotion, and vSAN Port Groups sitting on a vSphere Distributed Switch (vDS).  Usually, I see customers configure for active/active or active/standby.  When using active/standby I usually see Management and vMotion with Uplink1 active and Uplink2 standby and then Uplink2 active and Uplink1 standby for vSAN traffic. 

Configuring for active/active doesn’t really take advantage of both NICs equally and is not a great load balancing technique.  Using the active/standby deployment as discussed above usually works well, but I usually see the vSAN link used significantly more than the Management/vMotion link. 

What if I told you that there was a better way?  Something that would allow for using all the available bandwidth while providing redundancy in case of failure.  Well Hello Nur…errr…LACP.

First, I think it is critical to understand the anacronyms and terminology:

LACP – Link Aggregation Control Protocol – Allows for the bundling of multiple ports into one single combined interface that allows for load balancing and redundancy.  An example would be 2x25G ports being bundled together to form a 50G pipe.  This is the overall technology, but different vendors use different naming for how they do this.

  • LAG = Port-Channel (PC) – One vendor uses Link Aggregation Groups (LAGs) and one uses Port-Channels.  These are the same thing.  This allows for multiple ports on the same switch to be bundled together to for the higher bandwidth interface. 

  • MLAG = vPC – Multi-Chassis Link Aggregation (MLAG) and Virtual Port Channel (vPC) are both doing the same thing.  They allow you to use port/ports from two separate switches and combine them together to get the higher bandwidth interface.  This allows for increased redundance where losing a switch will only take down one of the two bonded interfaces. 

Now that we understand the anacronyms, let’s get this configured.  We are going to begin where many customers are today with their production system up and running, with a Virtual Distributed Switch (vDS).  Teaming and failover for the Management and vMotion portgroups is configured with Uplink 1 active and Uplink 2 in standby while the vSAN port group is configured for Uplink 2 active and Uplink 1 standby. 

The first change we need to make is to change the failover order for all the port groups to an active/standby model with Uplink 1 active and Uplink 2 standby.  The easiest way to do this is to right click on your vDS, choose “Distributed Port Group” and then click “Manage Distributed Port Groups”

Select “Teaming and failover” and then click “NEXT”


Select all your port groups and click “NEXT”.


Make sure that Uplink 1 is active, and Uplink 2 is standby then click “NEXT”.  Review your changes and click “FINISH”.

My switches have two 25G cables running between two Cisco 93180 switches that are configured with vPC; ports 47/48 on each switch are bundled together to for a 50G tunnel between the switches.  You must have vPC between the switches setup and configured if you are going to use MLAG/vPC. 

For information around Cisco and vPC please see Configuring vPCs.

Using the command “sh port-channel summary” we can see that the vPC link (port-channel 161) is active and communicating properly between the switches. 

Now that we have verified the two switches talking to each other, we are going to look at the host networking to the switches.  Each ESXi host has two 25G connections, one running to each switch.  In my lab the hosts are using ports 1-8 on each switch.  Notice, the only difference is my description; I like to show that there are two ports tied to this LAG.

On Switch A we are going to create the port-channels, but we are not going to add eth1/1-8 to them yet because that would break communication.  These switchports are cabled to the hypervisors’ vmnic associated with Uplink 1 (currently Active on our vDS port groups and currently handling all traffic in and out of the host.)  You would run the same commands on both switches changing up the description if desired.

Interface po201
Description esx01-LAG1-p1
vpc 201

Interface eth1/1
switchport mode active
mtu 9216
Description esx01-LAG1-p1

Interface po201
Description esx01-LAG1-p1
vpc 201

Interface eth1/2
switchport mode active
mtu 9216
Description esx02-LAG1-p1

Interface po202
Description esx02-LAG1-p1
vpc 202

Interface eth1/3
switchport mode active
mtu 9216
Description esx03-LAG1-p1

Interface po203
Description esx03-LAG1-p1
vpc 203

Interface eth1/4
switchport mode active
mtu 9216
Description esx04-LAG1-p1

Interface po204
Description esx04-LAG1-p1
vpc 204

Interface eth1/5
switchport mode active
mtu 9216
Description esx05-LAG1-p1

Interface po205
Description esx05-LAG1-p1
vpc 205

Interface eth1/6
switchport mode active
mtu 9216
Description esx06-LAG1-p1

Interface po206
Description esx06-LAG1-p1
vpc 206

Interface eth1/7
switchport mode active
mtu 9216
Description esx07-LAG1-p1

Interface po207
Description esx07-LAG1-p1
vpc 207

Interface eth1/8
switchport mode active
mtu 9216
Description esx08-LAG1-p1

Interface po208
Description esx08-LAG1-p1
vpc 208

Running “sh port-channel summary” shows that I created all my port channels, but none contain member ports yet.

Only on switch B, where these switchports are cabled to the hypervisors’ vmnic associated with Uplink 2 (currently Standby on our vDS port groups,) we are now going to add the ethernet interfaces to the port channel groups. 

SWITCH B ONLY

Interface eth1/1
channel-group 201 mode active

Interface eth1/2
channel-group 202 mode active

Interface eth1/3
channel-group 203 mode active

Interface eth1/4
channel-group 204 mode active

Interface eth1/5
channel-group 205 mode active

Interface eth1/6
channel-group 206 mode active

Interface eth1/7
channel-group 207 mode active

Interface eth1/8
channel-group 208 mode active

Running “sh port-channel summary” now shows that we have ports connected to this vPC now on switch2, but they are in a suspended state.  In vCenter, we are now going to move Uplink 2 to the LAG on the vDS.

Now we are going to create the LACP group.  Click on your vDS, then click “Configure” and then click LACP from the settings pane.  You will see the “New Link Aggregation Group” window.   Change the name as needed and just take the defaults for the rest and then click “OK”.

Click “MIGRATING NETWORK TRAFFIC TO LAGS”, then click “MANAGE DISTRIBUTED PORT GROUPS”.

Select “Teaming and failover” and click “NEXT”.

Select your port groups then click “NEXT”.

Move Uplink 2 to unused and LAG1 to standby and click “NEXT”.  You will get a notification that using standalone uplinks and a standby LAG should only be temporary while migrating to the LAG.  Click “OK”

Now we are going to migrate our Uplink 2 to LAG1.  Click “MIGRATING NETWORK TRAFFIC TO LAGS” then click “ADD AND MANAGE HOSTS”.

Click “Manage host networking” then click “NEXT”.


Select your hosts and then click “NEXT”

Select vmnic3 (your vmnics might be different) uplink and change it to “LAG1-1” then click “NEXT”.

We don’t need to make changes to our VMkernel adapters because they already live on the vDS and this LAG group is part of that.  Click “NEXT”.

We are not migrating any VM networking so click “NEXT” then click “FINISH” on the Ready to complete screen.

We currently have vmnic2 Uplink 1 on our vDS using the standard configuration and vmnic3 is attached to LAG-1.  Next, we need to move vmnic2 Uplink 1 to the MLAG.  Click “ADD AND MANAGE HOSTS”.

Click “Manage host networking” then click “NEXT”.

Select your hosts and then click “NEXT”

Select vmnic2 Uplink 1 and change it to “LAG1-0” then click “NEXT”.

We don’t need to make changes to our VMkernel adapters because they already live on the vDS and this LAG group is part of that.  Click “NEXT”.

We are not migrating any VM networking so click “NEXT” then click “FINISH” on the Ready to complete screen.

We now have both Uplinks on the MLAG.  Now we need to configure our ports on switch A to be part of the port-channel groups.  SSH into switch A and run the following. 

SWITCH A ONLY

Interface eth1/1
channel-group 201 mode active

Interface eth1/2
channel-group 202 mode active

Interface eth1/3
channel-group 203 mode active

Interface eth1/4
channel-group 204 mode active

Interface eth1/5
channel-group 205 mode active

Interface eth1/6
channel-group 206 mode active

Interface eth1/7
channel-group 207 mode active

Interface eth1/8
channel-group 208 mode active

Running “sh port-channel summary” we now see both sides of our vPC with the status of “P” which is “Up in port-channel”.

We have one final step and that is to make the MLAG the only active connection for teaming and failover.  Click “MANAGE DISTRIBUTED PORT GROUPS”.

Select “Teaming and failover” and click “NEXT”.

Select your port groups then click “NEXT”.

Move both Uplink 1 and Uplink 2 to Unused and move LAG1 to Active and click “NEXT” then “FINISH”.

For those with short attention spans; TLDR

Steps:

  1. Make sure that the vPC peer link is configured based on best practice from your switch vendor.
  2. Change vDS port groups to use Uplink 1 active and Uplink 2 standby.
  3. Configure port-channels on both switch A and switch B.
  4. Configure ethernet ports on switch A and switch B for ESXi hosts.
  5. Add only switch B ethernet ports to port-channel groups.
  6. Create LAG in vCenter.
  7. Edit port groups and make LAG1 standby and make Uplink 2 unused.
  8. Move vmnic3 from Uplink 2 to LAG1-1 and validate port-channel configuration.
  9. Move vmnic2 from Uplink 1 to LAG1-0
  10. Add switch A ethernet ports to port-channel groups.
  11. Move LAG1 to active and Uplink 1 and Uplink 2 to unused for all port groups.
  12. Profit.


Deploying vRealize Suite in VCF 4.x with VLAN-Backed Networks

When deploying VMware Cloud Foundation (VCF) I can’t recommend enough that you deploy with BGP/AVNs. This will make your life easier in the future when deploying the vRealize Suite as well as making deployment and administration easier for Tanzu. What happens though if you can’t get your network team to support BGP? This is where VLAN-Backed networks come in.

First we should start the download of the bundles for the vRealize suite. Within SDDC Manager go to Lifecycle Management–>Bundle Management and download all of the vRealize bundles. Now that we have the bundles downloading lets move on to the vRealize Suite tab.

Usually the first indication that BGP/AVNs were not deployed comes from the vRealize deployment screen. Notice that the “Deploy” button is greyed out at the bottom with the message saying that the deployment isn’t available because there is no “X-Region Application Virtual Network”.

No problem, we are going to follow the VMware KB 80864 to create two edge nodes, and add our VLAN-Backed networks to SDDC Manager. When looking at the KB you will notice three attachments. The first thing we want to do is open the validated design PDF.

First we are going to need some networks created and configured. Part of the workflow will deploy a Tier-0 gateway where the external uplink are added. If you are not planning on using the Tier-0 gateway for other use cases (Tanzu), then the VLAN IDs and IP addresses you enter for the uplink networks do not need to exist in you environment. In my experience it is better to create all of these VLANs and subnets in case you want to use them later. The Uplinks don’t have to be a /24. You should be able to use something smaller like a /27 or /28. Every edge node will use two IPs for the overlay network. The edge overlay VLAN/subnet needs to be able to talk to the host overlay VLAN/subnet defined when you deployed VCF.

Next we have the networks that are going to be used by the vRealize Suite. Those two networks are Cross Region and Region Specific. The cross region is used for vRSLCM, vROPs, vRA, and Workspace ONE. The region specific network is used for Log Insight, region specific Workspace ONE and the remote collectors for vROPs.

Now that we have all of our network information, we need to copy the JSON example from pages 11/12 into a text editor like notepad++ or copy from the code below. This code is what we are going to use to deploy our edge VMs. Make sure DNS entries are created for both edge nodes and the edge cluster VIP. The management IPs should be on the same network as the SDDC Manager and vCenter deployed by VCF. Make the necessary changes using the networks discussed previously. In the next step we will get the cluster ID.

{
 "edgeClusterName":"sfo-m01-ec01",
 "edgeClusterType":"NSX-T",
 "edgeRootPassword":"edge_root_password",
 "edgeAdminPassword":"edge_admin_password",
 "edgeAuditPassword":"edge_audit_password",
 "edgeFormFactor":"MEDIUM",
 "tier0ServicesHighAvailability":"ACTIVE_ACTIVE",
 "mtu":9000,
 "tier0RoutingType":"STATIC",
 "tier0Name": "sfo-m01-ec01-t0-gw01",
 "tier1Name": "sfo-m01-ec01-t1-gw01",
 "edgeClusterProfileType": "CUSTOM",
 "edgeClusterProfileSpec":
   { "bfdAllowedHop": 255,
      "bfdDeclareDeadMultiple": 3,
      "bfdProbeInterval": 1000,
      "edgeClusterProfileName": "sfo-m01-ecp01",
      "standbyRelocationThreshold": 30
 },
 "edgeNodeSpecs":[
 {
 "edgeNodeName":"sfo-m01-en01.sfo.rainpole.io",
 "managementIP":"172.16.11.69/24",
 "managementGateway":"172.16.11.253",
 "edgeTepGateway":"172.16.19.253",
 "edgeTep1IP":"172.16.19.2/24",
 "edgeTep2IP":"172.16.19.3/24",
 "edgeTepVlan":"1619",
 "clusterId":"<!REPLACE WITH sfo-m01-cl01 CLUSTER ID !>",
 "interRackCluster": "false",
 "uplinkNetwork":[
      {
      "uplinkVlan":1617,
      "uplinkInterfaceIP":"172.16.17.2/24"
      },
      {
      "uplinkVlan":1618,
      "uplinkInterfaceIP":"172.16.18.2/24"
 }
 ]
 },
 {
 "edgeNodeName":"sfo-m01-en02.sfo.rainpole.io",
 "managementIP":"172.16.11.70/24",
 "managementGateway":"172.16.11.253",
 "edgeTepGateway":"172.16.19.253",
 "edgeTep1IP":"172.16.19.4/24",
 "edgeTep2IP":"172.16.19.5/24",
 "edgeTepVlan":"1619",
 "clusterId":"<!REPLACE WITH sfo-m01-cl01 CLUSTER ID !>",

 "interRackCluster": "false",
 "uplinkNetwork":[
 {
      "uplinkVlan":1617,
      "uplinkInterfaceIP":"172.16.17.3/24"
 },
 {
      "uplinkVlan":1618,
      "uplinkInterfaceIP":"172.16.18.3/24"
 }
 ]
 }
 ]
}

Within SDDC Manager on the navigation menu select “Developer Center” then click “API Explorer“. Expand “APIs for managing Clusters“. Click “Get /v1/clusters“, and click “Execute“. Copy the cluster ID into the script we were working on where is says to replace.

Now expand “APIs for managing NSX-T Edge Clusters“. Click “POST /v1/edge-clusters/validations. Copy the contents of your JSON file we created and paste into the “Value” text box then click “Execute“.

After executing, copy the “ID of the validation“.

Now that we have the validation ID we are going to see if the validation was successful. Expand “APIs for managing NSX-T Edge Clusters” and click “GET /v1/edge-clusters/validations/{id}“. We want to verify that the validation shows “SUCCEEDED“.

Great! Now we are ready to deploy. Expand “APIs for managing NSX-T Edge Clusters” and click “POST /v1/edge-clusters“. In the Value text box paste the validated JSON file contents and click “Execute“. We now see the edge nodes deploying and can follow the workflow in the Tasks pane from within SDDC Manager.

While the edge nodes deploy we are going to create a transport zone and apply it to our hosts, but we have to wait for the edge node creation task to create a new transport zone and segments. In a web browser log into the NSX-T Manager for the management domain. Once logged in navigate to “System–>Fabric–>Transport Zones“. Click “+Add Zone” and create a new transport zone for the vRealize suite.

Now that we have our transport zone we need to add it to both our hosts and edge nodes. Navigate to “System–>Fabric–>Nodes“. Drop down the “Managed by” menu and select your management vCenter. Click on “Host Transport Nodes” if not already selected and then click each individual server, then from the “Actions” drop down select “Manage Transport Zones“. In the “Transport Zone” drop down select the transport zone we created earlier and then click “Add“.

Next, make sure that the edge node creation completed successfully. Once we see that successful task, we need to go to “System–>Fabric–>Nodes“. From the “Managed by” menu drop down select your management vCenter. Click “Edge Transport Nodes“. Check the box for both edge nodes and then from the “Actions” menu select “Manage Transport Zones“. From the “Transport Zone” drop down select the new transport zone we created and click “Add“.

We only have one thing left to do within NSX Manager. We need to create the segments that will be used by SDDC Manager for the vRealize Suite. Navigate to “Networking–>Segments“. We are going to create two new segments. Click “Add Segment” and put the appropriate information in for the cross region network and then click “Save“. When prompted to continue configuring the segment click “No“.

Click “Add Segment” and put the appropriate information in for the region specific network and then click “Save“. When prompted to continue configuring the segment click “No“.

We are in the homestretch! The final piece to the puzzle is telling SDDC Manager about these new networks. Download the config.ini file and the avn-ingestion-v2 file to your computer. Make the appropriate changes to the config.ini (see example below).

[REGION_A_AVN_SECTION]
name=REPLACE_WITH_FRIENDLY_NAME_FOR_vRLI_NETWORK
subnet=REPLACE_WITH_SUBNET_FOR_vRLI_NETWORK
subnetMask=REPLACE_WITH_SUBNET_MASK_FOR_vRLI_NETWORK
gateway=REPLACE_WITH_GATEWAY_FOR_vRLI_NETWORK
mtu=REPLACE_WITH_MTU_FOR_vRLI_NETWORK
portGroupName=REPLACE_WITH_VCENTER_PORTGROUP_FOR_vRLI_NETWORK
domainName=REPLACE_WITH_DNS_DOMAIN_FOR_vRLI_NETWORK
vlanId=REPLACE_WITH_VLAN_ID_FOR_vRLI_NETWORK
[REGION_X_AVN_SECTION]
name=REPLACE_WITH_FRIENDLY_NAME_FOR_vRSLCM_vROPs_vRA_NETWORK
subnet=REPLACE_WITH_SUBNET_FOR_vRSLCM_vROPs_vRA_NETWORK
subnetMask=REPLACE_WITH_SUBNET_MASK_FOR_vRSLCM_vROPs_vRA_NETWORK
gateway=REPLACE_WITH_GATEWAY_FOR_vRSLCM_vROPs_vRA_NETWORK
mtu=REPLACE_WITH_MTU_FOR_vRSLCM_vROPs_vRA_NETWORK
portGroupName=REPLACE_WITH_VCENTER_PORTGROUP_FOR_vRSLCM_vROPs_vRA_NETWORK
domainName=REPLACE_WITH_DNS_DOMAIN_FOR_vRSLCM_vROPs_vRA_NETWORK
vlanId=REPLACE_WITH_VLAN_ID_FOR_vRSLCM_vROPs_vRA_NETWORK
[REGION_A_AVN_SECTION]
name=areg-seg-1631
subnet=192.168.100.128
subnetMask=255.255.255.192
gateway=192.168.100.129
mtu=9000
portGroupName=areg-seg-1631
domainName=corp.com
vlanId=1631

[REGION_X_AVN_SECTION]
name=xreg-seg-1632
subnet=192.168.100.192
subnetMask=255.255.255.192
gateway=192.168.100.193
mtu=9000
portGroupName=xreg-seg-1632
domainName=corp.com
vlanId=1632

Using a transfer utility (I used WinSCP) transfer both the config.ini and the avn-ingestion-v2 file to the SDDC Manager. I placed mine in /tmp. Next SSH into your SDDC Manager. Login as “VCF” and the type SU and enter to elevate to root. Change directory to /tmp and then type the following to change ownership and permissions:

chmod 777 config.ini
chmod 777 avn-ingestion-v2.py
chown root:root config.ini
chown root:root avn-ingestion-v2.py

From our putty session we will ingest the config.ini file into SDDC Manager. Use the following to accomplish this:

python avn-ingestion-v2.py --config config.ini

#Other Options:
# --dryrun (will validate the config.ini but won't commit the changes). 
# --erase (will clean up the AVN data in SDDC Manager)

We need to change the edge cluster that will be used. The file location changed in 4.2.1.

#4.1/4.2
vi /opt/vmware/vcf/domainmanager/config/application-prod.properties

#4.2.1
vi /etc/vmware/vcf/domainmanager/application-prod.properties

Add the following line replacing “sfo-m01-ec01” with your edge cluster name and then save.

override.edge.cluster.name=sfo-m01-ec01

The last thing to do is restart the domainmanager service.

systemctl restart domainmanager

Success!! We are ready to deploy the vRealize Suite!

VCF 4.x Offline Bundle Transfer

I recently have had to do offline bundle transfers to bring updates into dark sites that could not pull the updates down automatically through SDDC Manager. One thing to note is that I am doing the steps on a Windows box and some of the commands might change slightly if on Linux.

First download the “Bundle Transfer Utility & Skip Level Upgrade Tool” from my.vmware.com. This tool can be found under the VMware Cloud Foundation section. Once downloaded, extract the files into a folder on the computer that will be used to download the updates. In my example I will be using the c:\offlinebundle. After extracting you should see a bin, conf, and lib folder. Also, you will need to make sure you have both a windows transfer utility such as WinSCP, and a secure ftp installation such as Putty.

4.2 ONLY!!!

In 4.2 there is a manifest file that must be downloaded from VMWare and then uploaded to the SDDC Manager before moving to the next steps.

From your Windows machine, open up an administrative command prompt and run the following to download the 4.2 manifest file. Note that you will have to change the username and password to your my.vmare.com credentials.

cd c:\offlinebundle\bin
lcm-bundle-transfer-util --download -manifestDownload --outputDirectory c:\offlinebundle --depotUser user@vmware.com -depotUserPassword userpass

This created a file called “lcmManifestv1.json”.

Next we use WinSCP to transfer the lcmManifestv1.json to the SDDC Manager. I put all of the files in /home/vcf. When logging into the SDDC Manager you will be using the account “VCF” and whatever password you configured for that account during deployment.

One transferred right click on the lcmManifestv1.json file and go to properties. Change the Permissions section to an Octal value of 7777. The other way you could do that is from the SDDC Manager using Putty with the following command:

chmod 7777 /home/vcf/lcmManifestv1.json

Once transferred to SDDC Manager, we need to ingest this manifest file into the manger. Using Putty log into the SDDC Manager with the username of VCF. Once logged in do the following (FQDN will need to be updated with yours):

cd /opt/vmware/vcf/lcm/lcm-tool/bin
./lcm-bundle-transfer-util --updae --sourceManifestDirectory /home/vcf/ --sddcMgrFqdn sddc-manager.vcf.sddc.lab --sddcMgrUser administrator@vsphere.local

All 4.x Versions

If you are running 4.0 or 4.1, this is where you want to begin your offline bundle journey.

Putty into your SDDC Manager VM if you have not already. Then run:

cd /opt/vmware/vcf/lcm/lcm-tools/bin
./lcm-bundle-transfer-util --generateMarker

These marker files will be created in /home/vcf. Using WinSCP, move the files from SDDC Manager to your windows c:\offlinebundle directory.

From your Windows admin prompt we are now going to download the bundles:

cd c:\offlinebundle\bin
lcm-bundle-transfer-util -download -outputDirectory c:\offlinebundle -depotUser user@vmware.com -markerFile c:\offlinebundle\markerfile -markerMd5File c:\offlinebundle\markerFile.md5

Notice all of the bundles available. If you only want a specific product version you would put a -p (version) in the above code. I just selected all 18 by pressing “y”.

The bundles will be downloaded first to a temp directory under c:\offlinebundle and then will eventually be in the c:\offlinebundle\bundles folder.

Next using WinSCP transfer the entire c:\offlinebundle folder up to SDDC Manager into the /nfs/vmware/vcf/nfs-mount/ directory. When complete it should look like this:

We need to change the permissions on this folder. You can either right click on the /nfs/vmware/vcf/nfs-mount/offlinebundle, go to properties, then change the Octal value to 7777 or from putty:

cd /nfs/vmware/vcf/nfs-mount
chmod -R 7777 offlinebundle/

The final step is to ingest the bundles into SDDC Manager. We do that by doing the following:

cd /opt/vmware/vcf/lcm/lcm-tools/bin
./lcm-bundle-transfer-util -upload -bundleDirectory /nfs/vmware/vcf/nfs-mount/offlinebundle/

If you now log into the SDDC Manager GUI you will see the bundles start to be ingested. Once complete, you should be able to update your environment as needed.

VCF Lab Constructor (VLC), used differently.

It has been a long time since I posted anything new, I will be looking to get back on top of posting things consistently for 2021. I have been using VLC for almost a year now and it has come a long way since its humble beginnings. I have deployed many different VMware Cloud Foundation (VCF) environments using this tool, but I also use it to quickly deploy virtual hosts for other testing. For my example today I will be deploying 3 hosts, adding them to a cluster, and then I will turn on HCI Mesh (Datastore Sharing) to use the storage of my physical vSAN cluster.

First, you will need to sign up and download VLC from http://tiny.cc/getVLCBits, the build I am using is for VCF 4.1. After signing up you will also get information about joining the #VLC-Support slack channel. If you have any issues with VLC, this is a great place to get quick answers. You will also need to download whatever ESXi version you will be using from myvmware.com. In my example I will be using 7.0.1 (16850804). After downloading VLC, unzip it to your C:\ drive. Follow the install guide in the zip file to create the vDS and portgroup used for VLC.

Using your favorite editor, edit the “add_3_hosts.json” file. Change the name as you see fit for each host. You can increase or decrease CPU, Memory, and the disks being added to these VMs. Set the management IP information as well. I have included the code for my installation. Once complete, save this file.

{
    "genVM": [
      {
        "name": "esxi-1",
        "cpus": 4,
        "mem": 16,
        "disks": "10,10,50,50,50",
		"mgmtip": "192.168.1.60",
		"subnetmask":"255.255.255.0",
		"ipgw":"192.168.1.1"
      },
      {
        "name": "esxi-2",
        "cpus": 4,
        "mem": 16,
        "disks": "10,10,50,50,50",
		"mgmtip": "192.168.1.61",
		"subnetmask":"255.255.255.0",
		"ipgw":"192.168.1.1"
      },
      {
        "name": "esxi-3",
        "cpus": 4,
        "mem": 16,
        "disks": "10,10,50,50,50",
		"mgmtip": "192.168.1.62",
		"subnetmask":"255.255.255.0",
		"ipgw":"192.168.1.1"
      }
    ]
}

Next, right click on the Powershell Script “VLCGui” and then select “Run with Powershell”.

The Lab Constructor GUI will appear. Choose the “Expansion Pack!” option.

Input your main VLAN ID, then click on the “Addtl Hosts JSON” box and select the “add_3_hosts.json” we edited earlier. Click on the ESXi ISO location and choose the ISO that you should have downloaded earlier…you didn’t skip ahead did you? Input your password as well as NTP, DNS, and domain information. On the right side of the window input your vCenter credentials and hit Connect. Once connected it will show you what clusters, networks, and datastores are supported. The cluster I wanted to use (lab-cl1) is not showing up, this was because I had vSphere HA enabled.

Once I turned off HA on the cluster it appeared for me to select. I chose my VLC network as well and my physical vSAN datastore “vsanDatastore”. My VLC network is configured for trunking and has Promiscuous mode, MAC address changes, and Forged transmits all set to “Accept”. Click “Validate” and then click “Construct”.

You will see Powershell start to deploy the ESXi hosts. You can monitor vCenter until complete, total time to build 3 hosts was just under 10 minutes.

Now create a new cluster and add these three hosts to the cluster. When completed you will have 3 hosts in a new cluster that are all in maintenance mode.

We now enable vSAN on the cluster by right clicking on the cluster, choosing Settings–>vSAN–>Services–>Configure. I went with the default options and did not choose any disks to be consumed for vSAN so my vSAN datastore shows 0 capacity and 0 free space. We will use Quickstart to configure the hosts further. If I enable vSAN and then try to use Datastore Sharing, it won’t let me configure it because the required vSAN networking is not configured yet.

Click on your cluster–>Configure–>Quickstart. In the Quickstart configuration view you should see 3 boxes, in the box on the right click the “Configure” button. We first configure our distributed switch. I already had one created that I wanted to use so I selected that to use for my vSAN network, added a new portgroup name, and then chose the two adapters I wanted to use.

Next we configure the information for the VMkernel adapters. I have a VLAN that I use for all of my vSAN traffic (30), then I add the information for the static IPs I want to use from that subnet. Use the Autofill option…it will save you time.

I did not make any changes on the Advanced Options, I did not claim any disks for vSAN, and I did not configure a proxy. Click “Next” until you get to the review screen. If satisfied with your choices, click “Finish.

Once the changes are complete from Quickstart, click on your cluster, then “Configure”, and then Datastore Sharing. Notice I still show a vsanDatastore (3) but it has not space. Click “MOUNT REMOTE DATASTORE”.

I chose “vsanDatastore” which is my physical storage for this cluster, all of the other datastores you see here are virtual.. Click “Next” and notice that this time our compatibility check is all green now because the vSAN networking is configured. Click “Finish”.

Now that we mounted our datastore, lets create a new VM on it. I just selected all of the defaults, but you could use a template to test with if you already had one deployed.

Let’s power up the VM. We now have a VM deployed in our HCIMesh cluster using the vSAN datastore from my lab-cl1 cluster.

This is just one example of some quick testing I did because VLC helped me to deploy my ESXi hosts quickly. I hope you found it helpful.

LSI MegaRAID SAS 3108 – Cisco 12G SAS Raid – VSAN JBOD

The other day I decided to switch out my disk that I was using for VSAN caching. I was doing some testing with an NVME, but now had to back it down to a SAS SSD. The disk I used had been used previously in a different system, so it had a foreign configuration that I had to remove.

  1. Best practice would be to work on one host at a time. Put the first host in maintenance mode, and choose to “Ensure Availability”.
  2. After the host is in maintenance mode, click on your cluster then click “Configure” tab, and then click “Disk Management“.
  3. Click on the disk group that you want to remove and then click the “Remove the disk group” button.
  4. You will get another data migration question. I choose “Ensure data accessibility from other hosts“. Click “Yes“.
  5. Wait for the disk group to be removed from the host. When complete, reboot your host. When prompted during the boot process, press “Ctrl-R” to get to the raid configuration menu.
  6. Press “Ctrl-P” or “Ctrl-N” to switch pages. One of the pages should show your disks and the slots they are in. We have a problem here. The only option for my 400GB SSD is to erase the disk because it has the state of “Foreign“.
  7. Switch pages to the “Virtual Drive Management” page and then on the Cisco 12G SAS Modular Raid press “F2“. This will give a menu; select “Foreign Config” and then “Clear“.
  8. This will clear out your configuration so please make sure that you have thought things through. If you are OK with the possibility of data loss, click “Yes“.
  9. Now we are getting somewhere. The disk now shows UG (Unconfigured Good).
  10. Highlight the disk and then press “F2“. From the menu click “Make JBOD“.
  11. DATA ON DISKS WILL BE DELETED so make sure you want to do this. Click “Yes” to proceed.
  12. All looks good. Escape out and exit the application. Reboot your host when done.

  13. My new 400GB disk shows up in VMware now.

    c
  14. Now click on your cluster, then click the “Configure” tab and then click “Disk Management“. Click on the host that you removed the disk from earlier and then choose “Add disk group” button. Choose your cache disk and your capacity disks and you are ready to go. Take your host out of maintenance mode and repeat steps on each host.

VSAN on Cisco C240-M3 with LSI MegaRAID SAS 9271-i

In the past I have configured a LSI MegaRAID SAS 3108 – Cisco 12G SAS Raid controller with 1GB FBWC module. When I set that up, I just passed through control to VMware. The MegaRAID SAS 9271-I is different; here is how I set them up. I used VMware KB2111266 for reference on configuration settings.

When booting and the controller information comes up, press CTRL-H.

  1. Click “Start”.
  2. I already have Virtual Drive 0 configured for my ESXi OS. Virtual Drive 1 has my 400GB disk I am using for VSAN caching. I have four unconfigured disks that I want to use for my capacity tier.
    Click “Configuration Wizard”.
  3. Click “Add Configuration” radio button and then click “Next“.
  4. Click “Manual Configuration” radio button and then click “Next“.
  5. Now we see the four unconfigured drives on the left side. Click the first one, then click “Add to Array“.
  6. Click on the “Accept DG” button. Repeat Steps 5 and 6 until all of your disks are in their own disk group then click “Next“.
  7. In the left pane click the “Add to SPAN” button.
  8. The Disk Group appears in the right window under Span. Click “Next“.
  9. Depending on if the disk is an HDD or SSD, your settings will change. In my example I configured for HDD. When finished changing settings, click “Accept” and then “Next“.

  10. You will receive and alert about the possibility of slower performance with Write Through. Click “Yes“.
  11. You now have to click “Back” and repeat steps 7-10 until all of your drives have been added.
  12. One all of your drives have been added, click “Accept“.
  13. Click “Yes” to save the configuration.
  14. Acknowledge that you know that data will be lost on the new virtual drives. Click “Yes”.
  15. You will now see all of your drives under Virtual Drives. Click the “Home” button.
  16. Click “Exit”.
  17. Click “Yes”.
  18. Power cycle your server.
  19. Success!! Vcenter shows my drives under storage devices. I can now add these disks to VSAN.

 

Storage Policy Based Management Wins

Hey everyone, it’s my first VSAN post! For the past few months I have been building out VSAN on a few test environments. One of them is what I call a Frankenstein cluster. It consists of four Cisco C series hosts with four SSD for Caching and 16 SSD for Capacity set up for an all flash VSAN. In order to do depupe and compression you have to have all flash. I am not going into performance discussions right now, but instead want to talk about Storage Policy Based Management or SPBM. Last week I had someone ask me where to set disk to Thick/Thin within the web client. Notice that Type says “As defined in the VM storage Policy”.

Here is the default policy assigned to this VM. Notice that there is nothing in my rules that would define thick/thin or anything between.

I bet you are now thinking, “What does the Fat Client say”. Well, I am glad you asked. I have a VM on the VSAN datastore with a 40GB Thick Provision Eager Zeroed HD1.

What does that look like for storage usage? I will show both from fat and web client. I think there is a bug in the web client that I will discuss shortly. I bet you are asking why it is showing 80GB Used storage. Remember my storage policy? It is set for raid 1 which will mirror the data. Keep that in mind if you will be using raid 1 with VSAN. Number all seem to jive.


Hey, let’s add a second disk. Let’s make it 100GB Thick Eager Zero. After adding the disk I went into windows and onlined/initialized the disk. After creating the disk in windows…these were the results. This is where I think there is a VMware bug. If you look at the second image, the storage usage in the web client at the VM level never changes. Comparing some numbers. Before the Used total was 591.73GB and this increased to 794.68GB used. This is a change of 203GB. Change in free space went from 6.04TB to 5.84TB which is a change of 200GB. Number looks like what we would expect.

Now let’s have some fun! Time to change the Default VSAN Storage Policy. Go to Homeà VM Storage Policies. Highlight the policy you want to change and then click the little pencil icon to edit. Click the Add Rule drop down at the bottom and choose “Object space reservation (%)”. I chose the default of 0. This means that any disks that have the default storage policy assigned to them will essentially be thin provisioned. Space will not be consumed until something is actually added to the drive. I chose to apply to all VMs now (I only have one). This might take some time if you have a lot of VMs that will change.


You should now be back at the Storage Policy Screen. I want to make sure that policy applied. I clicked on my default storage policy. Once in the policy, I clicked the “Check Compliance” button. On the right side I see “Compliant 1”. Just to make sure this applied to all disks on the VM (you can have separate storage policies apply to different disks) I went back to my VM à Manage à Policies. Notice all disks are compliant.


What does this all mean for my space? Let’s break it down. Used total was 794.68 and now is 514.68. This is a change of exactly 280GB! Free space went from 5.84TB free to 6.11TB. This is a change of 270GB. Look at the used space in the datastore. Notice also that the VM now shows provisioned storage of 144GB and used of 36.43GB.


Now for the interesting part. Let’s look at the fat client. Notice that the disks still show that they are Thick Provision Eager Zeroed, but because of the storage policy, they really are not.

I conclusion the storage policy wins…even though the Fat Client doesn’t seem to know that. Please let me know if you have any questions or want me to test anything else with this scenario.

Creating Cisco UCS port channels and then assigning a VLAN

In my new position I am learning a lot about UCS. Today I had to create a port channel for both A and B fabrics and then assign a VLAN to both.

  1. Log into Cisco Manager.
  2. Click on the LAN tab and then Plus sign next to Fabric A.
  3. Right click Port-Channels and select Create Port Channel.
  4. Give the port channel an ID and a Name.
  5. Select which ports are going to be used in this port channel group. Make sure you hit the >> button to move them over to be in the port channel. Click Finish.
  6. Repeat steps 1-5 on Fabric B
  7. Click LAN in the left navigation window and then at the bottom click on LAN Uplinks Manager.
  8. Click the VLAN tab and then click the VLAN Manager tab. In the left navigation pane you will select the port channel group that you created earlier. In my case I am using port channel 23. In the right window you select which vlans you want to check the box of the vlans you want to be a part of this port group. Click the Add to VLAN/VLAN Group box at the bottom of the screen.

    That should be it. You have now created a port channel and assigned a VLAN to it!

VMK0 MAC Change

I ran into an issue the other day in a UCS blade system where all of the VMK0 had the same MAC addresses. See VMware KB https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1031111

I had to remove the vmk0 port group (removes connectivity) and then I had to add this back through a KVM connection to the host. This is how I did it. Make sure you save the IP Information before making any changes.

VMK0 MAC Before

  1. Open ILO or KVM session to your host. Under troubleshooting choose the “Enable ESXi Shell” option. Once enabled do an alt-F1 on the keyboard and you should be at a login prompt. Login with root and your root password.
  2. I am going to use my port group name of “ESXi_MGMT” as an example. Yours might be different. Type esxcfg-vmknic -d -p ESXi_MGMT . This will remove vmk0.
  3. Type esxcfg-vmknic -a -I <management IP> -n <netmask> -p ESXi_MGMT. This will add the vmkernel back.

VMK0 MAC After Change

One important thing to note. Notice that Management traffic is no longer enabled on my vmk0 connection. You must edit this connection and check the box for management traffic. VMware will automatically move management to another vmk so make sure you go through and remove it from that vmkernel.

Storage vMotion Folder Rename

keep-calm-and-think-work-smarter-not-harder

I ran into a project the other day where the VM names in vCenter did not match up with the Windows hostname of the VM.  The VMware administrator was fixing this by shutting down the orignal machine and then cloning it.  The problem is that he then would have to do some configuration, and then on the datastore, the name would still be wrong because it appends a _1 behind the name because that VM name already exists.  The easiest way to change the name on your datastore is to do a Storage vMotion.  In my example I created a VM named “Original” and then change the name to “NewName”.  I will show what happened along the way.

What does the original datastore look like?


Now I renamed the VM from “Original” to “NewName”.  Notice that on the datastore the folder and files still use the “Original” name.


Time to do a Storage vMotion.  For my example I am using a powered off VM for simplicity.  The full C# VIC will not let you do a live Storage vMotion of a VM, however, you can use the Web VIC to accomplish this.   


Now the “NewName” VM is on a new host.  Looking at the Datastore, we see both the folder and file names have all been changed.