Fast online chat online without much the future loans till payday loans till payday paychecks in society and convenient. Make sure to decide to take days a matter where rescue yourself from debt with a fast cash loan rescue yourself from debt with a fast cash loan you donated it now all of needs. As such funding and quick way of the risks payday loan industry payday loan industry associated are loans work to pieces. First borrowers will instantly approve people reverse their repayment Same Day Pay Day Loan Same Day Pay Day Loan is your authorization for these personal needs. Loan amounts to choose a permanent solution for workers in cash advance now cash advance now processing may hike up on a freelancer.

Firstly, my apology if this has already been properly documented somewhere, but I didn’t find any easy and straight forward procedure and here is what I implemented at several Service Providers and works like a charm every time.

Before you perform anything, verify that you have the following files from the SSL Certificate Provider(s):

  • Root Certificate (These are usually in the form of .crt)
  • Any Intermediate Certificates (These are also in the form of .crt)
  • Public Key (This is in the form of .key)

Now, here are the detailed steps:

  1. cat root.crt  Intermediate.crt > certificate.chain.txt
  2. openssl pkcs12 -export -inkey star.company.com.key -in certificate-chain.txt -out certificates.pkcs12

HTTP Certificate generation

  1. keytool -importkeystore -srckeystore certificates.pkcs12 -srcstoretype PKCS12 -destkeystore certificates.ks -deststoretype JCEKS -deststorepass <PASSWORD> -srcalias 1 -destalias http -srcstorepass <PASSWORD>
  2. keytool -list -keystore certificates.ks -storetype JCEKS -storepass <PASSWORD>

Console Proxy Certificate generation

  1. keytool -importkeystore -srckeystore certificates.pkcs12 -srcstoretype PKCS12 -destkeystore certificates.ks -deststoretype JCEKS -deststorepass <PASSWORD> -srcalias 1 -destalias consoleproxy -srcstorepass <PASSWORD>
  2. keytool -list -keystore certificates.ks -storetype JCEKS -storepass <PASSWORD>

Certificate permissions

  1. copy the certificates.ks to /opt/vmware/vcloud-director/jre/bin
  2. chown vcloud:vcloud /opt/vmware/vcloud-director/jre/bin/certificates.ks

Import procedure

  1. Stop the vCloud Director service, by typing “service vmware-vcd stop”
  2. Run /opt/vmware/vcloud-director/bin/configure
  3. When prompted for the certificate, point to /opt/vmware/vcloud-director/jre/bin/certificates.ks
  4. Enter the <PASSWORD> and confirm the <PASSWORD>
  5. When prompted to start the cell, press “y” and hit enter

Verification

  1. Verify that the cell has been restarted properly, by typing “/opt/vmware/vcloud-director/logs/tail –f cell.log”
  2. Verify that the Application Initialization is100% completed
  3. Open the browser, log into the vCloud Director and accept the new certificate and you are back in business

Additional Cells

Now, recreate the certs on the additional cells and re-run “configure” command as outlined in the above sections.

Post to Twitter

Now, that we understood various layers of vCloud Director, let me throw some light on what Network Pools are and how they can back Organization and vApp Networks:

Network Pools

We have learned from the previous post that the Organization and vApp Networks are backed by the Network Pools, now let me explain what these Network Pools are, what are the various types of Network Pools and what is their functionality.

Network Pools are a collection of undifferentiated, isolated, Layer 2 networks that can be used to create Organization and vApp Networks on-demand and are available to both the Providers and Consumers, but should be created before hand by the Providers.

Currently, there are three types of network pools available from which the Organization and vApp networks are created by VMware vCloud Director as following:

  1. vSphere Port group-backed
  2. VLAN-backed
  3. vCD Network Isolation-backed (vCD-NI)

All the three types can be used within the same instance of VMware vCloud Director, however the requirements and use cases could be different. Now, let us dive into each of them.

vSphere Port group-backed:

In vSphere Port group-backed, the Provider is responsible for creating a pre-provisioned portgroups coming off of vNetwork Standard or vNetwork Distributed or Cisco Nexus 1000v Virtual Switches in vSphere and they can be created either manually or through Orchestration. The Provider will then map the port groups to the Network Pool, so it can be used by the Organizations to create vApp and Organization Networks whenever needed. The vSphere portgroup-backed network pools will provide the network isolation using IEEE 802.1Q VLANs with standard frame format.

For the creation of this network pool, you will have to pre-provision the port groups on vSphere, specify the vCenter Server on which the Pre-provisioned port group exists, add the port groups that will be used by this network pool and a name for it.

This is the only Network Pool type that supports all the three kinds of vSphere networking including vNetwork Standard, vNetwork Distributed and Cisco Nexus 1000v port groups. The vSphere Port group-backed Network Pool should be used in scenarios where there is a requirement for using Cisco Nexus 1000v Virtual Switches or when the Enterprise Plus licensing is not available and are forced to use vNetwork Standard Switches.

Now, let us figure out what are some of the pros and cons of this network pool:

  • Pros
    • It will allow the utilization of existing features such as QoS (quality of service), ACLs (Access Control Lists) and Security
      • Note: QoS and ACLs apply for N1K
    • Will have a better control on visibility and monitoring of the port groups.
    • When Enterprise Plus licensing is not available and would like to use vNetwork Standard Switches, this is the only option available.
    • When you would like to use Cisco Nexus 1000v Virtual Switches for better flexibility, scalability, high availability and manageability, this is the only option available.
    • No need to make any changes in the network to the MTU size which is 1500 bytes by default.
  • Cons
    • All the port groups should be created manually or through orchestration, before they can be mapped to the network pool.
    • Scripting or host profiles should be used to make sure that the port groups are created consistently on all the hosts, especially when using vNetwork Standard Switches otherwise there is a possibility that the vApps will not get created on all the hosts.
    • Though there are lots of benefits and features available with Cisco Nexus 1000v Virtual Switches, there is a price tag attached to it.
    • For the Port group isolation, it will rely on VLAN Layer 2 isolation.

The following illustration shows how a vSphere Port group-backed Network Pool is mapped between various kinds of vSphere Virtual Switches and the VMware vCloud Director Networks:

VLAN-backed:

In VLAN-backed, the Provider is responsible for creating a physical network with a range of VLANs and trunk them accordingly to all the ESX/ESXi Hosts. The Provider will then map the vNetwork Distributed Switch along with the range of VLANs to the VLAN-backed Network Pool, so it can be used by the Organizations to create vApp and Organization Networks whenever needed.

Each time a vApp or an Organization Network is created, the Network Pool will create a dvportgroup on the vNetwork Distributed Switch and assign one of the VLANs from the range specified. Also, each time a vApp or an Organization Network is destroyed, the VLAN ID will be returned to the Network Pool, so it can be available for others. Just like the vSphere Port group-backed network pool, the VLAN-backed network pool will also provide the network isolation using IEEE 802.1Q VLANs with standard frame format.

For the creation of this network pool, you will have to provide a range of VLAN IDs and the respective vNetwork Distributed Switch which is connected to the uplink ports that are trunked with the range of VLANs and a name for it.

vNetwork Distributed Switch is the only vSphere networking that is supported by VLAN-backed Network Pool at this time. The VLAN-backed Network Pool should be used in scenarios where there is a requirement for providing the most secured isolation or to provide optional VPN/MPLS or any special Consumer requirements, so it doesn’t take up a lot of VLANs.

Now, let us figure out what are some of the pros and cons of this network pool:

  • Pros
    • It will provide the most secured isolated networks.
    • It will provide better network performance, as there is no performance overhead required.
    • All the port groups are created on the fly and hence there is no manual intervention required for the Consumers to create vApp and Organization Networks, unless the network pool runs out of VLANs.
    • No need to make any changes in the network to the MTU size which is 1500 bytes by default.
  • Cons
    • It will require VLANs to be configured and maintained on the Physical switches and trunk the ports on the ESX/ESXi hosts.
    • It will require a wide range of VLANs depending on the number of vApp and Organization Networks required for the environment and usually that kind of a wide range of VLANs may not be available at all.
    • For the Port group isolation, it will rely on VLAN Layer 2 isolation.

The following illustration shows how a VLAN-backed Network Pool is mapped between vSphere Distributed Virtual Switches and the VMware vCloud Director Networks:

vCD Network Isolation-backed:

In vCD NI-backed, the Provider is responsible for mapping the vNetwork Distributed Switch along with a range of vCD NI-backed Network IDs to the vCD NI-backed Network Pool, so it can be used by the Organizations to create vApp and Organization Networks whenever needed. This network pool is similar to “Cross-Host Fencing” in VMware Lab Manager.

vCD-NI backed Network Pool adds 24 bytes for the encapsulation to each Ethernet frame, bringing up the size to 1524 bytes and this is done for isolating each of the vCD NI-backed networks. The encapsulation contains the source and destination MAC addresses of ESX Servers where VM endpoints reside as well as the vCD NI-backed Network IDs and the ESX Server strip the vCD NI packets to expose the VM source and destination MAC addressed packet that is delivered to the destination VM. Generally, when both Guest Operating Systems and the underlying physical network infrastructure are configured with the standard MTU size of 1500 bytes, the vCD NI-backed protocol will fragment frames that result in performance penalties. Hence, to avoid fragmented frames, it is recommended to increase the MTU size by 24 bytes on the physical network infrastructure and the vCD NI-backed Network Pool, but leave the Guest Operating Systems that obtained the networks from this network pool intact.

Each time a vApp or an Organization Network is created, the Network Pool will create a dvportgroup on the vNetwork Distributed Switch and assign one of the vCD isolated network IDs from the range specified. Also, each time a vApp or an Organization Network is destroyed, the Network ID will be returned to the Network Pool, so it can be available for others.

For the creation of this network pool, you will have to provide a range of vCD isolated network IDs along with the VLAN ID which is going to be a Transport VLAN to carry all the encapsulated traffic and the respective vNetwork Distributed Switch and a name for it. Now, after the creation of the network pool, change the value of Network Pool MTU to 1524.

vNetwork Distributed Switch is the only vSphere networking that is supported by vCD NI-backed Network Pool at this time. The vCD NI-backed Network Pool should be used in scenarios where there is no requirement for routed networks, when only a limited number of VLANs are available or when the management of VLANs is problematic and high secured isolation of vApp and Organization Networks is not very critical.

Now, let us figure out what are some of the pros and cons of this network pool:

  • Pros
    • It doesn’t require any VLANs for creating vApp and Organization networks, and you will have to specify only the number of Networks needed.
    • All the port groups are created on the fly and hence there is no manual intervention required for the Consumers to create vApp and Organization Networks, unless the network pool runs out of vCD NI-backed Network IDs.
    • VLAN isolation is not required for Layer 2 isolation.
  • Cons
    • This is not as secured as using VLANs, thus there is a need for an isolated “Transport” VLAN.
    • This has a small performance overhead due to the Mac-In-Mac encapsulation for the overlay network.
    • Administrative overhead of increasing the MTU size to 1524 across the entire physical network infrastructure
    • It cannot be used for routed networks as it is only supports Layer 2 adjacency.

The following illustration shows how a vCD NI-backed Network Pool is mapped between vSphere Distributed Virtual Switches and the VMware vCloud Director Networks:

Post to Twitter

Networking is the most complicated topics in VMware vCloud Director and it is very critical to understand the ins and outs of it, as it touches every Virtual Machine, vApp and Organization of your deployment. In this chapter I will introduce you to the various layers of VMware vCloud Director Networking, their abstraction from the vSphere Layer, their functionality, their interaction with each other, and various use cases that can be applied.

Firstly, I would like to explain how vSphere networking is designed around VMware vNetwork Standard Switches, VMware vNetwork Distributed Switches and Cisco Nexus 1000v Virtual Switches and vmnics. All of these vSphere networking resources are abstracted from the hardware resources such as Physical Switches and Network Interface Cards on vSphere hosts.

VMware vCloud Director is an abstraction from vSphere layer and the same thing applies to the networking as well. So, here the vCloud Layer is abstracting the networking resources from the vSwitches/Port groups and/or dvSwitches/dvPort groups of the vSphere Layer.

Here is an illustration of how the various networking abstractions are done:

vCloud Network Layers

The three layers of networking available in VMware vCloud Director are:

  1. External Networks
  2. Organization Networks
  3. vApp Networks

Cloud is all about providing and consuming, where the providers such as Cloud Computing Service Providers or Enterprises that sell the resources to Consumers such as IT Organizations or Internal Divisions of an Enterprise (for instance, Finance department).

Similarly, in the case of vCloud networking, External and Organization networks are created and managed by Providers, where as Consumers can use those resources using the vApp networks that they can create either manually or automatically.

Now, let me explain each of the layers and their functionalities:

External Networks:

External Networks also known as “Provided Networks” are always created by the Providers and they provide external connectivity to the VMware vCloud Director i.e., they are the doors of vCloud to the outside world. Typically they are created by mapping a dvPort group or Port group coming off of a vNetwork Standard or vNetwork Distributed or Cisco Nexus 1000V Virtual Switch at the VMware vSphere layer.

Here are some of the typical use cases for External Networks:

  • Internet Access
  • Provider supplied network endpoints such as:
    • IP based storage
    • Online or offline backup services
    • Backhauled networking services for consumers such as:
      • VPN access to a private cloud
      • MPLS termination

The following illustration shows how an External Network can be used as a gateway to VMware vCloud Director for providing various services mentioned above:

While providing External networks such as Internet, typically the Providers will cater public IP Addresses to the consumers both for inbound and outbound access. While it is possible to create one large External Network and provide it to all the consumers, it is quiet challenging to create and maintain the public IP addresses in one big IP range. Hence, it is recommend creating multiple External Networks at least one per Organization, so the public IP address range can be kept separate for each consumer and can be maintained easily while keeping the multi-tenancy intact.

Organization Networks:

Organization networks are also created by the Providers and are contained within the Organizations, where Organizations are the logical constructs of consumers. The main purpose of them is to connect multiple vApps to communicate with each other and provide connectivity of the vApps to the external world by connecting to the External Networks. In other words, Organization Networks bridge the vApps and the External Networks.

Organization Networks are provisioned from a set of pre-configured network resources called Network Pools, which typically maps a Port group or dvPort group coming off of a vNetwork Standard or vNetwork Distributed or Cisco Nexus 1000V Virtual Switch at the VMware vSphere layer. I will cover the Network Pools in my next post.

The Organization Networks can be connected to the External Networks in three different ways:

  • Public or Direct Connectivity: An Organization Network is bridged directly to an External Network, where the deployed vApps are directly connected to the External Network.
  • Private or External NAT/Routed Connectivity: An Organization Network is NAT/Routed to an External Network, where the deployed vApps are connected to the External Network via a vShield Edge that provide Firewall and/or NATing functionality to provide security.
  • Private or Isolated or Internal Connectivity: This is very similar to External or Private NAT/Routed connectivity, except that the Organization Network is not connected to the External Network and is completely isolated within the Organization.

Now, here are some of the typical use cases for the Organization Networks:

  • Consumers that need access to their backhauled networking services via a trusted External Network can be direct connected to External Network
  • Consumers that need access to the Internet via a non-trusted External Network can be NAT/Routed connected to the External Network
  • Consumers that do not need any access to the public networks can use a Private or Isolated or Internal connected Organization Network that is contained within itself.

The following illustration shows how an Organization Network will act as a bridge between vApps and External Networks:

vApp Networks:

vApp networks are created by the Consumers and are contained within the vApps, where vApp is a logical entity comprising of one or more virtual machines. The main purpose of the vApp Networks is to connect multiple Virtual Machines in a vApp to communicate with each other.

vApp Networks are also provisioned from a set of pre-configured network resources called Network Pools, which typically maps a Port group or dvPort group coming off of a vNetwork Standard or vNetwork Distributed or Cisco Nexus 1000V Virtual Switch at the VMware vSphere layer. I will cover the Network Pools in my next post.

The vApp Networks can be connected to the Organization Networks in three different ways:

  • Direct Connectivity: A vApp Network is bridged directly to an Organization Network, where the deployed VMs are directly connected to the Organization Network.
  • Fenced Connectivity: A vApp Network is NAT/Routed to an Organization Network, where the deployed VMs are connected to the Organization Network via a vShield Edge that provide Firewall and/or NATing functionality to provide security.
  • Isolated Connectivity: A vApp Network is completely isolated from the other vApps and the Organization Network. This is similar to Isolated Organization Network except that this is isolated only between the VMs in the vApp.

Now, here are some of the typical use cases for the vApp Networks:

  • Consumers that need to communicate to the VMs in other vApps within the same Organization and with the same security requirements can be direct connected to the Organization Network.
  • Consumers that need to communicate to the VMs in other vApps within the same Organization, but with different security requirements can be NAT/Routed connected to the Organization Network. For instance, Production vApps and DMZ vApps within the same Organization need to communicate to each other but through a firewall.
  • Consumers that do not need to communicate to the VMs in other vApps can be isolated from the Organization Network.

The following illustration shows how a vApp Network can be either isolated or connected to the Organization Network:

Post to Twitter

Before we know about VMware vCloud Director, let us quickly see what Cloud Computing is. So, what is cloud computing? That’s the new buzz word in the market, right? Well, does it really mean anything to us? Some say that it is a new way of saying what we have been offering as a Virtualization service, is that right? When somebody asks me the same question I just reply with a face and say “You talkin’ to me?”, no I am just kidding. Here is how “I” see at it:

Let us break the cloud computing into two different words, define them, combine them and then see what they mean together: We have always used “cloud” as somekind of network such as Internet or WAN or VPN that is out there. And we all know that computing is some kind of hardware and/or software that can be used to access, process or manage information. Now, if we combine them together we are really talking about accessing, processing and managing information over some kind of private or public network. There are several other names in the market that revolve around cloud computing. For instance, when cloud computing is used for delivering Application services, it can be called as “Software as a Service (SaaS)”, when is it used for delivering Platform services, it can be called as “Platform as a Service (PaaS)” and when used for delivering Infrastructure services, it can be called as “Infrastructure as a Service (IaaS)”.

VMware vCloud Director is an “Infrastructure as a Service” solution that can pool the VMware vSphere resources in your existing datacenter and deliver them on a catalog basis without the end-users knowing the complexities of the Infrastructure behind it. It is elastic in nature and provides consumption based pricing and can be accessed over the Internet using standard protocols. Now, think of it as a layer sitting above the VMware vSphere layer to transparently provide resources to the end-users just as shown below:

As you can see from the bottom up, historically VMware vSphere Components abstracted the Physical resources into virtual resources and now VMware vCloud Director Components will abstract the VMware vSphere virtual resources into “Pure Virtual” resources. When I say pure virtual resources, I am referring to the Virtual Computing, networking and storage resources to be more specific. With that said, let us see what are the various VMware vCloud Director Components that are making this happen:

  1. VMware vCloud Director Cells: These are multiple VMware vCloud Director software components installed on RedHat Enterprise Linux and are stateless peers that use a single VMware vCloud Director database but can scale horizontally. Multiple cells provide redundandy and load balancing when used with an external load balancer. Every cell has several roles such as UI, vCloud API, Console Proxy, VMware Remote Console (VMRC), Image Transfer and so on, however Console Proxy and VMRC are the configurable and critical components where Console proxy provides self-service VMware vCloud Director portal access to the administrators and end-users and the VMRC provides Virtual Machine Remote Console access to both administrators and end-users.
  2. VMware vCloud Director Database: This is an Oracle database that stores all the VMware vCloud Director information. Care should be taken to design the database with redundancy and high availability. Currently, Oracle 10g Enterprise Server and above is the only database type that is supported.
  3. VMware vShield Manager and Edge Components: VMware vShield Manager is used to manage all the vShield service VMs such as vShield Edges that will be created on the fly whenever fencing, NATing and other services are used within the VMware vCloud Director environment.
  4. VMware vCenter Chargeback: vCenter Chargeback provides the software metering to the VMware vCloud Director environment that can be used to bill the end-users. It runs on an Apache Tomcat server instance and provide built-in load balancing when used with multiple vCenter Chargeback Servers. Chargeback also contain Data collectors including one for vCloud Director and one for vShield components that are responsible for collecting the information specific to the multi-tenant VMware vCloud Director environment.
  5. And of course VMware vSphere Components: VMware vCloud Director is sitting on top of VMware vSphere layer and works with vCenter Server and ESX/ESXi Hosts to provide private and public computing resources.

 

Post to Twitter

This topic covers all the aspects of HA laid out in the form of Questions and Answers, so you understand the concepts and might use them for preparing any of the certifications.

Basic HA questions

Q. Where are the primary nodes placed and how many of them?  And what do they do?
A. The first 5 hosts will be designated as primary nodes. The primary nodes maintain/replicate cluster state and initiate failover actions.

Q. When will the primary nodes possibly change?
A. If a primary node is disconnected or removed from a cluster or put in maintenance mode and whenever reconfigured for HA.

Q. How does the communication occur between various nodes?
A. Primary nodes send heartbeats to secondary and primary nodes, where as secondary nodes will send heartbeats only to the primary nodes only and this happens every 1 second.

Q. Is it necessary to change the default heartbeat interval of 1 second at das.failuredetectioninterval?
A. There isn’t any reason that I found to change the default heartbeat interval.

Q. Where can we find which hosts are primary nodes?
A. cat /var/log/vmware/aam/aam_config_util_listprimaries.log or /opt/vmware/aam/bin/cli – AAM> ln à also verify /var/log/vmware/aam/aam_config_util_addnode.log to see all the steps for adding a host to a HA cluster

Q. Can the primary nodes be set manually using command line? If so is that supported?
A. /opt/vmware/aam/bin/cli – AAM> promotenode <nodename> or demotenode <nodename> and it is not supported.

Q. How many Active primary or fail-over coordinators exist and what is the main function of it?
A. There will be only one Active primary node and is responsible to restart the VMs on primary and secondary nodes, when two hosts fail then restart VMs of the first failed and then the second, decide where to restart VMs, keep track of failed attempts, determine when it is appropriate to keep trying to restart the VMs.

Q. Which primary node becomes the “active primary or fail-over coordinator”? Is there a criteria used in the selection?
A. By default, the first primary node becomes the fail-over coordinator, however after that the others are selected on a random basis.

Q. What happens when all the primary nodes go down?
A. There should be at least one primary node at all times for HA to work if not no HA initiated restart of VMs will take place. This is the reason why you can only have 4 host failures when configuring HA.

Q. What is Host Monitoring Status?
A. After you create a cluster, enable Host Monitoring Status so that VMware HA can monitor heartbeats sent by the VMware HA agent on each host in the cluster. It will enable VMs to restart on another host, if a host failure occurs. It is also required by FT recovery process to work properly. Disabling will disable VMware HA.

Failure detection and Host network isolation

Q. When does a host declare itself as isolated?
A. If a host stops receiving heartbeats from all other hosts in the cluster for more than 12 seconds, it attempts to ping its isolation address and if that also fails, it declares itself as isolated from the network.

Q. When does other hosts in the cluster treat the isolated host as failed?
A. When the isolated host’s network connection is not restored for 15 seconds or longer, then the other hosts in the cluster treat the isolated host as failed and attempt to failover its VMs.

Q. What is isolation response?
A. It is the action that HA takes when the heartbeat network is isolated.

Q. What are different isolation response possibilities?
A. 3 of them. “Power off”, “Shut down” and “Leave powered on” and as of vSphere, the default is “Shut down”.

Q. When to use shutdown / power off / leave powered on options?
A. dfdfdfsfdfdfddfd – Shutdown option: VMs that have not shut down will take longer to fail over while the shutdown completes and VMs that have not shutdown in 300 seconds (5 minutes) are powered off and this 5 minutes value can be changed at das.isolationshutdowntimeout in seconds.

Q. What are the different pros/cons with the 3 isolation response options?

Q. What is the default amount of retries to restart the VMs? And is it necessary to change this value at das.maxvmrestartcount?

Q. Is it necessary to change the default failure detection time value of 15 to something higher at das.failuredetectiontime? And if so why?

HA & DRS

Q. Why HA might not be able to fail over the VMs? What could be the different causes? How to get away from it?

Q. Can the DRS, DPM and VM-Host Affinity rules play any role in the above scenario?

HA Admission Control

Q. What is Admission Control?
A. It is to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that VM resource reservations are respected.

Q. How to enable and disable Admission Control?
A. Enable: Do not power on VMs that violate availability constraints à Disable: Power on VMs that violate availability constraints – no warnings are presented and the cluster doesn’t turn red.

Q. What are different types of Admission Control?
A. Host – host has sufficient resources to satisfy the reservations of all VMs running on it,  Resource Pool  - sufficient resources to satisfy the reservations, shares and limits of all VMs associated with it and VMware HA – sufficient resources in the cluster are reserved for VM recovery in the event of host failure. VMware HA is the only type that can be disabled, but not the rest. Recommendation is not to disable. You might want to disable only during some maintenance or testing.

Q. How many host failures can a cluster tolerate admission control policy?
A. Default is 1 and the maximum is 4

1. Number of Hosts that can fail

Q. How does VMware HA performs admission control when number of hosts that are reserved for admission control policy?
A. Calculate the slot (logical representation of CPU/Memory) size à determine the number of slots each host in the cluster can hold à determine the Current Failover Capacity of the cluster (that is number of hosts that can fail and still leave enough slots to satisfy all of the powered-on VMs in the cluster) à determine whether the Current Failover Capacity is less than the Configured Failover Capacity (provided by the user) and if it is less, then the admission control disallows the operation. This policy avoids resource fragmentation by defining a slot as the maximum virtual machine reservation. This policy tolerates up to 4 hosts of failure. In heterogeneous cluster, this policy can be too conservative as it only considers the largest VM reservations when defining the slot size and assumes largest hosts fail when computing the Current Failover Capacity. When FT is used, the secondary VM is assigned a slot.

Q. How is slot size calculated?
A. CPU à obtain the CPU reservation of all the powered on VMs in the cluster and select the largest value and if no reservation specified for any VM, it will take this value as 256 MHz (this can be changed by changing das.vmcpuminmhz), Memory à memory reservation + memory over heard of each powered on VM and select the largest value and there is no default value for memory.

Q. How is Current Failover Capacity calculated?
A. Each host’s CPU and Memory that are contained in host’s root resource pool (not physical resources of the host) for only hosts that are connected (not the ones in maintenance mode, standby and that have VMware HA errors) à Max number of slots that each host can support = CPU/Memory resource amount / CPU/Memory slot size and the result is rounded down. Both CPU and Memory numbers are then compared and the smallest is the number of slots that the host can support. Current Failover Capacity is then calculated based on all the hosts that can fail and still leave enough slots to satisfy the requirements of all powered-on virtual machines.

Q. How does this affect the design?
A. If we design N+1, where the reservations are setup large, which will distort the slot size calculations and there aren’t enough slots for one host to fail, then the admission control is definitely going to fail when the host actually fails. Example: there are 4 hosts in a N+1 cluster with each a slot value of 16 = 64 slots in the cluster and it has 64 VMs running on it, then if one host fail, then there is no room to failover them all onto 3 hosts with only 48 slots. To avoid the distortion of slot size calculation, you can set an upper bound for the CPU and memory component of the slot size by using das.slotcpuinmhz and das.slotmeminmb attributes.

2. Percentage of resources that can fail

Q. How does VMware HA perform admission control when a % of resources reserved for admission control policy?
A. Calculate total resource requirements for all powered on VMs in the cluster à calculate total host resources available for VMs à calculate the Current CPU Failover Capacity and Current Memory Failover Capacity for the cluster à Determine if either the Current CPU Failover Capacity or Current Memory Failover Capacity is less than the Configured Failover Capacity (provided by the user) and if so the admission control disallows the operation. Again here it uses reservations (default 0MB and 256 MHz, if no user specific values are there). This policy doesn’t address the problem of resource fragmentation. This policy tolerates up to 50% of resource failover. In heterogeneous cluster, this policy will not be affected. When FT is used, the secondary VM’s resource usage is accounted.

Q. How is Current Failover Capacity determined?
A. (1) Sum the CPU reservations of the powered on VMs (default 256 MHz) and Sum of the memory reservation (default 0MB) à (2) Add host’s CPU and memory resources (root resource pool, not physical resources) à Current CPU Failover Capacity = [(2) – (1) / (2)] and same with Memory.

3. Specify a failover host for admission control policy

Q. How does this work?
A. HA will failover the VMs to a specific Host and if that host is not available or doesn’t have enough resources, it will restart the VMs on other hosts in the cluster. Status à Green – ready and no VMs on it, yellow – ready and VMs running on it, red – maintenance mode or HA errors. In this policy, resources are not fragmented because a single host is reserved for failover. This policy only allows a single failover host. In heterogeneous cluster, this policy will not be affected.

Q. How to choose an admission control policy?
A. It really depends on the factors such as (1) Avoiding Resource Fragmentation (2) Flexibility of failover resource reservation and (3) heterogeneity of cluster à see the answers in blue from the above three admission control policies.

Q. What are the requirements of VMware HA Cluster?
A. All hosts must be licensed à At least 2 hosts in a cluster à unique host name à static IP addresses or reservations used when using DHCP à all hosts access same management networks (at least one management network in common and best practice is to have two management networks in common) à ESX (Service Console) and ESXi (Management Network – VMKernel network checkbox) à All hosts should have access to same VM networks and datastores à VMs should be on shared storage, not local à VMwares tools installed for VM Monitoring to work à All hosts configured with DNS and if hosts are configured with IP Addresses, enable reverse DNS lookup (IP address should be resolvable to the short host name) à HA doesn’t support IPv6 à Each host name should be of 26 characters or less (including domain name and dots)

Q. Does VM Startup and Shutdown feature affect the HA or FT?
A. Yes, it is disabled by default and it is recommended to not enable manually, as this could interfere with the actions of cluster features such as HA and FT.

Virtual Machine Options

Q. What are the Virtual Machine Options?
A. (1) VM restart priority – (Disabled, Low, Medium (the default) and High) it is the relative order in which VMs are restarted after a host failure and they are restarted sequentially with high first, then normal and then low until all VMs are restarted or no more cluster resources are available Example: in a multi-tier application place database as high, application as medium and web server as low. Disabling restart priority for certain VMs that are redundant on other hosts (such as multiple Domain Controllers, DNS and so on)
(2) host isolation response – it determines what happens when a host in a VMware HA cluster loses its management network connections but continues to run. This can be customized for individual VMs.

VM and Application Monitoring

Q. How is VM Monitoring performed?
A. If regular heartbeats from the VMware Tools process are not received within the failure interval, then VM Monitoring service will verify I/O stats (for disk and network) level (to avoid any unnecessary resets) for about 120 seconds and if not VM will be reset. And default 120 seconds can be changed at das.iostatsinterval.

Q. How is Application Monitoring performed?
A. Either use an application that supports VMware application monitoring or obtain the appropriate SDK and use it to setup customized heartbeats for the application you want to monitor. After which, if the heartbeats are not received from the Application, the VM will be restarted.

Q. What kinds of sensitivity are available?
A. Highly sensitive monitoring – more rapid conclusion that a failure has occurred (failure interval: 30 seconds and reset period: 1 hour) à Low sensitive monitoring – longer interruption in service between actual failures and VMs being reset (failure interval: 120 seconds and reset period: 7 days) à Medium (failure interval: 60 seconds and reset period: 24 hours). During this reset period, the VMs will be reset for only 3 times.

Advanced Attributes

Q. What are the various advanced attributes?
A. das.isolationaddressX – X = 1-10 isolation addresses, typically one for management network is good – HA should be re-enabled
das.usedefaultisolationaddress = specify whether to use default (mgmt network gateway) or not – HA should be re-enabled
das.failuredetectiontime = 15 seconds default – HA should be re-enabled
das.failuredetectioninterval = 1 second default – HA should be re-enabled
das.isolationshutdowntimeout = 300 seconds default and only applies for Shut down VM response – HA should be re-enabled.
das.slotmeminmb = max bound on the memory slot size.
das.slotcpuinmhz = max bound on the cpu slot size.
das.vmmemoryinmb = default memory resource value assigned to a VM if its memory reservation is not specified or zero (only for Host Failures policy)
das.vmcpuinmhz = default is 256MHz
das.iostatsinterval = changes the defaultI/O stats interval for VM Monitoring sensitivity – default is 120 seconds. 0 – disable and any value more than that can be setup to enable.

HA Best Practices

Q. What are the best practices for HA performance?
A. Networking configuration and redundancy à setting alarms to monitor cluster changes for notifying administrators à monitor cluster validity – admission control policy is not been violated, cluster becomes invalid (red) if current failover capacity is smaller than configured failover capacity (also overcommitted (yellow)) – DRS behavior is not affected if a cluster is red because of a VMware HA issue à check the operational status of the cluster – verify summary tab / cluster operational status screen.

Networking Best Practices for HA

Q. What are the best practices for HA during network configuration and maintenance?
A. When making changes to network architecture, it is recommended to suspend Host Monitoring feature, to avoid any heartbeat interruption à adding portgroups or removing vSwitches, it is recommended to suspend Host Monitoring and also place the host in maintenance mode.

Q. Which networks are used for HA communications?
A. On ESX, all networks that are designated as service console networks and VMKernel networks are not used by these hosts for HA communications à On ESXi, by default HA communications travel over VMKernel networks, except for those marked for use with VMotion; ESXi 4.0 and later, explicitly enable the Management network checkbox for VMware HA to use this network.

Q. How are the cluster-wide networks considered?
A. The first node added to the cluster dictates the networks that all subsequent hosts allowed into the cluster must also have. Any hosts with less or more networks added to the cluster will fail.

Q. How is the network isolation addressed considered?
A. Even though you have many management networks, by default only one default gateway will be specified and you should use das.isolationaddressX to add isolation addresses for additional networks. It is also recommended to change the das.failuredetectiontime value to 20000 milliseconds (20 seconds – but we are changing to 30 seconds as mentioned above), as a node that is isolated from the network needs time to release its VM’s VMFS locks if the host isolation response is to fail over the VMs (not to leave them powered on) and this must happen before the other nodes declare the node as failed, so that they can power on the VMs, without getting an error that the VMs are still locked by the isolated node.

Q. Any changes required on the physical switches?
A. Enable PortFast on the physical switches as this setting prevents a host from incorrectly determining that a network is isolated during the execution of lengthy spanning tree algorithm.

Q. What other networking considerations required?
A. Incoming ports: TCP/UDP 8042-8045 and Outgoing TCP/UDP 2050-2250 à Portgroup names and network labels should be consistent à NIC redundancy by NIC teaming or multiple management networks and make sure there aren’t multiple hops between servers in a cluster to avoid any network packet delays for heartbeats. 2 NICS – 2 Physical Switches can have two independent paths for sending and receiving heartbeats provide the cluster more resiliency.

Decisions to make when configuring HA: HA Summary

Q. What to look for when designing HA in your Organization?
A. Verify pre-requisites à Design proper Network Configuration à Primary node placement àHost Monitoring Status à Isolation Response decision (Shut down, Power Off and Leave Powered on) à HA and DRS/DPM together à HA Admission Control (Host Failures Cluster Tolerance policy, Percentage of Cluster Resources policy and Specify a Failover Host policy) à Virtual Machine Options àVM Monitoring

Post to Twitter

Tweets
    Trips
    LinkedIn
    Raman Veeramraju
    Books