My muse and inspiration

2012-04-05 19.54.19Although no longer this young and small, my granddaughter has shown an interest in technology from a very early age. Seeing her interest and thrill at new discoveries inspires me each day. Thankyou Maisie, love you loads xxx

Advertisements

VMware vSphere, MS Clusters and vMotion resulting in cluster service failures

Quick guide here, from experiences we’ve gained.

When running an active/passive MS Failover Cluster (MSFoC) in a VMware environment, you need to be aware of the behaviour of the clustering when a vMotion event happens on the active node. Because of the momentary interruption in networking and activity on the active node during the final cutover of the VM to the new host, the passive node can often see this as a failure and try to take over. As the active node is not actually down, this results in both nodes trying to run the services, resulting in ‘split-brain’ detection shutting service down on both nodes.

Our way around this issue is to set DRS for these cluster nodes to manual, making sure we’re aware they are there. In the event of vMotion being needed (maintenance generally, or for manual load balancing), we ensure the MSFoC service is shutdown on the passive node. That way we can move whatever we like as we need to without triggering a false takeover.

Bear this in mind when running MSFoC in VMware. Impact seems to be more often with shared-disk clusters, but also seen on non-shared-disk clusters too.

Cisco UCS and VLAN conflicts

If you ever find yourself with conflicting VLANs as a result of a UCS code upgrade, don’t panic. If, like us, you use NFS, and the NFS VLAN conflicted with a VSAN FCoE VLAN, you’ll likely have impacted your NFS VLAN. Once the conflict was resolved by moving the VSAN FCoE VLAN to an unused one, we still had no network traffic on the VLAN we use for NFS. We had to change the VLAN number for that VLAN to something else, commit the change, then change it back. This forced a refresh of the VLAN config and restored connectivity.

VMware vCloud Director 5.1 VXLAN install issues

Here is an interesting issue I hit today. Installing VMware vCloud Director 5.1 with vShield Manager 5.1.2a was giving me problems creating the VXLAN backend. Each time it would fail to install the VIB with “vib-module for agent not installed on host …. (vshield-vxlan-service)”. Lots of searching and an SR to VMware was raised. Then, just after raising the SR, I found my answer, thanks to this post

http://browse.feedreader.com/c/vClouds/289627048

Snipped here in case it vanishes from feedreader….

<snip>

When I tried to enable VXLAN in my vCloud Director setup I got the following error: “VIB module for agent is not installed on host vShield-VXLAN-Service”

This is solved by installing the VXLAN VIB which is available as a download on your vShield Manager: https://vsm-ip/bin/vdn/vibs/5.1/vxlan.zip on all your ESXi hosts in your VCD Cluster.

I used vCenter Update Manager to update all my ESXi hosts. no reboot is required.

After that, login to your  vCloud Networking and Security appliance (previously knows as vShield Manager), navigate to Networks, Network Virtualization and click on Preparation. Click Resolve to prepare your hosts.

</snip>

Hope this note helps someone.

Rescan SCSI bus in linux

Useful if you’ve just added a disk to a linux server and want to rescan to be able to use it:

echo “- – -” > /sys/class/scsi_host/host0/scan

vCenter custom alarms

I had a requirement today to create a custom alarm for vCenter 5.0, one that would alert me to a failed virtual disk consolidation event. This has been causing me a few issues and I want to know about it before the client does.

For anyone trying to do this, I found this page that tells all:
http://vmice.wordpress.com/2013/01/10/custom-alarms-for-events-in-vcenter-5-x/

Long and the short of it, you create a new alarm for an Event, and for the triggering event, you use the API definition. In this case, it was “com.vmware.vc.VmDiskFailedToConsolidateEvent”.

FYI, Veeam provide a nice breakdown of these API definitions:
http://www.veeam.com/support/vcEvents.html

Now, any time I get one of these events, my team will get notified and we can take action to rectify it.

Cloud – What are we trying to do here?

The term ‘Cloud’ is still evolving. We are all trying to come to terms with the implications around this simple word, what it really means to our customers, to our businesses and to us as individuals. Frequently I hear clients say they want to ‘get to the cloud’ without knowing what they actually mean to achieve with this statement.

So, let’s break it down.

Challenges

Businesses face many challenges, some of them financial, others political, that all impact the best place to spend their budget to make the most impact. For public listed businesses, they have shareholders and investment analysts to placate. They need to be seen to be doing the right thing (returning on investment) while still growing their business and maintaining their image. They need to invest shrewdly , many times sticking to old faithful ‘safe’ options in favour of newer technologies that have never been used inside their own company. They can be risk-averse, but this comes with its own risks too, holding them back and making them spend money on maintaining the status quo. They have legacy systems that teams of people support, that they pay vendors large sums of money to maintain. These are not going away overnight. Do they have the funds to re-architect the legacy systems onto a new platform? Do they really want to, or even need to?

Other businesses seem to go the other way. They jump on every bandwagon that comes along, being ‘early adopters’ of each bleeding-edge technology before they even know what business problem they are trying to solve with it, or if it can even solve a problem they may ever experience.

These all lead to a confusion around what ‘cloud’ can be to them and how it can assist, or hinder, their business processes.

A march into the clouds

Here is a definition of ‘cloud’ that I prefer:

Cloud is a concept, one that encompasses

  • process
  • automation
  • orchestration
  • compliance
  • federation
  • cost reduction
  • self service
  • flexibility
  • complexity reduction

which all come together to provide a service that the business can consume at a rate and cost that suits their speed of business.

(Note here that I say ‘business’ – there is no mention of IT yet. Business should lead the adoption of IT, not adjust their business processes to fit around the latest greatest product the IT department have found.)

The outcome should be that the business can move quickly and decisively, without the delays inherently seen with deliving supporting services. These services don’t need to be IT-related, they could be around human resources or facilities management. However, most revolve around an IT-based service at some point, so we’ll keep on track here.

I’m off to the cloud – what do I do next?

Deciding to start on the journey to cloud is just one small step forward. After all, everyone is doing it so it must be the right thing, right?

Stop, take a step back. Look at your business, at the processes and procedures that are the bedrock that holds it together. What business requirements are you trying to address with your cloud initiative? What problems are you trying to solve? Just moving things into the cloud isn’t an answer to a business need, it is the ‘bandwagon’ mentality. Evaluate what you need to solve. Is there a true need to embrace a full cloud initiative, or are you not even at the first step towards virtualisation in your own environment? What does the business think about your cloud move – are they happy that you are moving them to a cloud? What SLA/OLA requirements do you have around the services that you want to migrate? Will a cloud strategy still address them so that your business can carry on, or are you going to impact them, causing costs and business impact? Can a service provider, whether that is your own internal IT department or an external provider, deliver the levels of service and availiability that your business (not your IT department) truly demand?

If you are at this stage, engage a professional to assist you. Choose an independent, someone with a track record, someone who is not aligned to a vendor/provider/technology, someone who has experience of your line of business. Run yourselves a workshop, involve your line of business leaders. Make sure the decisions are being made by the right people, those who depend on the service you are going to change, those that use it every day. They will tell you their true business requirements. Use these as the basis for your discovery process, making sure to keep everything documented, the reasons why a certain business needs the SLA/OLA they ask for. Ensure they understand how each step they move towards higher availability also moves them a step up the cost ladder.

This is just a bit of a brain dump from me tonight as I sit here in my hotel room contemplating my navel (it has lint in it, where did that come from??). I’ll add more to this in the coming days/weeks.

Zerto – VM-level replication moves to the hypervisor

I saw a great product today, one that has been around for a while now but is getting more coverage and uptake..

Winning the Best Of Show at VMWorld 2011, Zerto is a product to integrate with vSphere environments. It provides VM-level replication and protection, with RPO compliance options.

  • By moving the replication away from the storage layer, you remove a lot of the previous restrictions for data protection.
  • No longer do you need to worry about which datastore a VM is on in order for it to be replicated and protected.
  • You don’t have to replicate all VMs on a datastore to the same destination.
  • You can leverage cloud-based providers as the target for your replication, giving you an off-site DR/BCP capability.
  • Zerto includes the ability to replicate between vCloud and non-vCloud vSphere environments.
  • CDP is included as part of the offering, allowing you to recover to a granular point-in-time.
  • Disk writes are copied in memory at the hypervisor layer, removing the storage overhead of reading from the disk to copy changed blocks.

I should get this in the lab soon, so watch this space for more updates. With the use of the Zerto API, we should be able to integrate this as a ‘value add’ service that a client can easily order through the Cisco IAC portal.

Tidal workflow for random password generation

Thought I would share this workflow to generate a random 12-character password, that can then be used for OS customisation, etc

http://dl.dropbox.com/u/62607087/Telemorphix_Utilities.tap

Feel free to use as you need to 🙂

UCS 2.0 and Port Channels

I came across this interesting article today around the new capabilities of UCS 2.0. It’s a shame that it seems to indicate the requirement for the new 6248 fabric interconnect, and 2208 fabric extenders in the chassis, to take advantage of the “Port Channel to the blade” capabilities it talks about though. Does this really require the next gen 40GE hardware, or can it be used with existing UCS 1.x hardware too, simply by upgrading the firmware? Answers on a postcard people!

ThinkAheadIT Blog