Let’s talk about configuring and deploying VMware Cloud Disaster Recovery #VCDR #vExpert


So, let’s talk about VCDR VMware Cloud Disaster Recovery. It was acquired from Datrium and I have some customers who are in the process of looking into it and deploying it.

If you have ever used SRM, you will feel very at home with VCDR, as you will see as I explain it in more detail.

I am a big fan of DR in general, I used to work for a BC/DR provider, and I lost count of the number of times people never tested their plans and were then surprised they didn’t work as they had hoped. I always used to say, you should be happy they failed now, imagine if this had been real and you couldn’t recover! As they say with backups, they are only as good as the last time you tested them.

I used to be a vSphere Admin as well and I spent a lot of time using SRM with VR/ABR back in vSphere 5.5/6.5.

Now working for Customer Success it’s my job, to help customers use things like VCDR properly and to get the most out of it, so I have been spending some solid time with VCDR and deploying it in my home lab and into a test VMConAWS SDDC.

So VCDR works differently than let’s say SRM:

Basic VCDR Overview

You get a Cloud File System (CFS) deployed with AWS S3 and all your VM snapshots are copied securely into there and are stored and encrypted.

You have 2 main recovery options:

  • On demand = You get the CFS and then when you want to do tests or recover, an SDDC is spun up esp for this use case. This as you can imagine has its pros and cons, the main one being a longer RTO
  • Pilot Light = You have at least a 3 node VMConAWS SDDC (2 node will be supported soon) sat ready to go, and you can do all your testing in it and get all your reports for compliance etc whenever you like, and of course since the SDDC is already deployed if you have a DR event, you don’t have to wait for an SDDC to be spun up, you can get right to it.

VCDR uses the VMware APIs for Data Protection (VADP), so it’s just like the majority of backup vendors on the market. So, nothing new there, if you are taking snapshots for your backups in general, then you should have zero issues with getting VCDR running!

Regardless of the option you pick, the CFS and the SDDC used will reside in the same AWS AZ, this helps with speed as the data is much more local to the SDDC.

Now when you protect a VMConAWS SDDC, you must select a different AZ or Region, as if the SDDC the AZ was to go down, I mean you would be pretty stuck since your DR was in the same AZ!

VCDR knows this and forces you to pick a different area, an example shown below:

SDDC Options

My CFS is in usw2-az2 so I can’t protect any of the VMC SDDCs in that az as a result, as you can see only the top one is available to be used as a protected site.

Regarding the CFS:

  • 1 CFS per Recovery SDDC
  • 1 Recovery SDDC per CFS

Now we have got that basic intro out of the way let’s talk about getting it up and running to protect your On-Prem environment.

Deploying the Connector Appliance

Regardless of whether you are protecting an on-prem environment or SDDC, you will need to deploy a connector, and that is made easy for you:

Protecting an on-prem environment

Once that is done you will have to download the connector and deploy it into the vCenter.

Downloading the connector appliance

Just like any other OVA style deployment you deploy it out into your vCenter:

So that is pretty straightforward, but there are a few keys that you should remember:

  • You don’t provide any IP details; this config will be done when you power on the VM.
  • Do not name the Connector VM using your normal naming conventions that you use in your protected site. The reason for this is that you can protect VMs using naming conventions and it will pick up the Connector VM and just cause issues.
  • If you are deploying the connector into VMConAWS to protect an SDDC, you MUST have one connector per cluster, this is due to the way VCDR/VADP works within VMConAWS
  • On prem there is a min requirement of 1 connector, but it is recommended to deploy at least 2. If one is down the other takes over the load
  • You need 1 for every 500 VM’s managed by the protected vCenter (regardless of if those VM’s are protected by VCDR)
  • If all connectors are down when a snapshot is due to be taken, that snap will be lost, and the schedule will resume when the connectors are back online
  • Connector software updates are automatic and pushed directly to the connectors. It is made up of Docker containers, so these are restarted and the VM itself does not need to be rebooted
  • Connectors are stateless and can be reinstalled at anytime without losing any existing backup data

The Connector VM is 8 vCPUs (Reserved), 12Gb of RAM (Reserved) and 100Gb of disk space. In my lab, I downsized the CPUs to 4 and it worked perfectly fine, so it could run on my Intel NUC. THIS OF COURSE IS TOTALLY UNSUPPORTED.

Now once you have powered it on you will get a login screen, you still must give it a bit of time after the login screen comes up otherwise you will come across this:

Slow down there, have a brew and wait a few mins

You use the default login details of:

admin/vmware#1

You will then make your way through the rest of the configuration process:

configuring the connector using the VM console

The label should be the same name as the name you have given within vCenter for the VM

The Connector VM previously needed admin rights into the vCenter, now the VCDR team have released a phyon script that you can use to create a user with the specific permissions you need and it can be found here:

https://docs.vmware.com/en/VMware-Cloud-Disaster-Recovery/services/vmware-cloud-disaster-recovery/GUID-1417AA78-74A1-4121-9CB7-15E95D12549C.html

Before this script came about, admin rights were the only way to go and that is what I have based my testing on. One key thing to be careful of is the script gives all the permissions but it will not work for failback.

Pairing it up is easy, you do it in the cloud console:

Pairing to the on-prem VC

Key things I came across in my testing:

If you want to use a custom account, you need to do 2 things based on my testing:

  • The custom account can be a vsphere.local or domain joined account
  • The account must be part of the vsphere.local\administrators group
part of the administrators’ group
  • The account must also have direct local admin rights to the vCenter
the account has admin rights into the vCenter as well

If these criteria are not met, you will not be able to pair the vCenter with VCDR, and you will get these errors:

Error 1
Error 2 is a bit more obvious

Once that has all been done, you should be good to go!

The next blog post will be on actually configuring up the Protection Groups and Recovery Plans!


Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.