A different take on the SRM upgrade order

Upgrading a vSphere environment with SRM can be tricky. If you follow the official best practice on how to upgrade vSphere and SRM you might find yourself in a position where SRM is out of commission because the last part of the upgrade process,  the part where you upgrade the DR site, fails.

The official recommended way to upgrade SRM

The official recommended way to upgrade SRM connected sites is to upgrade all components in the Primary sites first and then upgrade the components in the DR site. This means that you upgrade vCenter components followed by the SRM components in Primary Site A first, followed by vCenter components and the SRM components in Primary Site B and then by the vCenter components and the SRM components in Primary Site C. As a last step you would upgrade vCenter and SRM components in the DR Site.

srm 1

Now imagine that the vCenter upgrade or the SRM upgrade in the DR Site fails. As a consequence of this you might have to roll back vCenter and SRM servers in all sites, which amounts to time loss and in the hopefully rare case when you don’t have proper backups or snapshots prior to the migration, results in failing to meet the DR SLA’s until the problem in the DR site has been solved. To me this never looked like the best order to perform such an upgrade.

So here is my different take on the upgrade order.

You upgrade first all vCenter servers [UPGRADE 1] . This has the advantage that when a problem occurs during a vCenter server upgrade, you will notice this problem before you start upgrading any off the SRM servers. Doing so allows you to roll back all vCenter servers before you even start upgrading a single SRM server. You might wonder how this can be of importance.  (* Take notice that this is my vision)  If problems occur during an upgrade, the worst problems which might occur will happen during a vCenter upgrade.

The sooner you realize there is a problem with a vCenter upgrade, the sooner you can solve it and in case this problem with this single vCenter server persists and you cannot fix it immediately, you still can roll back all the vCenter servers to the previous version without having to worry about rolling back the SRM servers.

You will also know which vCenter server you have to concentrate on to get this problem fixed. In case it is the DR vCenter server, you can work on fixing the problem during the maintenance schedule window, and roll back the vCenter server during office hours until you have this problem solved.

Also in case the upgrade of the vCenter servers is successful a problem can still happen during the SRM upgrade [UPGRADE 2]. But at least you will know that the problem is not any of the vCenter servers and you only have to troubleshoot the SRM instances.








Be the first to comment

Leave a Reply