NSX-T Error – Failed to uninstall the software on host. MPA not working. Host is disconnected


I was starting over in my NSX-T 2.4 Datacenter Ravello lab, when I suddenly had this error show up. I reckon I was too hasty in removing something (which I don’t quite remember what) that the uninstall process would not complete and this error showed up:

No amount of trying to reconfigure NSX on the host/cluster would help:

CLI to the rescue. While the process to uninstall NSX-T from a host is documented in this VMware kb, it doesn’t quite tell us how to connect to a host’s NSX-T CLI. So here’s the process, elaborated a bit more than the kb article:

  1. Putty to NSX Management Cluster’s IP/DNS name
  2. Log in with the admin account
  3. Get the NSX Management Cluster’s thumbprint with get certificate api thumbprint
  4. Copy the thumbprint
  5. Putty to the host with the error with root credentials
  6. Punch in nsxcli to get into the host’s NSX-T CLI, note how the prompt changes:

7. Run the command: detach management-plane <name of Manager> username admin password <password> thumbprint <thumbprint>

8. Wait for 5 or so seconds following which the host should report: Node successfully unregistered.

9. Exit the NSX-T host CLI with the exit command and run the command: vsipioctl clearallfilters -Override

10. Next, run the command: /etc/init.d/netcpad stop which will stop the netCP service:

11. Finally, go back into the host’s NSX CLI (with nsxcli) and run the final command: del nsx to see the following:

Going back to the NSX GUI, hit the refresh button to see the host(s) back to a pristine state. Happy days.


13 Comments

  1. i’ve ran into this issue, but nsxcli wasn’t available anymore on the host.
    to solve this:

    – System > Fabric > Nodes > Host Transport Nodes
    – select host
    – Remove NSX
    – ONLY Select “Force Delete”

  2. i had the same issue, but the nsxcli was already gone from the ESXi host. To fix this i had to select the host where this was occuring. Then REMOVE NSX, and only choose the Force Delete option. With both options selected the issue kept popping up.

  3. This worked perfectly on a 2.5 deployment. One small tweak, at least needed in my case. With a multinode management cluster, when trying to unregister using the VIP address, I got an “Invalid Thumbprint” error. I needed to point directly at one of the nodes.

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.