Zerto, my Notes and Thoughts

Earlier this year, I was involved on a POC for Zerto Virtual Replication in a VMware environment and took some notes of things that I liked and found useful about the product and the way it works.

I wanted to share here some of the things I have learned, for my own reference and obviously for anyone out there who is starting with the product and may find these notes useful. A disclaimer here though: Things you read here could have been misinterpreted or misunderstood by me and you should research and use Zerto's Technical Documentation if you plan on implementing it in your production environment.

Overall, I really like Zerto; it is intuitive and simple to use, yet very powerful and complete application that will allow you to protect virtual machines with RPO in seconds and give you very convenient features.

Let's review some basic acronyms and components you need to be familiar with:
Zerto Virtual Manager or ZVM: It is the central management interface installed on a Windows server, it allows you to manage all the DR tasks related to your source and target sites. You need one ZVM per vCenter Server.
Virtual Replication Appliance or VRA: This is the appliance deployed to each one of the hosts in the cluster where the VMs you intend to protect reside, as well as in the target hosts. These appliances manage the actual replication of data from source to target site. VRA's run Debian Linux for operating system.
VPG: Virtual Protection Group; it is the grouping of servers that replicate with the same parameters or settings; often used to group servers of the same application stack, so they can be tested and recover together. Important as when you failover to a Checkpoint, they are all consistent.

From my personal point of view and own experience I will list some of the features and cool options in no specific order.

The installation process: It cannot be simpler. Installing Zerto is straightforward process; you will need one Windows Server to install the software and link it to its dedicated vCenter Server. The software requires a minimum of 4GB of free space. The installation wizard offers two options; one is the “Custom installation” which will give you the opportunity to select a specific account to run the Zerto Virtual Manager service, the ability to choose an external or embedded database. The “Express installation” will use embedded database and run its service as Local System. Regardless of the option, you will need to enter the FQDN of your vCenter, an account with permissions and a Site Name. From the installation wizard you can choose to participate in the Online Services and Zerto Mobile Application which will give you access to Zerto Analytics which is a great new tool that's expanding. At the end of the wizard communication and credentials to vCenter are validated; if there are any issues, a warning will display. Installation completes within 5 minutes.

Logging in for the first time: You access Zerto from a browser on port 9669 (https://DR-vCenter.kolkes.com:9669/zvm).
You need to provide a license key when you first login, so you either enter a the key manually or pair with another site that is already licensed and running.
Its HTML5 interface is clean and very responsive, you see multiple tabs where you configure different things but one thing I found useful in this product is that you can access and initiate many tasks from various places in the UI. 
On a brand new installation there will be pop-up messages that will guide you through finalizing the setup and things you need to do in order to start protecting your VMs.

Main Zerto management screen

The tabs in the UI are intuitive but here is a quick summary of them:

The Dashboard tab will show information about all the different VPG’s statuses, performance and if they are meeting the SLA you configured. Also displays information about the paired site, alerts and recent events.
VPGs tab is where you create and manage the Virtual Protection Groups for your site and can also see status and more detailed information including priority, actual RPO, performance etc.
VMs tab shows you all the VMs that are protected by the different VPGs and their information.
The Sites tab is where you can list and manage the paired sites, see connection status, performance and provisioned storage details; you can also see the quantity of VPGs and VMs per site from here.
Under the Setup tab, you will find the list of the hosts in your vCenter's cluster(s). It is in this section that VRAs are deployed to your ESXi hosts. It also conveniently lists datastores available with their capacity, usage and even a separate column to list the space used by DR, meaning for VRAs, replication, etc; anything being consumed by Zerto.
Offsite Backup is not a feature I have used with Zerto but as the name implies, it allows you to manage backup settings. 
The Monitoring tab will show you alerts, events and running or recent tasks. 
Finally, the Reports tab allows you to generate various types of reports on but not limited to performance, usage and overall protection over time.

The first step to get Zerto working after installation is to deploy at least a VRA to one or your hosts, both in the protected and recovery sites. The deployment is quick and simple but here is where I think Zerto has room for improvement… these task should have the ability to allow you to deploy multiple VRAs at once, to all your hosts in a cluster utilizing a IP pool/range; the second suggestion I would make is to be able to select a datastore cluster – Maybe these are feature requests that their engineers can work on or are already looking into (??)
I know VRAs can be deployed in bulk with a PowerShell script, but I bet many customers would appreciate being able to do it from the GUI.

Here is the interface to deploy a VRA (one at the time):

VRA deployment fields
The VRA defaults to 3GB of RAM and 1 vCPU, but only RAM can be increased before deploying the appliance. 
Compression performance can be improved by adding more CPU to the VRA; I don’t know why that is not possible during deployment the same way RAM can be increased.
The VRA is automatically named Z-VRA-hostname, this name should not be modified on vCenter. One important thing to note here is that in order for the VRA deployment to work, SSH is enabled on the host; in environments where Strict Lockdown Mode is enabled, the deployment of the VRA will fail.

VPG creation: It is very intuitive to create your Virtual Protection Groups; you give it a name, select the VMs from the pre-populated list (know that only VMs that reside on hosts with an active VRA will be listed. It is best practice to have VRAs in all hosts in your cluster but it is common to deploy only a few during a POC, so if you are testing the product, make sure the VMs you intend to replicate are pinned to the hosts with VRAs deployed). If you select/add multiple machines to a VPG, you have the ability to define boot order for them.
Here are some of the important settings to configure during VPG creation:
Journal History in days or hours: this option gives you the ability to select how long you want to keep images of your replicated VMs within the VPG on the DR Site, the default Journal datastore, its hard limits and thresholds; these options can be broken down per VM within the group if you need to.
Target RPO alert helps in generating alerts if the RPO is greater than what you set; this settings is individual per VPG and it defaults to 5 minutes.
WAN compression checkbox is selected by default and it allows for compression of data before it is transmitted over the wire.

For Storage selection, again, you can be very granular and select different datastores and disk format for individual machines or their individual VMDK's. To use preseeded volumes, select the VM and click Edit Selected, this will pop-up:

here you can select the datastore and pre-existing file on the target site. If you have a Preseeded VM with multiple disks, you need to complete these steps for each volume separately. Temp Data disk is replicated once at initial sync but not synchronized after that.

Thin provisioning does not save you space on the replication itself, if checked, it will use thin disks when the VM is failed over and running on DR site, either as a Test or Live.

Under the Recovery tab, you select the networks for Live and Test Failovers, the destination folder and an option to enter Pre and Post recovery scripts if you have any to run.

In the NIC tab is where you set IP settings for each network adapter and can break down network assignment for individual NICs for both Tests and Live Failovers or Moves. You can select DHCP, Static or no IP configuration. If you need to create a new MAC address while doing the failover, editing the VM will allow you to do that, this is also where Preferred, Alternate DNS and suffixes can be entered.

To use the Zerto Backup option a Repository needs to exist. This container can be a Network Share or a local folder on the ZVM node; a Repository is configured under the main Setup tab; if you plan on using a folder on the Zerto Virtual Manager server, it can be an existing folder or a new one can be created right from Repository creation section in the ZVM console.

Here are some screenshots from setting up a Backup repository on the ZVM:

You have some basic options for backups such retention period and job scheduling.

The Summary tab shows you all the settings and total provisioned space you will need for the VPG; you click DONE to create the VPG. Zerto will initiate the replication of the systems you included in the group.

Upgrading Zerto and its components: Both ZVM and VRA can easily be upgraded in place with minimal downtime. I upgraded ZVM from 4.5 to 5.0 and it was painless; then I proceeded to upgrade the VRAs and it was also straightforward although I noticed the tasks in vCenter and it was more like a replacement than an in-place upgrade; later on I learned that the OS volume of the appliance is basically replaced. One thing I noticed about Zerto is that it generates many tasks in vCenter but this doesn't mean anything bad, it's just something very notable.

Other notes and things to know:
  • A single ZVM can manage incoming and outgoing replications of multiple VPGs. Same goes for the VRAs.
  • File level recovery is very smooth and easy, it is one of the tasks that are granular for a volume of a VM within a VPG; other recovery tasks will affect and involve ALL VMs at the time such a Live or Test failover.
  • The MOVE option is great, it allows you to relocate machines in a VPG to the target site with the option to reprotect. Many people use this to simply migrate workloads from one site to another, think of a datacenter move or planned relocation of servers in preparation for an event such as a storm.
  • In the new Zerto Virtual Replication version, a VM can belong to multiple VPG that are replicated to separate sites, think one-to-many, but a single virtual machine can not belong to more than 3 VPGs.
  • Zerto can create thin or thick lazy zeroed disks for the replicated server, if you need it to be Eager Zeroed, what needs to be done is to pre-create the disk as Eager Zeroed at target and use the Preseeded Volume option when configuring the machine's replication.
  • If the source ZVM is down, either during upgrade or a crash, no Checkpoints are created but replication continues between the VRAs. If the target ZVM is down, no checkpoints can be committed to the replica disks. 
  • With the Clone feature you can create a copy of the replicated machine and keep it independent of everything else, it won't be linked or attached to Zerto, nor will it have anything to do with replication, unless of course you later protect it.

I am going to leave it at this for now, there are many other features or options to talk about, maybe I will do a second part that includes how Zerto integrates with AWS and Azure and how service providers use it for multi-tenancy with ZORGs and ZCM. Also will need more space and time to write about the Analytics offering included in the Enterprise Cloud Edition; Commit Policies, my ideas for improvements, things I distate like some of the buit-in email alerting and some gotchas I have already experienced after rebuilding a vCenter. 
As always, any comments or questions are welcomed and encouraged.

1 comment:

  1. Hey Jorge. Great write-up about a great product. We're currently doing a PoC with Zerto as well together with one of our customers. They will (if they end up buying it) protect about 3500 VMs running on about 190 ESXi hosts in two main sites.
    I would agree with you that selecting a datastore-cluster for the VRA would be nice, and I know they are already working on that and it will be in the next release (at least that was what I was told by one of Zerto's engineers).
    Overall I really like zerto. It's very easy and intiutive so I will try to push this product with our customers whenever I see a use-case where the customer could benefit from the product.
    One major use-case is the threat of encryption malware. If your servers are protected by Zerto you can do a failover (5 second interval in the first 15 minutes, 5 minutes intervals after that) and have your server up and running again in no time (and you don't have to pay for the decryption).