«

»

Nutanix Erasure Coding (EC-X) – How it works

Nutanix Erasure Coding (EC-X) is one of the storage reduction technologies available in Nutanix Acropolis Operating System (AOS) which is basically the Nutanix cluster. It basically takes one or two data copies and create a parity that can be used to recreate the data if required.

I have had quite a few questions regarding how EC-X works recently so I decided to put together an official Nutanix paper about it. You can download it via Nutanix Erasure Coding

Since software is constantly improved the technical paper and this blog posts explains how EC-X works until today meaning AOS releases up to 5.10.9, 5.11.2, and 5.16.

A summary of the paper and an example is provided in this blog post. If you need more information about how it works, more examples, recommendations and so on download the Nutanix Erasure Coding doc.

When you configure the DSF with a replication factor of 2 or 3, the Nutanix cluster maintains two or three exact copies of the same data on different nodes to ensure data availability. The actual logical capacity available depends on the replication factor you choose. When you use replication factor 2 (also called fault tolerance 1), you have approximately 50 percent capacity available. When you use replication factor 3 (also called fault tolerance 2), you have approximately 33 percent capacity available.

Nutanix DSF work with different logical constructs:

  • Slice: 32 KB (8 KB is a subregion of a slice that you can address).
  • Extent: 1 MB (for deduplicated data, an extent is 16 KB).
  • Extent group: Between 1 and 4 MB.
  • Container: Logical grouping construct where VMs are placed and EC-X is enabled.

EC-X works at the extent group layer, meaning it uses 1–4 MB data sets when performing its XOR calculations.

Below table presents the available strip sizes.

 
# of Nutanix Nodes Replication Factor 2 EC-X Strip Size (Data / Parity) Replication Factor 3 EC-X Strip Size (Data / Parity)
4 2/1 N/A
5 3/1 N/A
6 4/1 2/2
7 4/1 3/2
8 4/1 4/2
9 4/1 4/2
# of Nutanix Nodes includes at least one node used for failover with replication factor 2 and at least two nodes used for failover with replication factor 3.

Example:  4Node Nutanix Cluster, Replication Factor 2

This scenario includes the following items:

  • Four Nutanix nodes
  • Two vDisks
    • The CVM on node1 owns vDisk1.
    • The CVM on node3 owns vDisk2.
  • Three extent groups per vDisk
  • Two identical extent group copies

Before EC-X

Before the EC-X parity calculation, the DSF writes each extent group twice to the Nutanix cluster on different nodes:

  • vDisk1 egroup1 goes on nodes 1 and 4.
  • vDisk1 egroup2 goes on nodes 1 and 2.
  • vDisk1 egroup3 goes on nodes 1 and 3.
  • vDisk2 egroup1 goes on nodes 2 and 3.
  • vDisk2 egroup2 goes on nodes 3 and 4.
  • vDisk2 egroup3 goes on nodes 1 and 3.
../images/TN-2002-Erasure-Coding_image2.png

After EC-X

After the EC-X parity calculation, we get the following results:

  • vDisk1 egroup1 goes on node1 (data) and node2 (parity).
  • vDisk1 egroup2 goes on node1 (data) and node4 (parity).
  • vDisk1 egroup3 goes on node1 (data) and node4 (parity).
  • vDisk2 egroup1 goes on node3 (data) and node2 (parity).
  • vDisk2 egroup2 goes on node3 (data) and node4 (parity).
  • vDisk2 egroup3 goes on node3 (data) and node4 (parity).
../images/TN-2002-Erasure-Coding_image3.png