In this series of vSAN as of now we know what is vSAN and how it works. vSAN is an software defined storage where virtual machines are stored as an objects. vSAN works on storage policies where you can create a multiple storage policies based on the requirements and assign that policy to per virtual machine or per vmdk basis.
vSAN also provides the deduplication and compression features which are available in all the storage solutions. deduplication removes redundant data blocks, whereas compression removes additional redundant data within each data block. These techniques work together to reduce the amount of space required to store the data.
Deduplication and compression is a single cluster-wide setting that is disabled by default and can be enabled using a simple few clicks . It is not possible to enable only deduplicaiton, you need to enable both at the same time. When you enable deduplication and compression on a vSAN cluster, redundant data within a particular disk group is reduced to a single copy.
VSAN uses SHA-1 hashing algorithm and works with a 4K block for de-duplication. The deduplication algorithm utilizes a 4K fixed block size and is performed within each disk group. In other words, redundant copies of a block within the same disk group are reduced to one copy, but redundant blocks across multiple disk groups are not deduplicated.
“Cold” data in the cache tier that is ready to be de-staged is moved to memory were it is deduplicated and compressed and then it is written to the capacity tier. deduplication is performed when the data is being destaged from cache tier to the capacity tier it is neither done on the cache tier nor on the capacity tier so it is being called as “Nearline De-Duplication”
Using hashing algorithm, vSAN creates a Fingerprint for every data block. The hashing algorithm ensures that no two blocks of data result in same hash, so that all blocks of data are uniquely hashed. For any new incoming data blocks hash are created and compared with the existing hash, if it finds the same hash entry then there is no need to store that new block instead vSAN will add reference to the existing hash. If it does not already exist, a new hash entry is created and the block is stored.
vSAN uses LZ4 Compression mechanism. The compression algorithm is applied after deduplication has occurred just before the data is written to the capacity tier. Considering the additional compute resource and allocation map overhead of compression, vSAN will only store compressed data if a 4K block can be reduced to 2K or less. Otherwise, the block is written uncompressed to avoid the use of additional resources.
Enabling deduplication and compression consumes a small amount of capacity for metadata, such as hash, translation, and allocation maps. The space consumed by this metadata is relative to the size of the vSAN datastore and is typically around 5% of the total capacity. Note that the user interface displays the percentage of used capacity, not total capacity (used and free space).
The processes of deduplication and compression on any storage platform incur overhead and potentially impact performance in terms of latency and maximum IOPS. vSAN is no exception. However, considering deduplication and compression are only supported in all-flash vSAN configurations, these effects are predictable in the majority of use cases. The extreme performance and low latency of flash devices easily outweigh the additional resource requirements of deduplication and compression.
When you enable or disable deduplication and compression, vSAN performs a rolling reformat of every disk group on every host. Depending on the data stored on the vSAN datastore, this process might take a long time. Do not perform these operations frequently. If you plan to disable deduplication and compression, you must first verify that enough physical capacity is available to place your data.
Things to remember before enabling Deduplication and Compression:
- Deduplication and compression are available only on all-flash disk groups.
- On-disk format version 3.0 or later is required to support deduplication and compression.
- You must have a Advanced or Enterprise License to enable deduplication and compression on a cluster.
- When you enable deduplication and compression on a vSAN cluster, all disk groups participate in data reduction through deduplication and compression.
- vSAN can eliminate duplicate data blocks within each disk group, but not across disk groups.
- Capacity overhead for deduplication and compression is approximately five percent of total raw capacity.
- Policies must have either 0 percent or 100 percent object space reservations. Policies with 100 percent object space reservations are always honored, but can make deduplication and compression less efficient.
In future post we will see on How to Enable the Deduplicaiton and Compression on the vSAN.
Checkout Our Posts on vSAN:
That’s it for Today Friends. I Hope you liked reading this post & If you find anything more to be added or removed feel free to write it in our comments. If you find it useful You are Feel free to share this on social media to help others & spread knowledge.
If you have any query on any thing you are free to write it in our comments section & we will make sure to provide you the better solution as soon as possible.