Integrating LVM with Hadoop & providing Elasticity to Data Node Storage

5 min readMar 14, 2021

What is Hadoop?

Hadoop is an open-source framework that allows to store and process of big data in a distributed storage environment across the cluster of computers. It provides massive storage for any kind of data, enormous processing power, and the ability to handle virtually limitless concurrent tasks or jobs.

Logical Volume Management :

LVM is a tool for logical volume management which includes allocating disks, striping, mirroring, and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks. The physical volumes are combined into logical volumes.

So, let’s get started…

First, we have to create a Hadoop Cluster. The steps for the same are as follows:-

Step 1:- First, we have to configure Name Node.

Install Java and Hadoop in the Name Node system then configure the hdfs-site.xml file

Configure the core-site.xml file

Format the Name Node and start the Service

Step 2:- Next, we have to configure the Data Node.

Install Java and Hadoop in the Name Node system then configure the hdfs-site.xml file

Configure the core-site.xml file

Start the Service of Data Node

We can see that Data Node is providing all 20GB storage to Name Node by using the following command:-

hadoop dfsadmin -report

Now, we have to create a partition.

Step 4:- Attach Volume to Data Node System

Create a Volume

Now, check that the hard disk is attached or not by using the following command:-

fdisk -l

Here, we can see that the two volume of 10GB and 9GB are attached.

Step 3:- Now, create Physical Volume for both the volumes. Use the following command for the same:-

pvcreate /dev/sdb
pvdisplay /dev/sdb

Step 4:- Now, create a Volume Group for both the physical volumes. Use the following command for the same:-

vgcreate V_Group /dev/sdb /dev/sdc
vgdisplay V_Group

Here, we can see that the Volume Group of 18.99Gib is created.

Step 5:- Next, create Logical Volume of 5GB from above volume group. The command for the same is as follows:-

lvcreate --size %G --name lv1 V_Group

Now, format the Logical Volume

mkfs.ext4

Create a directory to mount this Logical Volume

mkdir /mydb

Mount the above Logical Volume to this directory

mount /dev/V_Group/lv1 /mydb

Now, check that logical volume is mounted or not by using the following command:-

df -h

Step 6:- Next, update this directory to hdfs-site.xml file of Data Node.

Start the Service of Data Node by using the following command:-

hadoop-daemon.sh start datanode

Check the report of Hadoop cluster in Name Node by using the following command:-

hadoop dfsadmin -report

Here, we can see that the Data Node is only Contributing 5GB to Name Node.

Step 7:- Next, increase the logical volume size. Here, I am increasing 2GB. The command for the same is as follows:-

lvextend --size +2G /dev/V_Group/lv1

Now, we have to format 2Gb extra part.

resize2fs /dev/V_Group/lv1

lvdisplay V_Group/lv1

Now, check the Hadoop cluster report in Name Node

Here, we can see that we have extended the logical volume successfully.

In the same way we can also reduce the LV size.

lvreduce -L 3G /dev/V_Group/lv1

Now, check the size of logical volume

lvdisplay V_Group/lv1

In this way, we reduced the LV size from 7GB to 3GB.

Hence, we can integrate LVM with a Hadoop Cluster to take the advantage of Elasticity.

Thanks you :)

Integrating LVM with Hadoop & providing Elasticity to Data Node Storage

What is Hadoop?

Logical Volume Management :

Written by Ananya Sharma

No responses yet