Integrating LVM with Hadoop & providing Elasticity to Data Node Storage
What is Hadoop?
Hadoop is an open-source framework that allows to store and process of big data in a distributed storage environment across the cluster of computers. It provides massive storage for any kind of data, enormous processing power, and the ability to handle virtually limitless concurrent tasks or jobs.
Logical Volume Management :
LVM is a tool for logical volume management which includes allocating disks, striping, mirroring, and resizing logical volumes. With LVM, a hard drive or set of hard drives is allocated to one or more physical volumes. LVM physical volumes can be placed on other block devices which might span two or more disks. The physical volumes are combined into logical volumes.
So, let’s get started…
First, we have to create a Hadoop Cluster. The steps for the same are as follows:-
Step 1:- First, we have to configure Name Node.
- Install Java and Hadoop in the Name Node system then configure the hdfs-site.xml file
- Configure the core-site.xml file
- Format the Name Node and start the Service
Step 2:- Next, we have to configure the Data Node.
- Install Java and Hadoop in the Name Node system then configure the hdfs-site.xml file
- Configure the core-site.xml file
- Start the Service of Data Node
We can see that Data Node is providing all 20GB storage to Name Node by using the following command:-
hadoop dfsadmin -report
Now, we have to create a partition.
Step 4:- Attach Volume to Data Node System
- Create a Volume
Now, check that the hard disk is attached or not by using the following command:-
fdisk -l
Here, we can see that the two volume of 10GB and 9GB are attached.
Step 3:- Now, create Physical Volume for both the volumes. Use the following command for the same:-
pvcreate /dev/sdb
pvdisplay /dev/sdb
Step 4:- Now, create a Volume Group for both the physical volumes. Use the following command for the same:-
vgcreate V_Group /dev/sdb /dev/sdc
vgdisplay V_Group
Here, we can see that the Volume Group of 18.99Gib is created.
Step 5:- Next, create Logical Volume of 5GB from above volume group. The command for the same is as follows:-
lvcreate --size %G --name lv1 V_Group
- Now, format the Logical Volume
mkfs.ext4
- Create a directory to mount this Logical Volume
mkdir /mydb
- Mount the above Logical Volume to this directory
mount /dev/V_Group/lv1 /mydb
- Now, check that logical volume is mounted or not by using the following command:-
df -h
Step 6:- Next, update this directory to hdfs-site.xml file of Data Node.
- Start the Service of Data Node by using the following command:-
hadoop-daemon.sh start datanode
- Check the report of Hadoop cluster in Name Node by using the following command:-
hadoop dfsadmin -report
Here, we can see that the Data Node is only Contributing 5GB to Name Node.
Step 7:- Next, increase the logical volume size. Here, I am increasing 2GB. The command for the same is as follows:-
lvextend --size +2G /dev/V_Group/lv1
- Now, we have to format 2Gb extra part.
resize2fs /dev/V_Group/lv1
lvdisplay V_Group/lv1
- Now, check the Hadoop cluster report in Name Node
Here, we can see that we have extended the logical volume successfully.
- In the same way we can also reduce the LV size.
lvreduce -L 3G /dev/V_Group/lv1
- Now, check the size of logical volume
lvdisplay V_Group/lv1
In this way, we reduced the LV size from 7GB to 3GB.
Hence, we can integrate LVM with a Hadoop Cluster to take the advantage of Elasticity.
Thanks you :)