Hadoop integration with LVM

Abhishek Prasad Kesare
4 min readMay 3, 2021

Hadoop is an HDFS cluster using LVM we can increase the storage on the fly

🌀 Todays Agenda
🔅Integrating LVM with Hadoop and
providing Elasticity to DataNode Storage
🔅Increase or Decrease the Size of Static
Partition in Linux.
🔅Automating LVM Partition using Python-Script.

Let's set up the Hadoop distributed file storage cluster quickly. I have already published an article for that. In the below article you are going to setup the Hadoop cluster using an ansible script so that process will be fast and automated

What is LVM?

Basically, LVMis not a technology it is a concept of the partitioning of the hard disk.LVM stands for logical volume management. suppose we have an HDD of 1TB and we formatted up to only 100 GB then the remaining portion of your hard disk remains unused and also we haven’t partitioned the complete disk so if we have a sudden requirement come up for large storage then 100 Gb then we can’t store so using the LVM we can partition our disk on the fly. Now Attaching the LVM with the Hadoop data node will provide elasticity to our store so that we can store as much data we want. This our today's practice

So to make partitioning easy I have created a python script this will help you to create the LVM very easily without knowing the commands.

Now I will explain my script with commands used in it so that you will become more familiar with the commands

step1: the creation of PV.

Physical volumes ( PV ) are the base “block” that you need in order to manipulate a disk using Logical Volume Manager ( LVM ).A physical volume is any physical storage device, such as a Hard Disk Drive ( HDD ), Solid State Drive ( SSD ), or partition, that has been initialized as a physical volume with LVM

 pvcreate <nameofpv>

In the above block of code, I have created the PV with the name and input taken from the user.

Step2: Create VG

A volume group ( VG ) is the central unit of the Logical Volume Manager (LVM) architecture. It is what we create when we combine multiple physical volumes to create a single storage structure, equal to the storage capacity of the combined physical devices

# vgcreate <vgname> <pvname>

Step3: Creation of LV and partitioning of LV.

A Logical Volume is the conceptual equivalent of a disk partition in a non-LVM system. Logical volumes are block devices that are created from the physical extents present in the same volume group. You can use the command lvcreate to create a logical volume in an existing volume group.

# lvcreate -n <name> -L <size> <vgname>

As you can see above we have created the lv of the given name and given storage. Now we have successfully created the LVM for our data node you don’t need to go to the command line and memorize the commands so that I have given the numbers in script with giving simple input you can see the LVM created.

Most of the time we have our pre-create LVM and we need to just expand or shrink it so I have also the script for you.

the command used to extend the lvm.

lvextend -L +<sizetobeextended> <lvname> <extendedlv> #for extending lvm
resize2fs <extendedlv> # formatting the extended space

Now depending upon the requirement you can increase or decrease the space for the data node.

Note: After setting up the Hadoop cluster we have to mount the Hadoop data node folder with the folder which has lvm then only this script will work. using mount command

mount /datanode /dev/sdb(folder having lvm)

Thank you for reading see you in the next one!!

you will find complete code on GitHub

you can connect me on Linkedin

--

--

Abhishek Prasad Kesare

Data science, , cloud computing, Artificial Intelligence, Cybersecurity,tech-blogger