Thursday, June 7, 2018

Logical Volume Manager LVM

To LVM or not to LVM?

The Linux Volume Manager aka LVM, at first view, is a simple storage management tool. However it differs from other disk management tools (eg:parted,fdisk) because it basically operates in a layer between the actual storage hardware and the filesystem. So the concept now becomes about Volume Management, rather than disk management.

  With LVM you can do magic, you can simply extend any partition by adding one or more hard disks without rebooting, neither interrupting any service on the running operating system. You can add more disk space literally "on the fly". Imagine you're facing a situation where you have a production server, or a VM, running out of space suddenly for some reason (logs,database data, fileserver storage etc). So you need to save the day by keeping the system running, then LVM can do the job. In addition to that LVM gives you the capability to take snapshots of the files which is very useful for backup, and can be used also to create Software RAID on your storage.

All of the above sounds amazing indeed but beware, there are some pitfalls you ought to consider before get involved in this sorcery.

LVM , as we mentioned before, is an extra layer between your physical partition and the actual filesystem. Well that extra layer apparently requires some extra kernel resources, and adds a small performance reduction. Now the tricky part is that LVM increases complexity and that can make the data recovery impossible. For example imagine that you lose a hard disk, mapping two different folders on different partitions on different filesystems. Yes that will cause a great mess.

LVM Anatomy

As we can see from the image below the whole architecture consists out of 3 layers

1. The Physical Volume layer.
 Simply the physical partition with added LVM metadata.

2. The Volume Group Layer.
 The pool of disks ,where they can be allocated to Logical Volume Layer.
3. The Logical Volume Layer.
 Here we have the logical partitions. A logical volume is perceived from the Linux operating system as a normal hard disk, but in fact is a virtual disk (not to be confused with Virtual Machine hard drive), which contains one or more hard drives in a lower layer.

LVM hands on

And now, after acquired a basic understanding of the architecture, we're ready to play with the spells (commands) to create LVM magic.

So taking as an example the above diagram lets assume we have a system with 2 hard drives and each disk is formatted in two partitions.

disk1: /dev/sda1 and /dev/sda2
disk2: /dev/sdb1 and /dev/sdb2

Lets start from bottom to top by creating the physical volumes, give :

# pvcreate /dev/sda1 /dev/sda2 /dev/sdb1 /dev/sdb2

You can confirm the new physical volumes by giving:

# pvdisplay for more detailed information

Proceeding to the upper layer enter the command:

# vgcreate volgroup /dev/sda1 /dev/sda2 /dev/sdb1 /dev/sdb2

to create the volume group.
And again to confirm the results enter:

# vgs  to examine the volume group created.

Now there's the main course ,where we create the Logical Volumes, lets assume we have a 500 GB hard disk:

# lvcreate -L 120G -n lvhome volgroup

creates a logical partition 120 GB from the volgroup pool.

# lvcreate -L 380G -n lvstorage volgroup


# lvcreate -l 100%FREE -n lvstorage volgroup

which is more accurate because it simply uses all the remaining free space of the volume group to create the logical volume.

again #lvs  to check the newly created logical volumes.

LVM layer is ready, the only thing left now is to create the filesystem on top.

# mkfs.ext4 /dev/volgroup/lvhome


# mkfs.ext4 /dev/volgroup/lvstorage

finally we need to mount those volumes to the desirable folders

#mount /dev/volgroup/lvhome /root
#mount /dev/volgroup/lvstorage /storage

and don't forget to put them on fstab to make the mounts work after the reboot.

LVM Magic

As it was mentioned before a very strong advantage of using LVM is to add more disk space without interrupting the system. This can be as follows:
Lest assume we need to add another 500GB (/dev/sdc) hard drive to expand the /storage folder, which is the lvstorage logical partition. 

After creating the partition /dev/sdc1 you need to create also the corresponding physical volume

# pvcreate /dev/sdc1

and then contain the physical volume in the volume group

# vgextend volgroup /dev/sdc1

and continue by expanding the storage volume group

#lvextend /dev/volgroup/lvstorage /dev/sdc1

Finally we need to extend the filesystem of the logical partition to acquire the new additional space

# resize2fs dev/volgroup/lvstorage


# df -h to confirm the canges

Now you're a bit wiser to decide whether you need LVM or not and if its finally worth the effort. I'm curious about your opinion on this, until then...

May the source be with you!

Wednesday, April 11, 2018


If you're involved in Linux and Web stuff you may heard sometime about Nginx. Well Nginx is a "state of the art" platform. It differs from your common web server because it can be used also as a reverse proxy, load balancer, email proxy or even for video streaming.

In this article we will examine the set-up and configuration of Nginx starting using it as a simple web server and then scaling up to web proxy and load balancer.

So lets start the installation, but first ,if you use a CentOS box like me, you have to make sure you have the "epel" repository installed. It's a very useful extra repository created for the enterprise Linux, which contains plenty of extra software including Nginx as well. 

To obtain and install that repo just give

#yum install epel-release-latest-7.noarch.rpm

Now we're ready for Nginx. On my CentOS server to install I just give the command:

#yum install nginx

 Now if you just navigate to /etc/nginx you can see the nginx.conf which is the main configuration file.

Nginx as a web server

We can start with the case which Nginx is used as a simple webserver. The basic configuration inside the nginx.conf is the following:

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';
                       access_log  /var/log/nginx/access.log  main;
 #server stanza configuration section

The http stanza contains some default information about logging and the server block information which goes as follows:

  server {
        listen       80 default_server;
        root         /usr/share/nginx/html;

        error_page 404 /404.html;
            location = /40x.html {

        error_page 500 502 503 504 /50x.html;
            location = /50x.html {

Here the "listen" directive defines the listening port of the web server,  root the root html directory of the web server, location and at last there are some default error pages defined to be displayed in case of an HTTP error request.

Nginx as a reverse proxy

Now we want to use the Nginx so that it can handle all incoming http requests and distribute them among the servers in the insight network. So on the main nginx.conf inside the server stanza we add the  following:


location /uri/path/ {
                  proxy pass http://mywebserver.local;

The "server_name" directive is essential if you have multiple servers, with different server-names apparently. If this is defined ,Nginx processes  the host header according to the configuration stated below server_name.
"location" directive checks the request URI, and forwards all the requests to the address specified by "proxy_pass" directive". In that case where mywebserver.local you can also put IP address and port e.g:

Nginx as a load balancer

As it was mentioned above Nginx can be a very effective load balancer using several different load balancing algorithms (round robin by default). So to set up a simple load balancer, on the nginx.conf, we must go under the http stanza configuration and give the following:

    upstream mywebsite {

All the magic here is been done by the upstream directive which defines the upstream servers where the traffic is distributed. Those servers are listed below defined by the classic server directive. By default uses the round robin algorithm but you can simply change that , by adding under the upstream directive.

least conn; 

for the least connected load balancing or

ip hash;  

for ip hash load balancing.

Nginx SSL 

It is essential to use https in your server http is insecure, obsolete and is going to be abandoned soon. You can count on Nginx to handle all the SSL procedure whether is a webserver or a proxy. To do this under the server stanza on the main configuration you need to add the following lines.

listen   443;

ssl    on;
ssl_certificate    /etc/nginx/conf/
ssl_certificate_key    /etc/nginx/conf/mywebsite.key;

Now the "listen" directive is on 443 (SSL), it follows the "SSL on", and then we simply declare the directory that we hold the SSL bundle certificate and the SSL key.

Nginx management and control.

After every configuration change you have to restart the nginx service in order for that to be applied, to do this simply give:

# systemctl restart nginx

But..beware, you have to be very sure that your configuration is correct otherwise the server will fail to start resulting your website or websites to be down. To avoid this you have the option to test your configuration before the restart by giving.

# nginx -t

You can also apply your configuration changes without restarting by giving.

# nginx -s reload

And don't forget to make sure that you have Nginx to run on system startup.

# systemctl enable nginx

So this is enough info for a good start, for additional plenty of information you can always visit


Friday, January 5, 2018

Network Tools

Computing co-exists with networking. Thus to operate a Linux system you’ll find yourself very often involved with network operations. Those operations may be between your system and the outside world (whether is a LAN or the Internet) but they may also be inside your own kernel network stack.

One of my favorite packages ever is the net-tools package. It is a set of very useful tools for configuring and gathering information about your network resources.
So let’s start by installing the package, I’ll use my centos 7 server for the demonstration

          #yum –y install net-tools

Now let’s find and inspect the package to see what we got:

          #rpm -qa | grep net-tools

Which gives the exact version of the package (net-tools-2.0-0.22.20131004git.el7.x86_64 )

To inspect that we give:

          #rpm -ql net-tools-2.0-0.22.20131004git.el7.x86_64

Here we get a long file list with man pages, language files, services etc, but we will focus on some binaries of the output list of the previous command:

My favorite here is Netstat. This command operates like a radar for your system, monitoring every single incoming and outgoing network connection. So let’s play with that by giving:

          #netstat –an

By examining the output, we spot two sections. The first section displays the “Active Internet connections (servers and established)” which is obviously the connections in and out of the machine.
Proto     Recv-Q  Send-Q                 Local Address         Foreign Address          State
tcp          0              0                             *                  LISTEN

Proto is the protocol type it can be tcp or udp, Recv-Q  Send-Q is the  count of bytes in queue ready to be received or sent accordingly, for this particular socket. Local address is the address of our machine and foreign address is the address of the remote connected machine. In this example is zero because the socket is in listening mode, this you can check by the last column “State” which displays the TCP protocol state the time you hit the command. Local address can be or the machine’s unique local ip or machine’s one of multiple ip addresses.

The second section of the output has the pattern:
ProtoRefCnt              Flags              Type                State               I-Node   Path
unix  2                    [ ACC ]            STREAM      LISTENING     17930    /var/run/lsm/ipc/sim

Here the Protocol column is always UNIX which represents a UNIX socket. This kind of socket is used only for process interconnection and not for external networking. The “Flags” column lists the opening TCP Flag of the connection, the “Type” states if the connection is  a stream or a datagram, “State” is the current TCP state, next column is the I-node number where the process file is located, and “Path” is the path of the process file.

Arp is a tool to get information about the apr table on the machine, just for the redord ARP stands for Addresss Resolution Protocoll and is basically maps an ip address to a physical MAC address. So by giving:


We get the following structure
Address                  HWtype             HWaddress                       Flags Mask            Iface
gateway                  ether                d1:68:0a:4a:f2:da               C                         enp1s0

Here we can see this mapping the MAC address (HWaddress) of the gateway connected to our Ethernet (HWtype ) interface enp1s0 (Iface).

Ifconfgig is an interface manipulation tool. With this you can change the IP settings (address ,netmask ,broadcast etc),enable or disable the interface, enter promiscuous mode or add an alias. 
So lets give:

          #ifconfig virbr0-nic

virbr0-nic: flags=4098<BROADCAST,MULTICAST>  mtu 1500
        ether 52:54:00:0f:48:4d  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

“virbr0-nic” is the virtual bridge interface of my KVM Hypervisor. Here we can see the type of the interface the MAC address and some statistics about packet transmission.

Iptunnel is a tool to create tunnels for ipv4 packet encapsulation. The use of it is a bit complicated and I hope I can cover it in a future article

Route is a tool to examine and manipulate your machines routing table. Giving

          # route

We have the following output:
Destination          Gateway         Genmask             Flags        Metric Ref     Use  Iface
default                    gateway                    UG          100    0          0    enp1s0                   U           100    0          0    enp1s0

This is basically the kernel routing table which shows the network path that a packet follows to reach its destination. The first line is the default route which is the route the packet follows when no other path is specified. Now by analyzing the columns of the routing table we can get information about each route:
Destination is the host or network address the packet is finally destined to, Gateway is the node that each packet uses in order to reach an outside network, Genmask is the netmask of the network, the Flags column indicates information about the state or type of route, Metric is the distance of the target, Ref the number of references to this route, Use is the count of lookups for the route and iface the network interface.

At last, of course I can’t exclude from the article traceroute and dig, although they’re not in network-tools packet.
So if we traceroute a host we get a numbered list of hostnames which are simply the hops the packet passes through in order to reach the final host destination.
Dig is a very powerful tool which gives detailed dns information about an internet address,
Bonus command: 

           #dig +short

which gives us our external IP address

Of course there are many other network commands and tools, but using the commands mentioned above is a very good toolset that will help you to identify your network surroundings and troubleshoot possible anomalies.