Linux Internals: Storage – DJ Ware

Introduction

Linux has robust services for handling and managing devices, including storage devices

But how does Linux do it? How does it represent these devices and make it usable to us

What is Block Storage

Block Storage is another name for a block device (is what Linux would call it)

Block devices are hardware that is designed to store and retrieve data at relatively high speeds ( I say that because it always seems it is never fast enough)

The most common type of storage device in years past was the spinning hard disk drive. Otherwise known as Spinning Rust.

Today those older Iron Oxide based devices are being replaced with solid state devices (SSD). Flash memory stick, SD Card, etc.

But why does Linux call them block devices? Its because the kernel interfaces with the hardware by reading and writing in chunks or fixed sized blocks.

So block devices are devices we can mount anywhere we wish on our file system. Once they are mounted we can read and write data seamlessly.

What are Disk Partitions

Disk partitions are a way of dividing up the space on a disk drive so we can 1) install different filesystems on them, or mount them in different places in the linux file tree.

So you can think of a partition as a way to segment our data storage so they can be used for different purposes. (Say boot the computer) or be my home directory…etc)

In some systems you do not need a partition, but other you do. It is generally recommended to partition your drives so you have more flexibility down the road if you need to change it.

MBR and GPT

Partitioning a disk boils down to using one of two different choices, the Master Boot Record (MBR) format or the GUID Partition Table (GPT), GUID stands for Globally Unique Identifiers.

MBR is the older partition type and it can only support disk drives with 2TB or less of storage, MBR also as a maximum of 4 partitions, although there is a trick you can use to make additional “logical partitions) using Extents.

GPT fixes both problems of MBR it can handle disk sizes up to 9,400,000,000 TB. That should be just enough to hold my music collection

GPT can have up to 128 partitions

So your first choice should be easier since GPT is generally preferred unless you are running on some weird operating system that prevents you from using MBR or GPT.

Formatting and Filesystems

Once we partition the drive the next step is to format it, but before we can do that, we need to choose a filesystem for it

File System Choices

Ext4 – The most popular format to use on Linux, it is the fourth version of the extended file system. Ext4 is journaled and is highly tuned to supporting operating system workloads.

XFS – comes from the mighty Silicon Graphics Computers, it has been enhanced, adopted and drives most of the high performance server workloads in Linux, it is the default file system for RedHat. XFS also uses journaling, but it journals metadata only. While this can lead to faster performance it can also lead to data corruption in the event of an abrupt power failure

Btrfs – is a modern feature rich copy-on-write filesystem. Its architecture allows for some volume management functionality including snapshots, cloning, volumes, etc. Btrfs still has a few known problems particularly with full disks and there is some debate about its suitability for production workloads, but none the less it has become the default filesystem for Fedora

ZFS – is a copy-on-write file system and volume manager with robust and mature feature set, including snapshotting, cloning, organizing volumes into RAID-like arrays (they are better than RAIDS IMHO). ZFS has a controversial history because of the license chosen to release it into the open source community. Ubuntu now ships a kernel and distor which allows its installation as a root filesystem and Debian includes the source code for ZFS in its repositories.

And yes, there are others, but we would be here all day listing them, go to wikipedia if you want a complete list

How Linux Manages Storage Devices

Unix pioneered the concept of “everything” is a file. Linux carries on this tradition and also includes hardware like storage devices which are represented on the filesystem as a file. The first disk drive on a Linux system light look like this /dev/sda

The first partition on that /dev/sda would be /dev/sda1

/dev/sda is a symbolic link which points back to a kernel defined hardware name

There are special links stored in /dev as well you may see one or more of the following:

/dev/disk/by-partlabel – uses user defined label names for a partition (GPT)

Dev/disk/by-partuuid – used uuid’s for partitions (GPT)

/dev/disk/by-label – uses user defined label names for a disk or partitions

/dev/disk/by-uuid – generated at time the disk is formatted, but unique among all devices on the system

/dev/disk/by-id – used links generated by hardware serial numbers

/dev/disk/by-path – like by-id the links are constructed from the systems interpretation of the hardware used to access the device

Usually by-label and by-id are the best choices to use, but you will find most distress will use by-uuid for the ones they create during install

Mounting Block Devices

The /dev/ file is used to communicate with the Linux kernel, but there is more to the story than that

To mount a file system you have to pick or create a place on the system file tree to mount your device, UNIX used to follow a convention which has either been forgotten or lost by Linux, but here it is

What conventions you come up with are all up to you. Just don’t choose one that is already in use or mounted, also avoid using /mnt as this one is generally reserved for mounting removable media like USB sticks, CD/DVD Drives

In any case picks scheme and be consistent

Making Mounts Permanent with /etc/fstab

The fstab file is used to hold definitions about block devices, filesystems, mount points and mounting options

Usually a filesystem not defined in fstab will not be automatically mounted, although there are other ways to do this by defining them to systemd.mount but that is not common

See the man page for fstab for additional information

More Complex Storage Management

RAID

You can also group devices together to form larger logical disk structures with varying degrees of redundancy and performance

RAID 0: is a striping method which spreads data more or less evenly between disk in the set. Read performance is spread between drives in the set, while writes are stripped across drives in the set.

A Single drive failure will result in a total loss of data

RAID 1: is driving mirroring. Anything written to a RAID 1 array is written to multiple disk drives. You can lose drives in either side of the mirror, and the remaining drives can be used to reconstruct the RAID, however RAID 1 reduces the total disk capacity in half

RAID 5: uses distributed parity to provide data redundancy for single drive failures. The RAID can be rebuilt from the remaining drives. Data capacity is reduced by one disk drive

RAID 6: offers double distributed parity to provide data redundancy for up to 2 disk failures. Again the RAID can be rebuilt from the remaining drives however this type of RAID takes the longer time to recover and the total capacity is reduced by 2 disk drives

RAID 10: is a combination of RAID 0 and RAID 1 so it offers some redundancy while providing good performance However it will take 1/2 of the total capacity of the combined disk drives

LVM

LVM is Logical Volume Management it is a system which manages logical devices from the the physical characteristics of the underlying storage devices . LVM allows you to create groups of physical devices and manage it as if it were one single block of space.

Create a partition using disk
Create Physical Volumes from a disk partition
Create a Volume Group from Physical Volume(s)
Create a Logical Volume from a Volume Group
Format the Logical Volume using a filesystem type (btrfs, XFS, EXT4, etc.)
Mount the filesystem
Manage the Logical Volume (extending, snapshotting, etc)

LVM can do things regular partitions and filesystem simply can not do. For instance you can expand partitions, create partitions that span multiple physical disk drives, take live snapshots of partitions and move volumes to different physical disk drives.

You can also combine LVM with RAID to provide additional flexibility to RAID file systems