Understanding Virtual Memory and File Systems in Linux

INTRODUCTION

Linux is a robust and flexible operating system known for its efficient memory management and versatile file system architecture. Two core components that play a critical role in its performance and reliability are virtual memory and the file system. Virtual memory allows Linux to use disk space as an extension of RAM, enabling smoother multitasking and better memory utilization. Meanwhile, the Linux file system organizes and manages data on storage devices using structures like inodes, superblocks, and data blocks. Understanding how these systems work is essential for anyone looking to gain deeper insight into Linux internals and system administration.

What is virtual memory?

Linux supports virtual memory, that is, using a disk as an extension of RAM so that the effective size of usable memory grows correspondingly. The kernel will write the contents of a currently unused block of memory to the hard disk so that the memory can be used for another purpose. When the original contents are needed again, they are read back into memory. This is all made completely transparent to the user; programs running under Linux only see the larger amount of memory available and don’t notice that parts of them reside on the disk from time to time. Of course, reading and writing the hard disk is slower (on the order of a thousand times slower) than using real memory, so the programs don’t run as fast. The part of the hard disk that is used as virtual memory is called the swap space.

Linux can use either a normal file in the file system or a separate partition for swap space. A swap partition is faster, but it is easier to change the size of a swap file (there’s no need to repartition the whole hard disk, and possibly install everything from scratch). When you know how much swap space you need, you should go for a swap partition, but if you are uncertain, you can use a swap file first, use the system for a while so that you can get a feel for how much swap you need, and then make a swap partition when you’re confident about its size.

You should also know that Linux allows one to use several swap partitions and/or swap files at the same time. This means that if you only occasionally need an unusual amount of swap space, you can set up an extra swap file at such times, instead of keeping the whole amount allocated all the time.

A note on operating system terminology: computer science usually distinguishes between swapping (writing the whole process out to swap space) and paging (writing only fixed size parts, usually a few kilobytes, at a time). Paging is usually more efficient, and that’s what Linux does, but traditional Linux terminology talks about swapping anyway.

File System:

What are file systems?

A file system is the methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk. The word is also used to refer to a partition or disk that is used to store the files or the type of the file system. Thus, one might say “I have two file systems” meaning one has two partitions on which one stores files, or that one is using the “extended file system”, meaning the type of the file system.

The difference between a disk or partition and the file system it contains is important. A few programs (including, reasonably enough, programs that create file systems) operate directly on the raw sectors of a disk or partition; if there is an existing file system there it will be destroyed or seriously corrupted. Most programs operate on a file system, and therefore won’t work on a partition that doesn’t contain one (or that contains one of the wrong types).

Before a partition or disk can be used as a file system, it needs to be initialized, and the book keeping data structures need to be written to the disk. This process is called making a file system.

Most LINUX file system types have a similar general structure, although the exact details vary quite a bit. The central concepts are superblock, inode , data block, directory block , and indirection block. The superblock contains information about the file system as a whole, such as its size (the exact information here depends on the file system). An inode contains all information about a file, except its name. The name is stored in the directory, together with the number of the in ode. A directory entry consists of a filename and the number of the in ode which represents the file. The inode contains the numbers of several data blocks, which are used to store the data in the file. There is space only for a few data block numbers in the inode, however, and if more are needed, more space for pointers to the data blocks is allocated dynamically. These dynamically allocated blocks are indirect blocks; the name indicates that in order to find the data block, one has to find its number in the indirect block first.

LINUX file systems usually allow one to create a hole in a file (this is done with the lseek() system call; check the manual page), which means that the file system just pretends that at a particular place in the file there is just zero bytes, but no actual disk sectors are reserved for that place in the file (this means that the file will use a bit less disk space). This happens especially often for small binaries, Linux shared libraries, some databases, and a few other special cases. (Holes are implemented by storing a special value as the address of the data block in the indirect block or inode. This special address means that no data block is allocated for that part of the file, ergo, there is a hole in the file.)

Understanding Virtual Memory and File Systems in Linux

INTRODUCTION

What is virtual memory?

File System:

What are file systems?

Get In Touch