ext2 partition. File systems ext2, ext3, XFS, ReiserFS, NTFS. File inodes

File system(English file system) - an order that determines the way of organizing, storing and naming data on storage media of IT equipment (using multiple write and information storage portable flash memory cards in portable electronic devices: digital cameras, mobile phones etc.) and computer equipment. It defines the format of the content and physical storage of information, which is usually grouped in the form of files. A specific file system determines the size of the file name (folder), the maximum possible file and partition size, and a set of file attributes. Some file systems provide service capabilities, such as access control or file encryption.

File system tasks

The main functions of any file system are aimed at solving the following tasks:

file naming;

software interface for working with files for applications;

mapping the logical model of the file system onto the physical organization of the data storage;
organizing file system resilience to power failures, hardware and software errors;

In multi-user systems, another task appears: protecting the files of one user from unauthorized access by another user, as well as ensuring collaboration with files, for example, when one of the users opens a file, for others the same file will be temporarily available in read-only mode. .

A file system is the basic structure a computer uses to organize information on its hard drive. When you install a new hard drive, it must be partitioned and formatted with a specific file system, after which you can store data and programs on it. On Windows there are three possible options file systems: NTFS, FAT32 and the rarely used legacy FAT system (also known as FAT16).

NTFS is the preferred file system for this version of Windows. It has many advantages over more early system FAT32; Some of them are listed below.

The ability to automatically recover from some disk errors (FAT32 does not have this ability).
Improved support for large hard drives.
Higher degree of security. You can use permissions and encryption to deny user access to certain files.

The FAT32 file system and the rarely used FAT system were used in previous Windows versions, including Windows 95, Windows 98 and Windows Millenium Edition. The FAT32 file system does not provide the level of security provided by NTFS, so if your computer has a partition or volume formatted as FAT32, the files on that partition are visible to anyone who has access to the computer. The FAT32 file system also has file size limitations. In this version of Windows, it is not possible to create a FAT32 partition larger than 32GB. In addition, a FAT32 partition cannot contain a file larger than 4GB.

The main reason for using a FAT32 system is that the computer will be able to run either Windows 95, Windows 98, or Windows Millennium Edition, or this version of Windows (multiple operating system configuration). To create such a configuration, you must install the previous version of the operating system on a partition formatted as FAT32 or FAT, making it the primary partition (the primary partition may contain the operating system). Other partitions accessed from previous versions of Windows must also be formatted as FAT32. More early versions Windows can only access network NTFS partitions or volumes. NTFS partitions on the local computer will be inaccessible.

FAT – advantages:

Little required to work effectively random access memory.
Fast work with small and medium-sized catalogs.
The disk makes, on average, fewer head movements (compared to NTFS).
Effective work on slow disks.

FAT – cons:

Catastrophic loss of performance with increasing fragmentation, especially for large disks (FAT32 only).
Difficulties with random access to large (say, 10% or more of the disk size) files.
Very slow work with directories containing a large number of files.

NTFS - advantages:

File fragmentation has virtually no consequences for the file system itself - the performance of a fragmented system is degraded only in terms of access to the file data itself.
The complexity of the directory structure and the number of files in one directory also does not pose any special obstacles to performance.
Quick access to an arbitrary fragment of a file (for example, editing large .wav files).
Very fast access to small files (a few hundred bytes) - the entire file is in the same place as the system data (MFT record).

NTFS - cons:

Significant system memory requirements (64 MB is the absolute minimum, more is better).
Slow disks and controllers without Bus Mastering greatly reduce the performance of NTFS.
Working with medium-sized directories is difficult because they are almost always fragmented.
A disk that operates for a long time at 80% - 90% full will show extremely low performance.

The following file systems are considered as “native” for Linux (that is, those on which it can be installed and from which it can start): ext2fs, ext3fs, ext4fs, ReiserFS, XFS, JFS. They are usually offered as a choice when installing the vast majority of distributions. Of course, there are ways Linux installations to FAT/VFAT/FAT32 file systems, but this is only for those honeys and gentlemen who understand perversions, and I won’t talk about them.

The main criteria when choosing a file system are usually reliability and performance. In some cases, you also have to take into account the compatibility factor - in this case, it means the ability of other operating systems to access a particular file system.
I’ll start the review with ReiserFS - because the reason for writing this note was the question: what should be considered small files? After all, it is well known that it is the efficiency of working with small files that is strong point this file system.

So, small files mean files smaller than a logical block of the file system, which in Linux in most cases is equal to four kilobytes, although it can be specified during formatting within certain limits (depending on the specific FS). There are countless such small files in any Unix-like OS. A typical example is the files that make up the tree of FreeBSD ports, Gentoo portages, and similar port-like systems.
In most file systems, such mini-files have both their own inode (an information node containing meta information about the file) and a data block, which leads to both disk space consumption and reduced performance file operations. In particular, this is precisely the reason for the catastrophic thoughtfulness of the FreeBSD file system (both the old one, UFS, and the new one, UFS2) when working with its own system of ports.

In the ReiserFS file system, in such cases, separate blocks are not allocated for data - it manages to push the file data directly into the inode area. Due to this, disk space is saved and performance increases - literally several times compared to all other FS.
This handling of small ReiserFS files has given rise to the legend of its unreliability. Indeed, when the file system collapses (that is, the destruction of service areas), data located together with its inodes disappears along with them - and irrevocably. Whereas in those file systems where inodes and data blocks are always separated spatially, the latter can theoretically be restored. So, for ext2/ext3 there are even tools that allow you to do this.

However, like any legend, this one only gives the impression of authenticity. First, permanent data loss only applies to very small files. Among the user ones there are practically no such ones, and all the others can be easily restored from the distribution kit.
Secondly, when speaking about the possibility of recovering data from blocks that have lost their connection to their inodes, it was not by chance that I used the word “theoretical”. Because in practice this activity is extremely labor-intensive and does not give a guaranteed result. Anyone who has had to do this will agree that one can only indulge in it out of complete despair. And this applies to all Linux file systems. So this aspect can be neglected when choosing a file system.

In terms of overall performance, ReiserFS is definitely faster than all other journaled FS, and in some respects it is superior to ext2. A speed comparison of some common file operations can be found here.
But with ReiserFS the compatibility situation is somewhat worse. Access to it from a Windows operating system, as far as I know, is impossible. In some operating systems BSD family(DragonFlyBSD, FreeBSD) support for this file system is implemented, but in read-only mode. Even the probability that an arbitrary Linux LiveCD from previous years does not have ReiserFS support is not zero.

And here it’s time to remember ext3fs. Its advantage is not at all in greater reliability - this is the same legend as the instability of ReiserFS. I have heard no less about ext3fs crashes than about similar incidents with ReiserFS. I myself could not destroy either one or the other. Except that it worked with ext2 - but even then a very long time ago, during the time of kernel 2.2 (or even 2.0).

No, the main advantage of ext3fs is its compatibility - it is guaranteed to be read by any Linux system. For example, when restoring from some ancient LiveCD at hand - a situation that is practically not so incredible, I had to get into it. Again, most BSD systems can easily understand ext3fs (albeit without logging). For Windows there are also, as far as I know, all kinds of drivers and plug-ins for common file managers(type Total Commander), providing access to partitions with ext2fs/ext3fs.

In terms of performance, ext3fs leaves a mixed impression. Firstly, its performance is very dependent on the logging mode, of which there are three: with full data logging, partial logging and logging only metadata. In each mode it shows different performance on different types file operations. However, in no case is the performance record-breaking.

However, if the performance requirement comes first, then ext2fs has no competition - however, in this case you will have to put up with the lack of logging at all. And, consequently, with lengthy checks of the file system in case of any incorrect shutdown - and with the volume of modern disks this can take a very long time...

The following can be said about XFS. In terms of compatibility, everything that was written for ReiserFS applies to it - moreover, until some time it was not supported by the standard Linux kernel. In terms of performance, XFS also does not shine, performing in total at approximately the same level as ext3fs. And the operation of deleting files generally demonstrates depressing slowness.
According to my observations, using XFS justifies itself when working not just with large, but with very large large files- which are actually only DVD images and video files.

Let me return to the question of reliability. A banal power shutdown during normal user work, as a rule, is painlessly tolerated by all journaled file systems (and none of them ensures the safety of user operations not written to disk - rescuing drowning people remains the work of the drowning people themselves). True, for any file system it is possible to simulate a situation in which turning off the power will lead to more or less serious damage to it. However, in real life Such situations are unlikely to occur. And you can completely eliminate them by purchasing a source uninterruptible power supply- it will give more confidence in the safety of data than the type of file system. Well, in any case, the only guarantee for restoring damaged data can be regular backups...

I think the information presented above is enough for an informed choice. My personal choice for the past few years has been ReiserFS. Occasionally, on systems where it is justified to move everything possible outside the root partition, it makes sense to use ext3fs for the root file system and ReiserFS for everyone else.

If a separate partition is provided for the /boot directory (and this is recommended when using GRUB bootloader by its developers) - for it, no other file system other than ext2fs is justified; any logging here makes no sense. Finally, if you create a separate partition for all kinds of multimedia materials, then you can think about XFS.

If we approach the explanation more methodically

ext - in the early days of Linux, ext2 (extended file system version 2) was dominant. Since 2002, it was replaced by the ext3 system, which is largely compatible with ext2, but also supports logging functions, and when working with kernel version 2.6 and higher, ACLs. The maximum file size is 2 TB, the maximum file system size is 8 TB. At the end of 2008, the release of ext4 was officially announced, which is backward compatible with ext3, but many functions are implemented more efficiently than before. In addition, the maximum file system size is 1 EB (1,048,576 TB), and you can expect this to be sufficient for some time. About reiser - the system was named after its founder, Hans Reiser, and was the first system with logging functions to access the Linux kernel for data. The SUSE version of Zp was even considered standard for some time. The main advantages of reiser compared to ext3 are higher speed and placement efficiency when working with small files (and in a file system, as a rule, most files are small). Over time, however, the development of reisefers stopped. It has long been announced that version 4 will be released, which is still not ready, and support for version 3 has ceased. About xfs - the xfs file system was originally developed for SGI workstations running on the IRIX operating system. Xfs is especially good for working with large files, and is particularly ideal for working with streaming video. The system supports quotas and extended attributes (ACLs).
jfs

jfs - a66peBHaTypaJFS stands for "Journaled File System". It was originally developed for IBM and then adapted for Linux. Jfs never enjoyed much recognition on Linux and currently languishes in a miserable existence, inferior to other file systems.
brtfs

brtfs - If it is the will of the leading kernel developers, the brtfs file system in Linux has a bright future. This system was developed from scratch at Oracle. It includes support for device-mapper and RAID. Brtfs is most similar to the ZFS system developed by Sun. To her very interesting features includes on-the-fly file system checks, as well as SSD support (solid state drives are hard disks, operating on the basis of flash memory). Unfortunately, work on brtfs will not be completed in the foreseeable future. Fedora, starting from version 11, provides the ability to install brtfs, but I recommend using it only for file system developers!
There is no "fastest" or "best" file system - the assessment depends on what you intend to use the system for. For beginners Linux users, working with local computer, it is recommended to work with ext3, and for server administrators - with ext4. Of course, with ext4 the speed of operation is higher than with ext3, but at the same time, in the ext4 system the situation with data reliability is much worse - you may well lose information if the system suddenly turns off.

If you have installed a second UNIX-like operating system on your computer, then the following file systems will be useful to you when exchanging data (from one OS to another).

sysv - used in SCO, Xenix and Coherent OS.

ufs - used in FreeBSD, NetBSD, NextStep and SunOS. Linux can only read information from such file systems, but cannot make changes to the data. To access segments from BSD, you will additionally need the BSD disklabel extension. A similar extension exists for SunOS partition tables.

ZFS is relative new system, developed by Sun for Solaris. Because ZFS code is not GPL compliant, it cannot be integrated with the Linux kernel. For this reason, Linux only supports this file system indirectly, through FUSE.
Windows, Mac OS X

The following file systems will be useful when exchanging information with MS DOS, Windows, OS/2 and Macintosh.

vfat - used in Windows 9x/ME. Linux can read information from such partitions and make changes to it. vfat system drivers allow you to work with older MS DOS file systems (8 + 3 characters).

ntfs - the system is used in all modern versions of Windows: otNT and higher. Linux can read and modify its files.

hfs and hfsplus - these file systems are used in Apple computers. Linux can read and modify its files.

Data CDs and DVDs typically use their own file systems.

iso9660 - The file system for CD-ROMs is described in the ISO-9660 standard, which allows only short file names. Long names are supported differently on different operating systems, using a variety of extensions that are incompatible with each other. Linux can run both the Rockridge extension, which is common in UNIX, and the Joliet extension, developed by Microsoft.

udf - this format (universal disk format) appeared and developed as a successor to ISO 9660.

Network file systems

File systems do not have to be on local disk- They
can connect to a computer and via a network. The Linux kernel supports various network file systems, of which the following are the most commonly used.

smbfs/cifs - help connect Windows or Samba network directories to a directory tree.

nfs is the most important network file system in UNIX.

coda - this system is very similar to NFS. It has many additional features, but it is not very common.

ncpfs - runs on the NetWare kernel protocol; oH is used by Novell Netware.

Virtual file systems

Linux has several file systems that are not designed to store data on the hard drive (or other storage media), but only to exchange information between the kernel and user programs.
devpts - This file system provides access to pseudo terminals (abbreviated as PTY) via /dev/pts/* according to the UNIX-98 specification. (Pseudo-terminals emulate a serial interface. On UNIX/Linux systems, such interfaces are used by terminal emulators such as xterm. Typically, devices such as /dev/ttypn are used. In contrast, the UNIX-98 specification defines new devices. More detailed information reported in the text terminal H0WT0.)
proc and sysfs - the proc file system is used to display service information related to kernel and process management. In addition to this, the sysfs file system builds relationships between the kernel and the hardware. Both file systems are mounted at /proc and /sys.
tmpfs - This system is built on the basis of shared memory according to System V. It is usually mounted at the /dev/shm position and allows efficient exchange of information between two programs. On some distributions (such as Ubuntu), the /var/run and /var/lock directories are also created using the tmpfs file system. The files in these directories are used by some network daemons to store process identification numbers as well as file access information. Thanks to tmpfs, this data is now reflected in RAM. The method guarantees high speed, and also that after the computer is turned off, there will be no files left in the /var/run or /var/lock directories.

usbfs - the usbfs file system, starting with kernel version 2.6 and higher, provides information about connected USB devices. It is usually integrated into the proc file system. About USB device support in Linux.

Other file systems

auto - in fact, there is no file system under that name. However, the word auto can be used in /etc/fstab or with the mount command to specify the file system. In this case, Linux will try to recognize the file system on its own. This method works with most major file systems.
autofs, autofs4

autofs, autofs4 are also not file systems, but kernel extensions that automatically execute the mount command for selected file systems. If a file system is not used for some time, the umount command is automatically run on it. This method is convenient primarily in cases where only a few of many NFS directories are actively used at the same time.

To perform such operations, the /etc/init.d/ autofs script automatically executes the automount program when the system starts. It is configured using the /etc/auto.master file. The corresponding programs are automatically installed, for example, in Red Hat and Fedora. In any case, autofs is only activated after configuring /etc/auto.master or /etc/auto.misc.
cramfs and squashfs

cramfs and squashfs - Cram and Squash file systems are read-only. They are used to "pack" as many zipped files as possible into flash memory or ROM (read-only memory).

fuse - FUSE stands for Filesystem in Userspace and allows filesystem drivers to be developed and used outside the kernel. Therefore, FUSE is always used with an external file system driver. FUSE works, in particular, with the NTFS driver ntfs-3g.

gfs and ocfs - Global File System and Cluster File System from Oracle (Oracle Cluster File System) allow you to build giant network file systems that can be accessed in parallel by many computers at the same time.

jffs and yaffs - Journaling Flash File System and Yet Another Flash File System are specifically optimized to work with solid state drives and flash media. Using special algorithms, they try to use all memory cells evenly (wear leveling technology) to avoid premature system failure.
loop

loop - used to work with pseudo devices. A loopback device is an adapter that can access a regular file as a block device. Thanks to it, you can place any file system in any file, and then connect it to the directory tree using mount. The kernel function responsible for this - pseudo-device support - is implemented in the loop module.

There are a variety of uses for pseudodevices. In particular, they can be used when creating Initial RAM disks for GRUB or LILO, when implementing encrypted file systems, or testing ISO images for CDs.

Storage media file systems

File systems
ISO 9660
Joliet ISO 9660 file system extension.
Rock Ridge (RRIP, IEEE P1282) – an ISO 9660 file system extension designed to store file attributes used in POSIX operating systems
Amiga Rock Ridge Extensions
El Torito
Apple ISO9660 Extensions
HFS, HFS+
Universal Disk Format is a specification of a file system format that is independent of the operating system for storing files on optical media. UDF is an implementation of the ISO/IEC 13346 standard
Mount Rainier

Let's look at the logical structure of the ext2fs file system. Physically HDD divided into sectors of 512 bytes. The first sector of a disk partition in any file system is considered the boot area. On the primary partition, this area contains the boot entry, a piece of code that initiates the process of loading the operating system at startup. This area is not used on other partitions. The remaining sectors are combined into logical blocks of 1, 2 or 4 kilobytes in size. A logical block is the smallest addressable piece of data: the data of each file occupies an integer number of blocks. Blocks, in turn, are combined into groups of blocks. Groups of blocks and blocks within a group are numbered sequentially, starting from 1.

The data structures used when working with the ext2fs file system are described in the /usr/include/linux/ext2fs .h header file.

The superblock serves as the starting point of the file system and stores all

information about her. It is 1024 bytes in size and is located at offset 1024 bytes from the beginning of the file system. It is duplicated in each group of blocks, which allows you to quickly restore it after failures. The superblock determines the size of the file system, the maximum number of files in the partition, the amount of free space, and contains information about where to look for unallocated areas. When the OS starts, the superblock is read into memory and all changes to the file system are first reflected in a copy of the superblock located in the OS and are written to disk only periodically. This improves system performance because many users and processes are constantly updating files. On the other hand, when the system is stopped, the superblock must be written to disk, which does not allow turning off the computer by simply turning off the power. Otherwise, the next time you boot, the information recorded in the sunblock will not be

corresponding to the real state of the file system.

After the superblock comes a description (descriptor) of the group of blocks. The information stored in it allows you to find the block and inode bitmaps, as well as the inode table.

A block bitmap is a structure in which each bit indicates whether the same block is allocated to a file. A value of 1 indicates that the block is busy. This map is used to search for free blocks in cases where it is necessary to allocate space for a file.

The inode bitmap performs a similar function to the inode table: it shows which inodes are in use.

Each file has one and only one inode (inode, i-node, information node), which is identified by its serial number - the file index. The inode stores the file's metadata. Among them are all the attributes of the file except its name, and a pointer to the file data.

For a regular file or directory, this pointer is an array of 15 block addresses. The first 12 addresses in this array are direct links to the block numbers in which the file data is stored. If the data does not fit into 12 blocks, then the indirect addressing mechanism is activated. The next address in this array is an indirect link, that is, the address of a block that stores a list of addresses of the next blocks with data from this file.

How many blocks of data can be addressed this way? The block address takes 4 bytes, the block, as already mentioned, takes 1, 2 or 4 kilobytes. This means that by indirect addressing it is possible to place 256 - 1024 blocks.

What if the file is even longer? The next address in the pointer array points to the double indirect addressing block (double indirect block). This block contains a list of block addresses, which, in turn, contain lists of addresses of the next data blocks.

And finally, the last address in the pointer array specifies the address of the triple indirect block, that is, a block with a list of block addresses that are double indirect blocks.

It remains unclear where the file name is if it is not among the file data or its metadata. In UNIX-like systems, a file name is an attribute not of the file itself, but of the file system, understood as a logical directory structure. The file name is stored only in the directory to which the file is assigned, and nowhere else. Interesting consequences follow from this.

First, a single inode can correspond to any number of names assigned to different directories, all of which are real. The number of names (hard links) is counted in the inode. This is the number you can see with the Is -1 command.

Secondly, deleting a file simply means removing its entry from the directory data and decrementing the link count by 1.

Thirdly, you can only match a name to an inode number within the same file system, which is why you cannot create a hard link to another file system (a symbolic one is possible, it has a different storage mechanism).

The directory itself is assigned to its parent directory in the same way. The root directory is always written to inode number 2 (number 1 is reserved for the list of addresses of bad blocks). Each directory stores a link to itself and to its parent directory - these are the pseudo-subdirectories “.” And "..".

Thus, the number of links to a directory is equal to the number of its subdirectories plus two.

The directory data is a linked list with variable length entries and looks something like this:

Directory structure in ext2fs

What about the files? physical devices? They can be located in the same directories as regular files: there is no data in the directory indicating that the name belongs to the file on the disk or device. The difference is at the inode level. If the i-node of a regular file points to the disk blocks where its data is stored, then the i-node of the device file contains a pointer to the list of device drivers in the kernel - the element of the list that corresponds to the major device number:

Difference between regular file and device file

ext2fs file system properties:

The maximum file system size is 4 TB.

The maximum file size is 2 GB.

The maximum file name length is 255 characters.

The minimum block size is 1024 bytes.

The number of allocated inodes is 1 per 4096 bytes of the partition.

VLADIMIR MESHKOV

ext2 file system architecture

The article discusses the logical structure of ext2 - the operating file system Linux systems.

Main components of ext2 file system

As in any file UNIX system, the ext2 file system includes the following components:

  • blocks and groups of blocks;
  • information node;
  • superblock.

Blocks and block groups

The entire disk partition space is divided into fixed-size blocks, multiples of the sector size - 1024, 2048 and 4096 bytes. The block size is specified when creating a file system on a hard disk partition. A smaller block size saves hard disk space, but also limits the maximum file system size. All blocks have serial numbers. In order to reduce fragmentation and the number of movements of hard disk heads when reading large amounts of data, blocks are combined into groups.

Information hub

The basic concept of a file system is the information node, or inode. This is a special structure that contains information about the attributes and physical location of the file. The attributes of a file are its type (regular file, directory, etc.), access rights to it, owner ID, size, creation time. Physical location information is a sequence of absolute block numbers containing file data.

Superblock

Superblock is the main element of the ext2 file system. It contains the following file system information (the list is incomplete):

  • the total number of blocks and inodes in the file system;
  • number of free blocks and inodes in the file system;
  • file system block size;
  • number of blocks and inodes in the group;
  • inode size;
  • file system identifier;
  • number of the first data block.

In other words, this is the number of the block containing the superblock. This number is always 0 if the file system block size is greater than 1024 bytes, and 1 if the block size is 1024 bytes.

The integrity of the superblock directly affects the performance of the file system. The operating system creates several backup copies superblock so that it can be restored in case of damage. The master copy is located at an offset of 1024 bytes from the beginning of the partition on which the file system was created (the first 1024 bytes are reserved for the operating system loader).

Early versions of the ext2 file system created copies of the superblock at the beginning of each block group. This resulted in a large loss of disk space, so later the number of superblock backups was reduced and block groups 0, 1, 3, 5 and 7 were allocated to accommodate them.

Block Group Format

A generalized block diagram of the ext2 file system is shown in Fig. 1.

Almost all groups of blocks have the same format. In each group, in addition to information blocks, information about the occupancy of blocks and the inode of the group is stored in the form of a bitmap. Block group 0 also includes a superblock and a group descriptor table, which we will look at below.

The block occupancy bitmap is usually located in the first block of a group. If there is a backup superblock in the group, the bitmap is located in the second block of the group. The bitmap size is one block. Each bit of this map represents the state of the block. If the bit is set (1), then the block is busy; if it is reset (0), the block is free. The first block of the group corresponds to the zero bit of the map, the second block to the first bit, etc.

Inodes located within the same group are collected in a table. In a group's inode occupancy bitmap, each bit characterizes the state of an element in the group's inode table.

Each block group is described using a block group descriptor. A group descriptor is a structure that contains information about the addresses of the block occupancy bitmap, inode occupancy bitmap, and inode table of the corresponding group. All group descriptors are collected in a group descriptor table, which is stored in block group 0. As with the superblock, the operating system creates backup copies of the group descriptor table.

File reading algorithm

Each inode, like a block, has a sequence number that is unique within the file system and contains information about only one file. Thus, to gain access to the contents of a file, you need to know the serial number of the corresponding inode.

As mentioned above, information about the physical location of the file is contained in the inode. This information is a sequence of 32-bit block numbers containing the file data (Figure 1). The first 12 numbers are direct links to information blocks (direct blocks number). The 13th number is an indirect link (indirect blocks number). It contains the address of the block in which the addresses of information blocks are stored. The 14th number is a double indirect link (double blocks number), the 15th number is a triple indirect link (triple blocks number).

The file name is not included in the inode; the correspondence between file names and inode numbers is performed through directories.

Catalogs

Files on UNIX and POSIX systems are stored in a tree-like hierarchical file system. The file system root is the root directory, indicated by the "/" symbol. Each intermediate node in the file system tree is a directory. The terminal vertices of a file system tree are either empty directories or files. The absolute pathname of a file consists of the names of all directories leading to the specified file, starting from the root directory. Thus, the pathname /home/test.file means that the file test.file is located in the home directory, which, in turn, is located in the root directory “/”.

A directory, like a file, is described using an inode. The contents of a directory are an array of entries, each containing information about a file that is "inside" the current directory.

A directory entry has the following format:

  • file inode serial number;
  • record length in bytes;
  • file name;
  • file name length.

The search for a file's inode number always starts from the root directory. For example, to obtain the inode number of a file located in the root directory, the operating system must obtain the contents of the root directory, find an entry in it with the name of this file, and extract the inode number of the file from this entry.

The first few inode numbers are reserved by the file system; their list is contained in the header file:

* Special inode numbers

#define EXT2_BAD_INO 1 /* Bad blocks inode */

#define EXT2_ROOT_IN 2 /* Root inode */

#define EXT2_ACL_IDX_IN 3 /* ACL inode */

#define EXT2_ACL_DATA_INO 4 /* ACL inode */

#define EXT2_BOOT_LOADER_INO 5 /* Boot loader inode */

#define EXT2_UNDEL_DIR_INO 6 /* Undelete directory inode */

Inode number 2 (root inode) is reserved for writing the root directory. This inode is in block group 0 and occupies the second position in the inode table of this group. The number of the first unreserved inode is stored in the superblock.

Having determined the inode number of a file, the kernel calculates the number of the group in which this inode is located and its position in the group's inode table. By reading from this inode position, the operating system obtains complete information about the file, including the addresses of the blocks in which the contents of the file are stored.

The number of the block group in which the inode is located is calculated by the formula:

group = (inode_num - 1) / inodes_per_group

Where:

  • group– the required block group number;
  • inode_num– serial number of the inode defining the file;
  • inodes_per_group– the number of inodes in the group (this information is located in the superblock).

The inode position in the group inode table is determined by the formula:

index = (inode_num - 1) % inodes_per_groupe

where index is the inode position in the table.

Let's look at an example of getting the contents of the file test.file located in the root directory. To read the /test.file file you need to:

  • find an entry about this file in the array of entries in the root directory;
  • extract the serial number of the inode of the file, calculate the number of the group in which this inode is located;
  • extract the address of the group's inode table from the descriptor of this group;
  • calculate the inode position in this table;
  • read file inode;
  • extract the addresses of information blocks from the inode and read the information contained in these blocks.

In Fig. Figure 2 shows in detail the steps to read the /test.file file.

    Steps 1-6 – reading the root directory:

  1. From block group 0 the group descriptor table is read.
  2. The block group descriptor 0 is retrieved from the group descriptor table and the address of the group 0 inode table is read from it.
  3. The inode table is read from block group 0.
  4. The inode number of the root directory is fixed and equal to 2, so the second element is read from the inode table of group 0, which contains the address of the block with the contents of the root directory. Let's assume that this block is located in block group A.
  5. From block group A, a block containing root directory entries is read.
  6. A search is made for an entry named "test.file". If such an entry is found, the inode number of the file “test.file” is extracted from it.
  7. Having determined the inode number, you can access the information blocks of the file (steps 7-11):

  8. The group number in which the given inode is located and its position in the group inode table are calculated (assuming the group number is B and the table position is X).
  9. From the group descriptor table, we retrieve the descriptor of block group B, and from it the address of the inode table of this block group is read.
  10. The inode table is read from block group B.
  11. From the inode table of block group B, the inode located at position X is read.
  12. From the read inode, the addresses of the block with the contents of the file /test.file are extracted and information is read from the block with the specified address.

Software implementation of the file reading algorithm

Initial data: available hard section disk on which the ext2 file system is created. This section corresponds to the device file /dev/hda3. A subdirectory home has been created in the root directory of the partition, and in it there is a file test.file with the following content:

Would citrus live in the thickets of the south?

Yes, but a fake copy!

1234567890-=

Don’t think anything bad, this is not nonsense, but a test exercise from the training course for telegraph operators in the communications troops of the former USSR!

Attention! One important point should be taken into account. The created file will not be immediately written to disk, but will first go to the disk buffer. An attempt to immediately obtain the contents of a file using the above algorithm will lead to nothing, since information about this file is not physically available on the disk. It is necessary to “force” the system to write the disk buffer to disk. The easiest way to do this is to perform a reboot operation. Therefore, after the file is created, reboot the system.

Our task is to use the device file /dev/hda3 to read the file /home/test.file using the direct access to its information blocks.

Let's consider the software implementation of the module that performs this operation.

Header files:

#include

#include

#include

#include

#include

#include

The header file defines structural types that describe the main components of the ext2 file system - superblock, block group descriptor, information node, directory entry.

Let's briefly look at the fields that are included in each of these structures:

  1. Superblock structure struct ext2_super_block:
    • __u32 s_inodes_count– total number of inodes in the file system;
    • __u32 s_blocks_count– the total number of blocks in the file system;
    • __u32 s_free_blocks_count– number of free blocks;
    • __u32 s_free_inodes_count– number of free inodes;
    • __u32 s_first_data_block– number of the first data block (number of the block in which the superblock is located);
    • __u32 s_log_block_size– This value is used to calculate the block size. The block size is determined by the formula: block size = 1024<< s_log_block_size;
    • __u32 s_blocks_per_group– number of blocks in the group;
    • __u32 s_inodes_per_group– number of inodes in the group;
    • __u16 s_magic– ext2 file system identifier (signature 0xEF53);
    • __u16 s_inode_size– size of the information node (inode);
    • __u32 s_first_ino– number of the first unreserved inode.
  2. Block group descriptor structure struct ext2_group_desc:
    • __u32 bg_block_bitmap– occupancy bitmap of group blocks;
    • __u32 bg_inode_bitmap– inode group occupancy bitmap;
    • __u32 bg_inode_table– address of the group inode table.
  3. The structure of the information node struct ext2_inode:
    • __u16 i_mode – file type and access rights. The file type is determined by bits 12-15 of this field:
      • 0xA000– symbolic link;
      • 0x8000– regular file;
      • 0x6000– block device file;
      • 0x4000– catalogue;
      • 0x2000– character device file;
      • 0x1000– FIFO channel.
    • __u32 i_size– size in bytes;
    • __u32 i_atime– time of last access to the file;
    • __u32 i_ctime– file creation time;
    • __u32 i_mtime– time of last modification;
    • __u32 i_blocks– the number of blocks occupied by the file;
    • __u32 i_block– addresses of information blocks (including all indirect links).
  4. The value of EXT2_N_BLOCKS is defined in the file:

    * Constants relative to the data blocks

    #define EXT2_NDIR_BLOCKS 12

    #define EXT2_IND_BLOCK EXT2_NDIR_BLOCKS

    #define EXT2_DIND_BLOCK (EXT2_IND_BLOCK + 1)

    #define EXT2_TIND_BLOCK (EXT2_DIND_BLOCK + 1)

    #define EXT2_N_BLOCKS (EXT2_TIND_BLOCK + 1)

  5. Directory entry structure struct ext2_dir_entry_2:
  6. #define EXT2_NAME_LEN 255

  • __u32 inode– file inode number;
  • __u16 rec_len– directory entry length;
  • __u8 name_len– file name length;
  • char name file name.

Let's determine the name of the partition on which the file system is created, global structures and variables.

#define PART_NAME "/dev/hda3"

struct ext2_super_block sb;

/* buffer for storing the group descriptor table */

unsigned char buff_grp;

unsigned char buff; /* information buffer */

int indev; /* device file descriptor */

int BLKSIZE; /* file system block size */

Let's define several functions that we need for work:

Superblock read function:

void read_sb()

Memset(&sb,0.1024);

We move 1024 bytes from the beginning of the section and read the superblock into the structure struct ext2_super_block sb:

If(lseek(indev,1024,0)< 0) {

Perror("lseek");

Exit(-1);

If(read(indev,(char *)&sb,sizeof(sb))< 0) {

Perror("read");

Exit(-1);

Checking the file system identifier:

If(sb.s_magic != EXT2_SUPER_MAGIC) (

Printf("Unknown file system type!");

Exit(-1);

The EXT2_SUPER_MAGIC value is defined in the header file.

We display information about the file system that is located in the superblock:

printf(" Superblock info ----------- ");

Printf("Inodes count - %u ",sb.s_inodes_count);

Printf("Blocks count - %u ",sb.s_blocks_count);

Printf("Block size - %u ",1024<< sb.s_log_block_size);

Printf("First inode - %d ",sb.s_first_ino);

Printf("Magic - 0x%X ",sb.s_magic);

Printf("Inode size - %d ",sb.s_inode_size);

Printf("Inodes per group - %u ",sb.s_inodes_per_group);

Printf("Blosks per group - %u ",sb.s_blocks_per_group);

Printf("First data block - %u ",sb.s_first_data_block);

Return;

Group descriptor table reading function:

void read_gdt()

Calculate the file system block size:

BLKSIZE = 1024<< sb.s_log_block_size

The group descriptor table is located in the block that is located immediately after the first data block (behind the superblock).

Reading the table:

If(lseek(indev, (sb.s_first_data_block + 1) * BLKSIZE, 0)< 0) {

Perror("lseek");

Exit(-1);

If(read(indev,buff_grp,BLKSIZE)< 0) {

Perror("read");

Exit(-1);

Return;

Function for getting the contents of an inode by its number:

void get_inode(int inode_num, struct ext2_inode *in)

The input parameters of the function are the inode sequence number and the struct ext2_inode structure.

Struct ext2_group_desc gd;

U64 group, index, pos;

We calculate the number of the block group in which the inode with the serial number inode_num is located:

Group = (inode_num - 1) / sb.s_inodes_per_group;

From the group descriptor table, extract the group descriptor and copy it into the struct ext2_group_desc gd structure:

Memset((void *)&gd, 0, sizeof(gd));

Memcpy((void *)&gd, buff_grp + (group * (sizeof(gd))), sizeof(gd));

We calculate the position of the inode with the serial number inode_num in the inode table of the group group and read this inode into the structure struct ext2_inode:

index = (inode_num - 1) % sb.s_inodes_per_group;

Pos = ((__u64)gd.bg_inode_table) * BLKSIZE + (index * sb.s_inode_size);

Pread64(indev, in, sb.s_inode_size, pos);

Return;

Data block reading function:

void read_iblock(struct ext2_inode *in, int blk_num)

U64 pos;

The input parameters of the function are the inode structure and the block number (meaning the number from the sequence of address blocks located in the inode).

We calculate the offset to the information block on the section and read this block into the global buff buffer:

Pos = ((__u64)in->i_block) * BLKSIZE;

Pread64(indev, buff, BLKSIZE, pos);

Return;

Function to get the contents of the root directory:

void get_root_dentry()

Struct ext2_inode in;

The inode number of the root directory is known, so we get the contents of the inode of the root directory and read its contents into the buff buffer:

get_inode(EXT2_ROOT_INO, &in);

Read_iblock(&in, 0);

The buff buffer will contain the contents of the root directory.

Return;

Function for getting inode number by file name:

int get_i_num(char *name)

The input parameters of the function are the file name. The return value is the inode number of the file.

Int i = 0, rec_len = 0;

Struct ext2_dir_entry_2 dent;

The buff buffer contains an array of directory entries. To determine the inode number of a file, you need to find an entry in this array with the name of this file:

For(; i< 700; i++) {

Memcpy((void *)&dent, (buff + rec_len), sizeof(dent));

If(!memcmp(dent.name, name, dent.name_len)) break;

Rec_len += dent.rec_len;

Return dent.inode;

Now let's write the main function:

int main()

Variables and structures:

struct ext2_inode in;

// absolute pathname of the file

Unsigned char *full_path = "/home/test.file";

Unsigned char buff1;

Static int i = 1;

Int n, i_num, outf, type;

The first character in an absolute pathname of a file must be a forward slash (/). Let's check this:

If(full_path != "/") (

Perror("slash");

Exit(-1);

Open the device file, read the superblock and the group descriptor table:

Indev = open(PART_NAME,O_RDONLY);

If(indev< 0) {

Perror("open");

Exit(-1);

Read_sb();

Read_gdt();

We get the contents of the root directory:

get_root_dentry();

Now the buff buffer contains all the entries in the root directory (you can save them in a separate file if you want). Now, given the root directory entries, we can get to the contents of test.file using the above file reading algorithm. For this purpose, we organize a cycle. In the body of the loop, we will parse the absolute path name of the file, highlighting its elements - subdirectories (we have only one, home) and the name of the file we are looking for (test.file). For each element, we determine the inode serial number, count this inode and then get the contents of the zero block (from the sequence of address blocks located in the inode):

while(1) (

Memset(buff1,0,sizeof(buff1));

For(n = 0 ; n< EXT2_NAME_LEN; n++, i++) {

Buff1[n] = full_path[i];

If((buff1[n] == "/") || (buff1[n] == "?")) (

I++;

Break;

buff1[n] = "?";

For each element of the absolute pathname of the file, we determine the inode sequence number, read this inode into memory and then get the contents of the zero block:

I_num = get_i_num(buff1);

Get_inode(i_num, &in);

Read_iblock(&in, 0);

Let's display information about the file (name, inode number, file size and type):

Printf("Inode number - %u ", i_num);

Printf("File name - %s ", buff1);

Printf("File size - %u ",in.i_size);

The file type is determined by the highest four bits of the i_mode field of the struct ext2_inode structure:

type = ((in.i_mode & 0xF000) >> 12);

Printf("Type - %d ",type);

Switch(type) (

Case(0x04) :

Printf("(directory) ");

Break;

Case(0x08) :

Printf("(regular file) ");

Break;

Case(0x06) :

Printf("(block device file) ");

Break;

Case(0x02) :

Printf("(character device file) ");

Break;

Default:

Printf("(unknown type) ");

Break;

Checking the file type. If this is a regular file, we interrupt the loop:

If(type & 0x08) (

The buff buffer will contain information read from the information blocks of the /home/test.file file. Let's write this information to a file:

Outf = open("out",O_CREAT|O_RDWR,0600);

Write(outf, buff, sizeof(buff));

Close(outf);

Break;

We leave:

Close(indev);

Return 0;

This concludes our consideration of the logical structure of the ext2 file system.

ext2(also called as ext2fs) - Second Extended File System(Second Extended File System) is a file system built on Linux kernel. The creator and developer of ext2 is Remy Card. The ext2 file system was built by him to replace the old one, previous version-ext.

In terms of speed and performance, this file system can serve as a benchmark. This is evidenced by the results of file system performance tests. For example, in Dell Tech Center's sequential read and write speed tests, the ext2 file system outperforms ext3 and is second only to the more modern ext4 in read speed.

The main disadvantage of ext2 is that it is not a journaling file system. However, this drawback was eliminated in the next file system - ext3.

ext2 is used on flash cards and solid-state drives (SSDs) because the lack of journaling is an advantage when working with drives with write cycle limits.

History of ext2 creation

During the rapid development of the Linux system, it used the Minix OS file system. It was quite stable, but at the same time it was 16-bit. As a result, there was a strict limit of 64 Mb per partition. In addition, there was a limit on the maximum length of the file name, which was 14 characters.

These limitations together led to the development of the "extended file system" (hence the term " Extended File System"). She was tasked with solving two of Minix's key problems. The new file system was unveiled in April 1992. It was Ext, which extended file size limits to 2 gigabytes and set a file name length limit of 255 characters.

However, despite the success of the new file system, there were still quite a few unsolved problems. For example, there was no support for separate access, there were no time stamps for data modification. The need to solve these problems served as the motivation for creating the next version of the ext2 extended file system (“ Second Extended File System"). ext2 was developed in January 1993 and also implemented POSIX-compliant ACLs and extended file attributes.

ext2 logical organization

The ext2 directory hierarchy graph is represented as a network. This is due to the fact that one file can be included in several directories at once.

All file types have symbolic names. Hierarchically organized file systems typically use three types of names: simple, compound, and relative. Same thing in ext2. In the case of a simple name, the limitation is that its length should not exceed 255 characters, in addition, the name should not contain the NULL character and a slash.

As for the NULL character, the restrictions are related to the representation of strings in the C language; in the case of the slash character, it all lies in the fact that it is used as a separating character between directories.

The full name is a chain of simple symbolic names of all directories through which the path passes from root to this file. In ext2, a file can belong to several directories, which means that it can have several full names (one file - several full names). But one way or another, the full name determines the file.

ext2 attributes:

  • file type and access rights,
  • owner, access group,
  • information on permitted operations,
  • creation time, last access date, last modified date and last deleted time,
  • current file size,
  • file specification:
    • regular file
    • catalog,
    • byte-oriented device file,
    • block-oriented device file,
    • named pipe,
    • symbolic link,
  • number of occupied blocks,
  • other

File attributes are contained in special tables, and not in directories, as is usually the case in simple file systems. As a result, the catalog has a very simple structure, consisting of two parts: the inode number and the name.

Physical organization ext2

Disk partition structure

The following can be distinguished as part of ext2:

  • blocks and groups of blocks;
  • inode;
  • superblock.

The entire disk partition space is divided into fixed-size blocks, the blocks being a multiple of the sector size (1024, 2048, 4096 or 8192 bytes). The block size is specified when creating a file system on a disk partition. All blocks are assigned serial numbers. To reduce fragmentation and the number of movements of hard disk heads when reading large amounts of data, blocks are combined into groups.

The basic concept of a file system is the inode (also called inode - information node). This is a special structure containing information about the attributes and physical location of a file. Index decryptors are combined into a table contained at the beginning of each block group. The superblock is the main element of the ext2 file system. It contains general information about the file system. The superblock is located 1024 bytes from the beginning of the section. The integrity of the superblock determines the health of the file system. The OS creates several backup copies of the superblock in case the partition is damaged. In the next block after the superblock there is a global descriptor table - a description of block groups in the form of an array with general information about all block groups.

Block group

All blocks of the ext2 partition are divided into groups. A separate entry is created for each group in the global descriptor table. This record stores basic parameters, such as: the block number in bitmaps and tables, the number of free blocks in the group, the number of inodes containing directories.

Block bitmap is a system in which each bit informs whether the corresponding block is allocated to a file. If the bit is 1, then the block is busy. The inode bitmap performs a similar function: it shows which inodes are busy and which are not. The Linux kernel tries to evenly distribute the inodes of directories across groups, and move the inodes of files to the group with the parent directory. All remaining space, which appears in the table as data, is allocated for storing files.

Data addressing system

The data addressing system is one of the most serious and key components of the file system. Thanks to her there is required file among many empty or busy blocks on the disk.

ext2 uses the following file block addressing scheme. To store the file address, 15 fields are allocated, each of which consists of 4 bytes. If the file fits into 12 blocks, then the numbers of the corresponding clusters are listed in the first twelve fields of the address. If the file size exceeds 12 blocks, then the next field contains the address of the cluster in which the numbers of the next blocks of the file can be located. Thus, the thirteenth field is used for indirect addressing.

With a maximum block size of 4096 bytes, the cluster corresponding to the 13th field can contain up to 1024 numbers of the following file blocks. If the file size exceeds 12+1024 blocks, then the 14th field is used, which contains the address of a cluster containing 1024 cluster numbers, each of which refers to 1024 blocks of the file. Here double indirect addressing is already used. And if the file includes more than 12+1024+1048576 blocks, then the last 15th field is applied for triple indirect addressing.

This addressing system allows you to have files larger than 2 TB with a maximum block size of 4096 bytes.

As with any UNIX file system, ext2 includes the following components:

− blocks and groups of blocks;

− index descriptor;

− superblock.

The entire disk partition space is divided into fixed-size blocks, multiples of the sector size: 1024, 2048, 4096 or 8192 bytes. The block size is specified when the file system is created on the disk partition. A smaller block size saves hard disk space, but also limits the maximum file system size. All blocks have serial numbers. To reduce fragmentation and head movements hard drive When reading large amounts of data, blocks are combined into groups of blocks.

The basic concept of a file system is an index descriptor, or inode (English information node). This is a special structure that contains information about the attributes and physical location of the file. The inodes are organized into a table, which is contained at the beginning of each block group.

Figure 10 - Generalized block diagram of ext2 FS

Superblock is the main element of the ext2 file system. It contains general information about the file system:

the total number of blocks and inodes in the file system,

number of free blocks and inodes in the file system,

file system block size,

number of blocks and inodes in a block group,

inode size,

file system identifier.

The superblock is 1024 bytes from the beginning of the section. The integrity of the superblock directly affects the performance of the file system. The operating system creates several backup copies of the superblock in case the partition is damaged. In the next block after the superblock there is a global descriptor table - a description of block groups, which is an array containing general information about all block groups in the section.

All blocks of the ext2 partition are divided into groups of blocks. For each group, a separate entry is created in the global descriptor table, which stores the main parameters:

block number in the block bitmap,

block number in the inode bitmap,

block number in the inode table,

number of free blocks in the group,

number of inodes containing directories.

A block bitmap is a structure in which each bit indicates whether the corresponding block is allocated to a file. If the bit is 1, then the block is busy. A similar function is performed by the inode bitmap, which shows which inodes are occupied and which are not. The Linux kernel, using the number of inodes containing directories, tries to evenly distribute the inodes of directories into groups, and tries to move the inodes of files, if possible, into the group with the parent directory. All remaining space, designated as data in the table, is allocated for storing files.

The ext2 file system uses the following file block addressing scheme. To store the file address, 15 fields are allocated, each of which consists of 4 bytes. If the file fits into 12 blocks, then the numbers of the corresponding clusters are directly listed in the first twelve fields of the address. If the file size exceeds 12 blocks, then the next field contains the address of the cluster in which the numbers of the next blocks of the file can be located. Thus, the 13th field is used for indirect addressing.

With a maximum block size of 4096 bytes, the cluster corresponding to the 13th field can contain up to 1024 numbers of the following file blocks. If the file size exceeds 12+1024 blocks, then the 14th field is used, which contains the address of a cluster containing 1024 cluster numbers, each of which refers to 1024 blocks of the file. Here double indirect addressing is already used. Finally, if the file contains more than 12+1024+1048576 blocks, then the last 15th field is used for triple indirection.

This system addressing allows you to have files larger than 2 TB with a maximum block size of 4096 bytes.

ext3 or ext3fs is a journaling file system used in operating systems based on the Linux kernel. Based on ext2 file system.

The main difference from ext2 is that ext3 is journaling, that is, it provides for recording some data that allows you to restore the file system in case of computer failures.

The standard provides three logging modes:

writeback: only file system metadata, that is, information about its changes, is written to the log. It cannot guarantee data integrity, but it already significantly reduces verification time compared to ext2;

ordered: the same as writeback, but data is guaranteed to be written to a file before information about changes to this file is written. Slightly reduces performance, and also cannot guarantee data integrity (although it increases the likelihood of their safety when appended to the end of an existing file);

journal: full logging of both FS metadata and user data. The slowest, but also the most safe mode; can guarantee data integrity when storing the log on a separate partition (or better yet, on a separate hard drive).

The ext3 file system can support files up to 1 TB in size. With the Linux kernel 2.4, the file system size is limited by the maximum block device size, which is 2 terabytes. In Linux 2.6 (for 32-bit processors), the maximum block device size is 16 TB, however ext3 only supports up to 4 TB.

ext4 is a file system based on ext3 and compatible with it (forward and backward). It differs from ext3 in support of extents, groups of adjacent physical blocks managed as a single whole; increased speed of integrity checking and a number of other improvements.

New features of ext4 (compared to ext3):

Using extents. In the ext3 file system, data addressing was performed in the traditional way, block by block. This addressing method becomes less efficient as file sizes grow. Extents allow you to address a large number (up to 128 MB) of sequential blocks with one descriptor. Up to 4 pointers to extents can be placed directly in the inode, which is sufficient for small to medium-sized files.

48-bit block numbers. With a 4K block size, this allows up to one exabyte to be addressed (2 48 *4KB = 2 50 *1KB = 2 60 B = 1 EB).

Allocation of blocks in groups (multiblock allocation). The file system stores not only information about the location of free blocks, but also the number of free blocks following each other. When allocating space, the file system finds a fragment into which data can be written without fragmentation. This reduces the level of fragmentation of the file system as a whole.

Delayed allocation of blocks. Allocation of blocks to store file data occurs immediately before physical writing to disk (for example, when calling sync), and not when calling write. As a result, block allocation operations can be done not one at a time, but in groups, which in turn minimizes fragmentation and speeds up the process of block allocation. On the other hand, it increases the risk of data loss in the event of a sudden power failure.

The limit of 32,000 directories has been exceeded.

Reserving inodes when creating a directory (directory inodes reservation). When creating a directory, several inodes are reserved. Subsequently, when creating files in this directory, the reserved inodes are first used, and if there are none left, the normal procedure is performed.

inode size. Inode size (default) increased from 128 to 256 bytes. This made it possible to realize the benefits listed below.

Nanosecond timestamps. Higher accuracy of times stored in inode. The range of stored times has also been expanded: if previously the upper limit of the stored time was January 18, 2038, now it is April 25, 2514.

inode version. A number has appeared in the inode, which increases each time the inode of the file changes.

Storing extended attributes in inode (EA in inode). Storing extended attributes such as ACLs, SELinux attributes, and others can improve performance. Attributes for which there is not enough space in the inode are stored in a separate 4KB block.

Journal checksumming. Log transaction checksums. Allows you to better find and (sometimes) correct errors when checking the integrity of the system after a failure.

Persistent preallocation. Now, in order for an application to be guaranteed to take up space in the file system, it fills it with zeros. In ext4, it became possible to reserve many blocks for writing and not spend extra time on initialization. If an application tries to read the data, it will receive a message that it is not initialized. Thus, it will not be possible to unauthorizedly read deleted data.

Defragmentation without unmounting (online Defragmentation).

Uninitialized groups. Allows you to speed up file system checking. Units marked as unused are inspected in groups, and a detailed inspection is performed only if the group's inspection shows that there is damage inside.

Lecture 12.

Topic: Directory systems

The link between a file management system and a set of files is a file directory. Simplest form The directory system is that there is one directory that contains all the files. The directory contains information about files, including attributes, location, ownership. Users access files by symbolic names. However, human memory limits the number of object names that a user can refer to by name. The hierarchical organization of the namespace allows us to significantly expand these boundaries. This is why catalog systems have a hierarchical structure. The graph describing the directory hierarchy can be a tree or a network. Directories form a tree if a file is allowed to be included in only one directory (Fig. 7.11), and a network if the file can be included in several directories.

For example, in Ms-Dos and Windows, directories form a tree structure, while in UNIX they form a network structure. In general computing system may have several disk devices; even a PC always has several disks: floppy, hard drive, CD-ROM (DVD). How to organize file storage in this case?

Rice. Directory systems

The first solution is to host an offline file system on each device, i.e. the files located on this device are described by a directory tree that is in no way connected to the directory trees on other devices. In this case, to uniquely identify the file, the user must specify the logical device identifier along with the compound symbolic file name. An example of such an autonomous existence is MS-DOS, Windows 95/98/Me/XP.

Another solution is to organize file storage in such a way that the user is given the opportunity to combine file systems located on different devices, into a single file system described by a single directory tree. This operation is called mounting.

In UNIX OS, mounting is carried out as follows. Among all the available logical disk devices, one stands out, called the system one. Let there be two file systems located on different logical drives, and one of the drives is the system drive (Fig. 7.12).

The file system located on the system disk is called the root. To link file hierarchies in the root file system, some existing directory is selected, in in this example– directory loc. Once the mount is complete, the selected loc directory becomes the root directory of the second file system. Through this directory, the mounted file system is attached as a subtree to the overall tree.

Rice. Mounting

Attribute

The concept of a file includes not only the data and name it stores, but also information describing the properties of the file. This information constitutes the file's attributes. The list of attributes may vary on different operating systems. An example of possible attributes is shown below.

Attribute Meaning
File type Regular, Catalog, Special, etc.
File owner Current owner
File Creator ID of the user who created the file
Password Password to access the file
Time Created, last accessed, last modified
Current file size Number of bytes in a record
Maximum size Number of bytes to which the file size can be increased
Read-only flag 0 – read/write, 1 – read only
"Hidden" flag 0 – normal, 1 – do not show in the list of directory files
System flag 0 – normal, 1 – system
"Archive" flag 0 – archived, 1 – archiving required
ASCII/binary flag 0 – ASCII, 1 – binary
Random access flag 0 – sequential access only, 1 – random access
Flag "temporary" 0 – normal, 1 – deletion after the process ends
Key position Offset to key in record
Key length Number of bytes in the key field

The user can access attributes using the facilities provided for this purpose by the file system. Typically, you can read the value of any attributes, but change only some.

File attribute values ​​can be contained in directories, as is done, for example, in MS-DOS (Fig. 7.7). Another option is to place attributes in special tables, in which case the catalogs contain links to these tables.

Rice. 7. Attributes of MS DOS files