Btrfs file system

November 3, 2013

Lang: cs en de es

A filesystem is something we cannot do without when storing data. Of course, the average user does not care about the filesystem as long as the computer is working. The more experienced ones know that they have something like this, often called FAT or NTFS, the really experienced ones know other filesystems. An insider knows that the file system is a very important part of the operating system, and since it's used to store/read data, any problem or advantage greatly affects the functionality of the entire system.

Different file systems

Whoever solves what disk to choose should also solve what file system to choose, but unfortunately they don't. The FAT (FAT32) file system is very outdated. It is also unsuitable for removable media such as SDcards and USB flash drives, but unfortunately it is still used here thanks to Windows. With the advent of the NT platform, Microsoft introduced the NTFS file system, which then became widespread with the release of Windows XP. This was a decent improvement from Microsoft over FAT, but sadly on the other hand, that to this day, Windows users are unable to take advantage of its features. But even NTFS is now obsolete and doesn't have the same capabilities as the new Linux filesystems. I say Linux, but many of them were primarily developed for various Unix systems, but they have been ported to Linux as well. Probably the most used filesystem on Linux is ext (currently version 4), and ReiseFS is much used, which is probably on the wane. Other filesystems are for example XFS or the JFS I use. These filesystems are modern, powerful, stable, and were designed so that the user will not run into their limits in the future. Because of mobile devices and flash drives, which differ in functionality from regular drives, other new filesystems have been created, that have been specifically designed for flash memory. As the main example I have experience with, I will cite JFFS.

Novelties in filesystems

But it's pointless to stand still. Another new filesystem idea in a real deployment was brought by Sun (now Oracle) with ZFS. However, its use in Linux is problematic for licensing reasons, and a similar filesystem was not available for Linux or any other OS. Therefore, the Btrfs filesystem was created. The capabilities of Btrfs are broad it can snapshot and uses COW. However, I was intrigued by the fact that it is optimized to work for both traditional spinning disks and flash drives, RAID is implemented directly in the filesystem (so far only RAID0 and RAID1) and for me the most interesting thing, is that it performs checksums over individual blocks of data.

Disk quality

a little digression about disks

Lately, the quality of discs has been going downhill and my confidence in them is slim.
As capacity increases and thus data density increases, so does the potential problem with the disk. On disks, too, there has always been hidden data degradation. Nothing is permanent, everything changes with... The disks themselves perform checksums and so are able to correct the data and then report the problem. However, the repair is not always successful, in which case the file in question may be irreversibly damaged. The disk firmware performs these checks and records the information. This is handled by S.M.A.R.T.. But S.M.A.R.T. is not always reliable. I've even come across information that some drives made by Western Digital solve these problems covertly. If it fails to repair the block, it returns an empty block of data and and the drive doesn't inform the user of the problem at all. There may be other problems with the drive, but they may not be caused by the drive itself. The problem may be in another part of the computer hardware, such as an IDE/SATA controller, a bad cable, or a software error.

I'm currently running into the problem that after checking the integrity of RAID1, it turned out that some blocks don't match, but it's basically impossible to tell which blocks they are and which file they belong to. However, when using RAID1 and Btrfs, which performs checksums, this works much better. When a faulty block of data is read, the fault is detected and the block is read from the second disk and the faulty block is fixed on the first disk :-). And this is my motivation for starting to use Btrfs.

Test Btrfs

A simple test of Btrfs in RAID1 and by checking the real data.
Performed on Mageia3 distribution: Btrfs v0.20-rc1, kernel 3.8.13.4-desktop-1.mga3.

Create a Btrfs filesystem with RAID1 on two partitions of different disks:
mkfs.btrfs -m raid1 -d raid1 /dev/sdb2 /dev/sdc2 -L btrfs2
mount the filesystem:
mount /dev/sdc2 /mnt/btrfs2/
copy data:
cp Mageia-4-alpha3-i586-DVD.iso /mnt/btrfs2/
checking RAID status, disk full information and other information:
btrfs filesystem df /mnt/btrfs2
btrfs filesystem show
With this command, I performed data corruption on one disk:
dd if=/dev/zero of=/dev/sdb2 bs=1MB count=1000
this is how to perform a data read:
cat /mnt/btrfs2/Mageia-4-alpha3-i586-DVD.iso-copy >/dev/null
Then in the kernel log (for mageia /var/log/kernel/info.log ), debian (/var/log/kern.log), we can see information about checksums not matching and information about the location of the block. And then information about patches and which sectors are affected:
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671612928 csum 2566472073 private 2778140509
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671617024 csum 2566472073 private 2800729912
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671621120 csum 2566472073 private 1522128662
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671674368 csum 2566472073 private 2448968283
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671752192 csum 2566472073 private 1296282567
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671756288 csum 2566472073 private 2828806260
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671760384 csum 2566472073 private 1593117388
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671764480 csum 2566472073 private 4136347329
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671817728 csum 2566472073 private 2889709515
Oct 25 15:28:49 localhost kernel: btrfs csum failed ino 260 off 671821824 csum 2566472073 private 3334484093
Oct 25 15:28:49 localhost kernel: btrfs bad tree block start 0 18904940544
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 674168832 (dev /dev/sdb2 sector 1849600)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 674172928 (dev /dev/sdb2 sector 1849608)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 679321600 (dev /dev/sdb2 sector 1859664)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 671481856 (dev /dev/sdb2 sector 1844352)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 679243776 (dev /dev/sdb2 sector 1859512)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 671485952 (dev /dev/sdb2 sector 1844360)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 679247872 (dev /dev/sdb2 sector 1859520)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 679747584 (dev /dev/sdb2 sector 1860496)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 671490048 (dev /dev/sdb2 sector 1844368)
Oct 25 15:28:53 localhost kernel: btrfs read error corrected: ino 260 off 679751680 (dev /dev/sdb2 sector 1860504)

Deploying Btrfs

So I decided to deploy Btrfs on the backup disk and on the home partition. Not on the data partition, because I have RAID5 there, which Btrfs does not yet directly do, but it is already in development. Current version in Debian Wheezy: kernel 3.2.0-4-686-pae, Btrfs v0.19.

I also plan to use Btrfs capabilities for backups, both on the home partition and the backup disk.
With Btrfs, it's complicated with space taken up and free space, but it's a shame that even in the dump when using RAID1, the size of the entire disk doesn't show up correctly, it shows the sum, as if there was a RAID0.

In case of a problem, the btrfs-recover utility can be used to recover the data.

After my first successful experience with BTRFS, I have been deploying it wherever possible since 2014. Both on system partition, data partition and backup disks. Thanks to BTRFS, it's very easy and efficient to back up both system and data in no time.

additional links with information:
abclinux - Btrfs
Oracle's official Btrfs page
beginners-guide-to-btrfs

OpenAlt 2017 - BTRFS talk

A recording of my talk on Btrfs and backup:

Training

To take your IT department further, feel free to contact me to request BTRFS file system training.

Video using BTRFS

Live stream on filesystem BTRFS filmed with Eken H9R:

RAID1 and blocks

The basic implementation of RAID1 in a Btrfs filesystem is that a block is always on two disks. Which can be a difference from traditional implementations where, when using RAID1, the block is always on N disks out of N disks when using multiple disks. In the case of Btrfs and the base RAD1 file system, the block is always on two disks out of N.
If the block is to be on multiple disks, there are other new RAID implants RAID1C3, RAID1C4 etc..

Notes

Microsoft introduced Windows Server 2012 ReFS filesystem, which is now more modern and should be able to do things like snapshots. However, I am a Linux and Linux server expert, so I don't rate the ReFS file system because I have no experience with it.

Btrfs uses Synology as the default FS for NASs. But RAID resolves using mdadm.

RadHat has removed Brfs support from the kernel of its Linux distribution.

phoronix.com: Linux 5.5 SSD RAID 0/1/5/6/10 Benchmarks Of Btrfs / EXT4 / F2FS / XFS

An interesting backup tool that uses the Btrfs file system as its base is btrfs-sxbackup and here is fork btrfs-sxbackup .

The Rockstor project uses the BTRFS filesystem. Rockstor is software that turns a PC into a NAS. It will definitely be an interesting alternative to Synology, QNAP or FreeNAS. FreeNAS uses ZFS.

Where to next?

If you want to learn more about the BTRFS file system, read my how-to series on the Btrfs file system.
If you want to learn how to use Btrfs, you can effectively learn it at Btrfs filesystem training .
For For consultation and personalized service feel free to contact me by email.

Články na podobné téma

VMware licensing change
Running Microsoft SQL Server on Linux
Backup: the Proxmox Backup Server
Linux as a router and firewall
How to upload a docker image to the Docker Registry
Linux: logical volume management
Linux Software RAID
Running a web application behind a proxy
Mailbox migration
Docker multistage build
Backing up your data by turning on your computer
Podman
Importing Windows into Proxmox virtualization
Docker and PHP mail
Proxmox virtualization
Docker and Cron
Lenovo ThinkPad X1 Carbon: LTE modem EM7544 commissioning
Yocto Project: Build custom operating system for embedded devices
Preparing a Linux server to run a web application in Python
How to address poor file share performance in Docker
How to get started using Docker correctly
Installing Linux on a dedicated HPE ProLiant DL320e server
How to stress test a web application
Why use the JFS filesystem
How to boot from a 4TB drive with GTP using UEFI
Raspberry PI
WINE - running Windous programs under Linux
GNU/Linux operating system

Newsletter

If you are interested in receiving occasional news by email.
You can register by filling in your email news subscription.


+