The default should be the device's native blocksize, and some devices misreport. You also lose performance if you use a larger blocksize than necessary.
If we can, I'd like to get a quirks list in place, but there have been higher priorities.
Do each of the other filesystems have their own quirks list? That seems suboptimal. Oh, I guess it's because it's in the user space mkfs tool of each, not the kernel.
I didn't realize that Linux Unplugged was still going; I haven't followed anything in the Jupiter Broadcasting sphere in almost a decade.
I'll queue this up to listen though I'm always afraid because I think Chris has gone pretty deep into conservative politics and that has the tendency of really pissing me off (though he's not as bad as Lunduke).
The tiered drive setup is more featured if you're interested in combining SSD+HDDs. Unlike other cache drive setups like ZFS's L2ARC/SLOG and even bcache, the SSD drives in this setup are still usable space. Otherwise I wouldn't want to use a 1TB SSD as a cache in front of a 4TB HDD for example.
Another use case would be on a handheld like Steam Deck: internal SSD tiered with a microSD for unused games I keep installed 'just in case'.
I would even want it on my home NAS. Instead of a separate root SSD and all my files on an HDD ZFS array (maybe with smaller cache SSDs), combine it all into a single bcachefs filesystems. Maybe with a subvolume root that stays pinned to the SSD(s). And also being able to use the full capabilities of different size drive unlike ZFS because it's a home NAS for cheap. No erasure coding yet though, so I'm in no rush to migrate my home NAS
One reason is to use the Linux page cache instead of dedicating RAM to ZFS, given how expensive memory is now. I am very happy with MGLRU and won't miss ZFS's ARC.
Actually listened to the podcast before. Happy that everything with the kernel situation kinda seemed to work out for you.
You kinda talked about ec already, but is there an ETA for resilvering?
You were talking about Valve helping in a big way. Was this monetary or development work? If development I would be interested because a while ago I know you do mainly correctness and features right now, but on the phoronix forum you were talking about low hanging fruits for performance work. Was that something of interest to valve/is it something being done right now to make bcachefs a good fs for gaming (whatever that means...)
I'm hoping to have erasure coding done sometime in the first half of next year (knock on wood).
While reconcile was getting done we got a detailed outline of where EC resilvering is going to plug into that, so it's not looking like a huge amount of work anymore - and there's been people testing EC and reporting the occasional bug, it's been looking pretty solid.
We did some performance testing not too long ago, and it looked like we were in better shape than I thought. I'm still more interested in tracking down performance bug than shaving cycles and going for raw IOPS.
And the userbase isn't complaining about performance at all, aside from the odd thing like accounting read being slow (just fixed a couple issues there) or lack of defrag.
After debugging and stabilization, it's going to be more about usability, fleshing out missing features, more integration work (there's some systemd integration that needs to happen in the mount path), telemetry/introspection improvements (I want all the data I can get for stabilization, and json reporting would be good for lots of things).
So, if you're asking if you can help, that's a decent list to start from :)
yes, it will. But we do want to communicate properly with systemd and let the user know what's going on if mount has to take awhile because of some sort of recovery (instead of timing out), and various other things.
related, plymouth integration to let users know when their machine is booting up if a drive or the filesystem is unhealthy
I love the idea of bcachefs, it gives a lot of the features of btrfs but includes encryption which means no luks song and dance. But having played around it on my laptop and raspberry pi(s), as root filesystem, it just can't be trusted at the moment. I can't remember the exact problem but I ran into bugs jumping to a new version of the kernel where bcachefs stopped working, and having to downgrade but then the format had changed (I think I caused this), and I was just in a completely broken state. I really wanted to figure it out, but after contemplating after the fact, I just don't want to deal with those kind of headaches for now.
I want to be able to use it in a way that I can rely on it for say the next 10 or 20 years, but it just isn't in that state. I can only feel comfortable using it on data or systems that I am not vested in.
We've been cranking through bugs fast, and there are still bug reports coming in but the severity and frequency has declined drastically, while the userbase has gone up; polling the userbase it's been stabilizing fast.
But we won't really know we're there until we're there, so the main thing I can say is: if you report a bug like that, it'll get looked at fast; the debugging tools are top notch.
I had discussed on the OFTC IRC channel at the time which looks like around 2025-06-09, the last issue seemed to have been nasty. I think while you said it was fixed, I couldn't un-eff my specific situation, and I think I gave up.
I am grateful that you make accessing support quite easy by being available on the #bcache IRC channel with a lot of community support as well, but it is sometimes hard to fix these issues -- in my case I was usually in a VGA console without network access, so I couldn't simply export information/logs/diagnostics to show you without pulling out my phone camera etc. and that becomes a bit tedious in itself. It is partly my fault for using bcachefs for the root filesystem, with encryption etc. but I also knew what I was in for and I wanted to help provide the feedback and experience needed to help out.
It is just that, after awhile, I felt like I kept running into issue after issue, and I kind of just gave up. I do run bcachefs for a secondary drive that is used for storage purpose and it has been great. But yeah, I think running as root fs is just a scary proposition, especially if you don't want to put in the hard yards to diagnose and fix issues as they come and be on top of them. I used Arch, so I was at the time getting the latest version of bcachefs and upgrading constantly.
That does happen sometimes. But look on the bright side; a lot of people are getting crash courses in low level systems debugging, and those are skills that are not as common as they used to be - but they're still important.
If you look at the field of filesystem developers and kernel developers, we don't have nearly as many young people getting involved and learning this stuff as we used to, and that's a problem. We need a pipeline of people building deep expertise, and if even a tiny fraction of the people getting involved with the bcachefs community start developing an interest and learning this stuff, that's a success.
Six months ago was also still a very hectic time for debugging and stabilization, it's definitely gotten better.
What I've heard is that Kent is very proactive in listening to any and all bug reports to chase down root causes of issues like yours. I'm sure that any information you send his way to try to reproduce the issue would be helpful.
Do Arch and NixOS count? We're in the core package repositories for them, and have packages available for a list of others.
We're not aiming to be in GUI installers yet, that'll be sometime after taking the experimental label off. We're still going slow and steady; I don't think about doing things that will bring in more users until incoming bug reports are dead quiet (or as close to it as they ever get), and the userbase has been going up plenty fast all on its own by the activity I see.
So, sometime next year we'll be working on distro stuff again. Dunno when, I expect another spike in new users and bug reports when I take the experimental label off.
now that 6.18 is the new LTS kernel, will I have a good experience with bcachefs if I stay on that LTS kernel instead of tracking newer stable kernel versions?
I currently run NixOS with ZFS-on-root, and because ZFS is also out-of-tree, the "stable" ZFS version in nixpkgs isn't always compatible with the most recent stable kernel. to keep things simple I tend to just stick with the LTS kernels.
previously when I've tried to experiment with bcachefs on NixOS I ran into a catch-22 where I needed to upgrade to a newer kernel to get bcachefs support but doing so wouldn't be compatible with ZFS.
Forgive a bit of ignorance on this as it might be a dumb question, but now that bcachefs is a kernel module and not part of the kernel directly, is it still realistic for people to run bcachefs as their root filesystem? Do you know anyone doing this?
Distros generally build everything they can as modules these days, including filesystems. No reason not too, we've had initramfs since forever; you can't build everything in that anyone might need to boot their machine.
As long as the testing pipelines are in place to make sure the dkms module builds on every distro configuration (a good chunk of that is still manual, but there's a project to improve the test infrastructure) - in practice, no one will notice.
I wouldn't have noticed the DKMS switch on my NixOS laptop if I didn't know it was happening.
bcachefs was always a module. You don’t want it in your kennel if you are not using it. The difference is that it used to ship in the mainline source code and be built as a module that was already built and on your drive.
If you build bcachefs as a module yourself (via DMKS or directly), it works the same as if you got it with your distro.
If you use bcachefs as root, the danger is booting with a kernel that lacks the module.
I hate that bcachefs is not in the kernel, and my primary distro does not use DKMS. But, if you can get a module built, there is no loss of functionality or performance.
You can do that now via the data_allowed parameter
ZFS did a bunch of stuff right, it's just a much older design; pre-extents, and based on the original Unix filesystem design - filesystem as a database was still unproven at the time.
They were just working incrementally, which for the amount of new features ZFS already had was the smart decision at the time.
I was hoping to use bcachefs to have one pool with subvolumes for root (encrypted by tpm), and for the home folders (also encrypted but with different keys, for example for systemd-homed use).
Any chance for different encryption keys per subvolume?
I'm a happy bcachefs user. Haven't had any issues on a simple mirrored array, which I've been running since before it was in (and out) of the kernel. It's the best filesystem in 2025. Thank you for all your work.
What is the status of scrub?
Are there any technical barriers to implementing it, or is it just prioritization at this point?
FWIW I think there are probably a lot of sysadmin types who would move over to bcachefs if scrub was implemented. I know there are other cooler features like RS and send/receive, but those probably aren't blocking many from switching over.
The Linux kernel has well-defined internal interfaces for character streams, block devices, block-erase devices (mtd), and extent devices (LVM).
Has it been considered to have an official (but not exposed to userspace) "btree device" interface?
The idea being that you could write composable wrappers for btree devices the way you can write composable wrappers for block devices (dmsetup, etc). And have a common interface for these kinds of devices -- the kernel has at least two large and well-developed btree-on-a-block-device implementations (bcache/bcachefs and btrfs). Both of these implementations have been criticized as being quite monolithic and not as unixy ("many small sharp tools") as LVM/dmsetup are.
I'd love to have configurable tiered storage with delayed migration. To let the spinning rust drives stay off in deep sleep for days, unless the frontend caches don't have the data.
Still changing the on disk format as required, but we're at the point now where the end user impact should be negligible - and we aren't doing big changes.
Just after reconcile, I landed a patch series to automatically run recovery passes in the background if they (and all dependents) can be run online; this allows the 1.33 upgrade to run in the background.
And with DKMS, users aren't having to run old versions (forcing a downgrade) if they have to boot into an old kernel. That was a big support issue in the past, users would have to run old unsupported versions because of other kernel bugs (amdgpu being the most common offender).
Unfortunately there doesn't seem to be any questions and answers about why bcachefs isn't in the kernel? It was but now it isn't. There was some hemming and hawwing about "testing"?
Kent wasn't willing to play ball with how kernel dev gets done because of uh, personality differences, so he was booted from the mainline kernel. Linus's house, Linus's rules.
Also the matter has been discussed here in detail when it broke the news a couple months ago so yeah focusing more on the technical merit is much more interesting IMO
Why is 512 the default and, if 4096 is better, why is this not the default instead?
If we can, I'd like to get a quirks list in place, but there have been higher priorities.
I'll queue this up to listen though I'm always afraid because I think Chris has gone pretty deep into conservative politics and that has the tendency of really pissing me off (though he's not as bad as Lunduke).
Happy to answer questions about all things bcachefs or what-have-you.
just please no more questions about whether or not bcachefs will be in the kernel, I've been asked that enough :)
Another use case would be on a handheld like Steam Deck: internal SSD tiered with a microSD for unused games I keep installed 'just in case'.
I would even want it on my home NAS. Instead of a separate root SSD and all my files on an HDD ZFS array (maybe with smaller cache SSDs), combine it all into a single bcachefs filesystems. Maybe with a subvolume root that stays pinned to the SSD(s). And also being able to use the full capabilities of different size drive unlike ZFS because it's a home NAS for cheap. No erasure coding yet though, so I'm in no rush to migrate my home NAS
You kinda talked about ec already, but is there an ETA for resilvering?
You were talking about Valve helping in a big way. Was this monetary or development work? If development I would be interested because a while ago I know you do mainly correctness and features right now, but on the phoronix forum you were talking about low hanging fruits for performance work. Was that something of interest to valve/is it something being done right now to make bcachefs a good fs for gaming (whatever that means...)
While reconcile was getting done we got a detailed outline of where EC resilvering is going to plug into that, so it's not looking like a huge amount of work anymore - and there's been people testing EC and reporting the occasional bug, it's been looking pretty solid.
We did some performance testing not too long ago, and it looked like we were in better shape than I thought. I'm still more interested in tracking down performance bug than shaving cycles and going for raw IOPS.
And the userbase isn't complaining about performance at all, aside from the odd thing like accounting read being slow (just fixed a couple issues there) or lack of defrag.
After debugging and stabilization, it's going to be more about usability, fleshing out missing features, more integration work (there's some systemd integration that needs to happen in the mount path), telemetry/introspection improvements (I want all the data I can get for stabilization, and json reporting would be good for lots of things).
So, if you're asking if you can help, that's a decent list to start from :)
related, plymouth integration to let users know when their machine is booting up if a drive or the filesystem is unhealthy
I love the idea of bcachefs, it gives a lot of the features of btrfs but includes encryption which means no luks song and dance. But having played around it on my laptop and raspberry pi(s), as root filesystem, it just can't be trusted at the moment. I can't remember the exact problem but I ran into bugs jumping to a new version of the kernel where bcachefs stopped working, and having to downgrade but then the format had changed (I think I caused this), and I was just in a completely broken state. I really wanted to figure it out, but after contemplating after the fact, I just don't want to deal with those kind of headaches for now.
I want to be able to use it in a way that I can rely on it for say the next 10 or 20 years, but it just isn't in that state. I can only feel comfortable using it on data or systems that I am not vested in.
We've been cranking through bugs fast, and there are still bug reports coming in but the severity and frequency has declined drastically, while the userbase has gone up; polling the userbase it's been stabilizing fast.
But we won't really know we're there until we're there, so the main thing I can say is: if you report a bug like that, it'll get looked at fast; the debugging tools are top notch.
I am grateful that you make accessing support quite easy by being available on the #bcache IRC channel with a lot of community support as well, but it is sometimes hard to fix these issues -- in my case I was usually in a VGA console without network access, so I couldn't simply export information/logs/diagnostics to show you without pulling out my phone camera etc. and that becomes a bit tedious in itself. It is partly my fault for using bcachefs for the root filesystem, with encryption etc. but I also knew what I was in for and I wanted to help provide the feedback and experience needed to help out.
It is just that, after awhile, I felt like I kept running into issue after issue, and I kind of just gave up. I do run bcachefs for a secondary drive that is used for storage purpose and it has been great. But yeah, I think running as root fs is just a scary proposition, especially if you don't want to put in the hard yards to diagnose and fix issues as they come and be on top of them. I used Arch, so I was at the time getting the latest version of bcachefs and upgrading constantly.
If you look at the field of filesystem developers and kernel developers, we don't have nearly as many young people getting involved and learning this stuff as we used to, and that's a problem. We need a pipeline of people building deep expertise, and if even a tiny fraction of the people getting involved with the bcachefs community start developing an interest and learning this stuff, that's a success.
Six months ago was also still a very hectic time for debugging and stabilization, it's definitely gotten better.
We're not aiming to be in GUI installers yet, that'll be sometime after taking the experimental label off. We're still going slow and steady; I don't think about doing things that will bring in more users until incoming bug reports are dead quiet (or as close to it as they ever get), and the userbase has been going up plenty fast all on its own by the activity I see.
So, sometime next year we'll be working on distro stuff again. Dunno when, I expect another spike in new users and bug reports when I take the experimental label off.
I currently run NixOS with ZFS-on-root, and because ZFS is also out-of-tree, the "stable" ZFS version in nixpkgs isn't always compatible with the most recent stable kernel. to keep things simple I tend to just stick with the LTS kernels.
previously when I've tried to experiment with bcachefs on NixOS I ran into a catch-22 where I needed to upgrade to a newer kernel to get bcachefs support but doing so wouldn't be compatible with ZFS.
As long as the testing pipelines are in place to make sure the dkms module builds on every distro configuration (a good chunk of that is still manual, but there's a project to improve the test infrastructure) - in practice, no one will notice.
I wouldn't have noticed the DKMS switch on my NixOS laptop if I didn't know it was happening.
If you build bcachefs as a module yourself (via DMKS or directly), it works the same as if you got it with your distro.
If you use bcachefs as root, the danger is booting with a kernel that lacks the module.
I hate that bcachefs is not in the kernel, and my primary distro does not use DKMS. But, if you can get a module built, there is no loss of functionality or performance.
I listened to the podcast it was interesting.
Gonna throw some questions you may or may not have gotten.
Are special devices like metadata or write-ahead log devices on the roadmap? Or distributed raid / other exotic raid types?
It would be interesting to hear your thoughts on these.
What do you think zfs got right with this and what did they get wrong?
ZFS did a bunch of stuff right, it's just a much older design; pre-extents, and based on the original Unix filesystem design - filesystem as a database was still unproven at the time.
They were just working incrementally, which for the amount of new features ZFS already had was the smart decision at the time.
Any chance for different encryption keys per subvolume?
What is the status of scrub? Are there any technical barriers to implementing it, or is it just prioritization at this point? FWIW I think there are probably a lot of sysadmin types who would move over to bcachefs if scrub was implemented. I know there are other cooler features like RS and send/receive, but those probably aren't blocking many from switching over.
Has it been considered to have an official (but not exposed to userspace) "btree device" interface?
The idea being that you could write composable wrappers for btree devices the way you can write composable wrappers for block devices (dmsetup, etc). And have a common interface for these kinds of devices -- the kernel has at least two large and well-developed btree-on-a-block-device implementations (bcache/bcachefs and btrfs). Both of these implementations have been criticized as being quite monolithic and not as unixy ("many small sharp tools") as LVM/dmsetup are.
Sorry. Not a question, just a feature request.
Just after reconcile, I landed a patch series to automatically run recovery passes in the background if they (and all dependents) can be run online; this allows the 1.33 upgrade to run in the background.
And with DKMS, users aren't having to run old versions (forcing a downgrade) if they have to boot into an old kernel. That was a big support issue in the past, users would have to run old unsupported versions because of other kernel bugs (amdgpu being the most common offender).
kinda makes me want to rebuild part of my homelab with bcacheFS
This is hacker news, not drama queen news :)
Same thing.