FreeBSD, Compact Flash, ZFS, and minimum root partition size

The day I booted a FreeBSD system off Compact Flash I was hooked. CF is an extremely robust storage medium with no moving parts. CF cards have emerged completely intact from washing machines, clothes dryers, and impacts that would destroy any spinning disk. After setting up a system to boot from CF, I am confident that henceforth and forevermore, that system will have a functional boot disk.

I’ve stuck CF cards and USB thumb drives into servers in our data centers, our server room at the office, and my server closet. The practice has served me quite well but that is not to say that CF is perfect. Write speed is slow. There is a finite number of write cycles each block can endure. Some CF cards claim DMA support but don’t support it well enough to be useful. Some server boards do not include internal IDE or USB ports. But everywhere else, we use CF.

Because of CF write limits, I always mount the root partition read-only. Files on the / partition are not frequently altered so this rarely causes any inconvenience. We recently built a 6.7 terabyte storage array at work using a HP 320S chassis, a pile of disks, and ZFS. ZFS volumes aren’t bootable in FreeBSD but we had already installed a USB thumb drive as the boot partition.

After working with ZFS, I decided that gmirror was no longer sufficient for my personal file server. It needed ZFS, which meant upgrading to FreeBSD 7. This server has been running off a 256MB CF card for years. The CF card is so old it was actually made in the USA! While upgrading to 7.0 I ran into a snag, the FreeBSD kernel (and modules) now use over 100MB. That means 256MB is no longer enough space for the new kernel and the old one to both fit.

18 Comments

  1. We’re trying to build a storage-array with MSA70s, using SUNs JBOD-SAS controller.
    If it works, we can build an EVA-killer for a fraction of the cost.
    SUN’s T5220 can accomodate 6 of these SAS-controllers, each controller can address 126 disks….

  2. We split the disks between two RAIDZ2 pools with 5+1 in each pool. We joined both those pools into one large ZFS pool.

    Performance is mixed. Under some workloads (random writes, for example) it solidly trounced every other system we have benchmarked, including a variety of RAID cards in various configs (0, 5, 6, 1+0) and other software RAID solutions including gmirror and gmiror+gstripe. In other workloads, it was significantly slower when compared to benchmarks of this same 320S with 5 software RAID-1 mirrors (geom_mirror) striped in a RAID-0 array.

    We had two separate needs we were testing for: a great big disk for storage and a high performance file system for MySQL. The results is that we use ZFS for file storage type application and a combination of hardware RAID-1 mirrors striped in RAID-0 via gstripe.

  3. James

    Thanks Matt.

    My setup – Freebsd on thumb drive (no swap file) – only the 1GB memory that the server ships with, 4 disks – 250gb in DL320s. I was getting around 300 – 1000 IOPS in sequential workload with large files – works out to be between 50 – 80 MB/s. (RAIDZ1 – Stripped)

    Currently I am presenting each drive as a raid 0 volume to OS (Smart array ). da0/da1/da2/da3. I have tested replacement and rebuild, this works well. Only thing I have noticed is it takes the defined space of the volume, so replacing 250gb with 400gb only saw 250gb. I might have to install the HP CLI management package, delete the volume and create a new one. Since ZFS can dynamically grow on larger disks.

    Currently my interests are:
    – Performance. (what are the limitation of DL320s backend)
    – The DL320s has two internal thumb drive slots. Is it possible.
    – Issues related to lack of swap file. (I believe zfs caches within Kernel space, and this shouldn’t be swapping)
    – Failure (Disk / Smart Array).

    Hope you can spare some thoughts…

    Thanks for your time again.

  4. Vincent

    Hi

    There are some nice CF-based solutions to run the Asterisk PBX software (eg. AstLinux, Askozia), but they’re limited to what they can do, and I’d like to move to a real distro so I’m free to install any software I want.

    As a non-FreeBSD expert, how hard would it be to add a RAMFS, and make /var/log/ and /tmp point to it, so that the CF doesn’t wear out too fast?

    Thanks.

  5. Very easy. Simply mount your CF card as / and set the read only attribute. FreeBSD will automatically create a Memory File System for your /tmp and /var. It’ll even pre-populate all the directories in /var for you. Here’s the relevant line from my /etc/fstab:

    /dev/ad0s1a / ufs ro,noatime 1 1

    The ro means readonly, and noatime tells the OS not to update the access time on the file (which is rarely used).

    You can control the size of the MFS FreeBSD creates by adding settings in /etc/rc.conf. Peruse through /etc/defaults/rc.conf to see all the options available.

  6. Nate

    I’m having a heck of a time setting up Linux or FreeBSD on a compact flash. Do you have any step by step instructions for the new FreeBSD user?

    Thanks in advance

    Nate

  7. For FreeBSD, it’s quite easy. Just insert your FreeBSD boot media (CD-ROM, netboot, etc) and your Compact Flash. When the FreeBSD installer prompts you for the disk to install onto, select the CF. If you have a smaller CF card (<1G) then do a minimum install.

  8. Tom

    I read about an /etc/rc.diskless2 file (old solid-state doc on freebsd.org/doc/en/articles) which would allow the creation of add’l dir’s and chmod’s. But it seems that 7.0 doesn’t support this feature anymore. How would you create a folder for a webserver’s log file (lighttpd) and set access rights for a ‘www’ user under /var/log? Also, how would you sym link /var/db/pkg to a CF (i.e., non-MD) filesystem so the pkg db doesn’t get trashed on power down? What replaced rc.diskless2 in 7.0?

  9. I circumvented this issue by installing a little shell script named syncvar that runs at startup/shutdown. At boot time, it copies the contents of /var.disk to the /var MFS. Make sure to tweak your rc scripts so that it runs after the /var is mounted and before any daemons that need stuff in /var. At shutdown time, it remounts your /var partition rw and copies the contents of your /var to flash.

    Make sure to install a statically compiled rsync in /bin. Syncvar is available on my Soekris page: http://www.tnpi.net/wiki/Soekris_Firewall

  10. Matt, thanks for the reply. Your wiki was helpful. BTW, I’m working w/ the Soekris platform as well. It looks like your recommendation is to mount the CF as ro. However, there seems to be a number of people running embedded systems w/ rw CF drives. Any recommendation on the filesystem that would work best for a writable CF parition? My application involves a low-volume mysql db, I was thinking about mounting the root as ro, and creating a rw /data parition for the db and b/u files. I would use an mfs for logs, etc.

  11. matt simerson

    Yes, I prefer to run with CF read only. That preference is more because I like to have my / partition mounted r/o. When the root partition is read only, its pretty difficult to end up in a state where the system won’t boot up and become remotely accessible. With a system that has no KVM, that’s a nice little benefit. I don’t just do that with my Soekris systems. I have system with 24 1TB disks in it that boot off of a read only CF card. It also makes it a bit more difficult for a less experienced sysadmin to make changes that cause problems.

    I’ve put RDBMS on flash based media, but that was much newer SSD storage. It’s nice because you get really fast read transactions. With CF, your writes will be terribly slow. That may or may not be a problem for your workload.

  12. Tom

    Matt,
    Thanks again for syncvar, the example rc.d script. I just wanted to share some of the tweaks that I used to get it to run w/FreeBSD 7.0-RELEASE:
    1. command=”/usr/local/bin/rsync” # Std location when rsync installed from ports
    2. # KEYWORD: shutdown # Red’d by rc.shutdown as per “man rc(8)”

    -tm

  13. gm

    anir, you don’t need jffs2. At least in the contexts we are talking, CF or USB flash memory i.e. SD, MMC. All of these memory cards contain a controller which handles the flash memory on-board. Jffs2 is only for situations with no controller where the kernel has direct access to the flash memory.

Leave a Reply

Your email address will not be published. Required fields are marked *