[OpenBSD]

[FAQ Index] [To Section 13 - Multimedia] [To Section 15 - Packages and Ports]

14 - Disk Setup


Table of Contents


14.1 - Using OpenBSD's disklabel(8)

What is disklabel(8)?

First, be sure to read the disklabel(8) man page.

The details of setting up disks in OpenBSD varies somewhat between platforms. For i386, amd64, macppc, zaurus, and armish, disk setup is done in two stages. First, the OpenBSD slice of the hard disk is defined using fdisk(8), then that slice is subdivided into OpenBSD partitions using disklabel(8).

All OpenBSD platforms, however, use disklabel(8) as the primary way to manage OpenBSD partitions. Platforms that also use fdisk(8) place all the disklabel(8) partitions in a single fdisk partition.

Labels hold certain information about your disk, like your drive geometry and information about the filesystems on the disk. They also contain information about your disk itself, such as rotational speed, interleave, etc., which is there for historic reasons, and is often incorrect. Don't worry about this. The disklabel is then used by the bootstrap program to access the drive and to know where filesystems are contained on the drive. You can read more in-depth information about disklabel in the disklabel(5) man page.

On some platforms, disklabel helps overcome architecture limitations on disk partitioning. For example, on i386, you can have 4 primary partitions, but with disklabel(8), you use one of these 'primary' partitions to store *all* of your OpenBSD partitions (for example, 'swap', '/', '/usr', '/var', etc.), and you still have 3 more partitions available for other OSs.

disklabel(8) during OpenBSD's install

One of the major parts of OpenBSD's install is your initial creation of labels. During the install you use disklabel(8) to create your separate partitions. As part of the install process, you can can define your mount points from within disklabel(8), but you can change these later in the install or post-install, as well.

There is not one "right" way to label a disk, but there are many wrong ways. Before attempting to label your disk, see this discussion on partitioning and partition sizing.

For an example of using disklabel(8) during install, see the Setting up disks part of the Installation Guide.

Using disklabel(8) after install

After install, one of the most common reasons to use disklabel(8) is to look at how your disk is laid out. The following command will show you the current disklabel, without modifying it:

# disklabel wd0 <-- Or whatever disk device you'd like to view # Inside MBR partition 3: type A6 start 63 size 29880837 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: Maxtor 51536H2 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 29888820 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 614817 63 4.2BSD 2048 16384 328 # Cyl 0*- 609 b: 409248 614880 swap # Cyl 610 - 1015 c: 29888820 0 unused 0 0 # Cyl 0 - 29651* d: 6291936 1024128 4.2BSD 2048 16384 328 # Cyl 1016 - 7257 e: 409248 7316064 4.2BSD 2048 16384 328 # Cyl 7258 - 7663 f: 1024128 9822960 4.2BSD 2048 16384 328 # Cyl 9745 - 10760 h: 2097648 7725312 4.2BSD 2048 16384 328 # Cyl 7664 - 9744

Note how this disk has only part of its disk space allocated at this time. Disklabel offers two different modes for editing the disklabel, a built-in command-driven editor (this is how you installed OpenBSD originally), and a full editor, such as vi(1). You may find the command-driven editor "easier", as it guides you through all the steps and provides help upon request, but the full-screen editor has definite use, too.

Let's add a partition to the above system.

Warning: Any time you are fiddling with your disklabel, you are putting all the data on your disk at risk. Make sure your data is backed up before editing an existing disklabel!

We will use the built-in command-driven editor, which is invoked using the "-E" option to disklabel(8).

# disklabel -E wd0 ... > a k offset: [10847088] size: [19033812] 2g Rounding to nearest cylinder: 4194288 FS type: [4.2BSD] > p m device: /dev/rwd0c type: ESDI disk: ESDI/IDE disk label: Maxtor 51536H2 bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total bytes: 14594.2M free bytes: 7245.9M rpm: 3600 16 partitions: # size offset fstype [fsize bsize cpg] a: 300.2M 0.0M 4.2BSD 2048 16384 328 # Cyl 0*- 609 b: 199.8M 300.2M swap # Cyl 610 - 1015 c: 14594.2M 0.0M unused 0 0 # Cyl 0 - 29651* d: 3072.2M 500.1M 4.2BSD 2048 16384 328 # Cyl 1016 - 7257 e: 199.8M 3572.3M 4.2BSD 2048 16384 328 # Cyl 7258 - 7663 f: 500.1M 4796.4M 4.2BSD 2048 16384 328 # Cyl 9745 - 10760 h: 1024.2M 3772.1M 4.2BSD 2048 16384 328 # Cyl 7664 - 9744 k: 2048.0M 5296.4M 4.2BSD 2048 16384 16 # Cyl 10761 - 14921 > q Write new label?: [y]
In this case, disklabel(8) was kind enough to calculate a good starting offset for the partition. In many cases, it will be able to do this, but if you have "holes" in the disklabel (i.e., you deleted a partition, or you just like making your life miserable) you may need to sit down with a paper and pencil to calculate the proper offset. Note that while disklabel(8) does some sanity checking, it is very possible to do things very wrong here. Be careful, understand the meaning of the numbers you are entering.

On most OpenBSD platforms, there are sixteen disklabel partitions available, labeled "a" through "p". (some "specialty" systems may have only eight). Every disklabel should have a 'c' partition, with an "fstype" of "unused" that covers the entire physical drive. If your disklabel is not like this, it must be fixed, the "D" option (below) can help. Never try to use the "c" partition for anything other than accessing the raw sectors of the disk, do not attempt to create a file system on "c". On the boot device, "a" is reserved for the root partition, and "b" is the swap partition, but only the boot device makes these distinctions. Other devices may use all fifteen partitions other than "c" for file systems.

Disklabel tricks and tips

14.2 - Using fdisk(8)

Be sure to check the fdisk(8) man page.

fdisk(8) is used on some platforms (i386, amd64, macppc, zaurus and armish) to create a partition recognized by the system boot ROM, into which the OpenBSD disklabel partitions can be placed. Other platforms do not need or use fdisk(8). fdisk(8) can also be used for manipulations of the Master Boot Record (MBR), which can impact all operating systems on a computer. Unlike the fdisk-like programs on some other operating systems, OpenBSD's fdisk assumes you know what you want to do, and for the most part, it will let you do what you need to do, making it a powerful tool to have on hand. It will also let you do things you shouldn't or didn't intend to do, so it must be used with care.

Normally, only one OpenBSD fdisk partition will be placed on a disk. That partition will be subdivided by disklabel into OpenBSD filesystem partitions.

To just view your partition table using fdisk, use:

# fdisk sd0

Which will give an output similar to this:

Disk: sd0 geometry: 553/255/63 [8883945 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ *0: A6 3 0 1 - 552 254 63 [ 48195: 8835750 ] OpenBSD 1: 12 0 1 1 - 2 254 63 [ 63: 48132 ] Compaq Diag. 2: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused 3: 00 0 0 0 - 0 0 0 [ 0: 0 ] unused

In this example we are viewing the fdisk output of the first SCSI drive. We can see the OpenBSD partition (A6) and its size. The * tells us that the OpenBSD partition is a bootable partition.

In the previous example we just viewed our information. What if we want to edit our partition table? Well, to do so we must use the -e flag. This will bring up a command line prompt to interact with fdisk.

# fdisk -e wd0 Enter 'help' for information fdisk: 1> help help Command help list manual Show entire OpenBSD man page for fdisk reinit Re-initialize loaded MBR (to defaults) setpid Set the identifier of a given table entry disk Edit current drive stats edit Edit given table entry flag Flag given table entry as bootable update Update machine code in loaded MBR select Select extended partition table entry MBR swap Swap two partition entries print Print loaded MBR partition table write Write loaded MBR to disk exit Exit edit of current MBR, without saving changes quit Quit edit of current MBR, saving current changes abort Abort program without saving current changes fdisk: 1>

Here is an overview of the commands you can use when you choose the -e flag.

fdisk tricks and tips

14.3 - Adding extra disks in OpenBSD

Well once you get your disk installed PROPERLY you need to use fdisk(8) (i386 only) and disklabel(8) to set up your disk in OpenBSD.

For i386 folks, start with fdisk. Other architectures can ignore this. In the below example we're adding a third SCSI drive to the system.

# fdisk -i sd2
This will initialize the disk's "real" partition table for exclusive use by OpenBSD. Next you need to create a disklabel for it. This will seem confusing.
# disklabel -e sd2 (screen goes blank, your $EDITOR comes up) type: SCSI ...bla... sectors/track: 63 total sectors: 6185088 ...bla... 16 partitions: # size offset fstype [fsize bsize cpg] c: 6185088 0 unused 0 0 # (Cyl. 0 - 6135) d: 1405080 63 4.2BSD 1024 8192 16 # (Cyl. 0*- 1393*) e: 4779945 1405143 4.2BSD 1024 8192 16 # (Cyl. 1393*- 6135)
First, ignore the 'c' partition, it's always there and is for programs like disklabel to function! Fstype for OpenBSD is 4.2BSD. Total sectors is the total size of the disk. Say this is a 3 gigabyte disk. Three gigabytes in disk manufacturer terms is 3000 megabytes. So divide 6185088/3000 (use bc(1)). You get 2061. So, to make up partition sizes for a, d, e, f, g, ... just multiply X*2061 to get X megabytes of space on that partition. The offset for your first new partition should be the same as the "sectors/track" reported earlier in disklabel's output. For us it is 63. The offset for each partition afterwards should be a combination of the size of each partition and the offset of each partition (Except the 'c' partition, since it has no play into this equation.)

Or, if you just want one partition on the disk, say you will use the whole thing for web storage or a home directory or something, just take the total size of the disk and subtract the sectors per track from it. 6185088-63 = 6185025. Your partition is

d: 6185025 63 4.2BSD 1024 8192 16
If all this seems needlessly complex, you can just use disklabel -E to get the same partitioning mode that you got on your install disk! There, you can just use "96M" to specify "96 megabytes". (Or, if you have a disk big enough, 96G for 96 gigs!) Unfortunately, the -E mode uses the BIOS disk geometry, not the real disk geometry, and often times the two are not the same. To get around this limitation, type 'g d' for 'geometry disk'. (Other options are 'g b' for 'geometry bios' and 'g u' for geometry user, or simply, what the label said before disklabel made any changes.)

That was a lot. But you are not finished. Finally, you need to create the filesystem on that disk using newfs(8).

# newfs sd2a

Or whatever your disk was named as per OpenBSD's disk numbering scheme. (Look at the output from dmesg(8) to see what your disk was named by OpenBSD.)

Now figure out where you are going to mount this new partition you just created. Say you want to put it on /u. First, make the directory /u. Then, mount it.

# mount /dev/sd2a /u

Finally, add it to /etc/fstab(5).

/dev/sd2a /u ffs rw 1 1

What if you need to migrate an existing directory like /usr/local? You should mount the new drive in /mnt and use cpio -pdum to copy /usr/local to the /mnt directory. Edit the /etc/fstab(5) file to show that the /usr/local partition is now /dev/sd2a (your freshly formatted partition.) Example:

/dev/sd2a /usr/local ffs rw 1 1

Reboot into single user mode with boot -s, move the existing /usr/local to /usr/local-backup (or delete it if you feel lucky) and create an empty directory /usr/local. Then reboot the system, and voila, the files are there!

14.4 - How to swap to a file

(Note: if you are looking to swap to a file because you are getting "virtual memory exhausted" errors, you should try raising the per-process limits first with csh's unlimit(1), or sh's ulimit(1).)

Swapping to a file doesn't require a custom built kernel, although that can still be done, this faq will show you how to add swap space both ways.

Swapping to a file.

Swapping to a file is the easiest and quickest way to get extra swap space setup. The file must not reside on a filesystem which has SoftUpdates enabled (they are disabled by default). To start out, you can see how much swap you currently have and how much you are using with the swapctl(8) utility. You can do this by using the command:

$ swapctl -l Device 512-blocks Used Avail Capacity Priority swap_device 65520 8 65512 0% 0

This shows the devices currently being used for swapping and their current statistics. In the above example there is only one device named "swap_device". This is the predefined area on disk that is used for swapping. (Shows up as partition b when viewing disklabels) As you can also see in the above example, that device isn't getting much use at the moment. But for the purposes of this document, we will act as if an extra 32M is needed.

The first step to setting up a file as a swap device is to create the file. It's best to do this with the dd(1) utility. Here is an example of creating the file /var/swap that is 32M large.

$ sudo dd if=/dev/zero of=/var/swap bs=1k count=32768 32768+0 records in 32768+0 records out 33554432 bytes transferred in 20 secs (1677721 bytes/sec)

Once this has been done, we can turn on swapping to that device. Use the following command to turn on swapping to this device

$ sudo chmod 600 /var/swap $ sudo swapctl -a /var/swap

Now we need to check to see if it has been correctly added to the list of our swap devices.

$ swapctl -l Device 512-blocks Used Avail Capacity Priority swap_device 65520 8 65512 0% 0 /var/swap 65536 0 65536 0% 0 Total 131056 8 131048 0%

Now that the file is setup and swapping is being done, you need to add a line to your /etc/fstab file so that this file is configured on the next boot time also. If this line is not added, your won't have this swap device configured.

$ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /var/swap /var/swap swap sw 0 0

Swapping via a vnode device

This is a more permanent solution to adding more swap space. To swap to a file permanently, first make a kernel with vnd0c as swap. If you have wd0a as root filesystem, wd0b is the previous swap, use this line in the kernel configuration file (refer to compiling a new kernel if in doubt):

config bsd root on wd0a swap on wd0b and vnd0c dumps on wd0b

After this is done, the file which will be used for swapping needs to be created. You should do this by using the same command as in the above examples.

$ sudo dd if=/dev/zero of=/var/swap bs=1k count=32768 32768+0 records in 32768+0 records out 33554432 bytes transferred in 20 secs (1677721 bytes/sec)

Now your file is in place, you need to add the file to your /etc/fstab. Here is a sample line to boot with this device as swap on boot.

$ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /dev/vnd0c none swap sw 0 0

At this point your computer needs to be rebooted so that the kernel changes can take place. Once this has been done it's time to configure the device as swap. To do this you will use vnconfig(8).

$ sudo vnconfig -c -v vnd0 /var/swap vnd0: 33554432 bytes on /var/swap

Now for the last step, turning on swapping to that device. We will do this just like in the above examples, using swapctl(8). Then we will check to see if it was correctly added to our list of swap devices.

$ sudo swapctl -a /dev/vnd0c $ swapctl -l Device 512-blocks Used Avail Capacity Priority swap_device 65520 8 65512 0% 0 /dev/vnd0c 65536 0 65536 0% 0 Total 131056 8 131048 0%

14.5 - Soft Updates

Soft Updates is based on an idea proposed by Greg Ganger and Yale Patt and developed for FreeBSD by Kirk McKusick. SoftUpdates imposes a partial ordering on the buffer cache operations which permits the requirement for synchronous writing of directory entries to be removed from the FFS code. Thus, a large performance increase is seen in disk writing performance.

Enabling soft updates must be done with a mount-time option. When mounting a partition with the mount(8) utility, you can specify that you wish to have soft updates enabled on that partition. Below is a sample /etc/fstab(5) entry that has one partition sd0a that we wish to have mounted with soft updates.

/dev/sd0a / ffs rw,softdep 1 1

Note to sparc users: Do not enable soft updates on sun4 or sun4c machines. These architectures support only a very limited amount of kernel memory and cannot use this feature. However, sun4m machines are fine.

14.6 - How does OpenBSD/i386 boot?

The boot process for OpenBSD/i386 is not trivial, and understanding how it works can be useful to troubleshoot a problem when things don't work. There are four key pieces to the boot process:
  1. Master Boot Record (MBR): The Master Boot Record is the first physical sector (512 bytes) on the disk. It contains the primary partition table and a small program to load the Partition Boot Record (PBR). Note that in some environments, the term "MBR" is used to refer to only the code portion of this first block on the disk, rather than the whole first block (including the partition table). It is critical to understand the meaning of "initialize the MBR" -- in the terminology of OpenBSD, it would involve rewriting the entire MBR sector, not just the code, as it might on some systems. You will rarely want to do this. Instead, use fdisk(8)'s "-u" command line option ("fdisk -u wd0").

    While OpenBSD includes an MBR, you are not obliged to use it, as virtually any MBR can boot OpenBSD. The MBR is manipulated by the fdisk(8) program, which is used both to edit the partition table, and also to install the MBR code on the disk.

    OpenBSD's MBR announces itself with the message:

    Using drive 0, partition 3.
    showing the disk and partition it is about to load the PBR from. In addition to the obvious, it also shows a trailing period ("."), which indicates this machine is capable of using LBA translation to boot. If the machine were incapable of using LBA translation, the above period would have have been replaced with a semicolon (";"), indicating CHS translation:
    Using Drive 0, Partition 3;
    Note that the trailing period or semicolon can be used as an indicator of the "new" OpenBSD MBR, introduced with OpenBSD 3.5.
  2. Partition Boot Record (PBR): The Partition Boot Record, also called the PBR or biosboot(8) (after the name of the file that holds the code) is the first physical sector of the OpenBSD partition of the disk. The PBR is the "first-stage boot loader" for OpenBSD. It is loaded by the MBR code, and has the task of loading the OpenBSD second-stage boot loader, boot(8). Like the MBR, the PBR is a very tiny section of code and data, only 512 bytes, total. That's not enough to have a fully filesystem-aware application, so rather than having the PBR locate /boot on the disk, the BIOS-accessible location of /boot is physically coded into the PBR at installation time.

    The PBR is installed by installboot, which is further described later in this document. The PBR announces itself with the message:

    Loading...
    printing a dot for every file system block it attempts to load. Again, the PBR shows if it is using LBA or CHS to load, if it has to use CHS translation, it displays a message with a semicolon:
    Loading;...
    The older (pre v3.5) biosboot(8) showed the message "reading boot...".
  3. Second Stage Boot Loader, /boot: /boot is loaded by the PBR, and has the task of accessing the OpenBSD file system through the machine's BIOS, and locating and loading the actual kernel. boot(8) also passes various options and information to the kernel.

    boot(8) is an interactive program. After it loads, it attempts to locate and read /etc/boot.conf, if it exists (which it does not on a default install), and processes any commands in it. Unless instructed otherwise by /etc/boot.conf, it then gives the user a prompt:

    probing: pc0 com0 com1 apm mem[636k 190M a20=on] disk: fd0 hd0+ >> OpenBSD/i386 BOOT 2.10 boot>
    It gives the user (by default) five seconds to start giving it other tasks, but if none are given before the timeout, it starts its default behavior: loading the kernel, bsd, from the root partition of the first hard drive. The second-stage boot loader probes (examines) your system hardware, through the BIOS (as the OpenBSD kernel is not loaded). Above, you can see a few things it looked for and found: The '+' character after the "hd0" indicates that the BIOS has told /boot that this disk can be accessed via LBA. When doing a first-time install, you will sometimes see a '*' after a hard disk -- this indicates a disk that does not seem to have a valid OpenBSD disk label on it.
  4. Kernel: /bsd: This is the goal of the boot process, to have the OpenBSD kernel loaded into RAM and properly running. Once the kernel has loaded, OpenBSD accesses the hardware directly, no longer through the BIOS.
So, the very start of the boot process could look like this:
Using drive 0, partition 3. <- MBR Loading.... <- PBR probing: pc0 com0 com1 apm mem[636k 190M a20=on] <- /boot disk: fd0 hd0+ >> OpenBSD/i386 BOOT 2.10 boot> booting hd0a:/bsd 4464500+838332 [58+204240+181750]=0x56cfd0 entry point at 0x100120 [ using 386464 bytes of bsd ELF symbol table ] Copyright (c) 1982, 1986, 1989, 1991, 1993 <- Kernel The Regents of the University of California. All rights reserved. Copyright (c) 1995-2007 OpenBSD. All rights reserved. http://www.OpenBSD.org OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007 deraadt@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC ...

What can go wrong

As the PBR is very small, its range of error messages is pretty limited, and somewhat cryptic. Most likely messages are: Other error messages are detailed in the biosboot(8) manual page For more information on the i386 boot process, see

14.7 - What are the issues regarding large drives with OpenBSD?

OpenBSD supports an individual file system of up to 231-1, or 2,147,483,647 sectors, and as each sector is 512 bytes, that's a tiny amount less than 1T.

There is also a 1T limit on the size of the physical disk, although under *some* circumstances, that may not cause you problems up to 2T, although this is not guaranteed.

Of course, the ability of file system and the abilities of particular hardware are two different things. A new 250G IDE hard disk may have issues on older (pre >137G standards) interfaces, and some very old SCSI adapters have been seen to have problems with more modern drives, and some older BIOSs will hang when they encounter a modern sized hard disk. You must respect the abilities of your hardware, of course.

Partition size and location limitations

Unfortunately, the full ability of the OS isn't available until AFTER the OS has been loaded into memory. The boot process has to utilize (and is thus limited by) the system's boot ROM.

For this reason, the entire /bsd file (the kernel) must be located on the disk within the boot ROM addressable area. This means that on some older i386 systems, the root partition must be completely within the first 504M, but newer computers may have limits of 2G, 8G, 32G, 128G or more. It is worth noting that many relatively new computers which support larger than 128G drives actually have BIOS limitations of booting only from within the first 128G. You can use these systems with large drives, but your root partition must be within the first 128G.

Note that it is possible to install a 40G drive on an old 486 and load OpenBSD on it as one huge partition, and think you have successfully violated the above rule. However, it might come back to haunt you in a most unpleasant way:

Why? Because when you copied "over" the new /bsd file, it didn't overwrite the old one, it got relocated to a new location on the disk, probably outside the 504M range the BIOS supported. The boot loader was unable to fetch the file /bsd, and the system hung.

To get OpenBSD to boot, the boot loaders (biosboot(8) and /boot in the case of i386) and the kernel (/bsd) must be within the boot ROM's supported range, and within their own abilities. To play it safe, the rule is simple:

the entire root partition must be within the computer's BIOS (or boot ROM) addressable space.

Some non-i386 users think they are immune to this, however most platforms have some kind of boot ROM limitation on disk size. Finding out for sure what the limit is, however, can be difficult.

This is another good reason to partition your hard disk, rather than using one large partition.

fsck(8) time and memory requirements

Another consideration with large file systems is the time and memory required to fsck(8) the file system after a crash or power interruption. One should not put a 120G file system on a system with 32M of RAM and expect it to successfully fsck(8) after a crash. A rough guideline is the system should have at least 1M of available memory for every 1G of disk space to successfully fsck the disk. Swap can be used here, but at a very significant performance penalty, so severe that it is usually unacceptable, except in special cases. The time required to fsck the drive may become a problem as the file system size expands, but you only have to fsck the disk space that is actually allocated to mounted filesystems. Don't forget that if you have multiple disks on the system, they could all end up being fsck(8)ed after a crash at the same time, so they could require more RAM than a single disk.

14.8 - Installing Bootblocks - i386/amd64 specific

Modern versions of OpenBSD (3.5 and later) have a very robust boot loader that is much more indifferent to drive geometries than the older boot loader was, however, they are sensitive to where the file /boot resides on the disk. If you do something that causes boot(8) to be moved to a new place on the disk (actually, a new inode), you will "break" your system, preventing it from booting properly. To fix your boot block so that you can boot normally, just put a boot floppy in your drive (or use a bootable CD-ROM) and at the boot prompt, type "b hd0a:/bsd" to force it to boot from the first hard disk (and not the floppy). Your machine should come up normally. You now need to reinstall the first-stage boot loader (biosboot(8)) based on the position of the /boot file, using the installboot(8) program.

Our example will assume your boot disk is sd0 (but for IDE it would be wd0, etc..):

# cd /usr/mdec; ./installboot /boot biosboot sd0

If a newer version of bootblocks are required, you will need to compile these yourself. To do so simply:

# cd /sys/arch/i386/stand/ # make && make install # cd /usr/mdec; cp ./boot /boot # ./installboot /boot biosboot sd0 (or whatever device your hard disk is)

14.9 - Preparing for disaster: Backing up and Restoring from tape

Introduction:

If you plan on running what might be called a production server, it is advisable to have some form of backup in the event one of your fixed disk drives fails.

This information will assist you in using the standard dump(8)/restore(8) utilities provided with OpenBSD. A more advanced backup utility called "Amanda" is also available through packages for backing up multiple servers to one tape drive. In most environments dump(8)/restore(8) is enough. However, if you have a need to backup multiple machines, Amanda might be worth investigating.

The device examples in this document are for a configuration that uses both SCSI disks and tape. In a production environment, SCSI disks are recommended over IDE due to the way in which they handle bad blocks. That is not to say this information is useless if you are using an IDE disk or other type of tape drive, your device names will simply differ slightly. For example sd0a would be wd0a in an IDE based system.

Backing up to tape:

Backing up to tape requires knowledge of where your file systems are mounted. You can determine how your filesystems are mounted using the mount(8) command at your shell prompt. You should get output similar to this:

# mount /dev/sd0a on / type ffs (local) /dev/sd0h on /usr type ffs (local)

In this example, the root (/) filesystem resides physically on sd0a which indicates SCSI fixed disk 0, partition a. The /usr filesystem resides on sd0h, which indicates SCSI fixed disk 0, partition h.

Another example of a more advanced mount table might be:

# mount /dev/sd0a on / type ffs (local) /dev/sd0d on /var type ffs (local) /dev/sd0e on /home type ffs (local) /dev/sd0h on /usr type ffs (local)

In this more advanced example, the root (/) filesystem resides physically on sd0a. The /var filesystem resides on sd0d, the /home filesystem on sd0e and finally /usr on sd0h.

To backup your machine you will need to feed dump the name of each fixed disk partition. Here is an example of the commands needed to backup the simpler mount table listed above:

# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0a # /sbin/dump -0au -f /dev/nrst0 /dev/rsd0h # mt -f /dev/rst0 rewind

For the more advanced mount table example, you would use something similar to:

# /sbin/dump -0au -f /dev/nrst0 /dev/rsd0a # /sbin/dump -0au -f /dev/nrst0 /dev/rsd0d # /sbin/dump -0au -f /dev/nrst0 /dev/rsd0e # /sbin/dump -0au -f /dev/nrst0 /dev/rsd0h # mt -f /dev/rst0 rewind

You can review the dump(8) man page to learn exactly what each command line switch does. Here is a brief description of the parameters used above:

Finally which partition to backup (/dev/rsd0a, etc)

The mt(1) command is used at the end to rewind the drive. Review the mt man page for more options (such as eject).

If you are unsure of your tape device name, use dmesg to locate it. An example tape drive entry in dmesg might appear similar to:

st0 at scsibus0 targ 5 lun 0: <ARCHIVE, Python 28388-XXX, 5.28>

You may have noticed that when backing up, the tape drive is accessed as device name "nrst0" instead of the "st0" name that is seen in dmesg. When you access st0 as nrst0 you are accessing the same physical tape drive but telling the drive to not rewind at the end of the job and access the device in raw mode. To back up multiple file systems to a single tape, be sure you use the non-rewind device, if you use a rewind device (rst0) to back up multiple file systems, you'll end up overwriting the prior filesystem with the next one dump tries to write to tape. You can find a more elaborate description of various tape drive devices in the dump man page.

If you wanted to write a small script called "backup", it might look something like this:

echo " Starting Full Backup..." /sbin/dump -0au -f /dev/nrst0 /dev/rsd0a /sbin/dump -0au -f /dev/nrst0 /dev/rsd0d /sbin/dump -0au -f /dev/nrst0 /dev/rsd0e /sbin/dump -0au -f /dev/nrst0 /dev/rsd0h echo echo -n " Rewinding Drive, Please wait..." mt -f /dev/rst0 rewind echo "Done." echo

If scheduled nightly backups are desired, cron(8) could be used to launch your backup script automatically.

It will also be helpful to document (on a scrap of paper) how large each file system needs to be. You can use "df -h" to determine how much space each partition is currently using. This will be handy when the drive fails and you need to recreate your partition table on the new drive.

Restoring your data will also help reduce fragmentation. To ensure you get all files, the best way of backing up is rebooting your system in single user mode. File systems do not need to be mounted to be backed up. Don't forget to mount root (/) r/w after rebooting in single user mode or your dump will fail when trying to write out dumpdates. Enter "bsd -s" at the boot> prompt for single user mode.

Viewing the contents of a dump tape:

After you've backed up your file systems for the first time, it would be a good idea to briefly test your tape and be sure the data on it is as you expect it should be.

You can use the following example to review a catalog of files on a dump tape:

# /sbin/restore -tvs 1 -f /dev/rst0

This will cause a list of files that exist on the 1st partition of the dump tape to be listed. Following along from the above examples, 1 would be your root (/) file system.

To see what resides on the 2nd tape partition and send the output to a file, you would use a command similar to:

# /sbin/restore -tvs 2 -f /dev/rst0 > /home/me/list.txt

If you have a mount table like the simple one, 2 would be /usr, if yours is a more advanced mount table 2 might be /var or another fs. The sequence number matches the order in which the file systems are written to tape.

Restoring from tape:

The example scenario listed below would be useful if your fixed drive has failed completely. In the event you want to restore a single file from tape, review the restore man page and pay attention to the interactive mode instructions.

If you have prepared properly, replacing a disk and restoring your data from tape can be a very quick process. The standard OpenBSD install/boot floppy already contains the required restore utility as well as the binaries required to partition and make your new drive bootable. In most cases, this floppy and your most recent dump tape is all you'll need to get back up and running.

After physically replacing the failed disk drive, the basic steps to restore your data are as follows:

14.10 - Mounting disk images in OpenBSD

To mount a disk image (ISO images, disk images created with dd, etc) in OpenBSD you must configure a vnd(4) device. For example, if you have an ISO image located at /tmp/ISO.image, you would take the following steps to mount the image.

# vnconfig svnd0 /tmp/ISO.image # mount -t cd9660 /dev/svnd0c /mnt

Notice that since this is an ISO-9660 image, as used by CDs and DVDs, you must specify type of cd9660 when mounting it. This is true, no matter what type, e.g. you must use type ext2fs when mounting Linux disk images.

To unmount the image use the following commands.

# umount /mnt # vnconfig -u svnd0

For more information, refer to the vnconfig(8) man page.

14.11 - Help! I'm getting errors with IDE DMA!

DMA IDE transfers, supported by pciide(4) are unreliable with many combinations of hardware. Until recently, most "mainstream" operating systems that claimed to support DMA transfers with IDE drives did not ship with that feature active by default due to unreliable hardware. Now many of these same machines are being used for OpenBSD.

OpenBSD is aggressive and attempts to use the highest DMA Mode it can configure. This will cause corruption of data transfers in some configurations because of buggy motherboard chipsets, buggy drives, and/or noise on the cables. Luckily, Ultra-DMA modes protect data transfers with a CRC to detect corruption. When the Ultra-DMA CRC fails, OpenBSD will print an error message and try the operation again.

wd2a: aborted command, interface CRC error reading fsbn 64 of 64-79 (wd2 bn 127; cn 0 tn 2 sn 1), retrying

After failing a couple times, OpenBSD will downgrade to a slower (hopefully more reliable) Ultra-DMA mode. If Ultra-DMA mode 0 is hit, then the drive downgrades to PIO mode.

UDMA errors are often caused by low quality or damaged cables. Cable problems should usually be the first suspect if you get many DMA errors or unexpectedly low DMA performance. It is also a bad idea to put the CD-ROM on the same channel with a hard disk.

If replacing cables does not resolve the problem and OpenBSD does not successfully downgrade, or the process causes your machine to lock hard, or causes excessive messages on the console and in the logs, you may wish to force the system to use a lower level of DMA or UDMA by default. This can be done by using UKC or config(8) to change the flags on the wd(4) device.

14.13 - RAID options for OpenBSD

RAID (Redundant Array of Inexpensive Disks) gives an opportunity to use multiple drives to give better performance, capacity and/or redundancy than one can get out of a single drive alone. While a full discussion of the benefits and risks of RAID are outside the scope of this article, there are a couple points that are important to make here: If this is new information to you, this is not a good starting point for your exploration of RAID.

Software Options

OpenBSD includes RAIDframe, a software RAID solution. Documentation for it can be found in the following places:

The root partition can be directly mirrored by OpenBSD using the "Autoconfiguration" option of RAIDframe.

OpenBSD 3.7-stable and later also includes mirroring as a feature of the ccd(4) driver. This system is built into the GENERIC kernel and is in the bsd.rd kernel of some platforms (amd64, hppa, hppa64, i386), so it can be much easier to use, though it has some limitations regarding rebuilding the array. See:

Hardware Options

Many OpenBSD platforms include support for various hardware RAID products. The options vary by platform, see the appropriate hardware support page (listed here).

Another option available for many platforms is one of the many products which make multiple drives act as a single IDE or SCSI drive, and are then plugged into a standard IDE or SCSI adapter. These devices can work on virtually any hardware platform that supports either SCSI or IDE.

Some manufacturers of these products:

(Note: these are just products that OpenBSD users have reported using -- this is not any kind of endorsement, nor is it an exhaustive list.)

Non-Options

An often asked question on the mail lists is "Are the low-cost IDE or SATA RAID controllers (such as those using Highpoint, Promise or Adaptec HostRAID chips) supported?". The answer is "No". These cards and chips are not true hardware RAID controllers, but rather BIOS-assisted boot of a software RAID. As OpenBSD already supports software RAID in a hardware-independent way, there isn't much desire among the OpenBSD developers to implement special support for these cards.

Almost all on-board SATA or IDE "RAID" controllers are this software-based style, and will typically work fine as a SATA or IDE controller using the standard IDE driver (pciide(4)), but are not going to work as a hardware RAID system on OpenBSD.

14.14 - Why does df(1) tell me I have over 100% of my disk used?

People are sometimes surprised to find they have negative available disk space, or more than 100% of a filesystem in use, as shown by df(1).

When a filesystem is created with newfs(8), some of the available space is held in reserve from normal users. This provides a margin of error when you accidently fill the disk, and helps keep disk fragmentation to a minimum. Default for this is 5% of the disk capacity, so if the root user has been carelessly filling the disk, you may see up to 105% of the available capacity in use.

If the 5% value is not appropriate for you, you can change it with the tunefs(8) command.

14.15 - Recovering partitions after deleting the disklabel

If you have a damaged partition table, there are various things you can attempt to do to recover it.

Firstly, panic. You usually do so anyways, so you might as well get it over with. Just don't do anything stupid. Panic away from your machine. Then relax, and see if the steps below won't help you out.

A copy of the disklabel for each disk is saved in /var/backups as part of the daily system maintenance. Assuming you still have the var partition, you can simply read the output, and put it back into disklabel.

In the event that you can no longer see that partition, there are two options. Fix enough of the disc so you can see it, or fix enough of the disc so that you can get your data off. Depending on what happened, one or other of those may be preferable (with dying discs you want the data first, with sloppy fingers you can just have the label)

The first tool you need is scan_ffs(8) (note the underscore, it isn't called "scanffs"). scan_ffs(8) will look through a disc, and try and find partitions and also tell you what information it finds about them. You can use this information to recreate the disklabel. If you just want /var back, you can recreate the partition for /var, and then recover the backed up label and add the rest from that.

disklabel(8) will update both the kernel's understanding of the disklabel, and then attempt to write the label to disk. Therefore, even if area of the disk containing the disklabel is unreadable, you will be able to mount(8) it until the next reboot.

14.16 - Can I access data on filesystems other than FFS?

Yes. Other supported filesystems include: ext2 (Linux), ISO9660 and UDF (CD-ROM, DVD media), FAT (MS-DOS and Windows), NFS, NTFS (Windows), AmigaDOS. Some of them have limited, for instance read-only, support. Note that FreeBSD's UFS2 filesystem is not supported.

We will give a general overview on how to use one of these filesystems under OpenBSD. To be able to use a filesystem, it must be mounted. For details and mount options, please consult the mount(8) manual page, and that of the mount command for the filesystem you will be mounting, e.g. mount_msdos, mount_ext2fs, ...

First, you must know on which device your filesystem is located. This can be simply your first hard disk, wd0 or sd0, but it may be less obvious. All recognized and configured devices on your system are mentioned in the output of the dmesg(1) command: a device name, followed by a one-line description of the device. For example, my first CD-ROM drive is recognized as follows:

cd0 at scsibus0 targ 0 lun 0: <COMPAQ, DVD-ROM LTD163, GQH3> SCSI0 5/cdrom removable

For a much shorter list of available disks, you can use sysctl(8). The command

# sysctl hw.disknames
will show all disks currently known to your system, for example:
hw.disknames=cd0,cd1,wd0,fd0,cd2

At this point, it is time to find out which partitions are on the device, and in which partition the desired filesystem resides. Therefore, we examine the device using disklabel(8). The disklabel contains a list of partitions, with a maximum number of 16. Partition c always indicates the entire device. Partitions a-b and d-p are used by OpenBSD. Partitions i-p may be automatically allocated to file systems of other operating systems. In this case, I'll be viewing the disklabel of my hard disk, which contains a number of different filesystems.

NOTE: OpenBSD was installed after the other operating systems on this system, and during the install a disklabel containing partitions for the native as well as the foreign filesystems was installed on the disk. However, if you install foreign filesystems after the OpenBSD disklabel was already installed on the disk, you need to add or modify them manually afterwards. This will be explained in this subsection.

# disklabel wd0 # using MBR partition 2: type A6 off 20338290 (0x1365672) size 29318625 (0x1bf5de1) # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: ST340016A flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 78165360 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] a: 408366 20338290 4.2BSD 2048 16384 16 # Cyl 20176*- 20581 b: 1638000 20746656 swap # Cyl 20582 - 22206 c: 78165360 0 unused 0 0 # Cyl 0 - 77544 d: 4194288 22384656 4.2BSD 2048 16384 16 # Cyl 22207 - 26367 e: 409248 26578944 4.2BSD 2048 16384 16 # Cyl 26368 - 26773 f: 10486224 26988192 4.2BSD 2048 16384 16 # Cyl 26774 - 37176 g: 12182499 37474416 4.2BSD 2048 16384 16 # Cyl 37177 - 49262* i: 64197 63 unknown # Cyl 0*- 63* j: 20274030 64260 unknown # Cyl 63*- 20176* k: 1975932 49656978 MSDOS # Cyl 49262*- 51223* l: 3919797 51632973 unknown # Cyl 51223*- 55111* m: 2939832 55552833 ext2fs # Cyl 55111*- 58028* n: 5879727 58492728 ext2fs # Cyl 58028*- 63861* o: 13783707 64372518 ext2fs # Cyl 63861*- 77535*

As can be seen in the above output, the OpenBSD partitions are listed first. Next to them are a number of ext2 partitions and one MSDOS partition, as well as a few 'unknown' partitions. On i386 and amd64 systems, you can usually find out more about those using the fdisk(8) utility. For the curious reader: partition i is a maintenance partition created by the vendor, partition j is a NTFS partition and partition l is a Linux swap partition.

Once you have determined which partition it is you want to use, you can move to the final step: mounting the filesystem contained in it. Most filesystems are supported in the GENERIC kernel: just have a look at the kernel configuration file, located in the /usr/src/sys/arch/<arch>/conf directory. However, some are not, e.g. the NTFS support is experimental and therefore not included in GENERIC. If you want to use one of the filesystems not supported in GENERIC, you will need to build a custom kernel.

When you have gathered the information needed as mentioned above, it is time to mount the filesystem. Let's assume a directory /mnt/otherfs exists, which we will use as a mount point where we will mount the desired filesystem. In this example, we will mount the ext2 filesystem in partition m:

# mount -t ext2fs /dev/wd0m /mnt/otherfs

If you plan to use this filesystem regularly, you may save yourself some time by inserting a line for it in /etc/fstab, for example something like:

/dev/wd0m /mnt/otherfs ext2fs rw,noauto,nodev,nosuid 0 0
Notice the 0 values in the fifth and sixth field. This means we do not require the filesystem to be dumped, and checked using fsck. Generally, those are things you want to have handled by the native operating system associated with the filesystem.

14.16.1 - The partitions are not in my disklabel! What should I do?

If you install foreign filesystems on your system (often the result of adding a new operating system) after you have already installed OpenBSD, a disklabel will already be present, and it will not be updated automatically to contain the new foreign filesystem partitions. If you wish to use them, you need to add or modify these partitions manually using disklabel(8).

As an example, I have modified one of my existing ext2 partitions: using Linux's fdisk program, I've reduced the size of the 'o' partition (see disklabel output above) to 1G. We will be able to recognize it easily by its starting position (offset: 64372518) and size (13783707). Note that these values are sector numbers, and that using sector numbers (not megabytes or any other measure) is the most exact and safest way of reading this information.

Before the change, the partition looked like this using OpenBSD's fdisk(8) utility (leaving only relevant output):

# fdisk wd0 . . . Offset: 64372455 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ 0: 83 4007 1 1 - 4864 254 63 [ 64372518: 13783707 ] Linux files* . . .
As you can see, the starting position and size are exactly those reported by disklabel(8) earlier. (Dont' be confused by the value indicated by "Offset": it is referring to the starting position of the extended partition in which the ext2 partition is contained.)

After changing the partition's size from Linux, it looks like this:

# fdisk wd0 . . . Offset: 64372455 Signature: 0xAA55 Starting Ending LBA Info: #: id C H S - C H S [ start: size ] ------------------------------------------------------------------------ 0: 83 4007 1 1 - 4137 254 63 [ 64372518: 2104452 ] Linux files* . . .
Now this needs to be changed using disklabel(8). For instance, you can issue disklabel -e wd0, which will invoke an editor specified by the EDITOR environment variable (default is vi). Within the editor, change the last line of the disklabel to match the new size:
o: 2104452 64372518 ext2fs
Save the disklabel to disk when finished. Now that the disklabel is up to date again, you should be able to mount your partitions as described above.

You can follow a very similar procedure to add new partitions.

14.17 - Can I use a flash memory device with OpenBSD?

Normally, the memory device should be recognized upon plugging it into a port of your machine. Shortly after inserting it, a number of messages are written to the console by the kernel. For instance, when I plug in my USB flash memory device, I see the following on my console:
umass0 at uhub1 port 1 configuration 1 interface 0 umass0: LEXR PLUG DRIVE LEXR PLUG DRIVE, rev 1.10/0.01, addr 2 umass0: using SCSI over Bulk-Only scsibus2 at umass0: 2 targets sd0 at scsibus2 targ 1 lun 0: <LEXAR, DIGITAL FILM, /W1.> SCSI2 0/direct removable sd0: 123MB, 123 cyl, 64 head, 32 sec, 512 bytes/sec, 251904 sec total
These lines indicate that the umass(4) (USB mass storage) driver has been attached to the memory device, and that it is using the SCSI system. The last two lines are the most important ones: they are saying to which device node the memory device has been attached, and what the total amount of storage space is. If you somehow missed these lines, you can still see them afterwards with the dmesg(1) command. The reported CHS geometry is a rather fictitious one, as the flash memory is being treated like any regular SCSI disk.

We will discuss two scenarios below.

The device is new/empty and you want to use it with OpenBSD only

You will need to initialize a disklabel onto the device, and create at least one partition. Please read Using OpenBSD's disklabel and the disklabel(8) manual page for details about this.

In this example I created just one partition a in which I will place a FFS filesystem:

# newfs sd0a Warning: inode blocks/cyl group (125) >= data blocks (62) in last cylinder group. This implies 1984 sector(s) cannot be allocated. /dev/rsd0a: 249856 sectors in 122 cylinders of 64 tracks, 32 sectors 122.0MB in 1 cyl groups (122 c/g, 122.00MB/g, 15488 i/g) super-block backups (for fsck -b #) at: 32,
Let's mount the filesystem we created in the a partition on /mnt/flashmem. Create the mount point first if it does not exist.
# mkdir /mnt/flashmem # mount /dev/sd0a /mnt/flashmem

You received the memory device from someone with whom you want to exchange data

There is a considerable chance the other person is not using OpenBSD, so there may be a foreign filesystem on the memory device. Therefore, we will first need to find out which partitions are on the device, as described in FAQ 14 - Foreign Filesystems.

# disklabel sd0 # /dev/rsd0c: type: SCSI disk: SCSI disk label: DIGITAL FILM flags: bytes/sector: 512 sectors/track: 32 tracks/cylinder: 64 sectors/cylinder: 2048 cylinders: 123 total sectors: 251904 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg] c: 251904 0 unused 0 0 # Cyl 0 - 122 i: 250592 32 MSDOS # Cyl 0*- 122*
As can be seen in the disklabel output above, there is only one partition i, containing a FAT filesystem created on a Windows machine. As usual, the c partition indicates the entire device.

Let's now mount the filesystem in the i partition on /mnt/flashmem.

# mount -t msdos /dev/sd0i /mnt/flashmem
Now you can start using it just like any other disk.

WARNING: You should always unmount the filesystem before unplugging the memory device. If you don't, the filesystem may be left in an inconsistent state, which may result in data corruption.

Upon detaching the memory device from your machine, you will again see the kernel write messages about this to the console:

umass0: at uhub1 port 1 (addr 2) disconnected sd0 detached scsibus2 detached umass0 detached

14.18 - Optimizing disk performance

Disk performance is a significant factor in the overall speed of your computer. It becomes increasingly important when your computer is hosting a multi-user environment (users of all kinds, from those who log-in interactively to those who see you as a file-server or a web-server). Data storage constantly needs attention, especially when your partitions run out of space or when your disks fail. OpenBSD has several options to increase the speed of your disk operations and provide fault tolerance.

14.18.1 - CCD

The first option is the use of ccd(4), the Concatenated Disk Driver. This allows you to join several partitions into one virtual disk (and thus, you can make several disks look like one disk). This concept is similar to that of LVM (logical volume management), which is found in many commercial Unix flavors.

If you are running GENERIC, ccd is already enabled (in /usr/src/sys/conf/GENERIC). If you have customized your kernel, you may need to return it to your kernel configuration. Either way, a line such as this should be in your configuration file:

pseudo-device ccd 4 # concatenated disk devices

The above example gives you up to 4 ccd devices (virtual disks). Now you need to figure out which partitions on your real disks you want to dedicate to ccd. Use disklabel to mark these partitions as type 'ccd'. On some architectures, disklabel may not allow you to do this. In this case, mark them as 'ffs'.

If you are using ccd to gain performance by striping, note that you will not get optimum performance unless you use the same model of disks with the same disklabel settings.

Edit /etc/ccd.conf to look something like this: (for more information on configuring ccd, look at ccdconfig(8))

# Configuration file for concatenated disk devices # # ccd ileave flags component devices ccd0 16 none /dev/sd2e /dev/sd3e

To make your changes take effect, run

# ccdconfig -C

As long as /etc/ccd.conf exists, ccd will automatically configure itself upon boot. Now, you have a new disk, ccd0, a combination of /dev/sd2e and /dev/sd3e. Just use disklabel on it like you normally would to make the partition or partitions you want to use. Again, don't use the 'c' partition as an actual partition that you put stuff on. Make sure your usable partitions are at least one cylinder off from the beginning of the disk.

14.18.2 - RAID

Another solution is raid(4), which will have you use raidctl(8) to control your raid devices. OpenBSD's RAID is based upon Greg Oster's NetBSD port of the CMU RAIDframe software. OpenBSD has support for RAID levels of 0, 1, 4, and 5.

With raid, as with ccd, support must be in the KERNEL. Unlike ccd, support for RAID is not found in GENERIC, so it must be compiled into your kernel (RAID support adds some 500K to the size of an i386 kernel).

pseudo-device raid 4 # RAIDframe disk device

Read the raid(4) and raidctl(8) man pages to get full details. There are many options and possible configurations available, and a detailed explanation is beyond the scope of this document.

14.18.3 - Soft updates

Another tool that can be used to speed up your system is softupdates. One of the slowest operations in the traditional BSD file system is updating metainfo (which happens, among other times, when you create or delete files and directories). Softupdates attempts to update metainfo in RAM instead of writing to the hard disk each and every single metainfo update. Another effect of this is that the metainfo on disk should always be complete, although not always up to date. So, a system crash should not require fsck(8) upon boot up, but simply a background version of fsck that makes changes to the metainfo in RAM (a la softupdates). This means rebooting a server is much faster, as you don't have to wait for fsck! (OpenBSD does not have this feature yet.) You can read more about softupdates in the Softupdates FAQ entry.

14.18.4 - Size of the namei() cache

The name-to-inode translation (a.k.a., namei()) cache controls the speed of pathname to inode(5) translation. A reasonable way to derive a value for the cache, should a large number of namei() cache misses be noticed with a tool such as systat(1), is to examine the system's current computed value with sysctl(8), (which calls this parameter "kern.maxvnodes") and to increase this value until either the namei() cache hit rate improves or it is determined that the system does not benefit substantially from an increase in the size of the namei() cache. After the value has been determined, you can set it at system startup time with sysctl.conf(5).

14.19 - Why aren't we using async mounts?

Question: "I simply do "mount -u -o async /" which makes one package I use (which insists on touching a few hundred things from time to time) usable. Why is async mounting frowned upon and not on by default (as it is in some other unixen)? Isn't it a much simpler, and therefore, a safer way of improving performance in some applications?"

Answer: "Async mounts are indeed faster than sync mounts, but they are also less safe. What happens in case of a power failure? Or a hardware problem? The quest for speed should not sacrifice the reliability and the stability of the system. Check the man page for mount(8)." async All I/O to the file system should be done asynchronously. This is a dangerous flag to set since it does not guaran- tee to keep a consistent file system structure on the disk. You should not use this flag unless you are pre- pared to recreate the file system should your system crash. The most common use of this flag is to speed up restore(8) where it can give a factor of two speed in- crease.

On the other hand, when you are dealing with temp data that you can recreate from scratch after a crash, you can gain speed by using a separate partition for that data only, mounted async. Again, do this only if you don't mind the loss of all the data in the partition when something goes wrong. For this reason, mfs(8) partitions are mounted asynchronously, as they will get wiped and recreated on a reboot anyway.

[FAQ Index] [To Section 13 - Multimedia] [To Section 15 - Packages and Ports]


[back] www@openbsd.org
$OpenBSD: faq14.html,v 1.167 2007/11/01 02:11:01 nick Exp $