Debian bootable/partitionable RAID1 MICRO-HOWTO.





1. Intro

With current 2.6 kernels, it's possible to create a RAID1 mirror device of 2 complete disks, and to partition that RAID device.

At Cistron, we are using SuperMicro 1U servers with 2x80 GB SATA disks. This MICRO-HOWTO describes the setup we use at Cistron- if your equipment is different, you need to substitute your specific setup mentally when reading this MICRO-HOWTO.

Requirements:

The reason you need the sysvinit_2.85-12 / initscripts_2.85-12 packages is that the checkroot script in that version (and later versions) makes sure that it checks the right device at boot. Since the major number of the partitionable RAID device is dynamic, it is unknown in advance. So you can't just create /dev/md/d0p1 and use it, since the device might be incorrect. The checkroot script checks for the validity of that device node and if it isn't valid it creates a temporary device in /dev/shm/root to run fsck on. The mdp-makedev script will create the correct device nodes later on once the root file system has been checked and remounted read-only.

2. Configuration.

With a SuperMicro 1U server with the Intel ICH5 chipset and Phoenix/AwardBIOS you need to change in the menu ""Advanced" -> "Advanced Chipset Control" the setting of "On-Chip Serial ATA" to "Combined Mode" to be able to install Debian.

Now install Debian onto the first disk. You can install woody (stable) and then upgrade to unstable, or you can try the latest debian-installer to install sid/sarge (testing/unstable) directly. When partitioning the disk, make sure that at the end of the disk you leave the last, say, 1MB of space unpartitioned (that's where the RAID superblock goes).

After this you need to download, unpack, configure compile and install a recent 2.6 kernel. You need to make sure that RAID1 is compiled into the kernel, not as a module. Also make sure tmpfs/shmfs support is compiled in.

At Cistron, on machines with SATA disks we prefer to use the LIBATA SCSI/SATA driver layer. This means enabling the ATA option in the SCSI driver menu and selecting the right driver. Then you need to edit /etc/fstab to change hda* to sda*, edit /etc/lilo.conf to set root to /dev/sda* instead of /dev/hda* (don't change boot= yet) and reboot. Now you should be running from /dev/sda*. Finally edit /etc/lilo.conf to adjust boot= as well now that you're running from sda*.

When using the SuperMicro with the Intel ICH5 chipset you need to go into the BIOS menu "Advanced" -> "Advanced Chipset Control" and set the "On Chip Serial ATA" option to "Enhanced Mode" or the kernel might not see the second SATA disk if the first SATA disk is missing or defective.

Now it's time to install the mdp-makedev init script into /etc/init.d. Make sure it's executable and install the needed symlinks using update-rc.d:

  # update-rc.d mdp-makedev start 26 S .

It's a good idea to reboot with this setup once to see if things are working up to this point. After a reboot you should see a few devices appear in the /dev/md directory, at the very least /dev/md/d0.

Now it's time to change the LILO config:

# /etc/lilo.conf

lba32
 
boot=/dev/sda
root=/dev/sda1

image=/boot/vmlinuz-2.6.5
        label=2.6.5
        append="md=d0,/dev/sda,/dev/sdb root=/dev/md_d0p1"

What we're doing is that we're setting up the configuration of the new RAID1 array explicitly from the kernel command line, and we're telling the kernel that the root filesystem is the first partition on the RAID1 array. We're NOT using the standard LILO "root=" way of setting the root partition because LILO doesn't understand that. The setting on the kernel command line has precedence over the value LILO passes.

It's probably a good idea to add an extra image= line without the RAID1 settings, so you can fall back on that. Now run "lilo".

Now you need to edit /etc/fstab and replace the entries like /dev/sda1, /dev/sda2 etc with their MD equivalents: /dev/md/d0p1, /dev/md/d0p2.

Now you need to boot from the Knoppix CD.

Booted from the knoppix CD, we're activating RAID1 using the mdadm command:

  # mdadm --zero-superblock /dev/hdb
  # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/hda missing
Remove the Knoppix CD, reboot, and if all is well, you'll be running from the RAID device instead of just the first disk.

Now we need to adjust the LILO configuration once more to be able to run LILO on the RAID array instead of one of the underlying devices.

Because LILO thinks it knows how to handle RAID devices while in fact it doesn't in this setup we need to trick LILO. First we create a /dev/Md -> /dev/md symlink:

   # ln -sf /dev/md /dev/Md

Now you need to run fdisk to see at what sector on the physical disk the first partition starts (x -> expert mode, p -> print, q -> quit)
  # fdisk /dev/sda
Command (m for help): x
 
Expert command (m for help): p
 
Disk /dev/sda: 255 heads, 63 sectors, 9964 cylinders
 
Nr AF  Hd Sec  Cyl  Hd Sec  Cyl     Start      Size ID
 1 80   1   1    0 254  63  123         63    1991997 83
[...]
Expert command (m for help): q

The number under "Start" is the starting sector of the first partition.

Now edit /lilo.conf:

lba32
 
disk=/dev/Md/d0
  bios=0x80
  sectors=63
  heads=255
  cylinders=1024
  partition=/dev/md/d0p1
    start=63
boot=/dev/Md/d0
root=/dev/md/d0p1
 
#boot=/dev/sda
#root=/dev/sda1
 
We're explicitly telling LILO about the layout of the unknow for LILO disk-type /dev/Md/d0. Sectors/Heads/Cylinders doesn't really matter and can be kept like above, the important thing is the "start" value under "partition". That's the number we got from fdisk.

Note that we're still using the root=/dev/md_d0p1 value on the kernel command line - LILO now understands root=/dev/md/d0p1, but the kernel doesn't understand the "/dev/md/d0p1" format, it only understands /dev/md_d0p1 right now.

Now run lilo and this will install the LILO bootblock, maps etc on the RAID device. Reboot to make sure it all works.

Everything should be OK right now, except that the second (/dev/sdb) disk is not active in the RAID array yet. We can activate it using the mdadm command:

  # mdadm /dev/md/d0 --add /dev/sdb

Running cat /proc/mdstat will show you that the RAID1 array is being synced - making /dev/sdb a member of the array. With 80 GB disks, it will take about half an hour.


Version 20040414 - © 2004 Cistron Broadband B.V. - Miquel van Smoorenburg