Adding a new drive via the serial console using dump/restore
It was just over three weeks ago that Serial Consoles can be very useful hit the airwaves. Now it is time to
use this valuable asset in battle. As I write this, I am in Ottawa.
There is a machine in Wellington which contains a dying disk.
Some members of the NZ FreeBSD User Group (NZFUG) have donated a new SCSI disk for the cvsup server and have
installed it for me. Unfortunately, the box does not come back as the old drive is failing the fsck…
Connecting to the serial console
Luckily, this box has a serial console. That means I can access from my other working box in New Zealand. These
two boxes sit beside each other in the rack. They have two serial cables running from one to the other. On the
working box, I did this [NOTE: after getting connected with tip, you may have to press ENTER to see the prompt]:
# tip com1
connected# mount
/dev/ad0s1a on / (ufs, local, read-only)
From here I have complete control over the other system. I can reboot, examine the output of dmesg, reconfigure the
system, and get it back up and running.
The disk is there… so what’s the problem?
I can confirm that the new disk is being recognized by the system:
Mounting root from ufs:/dev/ad0s1a
da1 at aha1 bus 0 target 3 lun 0
da1: <COMPAQ ST19171W 9A10> Fixed Direct Access SCSI-2 device
da1: 10.000MB/s transfers (10.000MHz, offset 8)
da1: 8678MB (17773500 512 byte sectors: 64H 32S/T 8678C)
da0 at aha1 bus 0 target 0 lun 0
da0: <SEAGATE ST32550N 0019> Fixed Direct Access SCSI-2 device
da0: 10.000MB/s transfers (10.000MHz, offset 8)
da0: 2047MB (4194058 512 byte sectors: 64H 32S/T 2047C)
da1 is the new drive, so, yes, it’s there… waiting.
I decided I would reboot the box so I (and you!) could see what happens during the boot process. Then I might be able
to more closely identify the problem.
# reboot
Waiting (max 60 seconds) for system process `vnlru' to stop...stopped
Waiting (max 60 seconds) for system process `bufdaemon' to stop...stopped
Waiting (max 60 seconds) for system process `syncer' to stop...stoppedsyncing disks...
done
Uptime: 7h39m15s
Rebooting...
Console: serial port
BIOS drive A: is disk0
BIOS drive C: is disk1
BIOS drive D: is disk2
BIOS drive E: is disk3
BIOS 639kB/64512kB available memoryFreeBSD/i386 bootstrap loader, Revision 0.8
(storm@cvsup.example.org, Fri Aug 2 20:21:34 NZST 2002)
Loading /boot/defaults/loader.conf
/kernel text=0x2e3775 data=0x3b4a0+0x345f4 syms=[0x4+0x3e450+0x4+0x44e15]
-
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [kernel]...
Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 4.6-STABLE #0: Tue Aug 13 13:31:45 NZST 2002
storm@cvsup.example.org:/usr/home/ncvs/usr-obj/usr/src/sys/ZEKE
Timecounter "i8254" frequency 1193182 Hz
CPU: Pentium/P54C (132.96-MHz 586-class CPU)
Origin = "GenuineIntel" Id = 0x52c Stepping = 12
Features=0x1bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8>
real memory = 67108864 (65536K bytes)
avail memory = 60407808 (58992K bytes)
Preloaded elf kernel "kernel" at 0xc04d9000.
Intel Pentium detected, installing workaround for F00F bug
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
isab0: <Intel 82371SB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX3 ATA controller> port 0xf000-0xf00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0x6100-0x611f mem 0xe1000000-0xe10fffff,0xe1100000-0xe1100fff irq 15 at device 18.0 on
pci0
fxp0: Ethernet address 00:a0:c9:16:d5:60, 10Mbps
pci0: <Cirrus Logic GD5434 SVGA controller> at 19.0
orm0: <Option ROMs> at iomem 0xc0000-0xc7fff,0xc8000-0xcbfff on isa0
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive 0
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x30 on isa0
sio0: type 16550A, console
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/15 bytes threshold
plip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
aha1: <Adaptec 1542/aha-1535> at port 0x330-0x333 iomem 0xcc000-0xcffff irq 10 drq 5 on isa0
aha1: AHA-1542CP FW Rev. F.0 (ID=46) SCSI Host Adapter, SCSI ID 7, 16 CCBs
IPsec: Initialized Security Association Processing.
ad0: 1221MB <Seagate Technology 1275MB - ST31276A> [2482/16/63] at ata0-master WDMA2
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s1a
da1 at aha1 bus 0 target 3 lun 0
da1: <COMPAQ ST19171W 9A10> Fixed Direct Access SCSI-2 device
da1: 10.000MB/s transfers (10.000MHz, offset 8)
da1: 8678MB (17773500 512 byte sectors: 64H 32S/T 8678C)
da0 at aha1 bus 0 target 0 lun 0
da0: <SEAGATE ST32550N 0019> Fixed Direct Access SCSI-2 device
da0: 10.000MB/s transfers (10.000MHz, offset 8)
da0: 2047MB (4194058 512 byte sectors: 64H 32S/T 2047C)
swapon: adding /dev/ad0s1b as swap device
Automatic boot in progress...
/dev/ad0s1a: FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad0s1a: clean, 13022 free (358 frags, 1583 blocks, 0.7% fragmentation)
/dev/ad0s1f: FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad0s1f: clean, 177535 free (9511 frags, 21003 blocks, 0.9% fragmentation)
/dev/ad0s1e: FILESYSTEM CLEAN; SKIPPING CHECKS
/dev/ad0s1e: clean, 9187 free (259 frags, 1116 blocks, 1.3% fragmentation)
/dev/da0s1e: DIRECTORY CORRUPTED I=240080 OWNER=cvsupin MODE=40775
/dev/da0s1e: SIZE=512 MTIME=Oct 8 06:18 2002
/dev/da0s1e: DIR=?/dev/da0s1e: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
THE FOLLOWING FILE SYSTEM HAD AN UNEXPECTED INCONSISTENCY:
/dev/da0s1e (/home/ncvs)
Automatic file system check failed . . . help!
Enter full pathname of shell or RETURN for /bin/sh:
#
Ahh, the problem is with /dev/da0s1e
, specifically /home/ncvs
.
The solution is right there. Run fsck.
I happen to know that this part of the disk is not critical to the basic running of the system. In fact,
/home/ncvs
is just the cvs portion of the system which contains data which can be easily
recovered via a cvsup.
The fsck
Here’s the fsck which I ran:
# fsck -y /dev/da0s1e
** /dev/da0s1e
** Last Mounted on /usr/home/ncvs
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
DIRECTORY CORRUPTED I=240080 OWNER=cvsupin MODE=40775
SIZE=512 MTIME=Oct 8 06:18 2002
DIR=?SALVAGE? yes
MISSING '.' I=240080 OWNER=cvsupin MODE=40775
SIZE=512 MTIME=Oct 8 06:18 2002
DIR=?FIX? yes
MISSING '..' I=240080 OWNER=cvsupin MODE=40775
SIZE=512 MTIME=Oct 8 06:18 2002
DIR=/FreeBSD/src/libexec/bootpd/tools/bootpefFIX? yes
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
UNREF FILE I=239745 OWNER=cvsupin MODE=100444
SIZE=3982 MTIME=Oct 7 07:32 2002
RECONNECT? yesNO lost+found DIRECTORY
CREATE? yesUNREF FILE I=239746 OWNER=cvsupin MODE=100444
SIZE=5762 MTIME=Oct 7 07:32 2002
RECONNECT? yesUNREF FILE I=239747 OWNER=cvsupin MODE=100444
SIZE=11813 MTIME=Oct 7 07:32 2002
RECONNECT? yes** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? yesSUMMARY INFORMATION BAD
SALVAGE? yes159791 files, 1801866 used, 230056 free (11080 frags, 27372 blocks, 0.5% fragmentation)
***** FILE SYSTEM MARKED CLEAN *****
***** FILE SYSTEM WAS MODIFIED *****
Then I did another reboot, just to ensure that everything was fine…. and because I thought it was so amazing
to be able to watch a remote machine reboot!
This time the system came back properly, with all disks mounted, and I was presented with a login prompt.
Then I ssh’d into the box and started using that for the next step.
Adding the new disk to the system
The FreeBSD Handbook has a good
section on adding a new drive. I followed their instructions and ran /stand/sysinstall
to configure the new disk. Then I modified /etc/fstab
to add the new drive into
the list of automatically mounted disks. NOTE: that after following those instructions, your new disk will
be mounted. You can view the details like this:
$ mount
/dev/ad0s1a on / (ufs, local)
/dev/ad0s1f on /usr (ufs, local)
/dev/ad0s1e on /var (ufs, local)
procfs on /proc (procfs, local)
/dev/da0s1e on /usr/home/ncvs (ufs, local)
/dev/da1s1e on /home/bigdisk (ufs, local)
I have put my new disk in bold
. Then I added a line to /etc/fstab
:
# Device Mountpoint FStype Options Dump Pass# /dev/ad0s1b none swap sw 0 0 /dev/ad0s1a / ufs rw 1 1 /dev/ad0s1f /usr ufs rw 2 2 /dev/ad0s1e /var ufs rw 2 2 /dev/acd0c /cdrom cd9660 ro,noauto 0 0 proc /proc procfs rw 0 0 /dev/da0s1e /home/ncvs ufs rw 2 2 /dev/da1s1e /home/bigdisk ufs rw 2 2
I again rebooted the box. I’m slightly anal that way. It is definitely not compulsory to reboot the box
under such circumstances. However, by doing so, I will test that my changes to /etc/fstab
are correct, thereby ensuring that a future reboot [long after I’ve forgotten about this incident]
will mount the new drive.
Copying the disk over
The next step involved copying the data from the old disk to the new disk. I have already written about
disk copying (Swapping boot drives around. If you look at the feedback for that
article, you’ll find a message from Henry which shows how to use dump/restore to copy the data. Here’s the
output from that, which I am sure you will find fascinating reading. (see also the note at the end of this section)
[root@cvsup:/home/bigdisk] # dump -0b 512 -f - /home/ncvs | buffer -S 2048K -p 75 | restore -rb 512 -f -
DUMP: Date of this level 0 dump: Tue Oct 15 03:37:26 2002
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping /dev/da0s1e (/home/ncvs) to standard output
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 1934456 tape blocks.
DUMP: dumping (Pass III) [directories]
DUMP: 0.03% done, finished in 2:05
DUMP: dumping (Pass IV) [regular files]
DUMP: 2.99% done, finished in 17:16
DUMP: 6.93% done, finished in 8:16
DUMP: 11.38% done, finished in 5:27
DUMP: 14.87% done, finished in 4:28
DUMP: 19.51% done, finished in 3:34
DUMP: 23.34% done, finished in 3:07
DUMP: 28.27% done, finished in 2:37
DUMP: 32.40% done, finished in 2:19
DUMP: 36.76% done, finished in 2:04
DUMP: 40.31% done, finished in 1:54
DUMP: 43.17% done, finished in 1:48
DUMP: 45.50% done, finished in 1:44
DUMP: 47.67% done, finished in 1:41
./FreeBSD/ports/french/linux-netscape6/scripts/configure,v: not found on tape
DUMP: 50.05% done, finished in 1:37
./FreeBSD/ports/graphics/libdvbpsi/distinfo,v: not found on tape
DUMP: 52.14% done, finished in 1:33
./FreeBSD/src/contrib/groff/src/devices/grotty/tty.cc,v: not found on tape
./FreeBSD/ports/german/linux-netscape6/scripts/configure,v: not found on tape
DUMP: 54.76% done, finished in 1:28
./FreeBSD/ports/www/linux-netscape6/pkg-message,v: not found on tape
./FreeBSD/ports/java/jikes/Makefile,v: not found on tape
DUMP: 57.28% done, finished in 1:23
./FreeBSD/src/usr.bin/calendar/calendars/de_DE.ISO8859-1/calendar.geschichte,v: not found on tape
DUMP: 59.53% done, finished in 1:19
DUMP: 61.30% done, finished in 1:17
./FreeBSD/ports/www/squid/pkg-plist,v: not found on tape
DUMP: 63.07% done, finished in 1:14
./FreeBSD/data/relnotes/CURRENT/installation/alpha/upgrading.html: not found on tape
DUMP: 65.22% done, finished in 1:10
./FreeBSD/data/doc/en_US.ISO8859-1/articles/pam/pam-terms.html: not found on tape
DUMP: 67.62% done, finished in 1:05
./FreeBSD/data/relnotes/CURRENT/installation/pc98/upgrading.html: not found on tape
./FreeBSD/data/relnotes/CURRENT/installation/i386/upgrading.html: not found on tape
DUMP: 69.90% done, finished in 1:01
./FreeBSD/data/relnotes/4-STABLE/errata/x28.html: not found on tape
DUMP: 72.60% done, finished in 0:55
./FreeBSD/data/relnotes/CURRENT/readme/x96.html: not found on tape
./FreeBSD/data/relnotes/CURRENT/relnotes/sparc64/x4229.html: not found on tape
./FreeBSD/ports/shells/zsh-devel/pkg-plist,v: not found on tape
DUMP: 74.96% done, finished in 0:50
./FreeBSD/data/doc/en_US.ISO8859-1/books/corp-net-guide/x998.html: not found on tape
./FreeBSD/src/sys/kern/vfs_subr.c,v: not found on tape
./FreeBSD/data/ru/releases/index.html: not found on tape
DUMP: 77.79% done, finished in 0:45
./FreeBSD/ports/www/squid24/Makefile,v: not found on tape
./FreeBSD/ports/japanese/netscape7/scripts/configure,v: not found on tape
./FreeBSD/data/ru/ports/zope.html: not found on tape
DUMP: 81.28% done, finished in 0:37
./FreeBSD/data/relnotes/CURRENT/relnotes/alpha/x4229.html: not found on tape
./FreeBSD/data/relnotes/CURRENT/relnotes/i386/x4229.html: not found on tape
DUMP: 84.70% done, finished in 0:30
./FreeBSD/src/sys/pc98/conf/GENERIC.hints,v: not found on tape
./FreeBSD/data/relnotes/CURRENT/errata/x29.html: not found on tape
./FreeBSD/data/search/index-site.html: not found on tape
DUMP: 89.12% done, finished in 0:21
./FreeBSD/data/relnotes/CURRENT/hardware/i386/x26.html: not found on tape
./FreeBSD/data/doc/en_US.ISO8859-1/articles/new-users/x737.html: not found on tape
./FreeBSD/src/sys/dev/wi/if_wireg.h,v: not found on tape
DUMP: 93.35% done, finished in 0:12
./FreeBSD/data/relnotes/CURRENT/hardware/alpha/x15.html: not found on tape
DUMP: 97.64% done, finished in 0:04
./FreeBSD/src/sys/fs/unionfs/union_vnops.c,v: not found on tape
./FreeBSD/data/doc/ja/articles/dialup-firewall/x91.html: not found on tape
DUMP: DUMP: 1934143 tape blocks
DUMP: finished in 11145 seconds, throughput 173 KBytes/sec
DUMP: DUMP IS DONE
1933824K
[root@cvsup:/home/bigdisk] #
I’m not really worried about those “not found on tape” messages. This is a mirror of the real data.
Any missing information will be “found” during the next cvsup.
This note added 2 April 2004: I tried this today under 4.9-RELEASE when attempting to copy a drive.
It failed. Instead of using the mount point (/home/ncvs
, the device
/dev/da0s1e
) needed to be used instead. However, I was unable to get it to work.
[root@temp:/dst] # dump -0b 512 -f - /dev/ad2s1a | restore -rb 512 -f -
DUMP: Date of this level 0 dump: Thu Apr 2 17:23:47 2004
DUMP: Date of last level 0 dump: the epoch
DUMP: Dumping /dev/ad2s1a to standard output
DUMP: mapping (Pass I) [regular files]
DUMP: mapping (Pass II) [directories]
DUMP: estimated 47111 tape blocks.
DUMP: dumping (Pass III) [directories]
DUMP: dumping (Pass IV) [regular files]
DUMP: 1.09% done, finished in 0:00
DUMP: Broken pipe
DUMP: The ENTIRE dump is aborted.
Segmentation fault (core dumped)
[root@temp:/dst] #
Swapping out the old for the new
I wanted to change the mount points around so that the new disk was used instead of the old disk.
This was pretty easy. I just unmounted the old data and mounted the new.
# umount /dev/da0s1e
# umount /dev/da1s1e
# mount /dev/da1s1e /usr/home/ncvs
In the above, I unmount both the old and the new drives. Then I mount the new drive where the old drive used to be.
Scroll back up to the section on mount for more detail.
I also modified /etc/fstab
to reflect these new mount points.
But WAIT! There’s more!
Yes, there’s more. There’s more data to be added to this disk. We need to cvsup from the
master repository in order to get things up to date. This will not happen until about
midnight New Zealand time (which is 7 AM here in Ottawa). By the time I get up on Tuesday
morning, the repo should be fine.
did you have the new disk installed live, or was the box brought down?
I’ve heard you can use camcontrol for live maintenance, but have never tested to see if it actually works as touted…
The box was brought down by NZFUG members who lived in Wellington.
camcontrol works fine. I’ve used it to replace dead portions of
vinum volumes, and to reoutinely attach/remove my orb drive.