I got home from work tonight and zfs had reported that one of the disks in my main pool was dying.
no great problem. insert a disk i had spare, and run zpool replace like so:
ganesh:~# zpool replace -f export scsi-SATA_WDC_WD10EARS-00_WD-WMAV50933036 scsi-SATA_ST31000528AS_9VP16X03
and now it’s just a matter of waiting until the data has “resilvered” and i can remove the old dying drive.
ganesh:~# zpool status export -v pool: export state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Mon Nov 28 19:25:49 2011 75.1G scanned out of 2.29T at 88.3M/s, 7h17m to go 18.0G resilvered, 3.21% done config: NAME STATE READ WRITE CKSUM export ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 scsi-SATA_WDC_WD10EACS-00_WD-WCASJ2114122 ONLINE 0 0 0 scsi-SATA_WDC_WD10EACS-00_WD-WCASJ2195141 ONLINE 0 0 0 scsi-SATA_WDC_WD10EARS-00_WD-WMAV50817803 ONLINE 0 0 0 replacing-3 ONLINE 0 0 0 scsi-SATA_WDC_WD10EARS-00_WD-WMAV50933036 ONLINE 0 0 0 scsi-SATA_ST31000528AS_9VP16X03 ONLINE 0 0 0 (resilvering) logs scsi-SATA_Patriot_Torqx_278BF0715010800025492-part6 ONLINE 0 0 0 cache scsi-SATA_Patriot_Torqx_278BF0715010800025492-part7 ONLINE 0 0 0 errors: No known data errors
(I use /dev/disk/by-id device names so I don’t have to care about the kernel detecting the disks and disk controllers in a different order on each reboot…also makes it easier to identify which disk is having a problem as I can squint at the front of the drives in my Lian Li 4-in-3 hotswap bays and read the serial number sticker)
OK, yeah, sure…it’s not that much different from replacing a dying drive in an mdadm or hardware raid array but one nice advantage of zfs over raid is that it only resyncs the data in use, not the empty or unused blocks. given that these are 1TB drives in a raidz vdev, and the pool is 63% used that means syncing only about 630GB rather than 1TB.
ps: in the time it’s taken to write this, it’s up to 10.11% done. 88MB/s reading data from the existing raidz drives and writing to a single drive isn’t too bad at all.
Are you using zfs-fuse or something else?
I’m using LLNL’s ZFS for Linux
I’m compiling local packages for debian from the daily releases at the Ubuntu zfsonlinux ppa.
There’s only one minor change to the debian/control file required to get them to build for debian instead of ubuntu: Edit the Depends line for zfs-initramfs so that it depends on grub rather than zfs-grub
The zfs-grub package doesn’t exist in debian…debian’s grub package contains the zfs.mod and zfsinfo.mod modules anyway. I’m not booting from a ZFS pool (still have an ext4 /boot and and XFS /), so i don’t know if it’s functionally the same as Ubuntu’s zfs-grub package.
Update: The resilver operation finished successfully last night just before 3am