ミラー運用で片方のディスクの先頭を書き潰してしまった場合の修復手順(ダメダメ編)

作業マシンが、RAID1だったということをすっかり忘れてて、sdbの頭からddしてしまつた。
こんなミスはしたことない(^^;。

root@redmine:~# dd if=/dev/zero of=/dev/sdb
^C61019+0 records in
61019+0 records out
31241728 bytes (31 MB) copied, 1.72095 s, 18.2 MB/s

あわてて、止めたけど、すでに31MBも書き潰してしまつた。
当然ながら、パーティションテーブルもない。

root@redmine:~# fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table

状況の確認

リカバリしないとなぁ...
SqueezeのインストーラーでパーティションもRAIDも構成したので、どんな構成だったかも覚えてない(^^;。
まず生きているほうのパーティションテーブルを確認。

root@redmine:~# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000f5f2

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1       60055   482384896   fd  Linux raid autodetect
/dev/sda2           60055       60802     5998593    5  Extended
/dev/sda5           60055       60802     5998592   fd  Linux raid autodetect

sda1はrootfs。
sda5はswapかな。
どちらもmdでRAID1構成になっている。
mdのオペレーションなんて、もう何年もやってないので、すっかり忘却しているなぁ...

root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active (auto-read-only) raid1 sda5[0] sdb5[1]
      5998528 blocks [2/2] [UU]
      
md0 : active raid1 sda1[0] sdb1[1]
      482384832 blocks [2/2] [UU]
      
unused devices: <none>

md構成は壊れていないと認識している。
生きているドライブを上書きしただけだし、sda優先になっているので電源を切らない限り動作に問題はないだろう。

修理の戦略

さて、修理の戦略を立てる。
まずは、動いたままなので、パーティションテーブルを復旧する。
swap領域は一部上書きしたので、たぶんダメ。こちらは開放してデグレードさせる。ddで書き潰しておく。mdのリビルドをやる。
rootfsは、まだ生きているので、これはパーティションテーブルを復旧すると治るのじゃないかと甘い期待をしてみる。
なんせ500GBなのでリビルドかけるだけで気が遠くなる時間がかかるから...

パーティションテーブルの復旧

これはマニュアルで。
幸いなことに、同じメーカーの同じドライブなので、sdaのMBRをコピって上書きする。

root@redmine:~# dd if=/dev/hda of=mbr.img bs=512 count=1
dd: opening `/dev/hda': No such file or directory
root@redmine:~# dd if=/dev/sda of=mbr.img bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 7.4122e-05 s, 6.9 MB/s
root@redmine:~# dd if=mbr.img of=/dev/sdb bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 7.3887e-05 s, 6.9 MB/s
root@redmine:~# fdisk -l /dev/sdb

Disk /dev/sdb: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000f5f2

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1       60055   482384896   fd  Linux raid autodetect
/dev/sdb2           60055       60802     5998593    5  Extended
/dev/sdb5           60055       60802     5998592   fd  Linux raid autodetect

というわけで復活。

Disk identifierが同じだけど？

でも、Disk identifierが同じになっているな。これって使っているのだったっけ？
http://www.linuxquestions.org/questions/linux-general-1/what-is-disk-identifier-740408/

A Disk Identifier (or Disk Signature) applies to an entire hard disk drive (not a single partition). A Disk Identifier/Disk Signature is a 4-byte (longword) number that is randomly generated when the Master Boot Record/Partition Table is first created and stored. The Disk Identifier is stored at byte offset 1B8 (hex) through 1BB (hex) in the MBR disk sector. Windows Vista uses the Disk Signature to locate boot devices so changing it can prevent Vista from booting. So far as I know Grub and Linux don't use the Disk Identifier.

Linuxでは、使ってないみたいね。
このスレッドには、Disk identifierを書き換えるCのサンプルもある。
でも、fdiskを使ったほうが簡単。
念のため変更しておく。

root@redmine:~# fdisk /dev/sdb

WARNING: DOS-compatible mode is deprecated. It's strongly recommended to
         switch off the mode (command 'c') and change display units to
         sectors (command 'u').

Command (m for help): x

Expert command (m for help): m
Command action
   b   move beginning of data in a partition
   c   change number of cylinders
   d   print the raw data in the partition table
   e   list extended partitions
   f   fix partition order
   g   create an IRIX (SGI) partition table
   h   change number of heads
   i   change the disk identifier
   m   print this menu
   p   print the partition table
   q   quit without saving changes
   r   return to main menu
   s   change number of sectors/track
   v   verify the partition table
   w   write table to disk and exit

Expert command (m for help): i
New disk identifier (current 0x0000f5f2): 0x0000f5f3
Disk identifier: 0x0000f5f3

Expert command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.

UUIDの確認

UUIDはパーティションごとに持っている。
昨今は、ドライブの認識にこいつを使っているから、話がめんどくさい。
こいつも上書きされたから、同じになっている。
ミラー構成にしてあるので、UUIDは同じでいいはず。
http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch26_:_Linux_Software_RAID

All three partitions were given the UUID label 77b695c4:32e5dd46:63dd7d16:17696e09 when the mdadm command created device /dev/md0. The mdadm.conf file makes sure this mapping is remembered when you reboot.

同じになっていることを確認。

root@redmine:~# blkid /dev/sda*
/dev/sda1: UUID="4203a1e6-c935-ba5a-0090-cc13c02a0c21" TYPE="linux_raid_member" 
/dev/sda5: UUID="7d4576d3-ca18-c730-5edd-8ffeb6891536" TYPE="linux_raid_member" 
root@redmine:~# blkid /dev/sdb*
/dev/sdb1: UUID="4203a1e6-c935-ba5a-0090-cc13c02a0c21" TYPE="linux_raid_member" 
/dev/sdb5: UUID="7d4576d3-ca18-c730-5edd-8ffeb6891536" TYPE="linux_raid_member" ||<
RAID設定のUUIDとの一致も確認しておく。
>||
root@redmine:~# cat /etc/mdadm/mdadm.conf | grep UUID
ARRAY /dev/md0 UUID=4203a1e6:c935ba5a:0090cc13:c02a0c21
ARRAY /dev/md1 UUID=7d4576d3:ca18c730:5edd8ffe:b6891536

問題ない。
同じドライブを使っていない場合は、fdisk -l の結果をバックアップしておいたほうがいい。
ホントは、パーティションテーブルをバックアップしておくといいんだけど(^^;。
同じでない場合は、sdaのfdiskの結果から、"Endの数字/headsの数字/sectorsの数字"した値を境界にして、fdiskで新規にパーティションを作る。
UUIDはext系のfsを使ってれば以下のように設定できる。

# tune2fs -U [UUID] /dev/[tagetDevice]

swapの再構成と痛恨のミス...

swapのsdb1をRAID構成から外しておく。

root@redmine:~# mdadm --manage /dev/md0 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md0

あちゃ、痛恨のミス。
swapなのはmd1-sdb5じゃんか... orz
でかいほうのボリュームを外しちゃったよ... orz

root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active (auto-read-only) raid1 sda5[0] sdb5[1]
      5998528 blocks [2/2] [UU]
      
md0 : active raid1 sda1[0] sdb1[2](F)
      482384832 blocks [2/1] [U_]
      
unused devices: <none>

今日はダメダメな日だわ...

戦略変更。全部リビルドすることに... orz

しょうがないので、全部リビルドさせる。
ふつーすぎて、つまらんな...

root@redmine:~# mdadm --manage /dev/md1 --fail /dev/sdb5
mdadm: set /dev/sdb5 faulty in /dev/md1
root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sda5[0] sdb5[2](F)
      5998528 blocks [2/1] [U_]
      
md0 : active raid1 sda1[0] sdb1[2](F)
      482384832 blocks [2/1] [U_]
      
unused devices: <none>

リビルド。

root@redmine:~# mdadm /dev/md0 -a /dev/sdb1
mdadm: Cannot open /dev/sdb1: Device or resource busy

なぬ？
こうだっけ？

root@redmine:~# mdadm --manage /dev/md0 -a /dev/sdb1
mdadm: Cannot open /dev/sdb1: Device or resource busy

ぬう？

root@redmine:~# mdadm --add /dev/md0 /dev/sdb1
mdadm: Cannot open /dev/sdb1: Device or resource busy

あー、うろおぼえでは、壊してしまうな...
ほんとダメな日。

なんでリビルド開始しないの？？？

RAIDの状態チェック。

root@redmine:~# mdadm --detail /dev/md0
/dev/md0:
        Version : 0.90
  Creation Time : Fri May 13 12:56:38 2011
     Raid Level : raid1
     Array Size : 482384832 (460.04 GiB 493.96 GB)
  Used Dev Size : 482384832 (460.04 GiB 493.96 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu May 26 15:01:24 2011
          State : clean, degraded
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0

           UUID : 4203a1e6:c935ba5a:0090cc13:c02a0c21
         Events : 0.76

    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       0        0        1      removed

       2       8       17        -      faulty spare   /dev/sdb1

想定通りだな。
sdb1が再ジョインできないのはなんでだ？

あ、spareになってるや。
removeしてないからか。
それば、busyだわな。

root@redmine:~# mdadm /dev/md0 -r /dev/sdb1
mdadm: hot removed /dev/sdb1 from /dev/md0
root@redmine:~# mdadm /dev/md1 -r /dev/sdb5
mdadm: hot removed /dev/sdb5 from /dev/md1
root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sda5[0]
      5998528 blocks [2/1] [U_]
      
md0 : active raid1 sda1[0]
      482384832 blocks [2/1] [U_]
      
unused devices: <none>

再構築。

root@redmine:~# mdadm /dev/md1 -a /dev/sdb5
mdadm: re-added /dev/sdb5
root@redmine:~# mdadm /dev/md0 -a /dev/sdb1
mdadm: re-added /dev/sdb1
root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sdb5[2] sda5[0]
      5998528 blocks [2/1] [U_]
      [====>................]  recovery = 24.9% (1495616/5998528) finish=2.0min speed=36634K/sec
      
md0 : active raid1 sdb1[2] sda1[0]
      482384832 blocks [2/1] [U_]
      	resync=DELAYED
      
unused devices: <none>

あとはエラーが出ないことを祈りながら、待つだけ...
500GBのこの環境で、予想時間表示では100分くらい。

起動できるの？

まだ、GRUB関係の起動部分はどうなってしまったかという問題はあるな。
md1の時のGRUB2ってどうやって設定するのだったかな？
/boot/grub/grub.cfgを見ると、どうもちゃんとmdを認識するようだ。
GRUB2恐るべし。
MBRごと上書きしたし、これは問題なさそうかな。
とりあえず書いておく。

root@redmine:~# grub-install /dev/sda
Installation finished. No error reported.
root@redmine:~# grub-install /dev/sdb
Installation finished. No error reported.

というわけで、無事リビルド終わり。

root@redmine:~# cat /proc/mdstat 
Personalities : [raid1] 
md1 : active raid1 sdb5[1] sda5[0]
      5998528 blocks [2/2] [UU]
      
md0 : active raid1 sdb1[1] sda1[0]
      482384832 blocks [2/2] [UU]
      
unused devices: <none>

再起動後も無事に動いています。