Re: mapping disk sectors to files
From: Eyal Lebedinsky <hidden>
Date: 2014-03-14 08:07:40
raid1 means that the two disks hold the same data at the same
offset, so dd from the member device should confirm the address
and trigger an error. Your calculation looks correct to me (IANA
Expert).
I do not understand why you "can't run badblocks" on the individual
device. Only running it on the md device causes it to be "cover"ed.
BTW, I would expect the periodic smartd long test to report the bad
blocks earlier, and the periodic md 'check' to fix such problems.
HTH,
Eyal
On 14/03/14 08:23, Thorsten von Eicken wrote:I've just had a disk in a raid1 mirror set die and that exposed some bad
block on the remaining drive -- ooops! I'd now like to map the bad
blocks to files so I can restore the affected files from backups, but I
can't figure out the mapping. What I've done:
- while I had only one drive, I ran badblocks on the md device, that
gave me a list of bad blocks, I verified that they were indeed bad with
dd, and I used debugfs to map them to files, verified the files gave a
read error, wrote zeroes to the blocks to get the drive to reallocate, done!
- I rebuilt the mirror set with a fresh drive, but that exposed two more
bad blocks, ouch!
- Now I can't run badblocks anymore easily because the second drive will
"cover" for the first one as far as I understand
- I've tried converting sectors to blocks, subtracting partition offset
and data offset, but it's just not working, i.e. I can't get dd to hit
the error when I try
This is the set of bad blocks I'm trying to deal with:
Mar 13 01:12:24 h kernel: [522839.210723] end_request: I/O error, dev
sda, sector 1147023664
Mar 13 01:12:24 h kernel: [522839.211618] ata1: EH complete
Mar 13 01:12:24 h kernel: [522839.211635] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146085632
Mar 13 01:11:49 h kernel: [522804.763467] end_request: I/O error, dev
sda, sector 1147020360
Mar 13 01:11:49 h kernel: [522804.765146] ata1: EH complete
Mar 13 01:11:49 h kernel: [522804.765180] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146082304
Mar 13 01:11:30 h kernel: [522785.944248] end_request: I/O error, dev
sda, sector 1147017056
Mar 13 01:11:30 h kernel: [522785.945926] ata1: EH complete
Mar 13 01:11:30 h kernel: [522785.945961] md/raid1:md0: sda:
unrecoverable I/O read error for block 1146078976
Mar 13 01:09:23 h kernel: [522658.983066] end_request: I/O error, dev
sda, sector 1129050144
Mar 13 01:09:23 h kernel: [522658.984750] ata1: EH complete
Mar 13 01:09:23 h kernel: [522658.984781] md/raid1:md0: sda:
unrecoverable I/O read error for block 1128112128
Mar 13 01:06:44 h kernel: [522499.760134] end_request: I/O error, dev
sda, sector 1098724456
Mar 13 01:06:44 h kernel: [522499.761829] ata1: EH complete
Mar 13 01:06:44 h kernel: [522499.761869] md/raid1:md0: sda:
unrecoverable I/O read error for block 1097786368
The GPT is:
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): FD188278-C58A-41C7-9943-AD5E94EDF1F8
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 4061 sectors (2.0 MiB)
Number Start (sector) End (sector) Size Code Name
1 2048 409600 199.0 MiB EF00 EFI System
2 411648 935935 256.0 MiB 0700 Microsoft basic
data
3 935936 3907029134 1.8 TiB FD00 Linux RAID
The md device says:
# mdadm --examine /dev/sda3
/dev/sda3:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : 1f3118f3:ee5a644b:d8d3df40:8decaa30
Name : h2:0
Creation Time : Sun Mar 18 00:37:55 2012
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 3906091151 (1862.57 GiB 1999.92 GB)
Array Size : 1953045575 (1862.57 GiB 1999.92 GB)
Used Dev Size : 3906091150 (1862.57 GiB 1999.92 GB)
Data Offset : 2048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 7a87b471:48f6c853:0a5dd5ad:15fa79a4
Update Time : Thu Mar 13 19:43:14 2014
Checksum : 1522576c - correct
Events : 8076588
Device Role : Active device 1
Array State : AA ('A' == active, '.' == missing)
- I'm using an ext4 filesystem and dumpfs says:
First block: 0
Block size: 4096
If I take the bad sector number from the first error message
(1147023664) subtract the partition start (935936) and then divide by 8
(4096 byte blocks) I can get dd to trigger the bad block:
# dd if=/dev/sda3 of=/dev/null bs=4096 count=10 iflag=direct skip=143260966
dd: reading `/dev/sda3': Input/output error
0+0 records in
0+0 records out
0 bytes (0 B) copied, 15.7246 s, 0.0 kB/s
But I need the block number in the md device so I can map it back to a
file. I've tried various calculations and used dd on the md device but I
see no error in syslog indicating hitting a bad block. Does someone know
how to do the mapping and/or how to fix the situation?
Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html-- Eyal Lebedinsky (eyal@eyal.emu.id.au)