Jeg har lavet en backup-maskine med 3x750GB diske. De kører i RAID 5
styret af Linux kernen - sat op vha. Webmin.
2 diske sidder på en SATA controller på bundkortet (sda og sdb). sdc og
sdd sidder på en lille 2-port controller på et PCI kort.
Men sdd klager sig i tykke stråler:
1 Time(s): [707822.076108] ata2: soft resetting port
1 Time(s): [707822.235979] ata2: SATA link up 1.5 Gbps (SStatus 113
SControl 310)
1 Time(s): [707822.306313] ata2.00: configured for UDMA/33
1 Time(s): [707822.306324] ata2: EH complete
1 Time(s): [707822.327749] sd 2:0:0:0: [sdd] 1465149168 512-byte
hardware sectors (750156 MB)
1 Time(s): [707822.328614] sd 2:0:0:0: [sdd] Write Protect is off
1 Time(s): [707822.347198] sd 2:0:0:0: [sdd] Write cache: enabled, read
cache: enabled, doesn't support DPO or FUA
1 Time(s): [707822.904826] res
ff/ff:ff:ff:ff:ff/ff:ff:ff:ff:ff/ff Emask 0x2 (HSM violation)
1 Time(s): [707828.260653] ata2: port is slow to respond, please be
patient (Status 0xff)
1 Time(s): [707832.936529] ata2: device not ready (errno=-16), forcing
hardreset
Dette gentager sig mange gange hvert døgn.
UDev siger om disken sdd:
UDEV [1200077861.934476] add /block/sdd/sdd1 (block)
UDEV_LOG=3
ACTION=add
DEVPATH=/block/sdd/sdd1
SUBSYSTEM=block
SEQNUM=1848
MINOR=49
MAJOR=8
PHYSDEVPATH=/devices/pci0000:00/0000:00:13.1/0000:04:06.0/host2/target2:0
:0/2:0:
0:0
PHYSDEVBUS=scsi
PHYSDEVDRIVER=sd
UDEVD_EVENT=1
DEVTYPE=partition
ID_VENDOR=ATA
ID_MODEL=WDC_WD7500AAKS-0
ID_REVISION=30.0
ID_SERIAL=1ATA_WDC_WD7500AAKS-00RBA0_WD-WCAPT0411498
ID_SERIAL_SHORT=ATA_WDC_WD7500AAKS-00RBA0_WD-WCAPT0411498
ID_TYPE=disk
ID_BUS=scsi
ID_ATA_COMPAT=WDC_WD7500AAKS-00RBA0_WD-WCAPT0411498
ID_PATH=pci-0000:04:06.0-scsi-2:0:0:0
ID_FS_USAGE=raid
ID_FS_TYPE=linux_raid_member
ID_FS_VERSION=0.90.0
ID_FS_UUID=46368934:d91d3ce6:8e117197:9a1c538f
ID_FS_UUID_ENC=46368934:d91d3ce6:8e117197:9a1c538f
ID_FS_LABEL=
ID_FS_LABEL_ENC=
ID_FS_LABEL_SAFE=
DEVNAME=/dev/sdd1
DEVLINKS=/dev/disk/by-id/scsi-1ATA_WDC_WD7500AAKS-00RBA0_WD-WCAPT0411498-
part1 /
dev/disk/by-id/ata-WDC_WD7500AAKS-00RBA0_WD-WCAPT0411498-part1
/dev/disk/by-path
/pci-0000:04:06.0-scsi-2:0:0:0-part1
Der sidder en anden disk af samme type som "sdb" - den giver ikke
problemer.
Smart status er som følger:
smartctl -a /dev/sdd
smartctl version 5.37 [i686-pc-linux-gnu] Copyright (C) 2002-6 Bruce
Allen
Home page is
http://smartmontools.sourceforge.net/
=== START OF INFORMATION SECTION ===
Device Model: WDC WD7500AAKS-00RBA0
Serial Number: WD-WCAPT0411498
Firmware Version: 30.04G30
User Capacity: 750.156.374.016 bytes
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 7
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Mon Jan 21 13:13:13 2008 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed
without error or no self-test has
ever
been run.
Total time to complete Offline
data collection: (15960) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection
on/off support.
Suspend Offline collection upon
new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging
supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 198) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always
- 0
3 Spin_Up_Time 0x0003 186 182 021 Pre-fail Always
- 7658
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always
- 19
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always
- 0
7 Seek_Error_Rate 0x000e 200 200 051 Old_age Always
- 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always
- 350
10 Spin_Retry_Count 0x0012 100 253 051 Old_age Always
- 0
11 Calibration_Retry_Count 0x0012 100 253 051 Old_age Always
- 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always
- 18
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always
- 7
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always
- 19
194 Temperature_Celsius 0x0022 121 098 000 Old_age Always
- 31
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always
- 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always
- 0
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline
- 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always
- 8901
200 Multi_Zone_Error_Rate 0x0008 200 200 051 Old_age Offline
- 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime
(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 40
-
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.
Ideer?
Mvh. NKJensen