Content deleted Content added
GreenC bot (talk | contribs) Rescued 1 archive link; reformat 1 link. Wayback Medic 2.5 per Category:All articles with dead external links - pass 3 |
|||
(20 intermediate revisions by 12 users not shown) | |||
Line 7:
If the drive itself is inherently reliable but has some bad sectors, then TLER and similar features prevent a disk from being unnecessarily marked as 'failed' by limiting the time spent on correcting detected errors before advising the array controller of a failed operation. The array controller can then handle the data recovery for the limited amount involved, rather than marking the entire drive as faulty.
==Typical defaults==
Effectively, TLER and similar features limit the performance of on-drive error handling, to allow hardware RAID controllers and software RAID implementations to handle the error if problematic.
Line 36:
|}
=== ZFS ===
The [[ZFS|ZFS filesystem]] was
===RAID controllers===
Disconnect timeout values for different hardware [[Disk array controller|RAID controllers]] may vary between vendors; thus, TLER should trigger before the controller times out the drive. For example, 3ware 9650SE uses 20 seconds as the timeout,<ref>{{cite web|url=http://kb.lsi.com/KnowledgebaseArticle15639.aspx|archiveurl=https://web.archive.org/web/20120203053819/http://kb.lsi.com/KnowledgebaseArticle15639.aspx|title=User Guide for 9650SE 9690SA from 9.5.2 Complete Codeset|archivedate=3 February 2012|work=lsi.com|accessdate=10 June 2015}}</ref> while for the LSI Logic used in IBM x-series it is 10 seconds.<ref>Available in BIOS Raid Config Utility > Advanced Device Properties</ref>▼
Widely available [[Intel Rapid Storage Technology|Intel Matrix RAID / Intel Rapid Storage Technology]], embedded in [[Intel]] server motherboards and modern desktop motherboards, is a pseudo-hardware controller, not a true hardware RAID controller.▼
===Software RAID===
Linux [[mdadm]] simply holds and lets the drive complete its recovery – however, the default command timeout for the SCSI Disk layer (/sys/block/sd?/device/timeout) is 30 seconds,<ref>{{cite web|url=https://github.com/torvalds/linux/blob/master/drivers/scsi/sd.h#
== Changing ERC ==
The utility comes with three batch files, {{Mono|TLERSCAN.BAT}} to get the current state of the TLER setting on all the hard drives, {{Mono|TLER-ON.BAT}} to enable TLER, and {{Mono|TLER-OFF.BAT}} to disable TLER. The included {{Mono|TLER-ON.BAT}} will set the Read & Write TLER time to seven seconds. If you wish to use a custom timeout value, you can use the {{Mono|WDTLER.EXE}} utility directly with the <code>-r# -w#</code> parameters to specify how many seconds the Time Limit value should be.▼
===ATA-8 standard===
Western Digital claims that using the {{Mono|WDTLER.EXE}} utility on newer drives can damage the firmware and make the disk unusable. The utility is no longer available from Western Digital, and new drives will not be able to have the TLER setting changed. RE disks are only suitable for RAID arrays and Caviar are only suitable for non-RAID use. The utility still works for older drives.▼
The 2006 ATA-8 standard defines a SCT {{tt|Error Recovery Control}} command.<ref>[https://www.singlix.org/trdos/8086/archive/specs/D1699r3e-ATA8_ACS.pdf ATA/ATAPI Command Set (ATA8-ACS) ]</ref> For hard drives that implement this interface, the {{Mono|smartctl}} utility (part of the [[smartmontools]] package) can be used to change the error-recovery timeout via {{code|-l scterc}}.<ref name=greg>{{Cite web |author=Richard Gregory |url=http://abatis.org.uk/projects/erc/ |title=Author's description of the original patch to smartctl that implemented that feature |access-date=2013-02-15 |archive-url=https://web.archive.org/web/20130910034510/http://cgi.csc.liv.ac.uk:80/~greg/projects/erc/ |archive-date=2013-09-10 |url-status=live }}</ref> In 2018, ACS-4 added a functionality for the setting to persist across reboot; it is now supported by smartctl.<ref>{{cite web |title=#1427 (Add support for SCT Error Recovery Timer features added in ACS-4) – smartmontools |url=https://www.smartmontools.org/ticket/1427 |website=www.smartmontools.org}}</ref>
Controlling the
On Windows, the HDAT2 program is available in addition to smartctl (which is cross-platform).<ref name=greg/>
▲Controlling the TLER behavior through the {{Mono|smartctl}} utility may not work on all hard disk drives because some manufacturers have changed their desktop drives not to include the support for the ERC parameter,<ref>{{cite web|url=http://www.spinics.net/lists/raid/msg38964.html|title=Re: md RAID with enterprise-class SATA or SAS drives|work=spinics.net}}</ref><ref>{{cite web|url=http://knowledge.seagate.com/articles/en_US/FAQ/203991en|title=Seagate FAQ: What is Error Recovery Control?|work=seagate.com}}</ref> purportedly to force sales of their more expensive RAID/enterprise models.{{Citation needed|date=April 2016}}
==
SBC-4 describes a RECOVERY TIME LIMIT field in the Read-Write Error Recovery mode page used to define how the drive performs error recovery.<ref>{{cite web |title=INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22 |url=https://standards.incits.org/apps/group_public/download.php/124286/livelink |access-date=22 May 2023 |date=15 September 2020}}</ref> The sdparm program can change this setting with {{code|1=--set=RTL}}.<ref>{{man|8|sdparm|Linux}}</ref>
▲Disconnect timeout values for different hardware [[Disk array controller|RAID controllers]] may vary between vendors; thus, TLER should trigger before the controller times out the drive. For example, 3ware 9650SE uses 20 seconds as the timeout,<ref>{{cite web|url=http://kb.lsi.com/KnowledgebaseArticle15639.aspx|archiveurl=https://web.archive.org/web/20120203053819/http://kb.lsi.com/KnowledgebaseArticle15639.aspx|title=User Guide for 9650SE 9690SA from 9.5.2 Complete Codeset|archivedate=3 February 2012|work=lsi.com|accessdate=10 June 2015}}</ref> while for the LSI Logic used in IBM x-series it is 10 seconds.<ref>Available in BIOS Raid Config Utility > Advanced Device Properties</ref>
=== Vendor utilities ===
▲Widely available [[Intel Rapid Storage Technology|Intel Matrix RAID / Intel Rapid Storage Technology]], embedded in [[Intel]] server motherboards and modern desktop motherboards, is a pseudo-hardware controller, not a true hardware RAID controller.
==== Western Digital ====
A {{Mono|WDTLER.EXE}} utility allows the enabling or disabling of the TLER parameter on Western Digital hard drives. This utility is written for [[DOS]]. The utility works on and makes changes to all compatible Western Digital hard disk drives connected to the computer. The change survives power-cycling. Western Digital used to mention the tool in an FAQ.<ref name=customer-service>{{cite web |title=TLER / CCTL / ERC thread |url=https://hardforum.com/threads/tler-cctl-erc-thread.1562128/ |website=[H]ard{{!}}Forum |date=16 November 2010}}</ref>
▲The utility comes with three batch files, {{Mono|TLERSCAN.BAT}} to get the current state of the TLER setting on all the hard drives, {{Mono|TLER-ON.BAT}} to enable TLER, and {{Mono|TLER-OFF.BAT}} to disable TLER. The included {{Mono|TLER-ON.BAT}} will set the Read & Write TLER time to seven seconds.
▲Western Digital claims that using the {{Mono|WDTLER.EXE}} utility on newer drives can damage the firmware and make the disk unusable. The utility is no longer available from Western Digital, and new drives will not be able to have the TLER setting changed. RE disks are only suitable for RAID arrays and Caviar are only suitable for non-RAID use. The utility still works for older drives{{which|date=May 2023}}<!-- what is the cutoff? -->.
==== Hitachi ====
Hitachi customer service stated in 2009 that there is a Feature Tool for changing ERC (referred to as CCTL).<ref name=customer-service/>
==
Seagate provides a {{Mono|openSeaChest}} utility to allow you to interrogate and change many firmware settings including TLER. If you cannot use <code>smartctl -l scterr,x,y</code> to set the TLER, the relevant command-line commands are <code>openSeaChest_Configure -d /dev/sg0 --sctReadTimer</code> and <code>openSeaChest_Configure -d /dev/sg0 --sctWriteTimer</code>.
▲Linux [[mdadm]] simply holds and lets the drive complete its recovery – however, the default command timeout for the SCSI Disk layer (/sys/block/sd?/device/timeout) is 30 seconds,<ref>{{cite web|url=https://github.com/torvalds/linux/blob/master/drivers/scsi/sd.h#LC11|title=linux/sd.h at master · torvalds/linux · GitHub|work=GitHub}}</ref> after which it will attempt to reset the drive, and if that fails, put the drive offline.<ref>{{cite web|url=https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/scsi/scsi_eh.txt|title=kernel/git/torvalds/linux.git – Linux kernel source tree|work=kernel.org}}</ref>
==References==
Line 68 ⟶ 77:
==External links==
* [https://raid.wiki.kernel.org/index.php/Timeout_Mismatch Linux Raid wiki: Timeout Mismatch]
* [https://archive.today/20130121054825/http://wdc.custhelp.com/app/answers/detail/a_id/1397/p/227,283/session/L3RpbWUvMTMyMTQzOTc4NS9zaWQvdVhvYmpmSms%3D Western Digital FAQ answer ID 1397: Difference between Desktop edition and RAID (Enterprise) edition drives]
* [http://www.wdc.com/wdproducts/library/other/2579-001098.pdf Time-Limited Error Recovery (TLER) Information Sheet], Western Digital, January 2013
* [https://web.archive.org/web/20071103042201/http://www.samsung.com/global/business/hdd/learningresource/whitepapers/LearningResource_CCTL.html Samsung CCTL]
|