commit 240630e61870e62e39a97225048f9945848fa5f5 upstream.
There have been several reports of LPM related hard freezes about once
a day on multiple Lenovo 50 series models. Strange enough these reports
where not disk model specific as LPM issues usually are and some users
with the exact same disk + laptop where seeing them while other users
where not seeing these issues.
It turns out that enabling LPM triggers a firmware bug somewhere, which
has been fixed in later BIOS versions.
This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM
for dealing with this.
The ahci_broken_lpm() function contains DMI match info for the 4 models
which are known to be affected by this and the DMI BIOS date field for
known good BIOS versions. If the BIOS date is older then the one in the
table LPM will be disabled and a warning will be printed.
Note the BIOS dates are for known good versions, some older versions may
work too, but we don't know for sure, the table is using dates from BIOS
versions for which users have confirmed that upgrading to that version
makes the problem go away.
Unfortunately I've been unable to get hold of the reporter who reported
that BIOS version 2.35 fixed the problems on the W541 for him. I've been
able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older
dmidecode, but I don't know the exact BIOS date as reported in the DMI.
Lenovo keeps a changelog with dates in their release notes, but the
dates there are the release dates not the build dates which are in DMI.
So I've chosen to set the date to which we compare to one day past the
release date of the 2.34 BIOS. I plan to fix this with a follow up
commit once I've the necessary info.
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cfce3a86b64b53f0a70e92a6a659c720c319b45 upstream.
Commit 184add2ca23c ("libata: Apply NOLPM quirk for SanDisk
SD7UB3Q*G1001 SSDs") disabled LPM for SanDisk SD7UB3Q*G1001 SSDs.
This has lead to several reports of users of that SSD where LPM
was working fine and who know have a significantly increased idle
power consumption on their laptops.
Likely there is another problem on the T450s from the original
reporter which gets exposed by the uncore reaching deeper sleep
states (higher PC-states) due to LPM being enabled. The problem as
reported, a hardfreeze about once a day, already did not sound like
it would be caused by LPM and the reports of the SSD working fine
confirm this. The original reporter is ok with dropping the quirk.
A X250 user has reported the same hard freeze problem and for him
the problem went away after unrelated updates, I suspect some GPU
driver stack changes fixed things.
TL;DR: The original reporters problem were triggered by LPM but not
an LPM issue, so drop the quirk for the SSD in question.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1583207
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Reported-by: Lorenzo Dalrio <lorenzo.dalrio@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: "Richard W.M. Jones" <rjones@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 136d769e0b3475d71350aa3648a116a6ee7a8f6c upstream.
While whitelisting Micron M500DC drives, the tweaked blacklist entry
enabled queued TRIM from M500IT variants also. But these do not support
queued TRIM. And while using those SSDs with the latest kernel we have
seen errors and even the partition table getting corrupted.
Some part from the dmesg:
[ 6.727384] ata1.00: ATA-9: Micron_M500IT_MTFDDAK060MBD, MU01, max UDMA/133
[ 6.727390] ata1.00: 117231408 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[ 6.741026] ata1.00: supports DRM functions and may not be fully accessible
[ 6.759887] ata1.00: configured for UDMA/133
[ 6.762256] scsi 0:0:0:0: Direct-Access ATA Micron_M500IT_MT MU01 PQ: 0 ANSI: 5
and then for the error:
[ 120.860334] ata1.00: exception Emask 0x1 SAct 0x7ffc0007 SErr 0x0 action 0x6 frozen
[ 120.860338] ata1.00: irq_stat 0x40000008
[ 120.860342] ata1.00: failed command: SEND FPDMA QUEUED
[ 120.860351] ata1.00: cmd 64/01:00:00:00:00/00:00:00:00:00/a0 tag 0 ncq dma 512 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x5 (timeout)
[ 120.860353] ata1.00: status: { DRDY }
[ 120.860543] ata1: hard resetting link
[ 121.166128] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 121.166376] ata1.00: supports DRM functions and may not be fully accessible
[ 121.186238] ata1.00: supports DRM functions and may not be fully accessible
[ 121.204445] ata1.00: configured for UDMA/133
[ 121.204454] ata1.00: device reported invalid CHS sector 0
[ 121.204541] sd 0:0:0:0: [sda] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
[ 121.204546] sd 0:0:0:0: [sda] tag#18 Sense Key : 0x5 [current]
[ 121.204550] sd 0:0:0:0: [sda] tag#18 ASC=0x21 ASCQ=0x4
[ 121.204555] sd 0:0:0:0: [sda] tag#18 CDB: opcode=0x93 93 08 00 00 00 00 00 04 28 80 00 00 00 30 00 00
[ 121.204559] print_req_error: I/O error, dev sda, sector 272512
After few reboots with these errors, and the SSD is corrupted.
After blacklisting it, the errors are not seen and the SSD does not get
corrupted any more.
Fixes: 243918be63 ("libata: Do not blacklist Micron M500DC")
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 322579dcc865b94b47345ad1b6002ad167f85405 upstream.
Sandisk SSDs SD7SN6S256G and SD8SN8U256G are regularly locking up
regularly under sustained moderate load with NCQ enabled. Blacklist
for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 184add2ca23ce5edcac0ab9c3b9be13f91e7b567 upstream.
Richard Jones has reported that using med_power_with_dipm on a T450s
with a Sandisk SD7UB3Q256G1001 SSD (firmware version X2180501) is
causing the machine to hang.
Switching the LPM to max_performance fixes this, so it seems that
this Sandisk SSD does not handle LPM well.
Note in the past there have been bug-reports about the following
Sandisk models not working with min_power, so we may need to extend
the quirk list in the future: name - firmware
Sandisk SD6SB2M512G1022I - X210400
Sandisk SD6PP4M-256G-1006 - A200906
Cc: stable@vger.kernel.org
Cc: Richard W.M. Jones <rjones@redhat.com>
Reported-and-tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d418ff56b8f2d2b296daafa8da151fe27689b757 upstream.
When commit 9c7be59fc519af ("libata: Apply NOLPM quirk to Crucial MX100
512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk
from the existing "Crucial_CT*MX100*" entry, but that entry sets model_rev
to "MU01", where as the entry adding the NOLPM quirk sets it to NULL.
This means that after this commit we no apply the NO_NCQ_TRIM quirk to
all "Crucial_CT512MX100*" SSDs even if they have the fixed "MU02"
firmware. This commit splits the "Crucial_CT512MX100*" quirk into 2
quirks, one for the "MU01" firmware and one for all other firmware
versions, so that we once again only apply the NO_NCQ_TRIM quirk to the
"MU01" firmware version.
Fixes: 9c7be59fc519af ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs")
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bf7b5d6d017c27e0d3b160aafb35a8e7cfeda1f upstream.
Commit b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB
drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs
but limited this to the MU02 firmware version, according to:
http://www.crucial.com/usa/en/support-ssd-firmware
MU02 is the last version, so there are no newer possibly fixed versions
and if the MU02 version has broken LPM then the MU01 almost certainly
also has broken LPM, so this commit changes the quirk to apply to all
firmware versions.
Fixes: b17e5729a630 ("libata: disable LPM for Crucial BX100 SSD 500GB...")
Cc: stable@vger.kernel.org
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62ac3f7305470e3f52f159de448bc1a771717e88 upstream.
There have been reports of the Crucial M500 480GB model not working
with LPM set to min_power / med_power_with_dipm level.
It has not been tested with medium_power, but that typically has no
measurable power-savings.
Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03
and there is a MU05 update available, but that update does not mention any
LPM fixes in its changelog, so the quirk matches all firmware versions.
In my experience the LPM problems with (older) Crucial SSDs seem to be
limited to higher capacity versions of the SSDs (different firmware?),
so this commit adds a NOLPM quirk for the 480 and 960GB versions of the
M500, to avoid LPM causing issues with these SSDs.
Cc: stable@vger.kernel.org
Reported-and-tested-by: Martin Steigerwald <martin@lichtvoll.de>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca6bfcb2f6d9deab3924bf901e73622a94900473 upstream.
Samsung explicitly states that queued TRIM is supported for Linux with
860 PRO and 860 EVO.
Make the previous blacklist to cover only 840 and 850 series.
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b17e5729a630d8326a48ec34ef02e6b4464a6aef upstream.
After Laptop Mode Tools starts to use min_power for LPM, a user found
out Crucial BX100 SSD can't get mounted.
Crucial BX100 SSD 500GB drive don't work well with min_power. This also
happens to med_power_with_dipm.
So let's disable LPM for Crucial BX100 SSD 500GB drive.
BugLink: https://bugs.launchpad.net/bugs/1726930
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c7be59fc519af9081c46c48f06f2b8fadf55ad8 upstream.
Various people have reported the Crucial MX100 512GB model not working
with LPM set to min_power. I've now received a report that it also does
not work with the new med_power_with_dipm level.
It does work with medium_power, but that has no measurable power-savings
and given the amount of people being bitten by the other levels not
working, this commit just disables LPM altogether.
Note all reporters of this have either the 512GB model (max capacity), or
are not specifying their SSD's size. So for now this quirk assumes this is
a problem with the 512GB model only.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=89261
Buglink: https://github.com/linrunner/TLP/issues/84
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9173e5e80729c8434b8d27531527c5245f4a5594 upstream.
syzkaller hit a WARN() in ata_qc_issue() when writing to /dev/sg0. This
happened because it issued a READ_6 command with no data buffer.
Just remove the WARN(), as it doesn't appear indicate a kernel bug. The
expected behavior is to fail the command, which the code does.
Here's a reproducer that works in QEMU when /dev/sg0 refers to a disk of
the default type ("82371SB PIIX3 IDE"):
#include <fcntl.h>
#include <unistd.h>
int main()
{
char buf[42] = { [36] = 0x8 /* READ_6 */ };
write(open("/dev/sg0", O_RDWR), buf, sizeof(buf));
}
Fixes: f92a26365a ("libata: change ATA_QCFLAG_DMAMAP semantics")
Reported-by: syzbot+f7b556d1766502a69d85071d2ff08bd87be53d0f@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org> # v2.6.25+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit db5ff909798ef0099004ad50a0ff5fde92426fd1 upstream.
LITEON EP1 has the same timeout issues as CX1 series devices.
Revert max_sectors to the value of 1024.
Fixes: e0edc8c54646 ("libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices")
Signed-off-by: Xinyu Lin <xinyu0123@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0edc8c546463f268d41d064d855bcff994c52fa upstream.
Marko reports that CX1-JB512-HP shows the same timeout issues as
CX1-JB256-HP. Let's apply MAX_SEC_128 to all devices in the series.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marko Koski-Vähälä <marko@koski-vahala.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull libata updates from Tejun Heo:
"Nothing interesting. A couple device specific minor updates and a
kernel doc change"
* 'for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: pata_arasam_cf: Use devm_clk_get
libata: fix libata-core.c kernel-doc warning
ata: sata_rcar: Remove obsolete sata-r8a779* platform_device_id entries
The Crucial M500 is known to have issues with queued TRIM commands, the
factory recertified SSDs use a different model number naming convention
which causes them to get ignored by the blacklist.
The new naming convention boils down to: s/Crucial_/FC/
Signed-off-by: Guillermo A. Amaral <g@maral.me>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Fix kernel-doc warning in libata-core.c:
Warning(..//drivers/ata/libata-core.c:4763): No description found for parameter 'tag'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
This reverts commit fe7173c206.
As implemented, ACS-4 sense reporting for ATA devices bypasses error
diagnosis and handling in libata degrading EH behavior significantly.
Revert the related changes for now.
ATA_ID_COMMAND_SET_3/4 constants are not reverted as they're used by
later changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org #v4.1+
This reverts commit a1524f226a.
As implemented, ACS-4 sense reporting for ATA devices bypasses error
diagnosis and handling in libata degrading EH behavior significantly.
Revert the related changes for now.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: stable@vger.kernel.org #v4.1+
A new Micron drive was just announced, once again recycling the first
part of the model string. Add an underscore to the M510/M550 pattern to
avoid picking up the new DC drive.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
I have a ST4000DM000 disk. If Linux is booted while the disk is spun down,
the command that sets transfer mode causes the disk to spin up. The
spin-up takes longer than the default 5s timeout, so the command fails and
timeout is reported.
Fix this by increasing the timeout to 15s, which is enough for the disk to
spin up.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Since no longer limiting max_sectors to BLK_DEF_MAX_SECTORS (commit 34b48db66e),
data corruption may occur on ST380013AS drive configured on 82801JI (ICH10 Family)
SATA controller. This patch will allow the driver to limit max_sectors as before
# cat /sys/block/sdb/queue/max_sectors_kb
512
I was able to double the max_sectors_kb value up to 16384 on linux-4.2.0-rc2
before seeing corruption, but seems safer to use previous limit. Without this
patch max_sectors_kb will be 32767.
tj: Minor comment update.
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: David Milburn <dmilburn@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org # v3.19 and later
Fixes: 34b48db66e ("block: remove artifical max_hw_sectors cap")
This device loses blocks, often the partition table area, on trim.
Disable TRIM.
http://pcengines.ch/msata16a.htm
Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Enabling AA on HP 250GB SATA disk VB0250EAVER causes errors:
[ 3.788362] ata3.00: failed to enable AA (error_mask=0x1)
[ 3.789243] ata3.00: failed to enable AA (error_mask=0x1)
Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk.
tj: Collected FPDMA_AA entries and updated comment.
Signed-off-by: Aleksei Mamlin <mamlinav@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Pull libata updates from Tejun Heo:
- a number of libata core changes to better support NCQ TRIM.
- ahci now supports MSI-X in single IRQ mode to support a new
controller which doesn't implement MSI or INTX.
- ahci now supports edge-triggered IRQ mode to support a new controller
which for some odd reason did edge-triggered IRQ.
- the usual controller support additions and changes.
* 'for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (27 commits)
libata: Do not blacklist Micron M500DC
ata: ahci_mvebu: add suspend/resume support
ahci, msix: Fix build error for !PCI_MSI
ahci: Add support for Cavium's ThunderX host controller
ahci: Add generic MSI-X support for single interrupts to SATA PCI driver
libata: finally use __initconst in ata_parse_force_one()
drivers: ata: add support for Ceva sata host controller
devicetree:bindings: add devicetree bindings for ceva ahci
ahci: added support for Freescale AHCI sata
ahci: Store irq number in struct ahci_host_priv
ahci: Move interrupt enablement code to a separate function
Doc: libata: Fix spelling typo found in libata.xml
ata:sata_nv - Change 1 to true for bool type variable.
ata: add Broadcom AHCI SATA3 driver for STB chips
Documentation: devicetree: add Broadcom SATA binding
libata: Fix regression when the NCQ Send and Receive log page is absent
ata: hpt366: fix constant cast warning
ata: ahci_xgene: potential NULL dereference in probe
ata: ahci_xgene: Add AHCI Support for 2nd HW version of APM X-Gene SoC AHCI SATA Host controller.
libahci: Add support to handle HOST_IRQ_STAT as edge trigger latch.
...
Queued TRIM got disabled on Micron M500DC drives thanks to the
"Micron_M500*" pattern we had in place to accommodate the previous
generation of this drive family. Tweak the blacklist entry slightly so
we only disable queued TRIM for the non-DC variants of M500 drives.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Just six days after this FIXME was added seven years ago, Sam Ravnborg
added the missing feature (37c514e3df "Add missing init section
definitions"), though it ended up being called __initconst.
Let's use it; better late than never.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
This patch fix a spelling typo found in libata.xml.
It is because libata.xml is generated from comments
in source, I have to fix it in libata-core.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
We have started seeing SSD firmware updates introduce support for queued
TRIM. Sadly, in most cases this support is completely untested and can
lead to either errors or data corruption.
Add two libata force flags that can be used to either enable or disable
queued TRIM support.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
The queued TRIM problems appear to be generic to Samsung's firmware and
not tied to a particular model. A recent update to the 840 EVO firmware
introduced the same issue as we saw on 850 Pro.
Blacklist queued TRIM on all 800-series drives while we work this issue
with Samsung.
Reported-by: Günter Waller <g.wal@web.de>
Reported-by: Sven Köhler <sven.koehler@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
When the LPM policy is set to ATA_LPM_MAX_POWER, the device might
generate a spurious PHY event that cuases errors on the link.
Ignore this event if it occured within 10s after the policy change.
The timeout was chosen observing that on a Dell XPS13 9333 these
spurious events can occur up to roughly 6s after the policy change.
Link: http://lkml.kernel.org/g/3352987.ugV1Ipy7Z5@xps13
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
This is a preparation commit that will allow to add other criteria
according to which PHY events should be dropped.
Signed-off-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Pull libata updates from Tejun Heo:
- Hannes's patchset implements support for better error reporting
introduced by the new ATA command spec.
- the deperecated pci_ dma API usages have been replaced by dma_ ones.
- a bunch of hardware specific updates and some cleanups.
* 'for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: remove deprecated use of pci api
ahci: st: st_configure_oob must be called after IP is clocked.
ahci: st: Update the ahci_st DT documentation
ahci: st: Update the DT example for how to obtain the PHY.
sata_dwc_460ex: indent an if statement
libata: Add tracepoints
libata-eh: Set 'information' field for autosense
libata: Implement support for sense data reporting
libata: Implement NCQ autosense
libata: use status bit definitions in ata_dump_status()
ide,ata: Rename ATA_IDX to ATA_SENSE
libata: whitespace fixes in ata_to_sense_error()
libata: whitespace cleanup in ata_get_cmd_descript()
libata: use READ_LOG_DMA_EXT
libata: remove ATA_FLAG_LOWTAG
sata_dwc_460ex: re-use hsdev->dev instead of dwc_dev
sata_dwc_460ex: move to generic DMA driver
sata_dwc_460ex: join messages back
sata: xgene: add ACPI support for APM X-Gene SATA ports
ata: sata_mv: add proper definitions for LP_PHY_CTL register values
Blacklist queued TRIM on this drive for now.
Reported-by: Stefan Keller <linux-list@zahlenfresser.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
CC: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Micron has released an updated firmware (MU02) for M510/M550/MX100
drives to fix the issues with queued TRIM. Queued TRIM remains broken on
M500 but is working fine on later drives such as M600 and MX200.
Tweak our blacklist to reflect the above.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=71371
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Add some tracepoints for ata_qc_issue, ata_qc_complete, and
ata_eh_link_autopsy.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
If NCQ autosense or the sense data reporting feature is enabled
the LBA of the offending command should be stored in the sense
data 'information' field.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
ACS-4 defines a sense data reporting feature set.
This patch implements support for it.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
SAS controller has its own tag allocation, which doesn't directly match to ATA
tag, so SAS and SATA have different code path for ata tags. Originally we use
port->scsi_host (98bd4be1) to destinguish SAS controller, but libsas set
->scsi_host too, so we can't use it for the destinguish, we add a new flag for
this purpose.
Without this patch, the following oops can happen because scsi-mq uses
a host-wide tag map shared among all devices with some integer tag
values >= ATA_MAX_QUEUE. These unexpectedly high tag values cause
__ata_qc_from_tag() to return NULL, which is then dereferenced in
ata_qc_new_init().
BUG: unable to handle kernel NULL pointer dereference at 0000000000000058
IP: [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120
PGD 32adf0067 PUD 32adf1067 PMD 0
Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi igb
i2c_algo_bit ptp pps_core pm80xx libsas scsi_transport_sas sg coretemp
eeprom w83795 i2c_i801
CPU: 4 PID: 1450 Comm: cydiskbench Not tainted 4.0.0-rc3 #1
Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b 05/04/12
task: ffff8800ba86d500 ti: ffff88032a064000 task.ti: ffff88032a064000
RIP: 0010:[<ffffffff804fd46e>] [<ffffffff804fd46e>] ata_qc_new_init+0x3e/0x120
RSP: 0018:ffff88032a067858 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8800ba0d2230 RCX: 000000000000002a
RDX: ffffffff80505ae0 RSI: 0000000000000020 RDI: ffff8800ba0d2230
RBP: ffff88032a067868 R08: 0000000000000201 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba0d0000
R13: ffff8800ba0d2230 R14: ffffffff80505ae0 R15: ffff8800ba0d0000
FS: 0000000041223950(0063) GS:ffff88033e480000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000058 CR3: 000000032a0a3000 CR4: 00000000000006e0
Stack:
ffff880329eee758 ffff880329eee758 ffff88032a0678a8 ffffffff80502dad
ffff8800ba167978 ffff880329eee758 ffff88032bf9c520 ffff8800ba167978
ffff88032bf9c520 ffff88032bf9a290 ffff88032a0678b8 ffffffff80506909
Call Trace:
[<ffffffff80502dad>] ata_scsi_translate+0x3d/0x1b0
[<ffffffff80506909>] ata_sas_queuecmd+0x149/0x2a0
[<ffffffffa0046650>] sas_queuecommand+0xa0/0x1f0 [libsas]
[<ffffffff804ea544>] scsi_dispatch_cmd+0xd4/0x1a0
[<ffffffff804eb50f>] scsi_queue_rq+0x66f/0x7f0
[<ffffffff803e5098>] __blk_mq_run_hw_queue+0x208/0x3f0
[<ffffffff803e54b8>] blk_mq_run_hw_queue+0x88/0xc0
[<ffffffff803e5c74>] blk_mq_insert_request+0xc4/0x130
[<ffffffff803e0b63>] blk_execute_rq_nowait+0x73/0x160
[<ffffffffa0023fca>] sg_common_write+0x3da/0x720 [sg]
[<ffffffffa0025100>] sg_new_write+0x250/0x360 [sg]
[<ffffffffa0025feb>] sg_write+0x13b/0x450 [sg]
[<ffffffff8032ec91>] vfs_write+0xd1/0x1b0
[<ffffffff8032ee54>] SyS_write+0x54/0xc0
[<ffffffff80689932>] system_call_fastpath+0x12/0x17
tj: updated description.
Fixes: 12cb5ce101 ("libata: use blk taging")
Reported-and-tested-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull block driver changes from Jens Axboe:
"This contains:
- The 4k/partition fixes for brd from Boaz/Matthew.
- A few xen front/back block fixes from David Vrabel and Roger Pau
Monne.
- Floppy changes from Takashi, cleaning the device file creation.
- Switching libata to use the new blk-mq tagging policy, removing
code (and a suboptimal implementation) from libata. This will
throw you a merge conflict, since a bug in the original libata
tagging code was fixed since this code was branched. Trivial.
From Shaohua.
- Conversion of loop to blk-mq, from Ming Lei.
- Cleanup of the io_schedule() handling in bsg from Peter Zijlstra.
He claims it improves on unreadable code, which will cost him a
beer.
- Maintainer update or NDB, now handled by Markus Pargmann.
- NVMe:
- Optimization from me that avoids a kmalloc/kfree per IO for
smaller (<= 8KB) IO. This cuts about 1% of high IOPS CPU
overhead.
- Removal of (now) dead RCU code, a relic from before NVMe was
converted to blk-mq"
* 'for-3.20/drivers' of git://git.kernel.dk/linux-block:
xen-blkback: default to X86_32 ABI on x86
xen-blkfront: fix accounting of reqs when migrating
xen-blkback,xen-blkfront: add myself as maintainer
block: Simplify bsg complete all
floppy: Avoid manual call of device_create_file()
NVMe: avoid kmalloc/kfree for smaller IO
MAINTAINERS: Update NBD maintainer
libata: make sata_sil24 use fifo tag allocator
libata: move sas ata tag allocation to libata-scsi.c
libata: use blk taging
NVMe: within nvme_free_queues(), delete RCU sychro/deferred free
null_blk: suppress invalid partition info
brd: Request from fdisk 4k alignment
brd: Fix all partitions BUGs
axonram: Fix bug in direct_access
loop: add blk-mq.h include
block: loop: don't handle REQ_FUA explicitly
block: loop: introduce lo_discard() and lo_req_flush()
block: loop: say goodby to bio
block: loop: improve performance via blk-mq
09c32aaa36 ("ahci_xgene: Fix the dma state machine lockup for the
ATA_CMD_SMART PIO mode command.") missed 3.19 release. Fold it into
for-3.20.
Signed-off-by: Tejun Heo <tj@kernel.org>
Basically move the sas ata tag allocation to libata-scsi.c to make it clear
these staffs are just for sas.
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
libata uses its own tag management which is duplication and the
implementation is poor. And if we switch to blk-mq, tag is build-in.
It's time to switch to generic taging.
The SAS driver has its own tag management, and looks we can't directly
map the host controler tag to SATA tag. So I just bypassed the SAS case.
I changed the code/variable name for the tag management of libata to
make it self contained. Only sas will use it. Later if libsas implements
its tag management, the tag management code in libata can be deleted
easily.
Cc: Jens Axboe <axboe@fb.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
Ronny reports: https://bugzilla.kernel.org/show_bug.cgi?id=87101
"Since commit 8a4aeec8d "libata/ahci: accommodate tag ordered
controllers" the access to the harddisk on the first SATA-port is
failing on its first access. The access to the harddisk on the
second port is working normal.
When reverting the above commit, access to both harddisks is working
fine again."
Maintain tag ordered submission as the default, but allow sata_sil24 to
continue with the old behavior.
Cc: <stable@vger.kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Reported-by: Ronny Hegewald <Ronny.Hegewald@online.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Remove the function ata_do_simple_cmd() that is not used anywhere.
This was partially found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Tejun Heo <tj@kernel.org>
As defined, the DRAT (Deterministic Read After Trim) and RZAT (Return
Zero After Trim) flags in the ATA Command Set are unreliable in the
sense that they only define what happens if the device successfully
executed the DSM TRIM command. TRIM is only advisory, however, and the
device is free to silently ignore all or parts of the request.
In practice this renders the DRAT and RZAT flags completely useless and
because the results are unpredictable we decided to disable discard in
MD for 3.18 to avoid the risk of data corruption.
Hardware vendors in the real world obviously need better guarantees than
what the standards bodies provide. Unfortuntely those guarantees are
encoded in product requirements documents rather than somewhere we can
key off of them programatically. So we are compelled to disabling
discard_zeroes_data for all devices unless we explicitly have data to
support whitelisting them.
This patch whitelists SSDs from a few of the main vendors. None of the
whitelists are based on written guarantees. They are purely based on
empirical evidence collected from internal and external users that have
tested or qualified these drives in RAID deployments.
The whitelist is only meant as a starting point and is by no means
comprehensive:
- All intel SSD models except for 510
- Micron M5?0/M600
- Samsung SSDs
- Seagate SSDs
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add new ATA device type for ZAC devices.
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull libata update from Tejun Heo:
"AHCI is getting per-port irq handling and locks for better
scalability. The gain is not huge but measureable with multiple high
iops devices connected to the same host; however, the value of
threaded IRQ handling seems negligible for AHCI and it likely will
revert to non-threaded handling soon.
Another noteworthy change is George Spelvin's "libata: Un-break ATA
blacklist". During 3.17 devel cycle, the libata blacklist glob
matching got generalized and rewritten; unfortunately, the patch
forgot to swap arguments to match the new match function and ended up
breaking blacklist matching completely. It got noticed only a couple
days ago so it couldn't make for-3.17-fixes either. :(
Other than the above two, nothing too interesting - the usual cleanup
churns and device-specific changes"
* 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
pata_serverworks: disable 64-KB DMA transfers on Broadcom OSB4 IDE Controller
libata: Un-break ATA blacklist
AHCI: Do not acquire ata_host::lock from single IRQ handler
AHCI: Optimize single IRQ interrupt processing
AHCI: Do not read HOST_IRQ_STAT reg in multi-MSI mode
AHCI: Make few function names more descriptive
AHCI: Move host activation code into ahci_host_activate()
AHCI: Move ahci_host_activate() function to libahci.c
AHCI: Pass SCSI host template as arg to ahci_host_activate()
ata: pata_imx: Use the SIMPLE_DEV_PM_OPS() macro
AHCI: Cleanup checking of multiple MSIs/SLM modes
libata-sff: Fix controllers with no ctl port
ahci_xgene: Fix the error print invalid resource for APM X-Gene SoC AHCI SATA Host Controller driver.
libata: change ata_<foo>_printk routines to return void
ata: qcom: Add device tree bindings information
ahci-platform: Bump max number of clocks to 5
ahci: ahci_p5wdh_workaround - constify DMI table
libahci_platform: Staticize ahci_platform_<en/dis>able_phys()
pata_platform: Remove useless irq_flags field
pata_of_platform: Remove "electra-ide" quirk
...
lib/glob.c provides a new glob_match() function, with arguments in
(pattern, string) order. It replaced a private function with arguments
in (string, pattern) order, but I didn't swap the call site...
The result was the entire ATA blacklist was effectively disabled.
The lesson for today is "I f***ed up *how* badly *how* many months ago?",
er, I mean "Nobody Tests RC Kernels On Legacy Hardware".
This was not a subtle break, but it made it through an entire RC
cycle unreported, presumably because all the people doing testing
have full-featured hardware.
(FWIW, the reason for the argument swap was because fnmatch() does it that
way, and for a while implementing a full fnmatch() was being considered.)
Fixes: 428ac5fc05 (libata: Use glob_match from lib/glob.c)
Reported-by: Steven Honeyman <stevenhoneyman@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=71371#c21
Signed-off-by: George Spelvin <linux@horizon.com>
Cc: <stable@vger.kernel.org> # 3.17
Tested-by: Steven Honeyman <stevenhoneyman@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>