[ Upstream commit 864449eea7c600596e305ffdc4a6a846414b222c ]
The firmware event workqueue should not be marked as WQ_MEM_RECLAIM
as it's doesn't need to make forward progress under memory pressure.
In the current state it will result in a deadlock if the device had been
forcefully removed.
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f49d4aed1315a7b766d855f1367142e682b0cc87 ]
1. In IO path, setting of "ATA command pending" flag early before device
removal, invalid device handle etc., checks causes any new commands
to be always returned with SAM_STAT_BUSY and when the driver removes
the drive the SML issues SYNC Cache command and that command is
always returned with SAM_STAT_BUSY and thus making SYNC Cache command
to requeued.
2. If the driver gets an ATA PT command for a SATA drive then the driver
set "ATA command pending" flag in device specific data structure not
to allow any further commands until the ATA PT command is completed.
However, after setting the flag if the driver decides to return the
command back to upper layers without actually issuing to the firmware
(i.e., returns from qcmd failure return paths) then the corresponding
flag is not cleared and this prevents the driver from sending any new
commands to the drive.
This patch fixes above two issues by setting of "ATA command pending"
flag after checking for whether device deleted, invalid device handle,
device busy with task management. And by setting "ATA command pending"
flag to false in all of the qcmd failure return paths after setting the
flag.
Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2ce9a3645299ba1752873d333d73f67620f4550b ]
Whenever an I/O for a RAID volume fails with IOCStatus
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and SCSIStatus equal to
(MPI2_SCSI_STATE_TERMINATED | MPI2_SCSI_STATE_NO_SCSI_STATUS) then
return the I/O to SCSI midlayer with "DID_RESET" (i.e. retry the IO
infinite times) set in the host byte.
Previously, the driver was completing the I/O with "DID_SOFT_ERROR"
which causes the I/O to be quickly retried. However, firmware needed
more time and hence I/Os were failing.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2e767bb5d6ee0d988cb7d4e54b0b21175802b6b upstream.
The firmware or device, possibly under a heavy I/O load, can return on a
partial unaligned boundary. Scsi-ml expects these requests to be
completed on an alignment boundary. Scsi-ml blindly requeues the I/O
without checking the alignment boundary of the I/O request for the
remaining bytes. This leads to errors, since devices cannot perform
non-aligned read/write operations.
This patch fixes the issue in the driver. It aligns unaligned
completions of FS requests, by truncating them to the nearest alignment
boundary.
[mkp: simplified if statement]
Reported-by: Mauricio Faria De Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb58456589443ca572221fabbdef3db8483a779 upstream.
mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time. If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout). The original fix was to block
the device when an ATA command came in, but this caused a regression
with
commit 669f044170d8933c3d66d231b69ea97cb8447338
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue Nov 22 16:17:13 2016 -0800
scsi: srp_transport: Move queuecommand() wait code to SCSI core
So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends. The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).
[mkp: addressed feedback wrt. test_bit and fixed whitespace]
Fixes: 18f6084a989ba1b (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Joe Korty <joe.korty@ccur.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffdadd68af5a397b8a52289ab39d62e1acb39e63 upstream.
MPI2 controllers sometimes got lost (i.e. disappear from
/sys/bus/pci/devices) if ASMP is enabled.
Signed-off-by: Slava Kardakov <ojab@ojab.ru>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=60644
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ff723ad0f87feba43dda45fdae71206063dd7d4 upstream.
While issuing any ATA passthrough command to firmware the driver will
block the device. But it will unblock the device only if the I/O
completes through the ISR path. If a controller reset occurs before
command completion the device will remain in blocked state.
Make sure we unblock the device following a controller reset if an ATA
passthrough command was queued.
[mkp: clarified patch description]
Fixes: ac6c2a93bd07 ("mpt3sas: Fix for SATA drive in blocked state, after diag reset")
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18f6084a989ba1b38702f9af37a2e4049a924be6 upstream.
This is a work around for a bug with LSI Fusion MPT SAS2 when perfoming
secure erase. Due to the very long time the operation takes, commands
issued during the erase will time out and will trigger execution of the
abort hook. Even though the abort hook is called for the specific
command which timed out, this leads to entire device halt
(scsi_state terminated) and premature termination of the secure erase.
Set device state to busy while ATA passthrough commands are in progress.
[mkp: hand applied to 4.9/scsi-fixes, tweaked patch description]
Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: <linux-scsi@vger.kernel.org>
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Chaitra P B <chaitra.basappa@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d3a56ed098566bc83d6c2afa74b4199c12ea074 upstream.
While merging mpt3sas & mpt2sas code, we added the is_warpdrive check
condition on the wrong line
---------------------------------------------------------------------------
scsih_target_alloc(struct scsi_target *starget)
sas_target_priv_data->handle = raid_device->handle;
sas_target_priv_data->sas_address = raid_device->wwid;
sas_target_priv_data->flags |= MPT_TARGET_FLAGS_VOLUME;
- raid_device->starget = starget;
+ sas_target_priv_data->raid_device = raid_device;
+ if (ioc->is_warpdrive)
+ raid_device->starget = starget;
}
spin_unlock_irqrestore(&ioc->raid_device_lock, flags);
return 0;
------------------------------------------------------------------------------
That check should be for the line sas_target_priv_data->raid_device =
raid_device;
Due to above hunk, we are not initializing raid_device's starget for
raid volumes, and so during raid disk deletion driver is not calling
scsi_remove_target() API as driver observes starget field of
raid_device's structure as NULL.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Fixes: 7786ab6aff ("mpt3sas: Ported WarpDrive product SSS6200 support")
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d667f72b2a20bbac72bec0ab11467fc70bb0f1f upstream.
In _scsih_io_done() we test if the ioc->logging_level does _not_ have
the MPT_DEBUG_REPLY bit set and if it hasn't we print the debug
messages. This unfortunately is the wrong way around.
Note, the actual bug is older than af0094115 but this commit removed the
CONFIG_SCSI_MPT3SAS_LOGGING Kconfig option which hid the bug.
Fixes: af0094115 'mpt2sas, mpt3sas: Remove SCSI_MPTXSAS_LOGGING entry from Kconfig'
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 03d1fb3a65783979f23bd58b5a0387e6992d9e26 ]
Track msix of each IO and use the same msix for issuing abort to timed
out IO. With this driver will process IO's reply first followed by TM.
Signed-off-by: Suganath prabu Subramani <suganath-prabu.subramani@avagotech.com>
Signed-off-by: Chaitra P B <chaitra.basappa@avagotech.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before enabling MPI2_SCSIIO_CONTROL_TLR_ON flag in MPI SCSI IO request
message, check whether TLR is enabled on the drive using
'sas_is_tlr_enabled' API.
Actually in the driver code, driver is using below API's
1. sas_enable_tlr() - to enable the TLR
2. sas_disable_tlr() - to disable the TLR
3. sas_is_tlr_enabled() - to check whether TLR is enabled or not.
but in scsih_qcmd() we have missed to use sas_is_tlr_enabled() API,
instead we checking for TLR bit from flag field of driver's 'struct
MPT3SAS_DEVIC' structure. which is corrected with this patch.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Modified the mpt3sas driver to have a single driver module which
supports both SAS 2.0 & SAS 3.0 HBA devices.
* Added SAS 2.0 HBA device IDs to the mpt3sas_pci_table pci table.
* Created two separate SCSI host templates for SAS2 and SAS3 HBAs so
that, during the driver load time driver can use corresponding host
template(based the pci device ID) while registering a scsi host
adapter instance for that pci device.
* Registered two IOCTL devices, mpt2ctl is for SAS2 HBAs & mpt3ctl for
SAS3 HBAs. Also updated the code to make sure that mpt2ctl device
processes only those ioctl cmds issued for the SAS2 HBAs and mpt3ctl
device processes only those ioctl cmds issued for the SAS3 HBAs.
* Added separate indexing for SAS2 and SAS3 HBAs.
* Replaced compile time check 'MPT2SAS_SCSI' to run time check
'hba_mpi_version_belonged' whereever needed.
* Aliased this merged driver to mpt2sas using MODULE_ALIAS.
* Moved global varaible 'driver_name' to per adapter instance variable.
* Created two raid function template and used corresponding raid
function templates based on the run time check
'hba_mpi_version_belonged'.
* Moved mpt2sas_warpdrive.c file from mpt2sas to mpt3sas folder and
renamed it as mpt3sas_warpdrive.c.
* Also renamed the functions in mpt3sas_warpdrive.c file to follow
current driver function name convention.
* Updated the Makefile to build mpt3sas_warpdrive.o file for these
WarpDrive-specific functions.
* Also in function mpt3sas_setup_direct_io(), used sector_div() API
instead of division operator (which gives compilation errors on 32 bit
machines).
* Removed mpt2sas files, mpt2sas directory & mpt3sas_module.c file.
* Added module parameter 'hbas_to_enumerate' which permits using this
merged driver as a legacy mpt2sas driver or as a legacy mpt3sas
driver.
Here are the available options for this module parameter:
0 - Merged driver which enumerates both SAS 2.0 & SAS 3.0 HBAs
1 - Acts as legacy mpt2sas driver, which enumerates only SAS 2.0 HBAs
2 - Acts as legacy mpt3sas driver, which enumerates only SAS 3.0 HBAs
* Removed mpt2sas entries from SCSI's Kconfig and Makefile files.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
setpci reset on nytro warpdrive card along with sysfs access and cli
ioctl access resulted in kernel oops
1. pci_access_mutex lock added to provide synchronization between IOCTL,
sysfs, PCI resource handling path
2. gioc_lock spinlock to protect list operations over multiple
controllers
This patch is ported from commit 6229b414b3 ("mpt2sas: setpci reset
kernel oops fix").
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fw_event_work struct is concurrently referenced at shutdown. Add a
refcount to protect it and refactor the code to use it.
Additionally, refactor _scsih_fw_event_cleanup_queue() such that it no
longer iterates over the list without holding the lock since
_firmware_event_work() concurrently deletes items from the list.
This patch is ported from commit 008549f6e8 ("mpt2sas: Refcount
fw_events and fix unsafe list usage"). These changes are also required
for mpt3sas.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sas_device objects can be referenced concurrently throughout the driver.
We need a way to make sure threads can't delete them out from under each
other. This patch adds the refcount and refactors the code to use it.
Additionally, we cannot iterate over the sas_device_list without holding
the lock or we risk corrupting random memory if items are added or
deleted as we iterate. This patch refactors _scsih_probe_sas() to use
the sas_device_list in a safe way.
This patch is ported from the following mpt2sas driver commit
d224fe0d60 ("mpt2sas: Refcount sas_device objects and fix unsafe list
usage").
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ported the following list of WarpDrive-specific patches:
1. commit 0bdccdb0a0 ("mpt2sas: WarpDrive
New product SSS6200 support added")
2. commit 82a4525812 ("mpt2sas: WarpDrive
Infinite command retries due to wrong scsi command entry in MPI
message")
3. commit ba96bd0b1d ("mpt2sas: Support
for greater than 2TB capacity WarpDrive")
4. commit 4da7af9494 ("mpt2sas: Do not
retry a timed out direct IO for Warpdrive")
5. commit daeaa9df92 ("mpt2sas: Avoid type
casting for direct I/O commands").
Also set the mpt2_ioctl_iocinfo adapter_type to:
1. MPT3_IOCTL_INTERFACE_SAS3 for Gen3 HBAs
2. MPT2_IOCTL_INTERFACE_SAS2_SSS6200 for Warp Drive
3. MPT2_IOCTL_INTERFACE_SAS2 for other Gen2 HBAs
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1. Do not enable MSI-X vectors for SAS2008 B0 controllers
2. Enable a single MSI-X vector for the following controller:
a. SAS2004
b. SAS2008
c. SAS2008_1
d. SAS2008_2
e. SAS2008_3
f. SAS2116_1
g. SAS2116_2
3. Enable Combined Reply Post Queue Support (i.e. 96 MSI-X vectors)
for Gen3 Invader/Fury C0 and above revision HBAs
4. Enable Combined Reply Post Queue Support (i.e. 96 MSI-X vectors)
for all Intruder and Cutlass HBAs
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Avoid sending PHYDISK_HIDDEN RAID action requests to SAS2 controllers
since they don't support it.
Also enable fast_path only for SAS3 HBAs.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently there is a logging level option provided for each of our
drivers in the kernel configuration utility. Users can enable this
option to get more verbose information. By default it is enabled.
Only when this option is enabled will the functions which display the
required information get compiled in.
As we are merging the both drivers we can no longer provide this
configuration option. Remove the SCSI_MPTXSAS_LOGGING entry from Kconfig
and unconditionally enable logging (by removing the #ifdef
CONFIG_SCSI_MPT3SAS_LOGGING preprocessor check conditions) so that all
functions which are defined to display more verbose information get
compiled in.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1. Use 'hba_mpi_version_belonged' IOC varable to uniquely identify each
individual generation driver functionality at runtime.
2. Declare global variable 'driver_name' and use this variable while
reserving PCI regions and while allocating the IRQs.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Created a mpt3sas_module.c file for mpt3sas driver where it can register
SAS3 HBA devices with PCI, SML, IOCTL subsystems. Also removed the
corresponding interfaces from mpt3sas_scsih.c file.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1. Added mpt2sas driver related macros in mpt3sas header files
2. Made scsi host's, raid class', pci's, ioctl's callback functions
global so that both drivers can use them.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Issue: When the disks are getting discovered and assigned device
handles by the kernel, a device block followed by an unblock
(due to broadcast primitives) issued by the driver is
interspersed by the kernel changing the state of the device.
Therefore the unblock by the driver results in a no operation
within the kernel API.
To fix this one, the below patch checks the return of the unblock API
and performs a block followed by an unblock to unfreeze the block
layer's I/O queue. Sufficient checks and prints are also added in the
driver to identify this condition caused by the kernel.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Created a thread using alloc_ordered_workqueue() API in order to process
the works from firmware Work-queue sequentially instead of
create_singlethread_workqueue() API.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Joe Lawrence <joe.lawrence@stratus.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
For any SCSI command, if the driver receives
IOC status = SCSI_IOC_TERMINATED and log info = 0x32010081 then
that command will be completed with DID_RESET host status.
The definition of this log info value is
"Virtual IO has failed and has to be retried".
Firmware will provide this log info value with IOC Status
"SCSI_IOC_TERMINATED", whenever a drive (with is a part of a volume)
is pulled and pushed back within some minimal delay.
With this log info value, firmware informs the driver to retry the
failed IO command infinite times, so to provide some time for the
firmware to discover the reinserted drive successfully instated of
just retrying failed command for five times(doesn't giving enough
time for firmware to complete the drive discovery) and failing the
IO permanently even though drive came back successfully.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
This Patch will provide more details of the devices such as slot number,
enclosure logical id, enclosure level & connector name in the following
scenarios,
- When end device is added in the topology,
- When the end device is removed from the setup,
- When the SCSI mid layer issues TASK ABORT/ DEVICE RESET/ TARGET RESET during
error handling,
- When any command to the device fails with Sense key Hardware error or Medium
error or Unit Attention,
- When firmware returns device error or device not ready status for the end
device,
- When a Predicted fault is detected on an end device.
This information can be used by the user to identify the location of the
desired drive in the topology.
Driver will get these information by reading the sas device page0.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
During hot-plugging of a disk(having a flaky link), the disk addition
stops and any further disk addition or removal doesn't happen on that
controller.
This is because, when driver receives DELAY_NOT_RESPONDING event for a disk
while it is undergoing addition at the SCSI Transport layer, the driver
would block the I/O to that disk resulting in a deadlock. i.e the disk
addition work couldn't be completed at the SCSI Transport Layer as it
can't send any I/Os (such as Inquiry, Report LUNs etc) to the disk as
I/Os are blocked to this drive. Also any subsequent device removal
(TARGET_NOT_RESPONDING) or link update(RC_PHY_CHANGED) event couldn't be
processed as they are in the queue to get processed after disk addition
event.
Description of Change:
Don't block the drive when drive addition is under the control of SML.
So that SML won't be blocked of issuing the device dicovery commands
(such as Inquiry, Report LUNs etc).
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Copyright, Trademark & Confidentiality legal statements throughout the
source code changed from LSI to Avago.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
When a flaky disk is there in a topology then during driver load,
discovery related I/O times out; which results in SCSI error recovery
initiating host reset and then the controller won't see any disk.
In this patch, The driver would return FAILED status to the host reset
initiated due to discovery related I/O timeout if ioc->is_driver_loading
is set. This flag would be set until we exit out of scsih_scan_finished().
i.e.
During device discovery if one of the disk is flaky
(which responds to some discovery commands and doesn't respond to some)
the driver wouldn't perform host reset for discovery related I/O timeout.
Instead it would return Failure for the host reset resulting in the
flaky disk getting removed by the SCSI Mid layer,
so other disks would be added correctly.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
This patch will log a message when driver receives "Temperature Threshold
exceeded" event from any temperature sensor.
The message will look similar to like:
mpt3sas0: Temperature Threshold flags a b c d exceeded for Sensor: x !!!
mpt3sas0: Current Temp In Celsius: y
where a b c d are threshold flags 0 1 2 3
Change_set:
1. Get the number of sensor count of this IOC by reading IO Unit page 8 at
driver initialization time.
2. Also unmask the Temperature Threshold Event at driver initialization
time
3. Whenever a MPI2_EVENT_TEMP_THRESHOLD event is received from the
firmware, then print the sensor number, the maximum threshold number it
has exceed and the current temperature of this sensor.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Since we got rid of ordered tag support in 2010 the prime use case of
switching on and off ordered tags has been obsolete. The other function
of enabling/disabling tagging entirely has only been correctly implemented
by the 53c700 driver and isn't generally useful.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Merge two functions, and remove overly verbose debugging output that pokes
into mid-layer internal structures.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Drop the now unused reason argument from the ->change_queue_depth method.
Also add a return value to scsi_adjust_queue_depth, and rename it to
scsi_change_queue_depth now that it can be used as the default
->change_queue_depth implementation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
All drivers use the implementation for ramping the queue up and down, so
instead of overloading the change_queue_depth method call the
implementation diretly if the driver opts into it by setting the
track_queue_depth flag in the host template.
Note that a few drivers validated the new queue depth in their
change_queue_depth method, but as we never go over the queue depth
set during slave_configure or the sysfs file this isn't nessecary
and can safely be removed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Venkatesh Srinivas <venkateshs@google.com>
Remove the tagged argument from scsi_adjust_queue_depth, and just let it
handle the queue depth. For most drivers those two are fairly separate,
given that most modern drivers don't care about the SCSI "tagged" status
of a command at all, and many old drivers allow queuing of multiple
untagged commands in the driver.
Instead we start out with the ->simple_tags flag set before calling
->slave_configure, which is how all drivers actually looking at
->simple_tags except for one worke anyway. The one other case looks
broken, but I've kept the behavior as-is for now.
Except for that we only change ->simple_tags from the ->change_queue_type,
and when rejecting a tag message in a single driver, so keeping this
churn out of scsi_adjust_queue_depth is a clear win.
Now that the usage of scsi_adjust_queue_depth is more obvious we can
also remove all the trivial instances in ->slave_alloc or ->slave_configure
that just set it to the cmd_per_lun default.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove the ordered_tags field, we haven't been issuing ordered tags based
on it since the big barrier rework in 2010.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Most drivers use exactly the same implementation, so provide it as a
library function.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
In _scsih_probe, propagate the return value from scsi_add_host.
In mpt3sas, avoid calling list_del twice if that returns an
error, which causes list_del corruption warnings if an error
is returned.
Tested with blk-mq and scsi-mq patches to properly cleanup
from and propagate blk_mq_init_rq_map errors.
Signed-off-by: Robert Elliott <elliott@hp.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Copyright in driver sources is updated for year the 2014.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Added code to send an SEP message that turns off the Predictive
Failure LED when a drive is removed (if Predictive Failure LED was turned on).
Added a new flag 'pfa_led_on' per device that tracks the status of Predictive
Failure LED. When the drive is removed, this flag is checked and
sends an SEP message to turn off the respective Predictive Failure LED.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to
meet kernel coding style guidelines. This issue was reported by checkpatch.
A simplified version of the semantic patch that makes this change is as
follows (http://coccinelle.lip6.fr/):
// <smpl>
@@
identifier i;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer z;
@@
- DEFINE_PCI_DEVICE_TABLE(i)
+ const struct pci_device_id i[]
= z;
// </smpl>
[bhelgaas: add semantic patch]
Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In _scsih_probe, delay the call to scsi_add_host until the host has been
fully set up.
Otherwise, the default .can_queue value of 1 causes scsi-mq to set the block
layer request queue size to its minimum size, resulting in awful performance.
In _scsih_probe error handling, call mpt3sas_base_detach rather than
scsi_remove_host to properly clean up in reverse order.
In _scsih_remove, call scsi_remove_host earlier to clean up in reverse order.
Signed-off-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@avagotech.com>
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tack the firmware reply event_data payload to the end of its
corresponding struct fw_event_work allocation. This matches the
convention in the mptfusion driver and simplifies the code.
This avoids the following smatch warning:
drivers/scsi/mpt3sas/mpt3sas_scsih.c:2519
mpt3sas_send_trigger_data_event() warn: possible memory leak of
'fw_event'
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
In _scsih_{slave,target}_alloc, an incorrect structure type is passed
to sizeof() when allocating storage for hostdata. Luckily larger
structure types were used, so at least the wrong sizes were safe:
struct scsi_device (1784 bytes) > struct MPT3SAS_DEVICE (24 bytes)
struct scsi_target (760 bytes) > struct MPT3SAS_TARGET (32 bytes)
This fixes the following smatch warnings:
drivers/scsi/mpt3sas/mpt3sas_scsih.c:1166 _scsih_target_alloc()
warn: struct type mismatch 'MPT3SAS_TARGET vs scsi_target'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:1280 _scsih_slave_alloc()
warn: struct type mismatch 'MPT3SAS_DEVICE vs scsi_device'
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Now that we're using 64-bit LUNs internally we need to increase
the size of max_luns to 64 bits, too.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Removing the host_lock from the I/O submission path gives a huge
scalability improvement.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Praveen Krishnamoorthy <Praveen.krishnamoorthy@lsi.com>
Acked-by: Sreekanth Reddy <Sreekanth.reddy@lsi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
The mpt3sas_scsih_issue_tm() function does not use the 'serial_number'
argument passed to it. Removing it removes the last vestiges of the
scsi_cmnd's serial_number field from this driver.
Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Reviewed-by: Nicholas Bellinger <nab@linux-iscsi.org>
Reviewed-by: Praveen Krishnamoorthy <Praveen.krishnamoorthy@lsi.com>
Acked-by: Sreekanth Reddy <Sreekanth.reddy@lsi.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
If mpt3sas_base_map_resources takes an early error path then its
counterpart, mpt3sas_base_free_resources needs to be careful about
cleaning up:
1 - _base_mask_interrupts and _base_make_ioc_ready require memory
mapped I/O registers, make sure that this is true.
2 - _base_free_irq iterates over the adapter's reply_queue_list, so
move this list head initialization out of _base_enable_msix to
_scsih_probe so this will always be safe.
3 - check that the controller PCI device and its BARs have been
enabled before disabling them.
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
When Async scanning mode is enabled and device scanning is in progress,
devices should not be removed. But in actuality, devices are removed but
their transport layer entries are not removed. This causes error to add
the same device to the transport layer after host reset or diagnostic
reset.
So, in this patch, modified the code in such a way that device is not removed
when Async scanning mode is enabled and device scanning is in progress.
Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>