Commit graph

70 commits

Author SHA1 Message Date
Srinivasarao P
f9cff13b5d Merge android-4.4.135 (c9d74f2) into msm-4.4
* refs/heads/tmp-c9d74f2
  Linux 4.4.135
  Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
  Revert "vti4: Don't override MTU passed on link creation via IFLA_MTU"
  Linux 4.4.134
  s390/ftrace: use expoline for indirect branches
  kdb: make "mdr" command repeat
  Bluetooth: btusb: Add device ID for RTL8822BE
  ASoC: samsung: i2s: Ensure the RCLK rate is properly determined
  regulator: of: Add a missing 'of_node_put()' in an error handling path of 'of_regulator_match()'
  scsi: lpfc: Fix frequency of Release WQE CQEs
  scsi: lpfc: Fix soft lockup in lpfc worker thread during LIP testing
  scsi: lpfc: Fix issue_lip if link is disabled
  netlabel: If PF_INET6, check sk_buff ip header version
  selftests/net: fixes psock_fanout eBPF test case
  perf report: Fix memory corruption in --branch-history mode --branch-history
  perf tests: Use arch__compare_symbol_names to compare symbols
  x86/apic: Set up through-local-APIC mode on the boot CPU if 'noapic' specified
  drm/rockchip: Respect page offset for PRIME mmap calls
  MIPS: Octeon: Fix logging messages with spurious periods after newlines
  audit: return on memory error to avoid null pointer dereference
  crypto: sunxi-ss - Add MODULE_ALIAS to sun4i-ss
  clk: samsung: exynos3250: Fix PLL rates
  clk: samsung: exynos5250: Fix PLL rates
  clk: samsung: exynos5433: Fix PLL rates
  clk: samsung: exynos5260: Fix PLL rates
  clk: samsung: s3c2410: Fix PLL rates
  media: cx25821: prevent out-of-bounds read on array card
  udf: Provide saner default for invalid uid / gid
  PCI: Add function 1 DMA alias quirk for Marvell 88SE9220
  serial: arc_uart: Fix out-of-bounds access through DT alias
  serial: fsl_lpuart: Fix out-of-bounds access through DT alias
  serial: imx: Fix out-of-bounds access through serial port index
  serial: mxs-auart: Fix out-of-bounds access through serial port index
  serial: samsung: Fix out-of-bounds access through serial port index
  serial: xuartps: Fix out-of-bounds access through DT alias
  rtc: tx4939: avoid unintended sign extension on a 24 bit shift
  staging: rtl8192u: return -ENOMEM on failed allocation of priv->oldaddr
  hwrng: stm32 - add reset during probe
  enic: enable rq before updating rq descriptors
  clk: rockchip: Prevent calculating mmc phase if clock rate is zero
  media: em28xx: USB bulk packet size fix
  dmaengine: pl330: fix a race condition in case of threaded irqs
  media: s3c-camif: fix out-of-bounds array access
  media: cx23885: Set subdev host data to clk_freq pointer
  media: cx23885: Override 888 ImpactVCBe crystal frequency
  ALSA: vmaster: Propagate slave error
  x86/devicetree: Fix device IRQ settings in DT
  x86/devicetree: Initialize device tree before using it
  usb: gadget: composite: fix incorrect handling of OS desc requests
  usb: gadget: udc: change comparison to bitshift when dealing with a mask
  gfs2: Fix fallocate chunk size
  cdrom: do not call check_disk_change() inside cdrom_open()
  hwmon: (pmbus/adm1275) Accept negative page register values
  hwmon: (pmbus/max8688) Accept negative page register values
  perf/core: Fix perf_output_read_group()
  ASoC: topology: create TLV data for dapm widgets
  powerpc: Add missing prototype for arch_irq_work_raise()
  usb: gadget: ffs: Execute copy_to_user() with USER_DS set
  usb: gadget: ffs: Let setup() return USB_GADGET_DELAYED_STATUS
  usb: dwc2: Fix interval type issue
  ipmi_ssif: Fix kernel panic at msg_done_handler
  PCI: Restore config space on runtime resume despite being unbound
  MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
  xhci: zero usb device slot_id member when disabling and freeing a xhci slot
  KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use
  i2c: mv64xxx: Apply errata delay only in standard mode
  ACPICA: acpi: acpica: fix acpi operand cache leak in nseval.c
  ACPICA: Events: add a return on failure from acpi_hw_register_read
  bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
  zorro: Set up z->dev.dma_mask for the DMA API
  clk: Don't show the incorrect clock phase
  cpufreq: cppc_cpufreq: Fix cppc_cpufreq_init() failure path
  usb: dwc3: Update DWC_usb31 GTXFIFOSIZ reg fields
  arm: dts: socfpga: fix GIC PPI warning
  virtio-net: Fix operstate for virtio when no VIRTIO_NET_F_STATUS
  ima: Fallback to the builtin hash algorithm
  ima: Fix Kconfig to select TPM 2.0 CRB interface
  ath10k: Fix kernel panic while using worker (ath10k_sta_rc_update_wk)
  net/mlx5: Protect from command bit overflow
  selftests: Print the test we're running to /dev/kmsg
  tools/thermal: tmon: fix for segfault
  powerpc/perf: Fix kernel address leak via sampling registers
  powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer
  rtc: hctosys: Ensure system time doesn't overflow time_t
  hwmon: (nct6775) Fix writing pwmX_mode
  parisc/pci: Switch LBA PCI bus from Hard Fail to Soft Fail mode
  m68k: set dma and coherent masks for platform FEC ethernets
  powerpc/mpic: Check if cpu_possible() in mpic_physmask()
  ACPI: acpi_pad: Fix memory leak in power saving threads
  xen/acpi: off by one in read_acpi_id()
  btrfs: fix lockdep splat in btrfs_alloc_subvolume_writers
  Btrfs: fix copy_items() return value when logging an inode
  btrfs: tests/qgroup: Fix wrong tree backref level
  Bluetooth: btusb: Add USB ID 7392:a611 for Edimax EW-7611ULB
  net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()
  rtc: snvs: Fix usage of snvs_rtc_enable
  sparc64: Make atomic_xchg() an inline function rather than a macro.
  fscache: Fix hanging wait on page discarded by writeback
  KVM: VMX: raise internal error for exception during invalid protected mode state
  sched/rt: Fix rq->clock_update_flags < RQCF_ACT_SKIP warning
  ocfs2/dlm: don't handle migrate lockres if already in shutdown
  btrfs: Fix possible softlock on single core machines
  Btrfs: fix NULL pointer dereference in log_dir_items
  Btrfs: bail out on error during replay_dir_deletes
  mm: fix races between address_space dereference and free in page_evicatable
  mm/ksm: fix interaction with THP
  dp83640: Ensure against premature access to PHY registers after reset
  scsi: aacraid: Insure command thread is not recursively stopped
  cpufreq: CPPC: Initialize shared perf capabilities of CPUs
  Force log to disk before reading the AGF during a fstrim
  sr: get/drop reference to device in revalidate and check_events
  swap: divide-by-zero when zero length swap file on ssd
  fs/proc/proc_sysctl.c: fix potential page fault while unregistering sysctl table
  x86/pgtable: Don't set huge PUD/PMD on non-leaf entries
  sh: fix debug trap failure to process signals before return to user
  net: mvneta: fix enable of all initialized RXQs
  net: Fix untag for vlan packets without ethernet header
  mm/kmemleak.c: wait for scan completion before disabling free
  llc: properly handle dev_queue_xmit() return value
  net-usb: add qmi_wwan if on lte modem wistron neweb d18q1
  net/usb/qmi_wwan.c: Add USB id for lt4120 modem
  net: qmi_wwan: add BroadMobi BM806U 2020:2033
  ARM: 8748/1: mm: Define vdso_start, vdso_end as array
  batman-adv: fix packet loss for broadcasted DHCP packets to a server
  batman-adv: fix multicast-via-unicast transmission with AP isolation
  selftests: ftrace: Add a testcase for probepoint
  selftests: ftrace: Add a testcase for string type with kprobe_event
  selftests: ftrace: Add probe event argument syntax testcase
  mm/mempolicy.c: avoid use uninitialized preferred_node
  RDMA/ucma: Correct option size check using optlen
  perf/cgroup: Fix child event counting bug
  vti4: Don't override MTU passed on link creation via IFLA_MTU
  vti4: Don't count header length twice on tunnel setup
  batman-adv: fix header size check in batadv_dbg_arp()
  net: Fix vlan untag for bridge and vlan_dev with reorder_hdr off
  sunvnet: does not support GSO for sctp
  ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu
  workqueue: use put_device() instead of kfree()
  bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa().
  netfilter: ebtables: fix erroneous reject of last rule
  USB: OHCI: Fix NULL dereference in HCDs using HCD_LOCAL_MEM
  xen: xenbus: use put_device() instead of kfree()
  fbdev: Fixing arbitrary kernel leak in case FBIOGETCMAP_SPARC in sbusfb_ioctl_helper().
  scsi: sd: Keep disk read-only when re-reading partition
  scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM
  usb: musb: call pm_runtime_{get,put}_sync before reading vbus registers
  e1000e: allocate ring descriptors with dma_zalloc_coherent
  e1000e: Fix check_for_link return value with autoneg off
  watchdog: f71808e_wdt: Fix magic close handling
  KVM: PPC: Book3S HV: Fix VRMA initialization with 2MB or 1GB memory backing
  selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
  Btrfs: send, fix issuing write op when processing hole in no data mode
  xen/pirq: fix error path cleanup when binding MSIs
  net/tcp/illinois: replace broken algorithm reference link
  gianfar: Fix Rx byte accounting for ndev stats
  sit: fix IFLA_MTU ignored on NEWLINK
  bcache: fix kcrashes with fio in RAID5 backend dev
  dmaengine: rcar-dmac: fix max_chunk_size for R-Car Gen3
  virtio-gpu: fix ioctl and expose the fixed status to userspace.
  r8152: fix tx packets accounting
  clocksource/drivers/fsl_ftm_timer: Fix error return checking
  nvme-pci: Fix nvme queue cleanup if IRQ setup fails
  netfilter: ebtables: convert BUG_ONs to WARN_ONs
  batman-adv: invalidate checksum on fragment reassembly
  batman-adv: fix packet checksum in receive path
  md/raid1: fix NULL pointer dereference
  media: dmxdev: fix error code for invalid ioctls
  x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations
  locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs
  regulatory: add NUL to request alpha2
  smsc75xx: fix smsc75xx_set_features()
  ARM: OMAP: Fix dmtimer init for omap1
  s390/cio: clear timer when terminating driver I/O
  s390/cio: fix return code after missing interrupt
  powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
  kernel/relay.c: limit kmalloc size to KMALLOC_MAX_SIZE
  md: raid5: avoid string overflow warning
  locking/xchg/alpha: Add unconditional memory barrier to cmpxchg()
  usb: musb: fix enumeration after resume
  drm/exynos: fix comparison to bitshift when dealing with a mask
  md raid10: fix NULL deference in handle_write_completed()
  mac80211: round IEEE80211_TX_STATUS_HEADROOM up to multiple of 4
  NFC: llcp: Limit size of SDP URI
  ARM: OMAP1: clock: Fix debugfs_create_*() usage
  ARM: OMAP3: Fix prm wake interrupt for resume
  ARM: OMAP2+: timer: fix a kmemleak caused in omap_get_timer_dt
  scsi: qla4xxx: skip error recovery in case of register disconnect.
  scsi: aacraid: fix shutdown crash when init fails
  scsi: storvsc: Increase cmd_per_lun for higher speed devices
  selftests: memfd: add config fragment for fuse
  usb: dwc2: Fix dwc2_hsotg_core_init_disconnected()
  usb: gadget: fsl_udc_core: fix ep valid checks
  usb: gadget: f_uac2: fix bFirstInterface in composite gadget
  ARC: Fix malformed ARC_EMUL_UNALIGNED default
  scsi: qla2xxx: Avoid triggering undefined behavior in qla2x00_mbx_completion()
  scsi: mptfusion: Add bounds check in mptctl_hp_targetinfo()
  scsi: sym53c8xx_2: iterator underflow in sym_getsync()
  scsi: bnx2fc: Fix check in SCSI completion handler for timed out request
  scsi: ufs: Enable quirk to ignore sending WRITE_SAME command
  irqchip/gic-v3: Change pr_debug message to pr_devel
  locking/qspinlock: Ensure node->count is updated before initialising node
  tools/libbpf: handle issues with bpf ELF objects containing .eh_frames
  bcache: return attach error when no cache set exist
  bcache: fix for data collapse after re-attaching an attached device
  bcache: fix for allocator and register thread race
  bcache: properly set task state in bch_writeback_thread()
  cifs: silence compiler warnings showing up with gcc-8.0.0
  proc: fix /proc/*/map_files lookup
  arm64: spinlock: Fix theoretical trylock() A-B-A with LSE atomics
  RDS: IB: Fix null pointer issue
  xen/grant-table: Use put_page instead of free_page
  xen-netfront: Fix race between device setup and open
  MIPS: TXx9: use IS_BUILTIN() for CONFIG_LEDS_CLASS
  bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y
  ACPI: processor_perflib: Do not send _PPC change notification if not ready
  firmware: dmi_scan: Fix handling of empty DMI strings
  x86/power: Fix swsusp_arch_resume prototype
  IB/ipoib: Fix for potential no-carrier state
  mm: pin address_space before dereferencing it while isolating an LRU page
  asm-generic: provide generic_pmdp_establish()
  mm/mempolicy: add nodes_empty check in SYSC_migrate_pages
  mm/mempolicy: fix the check of nodemask from user
  ocfs2: return error when we attempt to access a dirty bh in jbd2
  ocfs2/acl: use 'ip_xattr_sem' to protect getting extended attribute
  ocfs2: return -EROFS to mount.ocfs2 if inode block is invalid
  ntb_transport: Fix bug with max_mw_size parameter
  RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure
  powerpc/numa: Ensure nodes initialized for hotplug
  powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes
  jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path
  HID: roccat: prevent an out of bounds read in kovaplus_profile_activated()
  scsi: fas216: fix sense buffer initialization
  Btrfs: fix scrub to repair raid6 corruption
  btrfs: Fix out of bounds access in btrfs_search_slot
  Btrfs: set plug for fsync
  ipmi/powernv: Fix error return code in ipmi_powernv_probe()
  mac80211_hwsim: fix possible memory leak in hwsim_new_radio_nl()
  kconfig: Fix expr_free() E_NOT leak
  kconfig: Fix automatic menu creation mem leak
  kconfig: Don't leak main menus during parsing
  watchdog: sp5100_tco: Fix watchdog disable bit
  nfs: Do not convert nfs_idmap_cache_timeout to jiffies
  dm thin: fix documentation relative to low water mark threshold
  tools lib traceevent: Fix get_field_str() for dynamic strings
  perf callchain: Fix attr.sample_max_stack setting
  tools lib traceevent: Simplify pointer print logic and fix %pF
  PCI: Add function 1 DMA alias quirk for Marvell 9128
  tracing/hrtimer: Fix tracing bugs by taking all clock bases and modes into account
  kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
  ASoC: au1x: Fix timeout tests in au1xac97c_ac97_read()
  ALSA: hda - Use IS_REACHABLE() for dependency on input
  NFSv4: always set NFS_LOCK_LOST when a lock is lost.
  firewire-ohci: work around oversized DMA reads on JMicron controllers
  do d_instantiate/unlock_new_inode combinations safely
  xfs: remove racy hasattr check from attr ops
  kernel/signal.c: avoid undefined behaviour in kill_something_info
  kernel/sys.c: fix potential Spectre v1 issue
  kasan: fix memory hotplug during boot
  ipc/shm: fix shmat() nil address after round-down when remapping
  Revert "ipc/shm: Fix shmat mmap nil-page protection"
  xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
  libata: blacklist Micron 500IT SSD with MU01 firmware
  libata: Blacklist some Sandisk SSDs for NCQ
  mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
  ALSA: timer: Fix pause event notification
  aio: fix io_destroy(2) vs. lookup_ioctx() race
  affs_lookup(): close a race with affs_remove_link()
  KVM: Fix spelling mistake: "cop_unsuable" -> "cop_unusable"
  MIPS: Fix ptrace(2) PTRACE_PEEKUSR and PTRACE_POKEUSR accesses to o32 FGRs
  MIPS: ptrace: Expose FIR register through FP regset
  UPSTREAM: sched/fair: Consider RT/IRQ pressure in capacity_spare_wake

Conflicts:
	drivers/media/dvb-core/dmxdev.c
	drivers/scsi/sd.c
	drivers/scsi/ufs/ufshcd.c
	drivers/usb/gadget/function/f_fs.c
	fs/ecryptfs/inode.c

Change-Id: I15751ed8c82ec65ba7eedcb0d385b9f803c333f7
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-06-27 14:42:55 +05:30
Tang Junhui
3d0dca33a2 bcache: fix kcrashes with fio in RAID5 backend dev
[ Upstream commit 60eb34ec5526e264c2bbaea4f7512d714d791caf ]

Kernel crashed when run fio in a RAID5 backend bcache device, the call
trace is bellow:
[  440.012034] kernel BUG at block/blk-ioc.c:146!
[  440.012696] invalid opcode: 0000 [#1] SMP NOPTI
[  440.026537] CPU: 2 PID: 2205 Comm: md127_raid5 Not tainted 4.15.0 #8
[  440.027441] Hardware name: HP ProLiant MicroServer Gen8, BIOS J06 07/16
/2015
[  440.028615] RIP: 0010:put_io_context+0x8b/0x90
[  440.029246] RSP: 0018:ffffa8c882b43af8 EFLAGS: 00010246
[  440.029990] RAX: 0000000000000000 RBX: ffffa8c88294fca0 RCX: 0000000000
0f4240
[  440.031006] RDX: 0000000000000004 RSI: 0000000000000286 RDI: ffffa8c882
94fca0
[  440.032030] RBP: ffffa8c882b43b10 R08: 0000000000000003 R09: ffff949cb8
0c1700
[  440.033206] R10: 0000000000000104 R11: 000000000000b71c R12: 00000000000
01000
[  440.034222] R13: 0000000000000000 R14: ffff949cad84db70 R15: ffff949cb11
bd1e0
[  440.035239] FS:  0000000000000000(0000) GS:ffff949cba280000(0000) knlGS:
0000000000000000
[  440.060190] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  440.084967] CR2: 00007ff0493ef000 CR3: 00000002f1e0a002 CR4: 00000000001
606e0
[  440.110498] Call Trace:
[  440.135443]  bio_disassociate_task+0x1b/0x60
[  440.160355]  bio_free+0x1b/0x60
[  440.184666]  bio_put+0x23/0x30
[  440.208272]  search_free+0x23/0x40 [bcache]
[  440.231448]  cached_dev_write_complete+0x31/0x70 [bcache]
[  440.254468]  closure_put+0xb6/0xd0 [bcache]
[  440.277087]  request_endio+0x30/0x40 [bcache]
[  440.298703]  bio_endio+0xa1/0x120
[  440.319644]  handle_stripe+0x418/0x2270 [raid456]
[  440.340614]  ? load_balance+0x17b/0x9c0
[  440.360506]  handle_active_stripes.isra.58+0x387/0x5a0 [raid456]
[  440.380675]  ? __release_stripe+0x15/0x20 [raid456]
[  440.400132]  raid5d+0x3ed/0x5d0 [raid456]
[  440.419193]  ? schedule+0x36/0x80
[  440.437932]  ? schedule_timeout+0x1d2/0x2f0
[  440.456136]  md_thread+0x122/0x150
[  440.473687]  ? wait_woken+0x80/0x80
[  440.491411]  kthread+0x102/0x140
[  440.508636]  ? find_pers+0x70/0x70
[  440.524927]  ? kthread_associate_blkcg+0xa0/0xa0
[  440.541791]  ret_from_fork+0x35/0x40
[  440.558020] Code: c2 48 00 5b 41 5c 41 5d 5d c3 48 89 c6 4c 89 e7 e8 bb c2
48 00 48 8b 3d bc 36 4b 01 48 89 de e8 7c f7 e0 ff 5b 41 5c 41 5d 5d c3 <0f> 0b
0f 1f 00 0f 1f 44 00 00 55 48 8d 47 b8 48 89 e5 41 57 41
[  440.610020] RIP: put_io_context+0x8b/0x90 RSP: ffffa8c882b43af8
[  440.628575] ---[ end trace a1fd79d85643a73e ]--

All the crash issue happened when a bypass IO coming, in such scenario
s->iop.bio is pointed to the s->orig_bio. In search_free(), it finishes the
s->orig_bio by calling bio_complete(), and after that, s->iop.bio became
invalid, then kernel would crash when calling bio_put(). Maybe its upper
layer's faulty, since bio should not be freed before we calling bio_put(),
but we'd better calling bio_put() first before calling bio_complete() to
notify upper layer ending this bio.

This patch moves bio_complete() under bio_put() to avoid kernel crash.

[mlyle: fixed commit subject for character limits]

Reported-by: Matthias Ferdinand <bcache@mfedv.net>
Tested-by: Matthias Ferdinand <bcache@mfedv.net>
Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-05-30 07:49:02 +02:00
Srinivasarao P
2d309c994d Merge android-4.4.107 (79f138a) into msm-4.4
* refs/heads/tmp-79f138a
  Linux 4.4.107
  ath9k: fix tx99 potential info leak
  IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
  RDMA/cma: Avoid triggering undefined behavior
  macvlan: Only deliver one copy of the frame to the macvlan interface
  udf: Avoid overflow when session starts at large offset
  scsi: bfa: integer overflow in debugfs
  scsi: sd: change allow_restart to bool in sysfs interface
  scsi: sd: change manage_start_stop to bool in sysfs interface
  vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend
  scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry
  raid5: Set R5_Expanded on parity devices as well as data.
  pinctrl: adi2: Fix Kconfig build problem
  usb: musb: da8xx: fix babble condition handling
  tty fix oops when rmmod 8250
  powerpc/perf/hv-24x7: Fix incorrect comparison in memord
  scsi: hpsa: destroy sas transport properties before scsi_host
  scsi: hpsa: cleanup sas_phy structures in sysfs when unloading
  PCI: Detach driver before procfs & sysfs teardown on device remove
  xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
  xfs: fix log block underflow during recovery cycle verification
  l2tp: cleanup l2tp_tunnel_delete calls
  bcache: fix wrong cache_misses statistics
  bcache: explicitly destroy mutex while exiting
  GFS2: Take inode off order_write list when setting jdata flag
  thermal/drivers/step_wise: Fix temperature regulation misbehavior
  ppp: Destroy the mutex when cleanup
  clk: tegra: Fix cclk_lp divisor register
  clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU
  clk: mediatek: add the option for determining PLL source clock
  mm: Handle 0 flags in _calc_vm_trans() macro
  crypto: tcrypt - fix buffer lengths in test_aead_speed()
  arm-ccn: perf: Prevent module unload while PMU is in use
  target/file: Do not return error for UNMAP if length is zero
  target:fix condition return in core_pr_dump_initiator_port()
  iscsi-target: fix memory leak in lio_target_tiqn_addtpg()
  target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
  powerpc/ipic: Fix status get and status clear
  powerpc/opal: Fix EBUSY bug in acquiring tokens
  netfilter: ipvs: Fix inappropriate output of procfs
  powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo
  PCI/PME: Handle invalid data when reading Root Status
  dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
  rtc: pcf8563: fix output clock rate
  video: fbdev: au1200fb: Return an error code if a memory allocation fails
  video: fbdev: au1200fb: Release some resources if a memory allocation fails
  video: udlfb: Fix read EDID timeout
  fbdev: controlfb: Add missing modes to fix out of bounds access
  sfc: don't warn on successful change of MAC
  target: fix race during implicit transition work flushes
  target: fix ALUA transition timeout handling
  target: Use system workqueue for ALUA transitions
  btrfs: add missing memset while reading compressed inline extents
  NFSv4.1 respect server's max size in CREATE_SESSION
  efi/esrt: Cleanup bad memory map log messages
  perf symbols: Fix symbols__fixup_end heuristic for corner cases
  net/mlx4_core: Avoid delays during VF driver device shutdown
  afs: Fix afs_kill_pages()
  afs: Fix page leak in afs_write_begin()
  afs: Populate and use client modification time
  afs: Fix the maths in afs_fs_store_data()
  afs: Prevent callback expiry timer overflow
  afs: Migrate vlocation fields to 64-bit
  afs: Flush outstanding writes when an fd is closed
  afs: Adjust mode bits processing
  afs: Populate group ID from vnode status
  afs: Fix missing put_page()
  drm/radeon: reinstate oland workaround for sclk
  mmc: mediatek: Fixed bug where clock frequency could be set wrong
  sched/deadline: Use deadline instead of period when calculating overflow
  sched/deadline: Throttle a constrained deadline task activated after the deadline
  sched/deadline: Make sure the replenishment timer fires in the next period
  drm/radeon/si: add dpm quirk for Oland
  fjes: Fix wrong netdevice feature flags
  scsi: hpsa: limit outstanding rescans
  scsi: hpsa: update check for logical volume status
  openrisc: fix issue handling 8 byte get_user calls
  intel_th: pci: Add Gemini Lake support
  mlxsw: reg: Fix SPVMLR max record count
  mlxsw: reg: Fix SPVM max record count
  net: Resend IGMP memberships upon peer notification.
  dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
  net: wimax/i2400m: fix NULL-deref at probe
  writeback: fix memory leak in wb_queue_work()
  netfilter: bridge: honor frag_max_size when refragmenting
  drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
  Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
  NFSD: fix nfsd_reset_versions for NFSv4.
  NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
  net: bcmgenet: Power up the internal PHY before probing the MII
  net: bcmgenet: power down internal phy if open or resume fails
  net: bcmgenet: reserved phy revisions must be checked first
  net: bcmgenet: correct MIB access of UniMAC RUNT counters
  net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values
  net: initialize msg.msg_flags in recvfrom
  userfaultfd: selftest: vm: allow to build in vm/ directory
  userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
  md-cluster: free md_cluster_info if node leave cluster
  usb: phy: isp1301: Add OF device ID table
  mac80211: Fix addition of mesh configuration element
  KEYS: add missing permission check for request_key() destination
  ext4: fix crash when a directory's i_size is too small
  ext4: fix fdatasync(2) after fallocate(2) operation
  dmaengine: dmatest: move callback wait queue to thread context
  sched/rt: Do not pull from current CPU if only one CPU to pull
  xhci: Don't add a virt_dev to the devs array before it's fully allocated
  Bluetooth: btusb: driver to enable the usb-wakeup feature
  ceph: drop negative child dentries before try pruning inode's alias
  usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
  USB: core: prevent malicious bNumInterfaces overflow
  USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
  tracing: Allocate mask_str buffer dynamically
  autofs: fix careless error in recent commit
  crypto: salsa20 - fix blkcipher_walk API usage
  crypto: hmac - require that the underlying hash algorithm is unkeyed
  UPSTREAM: arm64: setup: introduce kaslr_offset()
  UPSTREAM: kcov: fix comparison callback signature
  UPSTREAM: kcov: support comparison operands collection
  UPSTREAM: kcov: remove pointless current != NULL check
  UPSTREAM: kcov: support compat processes
  UPSTREAM: kcov: simplify interrupt check
  UPSTREAM: kcov: make kcov work properly with KASLR enabled
  UPSTREAM: kcov: add more missing includes
  UPSTREAM: kcov: add missing #include <linux/sched.h>
  UPSTREAM: kcov: properly check if we are in an interrupt
  UPSTREAM: kcov: don't profile branches in kcov
  UPSTREAM: kcov: don't trace the code coverage code
  BACKPORT: kernel: add kcov code coverage

Conflicts:
	Makefile
	mm/kasan/Makefile
	scripts/Makefile.lib

Change-Id: Ic19953706ea2e700621b0ba94d1c90bbffa4f471
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-01-18 12:49:58 +05:30
Srinivasarao P
33260fbfb3 Merge android-4.4.105 (8a53962) into msm-4.4
* refs/heads/tmp-8a53962
  Linux 4.4.105
  xen-netfront: avoid crashing on resume after a failure in talk_to_netback()
  usb: host: fix incorrect updating of offset
  USB: usbfs: Filter flags passed in from user space
  USB: devio: Prevent integer overflow in proc_do_submiturb()
  USB: Increase usbfs transfer limit
  USB: core: Add type-specific length check of BOS descriptors
  usb: ch9: Add size macro for SSP dev cap descriptor
  usb: Add USB 3.1 Precision time measurement capability descriptor support
  usb: xhci: fix panic in xhci_free_virt_devices_depth_first
  usb: hub: Cycle HUB power when initialization fails
  Revert "ocfs2: should wait dio before inode lock in ocfs2_setattr()"
  net: fec: fix multicast filtering hardware setup
  xen-netfront: Improve error handling during initialization
  mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
  tcp: correct memory barrier usage in tcp_check_space()
  dmaengine: pl330: fix double lock
  tipc: fix cleanup at module unload
  net: sctp: fix array overrun read on sctp_timer_tbl
  drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
  NFSv4: Fix client recovery when server reboots multiple times
  KVM: arm/arm64: Fix occasional warning from the timer work function
  nfs: Don't take a reference on fl->fl_file for LOCK operation
  ravb: Remove Rx overflow log messages
  net/appletalk: Fix kernel memory disclosure
  vti6: fix device register to report IFLA_INFO_KIND
  ARM: OMAP1: DMA: Correct the number of logical channels
  net: systemport: Pad packet before inserting TSB
  net: systemport: Utilize skb_put_padto()
  kprobes/x86: Disable preemption in ftrace-based jprobes
  perf test attr: Fix ignored test case result
  sysrq : fix Show Regs call trace on ARM
  EDAC, sb_edac: Fix missing break in switch
  x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt()
  serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X
  usb: phy: tahvo: fix error handling in tahvo_usb_probe()
  spi: sh-msiof: Fix DMA transfer size check
  serial: 8250_fintek: Fix rs485 disablement on invalid ioctl()
  selftests/x86/ldt_get: Add a few additional tests for limits
  s390/pci: do not require AIS facility
  ima: fix hash algorithm initialization
  USB: serial: option: add Quectel BG96 id
  s390/runtime instrumentation: simplify task exit handling
  serial: 8250_pci: Add Amazon PCI serial device ID
  usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub
  uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices
  bcache: recover data from backing when data is clean
  bcache: only permit to recovery read error when cache device is clean
  ANDROID: initramfs: call free_initrd() when skipping init

Conflicts:
	drivers/usb/core/config.c
	include/linux/usb.h
	include/uapi/linux/usb/ch9.h

Change-Id: Ibada5100be12f3a1389461f7738ee2ecb0d427af
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
2018-01-08 10:02:41 +05:30
tang.junhui
230c4ba404 bcache: fix wrong cache_misses statistics
[ Upstream commit c157313791a999646901b3e3c6888514ebc36d62 ]

Currently, Cache missed IOs are identified by s->cache_miss, but actually,
there are many situations that missed IOs are not assigned a value for
s->cache_miss in cached_dev_cache_miss(), for example, a bypassed IO
(s->iop.bypass = 1), or the cache_bio allocate failed. In these situations,
it will go to out_put or out_submit, and s->cache_miss is null, which leads
bch_mark_cache_accounting() to treat this IO as a hit IO.

[ML: applied by 3-way merge]

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-20 10:04:59 +01:00
Rui Hua
3f7477e644 bcache: recover data from backing when data is clean
commit e393aa2446150536929140739f09c6ecbcbea7f0 upstream.

When we send a read request and hit the clean data in cache device, there
is a situation called cache read race in bcache(see the commit in the tail
of cache_look_up(), the following explaination just copy from there):
The bucket we're reading from might be reused while our bio is in flight,
and we could then end up reading the wrong data. We guard against this
by checking (in bch_cache_read_endio()) if the pointer is stale again;
if so, we treat it as an error (s->iop.error = -EINTR) and reread from
the backing device (but we don't pass that error up anywhere)

It should be noted that cache read race happened under normal
circumstances, not the circumstance when SSD failed, it was counted
and shown in  /sys/fs/bcache/XXX/internal/cache_read_races.

Without this patch, when we use writeback mode, we will never reread from
the backing device when cache read race happened, until the whole cache
device is clean, because the condition
(s->recoverable && (dc && !atomic_read(&dc->has_dirty))) is false in
cached_dev_read_error(). In this situation, the s->iop.error(= -EINTR)
will be passed up, at last, user will receive -EINTR when it's bio end,
this is not suitable, and wield to up-application.

In this patch, we use s->read_dirty_data to judge whether the read
request hit dirty data in cache device, it is safe to reread data from
the backing device when the read request hit clean data. This can not
only handle cache read race, but also recover data when failed read
request from cache device.

[edited by mlyle to fix up whitespace, commit log title, comment
spelling]

Fixes: d59b23795933 ("bcache: only permit to recovery read error when cache device is clean")
Signed-off-by: Hua Rui <huarui.dev@gmail.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Reviewed-by: Coly Li <colyli@suse.de>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09 18:42:37 +01:00
Coly Li
f80f34d8ba bcache: only permit to recovery read error when cache device is clean
commit d59b23795933678c9638fd20c942d2b4f3cd6185 upstream.

When bcache does read I/Os, for example in writeback or writethrough mode,
if a read request on cache device is failed, bcache will try to recovery
the request by reading from cached device. If the data on cached device is
not synced with cache device, then requester will get a stale data.

For critical storage system like database, providing stale data from
recovery may result an application level data corruption, which is
unacceptible.

With this patch, for a failed read request in writeback or writethrough
mode, recovery a recoverable read request only happens when cache device
is clean. That is to say, all data on cached device is up to update.

For other cache modes in bcache, read request will never hit
cached_dev_read_error(), they don't need this patch.

Please note, because cache mode can be switched arbitrarily in run time, a
writethrough mode might be switched from a writeback mode. Therefore
checking dc->has_data in writethrough mode still makes sense.

Changelog:
V4: Fix parens error pointed by Michael Lyle.
v3: By response from Kent Oversteet, he thinks recovering stale data is a
    bug to fix, and option to permit it is unnecessary. So this version
    the sysfs file is removed.
v2: rename sysfs entry from allow_stale_data_on_failure  to
    allow_stale_data_on_failure, and fix the confusing commit log.
v1: initial patch posted.

[small change to patch comment spelling by mlyle]

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Michael Lyle <mlyle@lyle.org>
Reported-by: Arne Wolf <awolf@lenovo.com>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Nix <nix@esperi.org.uk>
Cc: Kai Krakow <hurikhan77@gmail.com>
Cc: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: Junhui Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09 18:42:37 +01:00
Jan Kara
1c6dd64534 block: Use pointer to backing_dev_info from request_queue
We will want to have struct backing_dev_info allocated separately from
struct request_queue. As the first step add pointer to backing_dev_info
to request_queue and convert all users touching it. No functional
changes in this patch.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Change-Id: I77fbb181de7e39c83fbfba8cfb128d6ace161f31
Git-repo: git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git
Git-commit: 97419acd22a0bacc52dbc34d5bbc96d315e48acb
[riteshh@codeaurora.org: resolved merge conflicts]
Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
2017-10-18 12:05:43 +05:30
Tang Junhui
0471f58e18 bcache: do not subtract sectors_to_gc for bypassed IO
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui <tang.junhui@zte.com.cn>
Acked-by: Coly Li <colyli@suse.de>
Reviewed-by: Eric Wheeler <bcache@linux.ewheeler.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-27 11:00:16 +02:00
Kent Overstreet
6f26f0ba24 bcache: Make gc wakeup sane, remove set_task_state()
commit be628be09563f8f6e81929efbd7cf3f45c344416 upstream.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-23 17:43:10 +01:00
Jens Axboe
dece16353e block: change ->make_request_fn() and users to return a queue cookie
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
2015-11-07 10:40:46 -07:00
Kent Overstreet
749b61dab3 bcache: remove driver private bio splitting code
The bcache driver has always accepted arbitrarily large bios and split
them internally.  Now that every driver must accept arbitrarily large
bios this code isn't nessecary anymore.

Cc: linux-bcache@vger.kernel.org
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park <dpark@posteo.net>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-08-13 12:31:40 -06:00
Christoph Hellwig
4246a0b63b block: add a bi_error field to struct bio
Currently we have two different ways to signal an I/O error on a BIO:

 (1) by clearing the BIO_UPTODATE flag
 (2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario.  Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-07-29 08:55:15 -06:00
Jens Axboe
77b5a08427 bcache: don't embed 'return' statements in closure macros
This is horribly confusing, it breaks the flow of the code without
it being apparent in the caller.

Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Christoph Hellwig <hch@lst.de>
2015-07-11 09:57:32 -06:00
Tejun Heo
66114cad64 writeback: separate out include/linux/backing-dev-defs.h
With the planned cgroup writeback support, backing-dev related
declarations will be more widely used across block and cgroup;
unfortunately, including backing-dev.h from include/linux/blkdev.h
makes cyclic include dependency quite likely.

This patch separates out backing-dev-defs.h which only has the
essential definitions and updates blkdev.h to include it.  c files
which need access to more backing-dev details now include
backing-dev.h directly.  This takes backing-dev.h off the common
include dependency chain making it a lot easier to use it across block
and cgroup.

v2: fs/fat build failure fixed.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-06-02 08:33:34 -06:00
Jens Axboe
dac56212e8 bio: skip atomic inc/dec of ->bi_cnt for most use cases
Struct bio has a reference count that controls when it can be freed.
Most uses cases is allocating the bio, which then returns with a
single reference to it, doing IO, and then dropping that single
reference. We can remove this atomic_dec_and_test() in the completion
path, if nobody else is holding a reference to the bio.

If someone does call bio_get() on the bio, then we flag the bio as
now having valid count and that we must properly honor the reference
count when it's being put.

Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-05-05 13:32:49 -06:00
Gu Zheng
aae4933da9 md/bcache: use generic io stats accounting functions to simplify io stat accounting
Use generic io stats accounting help functions (generic_{start,end}_io_acct)
to simplify io stat accounting.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Acked-by: Kent Overstreet <kmo@datera.io>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-11-24 08:05:12 -07:00
Slava Pestov
60ae81eee8 bcache: bcache_write tracepoint was crashing
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-08-04 15:23:03 -07:00
Kent Overstreet
3f5e0a34da bcache: Kill dead cgroup code
This hasn't been used or even enabled in ages.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18 12:22:35 -07:00
Nicholas Swenson
da415a096f bcache: Fix moving_gc deadlocking with a foreground write
Deadlock happened because a foreground write slept, waiting for a bucket
to be allocated. Normally the gc would mark buckets available for invalidation.
But the moving_gc was stuck waiting for outstanding writes to complete.
These writes used the bcache_wq, the same queue foreground writes used.

This fix gives moving_gc its own work queue, so it was still finish moving
even if foreground writes are stuck waiting for allocation. It also makes
work queue a parameter to the data_insert path, so moving_gc can use its
workqueue for writes.

Signed-off-by: Nicholas Swenson <nks@daterainc.com>
Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-03-18 12:22:33 -07:00
Kent Overstreet
1b4eaf3d38 bcache: Fix flash_dev_cache_miss() for real this time
The code was using sectors to count the number of sectors it was zeroing... but
then it passed it to bio_advance()... after it had been set to 0. Amusing...

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-02-25 18:41:11 -08:00
Jens Axboe
96d2e8b5e2 Merge branch 'bcache-for-3.14' of git://evilpiepirate.org/~kent/linux-bcache into for-linus 2014-01-30 12:57:55 -07:00
Linus Torvalds
53d8ab29f8 Merge branch 'for-3.14/drivers' of git://git.kernel.dk/linux-block
Pull block IO driver changes from Jens Axboe:

 - bcache update from Kent Overstreet.

 - two bcache fixes from Nicholas Swenson.

 - cciss pci init error fix from Andrew.

 - underflow fix in the parallel IDE pg_write code from Dan Carpenter.
   I'm sure the 1 (or 0) users of that are now happy.

 - two PCI related fixes for sx8 from Jingoo Han.

 - floppy init fix for first block read from Jiri Kosina.

 - pktcdvd error return miss fix from Julia Lawall.

 - removal of IRQF_SHARED from the SEGA Dreamcast CD-ROM code from
   Michael Opdenacker.

 - comment typo fix for the loop driver from Olaf Hering.

 - potential oops fix for null_blk from Raghavendra K T.

 - two fixes from Sam Bradshaw (Micron) for the mtip32xx driver, fixing
   an OOM problem and a problem with handling security locked conditions

* 'for-3.14/drivers' of git://git.kernel.dk/linux-block: (47 commits)
  mg_disk: Spelling s/finised/finished/
  null_blk: Null pointer deference problem in alloc_page_buffers
  mtip32xx: Correctly handle security locked condition
  mtip32xx: Make SGL container per-command to eliminate high order dma allocation
  drivers/block/loop.c: fix comment typo in loop_config_discard
  drivers/block/cciss.c:cciss_init_one(): use proper errnos
  drivers/block/paride/pg.c: underflow bug in pg_write()
  drivers/block/sx8.c: remove unnecessary pci_set_drvdata()
  drivers/block/sx8.c: use module_pci_driver()
  floppy: bail out in open() if drive is not responding to block0 read
  bcache: Fix auxiliary search trees for key size > cacheline size
  bcache: Don't return -EINTR when insert finished
  bcache: Improve bucket_prio() calculation
  bcache: Add bch_bkey_equal_header()
  bcache: update bch_bkey_try_merge
  bcache: Move insert_fixup() to btree_keys_ops
  bcache: Convert sorting to btree_keys
  bcache: Convert debug code to btree_keys
  bcache: Convert btree_iter to struct btree_keys
  bcache: Refactor bset_tree sysfs stats
  ...
2014-01-30 11:40:10 -08:00
Linus Torvalds
f568849eda Merge branch 'for-3.14/core' of git://git.kernel.dk/linux-block
Pull core block IO changes from Jens Axboe:
 "The major piece in here is the immutable bio_ve series from Kent, the
  rest is fairly minor.  It was supposed to go in last round, but
  various issues pushed it to this release instead.  The pull request
  contains:

   - Various smaller blk-mq fixes from different folks.  Nothing major
     here, just minor fixes and cleanups.

   - Fix for a memory leak in the error path in the block ioctl code
     from Christian Engelmayer.

   - Header export fix from CaiZhiyong.

   - Finally the immutable biovec changes from Kent Overstreet.  This
     enables some nice future work on making arbitrarily sized bios
     possible, and splitting more efficient.  Related fixes to immutable
     bio_vecs:

        - dm-cache immutable fixup from Mike Snitzer.
        - btrfs immutable fixup from Muthu Kumar.

  - bio-integrity fix from Nic Bellinger, which is also going to stable"

* 'for-3.14/core' of git://git.kernel.dk/linux-block: (44 commits)
  xtensa: fixup simdisk driver to work with immutable bio_vecs
  block/blk-mq-cpu.c: use hotcpu_notifier()
  blk-mq: for_each_* macro correctness
  block: Fix memory leak in rw_copy_check_uvector() handling
  bio-integrity: Fix bio_integrity_verify segment start bug
  block: remove unrelated header files and export symbol
  blk-mq: uses page->list incorrectly
  blk-mq: use __smp_call_function_single directly
  btrfs: fix missing increment of bi_remaining
  Revert "block: Warn and free bio if bi_end_io is not set"
  block: Warn and free bio if bi_end_io is not set
  blk-mq: fix initializing request's start time
  block: blk-mq: don't export blk_mq_free_queue()
  block: blk-mq: make blk_sync_queue support mq
  block: blk-mq: support draining mq queue
  dm cache: increment bi_remaining when bi_end_io is restored
  block: fixup for generic bio chaining
  block: Really silence spurious compiler warnings
  block: Silence spurious compiler warnings
  block: Kill bio_pair_split()
  ...
2014-01-30 11:19:05 -08:00
Nicholas Swenson
e3b4825b85 bcache: bugfix - gc thread now gets woken when cache is full
Signed-off-by: Nicholas Swenson <nks@daterainc.com>
2014-01-29 13:06:42 -08:00
Hugh Dickins
b3ff8a2f95 cgroup: remove stray references to css_id
Trivial: remove the few stray references to css_id, which itself
was removed in v3.13's 2ff2a7d03b "cgroup: kill css_id".

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2014-01-13 10:48:18 -05:00
Kent Overstreet
085d2a3dd4 bcache: Make bch_keylist_realloc() take u64s, not nptrs
Getting away from KEY_PTRS and moving toward KEY_U64s - and getting rid of magic
2s

Also - split out the part that checks against journal entry size so as to avoid
a dependancy on struct cache_set in bset.c

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08 13:05:11 -08:00
Kent Overstreet
a5ae4300c1 bcache: Zero less memory
Another minor performance optimization

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08 13:05:08 -08:00
Kent Overstreet
d56d000a1f bcache: Don't touch bucket gen for dirty ptrs
Unnecessary since a bucket that has dirty pointers pointing to it can
never be invalidated - and skipping it is a measurable performance
boost, since the bucket gen will usually be a cache miss.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2014-01-08 13:05:07 -08:00
Kent Overstreet
20d0189b10 block: Introduce new bio_split()
The new bio_split() can split arbitrary bios - it's not restricted to
single page bios, like the old bio_split() (previously renamed to
bio_pair_split()). It also has different semantics - it doesn't allocate
a struct bio_pair, leaving it up to the caller to handle completions.

Then convert the existing bio_pair_split() users to the new bio_split()
- and also nvme, which was open coding bio splitting.

(We have to take that BUG_ON() out of bio_integrity_trim() because this
bio_split() needs to use it, and there's no reason it has to be used on
bios marked as cloned; BIO_CLONED doesn't seem to have clearly
documented semantics anyways.)

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Neil Brown <neilb@suse.de>
2013-11-23 22:33:57 -08:00
Kent Overstreet
59d276fe02 block: Add bio_clone_fast()
bio_clone() just got more expensive - however, most users of bio_clone()
don't actually need to modify the biovec. If they aren't modifying the
biovec, and they can guarantee that the original bio isn't freed before
the clone (also true in most cases), we can just point the clone at the
original bio's biovec.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-23 22:33:54 -08:00
Kent Overstreet
7988613b0e block: Convert bio_for_each_segment() to bvec_iter
More prep work for immutable biovecs - with immutable bvecs drivers
won't be able to use the biovec directly, they'll need to use helpers
that take into account bio->bi_iter.bi_bvec_done.

This updates callers for the new usage without changing the
implementation yet.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Paul Clements <Paul.Clements@steeleye.com>
Cc: Jim Paris <jim@jtan.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com>
Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com>
Cc: support@lsi.com
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jan Kara <jack@suse.cz>
Cc: linux-m68k@lists.linux-m68k.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: drbd-user@lists.linbit.com
Cc: nbd-general@lists.sourceforge.net
Cc: cbe-oss-dev@lists.ozlabs.org
Cc: xen-devel@lists.xensource.com
Cc: virtualization@lists.linux-foundation.org
Cc: linux-raid@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: DL-MPTFusionLinux@lsi.com
Cc: linux-scsi@vger.kernel.org
Cc: devel@driverdev.osuosl.org
Cc: linux-fsdevel@vger.kernel.org
Cc: cluster-devel@redhat.com
Cc: linux-mm@kvack.org
Acked-by: Geoff Levand <geoff@infradead.org>
2013-11-23 22:33:49 -08:00
Kent Overstreet
4f024f3797 block: Abstract out bvec iterator
Immutable biovecs are going to require an explicit iterator. To
implement immutable bvecs, a later patch is going to add a bi_bvec_done
member to this struct; for now, this patch effectively just renames
things.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Matthew Wilcox <willy@linux.intel.com>
Cc: Geoff Levand <geoff@infradead.org>
Cc: Yehuda Sadeh <yehuda@inktank.com>
Cc: Sage Weil <sage@inktank.com>
Cc: Alex Elder <elder@inktank.com>
Cc: ceph-devel@vger.kernel.org
Cc: Joshua Morris <josh.h.morris@us.ibm.com>
Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: dm-devel@redhat.com
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: linux390@de.ibm.com
Cc: Boaz Harrosh <bharrosh@panasas.com>
Cc: Benny Halevy <bhalevy@tonian.com>
Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@kernel.org>
Cc: Joern Engel <joern@logfs.org>
Cc: Prasad Joshi <prasadjoshi.linux@gmail.com>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Guo Chao <yan@linux.vnet.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Jerome Marchand <jmarchand@redhat.com>
Cc: Joe Perches <joe@perches.com>
Cc: Peng Tao <tao.peng@emc.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: fanchaoting <fanchaoting@cn.fujitsu.com>
Cc: Jie Liu <jeff.liu@oracle.com>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Pankaj Kumar <pankaj.km@samsung.com>
Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: Mel Gorman <mgorman@suse.de>6
2013-11-23 22:33:47 -08:00
Kent Overstreet
ed9c47bebe bcache: Kill unaligned bvec hack
Bcache has a hack to avoid cloning the biovec if it's all full pages -
but with immutable biovecs coming this won't be necessary anymore.

For now, we remove the special case and always clone the bvec array so
that the immutable biovec patches are simpler.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-23 22:33:47 -08:00
Kent Overstreet
5ceaaad704 bcache: Bypass torture test
More testing ftw! Also, now verify mode doesn't break if you read dirty
data.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:43 -08:00
Kent Overstreet
c4d951ddb6 bcache: Fix sysfs splat on shutdown with flash only devs
Whoops.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:41 -08:00
Kent Overstreet
8aee122071 bcache: Kill sequential_merge option
It never really made sense to expose this, so just kill it.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:39 -08:00
Kent Overstreet
81ab4190ac bcache: Pull on disk data structures out into a separate header
Now, the on disk data structures are in a header that can be exported to
userspace - and having them all centralized is nice too.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:33 -08:00
Kent Overstreet
2599b53b7b bcache: Move sector allocator to alloc.c
Just reorganizing things a bit.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:32 -08:00
Kent Overstreet
220bb38c21 bcache: Break up struct search
With all the recent refactoring around struct btree op struct search has
gotten rather large.

But we can now easily break it up in a different way - we break out
struct btree_insert_op which is for inserting data into the cache, and
that's now what the copying gc code uses - struct search is now specific
to request.c

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:32 -08:00
Kent Overstreet
cc7b881921 bcache: Convert bch_btree_insert() to bch_btree_map_leaf_nodes()
Last of the btree_map() conversions. Main visible effect is
bch_btree_insert() is no longer taking a struct btree_op as an argument
anymore - there's no fancy state machine stuff going on, it's just a
normal function.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:31 -08:00
Kent Overstreet
6054c6d4da bcache: Don't use op->insert_collision
When we convert bch_btree_insert() to bch_btree_map_leaf_nodes(), we
won't be passing struct btree_op to bch_btree_insert() anymore - so we
need a different way of returning whether there was a collision (really,
a replace collision).

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:30 -08:00
Kent Overstreet
1b207d80d5 bcache: Kill op->replace
This is prep work for converting bch_btree_insert to
bch_btree_map_leaf_nodes() - we have to convert all its arguments to
actual arguments. Bunch of churn, but should be straightforward.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:29 -08:00
Kent Overstreet
b54d6934da bcache: Kill op->cl
This isn't used for waiting asynchronously anymore - so this is a fairly
trivial refactoring.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:09 -08:00
Kent Overstreet
c18536a72d bcache: Prune struct btree_op
Eventual goal is for struct btree_op to contain only what is necessary
for traversing the btree.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:08 -08:00
Kent Overstreet
cc23196631 bcache: Clean up cache_lookup_fn
There was some looping in submit_partial_cache_hit() and
submit_partial_cache_hit() that isn't needed anymore - originally, we
wouldn't necessarily process the full hit or miss all at once because
when splitting the bio, we took into account the restrictions of the
device we were sending it to.

But, device bio size restrictions are now handled elsewhere, with a
wrapper around generic_make_request() - so that looping has been
unnecessary for awhile now and we can now do quite a bit of cleanup.

And if we trim the key we're reading from to match the subset we're
actually reading, we don't have to explicitly calculate bi_sector
anymore. Neat.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:08 -08:00
Kent Overstreet
2c1953e201 bcache: Convert bch_btree_read_async() to bch_btree_map_keys()
This is a fairly straightforward conversion, mostly reshuffling -
op->lookup_done goes away, replaced by MAP_DONE/MAP_CONTINUE. And the
code for handling cache hits and misses wasn't really btree code, so it
gets moved to request.c.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:07 -08:00
Kent Overstreet
df8e89701f bcache: Move some stuff to btree.c
With the new btree_map() functions, we don't need to export the stuff
needed for traversing the btree anymore.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:07 -08:00
Kent Overstreet
72a44517f3 bcache: Convert gc to a kthread
We needed a dedicated rescuer workqueue for gc anyways... and gc was
conceptually a dedicated thread, just one that wasn't running all the
time. Switch it to a dedicated thread to make the code a bit more
straightforward.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:04 -08:00
Kent Overstreet
35fcd848d7 bcache: Convert bucket_wait to wait_queue_head_t
At one point we did do fancy asynchronous waiting stuff with
bucket_wait, but that's all gone (and bucket_wait is used a lot less
than it used to be). So use the standard primitives.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-10 21:56:04 -08:00