Commit graph

299 commits

Author SHA1 Message Date
Olav Haugan
e33c24bfec sched: add cpu isolation support
This adds cpu isolation APIs to the scheduler to isolate and unisolate
CPUs. Isolating and unisolating a CPU can be used in place of hotplug.
Isolating and unisolating a CPU is faster than hotplug and can thus be
used to optimize the performance and power of multi-core CPUs.

Isolating works by migrating non-pinned IRQs and tasks to other CPUS and
marking the CPU as not available to the scheduler and load balancer.
Pinned tasks and IRQs are still allowed to run but it is expected that
this would be minimal.

Unisolation works by just marking the CPU available for scheduler and
load balancer.

Change-Id: I0bbddb56238c2958c5987877c5bfc3e79afa67cc
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
2016-09-24 10:55:17 -07:00
Syed Rameez Mustafa
1389927146 sched: Move data structures under CONFIG_SCHED_HMP
Frequency-demand conversion data structures are only used under
CONFIG_SCHED_HMP. Move them out of sched.h into hmp.c to where they
actually belong after the recent refactor.

Change-Id: I3c3eebca86062f11b80af93ba3716695eb787376
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-09-09 15:35:38 -07:00
Pavankumar Kondeti
2552980f79 sched: handle frequency alert notifications better
The load reporting during frequency alert notifications is broken under
load aggregation. When aggregation is enabled, the total group busy
time is accounted towards the maximum busy CPU of a frequency domain.
If this CPU has a notification pending, it's group busy time alone is
accounted and other CPU's group busy time is completely ignored.
Similarly if any CPU other than maximum busy CPU has a pending
notification, its group busy time is accounted twice.

Maintain the frequency alert notification flag per frequency domain.
When the notification is pending, don't clip the load to 100% @ fur
for any of the CPUs in the frequency domain.

Change-Id: Iebc7d74d6fafa20430fa1c7d80f34a6ab198832d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-08-22 14:06:36 -07:00
Pavankumar Kondeti
5ddfbfec06 sched: inherit the group id from the group leader
When sysctl_sched_enable_thread_grouping is set to 1, any new tasks
created are put in the same group as their group leader.

Change-Id: If1837dd7c8120c8b097cfffa1dc52eb4781f1641
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-08-22 14:06:35 -07:00
Syed Rameez Mustafa
67e0df6e33 sched: Move notify_migration() under CONFIG_SCHED_HMP
notify_migration() is a HMP specific function that relies on all
of its contents to be stubbed out for !CONFIG_SCHED_HMP. However,
it still maintains calls to rcu_read_lock/unlock(). In the !HMP
case these calls are simply redundant. Move the function under
CONFIG_SCHED_HMP and add a stub when the config is not defined so
that there is no overhead.

Change-Id: Iad914f31b629e81e403b0e89796b2b0f1d081695
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 14:06:33 -07:00
Syed Rameez Mustafa
9095a09ab1 sched: Move most HMP specific code to a separate file.
Most code pertaining to CONFIG_SCHED_HMP has been moved to a separate
file "hmp.c" in order to facilitate kernel upgrades. Fewer changes in
the original scheduler files means fewer conflicts. Some parts of code,
however, could not be moved to the separate file either because of
dependencies with other non-HMP code or because the changes are specific
only to the scheduling classes where the code resides.

Change-Id: Ib067ac75e5a494008dcb3c67586b622c1b3962ce
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 14:06:33 -07:00
Syed Rameez Mustafa
b01a93838d sched: Fix compile issues for !CONFIG_SCHED_HMP
Fix compile issues observed when CONFIG_SCHED_HMP is not turned on.
There are still targets that may want that config option turned off.

Change-Id: I29e69356da8d003d13d8cd3927a0b166cc1ef95e
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 14:06:31 -07:00
Syed Rameez Mustafa
62f2600ce9 sched: Remove all existence of CONFIG_SCHED_FREQ_INPUT
CONFIG_SCHED_FREQ_INPUT was created to keep parts of the scheduler
dealing with frequency separate from other parts of the scheduler
that deal with task placement. However, overtime the two features
have become intricately linked whereby SCHED_FREQ_INPUT cannot be
turned on without having SCHED_HMP turned on as well. Given this
complex inter-dependency and the fact that all old, existing and
future targets use both config options, remove this unnecessary
feature separation. It will aid in making kernel upgrades a lot
simpler and faster.

Change-Id: Ia20e40d8a088d50909cc28f5be758fa3e9a4af6f
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 11:37:22 -07:00
Syed Rameez Mustafa
e2b9b4a395 sched: Move CPU cstate tracking under CONFIG_SCHED_HMP
While tracking C-states makes sense under CONFIG_SMP as well, cstate
information is currently unused under CONFIG_SMP. Move it under
CONFIG_SCHED_HMP for now since that is the only place it is relevant
at the moment.

Change-Id: Ifc5812cfe14ebf2b4d447100dcd87f02ab29ff7a
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 11:33:55 -07:00
Syed Rameez Mustafa
e978394406 sched: Remove unused PELT extensions for HMP scheduling
PELT extensions for HMP have never been used since the early days
of the HMP scheduler. Furthermore, changes to PELT itself in newer
kernel versions render some of the code redundant or incorrect. These
extensions have not been tested for a long time and are practically
dead code. Remove it so that future upgrades become easier.

Change-Id: I029f327406ca00b2370c93134158b61dda3b81e3
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 11:32:57 -07:00
Syed Rameez Mustafa
ef1e55638d sched: Remove unused migration notifier code.
Migration notifiers were created to aid the CPU-boost driver manage
CPU frequencies when tasks migrate from one CPU to another. Over time
with the evolution of scheduler guided frequency, the scheduler now
directly manages load when tasks migrate. Consequently the CPU-boost
driver no longer makes use of this information. Remove unused code
pertaining to this feature.

Change-Id: I3529e4356e15e342a5fcfbcf3654396752a1d7cd
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-08-22 11:32:19 -07:00
Trilok Soni
f145f41478 Merge remote-tracking branch 'msm-4.4/tmp-2bf7955' into msm-4.4
* msm-4.4/tmp-2bf7955:
  Linux 4.4.8
  Revert "usb: hub: do not clear BOS field during reset device"
  usbvision: fix crash on detecting device with invalid configuration
  staging: android: ion: Set the length of the DMA sg entries in buffer
  Revert "PCI, x86: Implement pcibios_alloc_irq() and pcibios_free_irq()"
  Revert "PCI: Add helpers to manage pci_dev->irq and pci_dev->irq_managed"
  Revert "x86/PCI: Don't alloc pcibios-irq when MSI is enabled"
  HID: usbhid: fix inconsistent reset/resume/reset-resume behavior
  HID: wacom: fix Bamboo ONE oops
  ALSA: usb-audio: Skip volume controls triggers hangup on Dell USB Dock
  ALSA: usb-audio: Add a quirk for Plantronics BT300
  ALSA: usb-audio: Add a sample rate quirk for Phoenix Audio TMX320
  ALSA: hda/realtek - Enable the ALC292 dock fixup on the Thinkpad T460s
  ALSA: hda - fix front mic problem for a HP desktop
  ALSA: hda - Fix headset support and noise on HP EliteBook 755 G2
  ALSA: hda - Fixup speaker pass-through control for nid 0x14 on ALC225
  mmc: sdhci-pci: Add support and PCI IDs for more Broxton host controllers
  perf: Cure event->pending_disable race
  perf: Do not double free
  arm64: replace read_lock to rcu lock in call_step_hook
  Btrfs: fix file/data loss caused by fsync after rename and new inode
  iommu: Don't overwrite domain pointer when there is no default_domain
  ext4: ignore quota mount options if the quota feature is enabled
  ext4: add lockdep annotations for i_data_sem
  btrfs: fix crash/invalid memory access on fsync when using overlayfs
  nfs: use file_dentry()
  fs: add file_dentry()
  sd: Fix excessive capacity printing on devices with blocks bigger than 512 bytes
  iio: gyro: bmg160: fix endianness when reading axes
  iio: gyro: bmg160: fix buffer read values
  iio: accel: bmc150: fix endianness when reading axes
  iio: st_magn: always define ST_MAGN_TRIGGER_SET_STATE
  usb: renesas_usbhs: fix to avoid using a disabled ep in usbhsg_queue_done()
  usb: renesas_usbhs: disable TX IRQ before starting TX DMAC transfer
  usb: renesas_usbhs: avoid NULL pointer derefernce in usbhsf_pkt_handler()
  mac80211: fix txq queue related crashes
  mac80211: fix unnecessary frame drops in mesh fwding
  mac80211: fix ibss scan parameters
  mac80211: avoid excessive stack usage in sta_info
  mac80211: properly deal with station hashtable insert errors
  virtio: virtio 1.0 cs04 spec compliance for reset
  rbd: use GFP_NOIO consistently for request allocations
  pcmcia: db1xxx_ss: fix last irq_to_gpio user
  v4l: vsp1: Set the SRU CTRL0 register when starting the stream
  coda: fix error path in case of missing pdata on non-DT platform
  au0828: Fix dev_state handling
  au0828: fix au0828_v4l2_close() dev_state race condition
  pinctrl: freescale: imx: fix bogus check of of_iomap() return value
  pinctrl: nomadik: fix pull debug print inversion
  pinctrl: sunxi: Fix A33 external interrupts not working
  pinctrl: sh-pfc: only use dummy states for non-DT platforms
  pinctrl: pistachio: fix mfio84-89 function description and pinmux.
  MIPS: Fix MSA ld unaligned failure cases
  KVM: x86: reduce default value of halt_poll_ns parameter
  KVM: x86: Inject pending interrupt even if pending nmi exist
  cdc-acm: fix NULL pointer reference
  USB: uas: Add a new NO_REPORT_LUNS quirk
  USB: uas: Limit qdepth at the scsi-host level
  mpls: find_outdev: check for err ptr in addition to NULL check
  ipv6: Count in extension headers in skb->network_header
  ip6_tunnel: set rtnl_link_ops before calling register_netdevice
  ipv6: l2tp: fix a potential issue in l2tp_ip6_recv
  ipv4: l2tp: fix a potential issue in l2tp_ip_recv
  tuntap: restore default qdisc
  tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
  rtnl: fix msg size calculation in if_nlmsg_size()
  bridge: Allow set bridge ageing time when switchdev disabled
  ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates
  qmi_wwan: add "D-Link DWM-221 B1" device id
  xfrm: Fix crash observed during device unregistration and decryption
  ppp: take reference on channels netns
  ipv4: initialize flowi4_flags before calling fib_lookup()
  ipv4: fix broadcast packets reception
  bonding: fix bond_get_stats()
  net: bcmgenet: fix dma api length mismatch
  qlge: Fix receive packets drop.
  tcp/dccp: remove obsolete WARN_ON() in icmp handlers
  ppp: ensure file->private_data can't be overridden
  ath9k: fix buffer overrun for ar9287
  farsync: fix off-by-one bug in fst_add_one
  mlx4: add missing braces in verify_qp_parameters
  net: Fix use after free in the recvmmsg exit path
  ipv4: Don't do expensive useless work during inetdev destroy.
  bridge: allow zero ageing time
  rocker: set FDB cleanup timer according to lowest ageing time
  mlxsw: spectrum: Check requested ageing time is valid
  macvtap: always pass ethernet header in linear
  qlcnic: Fix mailbox completion handling during spurious interrupt
  qlcnic: Remove unnecessary usage of atomic_t
  sh_eth: advance 'rxdesc' later in sh_eth_ring_format()
  sh_eth: fix NULL pointer dereference in sh_eth_ring_format()
  bpf: avoid copying junk bytes in bpf_get_current_comm()
  packet: validate variable length ll headers
  ax25: add link layer header validation function
  net: validate variable length ll headers
  ppp: release rtnl mutex when interface creation fails
  tcp: fix tcpi_segs_in after connection establishment
  udp6: fix UDP/IPv6 encap resubmit path
  usbnet: cleanup after bind() in probe()
  cdc_ncm: toggle altsetting to force reset before setup
  vxlan: fix missing options_len update on RX with collect metadata
  ipv6: re-enable fragment header matching in ipv6_find_hdr
  qmi_wwan: add Sierra Wireless EM74xx device ID
  tipc: Revert "tipc: use existing sk_write_queue for outgoing packet chain"
  mld, igmp: Fix reserved tailroom calculation
  sctp: lack the check for ports in sctp_v6_cmp_addr
  net: fix bridge multicast packet checksum validation
  net: qca_spi: clear IFF_TX_SKB_SHARING
  net: qca_spi: Don't clear IFF_BROADCAST
  net: vrf: Remove direct access to skb->data
  net: jme: fix suspend/resume on JMC260
  ipv4: only create late gso-skb if skb is already set up with CHECKSUM_PARTIAL
  tunnel: Clear IPCB(skb)->opt before dst_link_failure called
  tcp: convert cached rtt from usec to jiffies when feeding initial rto
  xen/events: Mask a moving irq
  drm/amdgpu/gmc: use proper register for vram type on Fiji
  drm/amdgpu/gmc: move vram type fetching into sw_init
  drm/radeon: add a dpm quirk for all R7 370 parts
  drm/radeon: add another R7 370 quirk
  drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5
  drm/udl: Use unlocked gem unreferencing
  drm/dp: move hw_mutex up the call stack
  arm64: opcodes.h: Add arm big-endian config options before including arm header
  compiler-gcc: disable -ftracer for __noclone functions
  libnvdimm, pfn: fix uuid validation
  libnvdimm: fix smart data retrieval
  powerpc/mm: Fixup preempt underflow with huge pages
  mm: fix invalid node in alloc_migrate_target()
  ALSA: hda - Apply fix for white noise on Asus N550JV, too
  ALSA: hda - Fix white noise on Asus N750JV headphone
  ALSA: hda - Asus N750JV external subwoofer fixup
  ALSA: timer: Use mod_timer() for rearming the system timer
  parisc: Unbreak handling exceptions from kernel modules
  parisc: Fix kernel crash with reversed copy_from_user()
  parisc: Avoid function pointers for kernel exception routines
  PKCS#7: pkcs7_validate_trust(): initialize the _trusted output argument
  hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated
  Linux 4.4.7
  perf/x86/intel: Fix PEBS data source interpretation on Nehalem/Westmere
  perf/x86/intel: Use PAGE_SIZE for PEBS buffer size on Core2
  perf/x86/intel: Fix PEBS warning by only restoring active PMU in pmi
  perf/x86/pebs: Add workaround for broken OVFL status on HSW+
  sched/cputime: Fix steal time accounting vs. CPU hotplug
  scsi_common: do not clobber fixed sense information
  PM / sleep: Clear pm_suspend_global_flags upon hibernate
  intel_idle: prevent SKL-H boot failure when C8+C9+C10 enabled
  mtd: onenand: fix deadlock in onenand_block_markbad
  mm/page_alloc: prevent merging between isolated and other pageblocks
  ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
  ocfs2/dlm: fix race between convert and recovery
  Input: ati_remote2 - fix crashes on detecting device with invalid descriptor
  Input: ims-pcu - sanity check against missing interfaces
  Input: synaptics - handle spurious release of trackstick buttons, again
  writeback, cgroup: fix use of the wrong bdi_writeback which mismatches the inode
  writeback, cgroup: fix premature wb_put() in locked_inode_to_wb_and_lock_list()
  ACPI / PM: Runtime resume devices when waking from hibernate
  ARM: dts: at91: sama5d4 Xplained: don't disable hsmci regulator
  ARM: dts: at91: sama5d3 Xplained: don't disable hsmci regulator
  nfsd: fix deadlock secinfo+readdir compound
  nfsd4: fix bad bounds checking
  iser-target: Rework connection termination
  iser-target: Separate flows for np listeners and connections cma events
  iser-target: Add new state ISER_CONN_BOUND to isert_conn
  iser-target: Fix identification of login rx descriptor type
  target: Fix target_release_cmd_kref shutdown comp leak
  clk: bcm2835: Fix setting of PLL divider clock rates
  clk: rockchip: add hclk_cpubus to the list of rk3188 critical clocks
  clk: rockchip: rk3368: fix hdmi_cec gate-register
  clk: rockchip: rk3368: fix parents of video encoder/decoder
  clk: rockchip: rk3368: fix cpuclk core dividers
  clk: rockchip: rk3368: fix cpuclk mux bit of big cpu-cluster
  mmc: sdhci: Fix override of timeout clk wrt max_busy_timeout
  mmc: sdhci: fix data timeout (part 2)
  mmc: sdhci: fix data timeout (part 1)
  mmc: mmc_spi: Add Card Detect comments and fix CD GPIO case
  mmc: block: fix ABI regression of mmc_blk_ioctl
  ideapad-laptop: Add ideapad Y700 (15) to the no_hw_rfkill DMI list
  MAINTAINERS: Update mailing list and web page for hwmon subsystem
  kbuild/mkspec: fix grub2 installkernel issue
  scripts/kconfig: allow building with make 3.80 again
  scripts/coccinelle: modernize &
  bitops: Do not default to __clear_bit() for __clear_bit_unlock()
  tracing: Fix trace_printk() to print when not using bprintk()
  tracing: Fix crash from reading trace_pipe with sendfile
  tracing: Have preempt(irqs)off trace preempt disabled functions
  IB/ipoib: fix for rare multicast join race condition
  drm/amdgpu: include the right version of gmc header files for iceland
  drm/amdgpu: disable runtime pm on PX laptops without dGPU power control
  drm/radeon: Don't drop DP 2.7 Ghz link setup on some cards.
  drm/radeon: disable runtime pm on PX laptops without dGPU power control
  iwlwifi: mvm: Fix paging memory leak
  ipr: Fix regression when loading firmware
  ipr: Fix out-of-bounds null overwrite
  rapidio/rionet: fix deadlock on SMP
  fs/coredump: prevent fsuid=0 dumps into user-controlled directories
  fuse: Add reference counting for fuse_io_priv
  fuse: do not use iocb after it may have been freed
  md: multipath: don't hardcopy bio in .make_request path
  md/raid5: preserve STRIPE_PREREAD_ACTIVE in break_stripe_batch_list
  raid10: include bio_end_io_list in nr_queued to prevent freeze_array hang
  RAID5: revert e9e4c377e2 to fix a livelock
  RAID5: check_reshape() shouldn't call mddev_suspend
  md/raid5: Compare apples to apples (or sectors to sectors)
  raid1: include bio_end_io_list in nr_queued to prevent freeze_array hang
  xfs: fix two memory leaks in xfs_attr_list.c error paths
  quota: Fix possible GPF due to uninitialised pointers
  ARC: bitops: Remove non relevant comments
  ARC: [BE] readl()/writel() to work in Big Endian CPU configuration
  xtensa: clear all DBREAKC registers on start
  xtensa: fix preemption in {clear,copy}_user_highpage
  xtensa: ISS: don't hang if stdin EOF is reached
  splice: handle zero nr_pages in splice_to_pipe()
  vfs: show_vfsstat: do not ignore errors from show_devname method
  of: alloc anywhere from memblock if range not specified
  net: mvneta: enable change MAC address when interface is up
  cgroup: ignore css_sets associated with dead cgroups during migration
  Bluetooth: Fix potential buffer overflow with Add Advertising
  Bluetooth: Add new AR3012 ID 0489:e095
  watchdog: rc32434_wdt: fix ioctl error handling
  watchdog: don't run proc_watchdog_update if new value is same as old
  ia64: define ioremap_uc()
  mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage
  mm: memcontrol: reclaim when shrinking memory.high below usage
  bcache: fix cache_set_flush() NULL pointer dereference on OOM
  bcache: fix race of writeback thread starting before complete initialization
  bcache: cleaned up error handling around register_cache()
  IB/srpt: Simplify srpt_handle_tsk_mgmt()
  brd: Fix discard request processing
  jbd2: fix FS corruption possibility in jbd2_journal_destroy() on umount path
  tools/hv: Use include/uapi with __EXPORTED_HEADERS__
  ALSA: hda - Fix unconditional GPIO toggle via automute
  ALSA: hda - fix the mic mute button and led problem for a Lenovo AIO
  ALSA: hda - Don't handle ELD notify from invalid port
  ALSA: intel8x0: Add clock quirk entry for AD1981B on IBM ThinkPad X41.
  ALSA: pcm: Avoid "BUG:" string for warnings again
  ALSA: hda - Apply reboot D3 fix for CX20724 codec, too
  mtip32xx: Cleanup queued requests after surprise removal
  mtip32xx: Implement timeout handler
  mtip32xx: Handle FTL rebuild failure state during device initialization
  mtip32xx: Handle safe removal during IO
  mtip32xx: Fix for rmmod crash when drive is in FTL rebuild
  mtip32xx: Print exact time when an internal command is interrupted
  mtip32xx: Remove unwanted code from taskfile error handler
  mtip32xx: Fix broken service thread handling
  mtip32xx: Avoid issuing standby immediate cmd during FTL rebuild
  media: v4l2-compat-ioctl32: fix missing length copy in put_v4l2_buffer32
  coda: fix first encoded frame payload
  bttv: Width must be a multiple of 16 when capturing planar formats
  adv7511: TX_EDID_PRESENT is still 1 after a disconnect
  saa7134: Fix bytesperline not being set correctly for planar formats
  8250: use callbacks to access UART_DLL/UART_DLM
  net: irda: Fix use-after-free in irtty_open()
  tty: Fix GPF in flush_to_ldisc(), part 2
  staging: comedi: ni_mio_common: fix the ni_write[blw]() functions
  staging: android: ion_test: fix check of platform_device_register_simple() error code
  staging: comedi: ni_tiocmd: change mistaken use of start_src for start_arg
  HID: fix hid_ignore_special_drivers module parameter
  HID: multitouch: force retrieving of Win8 signature blob
  HID: i2c-hid: fix OOB write in i2c_hid_set_or_send_report()
  HID: logitech: fix Dual Action gamepad support
  tpm: fix the cleanup of struct tpm_chip
  tpm_eventlog.c: fix binary_bios_measurements
  tpm_crb: tpm2_shutdown() must be called before tpm_chip_unregister()
  tpm: fix the rollback in tpm_chip_register()
  mei: bus: check if the device is enabled before data transfer
  X.509: Fix leap year handling again
  crypto: marvell/cesa - forward devm_ioremap_resource() error code
  crypto: ux500 - fix checks of error code returned by devm_ioremap_resource()
  crypto: atmel - fix checks of error code returned by devm_ioremap_resource()
  crypto: keywrap - memzero the correct memory
  crypto: ccp - memset request context to zero during import
  crypto: ccp - Don't assume export/import areas are aligned
  crypto: ccp - Limit the amount of information exported
  crypto: ccp - Add hash state import and export support
  Bluetooth: btusb: Add a new AR3012 ID 13d3:3472
  Bluetooth: btusb: Add a new AR3012 ID 04ca:3014
  Bluetooth: btusb: Add new AR3012 ID 13d3:3395
  ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call
  ALSA: usb-audio: Minor code cleanup in create_fixed_stream_quirk()
  ALSA: usb-audio: add Microsoft HD-5001 to quirks
  ALSA: usb-audio: Add sanity checks for endpoint accesses
  ALSA: usb-audio: Fix NULL dereference in create_fixed_stream_quirk()
  Input: powermate - fix oops with malicious USB descriptors
  pwc: Add USB id for Philips Spc880nc webcam
  USB: option: add "D-Link DWM-221 B1" device id
  USB: serial: ftdi_sio: Add support for ICP DAS I-756xU devices
  USB: serial: cp210x: Adding GE Healthcare Device ID
  USB: cypress_m8: add endpoint sanity check
  USB: digi_acceleport: do sanity checking for the number of ports
  USB: mct_u232: add sanity checking in probe
  USB: usb_driver_claim_interface: add sanity checking
  USB: iowarrior: fix oops with malicious USB descriptors
  USB: cdc-acm: more sanity checking
  USB: uas: Reduce can_queue to MAX_CMNDS
  usb: hub: fix a typo in hub_port_init() leading to wrong logic
  usb: retry reset if a device times out
  dm: fix rq_end_stats() NULL pointer in dm_requeue_original_request()
  dm cache: make sure every metadata function checks fail_io
  dm thin metadata: don't issue prefetches if a transaction abort has failed
  dm: fix excessive dm-mq context switching
  dm snapshot: disallow the COW and origin devices from being identical
  libnvdimm: Fix security issue with DSM IOCTL.
  aic7xxx: Fix queue depth handling
  be2iscsi: set the boot_kset pointer to NULL in case of failure
  scsi: storvsc: fix SRB_STATUS_ABORTED handling
  sd: Fix discard granularity when LBPRZ=1
  aacraid: Set correct msix count for EEH recovery
  aacraid: Fix memory leak in aac_fib_map_free
  aacraid: Fix RRQ overload
  sg: fix dxferp in from_to case
  x86/mm: TLB_REMOTE_SEND_IPI should count pages
  x86/iopl: Fix iopl capability check on Xen PV
  x86/iopl/64: Properly context-switch IOPL on Xen PV
  x86/apic: Fix suspicious RCU usage in smp_trace_call_function_interrupt()
  x86/irq: Cure live lock in fixup_irqs()
  PCI: ACPI: IA64: fix IO port generic range check
  PCI: Disable IO/MEM decoding for devices with non-compliant BARs
  pinctrl-bcm2835: Fix cut-and-paste error in "pull" parsing
  s390/pci: enforce fmb page boundary rule
  s390/cpumf: add missing lpp magic initialization
  s390: fix floating pointer register corruption (again)
  EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()
  EDAC/sb_edac: Fix computation of channel address
  sched/preempt, sh: kmap_coherent relies on disabled preemption
  sched/cputime: Fix steal_account_process_tick() to always return jiffies
  Thermal: Ignore invalid trip points
  perf tools: Fix python extension build
  perf tools: Fix checking asprintf return value
  perf tools: Dont stop PMU parsing on alias parse error
  perf/core: Fix perf_sched_count derailment
  KVM: VMX: fix nested vpid for old KVM guests
  KVM: VMX: avoid guest hang on invalid invvpid instruction
  KVM: VMX: avoid guest hang on invalid invept instruction
  KVM: fix spin_lock_init order on x86
  KVM: i8254: change PIT discard tick policy
  KVM: x86: fix missed hardware breakpoints
  x86/PCI: Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs
  perf/x86/intel: Add definition for PT PMI bit
  x86/entry/compat: Keep TS_COMPAT set during signal delivery
  x86/microcode: Untangle from BLK_DEV_INITRD
  x86/microcode/intel: Make early loader look for builtin microcode too
  mmc: sh_mmcif: Correct TX DMA channel allocation
  mmc: sh_mmcif: rework dma channel handling
  ASoC: samsung: pass DMA channels as pointers
  regulator: core: Fix nested locking of supplies
  regulator: core: avoid unused variable warning
  s390/cpumf: Fix lpp detection
  cpufreq: dt: No need to allocate resources anymore
  cpufreq: dt: No need to fetch voltage-tolerance
  cpufreq: dt: Use dev_pm_opp_set_rate() to switch frequency
  cpufreq: dt: Reuse dev_pm_opp_get_max_transition_latency()
  cpufreq: dt: Unsupported OPPs are already disabled
  cpufreq: dt: Pass regulator name to the OPP core
  cpufreq: dt: OPP layers handles clock-latency for V1 bindings as well
  cpufreq: dt: Rename 'need_update' to 'opp_v1'
  cpufreq: dt: Convert few pr_debug/err() calls to dev_dbg/err()
  cpufreq-dt: fix handling regulator_get_voltage() result
  cpufreq-dt: Supply power coefficient when registering cooling devices
  PM / OPP: Rename structures for clarity
  PM / OPP: Fix incorrect comments
  PM / OPP: Initialize regulator pointer to an error value
  PM / OPP: Initialize u_volt_min/max to a valid value
  PM / OPP: Fix NULL pointer dereference crash when disabling OPPs
  PM / OPP: Add dev_pm_opp_set_rate()
  PM / OPP: Manage device clk
  PM / OPP: Parse clock-latency and voltage-tolerance for v1 bindings
  PM / OPP: Introduce dev_pm_opp_get_max_transition_latency()
  PM / OPP: Introduce dev_pm_opp_get_max_volt_latency()
  PM / OPP: Disable OPPs that aren't supported by the regulator
  PM / OPP: get/put regulators from OPP core
  cpufreq: cpufreq-dt: avoid uninitialized variable warnings:
  PM / OPP: Use snprintf() instead of sprintf()
  PM / OPP: Set cpu_dev->id in cpumask first
  PM / OPP: Fix parsing of opp-microvolt and opp-microamp properties
  PM / OPP: Parse 'opp-<prop>-<name>' bindings
  PM / OPP: Parse 'opp-supported-hw' binding
  PM / OPP: Add missing doc comments
  PM / OPP: Rename OPP nodes as opp@<opp-hz>
  PM / OPP: Remove 'operating-points-names' binding
  PM / OPP: Add {opp-microvolt|opp-microamp}-<name> binding
  PM / OPP: Add "opp-supported-hw" binding
  PM / OPP: Add debugfs support
  arm64: vdso: Mark vDSO code as read-only

Conflicts:
	drivers/staging/android/ion/ion.c
	mm/page_alloc.c

CRs-Fixed: 1010239
Change-Id: Id59539cad642885e1e41340cebae4159ba1f7eaf
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
2016-07-22 16:45:32 -07:00
Joonwoo Park
c876c09f58 sched: kill unnecessary divisions on fast path
The max_possible_efficiency and CPU's efficiency are fixed values which
are determined at cluster allocation time.  Avoid division on the fast
by using precomputed scale factor.

Also update_cpu_busy_time() doesn't need to know how many full windows
have elapsed.  Thus replace unneeded division with simple comparison.

Change-Id: I2be1aad3fb9b895e4f0917d05bd8eade985bbccf
Suggested-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-21 15:11:21 -07:00
Joonwoo Park
c07e88c80f sched: remove unused parameter cpu from cpu_cycles_to_freq()
The function parameter cpu isn't used anymore by cpu_cycles_to_freq().
So remove it.

Change-Id: Ide19321206dacb88fedca97e1b689d740f872866
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-21 15:10:22 -07:00
Joonwoo Park
96818d6f1d sched: fix potential deflated frequency estimation during IRQ handling
Time between mark_start of idle task and IRQ handler entry time is CPU
cycle counter stall period.  Therefore it's inappropriate to include such
duration as part of sample period when we do frequency estimation.

Fix such suboptimality by replenishing idle task's CPU cycle counter
upon IRQ entry and using irqtime as time delta.

Change-Id: I274d5047a50565cfaaa2fb821ece21c8cf4c991d
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-09 15:08:01 -07:00
Joonwoo Park
54c0b0001b sched: preserve CPU cycle counter in rq
Preserve cycle counter in rq in preparation for wait time accounting
while CPU idle fix.

Change-Id: I469263c90e12f39bb36bde5ed26298b7c1c77597
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-09 15:07:35 -07:00
Joonwoo Park
eedf0821f6 sched: Remove the sched heavy task frequency guidance feature
This has always been unused feature given its limitation of adding
phantom load to the system. Since there are no immediate plans of
using this and the fact that it adds unnecessary complications to
the new load fixup mechanism, remove this feature for now. It can
be revisited later in light of the new mechanism.

Change-Id: Ie9501a898d0f423338293a8dde6bc56f493f1e75
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-06-03 14:47:39 -07:00
Srivatsa Vaddagiri
e6aae1c3e0 sched: Aggregate for frequency
Related threads in a group could execute on different CPUs and hence
present a split-demand picture to cpufreq governor. IOW the governor
fails to see the net cpu demand of all related threads in a given
window if the threads's execution were to be split across CPUs. That
could result in sub-optimal frequency chosen in comparison to the
ideal frequency at which the aggregate work (taken up by related
threads) needs to be run.

This patch aggregates cpu execution stats in a window for all related
threads in a group. This helps present cpu busy time to governor as if
all related threads were part of the same thread and thus help select
the right frequency required by related threads. This aggregation
is done per-cluster.

Change-Id: I71e6047620066323721c6d542034ddd4b2950e7f
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: Fixed notify_migration() to hold rcu read
 lock as this version of Linux doesn't hold p->pi_lock when the
 function gets called while keeping use of rcu_access_pointer() since
 we never dereference return value.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-26 15:28:59 -07:00
Joonwoo Park
d9ff0d77af sched: simplify CPU frequency estimation and cycle counter API
Most of CPUs increase cycle counter by one every cycle which makes
frequency = cycles / time_delta is correct.  Therefore it's reasonable
to get rid of current cpu_cycle_max_scale_factor and ask cycle counter
read callback function to return scaled counter value when it's needed
in such a case that cycle counter doesn't increase every cycle.

Thus multiply NSEC_PER_SEC / HZ_PER_KHZ to CPU cycle counter delta
as we calculate frequency in khz and remove cpu_cycle_max_scale_factor.
This allows us to simplify frequency estimation and cycle counter API.

Change-Id: Ie7a628d4bc77c9b6c769f6099ce8d75740262a14
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-05-20 19:23:47 -07:00
Joonwoo Park
55b8e041e6 sched: take into account of limited CPU min and max frequencies
Actual CPU's min and max frequencies can be limited by hardware
components while governor's not aware of.  Provide an API for them to
notify for scheduler to be able to notice accurate currently
operating frequency boundaries which helps better task placement
decision.

CRs-fixed: 1006303
Change-Id: I608f5fa8b0baff8d9e998731dcddec59c9073d20
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-27 19:13:06 -07:00
Joonwoo Park
35f1d99e0a sched: add support for CPU frequency estimation with cycle counter
At present scheduler calculates task's demand with the task's execution
time weighted over CPU frequency.  The CPU frequency is given by
governor's CPU frequency transition notification.  Such notification
may not be available.

Provide an API for CPU clock driver to register callback functions so
in order for scheduler to access CPU's cycle counter to estimate CPU's
frequency without notification.  At time point scheduler assumes the
cycle counter increases always even when cluster is idle which might
not be true.  This will be fixed by subsequent change for more accurate
I/O wait time accounting.

CRs-fixed: 1006303
Change-Id: I93b187efd7bc225db80da0184683694f5ab99738
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-04-27 19:13:05 -07:00
Thomas Gleixner
2a8225ef46 sched/cputime: Fix steal time accounting vs. CPU hotplug
commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream.

On CPU hotplug the steal time accounting can keep a stale rq->prev_steal_time
value over CPU down and up. So after the CPU comes up again the delta
calculation in steal_account_process_tick() wreckages itself due to the
unsigned math:

	 u64 steal = paravirt_steal_clock(smp_processor_id());

	 steal -= this_rq()->prev_steal_time;

So if steal is smaller than rq->prev_steal_time we end up with an insane large
value which then gets added to rq->prev_steal_time, resulting in a permanent
wreckage of the accounting. As a consequence the per CPU stats in /proc/stat
become stale.

Nice trick to tell the world how idle the system is (100%) while the CPU is
100% busy running tasks. Though we prefer realistic numbers.

None of the accounting values which use a previous value to account for
fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity
check for prev_irq_time and prev_steal_time_rq, but that sanity check solely
deals with clock warps and limits the /proc/stat visible wreckage. The
prev_time values are still wrong.

Solution is simple: Reset rq->prev_*_time when the CPU is plugged in again.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: commit 095c0aa83e "sched: adjust scheduler cpu power for stolen time"
Fixes: commit aa48380851 "sched: Remove irq time from available CPU power"
Fixes: commit e6e6685acc "KVM guest: Steal time accounting"
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12 09:09:05 -07:00
Olav Haugan
b29f9a7a84 sched/core: Add protection against null-pointer dereference
p->grp is being accessed outside of lock which can cause null-pointer
dereference. Fix this and also add rcu critical section around access
of this data structure.

CRs-fixed: 985379
Change-Id: Ic82de6ae2821845d704f0ec18046cc6a24f98e39
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in init_new_task_load().]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:21 -07:00
Pavankumar Kondeti
6d742ce87b sched: Add separate load tracking histogram to predict loads
Current window based load tracking only saves history for five
windows. A historically heavy task's heavy load will be completely
forgotten after five windows of light load. Even before the five
window expires, a heavy task wakes up on same CPU it used to run won't
trigger any frequency change until end of the window. It would starve
for the entire window. It also adds one "small" load window to
history because it's accumulating load at a low frequency, further
reducing the tracked load for this heavy task.

Ideally, scheduler should be able to identify such tasks and notify
governor to increase frequency immediately after it wakes up.

Add a histogram for each task to track a much longer load history. A
prediction will be made based on runtime of previous or current
window, histogram data and load tracked in recent windows. Prediction
of all tasks that is currently running or runnable on a CPU is
aggregated and reported to CPUFreq governor in sched_get_cpus_busy().

sched_get_cpus_busy() now returns predicted busy time in addition
to previous window busy time and new task busy time, scaled to
the CPU maximum possible frequency.

Tunables:

- /proc/sys/kernel/sched_gov_alert_freq (KHz)

This tunable can be used to further filter the notifications.
Frequency alert notification is sent only when the predicted
load exceeds previous window load by sched_gov_alert_freq converted to
load.

Change-Id: If29098cd2c5499163ceaff18668639db76ee8504
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts around __migrate_task()
 and removed changes for CONFIG_SCHED_QHMP.]
2016-03-23 21:25:17 -07:00
Junjie Wu
efa673322f sched: Provide a wake up API without sending freq notifications
Each time a task wakes up, scheduler evaluates its load and notifies
governor if the resulting frequency of destination CPU is larger than
a threshold. However, some governor wakes up a separate task that
handles frequency change, which again calls wake_up_process().

This is dangerous because if the task being woken up meets the
threshold and ends up being moved around, there is a potential for
endless recursive notifications.

Introduce a new API for waking up a task without triggering
frequency notification.

Change-Id: I24261af81b7dc410c7fb01eaa90920b8d66fbd2a
Signed-off-by: Junjie Wu <junjiew@codeaurora.org>
2016-03-23 21:25:17 -07:00
Pavankumar Kondeti
6003b006be sched: Provide a facility to restrict RT tasks to lower power cluster
The current CPU selection algorithm for RT tasks looks for the
least loaded CPU in all clusters. Stop the search at the lowest
possible power cluster based on "sched_restrict_cluster_spill"
sysctl tunable.

Change-Id: I34fdaefea56e0d1b7e7178d800f1bb86aa0ec01c
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:15 -07:00
Pavankumar Kondeti
8cd1d7ef16 sched: Take cluster's minimum power into account for optimizing sbc()
The select_best_cpu() algorithm iterates over all the clusters and
selects the most power efficient CPU that satisfies the task needs.
During the search, skip the next cluster if its minimum power cost
is higher than the power cost of an eligible CPU found in the previous
cluster.

In a b.L system, if the BIG cluster minimum power cost is higher than
the maximum power cost of the little cluster, this optimization avoids
searching the BIG cluster if an eligible CPU is found in the little
cluster.

Change-Id: I5e3755f107edb6c72180edbec2a658be931c276d
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
2016-03-23 21:25:14 -07:00
Pavankumar Kondeti
6418f213ab sched: Revise the inter cluster load balance restrictions
The frequency based inter cluster load balance restrictions are not
reliable as frequency does not provide a good estimate of the CPU's
current load. Replace them with the spill_load and spill_nr_run
based checks.

The higher capacity cluster is restricted from pulling the tasks from
the lower capacity cluster unless all of the lower capacity CPUs are
above spill. This behavior can be controlled by a sysctl tunable and
it is disabled by default (i.e. no load balance restrictions).

Change-Id: I45c09c8adcb61a8a7d4e08beadf2f97f1805fb42
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes
 for CONFIG_SCHED_QHMP.]
2016-03-23 21:25:13 -07:00
Srivatsa Vaddagiri
3004236139 sched: colocate related threads
Provide userspace interface for tasks to be grouped together as
"related" threads. For example, all threads involved in updating
display buffer could be tagged as related.

Scheduler will attempt to provide special treatment for group of
related threads such as:

1) Colocation of related threads in same "preferred" cluster
2) Aggregation of demand towards determination of cluster frequency

This patch extends scheduler to provide best-effort colocation support
for a group of related threads.

Change-Id: Ic2cd769faf5da4d03a8f3cb0ada6224d0101a5f5
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor merge conflicts.  removed ifdefry
 for CONFIG_SCHED_QHMP.]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>

Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:12 -07:00
Srivatsa Vaddagiri
df6bfcaf70 sched: Update fair and rt placement logic to use scheduler clusters
Make use of clusters in the fair and rt scheduling classes. This is
needed as the freq domain mask can no longer be used to do correct
task placement. The freq domain mask was being used to demarcate
clusters.

Change-Id: I57f74147c7006f22d6760256926c10fd0bf50cbd
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed merge conflicts due to omitted changes
 for CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 21:25:11 -07:00
Srivatsa Vaddagiri
cb1bb6a8f4 sched: Introduce the concept CPU clusters in the scheduler
A cluster is set of CPUs sharing some power controls and an L2 cache.
This patch buids a list of clusters at bootup which are sorted by
their max_power_cost. Many cluster-shared attributes like cur_freq,
max_freq etc are needlessly maintained in per-cpu 'struct rq' currently.
Consolidate them in a cluster structure.

Change-Id: I0567672ad5fb67d211d9336181ceb53b9f6023af
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in
 arch/arm64/kernel/topology.c. fixed conflict due to ommited changes for
 CONFIG_SCHED_QHMP.]
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 21:25:10 -07:00
Joonwoo Park
9df619ba91 sched: fix compile failure where !CONFIG_SCHED_HMP
Fix compile failure when HMP scheduler isn't selected.

Change-Id: I411fa3501a4c4ac280c037a1698aa3b7278d440f
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:44 -07:00
Syed Rameez Mustafa
280b866848 sched: Optimize scheduler trace events to reduce trace buffer usage
Scheduler ftrace events currently generate a lot of data when turned
on. The excessive log messages often end up overflowing trace buffers
for long use cases or crowding out other events. Optimize scheduler
events so that the log spew is less and more manageable. To that end
change the variable type for some event fields; introduce variants
of sched_cpu_load that can be turned on/off for separate code paths
and remove unused fields from various events.

Change-Id: I2b313542b39ad5e09a01ad1303b5dfe2c4883b8a
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in rt.c due to
 CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:38 -07:00
Syed Rameez Mustafa
c00814c023 sched: Notify cpufreq governor early about potential big tasks
Tasks that are on the runqueue continuously for a certain amount of time
have the potential to be big tasks at the end of the window in which they
are runnable. In such scenarios ramping the CPU frequency early can
boost performance rather than waiting till the end of a window for the
governor to query load. Notify the governor early at every tick when a
task has been observed to execute beyond some percentage of the tick
period.

The threshold beyond which a task is eligible for early detection can be
changed via the tunable sched_early_detection_duration. The feature itself
is enabled only when scheduler boost is in effect.

Change-Id: I528b72bbc79a55b4593d1b8ab45450411c6d70f3
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed conflict in scheduler_tick() in
 kernel/sched/core.c.  fixed minor conflicts in include/linux/sched.h,
 include/linux/sched/sysctl.h and kernel/sysctl.c due to
 CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:34 -07:00
Joonwoo Park
b2e60dbe08 sched: avoid unnecessary multiplication and division
Avoid unnecessary multiplication and division when load scaling factor
is 1024.

Change-Id: If3cb63a77feaf49cc69ddec7f41cc3c1cabbfc5a
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:31 -07:00
Joonwoo Park
91a8710235 sched: precompute required frequency for CPU load
At present in order to estimate power cost of CPU load, HMP scheduler
converts CPU load to coresponding frequency on the fly which can be
avoided.

Optimize and reduce execution time of select_best_cpu() by precomputing
CPU load to frequency conversion.  This optimization reduces about ~20% of
execution time of select_best_cpu() on average.

Change-Id: I385c57f2ea9a50883b76ba6ca3deb673b827217f
[joonwoop@codeaurora.org: fixed minior conflict in kernel/sched/sched.h.
 stripped out codes for CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:31 -07:00
Joonwoo Park
383ae6b29e sched: clean up fixup_hmp_sched_stats()
The commit 392edf4969d20 ("sched: avoid stale cumulative_runnable_avg
HMP statistics) introduced the callback function fixup_hmp_sched_stats()
so update_history() can avoid decrement and increment pair of HMP stat.
However the commit also made fixup function to do obscure p->ravg.demand
update which isn't the cleanest way.

Revise the function fixup_hmp_sched_stats() so the caller can update
p->ravg.demand directly.

Change-Id: Id54667d306495d2109c26362813f80f08a1385ad
[joonwoop@codeaurora.org: stripped out CONFIG_SCHED_QHMP.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:30 -07:00
Joonwoo Park
446beddcd4 sched: account new task load so that governor can apply different policy
Account amount of load contributed by new tasks within CPU load so that
governor can apply different policy when CPU is loaded by new tasks.

To be able to distinguish new task load a new tunable
sched_new_task_windows also introduced.  The tunable defines tasks as new
when the tasks are have been active less than configured windows.

Change-Id: I2e2e62e4103882f7362154b792ab978b181b9f59
Suggested-by: Saravana Kannan <skannan@codeaurora.org>
[joonwoop@codeaurora.org: ommited changes for
 drivers/cpufreq/cpufreq_interactive.c.  cpufreq changes needs to be
 applied separately later.  fixed conflict in include/linux/sched.h and
 include/linux/sched/sysctl.h.  omitted changes for qhmp_core.c]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:29 -07:00
Olav Haugan
03a683a55c sched: Add tunables for static cpu and cluster cost
Add per-cpu tunable to set the extra cost to use a CPU that is idle.
Add the same for a cluster.

Change-Id: I4aa53f3c42c963df7abc7480980f747f0413d389
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[joonwoop@codeaurora.org: omitted changes for qhmp*.[c,h]  stripped out
 CONFIG_SCHED_QHMP in drivers/base/cpu.c and include/linux/sched.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:27 -07:00
Olav Haugan
4996dafe68 sched/core: Add API to set cluster d-state
Add new API to the scheduler to allow low power mode driver to inform
the scheduler about the d-state of a cluster. This can be leveraged by
the scheduler to make an informed decision about the cost of placing a task
on a cluster.

Change-Id: If0fe0fdba7acad1c2eb73654ebccfdb421225e62
Signed-off-by: Olav Haugan <ohaugan@codeaurora.org>
[joonwoop@codeaurora.org: omitted fixes for qhmp_core.c and qhmp_core.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:26 -07:00
Joonwoo Park
b4627e0104 sched: take into account of governor's frequency max load
At present HMP scheduler packs tasks to busy CPU till the CPU's load is
100% to avoid waking up of idle CPU as much as possible.  Such aggressive
packing leads unintended CPU frequency raise as governor raises the busy
CPU's frequency when its load is more than configured frequency max load
which can be less than 100%.

Fix to take into account of governor's frequency max load and pack tasks
only when the CPU's projected load is less than max load to avoid
unnecessary frequency raise.

Change-Id: I4447e5e0c2fa5214ae7a9128f04fd7585ed0dcac
[joonwoop@codeaurora.org: fixed minor conflict in kernel/sched/sched.h]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:25 -07:00
Syed Rameez Mustafa
ca42a1bec8 sched: add frequency zone awareness to the load balancer
Add zone awareness to the load balancer. Remove all earlier restrictions
that the load balancer had for inter cluster kicks and migration.

Change-Id: I12ad3d0c2d2e9bb498f49a231810f2ad418b061f
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed minor conflict in nohz_kick_needed() due
 to its return type change.]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:21 -07:00
Syed Rameez Mustafa
87fe20de7e sched: Update the wakeup placement logic for fair and rt tasks
For the fair sched class, update the select_best_cpu() policy to do
power based placement. The hope is to minimize the voltage at which
the CPU runs.

While RT tasks already do power based placement, their placement
preference has to now take into account the power cost of all tasks
on a given CPU. Also remove the check for sched_boost since
sched_boost no longer intends to elevate all tasks to the highest
capacity cluster.

Change-Id: Ic6a7625c97d567254d93b94cec3174a91727cb87
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:20 -07:00
Syed Rameez Mustafa
d590f25153 sched: remove the notion of small tasks and small task packing
Task packing will now be determined solely on the basis of the
power cost of task placement. All tasks are eligible for packing.
Remove the notion of "small" tasks from the scheduler.

Change-Id: I72d52d04b2677c6a8d0bc6aa7d50ff0f1a4f5ebb
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:19 -07:00
Joonwoo Park
b5a9a7b1c7 sched: avoid stale cumulative_runnable_avg HMP statistics
When a new window starts for a task and the task is on a rq, scheduler
decreases rq's cumulative_runnable_avg momentarily, re-account task's
demand and increases rq's cumulative_runnable_avg with newly accounted
task's demand.  Therefore there is short time period that rq's
cumulative_runnable_avg is less than what it's supposed to be.
Meanwhile, there is chance that other CPU is in search of best CPU to place
a task and makes suboptimal decision with momentarily stale
cumulative_runnable_avg.

Fix such issue by adding or subtracting of delta between task's old
and new demand instead of decrementing and incrementing of entire task's
load.

Change-Id: I3c9329961e6f96e269fa13359e7d1c39c4973ff2
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:16 -07:00
Syed Rameez Mustafa
d109fbbf71 sched: Add load based placement for RT tasks
Currently RT tasks prefer to go to the lowest power CPU in the
system. This can end up causing contention on the lowest power
CPU. Instead ensure that RT tasks end up on the lowest power
cluster and the least loaded CPU within that cluster.

Change-Id: I363b3d43236924962c67d2fb5d3d2d09800cd994
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
2016-03-23 20:02:15 -07:00
Joonwoo Park
a509c84de7 sched: inline function scale_load_to_cpu()
Inline relatively small and frequently used function scale_load_to_cpu().

CRs-fixed: 849655
Change-Id: Id5f60595c394959d78e6da4cc4c18c338fec285b
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:14 -07:00
Syed Rameez Mustafa
7ebc066cdb sched: Optimize select_best_cpu() to reduce execution time
select_best_cpu() is a crucial wakeup routine that determines the
time taken by the scheduler to wake up a task. Optimize this routine
to get higher performance. The following changes have been made as
part of the optimization listed in order of how they built on top of
one another:

* Several routines called by select_best_cpu() recalculate task load
  and CPU load even though these are already known quantities. For
  example mostly_idle_cpu_sync() calculates CPU load; task_will_fit()
  calculates task load before spill_threshold_crossed() recalculates
  both. Remove these redundant calculations by moving the task load
  and CPU load computations to the select_best_cpu() 'for' loop and
  passing to any functions that need the information.

* Rewrite best_small_task_cpu() to avoid the existing two pass
  approach. The two pass approach was only in place to find the
  minimum power cluster for small task placement. This information
  can easily be established by looking at runqueue capacities. The
  cluster with not the highest capacity constitutes the minimum power
  cluster. A special CPU mask is called the mpc_mask required to safeguard
  against undue side effects on SMP systems. Also terminate the function
  early if the previous CPU is found to be mostly_idle.

* Reorganize code to ensure that no unnecessary computations or
  variable assignments are done. For example there is no need to
  compute CPU load if that information does not end up getting used
  in any iteration of the 'for' loop.

* The tick logic for EA migrations unnecessarily checks for the power
  of all CPUs only for skip_cpu() to throw away the result later.
  Ensure that for EA we only check CPUs within the same cluster
  and avoid running select_best_cpu() whenever possible.

CRs-fixed: 849655
Change-Id: I4e722912fcf3fe4e365a826d4d92a4dd45c05ef3
Signed-off-by: Syed Rameez Mustafa <rameezmustafa@codeaurora.org>
[joonwoop@codeaurora.org: fixed cpufreq_notifier_policy() to set mpc_mask.
 added a comment about prerequisite of lower_power_cpu_available().
 s/struct rq * rq/struct rq *rq/. s/TASK_NICE/task_nice/]
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:10 -07:00
Srivatsa Vaddagiri
eca78aaf84 sched: report loads greater than 100% only during load alert notifications
The busy time of CPUs is adjusted during task migrations. This can
result in reporting the load greater than 100% to the governor and
causes direct jumps to the higher frequencies during the intra cluster
migrations. Hence clip the load to 100% during the load reporting at
the end of the window. The load is not clipped for load alert notifications
which allows ramping up the frequency faster for inter cluster migrations
and heavy task wakeup scenarios.

Change-Id: I7347260aa476287ecfc706d4dd0877f4b75a1089
Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org>
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:07 -07:00
Joonwoo Park
e61ddbb14c Revert "sched: Use only partial wait time as task demand"
This reverts commit 0e2092e47488 ("sched: Use only partial wait time as
task demand") as it causes performance regression.

Change-Id: I3917858be98530807c479fc31eb76c0f22b4ea89
Signed-off-by: Joonwoo Park <joonwoop@codeaurora.org>
2016-03-23 20:02:03 -07:00