* v4.4-16.09-android-tmp: unsafe_[get|put]_user: change interface to use a error target label usercopy: remove page-spanning test for now usercopy: fix overlap check for kernel text mm/slub: support left redzone Linux 4.4.21 lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs regulator: anatop: allow regulator to be in bypass mode hwrng: exynos - Disable runtime PM on probe failure cpufreq: Fix GOV_LIMITS handling for the userspace governor metag: Fix atomic_*_return inline asm constraints scsi: fix upper bounds check of sense key in scsi_sense_key_string() ALSA: timer: fix NULL pointer dereference on memory allocation failure ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE ALSA: timer: fix NULL pointer dereference in read()/ioctl() race ALSA: hda - Enable subwoofer on Dell Inspiron 7559 ALSA: hda - Add headset mic quirk for Dell Inspiron 5468 ALSA: rawmidi: Fix possible deadlock with virmidi registration ALSA: fireworks: accessing to user space outside spinlock ALSA: firewire-tascam: accessing to user space outside spinlock ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114 crypto: caam - fix IV loading for authenc (giv)decryption uprobes: Fix the memcg accounting x86/apic: Do not init irq remapping if ioapic is disabled vhost/scsi: fix reuse of &vq->iov[out] in response bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two. ubifs: Fix assertion in layout_in_gaps() ovl: fix workdir creation ovl: listxattr: use strnlen() ovl: remove posix_acl_default from workdir ovl: don't copy up opaqueness wrappers for ->i_mutex access lustre: remove unused declaration timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING timekeeping: Cap array access in timekeeping_debug xfs: fix superblock inprogress check ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup drm/msm: fix use of copy_from_user() while holding spinlock drm: Reject page_flip for !DRIVER_MODESET drm/radeon: fix radeon_move_blit on 32bit systems s390/sclp_ctl: fix potential information leak with /dev/sclp rds: fix an infoleak in rds_inc_info_copy powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 nvme: Call pci_disable_device on the error path. cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork block: make sure a big bio is split into at most 256 bvecs block: Fix race triggered by blk_set_queue_dying() ext4: avoid modifying checksum fields directly during checksum verification ext4: avoid deadlock when expanding inode size ext4: properly align shifted xattrs when expanding inodes ext4: fix xattr shifting when expanding inodes part 2 ext4: fix xattr shifting when expanding inodes ext4: validate that metadata blocks do not overlap superblock net: Use ns_capable_noaudit() when determining net sysctl permissions kernel: Add noaudit variant of ns_capable() KEYS: Fix ASN.1 indefinite length object parsing drivers:hv: Lock access to hyperv_mmio resource tree cxlflash: Move to exponential back-off when cmd_room is not available netfilter: x_tables: check for size overflow drm/amdgpu/cz: enable/disable vce dpm even if vce pg is disabled cred: Reject inodes with invalid ids in set_create_file_as() fs: Check for invalid i_uid in may_follow_link() IB/IPoIB: Do not set skb truesize since using one linearskb udp: properly support MSG_PEEK with truncated buffers crypto: nx-842 - Mask XERS0 bit in return value cxlflash: Fix to avoid virtual LUN failover failure cxlflash: Fix to escalate LINK_RESET also on port 1 tipc: fix nl compat regression for link statistics tipc: fix an infoleak in tipc_nl_compat_link_dump netfilter: x_tables: check for size overflow Bluetooth: Add support for Intel Bluetooth device 8265 [8087:0a2b] drm/i915: Check VBT for port presence in addition to the strap on VLV/CHV drm/i915: Only ignore eDP ports that are connected Input: xpad - move pending clear to the correct location net: thunderx: Fix link status reporting x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances crypto: vmx - IV size failing on skcipher API tda10071: Fix dependency to REGMAP_I2C crypto: vmx - Fix ABI detection crypto: vmx - comply with ABIs that specify vrsave as reserved. HID: core: prevent out-of-bound readings lpfc: Fix DMA faults observed upon plugging loopback connector block: fix blk_rq_get_max_sectors for driver private requests irqchip/gicv3-its: numa: Enable workaround for Cavium thunderx erratum 23144 clocksource: Allow unregistering the watchdog btrfs: Continue write in case of can_not_nocow blk-mq: End unstarted requests on dying queue cxlflash: Fix to resolve dead-lock during EEH recovery drm/radeon/mst: fix regression in lane/link handling. ecryptfs: fix handling of directory opening ALSA: hda: add AMD Polaris-10/11 AZ PCI IDs with proper driver caps drm: Balance error path for GEM handle allocation ntp: Fix ADJ_SETOFFSET being used w/ ADJ_NANO time: Verify time values in adjtimex ADJ_SETOFFSET to avoid overflow Input: xpad - correctly handle concurrent LED and FF requests net: thunderx: Fix receive packet stats net: thunderx: Fix for multiqset not configured upon interface toggle perf/x86/cqm: Fix CQM memory leak and notifier leak perf/x86/cqm: Fix CQM handling of grouping events into a cache_group s390/crypto: provide correct file mode at device register. proc: revert /proc/<pid>/maps [stack:TID] annotation intel_idle: Support for Intel Xeon Phi Processor x200 Product Family cxlflash: Fix to avoid unnecessary scan with internal LUNs Drivers: hv: vmbus: don't manipulate with clocksources on crash Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload() Drivers: hv: vmbus: avoid infinite loop in init_vp_index() arcmsr: fixes not release allocated resource arcmsr: fixed getting wrong configuration data s390/pci_dma: fix DMA table corruption with > 4 TB main memory net/mlx5e: Don't modify CQ before it was created net/mlx5e: Don't try to modify CQ moderation if it is not supported mmc: sdhci: Do not BUG on invalid vdd UVC: Add support for R200 depth camera sched/numa: Fix use-after-free bug in the task_numa_compare ALSA: hda - add codec support for Kabylake display audio codec drm/i915: Fix hpd live status bits for g4x tipc: fix nullptr crash during subscription cancel arm64: Add workaround for Cavium erratum 27456 net: thunderx: Fix for Qset error due to CQ full drm/radeon: fix dp link rate selection (v2) drm/amdgpu: fix dp link rate selection (v2) qla2xxx: Use ATIO type to send correct tmr response mmc: sdhci: 64-bit DMA actually has 4-byte alignment drm/atomic: Do not unset crtc when an encoder is stolen drm/i915/skl: Add missing SKL ids drm/i915/bxt: update list of PCIIDs hrtimer: Catch illegal clockids i40e/i40evf: Fix RSS rx-flow-hash configuration through ethtool mpt3sas: Fix for Asynchronous completion of timedout IO and task abort of timedout IO. mpt3sas: A correction in unmap_resources net: cavium: liquidio: fix check for in progress flag arm64: KVM: Configure TCR_EL2.PS at runtime irqchip/gic-v3: Make sure read from ICC_IAR1_EL1 is visible on redestributor pwm: lpc32xx: fix and simplify duty cycle and period calculations pwm: lpc32xx: correct number of PWM channels from 2 to 1 pwm: fsl-ftm: Fix clock enable/disable when using PM megaraid_sas: Add an i/o barrier megaraid_sas: Fix SMAP issue megaraid_sas: Do not allow PCI access during OCR s390/cio: update measurement characteristics s390/cio: ensure consistent measurement state s390/cio: fix measurement characteristics memleak qeth: initialize net_device with carrier off lpfc: Fix external loopback failure. lpfc: Fix mbox reuse in PLOGI completion lpfc: Fix RDP Speed reporting. lpfc: Fix crash in fcp command completion path. lpfc: Fix driver crash when module parameter lpfc_fcp_io_channel set to 16 lpfc: Fix RegLogin failed error seen on Lancer FC during port bounce lpfc: Fix the FLOGI discovery logic to comply with T11 standards lpfc: Fix FCF Infinite loop in lpfc_sli4_fcf_rr_next_index_get. cxl: Enable PCI device ID for future IBM CXL adapter cxl: fix build for GCC 4.6.x cxlflash: Enable device id for future IBM CXL adapter cxlflash: Resolve oops in wait_port_offline cxlflash: Fix to resolve cmd leak after host reset cxl: Fix DSI misses when the context owning task exits cxl: Fix possible idr warning when contexts are released Drivers: hv: vmbus: fix rescind-offer handling for device without a driver Drivers: hv: vmbus: serialize process_chn_event() and vmbus_close_internal() Drivers: hv: vss: run only on supported host versions drivers/hv: cleanup synic msrs if vmbus connect failed Drivers: hv: util: catch allocation errors tools: hv: report ENOSPC errors in hv_fcopy_daemon Drivers: hv: utils: run polling callback always in interrupt context Drivers: hv: util: Increase the timeout for util services lightnvm: fix missing grown bad block type lightnvm: fix locking and mempool in rrpc_lun_gc lightnvm: unlock rq and free ppa_list on submission fail lightnvm: add check after mempool allocation lightnvm: fix incorrect nr_free_blocks stat lightnvm: fix bio submission issue cxlflash: a couple off by one bugs fm10k: Cleanup exception handling for mailbox interrupt fm10k: Cleanup MSI-X interrupts in case of failure fm10k: reinitialize queuing scheme after calling init_hw fm10k: always check init_hw for errors fm10k: reset max_queues on init_hw_vf failure fm10k: Fix handling of NAPI budget when multiple queues are enabled per vector fm10k: Correct MTU for jumbo frames fm10k: do not assume VF always has 1 queue clk: xgene: Fix divider with non-zero shift value e1000e: fix division by zero on jumbo MTUs e1000: fix data race between tx_ring->next_to_clean ixgbe: Fix handling of NAPI budget when multiple queues are enabled per vector igb: fix NULL derefs due to skipped SR-IOV enabling igb: use the correct i210 register for EEMNGCTL igb: don't unmap NULL hw_addr i40e: Fix Rx hash reported to the stack by our driver i40e: clean whole mac filter list i40evf: check rings before freeing resources i40e: don't add zero MAC filter i40e: properly delete VF MAC filters i40e: Fix memory leaks, sideband filter programming i40e: fix: do not sleep in netdev_ops i40e/i40evf: Fix RS bit update in Tx path and disable force WB workaround i40evf: handle many MAC filters correctly i40e: Workaround fix for mss < 256 issue UPSTREAM: audit: fix a double fetch in audit_log_single_execve_arg() UPSTREAM: ARM: 8494/1: mm: Enable PXN when running non-LPAE kernel on LPAE processor FIXUP: sched/tune: update accouting before CPU capacity FIXUP: sched/tune: add fixes missing from a previous patch arm: Fix #if/#ifdef typo in topology.c arm: Fix build error "conflicting types for 'scale_cpu_capacity'" sched/walt: use do_div instead of division operator DEBUG: cpufreq: fix cpu_capacity tracing build for non-smp systems sched/walt: include missing header for arm_timer_read_counter() cpufreq: Kconfig: Fixup incorrect selection by CPU_FREQ_DEFAULT_GOV_SCHED sched/fair: Avoid redundant idle_cpu() call in update_sg_lb_stats() FIXUP: sched: scheduler-driven cpu frequency selection sched/rt: Add Kconfig option to enable panicking for RT throttling sched/rt: print RT tasks when RT throttling is activated UPSTREAM: sched: Fix a race between __kthread_bind() and sched_setaffinity() sched/fair: Favor higher cpus only for boosted tasks vmstat: make vmstat_updater deferrable again and shut down on idle sched/fair: call OPP update when going idle after migration sched/cpufreq_sched: fix thermal capping events sched/fair: Picking cpus with low OPPs for tasks that prefer idle CPUs FIXUP: sched/tune: do initialization as a postcore_initicall DEBUG: sched: add tracepoint for RD overutilized sched/tune: Introducing a new schedtune attribute prefer_idle sched: use util instead of capacity to select busy cpu arch_timer: add error handling when the MPM global timer is cleared FIXUP: sched: Fix double-release of spinlock in move_queued_task FIXUP: sched/fair: Fix hang during suspend in sched_group_energy FIXUP: sched: fix SchedFreq integration for both PELT and WALT sched: EAS: Avoid causing spikes to max-freq unnecessarily FIXUP: sched: fix set_cfs_cpu_capacity when WALT is in use sched/walt: Accounting for number of irqs pending on each core sched: Introduce Window Assisted Load Tracking (WALT) sched/tune: fix PB and PC cuts indexes definition sched/fair: optimize idle cpu selection for boosted tasks FIXUP: sched/tune: fix accounting for runnable tasks sched/tune: use a single initialisation function sched/{fair,tune}: simplify fair.c code FIXUP: sched/tune: fix payoff calculation for boost region sched/tune: Add support for negative boost values FIX: sched/tune: move schedtune_nornalize_energy into fair.c FIX: sched/tune: update usage of boosted task utilisation on CPU selection sched/fair: add tunable to set initial task load sched/fair: add tunable to force selection at cpu granularity sched: EAS: take cstate into account when selecting idle core sched/cpufreq_sched: Consolidated update FIXUP: sched: fix build for non-SMP target DEBUG: sched/tune: add tracepoint on P-E space filtering DEBUG: sched/tune: add tracepoint for energy_diff() values DEBUG: sched/tune: add tracepoint for task boost signal arm: topology: Define TC2 energy and provide it to the scheduler CHROMIUM: sched: update the average of nr_running DEBUG: schedtune: add tracepoint for schedtune_tasks_update() values DEBUG: schedtune: add tracepoint for CPU boost signal DEBUG: schedtune: add tracepoint for SchedTune configuration update DEBUG: sched: add energy procfs interface DEBUG: sched,cpufreq: add cpu_capacity change tracepoint DEBUG: sched: add tracepoint for CPU load/util signals DEBUG: sched: add tracepoint for task load/util signals DEBUG: sched: add tracepoint for cpu/freq scale invariance sched/fair: filter energy_diff() based on energy_payoff value sched/tune: add support to compute normalized energy sched/fair: keep track of energy/capacity variations sched/fair: add boosted task utilization sched/{fair,tune}: track RUNNABLE tasks impact on per CPU boost value sched/tune: compute and keep track of per CPU boost value sched/tune: add initial support for CGroups based boosting sched/fair: add boosted CPU usage sched/fair: add function to convert boost value into "margin" sched/tune: add sysctl interface to define a boost value sched/tune: add detailed documentation fixup! sched/fair: jump to max OPP when crossing UP threshold fixup! sched: scheduler-driven cpu frequency selection sched: rt scheduler sets capacity requirement sched: deadline: use deadline bandwidth in scale_rt_capacity sched: remove call of sched_avg_update from sched_rt_avg_update sched/cpufreq_sched: add trace events sched/fair: jump to max OPP when crossing UP threshold sched/fair: cpufreq_sched triggers for load balancing sched/{core,fair}: trigger OPP change request on fork() sched/fair: add triggers for OPP change requests sched: scheduler-driven cpu frequency selection cpufreq: introduce cpufreq_driver_is_slow sched: Consider misfit tasks when load-balancing sched: Add group_misfit_task load-balance type sched: Add per-cpu max capacity to sched_group_capacity sched: Do eas idle balance regardless of the rq avg idle value arm64: Enable max freq invariant scheduler load-tracking and capacity support arm: Enable max freq invariant scheduler load-tracking and capacity support sched: Update max cpu capacity in case of max frequency constraints cpufreq: Max freq invariant scheduler load-tracking and cpu capacity support arm64, topology: Updates to use DT bindings for EAS costing data sched: Support for extracting EAS energy costs from DT Documentation: DT bindings for energy model cost data required by EAS sched: Disable energy-unfriendly nohz kicks sched: Consider a not over-utilized energy-aware system as balanced sched: Energy-aware wake-up task placement sched: Determine the current sched_group idle-state sched, cpuidle: Track cpuidle state index in the scheduler sched: Add over-utilization/tipping point indicator sched: Estimate energy impact of scheduling decisions sched: Extend sched_group_energy to test load-balancing decisions sched: Calculate energy consumption of sched_group sched: Highest energy aware balancing sched_domain level pointer sched: Relocated cpu_util() and change return type sched: Compute cpu capacity available at current frequency arm64: Cpu invariant scheduler load-tracking and capacity support arm: Cpu invariant scheduler load-tracking and capacity support sched: Introduce SD_SHARE_CAP_STATES sched_domain flag sched: Initialize energy data structures sched: Introduce energy data structures sched: Make energy awareness a sched feature sched: Documentation for scheduler energy cost model sched: Prevent unnecessary active balance of single task in sched group sched: Enable idle balance to pull single task towards cpu with higher capacity sched: Consider spare cpu capacity at task wake-up sched: Add cpu capacity awareness to wakeup balancing sched: Store system-wide maximum cpu capacity in root domain arm: Update arch_scale_cpu_capacity() to reflect change to define arm64: Enable frequency invariant scheduler load-tracking support arm: Enable frequency invariant scheduler load-tracking support cpufreq: Frequency invariant scheduler load-tracking support sched/fair: Fix new task's load avg removed from source CPU in wake_up_new_task() FROMLIST: pstore: drop pmsg bounce buffer UPSTREAM: usercopy: remove page-spanning test for now UPSTREAM: usercopy: force check_object_size() inline BACKPORT: usercopy: fold builtin_const check into inline function UPSTREAM: x86/uaccess: force copy_*_user() to be inlined UPSTREAM: HID: core: prevent out-of-bound readings Android: Fix build breakages. UPSTREAM: tty: Prevent ldisc drivers from re-using stale tty fields UPSTREAM: netfilter: nfnetlink: correctly validate length of batch messages cpuset: Make cpusets restore on hotplug UPSTREAM: mm/slub: support left redzone UPSTREAM: Make the hardened user-copy code depend on having a hardened allocator Android: MMC/UFS IO Latency Histograms. UPSTREAM: usercopy: fix overlap check for kernel text UPSTREAM: usercopy: avoid potentially undefined behavior in pointer math UPSTREAM: unsafe_[get|put]_user: change interface to use a error target label BACKPORT: arm64: mm: fix location of _etext BACKPORT: ARM: 8583/1: mm: fix location of _etext BACKPORT: Don't show empty tag stats for unprivileged uids UPSTREAM: tcp: fix use after free in tcp_xmit_retransmit_queue() ANDROID: base-cfg: drop SECCOMP_FILTER config UPSTREAM: [media] xc2028: unlock on error in xc2028_set_config() UPSTREAM: [media] xc2028: avoid use after free ANDROID: base-cfg: enable SECCOMP config ANDROID: rcu_sync: Export rcu_sync_lockdep_assert RFC: FROMLIST: cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork RFC: FROMLIST: cgroup: avoid synchronize_sched() in __cgroup_procs_write() RFC: FROMLIST: locking/percpu-rwsem: Optimize readers and reduce global impact net: ipv6: Fix ping to link-local addresses. ipv6: fix endianness error in icmpv6_err ANDROID: dm: android-verity: Allow android-verity to be compiled as an independent module backporting: a brief introduce of backported feautures on 4.4 Linux 4.4.20 sysfs: correctly handle read offset on PREALLOC attrs hwmon: (iio_hwmon) fix memory leak in name attribute ALSA: line6: Fix POD sysfs attributes segfault ALSA: line6: Give up on the lock while URBs are released. ALSA: line6: Remove double line6_pcm_release() after failed acquire. ACPI / SRAT: fix SRAT parsing order with both LAPIC and X2APIC present ACPI / sysfs: fix error code in get_status() ACPI / drivers: replace acpi_probe_lock spinlock with mutex ACPI / drivers: fix typo in ACPI_DECLARE_PROBE_ENTRY macro staging: comedi: ni_mio_common: fix wrong insn_write handler staging: comedi: ni_mio_common: fix AO inttrig backwards compatibility staging: comedi: comedi_test: fix timer race conditions staging: comedi: daqboard2000: bug fix board type matching code USB: serial: option: add WeTelecom 0x6802 and 0x6803 products USB: serial: option: add WeTelecom WM-D200 USB: serial: mos7840: fix non-atomic allocation in write path USB: serial: mos7720: fix non-atomic allocation in write path USB: fix typo in wMaxPacketSize validation usb: chipidea: udc: don't touch DP when controller is in host mode USB: avoid left shift by -1 dmaengine: usb-dmac: check CHCR.DE bit in usb_dmac_isr_channel() crypto: qat - fix aes-xts key sizes crypto: nx - off by one bug in nx_of_update_msc() Input: i8042 - set up shared ps2_cmd_mutex for AUX ports Input: i8042 - break load dependency between atkbd/psmouse and i8042 Input: tegra-kbc - fix inverted reset logic btrfs: properly track when rescan worker is running btrfs: waiting on qgroup rescan should not always be interruptible fs/seq_file: fix out-of-bounds read gpio: Fix OF build problem on UM usb: renesas_usbhs: gadget: fix return value check in usbhs_mod_gadget_probe() megaraid_sas: Fix probing cards without io port mpt3sas: Fix resume on WarpDrive flash cards cdc-acm: fix wrong pipe type on rx interrupt xfers i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer() mfd: cros_ec: Add cros_ec_cmd_xfer_status() helper aacraid: Check size values after double-fetch from user ARC: Elide redundant setup of DMA callbacks ARC: Call trace_hardirqs_on() before enabling irqs ARC: use correct offset in pt_regs for saving/restoring user mode r25 ARC: build: Better way to detect ISA compatible toolchain drm/i915: fix aliasing_ppgtt leak drm/amdgpu: record error code when ring test failed drm/amd/amdgpu: sdma resume fail during S4 on CI drm/amdgpu: skip TV/CV in display parsing drm/amdgpu: avoid a possible array overflow drm/amdgpu: fix amdgpu_move_blit on 32bit systems drm/amdgpu: Change GART offset to 64-bit iio: fix sched WARNING "do not call blocking ops when !TASK_RUNNING" sched/nohz: Fix affine unpinned timers mess sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression of: fix reference counting in of_graph_get_endpoint_by_regs arm64: dts: rockchip: add reset saradc node for rk3368 SoCs mac80211: fix purging multicast PS buffer queue s390/dasd: fix hanging device after clear subchannel EDAC: Increment correct counter in edac_inc_ue_error() pinctrl/amd: Remove the default de-bounce time iommu/arm-smmu: Don't BUG() if we find aborting STEs with disable_bypass iommu/arm-smmu: Fix CMDQ error handling iommu/dma: Don't put uninitialised IOVA domains xhci: Make sure xhci handles USB_SPEED_SUPER_PLUS devices. USB: serial: ftdi_sio: add PIDs for Ivium Technologies devices USB: serial: ftdi_sio: add device ID for WICED USB UART dev board USB: serial: option: add support for Telit LE920A4 USB: serial: option: add D-Link DWM-156/A3 USB: serial: fix memleak in driver-registration error path xhci: don't dereference a xhci member after removing xhci usb: xhci: Fix panic if disconnect xhci: always handle "Command Ring Stopped" events usb/gadget: fix gadgetfs aio support. usb: gadget: fsl_qe_udc: off by one in setup_received_handle() USB: validate wMaxPacketValue entries in endpoint descriptors usb: renesas_usbhs: Use dmac only if the pipe type is bulk usb: renesas_usbhs: clear the BRDYSTS in usbhsg_ep_enable() USB: hub: change the locking in hub_activate USB: hub: fix up early-exit pathway in hub_activate usb: hub: Fix unbalanced reference count/memory leak/deadlocks usb: define USB_SPEED_SUPER_PLUS speed for SuperSpeedPlus USB3.1 devices usb: dwc3: gadget: increment request->actual once usb: dwc3: pci: add Intel Kabylake PCI ID usb: misc: usbtest: add fix for driver hang usb: ehci: change order of register cleanup during shutdown crypto: caam - defer aead_set_sh_desc in case of zero authsize crypto: caam - fix echainiv(authenc) encrypt shared descriptor crypto: caam - fix non-hmac hashes genirq/msi: Make sure PCI MSIs are activated early genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAP um: Don't discard .text.exit section ACPI / CPPC: Prevent cpc_desc_ptr points to the invalid data ACPI: CPPC: Return error if _CPC is invalid on a CPU mmc: sdhci-acpi: Reduce Baytrail eMMC/SD/SDIO hangs PCI: Limit config space size for Netronome NFP4000 PCI: Add Netronome NFP4000 PF device ID PCI: Limit config space size for Netronome NFP6000 family PCI: Add Netronome vendor and device IDs PCI: Support PCIe devices with short cfg_size NVMe: Don't unmap controller registers on reset ALSA: hda - Manage power well properly for resume libnvdimm, nd_blk: mask off reserved status bits perf intel-pt: Fix occasional decoding errors when tracing system-wide vfio/pci: Fix NULL pointer oops in error interrupt setup handling virtio: fix memory leak in virtqueue_add() parisc: Fix order of EREFUSED define in errno.h arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO ALSA: usb-audio: Add quirk for ELP HD USB Camera ALSA: usb-audio: Add a sample rate quirk for Creative Live! Cam Socialize HD (VF0610) powerpc/eeh: eeh_pci_enable(): fix checking of post-request state SUNRPC: allow for upcalls for same uid but different gss service SUNRPC: Handle EADDRNOTAVAIL on connection failures tools/testing/nvdimm: fix SIGTERM vs hotplug crash uprobes/x86: Fix RIP-relative handling of EVEX-encoded instructions x86/mm: Disable preemption during CR3 read+write hugetlb: fix nr_pmds accounting with shared page tables mm: SLUB hardened usercopy support mm: SLAB hardened usercopy support s390/uaccess: Enable hardened usercopy sparc/uaccess: Enable hardened usercopy powerpc/uaccess: Enable hardened usercopy ia64/uaccess: Enable hardened usercopy arm64/uaccess: Enable hardened usercopy ARM: uaccess: Enable hardened usercopy x86/uaccess: Enable hardened usercopy x86: remove more uaccess_32.h complexity x86: remove pointless uaccess_32.h complexity x86: fix SMAP in 32-bit environments Use the new batched user accesses in generic user string handling Add 'unsafe' user access functions for batched accesses x86: reorganize SMAP handling in user space accesses mm: Hardened usercopy mm: Implement stack frame object validation mm: Add is_migrate_cma_page Linux 4.4.19 Documentation/module-signing.txt: Note need for version info if reusing a key module: Invalidate signatures on force-loaded modules dm flakey: error READ bios during the down_interval rtc: s3c: Add s3c_rtc_{enable/disable}_clk in s3c_rtc_setfreq() lpfc: fix oops in lpfc_sli4_scmd_to_wqidx_distr() from lpfc_send_taskmgmt() ACPI / EC: Work around method reentrancy limit in ACPICA for _Qxx x86/platform/intel_mid_pci: Rework IRQ0 workaround PCI: Mark Atheros AR9485 and QCA9882 to avoid bus reset MIPS: hpet: Increase HPET_MIN_PROG_DELTA and decrease HPET_MIN_CYCLES MIPS: Don't register r4k sched clock when CPUFREQ enabled MIPS: mm: Fix definition of R6 cache instruction SUNRPC: Don't allocate a full sockaddr_storage for tracing Input: elan_i2c - properly wake up touchpad on ASUS laptops target: Fix ordered task CHECK_CONDITION early exception handling target: Fix max_unmap_lba_count calc overflow target: Fix race between iscsi-target connection shutdown + ABORT_TASK target: Fix missing complete during ABORT_TASK + CMD_T_FABRIC_STOP target: Fix ordered task target_setup_cmd_from_cdb exception hang iscsi-target: Fix panic when adding second TCP connection to iSCSI session ubi: Fix race condition between ubi device creation and udev ubi: Fix early logging ubi: Make volume resize power cut aware of: fix memory leak related to safe_name() IB/mlx4: Fix memory leak if QP creation failed IB/mlx4: Fix error flow when sending mads under SRIOV IB/mlx4: Fix the SQ size of an RC QP IB/IWPM: Fix a potential skb leak IB/IPoIB: Don't update neigh validity for unresolved entries IB/SA: Use correct free function IB/mlx5: Return PORT_ERR in Active to Initializing tranisition IB/mlx5: Fix post send fence logic IB/mlx5: Fix entries check in mlx5_ib_resize_cq IB/mlx5: Fix returned values of query QP IB/mlx5: Fix entries checks in mlx5_ib_create_cq IB/mlx5: Fix MODIFY_QP command input structure ALSA: hda - Fix headset mic detection problem for two dell machines ALSA: hda: add AMD Bonaire AZ PCI ID with proper driver caps ALSA: hda/realtek - Can't adjust speaker's volume on a Dell AIO ALSA: hda: Fix krealloc() with __GFP_ZERO usage mm/hugetlb: avoid soft lockup in set_max_huge_pages() mtd: nand: fix bug writing 1 byte less than page size block: fix bdi vs gendisk lifetime mismatch block: add missing group association in bio-cloning functions metag: Fix __cmpxchg_u32 asm constraint for CMP ftrace/recordmcount: Work around for addition of metag magic but not relocations balloon: check the number of available pages in leak balloon drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown" drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB drm/edid: Add 6 bpc quirk for display AEO model 0. drm: Restore double clflush on the last partial cacheline drm/nouveau/fbcon: fix font width not divisible by 8 drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup drm/nouveau: check for supported chipset before booting fbdev off the hw drm/radeon: support backlight control for UNIPHY3 drm/radeon: fix firmware info version checks drm/radeon: Poll for both connect/disconnect on analog connectors drm/radeon: add a delay after ATPX dGPU power off drm/amdgpu/gmc7: add missing mullins case drm/amdgpu: fix firmware info version checks drm/amdgpu: Disable RPM helpers while reprobing connectors on resume drm/amdgpu: support backlight control for UNIPHY3 drm/amdgpu: Poll for both connect/disconnect on analog connectors drm/amdgpu: add a delay after ATPX dGPU power off w1:omap_hdq: fix regression netlabel: add address family checks to netlbl_{sock,req}_delattr() ARM: dts: sunxi: Add a startup delay for fixed regulator enabled phys audit: fix a double fetch in audit_log_single_execve_arg() iommu/amd: Update Alias-DTE in update_device_table() iommu/amd: Init unity mappings only for dma_ops domains iommu/amd: Handle IOMMU_DOMAIN_DMA in ops->domain_free call-back iommu/vt-d: Return error code in domain_context_mapping_one() iommu/exynos: Suppress unbinding to prevent system failure drm/i915: Don't complain about lack of ACPI video bios nfsd: don't return an unhashed lock stateid after taking mutex nfsd: Fix race between FREE_STATEID and LOCK nfs: don't create zero-length requests MIPS: KVM: Propagate kseg0/mapped tlb fault errors MIPS: KVM: Fix gfn range check in kseg0 tlb faults MIPS: KVM: Add missing gfn range check MIPS: KVM: Fix mapped fault broken commpage handling random: add interrupt callback to VMBus IRQ handler random: print a warning for the first ten uninitialized random users random: initialize the non-blocking pool via add_hwgenerator_randomness() CIFS: Fix a possible invalid memory access in smb2_query_symlink() cifs: fix crash due to race in hmac(md5) handling cifs: Check for existing directory when opening file with O_CREAT fs/cifs: make share unaccessible at root level mountable jbd2: make journal y2038 safe ARC: mm: don't loose PTE_SPECIAL in pte_modify() remoteproc: Fix potential race condition in rproc_add ovl: disallow overlayfs as upperdir HID: uhid: fix timeout when probe races with IO EDAC: Correct channel count limit Bluetooth: Fix l2cap_sock_setsockopt() with optname BT_RCVMTU spi: pxa2xx: Clear all RFT bits in reset_sccr1() on Intel Quark i2c: efm32: fix a failure path in efm32_i2c_probe() s5p-mfc: Add release callback for memory region devs s5p-mfc: Set device name for reserved memory region devs hp-wmi: Fix wifi cannot be hard-unblocked dm: set DMF_SUSPENDED* _before_ clearing DMF_NOFLUSH_SUSPENDING sur40: fix occasional oopses on device close sur40: lower poll interval to fix occasional FPS drops to ~56 FPS Fix RC5 decoding with Fintek CIR chipset vb2: core: Skip planes array verification if pb is NULL videobuf2-v4l2: Verify planes array in buffer dequeueing media: dvb_ringbuffer: Add memory barriers media: usbtv: prevent access to free'd resources mfd: qcom_rpm: Parametrize also ack selector size mfd: qcom_rpm: Fix offset error for msm8660 intel_pstate: Fix MSR_CONFIG_TDP_x addressing in core_get_max_pstate() s390/cio: allow to reset channel measurement block KVM: nVMX: Fix memory corruption when using VMCS shadowing KVM: VMX: handle PML full VMEXIT that occurs during event delivery KVM: MTRR: fix kvm_mtrr_check_gfn_range_consistency page fault KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures arm64: mm: avoid fdt_check_header() before the FDT is fully mapped arm64: dts: rockchip: fixes the gic400 2nd region size for rk3368 pinctrl: cherryview: prevent concurrent access to GPIO controllers Bluetooth: hci_intel: Fix null gpio desc pointer dereference gpio: intel-mid: Remove potentially harmful code gpio: pca953x: Fix NBANK calculation for PCA9536 tty/serial: atmel: fix RS485 half duplex with DMA serial: samsung: Fix ERR pointer dereference on deferred probe tty: serial: msm: Don't read off end of tx fifo arm64: Fix incorrect per-cpu usage for boot CPU arm64: debug: unmask PSTATE.D earlier arm64: kernel: Save and restore UAO and addr_limit on exception entry USB: usbfs: fix potential infoleak in devio usb: renesas_usbhs: fix NULL pointer dereference in xfer_work() USB: serial: option: add support for Telit LE910 PID 0x1206 usb: dwc3: fix for the isoc transfer EP_BUSY flag usb: quirks: Add no-lpm quirk for Elan usb: renesas_usbhs: protect the CFIFOSEL setting in usbhsg_ep_enable() usb: f_fs: off by one bug in _ffs_func_bind() usb: gadget: avoid exposing kernel stack UPSTREAM: usb: gadget: configfs: add mutex lock before unregister gadget ANDROID: dm-verity: adopt changes made to dm callbacks UPSTREAM: ecryptfs: fix handling of directory opening ANDROID: net: core: fix UID-based routing ANDROID: net: fib: remove duplicate assignment FROMLIST: proc: Fix timerslack_ns CAP_SYS_NICE check when adjusting self ANDROID: dm verity fec: pack the fec_header structure ANDROID: dm: android-verity: Verify header before fetching table ANDROID: dm: allow adb disable-verity only in userdebug ANDROID: dm: mount as linear target if eng build ANDROID: dm: use default verity public key ANDROID: dm: fix signature verification flag ANDROID: dm: use name_to_dev_t ANDROID: dm: rename dm-linear methods for dm-android-verity ANDROID: dm: Minor cleanup ANDROID: dm: Mounting root as linear device when verity disabled ANDROID: dm-android-verity: Rebase on top of 4.1 ANDROID: dm: Add android verity target ANDROID: dm: fix dm_substitute_devices() ANDROID: dm: Rebase on top of 4.1 CHROMIUM: dm: boot time specification of dm= Implement memory_state_time, used by qcom,cpubw Revert "panic: Add board ID to panic output" usb: gadget: f_accessory: remove duplicate endpoint alloc BACKPORT: brcmfmac: defer DPC processing during probe FROMLIST: proc: Add LSM hook checks to /proc/<tid>/timerslack_ns FROMLIST: proc: Relax /proc/<tid>/timerslack_ns capability requirements UPSTREAM: ppp: defer netns reference release for ppp channel cpuset: Add allow_attach hook for cpusets on android. UPSTREAM: KEYS: Fix ASN.1 indefinite length object parsing ANDROID: sdcardfs: fix itnull.cocci warnings android-recommended.cfg: enable fstack-protector-strong Linux 4.4.18 mm: memcontrol: fix memcg id ref counter on swap charge move mm: memcontrol: fix swap counter leak on swapout from offline cgroup mm: memcontrol: fix cgroup creation failure after many small jobs ext4: fix reference counting bug on block allocation error ext4: short-cut orphan cleanup on error ext4: validate s_reserved_gdt_blocks on mount ext4: don't call ext4_should_journal_data() on the journal inode ext4: fix deadlock during page writeback ext4: check for extents that wrap around crypto: scatterwalk - Fix test in scatterwalk_done crypto: gcm - Filter out async ghash if necessary fs/dcache.c: avoid soft-lockup in dput() fuse: fix wrong assignment of ->flags in fuse_send_init() fuse: fuse_flush must check mapping->flags for errors fuse: fsync() did not return IO errors sysv, ipc: fix security-layer leaking block: fix use-after-free in seq file x86/syscalls/64: Add compat_sys_keyctl for 32-bit userspace drm/i915: Pretend cursor is always on for ILK-style WM calculations (v2) x86/mm/pat: Fix BUG_ON() in mmap_mem() on QEMU/i386 x86/pat: Document the PAT initialization sequence x86/xen, pat: Remove PAT table init code from Xen x86/mtrr: Fix PAT init handling when MTRR is disabled x86/mtrr: Fix Xorg crashes in Qemu sessions x86/mm/pat: Replace cpu_has_pat with boot_cpu_has() x86/mm/pat: Add pat_disable() interface x86/mm/pat: Add support of non-default PAT MSR setting devpts: clean up interface to pty drivers random: strengthen input validation for RNDADDTOENTCNT apparmor: fix ref count leak when profile sha1 hash is read Revert "s390/kdump: Clear subchannel ID to signal non-CCW/SCSI IPL" KEYS: 64-bit MIPS needs to use compat_sys_keyctl for 32-bit userspace arm: oabi compat: add missing access checks cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR x86/mm/32: Enable full randomization on i386 and X86_32 HID: sony: do not bail out when the sixaxis refuses the output report PNP: Add Broadwell to Intel MCH size workaround PNP: Add Haswell-ULT to Intel MCH size workaround scsi: ignore errors from scsi_dh_add_device() ipath: Restrict use of the write() interface tcp: consider recv buf for the initial window scale qed: Fix setting/clearing bit in completion bitmap net/irda: fix NULL pointer dereference on memory allocation failure net: bgmac: Fix infinite loop in bgmac_dma_tx_add() bonding: set carrier off for devices created through netlink ipv4: reject RTNH_F_DEAD and RTNH_F_LINKDOWN from user space tcp: enable per-socket rate limiting of all 'challenge acks' tcp: make challenge acks less predictable arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux arm64: vmlinux.lds: make __rela_offset and __dynsym_offset ABSOLUTE Linux 4.4.17 vfs: fix deadlock in file_remove_privs() on overlayfs intel_th: Fix a deadlock in modprobing intel_th: pci: Add Kaby Lake PCH-H support net: mvneta: set real interrupt per packet for tx_done libceph: apply new_state before new_up_client on incrementals libata: LITE-ON CX1-JB256-HP needs lower max_sectors i2c: mux: reg: wrong condition checked for of_address_to_resource return value posix_cpu_timer: Exit early when process has been reaped media: fix airspy usb probe error path ipr: Clear interrupt on croc/crocodile when running with LSI SCSI: fix new bug in scsi_dev_info_list string matching RDS: fix rds_tcp_init() error path can: fix oops caused by wrong rtnl dellink usage can: fix handling of unmodifiable configuration options fix can: c_can: Update D_CAN TX and RX functions to 32 bit - fix Altera Cyclone access can: at91_can: RX queue could get stuck at high bus load perf/x86: fix PEBS issues on Intel Atom/Core2 ovl: handle ATTR_KILL* sched/fair: Fix effective_load() to consistently use smoothed load mmc: block: fix packed command header endianness block: fix use-after-free in sys_ioprio_get() qeth: delete napi struct when removing a qeth device platform/chrome: cros_ec_dev - double fetch bug in ioctl clk: rockchip: initialize flags of clk_init_data in mmc-phase clock spi: sun4i: fix FIFO limit spi: sunxi: fix transfer timeout namespace: update event counter when umounting a deleted dentry 9p: use file_dentry() ext4: verify extent header depth ecryptfs: don't allow mmap when the lower fs doesn't support it Revert "ecryptfs: forbid opening files without mmap handler" locks: use file_inode() power_supply: power_supply_read_temp only if use_cnt > 0 cgroup: set css->id to -1 during init pinctrl: imx: Do not treat a PIN without MUX register as an error pinctrl: single: Fix missing flush of posted write for a wakeirq pvclock: Add CPU barriers to get correct version value Input: tsc200x - report proper input_dev name Input: xpad - validate USB endpoint count during probe Input: wacom_w8001 - w8001_MAX_LENGTH should be 13 Input: xpad - fix oops when attaching an unknown Xbox One gamepad Input: elantech - add more IC body types to the list Input: vmmouse - remove port reservation ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt ALSA: timer: Fix leak in events via snd_timer_user_ccallback ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS xenbus: don't bail early from xenbus_dev_request_and_reply() xenbus: don't BUG() on user mode induced condition xen/pciback: Fix conf_space read/write overlap check. ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame) arc: unwind: warn only once if DW2_UNWIND is disabled kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w pps: do not crash when failed to register vmlinux.lds: account for destructor sections mm, meminit: ensure node is online before checking whether pages are uninitialised mm, meminit: always return a valid node from early_pfn_to_nid mm, compaction: prevent VM_BUG_ON when terminating freeing scanner fs/nilfs2: fix potential underflow in call to crc32_le mm, compaction: abort free scanner if split fails mm, sl[au]b: add __GFP_ATOMIC to the GFP reclaim mask dmaengine: at_xdmac: double FIFO flush needed to compute residue dmaengine: at_xdmac: fix residue corruption dmaengine: at_xdmac: align descriptors on 64 bits x86/quirks: Add early quirk to reset Apple AirPort card x86/quirks: Reintroduce scanning of secondary buses x86/quirks: Apply nvidia_bugs quirk only on root bus USB: OHCI: Don't mark EDs as ED_OPER if scheduling fails Conflicts: arch/arm/kernel/topology.c arch/arm64/include/asm/arch_gicv3.h arch/arm64/kernel/topology.c block/bio.c drivers/cpufreq/Kconfig drivers/md/Makefile drivers/media/dvb-core/dvb_ringbuffer.c drivers/media/tuners/tuner-xc2028.c drivers/misc/Kconfig drivers/misc/Makefile drivers/mmc/core/host.c drivers/scsi/ufs/ufshcd.c drivers/scsi/ufs/ufshcd.h drivers/usb/dwc3/gadget.c drivers/usb/gadget/configfs.c fs/ecryptfs/file.c include/linux/mmc/core.h include/linux/mmc/host.h include/linux/mmzone.h include/linux/sched.h include/linux/sched/sysctl.h include/trace/events/power.h include/trace/events/sched.h init/Kconfig kernel/cpuset.c kernel/exit.c kernel/sched/Makefile kernel/sched/core.c kernel/sched/cputime.c kernel/sched/fair.c kernel/sched/features.h kernel/sched/rt.c kernel/sched/sched.h kernel/sched/stop_task.c kernel/sched/tune.c lib/Kconfig.debug mm/Makefile mm/vmstat.c Change-Id: I243a43231ca56a6362076fa6301827e1b0493be5 Signed-off-by: Runmin Wang <runminw@codeaurora.org>
1760 lines
47 KiB
C
1760 lines
47 KiB
C
/*
|
|
* linux/arch/arm/mm/mmu.c
|
|
*
|
|
* Copyright (C) 1995-2005 Russell King
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License version 2 as
|
|
* published by the Free Software Foundation.
|
|
*/
|
|
#include <linux/module.h>
|
|
#include <linux/kernel.h>
|
|
#include <linux/errno.h>
|
|
#include <linux/init.h>
|
|
#include <linux/mman.h>
|
|
#include <linux/nodemask.h>
|
|
#include <linux/memblock.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/vmalloc.h>
|
|
#include <linux/sizes.h>
|
|
|
|
#include <asm/cp15.h>
|
|
#include <asm/cputype.h>
|
|
#include <asm/sections.h>
|
|
#include <asm/cachetype.h>
|
|
#include <asm/fixmap.h>
|
|
#include <asm/sections.h>
|
|
#include <asm/setup.h>
|
|
#include <asm/smp_plat.h>
|
|
#include <asm/tlb.h>
|
|
#include <asm/highmem.h>
|
|
#include <asm/system_info.h>
|
|
#include <asm/traps.h>
|
|
#include <asm/procinfo.h>
|
|
#include <asm/memory.h>
|
|
|
|
#include <asm/mach/arch.h>
|
|
#include <asm/mach/map.h>
|
|
#include <asm/mach/pci.h>
|
|
#include <asm/fixmap.h>
|
|
|
|
#include "fault.h"
|
|
#include "mm.h"
|
|
#include "tcm.h"
|
|
|
|
/*
|
|
* empty_zero_page is a special page that is used for
|
|
* zero-initialized data and COW.
|
|
*/
|
|
struct page *empty_zero_page;
|
|
EXPORT_SYMBOL(empty_zero_page);
|
|
|
|
/*
|
|
* The pmd table for the upper-most set of pages.
|
|
*/
|
|
pmd_t *top_pmd;
|
|
|
|
pmdval_t user_pmd_table = _PAGE_USER_TABLE;
|
|
|
|
#define CPOLICY_UNCACHED 0
|
|
#define CPOLICY_BUFFERED 1
|
|
#define CPOLICY_WRITETHROUGH 2
|
|
#define CPOLICY_WRITEBACK 3
|
|
#define CPOLICY_WRITEALLOC 4
|
|
|
|
static unsigned int cachepolicy __initdata = CPOLICY_WRITEBACK;
|
|
static unsigned int ecc_mask __initdata = 0;
|
|
pgprot_t pgprot_user;
|
|
pgprot_t pgprot_kernel;
|
|
pgprot_t pgprot_hyp_device;
|
|
pgprot_t pgprot_s2;
|
|
pgprot_t pgprot_s2_device;
|
|
|
|
EXPORT_SYMBOL(pgprot_user);
|
|
EXPORT_SYMBOL(pgprot_kernel);
|
|
|
|
struct cachepolicy {
|
|
const char policy[16];
|
|
unsigned int cr_mask;
|
|
pmdval_t pmd;
|
|
pteval_t pte;
|
|
pteval_t pte_s2;
|
|
};
|
|
|
|
#ifdef CONFIG_ARM_LPAE
|
|
#define s2_policy(policy) policy
|
|
#else
|
|
#define s2_policy(policy) 0
|
|
#endif
|
|
|
|
static struct cachepolicy cache_policies[] __initdata = {
|
|
{
|
|
.policy = "uncached",
|
|
.cr_mask = CR_W|CR_C,
|
|
.pmd = PMD_SECT_UNCACHED,
|
|
.pte = L_PTE_MT_UNCACHED,
|
|
.pte_s2 = s2_policy(L_PTE_S2_MT_UNCACHED),
|
|
}, {
|
|
.policy = "buffered",
|
|
.cr_mask = CR_C,
|
|
.pmd = PMD_SECT_BUFFERED,
|
|
.pte = L_PTE_MT_BUFFERABLE,
|
|
.pte_s2 = s2_policy(L_PTE_S2_MT_UNCACHED),
|
|
}, {
|
|
.policy = "writethrough",
|
|
.cr_mask = 0,
|
|
.pmd = PMD_SECT_WT,
|
|
.pte = L_PTE_MT_WRITETHROUGH,
|
|
.pte_s2 = s2_policy(L_PTE_S2_MT_WRITETHROUGH),
|
|
}, {
|
|
.policy = "writeback",
|
|
.cr_mask = 0,
|
|
.pmd = PMD_SECT_WB,
|
|
.pte = L_PTE_MT_WRITEBACK,
|
|
.pte_s2 = s2_policy(L_PTE_S2_MT_WRITEBACK),
|
|
}, {
|
|
.policy = "writealloc",
|
|
.cr_mask = 0,
|
|
.pmd = PMD_SECT_WBWA,
|
|
.pte = L_PTE_MT_WRITEALLOC,
|
|
.pte_s2 = s2_policy(L_PTE_S2_MT_WRITEBACK),
|
|
}
|
|
};
|
|
|
|
#ifdef CONFIG_CPU_CP15
|
|
static unsigned long initial_pmd_value __initdata = 0;
|
|
|
|
/*
|
|
* Initialise the cache_policy variable with the initial state specified
|
|
* via the "pmd" value. This is used to ensure that on ARMv6 and later,
|
|
* the C code sets the page tables up with the same policy as the head
|
|
* assembly code, which avoids an illegal state where the TLBs can get
|
|
* confused. See comments in early_cachepolicy() for more information.
|
|
*/
|
|
void __init init_default_cache_policy(unsigned long pmd)
|
|
{
|
|
int i;
|
|
|
|
initial_pmd_value = pmd;
|
|
|
|
pmd &= PMD_SECT_TEX(1) | PMD_SECT_BUFFERABLE | PMD_SECT_CACHEABLE;
|
|
|
|
for (i = 0; i < ARRAY_SIZE(cache_policies); i++)
|
|
if (cache_policies[i].pmd == pmd) {
|
|
cachepolicy = i;
|
|
break;
|
|
}
|
|
|
|
if (i == ARRAY_SIZE(cache_policies))
|
|
pr_err("ERROR: could not find cache policy\n");
|
|
}
|
|
|
|
/*
|
|
* These are useful for identifying cache coherency problems by allowing
|
|
* the cache or the cache and writebuffer to be turned off. (Note: the
|
|
* write buffer should not be on and the cache off).
|
|
*/
|
|
static int __init early_cachepolicy(char *p)
|
|
{
|
|
int i, selected = -1;
|
|
|
|
for (i = 0; i < ARRAY_SIZE(cache_policies); i++) {
|
|
int len = strlen(cache_policies[i].policy);
|
|
|
|
if (memcmp(p, cache_policies[i].policy, len) == 0) {
|
|
selected = i;
|
|
break;
|
|
}
|
|
}
|
|
|
|
if (selected == -1)
|
|
pr_err("ERROR: unknown or unsupported cache policy\n");
|
|
|
|
/*
|
|
* This restriction is partly to do with the way we boot; it is
|
|
* unpredictable to have memory mapped using two different sets of
|
|
* memory attributes (shared, type, and cache attribs). We can not
|
|
* change these attributes once the initial assembly has setup the
|
|
* page tables.
|
|
*/
|
|
if (cpu_architecture() >= CPU_ARCH_ARMv6 && selected != cachepolicy) {
|
|
pr_warn("Only cachepolicy=%s supported on ARMv6 and later\n",
|
|
cache_policies[cachepolicy].policy);
|
|
return 0;
|
|
}
|
|
|
|
if (selected != cachepolicy) {
|
|
unsigned long cr = __clear_cr(cache_policies[selected].cr_mask);
|
|
cachepolicy = selected;
|
|
flush_cache_all();
|
|
set_cr(cr);
|
|
}
|
|
return 0;
|
|
}
|
|
early_param("cachepolicy", early_cachepolicy);
|
|
|
|
static int __init early_nocache(char *__unused)
|
|
{
|
|
char *p = "buffered";
|
|
pr_warn("nocache is deprecated; use cachepolicy=%s\n", p);
|
|
early_cachepolicy(p);
|
|
return 0;
|
|
}
|
|
early_param("nocache", early_nocache);
|
|
|
|
static int __init early_nowrite(char *__unused)
|
|
{
|
|
char *p = "uncached";
|
|
pr_warn("nowb is deprecated; use cachepolicy=%s\n", p);
|
|
early_cachepolicy(p);
|
|
return 0;
|
|
}
|
|
early_param("nowb", early_nowrite);
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
static int __init early_ecc(char *p)
|
|
{
|
|
if (memcmp(p, "on", 2) == 0)
|
|
ecc_mask = PMD_PROTECTION;
|
|
else if (memcmp(p, "off", 3) == 0)
|
|
ecc_mask = 0;
|
|
return 0;
|
|
}
|
|
early_param("ecc", early_ecc);
|
|
#endif
|
|
|
|
#else /* ifdef CONFIG_CPU_CP15 */
|
|
|
|
static int __init early_cachepolicy(char *p)
|
|
{
|
|
pr_warn("cachepolicy kernel parameter not supported without cp15\n");
|
|
}
|
|
early_param("cachepolicy", early_cachepolicy);
|
|
|
|
static int __init noalign_setup(char *__unused)
|
|
{
|
|
pr_warn("noalign kernel parameter not supported without cp15\n");
|
|
}
|
|
__setup("noalign", noalign_setup);
|
|
|
|
#endif /* ifdef CONFIG_CPU_CP15 / else */
|
|
|
|
#define PROT_PTE_DEVICE L_PTE_PRESENT|L_PTE_YOUNG|L_PTE_DIRTY|L_PTE_XN
|
|
#define PROT_PTE_S2_DEVICE PROT_PTE_DEVICE
|
|
#define PROT_SECT_DEVICE PMD_TYPE_SECT|PMD_SECT_AP_WRITE
|
|
|
|
static struct mem_type mem_types[] = {
|
|
[MT_DEVICE] = { /* Strongly ordered / ARMv6 shared device */
|
|
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_SHARED |
|
|
L_PTE_SHARED,
|
|
.prot_pte_s2 = s2_policy(PROT_PTE_S2_DEVICE) |
|
|
s2_policy(L_PTE_S2_MT_DEV_SHARED) |
|
|
L_PTE_SHARED,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_S,
|
|
.domain = DOMAIN_IO,
|
|
},
|
|
[MT_DEVICE_NONSHARED] = { /* ARMv6 non-shared device */
|
|
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_NONSHARED,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PROT_SECT_DEVICE,
|
|
.domain = DOMAIN_IO,
|
|
},
|
|
[MT_DEVICE_CACHED] = { /* ioremap_cached */
|
|
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_CACHED,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PROT_SECT_DEVICE | PMD_SECT_WB,
|
|
.domain = DOMAIN_IO,
|
|
},
|
|
[MT_DEVICE_WC] = { /* ioremap_wc */
|
|
.prot_pte = PROT_PTE_DEVICE | L_PTE_MT_DEV_WC,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PROT_SECT_DEVICE,
|
|
.domain = DOMAIN_IO,
|
|
},
|
|
[MT_UNCACHED] = {
|
|
.prot_pte = PROT_PTE_DEVICE,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
|
|
.domain = DOMAIN_IO,
|
|
},
|
|
[MT_CACHECLEAN] = {
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
#ifndef CONFIG_ARM_LPAE
|
|
[MT_MINICLEAN] = {
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN | PMD_SECT_MINICACHE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
#endif
|
|
[MT_LOW_VECTORS] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_RDONLY,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.domain = DOMAIN_VECTORS,
|
|
},
|
|
[MT_HIGH_VECTORS] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_USER | L_PTE_RDONLY,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.domain = DOMAIN_VECTORS,
|
|
},
|
|
[MT_MEMORY_RWX] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_RW] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_XN,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_ROM] = {
|
|
.prot_sect = PMD_TYPE_SECT,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_RWX_NONCACHED] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_MT_BUFFERABLE,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_RW_DTCM] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_XN,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_XN,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_RWX_ITCM] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_RW_SO] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_MT_UNCACHED | L_PTE_XN,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.prot_sect = PMD_TYPE_SECT | PMD_SECT_AP_WRITE | PMD_SECT_S |
|
|
PMD_SECT_UNCACHED | PMD_SECT_XN,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
[MT_MEMORY_DMA_READY] = {
|
|
.prot_pte = L_PTE_PRESENT | L_PTE_YOUNG | L_PTE_DIRTY |
|
|
L_PTE_XN,
|
|
.prot_l1 = PMD_TYPE_TABLE,
|
|
.domain = DOMAIN_KERNEL,
|
|
},
|
|
};
|
|
|
|
const struct mem_type *get_mem_type(unsigned int type)
|
|
{
|
|
return type < ARRAY_SIZE(mem_types) ? &mem_types[type] : NULL;
|
|
}
|
|
EXPORT_SYMBOL(get_mem_type);
|
|
|
|
static pte_t *(*pte_offset_fixmap)(pmd_t *dir, unsigned long addr);
|
|
|
|
static pte_t bm_pte[PTRS_PER_PTE + PTE_HWTABLE_PTRS]
|
|
__aligned(PTE_HWTABLE_OFF + PTE_HWTABLE_SIZE) __initdata;
|
|
|
|
static pte_t * __init pte_offset_early_fixmap(pmd_t *dir, unsigned long addr)
|
|
{
|
|
return &bm_pte[pte_index(addr)];
|
|
}
|
|
|
|
static pte_t *pte_offset_late_fixmap(pmd_t *dir, unsigned long addr)
|
|
{
|
|
return pte_offset_kernel(dir, addr);
|
|
}
|
|
|
|
static inline pmd_t * __init fixmap_pmd(unsigned long addr)
|
|
{
|
|
pgd_t *pgd = pgd_offset_k(addr);
|
|
pud_t *pud = pud_offset(pgd, addr);
|
|
pmd_t *pmd = pmd_offset(pud, addr);
|
|
|
|
return pmd;
|
|
}
|
|
|
|
void __init early_fixmap_init(void)
|
|
{
|
|
pmd_t *pmd;
|
|
|
|
/*
|
|
* The early fixmap range spans multiple pmds, for which
|
|
* we are not prepared:
|
|
*/
|
|
BUILD_BUG_ON((__fix_to_virt(__end_of_permanent_fixed_addresses) >> PMD_SHIFT)
|
|
!= FIXADDR_TOP >> PMD_SHIFT);
|
|
|
|
pmd = fixmap_pmd(FIXADDR_TOP);
|
|
pmd_populate_kernel(&init_mm, pmd, bm_pte);
|
|
|
|
pte_offset_fixmap = pte_offset_early_fixmap;
|
|
}
|
|
|
|
/*
|
|
* To avoid TLB flush broadcasts, this uses local_flush_tlb_kernel_range().
|
|
* As a result, this can only be called with preemption disabled, as under
|
|
* stop_machine().
|
|
*/
|
|
void __set_fixmap(enum fixed_addresses idx, phys_addr_t phys, pgprot_t prot)
|
|
{
|
|
unsigned long vaddr = __fix_to_virt(idx);
|
|
pte_t *pte = pte_offset_fixmap(pmd_off_k(vaddr), vaddr);
|
|
|
|
/* Make sure fixmap region does not exceed available allocation. */
|
|
BUILD_BUG_ON(FIXADDR_START + (__end_of_fixed_addresses * PAGE_SIZE) >
|
|
FIXADDR_END);
|
|
BUG_ON(idx >= __end_of_fixed_addresses);
|
|
|
|
if (pgprot_val(prot))
|
|
set_pte_at(NULL, vaddr, pte,
|
|
pfn_pte(phys >> PAGE_SHIFT, prot));
|
|
else
|
|
pte_clear(NULL, vaddr, pte);
|
|
local_flush_tlb_kernel_range(vaddr, vaddr + PAGE_SIZE);
|
|
}
|
|
|
|
/*
|
|
* Adjust the PMD section entries according to the CPU in use.
|
|
*/
|
|
static void __init build_mem_type_table(void)
|
|
{
|
|
struct cachepolicy *cp;
|
|
unsigned int cr = get_cr();
|
|
pteval_t user_pgprot, kern_pgprot, vecs_pgprot;
|
|
pteval_t hyp_device_pgprot, s2_pgprot, s2_device_pgprot;
|
|
int cpu_arch = cpu_architecture();
|
|
int i;
|
|
|
|
if (cpu_arch < CPU_ARCH_ARMv6) {
|
|
#if defined(CONFIG_CPU_DCACHE_DISABLE)
|
|
if (cachepolicy > CPOLICY_BUFFERED)
|
|
cachepolicy = CPOLICY_BUFFERED;
|
|
#elif defined(CONFIG_CPU_DCACHE_WRITETHROUGH)
|
|
if (cachepolicy > CPOLICY_WRITETHROUGH)
|
|
cachepolicy = CPOLICY_WRITETHROUGH;
|
|
#endif
|
|
}
|
|
if (cpu_arch < CPU_ARCH_ARMv5) {
|
|
if (cachepolicy >= CPOLICY_WRITEALLOC)
|
|
cachepolicy = CPOLICY_WRITEBACK;
|
|
ecc_mask = 0;
|
|
}
|
|
|
|
if (is_smp()) {
|
|
if (cachepolicy != CPOLICY_WRITEALLOC) {
|
|
pr_warn("Forcing write-allocate cache policy for SMP\n");
|
|
cachepolicy = CPOLICY_WRITEALLOC;
|
|
}
|
|
if (!(initial_pmd_value & PMD_SECT_S)) {
|
|
pr_warn("Forcing shared mappings for SMP\n");
|
|
initial_pmd_value |= PMD_SECT_S;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* Strip out features not present on earlier architectures.
|
|
* Pre-ARMv5 CPUs don't have TEX bits. Pre-ARMv6 CPUs or those
|
|
* without extended page tables don't have the 'Shared' bit.
|
|
*/
|
|
if (cpu_arch < CPU_ARCH_ARMv5)
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++)
|
|
mem_types[i].prot_sect &= ~PMD_SECT_TEX(7);
|
|
if ((cpu_arch < CPU_ARCH_ARMv6 || !(cr & CR_XP)) && !cpu_is_xsc3())
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++)
|
|
mem_types[i].prot_sect &= ~PMD_SECT_S;
|
|
|
|
/*
|
|
* ARMv5 and lower, bit 4 must be set for page tables (was: cache
|
|
* "update-able on write" bit on ARM610). However, Xscale and
|
|
* Xscale3 require this bit to be cleared.
|
|
*/
|
|
if (cpu_is_xscale() || cpu_is_xsc3()) {
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
|
|
mem_types[i].prot_sect &= ~PMD_BIT4;
|
|
mem_types[i].prot_l1 &= ~PMD_BIT4;
|
|
}
|
|
} else if (cpu_arch < CPU_ARCH_ARMv6) {
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
|
|
if (mem_types[i].prot_l1)
|
|
mem_types[i].prot_l1 |= PMD_BIT4;
|
|
if (mem_types[i].prot_sect)
|
|
mem_types[i].prot_sect |= PMD_BIT4;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* Mark the device areas according to the CPU/architecture.
|
|
*/
|
|
if (cpu_is_xsc3() || (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP))) {
|
|
if (!cpu_is_xsc3()) {
|
|
/*
|
|
* Mark device regions on ARMv6+ as execute-never
|
|
* to prevent speculative instruction fetches.
|
|
*/
|
|
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_XN;
|
|
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_XN;
|
|
mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_XN;
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_XN;
|
|
|
|
/* Also setup NX memory mapping */
|
|
mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_XN;
|
|
}
|
|
if (cpu_arch >= CPU_ARCH_ARMv7 && (cr & CR_TRE)) {
|
|
/*
|
|
* For ARMv7 with TEX remapping,
|
|
* - shared device is SXCB=1100
|
|
* - nonshared device is SXCB=0100
|
|
* - write combine device mem is SXCB=0001
|
|
* (Uncached Normal memory)
|
|
*/
|
|
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_TEX(1);
|
|
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(1);
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_BUFFERABLE;
|
|
} else if (cpu_is_xsc3()) {
|
|
/*
|
|
* For Xscale3,
|
|
* - shared device is TEXCB=00101
|
|
* - nonshared device is TEXCB=01000
|
|
* - write combine device mem is TEXCB=00100
|
|
* (Inner/Outer Uncacheable in xsc3 parlance)
|
|
*/
|
|
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_TEX(1) | PMD_SECT_BUFFERED;
|
|
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(2);
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_TEX(1);
|
|
} else {
|
|
/*
|
|
* For ARMv6 and ARMv7 without TEX remapping,
|
|
* - shared device is TEXCB=00001
|
|
* - nonshared device is TEXCB=01000
|
|
* - write combine device mem is TEXCB=00100
|
|
* (Uncached Normal in ARMv6 parlance).
|
|
*/
|
|
mem_types[MT_DEVICE].prot_sect |= PMD_SECT_BUFFERED;
|
|
mem_types[MT_DEVICE_NONSHARED].prot_sect |= PMD_SECT_TEX(2);
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_TEX(1);
|
|
}
|
|
} else {
|
|
/*
|
|
* On others, write combining is "Uncached/Buffered"
|
|
*/
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_BUFFERABLE;
|
|
}
|
|
|
|
/*
|
|
* Now deal with the memory-type mappings
|
|
*/
|
|
cp = &cache_policies[cachepolicy];
|
|
vecs_pgprot = kern_pgprot = user_pgprot = cp->pte;
|
|
s2_pgprot = cp->pte_s2;
|
|
hyp_device_pgprot = mem_types[MT_DEVICE].prot_pte;
|
|
s2_device_pgprot = mem_types[MT_DEVICE].prot_pte_s2;
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
/*
|
|
* We don't use domains on ARMv6 (since this causes problems with
|
|
* v6/v7 kernels), so we must use a separate memory type for user
|
|
* r/o, kernel r/w to map the vectors page.
|
|
*/
|
|
if (cpu_arch == CPU_ARCH_ARMv6)
|
|
vecs_pgprot |= L_PTE_MT_VECTORS;
|
|
|
|
/*
|
|
* Check is it with support for the PXN bit
|
|
* in the Short-descriptor translation table format descriptors.
|
|
*/
|
|
if (cpu_arch == CPU_ARCH_ARMv7 &&
|
|
(read_cpuid_ext(CPUID_EXT_MMFR0) & 0xF) >= 4) {
|
|
user_pmd_table |= PMD_PXNTABLE;
|
|
}
|
|
#endif
|
|
|
|
/*
|
|
* ARMv6 and above have extended page tables.
|
|
*/
|
|
if (cpu_arch >= CPU_ARCH_ARMv6 && (cr & CR_XP)) {
|
|
#ifndef CONFIG_ARM_LPAE
|
|
/*
|
|
* Mark cache clean areas and XIP ROM read only
|
|
* from SVC mode and no access from userspace.
|
|
*/
|
|
mem_types[MT_ROM].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
|
|
mem_types[MT_MINICLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
|
|
mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_APX|PMD_SECT_AP_WRITE;
|
|
#endif
|
|
|
|
/*
|
|
* If the initial page tables were created with the S bit
|
|
* set, then we need to do the same here for the same
|
|
* reasons given in early_cachepolicy().
|
|
*/
|
|
if (initial_pmd_value & PMD_SECT_S) {
|
|
user_pgprot |= L_PTE_SHARED;
|
|
kern_pgprot |= L_PTE_SHARED;
|
|
vecs_pgprot |= L_PTE_SHARED;
|
|
s2_pgprot |= L_PTE_SHARED;
|
|
mem_types[MT_DEVICE_WC].prot_sect |= PMD_SECT_S;
|
|
mem_types[MT_DEVICE_WC].prot_pte |= L_PTE_SHARED;
|
|
mem_types[MT_DEVICE_CACHED].prot_sect |= PMD_SECT_S;
|
|
mem_types[MT_DEVICE_CACHED].prot_pte |= L_PTE_SHARED;
|
|
mem_types[MT_MEMORY_RWX].prot_sect |= PMD_SECT_S;
|
|
mem_types[MT_MEMORY_RWX].prot_pte |= L_PTE_SHARED;
|
|
mem_types[MT_MEMORY_RW].prot_sect |= PMD_SECT_S;
|
|
mem_types[MT_MEMORY_RW].prot_pte |= L_PTE_SHARED;
|
|
mem_types[MT_MEMORY_DMA_READY].prot_pte |= L_PTE_SHARED;
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= PMD_SECT_S;
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_pte |= L_PTE_SHARED;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* Non-cacheable Normal - intended for memory areas that must
|
|
* not cause dirty cache line writebacks when used
|
|
*/
|
|
if (cpu_arch >= CPU_ARCH_ARMv6) {
|
|
if (cpu_arch >= CPU_ARCH_ARMv7 && (cr & CR_TRE)) {
|
|
/* Non-cacheable Normal is XCB = 001 */
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |=
|
|
PMD_SECT_BUFFERED;
|
|
} else {
|
|
/* For both ARMv6 and non-TEX-remapping ARMv7 */
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |=
|
|
PMD_SECT_TEX(1);
|
|
}
|
|
} else {
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= PMD_SECT_BUFFERABLE;
|
|
}
|
|
|
|
#ifdef CONFIG_ARM_LPAE
|
|
/*
|
|
* Do not generate access flag faults for the kernel mappings.
|
|
*/
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
|
|
mem_types[i].prot_pte |= PTE_EXT_AF;
|
|
if (mem_types[i].prot_sect)
|
|
mem_types[i].prot_sect |= PMD_SECT_AF;
|
|
}
|
|
kern_pgprot |= PTE_EXT_AF;
|
|
vecs_pgprot |= PTE_EXT_AF;
|
|
|
|
/*
|
|
* Set PXN for user mappings
|
|
*/
|
|
user_pgprot |= PTE_EXT_PXN;
|
|
#endif
|
|
|
|
for (i = 0; i < 16; i++) {
|
|
pteval_t v = pgprot_val(protection_map[i]);
|
|
protection_map[i] = __pgprot(v | user_pgprot);
|
|
}
|
|
|
|
mem_types[MT_LOW_VECTORS].prot_pte |= vecs_pgprot;
|
|
mem_types[MT_HIGH_VECTORS].prot_pte |= vecs_pgprot;
|
|
|
|
pgprot_user = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG | user_pgprot);
|
|
pgprot_kernel = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG |
|
|
L_PTE_DIRTY | kern_pgprot);
|
|
pgprot_s2 = __pgprot(L_PTE_PRESENT | L_PTE_YOUNG | s2_pgprot);
|
|
pgprot_s2_device = __pgprot(s2_device_pgprot);
|
|
pgprot_hyp_device = __pgprot(hyp_device_pgprot);
|
|
|
|
mem_types[MT_LOW_VECTORS].prot_l1 |= ecc_mask;
|
|
mem_types[MT_HIGH_VECTORS].prot_l1 |= ecc_mask;
|
|
mem_types[MT_MEMORY_RWX].prot_sect |= ecc_mask | cp->pmd;
|
|
mem_types[MT_MEMORY_RWX].prot_pte |= kern_pgprot;
|
|
mem_types[MT_MEMORY_RW].prot_sect |= ecc_mask | cp->pmd;
|
|
mem_types[MT_MEMORY_RW].prot_pte |= kern_pgprot;
|
|
mem_types[MT_MEMORY_DMA_READY].prot_pte |= kern_pgprot;
|
|
mem_types[MT_MEMORY_RWX_NONCACHED].prot_sect |= ecc_mask;
|
|
mem_types[MT_ROM].prot_sect |= cp->pmd;
|
|
|
|
switch (cp->pmd) {
|
|
case PMD_SECT_WT:
|
|
mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WT;
|
|
break;
|
|
case PMD_SECT_WB:
|
|
case PMD_SECT_WBWA:
|
|
mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
|
|
break;
|
|
}
|
|
pr_info("Memory policy: %sData cache %s\n",
|
|
ecc_mask ? "ECC enabled, " : "", cp->policy);
|
|
|
|
for (i = 0; i < ARRAY_SIZE(mem_types); i++) {
|
|
struct mem_type *t = &mem_types[i];
|
|
if (t->prot_l1)
|
|
t->prot_l1 |= PMD_DOMAIN(t->domain);
|
|
if (t->prot_sect)
|
|
t->prot_sect |= PMD_DOMAIN(t->domain);
|
|
}
|
|
}
|
|
|
|
#ifdef CONFIG_ARM_DMA_MEM_BUFFERABLE
|
|
pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
|
|
unsigned long size, pgprot_t vma_prot)
|
|
{
|
|
if (!pfn_valid(pfn))
|
|
return pgprot_noncached(vma_prot);
|
|
else if (file->f_flags & O_SYNC)
|
|
return pgprot_writecombine(vma_prot);
|
|
return vma_prot;
|
|
}
|
|
EXPORT_SYMBOL(phys_mem_access_prot);
|
|
#endif
|
|
|
|
#define vectors_base() (vectors_high() ? 0xffff0000 : 0)
|
|
|
|
static void __init *early_alloc_aligned(unsigned long sz, unsigned long align)
|
|
{
|
|
void *ptr = __va(memblock_alloc(sz, align));
|
|
memset(ptr, 0, sz);
|
|
return ptr;
|
|
}
|
|
|
|
static void __init *early_alloc(unsigned long sz)
|
|
{
|
|
return early_alloc_aligned(sz, sz);
|
|
}
|
|
|
|
static pte_t * __init early_pte_alloc(pmd_t *pmd, unsigned long addr, unsigned long prot)
|
|
{
|
|
if (pmd_none(*pmd)) {
|
|
pte_t *pte = early_alloc(PTE_HWTABLE_OFF + PTE_HWTABLE_SIZE);
|
|
__pmd_populate(pmd, __pa(pte), prot);
|
|
}
|
|
BUG_ON(pmd_bad(*pmd));
|
|
return pte_offset_kernel(pmd, addr);
|
|
}
|
|
|
|
static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
|
|
unsigned long end, unsigned long pfn,
|
|
const struct mem_type *type)
|
|
{
|
|
pte_t *pte = early_pte_alloc(pmd, addr, type->prot_l1);
|
|
do {
|
|
set_pte_ext(pte, pfn_pte(pfn, __pgprot(type->prot_pte)), 0);
|
|
pfn++;
|
|
} while (pte++, addr += PAGE_SIZE, addr != end);
|
|
}
|
|
|
|
static void __init __map_init_section(pmd_t *pmd, unsigned long addr,
|
|
unsigned long end, phys_addr_t phys,
|
|
const struct mem_type *type)
|
|
{
|
|
pmd_t *p = pmd;
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
/*
|
|
* In classic MMU format, puds and pmds are folded in to
|
|
* the pgds. pmd_offset gives the PGD entry. PGDs refer to a
|
|
* group of L1 entries making up one logical pointer to
|
|
* an L2 table (2MB), where as PMDs refer to the individual
|
|
* L1 entries (1MB). Hence increment to get the correct
|
|
* offset for odd 1MB sections.
|
|
* (See arch/arm/include/asm/pgtable-2level.h)
|
|
*/
|
|
if (addr & SECTION_SIZE)
|
|
pmd++;
|
|
#endif
|
|
do {
|
|
*pmd = __pmd(phys | type->prot_sect);
|
|
phys += SECTION_SIZE;
|
|
} while (pmd++, addr += SECTION_SIZE, addr != end);
|
|
|
|
flush_pmd_entry(p);
|
|
}
|
|
|
|
static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
|
|
unsigned long end, phys_addr_t phys,
|
|
const struct mem_type *type)
|
|
{
|
|
pmd_t *pmd = pmd_offset(pud, addr);
|
|
unsigned long next;
|
|
|
|
do {
|
|
/*
|
|
* With LPAE, we must loop over to map
|
|
* all the pmds for the given range.
|
|
*/
|
|
next = pmd_addr_end(addr, end);
|
|
|
|
/*
|
|
* Try a section mapping - addr, next and phys must all be
|
|
* aligned to a section boundary.
|
|
*/
|
|
if (type->prot_sect &&
|
|
((addr | next | phys) & ~SECTION_MASK) == 0) {
|
|
__map_init_section(pmd, addr, next, phys, type);
|
|
} else {
|
|
alloc_init_pte(pmd, addr, next,
|
|
__phys_to_pfn(phys), type);
|
|
}
|
|
|
|
phys += next - addr;
|
|
|
|
} while (pmd++, addr = next, addr != end);
|
|
}
|
|
|
|
static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
|
|
unsigned long end, phys_addr_t phys,
|
|
const struct mem_type *type)
|
|
{
|
|
pud_t *pud = pud_offset(pgd, addr);
|
|
unsigned long next;
|
|
|
|
do {
|
|
next = pud_addr_end(addr, end);
|
|
alloc_init_pmd(pud, addr, next, phys, type);
|
|
phys += next - addr;
|
|
} while (pud++, addr = next, addr != end);
|
|
}
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
static void __init create_36bit_mapping(struct map_desc *md,
|
|
const struct mem_type *type)
|
|
{
|
|
unsigned long addr, length, end;
|
|
phys_addr_t phys;
|
|
pgd_t *pgd;
|
|
|
|
addr = md->virtual;
|
|
phys = __pfn_to_phys(md->pfn);
|
|
length = PAGE_ALIGN(md->length);
|
|
|
|
if (!(cpu_architecture() >= CPU_ARCH_ARMv6 || cpu_is_xsc3())) {
|
|
pr_err("MM: CPU does not support supersection mapping for 0x%08llx at 0x%08lx\n",
|
|
(long long)__pfn_to_phys((u64)md->pfn), addr);
|
|
return;
|
|
}
|
|
|
|
/* N.B. ARMv6 supersections are only defined to work with domain 0.
|
|
* Since domain assignments can in fact be arbitrary, the
|
|
* 'domain == 0' check below is required to insure that ARMv6
|
|
* supersections are only allocated for domain 0 regardless
|
|
* of the actual domain assignments in use.
|
|
*/
|
|
if (type->domain) {
|
|
pr_err("MM: invalid domain in supersection mapping for 0x%08llx at 0x%08lx\n",
|
|
(long long)__pfn_to_phys((u64)md->pfn), addr);
|
|
return;
|
|
}
|
|
|
|
if ((addr | length | __pfn_to_phys(md->pfn)) & ~SUPERSECTION_MASK) {
|
|
pr_err("MM: cannot create mapping for 0x%08llx at 0x%08lx invalid alignment\n",
|
|
(long long)__pfn_to_phys((u64)md->pfn), addr);
|
|
return;
|
|
}
|
|
|
|
/*
|
|
* Shift bits [35:32] of address into bits [23:20] of PMD
|
|
* (See ARMv6 spec).
|
|
*/
|
|
phys |= (((md->pfn >> (32 - PAGE_SHIFT)) & 0xF) << 20);
|
|
|
|
pgd = pgd_offset_k(addr);
|
|
end = addr + length;
|
|
do {
|
|
pud_t *pud = pud_offset(pgd, addr);
|
|
pmd_t *pmd = pmd_offset(pud, addr);
|
|
int i;
|
|
|
|
for (i = 0; i < 16; i++)
|
|
*pmd++ = __pmd(phys | type->prot_sect | PMD_SECT_SUPER);
|
|
|
|
addr += SUPERSECTION_SIZE;
|
|
phys += SUPERSECTION_SIZE;
|
|
pgd += SUPERSECTION_SIZE >> PGDIR_SHIFT;
|
|
} while (addr != end);
|
|
}
|
|
#endif /* !CONFIG_ARM_LPAE */
|
|
|
|
/*
|
|
* Create the page directory entries and any necessary
|
|
* page tables for the mapping specified by `md'. We
|
|
* are able to cope here with varying sizes and address
|
|
* offsets, and we take full advantage of sections and
|
|
* supersections.
|
|
*/
|
|
static void __init create_mapping(struct map_desc *md)
|
|
{
|
|
unsigned long addr, length, end;
|
|
phys_addr_t phys;
|
|
const struct mem_type *type;
|
|
pgd_t *pgd;
|
|
|
|
if (md->virtual != vectors_base() && md->virtual < TASK_SIZE) {
|
|
pr_warn("BUG: not creating mapping for 0x%08llx at 0x%08lx in user region\n",
|
|
(long long)__pfn_to_phys((u64)md->pfn), md->virtual);
|
|
return;
|
|
}
|
|
|
|
if ((md->type == MT_DEVICE || md->type == MT_ROM) &&
|
|
md->virtual >= PAGE_OFFSET && md->virtual < FIXADDR_START &&
|
|
(md->virtual < VMALLOC_START || md->virtual >= VMALLOC_END)) {
|
|
pr_warn("BUG: mapping for 0x%08llx at 0x%08lx out of vmalloc space\n",
|
|
(long long)__pfn_to_phys((u64)md->pfn), md->virtual);
|
|
}
|
|
|
|
type = &mem_types[md->type];
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
/*
|
|
* Catch 36-bit addresses
|
|
*/
|
|
if (md->pfn >= 0x100000) {
|
|
create_36bit_mapping(md, type);
|
|
return;
|
|
}
|
|
#endif
|
|
|
|
addr = md->virtual & PAGE_MASK;
|
|
phys = __pfn_to_phys(md->pfn);
|
|
length = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
|
|
|
|
if (type->prot_l1 == 0 && ((addr | phys | length) & ~SECTION_MASK)) {
|
|
pr_warn("BUG: map for 0x%08llx at 0x%08lx can not be mapped using pages, ignoring.\n",
|
|
(long long)__pfn_to_phys(md->pfn), addr);
|
|
return;
|
|
}
|
|
|
|
pgd = pgd_offset_k(addr);
|
|
end = addr + length;
|
|
do {
|
|
unsigned long next = pgd_addr_end(addr, end);
|
|
|
|
alloc_init_pud(pgd, addr, next, phys, type);
|
|
|
|
phys += next - addr;
|
|
addr = next;
|
|
} while (pgd++, addr != end);
|
|
}
|
|
|
|
/*
|
|
* Create the architecture specific mappings
|
|
*/
|
|
void __init iotable_init(struct map_desc *io_desc, int nr)
|
|
{
|
|
struct map_desc *md;
|
|
struct vm_struct *vm;
|
|
struct static_vm *svm;
|
|
|
|
if (!nr)
|
|
return;
|
|
|
|
svm = early_alloc_aligned(sizeof(*svm) * nr, __alignof__(*svm));
|
|
|
|
for (md = io_desc; nr; md++, nr--) {
|
|
create_mapping(md);
|
|
|
|
vm = &svm->vm;
|
|
vm->addr = (void *)(md->virtual & PAGE_MASK);
|
|
vm->size = PAGE_ALIGN(md->length + (md->virtual & ~PAGE_MASK));
|
|
vm->phys_addr = __pfn_to_phys(md->pfn);
|
|
vm->flags = VM_IOREMAP | VM_ARM_STATIC_MAPPING;
|
|
vm->flags |= VM_ARM_MTYPE(md->type);
|
|
vm->caller = iotable_init;
|
|
add_static_vm_early(svm++);
|
|
}
|
|
}
|
|
|
|
void __init vm_reserve_area_early(unsigned long addr, unsigned long size,
|
|
void *caller)
|
|
{
|
|
struct vm_struct *vm;
|
|
struct static_vm *svm;
|
|
|
|
svm = early_alloc_aligned(sizeof(*svm), __alignof__(*svm));
|
|
|
|
vm = &svm->vm;
|
|
vm->addr = (void *)addr;
|
|
vm->size = size;
|
|
vm->flags = VM_IOREMAP | VM_ARM_EMPTY_MAPPING;
|
|
vm->caller = caller;
|
|
add_static_vm_early(svm);
|
|
}
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
|
|
/*
|
|
* The Linux PMD is made of two consecutive section entries covering 2MB
|
|
* (see definition in include/asm/pgtable-2level.h). However a call to
|
|
* create_mapping() may optimize static mappings by using individual
|
|
* 1MB section mappings. This leaves the actual PMD potentially half
|
|
* initialized if the top or bottom section entry isn't used, leaving it
|
|
* open to problems if a subsequent ioremap() or vmalloc() tries to use
|
|
* the virtual space left free by that unused section entry.
|
|
*
|
|
* Let's avoid the issue by inserting dummy vm entries covering the unused
|
|
* PMD halves once the static mappings are in place.
|
|
*/
|
|
|
|
static void __init pmd_empty_section_gap(unsigned long addr)
|
|
{
|
|
vm_reserve_area_early(addr, SECTION_SIZE, pmd_empty_section_gap);
|
|
}
|
|
|
|
static void __init fill_pmd_gaps(void)
|
|
{
|
|
struct static_vm *svm;
|
|
struct vm_struct *vm;
|
|
unsigned long addr, next = 0;
|
|
pmd_t *pmd;
|
|
|
|
list_for_each_entry(svm, &static_vmlist, list) {
|
|
vm = &svm->vm;
|
|
addr = (unsigned long)vm->addr;
|
|
if (addr < next)
|
|
continue;
|
|
|
|
/*
|
|
* Check if this vm starts on an odd section boundary.
|
|
* If so and the first section entry for this PMD is free
|
|
* then we block the corresponding virtual address.
|
|
*/
|
|
if ((addr & ~PMD_MASK) == SECTION_SIZE) {
|
|
pmd = pmd_off_k(addr);
|
|
if (pmd_none(*pmd))
|
|
pmd_empty_section_gap(addr & PMD_MASK);
|
|
}
|
|
|
|
/*
|
|
* Then check if this vm ends on an odd section boundary.
|
|
* If so and the second section entry for this PMD is empty
|
|
* then we block the corresponding virtual address.
|
|
*/
|
|
addr += vm->size;
|
|
if ((addr & ~PMD_MASK) == SECTION_SIZE) {
|
|
pmd = pmd_off_k(addr) + 1;
|
|
if (pmd_none(*pmd))
|
|
pmd_empty_section_gap(addr);
|
|
}
|
|
|
|
/* no need to look at any vm entry until we hit the next PMD */
|
|
next = (addr + PMD_SIZE - 1) & PMD_MASK;
|
|
}
|
|
}
|
|
|
|
#else
|
|
#define fill_pmd_gaps() do { } while (0)
|
|
#endif
|
|
|
|
#if defined(CONFIG_PCI) && !defined(CONFIG_NEED_MACH_IO_H)
|
|
static void __init pci_reserve_io(void)
|
|
{
|
|
struct static_vm *svm;
|
|
|
|
svm = find_static_vm_vaddr((void *)PCI_IO_VIRT_BASE);
|
|
if (svm)
|
|
return;
|
|
|
|
vm_reserve_area_early(PCI_IO_VIRT_BASE, SZ_2M, pci_reserve_io);
|
|
}
|
|
#else
|
|
#define pci_reserve_io() do { } while (0)
|
|
#endif
|
|
|
|
#ifdef CONFIG_DEBUG_LL
|
|
void __init debug_ll_io_init(void)
|
|
{
|
|
struct map_desc map;
|
|
|
|
debug_ll_addr(&map.pfn, &map.virtual);
|
|
if (!map.pfn || !map.virtual)
|
|
return;
|
|
map.pfn = __phys_to_pfn(map.pfn);
|
|
map.virtual &= PAGE_MASK;
|
|
map.length = PAGE_SIZE;
|
|
map.type = MT_DEVICE;
|
|
iotable_init(&map, 1);
|
|
}
|
|
#endif
|
|
|
|
static void * __initdata vmalloc_min =
|
|
(void *)(VMALLOC_END - (240 << 20) - VMALLOC_OFFSET);
|
|
|
|
/*
|
|
* vmalloc=size forces the vmalloc area to be exactly 'size'
|
|
* bytes. This can be used to increase (or decrease) the vmalloc
|
|
* area - the default is 240m.
|
|
*/
|
|
static int __init early_vmalloc(char *arg)
|
|
{
|
|
unsigned long vmalloc_reserve = memparse(arg, NULL);
|
|
|
|
if (vmalloc_reserve < SZ_16M) {
|
|
vmalloc_reserve = SZ_16M;
|
|
pr_warn("vmalloc area too small, limiting to %luMB\n",
|
|
vmalloc_reserve >> 20);
|
|
}
|
|
|
|
if (vmalloc_reserve > VMALLOC_END - (PAGE_OFFSET + SZ_32M)) {
|
|
vmalloc_reserve = VMALLOC_END - (PAGE_OFFSET + SZ_32M);
|
|
pr_warn("vmalloc area is too big, limiting to %luMB\n",
|
|
vmalloc_reserve >> 20);
|
|
}
|
|
|
|
vmalloc_min = (void *)(VMALLOC_END - vmalloc_reserve);
|
|
return 0;
|
|
}
|
|
early_param("vmalloc", early_vmalloc);
|
|
|
|
phys_addr_t arm_lowmem_limit __initdata = 0;
|
|
|
|
void __init sanity_check_meminfo(void)
|
|
{
|
|
phys_addr_t memblock_limit = 0;
|
|
int highmem = 0;
|
|
phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
|
|
struct memblock_region *reg;
|
|
bool should_use_highmem = false;
|
|
|
|
#ifdef CONFIG_ENABLE_VMALLOC_SAVING
|
|
struct memblock_region *prev_reg = NULL;
|
|
|
|
for_each_memblock(memory, reg) {
|
|
if (prev_reg == NULL) {
|
|
prev_reg = reg;
|
|
continue;
|
|
}
|
|
vmalloc_limit += reg->base - (prev_reg->base + prev_reg->size);
|
|
prev_reg = reg;
|
|
}
|
|
#endif
|
|
|
|
for_each_memblock(memory, reg) {
|
|
phys_addr_t block_start = reg->base;
|
|
phys_addr_t block_end = reg->base + reg->size;
|
|
phys_addr_t size_limit = reg->size;
|
|
|
|
if (reg->base >= vmalloc_limit)
|
|
highmem = 1;
|
|
else
|
|
size_limit = vmalloc_limit - reg->base;
|
|
|
|
|
|
if (!IS_ENABLED(CONFIG_HIGHMEM) || cache_is_vipt_aliasing()) {
|
|
|
|
if (highmem) {
|
|
pr_notice("Ignoring RAM at %pa-%pa (!CONFIG_HIGHMEM)\n",
|
|
&block_start, &block_end);
|
|
memblock_remove(reg->base, reg->size);
|
|
should_use_highmem = true;
|
|
continue;
|
|
}
|
|
|
|
if (reg->size > size_limit) {
|
|
phys_addr_t overlap_size = reg->size - size_limit;
|
|
|
|
pr_notice("Truncating RAM at %pa-%pa to -%pa",
|
|
&block_start, &block_end, &vmalloc_limit);
|
|
memblock_remove(vmalloc_limit, overlap_size);
|
|
block_end = vmalloc_limit;
|
|
should_use_highmem = true;
|
|
}
|
|
}
|
|
|
|
if (!highmem) {
|
|
if (block_end > arm_lowmem_limit) {
|
|
if (reg->size > size_limit)
|
|
arm_lowmem_limit = vmalloc_limit;
|
|
else
|
|
arm_lowmem_limit = block_end;
|
|
}
|
|
|
|
/*
|
|
* Find the first non-pmd-aligned page, and point
|
|
* memblock_limit at it. This relies on rounding the
|
|
* limit down to be pmd-aligned, which happens at the
|
|
* end of this function.
|
|
*
|
|
* With this algorithm, the start or end of almost any
|
|
* bank can be non-pmd-aligned. The only exception is
|
|
* that the start of the bank 0 must be section-
|
|
* aligned, since otherwise memory would need to be
|
|
* allocated when mapping the start of bank 0, which
|
|
* occurs before any free memory is mapped.
|
|
*/
|
|
if (!memblock_limit) {
|
|
if (!IS_ALIGNED(block_start, PMD_SIZE))
|
|
memblock_limit = block_start;
|
|
else if (!IS_ALIGNED(block_end, PMD_SIZE))
|
|
memblock_limit = arm_lowmem_limit;
|
|
}
|
|
|
|
}
|
|
}
|
|
|
|
if (should_use_highmem)
|
|
pr_notice("Consider using a HIGHMEM enabled kernel.\n");
|
|
|
|
high_memory = __va(arm_lowmem_limit - 1) + 1;
|
|
|
|
/*
|
|
* Round the memblock limit down to a pmd size. This
|
|
* helps to ensure that we will allocate memory from the
|
|
* last full pmd, which should be mapped.
|
|
*/
|
|
if (memblock_limit)
|
|
memblock_limit = round_down(memblock_limit, PMD_SIZE);
|
|
if (!memblock_limit)
|
|
memblock_limit = arm_lowmem_limit;
|
|
|
|
memblock_set_current_limit(memblock_limit);
|
|
}
|
|
|
|
static inline void prepare_page_table(void)
|
|
{
|
|
unsigned long addr;
|
|
phys_addr_t end;
|
|
|
|
/*
|
|
* Clear out all the mappings below the kernel image.
|
|
*/
|
|
for (addr = 0; addr < MODULES_VADDR; addr += PMD_SIZE)
|
|
pmd_clear(pmd_off_k(addr));
|
|
|
|
#ifdef CONFIG_XIP_KERNEL
|
|
/* The XIP kernel is mapped in the module area -- skip over it */
|
|
addr = ((unsigned long)_etext + PMD_SIZE - 1) & PMD_MASK;
|
|
#endif
|
|
for ( ; addr < PAGE_OFFSET; addr += PMD_SIZE)
|
|
pmd_clear(pmd_off_k(addr));
|
|
|
|
/*
|
|
* Find the end of the first block of lowmem.
|
|
*/
|
|
end = memblock.memory.regions[0].base + memblock.memory.regions[0].size;
|
|
if (end >= arm_lowmem_limit)
|
|
end = arm_lowmem_limit;
|
|
|
|
/*
|
|
* Clear out all the kernel space mappings, except for the first
|
|
* memory bank, up to the vmalloc region.
|
|
*/
|
|
for (addr = __phys_to_virt(end);
|
|
addr < VMALLOC_START; addr += PMD_SIZE)
|
|
pmd_clear(pmd_off_k(addr));
|
|
}
|
|
|
|
#ifdef CONFIG_ARM_LPAE
|
|
/* the first page is reserved for pgd */
|
|
#define SWAPPER_PG_DIR_SIZE (PAGE_SIZE + \
|
|
PTRS_PER_PGD * PTRS_PER_PMD * sizeof(pmd_t))
|
|
#else
|
|
#define SWAPPER_PG_DIR_SIZE (PTRS_PER_PGD * sizeof(pgd_t))
|
|
#endif
|
|
|
|
/*
|
|
* Reserve the special regions of memory
|
|
*/
|
|
void __init arm_mm_memblock_reserve(void)
|
|
{
|
|
/*
|
|
* Reserve the page tables. These are already in use,
|
|
* and can only be in node 0.
|
|
*/
|
|
memblock_reserve(__pa(swapper_pg_dir), SWAPPER_PG_DIR_SIZE);
|
|
|
|
#ifdef CONFIG_SA1111
|
|
/*
|
|
* Because of the SA1111 DMA bug, we want to preserve our
|
|
* precious DMA-able memory...
|
|
*/
|
|
memblock_reserve(PHYS_OFFSET, __pa(swapper_pg_dir) - PHYS_OFFSET);
|
|
#endif
|
|
}
|
|
|
|
/*
|
|
* Set up the device mappings. Since we clear out the page tables for all
|
|
* mappings above VMALLOC_START, except early fixmap, we might remove debug
|
|
* device mappings. This means earlycon can be used to debug this function
|
|
* Any other function or debugging method which may touch any device _will_
|
|
* crash the kernel.
|
|
*/
|
|
static void __init devicemaps_init(const struct machine_desc *mdesc)
|
|
{
|
|
struct map_desc map;
|
|
unsigned long addr;
|
|
void *vectors;
|
|
|
|
/*
|
|
* Allocate the vector page early.
|
|
*/
|
|
vectors = early_alloc(PAGE_SIZE * 2);
|
|
|
|
early_trap_init(vectors);
|
|
|
|
/*
|
|
* Clear page table except top pmd used by early fixmaps
|
|
*/
|
|
for (addr = VMALLOC_START; addr < (FIXADDR_TOP & PMD_MASK); addr += PMD_SIZE)
|
|
pmd_clear(pmd_off_k(addr));
|
|
|
|
/*
|
|
* Map the kernel if it is XIP.
|
|
* It is always first in the modulearea.
|
|
*/
|
|
#ifdef CONFIG_XIP_KERNEL
|
|
map.pfn = __phys_to_pfn(CONFIG_XIP_PHYS_ADDR & SECTION_MASK);
|
|
map.virtual = MODULES_VADDR;
|
|
map.length = ((unsigned long)_etext - map.virtual + ~SECTION_MASK) & SECTION_MASK;
|
|
map.type = MT_ROM;
|
|
create_mapping(&map);
|
|
#endif
|
|
|
|
/*
|
|
* Map the cache flushing regions.
|
|
*/
|
|
#ifdef FLUSH_BASE
|
|
map.pfn = __phys_to_pfn(FLUSH_BASE_PHYS);
|
|
map.virtual = FLUSH_BASE;
|
|
map.length = SZ_1M;
|
|
map.type = MT_CACHECLEAN;
|
|
create_mapping(&map);
|
|
#endif
|
|
#ifdef FLUSH_BASE_MINICACHE
|
|
map.pfn = __phys_to_pfn(FLUSH_BASE_PHYS + SZ_1M);
|
|
map.virtual = FLUSH_BASE_MINICACHE;
|
|
map.length = SZ_1M;
|
|
map.type = MT_MINICLEAN;
|
|
create_mapping(&map);
|
|
#endif
|
|
|
|
/*
|
|
* Create a mapping for the machine vectors at the high-vectors
|
|
* location (0xffff0000). If we aren't using high-vectors, also
|
|
* create a mapping at the low-vectors virtual address.
|
|
*/
|
|
map.pfn = __phys_to_pfn(virt_to_phys(vectors));
|
|
map.virtual = 0xffff0000;
|
|
map.length = PAGE_SIZE;
|
|
#ifdef CONFIG_KUSER_HELPERS
|
|
map.type = MT_HIGH_VECTORS;
|
|
#else
|
|
map.type = MT_LOW_VECTORS;
|
|
#endif
|
|
create_mapping(&map);
|
|
|
|
if (!vectors_high()) {
|
|
map.virtual = 0;
|
|
map.length = PAGE_SIZE * 2;
|
|
map.type = MT_LOW_VECTORS;
|
|
create_mapping(&map);
|
|
}
|
|
|
|
/* Now create a kernel read-only mapping */
|
|
map.pfn += 1;
|
|
map.virtual = 0xffff0000 + PAGE_SIZE;
|
|
map.length = PAGE_SIZE;
|
|
map.type = MT_LOW_VECTORS;
|
|
create_mapping(&map);
|
|
|
|
/*
|
|
* Ask the machine support to map in the statically mapped devices.
|
|
*/
|
|
if (mdesc->map_io)
|
|
mdesc->map_io();
|
|
else
|
|
debug_ll_io_init();
|
|
fill_pmd_gaps();
|
|
|
|
/* Reserve fixed i/o space in VMALLOC region */
|
|
pci_reserve_io();
|
|
|
|
/*
|
|
* Finally flush the caches and tlb to ensure that we're in a
|
|
* consistent state wrt the writebuffer. This also ensures that
|
|
* any write-allocated cache lines in the vector page are written
|
|
* back. After this point, we can start to touch devices again.
|
|
*/
|
|
local_flush_tlb_all();
|
|
flush_cache_all();
|
|
|
|
/* Enable asynchronous aborts */
|
|
early_abt_enable();
|
|
}
|
|
|
|
static void __init kmap_init(void)
|
|
{
|
|
#ifdef CONFIG_HIGHMEM
|
|
pkmap_page_table = early_pte_alloc(pmd_off_k(PKMAP_BASE),
|
|
PKMAP_BASE, _PAGE_KERNEL_TABLE);
|
|
#endif
|
|
|
|
early_pte_alloc(pmd_off_k(FIXADDR_START), FIXADDR_START,
|
|
_PAGE_KERNEL_TABLE);
|
|
}
|
|
|
|
static void __init map_lowmem(void)
|
|
{
|
|
struct memblock_region *reg;
|
|
phys_addr_t kernel_x_start = round_down(__pa(_stext), SECTION_SIZE);
|
|
phys_addr_t kernel_x_end = round_up(__pa(__init_end), SECTION_SIZE);
|
|
struct static_vm *svm;
|
|
phys_addr_t start;
|
|
phys_addr_t end;
|
|
unsigned long vaddr;
|
|
unsigned long pfn;
|
|
unsigned long length;
|
|
unsigned int type;
|
|
int nr = 0;
|
|
|
|
/* Map all the lowmem memory banks. */
|
|
for_each_memblock(memory, reg) {
|
|
struct map_desc map;
|
|
start = reg->base;
|
|
end = start + reg->size;
|
|
nr++;
|
|
|
|
if (end > arm_lowmem_limit)
|
|
end = arm_lowmem_limit;
|
|
if (start >= end)
|
|
break;
|
|
|
|
if (end < kernel_x_start) {
|
|
map.pfn = __phys_to_pfn(start);
|
|
map.virtual = __phys_to_virt(start);
|
|
map.length = end - start;
|
|
map.type = MT_MEMORY_RWX;
|
|
|
|
create_mapping(&map);
|
|
} else if (start >= kernel_x_end) {
|
|
map.pfn = __phys_to_pfn(start);
|
|
map.virtual = __phys_to_virt(start);
|
|
map.length = end - start;
|
|
map.type = MT_MEMORY_RW;
|
|
|
|
create_mapping(&map);
|
|
} else {
|
|
/* This better cover the entire kernel */
|
|
if (start < kernel_x_start) {
|
|
map.pfn = __phys_to_pfn(start);
|
|
map.virtual = __phys_to_virt(start);
|
|
map.length = kernel_x_start - start;
|
|
map.type = MT_MEMORY_RW;
|
|
|
|
create_mapping(&map);
|
|
}
|
|
|
|
map.pfn = __phys_to_pfn(kernel_x_start);
|
|
map.virtual = __phys_to_virt(kernel_x_start);
|
|
map.length = kernel_x_end - kernel_x_start;
|
|
map.type = MT_MEMORY_RWX;
|
|
|
|
create_mapping(&map);
|
|
|
|
if (kernel_x_end < end) {
|
|
map.pfn = __phys_to_pfn(kernel_x_end);
|
|
map.virtual = __phys_to_virt(kernel_x_end);
|
|
map.length = end - kernel_x_end;
|
|
map.type = MT_MEMORY_RW;
|
|
|
|
create_mapping(&map);
|
|
}
|
|
}
|
|
}
|
|
svm = early_alloc_aligned(sizeof(*svm) * nr, __alignof__(*svm));
|
|
|
|
for_each_memblock(memory, reg) {
|
|
struct vm_struct *vm;
|
|
|
|
start = reg->base;
|
|
end = start + reg->size;
|
|
|
|
if (end > arm_lowmem_limit)
|
|
end = arm_lowmem_limit;
|
|
if (start >= end)
|
|
break;
|
|
|
|
vm = &svm->vm;
|
|
pfn = __phys_to_pfn(start);
|
|
vaddr = __phys_to_virt(start);
|
|
length = end - start;
|
|
type = MT_MEMORY_RW;
|
|
|
|
vm->addr = (void *)(vaddr & PAGE_MASK);
|
|
vm->size = PAGE_ALIGN(length + (vaddr & ~PAGE_MASK));
|
|
vm->phys_addr = __pfn_to_phys(pfn);
|
|
vm->flags = VM_LOWMEM;
|
|
vm->flags |= VM_ARM_MTYPE(type);
|
|
vm->caller = map_lowmem;
|
|
add_static_vm_early(svm++);
|
|
mark_vmalloc_reserved_area(vm->addr, vm->size);
|
|
}
|
|
}
|
|
|
|
#ifdef CONFIG_ARM_PV_FIXUP
|
|
extern unsigned long __atags_pointer;
|
|
typedef void pgtables_remap(long long offset, unsigned long pgd, void *bdata);
|
|
pgtables_remap lpae_pgtables_remap_asm;
|
|
|
|
/*
|
|
* early_paging_init() recreates boot time page table setup, allowing machines
|
|
* to switch over to a high (>4G) address space on LPAE systems
|
|
*/
|
|
void __init early_paging_init(const struct machine_desc *mdesc)
|
|
{
|
|
pgtables_remap *lpae_pgtables_remap;
|
|
unsigned long pa_pgd;
|
|
unsigned int cr, ttbcr;
|
|
long long offset;
|
|
void *boot_data;
|
|
|
|
if (!mdesc->pv_fixup)
|
|
return;
|
|
|
|
offset = mdesc->pv_fixup();
|
|
if (offset == 0)
|
|
return;
|
|
|
|
/*
|
|
* Get the address of the remap function in the 1:1 identity
|
|
* mapping setup by the early page table assembly code. We
|
|
* must get this prior to the pv update. The following barrier
|
|
* ensures that this is complete before we fixup any P:V offsets.
|
|
*/
|
|
lpae_pgtables_remap = (pgtables_remap *)(unsigned long)__pa(lpae_pgtables_remap_asm);
|
|
pa_pgd = __pa(swapper_pg_dir);
|
|
boot_data = __va(__atags_pointer);
|
|
barrier();
|
|
|
|
pr_info("Switching physical address space to 0x%08llx\n",
|
|
(u64)PHYS_OFFSET + offset);
|
|
|
|
/* Re-set the phys pfn offset, and the pv offset */
|
|
__pv_offset += offset;
|
|
__pv_phys_pfn_offset += PFN_DOWN(offset);
|
|
|
|
/* Run the patch stub to update the constants */
|
|
fixup_pv_table(&__pv_table_begin,
|
|
(&__pv_table_end - &__pv_table_begin) << 2);
|
|
|
|
/*
|
|
* We changing not only the virtual to physical mapping, but also
|
|
* the physical addresses used to access memory. We need to flush
|
|
* all levels of cache in the system with caching disabled to
|
|
* ensure that all data is written back, and nothing is prefetched
|
|
* into the caches. We also need to prevent the TLB walkers
|
|
* allocating into the caches too. Note that this is ARMv7 LPAE
|
|
* specific.
|
|
*/
|
|
cr = get_cr();
|
|
set_cr(cr & ~(CR_I | CR_C));
|
|
asm("mrc p15, 0, %0, c2, c0, 2" : "=r" (ttbcr));
|
|
asm volatile("mcr p15, 0, %0, c2, c0, 2"
|
|
: : "r" (ttbcr & ~(3 << 8 | 3 << 10)));
|
|
flush_cache_all();
|
|
|
|
/*
|
|
* Fixup the page tables - this must be in the idmap region as
|
|
* we need to disable the MMU to do this safely, and hence it
|
|
* needs to be assembly. It's fairly simple, as we're using the
|
|
* temporary tables setup by the initial assembly code.
|
|
*/
|
|
lpae_pgtables_remap(offset, pa_pgd, boot_data);
|
|
|
|
/* Re-enable the caches and cacheable TLB walks */
|
|
asm volatile("mcr p15, 0, %0, c2, c0, 2" : : "r" (ttbcr));
|
|
set_cr(cr);
|
|
}
|
|
|
|
#else
|
|
|
|
void __init early_paging_init(const struct machine_desc *mdesc)
|
|
{
|
|
long long offset;
|
|
|
|
if (!mdesc->pv_fixup)
|
|
return;
|
|
|
|
offset = mdesc->pv_fixup();
|
|
if (offset == 0)
|
|
return;
|
|
|
|
pr_crit("Physical address space modification is only to support Keystone2.\n");
|
|
pr_crit("Please enable ARM_LPAE and ARM_PATCH_PHYS_VIRT support to use this\n");
|
|
pr_crit("feature. Your kernel may crash now, have a good day.\n");
|
|
add_taint(TAINT_CPU_OUT_OF_SPEC, LOCKDEP_STILL_OK);
|
|
}
|
|
|
|
#endif
|
|
|
|
#ifdef CONFIG_FORCE_PAGES
|
|
/*
|
|
* remap a PMD into pages
|
|
* We split a single pmd here none of this two pmd nonsense
|
|
*/
|
|
static noinline void __init split_pmd(pmd_t *pmd, unsigned long addr,
|
|
unsigned long end, unsigned long pfn,
|
|
const struct mem_type *type)
|
|
{
|
|
pte_t *pte, *start_pte;
|
|
pmd_t *base_pmd;
|
|
|
|
base_pmd = pmd_offset(
|
|
pud_offset(pgd_offset(&init_mm, addr), addr), addr);
|
|
|
|
if (pmd_none(*base_pmd) || pmd_bad(*base_pmd)) {
|
|
start_pte = early_alloc(PTE_HWTABLE_OFF + PTE_HWTABLE_SIZE);
|
|
#ifndef CONFIG_ARM_LPAE
|
|
/*
|
|
* Following is needed when new pte is allocated for pmd[1]
|
|
* cases, which may happen when base (start) address falls
|
|
* under pmd[1].
|
|
*/
|
|
if (addr & SECTION_SIZE)
|
|
start_pte += pte_index(addr);
|
|
#endif
|
|
} else {
|
|
start_pte = pte_offset_kernel(base_pmd, addr);
|
|
}
|
|
|
|
pte = start_pte;
|
|
|
|
do {
|
|
set_pte_ext(pte, pfn_pte(pfn, type->prot_pte), 0);
|
|
pfn++;
|
|
} while (pte++, addr += PAGE_SIZE, addr != end);
|
|
|
|
*pmd = __pmd((__pa(start_pte) + PTE_HWTABLE_OFF) | type->prot_l1);
|
|
mb(); /* let pmd be programmed */
|
|
flush_pmd_entry(pmd);
|
|
flush_tlb_all();
|
|
}
|
|
|
|
/*
|
|
* It's significantly easier to remap as pages later after all memory is
|
|
* mapped. Everything is sections so all we have to do is split
|
|
*/
|
|
static void __init remap_pages(void)
|
|
{
|
|
struct memblock_region *reg;
|
|
|
|
for_each_memblock(memory, reg) {
|
|
phys_addr_t phys_start = reg->base;
|
|
phys_addr_t phys_end = reg->base + reg->size;
|
|
unsigned long addr = (unsigned long)__va(phys_start);
|
|
unsigned long end = (unsigned long)__va(phys_end);
|
|
pmd_t *pmd = NULL;
|
|
unsigned long next;
|
|
unsigned long pfn = __phys_to_pfn(phys_start);
|
|
bool fixup = false;
|
|
unsigned long saved_start = addr;
|
|
|
|
if (phys_start > arm_lowmem_limit)
|
|
break;
|
|
if (phys_end > arm_lowmem_limit)
|
|
end = (unsigned long)__va(arm_lowmem_limit);
|
|
if (phys_start >= phys_end)
|
|
break;
|
|
|
|
pmd = pmd_offset(
|
|
pud_offset(pgd_offset(&init_mm, addr), addr), addr);
|
|
|
|
#ifndef CONFIG_ARM_LPAE
|
|
if (addr & SECTION_SIZE) {
|
|
fixup = true;
|
|
pmd_empty_section_gap((addr - SECTION_SIZE) & PMD_MASK);
|
|
pmd++;
|
|
}
|
|
|
|
if (end & SECTION_SIZE)
|
|
pmd_empty_section_gap(end);
|
|
#endif
|
|
|
|
do {
|
|
next = addr + SECTION_SIZE;
|
|
|
|
if (pmd_none(*pmd) || pmd_bad(*pmd))
|
|
split_pmd(pmd, addr, next, pfn,
|
|
&mem_types[MT_MEMORY_RWX]);
|
|
pmd++;
|
|
pfn += SECTION_SIZE >> PAGE_SHIFT;
|
|
|
|
} while (addr = next, addr < end);
|
|
|
|
if (fixup) {
|
|
/*
|
|
* Put a faulting page table here to avoid detecting no
|
|
* pmd when accessing an odd section boundary. This
|
|
* needs to be faulting to help catch errors and avoid
|
|
* speculation
|
|
*/
|
|
pmd = pmd_off_k(saved_start);
|
|
pmd[0] = pmd[1] & ~1;
|
|
}
|
|
}
|
|
}
|
|
#else
|
|
static void __init remap_pages(void)
|
|
{
|
|
|
|
}
|
|
#endif
|
|
|
|
static void __init early_fixmap_shutdown(void)
|
|
{
|
|
int i;
|
|
unsigned long va = fix_to_virt(__end_of_permanent_fixed_addresses - 1);
|
|
|
|
pte_offset_fixmap = pte_offset_late_fixmap;
|
|
pmd_clear(fixmap_pmd(va));
|
|
local_flush_tlb_kernel_page(va);
|
|
|
|
for (i = 0; i < __end_of_permanent_fixed_addresses; i++) {
|
|
pte_t *pte;
|
|
struct map_desc map;
|
|
|
|
map.virtual = fix_to_virt(i);
|
|
pte = pte_offset_early_fixmap(pmd_off_k(map.virtual), map.virtual);
|
|
|
|
/* Only i/o device mappings are supported ATM */
|
|
if (pte_none(*pte) ||
|
|
(pte_val(*pte) & L_PTE_MT_MASK) != L_PTE_MT_DEV_SHARED)
|
|
continue;
|
|
|
|
map.pfn = pte_pfn(*pte);
|
|
map.type = MT_DEVICE;
|
|
map.length = PAGE_SIZE;
|
|
|
|
create_mapping(&map);
|
|
}
|
|
}
|
|
|
|
/*
|
|
* paging_init() sets up the page tables, initialises the zone memory
|
|
* maps, and sets up the zero page, bad page and bad page tables.
|
|
*/
|
|
void __init paging_init(const struct machine_desc *mdesc)
|
|
{
|
|
void *zero_page;
|
|
|
|
build_mem_type_table();
|
|
prepare_page_table();
|
|
map_lowmem();
|
|
memblock_set_current_limit(arm_lowmem_limit);
|
|
dma_contiguous_remap();
|
|
early_fixmap_shutdown();
|
|
remap_pages();
|
|
devicemaps_init(mdesc);
|
|
kmap_init();
|
|
tcm_init();
|
|
|
|
top_pmd = pmd_off_k(0xffff0000);
|
|
|
|
/* allocate the zero page. */
|
|
zero_page = early_alloc(PAGE_SIZE);
|
|
|
|
bootmem_init();
|
|
|
|
empty_zero_page = virt_to_page(zero_page);
|
|
__flush_dcache_page(NULL, empty_zero_page);
|
|
}
|