Commit graph

1154 commits

Author SHA1 Message Date
James Hogan
89dea03c09 EDAC, octeon: Fix an uninitialized variable warning
commit 544e92581a2ac44607d7cc602c6b54d18656f56d upstream.

Fix an uninitialized variable warning in the Octeon EDAC driver, as seen
in MIPS cavium_octeon_defconfig builds since v4.14 with Codescape GNU
Tools 2016.05-03:

  drivers/edac/octeon_edac-lmc.c In function ‘octeon_lmc_edac_poll_o2’:
  drivers/edac/octeon_edac-lmc.c:87:24: warning: ‘((long unsigned int*)&int_reg)[1]’ may \
    be used uninitialized in this function [-Wmaybe-uninitialized]
    if (int_reg.s.sec_err || int_reg.s.ded_err) {
                        ^
Iinitialise the whole int_reg variable to zero before the conditional
assignments in the error injection case.

Signed-off-by: James Hogan <jhogan@kernel.org>
Acked-by: David Daney <david.daney@cavium.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Fixes: 1bc021e815 ("EDAC: Octeon: Add error injection support")
Link: http://lkml.kernel.org/r/20171113161206.20990-1-james.hogan@mips.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-16 20:09:47 +01:00
Jérémy Lefaure
8f72d29e70 EDAC, i5000, i5400: Fix definition of NRECMEMB register
[ Upstream commit a8c8261425649da58bdf08221570e5335ad33a31 ]

In the i5000 and i5400 drivers, the NRECMEMB register is defined as a
16-bit value, which results in wrong shifts in the code, as reported by
sparse.

In the datasheets ([1], section 3.9.22.20 and [2], section 3.9.22.21),
this register is a 32-bit register. A u32 value for the register fixes
the wrong shifts warnings and matches the datasheet.

Also fix the mask to access to the CAS bits [27:16] in the i5000 driver.

[1]: https://www.intel.com/content/dam/doc/datasheet/5000p-5000v-5000z-chipset-memory-controller-hub-datasheet.pdf
[2]: https://www.intel.se/content/dam/doc/datasheet/5400-chipset-memory-controller-hub-datasheet.pdf

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170629005729.8478-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16 10:33:54 +01:00
Jérémy Lefaure
222de157cc EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro
[ Upstream commit e61555c29c28a4a3b6ba6207f4a0883ee236004d ]

The MTR_DRAM_WIDTH macro returns the data width. It is sometimes used
as if it returned a boolean true if the width if 8. Fix the tests where
MTR_DRAM_WIDTH is misused.

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170309011809.8340-1-jeremy.lefaure@lse.epita.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-16 10:33:54 +01:00
Gustavo A. R. Silva
c86fa9ed3a EDAC, sb_edac: Fix missing break in switch
[ Upstream commit a8e9b186f153a44690ad0363a56716e7077ad28c ]

Add missing break statement in order to prevent the code from falling
through.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20171016174029.GA19757@embeddedor.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-09 18:42:40 +01:00
Emmanouil Maroudas
07adb640aa EDAC: Increment correct counter in edac_inc_ue_error()
commit 993f88f1cc7f0879047ff353e824e5cc8f10adfc upstream.

Fix typo in edac_inc_ue_error() to increment ue_noinfo_count instead of
ce_noinfo_count.

Signed-off-by: Emmanouil Maroudas <emmanouil.maroudas@gmail.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Fixes: 4275be6355 ("edac: Change internal representation to work with layers")
Link: http://lkml.kernel.org/r/1461425580-5898-1-git-send-email-emmanouil.maroudas@gmail.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-07 08:32:41 +02:00
Borislav Petkov
02808fd9e7 EDAC: Correct channel count limit
commit bba142957e04c400440d2df83c1b3b2dfc42e220 upstream.

c44696fff0 ("EDAC: Remove arbitrary limit on number of channels")
lifted the arbitrary limit on memory controller channels in EDAC.
However, the dynamic channel attributes dynamic_csrow_dimm_attr and
dynamic_csrow_ce_count_attr remained 6.

This wasn't a problem except channels 6 and 7 weren't visible in sysfs
on machines with more than 6 channels after the conversion to static
attr groups with

  2c1946b6d6 ("EDAC: Use static attribute groups for managing sysfs entries")

 [ without that, we're exploding in edac_create_sysfs_mci_device()
   because we're dereferencing out of the bounds of the
   dynamic_csrow_dimm_attr array. ]

Add attributes for channels 6 and 7 along with a guard for the
future, should more channels be required and/or to sanity check for
misconfigured machines.

We still need to check against the number of channels present on the MC
first, as Thor reported.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reported-by: Hironobu Ishii <ishii.hironobu@jp.fujitsu.com>
Tested-by: Thor Thayer <tthayer@opensource.altera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-20 18:09:19 +02:00
Tony Luck
7bf506067c EDAC, sb_edac: Fix rank lookup on Broadwell
commit c7103f650a11328f28b9fa1c95027db331b7774b upstream.

Broadwell made a small change to the rank target register moving the
target rank ID field up from bits 16:19 to bits 20:23.

Also found that the offset field grew by one bit in the IVY_BRIDGE to
HASWELL transition, so fix the RIR_OFFSET() macro too.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/2943fb819b1f7e396681165db9c12bb3df0e0b16.1464735623.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-07-27 09:47:27 -07:00
Tony Luck
4d32650fcd EDAC: i7core, sb_edac: Don't return NOTIFY_BAD from mce_decoder callback
commit c4fc1956fa31003bfbe4f597e359d751568e2954 upstream.

Both of these drivers can return NOTIFY_BAD, but this terminates
processing other callbacks that were registered later on the chain.
Since the driver did nothing to log the error it seems wrong to prevent
other interested parties from seeing it. E.g. neither of them had even
bothered to check the type of the error to see if it was a memory error
before the return NOTIFY_BAD.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/72937355dd92318d2630979666063f8a2853495b.1461864507.git.tony.luck@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:48 -07:00
Tony Luck
5582eb00f5 x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
commit ff15e95c82768d589957dbb17d7eb7dba7904659 upstream.

In commit:

  eb1af3b71f9d ("Fix computation of channel address")

I switched the "sck_way" variable from holding the log2 value read
from the h/w to instead be the actual number. Unfortunately it
is needed in log2 form when used to shift the address.

Tested-by: Patrick Geary <patrickg@supermicro.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Fixes: eb1af3b71f9d ("Fix computation of channel address")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-05-04 14:48:42 -07:00
Dan Carpenter
608377369d EDAC, amd64_edac: Shift wrapping issue in f1x_get_norm_dct_addr()
commit 6f3508f61c814ee852c199988a62bd954c50dfc1 upstream.

dct_sel_base_off is declared as a u64 but we're only using the lower 32
bits because of a shift wrapping bug. This can possibly truncate the
upper 16 bits of DctSelBaseOffset[47:26], causing us to misdecode the CS
row.

Fixes: c8e518d567 ('amd64_edac: Sanitize f10_get_base_addr_offset')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20160120095451.GB19898@mwanda
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12 09:08:36 -07:00
Tony Luck
dff87fa52d EDAC/sb_edac: Fix computation of channel address
commit eb1af3b71f9d83e45f2fd2fd649356e98e1c582c upstream.

Large memory Haswell-EX systems with multiple DIMMs per channel were
sometimes reporting the wrong DIMM.

Found three problems:

 1) Debug printouts for socket and channel interleave were not interpreting
    the register fields correctly. The socket interleave field is a 2^X
    value (0=1, 1=2, 2=4, 3=8). The channel interleave is X+1 (0=1, 1=2,
    2=3. 3=4).

 2) Actual use of the socket interleave value didn't interpret as 2^X

 3) Conversion of address to channel address was complicated, and wrong.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-04-12 09:08:36 -07:00
Borislav Petkov
b49777c656 EDAC, mc_sysfs: Fix freeing bus' name
commit 12e26969b32c79018165d52caff3762135614aa1 upstream.

I get the splat below when modprobing/rmmoding EDAC drivers. It happens
because bus->name is invalid after bus_unregister() has run. The Code: section
below corresponds to:

  .loc 1 1108 0
  movq    672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus
  .loc 1 1109 0
  popq    %rbx    #

  .loc 1 1108 0
  movq    (%rax), %rdi    # _7->name,
  jmp     kfree   #

and %rax has some funky stuff 2030203020312030 which looks a lot like
something walked over it.

Fix that by saving the name ptr before doing stuff to string it points to.

  general protection fault: 0000 [#1] SMP
  Modules linked in: ...
  CPU: 4 PID: 10318 Comm: modprobe Tainted: G          I EN  3.12.51-11-default+ #48
  Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011
  task: ffff880311320280 ti: ffff88030da3e000 task.ti: ffff88030da3e000
  RIP: 0010:[<ffffffffa019da92>]  [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
  RSP: 0018:ffff88030da3fe28  EFLAGS: 00010292
  RAX: 2030203020312030 RBX: ffff880311b4e000 RCX: 000000000000095c
  RDX: 0000000000000001 RSI: ffff880327bb9600 RDI: 0000000000000286
  RBP: ffff880311b4e750 R08: 0000000000000000 R09: ffffffff81296110
  R10: 0000000000000400 R11: 0000000000000000 R12: ffff88030ba1ac68
  R13: 0000000000000001 R14: 00000000011b02f0 R15: 0000000000000000
  FS:  00007fc9bf8f5700(0000) GS:ffff8801a7c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000403c90 CR3: 000000019ebdf000 CR4: 00000000000007e0
  Stack:
  Call Trace:
    i7core_unregister_mci.isra.9
    i7core_remove
    pci_device_remove
    __device_release_driver
    driver_detach
    bus_remove_driver
    pci_unregister_driver
    i7core_exit
    SyS_delete_module
    system_call_fastpath
    0x7fc9bf426536
  Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b
  RIP  [<ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
   RSP <ffff88030da3fe28>

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Fixes: 7a623c0390 ("edac: rewrite the sysfs code to use struct device")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:17 -08:00
Borislav Petkov
de46e65403 EDAC: Robustify workqueues destruction
commit fcd5c4dd8201595d4c598c9cca5e54760277d687 upstream.

EDAC workqueue destruction is really fragile. We cancel delayed work
but if it is still running and requeues itself, we still go ahead and
destroy the workqueue and the queued work explodes when workqueue core
attempts to run it.

Make the destruction more robust by switching op_state to offline so
that requeuing stops. Cancel any pending work *synchronously* too.

  EDAC i7core: Driver loaded.
  general protection fault: 0000 [#1] SMP
  CPU 12
  Modules linked in:
  Supported: Yes
  Pid: 0, comm: kworker/0:1 Tainted: G          IE   3.0.101-0-default #1 HP ProLiant DL380 G7
  RIP: 0010:[<ffffffff8107dcd7>]  [<ffffffff8107dcd7>] __queue_work+0x17/0x3f0
  < ... regs ...>
  Process kworker/0:1 (pid: 0, threadinfo ffff88019def6000, task ffff88019def4600)
  Stack:
   ...
  Call Trace:
   call_timer_fn
   run_timer_softirq
   __do_softirq
   call_softirq
   do_softirq
   irq_exit
   smp_apic_timer_interrupt
   apic_timer_interrupt
   intel_idle
   cpuidle_idle_call
   cpu_idle
  Code: ...
  RIP  __queue_work
   RSP <...>

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-03-03 15:07:17 -08:00
Linus Torvalds
9cf5c095b6 asm-generic cleanups
The asm-generic changes for 4.4 are mostly a series from Christoph Hellwig
 to clean up various abuses of headers in there. The patch to rename the
 io-64-nonatomic-*.h headers caused some conflicts with new users, so I
 added a workaround that we can remove in the next merge window.
 
 The only other patch is a warning fix from Marek Vasut
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIVAwUAVjzaf2CrR//JCVInAQImmhAA20fZ91sUlnA5skKNPT1phhF6Z7UF2Sx5
 nPKcHQD3HA3lT1OKfPBYvCo+loYflvXFLaQThVylVcnE/8ecAEMtft4nnGW2nXvh
 sZqHIZ8fszTB53cynAZKTjdobD1wu33Rq7XRzg0ugn1mdxFkOzCHW/xDRvWRR5TL
 rdQjzzgvn2PNlqFfHlh6cZ5ykShM36AIKs3WGA0H0Y/aYsE9GmDOAUp41q1mLXnA
 4lKQaIxoeOa+kmlsUB0wEHUecWWWJH4GAP+CtdKzTX9v12bGNhmiKUMCETG78BT3
 uL8irSqaViNwSAS9tBxSpqvmVUsa5aCA5M3MYiO+fH9ifd7wbR65g/wq39D3Pc01
 KnZ3BNVRW5XSA3c86pr8vbg/HOynUXK8TN0lzt6rEk8bjoPBNEDy5YWzy0t6reVe
 wX65F+ver8upjOKe9yl2Jsg+5Kcmy79GyYjLUY3TU2mZ+dIdScy/jIWatXe/OTKZ
 iB4Ctc4MDe9GDECmlPOWf98AXqsBUuKQiWKCN/OPxLtFOeWBvi4IzvFuO8QvnL9p
 jZcRDmIlIWAcDX/2wMnLjV+Hqi3EeReIrYznxTGnO7HHVInF555GP51vFaG5k+SN
 smJQAB0/sostmC1OCCqBKq5b6/li95/No7+0v0SUhJJ5o76AR5CcNsnolXesw1fu
 vTUkB/I66Hk=
 =dQKG
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic cleanups from Arnd Bergmann:
 "The asm-generic changes for 4.4 are mostly a series from Christoph
  Hellwig to clean up various abuses of headers in there.  The patch to
  rename the io-64-nonatomic-*.h headers caused some conflicts with new
  users, so I added a workaround that we can remove in the next merge
  window.

  The only other patch is a warning fix from Marek Vasut"

* tag 'asm-generic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: temporarily add back asm-generic/io-64-nonatomic*.h
  asm-generic: cmpxchg: avoid warnings from macro-ized cmpxchg() implementations
  gpio-mxc: stop including <asm-generic/bug>
  n_tracesink: stop including <asm-generic/bug>
  n_tracerouter: stop including <asm-generic/bug>
  mlx5: stop including <asm-generic/kmap_types.h>
  hifn_795x: stop including <asm-generic/kmap_types.h>
  drbd: stop including <asm-generic/kmap_types.h>
  move count_zeroes.h out of asm-generic
  move io-64-nonatomic*.h out of asm-generic
2015-11-06 14:22:15 -08:00
Linus Torvalds
b831ef2cad Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS changes from Ingo Molnar:
 "The main system reliability related changes were from x86, but also
  some generic RAS changes:

   - AMD MCE error injection subsystem enhancements.  (Aravind
     Gopalakrishnan)

   - Fix MCE and CPU hotplug interaction bug.  (Ashok Raj)

   - kcrash bootup robustness fix.  (Baoquan He)

   - kcrash cleanups.  (Borislav Petkov)

   - x86 microcode driver rework: simplify it by unmodularizing it and
     other cleanups.  (Borislav Petkov)"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/mce: Add a default case to the switch in __mcheck_cpu_ancient_init()
  x86/mce: Add a Scalable MCA vendor flags bit
  MAINTAINERS: Unify the microcode driver section
  x86/microcode/intel: Move #ifdef DEBUG inside the function
  x86/microcode/amd: Remove maintainers from comments
  x86/microcode: Remove modularization leftovers
  x86/microcode: Merge the early microcode loader
  x86/microcode: Unmodularize the microcode driver
  x86/mce: Fix thermal throttling reporting after kexec
  kexec/crash: Say which char is the unrecognized
  x86/setup/crash: Check memblock_reserve() retval
  x86/setup/crash: Cleanup some more
  x86/setup/crash: Remove alignment variable
  x86/setup: Cleanup crashkernel reservation functions
  x86/amd_nb, EDAC: Rename amd_get_node_id()
  x86/setup: Do not reserve crashkernel high memory if low reservation failed
  x86/microcode/amd: Do not overwrite final patch levels
  x86/microcode/amd: Extract current patch level read to a function
  x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC
  x86/ras/mce_amd_inj: Trigger deferred and thresholding errors interrupts
  ...
2015-11-03 17:51:33 -08:00
Tan Xiaojun
990995bad1 EDAC: Fix PAGES_TO_MiB macro misuse
The PAGES_TO_MiB macro is used for unit conversion but the
trace_mc_event() tracepoint expects a page address. Fix that.

Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445341538-24271-1-git-send-email-tanxiaojun@huawei.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-22 22:57:30 +02:00
Aravind Gopalakrishnan
1a6775c1a2 x86/amd_nb, EDAC: Rename amd_get_node_id()
This function doesn't give us the "Node ID" as the function name
suggests. Rather, it receives a PCI device as argument, checks
the available F3 PCI device IDs in the system and returns the
index of the matching Bus/Device IDs.

Rename it to amd_pci_dev_to_node_id().

No functional change is introduced.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1445246268-26285-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-10-21 11:10:55 +02:00
Dinh Nguyen
941fd2e709 EDAC, altera: SoCFPGA EDAC should not look for ECC_CORR_EN
The bootloader may or may not enable the ECC_CORR_EN bit. By
not enabling ECC_CORR_EN, when error happens, it is the user's
responsibility to perform a full SDRAM scrub.

Remove the check for ECC_CORR_EN.

Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Thor Thayer <tthayer@opensource.altera.com>
Link: http://lkml.kernel.org/r/1444864456-21778-1-git-send-email-dinguyen@opensource.altera.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-15 11:57:23 +02:00
Christoph Hellwig
2f8e2c8777 move io-64-nonatomic*.h out of asm-generic
These are not implementations of default architecture code but helpers
for drivers. Move them to the place they belong to.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Acked-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2015-10-15 00:21:07 +02:00
Tan Xiaojun
30f84a891b EDAC: Use edac_debugfs_remove_recursive()
debugfs_remove() is used to remove a file or a directory from the
debugfs filesystem, but mci->debugfs might not empty.

This can be triggered by the following sequence:

1) Enable CONFIG_EDAC_DEBUG
2) insmod an EDAC module (like i3000_edac or similar)
3) rmmod this module
4) we can see files remaining under <debugfs_mountpoint>/edac/ like
   "fake_inject", for example.

Removing edac_core then, causes a NULL pointer dereference.

Reported-by: Yun Wu (Abel) <wuyun.wu@huawei.com>
Signed-off-by: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1444787364-104353-1-git-send-email-tanxiaojun@huawei.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-14 18:50:32 +02:00
Luis de Bethencourt
90b3b37383 EDAC, ppc4xx_edac: Fix module autoload for OF platform driver
This platform driver has an OF device ID table but the OF module alias
information is not created so module autoloading won't work.

Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/20150917114619.GA13145@goodgumbo.baconseed.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-10-03 12:19:42 +02:00
Aravind Gopalakrishnan
1a8bc7707e EDAC, amd64_edac: Update copyright and remove changelog
Git provides us all the changelogs anyway. So trim the comments section
here. Update the copyrights info while at it.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443440593-2316-3-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-29 13:41:04 +02:00
Aravind Gopalakrishnan
da92110dfd EDAC, amd64_edac: Extend scrub rate support to F15hM60h
The scrub rate control register has moved to function 2 in PCI config
space and is at a different offset on family 0x15, models 0x60 and
later. The minimum recommended scrub rate has also changed. (Refer to
D18F2x1c9_dct[1:0][DramScrub] in Fam15hM60h BKDG).

Adjust set_scrub_rate() and get_scrub_rate() functions to accommodate
this.

Tested on F15hM60h, Fam15h, models 00h-0fh and Fam10h systems.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443440593-2316-2-git-send-email-Aravind.Gopalakrishnan@amd.com
[ Cleanup conditionals. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-29 13:25:33 +02:00
Toshi Kani
d0c9c93019 EDAC: Don't allow empty DIMM labels
Updating dimm_label to an empty string does not make much sense. Change
the sysfs dimm_label store operation to fail a request when an input
string is empty.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: elliott@hpe.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443124767.25474.172.camel@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-28 16:39:05 +02:00
Toshi Kani
438470b84c EDAC: Fix sysfs dimm_label store operation
Sysfs "dimm_label" and "chX_dimm_label" nodes have the following issues
in their store operation:

 1) A newline-terminated input string causes redundant newlines:

  # echo "test" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat  /sys/bus/mc0/devices/dimm0/dimm_label
  test

  #  od -bc /sys/bus/mc0/devices/dimm0/dimm_label
  0000000 164 145 163 164 012 012
            t   e   s   t  \n  \n
  0000006

 2) The original label string (31 characters) cannot be stored due to
    an improper size check:

  # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat /sys/bus/mc0/devices/dimm0/dimm_label

  # od -bc /sys/bus/mc0/devices/dimm0/dimm_label
   0000000 012 012
            \n  \n
   0000002

 3) An input string longer than the buffer size results a wrong label
    info as it allows a retry with the remaining string:

  # echo "CPU_SrcID#0_Ha#0_Chan#0_DIMM#0_TEST" > /sys/bus/mc0/devices/dimm0/dimm_label
  # cat  /sys/bus/mc0/devices/dimm0/dimm_label
  _TEST

Fix these issues by making the following changes:
 1) Replace a newline character at the end by setting a null. It also
    assures that the string is null-terminated in the label buffer.
 2) Check the label buffer size with 'sizeof(dimm->label)'.
 3) Fail a request if its string exceeds the label buffer size.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Robert Elliott <elliott@hpe.com>
Link: http://lkml.kernel.org/r/1443121564.25474.160.camel@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 19:45:59 +02:00
Toshi Kani
1ea62c59c8 EDAC: Fix sysfs dimm_label show operation
After

  7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")

sysfs "dimm_label" and "chX_dimm_label" show their label string without a
newline "\n" at the end.

  [root@orange ~]# cat /sys/bus/mc0/devices/dimm0/dimm_label
  CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#

  [root@orange ~]# cat /sys/devices/system/edac/mc/mc0/csrow0/ch0_dimm_label
  CPU_SrcID#0_Ha#0_Chan#0_DIMM#0[root@orange ~]#

The label strings now have 31 characters, which are the same as
EDAC_MC_LABEL_LEN. Since the snprintf()s in channel_dimm_label_show()
and dimmdev_label_show() limit the whole length by EDAC_MC_LABEL_LEN,
the newline in the format "%s\n" is ignored.

  [root@orange ~]# od -bc /sys/bus/mc0/devices/dimm0/dimm_label
  0000000 103 120 125 137 123 162 143 111 104 043 060 137 110 141 043 060
            C   P   U   _   S   r   c   I   D   #   0   _   H   a   #   0
  0000020 137 103 150 141 156 043 060 137 104 111 115 115 043 060 000
            _   C   h   a   n   #   0   _   D   I   M   M   #   0  \0
  0000037

Fix it by using 'sizeof(dimm->label) + 1' as the whole length in the
snprintf()s in channel_dimm_label_show() and dimmdev_label_show().

Reported-by: Robert Elliott <elliott@hpe.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1442933883-21587-2-git-send-email-toshi.kani@hpe.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 19:19:21 +02:00
Loc Ho
f864b79ba2 EDAC, xgene: Add SoC support
Add support for the SoC component.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443055261-8613-4-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:41:46 +02:00
Loc Ho
9bc1c0c0ec EDAC, xgene: Fix possible sprintf() overflow issue
Replace sprintf() with snprintf() to avoid possible string array
overflow.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443116287-11752-1-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:39:52 +02:00
Loc Ho
9347473c7d EDAC, xgene: Add L3 support
Add EDAC support for the L3 component.

Signed-off-by: Loc Ho <lho@apm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: devicetree@vger.kernel.org
Cc: ijc+devicetree@hellion.org.uk
Cc: jcm@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: patches@apm.com
Cc: robh+dt@kernel.org
Link: http://lkml.kernel.org/r/1443055261-8613-3-git-send-email-lho@apm.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-25 15:36:31 +02:00
Seth Jennings
2900ea6096 EDAC, sb_edac: Fix TAD presence check for sbridge_mci_bind_devs()
In commit

  7d375bffa5 ("sb_edac: Fix support for systems with two home agents per socket")

NUM_CHANNELS was changed to 8 and the channel space was renumerated to
handle EN, EP, and EX configurations.

The *_mci_bind_devs() functions - except for sbridge_mci_bind_devs() -
got a new device presence check in the form of saw_chan_mask. However,
sbridge_mci_bind_devs() still uses the NUM_CHANNELS for loop.

With the increase in NUM_CHANNELS, this loop fails at index 4 since
SB only has 4 TADs.  This results in the following error on SB machines:

  EDAC sbridge: Some needed devices are missing
  EDAC sbridge: Couldn't find mci handler
  EDAC sbridge: Couldn't find mci handle

This patch adapts the saw_chan_mask logic for sbridge_mci_bind_devs() as
well.

After this patch:

  EDAC MC0: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#0: DEV 0000:3f:0e.0 (POLLED)
  EDAC MC1: Giving out device to module sbridge_edac.c controller Sandy Bridge Socket#1: DEV 0000:7f:0e.0 (POLLED)

Signed-off-by: Seth Jennings <sjenning@redhat.com>
Acked-by: Aristeu Rozanski <aris@redhat.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # v4.2
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1438798561-10180-1-git-send-email-sjenning@redhat.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-24 20:40:50 +02:00
Aravind Gopalakrishnan
58a9c251c9 EDAC, ghes_edac: Remove redundant memory_type array
We already have edac_mem_types[] that enumerates the different kinds of
memory. So, use that and remove the redundant memory_type[] array here.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1442436811-23382-2-git-send-email-Aravind.Gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-23 16:59:25 +02:00
Borislav Petkov
09bd1b4f81 EDAC, xgene: Convert to debugfs wrappers
Drop CONFIG_EDAC_DEBUG ifdeffery too, while at it.

Tested-by: Loc Ho <lho@apm.com>
Cc: linux-edac@vger.kernel.org
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-23 11:37:45 +02:00
Borislav Petkov
52019e406c EDAC, i5100: Convert to debugfs wrappers
This driver creates its debugfs hierarchy under the toplevel debugfs dir
- see i5100_init() - so make it use edac_debugfs_create_dir_at( , NULL)
because we're not breaking userspace. Oh well.

Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:21:12 +02:00
Borislav Petkov
bba3b31e44 EDAC, altera: Convert to debugfs wrappers
Use the EDAC-specific wrappers. Drop CONFIG_EDAC_DEBUG ifdeffery.

Cc: Thor Thayer <tthayer@opensource.altera.com>
Cc: <linux-edac@vger.kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:20:56 +02:00
Borislav Petkov
4397bcb4fa EDAC: Add debugfs wrappers
Later patches will convert EDAC users to those.

Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 18:10:22 +02:00
Borislav Petkov
7ac8bf9bc9 EDAC: Carve out debugfs functionality
... into a separate compilation unit and drop a couple of
CONFIG_EDAC_DEBUG ifdefferies. Rename edac_create_debug_nodes() to
edac_create_debugfs_nodes(), while at it.

No functionality change.

Cc: <linux-edac@vger.kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-09-22 12:29:46 +02:00
Linus Torvalds
d9b44fe30f edac updates for v4.3-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV8giFAAoJEAhfPr2O5OEVYxsP/imxjTa1utbeYToA+oqut9Yw
 HbMB6yjscfZF/CwP/rB/T5jNTTww6pvavBntw29NewTmPIfREuvYMvvOtCkyKdQf
 yEVeME0cvJAiSdtiqeWoAMJYm3VMDKPh0p/cncvhcpip0ORU3prV9XEqNo1byvEU
 S4laHNfxgDbTHptVhvaT7Li/yN3KyenlaT61K3dYn0LtbPf6uxSsvqX+ExuZSix7
 ZWS+wQrzX35hQHvL7Ax1iZv15uTfrC2kD2bwZrgetvl6wVdwBKx9vbWosaZ+XffV
 gv4dxJG5zNV2nFPfFUxqQZ++OcieuqP1yeWtwbuBPx3qibDKTb/IJYX8VvHAAc8n
 EgaDwFsYH6YnLsxNL58+W9PYigVyc2VS2Nfqz9uRuXmuryoyDbCpcgVmZYdmgYKn
 c1Rc4DzOqvOWrv5jdtSxYmdmAMqgf0LjlMzUmZ7shJYWprsqjgwqxevPXmH9xb7s
 myBpM604KofiWGPGqVNVYIdt8AldXc1QNnyCxGHcAAdIzEOn+7WEdeWqVQ665Vya
 x2Cp9h1IURPtcniw7Dx/0nqBu64Y6OMGz9W6H0SxBUWc2cC6QZZ4k9zOg2krtFkt
 c/S6RCYdIoxIcwTPfhyoBfCAHueSfUaJ8sIAWMrQLFUCgALN2f1WslIYEJVGcPOT
 UeQzSo1pe9wyZu5hQzji
 =DrsG
 -----END PGP SIGNATURE-----

Merge tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac

Pull edac updates from Mauro Carvalho Chehab:
 "Two EDAC fixes for Intel systems (Haswell and Ivy Bridge)"

* tag 'edac/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac:
  sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell
  sb_edac: look harder for DDRIO on Haswell systems
2015-09-11 16:21:12 -07:00
Aristeu Rozanski
12f0721c5a sb_edac: correctly fetch DIMM width on Ivy Bridge and Haswell
dimm_dev_type has been incorrectly determined in sb_edac. This patch fixes it
for Ivy Bridge and Haswell only since nothing like exists for Sandy Bridge.
We tested this patch in multiple systems matching the results with the
installed memory modules.

Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-09-08 20:33:48 -03:00
Aristeu Rozanski
7179385afe sb_edac: look harder for DDRIO on Haswell systems
In case the memory banks are populated so the first channel isn't used, the
DDRIO PCI device won't be visible and it won't be possible to determine the
memory type.

Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
2015-09-08 20:32:13 -03:00
Linus Torvalds
2d678b68e8 Minor stuff: AMD MCE decoding correction and xgene_edac cleanup.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5aADAAoJEBLB8Bhh3lVKHM8P/RLgWShqZ0CwXz+2v9e8wZxr
 Uwo3xhx0jqUTefBNZqMGgJKq7DexNXWa/40vHydPO4S7yNRka4aftev01OVr2tVG
 3v/ggqxV3SRYkTyhQhjzs39MDeb6+jZlMPz17DmM75iry1og9iT213jPIk6aAt9G
 OoIGgWWT6rtsDa55+Yat/15qc4ajE5mmJnwej/Ybr/B3qI4/Zm3XGVqopbg2axrw
 CtmcK7+3OJ2mvbBTGCoNAL+xtcCRXsJvch8+eo1PhJDjA1lY37lCdZa2TaD0RvsU
 /YFDotPNYf09RJW+f8FvMkj4P3/Upwo0w97e1nhqLRYemmezdSBd+iU3KWHCdWkf
 wj1XFtdpDEDuDzpo2zX5KOA/KgB7ozcRKsm9NzcQHNnbt5UCXmHIWuK3+3ED/RdM
 Pbm4SrLzFX0Fp/08/Vd55CSZ0bP8+bduP5su8yBfX+oUuNY3cvNO2e7zmSRaifON
 CI+hpjfjpaDOmydK4/RbtPvB3YkcykeB7Jk/gyDTD2ZKjxoQw8YMH5+7Xy/cMQYA
 aKnnbE0O14+fnp7JrZJ/ER2FlcSOMiWBhgPJ2wi+0BqcJZhpwvhkEMmClsMtrQss
 u7rX9B5+GqeAQ145X+67CK3VwckyD6ZO6rwP7q6VEiknmkMAO9ML4DFUhCO7YeHx
 8s5Cpy94v2rPext0O2Uw
 =23M/
 -----END PGP SIGNATURE-----

Merge tag 'edac_for_4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fixes from Borislav Petkov:
 "Two minor fixlets this time: AMD MCE decoding correction and
  xgene_edac cleanup"

* tag 'edac_for_4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, mce_amd: Don't emit 'CE' for Deferred error
  EDAC, xgene: Drop owner assignment from platform_driver
2015-09-01 18:34:22 -07:00
Linus Torvalds
3959df1dfb Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "MCE handling updates, but also some generic drivers/edac/ changes to
  better organize the Kconfig space"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ras: Move AMD MCE injector to arch/x86/ras/
  x86/mce: Add a wrapper around mce_log() for injection
  x86/mce: Rename rcu_dereference_check_mce() to mce_log_get_idx_check()
  RAS: Add a menuconfig option with descriptive text
  x86/mce: Reenable CMCI banks when swiching back to interrupt mode
  x86/mce: Clear Local MCE opt-in before kexec
  x86/mce: Remove unused function declarations
  x86/mce: Kill drain_mcelog_buffer()
  x86/mce: Avoid potential deadlock due to printk() in MCE context
  x86/mce: Remove the MCE ring for Action Optional errors
  x86/mce: Don't use percpu workqueues
  x86/mce: Provide a lockless memory pool to save error records
  x86/mce: Reuse one of the u16 padding fields in 'struct mce'
2015-08-31 20:20:30 -07:00
Borislav Petkov
6c36dfe949 x86/ras: Move AMD MCE injector to arch/x86/ras/
This is an x86-specific module and would benefit from being
closer to the arch code. Move it there. Update copyright while
at it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/1439396985-12812-14-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:54 +02:00
Borislav Petkov
eef4dfa0cb x86/mce: Kill drain_mcelog_buffer()
This used to flush out MCEs logged during early boot and which
were in the MCA registers from a previous system run. No need
for that now, since we've moved to a genpool.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:52 +02:00
Chen, Gong
fd4cf79fcc x86/mce: Remove the MCE ring for Action Optional errors
Use unified genpool to save Action Optional error events and put
Action Optional error handling in the same notification chain as
MCE error decoding.

Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
[ Fold in subsequent patch from Boris for early boot logging. ]
Signed-off-by: Tony Luck <tony.luck@intel.com>
[ Correct a lot. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1439396985-12812-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-13 10:12:51 +02:00
Michael Walle
5c16179b55 EDAC, ppc4xx: Access mci->csrows array elements properly
The commit

  de3910eb79 ("edac: change the mem allocation scheme to
		 make Documentation/kobject.txt happy")

changed the memory allocation for the csrows member. But ppc4xx_edac was
forgotten in the patch. Fix it.

Signed-off-by: Michael Walle <michael@walle.cc>
Cc: <stable@vger.kernel.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.cc
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-08-13 06:02:19 +02:00
Aravind Gopalakrishnan
99e1dfb7d2 EDAC, mce_amd: Don't emit 'CE' for Deferred error
Currently, when decoding an MCE, we display 'CE' for a Deferred error, like
this:

[Hardware Error]: CPU:0 (15:2:0) MC4_STATUS[Over|CE|MiscV|-|AddrV|Deferred|-|UECC]: 0xdc04b00095080813

When the 'UC' bit in the MCx_STATUS register is clear, the error status
is either a Corrected error or Deferred error as determined by the
'Deferred' bit. So do not print 'CE' on a deferred error.

Refer to AMD Error Scope Hierarchy table in a newer BKDG (example:
49125_15h_Models_30h-3Fh_BKDG.pdf, section "RAS Features").

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Link: http://lkml.kernel.org/r/1436788382-6463-1-git-send-email-aravind.gopalakrishnan@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-07-14 06:32:53 +02:00
Krzysztof Kozlowski
ca12bb14fe EDAC, xgene: Drop owner assignment from platform_driver
platform_driver does not need to set an owner because
platform_driver_register() will set it.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Loc Ho <lho@apm.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Link: http://lkml.kernel.org/r/1436507243-11159-1-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-07-10 14:16:24 +02:00
Linus Torvalds
b53343fc6c A build fix for octeon_edac from Aaro Koskinen.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVlPv5AAoJEBLB8Bhh3lVKlxIP/35j4O/i4+xgw2CRQvAflJgz
 haUZsguT7QRXqbVmQdAMqyCrz77jnIIFIJEcoqpvpcHSVuNblbjBS6xc7u1UAfmF
 h7M+XIFZuIX7mp6svVJujN+VgP1Obgsx860bG335HIJZZzPfOfs31pisxUddXNpC
 deuP4opLlKcykioFF4GXhoLzDBoc2PLfPGczNRzw9HUisN2F+SK4NEa7iehp8RBX
 UCQTtwsLFuBzi4AHqAfrNKgDiEQ+DZ529sVh5bqAk/U8VDqUulKqIjLh0G75tYm2
 SauVE2woOn4WqUTG8CJBjCTauL+BIc2LTcPh6qIzWxsgZnyeHCve/1QBBOMi/yk2
 FAJYg2VDhPqj0OyUqC7JUXicB3ZKviWYTZ3knv9mIF8m98N8K4kAwGPuUkTc3Z2P
 fBBusvzb+6/OTN2WyQaLx8DjG/+Ha/o0AEeWE6S4A9+urCrJIx+Dkelt3dngdrVe
 kbHyvKmTPy13U1zEo5jmMZkjf7xNfotpckTIwCz3l00Z5AwPPVjC5KylhPvlwM7L
 BpftYJsPRJrIK0zkuU3Yfcrhy8U2QcQAV14IMEW1r5pZwpLH+EfoIHAbh8PWJc5C
 toporRXFc2R7MT1+0DA4faNzfwXXBLbT5BYRzNfTZPClAeSqKE0OPj4Ytv+JWyO5
 kx3HK/kWiSH2SxQ1Sid1
 =7C57
 -----END PGP SIGNATURE-----

Merge tag 'edac_urgent_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp

Pull EDAC fix from Borislav Petkov:
 "A build fix for octeon_edac from Aaro Koskinen"

* tag 'edac_urgent_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
  EDAC, octeon: Fix broken build due to model helper renames
2015-07-03 12:10:12 -07:00
Aaro Koskinen
75a15a7864 EDAC, octeon: Fix broken build due to model helper renames
Commit

  debe6a623d ("MIPS: OCTEON: Update octeon-model.h code for new SoCs.")

renamed some SoC model helper functions, but forgot to update the EDAC
drivers resulting in build failures. Fix that.

Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Acked-by: David Daney <david.daney@cavium.com>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: linux-mips@linux-mips.org
Link: http://lkml.kernel.org/r/1435747132-10954-1-git-send-email-aaro.koskinen@nokia.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2015-07-02 10:46:28 +02:00
Linus Torvalds
8d7804a2f0 Driver core patches for 4.2-rc1
Here is the driver core / firmware changes for 4.2-rc1.
 
 A number of small changes all over the place in the driver core, and in
 the firmware subsystem.  Nothing really major, full details in the
 shortlog.  Some of it is a bit of churn, given that the platform driver
 probing changes was found to not work well, so they were reverted.
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlWNoCQACgkQMUfUDdst+ym4JACdFrrXoMt2pb8nl5gMidGyM9/D
 jg8AnRgdW8ArDA/xOarULd/X43eA3J3C
 =Al2B
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the driver core / firmware changes for 4.2-rc1.

  A number of small changes all over the place in the driver core, and
  in the firmware subsystem.  Nothing really major, full details in the
  shortlog.  Some of it is a bit of churn, given that the platform
  driver probing changes was found to not work well, so they were
  reverted.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'driver-core-4.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (31 commits)
  Revert "base/platform: Only insert MEM and IO resources"
  Revert "base/platform: Continue on insert_resource() error"
  Revert "of/platform: Use platform_device interface"
  Revert "base/platform: Remove code duplication"
  firmware: add missing kfree for work on async call
  fs: sysfs: don't pass count == 0 to bin file readers
  base:dd - Fix for typo in comment to function driver_deferred_probe_trigger().
  base/platform: Remove code duplication
  of/platform: Use platform_device interface
  base/platform: Continue on insert_resource() error
  base/platform: Only insert MEM and IO resources
  firmware: use const for remaining firmware names
  firmware: fix possible use after free on name on asynchronous request
  firmware: check for file truncation on direct firmware loading
  firmware: fix __getname() missing failure check
  drivers: of/base: move of_init to driver_init
  drivers/base: cacheinfo: fix annoying typo when DT nodes are absent
  sysfs: disambiguate between "error code" and "failure" in comments
  driver-core: fix build for !CONFIG_MODULES
  driver-core: make __device_attach() static
  ...
2015-06-26 15:07:37 -07:00