Jia He [Tue, 28 Aug 2018 04:53:26 +0000 (12:53 +0800)]
irqchip/gic-v3-its: Cap lpi_id_bits to reduce memory footprint
Commit cd6c530221cc ("irqchip/gic-v3-its: Use full range of LPIs"), removes
the cap for lpi_id_bits, which causes the following warning to trigger on a
QDF2400 server:
In its_alloc_lpi_tables(), lpi_id_bits is 24 in QDF2400. The allocation in
allocate_prop_table() tries therefore to allocate 16M (order 12 if
pagesize=4k), which triggers the warning.
As said by MarcL
Capping lpi_id_bits at 16 (which is what we had before) is plenty,
will save a some memory, and gives some margin before we need to push
it up again.
Bring the upper limit of lpi_id_bits back to prevent
Fixes: cd6c530221cc ("irqchip/gic-v3-its: Use full range of LPIs") Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Jia He <jia.he@hxt-semitech.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Olof Johansson <olof@lixom.net> Cc: Jason Cooper <jason@lakedaemon.net> Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r/1535432006-2304-1-git-send-email-jia.he@hxt-semitech.com
Merge tag 'apparmor-pr-2018-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fix from John Johansen:
"A fix for an issue syzbot discovered last week:
- Fix for bad debug check when converting secids to secctx"
* tag 'apparmor-pr-2018-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix bad debug check in apparmor_secid_to_secctx()
Merge tag 'trace-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"This fixes two annoying bugs:
- The first one is a side effect caused by using SRCU for rcuidle
tracepoints. It seems that the perf was depending on the rcuidle
tracepoints to make RCU watch when it wasn't.
The real fix will be to have perf use SRCU instead of depending on
RCU watching, but that can't be done until SRCU is safe to use in
NMI context (Paul's working on that).
- The second bug fix is for a bug that's been periodically making my
tests fail randomly for some time. I haven't had time to track it
down, but finally have. It has to do with stressing NMIs (via perf)
while enabling or disabling ftrace function handling with lockdep
enabled.
If an interrupt happens and just as it returns, it sets lockdep
back to "interrupts enabled" but before it returns an NMI is
triggered, and if this happens while printk_nmi_enter has a
breakpoint attached to it (because ftrace is converting it to or
from nop to call fentry), the breakpoint trap also calls into
lockdep, and since returning from the NMI to a interrupt handler,
interrupts were disabled when the NMI went off, lockdep keeps its
state as interrupts disabled when it returns back from the
interrupt handler where interrupts are enabled.
This causes lockdep_assert_irqs_enabled() to trigger a false
positive"
* tag 'trace-v4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
printk/tracing: Do not trace printk_nmi_enter()
tracing: Add back in rcu_irq_enter/exit_irqson() for rcuidle tracepoints
Merge tag 'for-4.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix for improper fsync after hardlink
- fix for a corruption during file deduplication
- use after free fixes
- RCU warning fix
- fix for buffered write to nodatacow file
* tag 'for-4.19-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: Fix suspicious RCU usage warning in btrfs_debug_in_rcu
btrfs: use after free in btrfs_quota_enable
btrfs: btrfs_shrink_device should call commit transaction at the end
btrfs: fix qgroup_free wrong num_bytes in btrfs_subvolume_reserve_metadata
Btrfs: fix data corruption when deduplicating between different files
Btrfs: sync log after logging new name
Btrfs: fix unexpected failure of nocow buffered writes after snapshotting when low on space
------------[ cut here ]------------
IRQs not enabled as expected
WARNING: CPU: 3 PID: 0 at kernel/time/tick-sched.c:982 tick_nohz_idle_enter+0x44/0x8c
Modules linked in: ip6t_REJECT nf_reject_ipv6 ip6table_filter ip6_tables ipv6
CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.0-rc2-test+ #2
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
EIP: tick_nohz_idle_enter+0x44/0x8c
Code: ec 05 00 00 00 75 26 83 b8 c0 05 00 00 00 75 1d 80 3d d0 36 3e c1 00
75 14 68 94 63 12 c1 c6 05 d0 36 3e c1 01 e8 04 ee f8 ff <0f> 0b 58 fa bb a0
e5 66 c1 e8 25 0f 04 00 64 03 1d 28 31 52 c1 8b
EAX: 0000001c EBX: f26e7f8c ECX: 00000006 EDX: 00000007
ESI: f26dd1c0 EDI: 00000000 EBP: f26e7f40 ESP: f26e7f38
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010296
CR0: 80050033 CR2: 0813c6b0 CR3: 2f342000 CR4: 001406f0
Call Trace:
do_idle+0x33/0x202
cpu_startup_entry+0x61/0x63
start_secondary+0x18e/0x1ed
startup_32_smp+0x164/0x168
irq event stamp: 18773830
hardirqs last enabled at (18773829): [<c040150c>] trace_hardirqs_on_thunk+0xc/0x10
hardirqs last disabled at (18773830): [<c040151c>] trace_hardirqs_off_thunk+0xc/0x10
softirqs last enabled at (18773824): [<c0ddaa6f>] __do_softirq+0x25f/0x2bf
softirqs last disabled at (18773767): [<c0416bbe>] call_on_stack+0x45/0x4b
---[ end trace b7c64aa79e17954a ]---
After a bit of debugging, I found what was happening. This would trigger
when performing "perf" with a high NMI interrupt rate, while enabling and
disabling function tracer. Ftrace uses breakpoints to convert the nops at
the start of functions to calls to the function trampolines. The breakpoint
traps disable interrupts and this makes calls into lockdep via the
trace_hardirqs_off_thunk in the entry.S code. What happens is the following:
do_idle {
[interrupts enabled]
<interrupt> [interrupts disabled]
TRACE_IRQS_OFF [lockdep says irqs off]
[...]
TRACE_IRQS_IRET
test if pt_regs say return to interrupts enabled [yes]
TRACE_IRQS_ON [lockdep says irqs are on]
<nmi>
nmi_enter() {
printk_nmi_enter() [traced by ftrace]
[ hit ftrace breakpoint ]
<breakpoint exception>
TRACE_IRQS_OFF [lockdep says irqs off]
[...]
TRACE_IRQS_IRET [return from breakpoint]
test if pt_regs say interrupts enabled [no]
[iret back to interrupt]
[iret back to code]
tick_nohz_idle_enter() {
lockdep_assert_irqs_enabled() [lockdep say no!]
Although interrupts are indeed enabled, lockdep thinks it is not, and since
we now do asserts via lockdep, it gives a false warning. The issue here is
that printk_nmi_enter() is called before lockdep_off(), which disables
lockdep (for this reason) in NMIs. By simply not allowing ftrace to see
printk_nmi_enter() (via notrace annotation) we keep lockdep from getting
confused.
Cc: stable@vger.kernel.org Fixes: aca0962437868 ("printk/nmi: generic solution for safe printk in NMI") Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Merge tag 'gpio-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Some GPIO fixes. The ACPI stuff is probably the most annoying for
users that get fixed this time.
- Atomic contexts, cansleep* calls and such fastpath/slopwpath
things.
- Defer ACPI event handler registration to late_initcall() so IRQs do
not fire in our face before other drivers have a chance to register
handlers.
- Race condition if a consumer requests a GPIO after
gpiochip_add_data_with_key() but before of_gpiochip_add()
- Probe errorpath in the dwapb driver"
* tag 'gpio-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Fix crash due to registration race
gpio: dwapb: Fix error handling in dwapb_gpio_probe()
gpiolib-acpi: Register GpioInt ACPI event handlers from a late_initcall
gpiolib: acpi: Switch to cansleep version of GPIO library call
gpio: adp5588: Fix sleep-in-atomic-context bug
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"A set of very minor fixes and a couple of reverts to fix a major
problem (the attempt to change the busy count causes a hang when
attempting to change the drive cache type)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: aacraid: fix a signedness bug
Revert "scsi: core: avoid host-wide host_busy counter for scsi_mq"
Revert "scsi: core: fix scsi_host_queue_ready"
scsi: libata: Add missing newline at end of file
scsi: target: iscsi: cxgbit: use pr_debug() instead of pr_info()
scsi: hpsa: limit transfer length to 1MB, not 512kB
scsi: lpfc: Correct MDS diag and nvmet configuration
scsi: lpfc: Default fdmi_on to on
scsi: csiostor: fix incorrect port capabilities
scsi: csiostor: add a check for NULL pointer after kmalloc()
scsi: documentation: add scsi_mod.use_blk_mq to scsi-parameters
scsi: core: Update SCSI_MQ_DEFAULT help text to match default
Merge tag 'nds32-for-linus-4.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux
Pull nds32 updates from Greentime Hu:
"Contained in here are the bug fixes, building error fixes and ftrace
support for nds32"
* tag 'nds32-for-linus-4.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux:
nds32: linker script: GCOV kernel may refers data in __exit
nds32: fix build error because of wrong semicolon
nds32: Fix a kernel panic issue because of wrong frame pointer access.
nds32: Only print one page of stack when die to prevent printing too much information.
nds32: Add macro definition for offset of lp register on stack
nds32: Remove the deprecated ABI implementation
nds32/stack: Get real return address by using ftrace_graph_ret_addr
nds32/ftrace: Support dynamic function graph tracer
nds32/ftrace: Support dynamic function tracer
nds32/ftrace: Add RECORD_MCOUNT support
nds32/ftrace: Support static function graph tracer
nds32/ftrace: Support static function tracer
nds32: Extract the checking and getting pointer to a macro
nds32: Clean up the coding style
nds32: Fix get_user/put_user macro expand pointer problem
nds32: Fix empty call trace
nds32: add NULL entry to the end of_device_id array
nds32: fix logic for module
tracing: Add back in rcu_irq_enter/exit_irqson() for rcuidle tracepoints
Borislav reported the following splat:
=============================
WARNING: suspicious RCU usage
4.19.0-rc1+ #1 Not tainted
-----------------------------
./include/linux/rcupdate.h:631 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 2, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: 000000004557ee0e (rcu_read_lock){....}, at: perf_event_output_forward+0x0/0x130
This is due to the tracepoints moving to SRCU usage which does not require
RCU to be "watching". But perf uses these tracepoints with RCU and expects
it to be. Hence, we still need to add in the rcu_irq_enter/exit_irqson()
calls for "rcuidle" tracepoints. This is a temporary fix until we have SRCU
working in NMI context, and then perf can be converted to use that instead
of normal RCU.
Link: http://lkml.kernel.org/r/20180904162611.6a120068@gandalf.local.home Cc: x86-ml <x86@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: Borislav Petkov <bp@alien8.de> Tested-by: Borislav Petkov <bp@alien8.de> Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Fixes: 0d8d748d710bc ("tracepoint: Make rcuidle tracepoint callers use SRCU") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Greentime Hu [Tue, 4 Sep 2018 06:25:57 +0000 (14:25 +0800)]
nds32: linker script: GCOV kernel may refers data in __exit
This patch is used to fix nds32 allmodconfig/allyesconfig build error
because GCOV kernel embeds counters in the kernel for each line
and a part of that embed in __exit text. So we need to keep the
EXIT_TEXT and EXIT_DATA if CONFIG_GCOV_KERNEL=y.
Link: https://lkml.org/lkml/2018/9/1/125 Signed-off-by: Greentime Hu <greentime@andestech.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
nilfs2: convert to SPDX license tags
drivers/dax/device.c: convert variable to vm_fault_t type
lib/Kconfig.debug: fix three typos in help text
checkpatch: add __ro_after_init to known $Attribute
mm: fix BUG_ON() in vmf_insert_pfn_pud() from VM_MIXEDMAP removal
uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name
memory_hotplug: fix kernel_panic on offline page processing
checkpatch: add optional static const to blank line declarations test
ipc/shm: properly return EIDRM in shm_lock()
mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.
mm/util.c: improve kvfree() kerneldoc
tools/vm/page-types.c: fix "defined but not used" warning
tools/vm/slabinfo.c: fix sign-compare warning
kmemleak: always register debugfs file
mm: respect arch_dup_mmap() return value
mm, oom: fix missing tlb_finish_mmu() in __oom_reap_task_mm().
mm: memcontrol: print proper OOM header when no eligible victim left
Dave Jiang [Tue, 4 Sep 2018 22:46:16 +0000 (15:46 -0700)]
mm: fix BUG_ON() in vmf_insert_pfn_pud() from VM_MIXEDMAP removal
It looks like I missed the PUD path when doing VM_MIXEDMAP removal.
This can be triggered by:
1. Boot with memmap=4G!8G
2. build ndctl with destructive flag on
3. make TESTS=device-dax check
[ +0.000675] kernel BUG at mm/huge_memory.c:824!
Applying the same change that was applied to vmf_insert_pfn_pmd() in the
original patch.
Link: http://lkml.kernel.org/r/153565957352.35524.1005746906902065126.stgit@djiang5-desk3.ch.intel.com Fixes: 7555efba5f4 ("dax: remove VM_MIXEDMAP for fsdax and device dax") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Acked-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Tue, 4 Sep 2018 22:46:13 +0000 (15:46 -0700)]
uapi/linux/keyctl.h: don't use C++ reserved keyword as a struct member name
Since this header is in "include/uapi/linux/", apparently people want to
use it in userspace programs -- even in C++ ones. However, the header
uses a C++ reserved keyword ("private"), so change that to "dh_private"
instead to allow the header file to be used in C++ userspace.
Fixes https://bugzilla.kernel.org/show_bug.cgi?id=191051 Link: http://lkml.kernel.org/r/0db6c314-1ef4-9bfa-1baa-7214dd2ee061@infradead.org Fixes: 23a7fe99323c ("KEYS: Add KEYCTL_DH_COMPUTE command") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Mat Martineau <mathew.j.martineau@linux.intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
That VM_BUG_ON was triggered by the page poisoning introduced in
mm/sparse.c with the git commit ce83b9d4f007 ("mm/memory_hotplug:
optimize memory hotplug").
With the same commit the new 'nid' field has been added to the struct
memory_block in order to store and later on derive the node id for
offline pages (instead of accessing struct page which might be
uninitialized). But one reference to nid in show_valid_zones() function
has been overlooked. Fixed with current commit. Also, nr_pages will
not be used any more after test_pages_in_a_zone() call, do not update
it.
Joe Perches [Tue, 4 Sep 2018 22:46:06 +0000 (15:46 -0700)]
checkpatch: add optional static const to blank line declarations test
Using a static const struct definition as part of a series of
declarations produces a false positive "Missing a blank line after
declarations" for code like:
WARNING: Missing a blank line after declarations
#710: FILE: drivers/gpu/drm/tidss/tidss_scale_coefs.c:137:
+ int inc;
+ static const struct {
When getting rid of the general ipc_lock(), this was missed furthermore,
making the comment around the ipc object validity check bogus. Under
EIDRM conditions, callers will in turn not see the error and continue
with the operation.
mm/hugetlb: filter out hugetlb pages if HUGEPAGE migration is not supported.
When scanning for movable pages, filter out Hugetlb pages if hugepage
migration is not supported. Without this we hit infinte loop in
__offline_pages() where we do
pfn = scan_movable_pages(start_pfn, end_pfn);
if (pfn) { /* We have movable pages */
ret = do_migrate_range(pfn, end_pfn);
goto repeat;
}
Fix this by checking hugepage_migration_supported both in
has_unmovable_pages which is the primary backoff mechanism for page
offlining and for consistency reasons also into scan_movable_pages
because it doesn't make any sense to return a pfn to non-migrateable
huge page.
This issue was revealed by, but not caused by e53f8d291813 ("mm,
memory_hotplug: do not fail offlining too early").
Link: http://lkml.kernel.org/r/20180824063314.21981-1-aneesh.kumar@linux.ibm.com Fixes: e53f8d291813 ("mm, memory_hotplug: do not fail offlining too early") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Reported-by: Haren Myneni <haren@linux.vnet.ibm.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Tue, 4 Sep 2018 22:45:55 +0000 (15:45 -0700)]
mm/util.c: improve kvfree() kerneldoc
Scooped from an email from Matthew.
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
slabinfo.c:854:22: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
if (s->object_size < min_objsize)
^
due to the mismatch of signed/unsigned comparison. ->object_size and
->slab_size are never expected to be negative, so let's define them as
unsigned int.
If kmemleak built in to the kernel, but is disabled by default, the
debugfs file is never registered. Because of this, it is not possible
to find out if the kernel is built with kmemleak support by checking for
the presence of this file. To allow this, always register the file.
After this patch, if the file doesn't exist, kmemleak is not available
in the kernel. If writing "scan" or any other value than "clear" to
this file results in EBUSY, then kmemleak is available but is disabled
by default and can be activated via the kernel command line.
Catalin: "that's also consistent with a late disabling of kmemleak when
the debugfs entry sticks around."
Link: http://lkml.kernel.org/r/20180824131220.19176-1-vincent.whitchurch@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nadav Amit [Tue, 4 Sep 2018 22:45:41 +0000 (15:45 -0700)]
mm: respect arch_dup_mmap() return value
Commit 932233a6114f ("include/linux/sched/mm.h: uninline mmdrop_async(),
etc") ignored the return value of arch_dup_mmap(). As a result, on x86,
a failure to duplicate the LDT (e.g. due to memory allocation error)
would leave the duplicated memory mapping in an inconsistent state.
Fix by using the return value, as it was before the change.
Link: http://lkml.kernel.org/r/20180823051229.211856-1-namit@vmware.com Fixes: 932233a6114fa ("include/linux/sched/mm.h: uninline mmdrop_async(), etc") Signed-off-by: Nadav Amit <namit@vmware.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm, oom: fix missing tlb_finish_mmu() in __oom_reap_task_mm().
Commit f4db9358e5c1 ("mm, oom: distinguish blockable mode for mmu
notifiers") has added an ability to skip over vmas with blockable mmu
notifiers. This however didn't call tlb_finish_mmu as it should.
As a result inc_tlb_flush_pending has been called without its pairing
dec_tlb_flush_pending and all callers mm_tlb_flush_pending would flush
even though this is not really needed. This alone is not harmful and it
seems there shouldn't be any such callers for oom victims at all but
there is no real reason to skip tlb_finish_mmu on early skip either so
call it.
[mhocko@suse.com: new changelog] Link: http://lkml.kernel.org/r/b752d1d5-81ad-7a35-2394-7870641be51c@i-love.sakura.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Tue, 4 Sep 2018 22:45:34 +0000 (15:45 -0700)]
mm: memcontrol: print proper OOM header when no eligible victim left
When the memcg OOM killer runs out of killable tasks, it currently
prints a WARN with no further OOM context. This has caused some user
confusion.
Warnings indicate a kernel problem. In a reported case, however, the
situation was triggered by a nonsensical memcg configuration (hard limit
set to 0). But without any VM context this wasn't obvious from the
report, and it took some back and forth on the mailing list to identify
what is actually a trivial issue.
Handle this OOM condition like we handle it in the global OOM killer:
dump the full OOM context and tell the user we ran out of tasks.
This way the user can identify misconfigurations easily by themselves
and rectify the problem - without having to go through the hassle of
running into an obscure but unsettling warning, finding the appropriate
kernel mailing list and waiting for a kernel developer to remote-analyze
that the memcg configuration caused this.
If users cannot make sense of why the OOM killer was triggered or why it
failed, they will still report it to the mailing list, we know that from
experience. So in case there is an actual kernel bug causing this,
kernel developers will very likely hear about it.
Link: http://lkml.kernel.org/r/20180821160406.22578-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1) Must perform TXQ teardown before unregistering interfaces in
mac80211, from Toke Høiland-Jørgensen.
2) Don't allow creating mac80211_hwsim with less than one channel, from
Johannes Berg.
3) Division by zero in cfg80211, fix from Johannes Berg.
4) Fix endian issue in tipc, from Haiqing Bai.
5) BPF sockmap use-after-free fixes from Daniel Borkmann.
6) Spectre-v1 in mac80211_hwsim, from Jinbum Park.
7) Missing rhashtable_walk_exit() in tipc, from Cong Wang.
8) Revert kvzalloc() conversion of AF_PACKET, it breaks mmap() when
kvzalloc() tries to use kmalloc() pages. From Eric Dumazet.
9) Fix deadlock in hv_netvsc, from Dexuan Cui.
10) Do not restart timewait timer on RST, from Florian Westphal.
11) Fix double lwstate refcount grab in ipv6, from Alexey Kodanev.
12) Unsolicit report count handling is off-by-one, fix from Hangbin Liu.
13) Sleep-in-atomic in cadence driver, from Jia-Ju Bai.
14) Respect ttl-inherit in ip6 tunnel driver, from Hangbin Liu.
15) Use-after-free in act_ife, fix from Cong Wang.
16) Missing hold to meta module in act_ife, from Vlad Buslov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
net: phy: sfp: Handle unimplemented hwmon limits and alarms
net: sched: action_ife: take reference to meta module
act_ife: fix a potential use-after-free
net/mlx5: Fix SQ offset in QPs with small RQ
tipc: correct spelling errors for tipc_topsrv_queue_evt() comments
tipc: correct spelling errors for struct tipc_bc_base's comment
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
bnxt_en: Clean up unused functions.
bnxt_en: Fix firmware signaled resource change logic in open.
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
sctp: fix invalid reference to the index variable of the iterator
net/ibm/emac: wrong emac_calc_base call was used by typo
net: sched: null actions array pointer before releasing action
vhost: fix VHOST_GET_BACKEND_FEATURES ioctl request definition
r8169: add support for NCube 8168 network card
ip6_tunnel: respect ttl inherit for ip6tnl
mac80211: shorten the IBSS debug messages
mac80211: don't Tx a deauth frame if the AP forbade Tx
mac80211: Fix station bandwidth setting after channel switch
mac80211: fix a race between restart and CSA flows
...
Andrew Lunn [Tue, 4 Sep 2018 02:23:56 +0000 (04:23 +0200)]
net: phy: sfp: Handle unimplemented hwmon limits and alarms
Not all SFPs implement the registers containing sensor limits and
alarms. Luckily, there is a bit indicating if they are implemented or
not. Add checking for this bit, when deciding if the hwmon attributes
should be visible.
Fixes: d7f1b042f3e0 ("net: phy: sfp: Add HWMON support for module sensors") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: sched: action_ife: take reference to meta module
Recent refactoring of add_metainfo() caused use_all_metadata() to add
metainfo to ife action metalist without taking reference to module. This
causes warning in module_put called from ife action cleanup function.
Implement add_metainfo_and_get_ops() function that returns with reference
to module taken if metainfo was added successfully, and call it from
use_all_metadata(), instead of calling __add_metainfo() directly.
Fixes: bfd20cd67b75 ("act_ife: fix a potential deadlock") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 3 Sep 2018 18:08:15 +0000 (11:08 -0700)]
act_ife: fix a potential use-after-free
Immediately after module_put(), user could delete this
module, so e->ops could be already freed before we call
e->ops->release().
Fix this by moving module_put() after ops->release().
Fixes: b2e88d736799 ("introduce IFE action") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Correct the formula for calculating the RQ page remainder,
which should be in byte granularity. The result will be
non-zero only for RQs smaller than PAGE_SIZE, as an RQ size
is a power of 2.
Divide this by the SQ stride (MLX5_SEND_WQE_BB) to get the
SQ offset in strides granularity.
Fixes: 441f46f08e9c ("net/mlx5: Fix QP fragmented buffer allocation") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Greentime Hu [Thu, 23 Aug 2018 07:05:46 +0000 (15:05 +0800)]
nds32: Fix a kernel panic issue because of wrong frame pointer access.
It can make sure that trace_hardirqs_off/trace_hardirqs_on can get a correct
return address by frame pointer through __builtin_return_address() in this fix.
Zong Li [Wed, 15 Aug 2018 03:05:40 +0000 (11:05 +0800)]
nds32/stack: Get real return address by using ftrace_graph_ret_addr
Function graph tracer has modified the return address to
'return_to_handler' on stack, and provide the 'ftrace_graph_ret_addr' to
get the real return address.
Signed-off-by: Zong Li <zong@andestech.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
Zong Li [Wed, 15 Aug 2018 03:00:08 +0000 (11:00 +0800)]
nds32/ftrace: Support dynamic function tracer
This patch contains the implementation of dynamic function tracer.
The mcount call is composed of three instructions, so there are three
nop for enough placeholder.
Signed-off-by: Zong Li <zong@andestech.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
Zong Li [Wed, 15 Aug 2018 02:45:59 +0000 (10:45 +0800)]
nds32/ftrace: Support static function tracer
This patch support the static function tracer. On nds32 ABI, we need to
always push return address to stack for __builtin_return_address can
work correctly, otherwise, it will get the wrong value of $lp at leaf
function.
Signed-off-by: Zong Li <zong@andestech.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
Zong Li [Mon, 13 Aug 2018 07:08:52 +0000 (15:08 +0800)]
nds32: Clean up the coding style
1. Adjust indentation.
2. Unify argument name of each macro.
3. Add space after comma in parameters list.
4. Add space after 'if' keyword.
5. Replace space by tab.
6. Change asm volatile to __asm__ __volatile__
Signed-off-by: Zong Li <zong@andestech.com> Acked-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Greentime Hu <greentime@andestech.com>
David S. Miller [Tue, 4 Sep 2018 05:12:02 +0000 (22:12 -0700)]
Merge tag 'mac80211-for-davem-2018-09-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Here are quite a large number of fixes, notably:
* various A-MSDU building fixes (currently only affects mt76)
* syzkaller & spectre fixes in hwsim
* TXQ vs. teardown fix that was causing crashes
* embed WMM info in reg rule, bad code here had been causing crashes
* one compilation issue with fix from Arnd (rfkill-gpio includes)
* fixes for a race and bad data during/after channel switch
* nl80211: a validation fix, attribute type & unit fixes
along with other small fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tipc: correct spelling errors for struct tipc_bc_base's comment
Trivial fix for two spelling mistakes.
Signed-off-by: Zhenbo Gao <zhenbo.gao@windriver.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 3 Sep 2018 08:23:19 +0000 (04:23 -0400)]
bnxt_en: Do not adjust max_cp_rings by the ones used by RDMA.
Currently, the driver adjusts the bp->hw_resc.max_cp_rings by the number
of MSIX vectors used by RDMA. There is one code path in open that needs
to check the true max_cp_rings including any used by RDMA. This code
is now checking for the reduced max_cp_rings which will fail when the
number of cp rings is very small.
To fix this in a clean way, we don't adjust max_cp_rings anymore.
Instead, we add a helper bnxt_get_max_func_cp_rings_for_en() to get the
reduced max_cp_rings when appropriate.
Fixes: 1a1114f3f11f ("bnxt_en: Add ULP calls to stop and restart IRQs.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Mon, 3 Sep 2018 08:23:17 +0000 (04:23 -0400)]
bnxt_en: Fix firmware signaled resource change logic in open.
When the driver detects that resources have changed during open, it
should reset the rx and tx rings to 0. This will properly setup the
init sequence to initialize the default rings again. We also need
to signal the RDMA driver to stop and clear its interrupts. We then
call the RoCE driver to restart if a new set of default rings is
successfully reserved.
Fixes: 2546c32ae38a ("bnxt_en: Notify firmware about IF state changes.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 3 Sep 2018 07:47:11 +0000 (15:47 +0800)]
sctp: not traverse asoc trans list if non-ipv6 trans exists for ipv6_flowlabel
When users set params.spp_address and get a trans, ipv6_flowlabel flag
should be applied into this trans. But even if this one is not an ipv6
trans, it should not go to apply it into all other transes of the asoc
but simply ignore it.
Fixes: d96ab52c5009 ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Mon, 3 Sep 2018 07:47:10 +0000 (15:47 +0800)]
sctp: fix invalid reference to the index variable of the iterator
Now in sctp_apply_peer_addr_params(), if SPP_IPV6_FLOWLABEL flag is set
and trans is NULL, it would use trans as the index variable to traverse
transport_addr_list, then trans is set as the last transport of it.
Later, if SPP_DSCP flag is set, it would enter into the wrong branch as
trans is actually an invalid reference.
So fix it by using a new index variable to traverse transport_addr_list
for both SPP_DSCP and SPP_IPV6_FLOWLABEL flags process.
Fixes: d96ab52c5009 ("sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparams") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Mikhaylov [Mon, 3 Sep 2018 07:26:28 +0000 (10:26 +0300)]
net/ibm/emac: wrong emac_calc_base call was used by typo
__emac_calc_base_mr1 was used instead of __emac4_calc_base_mr1
by copy-paste mistake for emac4syn.
Fixes: 3f668a3a6b7476e863e4d3ada42a49f14caa21b2 ("net/ibm/emac: add 8192 rx/tx fifo size") Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: sched: null actions array pointer before releasing action
Currently, tcf_action_delete() nulls actions array pointer after putting
and deleting it. However, if tcf_idr_delete_index() returns an error,
pointer to action is not set to null. That results it being released second
time in error handling code of tca_action_gd().
Kasan error:
[ 807.367755] ==================================================================
[ 807.375844] BUG: KASAN: use-after-free in tc_setup_cb_call+0x14e/0x250
[ 807.382763] Read of size 8 at addr ffff88033e636000 by task tc/2732
[ 807.958240] Memory state around the buggy address:
[ 807.963405] ffff88033e635f00: fc fc fc fc fb fb fb fb fb fb fb fc fc fc fc fb
[ 807.971288] ffff88033e635f80: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
[ 807.979166] >ffff88033e636000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 807.994882] ^
[ 807.998477] ffff88033e636080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 808.006352] ffff88033e636100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
[ 808.014230] ==================================================================
[ 808.022108] Disabling lock debugging due to kernel taint
Fixes: 22efde7643ff ("net_sched: improve and refactor tcf_action_put_many()") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The _IOC_READ flag fits this ioctl request more because this request
actually only writes to, but doesn't read from userspace.
See NOTEs in include/uapi/asm-generic/ioctl.h for more information.
Fixes: e376e1d85adf ("vhost: switch to use new message format") Signed-off-by: Gleb Fotengauer-Malinovskiy <glebfm@altlinux.org> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Fri, 31 Aug 2018 08:52:01 +0000 (16:52 +0800)]
ip6_tunnel: respect ttl inherit for ip6tnl
man ip-tunnel ttl section says:
0 is a special value meaning that packets inherit the TTL value.
IPv4 tunnel respect this in ip_tunnel_xmit(), but IPv6 tunnel has not
implement it yet. To make IPv6 behave consistently with IP tunnel,
add ipv6 tunnel inherit support.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Johansen [Sat, 1 Sep 2018 08:57:52 +0000 (01:57 -0700)]
apparmor: fix bad debug check in apparmor_secid_to_secctx()
apparmor_secid_to_secctx() has a bad debug statement tripping on a
condition handle by the code. When kconfig SECURITY_APPARMOR_DEBUG is
enabled the debug WARN_ON will trip when **secdata is NULL resulting
in the following trace.
------------[ cut here ]------------
AppArmor WARN apparmor_secid_to_secctx: ((!secdata)):
WARNING: CPU: 0 PID: 14826 at security/apparmor/secid.c:82 apparmor_secid_to_secctx+0x2b5/0x2f0 security/apparmor/secid.c:82
Kernel panic - not syncing: panic_on_warn set ...
CC: <stable@vger.kernel.org> #4.18 Fixes: cea91f5cd34f ("apparmor: add support for mapping secids and using secctxes") Reported-by: syzbot+21016130b0580a9de3b5@syzkaller.appspotmail.com Signed-off-by: John Johansen <john.johansen@canonical.com>
mac80211: don't Tx a deauth frame if the AP forbade Tx
If the driver fails to properly prepare for the channel
switch, mac80211 will disconnect. If the CSA IE had mode
set to 1, it means that the clients are not allowed to send
any Tx on the current channel, and that includes the
deauthentication frame.
Make sure that we don't send the deauthentication frame in
this case.
In iwlwifi, this caused a failure to flush queues since the
firmware already closed the queues after having parsed the
CSA IE. Then mac80211 would wait until the deauthentication
frame would go out (drv_flush(drop=false)) and that would
never happen.
Ilan Peer [Fri, 31 Aug 2018 08:31:10 +0000 (11:31 +0300)]
mac80211: Fix station bandwidth setting after channel switch
When performing a channel switch flow for a managed interface, the
flow did not update the bandwidth of the AP station and the rate
scale algorithm. In case of a channel width downgrade, this would
result with the rate scale algorithm using a bandwidth that does not
match the interface channel configuration.
Fix this by updating the AP station bandwidth and rate scaling algorithm
before the actual channel change in case of a bandwidth downgrade, or
after the actual channel change in case of a bandwidth upgrade.
Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211: fix a race between restart and CSA flows
We hit a problem with iwlwifi that was caused by a bug in
mac80211. A bug in iwlwifi caused the firwmare to crash in
certain cases in channel switch. Because of that bug,
drv_pre_channel_switch would fail and trigger the restart
flow.
Now we had the hw restart worker which runs on the system's
workqueue and the csa_connection_drop_work worker that runs
on mac80211's workqueue that can run together. This is
obviously problematic since the restart work wants to
reconfigure the connection, while the csa_connection_drop_work
worker does the exact opposite: it tries to disconnect.
Fix this by cancelling the csa_connection_drop_work worker
in the restart worker.
Note that this can sound racy: we could have:
driver iface_work CSA_work restart_work
+++++++++++++++++++++++++++++++++++++++++++++
|
<--drv_cs ---|
<FW CRASH!>
-CS FAILED-->
| |
| cancel_work(CSA)
schedule |
CSA work |
| |
Race between those 2
But this is not possible because we flush the workqueue
in the restart worker before we cancel the CSA worker.
That would be bullet proof if we could guarantee that
we schedule the CSA worker only from the iface_work
which runs on the workqueue (and not on the system's
workqueue), but unfortunately we do have an instance
in which we schedule the CSA work outside the context
of the workqueue (ieee80211_chswitch_done).
Note also that we should probably cancel other workers
like beacon_connection_loss_work and possibly others
for different types of interfaces, at the very least,
IBSS should suffer from the exact same problem, but for
now, do the minimum to fix the actual bug that was actually
experienced and reproduced.
Dreyfuss, Haim [Fri, 31 Aug 2018 08:31:04 +0000 (11:31 +0300)]
mac80211: fix WMM TXOP calculation
In commit 9236c4523e5b ("mac80211: limit wmm params to comply
with ETSI requirements"), we have limited the WMM parameters to
comply with 802.11 and ETSI standard. Mistakenly the TXOP value
was caluclated wrong. Fix it by taking the minimum between
802.11 to ETSI to make sure we are not violating both.
Fixes: ef1bbba10c89 ("mac80211: limit wmm params to comply with ETSI requirements") Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Dan Carpenter [Fri, 31 Aug 2018 08:10:55 +0000 (11:10 +0300)]
cfg80211: fix a type issue in ieee80211_chandef_to_operating_class()
The "chandef->center_freq1" variable is a u32 but "freq" is a u16 so we
are truncating away the high bits. I noticed this bug because in commit 9cf0a0b4b64a ("cfg80211: Add support for 60GHz band channels 5 and 6")
we made "freq <= 56160 + 2160 * 6" a valid requency when before it was
only "freq <= 56160 + 2160 * 4" that was valid. It introduces a static
checker warning:
Lorenzo Bianconi [Thu, 30 Aug 2018 23:04:13 +0000 (01:04 +0200)]
mac80211: fix an off-by-one issue in A-MSDU max_subframe computation
Initialize 'n' to 2 in order to take into account also the first
packet in the estimation of max_subframe limit for a given A-MSDU
since frag_tail pointer is NULL when ieee80211_amsdu_aggregate
routine analyzes the second frame.
Fixes: 997eb7fcb902 ("mac80211: add A-MSDU tx support") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Merge tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
"A few fixes for the fallout of being a little more pedantic about dma
masks"
* tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping:
of/platform: initialise AMBA default DMA masks
sparc: set a default 32-bit dma mask for OF devices
kernel/dma/direct: take DMA offset into account in dma_direct_supported
/usr/include/linux/rds.h:156:18: error: field ‘laddr’ has incomplete type
struct in6_addr laddr;
^~~~~
/usr/include/linux/rds.h:157:18: error: field ‘faddr’ has incomplete type
struct in6_addr faddr;
^~~~~
/usr/include/linux/rds.h:178:18: error: field ‘laddr’ has incomplete type
struct in6_addr laddr;
^~~~~
/usr/include/linux/rds.h:179:18: error: field ‘faddr’ has incomplete type
struct in6_addr faddr;
^~~~~
/usr/include/linux/rds.h:198:18: error: field ‘bound_addr’ has incomplete type
struct in6_addr bound_addr;
^~~~~~~~~~
/usr/include/linux/rds.h:199:18: error: field ‘connected_addr’ has incomplete type
struct in6_addr connected_addr;
^~~~~~~~~~~~~~
/usr/include/linux/rds.h:219:18: error: field ‘local_addr’ has incomplete type
struct in6_addr local_addr;
^~~~~~~~~~
/usr/include/linux/rds.h:221:18: error: field ‘peer_addr’ has incomplete type
struct in6_addr peer_addr;
^~~~~~~~~
/usr/include/linux/rds.h:245:18: error: field ‘src_addr’ has incomplete type
struct in6_addr src_addr;
^~~~~~~~
/usr/include/linux/rds.h:246:18: error: field ‘dst_addr’ has incomplete type
struct in6_addr dst_addr;
^~~~~~~~
Fixes: ad50ada6fc6d ("rds: Extend RDS API for IPv6 support") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data()
when linearizing multiple scatterlist elements, from Tushar.
2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is
found on map update where it would release existing ULP. syzbot found and
triggered this couple of times now, fix from John.
3) Add missing xskmap type to bpftool so it will properly show the type
on map dump, from Prashant.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 30 Aug 2018 21:15:43 +0000 (14:15 -0700)]
net/ipv6: Only update MTU metric if it set
Jan reported a regression after an update to 4.18.5. In this case ipv6
default route is setup by systemd-networkd based on data from an RA. The
RA contains an MTU of 1492 which is used when the route is first inserted
but then systemd-networkd pushes down updates to the default route
without the mtu set.
Prior to the change to fib6_info, metrics such as MTU were held in the
dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if
any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not,
the value from the metrics is used if it is set and finally falling
back to the idev value.
After the fib6_info change metrics are contained in the fib6_info struct
and there is no equivalent to rt6i_pmtu. To maintain consistency with
the old behavior the new code should only reset the MTU in the metrics
if the route update has it set.
Fixes: b61a880ad6ab ("net/ipv6: move metrics from dst to rt6_info") Reported-by: Jan Janssen <medhefgo@web.de> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Wed, 29 Aug 2018 15:00:24 +0000 (08:00 -0700)]
net: ethernet: cpsw-phy-sel: prefer phandle for phy sel
The cpsw-phy-sel device is not a child of the cpsw interconnect target
module. It lives in the system control module.
Let's fix this issue by trying to use cpsw-phy-sel phandle first if it
exists and if not fall back to current usage of trying to find the
cpsw-phy-sel child. That way the phy sel driver can be a child of the
system control module where it belongs in the device tree.
Without this fix, we cannot have a proper interconnect target module
hierarchy in device tree for things like genpd.
Note that deferred probe is mostly not supported by cpsw and this patch
does not attempt to fix that. In case deferred probe support is needed,
this could be added to cpsw_slave_open() and phy_connect() so they start
handling and returning errors.
For documenting it, looks like the cpsw-phy-sel is used for all cpsw device
tree nodes. It's missing the related binding documentation, so let's also
update the binding documentation accordingly.
Cc: devicetree@vger.kernel.org Cc: Andrew Lunn <andrew@lunn.ch> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Lindgren [Wed, 29 Aug 2018 15:00:23 +0000 (08:00 -0700)]
dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle
The current cpsw usage for cpsw-phy-sel is undocumented but is used for
all the boards using cpsw. And cpsw-phy-sel is not really a child of
the cpsw device, it lives in the system control module instead.
Let's document the existing usage, and improve it a bit where we prefer
to use a phandle instead of a child device for it. That way we can
properly describe the hardware in dts files for things like genpd.
Cc: devicetree@vger.kernel.org Cc: Andrew Lunn <andrew@lunn.ch> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Wed, 29 Aug 2018 10:06:10 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count after link down and up
After link down and up, i.e. when call ip_mc_up(), we doesn't init
im->unsolicit_count. So after igmp_timer_expire(), we will not start
timer again and only send one unsolicit report at last.
Fix it by initializing im->unsolicit_count in igmp_group_added(), so
we can respect igmp robustness value.
Fixes: cb3bac8948c0c ("igmp: do not remove igmp souce list info when set link down") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Wed, 29 Aug 2018 10:06:08 +0000 (18:06 +0800)]
igmp: fix incorrect unsolicit report count when join group
We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.
Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Fri, 31 Aug 2018 04:25:02 +0000 (21:25 -0700)]
bpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP
Currently we check sk_user_data is non NULL to determine if the sk
exists in a map. However, this is not sufficient to ensure the psock
or the ULP ops are not in use by another user, such as kcm or TLS. To
avoid this when adding a sock to a map also verify it is of the
correct ULP type. Additionally, when releasing a psock verify that
it is the TCP_ULP_BPF type before releasing the ULP. The error case
where we abort an update due to ULP collision can cause this error
path.
For example,
__sock_map_ctx_update_elem()
[...]
err = tcp_set_ulp_id(sock, TCP_ULP_BPF) <- collides with TLS
if (err) <- so err out here
goto out_free
[...]
out_free:
smap_release_sock() <- calling tcp_cleanup_ulp releases the
TLS ULP incorrectly.
Fixes: ceebdc27c8d6 ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tushar Dave [Fri, 31 Aug 2018 21:45:16 +0000 (23:45 +0200)]
bpf: Fix bpf_msg_pull_data()
Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while
linearizing multiple scatterlist elements. Variable 'offset' is used
to find first starting scatterlist element
i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset"
Use different variable name while linearizing multiple scatterlist
elements so that value contained in variable 'offset' won't get
overwritten.
Fixes: b20b48b5b56e ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"First batch of fixes post-merge window:
- A handful of devicetree changes for i.MX2{3,8} to change over to
new panel bindings. The platforms were moved from legacy
framebuffers to DRM and some development board panels hadn't yet
been converted.
- OMAP fixes related to ti-sysc driver conversion fallout, fixing
some register offsets, no_console_suspend fixes, etc.
- Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
- Fixed 0755->0644 permissions on a newly added file.
- Defconfig changes to make ARM Versatile more useful with QEMU
(helps testing).
- Enable defconfig options for new TI SoC platform that was merged
this window (AM6)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: defconfig: Enable TI's AM6 SoC platform
ARM: defconfig: Update the ARM Versatile defconfig
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: dts: imx23-evk: Convert to the new display bindings
ARM: dts: imx23-evk: Move regulators outside simple-bus
ARM: dts: imx28-evk: Convert to the new display bindings
ARM: dts: imx28-evk: Move regulators outside simple-bus
Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Speculation:
- Make the microcode check more robust
- Make the L1TF memory limit depend on the internal cache physical
address space and not on the CPUID advertised physical address
space, which might be significantly smaller. This avoids disabling
L1TF on machines which utilize the full physical address space.
- Fix the GDT mapping for EFI calls on 32bit PTI
- Fix the MCE nospec implementation to prevent #GP
Fixes and robustness:
- Use the proper operand order for LSL in the VDSO
- Prevent NMI uaccess race against CR3 switching
- Add a lockdep check to verify that text_mutex is held in
text_poke() functions
- Repair the fallout of giving native_restore_fl() a prototype
- Prevent kernel memory dumps based on usermode RIP
- Wipe KASAN shadow stack before rewinding the stack to prevent false
positives
- Move the AMS GOTO enforcement to the actual build stage to allow
user API header extraction without a compiler
- Fix a section mismatch introduced by the on demand VDSO mapping
change
Miscellaneous:
- Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pti: Fix section mismatch warning/error
x86/vdso: Fix lsl operand order
x86/mce: Fix set_mce_nospec() to avoid #GP fault
x86/efi: Load fixmap GDT in efi_call_phys_epilog()
x86/nmi: Fix NMI uaccess race against CR3 switching
x86: Allow generating user-space headers without a compiler
x86/dumpstack: Don't dump kernel memory based on usermode RIP
x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
x86/irqflags: Mark native_restore_fl extern inline
x86/build: Remove jump label quirk for GCC older than 4.5.2
x86/Kconfig: Fix trivial typo
x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
x86/spectre: Add missing family 6 check to microcode check
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"A small set of updates for core code:
- Prevent tracing in functions which are called from trace patching
via stop_machine() to prevent executing half patched function trace
entries.
- Remove old GCC workarounds
- Remove pointless includes of notifier.h"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Remove workaround for unreachable warnings from old GCC
notifier: Remove notifier header file wherever not used
watchdog: Mark watchdog touch functions as notrace
Randy Dunlap [Sun, 2 Sep 2018 04:01:28 +0000 (21:01 -0700)]
x86/pti: Fix section mismatch warning/error
Fix the section mismatch warning in arch/x86/mm/pti.c:
WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.
Fixes: 6c0d709dbc06 ("x86/pti: Map the vsyscall page if needed") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org
Linus Walleij [Fri, 31 Aug 2018 14:13:07 +0000 (16:13 +0200)]
of/platform: initialise AMBA default DMA masks
This addresses a v4.19-rc1 regression in the PL111 DRM driver in
drivers/gpu/pl111/*
The driver uses the CMA KMS helpers and will thus at some point call
down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory
for the framebuffer.
It appears that in v4.18, it was OK that this (and other DMA mastering
AMBA devices) left dev->coherent_dma_mask blank (zero).
In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in
dma_alloc_attrs() in include/linux/dma-mapping.h is triggered. The
allocation later fails when get_coherent_dma_mask() is called from
__dma_alloc() and __dma_alloc() returns NULL:
drm-clcd-pl111 dev:20: coherent DMA mask is unset
drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
Failed to set fbdev configuration
It turns out that in commit 41a449c3d058 ("OF: Don't set default
coherent DMA mask") the OF core stops setting the default DMA mask on
new devices, especially those lines of the patch:
- if (!dev->coherent_dma_mask)
- dev->coherent_dma_mask = DMA_BIT_MASK(32);
Robin Murphy solved a similar problem in 92019f02f3ca ("of/platform:
Initialise default DMA masks") by simply assigning dev.coherent_dma_mask
and the dev.dma_mask to point to the same when creating devices from the
device tree, and introducing the same code into the code path creating
AMBA/PrimeCell devices solved my problem, graphics now come up.
The code simply assumes that the device can access all of the system
memory by setting the coherent DMA mask to 0xffffffff when creating a
device from the device tree, which is crude, but seems to be what kernel
v4.18 assumed.
The AMBA PrimeCells do not differ between coherent and streaming DMA so
we can just assign the same to any DMA mask.
Possibly drivers should augment their coherent DMA mask in accordance
with "dma-ranges" from the device tree if more finegranular masking is
needed.
Reported-by: Russell King <linux@armlinux.org.uk> Fixes: 41a449c3d058 ("OF: Don't set default coherent DMA mask") Cc: Russell King <linux@armlinux.org.uk> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
sparc: set a default 32-bit dma mask for OF devices
This keeps the historic default behavior for devices without a DMA mask,
but removes the warning about a lacking DMA mask for doing DMA without
a mask.
Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Guenter Roeck <linux@roeck-us.net>
Olof Johansson [Sun, 2 Sep 2018 01:22:19 +0000 (18:22 -0700)]
Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omap variants against v4.19-rc1
These are mostly fixes related to using ti-sysc interconnect target module
driver for accessing right register offsets for sgx and cpsw and for
no_console_suspend regression.
There is also a droid4 emmc fix where emmc may not get detected for some
models, and vibrator dts mismerge fix.
And we have a file permission fix for am335x-osd3358-sm-red.dts that
just got added. And we must tag RTC as system-power-controller for
am437x for PMIC to shut down during poweroff.
* tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Alexey Kodanev [Thu, 30 Aug 2018 16:11:24 +0000 (19:11 +0300)]
ipv6: don't get lwtstate twice in ip6_rt_copy_init()
Commit 7b1a9b47a4bd ("net/ipv6: Put lwtstate when destroying fib6_info")
partially fixed the kmemleak [1], lwtstate can be copied from fib6_info,
with ip6_rt_copy_init(), and it should be done only once there.
rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function
ip6_rt_copy_init(), so there is no need to get it again at the end.
With this patch, lwtstate also isn't copied from RTF_REJECT routes.
Fixes: 1351050cefaa ("net/ipv6: Defer initialization of dst to data path") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Samuel Neves [Sat, 1 Sep 2018 20:14:52 +0000 (21:14 +0100)]
x86/vdso: Fix lsl operand order
In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.
Fixes: 8ef556d5c260 ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small fixes, one for the x86 Stoney SoC to get a more accurate clk
frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX
driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: x86: Set default parent to 48Mhz
clk: npcm7xx: fix memory allocation