The BPF verifier conflict was some minor contextual issue.
The TUN conflict was less trivial. Cong Wang fixed a memory leak of
tfile->tx_array in 'net'. This is an skb_array. But meanwhile in
net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
ptr_ring.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Jan 2018 20:57:03 +0000 (15:57 -0500)]
Merge branch 'dsa-mv88e6xxx-ATU-VTU-irq-fixes'
Andrew Lunn says:
====================
ATU and VTU irq fixes
Further testing and code review found two sets of bugs.
Core review found a cut/paste error in the irq setup code.
A board which does not have an interrupt line from the switch to the
SoC, and experiancing an EPROBE_DEFER throw a splat when the ATU irq
was freed but never registered.
v2: Fix typ0 chip->chip->vtu_prob_irq to chip->vtu_prob_irq
0-day compile testing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Thu, 18 Jan 2018 16:42:50 +0000 (17:42 +0100)]
net: dsa: mv88e6xxx: Free ATU/VTU irq only when there is chip irq
We only register the ATU and VTU irq when we have a chip level IRQ.
In the error path, we should only attempt to remove the ATU and VTU
irq if we also have a chip level IRQ.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Jan 2018 20:52:52 +0000 (15:52 -0500)]
Merge branch 'net-sched-cls-add-extack-support'
Alexander Aring says:
====================
net: sched: cls: add extack support
this patch adds extack support for TC classifier subsystem. The first
patch fixes some code style issues for this patch series pointed out
by checkpatch. The other patches until the last one prepares extack
handling for the TC classifier subsystem and handle generic extack
errors.
The last patch is an example for u32 classifier to add extack support
inside the callbacks delete and change. There exists a init callback as
well, but most classifier implementation run a kalloc() once to allocate
something. Not necessary _yet_ to add extack support now.
- Alex
Cc: David Ahern <dsahern@gmail.com>
changes since v3:
- fix accidentally move of config option mismatch message in PATCH 2/8
correct one is 4/8, detected by kbuildbot (Thank you)
- Removed patch "net: sched: cls: add extack support for tc_setup_cb_call"
PATCH 7/8 in version v2 as suggested by Jakub Kicinski (Thank you)
- changed NL_SET_ERR_MSG to NL_SET_ERR_MSG_MOD as suggested by Jakub Kicinski
in u32 cls (Thank You)
- Removed text from cover letter that I was waiting for Jiri's Patches as
detected by Jamal Hadi Salim (Thank you).
changes since v2:
- rebased on Jiri's patches (Thank you)
- several spelling fixes pointed out by Cong Wang (Thank you)
- several spelling fixes pointed out by David Ahern (Thank you)
- use David Ahern recommendation if config option is mismatch, but
combine it with Cong Wang recommendation to put config name into it
(Thank you)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:55 +0000 (11:20 -0500)]
net: sched: cls_u32: add extack support
This patch adds extack support for the u32 classifier as example for
delete and init callback.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:54 +0000 (11:20 -0500)]
net: sched: cls: add extack support for tcf_change_indev
This patch adds extack handling for the tcf_change_indev function which
is common used by TC classifier implementations.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:53 +0000 (11:20 -0500)]
net: sched: cls: add extack support for delete callback
This patch adds extack support for classifier delete callback api. This
prepares to handle extack support inside each specific classifier
implementation.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:52 +0000 (11:20 -0500)]
net: sched: cls: add extack support for tcf_exts_validate
The tcf_exts_validate function calls the act api change callback. For
preparing extack support for act api, this patch adds the extack as
parameter for this function which is common used in cls implementations.
Furthermore the tcf_exts_validate will call action init callback which
prepares the TC action subsystem for extack support.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:51 +0000 (11:20 -0500)]
net: sched: cls: add extack support for change callback
This patch adds extack support for classifier change callback api. This
prepares to handle extack support inside each specific classifier
implementation.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Aring [Thu, 18 Jan 2018 16:20:50 +0000 (11:20 -0500)]
net: sched: cls_api: handle generic cls errors
This patch adds extack support for generic cls handling. The extack
will be set deeper to each called function which is not part of netdev
core api.
Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Alexander Aring <aring@mojatatu.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Thu, 18 Jan 2018 11:55:23 +0000 (12:55 +0100)]
mlxsw: spectrum: Upper-bound supported FW version
During initialization the driver checks whether the flashed FW image
suits its requirements by checking that it's sufficiently new.
However, there's only a weak backward compatibility scheme that is
actually guaranteed by the FW, so driver must also upper bound the
version to prevent compatibility issues between current driver and some
possible future fw.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
nfp: devlink, capabilities extensions and updates
This series starts with an improvement to the usability of the device
memory accessors (CPP transactions). Next few patches are devoted to
fixing the devlink locking. After recent patches for mlxsw the locking
scheme of devlink ops has to be reworked. Following patches improve
NFP code dealing with "representors", and expands the error message
printed when driver has no support for loaded FW.
Second part of the series is focused on vNIC capabilities read from
vNIC control memory (often referred to as "BAR0" for historical reasons).
TLV capability format is established and immediately made use of. The
next patches rework parsing of features for control vNIC which allows
apps to mask out features they don't want enabled.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:06 +0000 (18:51 -0800)]
nfp: bpf: disable all ctrl vNIC capabilities
BPF firmware currently exposes IRQ moderation capability.
The driver will make use of it by default, inserting 50 usec
delay to every control message exchange. This cuts the number
of messages per second we can exchange by almost half.
None of the other capabilities make much sense for BPF control
vNIC, either. Disable them all.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:05 +0000 (18:51 -0800)]
nfp: allow apps to disable ctrl vNIC capabilities
Most vNIC capabilities are netdev related. It makes no sense
to initialize them and waste FW resources. Some are even
counter-productive, like IRQ moderation, which will slow
down exchange of control messages.
Add to nfp_app a mask of enabled control vNIC capabilities
for apps to use. Make flower and BPF enable all capabilities
for now. No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:04 +0000 (18:51 -0800)]
nfp: split reading capabilities out of nfp_net_init()
nfp_net_init() is a little long and we are about to add more
code to reading capabilties. Move the capability reading,
parsing and validating out. Only actual initialization
will stay in nfp_net_init().
No functional changes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:03 +0000 (18:51 -0800)]
nfp: read mailbox address from TLV caps
Allow specifying alternative vNIC mailbox location in TLV caps.
This way we can size the mailbox to the needs and not necessarily
waste 512B of ctrl memory space.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:02 +0000 (18:51 -0800)]
nfp: read ME frequency from vNIC ctrl memory
PCIe island clock frequency is used when converting coalescing
parameters from usecs to NFP timestamps. Most chips don't run
at 1200MHz, allow FW to provide us with the real frequency.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:01 +0000 (18:51 -0800)]
nfp: add TLV capabilities to the BAR
NFP is entirely programmable, including the PCI data interface.
Using a fixed control BAR layout certainly makes implementations
easier, but require careful considerations when space is allocated.
Once BAR area is allocated to one feature nothing else can use it.
Allocating space statically also requires it to be sized upfront,
which leads to either unnecessary limitation or wastage.
We currently have a 32bit capability word defined which tells drivers
which application FW features are supported. Most of the bits
are exhausted. The same bits are also reused for enabling specific
features. Bulk of capabilities don't have a need for an enable bit,
however, leading to confusion and wastage.
TLVs seems like a better fit for expressing capabilities of applications
running on programmable hardware.
This patch leaves the front of the BAR as is, and declares a TLV
capability start at offset 0x58. Most of the space up to 0x0d90
is already allocated, but the used space can be wrapped with RESERVED
TLVs. E.g.:
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:51:00 +0000 (18:51 -0800)]
nfp: improve app not found message
When driver app matching loaded FW is not found users are faced with:
nfp: failed to find app with ID 0x%02x
This message does not properly explain that matching driver code is
either not built into the driver or the driver is too old.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:50:59 +0000 (18:50 -0800)]
nfp: protect each repr pointer individually with RCU
Representors are grouped in sets by type. Currently the whole
sets are under RCU protection, but individual representor pointers
are not. This causes some inconveniences when representors have
to be destroyed, because we have to allocate new sets to remove
any representors. Protect the individual pointers with RCU.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:50:58 +0000 (18:50 -0800)]
nfp: add nfp_reprs_get_locked() helper
The write side of repr tables is always done under pf->lock.
Add a helper to dereference repr table pointers under protection
of that lock.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:50:57 +0000 (18:50 -0800)]
nfp: register devlink after app is created
Devlink used to have two global locks: devlink lock and port lock,
our lock ordering looked like this:
devlink lock -> driver's pf->lock -> devlink port lock
After recent changes port lock was replaced with per-instance
lock. Unfortunately, new per-instance lock is taken on most
operations now. This means we can only grab the pf->lock from
the port split/unsplit ops. Lock ordering looks like this:
Since we can't take pf->lock from most devlink ops, make sure
nfp_apps are prepared to service them as soon as devlink is
registered. Locking the pf must be pushed down after
nfp_app_init() callback.
The init order looks like this:
nfp_app_init
devlink_register
nfp_app_start
netdev/port_register
As soon as app_init is done nfp_apps must be ready to service
devlink-related callbacks. apps can only register their own
devlink objects from nfp_app_start.
Fixes: 0da085c52e82 ("devlink: Add per devlink instance lock") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:50:56 +0000 (18:50 -0800)]
nfp: release global resources only on the remove path
NFP app is currently shut down as soon as all the vNICs are gone.
This means we can't depend on the app existing throughout the
lifetime of the device. Free the app only from PCI remove path.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 18 Jan 2018 02:50:55 +0000 (18:50 -0800)]
nfp: core: make scalar CPP helpers fail on short accesses
Currently the helpers for accessing 4 or 8 byte values over
the CPP bus return the length of IO on success. If the IO
was short caller has to deal with error handling. The short
IO for 4/8B values is completely impractical. Make the
helpers return an error if full access was not possible.
Fix the few places which are actually dealing with errors
correctly, most call sites already only deal with negative
return codes.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Jan 2018 20:39:31 +0000 (15:39 -0500)]
Merge branch 'tcp-min-rtt'
Yuchung Cheng says:
====================
tcp: do not use RTT from delayed ACKs for min-RTT
This patch set prevents TCP sender from using RTT samples from
(suspected) delayed ACKs as the minimum RTT, to avoid unbounded
over-estimation of the network path delay. This issue is common
when a connection has extended periods of one packet chit-chat
beyond the min RTT filter window. The first patch does that for TCP
general min RTT estimation. The second patch addresses specifically
the BBR congestion control's min RTT filter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Wed, 17 Jan 2018 20:11:01 +0000 (12:11 -0800)]
tcp: avoid min RTT bloat by skipping RTT from delayed-ACK in BBR
A persistent connection may send tiny amount of data (e.g. health-check)
for a long period of time. BBR's windowed min RTT filter may only see
RTT samples from delayed ACKs causing BBR to grossly over-estimate
the path delay depending how much the ACK was delayed at the receiver.
This patch skips RTT samples that are likely coming from delayed ACKs. Note
that it is possible the sender never obtains a valid measure to set the
min RTT. In this case BBR will continue to set cwnd to initial window
which seems fine because the connection is thin stream.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Wed, 17 Jan 2018 20:11:00 +0000 (12:11 -0800)]
tcp: avoid min-RTT overestimation from delayed ACKs
This patch avoids having TCP sender or congestion control
overestimate the min RTT by orders of magnitude. This happens when
all the samples in the windowed filter are one-packet transfer
like small request and health-check like chit-chat, which is farily
common for applications using persistent connections. This patch
tries to conservatively labels and skip RTT samples obtained from
this type of workload.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Maloy [Wed, 17 Jan 2018 15:42:46 +0000 (16:42 +0100)]
tipc: fix race between poll() and setsockopt()
Letting tipc_poll() dereference a socket's pointer to struct tipc_group
entails a race risk, as the group item may be deleted in a concurrent
tipc_sk_join() or tipc_sk_leave() thread.
We now move the 'open' flag in struct tipc_group to struct tipc_sock,
and let the former retain only a pointer to the moved field. This will
eliminate the race risk.
Reported-by: syzbot+799dafde0286795858ac@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi [Wed, 17 Jan 2018 10:41:20 +0000 (11:41 +0100)]
l2tp: remove switch block in l2tp_nl_cmd_session_create()
Remove the switch block in l2tp_nl_cmd_session_create() that
checks pseudowire-specific parameters since just L2TP_PWTYPE_ETH and
L2TP_PWTYPE_PPP are currently supported and no actual checks are
performed. Moreover the L2TP_PWTYPE_IP/default case presents a harmless
issue in error handling (break instead of goto out_tunnel)
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Wed, 17 Jan 2018 07:27:34 +0000 (12:57 +0530)]
cxgb4: IPv6 filter takes 2 tids
on T6, IPv6 filter would occupy 2 tids instead of 4.
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
l2tp: set l2specific_len based on l2specific_type
Do not rely on l2specific_len value provided by userspace but set sublayer
length according to l2specific_type.
Mark L2TP_ATTR_L2SPEC_LEN attribute as not used
Changes since v2:
- drop the patch related to a fix in the switch default case in
l2tp_nl_cmd_session_create()
- use L2SPECTYPE_NONE as default case in l2tp_get_l2specific_len()
Changes since v1:
- remove l2specific_len parameter
- add sanity check on l2specific_type provided by userspace
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi [Tue, 16 Jan 2018 22:01:57 +0000 (23:01 +0100)]
l2tp: mark L2TP_ATTR_L2SPEC_LEN as not used
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove l2specific_len configuration parameter since now L2-Specific
Sublayer length is computed according to l2specific_type provided by
userspace.
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi [Tue, 16 Jan 2018 22:01:55 +0000 (23:01 +0100)]
l2tp: remove l2specific_len dependency in l2tp_core
Remove l2specific_len dependency while building l2tpv3 header or
parsing the received frame since default L2-Specific Sublayer is
always four bytes long and we don't need to rely on a user supplied
value.
Moreover in l2tp netlink code there are no sanity checks to
enforce the relation between l2specific_len and l2specific_type,
so sending a malformed netlink message is possible to set
l2specific_type to L2TP_L2SPECTYPE_DEFAULT (or even
L2TP_L2SPECTYPE_NONE) and set l2specific_len to a value greater than
4 leaking memory on the wire and sending corrupted frames.
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi [Tue, 16 Jan 2018 22:01:54 +0000 (23:01 +0100)]
l2tp: double-check l2specific_type provided by userspace
Add sanity check on l2specific_type provided by userspace in
l2tp_nl_cmd_session_create() since just L2TP_L2SPECTYPE_DEFAULT and
L2TP_L2SPECTYPE_NONE are currently supported.
Moreover explicitly set l2specific_type to L2TP_L2SPECTYPE_DEFAULT
only if the userspace does not provide a value for it
Reviewed-by: Guillaume Nault <g.nault@alphalink.fr> Tested-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
cxgb4: reduce memory footprint for collecting firmware dump
Firmware dump can be large (upto 2 GB). In low memory conditions,
ethtool fails to allocate such large memory. So, use zlib deflate
to compress collected firmware dump.
Patch 1 updates collection logic to use compression.
Rahul Lakkireddy [Wed, 17 Jan 2018 07:23:46 +0000 (12:53 +0530)]
cxgb4: update dump collection logic to use compression
Update firmware dump collection logic to use compression when available.
Let collection logic attempt to do compression, instead of returning out
of memory early.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Talat Batheesh [Wed, 17 Jan 2018 21:13:16 +0000 (23:13 +0200)]
net/dim: Fix fixpoint divide exception in net_dim_stats_compare
Helmut reported a bug about devision by zero while
running traffic and doing physical cable pull test.
When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is devision by zero.
This patch prevent this division for both ppms and epms.
Fixes: 06d22b806446 ("net/mlx5e: Added BW check for DIM decision mechanism") Fixes: 2ea2b496e21f ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux") Reported-by: Helmut Grauer <helmut.grauer@de.ibm.com> Signed-off-by: Talat Batheesh <talatb@mellanox.com> Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 19 Jan 2018 19:38:19 +0000 (11:38 -0800)]
Merge tag 'trace-v4.15-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Two more small fixes
- The conversion of enums into their actual numbers to display in the
event format file had an off-by-one bug, that could cause an enum
not to be converted, and break user space parsing tools.
- A fix to a previous fix to bring back the context recursion checks.
The interrupt case checks for NMI, IRQ and softirq, but the softirq
returned the same number regardless if it was set or not, although
the logic would force it to be set if it were hit"
* tag 'trace-v4.15-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix converting enum's from the map in trace_event_eval_update()
ring-buffer: Fix duplicate results in mapping context to bits in recursive lock
Wei Yongjun [Wed, 17 Jan 2018 03:27:42 +0000 (03:27 +0000)]
devlink: Make some functions static
Fixes the following sparse warnings:
net/core/devlink.c:2297:25: warning:
symbol 'devlink_resource_find' was not declared. Should it be static?
net/core/devlink.c:2322:6: warning:
symbol 'devlink_resource_validate_children' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Wed, 17 Jan 2018 03:27:33 +0000 (03:27 +0000)]
mlxsw: spectrum: Make function mlxsw_sp_kvdl_part_occ() static
Fixes the following sparse warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:289:5: warning:
symbol 'mlxsw_sp_kvdl_part_occ' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Zhu Yanjun [Wed, 17 Jan 2018 02:59:41 +0000 (21:59 -0500)]
forcedeth: remove unused variable
The variable miistat is not used. So it is removed.
CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 19 Jan 2018 19:30:06 +0000 (11:30 -0800)]
Merge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two bugfixes for the I2C core: Lixing Wang fixed a refcounting problem
with DT nodes. Jeremy Compostella fixed a buffer overflow possibility
when using a 'don't use' ioctl interface directly"
* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: core-smbus: prevent stack corruption on read I2C_BLOCK_DATA
i2c: core: decrease reference count of device node in i2c_unregister_device
Linus Torvalds [Fri, 19 Jan 2018 19:25:17 +0000 (11:25 -0800)]
Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"cgroup.threads should be delegatable (ie. a container should be able
to write to it from inside) but was missing the flag.
The change is very low risk"
* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: make cgroup.threads delegatable
Linus Torvalds [Fri, 19 Jan 2018 19:23:39 +0000 (11:23 -0800)]
Merge branch 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixlet from Tejun Heo:
"One patch to add touch_nmi_watchdog() while dumping workqueue debug
messages to avoid triggering the lockup detector spuriously.
The change is very low risk"
* 'for-4.15-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: avoid hard lockups in show_workqueue_state()
Linus Torvalds [Fri, 19 Jan 2018 19:21:31 +0000 (11:21 -0800)]
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Arnd Bergmann:
"We have various small DT fixes, and one important regression fix:
The recent device tree bugfixes that were intended to address issues
that 'dtc' started warning about in 4.15 fixed various USB PHY device
nodes, but it turns out that we had code that depended on those nodes
being incorrect and the probe failing with a particular error code.
With the workaround we can also deal with correct device nodes.
The DT fixes include:
- Allwinner A10 and A20 had the display pipeline set up incorrectly
(introduced in v4.15)
- The Altera PMU lacked an interrupt-parent (never worked)
- Pin muxing on the Openblocks A7 (never worked)
- Clocks might get set up wrong on Armada 7K/8K (4.15 regression)
We now have additional device tree patches to address all the
remaining warnings introduced in 4.15, but decided to queue them for
4.16 instead, to avoid risking another regression like the USB PHY
thing mentioned above.
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
phy: work around 'phys' references to usb-nop-xceiv devices
ARM: sunxi_defconfig: Enable CMA
arm64: dts: socfpga: add missing interrupt-parent
ARM: dts: sun[47]i: Fix display backend 1 output to TCON0 remote endpoint
ARM64: dts: marvell: armada-cp110: Fix clock resources for various node
ARM: dts: da850-lcdk: Remove leading 0x and 0s from unit address
ARM: dts: kirkwood: fix pin-muxing of MPP7 on OpenBlocks A7
Linus Torvalds [Fri, 19 Jan 2018 19:19:11 +0000 (11:19 -0800)]
Merge tag 'powerpc-4.15-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"More than we'd like after rc8, but nothing very alarming either, just
tying up loose ends before the release:
Since we changed powernv to use cpufreq_get() from show_cpuinfo(), we
see warnings with PREEMPT enabled. But the preempt_disable() in
show_cpuinfo() doesn't actually prevent CPU hotplug as it suggests, so
remove it.
Two updates to the recently merged RFI flush code. Wire up the generic
sysfs file to report the status, and add a debugfs file to allow
enabling/disabling it at runtime.
Two updates to xmon, one to add the RFI flush related fields to the
paca dump, and another to not use hashed pointers in the paca dump.
And one minor fix to add a missing include of linux/types.h in
asm/hvcall.h, not seen to break the build in upstream, but correct
anyway.
Thanks to: Benjamin Herrenschmidt, Michal Suchanek, Nicholas Piggin"
* tag 'powerpc-4.15-8' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries: include linux/types.h in asm/hvcall.h
powerpc/64s: Allow control of RFI flush via debugfs
powerpc/64s: Wire up cpu_show_meltdown()
powerpc: Don't preempt_disable() in show_cpuinfo()
powerpc/xmon: Don't print hashed pointers in paca dump
powerpc/xmon: Add RFI flush related fields to paca dump
Linus Torvalds [Fri, 19 Jan 2018 19:16:01 +0000 (11:16 -0800)]
Merge tag 'drm-fixes-for-v4.15-rc9' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nouveau, i915, vmwgfx and sun4i regression fixes.
The i915 change fixes a display corruption problem introduced in 4.15,
the nouveau changes are for regressions in 4.15, one of the vmwgfx
fixes goes back a little further, the other is a 4.15 regression fix,
the 3 sun4i changes fix blank HDMI output on those devices"
* tag 'drm-fixes-for-v4.15-rc9' of git://people.freedesktop.org/~airlied/linux:
drm/nouveau/mmu/mcp77: fix regressions in stolen memory handling
drm/nouveau/bar/gk20a: Avoid bar teardown during init
drm/nouveau/drm/nouveau: Pass the proper arguments to nvif_object_map_handle()
drm/vmwgfx: fix memory corruption with legacy/sou connectors
drm/vmwgfx: Fix a boot time warning
drm/i915: Fix deadlock in i830_disable_pipe()
drm/i915: Redo plane sanitation during readout
drm/i915: Add .get_hw_state() method for planes
drm/sun4i: hdmi: Add missing rate halving check in sun4i_tmds_determine_rate
drm/sun4i: hdmi: Fix incorrect assignment in sun4i_tmds_determine_rate
drm/sun4i: hdmi: Check for unset best_parent in sun4i_tmds_determine_rate
Arnd Bergmann [Tue, 16 Jan 2018 16:34:00 +0000 (17:34 +0100)]
caif: reduce stack size with KASAN
When CONFIG_KASAN is set, we can use relatively large amounts of kernel
stack space:
net/caif/cfctrl.c:555:1: warning: the frame size of 1600 bytes is larger than 1280 bytes [-Wframe-larger-than=]
This adds convenience wrappers around cfpkt_extr_head(), which is responsible
for most of the stack growth. With those wrapper functions, gcc apparently
starts reusing the stack slots for each instance, thus avoiding the
problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Matthew Wilcox [Thu, 18 Jan 2018 21:52:17 +0000 (13:52 -0800)]
ia64: Rewrite atomic_add and atomic_sub
Force __builtin_constant_p to evaluate whether the argument to atomic_add
& atomic_sub is constant in the front-end before optimisations which
can lead GCC to output a call to __bad_increment_for_ia64_fetch_and_add().
See GCC bugzilla 83653.
Signed-off-by: Jakub Jelinek <jakub@redhat.com> Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Fri, 19 Jan 2018 00:34:05 +0000 (16:34 -0800)]
proc: fix coredump vs read /proc/*/stat race
do_task_stat() accesses IP and SP of a task without bumping reference
count of a stack (which became an entity with independent lifetime at
some point).
however this falls foul of the grep -v, which matches lines containing
".text" and ends up removing all branch instructions from the dump.
This patch resolves both issues by using the .inst directive for 4-byte
quantities on arm64 and stripping the resulting binaries (as is done on
arm already) to remove the mapping symbols.
Link: http://lkml.kernel.org/r/1506596147-23630-1-git-send-email-will.deacon@arm.com Signed-off-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This warning is shown because we are calling drain_all_pages() in
init_early_allocated_pages(), but mm_percpu_wq is not up yet, it is being
set up later on in kernel_init_freeable() -> init_mm_internals().
Link: http://lkml.kernel.org/r/20180109153921.GA13070@techadventures.net Signed-off-by: Oscar Salvador <osalvador@techadventures.net> Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@suse.com> Cc: Ayush Mittal <ayush.m@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Fri, 19 Jan 2018 00:33:50 +0000 (16:33 -0800)]
mm/memory.c: release locked page in do_swap_page()
James reported a bug in swap paging-in from his testing. It is that
do_swap_page doesn't release locked page so system hang-up happens due
to a deadlock on PG_locked.
It was introduced by 86551ac92432 ("mm, swap: skip swapcache for swapin
of synchronous device") because I missed swap cache hit places to update
swapcache variable to work well with other logics against swapcache in
do_swap_page.
1) Fix BPF divides by zero, from Eric Dumazet and Alexei Starovoitov.
2) Reject stores into bpf context via st and xadd, from Daniel
Borkmann.
3) Fix a memory leak in TUN, from Cong Wang.
4) Disable RX aggregation on a specific troublesome configuration of
r8152 in a Dell TB16b dock.
5) Fix sw_ctx leak in tls, from Sabrina Dubroca.
6) Fix program replacement in cls_bpf, from Daniel Borkmann.
7) Fix uninitialized station_info structures in cfg80211, from Johannes
Berg.
8) Fix miscalculation of transport header offset field in flow
dissector, from Eric Dumazet.
9) Fix LPM tree leak on failure in mlxsw driver, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
ibmvnic: Fix IPv6 packet descriptors
ibmvnic: Fix IP offload control buffer
ipv6: don't let tb6_root node share routes with other node
ip6_gre: init dev->mtu and dev->hard_header_len correctly
mlxsw: spectrum_router: Free LPM tree upon failure
flow_dissector: properly cap thoff field
fm10k: mark PM functions as __maybe_unused
cfg80211: fix station info handling bugs
netlink: reset extack earlier in netlink_rcv_skb
can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
can: af_can: can_rcv(): replace WARN_ONCE by pr_warn_once
bpf: mark dst unknown on inconsistent {s, u}bounds adjustments
bpf: fix cls_bpf on filter replace
Net: ethernet: ti: netcp: Fix inbound ping crash if MTU size is greater than 1500
tls: reset crypto_info when do_tls_setsockopt_tx fails
tls: return -EBUSY if crypto_info is already set
tls: fix sw_ctx leak
net/tls: Only attach to sockets in ESTABLISHED state
net: fs_enet: do not call phy_stop() in interrupts
r8152: disable RX aggregation on Dell TB16 dock
...
Arnd Bergmann [Fri, 12 Jan 2018 10:12:05 +0000 (11:12 +0100)]
phy: work around 'phys' references to usb-nop-xceiv devices
Stefan Wahren reports a problem with a warning fix that was merged
for v4.15: we had lots of device nodes with a 'phys' property pointing
to a device node that is not compliant with the binding documented in
Documentation/devicetree/bindings/phy/phy-bindings.txt
This generally works because USB HCD drivers that support both the generic
phy subsystem and the older usb-phy subsystem ignore most errors from
phy_get() and related calls and then use the usb-phy driver instead.
However, it turns out that making the usb-nop-xceiv device compatible with
the generic-phy binding changes the phy_get() return code from -EINVAL to
-EPROBE_DEFER, and the dwc2 usb controller driver for bcm2835 now returns
-EPROBE_DEFER from its probe function rather than ignoring the failure,
breaking all USB support on raspberry-pi when CONFIG_GENERIC_PHY is
enabled. The same code is used in the dwc3 driver and the usb_add_hcd()
function, so a reasonable assumption would be that many other platforms
are affected as well.
I have reviewed all the related patches and concluded that "usb-nop-xceiv"
is the only USB phy that is affected by the change, and since it is by far
the most commonly referenced phy, all the other USB phy drivers appear
to be used in ways that are are either safe in DT (they don't use the
'phys' property), or in the driver (they already ignore -EPROBE_DEFER
from generic-phy when usb-phy is available).
To work around the problem, this adds a special case to _of_phy_get()
so we ignore any PHY node that is compatible with "usb-nop-xceiv",
as we know that this can never load no matter how much we defer. In the
future, we might implement a generic-phy driver for "usb-nop-xceiv"
and then remove this workaround.
Since we generally want older kernels to also want to work with the
fixed devicetree files, it would be good to backport the patch into
stable kernels as well (3.13+ are possibly affected), even though they
don't contain any of the patches that may have caused regressions.
Fixes: 99b72defc9b2 ARM: dts: bcm283x: Fix DTC warnings about missing phy-cells Fixes: 24aae8b1b8fb arm: dts: nspire: Add missing #phy-cells to usb-nop-xceiv Fixes: 8a175759d824 arm: dts: marvell: Add missing #phy-cells to usb-nop-xceiv Fixes: fbbd6fbdb8ca ARM: dts: omap: Add missing #phy-cells to usb-nop-xceiv Fixes: d745d5f277bf ARM: dts: imx51-zii-rdu1: Add missing #phy-cells to usb-nop-xceiv Fixes: 915fbe59cbf2 ARM: dts: imx: Add missing #phy-cells to usb-nop-xceiv Link: https://marc.info/?l=linux-usb&m=151518314314753&w=2 Link: https://patchwork.kernel.org/patch/10158145/ Cc: stable@vger.kernel.org Cc: Felipe Balbi <balbi@kernel.org> Cc: Eric Anholt <eric@anholt.net> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Acked-by: Rob Herring <robh@kernel.org> Tested-by: Hans Verkuil <hans.verkuil@cisco.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Maxime Ripard [Fri, 19 Jan 2018 13:32:08 +0000 (14:32 +0100)]
ARM: sunxi_defconfig: Enable CMA
The DRM driver most notably, but also out of tree drivers (for now) like
the VPU or GPU drivers, are quite big consumers of large, contiguous memory
buffers. However, the sunxi_defconfig doesn't enable CMA in order to
mitigate that, which makes them almost unusable.
Enable it to make sure it somewhat works.
Cc: <stable@vger.kernel.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 18 Jan 2018 13:04:42 +0000 (14:04 +0100)]
can: m_can: mark runtime-PM handlers as __maybe_unused
Building without CONFIG_PM results in a harmless warning:
drivers/net/can/m_can/m_can.c:1763:12: error: 'm_can_runtime_resume' defined but not used [-Werror=unused-function]
drivers/net/can/m_can/m_can.c:1752:12: error: 'm_can_runtime_suspend' defined but not used [-Werror=unused-function]
Marking the functions as __maybe_unused lets the compiler
silently drop them instead.
Dave Airlie [Fri, 19 Jan 2018 02:40:07 +0000 (12:40 +1000)]
Merge tag 'drm-intel-fixes-2018-01-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
Display corruption regression bugfix with both a prep patch and a
follow-up fix
* tag 'drm-intel-fixes-2018-01-18' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Fix deadlock in i830_disable_pipe()
drm/i915: Redo plane sanitation during readout
drm/i915: Add .get_hw_state() method for planes
Thomas Falcon [Fri, 19 Jan 2018 01:05:01 +0000 (19:05 -0600)]
ibmvnic: Fix IP offload control buffer
Set some missing fields in the IP control offload buffer. This buffer is
used to enable checksum and TCP segmentation offload in the VNIC server.
The buffer length field and the checksum offloading bits were not set
properly, so fix that here.
Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 19 Jan 2018 02:16:13 +0000 (21:16 -0500)]
Merge tag 'linux-can-fixes-for-4.15-20180118' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2018-01-18
====================
this is a pull reqeust of two patches for net/master:
The syzkaller project triggered two WARN_ONCE() in the af_can code from
userspace and we decided to replace it by a pr_warn_once().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Wang [Thu, 18 Jan 2018 18:40:03 +0000 (10:40 -0800)]
ipv6: don't let tb6_root node share routes with other node
After commit ec11cf06eaa8, if we add a route to the subtree of tb6_root
which does not have any route attached to it yet, the current code will
let tb6_root and the node in the subtree share the same route.
This could cause problem cause tb6_root has RTN_INFO flag marked and the
tree repair and clean up code will not work properly.
This commit makes sure tb6_root->leaf points back to null_entry instead
of sharing route with other node.
It fixes the following syzkaller reported issue:
BUG: KASAN: use-after-free in ipv6_prefix_equal include/net/ipv6.h:540 [inline]
BUG: KASAN: use-after-free in fib6_add_1+0x165f/0x1790 net/ipv6/ip6_fib.c:618
Read of size 8 at addr ffff8801bc043498 by task syz-executor5/19819
Fixes: ec11cf06eaa8 ("ipv6: remove null_entry before adding default route") Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Airlie [Fri, 19 Jan 2018 02:12:31 +0000 (12:12 +1000)]
Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixes
Thought I'd try my luck getting one more in:
- Two fixes for Tegra (one is to common code, but our userspace doesn't hit it).
- One for NV5x-class MCPs
* 'linux-4.15' of git://github.com/skeggsb/linux:
drm/nouveau/mmu/mcp77: fix regressions in stolen memory handling
drm/nouveau/bar/gk20a: Avoid bar teardown during init
drm/nouveau/drm/nouveau: Pass the proper arguments to nvif_object_map_handle()
Andrew Morton [Fri, 19 Jan 2018 00:30:49 +0000 (16:30 -0800)]
net/sched/sch_prio.c: work around gcc-4.4.4 union initializer issues
gcc-4.4.4 has problems witn anon union initializers. Work around this.
net/sched/sch_prio.c: In function 'prio_dump_offload':
net/sched/sch_prio.c:260: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:260: warning: initialization makes integer from pointer without a cast
net/sched/sch_prio.c:261: error: unknown field 'stats' specified in initializer
net/sched/sch_prio.c:261: warning: initialization makes integer from pointer without a cast
Fixes: 63a7228f5d5b00 ("net: sch: prio: Add offload ability to PRIO qdisc") Cc: Nogah Frankel <nogahf@mellanox.com> Cc: Yuval Mintz <yuvalm@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Kodanev [Thu, 18 Jan 2018 17:51:12 +0000 (20:51 +0300)]
ip6_gre: init dev->mtu and dev->hard_header_len correctly
Commit 2e1818f28675 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") moved dev->mtu initialization
from ip6gre_tunnel_setup() to ip6gre_tunnel_init(), as a
result, the previously set values, before ndo_init(), are
reset in the following cases:
* rtnl_create_link() can update dev->mtu from IFLA_MTU
parameter.
* ip6gre_tnl_link_config() is invoked before ndo_init() in
netlink and ioctl setup, so ndo_init() can reset MTU
adjustments with the lower device MTU as well, dev->mtu
and dev->hard_header_len.
Not applicable for ip6gretap because it has one more call
to ip6gre_tnl_link_config(tunnel, 1) in ip6gre_tap_init().
Fix the first case by updating dev->mtu with 'tb[IFLA_MTU]'
parameter if a user sets it manually on a device creation,
and fix the second one by moving ip6gre_tnl_link_config()
call after register_netdevice().
Fixes: 2e1818f28675 ("gre6: Cleanup GREv6 transmit path, call common GRE functions") Fixes: 8fe2bab3267f ("ip6_gre: Fix MTU setting") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Thu, 18 Jan 2018 14:42:10 +0000 (15:42 +0100)]
mlxsw: spectrum_router: Free LPM tree upon failure
When a new LPM tree is created, we try to replace the trees in the
existing virtual routers with it. If we fail, the tree needs to be
freed.
Currently, this does not happen in the unlikely case where we fail to
bind the tree to the first virtual router, since its reference count
never transitions from 1 to 0.
Fix that by taking a reference before binding the tree.
Fixes: ba9238dbb5e6 ("mlxsw: spectrum_router: Use one LPM tree for all virtual routers") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Hunter [Thu, 4 Jan 2018 11:29:09 +0000 (11:29 +0000)]
drm/nouveau/bar/gk20a: Avoid bar teardown during init
Commit 40b0602c4e7f ("drm/nouveau/bar: implement bar1 teardown")
introduced add a teardown helper function for BAR1. During
initialisation of the Nouveau, initially all the teardown helpers are
called once, before calling their init counterparts. For gk20a, after
the BAR1 teardown function is called, the device is hanging during the
initialisation of the FB sub-device. At this point it is unclear why
this is happening and this is still under investigation. However, this
change is preventing Tegra124 devices from booting when Nouveau is
enabled. To allow Tegra124 to boot, remove the teardown helper for
gk20a.
This is based upon a previous patch by Guillaume Tucker but limits
the workaround to only gk20a GPUs.
Fixes: 40b0602c4e7f ("drm/nouveau/bar: implement bar1 teardown") Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com> Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Thierry Reding [Thu, 18 Jan 2018 21:24:12 +0000 (07:24 +1000)]
drm/nouveau/drm/nouveau: Pass the proper arguments to nvif_object_map_handle()
This is obviously wrong in the current code. Make sure to record the
correct size of the arguments and pass the actual arguments to the
nvif_object_map_handle() function.
Suggested-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Eric Dumazet [Wed, 17 Jan 2018 22:21:13 +0000 (14:21 -0800)]
flow_dissector: properly cap thoff field
syzbot reported yet another crash [1] that is caused by
insufficient validation of DODGY packets.
Two bugs are happening here to trigger the crash.
1) Flow dissection leaves with incorrect thoff field.
2) skb_probe_transport_header() sets transport header to this invalid
thoff, even if pointing after skb valid data.
3) qdisc_pkt_len_init() reads out-of-bound data because it
trusts tcp_hdrlen(skb)
Possible fixes :
- Full flow dissector validation before injecting bad DODGY packets in
the stack.
This approach was attempted here : https://patchwork.ozlabs.org/patch/
861874/
- Have more robust functions in the core.
This might be needed anyway for stable versions.
Fixes: 723a88d344d3 ("net: __skb_flow_dissect() must cap its return value") Fixes: 9292fd64207b ("flow_dissector: Jump to exit code in __skb_flow_dissect") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tracing: Fix converting enum's from the map in trace_event_eval_update()
Since enums do not get converted by the TRACE_EVENT macro into their values,
the event format displaces the enum name and not the value. This breaks
tools like perf and trace-cmd that need to interpret the raw binary data. To
solve this, an enum map was created to convert these enums into their actual
numbers on boot up. This is done by TRACE_EVENTS() adding a
TRACE_DEFINE_ENUM() macro.
Some enums were not being converted. This was caused by an optization that
had a bug in it.
All calls get checked against this enum map to see if it should be converted
or not, and it compares the call's system to the system that the enum map
was created under. If they match, then they call is processed.
To cut down on the number of iterations needed to find the maps with a
matching system, since calls and maps are grouped by system, when a match is
made, the index into the map array is saved, so that the next call, if it
belongs to the same system as the previous call, could start right at that
array index and not have to scan all the previous arrays.
The problem was, the saved index was used as the variable to know if this is
a call in a new system or not. If the index was zero, it was assumed that
the call is in a new system and would keep incrementing the saved index
until it found a matching system. The issue arises when the first matching
system was at index zero. The next map, if it belonged to the same system,
would then think it was the first match and increment the index to one. If
the next call belong to the same system, it would begin its search of the
maps off by one, and miss the first enum that should be converted. This left
a single enum not converted properly.
Also add a comment to describe exactly what that index was for. It took me a
bit too long to figure out what I was thinking when debugging this issue.
Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com Cc: stable@vger.kernel.org Fixes: 7efcee96281f8 ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values") Reported-by: Chuck Lever <chuck.lever@oracle.com> Teste-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Arnd Bergmann [Wed, 17 Jan 2018 15:57:32 +0000 (07:57 -0800)]
fm10k: mark PM functions as __maybe_unused
A cleanup of the PM code left an incorrect #ifdef in place, leading
to a harmless build warning:
drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2502:12: error: 'fm10k_suspend' defined but not used [-Werror=unused-function]
drivers/net/ethernet/intel/fm10k/fm10k_pci.c:2475:12: error: 'fm10k_resume' defined but not used [-Werror=unused-function]
It's easier to use __maybe_unused attributes here, since you
can't pick the wrong one.
Fixes: 965688895beb ("fm10k: use generic PM hooks instead of legacy PCIe power hooks") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ring-buffer: Fix duplicate results in mapping context to bits in recursive lock
In bringing back the context checks, the code checks first if its normal
(non-interrupt) context, and then for NMI then IRQ then softirq. The final
check is redundant. Since the if branch is only hit if the context is one of
NMI, IRQ, or SOFTIRQ, if it's not NMI or IRQ there's no reason to check if
it is SOFTIRQ. The current code returns the same result even if its not a
SOFTIRQ. Which is confusing.
pc & SOFTIRQ_OFFSET ? 2 : RB_CTX_SOFTIRQ
Is redundant as RB_CTX_SOFTIRQ *is* 2!
Fixes: 5e83840c6c71 ("ring-buffer: Bring back context level recursive checks") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Johannes Berg [Tue, 16 Jan 2018 22:20:22 +0000 (23:20 +0100)]
cfg80211: fix station info handling bugs
Fix two places where the structure isn't initialized to zero,
and thus can't be filled properly by the driver.
Fixes: f773229e40fc ("cfg80211: Accept multiple RSSI thresholds for CQM") Fixes: bd2ab2d28674 ("cfg80211: implement IWRATE") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Thu, 18 Jan 2018 06:48:03 +0000 (14:48 +0800)]
netlink: reset extack earlier in netlink_rcv_skb
Move up the extack reset/initialization in netlink_rcv_skb, so that
those 'goto ack' will not skip it. Otherwise, later on netlink_ack
may use the uninitialized extack and cause kernel crash.
Fixes: d273ba272ff4 ("netlink: extack needs to be reset each time through loop") Reported-by: syzbot+03bee3680a37466775e7@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Desaulniers [Thu, 18 Jan 2018 19:36:41 +0000 (11:36 -0800)]
Input: synaptics-rmi4 - prevent UAF reported by KASAN
KASAN found a UAF due to dangling pointer. As the report below says,
rmi_f11_attention() accesses drvdata->attn_data.data, which was freed in
rmi_irq_fn.
[ 311.424369] Memory state around the buggy address:
[ 311.424373] ffff88041fd60f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 311.424377] ffff88041fd61000: fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb fb
[ 311.424381] >ffff88041fd61080: fb fb fb fb fc fc fc fc fb fb fb fb fb fb fb fb
[ 311.424384] ^
[ 311.424387] ffff88041fd61100: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
[ 311.424391] ffff88041fd61180: fb fb fb fb fb fb fb fb fc fc fc fc fb fb fb fb
Cc: stable@vger.kernel.org Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Thu, 18 Jan 2018 18:54:52 +0000 (10:54 -0800)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull two NVMe fixes from Jens Axboe:
"Two important fixes for the sgl support for nvme that is new in this
release"
* 'for-linus' of git://git.kernel.dk/linux-block:
nvme-pci: take sglist coalescing in dma_map_sg into account
nvme-pci: check segement valid for SGL use
Jiri Pirko [Thu, 18 Jan 2018 15:14:49 +0000 (16:14 +0100)]
net: sched: silence uninitialized parent variable warning in tc_dump_tfilter
When tcm->tcm_ifindex == TCM_IFINDEX_MAGIC_BLOCK, parent is still passed
down but the value is never used. Compiler does not recognize it and
issues a warning. Silence it down initializing parent to 0.
Fixes: 1edc1568f711 ("net: sched: use block index as a handle instead of qdisc when block is shared") Reported-by: David Miller <davem@davemloft.net> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a divide by zero due to wrong if (src_reg == 0) check in
64-bit mode. Properly handle this in interpreter and mask it
also generically in verifier to guard against similar checks
in JITs, from Eric and Alexei.
2) Fix a bug in arm64 JIT when tail calls are involved and progs
have different stack sizes, from Daniel.
3) Reject stores into BPF context that are not expected BPF_STX |
BPF_MEM variant, from Daniel.
4) Mark dst reg as unknown on {s,u}bounds adjustments when the
src reg has derived bounds from dead branches, from Daniel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
can: af_can: canfd_rcv(): replace WARN_ONCE by pr_warn_once
If an invalid CANFD frame is received, from a driver or from a tun
interface, a Kernel warning is generated.
This patch replaces the WARN_ONCE by a simple pr_warn_once, so that a
kernel, bootet with panic_on_warn, does not panic. A printk seems to be
more appropriate here.
Reported-by: syzbot+e3b775f40babeff6e68b@syzkaller.appspotmail.com Suggested-by: Dmitry Vyukov <dvyukov@google.com> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>