netvsc: save pointer to parent netvsc_device in channel table
Keep back pointer in the per-channel data structure to
avoid any possible RCU related issues when napi poll is
called but netvsc_device is in RCU limbo.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netvsc: need rcu_derefence when accessing internal device info
The netvsc_device structure should be accessed by rcu_dereference
in the send path. Change arguments to netvsc_send() to make
this easier to do correctly.
Remove no longer needed hv_device_to_netvsc_device.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The rndis_filter_device_add function is called both in
probe context and RTNL context,and creates the netvsc_device
inner structure. It is easier to get the RTNL lock annotation
correct if it returns the object directly, rather than implicitly
by updating network device private data.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netvsc: change logic for change mtu and set_queues
Use device detach/attach to ensure that no packets are handed
to device during state changes. Call rndis_filter_open/close
directly as part of later VF related changes.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the error unwind logic for incorrect number of queues.
If netif_set_real_num_XX_queues failed then rndis_filter_device_add
would have been called twice. Since input arguments are already
ranged checked this is a hypothetical only problem, not possible
in actual code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
If two MTU changes are in less than update interval (2 seconds),
then the netvsc network device may get stuck with no carrier.
The netvsc driver debounces link status events which is fine
for unsolicited updates, but blocks getting the update after
down/up from MTU reinitialization.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 19 Jul 2017 23:45:16 +0000 (16:45 -0700)]
Merge branch 'dev_close-void'
Stephen Hemminger says:
====================
net: make dev_close void
Noticed while working on other changes. Why is dev_close()
returning int, it should be void. Should also change
ndo_close to be void, but that requires more work and someone
with more coccinelle foo (smpl) than me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
liquidio: lio_main: remove unnecessary static in setup_io_queues()
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
78689 15272 27808 121769 1dba9 drivers/net/ethernet/cavium/liquidio/lio_main.o
after:
text data bss dec hex filename
78667 15128 27680 121475 1da83 drivers/net/ethernet/cavium/liquidio/lio_main.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
liquidio: lio_vf_main: remove unnecessary static in setup_io_queues()
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
55656 10680 576 66912 10560 drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
after:
text data bss dec hex filename
55796 10536 448 66780 104dc drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: mediatek: remove useless code in mtk_poll_tx()
Remove useless local variable _condition_ and the code related.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic: remove unnecessary static in qlcnic_dump_fw()
Remove unnecessary static on local variable fw_dump_ops.
Such variable is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a difference in the object file size.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
19032 2136 64 21232 52f0 drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o
after:
text data bss dec hex filename
19020 2048 0 21068 524c drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove useless local variables last_read_point and last_txw_point and
the code related.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: David S. Miller <davem@davemloft.net>
wireless: airo: remove unnecessary static in writerids()
Remove unnecessary static on local function pointer _writer_.
Such pointer is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:
before:
text data bss dec hex filename
113797 19152 1216 134165 20c15 drivers/net/wireless/cisco/airo.o
after:
text data bss dec hex filename
113881 19096 1152 134129 20bf1 drivers/net/wireless/cisco/airo.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the definition of PGV_FROM_VMALLOC from af_packet.c.
The PGV_FROM_VMALLOC definition was already removed by
commit 62633ad24fc2 ("net: cleanup unused macros in net directory"),
and its usage was removed even before by commit 6474f5100eed
("af_packet: remove pgv.flags"); but it was added back by mistake later on,
in commit caf94a261123 ("af-packet: TPACKET_V3 flexible buffer implementation").
Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Tue, 18 Jul 2017 15:49:26 +0000 (18:49 +0300)]
ISDN: eicon: switch to use native bitmaps
Two arrays are clearly bit maps, so, make that explicit by converting to
bitmap API and remove custom helpers.
Note sig_ind() uses out of boundary bit to (looks like) protect against
potential bitmap_empty() checks for the same bitmap.
This patch removes that since:
1) that didn't guarantee atomicity anyway;
2) the first operation inside the for-loop is set bit in the bitmap
(which effectively makes it non-empty);
3) group_optimization() doesn't utilize possible emptiness of the bitmap
in question.
Thus, if there is a protection needed it should be implemented properly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adjusts the timeout formula to schedule the TCP loss probe
(TLP). The previous formula uses 2*SRTT or 1.5*RTT + DelayACKMax if
only one packet is in flight. It keeps a lower bound of 10 msec which
is too large for short RTT connections (e.g. within a data-center).
The new formula = 2*RTT + (inflight == 1 ? 200ms : 2ticks) which
performs better for short and fast connections.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch: Optimize operations for OvS flow_stats.
When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).
The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.
VM01 --- OvS-VM --- VM02
Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.
With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).
This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch: Optimize updating for OvS flow_stats.
In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 19 Jul 2017 20:24:47 +0000 (13:24 -0700)]
Merge branch 'liquidio-lowmem-fixes'
Rick Farrington says:
====================
liquidio: avoid vm low memory crashes
This patchset addresses issues brought about by low memory conditions
in a VM. These conditions were not seen when the driver was exercised
normally. Rather, they were brought about through manual fault injection.
They are being included in the interest of hardening the driver against
unforeseen circumstances.
1. Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'.
This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry
in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries.
2. Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers().
3. For defensive programming, zero the allocated block 'oct->droq' in
octeon_setup_output_queues() and 'oct->instr_queue' in
octeon_setup_instr_queues().
change log:
V1 -> V2:
1. Corrected syntax in 'Subject' lines; no functional or code changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 18 Jul 2017 00:51:37 +0000 (17:51 -0700)]
liquidio: lowmem: init allocated memory to 0
For defensive programming, zero the allocated block 'oct->droq[0]' in
octeon_setup_output_queues() and 'oct->instr_queue[0]' in
octeon_setup_instr_queues().
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 18 Jul 2017 00:51:10 +0000 (17:51 -0700)]
liquidio: lowmem: do not dereference null ptr
Don't dereference a NULL ptr in octeon_droq_destroy_ring_buffers().
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Tue, 18 Jul 2017 00:50:47 +0000 (17:50 -0700)]
liquidio: lowmem: init allocated memory to 0
Fix GPF in octeon_init_droq(); zero the allocated block 'recv_buf_list'.
This prevents a GPF trying to access an invalid 'recv_buf_list[i]' entry
in octeon_droq_destroy_ring_buffers() if init didn't alloc all entries.
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rick Farrington [Mon, 17 Jul 2017 20:33:14 +0000 (13:33 -0700)]
liquidio: support new firmware statistic fw_err_pki
Added support for new firmware statistic 'tx_err_pki'.
Signed-off-by: Rick Farrington <ricardo.farrington@cavium.com> Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2017 19:04:57 +0000 (12:04 -0700)]
Merge branch 'net-attribute_group-const'
Arvind Yadav says:
====================
constify net attribute_group structures.
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work with const
attribute_group. So mark the non-const structs as const.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
28720 985 12 29717 7415 net/.../cxgb3/cxgb3_main.o
File size After adding 'const':
text data bss dec hex filename
28848 857 12 29717 7415 net/.../cxgb3/cxgb3_main.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
4512 1472 0 5984 1760 drivers/net/bonding/bond_sysfs.o
File size After adding 'const':
text data bss dec hex filename
4576 1408 0 5984 1760 drivers/net/bonding/bond_sysfs.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
3409 948 28 4385 1121 drivers/net/arcnet/com20020-pci.o
File size After adding 'const':
text data bss dec hex filename
3473 884 28 4385 1121 drivers/net/arcnet/com20020-pci.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/sysfs.h> work
with const attribute_group. So mark the non-const structs as const.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
11800 368 0 12168 2f88 drivers/net/can/janz-ican3.o
File size After adding 'const':
text data bss dec hex filename
11864 304 0 12168 2f88 drivers/net/can/janz-ican3.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
6164 304 0 6468 1944 drivers/net/can/at91_can.o
File size After adding 'const':
text data bss dec hex filename
6228 240 0 6468 1944 drivers/net/can/at91_can.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
attribute_group are not supposed to change at runtime. All functions
working with attribute_group provided by <linux/netdevice.h> work
with const attribute_group. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
13275 928 1 14204 377c drivers/net/usb/cdc_ncm.o
File size After adding 'const':
text data bss dec hex filename
13339 864 1 14204 377c drivers/net/usb/cdc_ncm.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mlxsw: Preparations for IPv6 UC router
Ido says:
The purpose of this set is to prepare the driver for the introduction of
IPv6 FIB offload. It's mainly composed of small and non-functional
changes, that either add the IPv6 equivalent of existing IPv4 code or
aimed at making the introduction of IPv6-specific code easier.
The first five patches enable IPv6 forwarding in the device and allow us
to configure router interfaces (RIFs) based on inet6addr notifications.
The next six patches add support for programming IPv6 neighbours into
the device's table as well as dumping their activity and updating the
kernel accordingly.
The last 11 patches extend current infrastructure to allow us to program
IPv6 routes, set catch-all IPv6 trap in case of abort and make the code
more receptive towards up-coming changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Update prefix count for IPv6
The number of possible prefix lengths for IPv6 is 129 and not 128.
Fixes following warning from UBSAN when /128 routes are offloaded:
UBSAN: Undefined behaviour in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:2510:27 index 128 is out
of range for type 'long unsigned int [128]'
Fixes: 638a7b12eb3e ("mlxsw: spectrum_router: Implement private fib") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Mark IPv4 specific function accordingly
The functions to create and destroy a nexthop group are IPv4 specific
and should be renamed accordingly, so that they won't be confused with
the IPv6 specific functions in follow-up patches.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When we fail to insert a route we invoke the abort mechanism which
flushes all the tables and inserts a default route in each, so that all
packets incoming to the router will be trapped to the CPU.
Upon abort, add an IPv6 default route to the IPv6 tables.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Allow IPv6 routes to be programmed
Take advantage of previous patch and allow the RALUE register to be
called with IPv6 routes.
In order to re-use as much code as possible between IPv4 and IPv6, only
the lowest-level function that actually does the register packing is
demuxed based on the passed protocol.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Make FIB node retrieval family agnostic
A FIB node is an entity which stores routes sharing the same prefix and
length. The data structure itself is already family agnostic, but we
make some of its operations agnostic as well and thus re-use them for
IPv6 offload.
Instead of passing an IPv4-specific structure to fib4_node_get(), pass
general routing parameters and rename the function accordingly.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Don't assume neighbour type
Thankfully, the neighbour subsystem is agnostic to the upper protocol
and used by both IPv4 and IPv6. By removing assumptions regarding the
neighbour type we can thus re-use much of the neighbour-related code for
both IPv4 and IPv6.
For each nexthop, store its gateway IP and for nexthop group store the
neighbour table used by its nexthops.
Use this information throughout the code and remove assumption about the
neighbour type.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Set activity interval according to both neighbour tables
The neighbours' activity is currently dumped according to the ARP
table's DELAY_PROBE time, but with the introduction of IPv6 offload we
should set the interval according to the minimum between the ARP and
ndisc tables.
Signed-off-by: Arkadi Sharshvesky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Reflect IPv6 neighbours to the device
As with IPv4, listen to NEIGH_UPDATE events from the ndisc table and
program relevant neighbours to the device's neighbour table.
Note that neighbours with a link-local IP address aren't programmed, as
packets with a link-local destination IP are trapped after LPM lookup
and never reach the neighbour table.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Configure RIFs based on IPv6 addresses
When a netdev is configured with an IP address a router interface (RIF)
should be configured for it in the device. Allow configuration of RIFs
based on IPv6 address notifications as well as IPv4.
Note that the RIF exists as long as an IP address is configured on the
netdev, regardless of the address family.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_router: Flood unregistered multicast packets to router
Up until now we only flooded broadcast packets to the router when an L3
interface was configured on top of a bridge. However, IPv6 Neighbour
Discovery packets are trapped to the CPU inside the router and these can
be sent with a multicast address.
Flood unregistered multicast packets to the router port, so that
relevant packets could be trapped there.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Before we can start using IPv6, we need to trap certain control packets
to the CPU. Among others, these include Neighbour Discovery, DHCP and
neighbour misses.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 18 Jul 2017 18:13:42 +0000 (11:13 -0700)]
Merge branch 'xfrm-remove-flow-cache'
Florian Westphal says:
====================
xfrm: remove flow cache
After RCU-ification of ipsec packet path there are no major scalability
issues anymore without flow cache.
We still incur a performance hit, which comes mostly from the extra xfrm
dst allocation/freeing.
The last patch in the series adds a simple percpu cache to avoid the
extra allocation if a packet matched the same policies as last one.
The main concern with this is that we will see performance drops,
especially with large numbers of policies/SAs.
However, during hallway discussions at nfws 2017 it seemed the issues
with flow caching outweight the removal downsides, and that it
might be best to just 'remove it' and see where the practical issues
(if any) will appear.
It should now be possible to also remove the genid member in the policies
as we don't hold bundles for prolonged time anymore, but I think
this change is controversial (and intrusive) enough as-is, so defer
that to a later point in time.
Changes since last rfc:
- fix build failures due to implicit interrupt.h includes
- rework last patch (pcpu cache):
* avoid xchg()
* check policies for walk.dead = 1 instead of more costly bundle_ok().
* flush pcpu bundles when sa/policies get removed, to allow module
references to go away (suggested by Ilan Tayari)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
retain last used xfrm_dst in a pcpu cache.
On next request, reuse this dst if the policies are the same.
The cache will not help with strict RR workloads as there is no hit.
The cache packet-path part is reasonably small, the notifier part is
needed so we do not add long hangs when a device is dismantled but some
pcpu xdst still holds a reference, there are also calls to the flush
operation when userspace deletes SAs so modules can be removed
(there is no hit.
We need to run the dst_release on the correct cpu to avoid races with
packet path. This is done by adding a work_struct for each cpu and then
doing the actual test/release on each affected cpu via schedule_work_on().
Test results using 4 network namespaces and null encryption:
Once we remove flow cache, we don't have a flow cache limit anymore.
We must not allow (virtually) unlimited allocations of xfrm dst entries.
Revert back to the old xfrm dst gc limits.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
these drivers use tasklets or irq apis, but don't include interrupt.h.
Once flow cache is removed the implicit interrupt.h inclusion goes away
which will break the build.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch series removes the remaining capabilities as well as the
flags bitmap in the info structures. Most of them are turned into ops,
or new info members.
There is no mv88e6xxx_cap enum or bitmap flags anymore, only
mv88e6xxx_info and mv88e6xxx_ops structures.
While reviewing and documenting the related G2 registers, fix a few
inconsistencies: 88E6185 has no interrupt in G2 and 88E6390 has a POT.
Except these two adjustments, there is no functional changes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:46 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: add a multi_chip info flag
Instead of relying on a bitmap flag, add a new multi_chip info flag to
describe the presence of the indirect SMI access though the two device
registers 0x0 and 0x1.
All remaining capabilities and flags are now unused. Remove the
mv88e6xxx_cap enum and the info flags bitmaps.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:45 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: add Energy Detect ops
The 88E6352 family supports Energy Detect and has one bit for Sense and
one bit for periodically transmit NLP (Energy Detect+TM). The 88E6390
family adds another bit to distinguish Auto or SW wake-up. Chips
supporting EEE all have an EEE Enabled bit in the Port Status Register.
This patch adds new ops for the PHY Energy Detect accesses.
This also allows us to get rid of the MV88E6XXX_FLAG_EEE flag.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:44 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: add a global2_addr info flag
Similarly to global1_addr, add a global2_addr member in the info
structure to describe the presence of the Global 2 Registers.
This allows us to get rid of the MV88E6XXX_FLAG_GLOBAL2 flag.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:43 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: add POT operation
Add a pot_clear operation to clear the Priority Override Table and wrap
its call into a mv88e6xxx_pot_setup helper.
This allows us to get rid of the MV88E6XXX_FLAG_G2_POT flag.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
The 88E6390 family clear the Priority Override Table the same way as 88E6352, thus add MV88E6XXX_FLAG_G2_POT to MV88E6XXX_FLAGS_FAMILY_6390.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:41 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: distinguish Global 2 Rsvd2CPU
The 88E6185 family only has one 16-bit register to mark the 16 802.1D
reserved multicast addresses in the range of 01:80:C2:00:00:0x as MGMT.
The 88E6352 family also has one 16-bit register to mark the 16 GARP
reserved multicast addresses in the range of 01:80:C2:00:00:2x as MGMT.
Split the existing mv88e6095 prefixed mgmt_rsvd2cpu operation into two
distinct mv88e6185 and mv88e6352 prefixed operations, and wrap its call
into a mv88e6xxx_rsvd2cpu_setup helper.
This allows us to also get rid of the MV88E6XXX_CAP_G2_MGMT_EN_* flags.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:40 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: add number of Global 2 IRQs
Similarly to g1_irqs, add a g2_irqs member to the info structure to
indicates the presence of the Global 2 Interrupt Source and Mask
registers.
At the same time, provide helpers and document the registers since they
differ a bit between 88E6352 and 88E6390 families.
This allows us to get rid of the MV88E6XXX_FLAG_G2_INT flag.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
The 88E6185 family has no Global 2 Interrupt Source or Mask registers.
Remove the MV88E6XXX_FLAG_G2_INT from MV88E6XXX_FLAGS_FAMILY_6185.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:38 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: remove unused capabilities
Remove the forgotten capabilities and related flags from previous
cleanups.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
MV88E6XXX_FAMILY_6321 is undefined, 88E6321's family is 88E6320,
fix this.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:36 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: remove LED control register
We don't support LED control yet, remove its register definition.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Mon, 17 Jul 2017 17:03:35 +0000 (13:03 -0400)]
net: dsa: mv88e6xxx: remove unneeded dsa header
phy.c does not need to include the DSA public header. Remove it.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
John Fastabend [Tue, 18 Jul 2017 04:56:48 +0000 (21:56 -0700)]
net: fix build error in devmap helper calls
Initial patches missed case with CONFIG_BPF_SYSCALL not set.
Fixes: 85d93df2d36f ("xdp: Add batching support to redirect map") Fixes: e3e70468e656 ("bpf: add bpf_redirect_map helper routine") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
5113 384 0 5497 1579 drivers/net/ethernet/ec_bhf.o
File size After adding 'const':
text data bss dec hex filename
5177 320 0 5497 1579 drivers/net/ethernet/ec_bhf.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
pci_device_id are not supposed to change at runtime. All functions
working with pci_device_id provided by <linux/pci.h> work with
const pci_device_id. So mark the non-const structs as const.
File size before:
text data bss dec hex filename
791 336 0 1127 467 net/ethernet/cadence/macb_pci.o
File size After adding 'const':
text data bss dec hex filename
855 272 0 1127 467 net/ethernet/cadence/macb_pci.o
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Revert "net: add function to allocate sk_buff head without data area"
It was added for netlink mmap tx, there are no callers in the tree.
The commit also added a check for skb->head != NULL in kfree_skb path,
remove that too -- all skbs ought to have skb->head set.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>