]> git.baikalelectronics.ru Git - kernel.git/commit
cpumask: make cpumask_next() out-of-line
authorAlexey Dobriyan <adobriyan@gmail.com>
Fri, 8 Sep 2017 23:17:15 +0000 (16:17 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 9 Sep 2017 01:26:51 +0000 (18:26 -0700)
commit786e77b8c379189272f005998065e59a7db8b88d
tree57173374d87f88a280e80a6b7b8ee3eb237d2c0b
parent40ea28f58c2640594485a68cf240abf3474facb5
cpumask: make cpumask_next() out-of-line

Every for_each_XXX_cpu() invocation calls cpumask_next() which is an
inline function:

static inline unsigned int cpumask_next(int n, const struct cpumask *srcp)
{
        /* -1 is a legal arg here. */
        if (n != -1)
                cpumask_check(n);
        return find_next_bit(cpumask_bits(srcp), nr_cpumask_bits, n + 1);
}

However!

find_next_bit() is regular out-of-line function which means "nr_cpu_ids"
load and increment happen at the caller resulting in a lot of bloat

x86_64 defconfig:
add/remove: 3/0 grow/shrink: 8/373 up/down: 155/-5668 (-5513)
x86_64 allyesconfig-ish:
add/remove: 3/1 grow/shrink: 57/634 up/down: 3515/-28177 (-24662) !!!

Some archs redefine find_next_bit() but it is OK:

m68k inline but SMP is not supported
arm out-of-line
unicore32 out-of-line

Function call will happen anyway, so move load and increment into callee.

Link: http://lkml.kernel.org/r/20170824230010.GA1593@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/cpumask.h
lib/cpumask.c