]> git.baikalelectronics.ru Git - kernel.git/commitdiff
mm, memcg: do not high throttle allocators based on wraparound
authorJakub Kicinski <kuba@kernel.org>
Fri, 10 Apr 2020 21:32:19 +0000 (14:32 -0700)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 17 Apr 2020 08:50:17 +0000 (10:50 +0200)
commit b8134b2b6e412eb7fe38a320a4bbb0d4a27ef495 upstream.

If a cgroup violates its memory.high constraints, we may end up unduly
penalising it.  For example, for the following hierarchy:

  A:   max high, 20 usage
  A/B: 9 high, 10 usage
  A/C: max high, 10 usage

We would end up doing the following calculation below when calculating
high delay for A/B:

  A/B: 10 - 9 = 1...
  A:   20 - PAGE_COUNTER_MAX = 21, so set max_overage to 21.

This gets worse with higher disparities in usage in the parent.

I have no idea how this disappeared from the final version of the patch,
but it is certainly Not Good(tm).  This wasn't obvious in testing because,
for a simple cgroup hierarchy with only one child, the result is usually
roughly the same.  It's only in more complex hierarchies that things go
really awry (although still, the effects are limited to a maximum of 2
seconds in schedule_timeout_killable at a maximum).

[chris@chrisdown.name: changelog]
Fixes: 0798a9a9b57d ("mm, memcg: throttle allocators based on ancestral memory.high")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Chris Down <chris@chrisdown.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org> [5.4.x]
Link: http://lkml.kernel.org/r/20200331152424.GA1019937@chrisdown.name
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mm/memcontrol.c

index 5d0575d633d2e000797f8a70b7b389a623c90b60..8159000781be59a0cc83f93ce9695c1a1d1004b5 100644 (file)
@@ -2441,6 +2441,9 @@ static unsigned long calculate_high_delay(struct mem_cgroup *memcg,
                usage = page_counter_read(&memcg->memory);
                high = READ_ONCE(memcg->high);
 
+               if (usage <= high)
+                       continue;
+
                /*
                 * Prevent division by 0 in overage calculation by acting as if
                 * it was a threshold of 1 page