]> git.baikalelectronics.ru Git - kernel.git/commit
thp: change CoW semantics for anon-THP
authorKirill A. Shutemov <kirill.shutemov@linux.intel.com>
Wed, 3 Jun 2020 23:00:27 +0000 (16:00 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Thu, 4 Jun 2020 03:09:46 +0000 (20:09 -0700)
commit821c577da574d00fa0e73f02b20aa2de72746a3d
tree8e21dd0cbeb98e82685c3c36bfc3c251a24bcf66
parent8a9bbf73fce542aff6c49f3cbf496a41ad830df5
thp: change CoW semantics for anon-THP

Currently we have different copy-on-write semantics for anon- and
file-THP.  For anon-THP we try to allocate huge page on the write fault,
but on file-THP we split PMD and allocate 4k page.

Arguably, file-THP semantics is more desirable: we don't necessary want to
unshare full PMD range from the parent on the first access.  This is the
primary reason THP is unusable for some workloads, like Redis.

The original THP refcounting didn't allow to have PTE-mapped compound
pages, so we had no options, but to allocate huge page on CoW (with
fallback to 512 4k pages).

The current refcounting doesn't have such limitations and we can cut a lot
of complex code out of fault path.

khugepaged is now able to recover THP from such ranges if the
configuration allows.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Yang Shi <yang.shi@linux.alibaba.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Link: http://lkml.kernel.org/r/20200416160026.16538-8-kirill.shutemov@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/huge_memory.c