Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 249be851 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'akpm' (patches from Andrew)

Merge yet more updates from Andrew Morton:
 "The rest of MM and a kernel-wide procfs cleanup.

  Summary of the more significant patches:

   - Patch series "mm/memory_hotplug: Factor out memory block
     devicehandling", v3. David Hildenbrand.

     Some spring-cleaning of the memory hotplug code, notably in
     drivers/base/memory.c

   - "mm: thp: fix false negative of shmem vma's THP eligibility". Yang
     Shi.

     Fix /proc/pid/smaps output for THP pages used in shmem.

   - "resource: fix locking in find_next_iomem_res()" + 1. Nadav Amit.

     Bugfix and speedup for kernel/resource.c

   - Patch series "mm: Further memory block device cleanups", David
     Hildenbrand.

     More spring-cleaning of the memory hotplug code.

   - Patch series "mm: Sub-section memory hotplug support". Dan
     Williams.

     Generalise the memory hotplug code so that pmem can use it more
     completely. Then remove the hacks from the libnvdimm code which
     were there to work around the memory-hotplug code's constraints.

   - "proc/sysctl: add shared variables for range check", Matteo Croce.

     We have about 250 instances of

          int zero;
          ...
                  .extra1 = &zero,

     in the tree. This is a tree-wide sweep to make all those private
     "zero"s and "one"s use global variables.

     Alas, it isn't practical to make those two global integers const"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (38 commits)
  proc/sysctl: add shared variables for range check
  mm: migrate: remove unused mode argument
  mm/sparsemem: cleanup 'section number' data types
  libnvdimm/pfn: stop padding pmem namespaces to section alignment
  libnvdimm/pfn: fix fsdax-mode namespace info-block zero-fields
  mm/devm_memremap_pages: enable sub-section remap
  mm: document ZONE_DEVICE memory-model implications
  mm/sparsemem: support sub-section hotplug
  mm/sparsemem: prepare for sub-section ranges
  mm: kill is_dev_zone() helper
  mm/hotplug: kill is_dev_zone() usage in __remove_pages()
  mm/sparsemem: convert kmalloc_section_memmap() to populate_section_memmap()
  mm/hotplug: prepare shrink_{zone, pgdat}_span for sub-section removal
  mm/sparsemem: add helpers track active portions of a section at boot
  mm/sparsemem: introduce a SECTION_IS_EARLY flag
  mm/sparsemem: introduce struct mem_section_usage
  drivers/base/memory.c: get rid of find_memory_block_hinted()
  mm/memory_hotplug: move and simplify walk_memory_blocks()
  mm/memory_hotplug: rename walk_memory_range() and pass start+size instead of pfns
  mm: make register_mem_sect_under_node() static
  ...
parents 3bfe1fc4 eec4844f
Loading
Loading
Loading
Loading
+2 −2
Original line number Diff line number Diff line
@@ -486,8 +486,8 @@ replaced by copy-on-write) part of the underlying shmem object out on swap.
"SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
does not take into account swapped out page of underlying shmem objects.
"Locked" indicates whether the mapping is locked in memory or not.
"THPeligible" indicates whether the mapping is eligible for THP pages - 1 if
true, 0 otherwise.
"THPeligible" indicates whether the mapping is eligible for allocating THP
pages - 1 if true, 0 otherwise. It just shows the current status.

"VmFlags" field deserves a separate description. This member represents the kernel
flags associated with the particular virtual memory area in two letter encoded
+40 −0
Original line number Diff line number Diff line
@@ -181,3 +181,43 @@ that is eventually passed to vmemmap_populate() through a long chain
of function calls. The vmemmap_populate() implementation may use the
`vmem_altmap` along with :c:func:`altmap_alloc_block_buf` helper to
allocate memory map on the persistent memory device.

ZONE_DEVICE
===========
The `ZONE_DEVICE` facility builds upon `SPARSEMEM_VMEMMAP` to offer
`struct page` `mem_map` services for device driver identified physical
address ranges. The "device" aspect of `ZONE_DEVICE` relates to the fact
that the page objects for these address ranges are never marked online,
and that a reference must be taken against the device, not just the page
to keep the memory pinned for active use. `ZONE_DEVICE`, via
:c:func:`devm_memremap_pages`, performs just enough memory hotplug to
turn on :c:func:`pfn_to_page`, :c:func:`page_to_pfn`, and
:c:func:`get_user_pages` service for the given range of pfns. Since the
page reference count never drops below 1 the page is never tracked as
free memory and the page's `struct list_head lru` space is repurposed
for back referencing to the host device / driver that mapped the memory.

While `SPARSEMEM` presents memory as a collection of sections,
optionally collected into memory blocks, `ZONE_DEVICE` users have a need
for smaller granularity of populating the `mem_map`. Given that
`ZONE_DEVICE` memory is never marked online it is subsequently never
subject to its memory ranges being exposed through the sysfs memory
hotplug api on memory block boundaries. The implementation relies on
this lack of user-api constraint to allow sub-section sized memory
ranges to be specified to :c:func:`arch_add_memory`, the top-half of
memory hotplug. Sub-section support allows for 2MB as the cross-arch
common alignment granularity for :c:func:`devm_memremap_pages`.

The users of `ZONE_DEVICE` are:

* pmem: Map platform persistent memory to be used as a direct-I/O target
  via DAX mappings.

* hmm: Extend `ZONE_DEVICE` with `->page_fault()` and `->page_free()`
  event callbacks to allow a device-driver to coordinate memory management
  events related to device-memory, typically GPU memory. See
  Documentation/vm/hmm.rst.

* p2pdma: Create `struct page` objects to allow peer devices in a
  PCI/-E topology to coordinate direct-DMA operations between themselves,
  i.e. bypass host memory.
+17 −0
Original line number Diff line number Diff line
@@ -1074,4 +1074,21 @@ int arch_add_memory(int nid, u64 start, u64 size,
	return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
			   restrictions);
}
void arch_remove_memory(int nid, u64 start, u64 size,
			struct vmem_altmap *altmap)
{
	unsigned long start_pfn = start >> PAGE_SHIFT;
	unsigned long nr_pages = size >> PAGE_SHIFT;
	struct zone *zone;

	/*
	 * FIXME: Cleanup page tables (also in arch_add_memory() in case
	 * adding fails). Until then, this function should only be used
	 * during memory hotplug (adding memory), not for memory
	 * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
	 * unlocked yet.
	 */
	zone = page_zone(pfn_to_page(start_pfn));
	__remove_pages(zone, start_pfn, nr_pages, altmap);
}
#endif
+0 −2
Original line number Diff line number Diff line
@@ -681,7 +681,6 @@ int arch_add_memory(int nid, u64 start, u64 size,
	return ret;
}

#ifdef CONFIG_MEMORY_HOTREMOVE
void arch_remove_memory(int nid, u64 start, u64 size,
			struct vmem_altmap *altmap)
{
@@ -693,4 +692,3 @@ void arch_remove_memory(int nid, u64 start, u64 size,
	__remove_pages(zone, start_pfn, nr_pages, altmap);
}
#endif
#endif
+0 −2
Original line number Diff line number Diff line
@@ -125,7 +125,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
	return __add_pages(nid, start_pfn, nr_pages, restrictions);
}

#ifdef CONFIG_MEMORY_HOTREMOVE
void __ref arch_remove_memory(int nid, u64 start, u64 size,
			     struct vmem_altmap *altmap)
{
@@ -151,7 +150,6 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
		pr_warn("Hash collision while resizing HPT\n");
}
#endif
#endif /* CONFIG_MEMORY_HOTPLUG */

#ifndef CONFIG_NEED_MULTIPLE_NODES
void __init mem_topology_setup(void)
Loading