Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit f6e858a0 authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'akpm' (Andrew's patch-bomb)

Merge misc VM changes from Andrew Morton:
 "The rest of most-of-MM.  The other MM bits await a slab merge.

  This patch includes the addition of a huge zero_page.  Not a
  performance boost but it an save large amounts of physical memory in
  some situations.

  Also a bunch of Fujitsu engineers are working on memory hotplug.
  Which, as it turns out, was badly broken.  About half of their patches
  are included here; the remainder are 3.8 material."

However, this merge disables CONFIG_MOVABLE_NODE, which was totally
broken.  We don't add new features with "default y", nor do we add
Kconfig questions that are incomprehensible to most people without any
help text.  Does the feature even make sense without compaction or
memory hotplug?

* akpm: (54 commits)
  mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()
  mm/memory.c: remove unused code from do_wp_page()
  asm-generic, mm: pgtable: consolidate zero page helpers
  mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage
  hwpoison, hugetlbfs: fix RSS-counter warning
  hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
  mm: protect against concurrent vma expansion
  memcg: do not check for mm in __mem_cgroup_count_vm_event
  tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)
  mm: provide more accurate estimation of pages occupied by memmap
  fs/buffer.c: remove redundant initialization in alloc_page_buffers()
  fs/buffer.c: do not inline exported function
  writeback: fix a typo in comment
  mm: introduce new field "managed_pages" to struct zone
  mm, oom: remove statically defined arch functions of same name
  mm, oom: remove redundant sleep in pagefault oom handler
  mm, oom: cleanup pagefault oom handler
  memory_hotplug: allow online/offline memory to result movable node
  numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  mm, memcg: avoid unnecessary function call when memcg is disabled
  ...
parents 193c0d68 98870901
Loading
Loading
Loading
Loading
+1 −1
Original line number Diff line number Diff line
@@ -218,7 +218,7 @@ and name space for cpusets, with a minimum of additional kernel code.
The cpus and mems files in the root (top_cpuset) cpuset are
read-only.  The cpus file automatically tracks the value of
cpu_online_mask using a CPU hotplug notifier, and the mems file
automatically tracks the value of node_states[N_HIGH_MEMORY]--i.e.,
automatically tracks the value of node_states[N_MEMORY]--i.e.,
nodes with memory--using the cpuset_track_online_nodes() hook.


+4 −1
Original line number Diff line number Diff line
@@ -390,6 +390,7 @@ struct memory_notify {
       unsigned long start_pfn;
       unsigned long nr_pages;
       int status_change_nid_normal;
       int status_change_nid_high;
       int status_change_nid;
}

@@ -397,7 +398,9 @@ start_pfn is start_pfn of online/offline memory.
nr_pages is # of pages of online/offline memory.
status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
is (will be) set/clear, if this is -1, then nodemask status is not changed.
status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be)
status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
is (will be) set/clear, if this is -1, then nodemask status is not changed.
status_change_nid is set node id when N_MEMORY of nodemask is (will be)
set/clear. It means a new(memoryless) node gets new memory by online and a
node loses all memory. If this is -1, then nodemask status is not changed.
If status_changed_nid* >= 0, callback should create/discard structures for the
+17 −2
Original line number Diff line number Diff line
@@ -116,6 +116,13 @@ echo always >/sys/kernel/mm/transparent_hugepage/defrag
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo never >/sys/kernel/mm/transparent_hugepage/defrag

By default kernel tries to use huge zero page on read page fault.
It's possible to disable huge zero page by writing 0 or enable it
back by writing 1:

echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page
echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/use_zero_page

khugepaged will be automatically started when
transparent_hugepage/enabled is set to "always" or "madvise, and it'll
be automatically shutdown if it's set to "never".
@@ -197,6 +204,14 @@ thp_split is incremented every time a huge page is split into base
	pages. This can happen for a variety of reasons but a common
	reason is that a huge page is old and is being reclaimed.

thp_zero_page_alloc is incremented every time a huge zero page is
	successfully allocated. It includes allocations which where
	dropped due race with other allocation. Note, it doesn't count
	every map of the huge zero page, only its allocation.

thp_zero_page_alloc_failed is incremented if kernel fails to allocate
	huge zero page and falls back to using small pages.

As the system ages, allocating huge pages may be expensive as the
system uses memory compaction to copy data around memory to free a
huge page for use. There are some counters in /proc/vmstat to help
@@ -276,7 +291,7 @@ unaffected. libhugetlbfs will also work fine as usual.
== Graceful fallback ==

Code walking pagetables but unware about huge pmds can simply call
split_huge_page_pmd(mm, pmd) where the pmd is the one returned by
split_huge_page_pmd(vma, addr, pmd) where the pmd is the one returned by
pmd_offset. It's trivial to make the code transparent hugepage aware
by just grepping for "pmd_offset" and adding split_huge_page_pmd where
missing after pmd_offset returns the pmd. Thanks to the graceful
@@ -299,7 +314,7 @@ diff --git a/mm/mremap.c b/mm/mremap.c
		return NULL;

	pmd = pmd_offset(pud, addr);
+	split_huge_page_pmd(mm, pmd);
+	split_huge_page_pmd(vma, addr, pmd);
	if (pmd_none_or_clear_bad(pmd))
		return NULL;

+1 −10
Original line number Diff line number Diff line
@@ -76,16 +76,7 @@ extern unsigned long zero_page_mask;

#define ZERO_PAGE(vaddr) \
	(virt_to_page((void *)(empty_zero_page + (((unsigned long)(vaddr)) & zero_page_mask))))

#define is_zero_pfn is_zero_pfn
static inline int is_zero_pfn(unsigned long pfn)
{
	extern unsigned long zero_pfn;
	unsigned long offset_from_zero_pfn = pfn - zero_pfn;
	return offset_from_zero_pfn <= (zero_page_mask >> PAGE_SHIFT);
}

#define my_zero_pfn(addr)	page_to_pfn(ZERO_PAGE(addr))
#define __HAVE_COLOR_ZERO_PAGE

extern void paging_init(void);

+12 −15
Original line number Diff line number Diff line
@@ -113,19 +113,6 @@ static int store_updates_sp(struct pt_regs *regs)
#define MM_FAULT_CONTINUE	-1
#define MM_FAULT_ERR(sig)	(sig)

static int out_of_memory(struct pt_regs *regs)
{
	/*
	 * We ran out of memory, or some other thing happened to us that made
	 * us unable to handle the page fault gracefully.
	 */
	up_read(&current->mm->mmap_sem);
	if (!user_mode(regs))
		return MM_FAULT_ERR(SIGKILL);
	pagefault_out_of_memory();
	return MM_FAULT_RETURN;
}

static int do_sigbus(struct pt_regs *regs, unsigned long address)
{
	siginfo_t info;
@@ -169,8 +156,18 @@ static int mm_fault_error(struct pt_regs *regs, unsigned long addr, int fault)
		return MM_FAULT_CONTINUE;

	/* Out of memory */
	if (fault & VM_FAULT_OOM)
		return out_of_memory(regs);
	if (fault & VM_FAULT_OOM) {
		up_read(&current->mm->mmap_sem);

		/*
		 * We ran out of memory, or some other thing happened to us that
		 * made us unable to handle the page fault gracefully.
		 */
		if (!user_mode(regs))
			return MM_FAULT_ERR(SIGKILL);
		pagefault_out_of_memory();
		return MM_FAULT_RETURN;
	}

	/* Bus error. x86 handles HWPOISON here, we'll add this if/when
	 * we support the feature in HW
Loading