Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit a8aed3e0 authored by Andrea Arcangeli's avatar Andrea Arcangeli Committed by Ingo Molnar
Browse files

x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge



Without this patch any kernel code that reads kernel memory in
non present kernel pte/pmds (as set by pageattr.c) will crash.

With this kernel code:

static struct page *crash_page;
static unsigned long *crash_address;
[..]
	crash_page = alloc_pages(GFP_KERNEL, 9);
	crash_address = page_address(crash_page);
	if (set_memory_np((unsigned long)crash_address, 1))
		printk("set_memory_np failure\n");
[..]

The kernel will crash if inside the "crash tool" one would try
to read the memory at the not present address.

crash> p crash_address
crash_address = $8 = (long unsigned int *) 0xffff88023c000000
crash> rd 0xffff88023c000000
[ *lockup* ]

The lockup happens because _PAGE_GLOBAL and _PAGE_PROTNONE
shares the same bit, and pageattr leaves _PAGE_GLOBAL set on a
kernel pte which is then mistaken as _PAGE_PROTNONE (so
pte_present returns true by mistake and the kernel fault then
gets confused and loops).

With THP the same can happen after we taught pmd_present to
check _PAGE_PROTNONE and _PAGE_PSE in commit
027ef6c8 ("mm: thp: fix pmd_present for
split_huge_page and PROT_NONE with THP").  THP has the same
problem with _PAGE_GLOBAL as the 4k pages, but it also has a
problem with _PAGE_PSE, which must be cleared too.

After the patch is applied copy_user correctly returns -EFAULT
and doesn't lockup anymore.

crash> p crash_address
crash_address = $9 = (long unsigned int *) 0xffff88023c000000
crash> rd 0xffff88023c000000
rd: read error: kernel virtual address: ffff88023c000000  type:
"64-bit KVADDR"

Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Shaohua Li <shaohua.li@intel.com>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent 954f8571
Loading
Loading
Loading
Loading
+47 −3
Original line number Original line Diff line number Diff line
@@ -444,6 +444,19 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
	pgprot_val(req_prot) &= ~pgprot_val(cpa->mask_clr);
	pgprot_val(req_prot) &= ~pgprot_val(cpa->mask_clr);
	pgprot_val(req_prot) |= pgprot_val(cpa->mask_set);
	pgprot_val(req_prot) |= pgprot_val(cpa->mask_set);


	/*
	 * Set the PSE and GLOBAL flags only if the PRESENT flag is
	 * set otherwise pmd_present/pmd_huge will return true even on
	 * a non present pmd. The canon_pgprot will clear _PAGE_GLOBAL
	 * for the ancient hardware that doesn't support it.
	 */
	if (pgprot_val(new_prot) & _PAGE_PRESENT)
		pgprot_val(new_prot) |= _PAGE_PSE | _PAGE_GLOBAL;
	else
		pgprot_val(new_prot) &= ~(_PAGE_PSE | _PAGE_GLOBAL);

	new_prot = canon_pgprot(new_prot);

	/*
	/*
	 * old_pte points to the large page base address. So we need
	 * old_pte points to the large page base address. So we need
	 * to add the offset of the virtual address:
	 * to add the offset of the virtual address:
@@ -489,7 +502,7 @@ try_preserve_large_page(pte_t *kpte, unsigned long address,
		 * The address is aligned and the number of pages
		 * The address is aligned and the number of pages
		 * covers the full page.
		 * covers the full page.
		 */
		 */
		new_pte = pfn_pte(pte_pfn(old_pte), canon_pgprot(new_prot));
		new_pte = pfn_pte(pte_pfn(old_pte), new_prot);
		__set_pmd_pte(kpte, address, new_pte);
		__set_pmd_pte(kpte, address, new_pte);
		cpa->flags |= CPA_FLUSHTLB;
		cpa->flags |= CPA_FLUSHTLB;
		do_split = 0;
		do_split = 0;
@@ -540,16 +553,35 @@ static int split_large_page(pte_t *kpte, unsigned long address)
#ifdef CONFIG_X86_64
#ifdef CONFIG_X86_64
	if (level == PG_LEVEL_1G) {
	if (level == PG_LEVEL_1G) {
		pfninc = PMD_PAGE_SIZE >> PAGE_SHIFT;
		pfninc = PMD_PAGE_SIZE >> PAGE_SHIFT;
		/*
		 * Set the PSE flags only if the PRESENT flag is set
		 * otherwise pmd_present/pmd_huge will return true
		 * even on a non present pmd.
		 */
		if (pgprot_val(ref_prot) & _PAGE_PRESENT)
			pgprot_val(ref_prot) |= _PAGE_PSE;
			pgprot_val(ref_prot) |= _PAGE_PSE;
		else
			pgprot_val(ref_prot) &= ~_PAGE_PSE;
	}
	}
#endif
#endif


	/*
	 * Set the GLOBAL flags only if the PRESENT flag is set
	 * otherwise pmd/pte_present will return true even on a non
	 * present pmd/pte. The canon_pgprot will clear _PAGE_GLOBAL
	 * for the ancient hardware that doesn't support it.
	 */
	if (pgprot_val(ref_prot) & _PAGE_PRESENT)
		pgprot_val(ref_prot) |= _PAGE_GLOBAL;
	else
		pgprot_val(ref_prot) &= ~_PAGE_GLOBAL;

	/*
	/*
	 * Get the target pfn from the original entry:
	 * Get the target pfn from the original entry:
	 */
	 */
	pfn = pte_pfn(*kpte);
	pfn = pte_pfn(*kpte);
	for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc)
	for (i = 0; i < PTRS_PER_PTE; i++, pfn += pfninc)
		set_pte(&pbase[i], pfn_pte(pfn, ref_prot));
		set_pte(&pbase[i], pfn_pte(pfn, canon_pgprot(ref_prot)));


	if (address >= (unsigned long)__va(0) &&
	if (address >= (unsigned long)__va(0) &&
		address < (unsigned long)__va(max_low_pfn_mapped << PAGE_SHIFT))
		address < (unsigned long)__va(max_low_pfn_mapped << PAGE_SHIFT))
@@ -659,6 +691,18 @@ repeat:


		new_prot = static_protections(new_prot, address, pfn);
		new_prot = static_protections(new_prot, address, pfn);


		/*
		 * Set the GLOBAL flags only if the PRESENT flag is
		 * set otherwise pte_present will return true even on
		 * a non present pte. The canon_pgprot will clear
		 * _PAGE_GLOBAL for the ancient hardware that doesn't
		 * support it.
		 */
		if (pgprot_val(new_prot) & _PAGE_PRESENT)
			pgprot_val(new_prot) |= _PAGE_GLOBAL;
		else
			pgprot_val(new_prot) &= ~_PAGE_GLOBAL;

		/*
		/*
		 * We need to keep the pfn from the existing PTE,
		 * We need to keep the pfn from the existing PTE,
		 * after all we're only going to change it's attributes
		 * after all we're only going to change it's attributes