Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 8b283f35 authored by Joonsoo Kim's avatar Joonsoo Kim Committed by Vinayak Menon
Browse files

mm/page_alloc: fix incorrect isolation behavior by rechecking migratetype



There are two paths to reach core free function of buddy allocator,
__free_one_page(), one is free_one_page()->__free_one_page() and the
other is free_hot_cold_page()->free_pcppages_bulk()->__free_one_page().
Each paths has race condition causing serious problems. At first, this
patch is focused on first type of freepath. And then, following patch
will solve the problem in second type of freepath.

In the first type of freepath, we got migratetype of freeing page without
holding the zone lock, so it could be racy. There are two cases of this
race.

1. pages are added to isolate buddy list after restoring orignal
migratetype

CPU1                                   CPU2

get migratetype => return MIGRATE_ISOLATE
call free_one_page() with MIGRATE_ISOLATE

				grab the zone lock
				unisolate pageblock
				release the zone lock

grab the zone lock
call __free_one_page() with MIGRATE_ISOLATE
freepage go into isolate buddy list,
although pageblock is already unisolated

This may cause two problems. One is that we can't use this page anymore
until next isolation attempt of this pageblock, because freepage is on
isolate buddy list. The other is that freepage accouting could be wrong
due to merging between different buddy list. Freepages on isolate buddy
list aren't counted as freepage, but ones on normal buddy list are counted
as freepage. If merge happens, buddy freepage on normal buddy list is
inevitably moved to isolate buddy list without any consideration of
freepage accouting so it could be incorrect.

2. pages are added to normal buddy list while pageblock is isolated.
It is similar with above case.

This also may cause two problems. One is that we can't keep these
freepages from being allocated. Although this pageblock is isolated,
freepage would be added to normal buddy list so that it could be
allocated without any restriction. And the other problem is same as
case 1, that it, incorrect freepage accouting.

This race condition would be prevented by checking migratetype again
with holding the zone lock. Because it is somewhat heavy operation
and it isn't needed in common case, we want to avoid rechecking as much
as possible. So this patch introduce new variable, nr_isolate_pageblock
in struct zone to check if there is isolated pageblock.
With this, we can avoid to re-check migratetype in common case and do
it only if there is isolated pageblock or migratetype is MIGRATE_ISOLATE.
This solve above mentioned problems.

Cc: <stable@vger.kernel.org>
Acked-by: default avatarMinchan Kim <minchan@kernel.org>
Acked-by: default avatarMichal Nazarewicz <mina86@mina86.com>
Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
Patch-mainline: linux-kernel @ 10/31/2014 16:25:27
[vinmenon@codeaurora.org: get_pfnblock_migratetype replaced by
get_pageblock_migratetype plus fixed merge conflicts]
Signed-off-by: default avatarVinayak Menon <vinmenon@codeaurora.org>
CRs-fixed: 720761
Change-Id: I8b4a47b8986900f9e0f5364692b5914d19bf189c
parent e1b54da6
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -370,6 +370,16 @@ struct zone {
	unsigned long		compact_cached_free_pfn;
	unsigned long		compact_cached_migrate_pfn;
#endif

#ifdef CONFIG_MEMORY_ISOLATION
	/*
	 * Number of isolated pageblock. It is used to solve incorrect
	 * freepage counting problem due to racy retrieving migratetype
	 * of pageblock. Protected by zone->lock.
	 */
	unsigned long           nr_isolate_pageblock;
#endif

#ifdef CONFIG_MEMORY_HOTPLUG
	/* see spanned/present_pages for more description */
	seqlock_t		span_seqlock;
+8 −0
Original line number Diff line number Diff line
@@ -2,6 +2,10 @@
#define __LINUX_PAGEISOLATION_H

#ifdef CONFIG_MEMORY_ISOLATION
static inline bool has_isolate_pageblock(struct zone *zone)
{
	return zone->nr_isolate_pageblock;
}
static inline bool is_migrate_isolate_page(struct page *page)
{
	return get_pageblock_migratetype(page) == MIGRATE_ISOLATE;
@@ -11,6 +15,10 @@ static inline bool is_migrate_isolate(int migratetype)
	return migratetype == MIGRATE_ISOLATE;
}
#else
static inline bool has_isolate_pageblock(struct zone *zone)
{
	return false;
}
static inline bool is_migrate_isolate_page(struct page *page)
{
	return false;
+9 −2
Original line number Diff line number Diff line
@@ -718,9 +718,16 @@ static void free_one_page(struct zone *zone, struct page *page, int order,
	spin_lock(&zone->lock);
	zone->pages_scanned = 0;

	__free_one_page(page, zone, order, migratetype);
	if (unlikely(!is_migrate_isolate(migratetype)))
	if (unlikely(has_isolate_pageblock(zone) ||
		is_migrate_isolate(migratetype))) {
		migratetype = get_pageblock_migratetype(page);
		if (is_migrate_isolate(migratetype))
			goto skip_counting;
	}
	__mod_zone_freepage_state(zone, 1 << order, migratetype);

skip_counting:
	__free_one_page(page, zone, order, migratetype);
	spin_unlock(&zone->lock);
}

+2 −0
Original line number Diff line number Diff line
@@ -59,6 +59,7 @@ out:
		int migratetype = get_pageblock_migratetype(page);

		set_pageblock_migratetype(page, MIGRATE_ISOLATE);
		zone->nr_isolate_pageblock++;
		nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE);

		__mod_zone_freepage_state(zone, -nr_pages, migratetype);
@@ -82,6 +83,7 @@ void unset_migratetype_isolate(struct page *page, unsigned migratetype)
	nr_pages = move_freepages_block(zone, page, migratetype);
	__mod_zone_freepage_state(zone, nr_pages, migratetype);
	set_pageblock_migratetype(page, migratetype);
	zone->nr_isolate_pageblock--;
out:
	spin_unlock_irqrestore(&zone->lock, flags);
}