Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit ac5b1481 authored by Prakash Surya's avatar Prakash Surya Committed by Greg Kroah-Hartman
Browse files

staging: lustre: osc: Track and limit "unstable" pages



This change adds a global counter to track the number of "unstable"
pages held by a given client, along with per file system counters. An
"unstable" page is defined as a page which has been sent to the server
as part of a bulk request, but is uncommitted to stable storage.

In addition to simply tracking the unstable pages, they now also count
towards the maximum number of "pinned" pages on the system at any given
time. Thus, a client will now be bound on the number of dirty and
unstable pages it can pin in memory. Previously only dirty pages were
accounted for in this limit.

In addition to tracking the number of unstable pages in Lustre, the
NR_UNSTABLE_NFS memory zone is also incremented and decremented for
easy monitoring using the "NFS_Unstable:" field in /proc/meminfo.
This field is also used internally by the kernel to limit the total
amount of unstable pages on the system.

The motivation for this change is twofold. First, the client must not
allow itself to disconnect from an OST while still holding unstable
pages. Otherwise, these unstable pages can get lost due to an OST
failure, and replay is not possible due to the disconnect via unmount.

Secondly, the client needs a mechanism to prevent it from allocating too
much of its available RAM to unreclaimable pages pinned by the ptlrpc
layer. If this case occurs, out of memory events can trigger as a side
effect, which we need to avoid.

The current number of unstable pages accounted for on a per file system
granularity is exported by the unstable_stats proc file, contained under
each file system's llite namespace. An example of retrieving this
information is below:

	$ lctl get_param llite.*.unstable_stats

Signed-off-by: default avatarPrakash Surya <surya1@llnl.gov>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2139
Reviewed-on: http://review.whamcloud.com/6284


Reviewed-by: default avatarJinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: default avatarAndreas Dilger <andreas.dilger@intel.com>
Reviewed-by: default avatarOleg Drokin <oleg.drokin@intel.com>
Signed-off-by: default avatarJames Simmons <jsimmons@infradead.org>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
parent 7bbe9f83
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -2351,6 +2351,16 @@ struct cl_client_cache {
	 * Lock to protect ccc_lru list
	 */
	spinlock_t		ccc_lru_lock;
	/**
	 * # of unstable pages for this mount point
	 */
	atomic_t		ccc_unstable_nr;
	/**
	 * Waitq for awaiting unstable pages to reach zero.
	 * Used at umounting time and signaled on BRW commit
	 */
        wait_queue_head_t	ccc_unstable_waitq;

};

/** @} cl_page */
+3 −1
Original line number Diff line number Diff line
@@ -1327,7 +1327,9 @@ struct ptlrpc_request {
		/* allow the req to be sent if the import is in recovery
		 * status
		 */
		rq_allow_replay:1;
		rq_allow_replay:1,
		/* bulk request, sent to server, but uncommitted */
		rq_unstable:1;

	unsigned int rq_nr_resend;

+1 −1
Original line number Diff line number Diff line
@@ -477,7 +477,7 @@ struct lov_obd {
	struct dentry		*lov_pool_debugfs_entry;
	enum lustre_sec_part    lov_sp_me;

	/* Cached LRU pages from upper layer */
	/* Cached LRU and unstable data from upper layer */
	void		       *lov_cache;

	struct rw_semaphore     lov_notify_lock;
+1 −0
Original line number Diff line number Diff line
@@ -58,6 +58,7 @@ extern int at_early_margin;
extern int at_extra;
extern unsigned int obd_sync_filter;
extern unsigned int obd_max_dirty_pages;
extern atomic_t obd_unstable_pages;
extern atomic_t obd_dirty_pages;
extern atomic_t obd_dirty_transit_pages;
extern char obd_jobid_var[];
+6 −0
Original line number Diff line number Diff line
@@ -491,6 +491,12 @@ struct ll_sb_info {

	struct lprocfs_stats     *ll_stats; /* lprocfs stats counter */

	/*
	 * Used to track "unstable" pages on a client, and maintain a
	 * LRU list of clean pages. An "unstable" page is defined as
	 * any page which is sent to a server as part of a bulk request,
	 * but is uncommitted to stable storage.
	 */
	struct cl_client_cache    ll_cache;

	struct lprocfs_stats     *ll_ra_stats;
Loading