Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 382c2fe9 authored by Rik van Riel's avatar Rik van Riel Committed by Ingo Molnar
Browse files

sched, time: Remove non-power-of-two divides from __acct_update_integrals()



When running a microbenchmark calling an invalid syscall number
in a loop, on a nohz_full CPU, we spend a full 9% of our CPU
time in __acct_update_integrals().

This function converts cputime_t to jiffies, to a timeval, only to
convert the timeval back to microseconds before discarding it.

This patch leaves __acct_update_integrals() functionally equivalent,
but speeds things up by about 12%, with 10 million calls to an
invalid syscall number dropping from 3.7 to 3.25 seconds.

Signed-off-by: default avatarRik van Riel <riel@redhat.com>
Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: clark@redhat.com
Cc: eric.dumazet@gmail.com
Cc: fweisbec@gmail.com
Cc: luto@amacapital.net
Link: http://lkml.kernel.org/r/1455152907-18495-2-git-send-email-riel@redhat.com


Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
parent c3a990dc
Loading
Loading
Loading
Loading
+16 −10
Original line number Diff line number Diff line
@@ -93,9 +93,11 @@ void xacct_add_tsk(struct taskstats *stats, struct task_struct *p)
{
	struct mm_struct *mm;

	/* convert pages-usec to Mbyte-usec */
	stats->coremem = p->acct_rss_mem1 * PAGE_SIZE / MB;
	stats->virtmem = p->acct_vm_mem1 * PAGE_SIZE / MB;
	/* convert pages-nsec/1024 to Mbyte-usec, see __acct_update_integrals */
	stats->coremem = p->acct_rss_mem1 * PAGE_SIZE;
	do_div(stats->coremem, 1000 * KB);
	stats->virtmem = p->acct_vm_mem1 * PAGE_SIZE;
	do_div(stats->virtmem, 1000 * KB);
	mm = get_task_mm(p);
	if (mm) {
		/* adjust to KB unit */
@@ -125,22 +127,26 @@ static void __acct_update_integrals(struct task_struct *tsk,
{
	if (likely(tsk->mm)) {
		cputime_t time, dtime;
		struct timeval value;
		unsigned long flags;
		u64 delta;

		local_irq_save(flags);
		time = stime + utime;
		dtime = time - tsk->acct_timexpd;
		jiffies_to_timeval(cputime_to_jiffies(dtime), &value);
		delta = value.tv_sec;
		delta = delta * USEC_PER_SEC + value.tv_usec;
		/* Avoid division: cputime_t is often in nanoseconds already. */
		delta = cputime_to_nsecs(dtime);

		if (delta == 0)
		if (delta < TICK_NSEC)
			goto out;

		tsk->acct_timexpd = time;
		tsk->acct_rss_mem1 += delta * get_mm_rss(tsk->mm);
		tsk->acct_vm_mem1 += delta * tsk->mm->total_vm;
		/*
		 * Divide by 1024 to avoid overflow, and to avoid division.
		 * The final unit reported to userspace is Mbyte-usecs,
		 * the rest of the math is done in xacct_add_tsk.
		 */
		tsk->acct_rss_mem1 += delta * get_mm_rss(tsk->mm) >> 10;
		tsk->acct_vm_mem1 += delta * tsk->mm->total_vm >> 10;
	out:
		local_irq_restore(flags);
	}