Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit ca1136c9 authored by Shaohua Li's avatar Shaohua Li Committed by Jens Axboe
Browse files

blktrace: export cgroup info in trace



Currently blktrace isn't cgroup aware. blktrace prints out task name of
current context, but the task of current context isn't always in the
cgroup where the BIO comes from. We can't use task name to find out IO
cgroup. For example, Writeback BIOs always comes from flusher thread but
the BIOs are for different blk cgroups. Request could be requeued and
dispatched from completely different tasks. MD/DM are another examples.

This patch tries to fix the gap. We print out cgroup fhandle info in
blktrace. Userspace can use open_by_handle_at() syscall to find the
cgroup by fhandle. Or userspace can use name_to_handle_at() syscall to
find fhandle for a cgroup and use a BPF program to filter out blktrace
for a specific cgroup.

We add a new 'blk_cgroup' trace option for blk tracer. It's default off.
Application which doesn't know the new option isn't affected.  When it's
on, we output fhandle info right after blk_io_trace with an extra bit
set in event action. So from application point of view, blktrace with
the option will output new actions.

I didn't change blk trace event yet, since I'm not sure if changing the
trace event output is an ABI issue. If not, I'll do it later.

Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: default avatarShaohua Li <shli@fb.com>
Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
parent 121508df
Loading
Loading
Loading
Loading
+3 −0
Original line number Original line Diff line number Diff line
@@ -52,6 +52,7 @@ enum blktrace_act {
	__BLK_TA_REMAP,			/* bio was remapped */
	__BLK_TA_REMAP,			/* bio was remapped */
	__BLK_TA_ABORT,			/* request aborted */
	__BLK_TA_ABORT,			/* request aborted */
	__BLK_TA_DRV_DATA,		/* driver-specific binary data */
	__BLK_TA_DRV_DATA,		/* driver-specific binary data */
	__BLK_TA_CGROUP = 1 << 8,	/* from a cgroup*/
};
};


/*
/*
@@ -61,6 +62,7 @@ enum blktrace_notify {
	__BLK_TN_PROCESS = 0,		/* establish pid/name mapping */
	__BLK_TN_PROCESS = 0,		/* establish pid/name mapping */
	__BLK_TN_TIMESTAMP,		/* include system clock */
	__BLK_TN_TIMESTAMP,		/* include system clock */
	__BLK_TN_MESSAGE,		/* Character string message */
	__BLK_TN_MESSAGE,		/* Character string message */
	__BLK_TN_CGROUP = __BLK_TA_CGROUP, /* from a cgroup */
};
};




@@ -107,6 +109,7 @@ struct blk_io_trace {
	__u32 cpu;		/* on what cpu did it happen */
	__u32 cpu;		/* on what cpu did it happen */
	__u16 error;		/* completion error */
	__u16 error;		/* completion error */
	__u16 pdu_len;		/* length of data after this trace */
	__u16 pdu_len;		/* length of data after this trace */
	/* cgroup id will be stored here if exists */
};
};


/*
/*
+158 −73

File changed.

Preview size limit exceeded, changes collapsed.