Merge libata upstream (which includes C/H/S support) include irq-pio branch. (0fbbbf2b) · Commits · e / devices / android_kernel_fairphone_FP3

Documentation/Changes

+10 −0

Original line number	Diff line number	Diff line
		@@ -237,6 +237,12 @@ udev
		udev is a userspace application for populating /dev dynamically with
		only entries for devices actually present. udev replaces devfs.

		FUSE
		----

		Needs libfuse 2.4.0 or later. Absolute minimum is 2.3.0 but mount
		options 'direct_io' and 'kernel_cache' won't work.

		Networking
		==========

		@@ -390,6 +396,10 @@ udev
		----
		o <http://www.kernel.org/pub/linux/utils/kernel/hotplug/udev.html>

		FUSE
		----
		o <http://sourceforge.net/projects/fuse>

		Networking
		**********

Documentation/DocBook/libata.tmpl

+356 −0

Original line number	Diff line number	Diff line
		@@ -415,6 +415,362 @@ and other resources, etc.
		</sect1>
		</chapter>

		<chapter id="libataEH">
		<title>Error handling</title>

		<para>
		This chapter describes how errors are handled under libata.
		Readers are advised to read SCSI EH
		(Documentation/scsi/scsi_eh.txt) and ATA exceptions doc first.
		</para>

		<sect1><title>Origins of commands</title>
		<para>
		In libata, a command is represented with struct ata_queued_cmd
		or qc. qc's are preallocated during port initialization and
		repetitively used for command executions. Currently only one
		qc is allocated per port but yet-to-be-merged NCQ branch
		allocates one for each tag and maps each qc to NCQ tag 1-to-1.
		</para>
		<para>
		libata commands can originate from two sources - libata itself
		and SCSI midlayer. libata internal commands are used for
		initialization and error handling. All normal blk requests
		and commands for SCSI emulation are passed as SCSI commands
		through queuecommand callback of SCSI host template.
		</para>
		</sect1>

		<sect1><title>How commands are issued</title>

		<variablelist>

		<varlistentry><term>Internal commands</term>
		<listitem>
		<para>
		First, qc is allocated and initialized using
		ata_qc_new_init(). Although ata_qc_new_init() doesn't
		implement any wait or retry mechanism when qc is not
		available, internal commands are currently issued only during
		initialization and error recovery, so no other command is
		active and allocation is guaranteed to succeed.
		</para>
		<para>
		Once allocated qc's taskfile is initialized for the command to
		be executed. qc currently has two mechanisms to notify
		completion. One is via qc->complete_fn() callback and the
		other is completion qc->waiting. qc->complete_fn() callback
		is the asynchronous path used by normal SCSI translated
		commands and qc->waiting is the synchronous (issuer sleeps in
		process context) path used by internal commands.
		</para>
		<para>
		Once initialization is complete, host_set lock is acquired
		and the qc is issued.
		</para>
		</listitem>
		</varlistentry>

		<varlistentry><term>SCSI commands</term>
		<listitem>
		<para>
		All libata drivers use ata_scsi_queuecmd() as
		hostt->queuecommand callback. scmds can either be simulated
		or translated. No qc is involved in processing a simulated
		scmd. The result is computed right away and the scmd is
		completed.
		</para>
		<para>
		For a translated scmd, ata_qc_new_init() is invoked to
		allocate a qc and the scmd is translated into the qc. SCSI
		midlayer's completion notification function pointer is stored
		into qc->scsidone.
		</para>
		<para>
		qc->complete_fn() callback is used for completion
		notification. ATA commands use ata_scsi_qc_complete() while
		ATAPI commands use atapi_qc_complete(). Both functions end up
		calling qc->scsidone to notify upper layer when the qc is
		finished. After translation is completed, the qc is issued
		with ata_qc_issue().
		</para>
		<para>
		Note that SCSI midlayer invokes hostt->queuecommand while
		holding host_set lock, so all above occur while holding
		host_set lock.
		</para>
		</listitem>
		</varlistentry>

		</variablelist>
		</sect1>

		<sect1><title>How commands are processed</title>
		<para>
		Depending on which protocol and which controller are used,
		commands are processed differently. For the purpose of
		discussion, a controller which uses taskfile interface and all
		standard callbacks is assumed.
		</para>
		<para>
		Currently 6 ATA command protocols are used. They can be
		sorted into the following four categories according to how
		they are processed.
		</para>

		<variablelist>
		<varlistentry><term>ATA NO DATA or DMA</term>
		<listitem>
		<para>
		ATA_PROT_NODATA and ATA_PROT_DMA fall into this category.
		These types of commands don't require any software
		intervention once issued. Device will raise interrupt on
		completion.
		</para>
		</listitem>
		</varlistentry>

		<varlistentry><term>ATA PIO</term>
		<listitem>
		<para>
		ATA_PROT_PIO is in this category. libata currently
		implements PIO with polling. ATA_NIEN bit is set to turn
		off interrupt and pio_task on ata_wq performs polling and
		IO.
		</para>
		</listitem>
		</varlistentry>

		<varlistentry><term>ATAPI NODATA or DMA</term>
		<listitem>
		<para>
		ATA_PROT_ATAPI_NODATA and ATA_PROT_ATAPI_DMA are in this
		category. packet_task is used to poll BSY bit after
		issuing PACKET command. Once BSY is turned off by the
		device, packet_task transfers CDB and hands off processing
		to interrupt handler.
		</para>
		</listitem>
		</varlistentry>

		<varlistentry><term>ATAPI PIO</term>
		<listitem>
		<para>
		ATA_PROT_ATAPI is in this category. ATA_NIEN bit is set
		and, as in ATAPI NODATA or DMA, packet_task submits cdb.
		However, after submitting cdb, further processing (data
		transfer) is handed off to pio_task.
		</para>
		</listitem>
		</varlistentry>
		</variablelist>
		</sect1>

		<sect1><title>How commands are completed</title>
		<para>
		Once issued, all qc's are either completed with
		ata_qc_complete() or time out. For commands which are handled
		by interrupts, ata_host_intr() invokes ata_qc_complete(), and,
		for PIO tasks, pio_task invokes ata_qc_complete(). In error
		cases, packet_task may also complete commands.
		</para>
		<para>
		ata_qc_complete() does the following.
		</para>

		<orderedlist>

		<listitem>
		<para>
		DMA memory is unmapped.
		</para>
		</listitem>

		<listitem>
		<para>
		ATA_QCFLAG_ACTIVE is clared from qc->flags.
		</para>
		</listitem>

		<listitem>
		<para>
		qc->complete_fn() callback is invoked. If the return value of
		the callback is not zero. Completion is short circuited and
		ata_qc_complete() returns.
		</para>
		</listitem>

		<listitem>
		<para>
		__ata_qc_complete() is called, which does
		<orderedlist>

		<listitem>
		<para>
		qc->flags is cleared to zero.
		</para>
		</listitem>

		<listitem>
		<para>
		ap->active_tag and qc->tag are poisoned.
		</para>
		</listitem>

		<listitem>
		<para>
		qc->waiting is claread & completed (in that order).
		</para>
		</listitem>

		<listitem>
		<para>
		qc is deallocated by clearing appropriate bit in ap->qactive.
		</para>
		</listitem>

		</orderedlist>
		</para>
		</listitem>

		</orderedlist>

		<para>
		So, it basically notifies upper layer and deallocates qc. One
		exception is short-circuit path in #3 which is used by
		atapi_qc_complete().
		</para>
		<para>
		For all non-ATAPI commands, whether it fails or not, almost
		the same code path is taken and very little error handling
		takes place. A qc is completed with success status if it
		succeeded, with failed status otherwise.
		</para>
		<para>
		However, failed ATAPI commands require more handling as
		REQUEST SENSE is needed to acquire sense data. If an ATAPI
		command fails, ata_qc_complete() is invoked with error status,
		which in turn invokes atapi_qc_complete() via
		qc->complete_fn() callback.
		</para>
		<para>
		This makes atapi_qc_complete() set scmd->result to
		SAM_STAT_CHECK_CONDITION, complete the scmd and return 1. As
		the sense data is empty but scmd->result is CHECK CONDITION,
		SCSI midlayer will invoke EH for the scmd, and returning 1
		makes ata_qc_complete() to return without deallocating the qc.
		This leads us to ata_scsi_error() with partially completed qc.
		</para>

		</sect1>

		<sect1><title>ata_scsi_error()</title>
		<para>
		ata_scsi_error() is the current hostt->eh_strategy_handler()
		for libata. As discussed above, this will be entered in two
		cases - timeout and ATAPI error completion. This function
		calls low level libata driver's eng_timeout() callback, the
		standard callback for which is ata_eng_timeout(). It checks
		if a qc is active and calls ata_qc_timeout() on the qc if so.
		Actual error handling occurs in ata_qc_timeout().
		</para>
		<para>
		If EH is invoked for timeout, ata_qc_timeout() stops BMDMA and
		completes the qc. Note that as we're currently in EH, we
		cannot call scsi_done. As described in SCSI EH doc, a
		recovered scmd should be either retried with
		scsi_queue_insert() or finished with scsi_finish_command().
		Here, we override qc->scsidone with scsi_finish_command() and
		calls ata_qc_complete().
		</para>
		<para>
		If EH is invoked due to a failed ATAPI qc, the qc here is
		completed but not deallocated. The purpose of this
		half-completion is to use the qc as place holder to make EH
		code reach this place. This is a bit hackish, but it works.
		</para>
		<para>
		Once control reaches here, the qc is deallocated by invoking
		__ata_qc_complete() explicitly. Then, internal qc for REQUEST
		SENSE is issued. Once sense data is acquired, scmd is
		finished by directly invoking scsi_finish_command() on the
		scmd. Note that as we already have completed and deallocated
		the qc which was associated with the scmd, we don't need
		to/cannot call ata_qc_complete() again.
		</para>

		</sect1>

		<sect1><title>Problems with the current EH</title>

		<itemizedlist>

		<listitem>
		<para>
		Error representation is too crude. Currently any and all
		error conditions are represented with ATA STATUS and ERROR
		registers. Errors which aren't ATA device errors are treated
		as ATA device errors by setting ATA_ERR bit. Better error
		descriptor which can properly represent ATA and other
		errors/exceptions is needed.
		</para>
		</listitem>

		<listitem>
		<para>
		When handling timeouts, no action is taken to make device
		forget about the timed out command and ready for new commands.
		</para>
		</listitem>

		<listitem>
		<para>
		EH handling via ata_scsi_error() is not properly protected
		from usual command processing. On EH entrance, the device is
		not in quiescent state. Timed out commands may succeed or
		fail any time. pio_task and atapi_task may still be running.
		</para>
		</listitem>

		<listitem>
		<para>
		Too weak error recovery. Devices / controllers causing HSM
		mismatch errors and other errors quite often require reset to
		return to known state. Also, advanced error handling is
		necessary to support features like NCQ and hotplug.
		</para>
		</listitem>

		<listitem>
		<para>
		ATA errors are directly handled in the interrupt handler and
		PIO errors in pio_task. This is problematic for advanced
		error handling for the following reasons.
		</para>
		<para>
		First, advanced error handling often requires context and
		internal qc execution.
		</para>
		<para>
		Second, even a simple failure (say, CRC error) needs
		information gathering and could trigger complex error handling
		(say, resetting & reconfiguring). Having multiple code
		paths to gather information, enter EH and trigger actions
		makes life painful.
		</para>
		<para>
		Third, scattered EH code makes implementing low level drivers
		difficult. Low level drivers override libata callbacks. If
		EH is scattered over several places, each affected callbacks
		should perform its part of error handling. This can be error
		prone and painful.
		</para>
		</listitem>

		</itemizedlist>
		</sect1>
		</chapter>

		<chapter id="libataExt">
		<title>libata Library</title>
		!Edrivers/scsi/libata-core.c

Documentation/SubmittingPatches

+69 −1

Original line number	Diff line number	Diff line
		@@ -301,8 +301,68 @@ now, but you can do this to mark internal company procedures or just
		point out some special detail about the sign-off.


		12) The canonical patch format

		12) More references for submitting patches
		The canonical patch subject line is:

		Subject: [PATCH 001/123] [<area>:] <explanation>

		The canonical patch message body contains the following:

		- A "from" line specifying the patch author.

		- An empty line.

		- The body of the explanation, which will be copied to the
		permanent changelog to describe this patch.

		- The "Signed-off-by:" lines, described above, which will
		also go in the changelog.

		- A marker line containing simply "---".

		- Any additional comments not suitable for the changelog.

		- The actual patch (diff output).

		The Subject line format makes it very easy to sort the emails
		alphabetically by subject line - pretty much any email reader will
		support that - since because the sequence number is zero-padded,
		the numerical and alphabetic sort is the same.

		See further details on how to phrase the "<explanation>" in the
		"Subject:" line in Andrew Morton's "The perfect patch", referenced
		below.

		The "from" line must be the very first line in the message body,
		and has the form:

		From: Original Author <author@example.com>

		The "from" line specifies who will be credited as the author of the
		patch in the permanent changelog. If the "from" line is missing,
		then the "From:" line from the email header will be used to determine
		the patch author in the changelog.

		The explanation body will be committed to the permanent source
		changelog, so should make sense to a competent reader who has long
		since forgotten the immediate details of the discussion that might
		have led to this patch.

		The "---" marker line serves the essential purpose of marking for patch
		handling tools where the changelog message ends.

		One good use for the additional comments after the "---" marker is for
		a diffstat, to show what files have changed, and the number of inserted
		and deleted lines per file. A diffstat is especially useful on bigger
		patches. Other comments relevant only to the moment or the maintainer,
		not suitable for the permanent changelog, should also go here.

		See more details on the proper patch format in the following
		references.


		13) More references for submitting patches

		Andrew Morton, "The perfect patch" (tpp).
		<http://www.zip.com.au/~akpm/linux/patches/stuff/tpp.txt>
		@@ -310,6 +370,14 @@ Andrew Morton, "The perfect patch" (tpp).
		Jeff Garzik, "Linux kernel patch submission format."
		<http://linux.yyz.us/patch-format.html>

		Greg KH, "How to piss off a kernel subsystem maintainer"
		<http://www.kroah.com/log/2005/03/31/>

		Kernel Documentation/CodingStyle
		<http://sosdg.org/~coywolf/lxr/source/Documentation/CodingStyle>

		Linus Torvald's mail on the canonical patch format:
		<http://lkml.org/lkml/2005/4/7/183>


		-----------------------------------

Documentation/keys.txt

+55 −19

Original line number	Diff line number	Diff line
		@@ -195,8 +195,8 @@ KEY ACCESS PERMISSIONS
		======================

		Keys have an owner user ID, a group access ID, and a permissions mask. The mask
		has up to eight bits each for user, group and other access. Only five of each
		set of eight bits are defined. These permissions granted are:
		has up to eight bits each for possessor, user, group and other access. Only
		five of each set of eight bits are defined. These permissions granted are:

		(*) View

		@@ -242,15 +242,15 @@ about the status of the key service:
		this way:

		SERIAL FLAGS USAGE EXPY PERM UID GID TYPE DESCRIPTION: SUMMARY
		00000001 I----- 39 perm 1f0000 0 0 keyring _uid_ses.0: 1/4
		00000002 I----- 2 perm 1f0000 0 0 keyring _uid.0: empty
		00000007 I----- 1 perm 1f0000 0 0 keyring _pid.1: empty
		0000018d I----- 1 perm 1f0000 0 0 keyring _pid.412: empty
		000004d2 I--Q-- 1 perm 1f0000 32 -1 keyring _uid.32: 1/4
		000004d3 I--Q-- 3 perm 1f0000 32 -1 keyring _uid_ses.32: empty
		00000892 I--QU- 1 perm 1f0000 0 0 user metal:copper: 0
		00000893 I--Q-N 1 35s 1f0000 0 0 user metal:silver: 0
		00000894 I--Q-- 1 10h 1f0000 0 0 user metal:gold: 0
		00000001 I----- 39 perm 1f1f0000 0 0 keyring _uid_ses.0: 1/4
		00000002 I----- 2 perm 1f1f0000 0 0 keyring _uid.0: empty
		00000007 I----- 1 perm 1f1f0000 0 0 keyring _pid.1: empty
		0000018d I----- 1 perm 1f1f0000 0 0 keyring _pid.412: empty
		000004d2 I--Q-- 1 perm 1f1f0000 32 -1 keyring _uid.32: 1/4
		000004d3 I--Q-- 3 perm 1f1f0000 32 -1 keyring _uid_ses.32: empty
		00000892 I--QU- 1 perm 1f000000 0 0 user metal:copper: 0
		00000893 I--Q-N 1 35s 1f1f0000 0 0 user metal:silver: 0
		00000894 I--Q-- 1 10h 001f0000 0 0 user metal:gold: 0

		The flags are:

		@@ -637,6 +637,34 @@ call, and the key released upon close. How to deal with conflicting keys due to
		two different users opening the same file is left to the filesystem author to
		solve.

		Note that there are two different types of pointers to keys that may be
		encountered:

		() struct key

		This simply points to the key structure itself. Key structures will be at
		least four-byte aligned.

		(*) key_ref_t

		This is equivalent to a struct key *, but the least significant bit is set
		if the caller "possesses" the key. By "possession" it is meant that the
		calling processes has a searchable link to the key from one of its
		keyrings. There are three functions for dealing with these:

		key_ref_t make_key_ref(const struct key *key,
		unsigned long possession);

		struct key *key_ref_to_ptr(const key_ref_t key_ref);

		unsigned long is_key_possessed(const key_ref_t key_ref);

		The first function constructs a key reference from a key pointer and
		possession information (which must be 0 or 1 and not any other value).

		The second function retrieves the key pointer from a reference and the
		third retrieves the possession flag.

		When accessing a key's payload contents, certain precautions must be taken to
		prevent access vs modification races. See the section "Notes on accessing
		payload contents" for more information.
		@@ -665,7 +693,11 @@ payload contents" for more information.

		void key_put(struct key *key);

		This can be called from interrupt context. If CONFIG_KEYS is not set then
		Or:

		void key_ref_put(key_ref_t key_ref);

		These can be called from interrupt context. If CONFIG_KEYS is not set then
		the argument will not be parsed.


		@@ -689,13 +721,17 @@ payload contents" for more information.

		(*) If a keyring was found in the search, this can be further searched by:

		struct key keyring_search(struct key keyring,
		key_ref_t keyring_search(key_ref_t keyring_ref,
		const struct key_type *type,
		const char *description)

		This searches the keyring tree specified for a matching key. Error ENOKEY
		is returned upon failure. If successful, the returned key will need to be
		released.
		is returned upon failure (use IS_ERR/PTR_ERR to determine). If successful,
		the returned key will need to be released.

		The possession attribute from the keyring reference is used to control
		access through the permissions mask and is propagated to the returned key
		reference pointer if successful.


		(*) To check the validity of a key, this function can be called:
		@@ -732,7 +768,7 @@ More complex payload contents must be allocated and a pointer to them set in
		key->payload.data. One of the following ways must be selected to access the
		data:

		(1) Unmodifyable key type.
		(1) Unmodifiable key type.

		If the key type does not have a modify method, then the key's payload can
		be accessed without any form of locking, provided that it's known to be

MAINTAINERS

+28 −5

Original line number	Diff line number	Diff line
		@@ -604,6 +604,15 @@ P: H. Peter Anvin
		M: hpa@zytor.com
		S: Maintained

		CPUSETS
		P: Paul Jackson
		P: Simon Derr
		M: pj@sgi.com
		M: simon.derr@bull.net
		L: linux-kernel@vger.kernel.org
		W: http://www.bullopensource.org/cpuset/
		S: Supported

		CRAMFS FILESYSTEM
		W: http://sourceforge.net/projects/cramfs/
		S: Orphan
		@@ -1159,11 +1168,6 @@ L: linux1394-devel@lists.sourceforge.net
		W: http://www.linux1394.org/
		S: Orphan

		IEEE 1394 SBP2
		L: linux1394-devel@lists.sourceforge.net
		W: http://www.linux1394.org/
		S: Orphan

		IEEE 1394 SUBSYSTEM
		P: Ben Collins
		M: bcollins@debian.org
		@@ -1198,6 +1202,15 @@ L: linux1394-devel@lists.sourceforge.net
		W: http://www.linux1394.org/
		S: Maintained

		IEEE 1394 SBP2
		P: Ben Collins
		M: bcollins@debian.org
		P: Stefan Richter
		M: stefanr@s5r6.in-berlin.de
		L: linux1394-devel@lists.sourceforge.net
		W: http://www.linux1394.org/
		S: Maintained

		IMS TWINTURBO FRAMEBUFFER DRIVER
		P: Paul Mundt
		M: lethal@chaoticdreams.org
		@@ -1734,8 +1747,11 @@ S: Maintained
		IPVS
		P: Wensong Zhang
		M: wensong@linux-vs.org
		P: Simon Horman
		M: horms@verge.net.au
		P: Julian Anastasov
		M: ja@ssi.bg
		L: netdev@vger.kernel.org
		S: Maintained

		NFS CLIENT
		@@ -1906,6 +1922,13 @@ M: joern@wh.fh-wedel.de
		L: linux-mtd@lists.infradead.org
		S: Maintained

		PKTCDVD DRIVER
		P: Peter Osterlund
		M: petero2@telia.com
		L: linux-kernel@vger.kernel.org
		L: packet-writing@suse.com
		S: Maintained

		POSIX CLOCKS and TIMERS
		P: George Anzinger
		M: george@mvista.com