diff --git a/.gitignore b/.gitignore index e1d5c17c12c22663d9df80925e4b7872bd53d6aa..9eb4b77114994877b247c1d165f3920f680cc7e6 100644 --- a/.gitignore +++ b/.gitignore @@ -20,6 +20,7 @@ # Top-level generic files # tags +TAGS vmlinux* System.map Module.symvers diff --git a/CREDITS b/CREDITS index cc3453a55fb948f5173ff82715c304d18dc1212d..8218e790f43d33d2c754df8f149b56ef7c19b816 100644 --- a/CREDITS +++ b/CREDITS @@ -34,6 +34,7 @@ E: magrawal@nortelnetworks.com D: Basic Interphase 5575 driver with UBR and ABR support. S: 75 Donald St, Apt 42 S: Weymouth, MA 02188 +S: USA N: Dave Airlie E: airlied@linux.ie @@ -44,7 +45,7 @@ S: Longford, Ireland S: Sydney, Australia N: Tigran A. Aivazian -E: tigran@veritas.com +E: tigran@aivazian.fsnet.co.uk W: http://www.moses.uklinux.net/patches D: BFS filesystem D: Intel IA32 CPU microcode update support @@ -202,6 +203,7 @@ S: MS42 S: Hewlett-Packard S: 3404 E Harmony Rd S: Fort Collins, CO 80525 +S: USA N: Arindam Banerji E: axb@cse.nd.edu @@ -444,6 +446,7 @@ E: rbradetich@uswest.net D: Linux/PA-RISC hacker S: 1200 Goldenrod Dr. S: Nampa, Idaho 83686 +S: USA N: Derrick J. Brashear E: shadow@dementia.org @@ -633,6 +636,7 @@ E: scole@lanl.gov E: elenstev@mesatop.com D: Various build fixes and kernel documentation. S: Los Alamos, New Mexico +S: USA N: Hamish Coleman E: hamish@zot.apana.org.au @@ -951,6 +955,12 @@ S: Brevia 1043 S: S-114 79 Stockholm S: Sweden +N: Pekka Enberg +E: penberg@cs.helsinki.fi +W: http://www.cs.helsinki.fi/u/penberg/ +D: Various kernel hacks, fixes, and cleanups. +S: Finland + N: David Engebretsen E: engebret@us.ibm.com D: Linux port to 64-bit PowerPC architecture @@ -1620,7 +1630,8 @@ D: fbdev hacking N: Jesper Juhl E: jesper.juhl@gmail.com -D: Various fixes, cleanups and minor features. +D: Various fixes, cleanups and minor features all over the tree. +D: Wrote initial version of the hdaps driver (since passed on to others). S: Lemnosvej 1, 3.tv S: 2300 Copenhagen S. S: Denmark @@ -1797,6 +1808,14 @@ S: Kruislaan 419 S: 1098 VA Amsterdam S: The Netherlands +N: Jiri Kosina +E: jikos@jikos.cz +E: jkosina@suse.cz +D: Generic HID layer - original code split, fixes +D: Various ACPI fixes, keeping correct battery state through suspend +D: various lockdep annotations, autofs and other random bugfixes +S: Prague, Czech Republic + N: Gene Kozin E: 74604.152@compuserve.com W: http://www.sangoma.com @@ -2002,6 +2021,7 @@ W: http://www.mathematik.uni-stuttgart.de/~floeff D: Busmaster driver for HP 10/100 Mbit Network Adapters S: University of Stuttgart, Germany and S: Ecole Nationale Superieure des Telecommunications, Paris +S: France N: Jamie Lokier E: jamie@shareable.org @@ -2171,6 +2191,7 @@ S: Hewlett Packard S: MS 42 S: 3404 E. Harmony Road S: Fort Collins, CO 80528 +S: USA N: Torben Mathiasen E: torben.mathiasen@compaq.com @@ -2227,6 +2248,12 @@ D: tc: HFSC scheduler S: Freiburg S: Germany +N: Paul E. McKenney +E: paulmck@us.ibm.com +W: http://www.rdrop.com/users/paulmck/ +D: RCU and variants +D: rcutorture module + N: Mike McLagan E: mike.mclagan@linux.org W: http://www.invlogic.com/~mmclagan @@ -2477,7 +2504,8 @@ S: Derbyshire DE4 3RL S: United Kingdom N: Ian S. Nelson -E: ian.nelson@echostar.com +E: nelsonis@earthlink.net +P: 1024D/00D3D983 3EFD 7B86 B888 D7E2 29B6 9E97 576F 1B97 00D3 D983 D: Minor mmap and ide hacks S: 1370 Atlantis Ave. S: Lafayette CO, 80026 @@ -2578,6 +2606,9 @@ S: Ucitelska 1576 S: Prague 8 S: 182 00 Czech Republic +N: Rick Payne +D: RFC2385 Support for TCP + N: Barak A. Pearlmutter E: bap@cs.unm.edu W: http://www.cs.unm.edu/~bap/ @@ -2967,6 +2998,10 @@ S: 69 rue Dunois S: 75013 Paris S: France +N: Dipankar Sarma +E: dipankar@in.ibm.com +D: RCU + N: Hannu Savolainen E: hannu@opensound.com D: Maintainer of the sound drivers until 2.1.x days. @@ -3279,6 +3314,12 @@ S: 3 Ballow Crescent S: MacGregor A.C.T 2615 S: Australia +N: Josh Triplett +E: josh@freedesktop.org +P: 1024D/D0FE7AFB B24A 65C9 1D71 2AC2 DE87 CA26 189B 9946 D0FE 7AFB +D: rcutorture maintainer +D: lock annotations, finding and fixing lock bugs + N: Winfried Trümper E: winni@xpilot.org W: http://www.shop.de/~winni/ @@ -3481,14 +3522,12 @@ D: The Linux Support Team Erlangen N: David Weinehall E: tao@acc.umu.se +P: 1024D/DC47CA16 7ACE 0FB0 7A74 F994 9B36 E1D1 D14E 8526 DC47 CA16 W: http://www.acc.umu.se/~tao/ -W: http://www.acc.umu.se/~mcalinux/ +D: v2.0 kernel maintainer D: Fixes for the NE/2-driver D: Miscellaneous MCA-support D: Cleanup of the Config-files -S: Axtorpsvagen 40:20 -S: S-903 37 UMEA -S: Sweden N: Matt Welsh E: mdw@metalab.unc.edu @@ -3548,11 +3587,11 @@ S: Fargo, North Dakota 58122 S: USA N: Steven Whitehouse -E: SteveW@ACM.org +E: steve@chygwyn.com W: http://www.chygwyn.com/~steve -D: Linux DECnet project: http://www.sucs.swan.ac.uk/~rohan/DECnet/index.html +D: Linux DECnet project D: Minor debugging of other networking protocols. -D: Misc bug fixes and filesystem development +D: Misc bug fixes and GFS2 filesystem development N: Hans-Joachim Widmaier E: hjw@zvw.de @@ -3650,7 +3689,7 @@ S: Portland, OR S: USA N: Michal Wronski -E: Michal.Wronski@motorola.com +E: michal.wronski@gmail.com D: POSIX message queues fs (with K. Benedyczak) S: Krakow S: Poland diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 02457ec9c94fe27ec74dc943186283137774e232..f08ca953573392ec710a27db0f7f87f14ad54527 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -104,8 +104,6 @@ firmware_class/ - request_firmware() hotplug interface info. floppy.txt - notes and driver options for the floppy disk driver. -ftape.txt - - notes about the floppy tape device driver. hayes-esp.txt - info on using the Hayes ESP serial driver. highuid.txt diff --git a/Documentation/ABI/testing/debugfs-pktcdvd b/Documentation/ABI/testing/debugfs-pktcdvd new file mode 100644 index 0000000000000000000000000000000000000000..03dbd883cc41599211cb1c462fc126352112bb3a --- /dev/null +++ b/Documentation/ABI/testing/debugfs-pktcdvd @@ -0,0 +1,20 @@ +What: /debug/pktcdvd/pktcdvd[0-7] +Date: Oct. 2006 +KernelVersion: 2.6.19 +Contact: Thomas Maier +Description: + +debugfs interface +----------------- + +The pktcdvd module (packet writing driver) creates +these files in debugfs: + +/debug/pktcdvd/pktcdvd[0-7]/ + info (0444) Lots of human readable driver + statistics and infos. Multiple lines! + +Example: +------- + +cat /debug/pktcdvd/pktcdvd0/info diff --git a/Documentation/ABI/testing/sysfs-class-pktcdvd b/Documentation/ABI/testing/sysfs-class-pktcdvd new file mode 100644 index 0000000000000000000000000000000000000000..c4c55edc9a5c9ca7f5eaa6441049360ad5813017 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-class-pktcdvd @@ -0,0 +1,72 @@ +What: /sys/class/pktcdvd/ +Date: Oct. 2006 +KernelVersion: 2.6.19 +Contact: Thomas Maier +Description: + +sysfs interface +--------------- + +The pktcdvd module (packet writing driver) creates +these files in the sysfs: +( is in format major:minor ) + +/sys/class/pktcdvd/ + add (0200) Write a block device id (major:minor) + to create a new pktcdvd device and map + it to the block device. + + remove (0200) Write the pktcdvd device id (major:minor) + to it to remove the pktcdvd device. + + device_map (0444) Shows the device mapping in format: + pktcdvd[0-7] + +/sys/class/pktcdvd/pktcdvd[0-7]/ + dev (0444) Device id + uevent (0200) To send an uevent. + +/sys/class/pktcdvd/pktcdvd[0-7]/stat/ + packets_started (0444) Number of started packets. + packets_finished (0444) Number of finished packets. + + kb_written (0444) kBytes written. + kb_read (0444) kBytes read. + kb_read_gather (0444) kBytes read to fill write packets. + + reset (0200) Write any value to it to reset + pktcdvd device statistic values, like + bytes read/written. + +/sys/class/pktcdvd/pktcdvd[0-7]/write_queue/ + size (0444) Contains the size of the bio write + queue. + + congestion_off (0644) If bio write queue size is below + this mark, accept new bio requests + from the block layer. + + congestion_on (0644) If bio write queue size is higher + as this mark, do no longer accept + bio write requests from the block + layer and wait till the pktcdvd + device has processed enough bio's + so that bio write queue size is + below congestion off mark. + A value of <= 0 disables congestion + control. + + +Example: +-------- +To use the pktcdvd sysfs interface directly, you can do: + +# create a new pktcdvd device mapped to /dev/hdc +echo "22:0" >/sys/class/pktcdvd/add +cat /sys/class/pktcdvd/device_map +# assuming device pktcdvd0 was created, look at stat's +cat /sys/class/pktcdvd/pktcdvd0/stat/kb_written +# print the device id of the mapped block device +fgrep pktcdvd0 /sys/class/pktcdvd/device_map +# remove device, using pktcdvd0 device id 253:0 +echo "253:0" >/sys/class/pktcdvd/remove diff --git a/Documentation/ABI/testing/sysfs-power b/Documentation/ABI/testing/sysfs-power index d882f8093871386f03ab63c6cc2b820adb2610e6..dcff4d0623add0e7708c642d0cfe210b0c1f48ab 100644 --- a/Documentation/ABI/testing/sysfs-power +++ b/Documentation/ABI/testing/sysfs-power @@ -21,7 +21,7 @@ Description: these states. What: /sys/power/disk -Date: August 2006 +Date: September 2006 Contact: Rafael J. Wysocki Description: The /sys/power/disk file controls the operating mode of the @@ -39,6 +39,19 @@ Description: 'reboot' - the memory image will be saved by the kernel and the system will be rebooted. + Additionally, /sys/power/disk can be used to turn on one of the + two testing modes of the suspend-to-disk mechanism: 'testproc' + or 'test'. If the suspend-to-disk mechanism is in the + 'testproc' mode, writing 'disk' to /sys/power/state will cause + the kernel to disable nonboot CPUs and freeze tasks, wait for 5 + seconds, unfreeze tasks and enable nonboot CPUs. If it is in + the 'test' mode, writing 'disk' to /sys/power/state will cause + the kernel to disable nonboot CPUs and freeze tasks, shrink + memory, suspend devices, wait for 5 seconds, resume devices, + unfreeze tasks and enable nonboot CPUs. Then, we are able to + look in the log messages and work out, for example, which code + is being slow and which device drivers are misbehaving. + The suspend-to-disk method may be chosen by writing to this file one of the accepted strings: @@ -46,6 +59,8 @@ Description: 'platform' 'shutdown' 'reboot' + 'testproc' + 'test' It will only change to 'firmware' or 'platform' if the system supports that. diff --git a/Documentation/Changes b/Documentation/Changes index abee7f58c1ed6b2c374aa0483c06f86a5a260512..73a8617f1861754198c18c5e7285da95358994fe 100644 --- a/Documentation/Changes +++ b/Documentation/Changes @@ -201,7 +201,7 @@ udev ---- udev is a userspace application for populating /dev dynamically with only entries for devices actually present. udev replaces the basic -functionality of devfs, while allowing persistant device naming for +functionality of devfs, while allowing persistent device naming for devices. FUSE diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index 6d2412ec91edb21ebab0c13e581951bbe352b610..0ad6dcb5d45ffd0f9f7e5ae1d64044afa51b1b8b 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -35,12 +35,37 @@ In short, 8-char indents make things easier to read, and have the added benefit of warning you when you're nesting your functions too deep. Heed that warning. +The preferred way to ease multiple indentation levels in a switch statement is +to align the "switch" and its subordinate "case" labels in the same column +instead of "double-indenting" the "case" labels. E.g.: + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + /* fall through */ + default: + break; + } + + Don't put multiple statements on a single line unless you have something to hide: if (condition) do_this; do_something_everytime; +Don't put multiple assignments on a single line either. Kernel coding style +is super simple. Avoid tricky expressions. + Outside of comments, documentation and except in Kconfig, spaces are never used for indentation, and the above example is deliberately broken. @@ -69,7 +94,7 @@ void fun(int a, int b, int c) next_statement; } - Chapter 3: Placing Braces + Chapter 3: Placing Braces and Spaces The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to @@ -81,6 +106,20 @@ brace last on the line, and put the closing brace first, thusly: we do y } +This applies to all non-function statement blocks (if, switch, for, +while, do). E.g.: + + switch (action) { + case KOBJ_ADD: + return "add"; + case KOBJ_REMOVE: + return "remove"; + case KOBJ_CHANGE: + return "change"; + default: + return NULL; + } + However, there is one special case, namely functions: they have the opening brace at the beginning of the next line, thus: @@ -121,6 +160,49 @@ supply of new-lines on your screen is not a renewable resource (think 25-line terminal screens here), you have more empty lines to put comments on. + 3.1: Spaces + +Linux kernel style for use of spaces depends (mostly) on +function-versus-keyword usage. Use a space after (most) keywords. The +notable exceptions are sizeof, typeof, alignof, and __attribute__, which look +somewhat like functions (and are usually used with parentheses in Linux, +although they are not required in the language, as in: "sizeof info" after +"struct fileinfo info;" is declared). + +So use a space after these keywords: + if, switch, case, for, do, while +but not with sizeof, typeof, alignof, or __attribute__. E.g., + s = sizeof(struct file); + +Do not add spaces around (inside) parenthesized expressions. This example is +*bad*: + + s = sizeof( struct file ); + +When declaring pointer data or a function that returns a pointer type, the +preferred use of '*' is adjacent to the data name or function name and not +adjacent to the type name. Examples: + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + +Use one space around (on each side of) most binary and ternary operators, +such as any of these: + + = + - < > * / % | & ^ <= >= == != ? : + +but no space after unary operators: + & * + - ~ ! sizeof typeof alignof __attribute__ defined + +no space before the postfix increment & decrement unary operators: + ++ -- + +no space after the prefix increment & decrement unary operators: + ++ -- + +and no space around the '.' and "->" structure member operators. + Chapter 4: Naming @@ -152,7 +234,7 @@ variable that is used to hold a temporary value. If you are afraid to mix up your local variable names, you have another problem, which is called the function-growth-hormone-imbalance syndrome. -See next chapter. +See chapter 6 (Functions). Chapter 5: Typedefs @@ -258,6 +340,20 @@ generally easily keep track of about 7 different things, anything more and it gets confused. You know you're brilliant, but maybe you'd like to understand what you did 2 weeks from now. +In source files, separate functions with one blank line. If the function is +exported, the EXPORT* macro for it should follow immediately after the closing +function brace line. E.g.: + +int system_is_up(void) +{ + return system_state == SYSTEM_RUNNING; +} +EXPORT_SYMBOL(system_is_up); + +In function prototypes, include parameter names with their data types. +Although this is not required by the C language, it is preferred in Linux +because it is a simple way to add valuable information for the reader. + Chapter 7: Centralized exiting of functions @@ -306,16 +402,36 @@ time to explain badly written code. Generally, you want your comments to tell WHAT your code does, not HOW. Also, try to avoid putting comments inside a function body: if the function is so complex that you need to separately comment parts of it, -you should probably go back to chapter 5 for a while. You can make +you should probably go back to chapter 6 for a while. You can make small comments to note or warn about something particularly clever (or ugly), but try to avoid excess. Instead, put the comments at the head of the function, telling people what it does, and possibly WHY it does it. -When commenting the kernel API functions, please use the kerneldoc format. +When commenting the kernel API functions, please use the kernel-doc format. See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc for details. +Linux style for comments is the C89 "/* ... */" style. +Don't use C99-style "// ..." comments. + +The preferred style for long (multi-line) comments is: + + /* + * This is the preferred style for multi-line + * comments in the Linux kernel source code. + * Please use it consistently. + * + * Description: A column of asterisks on the left side, + * with beginning and ending almost-blank lines. + */ + +It's also important to comment data, whether they are basic types or derived +types. To this end, use just one data declaration per line (no commas for +multiple data declarations). This leaves you room for a small comment on each +item, explaining its use. + + Chapter 9: You've made a mess of it That's OK, we all do. You've probably been told by your long-time Unix @@ -532,6 +648,40 @@ appears outweighs the potential value of the hint that tells gcc to do something it would have done anyway. + Chapter 16: Function return values and names + +Functions can return values of many different kinds, and one of the +most common is a value indicating whether the function succeeded or +failed. Such a value can be represented as an error-code integer +(-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, +non-zero = success). + +Mixing up these two sorts of representations is a fertile source of +difficult-to-find bugs. If the C language included a strong distinction +between integers and booleans then the compiler would find these mistakes +for us... but it doesn't. To help prevent such bugs, always follow this +convention: + + If the name of a function is an action or an imperative command, + the function should return an error-code integer. If the name + is a predicate, the function should return a "succeeded" boolean. + +For example, "add work" is a command, and the add_work() function returns 0 +for success or -EBUSY for failure. In the same way, "PCI device present" is +a predicate, and the pci_dev_present() function returns 1 if it succeeds in +finding a matching device or 0 if it doesn't. + +All EXPORTed functions must respect this convention, and so should all +public functions. Private (static) functions need not, but it is +recommended that they do. + +Functions whose return value is the actual result of a computation, rather +than an indication of whether the computation succeeded, are not subject to +this rule. Generally they indicate failure by returning some out-of-range +result. Typical examples would be functions that return pointers; they use +NULL or the ERR_PTR mechanism to report failure. + + Appendix I: References @@ -557,4 +707,4 @@ Kernel CodingStyle, by greg@kroah.com at OLS 2002: http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ -- -Last updated on 30 April 2006. +Last updated on 2006-December-06. diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt index 2ffb0d62f0fe3ed8156586505b7faf1efe1c33ce..805db4b2cba65b65bd966f054c71ca13427f9287 100644 --- a/Documentation/DMA-API.txt +++ b/Documentation/DMA-API.txt @@ -77,7 +77,7 @@ To get this part of the dma_ API, you must #include Many drivers need lots of small dma-coherent memory regions for DMA descriptors or I/O buffers. Rather than allocating in units of a page or more using dma_alloc_coherent(), you can use DMA pools. These work -much like a kmem_cache_t, except that they use the dma-coherent allocator +much like a struct kmem_cache, except that they use the dma-coherent allocator not __get_free_pages(). Also, they understand common hardware constraints for alignment, like queue heads needing to be aligned on N byte boundaries. @@ -94,7 +94,7 @@ The pool create() routines initialize a pool of dma-coherent buffers for use with a given device. It must be called in a context which can sleep. -The "name" is for diagnostics (like a kmem_cache_t name); dev and size +The "name" is for diagnostics (like a struct kmem_cache name); dev and size are like what you'd pass to dma_alloc_coherent(). The device's hardware alignment requirement for this type of data is "align" (which is expressed in bytes, and must be a power of two). If your device has no boundary @@ -431,10 +431,10 @@ be identical to those passed in (and returned by dma_alloc_noncoherent()). int -dma_is_consistent(dma_addr_t dma_handle) +dma_is_consistent(struct device *dev, dma_addr_t dma_handle) -returns true if the memory pointed to by the dma_handle is actually -consistent. +returns true if the device dev is performing consistent DMA on the memory +area pointed to by the dma_handle. int dma_get_cache_alignment(void) @@ -459,7 +459,7 @@ anything like this. You must also be extra careful about accessing memory you intend to sync partially. void -dma_cache_sync(void *vaddr, size_t size, +dma_cache_sync(struct device *dev, void *vaddr, size_t size, enum dma_data_direction direction) Do a partial sync of memory that was allocated by @@ -489,7 +489,7 @@ size is the size of the area (must be multiples of PAGE_SIZE). flags can be or'd together and are DMA_MEMORY_MAP - request that the memory returned from -dma_alloc_coherent() be directly writeable. +dma_alloc_coherent() be directly writable. DMA_MEMORY_IO - request that the memory returned from dma_alloc_coherent() be addressable using read/write/memcpy_toio etc. diff --git a/Documentation/DMA-ISA-LPC.txt b/Documentation/DMA-ISA-LPC.txt index 705f6be92bdbf934a1171f291ac8b3b631d670c3..e767805b4182826acc049e29a183ec5109c4a39e 100644 --- a/Documentation/DMA-ISA-LPC.txt +++ b/Documentation/DMA-ISA-LPC.txt @@ -110,7 +110,7 @@ lock. Once the DMA transfer is finished (or timed out) you should disable the channel again. You should also check get_dma_residue() to make -sure that all data has been transfered. +sure that all data has been transferred. Example: diff --git a/Documentation/DMA-mapping.txt b/Documentation/DMA-mapping.txt index 63392c9132b4d08bf32f3fab82cc0c14c39e15b1..028614cdd0624291f630d3cdda899cf02335ed2b 100644 --- a/Documentation/DMA-mapping.txt +++ b/Documentation/DMA-mapping.txt @@ -107,7 +107,7 @@ The query is performed via a call to pci_set_dma_mask(): int pci_set_dma_mask(struct pci_dev *pdev, u64 device_mask); -The query for consistent allocations is performed via a a call to +The query for consistent allocations is performed via a call to pci_set_consistent_dma_mask(): int pci_set_consistent_dma_mask(struct pci_dev *pdev, u64 device_mask); @@ -117,7 +117,7 @@ device_mask is a bit mask describing which bits of a PCI address your device supports. It returns zero if your card can perform DMA properly on the machine given the address mask you provided. -If it returns non-zero, your device can not perform DMA properly on +If it returns non-zero, your device cannot perform DMA properly on this platform, and attempting to do so will result in undefined behavior. You must either use a different mask, or not use DMA. diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile index 66e1cf733571ccc4122dcc745bf9c713ee6559b6..36526a1e76d753e83bc752acf584e93d5db2f0b5 100644 --- a/Documentation/DocBook/Makefile +++ b/Documentation/DocBook/Makefile @@ -9,7 +9,7 @@ DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \ kernel-hacking.xml kernel-locking.xml deviceiobook.xml \ procfs-guide.xml writing_usb_driver.xml \ - kernel-api.xml journal-api.xml lsm.xml usb.xml \ + kernel-api.xml filesystems.xml lsm.xml usb.xml \ gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \ genericirq.xml @@ -190,9 +190,13 @@ quiet_cmd_fig2png = FIG2PNG $@ ### # Help targets as used by the top-level makefile dochelp: - @echo ' Linux kernel internal documentation in different formats:' - @echo ' xmldocs (XML DocBook), psdocs (Postscript), pdfdocs (PDF)' - @echo ' htmldocs (HTML), mandocs (man pages, use installmandocs to install)' + @echo ' Linux kernel internal documentation in different formats:' + @echo ' htmldocs - HTML' + @echo ' installmandocs - install man pages generated by mandocs' + @echo ' mandocs - man pages' + @echo ' pdfdocs - PDF' + @echo ' psdocs - Postscript' + @echo ' xmldocs - XML DocBook' ### # Temporary files left by various tools diff --git a/Documentation/DocBook/journal-api.tmpl b/Documentation/DocBook/filesystems.tmpl similarity index 78% rename from Documentation/DocBook/journal-api.tmpl rename to Documentation/DocBook/filesystems.tmpl index 2077f9a28c191bdba263271e649f015485052926..39fa2aba7f9b141d13912d04994ebc4335499000 100644 --- a/Documentation/DocBook/journal-api.tmpl +++ b/Documentation/DocBook/filesystems.tmpl @@ -2,39 +2,11 @@ - + - The Linux Journalling API - - - Roger - Gammans - -
- rgammans@computer-surgery.co.uk -
-
-
-
- - - - Stephen - Tweedie - -
- sct@redhat.com -
-
-
-
+ Linux Filesystems API - - 2002 - Roger Gammans - - - + This documentation is free software; you can redistribute it and/or modify it under the terms of the GNU General Public @@ -42,21 +14,21 @@ version 2 of the License, or (at your option) any later version. - + This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. - + You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA - + For more details see the file COPYING in the source distribution of Linux. @@ -66,17 +38,113 @@ - + + The Linux VFS + The Filesystem types +!Iinclude/linux/fs.h + + The Directory Cache +!Efs/dcache.c +!Iinclude/linux/dcache.h + + Inode Handling +!Efs/inode.c +!Efs/bad_inode.c + + Registration and Superblocks +!Efs/super.c + + File Locks +!Efs/locks.c +!Ifs/locks.c + + Other Functions +!Efs/mpage.c +!Efs/namei.c +!Efs/buffer.c +!Efs/bio.c +!Efs/seq_file.c +!Efs/filesystems.c +!Efs/fs-writeback.c +!Efs/block_dev.c + + + + + The proc filesystem + + sysctl interface +!Ekernel/sysctl.c + + + proc filesystem interface +!Ifs/proc/base.c + + + + + The Filesystem for Exporting Kernel Objects +!Efs/sysfs/file.c +!Efs/sysfs/symlink.c +!Efs/sysfs/bin.c + + + + The debugfs filesystem + + debugfs interface +!Efs/debugfs/inode.c +!Efs/debugfs/file.c + + + + + + The Linux Journalling API + + + + Roger + Gammans + +
+ rgammans@computer-surgery.co.uk +
+
+
+
+ + + + Stephen + Tweedie + +
+ sct@redhat.com +
+
+
+
+ + + 2002 + Roger Gammans + +
+ + The Linux Journalling API + + Overview - + Details -The journalling layer is easy to use. You need to +The journalling layer is easy to use. You need to first of all create a journal_t data structure. There are two calls to do this dependent on how you decide to allocate the physical -media on which the journal resides. The journal_init_inode() call +media on which the journal resides. The journal_init_inode() call is for journals stored in filesystem inodes, or the journal_init_dev() -call can be use for journal stored on a raw device (in a continuous range +call can be use for journal stored on a raw device (in a continuous range of blocks). A journal_t is a typedef for a struct pointer, so when you are finally finished make sure you call journal_destroy() on it to free up any used kernel memory. @@ -91,27 +159,26 @@ need to call journal_create(). Most of the time however your journal file will already have been created, but before you load it you must call journal_wipe() to empty the journal file. -Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the +Hang on, you say , what if the filesystem wasn't cleanly umount()'d . Well, it is the job of the client file system to detect this and skip the call to journal_wipe(). In either case the next call should be to journal_load() which prepares the -journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() +journal file for use. Note that journal_wipe(..,0) calls journal_skip_recovery() for you if it detects any outstanding transactions in the journal and similarly journal_load() will call journal_recover() if necessary. I would advise reading fs/ext3/super.c for examples on this stage. -[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly -complicate the API. Or isn't a good idea for the journal layer to hide +[RGG: Why is the journal_wipe() call necessary - doesn't this needlessly +complicate the API. Or isn't a good idea for the journal layer to hide dirty mounts from the client fs] -Now you can go ahead and start modifying the underlying +Now you can go ahead and start modifying the underlying filesystem. Almost. - You still need to actually journal your filesystem changes, this @@ -138,10 +205,10 @@ individual buffers (blocks). Before you start to modify a buffer you need to call journal_get_{create,write,undo}_access() as appropriate, this allows the journalling layer to copy the unmodified data if it needs to. After all the buffer may be part of a previously uncommitted -transaction. +transaction. At this point you are at last ready to modify a buffer, and once you are have done so you need to call journal_dirty_{meta,}data(). -Or if you've asked for access to a buffer you now know is now longer +Or if you've asked for access to a buffer you now know is now longer required to be pushed back on the device you can call journal_forget() in much the same way as you might have used bforget() in the past. @@ -156,7 +223,6 @@ Then at umount time , in your put_super() (2.4) or write_super() (2.5) you can then call journal_destroy() to clean up your in-core journal object. - Unfortunately there a couple of ways the journal layer can cause a deadlock. The first thing to note is that each task can only have @@ -164,19 +230,19 @@ a single outstanding transaction at any one time, remember nothing commits until the outermost journal_stop(). This means you must complete the transaction at the end of each file/inode/address etc. operation you perform, so that the journalling system isn't re-entered -on another journal. Since transactions can't be nested/batched +on another journal. Since transactions can't be nested/batched across differing journals, and another filesystem other than yours (say ext3) may be modified in a later syscall. -The second case to bear in mind is that journal_start() can -block if there isn't enough space in the journal for your transaction +The second case to bear in mind is that journal_start() can +block if there isn't enough space in the journal for your transaction (based on the passed nblocks param) - when it blocks it merely(!) needs to -wait for transactions to complete and be committed from other tasks, -so essentially we are waiting for journal_stop(). So to avoid +wait for transactions to complete and be committed from other tasks, +so essentially we are waiting for journal_stop(). So to avoid deadlocks you must treat journal_start/stop() as if they -were semaphores and include them in your semaphore ordering rules to prevent +were semaphores and include them in your semaphore ordering rules to prevent deadlocks. Note that journal_extend() has similar blocking behaviour to journal_start() so you can deadlock here just as easily as on journal_start(). @@ -184,7 +250,7 @@ journal_start() so you can deadlock here just as easily as on journal_start(). Try to reserve the right number of blocks the first time. ;-). This will be the maximum number of blocks you are going to touch in this transaction. -I advise having a look at at least ext3_jbd.h to see the basis on which +I advise having a look at at least ext3_jbd.h to see the basis on which ext3 uses to make these decisions. @@ -193,13 +259,13 @@ Another wriggle to watch out for is your on-disk block allocation strategy. why? Because, if you undo a delete, you need to ensure you haven't reused any of the freed blocks in a later transaction. One simple way of doing this is make sure any blocks you allocate only have checkpointed transactions -listed against them. Ext3 does this in ext3_test_allocatable(). +listed against them. Ext3 does this in ext3_test_allocatable().
Lock is also providing through journal_{un,}lock_updates(), ext3 uses this when it wants a window with a clean and stable fs for a moment. -eg. +eg. @@ -230,19 +296,19 @@ extend it like this:- struct journal_callback for_jbd; // Stuff for myfs allocated together. myfs_inode* i_commited; - + } -this would be useful if you needed to know when data was committed to a +this would be useful if you needed to know when data was committed to a particular inode. - + - -Summary + + Summary Using the journal is a matter of wrapping the different context changes, being each mount, each modification (transaction) and each changed buffer @@ -260,15 +326,15 @@ an example. if (clean) journal_wipe(); journal_load(); - foreach(transaction) { /*transactions must be + foreach(transaction) { /*transactions must be completed before - a syscall returns to + a syscall returns to userspace*/ handle_t * xct=journal_start(my_jnrl); foreach(bh) { journal_get_{create,write,undo}_access(xact,bh); - if ( myfs_modify(bh) ) { /* returns true + if ( myfs_modify(bh) ) { /* returns true if makes changes */ journal_dirty_{meta,}data(xact,bh); } else { @@ -279,55 +345,57 @@ an example. } journal_destroy(my_jrnl); - + - + - + Data Types - + The journalling layer uses typedefs to 'hide' the concrete definitions of the structures used. As a client of the JBD layer you can just rely on the using the pointer as a magic cookie of some sort. - + Obviously the hiding is not enforced as this is 'C'. - - Structures + + Structures !Iinclude/linux/jbd.h - - + + - + Functions - + The functions here are split into two groups those that affect a journal as a whole, and those which are used to manage transactions - - Journal Level + + Journal Level !Efs/jbd/journal.c !Ifs/jbd/recovery.c - - Transasction Level -!Efs/jbd/transaction.c - - - + + Transasction Level +!Efs/jbd/transaction.c + + + See also - + - Journaling the Linux ext2fs Filesystem,LinuxExpo 98, Stephen Tweedie + Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen Tweedie - - - + + + - Ext3 Journalling FileSystem , OLS 2000, Dr. Stephen Tweedie + Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen Tweedie - - + + + +
diff --git a/Documentation/DocBook/kernel-api.tmpl b/Documentation/DocBook/kernel-api.tmpl index f8fe882e33dccfc50f6e5add6ca89a53389d649f..3fa0c4b4541e065760a3c119e88d314ba3b6ee91 100644 --- a/Documentation/DocBook/kernel-api.tmpl +++ b/Documentation/DocBook/kernel-api.tmpl @@ -158,6 +158,7 @@ X!Ilib/string.c !Emm/filemap.c !Emm/memory.c !Emm/vmalloc.c +!Imm/page_alloc.c !Emm/mempool.c !Emm/page-writeback.c !Emm/truncate.c @@ -181,56 +182,19 @@ X!Ilib/string.c - - The proc filesystem - - sysctl interface -!Ekernel/sysctl.c - - - proc filesystem interface -!Ifs/proc/base.c - - + + relay interface support - - The debugfs filesystem - - debugfs interface -!Efs/debugfs/inode.c -!Efs/debugfs/file.c - - + + Relay interface support + is designed to provide an efficient mechanism for tools and + facilities to relay large amounts of data from kernel space to + user space. + - - The Linux VFS - The Filesystem types -!Iinclude/linux/fs.h - - The Directory Cache -!Efs/dcache.c -!Iinclude/linux/dcache.h - - Inode Handling -!Efs/inode.c -!Efs/bad_inode.c - - Registration and Superblocks -!Efs/super.c - - File Locks -!Efs/locks.c -!Ifs/locks.c - - Other Functions -!Efs/mpage.c -!Efs/namei.c -!Efs/buffer.c -!Efs/bio.c -!Efs/seq_file.c -!Efs/filesystems.c -!Efs/fs-writeback.c -!Efs/block_dev.c + relay interface +!Ekernel/relay.c +!Ikernel/relay.c @@ -302,8 +266,13 @@ X!Ekernel/module.c !Ekernel/irq/manage.c + DMA Channels +!Ekernel/dma.c + + Resources Management !Ikernel/resource.c +!Ekernel/resource.c MTRR Handling @@ -349,13 +318,6 @@ X!Earch/i386/kernel/mca.c - - The Filesystem for Exporting Kernel Objects -!Efs/sysfs/file.c -!Efs/sysfs/symlink.c -!Efs/sysfs/bin.c - - Security Framework !Esecurity/security.c @@ -386,6 +348,7 @@ X!Iinclude/linux/device.h --> !Edrivers/base/driver.c !Edrivers/base/core.c +!Edrivers/base/class.c !Edrivers/base/firmware_class.c !Edrivers/base/transport_class.c !Edrivers/base/dmapool.c @@ -437,6 +400,11 @@ X!Edrivers/pnp/system.c !Eblock/ll_rw_blk.c + + Char devices +!Efs/char_dev.c + + Miscellaneous Devices !Edrivers/char/misc.c @@ -450,9 +418,35 @@ X!Edrivers/pnp/system.c !Idrivers/parport/daisy.c - - Video4Linux -!Edrivers/media/video/videodev.c + + Message-based devices + Fusion message devices +!Edrivers/message/fusion/mptbase.c +!Idrivers/message/fusion/mptbase.c +!Edrivers/message/fusion/mptscsih.c +!Idrivers/message/fusion/mptscsih.c +!Idrivers/message/fusion/mptctl.c +!Idrivers/message/fusion/mptspi.c +!Idrivers/message/fusion/mptfc.c +!Idrivers/message/fusion/mptlan.c + + I2O message devices +!Iinclude/linux/i2o.h +!Idrivers/message/i2o/core.h +!Edrivers/message/i2o/iop.c +!Idrivers/message/i2o/iop.c +!Idrivers/message/i2o/config-osm.c +!Edrivers/message/i2o/exec-osm.c +!Idrivers/message/i2o/exec-osm.c +!Idrivers/message/i2o/bus-osm.c +!Edrivers/message/i2o/device.c +!Idrivers/message/i2o/device.c +!Idrivers/message/i2o/driver.c +!Idrivers/message/i2o/pci.c +!Idrivers/message/i2o/i2o_block.c +!Idrivers/message/i2o/i2o_scsi.c +!Idrivers/message/i2o/i2o_proc.c + @@ -565,4 +559,12 @@ X!Idrivers/video/console/fonts.c --> + + + Input Subsystem +!Iinclude/linux/input.h +!Edrivers/input/input.c +!Edrivers/input/ff-core.c +!Edrivers/input/ff-memless.c +
diff --git a/Documentation/DocBook/libata.tmpl b/Documentation/DocBook/libata.tmpl index 065e8dc23e3adb4bd3d5d659ddf2a80741261b3a..07a635590b36e8ff8c8a61086adb34e9d907bd65 100644 --- a/Documentation/DocBook/libata.tmpl +++ b/Documentation/DocBook/libata.tmpl @@ -14,7 +14,7 @@ - 2003-2005 + 2003-2006 Jeff Garzik @@ -1400,7 +1400,7 @@ and other resources, etc. When it's known that HBA is in ready state but ATA/ATAPI - device in in unknown state, reset only device. + device is in unknown state, reset only device. diff --git a/Documentation/DocBook/usb.tmpl b/Documentation/DocBook/usb.tmpl index 320af25de3a276fd5b77aa830b08549c2c523793..143e5ff7deb8c4fad66789c02750749251a2acea 100644 --- a/Documentation/DocBook/usb.tmpl +++ b/Documentation/DocBook/usb.tmpl @@ -43,59 +43,52 @@ A Universal Serial Bus (USB) is used to connect a host, such as a PC or workstation, to a number of peripheral - devices. USB uses a tree structure, with the host at the + devices. USB uses a tree structure, with the host as the root (the system's master), hubs as interior nodes, and - peripheral devices as leaves (and slaves). + peripherals as leaves (and slaves). Modern PCs support several such trees of USB devices, usually one USB 2.0 tree (480 Mbit/sec each) with a few USB 1.1 trees (12 Mbit/sec each) that are used when you connect a USB 1.1 device directly to the machine's "root hub". - That master/slave asymmetry was designed in part for - ease of use. It is not physically possible to assemble - (legal) USB cables incorrectly: all upstream "to-the-host" - connectors are the rectangular type, matching the sockets on - root hubs, and the downstream type are the squarish type - (or they are built in to the peripheral). - Software doesn't need to deal with distributed autoconfiguration - since the pre-designated master node manages all that. - At the electrical level, bus protocol overhead is reduced by - eliminating arbitration and moving scheduling into host software. + That master/slave asymmetry was designed-in for a number of + reasons, one being ease of use. It is not physically possible to + assemble (legal) USB cables incorrectly: all upstream "to the host" + connectors are the rectangular type (matching the sockets on + root hubs), and all downstream connectors are the squarish type + (or they are built into the peripheral). + Also, the host software doesn't need to deal with distributed + auto-configuration since the pre-designated master node manages all that. + And finally, at the electrical level, bus protocol overhead is reduced by + eliminating arbitration and moving scheduling into the host software. - USB 1.0 was announced in January 1996, and was revised + USB 1.0 was announced in January 1996 and was revised as USB 1.1 (with improvements in hub specification and support for interrupt-out transfers) in September 1998. - USB 2.0 was released in April 2000, including high speed - transfers and transaction translating hubs (used for USB 1.1 + USB 2.0 was released in April 2000, adding high-speed + transfers and transaction-translating hubs (used for USB 1.1 and 1.0 backward compatibility). - USB support was added to Linux early in the 2.2 kernel series - shortly before the 2.3 development forked off. Updates - from 2.3 were regularly folded back into 2.2 releases, bringing - new features such as /sbin/hotplug support, - more drivers, and more robustness. - The 2.5 kernel series continued such improvements, and also - worked on USB 2.0 support, - higher performance, - better consistency between host controller drivers, - API simplification (to make bugs less likely), - and providing internal "kerneldoc" documentation. + Kernel developers added USB support to Linux early in the 2.2 kernel + series, shortly before 2.3 development forked. Updates from 2.3 were + regularly folded back into 2.2 releases, which improved reliability and + brought /sbin/hotplug support as well more drivers. + Such improvements were continued in the 2.5 kernel series, where they added + USB 2.0 support, improved performance, and made the host controller drivers + (HCDs) more consistent. They also simplified the API (to make bugs less + likely) and added internal "kerneldoc" documentation. Linux can run inside USB devices as well as on the hosts that control the devices. - Because the Linux 2.x USB support evolved to support mass market - platforms such as Apple Macintosh or PC-compatible systems, - it didn't address design concerns for those types of USB systems. - So it can't be used inside mass-market PDAs, or other peripherals. - USB device drivers running inside those Linux peripherals + But USB device drivers running inside those peripherals don't do the same things as the ones running inside hosts, - and so they've been given a different name: - they're called gadget drivers. - This document does not present gadget drivers. + so they've been given a different name: + gadget drivers. + This document does not cover gadget drivers. @@ -103,17 +96,14 @@ USB Host-Side API Model - Within the kernel, - host-side drivers for USB devices talk to the "usbcore" APIs. - There are two types of public "usbcore" APIs, targetted at two different - layers of USB driver. Those are - general purpose drivers, exposed through - driver frameworks such as block, character, or network devices; - and drivers that are part of the core, - which are involved in managing a USB bus. - Such core drivers include the hub driver, - which manages trees of USB devices, and several different kinds - of host controller driver (HCD), + Host-side drivers for USB devices talk to the "usbcore" APIs. + There are two. One is intended for + general-purpose drivers (exposed through + driver frameworks), and the other is for drivers that are + part of the core. + Such core drivers include the hub driver + (which manages trees of USB devices) and several different kinds + of host controller drivers, which control individual busses. @@ -122,21 +112,21 @@ - USB supports four kinds of data transfer - (control, bulk, interrupt, and isochronous). Two transfer - types use bandwidth as it's available (control and bulk), - while the other two types of transfer (interrupt and isochronous) + USB supports four kinds of data transfers + (control, bulk, interrupt, and isochronous). Two of them (control + and bulk) use bandwidth as it's available, + while the other two (interrupt and isochronous) are scheduled to provide guaranteed bandwidth. The device description model includes one or more "configurations" per device, only one of which is active at a time. - Devices that are capable of high speed operation must also support - full speed configurations, along with a way to ask about the - "other speed" configurations that might be used. + Devices that are capable of high-speed operation must also support + full-speed configurations, along with a way to ask about the + "other speed" configurations which might be used. - Configurations have one or more "interface", each + Configurations have one or more "interfaces", each of which may have "alternate settings". Interfaces may be standardized by USB "Class" specifications, or may be specific to a vendor or device. @@ -162,7 +152,7 @@ The Linux USB API supports synchronous calls for - control and bulk messaging. + control and bulk messages. It also supports asynchnous calls for all kinds of data transfer, using request structures called "URBs" (USB Request Blocks). @@ -324,8 +314,7 @@ usbdevfs although it wasn't solving what devfs was. Every USB device will appear in usbfs, regardless of whether or - not it has a kernel driver; but only devices with kernel drivers - show up in devfs. + not it has a kernel driver. @@ -463,14 +452,25 @@ file in your Linux kernel sources. - Otherwise the main use for this file from programs - is to poll() it to get notifications of usb devices - as they're plugged or unplugged. - To see what changed, you'd need to read the file and - compare "before" and "after" contents, scan the filesystem, - or see its hotplug event. + This file, in combination with the poll() system call, can + also be used to detect when devices are added or removed: +int fd; +struct pollfd pfd; + +fd = open("/proc/bus/usb/devices", O_RDONLY); +pfd = { fd, POLLIN, 0 }; +for (;;) { + /* The first time through, this call will return immediately. */ + poll(&pfd, 1, -1); + + /* To see what's changed, compare the file's previous and current + contents or scan the filesystem. (Scanning is more precise.) */ +} + Note that this behavior is intended to be used for informational + and debug purposes. It would be more appropriate to use programs + such as udev or HAL to initialize a device or start a user-mode + helper program, for instance. - @@ -740,7 +740,7 @@ usbdev_ioctl (int fd, int ifno, unsigned request, void *param) Synchronous I/O Support Synchronous requests involve the kernel blocking - until until the user mode request completes, either by + until the user mode request completes, either by finishing successfully or by reporting an error. In most cases this is the simplest way to use usbfs, although as noted above it does prevent performing I/O diff --git a/Documentation/DocBook/writing_usb_driver.tmpl b/Documentation/DocBook/writing_usb_driver.tmpl index 008a341234d0ca48848b216038fc0f900d235b0e..d4188d4ff5356902b93e9f19d10a78b82df3c813 100644 --- a/Documentation/DocBook/writing_usb_driver.tmpl +++ b/Documentation/DocBook/writing_usb_driver.tmpl @@ -224,13 +224,8 @@ static int skel_probe(struct usb_interface *interface, Conversely, when the device is removed from the USB bus, the disconnect function is called with the device pointer. The driver needs to clean any private data that has been allocated at this time and to shut down any - pending urbs that are in the USB system. The driver also unregisters - itself from the devfs subsystem with the call: + pending urbs that are in the USB system. - -/* remove our devfs node */ -devfs_unregister(skel->devfs); - Now that the device is plugged into the system and the driver is bound to the device, any of the functions in the file_operations structure that @@ -350,8 +345,7 @@ static inline void skel_delete (struct usb_skel *dev) usb_buffer_free (dev->udev, dev->bulk_out_size, dev->bulk_out_buffer, dev->write_urb->transfer_dma); - if (dev->write_urb != NULL) - usb_free_urb (dev->write_urb); + usb_free_urb (dev->write_urb); kfree (dev); } diff --git a/Documentation/HOWTO b/Documentation/HOWTO index 1d6560413cc593e001080759ce867c070956aeb7..8d51c148f72117b707792a65c9c5ad025364a03a 100644 --- a/Documentation/HOWTO +++ b/Documentation/HOWTO @@ -375,6 +375,46 @@ of information is needed by the kernel developers to help track down the problem. +Managing bug reports +-------------------- + +One of the best ways to put into practice your hacking skills is by fixing +bugs reported by other people. Not only you will help to make the kernel +more stable, you'll learn to fix real world problems and you will improve +your skills, and other developers will be aware of your presence. Fixing +bugs is one of the best ways to earn merit amongst the developers, because +not many people like wasting time fixing other people's bugs. + +To work in the already reported bug reports, go to http://bugzilla.kernel.org. +If you want to be advised of the future bug reports, you can subscribe to the +bugme-new mailing list (only new bug reports are mailed here) or to the +bugme-janitor mailing list (every change in the bugzilla is mailed here) + + http://lists.osdl.org/mailman/listinfo/bugme-new + http://lists.osdl.org/mailman/listinfo/bugme-janitors + + + +Managing bug reports +-------------------- + +One of the best ways to put into practice your hacking skills is by fixing +bugs reported by other people. Not only you will help to make the kernel +more stable, you'll learn to fix real world problems and you will improve +your skills, and other developers will be aware of your presence. Fixing +bugs is one of the best ways to get merits among other developers, because +not many people like wasting time fixing other people's bugs. + +To work in the already reported bug reports, go to http://bugzilla.kernel.org. +If you want to be advised of the future bug reports, you can subscribe to the +bugme-new mailing list (only new bug reports are mailed here) or to the +bugme-janitor mailing list (every change in the bugzilla is mailed here) + + http://lists.osdl.org/mailman/listinfo/bugme-new + http://lists.osdl.org/mailman/listinfo/bugme-janitors + + + Mailing lists ------------- diff --git a/Documentation/IPMI.txt b/Documentation/IPMI.txt index 0256805b548f8c43906c07b4e05ab1f76ae1f972..24dc3fcf15948e8a9ed93043cf46c0e81f929c36 100644 --- a/Documentation/IPMI.txt +++ b/Documentation/IPMI.txt @@ -326,9 +326,12 @@ for events, they will all receive all events that come in. For receiving commands, you have to individually register commands you want to receive. Call ipmi_register_for_cmd() and supply the netfn -and command name for each command you want to receive. Only one user -may be registered for each netfn/cmd, but different users may register -for different commands. +and command name for each command you want to receive. You also +specify a bitmask of the channels you want to receive the command from +(or use IPMI_CHAN_ALL for all channels if you don't care). Only one +user may be registered for each netfn/cmd/channel, but different users +may register for different commands, or the same command if the +channel bitmasks do not overlap. From userland, equivalent IOCTLs are provided to do these functions. @@ -361,6 +364,8 @@ You can change this at module load time (for a module) with: regspacings=,,... regsizes=,,... regshifts=,,... slave_addrs=,,... + force_kipmid=,,... + unload_when_empty=[0|1] Each of these except si_trydefaults is a list, the first item for the first interface, second item for the second interface, etc. @@ -406,7 +411,18 @@ The slave_addrs specifies the IPMI address of the local BMC. This is usually 0x20 and the driver defaults to that, but in case it's not, it can be specified when the driver starts up. -When compiled into the kernel, the addresses can be specified on the +The force_ipmid parameter forcefully enables (if set to 1) or disables +(if set to 0) the kernel IPMI daemon. Normally this is auto-detected +by the driver, but systems with broken interrupts might need an enable, +or users that don't want the daemon (don't need the performance, don't +want the CPU hit) can disable it. + +If unload_when_empty is set to 1, the driver will be unloaded if it +doesn't find any interfaces or all the interfaces fail to work. The +default is one. Setting to 0 is useful with the hotmod, but is +obviously only useful for modules. + +When compiled into the kernel, the parameters can be specified on the kernel command line as: ipmi_si.type=,... @@ -416,6 +432,7 @@ kernel command line as: ipmi_si.regsizes=,,... ipmi_si.regshifts=,,... ipmi_si.slave_addrs=,,... + ipmi_si.force_kipmid=,,... It works the same as the module parameters of the same names. @@ -430,6 +447,25 @@ have high-res timers enabled in the kernel and you don't have interrupts enabled, the driver will run VERY slowly. Don't blame me, these interfaces suck. +The driver supports a hot add and remove of interfaces. This way, +interfaces can be added or removed after the kernel is up and running. +This is done using /sys/modules/ipmi_si/hotmod, which is a write-only +parameter. You write a string to this interface. The string has the +format: + [:op2[:op3...]] +The "op"s are: + add|remove,kcs|bt|smic,mem|i/o,
[,[,[,...]]] +You can specify more than one interface on the line. The "opt"s are: + rsp= + rsi= + rsh= + irq= + ipmb= +and these have the same meanings as discussed above. Note that you +can also use this on the kernel command line for a more compact format +for specifying an interface. Note that when removing an interface, +only the first three parameters (si type, address type, and address) +are used for the comparison. Any options are ignored for removing. The SMBus Driver ---------------- @@ -457,12 +493,12 @@ BMCs specified on the smb_addr line will be detected. Setting smb_dbg_probe to 1 will enable debugging of the probing and detection process for BMCs on the SMBusses. -Discovering the IPMI compilant BMC on the SMBus can cause devices +Discovering the IPMI compliant BMC on the SMBus can cause devices on the I2C bus to fail. The SMBus driver writes a "Get Device ID" IPMI message as a block write to the I2C bus and waits for a response. This action can be detrimental to some I2C devices. It is highly recommended that the known I2c address be given to the SMBus driver in the smb_addr -parameter. The default adrress range will not be used when a smb_addr +parameter. The default address range will not be used when a smb_addr parameter is provided. When compiled into the kernel, the addresses can be specified on the @@ -491,7 +527,10 @@ used to control it: modprobe ipmi_watchdog timeout= pretimeout= action= preaction= preop= start_now=x - nowayout=x + nowayout=x ifnum_to_use=n + +ifnum_to_use specifies which interface the watchdog timer should use. +The default is -1, which means to pick the first one registered. The timeout is the number of seconds to the action, and the pretimeout is the amount of seconds before the reset that the pre-timeout panic will @@ -613,5 +652,9 @@ command line. The parameter is also available via the proc filesystem in /proc/sys/dev/ipmi/poweroff_powercycle. Note that if the system does not support power cycling, it will always do the power off. +The "ifnum_to_use" parameter specifies which interface the poweroff +code should use. The default is -1, which means to pick the first one +registered. + Note that if you have ACPI enabled, the system will prefer using ACPI to power off. diff --git a/Documentation/MSI-HOWTO.txt b/Documentation/MSI-HOWTO.txt index 3ec6c720b0166b4a76c576391414c795898d47b5..d389388c733e6c718f87de37a885132af6519930 100644 --- a/Documentation/MSI-HOWTO.txt +++ b/Documentation/MSI-HOWTO.txt @@ -219,7 +219,7 @@ into the field vector of each element contained in a second argument. Note that the pre-assigned IOAPIC dev->irq is valid only if the device operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at using dev->irq by the device driver to request for interrupt service -may result unpredictabe behavior. +may result in unpredictable behavior. For each MSI-X vector granted, a device driver is responsible for calling other functions like request_irq(), enable_irq(), etc. to enable @@ -267,7 +267,7 @@ y = The number of MSI capable devices populated in the system. vector reserved to avoid the case where some MSI-X capable drivers may attempt to claim all available vector resources. -z = The number of MSI-X capable devices pupulated in the system. +z = The number of MSI-X capable devices populated in the system. This policy ensures that maximum (x - y) is distributed evenly among MSI-X capable devices. @@ -470,7 +470,68 @@ LOC: 324553 325068 ERR: 0 MIS: 0 -6. FAQ +6. MSI quirks + +Several PCI chipsets or devices are known to not support MSI. +The PCI stack provides 3 possible levels of MSI disabling: +* on a single device +* on all devices behind a specific bridge +* globally + +6.1. Disabling MSI on a single device + +Under some circumstances, it might be required to disable MSI on a +single device, It may be achived by either not calling pci_enable_msi() +or all, or setting the pci_dev->no_msi flag before (most of the time +in a quirk). + +6.2. Disabling MSI below a bridge + +The vast majority of MSI quirks are required by PCI bridges not +being able to route MSI between busses. In this case, MSI have to be +disabled on all devices behind this bridge. It is achieves by setting +the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge +subordinate bus. There is no need to set the same flag on bridges that +are below the broken brigde. When pci_enable_msi() is called to enable +MSI on a device, pci_msi_supported() takes care of checking the NO_MSI +flag in all parent busses of the device. + +Some bridges actually support dynamic MSI support enabling/disabling +by changing some bits in their PCI configuration space (especially +the Hypertransport chipsets such as the nVidia nForce and Serverworks +HT2000). It may then be required to update the NO_MSI flag on the +corresponding devices in the sysfs hierarchy. To enable MSI support +on device "0000:00:0e", do: + + echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus + +To disable MSI support, echo 0 instead of 1. Note that it should be +used with caution since changing this value might break interrupts. + +6.3. Disabling MSI globally + +Some extreme cases may require to disable MSI globally on the system. +For now, the only known case is a Serverworks PCI-X chipsets (MSI are +not supported on several busses that are not all connected to the +chipset in the Linux PCI hierarchy). In the vast majority of other +cases, disabling only behind a specific bridge is enough. + +For debugging purpose, the user may also pass pci=nomsi on the kernel +command-line to explicitly disable MSI globally. But, once the appro- +priate quirks are added to the kernel, this option should not be +required anymore. + +6.4. Finding why MSI cannot be enabled on a device + +Assuming that MSI are not enabled on a device, you should look at +dmesg to find messages that quirks may output when disabling MSI +on some devices, some bridges or even globally. +Then, lspci -t gives the list of bridges above a device. Reading +/sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI +are enabled (1) or disabled (0). In 0 is found in a single bridge +msi_bus file above the device, MSI cannot be enabled. + +7. FAQ Q1. Are there any limitations on using the MSI? diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt index 1d50cf0c905ef28d62d6e5eda83c6415721504a6..f4dffadbcb00af44b40b99dd55649731b57cb21d 100644 --- a/Documentation/RCU/checklist.txt +++ b/Documentation/RCU/checklist.txt @@ -221,3 +221,41 @@ over a rather long period of time, but improvements are always welcome! disable irq on a given acquisition of that lock will result in deadlock as soon as the RCU callback happens to interrupt that acquisition's critical section. + +13. SRCU (srcu_read_lock(), srcu_read_unlock(), and synchronize_srcu()) + may only be invoked from process context. Unlike other forms of + RCU, it -is- permissible to block in an SRCU read-side critical + section (demarked by srcu_read_lock() and srcu_read_unlock()), + hence the "SRCU": "sleepable RCU". Please note that if you + don't need to sleep in read-side critical sections, you should + be using RCU rather than SRCU, because RCU is almost always + faster and easier to use than is SRCU. + + Also unlike other forms of RCU, explicit initialization + and cleanup is required via init_srcu_struct() and + cleanup_srcu_struct(). These are passed a "struct srcu_struct" + that defines the scope of a given SRCU domain. Once initialized, + the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock() + and synchronize_srcu(). A given synchronize_srcu() waits only + for SRCU read-side critical sections governed by srcu_read_lock() + and srcu_read_unlock() calls that have been passd the same + srcu_struct. This property is what makes sleeping read-side + critical sections tolerable -- a given subsystem delays only + its own updates, not those of other subsystems using SRCU. + Therefore, SRCU is less prone to OOM the system than RCU would + be if RCU's read-side critical sections were permitted to + sleep. + + The ability to sleep in read-side critical sections does not + come for free. First, corresponding srcu_read_lock() and + srcu_read_unlock() calls must be passed the same srcu_struct. + Second, grace-period-detection overhead is amortized only + over those updates sharing a given srcu_struct, rather than + being globally amortized as they are for other forms of RCU. + Therefore, SRCU should be used in preference to rw_semaphore + only in extremely read-intensive situations, or in situations + requiring SRCU's read-side deadlock immunity or low read-side + realtime latency. + + Note that, rcu_assign_pointer() and rcu_dereference() relate to + SRCU just as they do to other forms of RCU. diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt index 02e27bf1d36554d35038bdb6c962c2f091618d0a..f84407cba8160a47ea57e46a640a54efafb905c5 100644 --- a/Documentation/RCU/rcu.txt +++ b/Documentation/RCU/rcu.txt @@ -45,7 +45,8 @@ o How can I see where RCU is currently used in the Linux kernel? Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu", "rcu_read_lock_bh", "rcu_read_unlock_bh", "call_rcu_bh", - "synchronize_rcu", and "synchronize_net". + "srcu_read_lock", "srcu_read_unlock", "synchronize_rcu", + "synchronize_net", and "synchronize_srcu". o What guidelines should I follow when writing code that uses RCU? diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt index a4948591607d0e1d7b653ce1f66c08e4831cbad1..25a3c3f7d378f344676652aec19fad62853fc6b1 100644 --- a/Documentation/RCU/torture.txt +++ b/Documentation/RCU/torture.txt @@ -28,6 +28,15 @@ nreaders This is the number of RCU reading threads supported. To properly exercise RCU implementations with preemptible read-side critical sections. +nfakewriters This is the number of RCU fake writer threads to run. Fake + writer threads repeatedly use the synchronous "wait for + current readers" function of the interface selected by + torture_type, with a delay between calls to allow for various + different numbers of writers running in parallel. + nfakewriters defaults to 4, which provides enough parallelism + to trigger special cases caused by multiple writers, such as + the synchronize_srcu() early return optimization. + stat_interval The number of seconds between output of torture statistics (via printk()). Regardless of the interval, statistics are printed when the module is unloaded. @@ -44,9 +53,12 @@ test_no_idle_hz Whether or not to test the ability of RCU to operate in a kernel that disables the scheduling-clock interrupt to idle CPUs. Boolean parameter, "1" to test, "0" otherwise. -torture_type The type of RCU to test: "rcu" for the rcu_read_lock() - API, "rcu_bh" for the rcu_read_lock_bh() API, and "srcu" - for the "srcu_read_lock()" API. +torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, + "rcu_sync" for rcu_read_lock() with synchronous reclamation, + "rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for + rcu_read_lock_bh() with synchronous reclamation, "srcu" for + the "srcu_read_lock()" API, and "sched" for the use of + preempt_disable() together with synchronize_sched(). verbose Enable debug printk()s. Default is disabled. @@ -118,6 +130,21 @@ o "Free-Block Circulation": Shows the number of torture structures as it is only incremented if a torture structure's counter somehow gets incremented farther than it should. +Different implementations of RCU can provide implementation-specific +additional information. For example, SRCU provides the following: + + srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0 + srcu-torture: Reader Pipe: 559738 939 0 0 0 0 0 0 0 0 0 + srcu-torture: Reader Batch: 560434 243 0 0 0 0 0 0 0 0 + srcu-torture: Free-Block Circulation: 355 354 353 352 351 350 349 348 347 346 0 + srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1) + +The first four lines are similar to those for RCU. The last line shows +the per-CPU counter state. The numbers in parentheses are the values +of the "old" and "current" counters for the corresponding CPU. The +"idx" value maps the "old" and "current" values to the underlying array, +and is useful for debugging. + USAGE diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt index 318df44259b37f6556cbe5b6b4e9e12d4a26259a..e0d6d99b8f9bb0dc17f647dbd8256aad1282dd2d 100644 --- a/Documentation/RCU/whatisRCU.txt +++ b/Documentation/RCU/whatisRCU.txt @@ -582,7 +582,7 @@ The rcu_read_lock() and rcu_read_unlock() primitive read-acquire and release a global reader-writer lock. The synchronize_rcu() primitive write-acquires this same lock, then immediately releases it. This means that once synchronize_rcu() exits, all RCU read-side -critical sections that were in progress before synchonize_rcu() was +critical sections that were in progress before synchronize_rcu() was called are guaranteed to have completed -- there is no way that synchronize_rcu() would have been able to write-acquire the lock otherwise. @@ -750,7 +750,7 @@ Or, for those who prefer a side-by-side listing: Either way, the differences are quite small. Read-side locking moves to rcu_read_lock() and rcu_read_unlock, update-side locking moves from -from a reader-writer lock to a simple spinlock, and a synchronize_rcu() +a reader-writer lock to a simple spinlock, and a synchronize_rcu() precedes the kfree(). However, there is one potential catch: the read-side and update-side @@ -778,6 +778,8 @@ Markers for RCU read-side critical sections: rcu_read_unlock rcu_read_lock_bh rcu_read_unlock_bh + srcu_read_lock + srcu_read_unlock RCU pointer/list traversal: @@ -804,6 +806,7 @@ RCU grace period: synchronize_net synchronize_sched synchronize_rcu + synchronize_srcu call_rcu call_rcu_bh diff --git a/Documentation/SubmitChecklist b/Documentation/SubmitChecklist index a10bfb6ecd9fa4e55c9beecd2e588c9a0359ec71..2270efa101530f8f1de58e951d9c7b4a1f6969d0 100644 --- a/Documentation/SubmitChecklist +++ b/Documentation/SubmitChecklist @@ -61,3 +61,14 @@ kernel patches. Documentation/kernel-parameters.txt. 18: All new module parameters are documented with MODULE_PARM_DESC() + +19: All new userspace interfaces are documented in Documentation/ABI/. + See Documentation/ABI/README for more information. + +20: Check that it all passes `make headers_check'. + +21: Has been checked with injection of at least slab and page-allocation + fauilures. See Documentation/fault-injection/. + + If the new code is substantial, addition of subsystem-specific fault + injection might be appropriate. diff --git a/Documentation/SubmittingDrivers b/Documentation/SubmittingDrivers index 6bd30fdd0786b9a7c64aa4c34f5a732b4c7096e0..58bead05eabb057fb777bdd5fe7a56b44f35b0e0 100644 --- a/Documentation/SubmittingDrivers +++ b/Documentation/SubmittingDrivers @@ -59,11 +59,11 @@ Copyright: The copyright owner must agree to use of GPL. are the same person/entity. If not, the name of the person/entity authorizing use of GPL should be listed in case it's necessary to verify the will of - the copright owner. + the copyright owner. Interfaces: If your driver uses existing interfaces and behaves like other drivers in the same class it will be much more likely - to be accepted than if it invents gratuitous new ones. + to be accepted than if it invents gratuitous new ones. If you need to implement a common API over Linux and NT drivers do it in userspace. @@ -88,7 +88,7 @@ Clarity: It helps if anyone can see how to fix the driver. It helps it will go in the bitbucket. Control: In general if there is active maintainance of a driver by - the author then patches will be redirected to them unless + the author then patches will be redirected to them unless they are totally obvious and without need of checking. If you want to be the contact and update point for the driver it is a good idea to state this in the comments, @@ -100,7 +100,7 @@ What Criteria Do Not Determine Acceptance Vendor: Being the hardware vendor and maintaining the driver is often a good thing. If there is a stable working driver from other people already in the tree don't expect 'we are the - vendor' to get your driver chosen. Ideally work with the + vendor' to get your driver chosen. Ideally work with the existing driver author to build a single perfect driver. Author: It doesn't matter if a large Linux company wrote the driver, @@ -116,17 +116,13 @@ Linux kernel master tree: ftp.??.kernel.org:/pub/linux/kernel/... ?? == your country code, such as "us", "uk", "fr", etc. -Linux kernel mailing list: +Linux kernel mailing list: linux-kernel@vger.kernel.org [mail majordomo@vger.kernel.org to subscribe] Linux Device Drivers, Third Edition (covers 2.6.10): http://lwn.net/Kernel/LDD3/ (free version) -Kernel traffic: - Weekly summary of kernel list activity (much easier to read) - http://www.kerneltraffic.org/kernel-traffic/ - LWN.net: Weekly summary of kernel development activity - http://lwn.net/ 2.6 API changes: @@ -145,11 +141,8 @@ KernelNewbies: Linux USB project: http://www.linux-usb.org/ -How to NOT write kernel driver by arjanv@redhat.com - http://people.redhat.com/arjanv/olspaper.pdf +How to NOT write kernel driver by Arjan van de Ven: + http://www.fenrus.org/how-to-not-write-a-device-driver-paper.pdf Kernel Janitor: http://janitor.kernelnewbies.org/ - --- -Last updated on 17 Nov 2005. diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches index d42ab4c9e893b787d0918cc9f5a540a785a11aee..302d148c2e18f0e0fe565bc9c9c1a2f4b232d346 100644 --- a/Documentation/SubmittingPatches +++ b/Documentation/SubmittingPatches @@ -173,15 +173,15 @@ For small patches you may want to CC the Trivial Patch Monkey trivial@kernel.org managed by Adrian Bunk; which collects "trivial" patches. Trivial patches must qualify for one of the following rules: Spelling fixes in documentation - Spelling fixes which could break grep(1). + Spelling fixes which could break grep(1) Warning fixes (cluttering with useless warnings is bad) Compilation fixes (only if they are actually correct) Runtime fixes (only if they actually fix things) - Removing use of deprecated functions/macros (eg. check_region). + Removing use of deprecated functions/macros (eg. check_region) Contact detail and documentation fixes Non-portable code replaced by portable code (even in arch-specific, since people copy, as long as it's trivial) - Any fix by the author/maintainer of the file. (ie. patch monkey + Any fix by the author/maintainer of the file (ie. patch monkey in re-transmission mode) URL: @@ -209,6 +209,19 @@ Exception: If your mailer is mangling patches then someone may ask you to re-send them using MIME. +WARNING: Some mailers like Mozilla send your messages with +---- message header ---- +Content-Type: text/plain; charset=us-ascii; format=flowed +---- message header ---- +The problem is that "format=flowed" makes some of the mailers +on receiving side to replace TABs with spaces and do similar +changes. Thus the patches from you can look corrupted. + +To fix this just make your mozilla defaults/pref/mailnews.js file to look like: +pref("mailnews.send_plaintext_flowed", false); // RFC 2646======= +pref("mailnews.display.disable_format_flowed_support", true); + + 7) E-mail size. @@ -245,13 +258,13 @@ updated change. It is quite common for Linus to "drop" your patch without comment. That's the nature of the system. If he drops your patch, it could be due to -* Your patch did not apply cleanly to the latest kernel version +* Your patch did not apply cleanly to the latest kernel version. * Your patch was not sufficiently discussed on linux-kernel. -* A style issue (see section 2), -* An e-mail formatting issue (re-read this section) -* A technical problem with your change -* He gets tons of e-mail, and yours got lost in the shuffle -* You are being annoying (See Figure 1) +* A style issue (see section 2). +* An e-mail formatting issue (re-read this section). +* A technical problem with your change. +* He gets tons of e-mail, and yours got lost in the shuffle. +* You are being annoying. When in doubt, solicit comments on linux-kernel mailing list. @@ -476,10 +489,10 @@ SECTION 3 - REFERENCES Andrew Morton, "The perfect patch" (tpp). -Jeff Garzik, "Linux kernel patch submission format." +Jeff Garzik, "Linux kernel patch submission format". -Greg Kroah-Hartman "How to piss off a kernel subsystem maintainer". +Greg Kroah-Hartman, "How to piss off a kernel subsystem maintainer". @@ -488,9 +501,9 @@ Greg Kroah-Hartman "How to piss off a kernel subsystem maintainer". NO!!!! No more huge patch bombs to linux-kernel@vger.kernel.org people! -Kernel Documentation/CodingStyle +Kernel Documentation/CodingStyle: -Linus Torvald's mail on the canonical patch format: +Linus Torvalds's mail on the canonical patch format: -- diff --git a/Documentation/accounting/getdelays.c b/Documentation/accounting/getdelays.c index 795ca3911cc58f7fd043a6e51b95df8e9bff449f..e9126e794ed7c04c19c42fb318dd36295d8806da 100644 --- a/Documentation/accounting/getdelays.c +++ b/Documentation/accounting/getdelays.c @@ -7,6 +7,8 @@ * Copyright (C) Balbir Singh, IBM Corp. 2006 * Copyright (c) Jay Lan, SGI. 2006 * + * Compile with + * gcc -I/usr/src/linux/include getdelays.c -o getdelays */ #include @@ -35,13 +37,20 @@ #define NLA_DATA(na) ((void *)((char*)(na) + NLA_HDRLEN)) #define NLA_PAYLOAD(len) (len - NLA_HDRLEN) -#define err(code, fmt, arg...) do { printf(fmt, ##arg); exit(code); } while (0) -int done = 0; -int rcvbufsz=0; - - char name[100]; -int dbg=0, print_delays=0; +#define err(code, fmt, arg...) \ + do { \ + fprintf(stderr, fmt, ##arg); \ + exit(code); \ + } while (0) + +int done; +int rcvbufsz; +char name[100]; +int dbg; +int print_delays; +int print_io_accounting; __u64 stime, utime; + #define PRINTF(fmt, arg...) { \ if (dbg) { \ printf(fmt, ##arg); \ @@ -49,7 +58,7 @@ __u64 stime, utime; } /* Maximum size of response requested or message sent */ -#define MAX_MSG_SIZE 256 +#define MAX_MSG_SIZE 1024 /* Maximum number of cpus expected to be specified in a cpumask */ #define MAX_CPUS 32 /* Maximum length of pathname to log file */ @@ -78,8 +87,9 @@ static int create_nl_socket(int protocol) if (rcvbufsz) if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &rcvbufsz, sizeof(rcvbufsz)) < 0) { - printf("Unable to set socket rcv buf size to %d\n", - rcvbufsz); + fprintf(stderr, "Unable to set socket rcv buf size " + "to %d\n", + rcvbufsz); return -1; } @@ -186,6 +196,15 @@ void print_delayacct(struct taskstats *t) "count", "delay total", t->swapin_count, t->swapin_delay_total); } +void print_ioacct(struct taskstats *t) +{ + printf("%s: read=%llu, write=%llu, cancelled_write=%llu\n", + t->ac_comm, + (unsigned long long)t->read_bytes, + (unsigned long long)t->write_bytes, + (unsigned long long)t->cancelled_write_bytes); +} + int main(int argc, char *argv[]) { int c, rc, rep_len, aggr_len, len2, cmd_type; @@ -208,7 +227,7 @@ int main(int argc, char *argv[]) struct msgtemplate msg; while (1) { - c = getopt(argc, argv, "dw:r:m:t:p:v:l"); + c = getopt(argc, argv, "diw:r:m:t:p:v:l"); if (c < 0) break; @@ -217,6 +236,10 @@ int main(int argc, char *argv[]) printf("print delayacct stats ON\n"); print_delays = 1; break; + case 'i': + printf("printing IO accounting\n"); + print_io_accounting = 1; + break; case 'w': strncpy(logfile, optarg, MAX_FILENAME); printf("write to file %s\n", logfile); @@ -238,14 +261,12 @@ int main(int argc, char *argv[]) if (!tid) err(1, "Invalid tgid\n"); cmd_type = TASKSTATS_CMD_ATTR_TGID; - print_delays = 1; break; case 'p': tid = atoi(optarg); if (!tid) err(1, "Invalid pid\n"); cmd_type = TASKSTATS_CMD_ATTR_PID; - print_delays = 1; break; case 'v': printf("debug on\n"); @@ -277,7 +298,7 @@ int main(int argc, char *argv[]) mypid = getpid(); id = get_family_id(nl_sd); if (!id) { - printf("Error getting family id, errno %d", errno); + fprintf(stderr, "Error getting family id, errno %d\n", errno); goto err; } PRINTF("family id %d\n", id); @@ -285,10 +306,10 @@ int main(int argc, char *argv[]) if (maskset) { rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET, TASKSTATS_CMD_ATTR_REGISTER_CPUMASK, - &cpumask, sizeof(cpumask)); + &cpumask, strlen(cpumask) + 1); PRINTF("Sent register cpumask, retval %d\n", rc); if (rc < 0) { - printf("error sending register cpumask\n"); + fprintf(stderr, "error sending register cpumask\n"); goto err; } } @@ -298,7 +319,7 @@ int main(int argc, char *argv[]) cmd_type, &tid, sizeof(__u32)); PRINTF("Sent pid/tgid, retval %d\n", rc); if (rc < 0) { - printf("error sending tid/tgid cmd\n"); + fprintf(stderr, "error sending tid/tgid cmd\n"); goto done; } } @@ -310,12 +331,15 @@ int main(int argc, char *argv[]) PRINTF("received %d bytes\n", rep_len); if (rep_len < 0) { - printf("nonfatal reply error: errno %d\n", errno); + fprintf(stderr, "nonfatal reply error: errno %d\n", + errno); continue; } if (msg.n.nlmsg_type == NLMSG_ERROR || !NLMSG_OK((&msg.n), rep_len)) { - printf("fatal reply error, errno %d\n", errno); + struct nlmsgerr *err = NLMSG_DATA(&msg); + fprintf(stderr, "fatal reply error, errno %d\n", + err->error); goto done; } @@ -355,6 +379,8 @@ int main(int argc, char *argv[]) count++; if (print_delays) print_delayacct((struct taskstats *) NLA_DATA(na)); + if (print_io_accounting) + print_ioacct((struct taskstats *) NLA_DATA(na)); if (fd) { if (write(fd, NLA_DATA(na), na->nla_len) < 0) { err(1,"write error\n"); @@ -364,7 +390,9 @@ int main(int argc, char *argv[]) goto done; break; default: - printf("Unknown nested nla_type %d\n", na->nla_type); + fprintf(stderr, "Unknown nested" + " nla_type %d\n", + na->nla_type); break; } len2 += NLA_ALIGN(na->nla_len); @@ -373,7 +401,8 @@ int main(int argc, char *argv[]) break; default: - printf("Unknown nla_type %d\n", na->nla_type); + fprintf(stderr, "Unknown nla_type %d\n", + na->nla_type); break; } na = (struct nlattr *) (GENLMSG_DATA(&msg) + len); @@ -383,7 +412,7 @@ done: if (maskset) { rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET, TASKSTATS_CMD_ATTR_DEREGISTER_CPUMASK, - &cpumask, sizeof(cpumask)); + &cpumask, strlen(cpumask) + 1); printf("Sent deregister mask, retval %d\n", rc); if (rc < 0) err(rc, "error sending deregister cpumask\n"); diff --git a/Documentation/accounting/taskstats-struct.txt b/Documentation/accounting/taskstats-struct.txt new file mode 100644 index 0000000000000000000000000000000000000000..661c797eaf79dc791997fa31a4b3823132735244 --- /dev/null +++ b/Documentation/accounting/taskstats-struct.txt @@ -0,0 +1,161 @@ +The struct taskstats +-------------------- + +This document contains an explanation of the struct taskstats fields. + +There are three different groups of fields in the struct taskstats: + +1) Common and basic accounting fields + If CONFIG_TASKSTATS is set, the taskstats inteface is enabled and + the common fields and basic accounting fields are collected for + delivery at do_exit() of a task. +2) Delay accounting fields + These fields are placed between + /* Delay accounting fields start */ + and + /* Delay accounting fields end */ + Their values are collected if CONFIG_TASK_DELAY_ACCT is set. +3) Extended accounting fields + These fields are placed between + /* Extended accounting fields start */ + and + /* Extended accounting fields end */ + Their values are collected if CONFIG_TASK_XACCT is set. + +Future extension should add fields to the end of the taskstats struct, and +should not change the relative position of each field within the struct. + + +struct taskstats { + +1) Common and basic accounting fields: + /* The version number of this struct. This field is always set to + * TAKSTATS_VERSION, which is defined in . + * Each time the struct is changed, the value should be incremented. + */ + __u16 version; + + /* The exit code of a task. */ + __u32 ac_exitcode; /* Exit status */ + + /* The accounting flags of a task as defined in + * Defined values are AFORK, ASU, ACOMPAT, ACORE, and AXSIG. + */ + __u8 ac_flag; /* Record flags */ + + /* The value of task_nice() of a task. */ + __u8 ac_nice; /* task_nice */ + + /* The name of the command that started this task. */ + char ac_comm[TS_COMM_LEN]; /* Command name */ + + /* The scheduling discipline as set in task->policy field. */ + __u8 ac_sched; /* Scheduling discipline */ + + __u8 ac_pad[3]; + __u32 ac_uid; /* User ID */ + __u32 ac_gid; /* Group ID */ + __u32 ac_pid; /* Process ID */ + __u32 ac_ppid; /* Parent process ID */ + + /* The time when a task begins, in [secs] since 1970. */ + __u32 ac_btime; /* Begin time [sec since 1970] */ + + /* The elapsed time of a task, in [usec]. */ + __u64 ac_etime; /* Elapsed time [usec] */ + + /* The user CPU time of a task, in [usec]. */ + __u64 ac_utime; /* User CPU time [usec] */ + + /* The system CPU time of a task, in [usec]. */ + __u64 ac_stime; /* System CPU time [usec] */ + + /* The minor page fault count of a task, as set in task->min_flt. */ + __u64 ac_minflt; /* Minor Page Fault Count */ + + /* The major page fault count of a task, as set in task->maj_flt. */ + __u64 ac_majflt; /* Major Page Fault Count */ + + +2) Delay accounting fields: + /* Delay accounting fields start + * + * All values, until the comment "Delay accounting fields end" are + * available only if delay accounting is enabled, even though the last + * few fields are not delays + * + * xxx_count is the number of delay values recorded + * xxx_delay_total is the corresponding cumulative delay in nanoseconds + * + * xxx_delay_total wraps around to zero on overflow + * xxx_count incremented regardless of overflow + */ + + /* Delay waiting for cpu, while runnable + * count, delay_total NOT updated atomically + */ + __u64 cpu_count; + __u64 cpu_delay_total; + + /* Following four fields atomically updated using task->delays->lock */ + + /* Delay waiting for synchronous block I/O to complete + * does not account for delays in I/O submission + */ + __u64 blkio_count; + __u64 blkio_delay_total; + + /* Delay waiting for page fault I/O (swap in only) */ + __u64 swapin_count; + __u64 swapin_delay_total; + + /* cpu "wall-clock" running time + * On some architectures, value will adjust for cpu time stolen + * from the kernel in involuntary waits due to virtualization. + * Value is cumulative, in nanoseconds, without a corresponding count + * and wraps around to zero silently on overflow + */ + __u64 cpu_run_real_total; + + /* cpu "virtual" running time + * Uses time intervals seen by the kernel i.e. no adjustment + * for kernel's involuntary waits due to virtualization. + * Value is cumulative, in nanoseconds, without a corresponding count + * and wraps around to zero silently on overflow + */ + __u64 cpu_run_virtual_total; + /* Delay accounting fields end */ + /* version 1 ends here */ + + +3) Extended accounting fields + /* Extended accounting fields start */ + + /* Accumulated RSS usage in duration of a task, in MBytes-usecs. + * The current rss usage is added to this counter every time + * a tick is charged to a task's system time. So, at the end we + * will have memory usage multiplied by system time. Thus an + * average usage per system time unit can be calculated. + */ + __u64 coremem; /* accumulated RSS usage in MB-usec */ + + /* Accumulated virtual memory usage in duration of a task. + * Same as acct_rss_mem1 above except that we keep track of VM usage. + */ + __u64 virtmem; /* accumulated VM usage in MB-usec */ + + /* High watermark of RSS usage in duration of a task, in KBytes. */ + __u64 hiwater_rss; /* High-watermark of RSS usage */ + + /* High watermark of VM usage in duration of a task, in KBytes. */ + __u64 hiwater_vm; /* High-water virtual memory usage */ + + /* The following four fields are I/O statistics of a task. */ + __u64 read_char; /* bytes read */ + __u64 write_char; /* bytes written */ + __u64 read_syscalls; /* read syscalls */ + __u64 write_syscalls; /* write syscalls */ + + /* Extended accounting fields end */ + +} diff --git a/Documentation/accounting/taskstats.txt b/Documentation/accounting/taskstats.txt index 92ebf29e9041cef2b68fc0c3c33698f8558ff085..ff06b738bb88065b28f6006c7a6381d5af7d57dd 100644 --- a/Documentation/accounting/taskstats.txt +++ b/Documentation/accounting/taskstats.txt @@ -96,9 +96,9 @@ a) TASKSTATS_TYPE_AGGR_PID/TGID : attribute containing no payload but indicates a pid/tgid will be followed by some stats. b) TASKSTATS_TYPE_PID/TGID: attribute whose payload is the pid/tgid whose stats -is being returned. +are being returned. -c) TASKSTATS_TYPE_STATS: attribute with a struct taskstsats as payload. The +c) TASKSTATS_TYPE_STATS: attribute with a struct taskstats as payload. The same structure is used for both per-pid and per-tgid stats. 3. New message sent by kernel whenever a task exits. The payload consists of a @@ -122,12 +122,12 @@ of atomicity). However, maintaining per-process, in addition to per-task stats, within the kernel has space and time overheads. To address this, the taskstats code -accumalates each exiting task's statistics into a process-wide data structure. -When the last task of a process exits, the process level data accumalated also +accumulates each exiting task's statistics into a process-wide data structure. +When the last task of a process exits, the process level data accumulated also gets sent to userspace (along with the per-task data). When a user queries to get per-tgid data, the sum of all other live threads in -the group is added up and added to the accumalated total for previously exited +the group is added up and added to the accumulated total for previously exited threads of the same thread group. Extending taskstats diff --git a/Documentation/aoe/todo.txt b/Documentation/aoe/todo.txt index 7fee1e1165bcf0a563176fd340be41e9d248e860..c09dfad4aed8f95be03a5088cdb7f824c8f3f774 100644 --- a/Documentation/aoe/todo.txt +++ b/Documentation/aoe/todo.txt @@ -7,7 +7,7 @@ not been observed, but it would be nice to eliminate any potential for deadlock under memory pressure. Because ATA over Ethernet is not fragmented by the kernel's IP code, -the destructore member of the struct sk_buff is available to the aoe +the destructor member of the struct sk_buff is available to the aoe driver. By using a mempool for allocating all but the first few sk_buffs, and by registering a destructor, we should be able to efficiently allocate sk_buffs without introducing any potential for diff --git a/Documentation/arm/SA1100/serial_UART b/Documentation/arm/SA1100/serial_UART index aea2e91ca0efe96fb1783e0a84b586a625308cda..a63966f1d08358414c339c5d6dbc7782be4131c8 100644 --- a/Documentation/arm/SA1100/serial_UART +++ b/Documentation/arm/SA1100/serial_UART @@ -24,8 +24,8 @@ The SA1100 serial port had its major/minor numbers officially assigned: > 7 = /dev/cusa2 Callout device for ttySA2 > -If you're not using devfs, you must create those inodes in /dev -on the root filesystem used by your SA1100-based device: +You must create those inodes in /dev on the root filesystem used +by your SA1100-based device: mknod ttySA0 c 204 5 mknod ttySA1 c 204 6 diff --git a/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt b/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt index 000e3d7a78b23cd27fc74ce0eccdec7681a6ef81..26422f0f9080386dd5c3a8986757b14eb3df5d1e 100644 --- a/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt +++ b/Documentation/arm/Samsung-S3C24XX/EB2410ITX.txt @@ -38,7 +38,7 @@ MTD --- The NAND and NOR support has been merged from the linux-mtd project. - Any prolbems, see http://www.linux-mtd.infradead.org/ for more + Any problems, see http://www.linux-mtd.infradead.org/ for more information or up-to-date versions of linux-mtd. diff --git a/Documentation/arm/Samsung-S3C24XX/GPIO.txt b/Documentation/arm/Samsung-S3C24XX/GPIO.txt index 0822764ec2708ecd826518ba5c21d6f34564e59e..8caea8c237eec0b9a033414c53ab2ea30b313800 100644 --- a/Documentation/arm/Samsung-S3C24XX/GPIO.txt +++ b/Documentation/arm/Samsung-S3C24XX/GPIO.txt @@ -24,7 +24,7 @@ Headers header include/asm-arm/arch-s3c2410/hardware.h which can be included by #include - A useful ammount of documentation can be found in the hardware + A useful amount of documentation can be found in the hardware header on how the GPIO functions (and others) work. Whilst a number of these functions do make some checks on what diff --git a/Documentation/arm/Samsung-S3C24XX/Overview.txt b/Documentation/arm/Samsung-S3C24XX/Overview.txt index 3e46d2a3115814a67e6990a9bba8e09e77f80854..dda7ecdde87bab1de56867759f2a3aaab246db3d 100644 --- a/Documentation/arm/Samsung-S3C24XX/Overview.txt +++ b/Documentation/arm/Samsung-S3C24XX/Overview.txt @@ -80,7 +80,7 @@ Machines Adding New Machines ------------------- - The archicture has been designed to support as many machines as can + The architecture has been designed to support as many machines as can be configured for it in one kernel build, and any future additions should keep this in mind before altering items outside of their own machine files. diff --git a/Documentation/arm/Samsung-S3C24XX/S3C2412.txt b/Documentation/arm/Samsung-S3C24XX/S3C2412.txt index cb82a7fc7901302d50958532e6fb0994980ab8a3..295d971a15ed49d1c7d60fb31affbdd1e5119a1c 100644 --- a/Documentation/arm/Samsung-S3C24XX/S3C2412.txt +++ b/Documentation/arm/Samsung-S3C24XX/S3C2412.txt @@ -80,7 +80,7 @@ RTC Watchdog -------- - The watchdog harware is the same as the S3C2410, and is supported by + The watchdog hardware is the same as the S3C2410, and is supported by the s3c2410_wdt driver. diff --git a/Documentation/block/as-iosched.txt b/Documentation/block/as-iosched.txt index 6f47332c883de7366d7d8f73fa5036d55136480c..a598fe10a2974f5757761df5ab8f7f98c5c5f84a 100644 --- a/Documentation/block/as-iosched.txt +++ b/Documentation/block/as-iosched.txt @@ -24,8 +24,10 @@ very similar behavior to the deadline IO scheduler. Selecting IO schedulers ----------------------- To choose IO schedulers at boot time, use the argument 'elevator=deadline'. -'noop' and 'as' (the default) are also available. IO schedulers are assigned -globally at boot time only presently. +'noop', 'as' and 'cfq' (the default) are also available. IO schedulers are +assigned globally at boot time only presently. It's also possible to change +the IO scheduler for a determined device on the fly, as described in +Documentation/block/switching-sched.txt. Anticipatory IO scheduler Policies @@ -99,8 +101,8 @@ contrast, many write requests may be dispatched to the disk controller at a time during a write batch. It is this characteristic that can make the anticipatory scheduler perform anomalously with controllers supporting TCQ, or with hardware striped RAID devices. Setting the antic_expire -queue paramter (see below) to zero disables this behavior, and the anticipatory -scheduler behaves essentially like the deadline scheduler. +queue parameter (see below) to zero disables this behavior, and the +anticipatory scheduler behaves essentially like the deadline scheduler. When read anticipation is enabled (antic_expire is not zero), reads are dispatched to the disk controller one at a time. diff --git a/Documentation/block/barrier.txt b/Documentation/block/barrier.txt index 03971518b22254781bce94bea5c749f0a2cd1e7f..a272c3db80940bd5821e8b2f28068e3fe5d5a65b 100644 --- a/Documentation/block/barrier.txt +++ b/Documentation/block/barrier.txt @@ -25,7 +25,7 @@ of the following three ways. i. For devices which have queue depth greater than 1 (TCQ devices) and support ordered tags, block layer can just issue the barrier as an ordered request and the lower level driver, controller and drive -itself are responsible for making sure that the ordering contraint is +itself are responsible for making sure that the ordering constraint is met. Most modern SCSI controllers/drives should support this. NOTE: SCSI ordered tag isn't currently used due to limitation in the @@ -42,7 +42,7 @@ iii. Devices which have queue depth of 1. This is a degenerate case of ii. Just keeping issue order suffices. Ancient SCSI controllers/drives and IDE drives are in this category. -2. Forced flushing to physcial medium +2. Forced flushing to physical medium Again, if you're not gonna do synchronization with disk drives (dang, it sounds even more appealing now!), the reason you use I/O barriers @@ -56,7 +56,7 @@ There are four cases, i. No write-back cache. Keeping requests ordered is enough. ii. Write-back cache but no flush operation. There's no way to -gurantee physical-medium commit order. This kind of devices can't to +guarantee physical-medium commit order. This kind of devices can't to I/O barriers. iii. Write-back cache and flush operation but no FUA (forced unit diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt index f989a9e839b49abde7b12bfea3926aca9b3635a3..c6c9a9c10d7f88b5f894990e45878acadbb2ea64 100644 --- a/Documentation/block/biodoc.txt +++ b/Documentation/block/biodoc.txt @@ -135,7 +135,7 @@ Some new queue property settings: Sets two variables that limit the size of the request. - The request queue's max_sectors, which is a soft size in - in units of 512 byte sectors, and could be dynamically varied + units of 512 byte sectors, and could be dynamically varied by the core kernel. - The request queue's max_hw_sectors, which is a hard limit @@ -183,7 +183,7 @@ it, the pci dma mapping routines and associated data structures have now been modified to accomplish a direct page -> bus translation, without requiring a virtual address mapping (unlike the earlier scheme of virtual address -> bus translation). So this works uniformly for high-memory pages (which -do not have a correponding kernel virtual address space mapping) and +do not have a corresponding kernel virtual address space mapping) and low-memory pages. Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA @@ -391,7 +391,7 @@ forced such requests to be broken up into small chunks before being passed on to the generic block layer, only to be merged by the i/o scheduler when the underlying device was capable of handling the i/o in one shot. Also, using the buffer head as an i/o structure for i/os that didn't originate -from the buffer cache unecessarily added to the weight of the descriptors +from the buffer cache unnecessarily added to the weight of the descriptors which were generated for each such chunk. The following were some of the goals and expectations considered in the @@ -403,14 +403,14 @@ i. Should be appropriate as a descriptor for both raw and buffered i/o - for raw i/o. ii. Ability to represent high-memory buffers (which do not have a virtual address mapping in kernel address space). -iii.Ability to represent large i/os w/o unecessarily breaking them up (i.e +iii.Ability to represent large i/os w/o unnecessarily breaking them up (i.e greater than PAGE_SIZE chunks in one shot) iv. At the same time, ability to retain independent identity of i/os from different sources or i/o units requiring individual completion (e.g. for latency reasons) v. Ability to represent an i/o involving multiple physical memory segments (including non-page aligned page fragments, as specified via readv/writev) - without unecessarily breaking it up, if the underlying device is capable of + without unnecessarily breaking it up, if the underlying device is capable of handling it. vi. Preferably should be based on a memory descriptor structure that can be passed around different types of subsystems or layers, maybe even @@ -783,7 +783,7 @@ all the outstanding requests. There's a third helper to do that: blk_queue_invalidate_tags(request_queue_t *q) - Clear the internal block tag queue and readd all the pending requests + Clear the internal block tag queue and re-add all the pending requests to the request queue. The driver will receive them again on the next request_fn run, just like it did the first time it encountered them. @@ -890,7 +890,7 @@ Aside: Kvec i/o: - Ben LaHaise's aio code uses a slighly different structure instead + Ben LaHaise's aio code uses a slightly different structure instead of kiobufs, called a kvec_cb. This contains an array of tuples (very much like the networking code), together with a callback function and data pointer. This is embedded into a brw_cb structure when passed @@ -988,7 +988,7 @@ elevator_exit_fn Allocate and free any elevator specific storage for a queue. 4.2 Request flows seen by I/O schedulers -All requests seens by I/O schedulers strictly follow one of the following three +All requests seen by I/O schedulers strictly follow one of the following three flows. set_req_fn -> @@ -1013,7 +1013,7 @@ Characteristics: i. Binary tree AS and deadline i/o schedulers use red black binary trees for disk position sorting and searching, and a fifo linked list for time-based searching. This -gives good scalability and good availablility of information. Requests are +gives good scalability and good availability of information. Requests are almost always dispatched in disk sort order, so a cache is kept of the next request in sort order to prevent binary tree lookups. @@ -1203,6 +1203,6 @@ temporarily map a bio into the virtual address space. and Linus' comments - Jan 2001) 9.2 Discussions about kiobuf and bh design on lkml between sct, linus, alan et al - Feb-March 2001 (many of the initial thoughts that led to bio were -brought up in this discusion thread) +brought up in this discussion thread) 9.3 Discussions on mempool on lkml - Dec 2001. diff --git a/Documentation/block/deadline-iosched.txt b/Documentation/block/deadline-iosched.txt index c918b3a6022dd04c48f396d81c372c7d7538a682..be08ffd1e9b82a52504c05ad657ebb4bd895d80b 100644 --- a/Documentation/block/deadline-iosched.txt +++ b/Documentation/block/deadline-iosched.txt @@ -23,11 +23,11 @@ you can do so by typing: read_expire (in ms) ----------- -The goal of the deadline io scheduler is to attempt to guarentee a start +The goal of the deadline io scheduler is to attempt to guarantee a start service time for a request. As we focus mainly on read latencies, this is tunable. When a read request first enters the io scheduler, it is assigned a deadline that is the current time + the read_expire value in units of -miliseconds. +milliseconds. write_expire (in ms) diff --git a/Documentation/cciss.txt b/Documentation/cciss.txt index 9c629ffa0e58ffa5710843a5c389660d46036681..f74affe5c8297f58b4d36dfac566e7411899f2c5 100644 --- a/Documentation/cciss.txt +++ b/Documentation/cciss.txt @@ -80,7 +80,7 @@ the /proc filesystem entry which the "block" side of the driver creates as the SCSI core may not yet be initialized (because the driver is a block driver) and attempting to register it with the SCSI core in such a case would cause a hang. This is best done via an initialization script -(typically in /etc/init.d, but could vary depending on distibution). +(typically in /etc/init.d, but could vary depending on distribution). For example: for x in /proc/driver/cciss/cciss[0-9]* @@ -152,7 +152,7 @@ side during the SCSI error recovery process, the cciss driver only implements the first two of these actions, aborting the command, and resetting the device. Additionally, most tape drives will not oblige in aborting commands, and sometimes it appears they will not even -obey a reset coommand, though in most circumstances they will. In +obey a reset command, though in most circumstances they will. In the case that the command cannot be aborted and the device cannot be reset, the device will be set offline. diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.txt index 3d44c561fe6d935c868bc2d6a4755452cb56c1a1..7715d2247c4de4cf01045a7f46ce9ce5bf36a7a4 100644 --- a/Documentation/cdrom/packet-writing.txt +++ b/Documentation/cdrom/packet-writing.txt @@ -90,6 +90,41 @@ Notes to create an ext2 filesystem on the disc. +Using the pktcdvd sysfs interface +--------------------------------- + +Since Linux 2.6.19, the pktcdvd module has a sysfs interface +and can be controlled by it. For example the "pktcdvd" tool uses +this interface. (see http://people.freenet.de/BalaGi#pktcdvd ) + +"pktcdvd" works similar to "pktsetup", e.g.: + + # pktcdvd -a dev_name /dev/hdc + # mkudffs /dev/pktcdvd/dev_name + # mount -t udf -o rw,noatime /dev/pktcdvd/dev_name /dvdram + # cp files /dvdram + # umount /dvdram + # pktcdvd -r dev_name + + +For a description of the sysfs interface look into the file: + + Documentation/ABI/testing/sysfs-block-pktcdvd + + +Using the pktcdvd debugfs interface +----------------------------------- + +To read pktcdvd device infos in human readable form, do: + + # cat /debug/pktcdvd/pktcdvd[0-7]/info + +For a description of the debugfs interface look into the file: + + Documentation/ABI/testing/debugfs-pktcdvd + + + Links ----- diff --git a/Documentation/computone.txt b/Documentation/computone.txt index b1cf59b84d9727f718b434f0c128511dace766df..5e2a0c76bfa0b61a697772496a281ed42b6f13c4 100644 --- a/Documentation/computone.txt +++ b/Documentation/computone.txt @@ -199,30 +199,6 @@ boxes this will leave gaps in the sequence of device names. ip2mkdev uses Linux tty naming conventions: ttyF0 - ttyF255 for normal devices, and cuf0 - cuf255 for callout devices. -If you are using devfs, existing devices are automatically created within -the devfs name space. Normal devices will be tts/F0 - tts/F255 and callout -devices will be cua/F0 - cua/F255. With devfs installed, ip2mkdev will -create symbolic links in /dev from the old conventional names to the newer -devfs names as follows: - - /dev/ip2ipl[n] -> /dev/ip2/ipl[n] n = 0 - 3 - /dev/ip2stat[n] -> /dev/ip2/stat[n] n = 0 - 3 - /dev/ttyF[n] -> /dev/tts/F[n] n = 0 - 255 - /dev/cuf[n] -> /dev/cua/F[n] n = 0 - 255 - -Only devices for existing ports and boards will be created. - -IMPORTANT NOTE: The naming convention used for devfs by this driver -was changed from 1.2.12 to 1.2.13. The old naming convention was to -use ttf/%d for the tty device and cuf/%d for the cua device. That -has been changed to conform to an agreed-upon standard of placing -all the tty devices under tts. The device names are now tts/F%d for -the tty device and cua/F%d for the cua devices. If you were using -the older devfs names, you must update for the newer convention. - -You do not need to run ip2mkdev if you are using devfs and only want to -use the devfs native device names. - 4. USING THE DRIVERS @@ -256,57 +232,15 @@ cut out and run as "ip2mkdev" to create the necessary device files. To use the ip2mkdev script, you must have procfs enabled and the proc file system mounted on /proc. -You do not need to run ip2mkdev if you are using devfs and only want to -use the devfs native device names. - - -6. DEVFS - -DEVFS is the DEVice File System available as an add on package for the -2.2.x kernels and available as a configuration option in 2.3.46 and higher. -Devfs allows for the automatic creation and management of device names -under control of the device drivers themselves. The Devfs namespace is -hierarchical and reduces the clutter present in the normal flat /dev -namespace. Devfs names and conventional device names may be intermixed. -A userspace daemon, devfsd, exists to allow for automatic creation and -management of symbolic links from the devfs name space to the conventional -names. More details on devfs can be found on the DEVFS home site at - or in the file kernel -documentation files, .../linux/Documentation/filesystems/devfs/README. - -If you are using devfs, existing devices are automatically created within -the devfs name space. Normal devices will be tts/F0 - tts/F255 and callout -devices will be cua/F0 - cua/F255. With devfs installed, ip2mkdev will -create symbolic links in /dev from the old conventional names to the newer -devfs names as follows: - - /dev/ip2ipl[n] -> /dev/ip2/ipl[n] n = 0 - 3 - /dev/ip2stat[n] -> /dev/ip2/stat[n] n = 0 - 3 - /dev/ttyF[n] -> /dev/tts/F[n] n = 0 - 255 - /dev/cuf[n] -> /dev/cua/F[n] n = 0 - 255 - -Only devices for existing ports and boards will be created. - -IMPORTANT NOTE: The naming convention used for devfs by this driver -was changed from 1.2.12 to 1.2.13. The old naming convention was to -use ttf/%d for the tty device and cuf/%d for the cua device. That -has been changed to conform to an agreed-upon standard of placing -all the tty devices under tts. The device names are now tts/F%d for -the tty device and cua/F%d for the cua devices. If you were using -the older devfs names, you must update for the newer convention. - -You do not need to run ip2mkdev if you are using devfs and only want to -use the devfs native device names. - -7. NOTES +6. NOTES This is a release version of the driver, but it is impossible to test it in all configurations of Linux. If there is any anomalous behaviour that does not match the standard serial port's behaviour please let us know. -8. ip2mkdev shell script +7. ip2mkdev shell script Previously, this script was simply attached here. It is now attached as a shar archive to make it easier to extract the script from the documentation. diff --git a/Documentation/cpu-freq/cpufreq-nforce2.txt b/Documentation/cpu-freq/cpufreq-nforce2.txt index 9188337d8f6b8f2b3ee852463c72494164dfb398..babce13150265f173fc32dfa72ce1b91cf234afd 100644 --- a/Documentation/cpu-freq/cpufreq-nforce2.txt +++ b/Documentation/cpu-freq/cpufreq-nforce2.txt @@ -1,7 +1,7 @@ -The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 plattforms. +The cpufreq-nforce2 driver changes the FSB on nVidia nForce2 platforms. -This works better than on other plattforms, because the FSB of the CPU +This works better than on other platforms, because the FSB of the CPU can be controlled independently from the PCI/AGP clock. The module has two options: diff --git a/Documentation/cpu-freq/cpufreq-stats.txt b/Documentation/cpu-freq/cpufreq-stats.txt index 6a82948ff4bd59adb26b08e305339b4452f79edc..53d62c1e1c94f68531756d20c80e50c48aa718b3 100644 --- a/Documentation/cpu-freq/cpufreq-stats.txt +++ b/Documentation/cpu-freq/cpufreq-stats.txt @@ -1,5 +1,5 @@ - CPU frequency and voltage scaling statictics in the Linux(TM) kernel + CPU frequency and voltage scaling statistics in the Linux(TM) kernel L i n u x c p u f r e q - s t a t s d r i v e r @@ -18,8 +18,8 @@ Contents 1. Introduction cpufreq-stats is a driver that provices CPU frequency statistics for each CPU. -This statistics is provided in /sysfs as a bunch of read_only interfaces. This -interface (when configured) will appear in a seperate directory under cpufreq +These statistics are provided in /sysfs as a bunch of read_only interfaces. This +interface (when configured) will appear in a separate directory under cpufreq in /sysfs (/devices/system/cpu/cpuX/cpufreq/stats/) for each CPU. Various statistics will form read_only files under this directory. @@ -53,7 +53,7 @@ drwxr-xr-x 3 root root 0 May 14 15:58 .. This gives the amount of time spent in each of the frequencies supported by this CPU. The cat output will have "