Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit d40d9d29 authored by Jeff Garzik's avatar Jeff Garzik
Browse files

Merge branch 'master'

parents 96a71d52 70d9d825
Loading
Loading
Loading
Loading
+3 −3
Original line number Diff line number Diff line
@@ -239,9 +239,9 @@ X!Ilib/string.c
     <title>Network device support</title>
     <sect1><title>Driver Support</title>
!Enet/core/dev.c
     </sect1>
     <sect1><title>8390 Based Network Cards</title>
!Edrivers/net/8390.c
!Enet/ethernet/eth.c
!Einclude/linux/etherdevice.h
!Enet/core/wireless.c
     </sect1>
     <sect1><title>Synchronous PPP</title>
!Edrivers/net/wan/syncppp.c
+40 −1
Original line number Diff line number Diff line
@@ -81,7 +81,8 @@ Adding New Machines

  Any large scale modifications, or new drivers should be discussed
  on the ARM kernel mailing list (linux-arm-kernel) before being
  attempted.
  attempted. See http://www.arm.linux.org.uk/mailinglists/ for the
  mailing list information.


NAND
@@ -120,6 +121,43 @@ Clock Management
  various clock units


Platform Data
-------------

  Whenever a device has platform specific data that is specified
  on a per-machine basis, care should be taken to ensure the
  following:

    1) that default data is not left in the device to confuse the
       driver if a machine does not set it at startup

    2) the data should (if possible) be marked as __initdata,
       to ensure that the data is thrown away if the machine is
       not the one currently in use.

       The best way of doing this is to make a function that
       kmalloc()s an area of memory, and copies the __initdata
       and then sets the relevant device's platform data. Making
       the function `__init` takes care of ensuring it is discarded
       with the rest of the initialisation code

       static __init void s3c24xx_xxx_set_platdata(struct xxx_data *pd)
       {
           struct s3c2410_xxx_mach_info *npd;

	   npd = kmalloc(sizeof(struct s3c2410_xxx_mach_info), GFP_KERNEL);
	   if (npd) {
	      memcpy(npd, pd, sizeof(struct s3c2410_xxx_mach_info));
	      s3c_device_xxx.dev.platform_data = npd;
	   } else {
              printk(KERN_ERR "no memory for xxx platform data\n");
	   }
	}

	Note, since the code is marked as __init, it should not be
	exported outside arch/arm/mach-s3c2410/, or exported to
	modules via EXPORT_SYMBOL() and related functions.

Port Contributors
-----------------

@@ -149,6 +187,7 @@ Document Changes
  06 Mar 2005 - BJD - Added Christer Weinigel
  08 Mar 2005 - BJD - Added LCVR to list of people, updated introduction
  08 Mar 2005 - BJD - Added section on adding machines
  09 Sep 2005 - BJD - Added section on platform data

Document Author
---------------
+38 −4
Original line number Diff line number Diff line
@@ -50,9 +50,14 @@ userspace utilities, etc.
Features
========

- This is a complete rewrite of the NTFS driver that used to be in the kernel.
  This new driver implements NTFS read support and is functionally equivalent
  to the old ntfs driver.
- This is a complete rewrite of the NTFS driver that used to be in the 2.4 and
  earlier kernels.  This new driver implements NTFS read support and is
  functionally equivalent to the old ntfs driver and it also implements limited
  write support.  The biggest limitation at present is that files/directories
  cannot be created or deleted.  See below for the list of write features that
  are so far supported.  Another limitation is that writing to compressed files
  is not implemented at all.  Also, neither read nor write access to encrypted
  files is so far implemented.
- The new driver has full support for sparse files on NTFS 3.x volumes which
  the old driver isn't happy with.
- The new driver supports execution of binaries due to mmap() now being
@@ -78,7 +83,20 @@ Features
- The new driver supports fsync(2), fdatasync(2), and msync(2).
- The new driver supports readv(2) and writev(2).
- The new driver supports access time updates (including mtime and ctime).

- The new driver supports truncate(2) and open(2) with O_TRUNC.  But at present
  only very limited support for highly fragmented files, i.e. ones which have
  their data attribute split across multiple extents, is included.  Another
  limitation is that at present truncate(2) will never create sparse files,
  since to mark a file sparse we need to modify the directory entry for the
  file and we do not implement directory modifications yet.
- The new driver supports write(2) which can both overwrite existing data and
  extend the file size so that you can write beyond the existing data.  Also,
  writing into sparse regions is supported and the holes are filled in with
  clusters.  But at present only limited support for highly fragmented files,
  i.e. ones which have their data attribute split across multiple extents, is
  included.  Another limitation is that write(2) will never create sparse
  files, since to mark a file sparse we need to modify the directory entry for
  the file and we do not implement directory modifications yet.

Supported mount options
=======================
@@ -439,6 +457,22 @@ ChangeLog

Note, a technical ChangeLog aimed at kernel hackers is in fs/ntfs/ChangeLog.

2.1.25:
	- Write support is now extended with write(2) being able to both
	  overwrite existing file data and to extend files.  Also, if a write
	  to a sparse region occurs, write(2) will fill in the hole.  Note,
	  mmap(2) based writes still do not support writing into holes or
	  writing beyond the initialized size.
	- Write support has a new feature and that is that truncate(2) and
	  open(2) with O_TRUNC are now implemented thus files can be both made
	  smaller and larger.
	- Note: Both write(2) and truncate(2)/open(2) with O_TRUNC still have
	  limitations in that they
	  - only provide limited support for highly fragmented files.
	  - only work on regular, i.e. uncompressed and unencrypted files.
	  - never create sparse files although this will change once directory
	    operations are implemented.
	- Lots of bug fixes and enhancements across the board.
2.1.24:
	- Support journals ($LogFile) which have been modified by chkdsk.  This
	  means users can boot into Windows after we marked the volume dirty.
+108 −34
Original line number Diff line number Diff line
@@ -19,15 +19,43 @@ Mount Options

When mounting an XFS filesystem, the following options are accepted.

  biosize=size
	Sets the preferred buffered I/O size (default size is 64K).
	"size" must be expressed as the logarithm (base2) of the
	desired I/O size.
	Valid values for this option are 14 through 16, inclusive
	(i.e. 16K, 32K, and 64K bytes).  On machines with a 4K
	pagesize, 13 (8K bytes) is also a valid size.
	The preferred buffered I/O size can also be altered on an
	individual file basis using the ioctl(2) system call.
  allocsize=size
	Sets the buffered I/O end-of-file preallocation size when
	doing delayed allocation writeout (default size is 64KiB).
	Valid values for this option are page size (typically 4KiB)
	through to 1GiB, inclusive, in power-of-2 increments.

  attr2/noattr2
	The options enable/disable (default is disabled for backward
	compatibility on-disk) an "opportunistic" improvement to be
	made in the way inline extended attributes are stored on-disk.
	When the new form is used for the first time (by setting or
	removing extended attributes) the on-disk superblock feature
	bit field will be updated to reflect this format being in use.

  barrier
	Enables the use of block layer write barriers for writes into
	the journal and unwritten extent conversion.  This allows for
	drive level write caching to be enabled, for devices that
	support write barriers.

  dmapi
	Enable the DMAPI (Data Management API) event callouts.
	Use with the "mtpt" option.

  grpid/bsdgroups and nogrpid/sysvgroups
	These options define what group ID a newly created file gets.
	When grpid is set, it takes the group ID of the directory in
	which it is created; otherwise (the default) it takes the fsgid
	of the current process, unless the directory has the setgid bit
	set, in which case it takes the gid from the parent directory,
	and also gets the setgid bit set if it is a directory itself.

  ihashsize=value
	Sets the number of hash buckets available for hashing the
	in-memory inodes of the specified mount point.  If a value
	of zero is used, the value selected by the default algorithm
	will be displayed in /proc/mounts.

  ikeep/noikeep
	When inode clusters are emptied of inodes, keep them around
@@ -35,12 +63,31 @@ When mounting an XFS filesystem, the following options are accepted.
	and is still the default for now.  Using the noikeep option,
	inode clusters are returned to the free space pool.

  inode64
	Indicates that XFS is allowed to create inodes at any location
	in the filesystem, including those which will result in inode
	numbers occupying more than 32 bits of significance.  This is
	provided for backwards compatibility, but causes problems for
	backup applications that cannot handle large inode numbers.

  largeio/nolargeio
	If "nolargeio" is specified, the optimal I/O reported in
	st_blksize by stat(2) will be as small as possible to allow user
	applications to avoid inefficient read/modify/write I/O.
	If "largeio" specified, a filesystem that has a "swidth" specified
	will return the "swidth" value (in bytes) in st_blksize. If the
	filesystem does not have a "swidth" specified but does specify
	an "allocsize" then "allocsize" (in bytes) will be returned
	instead.
	If neither of these two options are specified, then filesystem
	will behave as if "nolargeio" was specified.

  logbufs=value
	Set the number of in-memory log buffers.  Valid numbers range
	from 2-8 inclusive.
	The default value is 8 buffers for filesystems with a
	blocksize of 64K, 4 buffers for filesystems with a blocksize
	of 32K, 3 buffers for filesystems with a blocksize of 16K
	blocksize of 64KiB, 4 buffers for filesystems with a blocksize
	of 32KiB, 3 buffers for filesystems with a blocksize of 16KiB
	and 2 buffers for all other configurations.  Increasing the
	number of buffers may increase performance on some workloads
	at the cost of the memory used for the additional log buffers
@@ -52,7 +99,7 @@ When mounting an XFS filesystem, the following options are accepted.
	Valid sizes for version 1 and version 2 logs are 16384 (16k) and
	32768 (32k).  Valid sizes for version 2 logs also include
	65536 (64k), 131072 (128k) and 262144 (256k).
	The default value for machines with more than 32MB of memory
	The default value for machines with more than 32MiB of memory
	is 32768, machines with less memory use 16384 by default.

  logdev=device and rtdev=device
@@ -62,6 +109,11 @@ When mounting an XFS filesystem, the following options are accepted.
	optional, and the log section can be separate from the data
	section or contained within it.

  mtpt=mountpoint
	Use with the "dmapi" option.  The value specified here will be
	included in the DMAPI mount event, and should be the path of
	the actual mountpoint that is used.

  noalign
	Data allocations will not be aligned at stripe unit boundaries.

@@ -91,13 +143,17 @@ When mounting an XFS filesystem, the following options are accepted.
	O_SYNC writes can be lost if the system crashes.
	If timestamp updates are critical, use the osyncisosync option.

  quota/usrquota/uqnoenforce
  uquota/usrquota/uqnoenforce/quota
	User disk quota accounting enabled, and limits (optionally)
	enforced.
	enforced.  Refer to xfs_quota(8) for further details.

  grpquota/gqnoenforce
  gquota/grpquota/gqnoenforce
	Group disk quota accounting enabled and limits (optionally)
	enforced.
	enforced.  Refer to xfs_quota(8) for further details.

  pquota/prjquota/pqnoenforce
	Project disk quota accounting enabled and limits (optionally)
	enforced.  Refer to xfs_quota(8) for further details.

  sunit=value and swidth=value
	Used to specify the stripe unit and width for a RAID device or
@@ -113,6 +169,12 @@ When mounting an XFS filesystem, the following options are accepted.
	The "swidth" option is required if the "sunit" option has been
	specified, and must be a multiple of the "sunit" value.

  swalloc
	Data allocations will be rounded up to stripe width boundaries
	when the current end of file is being extended and the file
	size is larger than the stripe width size.


sysctls
=======

@@ -172,17 +234,29 @@ The following sysctls are available for the XFS filesystem:
  	Controls whether unprivileged users can use chown to "give away"
	a file to another user.

  fs.xfs.inherit_sync		(Min: 0  Default: 1  Max 1)
  fs.xfs.inherit_sync		(Min: 0  Default: 1  Max: 1)
	Setting this to "1" will cause the "sync" flag set
	by the chattr(1) command on a directory to be
	by the xfs_io(8) chattr command on a directory to be
	inherited by files in that directory.

  fs.xfs.inherit_nodump		(Min: 0  Default: 1  Max 1)
  fs.xfs.inherit_nodump		(Min: 0  Default: 1  Max: 1)
	Setting this to "1" will cause the "nodump" flag set
	by the chattr(1) command on a directory to be
	by the xfs_io(8) chattr command on a directory to be
	inherited by files in that directory.

  fs.xfs.inherit_noatime	(Min: 0  Default: 1  Max 1)
  fs.xfs.inherit_noatime	(Min: 0  Default: 1  Max: 1)
	Setting this to "1" will cause the "noatime" flag set
	by the chattr(1) command on a directory to be
	by the xfs_io(8) chattr command on a directory to be
	inherited by files in that directory.

  fs.xfs.inherit_nosymlinks	(Min: 0  Default: 1  Max: 1)
	Setting this to "1" will cause the "nosymlinks" flag set
	by the xfs_io(8) chattr command on a directory to be
	inherited by files in that directory.

  fs.xfs.rotorstep		(Min: 1  Default: 1  Max: 256)
	In "inode32" allocation mode, this option determines how many
	files the allocator attempts to allocate in the same allocation
	group before moving to the next allocation group.  The intent
	is to control the rate at which the allocator moves between
	allocation groups when allocating extents for new files.
+152 −47
Original line number Diff line number Diff line
S2IO Technologies XFrame 10 Gig adapter.
-------------------------------------------

I. Module loadable parameters.
When loaded as a module, the driver provides a host of Module loadable
parameters, so the device can be tuned as per the users needs.
A list of the Module params is given below.
(i)	ring_num: This can be used to program the number of
		 receive rings used in the driver.
(ii)	ring_len: This defines the number of descriptors each ring
		 can have. There can be a maximum of 8 rings.
(iii)	frame_len: This is an array of size 8. Using this we can 
		 set the maximum size of the received frame that can
		 be steered into the corrsponding receive ring.	
(iv)	fifo_num: This defines the number of Tx FIFOs thats used in
		 the driver. 
(v)	fifo_len: Each element defines the number of 
 		 Tx descriptors that can be associated with each 
		 corresponding FIFO. There are a maximum of 8 FIFOs.
(vi)	tx_prio: This is a bool, if module is loaded with a non-zero
		value for tx_prio multi FIFO scheme is activated.
(vii)	rx_prio: This is a bool, if module is loaded with a non-zero
		value for tx_prio multi RING scheme is activated.
(viii)	latency_timer: The value given against this param will be
		 loaded	into the latency timer register in PCI Config
		 space, else the register is left with its reset value.

II. Performance tuning.
 By changing a few sysctl parameters.
	Copy the following lines into a file and run the following command,
	"sysctl -p <file_name>"
### IPV4 specific settings
net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use
net.ipv4.tcp_sack = 0 # turn SACK support off, default on
# on systems with a VERY fast bus -> memory interface this is the big gainer
net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760
net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072
net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768
                                                                                
### CORE settings (mostly for socket and UDP effect)
net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071
net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071
net.core.rmem_default = 524287 # default receive socket buffer size, default 65535
net.core.wmem_default = 524287 # default send socket buffer size, default 65535
net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240
net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300
---End of performance tuning file---
Release notes for Neterion's (Formerly S2io) Xframe I/II PCI-X 10GbE driver.

Contents
=======
- 1.  Introduction
- 2.  Identifying the adapter/interface
- 3.  Features supported
- 4.  Command line parameters
- 5.  Performance suggestions
- 6.  Available Downloads 


1.	Introduction:
This Linux driver supports Neterion's Xframe I PCI-X 1.0 and
Xframe II PCI-X 2.0 adapters. It supports several features 
such as jumbo frames, MSI/MSI-X, checksum offloads, TSO, UFO and so on.
See below for complete list of features.
All features are supported for both IPv4 and IPv6.

2.	Identifying the adapter/interface:
a. Insert the adapter(s) in your system.
b. Build and load driver 
# insmod s2io.ko
c. View log messages
# dmesg | tail -40
You will see messages similar to:
eth3: Neterion Xframe I 10GbE adapter (rev 3), Version 2.0.9.1, Intr type INTA
eth4: Neterion Xframe II 10GbE adapter (rev 2), Version 2.0.9.1, Intr type INTA
eth4: Device is on 64 bit 133MHz PCIX(M1) bus

The above messages identify the adapter type(Xframe I/II), adapter revision,
driver version, interface name(eth3, eth4), Interrupt type(INTA, MSI, MSI-X).
In case of Xframe II, the PCI/PCI-X bus width and frequency are displayed
as well.

To associate an interface with a physical adapter use "ethtool -p <ethX>".
The corresponding adapter's LED will blink multiple times.

3.	Features supported:
a. Jumbo frames. Xframe I/II supports MTU upto 9600 bytes,
modifiable using ifconfig command.

b. Offloads. Supports checksum offload(TCP/UDP/IP) on transmit
and receive, TSO.

c. Multi-buffer receive mode. Scattering of packet across multiple
buffers. Currently driver supports 2-buffer mode which yields
significant performance improvement on certain platforms(SGI Altix,
IBM xSeries).

d. MSI/MSI-X. Can be enabled on platforms which support this feature
(IA64, Xeon) resulting in noticeable performance improvement(upto 7%
on certain platforms).

e. NAPI. Compile-time option(CONFIG_S2IO_NAPI) for better Rx interrupt 
moderation.

f. Statistics. Comprehensive MAC-level and software statistics displayed
using "ethtool -S" option.

g. Multi-FIFO/Ring. Supports up to 8 transmit queues and receive rings, 
with multiple steering options.

4.  Command line parameters
a. tx_fifo_num
Number of transmit queues
Valid range: 1-8
Default: 1

b. rx_ring_num
Number of receive rings
Valid range: 1-8
Default: 1

c. tx_fifo_len
Size of each transmit queue
Valid range: Total length of all queues should not exceed 8192
Default: 4096

d. rx_ring_sz 
Size of each receive ring(in 4K blocks)
Valid range: Limited by memory on system
Default: 30 

e. intr_type
Specifies interrupt type. Possible values 1(INTA), 2(MSI), 3(MSI-X)
Valid range: 1-3
Default: 1 

5.  Performance suggestions
General:
a. Set MTU to maximum(9000 for switch setup, 9600 in back-to-back configuration)
b. Set TCP windows size to optimal value. 
For instance, for MTU=1500 a value of 210K has been observed to result in 
good performance.
# sysctl -w net.ipv4.tcp_rmem="210000 210000 210000"
# sysctl -w net.ipv4.tcp_wmem="210000 210000 210000"
For MTU=9000, TCP window size of 10 MB is recommended.
# sysctl -w net.ipv4.tcp_rmem="10000000 10000000 10000000"
# sysctl -w net.ipv4.tcp_wmem="10000000 10000000 10000000"

Transmit performance:
a. By default, the driver respects BIOS settings for PCI bus parameters. 
However, you may want to experiment with PCI bus parameters 
max-split-transactions(MOST) and MMRBC (use setpci command). 
A MOST value of 2 has been found optimal for Opterons and 3 for Itanium.  
It could be different for your hardware.  
Set MMRBC to 4K**.

For example you can set 
For opteron
#setpci -d 17d5:* 62=1d 
For Itanium
#setpci -d 17d5:* 62=3d 

For detailed description of the PCI registers, please see Xframe User Guide.

b. Ensure Transmit Checksum offload is enabled. Use ethtool to set/verify this 
parameter.
c. Turn on TSO(using "ethtool -K")
# ethtool -K <ethX> tso on

Receive performance:
a. By default, the driver respects BIOS settings for PCI bus parameters. 
However, you may want to set PCI latency timer to 248.
#setpci -d 17d5:* LATENCY_TIMER=f8
For detailed description of the PCI registers, please see Xframe User Guide.
b. Use 2-buffer mode. This results in large performance boost on
on certain platforms(eg. SGI Altix, IBM xSeries).
c. Ensure Receive Checksum offload is enabled. Use "ethtool -K ethX" command to 
set/verify this option.
d. Enable NAPI feature(in kernel configuration Device Drivers ---> Network 
device support --->  Ethernet (10000 Mbit) ---> S2IO 10Gbe Xframe NIC) to 
bring down CPU utilization.

** For AMD opteron platforms with 8131 chipset, MMRBC=1 and MOST=1 are 
recommended as safe parameters.
For more information, please review the AMD8131 errata at
http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26310.pdf

6.  Available Downloads
Neterion "s2io" driver in Red Hat and Suse 2.6-based distributions is kept up 
to date, also the latest "s2io" code (including support for 2.4 kernels) is 
available via "Support" link on the Neterion site:  http://www.neterion.com.

For Xframe User Guide (Programming manual), visit ftp site ns1.s2io.com,
user: linuxdocs password: HALdocs

7. Support 
For further support please contact either your 10GbE Xframe NIC vendor (IBM, 
HP, SGI etc.) or click on the "Support" link on the Neterion site:  
http://www.neterion.com.
Loading