Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 23f347ef authored by Linus Torvalds's avatar Linus Torvalds
Browse files
Pull networking updates from David Miller:

 1) Fix inaccuracies in network driver interface documentation, from Ben
    Hutchings.

 2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.

 3) Compile warning, locking, and refcounting fixes in netfilter's
    xt_CT, from Pablo Neira Ayuso.

 4) phonet sendmsg needs to validate user length just like any other
    datagram protocol, fix from Sasha Levin.

 5) Ipv6 multicast code uses wrong loop index, from RongQing Li.

 6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
    and Yuval Mintz.

 7) mlx4 erroneously allocates 4 pages at a time, regardless of page
    size, fix from Thadeu Lima de Souza Cascardo.

 8) SCTP socket option wasn't extended in a backwards compatible way,
    fix from Thomas Graf.

 9) Add missing address change event emissions to bonding, from Shlomo
    Pongratz.

10) /proc/net/dev regressed because it uses a private offset to track
    where we are in the hash table, but this doesn't track the offset
    pullback that the seq_file code does resulting in some entries being
    missed in large dumps.

    Fix from Eric Dumazet.

11) do_tcp_sendpage() unloads the send queue way too fast, because it
    invokes tcp_push() when it shouldn't.  Let the natural sequence
    generated by the splice paths, and the assosciated MSG_MORE
    settings, guide the tcp_push() calls.

    Otherwise what goes out of TCP is spaghetti and doesn't batch
    effectively into GSO/TSO clusters.

    From Eric Dumazet.

12) Once we put a SKB into either the netlink receiver's queue or a
    socket error queue, it can be consumed and freed up, therefore we
    cannot touch it after queueing it like that.

    Fixes from Eric Dumazet.

13) PPP has this annoying behavior in that for every transmit call it
    immediately stops the TX queue, then calls down into the next layer
    to transmit the PPP frame.

    But if that next layer can take it immediately, it just un-stops the
    TX queue right before returning from the transmit method.

    Besides being useless work, it makes several facilities unusable, in
    particular things like the equalizers.  Well behaved devices should
    only stop the TX queue when they really are full, and in PPP's case
    when it gets backlogged to the downstream device.

    David Woodhouse therefore fixed PPP to not stop the TX queue until
    it's downstream can't take data any more.

14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
    changes, re-add.  From Marc Kleine-Budde.

15) Fix link flaps in ixgbe, from Eric W. Multanen.

16) Descriptor writeback fixes in e1000e from Matthew Vick.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
  net: fix a race in sock_queue_err_skb()
  netlink: fix races after skb queueing
  doc, net: Update ndo_start_xmit return type and values
  doc, net: Remove instruction to set net_device::trans_start
  doc, net: Update netdev operation names
  doc, net: Update documentation of synchronisation for TX multiqueue
  doc, net: Remove obsolete reference to dev->poll
  ethtool: Remove exception to the requirement of holding RTNL lock
  MAINTAINERS: update for Marvell Ethernet drivers
  bonding: properly unset current_arp_slave on slave link up
  phonet: Check input from user before allocating
  tcp: tcp_sendpages() should call tcp_push() once
  ipv6: fix array index in ip6_mc_add_src()
  mlx4: allocate just enough pages instead of always 4 pages
  stmmac: re-add IFF_UNICAST_FLT for dwmac1000
  bnx2x: Clear MDC/MDIO warning message
  bnx2x: Fix BCM57711+BCM84823 link issue
  bnx2x: Clear BCM84833 LED after fan failure
  bnx2x: Fix BCM84833 PHY FW version presentation
  bnx2x: Fix link issue for BCM8727 boards.
  ...
parents 314489bd 110c4330
Loading
Loading
Loading
Loading
+15 −16
Original line number Diff line number Diff line
@@ -2,15 +2,15 @@ Document about softnet driver issues

Transmit path guidelines:

1) The hard_start_xmit method must never return '1' under any
   normal circumstances.  It is considered a hard error unless
1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under
   any normal circumstances.  It is considered a hard error unless
   there is no way your device can tell ahead of time when it's
   transmit function will become busy.

   Instead it must maintain the queue properly.  For example,
   for a driver implementing scatter-gather this means:

	static int drv_hard_start_xmit(struct sk_buff *skb,
	static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb,
					       struct net_device *dev)
	{
		struct drv *dp = netdev_priv(dev);
@@ -23,7 +23,7 @@ Transmit path guidelines:
			unlock_tx(dp);
			printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
			       dev->name);
			return 1;
			return NETDEV_TX_BUSY;
		}

		... queue packet to card ...
@@ -35,6 +35,7 @@ Transmit path guidelines:
		...
		unlock_tx(dp);
		...
		return NETDEV_TX_OK;
	}

   And then at the end of your TX reclamation event handling:
@@ -58,15 +59,12 @@ Transmit path guidelines:
            TX_BUFFS_AVAIL(dp) > 0)
		netif_wake_queue(dp->dev);

2) Do not forget to update netdev->trans_start to jiffies after
   each new tx packet is given to the hardware.

3) A hard_start_xmit method must not modify the shared parts of a
2) An ndo_start_xmit method must not modify the shared parts of a
   cloned SKB.

4) Do not forget that once you return 0 from your hard_start_xmit
   method, it is your driver's responsibility to free up the SKB
   and in some finite amount of time.
3) Do not forget that once you return NETDEV_TX_OK from your
   ndo_start_xmit method, it is your driver's responsibility to free
   up the SKB and in some finite amount of time.

   For example, this means that it is not allowed for your TX
   mitigation scheme to let TX packets "hang out" in the TX
@@ -74,8 +72,9 @@ Transmit path guidelines:
   This error can deadlock sockets waiting for send buffer room
   to be freed up.

   If you return 1 from the hard_start_xmit method, you must not keep
   any reference to that SKB and you must not attempt to free it up.
   If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you
   must not keep any reference to that SKB and you must not attempt
   to free it up.

Probing guidelines:

@@ -85,10 +84,10 @@ Probing guidelines:

Close/stop guidelines:

1) After the dev->stop routine has been called, the hardware must
1) After the ndo_stop routine has been called, the hardware must
   not receive or transmit any data.  All in flight packets must
   be aborted. If necessary, poll or wait for completion of 
   any reset commands.

2) The dev->stop routine will be called by unregister_netdevice
2) The ndo_stop routine will be called by unregister_netdevice
   if device is still UP.
+2 −9
Original line number Diff line number Diff line
@@ -604,15 +604,8 @@ IP Variables:
ip_local_port_range - 2 INTEGERS
	Defines the local port range that is used by TCP and UDP to
	choose the local port. The first number is the first, the
	second the last local port number. Default value depends on
	amount of memory available on the system:
	> 128Mb 32768-61000
	< 128Mb 1024-4999 or even less.
	This number defines number of active connections, which this
	system can issue simultaneously to systems not supporting
	TCP extensions (timestamps). With tcp_tw_recycle enabled
	(i.e. by default) range 1024-4999 is enough to issue up to
	2000 connections per second to systems supporting timestamps.
	second the last local port number. The default values are
	32768 and 61000 respectively.

ip_local_reserved_ports - list of comma separated ranges
	Specify the ports which are reserved for known third-party
+12 −13
Original line number Diff line number Diff line
@@ -47,26 +47,25 @@ packets is preferred.

struct net_device synchronization rules
=======================================
dev->open:
ndo_open:
	Synchronization: rtnl_lock() semaphore.
	Context: process

dev->stop:
ndo_stop:
	Synchronization: rtnl_lock() semaphore.
	Context: process
	Note1: netif_running() is guaranteed false
	Note2: dev->poll() is guaranteed to be stopped
	Note: netif_running() is guaranteed false

dev->do_ioctl:
ndo_do_ioctl:
	Synchronization: rtnl_lock() semaphore.
	Context: process

dev->get_stats:
ndo_get_stats:
	Synchronization: dev_base_lock rwlock.
	Context: nominally process, but don't sleep inside an rwlock

dev->hard_start_xmit:
	Synchronization: netif_tx_lock spinlock.
ndo_start_xmit:
	Synchronization: __netif_tx_lock spinlock.

	When the driver sets NETIF_F_LLTX in dev->features this will be
	called without holding netif_tx_lock. In this case the driver
@@ -87,20 +86,20 @@ dev->hard_start_xmit:
	o NETDEV_TX_LOCKED Locking failed, please retry quickly.
	  Only valid when NETIF_F_LLTX is set.

dev->tx_timeout:
	Synchronization: netif_tx_lock spinlock.
ndo_tx_timeout:
	Synchronization: netif_tx_lock spinlock; all TX queues frozen.
	Context: BHs disabled
	Notes: netif_queue_stopped() is guaranteed true

dev->set_rx_mode:
	Synchronization: netif_tx_lock spinlock.
ndo_set_rx_mode:
	Synchronization: netif_addr_lock spinlock.
	Context: BHs disabled

struct napi_struct synchronization rules
========================================
napi->poll:
	Synchronization: NAPI_STATE_SCHED bit in napi->state.  Device
		driver's dev->close method will invoke napi_disable() on
		driver's ndo_stop method will invoke napi_disable() on
		all NAPI instances which will do a sleeping poll on the
		NAPI_STATE_SCHED napi->state bit, waiting for all pending
		NAPI activity to cease.
+7 −12
Original line number Diff line number Diff line
@@ -4309,6 +4309,13 @@ W: http://www.kernel.org/doc/man-pages
L:	linux-man@vger.kernel.org
S:	Maintained

MARVELL GIGABIT ETHERNET DRIVERS (skge/sky2)
M:	Mirko Lindner <mlindner@marvell.com>
M:	Stephen Hemminger <shemminger@vyatta.com>
L:	netdev@vger.kernel.org
S:	Maintained
F:	drivers/net/ethernet/marvell/sk*

MARVELL LIBERTAS WIRELESS DRIVER
M:	Dan Williams <dcbw@redhat.com>
L:	libertas-dev@lists.infradead.org
@@ -4339,12 +4346,6 @@ M: Nicolas Pitre <nico@fluxnic.net>
S:	Odd Fixes
F:	drivers/mmc/host/mvsdio.*

MARVELL YUKON / SYSKONNECT DRIVER
M:	Mirko Lindner <mlindner@syskonnect.de>
M:	Ralph Roesler <rroesler@syskonnect.de>
W:	http://www.syskonnect.com
S:	Supported

MATROX FRAMEBUFFER DRIVER
L:	linux-fbdev@vger.kernel.org
S:	Orphan
@@ -6116,12 +6117,6 @@ W: http://www.winischhofer.at/linuxsisusbvga.shtml
S:	Maintained
F:	drivers/usb/misc/sisusbvga/

SKGE, SKY2 10/100/1000 GIGABIT ETHERNET DRIVERS
M:	Stephen Hemminger <shemminger@vyatta.com>
L:	netdev@vger.kernel.org
S:	Maintained
F:	drivers/net/ethernet/marvell/sk*

SLAB ALLOCATOR
M:	Christoph Lameter <cl@linux-foundation.org>
M:	Pekka Enberg <penberg@kernel.org>
+91 −31
Original line number Diff line number Diff line
@@ -18,17 +18,17 @@
 * r9d : hlen = skb->len - skb->data_len
 */
#define SKBDATA	%r8

sk_load_word_ind:
	.globl	sk_load_word_ind

	add	%ebx,%esi	/* offset += X */
#	test    %esi,%esi	/* if (offset < 0) goto bpf_error; */
	js	bpf_error
#define SKF_MAX_NEG_OFF    $(-0x200000) /* SKF_LL_OFF from filter.h */

sk_load_word:
	.globl	sk_load_word

	test	%esi,%esi
	js	bpf_slow_path_word_neg

sk_load_word_positive_offset:
	.globl	sk_load_word_positive_offset

	mov	%r9d,%eax		# hlen
	sub	%esi,%eax		# hlen - offset
	cmp	$3,%eax
@@ -37,16 +37,15 @@ sk_load_word:
	bswap   %eax  			/* ntohl() */
	ret


sk_load_half_ind:
	.globl sk_load_half_ind

	add	%ebx,%esi	/* offset += X */
	js	bpf_error

sk_load_half:
	.globl	sk_load_half

	test	%esi,%esi
	js	bpf_slow_path_half_neg

sk_load_half_positive_offset:
	.globl	sk_load_half_positive_offset

	mov	%r9d,%eax
	sub	%esi,%eax		#	hlen - offset
	cmp	$1,%eax
@@ -55,14 +54,15 @@ sk_load_half:
	rol	$8,%ax			# ntohs()
	ret

sk_load_byte_ind:
	.globl sk_load_byte_ind
	add	%ebx,%esi	/* offset += X */
	js	bpf_error

sk_load_byte:
	.globl	sk_load_byte

	test	%esi,%esi
	js	bpf_slow_path_byte_neg

sk_load_byte_positive_offset:
	.globl	sk_load_byte_positive_offset

	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
	jle	bpf_slow_path_byte
	movzbl	(SKBDATA,%rsi),%eax
@@ -73,25 +73,21 @@ sk_load_byte:
 *
 * Implements BPF_S_LDX_B_MSH : ldxb  4*([offset]&0xf)
 * Must preserve A accumulator (%eax)
 * Inputs : %esi is the offset value, already known positive
 * Inputs : %esi is the offset value
 */
ENTRY(sk_load_byte_msh)
	CFI_STARTPROC
sk_load_byte_msh:
	.globl	sk_load_byte_msh
	test	%esi,%esi
	js	bpf_slow_path_byte_msh_neg

sk_load_byte_msh_positive_offset:
	.globl	sk_load_byte_msh_positive_offset
	cmp	%esi,%r9d      /* if (offset >= hlen) goto bpf_slow_path_byte_msh */
	jle	bpf_slow_path_byte_msh
	movzbl	(SKBDATA,%rsi),%ebx
	and	$15,%bl
	shl	$2,%bl
	ret
	CFI_ENDPROC
ENDPROC(sk_load_byte_msh)

bpf_error:
# force a return 0 from jit handler
	xor		%eax,%eax
	mov		-8(%rbp),%rbx
	leaveq
	ret

/* rsi contains offset and can be scratched */
#define bpf_slow_path_common(LEN)		\
@@ -138,3 +134,67 @@ bpf_slow_path_byte_msh:
	shl	$2,%al
	xchg	%eax,%ebx
	ret

#define sk_negative_common(SIZE)				\
	push	%rdi;	/* save skb */				\
	push	%r9;						\
	push	SKBDATA;					\
/* rsi already has offset */					\
	mov	$SIZE,%ecx;	/* size */			\
	call	bpf_internal_load_pointer_neg_helper;		\
	test	%rax,%rax;					\
	pop	SKBDATA;					\
	pop	%r9;						\
	pop	%rdi;						\
	jz	bpf_error


bpf_slow_path_word_neg:
	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
	jl	bpf_error	/* offset lower -> error  */
sk_load_word_negative_offset:
	.globl	sk_load_word_negative_offset
	sk_negative_common(4)
	mov	(%rax), %eax
	bswap	%eax
	ret

bpf_slow_path_half_neg:
	cmp	SKF_MAX_NEG_OFF, %esi
	jl	bpf_error
sk_load_half_negative_offset:
	.globl	sk_load_half_negative_offset
	sk_negative_common(2)
	mov	(%rax),%ax
	rol	$8,%ax
	movzwl	%ax,%eax
	ret

bpf_slow_path_byte_neg:
	cmp	SKF_MAX_NEG_OFF, %esi
	jl	bpf_error
sk_load_byte_negative_offset:
	.globl	sk_load_byte_negative_offset
	sk_negative_common(1)
	movzbl	(%rax), %eax
	ret

bpf_slow_path_byte_msh_neg:
	cmp	SKF_MAX_NEG_OFF, %esi
	jl	bpf_error
sk_load_byte_msh_negative_offset:
	.globl	sk_load_byte_msh_negative_offset
	xchg	%eax,%ebx /* dont lose A , X is about to be scratched */
	sk_negative_common(1)
	movzbl	(%rax),%eax
	and	$15,%al
	shl	$2,%al
	xchg	%eax,%ebx
	ret

bpf_error:
# force a return 0 from jit handler
	xor		%eax,%eax
	mov		-8(%rbp),%rbx
	leaveq
	ret
Loading