Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 9d2355ba authored by David S. Miller's avatar David S. Miller
Browse files

Merge branch 'cmsg_timestamp'



Soheil Hassas Yeganeh says:

====================
add TX timestamping via cmsg

This patch series aim at enabling TX timestamping via cmsg.

Currently, to occasionally sample TX timestamping on a socket,
applications need to call setsockopt twice: first for enabling
timestamps and then for disabling them. This is an unnecessary
overhead. With cmsg, in contrast, applications can sample TX
timestamps per sendmsg().

This patch series adds the code for processing SO_TIMESTAMPING
for cmsg's of the SOL_SOCKET level, and adds the glue code for
TCP, UDP, and RAW for both IPv4 and IPv6. This implementation
supports overriding timestamp generation flags (i.e.,
SOF_TIMESTAMPING_TX_*) but not timestamp reporting flags.
Applications must still enable timestamp reporting via
setsockopt to receive timestamps.

This series does not change existing timestamping behavior for
applications that are using socket options.

I will follow up with another patch to enable timestamping for
active TFO (client-side TCP Fast Open) and also setting packet
mark via cmsgs.

Thanks!

Changes in v2:
        - Replace u32 with __u32 in the documentation.

Changes in v3:
	- Fix the broken build for L2TP (due to changes
	  in IPv6).
====================

Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 833716e0 fd91e12f
Loading
Loading
Loading
Loading
+45 −3
Original line number Diff line number Diff line
@@ -44,11 +44,17 @@ timeval of SO_TIMESTAMP (ms).
Supports multiple types of timestamp requests. As a result, this
socket option takes a bitmap of flags, not a boolean. In

  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, &val);
  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
                   sizeof(val));

val is an integer with any of the following bits set. Setting other
bit returns EINVAL and does not change the current state.

The socket option configures timestamp generation for individual
sk_buffs (1.3.1), timestamp reporting to the socket's error
queue (1.3.2) and options (1.3.3). Timestamp generation can also
be enabled for individual sendmsg calls using cmsg (1.3.4).


1.3.1 Timestamp Generation

@@ -71,13 +77,16 @@ SOF_TIMESTAMPING_RX_SOFTWARE:
  kernel receive stack.

SOF_TIMESTAMPING_TX_HARDWARE:
  Request tx timestamps generated by the network adapter.
  Request tx timestamps generated by the network adapter. This flag
  can be enabled via both socket options and control messages.

SOF_TIMESTAMPING_TX_SOFTWARE:
  Request tx timestamps when data leaves the kernel. These timestamps
  are generated in the device driver as close as possible, but always
  prior to, passing the packet to the network interface. Hence, they
  require driver support and may not be available for all devices.
  This flag can be enabled via both socket options and control messages.


SOF_TIMESTAMPING_TX_SCHED:
  Request tx timestamps prior to entering the packet scheduler. Kernel
@@ -90,7 +99,8 @@ SOF_TIMESTAMPING_TX_SCHED:
  machines with virtual devices where a transmitted packet travels
  through multiple devices and, hence, multiple packet schedulers,
  a timestamp is generated at each layer. This allows for fine
  grained measurement of queuing delay.
  grained measurement of queuing delay. This flag can be enabled
  via both socket options and control messages.

SOF_TIMESTAMPING_TX_ACK:
  Request tx timestamps when all data in the send buffer has been
@@ -99,6 +109,7 @@ SOF_TIMESTAMPING_TX_ACK:
  over-report measurement, because the timestamp is generated when all
  data up to and including the buffer at send() was acknowledged: the
  cumulative acknowledgment. The mechanism ignores SACK and FACK.
  This flag can be enabled via both socket options and control messages.


1.3.2 Timestamp Reporting
@@ -183,6 +194,37 @@ having access to the contents of the original packet, so cannot be
combined with SOF_TIMESTAMPING_OPT_TSONLY.


1.3.4. Enabling timestamps via control messages

In addition to socket options, timestamp generation can be requested
per write via cmsg, only for SOF_TIMESTAMPING_TX_* (see Section 1.3.1).
Using this feature, applications can sample timestamps per sendmsg()
without paying the overhead of enabling and disabling timestamps via
setsockopt:

  struct msghdr *msg;
  ...
  cmsg			       = CMSG_FIRSTHDR(msg);
  cmsg->cmsg_level	       = SOL_SOCKET;
  cmsg->cmsg_type	       = SO_TIMESTAMPING;
  cmsg->cmsg_len	       = CMSG_LEN(sizeof(__u32));
  *((__u32 *) CMSG_DATA(cmsg)) = SOF_TIMESTAMPING_TX_SCHED |
				 SOF_TIMESTAMPING_TX_SOFTWARE |
				 SOF_TIMESTAMPING_TX_ACK;
  err = sendmsg(fd, msg, 0);

The SOF_TIMESTAMPING_TX_* flags set via cmsg will override
the SOF_TIMESTAMPING_TX_* flags set via setsockopt.

Moreover, applications must still enable timestamp reporting via
setsockopt to receive timestamps:

  __u32 val = SOF_TIMESTAMPING_SOFTWARE |
	      SOF_TIMESTAMPING_OPT_ID /* or any other flag */;
  err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val,
                   sizeof(val));


1.4 Bytestream Timestamps

The SO_TIMESTAMPING interface supports timestamping of bytes in a
+2 −1
Original line number Diff line number Diff line
@@ -861,7 +861,8 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
		goto drop;

	if (skb->sk && sk_fullsock(skb->sk)) {
		sock_tx_timestamp(skb->sk, &skb_shinfo(skb)->tx_flags);
		sock_tx_timestamp(skb->sk, skb->sk->sk_tsflags,
				  &skb_shinfo(skb)->tx_flags);
		sw_tx_timestamp(skb);
	}

+2 −1
Original line number Diff line number Diff line
@@ -56,6 +56,7 @@ static inline unsigned int ip_hdrlen(const struct sk_buff *skb)
}

struct ipcm_cookie {
	struct sockcm_cookie	sockc;
	__be32			addr;
	int			oif;
	struct ip_options_rcu	*opt;
@@ -550,7 +551,7 @@ int ip_options_rcv_srr(struct sk_buff *skb);

void ipv4_pktinfo_prepare(const struct sock *sk, struct sk_buff *skb);
void ip_cmsg_recv_offset(struct msghdr *msg, struct sk_buff *skb, int offset);
int ip_cmsg_send(struct net *net, struct msghdr *msg,
int ip_cmsg_send(struct sock *sk, struct msghdr *msg,
		 struct ipcm_cookie *ipc, bool allow_ipv6);
int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval,
		  unsigned int optlen);
+4 −2
Original line number Diff line number Diff line
@@ -867,7 +867,8 @@ int ip6_append_data(struct sock *sk,
				int odd, struct sk_buff *skb),
		    void *from, int length, int transhdrlen, int hlimit,
		    int tclass, struct ipv6_txoptions *opt, struct flowi6 *fl6,
		    struct rt6_info *rt, unsigned int flags, int dontfrag);
		    struct rt6_info *rt, unsigned int flags, int dontfrag,
		    const struct sockcm_cookie *sockc);

int ip6_push_pending_frames(struct sock *sk);

@@ -884,7 +885,8 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
			     void *from, int length, int transhdrlen,
			     int hlimit, int tclass, struct ipv6_txoptions *opt,
			     struct flowi6 *fl6, struct rt6_info *rt,
			     unsigned int flags, int dontfrag);
			     unsigned int flags, int dontfrag,
			     const struct sockcm_cookie *sockc);

static inline struct sk_buff *ip6_finish_skb(struct sock *sk)
{
+9 −4
Original line number Diff line number Diff line
@@ -1418,8 +1418,11 @@ void sk_send_sigurg(struct sock *sk);

struct sockcm_cookie {
	u32 mark;
	u16 tsflags;
};

int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
		     struct sockcm_cookie *sockc);
int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
		   struct sockcm_cookie *sockc);

@@ -2054,19 +2057,21 @@ static inline void sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
		sk->sk_stamp = skb->tstamp;
}

void __sock_tx_timestamp(const struct sock *sk, __u8 *tx_flags);
void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags);

/**
 * sock_tx_timestamp - checks whether the outgoing packet is to be time stamped
 * @sk:		socket sending this packet
 * @tsflags:	timestamping flags to use
 * @tx_flags:	completed with instructions for time stamping
 *
 * Note : callers should take care of initial *tx_flags value (usually 0)
 */
static inline void sock_tx_timestamp(const struct sock *sk, __u8 *tx_flags)
static inline void sock_tx_timestamp(const struct sock *sk, __u16 tsflags,
				     __u8 *tx_flags)
{
	if (unlikely(sk->sk_tsflags))
		__sock_tx_timestamp(sk, tx_flags);
	if (unlikely(tsflags))
		__sock_tx_timestamp(tsflags, tx_flags);
	if (unlikely(sock_flag(sk, SOCK_WIFI_STATUS)))
		*tx_flags |= SKBTX_WIFI_STATUS;
}
Loading