Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit cf1ef3f0 authored by Wei Wang's avatar Wei Wang Committed by David S. Miller
Browse files

net/tcp_fastopen: Disable active side TFO in certain scenarios

Middlebox firewall issues can potentially cause server's data being
blackholed after a successful 3WHS using TFO. Following are the related
reports from Apple:
https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf
Slide 31 identifies an issue where the client ACK to the server's data
sent during a TFO'd handshake is dropped.
C ---> syn-data ---> S
C <--- syn/ack ----- S
C (accept & write)
C <---- data ------- S
C ----- ACK -> X     S
		[retry and timeout]

https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf


Slide 5 shows a similar situation that the server's data gets dropped
after 3WHS.
C ---- syn-data ---> S
C <--- syn/ack ----- S
C ---- ack --------> S
S (accept & write)
C?  X <- data ------ S
		[retry and timeout]

This is the worst failure b/c the client can not detect such behavior to
mitigate the situation (such as disabling TFO). Failing to proceed, the
application (e.g., SSL library) may simply timeout and retry with TFO
again, and the process repeats indefinitely.

The proposed solution is to disable active TFO globally under the
following circumstances:
1. client side TFO socket detects out of order FIN
2. client side TFO socket receives out of order RST

We disable active side TFO globally for 1hr at first. Then if it
happens again, we disable it for 2h, then 4h, 8h, ...
And we reset the timeout to 1hr if a client side TFO sockets not opened
on loopback has successfully received data segs from server.
And we examine this condition during close().

The rational behind it is that when such firewall issue happens,
application running on the client should eventually close the socket as
it is not able to get the data it is expecting. Or application running
on the server should close the socket as it is not able to receive any
response from client.
In both cases, out of order FIN or RST will get received on the client
given that the firewall will not block them as no data are in those
frames.
And we want to disable active TFO globally as it helps if the middle box
is very close to the client and most of the connections are likely to
fail.

Also, add a debug sysctl:
  tcp_fastopen_blackhole_detect_timeout_sec:
    the initial timeout to use when firewall blackhole issue happens.
    This can be set and read.
    When setting it to 0, it means to disable the active disable logic.

Signed-off-by: default avatarWei Wang <weiwan@google.com>
Acked-by: default avatarYuchung Cheng <ycheng@google.com>
Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent bc95cd8e
Loading
Loading
Loading
Loading
+8 −0
Original line number Diff line number Diff line
@@ -602,6 +602,14 @@ tcp_fastopen - INTEGER
	Note that that additional client or server features are only
	effective if the basic support (0x1 and 0x2) are enabled respectively.

tcp_fastopen_blackhole_timeout_sec - INTEGER
	Initial time period in second to disable Fastopen on active TCP sockets
	when a TFO firewall blackhole issue happens.
	This time period will grow exponentially when more blackhole issues
	get detected right after Fastopen is re-enabled and will reset to
	initial value when the blackhole issue goes away.
	By default, it is set to 1hr.

tcp_syn_retries - INTEGER
	Number of times initial SYNs for an active TCP connection attempt
	will be retransmitted. Should not be higher than 127. Default value
+1 −0
Original line number Diff line number Diff line
@@ -233,6 +233,7 @@ struct tcp_sock {
	u8	syn_data:1,	/* SYN includes data */
		syn_fastopen:1,	/* SYN includes Fast Open option */
		syn_fastopen_exp:1,/* SYN includes Fast Open exp. option */
		syn_fastopen_ch:1, /* Active TFO re-enabling probe */
		syn_data_acked:1,/* data in SYN is acked by SYN-ACK */
		save_syn:1,	/* Save headers of SYN packet */
		is_cwnd_limited:1;/* forward progress limited by snd_cwnd? */
+6 −0
Original line number Diff line number Diff line
@@ -1506,6 +1506,12 @@ struct tcp_fastopen_context {
	struct rcu_head		rcu;
};

extern unsigned int sysctl_tcp_fastopen_blackhole_timeout;
void tcp_fastopen_active_disable(void);
bool tcp_fastopen_active_should_disable(struct sock *sk);
void tcp_fastopen_active_disable_ofo_check(struct sock *sk);
void tcp_fastopen_active_timeout_reset(void);

/* Latencies incurred by various limits for a sender. They are
 * chronograph-like stats that are mutually exclusive.
 */
+21 −0
Original line number Diff line number Diff line
@@ -350,6 +350,19 @@ static int proc_udp_early_demux(struct ctl_table *table, int write,
	return ret;
}

static int proc_tfo_blackhole_detect_timeout(struct ctl_table *table,
					     int write,
					     void __user *buffer,
					     size_t *lenp, loff_t *ppos)
{
	int ret;

	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
	if (write && ret == 0)
		tcp_fastopen_active_timeout_reset();
	return ret;
}

static struct ctl_table ipv4_table[] = {
	{
		.procname	= "tcp_timestamps",
@@ -399,6 +412,14 @@ static struct ctl_table ipv4_table[] = {
		.maxlen		= ((TCP_FASTOPEN_KEY_LENGTH * 2) + 10),
		.proc_handler	= proc_tcp_fastopen_key,
	},
	{
		.procname	= "tcp_fastopen_blackhole_timeout_sec",
		.data		= &sysctl_tcp_fastopen_blackhole_timeout,
		.maxlen		= sizeof(int),
		.mode		= 0644,
		.proc_handler	= proc_tfo_blackhole_detect_timeout,
		.extra1		= &zero,
	},
	{
		.procname	= "tcp_abort_on_overflow",
		.data		= &sysctl_tcp_abort_on_overflow,
+1 −0
Original line number Diff line number Diff line
@@ -2296,6 +2296,7 @@ int tcp_disconnect(struct sock *sk, int flags)
	tcp_clear_xmit_timers(sk);
	__skb_queue_purge(&sk->sk_receive_queue);
	tcp_write_queue_purge(sk);
	tcp_fastopen_active_disable_ofo_check(sk);
	skb_rbtree_purge(&tp->out_of_order_queue);

	inet->inet_dport = 0;
Loading