Donate to e Foundation | Murena handsets with /e/OS | Own a part of Murena! Learn more

Commit 5185ad61 authored by David S. Miller's avatar David S. Miller
Browse files

Merge tag 'mlx5-updates-2017-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux



Saeed Mahameed says:

====================
mlx5-updates-2017-06-27 (Innova IPsec offload support)

This patchset adds support for Innova IPSec network interface card.

About Innova device:
--------------------
Innova is a network card with a ConnectX chip and an FPGA chip as a
 bump-on-the-wire.

               Internal
+----------+   Link       +-----------------+
|          +--------------+      FPGA       |  +------+
| ConnectX |              |  Shell          +--+ QSFP |
|          +--------------+    +-------+    |  | Port |
+----------+      I2C     |    |  SBU  |    |  +------+
                          |    +-------+    |
                          +--+----------+---+
                             |          |
                          +--+--+   +---+---+
                          | DDR |   | Flash |
                          +-----+   +-------+

The FPGA synthesized logic is loaded from dedicated flash storage and has
 access to its own dedicated DDR RAM.
The ConnectX chip firmware programs the FPGA by accessing its configuration
space over either the slow internal I2C link or the high-speed internal link.

The FPGA logic is divided into a "Shell" and a "Sandbox Unit" (SBU).
mlx5_core driver (with CONFIG_MLX5_FPGA) handles all shell functionality,
while other components may handle the various SBU functionalities.

The driver opens high-speed reliable communication channels with the shell and
the SBU over the internal link.
These channels may be used for high-bandwidth configuration or for SBU-specific
out-of-band data paths.

About Innova IPSec device:
--------------------------
Innova IPSec is a network card that allows offloading IPSec cryptography operations
from the host CPU to the NIC. It is an Innova card with an IPSec SBU.
The hardware keeps the database of IPSec Security Associations (SADB) in the FPGA's
DDR memory.

               Internal
+----------+   Link       +-----------------+
|          +--------------+      FPGA       |  +------+
| ConnectX |              |  Shell          +--+ QSFP |
|          +--------------+    +-------+    |  | Port |
+----------+ Internal I2C |    | IPSec |    |  +------+
                          |    |  SBU  |    |
                          |    +-------+    |
                          +--+----------+---+
                             |          |
                          +--+--+   +---+---+
                          | DDR |   |       |
                          |     |   | Flash |
                          |SADB |   |       |
                          +-----+   +-------+

Modes and ciphers:
Currently the following modes and ciphers are supported:
IPv4 and IPv6
ESP tunnel and transport modes
AES 128 and 256 bit encryption, with GCM authentication (RFC4106)

IV is generated using seqiv, in sync with Linux's geniv.

More modes and ciphers may be added later.

Notes:
In the future similar functionality will be included in a single-chip NIC.

About the driver:
-----------------
Patches 1-4 prepare some existing driver code for the new feature:
  * Add support for reserved GIDs in the hardware GID table
  * Allow multiple modules to enable hardware RoCE support independently
Patches 5-6 define structs and helper functions for QP work-queues.
Patches 7-11 add various FPGA-related features required for Innova.
IPSec.
Patch 12 adds abstraction layer for Mellanox IPSec-offload capable devices.
atches 13-16 add IPSec offload support to the mlx5 netdevice.

This driver services the new IPSec offload API introduced in commit
d77e38e6 ("xfrm: Add an IPsec hardware offloading API")

Configuration Path:
If Innova IPSec device is detected, the mlx5e netdevice gets the new
NETIF_F_HW_ESP feature and the xdo callbacks, indicating ESP offload
capabilities, and also the matching TX checksum and GSO features.

The driver configures offloaded Security Associations (SAs) by sending
an ADD_SA or DEL_SA message to the IPSec SBU, which updates the SADB in DDR.
These messages and their responses are sent over a high-speed channel.
Counters for ethtool are retrieved by the driver from the SBU.

Data path:
On receive path, the SBU decrypts ESP packets which match the offloaded SADB,
but keeps them encapsulated.
The SBU injects metadata (Mellanox owned ethertype) indicating that crypto-offload
has taken place, the SA with which it was done, and the authentication result.

The ConnectX chip performs RX checksum offload on the packet, and RSS using the
ESP SPI value.  The driver detects the special ethertype, and attaches a struct
secpath to the RX SKB, including flags to indicate that crypto offload took place,
the authentication result, and which xfrm_state was used for decryption, in the
olen and ovec members. The RX SKB may have useful CHECKSUM_COMPLETE. A separate
patchset will add support for that in the xfrm stack.

On transmit path, the stack encapsulates the packet but does not encrypt it, and
indicates in the SKB's secpath that crypto offload is to be performed and the SA
to use to do so.
The driver avoids performing crypto-offload for ESP fragments, and packets with
IP options, as the SBU cannot currently do that.  For eligible packets, the driver
prepends a special ethertype with metadata instructing the hardware to perform crypto offload.
The stack builds regular (non-GSO) SKBs so that they contain a placeholder for the ESP trailer.
The driver trims it off, because the SBU automatically appends the trailer for offloaded packets.
The ConnectX chip performs TX checksum offload on inner UDP or TCP packets,
and GSO for TCP packets (duplicating the prepended metadata).
The segmented packets then undergo encryption in the SBU before going on the wire.

Performance:
We measure single stream of TCP on Intel(R) Xeon(R) CPU E5-2643 v2 @3.50GHz
Using AES-NI with ESP GSO we get constant 4.1 Gbps.
Using crypto offload we get constant 18 Gbps.

Note that these numbers require CHECKSUM_COMPLETE support in XFRM, which we submit separately.

-  Ilan Tayari
====================

Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents 869684a7 164f16f7
Loading
Loading
Loading
Loading
+10 −0
Original line number Diff line number Diff line
@@ -8327,6 +8327,16 @@ Q: http://patchwork.ozlabs.org/project/netdev/list/
F:	drivers/net/ethernet/mellanox/mlx5/core/fpga/*
F:	include/linux/mlx5/mlx5_ifc_fpga.h

MELLANOX ETHERNET INNOVA IPSEC DRIVER
M:	Ilan Tayari <ilant@mellanox.com>
R:	Boris Pismenny <borisp@mellanox.com>
L:	netdev@vger.kernel.org
S:	Supported
W:	http://www.mellanox.com
Q:	http://patchwork.ozlabs.org/project/netdev/list/
F:	drivers/net/ethernet/mellanox/mlx5/core/en_ipsec/*
F:	drivers/net/ethernet/mellanox/mlx5/core/ipsec*

MELLANOX ETHERNET SWITCH DRIVERS
M:	Jiri Pirko <jiri@mellanox.com>
M:	Ido Schimmel <idosch@mellanox.com>
+53 −66
Original line number Diff line number Diff line
@@ -223,7 +223,7 @@ static int translate_eth_proto_oper(u32 eth_proto_oper, u8 *active_speed,
	return 0;
}

static void mlx5_query_port_roce(struct ib_device *device, u8 port_num,
static int mlx5_query_port_roce(struct ib_device *device, u8 port_num,
				struct ib_port_attr *props)
{
	struct mlx5_ib_dev *dev = to_mdev(device);
@@ -232,12 +232,14 @@ static void mlx5_query_port_roce(struct ib_device *device, u8 port_num,
	enum ib_mtu ndev_ib_mtu;
	u16 qkey_viol_cntr;
	u32 eth_prot_oper;
	int err;

	/* Possible bad flows are checked before filling out props so in case
	 * of an error it will still be zeroed out.
	 */
	if (mlx5_query_port_eth_proto_oper(mdev, &eth_prot_oper, port_num))
		return;
	err = mlx5_query_port_eth_proto_oper(mdev, &eth_prot_oper, port_num);
	if (err)
		return err;

	translate_eth_proto_oper(eth_prot_oper, &props->active_speed,
				 &props->active_width);
@@ -258,7 +260,7 @@ static void mlx5_query_port_roce(struct ib_device *device, u8 port_num,

	ndev = mlx5_ib_get_netdev(device, port_num);
	if (!ndev)
		return;
		return 0;

	if (mlx5_lag_is_active(dev->mdev)) {
		rcu_read_lock();
@@ -281,75 +283,49 @@ static void mlx5_query_port_roce(struct ib_device *device, u8 port_num,
	dev_put(ndev);

	props->active_mtu	= min(props->max_mtu, ndev_ib_mtu);
	return 0;
}

static void ib_gid_to_mlx5_roce_addr(const union ib_gid *gid,
				     const struct ib_gid_attr *attr,
				     void *mlx5_addr)
static int set_roce_addr(struct mlx5_ib_dev *dev, u8 port_num,
			 unsigned int index, const union ib_gid *gid,
			 const struct ib_gid_attr *attr)
{
#define MLX5_SET_RA(p, f, v) MLX5_SET(roce_addr_layout, p, f, v)
	char *mlx5_addr_l3_addr	= MLX5_ADDR_OF(roce_addr_layout, mlx5_addr,
					       source_l3_address);
	void *mlx5_addr_mac	= MLX5_ADDR_OF(roce_addr_layout, mlx5_addr,
					       source_mac_47_32);
	enum ib_gid_type gid_type = IB_GID_TYPE_IB;
	u8 roce_version = 0;
	u8 roce_l3_type = 0;
	bool vlan = false;
	u8 mac[ETH_ALEN];
	u16 vlan_id = 0;

	if (!gid)
		return;

	ether_addr_copy(mlx5_addr_mac, attr->ndev->dev_addr);
	if (gid) {
		gid_type = attr->gid_type;
		ether_addr_copy(mac, attr->ndev->dev_addr);

		if (is_vlan_dev(attr->ndev)) {
		MLX5_SET_RA(mlx5_addr, vlan_valid, 1);
		MLX5_SET_RA(mlx5_addr, vlan_id, vlan_dev_vlan_id(attr->ndev));
			vlan = true;
			vlan_id = vlan_dev_vlan_id(attr->ndev);
		}
	}

	switch (attr->gid_type) {
	switch (gid_type) {
	case IB_GID_TYPE_IB:
		MLX5_SET_RA(mlx5_addr, roce_version, MLX5_ROCE_VERSION_1);
		roce_version = MLX5_ROCE_VERSION_1;
		break;
	case IB_GID_TYPE_ROCE_UDP_ENCAP:
		MLX5_SET_RA(mlx5_addr, roce_version, MLX5_ROCE_VERSION_2);
		break;

	default:
		WARN_ON(true);
	}

	if (attr->gid_type != IB_GID_TYPE_IB) {
		roce_version = MLX5_ROCE_VERSION_2;
		if (ipv6_addr_v4mapped((void *)gid))
			MLX5_SET_RA(mlx5_addr, roce_l3_type,
				    MLX5_ROCE_L3_TYPE_IPV4);
			roce_l3_type = MLX5_ROCE_L3_TYPE_IPV4;
		else
			MLX5_SET_RA(mlx5_addr, roce_l3_type,
				    MLX5_ROCE_L3_TYPE_IPV6);
	}
			roce_l3_type = MLX5_ROCE_L3_TYPE_IPV6;
		break;

	if ((attr->gid_type == IB_GID_TYPE_IB) ||
	    !ipv6_addr_v4mapped((void *)gid))
		memcpy(mlx5_addr_l3_addr, gid, sizeof(*gid));
	else
		memcpy(&mlx5_addr_l3_addr[12], &gid->raw[12], 4);
	default:
		mlx5_ib_warn(dev, "Unexpected GID type %u\n", gid_type);
	}

static int set_roce_addr(struct ib_device *device, u8 port_num,
			 unsigned int index,
			 const union ib_gid *gid,
			 const struct ib_gid_attr *attr)
{
	struct mlx5_ib_dev *dev = to_mdev(device);
	u32  in[MLX5_ST_SZ_DW(set_roce_address_in)]  = {0};
	u32 out[MLX5_ST_SZ_DW(set_roce_address_out)] = {0};
	void *in_addr = MLX5_ADDR_OF(set_roce_address_in, in, roce_address);
	enum rdma_link_layer ll = mlx5_ib_port_link_layer(device, port_num);

	if (ll != IB_LINK_LAYER_ETHERNET)
		return -EINVAL;

	ib_gid_to_mlx5_roce_addr(gid, attr, in_addr);

	MLX5_SET(set_roce_address_in, in, roce_address_index, index);
	MLX5_SET(set_roce_address_in, in, opcode, MLX5_CMD_OP_SET_ROCE_ADDRESS);
	return mlx5_cmd_exec(dev->mdev, in, sizeof(in), out, sizeof(out));
	return mlx5_core_roce_gid_set(dev->mdev, index, roce_version,
				      roce_l3_type, gid->raw, mac, vlan,
				      vlan_id);
}

static int mlx5_ib_add_gid(struct ib_device *device, u8 port_num,
@@ -357,13 +333,13 @@ static int mlx5_ib_add_gid(struct ib_device *device, u8 port_num,
			   const struct ib_gid_attr *attr,
			   __always_unused void **context)
{
	return set_roce_addr(device, port_num, index, gid, attr);
	return set_roce_addr(to_mdev(device), port_num, index, gid, attr);
}

static int mlx5_ib_del_gid(struct ib_device *device, u8 port_num,
			   unsigned int index, __always_unused void **context)
{
	return set_roce_addr(device, port_num, index, NULL, NULL);
	return set_roce_addr(to_mdev(device), port_num, index, NULL, NULL);
}

__be16 mlx5_get_roce_udp_sport(struct mlx5_ib_dev *dev, u8 port_num,
@@ -978,20 +954,31 @@ static int mlx5_query_hca_port(struct ib_device *ibdev, u8 port,
int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
		       struct ib_port_attr *props)
{
	unsigned int count;
	int ret;

	switch (mlx5_get_vport_access_method(ibdev)) {
	case MLX5_VPORT_ACCESS_METHOD_MAD:
		return mlx5_query_mad_ifc_port(ibdev, port, props);
		ret = mlx5_query_mad_ifc_port(ibdev, port, props);
		break;

	case MLX5_VPORT_ACCESS_METHOD_HCA:
		return mlx5_query_hca_port(ibdev, port, props);
		ret = mlx5_query_hca_port(ibdev, port, props);
		break;

	case MLX5_VPORT_ACCESS_METHOD_NIC:
		mlx5_query_port_roce(ibdev, port, props);
		return 0;
		ret = mlx5_query_port_roce(ibdev, port, props);
		break;

	default:
		return -EINVAL;
		ret = -EINVAL;
	}

	if (!ret && props) {
		count = mlx5_core_reserved_gids_count(to_mdev(ibdev)->mdev);
		props->gid_tbl_len -= count;
	}
	return ret;
}

static int mlx5_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
+16 −0
Original line number Diff line number Diff line
@@ -11,9 +11,13 @@ config MLX5_CORE
	  Core driver for low level functionality of the ConnectX-4 and
	  Connect-IB cards by Mellanox Technologies.

config MLX5_ACCEL
	bool

config MLX5_FPGA
        bool "Mellanox Technologies Innova support"
        depends on MLX5_CORE
	select MLX5_ACCEL
        ---help---
          Build support for the Innova family of network cards by Mellanox
          Technologies. Innova network cards are comprised of a ConnectX chip
@@ -48,3 +52,15 @@ config MLX5_CORE_IPOIB
	default n
	---help---
	  MLX5 IPoIB offloads & acceleration support.

config MLX5_EN_IPSEC
	bool "IPSec XFRM cryptography-offload accelaration"
	depends on MLX5_ACCEL
	depends on MLX5_CORE_EN
	depends on XFRM_OFFLOAD
	depends on INET_ESP_OFFLOAD || INET6_ESP_OFFLOAD
	default n
	---help---
	  Build support for IPsec cryptography-offload accelaration in the NIC.
	  Note: Support for hardware with this capability needs to be selected
	  for this option to become available.
+8 −2
Original line number Diff line number Diff line
@@ -4,9 +4,12 @@ subdir-ccflags-y += -I$(src)
mlx5_core-y :=	main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
		health.o mcg.o cq.o srq.o alloc.o qp.o port.o mr.o pd.o \
		mad.o transobj.o vport.o sriov.o fs_cmd.o fs_core.o \
		fs_counters.o rl.o lag.o dev.o
		fs_counters.o rl.o lag.o dev.o lib/gid.o

mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o
mlx5_core-$(CONFIG_MLX5_ACCEL) += accel/ipsec.o

mlx5_core-$(CONFIG_MLX5_FPGA) += fpga/cmd.o fpga/core.o fpga/conn.o fpga/sdk.o \
		fpga/ipsec.o

mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o eswitch_offloads.o \
		en_main.o en_common.o en_fs.o en_ethtool.o en_tx.o \
@@ -16,3 +19,6 @@ mlx5_core-$(CONFIG_MLX5_CORE_EN) += wq.o eswitch.o eswitch_offloads.o \
mlx5_core-$(CONFIG_MLX5_CORE_EN_DCB) +=  en_dcbnl.o

mlx5_core-$(CONFIG_MLX5_CORE_IPOIB) += ipoib/ipoib.o ipoib/ethtool.o

mlx5_core-$(CONFIG_MLX5_EN_IPSEC) += en_accel/ipsec.o en_accel/ipsec_rxtx.o \
		en_accel/ipsec_stats.o
+78 −0
Original line number Diff line number Diff line
/*
 * Copyright (c) 2017 Mellanox Technologies. All rights reserved.
 *
 * This software is available to you under a choice of one of two
 * licenses.  You may choose to be licensed under the terms of the GNU
 * General Public License (GPL) Version 2, available from the file
 * COPYING in the main directory of this source tree, or the
 * OpenIB.org BSD license below:
 *
 *     Redistribution and use in source and binary forms, with or
 *     without modification, are permitted provided that the following
 *     conditions are met:
 *
 *      - Redistributions of source code must retain the above
 *        copyright notice, this list of conditions and the following
 *        disclaimer.
 *
 *      - Redistributions in binary form must reproduce the above
 *        copyright notice, this list of conditions and the following
 *        disclaimer in the documentation and/or other materials
 *        provided with the distribution.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
 * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
 * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
 * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 *
 */

#include <linux/mlx5/device.h>

#include "accel/ipsec.h"
#include "mlx5_core.h"
#include "fpga/ipsec.h"

void *mlx5_accel_ipsec_sa_cmd_exec(struct mlx5_core_dev *mdev,
				   struct mlx5_accel_ipsec_sa *cmd)
{
	if (!MLX5_IPSEC_DEV(mdev))
		return ERR_PTR(-EOPNOTSUPP);

	return mlx5_fpga_ipsec_sa_cmd_exec(mdev, cmd);
}

int mlx5_accel_ipsec_sa_cmd_wait(void *ctx)
{
	return mlx5_fpga_ipsec_sa_cmd_wait(ctx);
}

u32 mlx5_accel_ipsec_device_caps(struct mlx5_core_dev *mdev)
{
	return mlx5_fpga_ipsec_device_caps(mdev);
}

unsigned int mlx5_accel_ipsec_counters_count(struct mlx5_core_dev *mdev)
{
	return mlx5_fpga_ipsec_counters_count(mdev);
}

int mlx5_accel_ipsec_counters_read(struct mlx5_core_dev *mdev, u64 *counters,
				   unsigned int count)
{
	return mlx5_fpga_ipsec_counters_read(mdev, counters, count);
}

int mlx5_accel_ipsec_init(struct mlx5_core_dev *mdev)
{
	return mlx5_fpga_ipsec_init(mdev);
}

void mlx5_accel_ipsec_cleanup(struct mlx5_core_dev *mdev)
{
	mlx5_fpga_ipsec_cleanup(mdev);
}
Loading