Loading Documentation/admin-guide/kernel-parameters.txt +13 −0 Original line number Diff line number Diff line Loading @@ -2233,6 +2233,17 @@ memory contents and reserves bad memory regions that are detected. mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control Valid arguments: on, off Default (depends on kernel configuration option): on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n) mem_encrypt=on: Activate SME mem_encrypt=off: Do not activate SME Refer to Documentation/x86/amd-memory-encryption.txt for details on when memory encryption can be activated. mem_sleep_default= [SUSPEND] Default system suspend mode: s2idle - Suspend-To-Idle shallow - Power-On Suspend or equivalent (if supported) Loading Loading @@ -2696,6 +2707,8 @@ nopat [X86] Disable PAT (page attribute table extension of pagetables) support. nopcid [X86-64] Disable the PCID cpu feature. norandmaps Don't use address space randomization. Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space Loading Documentation/networking/switchdev.txt +2 −2 Original line number Diff line number Diff line Loading @@ -228,7 +228,7 @@ Learning on the device port should be enabled, as well as learning_sync: bridge link set dev DEV learning on self bridge link set dev DEV learning_sync on self Learning_sync attribute enables syncing of the learned/forgotton FDB entry to Learning_sync attribute enables syncing of the learned/forgotten FDB entry to the bridge's FDB. It's possible, but not optimal, to enable learning on the device port and on the bridge port, and disable learning_sync. Loading @@ -245,7 +245,7 @@ the responsibility of the port driver/device to age out these entries. If the port device supports ageing, when the FDB entry expires, it will notify the driver which in turn will notify the bridge with SWITCHDEV_FDB_DEL. If the device does not support ageing, the driver can simulate ageing using a garbage collection timer to monitor FBD entries. Expired entries will be garbage collection timer to monitor FDB entries. Expired entries will be notified to the bridge using SWITCHDEV_FDB_DEL. See rocker driver for example of driver running ageing timer. Loading Documentation/sysctl/net.txt +36 −11 Original line number Diff line number Diff line Loading @@ -35,9 +35,34 @@ Table : Subdirectories in /proc/sys/net bpf_jit_enable -------------- This enables Berkeley Packet Filter Just in Time compiler. Currently supported on x86_64 architecture, bpf_jit provides a framework to speed packet filtering, the one used by tcpdump/libpcap for example. This enables the BPF Just in Time (JIT) compiler. BPF is a flexible and efficient infrastructure allowing to execute bytecode at various hook points. It is used in a number of Linux kernel subsystems such as networking (e.g. XDP, tc), tracing (e.g. kprobes, uprobes, tracepoints) and security (e.g. seccomp). LLVM has a BPF back end that can compile restricted C into a sequence of BPF instructions. After program load through bpf(2) and passing a verifier in the kernel, a JIT will then translate these BPF proglets into native CPU instructions. There are two flavors of JITs, the newer eBPF JIT currently supported on: - x86_64 - arm64 - ppc64 - sparc64 - mips64 - s390x And the older cBPF JIT supported on the following archs: - arm - mips - ppc - sparc eBPF JITs are a superset of cBPF JITs, meaning the kernel will migrate cBPF instructions into eBPF instructions and then JIT compile them transparently. Older cBPF JITs can only translate tcpdump filters, seccomp rules, etc, but not mentioned eBPF programs loaded through bpf(2). Values : 0 - disable the JIT (default value) 1 - enable the JIT Loading @@ -46,9 +71,9 @@ Values : bpf_jit_harden -------------- This enables hardening for the Berkeley Packet Filter Just in Time compiler. Supported are eBPF JIT backends. Enabling hardening trades off performance, but can mitigate JIT spraying. This enables hardening for the BPF JIT compiler. Supported are eBPF JIT backends. Enabling hardening trades off performance, but can mitigate JIT spraying. Values : 0 - disable JIT hardening (default value) 1 - enable JIT hardening for unprivileged users only Loading @@ -57,11 +82,11 @@ Values : bpf_jit_kallsyms ---------------- When Berkeley Packet Filter Just in Time compiler is enabled, then compiled images are unknown addresses to the kernel, meaning they neither show up in traces nor in /proc/kallsyms. This enables export of these addresses, which can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature is disabled. When BPF JIT compiler is enabled, then compiled images are unknown addresses to the kernel, meaning they neither show up in traces nor in /proc/kallsyms. This enables export of these addresses, which can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature is disabled. Values : 0 - disable JIT kallsyms export (default value) 1 - enable JIT kallsyms export for privileged users only Loading Documentation/x86/amd-memory-encryption.txt 0 → 100644 +68 −0 Original line number Diff line number Diff line Secure Memory Encryption (SME) is a feature found on AMD processors. SME provides the ability to mark individual pages of memory as encrypted using the standard x86 page tables. A page that is marked encrypted will be automatically decrypted when read from DRAM and encrypted when written to DRAM. SME can therefore be used to protect the contents of DRAM from physical attacks on the system. A page is encrypted when a page table entry has the encryption bit set (see below on how to determine its position). The encryption bit can also be specified in the cr3 register, allowing the PGD table to be encrypted. Each successive level of page tables can also be encrypted by setting the encryption bit in the page table entry that points to the next table. This allows the full page table hierarchy to be encrypted. Note, this means that just because the encryption bit is set in cr3, doesn't imply the full hierarchy is encyrpted. Each page table entry in the hierarchy needs to have the encryption bit set to achieve that. So, theoretically, you could have the encryption bit set in cr3 so that the PGD is encrypted, but not set the encryption bit in the PGD entry for a PUD which results in the PUD pointed to by that entry to not be encrypted. Support for SME can be determined through the CPUID instruction. The CPUID function 0x8000001f reports information related to SME: 0x8000001f[eax]: Bit[0] indicates support for SME 0x8000001f[ebx]: Bits[5:0] pagetable bit number used to activate memory encryption Bits[11:6] reduction in physical address space, in bits, when memory encryption is enabled (this only affects system physical addresses, not guest physical addresses) If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to determine if SME is enabled and/or to enable memory encryption: 0xc0010010: Bit[23] 0 = memory encryption features are disabled 1 = memory encryption features are enabled Linux relies on BIOS to set this bit if BIOS has determined that the reduction in the physical address space as a result of enabling memory encryption (see CPUID information above) will not conflict with the address space resource requirements for the system. If this bit is not set upon Linux startup then Linux itself will not set it and memory encryption will not be possible. The state of SME in the Linux kernel can be documented as follows: - Supported: The CPU supports SME (determined through CPUID instruction). - Enabled: Supported and bit 23 of MSR_K8_SYSCFG is set. - Active: Supported, Enabled and the Linux kernel is actively applying the encryption bit to page table entries (the SME mask in the kernel is non-zero). SME can also be enabled and activated in the BIOS. If SME is enabled and activated in the BIOS, then all memory accesses will be encrypted and it will not be necessary to activate the Linux memory encryption support. If the BIOS merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or by supplying mem_encrypt=on on the kernel command line. However, if BIOS does not enable SME, then Linux will not be able to activate memory encryption, even if configured to do so by default or the mem_encrypt=on command line parameter is specified. Documentation/x86/protection-keys.txt +3 −3 Original line number Diff line number Diff line Loading @@ -34,7 +34,7 @@ with a key. In this example WRPKRU is wrapped by a C function called pkey_set(). int real_prot = PROT_READ|PROT_WRITE; pkey = pkey_alloc(0, PKEY_DENY_WRITE); pkey = pkey_alloc(0, PKEY_DISABLE_WRITE); ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey); ... application runs here Loading @@ -42,9 +42,9 @@ called pkey_set(). Now, if the application needs to update the data at 'ptr', it can gain access, do the update, then remove its write access: pkey_set(pkey, 0); // clear PKEY_DENY_WRITE pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE *ptr = foo; // assign something pkey_set(pkey, PKEY_DENY_WRITE); // set PKEY_DENY_WRITE again pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again Now when it frees the memory, it will also free the pkey since it is no longer in use: Loading Loading
Documentation/admin-guide/kernel-parameters.txt +13 −0 Original line number Diff line number Diff line Loading @@ -2233,6 +2233,17 @@ memory contents and reserves bad memory regions that are detected. mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control Valid arguments: on, off Default (depends on kernel configuration option): on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n) mem_encrypt=on: Activate SME mem_encrypt=off: Do not activate SME Refer to Documentation/x86/amd-memory-encryption.txt for details on when memory encryption can be activated. mem_sleep_default= [SUSPEND] Default system suspend mode: s2idle - Suspend-To-Idle shallow - Power-On Suspend or equivalent (if supported) Loading Loading @@ -2696,6 +2707,8 @@ nopat [X86] Disable PAT (page attribute table extension of pagetables) support. nopcid [X86-64] Disable the PCID cpu feature. norandmaps Don't use address space randomization. Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space Loading
Documentation/networking/switchdev.txt +2 −2 Original line number Diff line number Diff line Loading @@ -228,7 +228,7 @@ Learning on the device port should be enabled, as well as learning_sync: bridge link set dev DEV learning on self bridge link set dev DEV learning_sync on self Learning_sync attribute enables syncing of the learned/forgotton FDB entry to Learning_sync attribute enables syncing of the learned/forgotten FDB entry to the bridge's FDB. It's possible, but not optimal, to enable learning on the device port and on the bridge port, and disable learning_sync. Loading @@ -245,7 +245,7 @@ the responsibility of the port driver/device to age out these entries. If the port device supports ageing, when the FDB entry expires, it will notify the driver which in turn will notify the bridge with SWITCHDEV_FDB_DEL. If the device does not support ageing, the driver can simulate ageing using a garbage collection timer to monitor FBD entries. Expired entries will be garbage collection timer to monitor FDB entries. Expired entries will be notified to the bridge using SWITCHDEV_FDB_DEL. See rocker driver for example of driver running ageing timer. Loading
Documentation/sysctl/net.txt +36 −11 Original line number Diff line number Diff line Loading @@ -35,9 +35,34 @@ Table : Subdirectories in /proc/sys/net bpf_jit_enable -------------- This enables Berkeley Packet Filter Just in Time compiler. Currently supported on x86_64 architecture, bpf_jit provides a framework to speed packet filtering, the one used by tcpdump/libpcap for example. This enables the BPF Just in Time (JIT) compiler. BPF is a flexible and efficient infrastructure allowing to execute bytecode at various hook points. It is used in a number of Linux kernel subsystems such as networking (e.g. XDP, tc), tracing (e.g. kprobes, uprobes, tracepoints) and security (e.g. seccomp). LLVM has a BPF back end that can compile restricted C into a sequence of BPF instructions. After program load through bpf(2) and passing a verifier in the kernel, a JIT will then translate these BPF proglets into native CPU instructions. There are two flavors of JITs, the newer eBPF JIT currently supported on: - x86_64 - arm64 - ppc64 - sparc64 - mips64 - s390x And the older cBPF JIT supported on the following archs: - arm - mips - ppc - sparc eBPF JITs are a superset of cBPF JITs, meaning the kernel will migrate cBPF instructions into eBPF instructions and then JIT compile them transparently. Older cBPF JITs can only translate tcpdump filters, seccomp rules, etc, but not mentioned eBPF programs loaded through bpf(2). Values : 0 - disable the JIT (default value) 1 - enable the JIT Loading @@ -46,9 +71,9 @@ Values : bpf_jit_harden -------------- This enables hardening for the Berkeley Packet Filter Just in Time compiler. Supported are eBPF JIT backends. Enabling hardening trades off performance, but can mitigate JIT spraying. This enables hardening for the BPF JIT compiler. Supported are eBPF JIT backends. Enabling hardening trades off performance, but can mitigate JIT spraying. Values : 0 - disable JIT hardening (default value) 1 - enable JIT hardening for unprivileged users only Loading @@ -57,11 +82,11 @@ Values : bpf_jit_kallsyms ---------------- When Berkeley Packet Filter Just in Time compiler is enabled, then compiled images are unknown addresses to the kernel, meaning they neither show up in traces nor in /proc/kallsyms. This enables export of these addresses, which can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature is disabled. When BPF JIT compiler is enabled, then compiled images are unknown addresses to the kernel, meaning they neither show up in traces nor in /proc/kallsyms. This enables export of these addresses, which can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature is disabled. Values : 0 - disable JIT kallsyms export (default value) 1 - enable JIT kallsyms export for privileged users only Loading
Documentation/x86/amd-memory-encryption.txt 0 → 100644 +68 −0 Original line number Diff line number Diff line Secure Memory Encryption (SME) is a feature found on AMD processors. SME provides the ability to mark individual pages of memory as encrypted using the standard x86 page tables. A page that is marked encrypted will be automatically decrypted when read from DRAM and encrypted when written to DRAM. SME can therefore be used to protect the contents of DRAM from physical attacks on the system. A page is encrypted when a page table entry has the encryption bit set (see below on how to determine its position). The encryption bit can also be specified in the cr3 register, allowing the PGD table to be encrypted. Each successive level of page tables can also be encrypted by setting the encryption bit in the page table entry that points to the next table. This allows the full page table hierarchy to be encrypted. Note, this means that just because the encryption bit is set in cr3, doesn't imply the full hierarchy is encyrpted. Each page table entry in the hierarchy needs to have the encryption bit set to achieve that. So, theoretically, you could have the encryption bit set in cr3 so that the PGD is encrypted, but not set the encryption bit in the PGD entry for a PUD which results in the PUD pointed to by that entry to not be encrypted. Support for SME can be determined through the CPUID instruction. The CPUID function 0x8000001f reports information related to SME: 0x8000001f[eax]: Bit[0] indicates support for SME 0x8000001f[ebx]: Bits[5:0] pagetable bit number used to activate memory encryption Bits[11:6] reduction in physical address space, in bits, when memory encryption is enabled (this only affects system physical addresses, not guest physical addresses) If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to determine if SME is enabled and/or to enable memory encryption: 0xc0010010: Bit[23] 0 = memory encryption features are disabled 1 = memory encryption features are enabled Linux relies on BIOS to set this bit if BIOS has determined that the reduction in the physical address space as a result of enabling memory encryption (see CPUID information above) will not conflict with the address space resource requirements for the system. If this bit is not set upon Linux startup then Linux itself will not set it and memory encryption will not be possible. The state of SME in the Linux kernel can be documented as follows: - Supported: The CPU supports SME (determined through CPUID instruction). - Enabled: Supported and bit 23 of MSR_K8_SYSCFG is set. - Active: Supported, Enabled and the Linux kernel is actively applying the encryption bit to page table entries (the SME mask in the kernel is non-zero). SME can also be enabled and activated in the BIOS. If SME is enabled and activated in the BIOS, then all memory accesses will be encrypted and it will not be necessary to activate the Linux memory encryption support. If the BIOS merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or by supplying mem_encrypt=on on the kernel command line. However, if BIOS does not enable SME, then Linux will not be able to activate memory encryption, even if configured to do so by default or the mem_encrypt=on command line parameter is specified.
Documentation/x86/protection-keys.txt +3 −3 Original line number Diff line number Diff line Loading @@ -34,7 +34,7 @@ with a key. In this example WRPKRU is wrapped by a C function called pkey_set(). int real_prot = PROT_READ|PROT_WRITE; pkey = pkey_alloc(0, PKEY_DENY_WRITE); pkey = pkey_alloc(0, PKEY_DISABLE_WRITE); ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey); ... application runs here Loading @@ -42,9 +42,9 @@ called pkey_set(). Now, if the application needs to update the data at 'ptr', it can gain access, do the update, then remove its write access: pkey_set(pkey, 0); // clear PKEY_DENY_WRITE pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE *ptr = foo; // assign something pkey_set(pkey, PKEY_DENY_WRITE); // set PKEY_DENY_WRITE again pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again Now when it frees the memory, it will also free the pkey since it is no longer in use: Loading