Linux kernel 6.7 includes the new commit
215199e3 "hardening: Provide Kconfig fragments for basic options"
Despite the warm fuzzies the commit message tries to impart, some of the specific options seem questionable to me; see below for details. However, I think they are all worth considering. I've only detailed recommendations that do not match what Slackware already has.
Code:
hardening: Provide Kconfig fragments for basic options
Inspired by Salvatore Mesoraca's earlier[1] efforts to provide some
in-tree guidance for kernel hardening Kconfig options, add a new fragment
named "hardening-basic.config" (along with some arch-specific fragments)
that enable a basic set of kernel hardening options that have the least
(or no) performance impact and remove a reasonable set of legacy APIs.
Using this fragment is as simple as running "make hardening.config".
More extreme fragments can be added[2] in the future to cover all the
recognized hardening options, and more per-architecture files can be
added too.
For now, document the fragments directly via comments. Perhaps .rst
documentation can be generated from them in the future (rather than the
other way around).
[1] https://lore.kernel.org/kernel-hardening/1536516257-30871-1-git-send-email-s.mesoraca16@gmail.com/
[2] https://github.com/KSPP/linux/issues/14
Cc: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Cc: x86@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
The following x86-specific config options are suggested but not currently enabled in Slackware:
vsyscall table for legacy applications: None CONFIG_LEGACY_VSYSCALL_NONE
Code:
There will be no vsyscall mapping at all. This will
eliminate any risk of ASLR bypass due to the vsyscall
fixed address mapping. Attempts to use the vsyscalls
will be reported to dmesg, so that either old or
malicious userspace programs can be identified.
Currently Slackware has CONFIG_LEGACY_VSYSCALL_XONLY instead of CONFIG_LEGACY_VSYSCALL_NONE.
Enable Intel DMA Remapping Devices by default CONFIG_INTEL_IOMMU_DEFAULT_ON
Code:
Selecting this option will enable a DMAR device at boot time if
one is found. If this option is not selected, DMAR support can
be enabled by passing intel_iommu=on to the kernel.
AMD IOMMU Version 2 driver CONFIG_AMD_IOMMU_V2
Code:
This option enables support for the AMD IOMMUv2 features of the IOMMU
hardware. Select this option if you want to use devices that support
the PCI PRI and PASID interface.
Currently Slackware has =m instead of =y
The following generic config options are suggested but not currently enabled in Slackware:
Randomize slab caches for normal kmalloc CONFIG_RANDOM_KMALLOC_CACHES
Code:
A hardening feature that creates multiple copies of slab caches for
normal kmalloc allocation and makes kmalloc randomly pick one based
on code address, which makes the attackers more difficult to spray
vulnerable memory objects on the heap for the purpose of exploiting
memory vulnerabilities.
Currently the number of copies is set to 16, a reasonably large value
that effectively diverges the memory objects allocated for different
subsystems or modules into different caches, at the expense of a
limited degree of memory and CPU overhead that relates to hardware and
system workload.
Default state of kernel stack offset randomization CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT
Code:
Kernel stack offset randomization is controlled by kernel boot param
"randomize_kstack_offset=on/off", and this config chooses the default
boot state.
Note that CONFIG_RANDOMIZE_KSTACK_OFFSET is already set, but by default is not turned on.
Undefined behaviour sanity checker CONFIG_UBSAN
Code:
This option enables the Undefined Behaviour sanity checker.
Compile-time instrumentation is used to detect various undefined
behaviours at runtime. For more details, see:
ubsan.rst
lib/ubsan.c
Abort on Sanitizer warnings (smaller kernel but less verbose) CONFIG_UBSAN_TRAP
Code:
Building kernels with Sanitizer features enabled tends to grow
the kernel size by around 5%, due to adding all the debugging
text on failure paths. To avoid this, Sanitizer instrumentation
can just issue a trap. This reduces the kernel size overhead but
turns all warnings (including potentially harmless conditions)
into full exceptions that abort the running kernel code
(regardless of context, locks held, etc), which may destabilize
the system. For some system builders this is an acceptable
trade-off.
Also note that selecting Y will cause your kernel to Oops
with an "illegal instruction" error with no further details
when a UBSAN violation occurs. (Except on arm64, which will
report which Sanitizer failed.) This may make it hard to
determine whether an Oops was caused by UBSAN or to figure
out the details of a UBSAN violation. It makes the kernel log
output less useful for bug reports.
This one seems like a big hammer. Reduce the kernel size by 5% at the expense of getting an Oops instead of a warning? On the other hand, if you hit undefined behavior things are already bad...
Perform array index bounds checking CONFIG_UBSAN_BOUNDS
Code:
This option enables detection of directly indexed out of bounds
array accesses, where the array size is known at compile time.
Note that this does not protect array overflows via bad calls
to the {str,mem}*cpy() family of functions (that is addressed
by CONFIG_FORTIFY_SOURCE).
Enable instrumentation for the entire kernel CONFIG_UBSAN_SANITIZE_ALL
Code:
This option activates instrumentation for the entire kernel.
If you don't enable this option, you have to explicitly specify
UBSAN_SANITIZE := y for the files/directories you want to check for UB.
Enabling this option will get kernel image size increased
significantly.
Check integrity of linked list manipulation CONFIG_LIST_HARDENED
Code:
Minimal integrity checking in the linked-list manipulation routines
to catch memory corruptions that are not guaranteed to result in an
immediate access fault.
If unsure, say N.
Clearly kernel devs can't agree with themselves. The help text for this config suggests N, but the the hardening guidelines say Y.
Enable heap memory zeroing on allocation by default CONFIG_INIT_ON_ALLOC_DEFAULT_ON
Code:
This has the effect of setting "init_on_alloc=1" on the kernel
command line. This can be disabled with "init_on_alloc=0".
When "init_on_alloc" is enabled, all page allocator and slab
allocator memory will be zeroed when allocated, eliminating
many kinds of "uninitialized heap memory" flaws, especially
heap content exposures. The performance impact varies by
workload, but most cases see <1% impact. Some synthetic
workloads have measured as high as 7%.
Clear Busmaster bit on PCI bridges during ExitBootServices() CONFIG_EFI_DISABLE_PCI_DMA
Code:
Disable the busmaster bit in the control register on all PCI bridges
while calling ExitBootServices() and passing control to the runtime
kernel. System firmware may configure the IOMMU to prevent malicious
PCI devices from being able to attack the OS via DMA. However, since
firmware can't guarantee that the OS is IOMMU-aware, it will tear
down IOMMU configuration when ExitBootServices() is called. This
leaves a window between where a hostile device could still cause
damage before Linux configures the IOMMU again.
If you say Y here, the EFI stub will clear the busmaster bit on all
PCI bridges before ExitBootServices() is called. This will prevent
any malicious PCI devices from being able to perform DMA until the
kernel reenables busmastering after configuring the IOMMU.
This option will cause failures with some poorly behaved hardware
and should not be enabled without testing. The kernel commandline
options "efi=disable_early_pci_dma" or "efi=no_disable_early_pci_dma"
may be used to override this option.
I'm surprised this is a suggested hardening default with that scary warning...
IOMMU default domain type: Translated - Strict CONFIG_IOMMU_DEFAULT_DMA_STRICT
Code:
Trusted devices use translation to restrict their access to only
DMA-mapped pages, with strict TLB invalidation on unmap. Equivalent
to passing "iommu.passthrough=0 iommu.strict=1" on the command line.
Untrusted devices always use this mode, with an additional layer of
bounce-buffering such that they cannot gain access to any unrelated
data within a mapped page.
Filter I/O access to /dev/mem CONFIG_IO_STRICT_DEVMEM
Code:
If this option is disabled, you allow userspace (root) access to all
io-memory regardless of whether a driver is actively using that
range. Accidental access to this is obviously disastrous, but
specific access can be used by people debugging kernel drivers.
If this option is switched on, the /dev/mem file only allows
userspace access to *idle* io-memory ranges (see /proc/iomem) This
may break traditional users of /dev/mem (dosemu, legacy X, etc...)
if the driver using a given range cannot be disabled.
If in doubt, say Y.
Note that CONFIG_STRICT_DEVMEM is already enabled in Slackware.
Automatically load TTY Line Disciplines CONFIG_LDISC_AUTOLOAD
Code:
Historically the kernel has always automatically loaded any
line discipline that is in a kernel module when a user asks
for it to be loaded with the TIOCSETD ioctl, or through other
means. This is not always the best thing to do on systems
where you know you will not be using some of the more
"ancient" line disciplines, so prevent the kernel from doing
this unless the request is coming from a process with the
CAP_SYS_MODULE permissions.
Say 'Y' here if you trust your userspace users to do the right
thing, or if you have only provided the line disciplines that
you know you will be using, or if you wish to continue to use
the traditional method of on-demand loading of these modules
by any user.
This functionality can be changed at runtime with the
dev.tty.ldisc_autoload sysctl, this configuration option will
only set the default value of this functionality.
Hardening options say to disable this option; it is currently set to Y in Slackware.
/proc/kcore support CONFIG_PROC_KCORE
Code:
Provides a virtual ELF core file of the live kernel. This can
be read with gdb and other ELF tools. No modifications can be
made using this mechanism.
Hardening options say to disable this ("Dangerous; exposes kernel text image layout"); it is currently set to Y in Slackware.
Legacy (BSD) PTY support CONFIG_LEGACY_PTYS
Code:
A pseudo terminal (PTY) is a software device consisting of two
halves: a master and a slave. The slave device behaves identical to
a physical terminal; the master device is used by a process to
read data from and write data to the slave, thereby emulating a
terminal. Typical programs for the master side are telnet servers
and xterms.
Linux has traditionally used the BSD-like names /dev/ptyxx
for masters and /dev/ttyxx for slaves of pseudo
terminals. This scheme has a number of problems, including
security. This option enables these legacy devices; on most
systems, it is safe to say N.
Hardening options say to disable this ("Attack surface reduction: Use the modern PTY interface (devpts) only."); it is currently set to Y in Slackware.
Enable legacy drivers (DANGEROUS) CONFIG_DRM_LEGACY
Code:
Enable legacy DRI1 drivers. Those drivers expose unsafe and dangerous
APIs to user-space, which can be used to circumvent access
restrictions and other security measures. For backwards compatibility
those drivers are still available, but their use is highly
inadvisable and might harm your system.
You are recommended to use the safe modeset-only drivers instead, and
perform 3D emulation in user-space.
Unless you have strong reasons to go rogue, say "N".
Hardening options ay to disable this ("Attack surface reduction: Use only modesetting video drivers."); it is currently set to Y in Slackware.
I've been running with this option disabled for some time (over a year) with no problem, using nVidia drivers.