Many security bugs can be fixed without performance penalty , but according to reports Intel processors have a hardware bug – whose details have not been disclosed yet (embargo) – that seems to affect all operating systems including Windows, Linux, Mac OS, etc…, and the fix may lead to significant performance hits for some tasks.
We know a bit more thanks to the Kernel Page Table Isolation (KPTI) patch for Linux that enables the fix/workaround with X86_BUG_CPU_INSECURE feature. The fix used to be called KAISER, and there’s an explanation on LWN about “hiding the kernel from user space” about the issue:
On contemporary 64-bit systems, the shared address space does not constrain the amount of virtual memory that can be addressed as it used to, but there is another problem that is related to security. An important technique for hardening the system is kernel address-space layout randomization (KASLR), which randomizes the placement of the kernel in the virtual address space at boot time. By denying an attacker the knowledge of where the kernel lives in memory, KASLR makes many types of attack considerably more difficult. As long as the actual location of the kernel does not leak to user space, attackers will be left groping in the dark.
The problem is that this information leaks in many ways….
More recently, a concerted effort has been made to close off the direct leaks from the kernel, but none of that will be of much benefit if the hardware itself reveals the kernel’s location. And that would appear to be exactly what is happening.
This paper from Daniel Gruss et al. [PDF] cites a number of hardware-based attacks on KASLR. They use techniques like exploiting timing differences in fault handling, observing the behavior of prefetch instructions, or forcing faults using the Intel TSX (transactional memory) instructions. There are rumors circulating that other such channels exist but have not yet been disclosed…
and the fix:
Fixing information leaks in the hardware is difficult and, in any case, deployed systems are likely to remain vulnerable. But there is a viable defense against these information leaks: making the kernel’s page tables entirely inaccessible to user space. In other words, it would seem that the practice of mapping the kernel into user space needs to end in the interest of hardening the system.
The paper linked above provided an implementation of separated address spaces for the x86-64 kernel; the authors called it “KAISER”, which evidently stands for “kernel address isolation to have side-channels efficiently removed”. This implementation was not suitable for inclusion into the mainline, but it was picked up and heavily modified by Dave Hansen.
So in short, Intel processors leak the kernel’s location, so now efforts have to be made to close this hole at the OS level since the current hardware or microcode can be updated to fix this issue. In theory, having a fix in the operating system should be good enough, but there’s a caveat: performance hit!
Most workloads that we have run show single-digit regressions. 5% is a good round number for what is typical. The worst we have seen is a roughly 30% regression on a loopback networking test that did a ton of syscalls and context switches.
and from The Register article linked above:
PostgreSQL SELECT 1 with the KPTI workaround for Intel CPU vulnerability https://t.co/N9gSvML2Fo
Best case: 17% slowdown
Worst case: 23%
— The Register (@TheRegister) January 2, 2018
So PostgreSQL SELECT command is about ~20% slower with KPTI workaround, and I/Os in general seem to be impacted negatively according to Phoronix benchmarks especially with fast storage, but not gaming performance, Linux kernel compilation, H.264 encoding, etc…
However, if you own an AMD system you can do a victory dance since the processors are not affected, so the “fix” is disabled:
AMD processors are not subject to the types of attacks that the kernel page table isolation feature protects against. The AMD microarchitecture does not allow memory references, including speculative references, that access higher privileged data when running in a lesser privileged mode when that access would result in a page fault.
Disable page table isolation by default on AMD processors by not setting the X86_BUG_CPU_INSECURE feature, which controls whether X86_FEATURE_PTI is set.
The Intel bug will be fully revealed later this month after all main OS have been patched, and I’d assume Intel will fix the hardware in its processors too, so we’ll live in interesting times when people may/will want to check the CPU revision / stepping number before purchasing an Intel system.