Hacker Newsnew | past | comments | ask | show | jobs | submit | hansendc's commentslogin

"On x86-64, there are two CPU settings which control the kernel’s ability to access memory."

There are a couple more than two, even in 2021.

Memory Protection Keys come to mind, as do the NPT/EPT tables when virtualization is in play. SEV and SGX also have their own ways of preventing the kernel from writing to memory. The CPU also has range registers that protect certain special physical address ranges, like the TDX module's range. You can't write there either.

That's all that comes to mind at the moment. It's definitely a fun question!


a thought: do MPK actually control the kernel's ability to access memory? on intel, i think if you try to read that memory, a page fault wont be thrown. although with PKS, kernel reads will cause a page fault.

so can the kernel (ring0) freely read/write to memory encrypted with MPK? I think so, yes. good luck with whatever happens next tho lol


There are two versions of MPK. One is only applicable to userspace pages. The other is newer and can be applied to kernel space pages; last time I checked, this was only available on newer Xeon processors.

By the way, MPK memory is not encrypted. The key is just an identifier for the requestor. If the requestor key doesn’t match the same identifier for the memory page, then an exception is raised.

Funnily enough, MPK isn’t new at all. It’s almost a reintroduction of a feature from Itanium.


Aw, so I was half right. I knew the newer one, which is MPS, will throw a page fault. Sorry, it’s been a while since I’ve done this stuff and we were mostly working with tz

Here's an implementation that one of the OpenStreetmap applications uses:

https://josm.openstreetmap.de/browser/josm/trunk/src/org/ope...

It used to use a linear list of points, but it was VERY slow to draw, so I hacked this in to the code base a few years ago.


The AVX disable is only when you use "gather_data_sampling=force". The default is to leave AVX alone and proclaim the system to be vulnerable.

From https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... :

> Specifying "gather_data_sampling=force" will use the microcode mitigation when > available or disable AVX on affected systems where the microcode hasn't been > updated to include the mitigation.

Disclaimer: I work on Linux at Intel. I probably wrote or tweaked the documentation and changelogs that are confusing folks.


Great, thanks for the clarification


Uh... Did I miss the patches that add a pre-zeroed page pool to Linux? Wouldn't be the first time I missed something like that getting added, but 6.3-rc5 definitely zeroes _some_ pages at allocation time, and I don't see any indiciation of it consulting a prezeroed page pool: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


> there's no way[0][1] to express R^X, PROT_EXEC without PROT_READ is not possible.

I'll also add a [2]:

[2] There's no way to do it in the page tables. But, if you have Protection Keys for Userspace (PKU), you can get it ... kinda. You can have a PROT_READ|PROT_EXEC mapping, assign it a pkey, then set PKEY_DISABLE_ACCESS in the PKRU register for that key. In fact, if you have a PKU CPU and you do an unadorned mmap(PROT_EXEC), the kernel will allocate you a pkey and do this under the covers FOR you. Anyone who can execute WRPKRU can easily undo this protection, but it's better than nothing.


kinda indeed.

As far as I can tell Intel PKU was only on Server-CPUs/Xeons until at least the 11th Gen (only later models?), and AMD Zen 3.

OpenBSD doesn't support protection keys, in any case.


There honestly isn't that much "tech" to speak of here. We were literally talking about "immutable" mappings last week in Linux land: https://lore.kernel.org/all/b4f0dca5-1d15-67f7-4600-9a0a91e9...

That said, this would be great to see in OpenBSD (or any other OS).


There's at least one extremely well documented example of a killer whale that played extensively with boats: https://en.wikipedia.org/wiki/Luna_(orca)

Granted, this was a lonely little fellow. But, he knew perfectly well what he was doing and repeatedly approached boats, despite the noise. He died after colliding with a tugboat prop.


They actually leave lots of evidence. A transient eating a seal is messy business and there are lots of seal bits and chunks left over. Eva Saulitis describes the aftermath in several cases in her book (https://www.penguinrandomhouse.com/books/219235/into-great-s...). IIRC, fishing the evidence out of the water is one of the primary ways they study killer whale diets.


From: https://arstechnica.com/gadgets/2020/10/in-a-first-researche...

"In a statement, Intel officials wrote: ... we do not rely on obfuscation of information behind red unlock as a security measure."

(BTW, I work on Linux at Intel, I'm not posting this in any official capacity)


> I work on Linux at Intel, I'm not posting this in any official capacity

Oh, great! Isn't there a way where intel could provide keys so we could get rid of IME even if it means we won't be able to play DRM'ed content?


IIRC IME also does a lot of core functionality like power regulation. Unlike many in this thread probably think, it does provide a lot of core functionality that you probably don't want removed.


The contention, from the User standpoint, however, is the network stack, potential to phone home, and the unrestricted access to the global machine state, combined with the fact, it is not documented or disclosed.

It's one thing to have that and be up front and open on it. Get secretive, and you're creating a massive source of unknown unknowns for everyone involved.

And like it or not, if you won't/can't be transparent about it, either

  A) It'd take too long to document, which suggests there may be room for simplification
  B) you're doing something that if it saw the light of day, would cause outrage, likely because you shouldn't be doing it
  C) You're holding back the state-of-the-art for the sake of securing a revenue stream.
None of these inspires a excess of confidence/trust.


Interesting. So IME handles dynamic voltage and frequency scaling(DVFS) or cores then?


Thermal management, power gating, scheduling... It does a lot.


Doubt the NSA would allow it, best option is RISCV if you don't want Intel's ring-3 backdoor.


Actually, its primary design goal is to make address sanitizers faster. Right now, all the code that touches a sanitizer-tagged address must be recompiled to understand how to place and remove the tag. These address-bit-ignore approaches can (ideally) allow you to just modify the memory allocator to hand out tagged addresses. Those addresses can then be passed around to code that doesn't even know it's handling a tagged address. It doesn't need to be modified. You don't need to recompile the world. Even when the sanitizer is on, you also don't need to be constantly stripping tags out of pointers before dereferencing them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: