Why RISC-V doesn't yet support KVM

nullspace · on May 20, 2021

The last email from Greg K-H is a gem for many reasons.[1]

"""

And to be a bit more clear about this, having other subsystem maintainers drop their unwanted code on this subsystem, _without_ even asking me first is just not very nice. All of a sudden I am now responsible for this stuff, without me even being asked about it. Should I start throwing random drivers into the kvm subsystem for them to maintain because I don't want to? :)

If there's really no other way to do this, than to put it in staging, let's talk about it. But saying "this must go here" is not a conversation...

"""

I deeply cherish the fact this stuff is out in public. It helps more junior software devs learn the softer skills.

[1] https://lwn.net/ml/linux-kernel/YKTsyyVYsHVMQC+G@kroah.com/

jerrysievert · on May 21, 2021

it was always fun when greg would show up when I was working at the OSDL (he worked across the parking lot from us). always in black pants, a white t-shirt with a pocket (usually holding a pack of cigarettes), and with his hair tied back.

always so humble, and dropping a ton of knowledge and advice. he really is the nice guy that his image projects.

gregkh · on May 21, 2021

While I did drop by a lot, you aren't describing me, but rather another Linux kernel developer who worked at IBM in our group at the time who is by far more brilliant and smarter than I.

I've never smoked, and at that point in time, I did not have long hair anymore (cut it off when I turned 30 which was before OSDL was ever started.)

jerrysievert · on May 21, 2021

then I apologize for misdescribing you greg :) but the rest of the comment stands. sometimes things blur a little after 20 years.

sohkamyung · on May 21, 2021

> with his hair tied back

I find this hard to reconcile with GregKH's current public image. :-)

jerrysievert · on May 21, 2021

wow, I had not seen a picture of him since 2001, so 20 years? people change, and we were all young once!

jasonwatkinspdx · on May 21, 2021

We're only distantly acquainted but I just wanted to say I'm happy to see you around :)

jerrysievert · on May 21, 2021

why thank you! I'd say to look for me at bailey's on occasion, but alas those days are no more.

perhaps when level 3 opens up, I'll make an appearance every once in a while!

fartcannon · on May 20, 2021

Greg is great, but it makes me miss Linus.

makefile10 · on May 21, 2021

> I deeply cherish the fact this stuff is out in public. It helps more junior software devs learn the softer skills.

Agreed, it makes me very happy to see some sanity in open source. In other projects like CPython, expect a little Twitter witch hunt by unproductive project politicians if you object to excessive changes.

Because of CPython's cultural revolution, the hierarchy is:

  cabal
  newcomers
  productive long term developers

mikedilger · on May 20, 2021

Ratifying the H-extension is a serious commitment not easily patched later. Putting something in the kernel is a commitment that's much easier to patch later. The linux RISC-V development policy that the extension must be frozen first is the part that seems excessive and unnecessary to me. Put it in now, and modify it later if the H-extension changes.

Putting pressure on RISC-V to freeze the H-extension early would be IMHO a bad outcome if it's not frozen for a good reason. If it's not frozen for no known good reason and it's just vendors gaming the system then of course freezing it is a good idea, but that's quite an independent thing from linux.

snvzz · on May 21, 2021

Other ISAs do also go through revisions, often reinventing interfaces several times until they get it "right". Specific code needs to be written to support every revision. What matters is that hardware is out, and there should be a way to support this hardware.

Putting pressure on RISC-V would be Linux abusing its market position to interfere on RISC-V processes. This is particularly wrong to do, as RISC-V processes being open (unlike most other ISAs) shouldn't be seen as an invitation to discriminate against them by not allowing them the possibility of making incompatible changes.

FullyFunctional · on May 21, 2021

Reality is more nuanced than that. Some private extensions or unratified extensions should really be frowned upon because they risk fragmenting the RISC-V ecosystem. Examples include the completely incompatible version 0.7 of the vector spec and T-head's assorted private extensions.

The H extension is a little less clear cut as it doesn't affect user code (IIUC) and as such the damage is limited to OS kernelS (there's more than Linux out there).

However it's a slippery slope into fragmentation hell and I actually think the current policy is correct and the blame lies with the H committee.

EDIT: language

brucehoult · on May 21, 2021

RVV 0.7.1 is not completely incompatible with the current 1.0 draft. The overall structure of code is the same, most instructions and opcodes have not changed at all, especially unmasked operations.

I have demonstrated that at least memcpy() can be be binary compatible between 0.7.1 and the current draft. I will check others soon (once I get a Nezha board, hopefully before the end of the month). Many other things could be made compatible -- most simple code using 16 or 32 bit int or 32 bit FP -- by reverting a single small commit to the spec in June 2020 which tidied up the bit field format of the VTYPE CSR but did not introduce any functionality.

There are things that are incompatible for very good reasons, but there is much that is either compatible now or that is incompatible for inessential reasons.

Reverting the format change in the SEW and LMUL fields, and making 0 for the mask agnostic and tail agnostic bits match the (fixed) settings in 0.7.1 would go a long way towards increasing compatibility, at virtually zero cost.

pantalaimon · on May 21, 2021

The Kendryte K210 with it's pre-standard MMU is another example. Because of that it can only run nommu-Linux on mainline.

Taniwha · on May 21, 2021

But in essence putting stuff like this in the kernel is putting pressure on freezing the extension.

However it's a bit of a chicken and egg situation, you need serious testing before you can ratify and freeze but if you don't encourage people to code against the (moving) spec you wont get the testing you need.

As a CPU designer it's particularly tough, wanting those changes frozen asap so you can implement them, but not wanting a broken one

Ericson2314 · on May 20, 2021

The staging "workaround" actually strikes me as the exact correct approach. Linux merging some unstable feature to better prove the design of an unstable ISA extension seems like virtuous feedback loop of testing of designs before they are ratified.

That staging is for sketchy drivers but this is architecture-specific code strikes me as especially silly. Never mind CPU vs peripheral: both are hardware with interfaces outside Linux's control. The important distinction is internal vs external interfaces, and staging makes sense for me for dealing with all sketchy external interfaces alike.

marcan_42 · on May 21, 2021

That's not how it works. The Linux kernel is chock full of drivers for sketchy hardware. The staging tree is for sketchy code. If the feature is in shipping hardware and useful, Linux policy is to support it.

a1369209993 · on May 21, 2021

> The staging tree is for sketchy code.

If the code is likely to fail in weird ways for reasons that can't reliably be predicted or fixed ahead of time, I think it's fair to call it sketchy.

If the code uses instructions that might be defined out from under it, it's likely to fail in weird ways for reasons that can't reliably be predicted or fixed ahead of time.

No claim on whether it's a good idea (the "discourage bad hardware" angle would suggest not), but "staging is for sketchy drivers" at least seems consistent with including it.

kelnos · on May 21, 2021

> If the code uses instructions that might be defined out from under it

They won't, though, at least not in a way that matters. Linux exists to make hardware run. There is hardware built against the current draft version of the H extension. KVM should be made to work on that hardware. If the final ratified H extension ends up having incompatible changes to what's there now, then that's just another revision that the kernel can support, just like Linux can support different features in different x86 generations.

If the hardware manufacturers were being reckless and shipping an earnestly-worked-on pre-standard before it was ready, I would perhaps agree that they shouldn't get mainline support.

But it seems the H extension has been sitting in limbo for a year now, and it seems the reason is that a company that is having trouble getting hardware ready for shipping is trying to stall the process so other companies don't have a leg up. That's not a good reason to punish these other (more competent?) companies by forcing them to carry out-of-tree patches. And given the situation, it's likely that the eventually-ratified version will be the same as (or at least compatible with) the current draft, so the risk of taking the code now and having to maintain compatibility hacks later is pretty low.

colonwqbang · on May 21, 2021

The hardware that is already designed and built and sold will never change. It will always work the same. It needs to be supported as it exists today.

There will be always new HW designed which works differently to the old one. This goes for all types of HW. They never stop.

Ericson2314 · on May 21, 2021

That's a fair distinction.

I still like Linux "helping out" RISC-V standardization by separating the unratified extensions --- traditionally the hardware is fait accompli and Linux is the agile software playing catch up but with something like RISC-V the roles are somewhat reversed --- but I fully grant that's separate to the purpose of staging as you say.

marcan_42 · on May 21, 2021

This isn't just a RISC-V thing. There is significant bureaucracy in Linux to push back on vendors doing even dumber stuff with hardware in the future; that was a lot of the discussion around my patches to support the M1. We have to support the hardware, but do so in a way that does not encourage vendors to push things further (we know Apple doesn't care about any of this, but e.g. nobody wants Qualcomm or Broadcom, who do build platforms that intend to run Linux as a first-class citizen, to get any silly ideas).

But this is all stuff you work out by looking at the finer points of the implementation and doing it in a way that does not encourage further divergence; certainly throwing stuff in staging isn't how you do it.

Ultimately, Linux's goal is to support the hardware; that support may be done in such a way to discourage future "bad" hardware, but it still has to be there as proper support, if the features have merit. For example, we have a patch series in review right now to support KVM on the M1, because (surprise surprise) Apple did some nonstandard stuff there too. It's definitely not going in staging though :)

On the other hand, it's less likely that mainline Linux will quickly gain support for, say, things like Apple's proprietary security features (their pointer auth extensions, paging protection stuff, etc) unless there are similar mechanisms implemented in other vendors' CPUs, because those things provide only marginal benefit for end users of specific systems, and Linux has bigger security fish to fry first.

pabs3 · on May 21, 2021

Linux has other code for dealing with quirks in hardware, so why not RISC-V and KVM? Just implement the draft now and later add a way to use the draft code for hardware implementing the draft spec and final code for hardware implementing the final spec.

cratermoon · on May 21, 2021

The million dollar unanswered question here is: who is benefitting from the deadlock over KVM specifications becoming Frozen?

tytso · on May 21, 2021

The usual answer when this things happens in standards committees is those companies who aren't ready to ship product yet. If they can stall the official release of the standard, then it gives them time to catch up. Of course, other companies can release hardware on the not-quite-released standard. There were v.90 modems that were pre-standardization for at least 2-3 years. All this does is harm the consumer at that point, since it impacts interoperability; making companies wait to ship products that they have working because of one laggard is just not fair.

"There comes a time in every project when you must shoot the engineers and begin production.”

makomk · on May 21, 2021

Hmmm. I wonder if this is what happened to Bluetooth LE Audio and if so who the laggard is...

CGamesPlay · on May 21, 2021

I don’t understand what the point of the staging team is. It seems like it will always have these conflicts with all of the functional teams, and for what gain? Specifically why isn’t staging a subdirectory of each functional team?

jchw · on May 21, 2021

I hope they come to what seems to be the most reasonable conclusion (that the RISC-V patch acceptance policy should be adjusted.) The way I see it, Linux runs on and supports all kinds of stuff that is not well-specced and subject to change, and sometimes reverse-engineered. While the patch policy has good intentions, I think when silicon is shipping and emulators are emulating things it would make sense for Linux to support it. So let’s hope it works out.

The staging approach would sound reasonable on the surface, until you realize the Linux staging subsystem is more specific[1] than the name implies, and not just a dumping ground for stuff that can’t be merged for other reasons.

[1]: https://lwn.net/Articles/324279/

ineedasername · on May 21, 2021

Am I wrong to take away from this that there is a critically useful piece of software going unsupported because of bureaucratic policies and turf wars?

tux3 · on May 21, 2021

I wouldn't say unsupported, but it's out of tree, which is a small step up.

lostmsu · on May 24, 2021

So why is KVM feature for RISC-V not frozen yet?

webaholic · on May 21, 2021

"Company X" in the comments is SiFive, I am guessing?