Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I’m not convinced that’ll bring about any significant change. Any power savings from switching to a RISC from x86 is coming from simplifying the instruction decoder, which seems to be about 15-20% if we compare the Ampere Altra to a comparable AMD chip. That’s not an order of magnitude.

On the other hand, on the order of 80% of a chip’s power is spent on OOO execution. If you want the order of magnitude improvement in power efficiency, you need to dump superscalar/OOO in favor of smart compilers and VLIW. Cheap DSPs have been doing it for years, but compilers aren’t good enough yet for general purpose processing.



Agree that OoO is the big cost. But we can also mitigate that without VLIW: SIMD/vector reduces the instruction count by ~5x, and energy by a similar factor.

And a portable API such as Highway also helps us move the same code from x86 to Arm or RISC-V with just a recompile :D


It's not clear that even a super smart compiler can do this. The best schedule depends on the latency of instructions. This is a problem because we can't know statically whether a particular memory load is in L1/L2/L3/DRAM/etc., as this can vary for different executions of the same load instruction.


According to [1], 88% of the speedup given by OOO processors is due to speculation, and the reordering in the case of cache misses attributed around 10% of the speedup. If OOO in general gives around a 50% speedup compared to in-order designs, reordering in the face of cache misses gives only around a 5% speedup. If you use a good static schedule with speculation, you'll get the bulk of the speedup. The rest can be recovered by increasing the clock rate by 5%, since you'll have gotten rid of so much silicon.

[1] https://doi.org/10.1145/2451116.2451143


Nice paper!


> compilers aren’t good enough yet

So we need to go back to coding in assembler to save the planet? Sign me up!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: