"...self-modifying code, [is] sometimes appropriate to solve certain problems th...

pizlonator · on Jan 1, 2020

X86 is particularly friendly to self modifying code, going so far as to require fewer fences and hints when code is modified. It’s generally easier to implement self- and cross-modifying code on x86 than other ISAs, like say arm.

(I manage a team thag writes self modifying code for a living.)

Skunkleton · on Jan 1, 2020

Sounds interesting. What does your team do?

saagarjha · on Jan 1, 2020

Among other things, just-in-time compilers for the WebKit JavaScript engine, JavaScriptCore.

tlb · on Dec 31, 2019

It's quite widely used for dynamic library jump tables. When calling a function in a dynamically linked library, it calls a stub, which is initially a call to the lazy linker, but gets replaced with a call to the resolved function.

You may be remembering early x86 chips that didn't properly invalidate the instruction cache after a write. Modern chips are fully cache-coherent.

matheusmoreira · on Jan 1, 2020

This is interesting. I've read documentation about dynamic linking and it describes this replacement process but I never truly understood the fact it was self-modifying code. Doesn't this imply the program's code is writable? I know that JIT compilers also emit code into writable and executable pages. Aren't there security implications?

saagarjha · on Jan 1, 2020

There are. That’s why dynamic linking typically doesn’t use self-modifying code and JIT compilers take a number of precautions to prevent attackers from being able to execute arbitrary code.

sigjuice · on Dec 31, 2019

I don't know much about x86, but on other cpus/architectures (PowerPC, MIPS, and possibly others) one might still need to fiddle with the instruction cache.

chrisseaton · on Jan 1, 2020

On modern IA32 and AMD64 architectures it involves the instruction cache in the cache protocol, so it's all done for you.

Taniwha · on Jan 1, 2020

Yes but it still needs to: a) empty the write buffers into the cache, then b) flush the current instruction stream (in case the CPU has already fetched and decoded instructions from the modified memory)

Doing this on every write (especially considering multiple possible virtual to physical mappings) is very expensive in terms of hardware - it's why some architectures (RISC-V for example) have explicit instructions to trigger these things

saagarjha · on Dec 31, 2019

That’s not self modifying code, though; it’s just an indirect jump through a pointer.

sigjuice · on Dec 31, 2019

I think the op is saying that the instruction call <stub_function> gets rewritten by the instruction call <resolved_function>.

saagarjha · on Dec 31, 2019

Generally that kind of thing goes through a GOT/PLT for security and performance reasons.

saagarjha · on Dec 31, 2019

> I have read that self-modifying code on the x86 architecture is pretty dangerous at the assembly level.

Dangerous in what way?

anfilt · on Dec 31, 2019

Having written self modifying x86 code. The most annoying thing is that instructions are not a fixed width. This means patching code requires you are able to parse every instruction, or a look up table where each instruction starts or just regenerate the entire code whole cloth. This also can cause problems if self modifying code has more than one thread since you suddenly may need to update 1 to 15 bytes atomically.

Generally, easiest to do the last or put NOPs or doing something like windows hot patch point for functions. Where hot patchable functions are preceded with a 5 bytes of nops, and the function always starts with MOV EDI, EDI which again is a pretty much a NOP, but takes two bytes.

This allows one to replace MOV EDI, EDI to a short jump to the start of those 5 bytes which is large enough to hold a long jump to any code. Windows went this route because originally multi byte NOPs where not part of the spec so if you used the one byte NOPs not only would each nop need to be execute slowing down function calls, but in multi-threaded code you would have to lock all threads to edit the code since it would be fetching on byte at time ect...

spc476 · on Jan 1, 2020

The original 8086 had a six byte instruction cache (the 8088 in the original IBM PC had a four byte instruction cache). If you modify an instruction less than six bytes away, it won't be seen by the CPU unless you issue a JMP (or CALL) instruction. It was not normally that big of an issue (just make sure you modify the instruction from far enough away). You can use the fact that the 8086 has a six byte instruction cache and the 8088 a four byte instruction cache to determine which CPU the program is executing on.

These days, I think you would need to 1) have memory pages with code with write permissions, 2) possibly flush the instruction cache and 3) hope no other thread is using said routine. With today's security concerns, 1) will not be likely, 2) possibly requires elevated privileges (I don't recall---I've only really done ring-3 level code on x86) and 3) is probably okay in a single-threaded program.

saagarjha · on Jan 1, 2020

You can get around 1 by having the page be mapped multiple times with different permissions. Modern x86 processors don’t need a cache flush.

stevekemp · on Jan 1, 2020

I remember using the instruction-cache as a way to catch debuggers single-stepping through my code. If you rewrote a near instruction to be a jump-to-self single-steppers would fall victim to it..

Rerarom · on Dec 31, 2019

When I was young I imagined that when I'll be an advanced programming/CS student I would learn advanced stuff like self-modifying code. Too bad real life is not that exciting...

chrisseaton · on Dec 31, 2019

You can get a job in compilers if you want. Message me via my profile if you don't know where to start.

saagarjha · on Dec 31, 2019

If you’re interested, JIT compilers or malware reversing is a “real” place where this knowledge is useful.

anfilt · on Dec 31, 2019

Also don't forget binary translation which basically is assembly level JIT. Although static binary translation is a thing but has limitations.

chrisseaton · on Dec 31, 2019

> I have read that self-modifying code on the x86 architecture is pretty dangerous at the assembly level.

Why's that? I'm not aware of any issues specific to x86.

saagarjha · on Dec 31, 2019

x86 is actually one of the easier platforms to have self-modifying code on, because you don’t have to flush the instruction cache.

xscott · on Jan 1, 2020

Which is kind of interesting considering how removed the assembly which was written is from what is actually executed. You'd think x86 would be even less likely to notice you changed the plan out from underneath it.