From what I've understood of this - it's transpiling the x86 code to ARM on the fly. I honestly would have thought it wasn't possible but hearing that they're doing it - it will be a monumental effort, but very feasible. The best part is that once they've gotten CRT and cdecl instructions working - actual application support won't be far behind. The biggest challenge will likely be inserting memory barriers correctly - a spinlock implemented in x86 assembly is highly unlikely to work correctly without a lot of effort to recognize and transpile that specific structure as a whole.
I thought FAT binaries don't work like that - they included multiple instruction sets with a header pointing to the sections (68k, PPC, and x86)
Rosetta to the best of my understanding did something similar - but relied on some custom microcode support that isn't rooted in ARM instructions. Do you have a link that explains a bit more in depth on how they did that?
Fat binaries contain both ARM and x86 code, but I was referring to Rosetta, which is used for x86-only binaries.
Rosetta does translation of x86 to ARM, both AOT and JIT. It does translate to normal ARM code, the only dependency on a Apple-specific custom ARM extension is that the M-series processors have a special mode that implements x86-like strong memory ordering. This means Rosetta does not have to figure out where to place memory barriers, this allows for much better performance.
So when running translated code Apple Silicon is basically an ARM CPU with an x86 memory model.