This patch adds a dynamic recompiler for 32-bit PowerPC, based on
the existing dynrec framework. I've only tested it on a wii but there
should be no reason for it not to work on PowerPC based Macs. As far as
performance goes with core=normal I get 0.7fps from PCPBENCH, with
core=dynamic I get 3.1fps. There are some other big-endian improvements
that can be made that get it up to 4.0 but I haven't included them here
as they aren't related to dynrec.
I haven't touched any of the autoconfigure scripts, config.h needs the
following settings:
The compiler needs to support gcc inline assembly (checked via
defined(__GNUC__)) for dcache flushing/icache invalidation. There
doesn't seem to be a portable way to achieve this, but they're not
supervisor level instructions so should be fine for any userspace
program to use.
Some comments on the changes:
- I had to name the FPU_Rec struct so it could be forward-declared in
risc_ppc.h (having a dedicated register pointed to it helps FPU
heavy code).
- Removed some unneeded WORDS_BIGENDIAN guards in the self-modifying
code detection, they weren't needed as the additions aren't meant to
overflow between bytes.
- Made dyn_run_code() get called before dyn_return(BR_Link1/BR_Link2)
and shuffled their locations a bit. The reason for this is that the
PPC dynrec generates its epilog once in gen_run_code() and then puts
a jump to it whenever gen_return_function() is called, rather than
emitting a full epilog every time. If dyn_return() was called before
dyn_run_code() the address of the epilog is unknown.
- Added missing cache_block_before_close()/cache_block_closing() calls
for those blocks, since they were missing.
- The dynrec decoder wasn't differentiating between little-endian (host)
memory access and regular memory access. I added new functions where
necessary (hopefully caught them all) and aliased them to the regular
functions when WORDS_BIGENDIAN is not defined.
- dyn_ret_near() was bugged, it tried to write a dword to ®_ip which
overran on big-endian.