1
Examining the DeSmuMe Emulator
Abstract
Emulation is the reproduction of the function of one computer on another computer. It is
important for the maintainability of access to legacy software, such as old video games.
DeSmuMe is used as a case study to understand how emulation (using interpretation as opposed
to JIT) works. The two methods used to understand this emulator were code examination and
testing, with an emphasis on CPU and MMU. This revealed the emulator’s architecture, and how
it works together to form a final working product. This shows the value of interpretive emulation
as a method to preserve old software.
Introduction
Emulation, at a base level, is simulating software built for one architecture on a different
architecture. It is a way to essentially port an operating system to a more standardized computer.
It is necessary to preserve old software from going extinct. Currently, emulation is most
commonly used to run old games built for different machines on a standard computer. As the
hardware for these games becomes outdated, the risk of certain games disappearing becomes
higher and higher. Emulation also has many practical applications for software engineering and
testing. Breaking down and analyzing DeSmuMe, a Nintendo DS emulator, will allow us to
understand how emulation works.
Literature Review
Emulation is crucial for preserving access to legacy software. Amelia Acker says that,
“Providing access to information through software emulation techniques will likely transform the
culture, practice, and access experiences to digital cultural heritage as well as best practices for
digital preservation professionals” [1]. Old software has become a cornerstone of our culture.
2
Millions of people across generations have deep emotional connections to old games. From the
old arcade classics to early console games to even more recent systems such as the Wii or
Nintendo DS, emulation provides an opportunity for all of these people to replay their childhood
favorites. These games have become iconic in our culture. Nearly everyone knows who Mario is,
and there have even been movies made based on the old video game. As the access to software
continues to increase, the need for a way to preserve games increases. As time goes on, new
software will become old, and the volume of software is increasing exponentially. There is a dire
need for a way to continuously and sustainably preserve video games. Spencer Bevis describes
the reason for this as, “Because digital objects are vulnerable to loss in ways that traditional
library documents are not, they require innovative solutions to recover the object and its software
environment” [2]. There are plenty of games that have already been (seemingly) permanently
lost. As these games get old and lose their value, there is no good reason for there to not be a
library of games, and emulation provides a fantastic solution to allowing all of these games to be
played.
There are two main methods that are actively being used for emulation. These are interpretive
emulation and Just-in-Time Compilation (JIT). Interpretive emulation is the more commonly
used method. The literature on interpretive emulation is not always consistent with naming.
Fabrice Bellard calls it micro operations, but the emulation community more commonly refers to
this method as interpretive emulation. It involves decoding and executing each instruction, one at
a time. It is fairly straightforward to implement, and very accurate, and it is easily portable.
However, it is slow, and lower quality hardware can sometimes struggle to run an emulator using
this method [3]. Since each instruction needs to be processed individually, emulators using this
method can tend to run into issues. However, this is dependent on the native system the emulator
is running on, and how well the emulator is built/what it is emulating. For example, a Nintendo
DS emulator is much easier to run than a Wii emulator, as there is simply less hardware to
emulate. Dolphin, the most popular GameCube/Wii emulator, uses JIT instead. Brook Heisler,
when comparing interpretation to JIT, described that a JIT compiler, “can be run once and emit a
blob of machine code which executes an entire emulated function (or more) in one sequence of
3
instructions.” [4]. There is significantly less of a continuous loop necessary to execute
instructions, allowing the code to run significantly faster. This method is highly complex, and
makes debugging significantly more difficult. However, it is necessary to use if performance is a
critical component of the emulator. This makes it a much better choice to emulate a machine as
big as something like the Wii.
Methods
In order to break down and analyze DeSmuMe, two main methods were used. The first method
was examining the code. DeSmuMe is fairly well documented for the important parts of the
code. However, there are many giant chunks of code that are not commented at all, and it is hard
to decipher exactly what it’s doing. Thankfully, the core components are commented and
decently easy to understand with proper context. The two components that were focused on
include the ARM CPU, MMU. Another important component of DeSmuMe is the graphics.
There are two engines that handle 2D graphics, and a GPU that handles 3D graphics. However,
this section is largely undocumented, is split into multiple files, and contains tens of thousands of
lines of code. In light of this (and considering that graphics are not a core component of an
operating system), it was omitted from the project. The second method was experimental testing,
using the developer mode of DeSmuMe. This version of the emulator allows for direct access to
a graphical representation of the components listed above. There are many tools available to see
what is happening in real time during emulation. These include the disassembler
(interpreter/CPU loop), registers, memory, and all of the components involved in rasterization.
There are also sound states available, which is cool, but it was decided that it was above the
paygrade of the project as the code for handling sound is split into many different files, and is
largely undocumented. The focus for this project was kept on the core components of an
operating system, which include the CPU and MMU. By having access to these components
while testing, it was very easy to see exactly what was going on during execution. This proved to
be very helpful in understanding how DeSmuMe holistically worked. An attempt was made to
modify the code in certain spots to see how performance was affected as well. However, this was
4
largely unsuccessful, as the emulator would either completely break, or nothing would change.
This ended up being a major limitation of the project. The ultimate goal with these methods was
to understand the architecture of the emulator, and how/why it works.
Results
DeSmuMe is primarily coded in C++. There are some parts that are C, but the majority (and all
of the major components that were analyzed) are in C++. The first component analyzed was the
CPU. Interestingly, DeSmuMe does have a file called “arm_jit.cpp”, however, this is only for
experimentation purposes, and it is not actively enabled. The files that contain the interpretive
method for CPU handling are “arm_instructions.cpp” and “armcpu.cpp”. The dual processor
architecture of the Nintendo DS is replicated, with two ARM cores, ARM9 and ARM7. The
ARM9 processor mostly handles the game loop, while the ARM7 processor handles audio and
I/O. For both of these processors, an interpretation loop is used that mimics how the real ARM
cores fetch, decode, and execute instructions. These three pieces create the loop that processes
each instruction, one by one. The instruction is fetched from memory, decoded to determine the
operation type (as seen in “arm_instructions.cpp), and then executed. Each instruction is
modular, meaning that each type of instruction has its own handler or method, which allows for
easy debugging. Both processors also include different modes, including User, System, and IRQ.
The registers and program counter are consistently maintained throughout. Since the software is
being emulated, it does not have access to a built-in hardware clock. Instead, it manually
manages instruction timing by running the two processors in pseudo-parallel. It simulates
concurrency by switching between ARM9 and ARM7 execution so one processor does not begin
to jump ahead of the other. This is crucial for any shared memory and interrupt handling. It is
likely most important for sound, ensuring that the correct sounds are played alongside the
graphics. By using an interpretive loop, it allows for a high level of accuracy, easy debugging,
and easy portability across different operating systems (Windows, MacOS, Linux).
5
Graphical representations of the interpreters for each processor
The second component that was analyzed was how DeSmuMe handles memory. The Nintendo
DS has a fairly complex memory system, which includes 4mb of RAM, 256kb of VRAM, shared
memory, and access to cartridge memory across regions. DeSmuMe has to exactly replicate this
system in order to properly function. Most of this is implemented in “MMU.cpp”. While the
code itself is hard to track (it is 6000 lines), the graphical representation of memory in the
developer mode emulator reveals exactly how memory is handled. There is designated memory
6
for the ARM9 and ARM7 processors, as well as firmware and the ROM cartridge. Memory is
mapped to the native machine, allowing for reads and writes to specific addresses. There is also
memory for I/O contained in specific registers. The ARM9 processor has registers for video
engines A and B, DMA, IPC, and IRQ. The ARM7 processor has registers for just IPC, IRQ, and
DMA. All parts of the Nintendo DS memory system are replicated and built through memory
virtualization and mapping.
Graphical representation of memory
Graphical representations of registers for each processor
7
Testing and Experimentation
Testing was done by running Pokemon Platinum on the emulator, and recording frame rate and
CPU usage. This was done on a laptop with an AMD Ryzen 7, 32GB of DDR4 RAM, and a
NVIDIA GeForce RTX 3060 GPU. An attempt was made to test on another laptop with an Intel
i5, 8GB of RAM, and Intel-Iris GPU, however, for some reason DeSmuMe would not compile (I
was borrowing from a friend and had limited time with it). The only testing that happened ended
up being on the first laptop. The four things that were analyzed were FPS, CPU usage, GPU
usage, and memory usage. Two five minute trials were conducted, with the first being run using
rasterization to handle graphics, and the second using OpenGL.
Average FPS
CPU Usage
GPU usage
Memory Usage
DirectDraw HW
60
~7.8%
~6.5%
28.3 MB
OpenGL
60
~6.2%
~15%
29.3 MB
These results were in line with what would be expected. DirectDraw is the name for the 2D
rasterization, which relies on the processors for graphics, whereas OpenGL uses the GPU. This is
reflected in the usages of the CPU and GPU. A frame drop was never observed in either case,
which makes sense considering the specs of the computer that was used for testing. Interestingly,
the emulator uses a very consistent amount of memory. Other processes jump around, at least a
little bit, with how much memory is being used, but DeSmuMe never changes. This is likely due
to memory virtualization, with a set amount of memory being specifically set aside for the
emulator.
Conclusion
Overall, analyzing the DeSmuMe emulator provides valuable insight as to how emulators work
with native operating systems. A dual-CPU interpretive loop is used to handle the instructions.
Timing is manually managed to ensure that each CPU runs concurrently. Memory is handled
8
through virtualization mapping. These allow for great portability, accuracy, consistency, and
debugging. These are all great advantages of the interpretive approach to emulation. However,
there were some limitations as well. Graphics and sound handling were omitted, testing was only
done on one machine, and the code was not modified for further testing. In the future, a more
holistic analysis of the emulator could allow for more in depth insights to be revealed. Overall,
emulators are a valuable tool for understanding how operating systems work and preserving
them long term.
References
[1] A. Acker, “Software Emulation to Preserve Access to Legacy Born-Digital Cultural
Artifacts,” J. Assoc. Inf. Sci. Technol., vol. 72, no. 7, pp. 790–802, 2021. [Online]. Available:
https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24482
[2] S. Bevis, Emulation in the Archives: Preserving Digital Heritage through Software
Preservation, Master’s paper, Univ. of North Carolina at Chapel Hill, 2019. [Online]. Available:
https://cdr.lib.unc.edu/concern/masters_papers/x059cc73c
[3] M. Jantz, “Emulation Notes,” University of Tennessee, Knoxville. [Online]. Available:
https://web.eecs.utk.edu/~mrjantz/slides/teaching/runtime_systems/emulation_notes.pdf
[4] B. Heisler, “Experiments in NES JIT Compilation,” 2018. [Online]. Available:
https://bheisler.github.io/post/experiments-in-nes-jit-compilation/
[5] F. Bellard, “QEMU, a Fast and Portable Dynamic Translator,” in Proc. USENIX Annual
Technical Conference, FREENIX Track, 2005. [Online]. Available:
https://www.usenix.org/legacy/event/usenix05/tech/freenix/full_papers/bellard/bellard.pdf
[6] “DeSmuME Main Page,” DeSmuME Wiki. [Online]. Available:
https://wiki.desmume.org/index.php?title=Main_Page
9
[7] TASEmulators, “DeSmuME GitHub Repository.” [Online]. Available:
https://github.com/TASEmulators/desmume/tree/master
[8] “Just-in-Time Compilation in Emulator Design,” NIPES J. Sci. Technol. Res., vol. 5, no. 1,
pp. 283–287, 2023.