Overview

This aims to delve into the actual play.date console platform itself, including (but not limited to) the game build process, the onboard operating system, and the peripherals on board the console, the things with no existing documentation.

Note that any code marked as Example Code (such as, but not limited to the Game of Life C example) or stated to be provided in the SDK or constructed by Panic Inc. has sole IP ownership by Panic Inc. and belongs to them.

Previous Work

Most of the past work has been focused on the API, the overal structure of the packaged game files, and the soundtracks, not too much has been focused on platform introspection itself. At a high level a lot of the work is already laid out for us a la play.date reversing, but the things it ignores - the underlying runtime, the api for the game binaries themselves, and how the play.date is the play.date. That still needs to be uncovered.

Also i want a web browser on it lol.

Is any of this even allowed?

Yeah basically.

The big "don't"'s of the SDK license are:

You will not:
    - Modify, disassemble, or decompile any part of the SDK;
    - Distribute or transfer the SDK to others (other than the incorporation of distributable elements of the SDK in Your Developed Programs in accordance with the terms of this Agreement);
    - Modify, adapt, alter, translate, or incorporate into or with other software or create a derivative work of any part of the SDK, except as permitted herein, without express written permission from Panic;
    - Use the SDK to develop applications for other platforms or to develop another SDK, without express written permission from Panic.

The other fun one not in the initial bullets:

"[You will not] Use the SDK to create, develop, or use any program, software, or service which (1) contains any viruses, Trojan horses, worms, or other computer programming routines that are intended to damage, detrimentally interfere with, surreptitiously intercept, or expropriate any system, data, or personal information; (2) when used in the manner in which it is intended, violates any applicable law, statute, ordinance, or regulation (including without limitation the laws and regulations governing export control, unfair competition, anti-discrimination, false advertising, or data privacy); (3) infringes the intellectual property rights of third parties; or (4) interferes with the operability of other Panic or third-party programs or software; "

Addressing the don't's

  • No changes are being made to the SDK, theres nothing of interest in the SDK to disassemble or decompile (the only things we could would be pdutils and pdc).
  • The SDK is not being redistributed so othing to worry about
  • The SDK is not being modified
  • The SDK is not being used to develop for another platform or "develop another SDK"
  • not doing any of that illegal stuff

Game Binary Internals

This gives a brief overview of what an example project setup could look like, what the build process for the project is, and what the emitted binary structure is. This focuses on what builds with the C SDK look like, and does not address lua at all.

NOTE:

This uses the C_API/Examples/Life example from the SDK extensively.

A Game of Life

This is going to to a quick "once over" of the code they gave as example for Conway's Game of Life. Life is a classic nerd example and is fairly straghforward.

This is more concerned with the actual effect of the code layout as opposed to the code itself.

The initial file strucure of Life looks like:

$ tree
CMakeLists.txt
main.c
Makefile
Source/
  - pdxinfo

We can basically ignore the pdxinfo for now, and the Makefile as well. CMake is more powerful than Make and allows for cleaner builds imo and its easier to decipher wtf is actually going on so we'll look at CMakeLists.txt.

Note the lines:

if (TOOLCHAIN STREQUAL "armgcc")
	add_executable(${PLAYDATE_GAME_DEVICE} ${SDK}/C_API/buildsupport/setup.c main.c)
else()
	add_library(${PLAYDATE_GAME_NAME} SHARED main.c )
endif()

include(${SDK}/C_API/buildsupport/playdate_game.cmake)

As we're focusing on binaries for the actual device we don't actually care about the else branch.

Input files

The input compilation units to this game are going to be:

  • setup.c (provided by SDK)
  • main.c (from user code)

main.c

What main does in general is entirely up to the user, but as per the docs, must contain, at a minimum:

what is setup.c?

This is where a lot of the play.date magic happens, and is a file provided in the SDK. At a high level it is putting a more accessible name of their allocator in scope, and making wrappers around malloc(), free(), and realloc(). Then is arranging the sections in the compilation unit so that the entry to your defined code is always in the same place. This simplifies their API to all games and is a really cool API decision on their part.

So how do they do all of this?

With *code*:

PDEventHandler* PD_eventHandler __attribute__((section(".capi_handler"))) = &eventHandlerShim;

extern uint32_t bssStart asm("__bss_start__");
uint32_t* _bss_start __attribute__((section(".bss_start"))) = &bssStart;

extern uint32_t bssEnd asm("__bss_end__");
uint32_t* _bss_end __attribute__((section(".bss_end"))) = &bssEnd;

.capi_handler

bss okay cool thats standard C programming stuff (initialized 0 memory), but what is .capi_handler??? According to OpenAI's ChatGPT:

The .capi_handler section is a section of memory in a program that is reserved for storing data related to C++ exception handling. This data is used by the C++ runtime library to implement the C++ exception handling mechanism.

In a C++ program, when an exception is thrown, the exception object is passed to the C++ runtime library, which searches for an exception handler that is capable of handling the exception. The runtime library uses the data stored in the .capi_handler section to determine which exception handlers are available and how to invoke them.

The .capi_handler section is created by the linker when it links a C++ program. The linker generates the .capi_handler section based on information in the object files generated by the compiler.

Cool, so it sounds like they took advantage of the fact that the input code is only C, then threw a pointer to the event handler in a section that would otherwise be unpopulated.

But what is the thing they are putting in that section? eventHandlerShim() is not a user function.

Shim code:

int eventHandlerShim(PlaydateAPI* playdate, PDSystemEvent event, uint32_t arg)
{
	if ( event == kEventInit )
		pdrealloc = playdate->system->realloc;
	
	return eventHandler(playdate, event, arg);
}

The shim makes sure that the allocator pointer is headed to the right place before any user game code executes. This is a pretty nice extra layer of developer nicities for us in that there is one less thing to worry about (the pdrealloc is used in their helper malloc(), free(), realloc(), implementations).

.bss

Also remember that in the setup.c they make a point to put a pointer to the __bss_start__ and __bss_end__ addresses in their own sections, so now we are up to 3 custom sections going into the actual build step (in addition to the default .text, .data etc).

So how is this going to look when it actually builds?

Building

NOTE:

Building binaries using their device / arm SDK requires some dependencies. The only things i needed to install were: apt install arm-none-eabi-binutils, and snap install cmake since the apt version is wildly out of date.

Assuming all the dependencies are installed and everything is setup correctly you should be able to build the SDK's Life example and it should just work, note that you need to override the defult toolchain file for cmake, see the below commands for more.

We're going to run the build commands, then dissect what's going on under the hood from all the automagic stuffs we're given from Panic Inc. ( the automagic stuff is great btw <3 ).

Running the Build

Without any of the build output here are the types of commands needed to build an actual project, nothing more than your standard cmake build.

# run from within the `Life` project directory
$ mkdir build && cd build
# run the initial cmake build setup step, and override the toolchain
$ PLAYDATE_SDK_PATH=~/d/o/play.date/sdk-1.12.3 cmake .. -DCMAKE_TOOLCHAIN_FILE=~/d/o/play.date/sdk-1.12.3/C_API/buildsupport/arm.cmake
# build it, not sure if you need to provide the sdk path but i did anyways
$ PLAYDATE_SDK_PATH=~/d/o/play.date/sdk-1.12.3 make Life_DEVICE

Again, except with output

# prep the build
$ pwd
[..]/sdk-1.12.3/C_API/Examples/Life
$ mkdir build && cd build
$ PLAYDATE_SDK_PATH=~/d/o/play.date/sdk-1.12.3 cmake .. -DCMAKE_TOOLCHAIN_FILE=~/d/o/play.date/sdk-1.12.3/C_API/buildsupport/arm.cmake
-- arm.cmake loaded
-- arm.cmake loaded
-- The C compiler identification is GNU 8.3.1
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/arm-none-eabi-gcc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/arm-none-eabi-gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- SDK Path: /home/lockbox/d/o/play.date/sdk-1.12.3
-- Configuring done
-- Generating done
-- Build files have been written to: /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build

# do the build
$ PLAYDATE_SDK_PATH=~/d/o/play.date/sdk-1.12.3 make Life_DEVICE
/snap/cmake/1210/bin/cmake -S/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life -B/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build --check-build-system CMakeFiles/Makefile.cmake 0
make  -f CMakeFiles/Makefile2 Life_DEVICE
make[1]: Entering directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
/snap/cmake/1210/bin/cmake -S/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life -B/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build --check-build-system CMakeFiles/Makefile.cmake 0
/snap/cmake/1210/bin/cmake -E cmake_progress_start /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build/CMakeFiles 3
make  -f CMakeFiles/Makefile2 CMakeFiles/Life_DEVICE.dir/all
make[2]: Entering directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
make  -f CMakeFiles/Life_DEVICE.dir/build.make CMakeFiles/Life_DEVICE.dir/depend
make[3]: Entering directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
cd /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build && /snap/cmake/1210/bin/cmake -E cmake_depends "Unix Makefiles" /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build/CMakeFiles/Life_DEVICE.dir/DependInfo.cmake --color=
make[3]: Leaving directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
make  -f CMakeFiles/Life_DEVICE.dir/build.make CMakeFiles/Life_DEVICE.dir/build
make[3]: Entering directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
[ 33%] Building C object CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
/usr/bin/arm-none-eabi-gcc -DTARGET_EXTENSION=1 -DTARGET_PLAYDATE=1 -I/home/lockbox/d/o/play.date/sdk-1.12.3/C_API -Wall -Wno-unknown-pragmas -Wdouble-promotion -mthumb -mcpu=cortex-m7 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -D__FPU_USED=1 -falign-functions=16 -fomit-frame-pointer -gdwarf-2 -fverbose-asm -ffunction-sections -fdata-sections -std=gnu11 -MD -MT CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj -MF CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj.d -o CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj -c /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c
[ 66%] Building C object CMakeFiles/Life_DEVICE.dir/main.c.obj
/usr/bin/arm-none-eabi-gcc -DTARGET_EXTENSION=1 -DTARGET_PLAYDATE=1 -I/home/lockbox/d/o/play.date/sdk-1.12.3/C_API -Wall -Wno-unknown-pragmas -Wdouble-promotion -mthumb -mcpu=cortex-m7 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -D__FPU_USED=1 -falign-functions=16 -fomit-frame-pointer -gdwarf-2 -fverbose-asm -ffunction-sections -fdata-sections -std=gnu11 -MD -MT CMakeFiles/Life_DEVICE.dir/main.c.obj -MF CMakeFiles/Life_DEVICE.dir/main.c.obj.d -o CMakeFiles/Life_DEVICE.dir/main.c.obj -c /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/main.c
[100%] Linking C executable Life_DEVICE.elf
/snap/cmake/1210/bin/cmake -E cmake_link_script CMakeFiles/Life_DEVICE.dir/link.txt --verbose=1
/usr/bin/arm-none-eabi-gcc -mthumb -mcpu=cortex-m7 -mfloat-abi=hard -mfpu=fpv5-sp-d16 -D__FPU_USED=1 -T/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/link_map.ld -Wl,-Map=game.map,--cref,--gc-sections,--no-warn-mismatch "CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj" CMakeFiles/Life_DEVICE.dir/main.c.obj -o Life_DEVICE.elf
/usr/bin/arm-none-eabi-objcopy -Obinary Life_DEVICE.elf /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/Source/pdex.bin
cd /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life && /home/lockbox/d/o/play.date/sdk-1.12.3/bin/pdc -sdkpath /home/lockbox/d/o/play.date/sdk-1.12.3 Source Life.pdx
make[3]: Leaving directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
[100%] Built target Life_DEVICE
make[2]: Leaving directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'
/snap/cmake/1210/bin/cmake -E cmake_progress_start /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build/CMakeFiles 0
make[1]: Leaving directory '/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/build'

Examining the internal CMake files

<todo - not really super important, understanding them just makes things less automagic>

Build Steps

Build setup.c + main.c

This step is only one (ridiculous) gcc invocation, note that for each step (setup.c and main.c, and any other compilation unit), the flags are the same unless manually changed:

# prettified
/usr/bin/arm-none-eabi-gcc \
    -DTARGET_EXTENSION=1 \
    -DTARGET_PLAYDATE=1 \
    -I/home/lockbox/d/o/play.date/sdk-1.12.3/C_API \
    -Wall \
    -Wno-unknown-pragmas \
    -Wdouble-promotion \
    -mthumb \
    -mcpu=cortex-m7 \
    -mfloat-abi=hard \
    -mfpu=fpv5-sp-d16 \
    -D__FPU_USED=1 \
    -falign-functions=16 \
    -fomit-frame-pointer \
    -gdwarf-2 \
    -fverbose-asm \
    -ffunction-sections \
    -fdata-sections \
    -std=gnu11 \
    -MD \
    -MT CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj \
    -MF CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj.d \
    -o CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj \
    -c /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c

Deciphering the gcc options they used

references:

  • https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
  • https://gcc.gnu.org/onlinedocs/gcc/Preprocessor-Options.html
  • gcc manpage
  • processor manual
  • https://stackoverflow.com/questions/4274804/query-on-ffunction-section-fdata-sections-options-of-gcc
    • I'm curious what decisions drove them to do this, their results is some jank form of PIC with a bunch of extra data they don't need?
OptionWhat it's doing
-DTARGET_EXTENSION=1Custom define for some of the CMake build system
-DTARGET_PLAYDATE=1Custom define for some of the CMake build system
-I/home/lockbox/d/o/play.date/sdk-1.12.3/C_APIAdd the C_API to the include path
-WallWarn against all the common stuff
-Wno-unknown-pragmasmanually disable unknown-pragmas since -Wall enables it
-Wdouble-promotionenable double-promotion warning
-mthumbemit thumb code
-mcpu=cortex-m7select cpu
-mfloat-abi=hardenable hard float support
-mfpu=fpv5-sp-d16select the target FPU
-D__FPU_USED=1Define intrinsic variable for FPU support
-falign-functions=16Align functions to 16 bytes, (as per cpu docs)
-fomit-frame-pointerOnly use frame pointers when necessary (generally non-small functions)
-gdwarf-2use dwarf2 debug info
-fverbose-asmadd extra "readability" syntax to generated asm, shouldnt be used? gross.
-ffunction-sectionsPlace each function in it's own section
-fdata-sectionsPlace each data item in it's own section
-std=gnu11Set compiler standard
-MD -MT CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.objpreprocessor arg
-MF CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj.dpreprocessor arg
-o CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.objoutput object file
-c /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.cinput file to compile

Link Step

Now the fun part - the link step.

/usr/bin/arm-none-eabi-gcc \
    -mthumb \
    -mcpu=cortex-m7 \
    -mfloat-abi=hard \
    -mfpu=fpv5-sp-d16 \
    -D__FPU_USED=1 \
    -T/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/link_map.ld \
        -Wl,-Map=game.map,--cref,--gc-sections,--no-warn-mismatch \
    "CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj" \
    CMakeFiles/Life_DEVICE.dir/main.c.obj \
    -o Life_DEVICE.elf

Many of the options are similar to the above build commands, the new flags however, are parameters to pass to the linker.

References:

  • https://sourceware.org/binutils/docs/ld.html
Linker OptionsEffect
-Tlink_map.lduse link_map.ld as the linker script
-Map=game.mapoutput map to file game.map
--crefprint symbol cross reference information to the map file
--gc-sectionslink-tme section garbage collection
--no-warn-mismatchignore errors from linking mismatched objects

Quote from ld documentation:

3.6.4.4 Input Section and Garbage Collection When link-time garbage collection is in use (‘--gc-sections’), it is often useful to mark sections that should not be eliminated. This is accomplished by surrounding an input section’s wildcard entry with KEEP(), as in KEEP(*(.init)) or KEEP(SORT_BY_NAME(*)(.ctors)).

What this is doing

The output symbol / section map for Life_DEVICE.elf is getting dumped to game.map, so now we can retroactively figure out what things are at x offset into memory etc.

Custom Linker script:

MEMORY
{
  EXTRAM (rwx): ORIGIN = 0x60000000, LENGTH = 8*1024K
}

GROUP(libgcc.a libc.a libm.a)

SECTIONS
{
	.capi :
	{
		KEEP(*(.capi_handler))
		KEEP(*(.bss_start))
		KEEP(*(.bss_end))
	} > EXTRAM

	.text :
	{
		*(.text)

		KEEP(*(.init))
		KEEP(*(.fini))

		/* .ctors */
		*crtbegin.o(.ctors)
		*crtbegin?.o(.ctors)
		*(EXCLUDE_FILE(*crtend?.o *crtend.o) .ctors)
		*(SORT(.ctors.*))
		*(.ctors)

		/* .dtors */
 		*crtbegin.o(.dtors)
 		*crtbegin?.o(.dtors)
 		*(EXCLUDE_FILE(*crtend?.o *crtend.o) .dtors)
 		*(SORT(.dtors.*))
 		*(.dtors)

		*(.rodata*)

		KEEP(*(.eh_frame*))

	} > EXTRAM

	.data :
	{
		__etext = .;

		__data_start__ = .;
		*(vtable)
		*(.data*)

		. = ALIGN(4);
		/* preinit data */
		PROVIDE_HIDDEN (__preinit_array_start = .);
		KEEP(*(.preinit_array))
		PROVIDE_HIDDEN (__preinit_array_end = .);

		. = ALIGN(4);
		/* init data */
		PROVIDE_HIDDEN (__init_array_start = .);
		KEEP(*(SORT(.init_array.*)))
		KEEP(*(.init_array))
		PROVIDE_HIDDEN (__init_array_end = .);

		. = ALIGN(4);
		/* finit data */
		PROVIDE_HIDDEN (__fini_array_start = .);
		KEEP(*(SORT(.fini_array.*)))
		KEEP(*(.fini_array))
		PROVIDE_HIDDEN (__fini_array_end = .);

		. = ALIGN(4);
		/* All data end */
		__data_end__ = .;

	} > EXTRAM

	.bss :
	{
		. = ALIGN(4);
		__bss_start__ = .;
		*(.bss*)
		*(COMMON)
		. = ALIGN(4);
		__bss_end__ = .;

	} > EXTRAM

  /DISCARD/ :
  {
		*(.ARM.exidx)
  }

}

The important parts of this linker script are that its making a rwx (Read Write Execute) section at addresss 0x60000000 and of size 8M. It then maps all the code onto it, and all data as well. Starting at offset 0x0 the sections, in order will be:

  • .capi
  • .text
  • .data
  • .bss

Parsing game.map

Important output from game.map:

Eventually you get to the part of game.map that is going to detail all the sections, symbols, addresses, sizes, and source compilation units. This is what is going to get generated into the elf, and is almost what is run on the device.

.capi           0x0000000060000000        0xc
 *(.capi_handler)
 .capi_handler  0x0000000060000000        0x4 CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
                0x0000000060000000                PD_eventHandler
 *(.bss_start)
 .bss_start     0x0000000060000004        0x4 CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
                0x0000000060000004                _bss_start
 *(.bss_end)
 .bss_end       0x0000000060000008        0x4 CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
                0x0000000060000008                _bss_end

.text           0x000000006000000c      0x19c
 *(.text)
 .text          0x000000006000000c       0x88 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
 .text          0x0000000060000094        0xc /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-atexit.o)
                0x0000000060000094                atexit
 .text          0x00000000600000a0       0x34 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-fini.o)
                0x00000000600000a0                __libc_fini_array
 .text          0x00000000600000d4       0x4c /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-rand.o)
                0x00000000600000d4                srand
                0x00000000600000e4                rand
 .text          0x0000000060000120       0x68 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-__atexit.o)
                0x0000000060000120                __register_exitproc
 *(.init)
 .init          0x0000000060000188        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crti.o
                0x0000000060000188                _init
 .init          0x000000006000018c        0x8 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtn.o
 *(.fini)
 .fini          0x0000000060000194        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crti.o
                0x0000000060000194                _fini
 .fini          0x0000000060000198        0x8 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtn.o
 *crtbegin.o(.ctors)
 *crtbegin?.o(.ctors)
 *(EXCLUDE_FILE(*crtend.o *crtend?.o) .ctors)
 *(SORT_BY_NAME(.ctors.*))
 *(.ctors)
 *crtbegin.o(.dtors)
 *crtbegin?.o(.dtors)
 *(EXCLUDE_FILE(*crtend.o *crtend?.o) .dtors)
 *(SORT_BY_NAME(.dtors.*))
 *(.dtors)
 *(.rodata*)
 .rodata        0x00000000600001a0        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-impure.o)
                0x00000000600001a0                _global_impure_ptr
 *(.eh_frame*)
 .eh_frame      0x00000000600001a4        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
 .eh_frame      0x00000000600001a4        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtend.o

.glue_7         0x00000000600001a8        0x0
 .glue_7        0x00000000600001a8        0x0 linker stubs

.glue_7t        0x00000000600001a8        0x0
 .glue_7t       0x00000000600001a8        0x0 linker stubs

.vfp11_veneer   0x00000000600001a8        0x0
 .vfp11_veneer  0x00000000600001a8        0x0 linker stubs

.v4_bx          0x00000000600001a8        0x0
 .v4_bx         0x00000000600001a8        0x0 linker stubs

.iplt           0x00000000600001a8        0x0
 .iplt          0x00000000600001a8        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o

.text.eventHandlerShim
                0x00000000600001b0       0x3c
 .text.eventHandlerShim
                0x00000000600001b0       0x3c CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
                0x00000000600001b0                eventHandlerShim

.text.ison      0x00000000600001f0       0x44
 .text.ison     0x00000000600001f0       0x44 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.val       0x0000000060000240       0x48
 .text.val      0x0000000060000240       0x48 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.rowsum    0x0000000060000290       0x98
 .text.rowsum   0x0000000060000290       0x98 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.middlerowsum
                0x0000000060000330       0x74
 .text.middlerowsum
                0x0000000060000330       0x74 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.doRow     0x00000000600003b0       0xb0
 .text.doRow    0x00000000600003b0       0xb0 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.randomize
                0x0000000060000460       0x64
 .text.randomize
                0x0000000060000460       0x64 CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.update    0x00000000600004d0       0xdc
 .text.update   0x00000000600004d0       0xdc CMakeFiles/Life_DEVICE.dir/main.c.obj

.text.eventHandler
                0x00000000600005b0       0x54
 .text.eventHandler
                0x00000000600005b0       0x54 CMakeFiles/Life_DEVICE.dir/main.c.obj
                0x00000000600005b0                eventHandler

.text.startup   0x0000000060000604       0x14
 .text.startup  0x0000000060000604       0x14 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-__call_atexit.o)

.rel.dyn        0x0000000060000618        0x0
 .rel.iplt      0x0000000060000618        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o

.data           0x0000000060000618      0x43c
                0x0000000060000618                __etext = .
                0x0000000060000618                __data_start__ = .
 *(vtable)
 *(.data*)
 .data          0x0000000060000618      0x430 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-impure.o)
                0x0000000060000618                _impure_ptr
                0x0000000060000a48                . = ALIGN (0x4)
                0x0000000060000a48                PROVIDE (__preinit_array_start = .)
 *(.preinit_array)
                0x0000000060000a48                PROVIDE (__preinit_array_end = .)
                0x0000000060000a48                . = ALIGN (0x4)
                0x0000000060000a48                PROVIDE (__init_array_start = .)
 *(SORT_BY_NAME(.init_array.*))
 .init_array.00000
                0x0000000060000a48        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/../../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc.a(lib_a-__call_atexit.o)
 *(.init_array)
 .init_array    0x0000000060000a4c        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
                0x0000000060000a50                PROVIDE (__init_array_end = .)
                0x0000000060000a50                . = ALIGN (0x4)
                0x0000000060000a50                PROVIDE (__fini_array_start = .)
 *(SORT_BY_NAME(.fini_array.*))
 *(.fini_array)
 .fini_array    0x0000000060000a50        0x4 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
                0x0000000060000a54                PROVIDE (__fini_array_end = .)
                0x0000000060000a54                . = ALIGN (0x4)
                0x0000000060000a54                __data_end__ = .

.tm_clone_table
                0x0000000060000a54        0x0
 .tm_clone_table
                0x0000000060000a54        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
 .tm_clone_table
                0x0000000060000a54        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtend.o

.igot.plt       0x0000000060000a54        0x0
 .igot.plt      0x0000000060000a54        0x0 /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o

.bss            0x0000000060000a54       0x24
                0x0000000060000a54                . = ALIGN (0x4)
                0x0000000060000a54                __bss_start__ = .
 *(.bss*)
 .bss           0x0000000060000a54       0x1c /usr/lib/gcc/arm-none-eabi/8.3.1/thumb/v7e-m+fp/hard/crtbegin.o
 .bss.pdrealloc
                0x0000000060000a70        0x4 CMakeFiles/Life_DEVICE.dir/home/lockbox/d/o/play.date/sdk-1.12.3/C_API/buildsupport/setup.c.obj
 .bss.pd        0x0000000060000a74        0x4 CMakeFiles/Life_DEVICE.dir/main.c.obj
 *(COMMON)
                0x0000000060000a78                . = ALIGN (0x4)
                0x0000000060000a78                __bss_end__ = .

Tying back to the build steps

A lot of the contents in the generated map are due to the gcc build options that were passed in.

  • -ffunction-sections
    • section symbols with names like .text.evenHandlerShim appear because this switch gives each function it's own section
    • notice how symbols like atexit and __libc_fini_array etc. (functions from NOT the play.date build system) only belong to the section .text? They don't have this build switch enabled.
  • -fdata-sections
    • section symbols with names like .bss.pdrealloc and .bss.pd occur because of this switch (see above).

Take note of the .capi section, and how there are ONLY three members (there will always only be three due to link_map.ld above). This is the magic that creates a simple API for all games to run hosted in their runtime.

Packing Step

The packaging step is pretty simple, and uses a bunch of information from the link step.

# copy the section contents from `Life_DEVICE.elf` into a binary blob
/usr/bin/arm-none-eabi-objcopy \
    -Obinary \
    Life_DEVICE.elf \
    /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life/Source/pdex.bin

# Package the binary blob in a .pdx archive
cd /home/lockbox/d/o/play.date/sdk-1.12.3/C_API/Examples/Life && \
/home/lockbox/d/o/play.date/sdk-1.12.3/bin/pdc \
    -sdkpath /home/lockbox/d/o/play.date/sdk-1.12.3 \
    Source \
    Life.pdx

All objcopy is going to do in this step is take the section contents from the elf and dump them into a blob, this is effectively doing a memory map dump to disk. And because in the link step it mashed all the sections next to each other there is one contiguous blob of all our code + literal pools + data etc. No relocations necessary :D

In effect we get the information from game.map actualized into a .bin. The contents of our .bin file are repsented exactly as defined in game.map, the only difference being thatthe beginning of the .bin is (duh) going to be offset 0x0, but will be relocated to 0x60000000 on the actual device.

The pdc invocation when using C only projects basically just packages the bytes into the game archive far as I can tell.

Binary Structure

TLDR; the compiled binary looks like this (Note that address 0x0 is actually 0x60000000 when run on hardware):

layout.svg

What this means is that:

  • the first word in the binary ust be a pointer to our "event handler"
  • the second word points to the start of our program globals
  • the third word points to the end of program globals

Offhand I'm not sure the purpose of the runtime knowing the beginning + end of the program global data, unless they're extending the heap to use that space or something.

What we also learned from the game.map is that we're going to be shoving a ton of extra runtime stuff for the hard float random stuff into the binary we're producing (Life uses rand()). The source for all that included code can be found in the gcc source tree.

Game Binary

When the game is mapped into memory the runtime then calls the game's provided event handler with the init event. So execution flow looks something like:

- loadGame()
- initGame()
    - game.eventHandler(&self, kEventInit, 0)
        - eventHandlerShim(pd, evt, arg)
            - eventHandler(pd, evt, arg)

from setup.c we know that eventHandlerShim() is just:

int eventHandlerShim(PlaydateAPI* playdate, PDSystemEvent event, uint32_t arg)
{
	if ( event == kEventInit )
		pdrealloc = playdate->system->realloc;
	
	return eventHandler(playdate, event, arg);
}

Whih is a little disappointing that they add the extra function call overhead as its only going to be used once per game session. So this is an interesting overhead addition after they (Panic Inc.) have already shown that they are both:

  • willing to add extra sections to emitted game binary
  • willing to arrange sections and add extra pointers to game binary

Some alternatives that could reduce this (small, but passive and unavoidable) runtime overhead (ignoring the rest of the time spent in the play.date OS interrupts + runtime etc.) could be things like:

  • add a pointer to .bss.pdrealloc in .capi, so the runtime can overwite it itself
  • move the realloc function during the link step to change it's section, then overwrite it (though the previous idea would be considerably easier)

Either way, the game program is a self contained image that is not PIC (poisiton independant code), but based at 0x60000000, and is provided a pointer to the playdate api each "interrupt."

Game Console

We know that new games get delivered over Wi-Fi, and that there is an ESP8266 onboard (pg. 2), but there is also no interface to communicate with it via the play.date SDK, so we'll have to do a little digging and try to look at the underlying operating system.

To do that we're going to need to write some code to figure out where the OS is stored, where the heap is, where code objects are, which flash banks are being used, whch hardware is being used etc.

We do know that the ESP8266 is connected via UART, and we also know that there is a peripheral accelerometer onboard as well (and soon a stereo dock, though they redid the FCC filing for it in December 2022 so we probably have to wait a bit).

Finding Valid Memory

To determine valid memory address ranges in RAM, we can start with a couple of easy no-crash-no-risk methods before just yolo touching different regions to see if the play.date crashes or not

First we're going to try

  • Using the PlaydateAPI address to get valid regions
  • Using allocator memory to get a valid region
  • Get the stack memory
  • Look at function addresses to see where valid code space is
  • Walk the stack

Using the PlaydateAPI address

Using the pointer provided to our event handler we can mask off the lower bits and find which RAM bank the heap with the playdate object is in (or stack address / flash address if its not stored on the heap, TBD)

Using some quick n dirty C we can dump the address to the screen like below:

api_address.gif

C snippet:

static void log_addresses(PlaydateAPI *pd)
{
    char * api_base;
    char * pd_address;

    // create the statements
    pd->system->formatString(&api_base, "API Base Address: %p", (uint32_t)pd & 0xf0000000);
    pd->system->formatString(&pd_address, "PD API Address: %p", pd);    

    // write the statements
    write_line(pd, api_base);
    write_line(pd, pd_address);

    // free the statements
    FREE(api_base);
    FREE(pd_address);
}

What this tells us is that the pd api is allocated in SRAM (system SRAM starts at 0x20000000 as per docs section 2.3). Cool, now that we have some progress, we're going to dump some more addresses to see what we're working with.

Using more addresses

Now we're going to still use the pd api address, but now we're going to allocate something on the heap and use that address as well, and allocate something on the stack and use that address (Note that the stack is setup to be only 61800 bytes a la C_API/buildsupport/*.cmake and C_API/Examples/*/Makefile).

So, lets see what we got:

addresses

corresponding snippet:

static void log_addresses(PlaydateAPI *pd)
{
    char * api_base;
    char * pd_address;
    char * heap_obj;
    char * heap_base;
    char * stack_obj;
    char * stack_base;
    char * code_ptr;

    // create the statements
    pd->system->formatString(&api_base, "API Base Address: %p", (uint32_t)pd & 0xf0000000);
    pd->system->formatString(&pd_address, "PD API Address: %p", pd);
    pd->system->formatString(&heap_base, "Heap Base Address: %p", ((uint32_t)api_base & 0xf0000000));
    pd->system->formatString(&heap_obj, "Heap Address: %p", api_base);
    pd->system->formatString(&stack_base, "Stack Base Address: %p", ((uint32_t)&api_base & 0xf0000000));
    pd->system->formatString(&stack_obj, "Stack Address: %p", &api_base);
    pd->system->formatString(&code_ptr, "&formatString: %p", pd->system->formatString);

    // write the statements
    write_line(pd, api_base);
    write_line(pd, pd_address);
    write_line(pd, heap_base);
    write_line(pd, heap_obj);
    write_line(pd, stack_base);
    write_line(pd, stack_obj);
    write_line(pd, code_ptr);

    // free the statements
    FREE(api_base);
    FREE(pd_address);
    FREE(heap_base);
    FREE(heap_obj);
    FREE(stack_base);
    FREE(stack_obj);
    FREE(code_ptr);
}

Cool, so this looks like we have a (so far) deterministic stack address (0x2003807c) for the pd api handle, and that the heap objects we allocate during games are going to be allocated immediately after the game code (remember that the game is mapped to 0x60000000) in ram. Additionally this shows us that the flash region beng used for code (specifically the flash region that pd->system->formatString is in) is the flash bank that starts at 0x8000000.

Flash bank table for reference (section 3.3.1):

flash_blocks.png

So what we can glean from this table (and the fact that we can run code on it) is that we have permission to access the memory and execute it (haven't seen many things that allow execute only permissions), and that the entire flash region is from address 0x08000000 to 0x080FFFFF). So we can probably just dump all that code right now if we wanted to, BUT we still have a fun exercise comming up.

Walking the stack

This is a fun exercise of walking the stack, intuitively we will imagine the stack growing down, as the addresses decrease the more items you add. Note that because we are executing exclusively in Thumb2 mode the calling convention / ABI that will be used on the play.date is documented here.

You should probably skim this real quick before continuing.

Note this fun highlight from the docs:

fp is usually not used in Thumb state

And note the register names they use:

RegisterTPCS nameTPCS role
r0a1argument 1/scratch register/result
r1a2argument 2/scratch register/result
r2a3argument 3/scratch register/result
r3a4argument 4/scratch register/result
r4v1register variable
r5v2register variable
r6v3register variable
r7v4/wrregister variable/work register in function entry/exit
r8(v5)(ARM v5 register, no defined role in Thumb)
r9(v6)(ARM v6 register, no defined role in Thumb)
r10sl (v7)stack limit
r11fp (v8)frame pointer (usually not used in Thumb state)
r12(ip)(ARM ip register, no defined role in Thumb. May be used as a temporary register on Thumb function entry/exit.)
r13spstack pointer (full descending stack)
r14lrlink register
r15pcprogram counter

They double down on the no-frame-pointer-use when discussing control arrival:

At the instant when control arrives at the target function:

    - pc contains the address of an entry point to the target function.

    - lr contains the value to restore to pc on exit from the function
    (the return link value, see The stack backtrace data structure).

    - sp points at or above the current stack limit. If the limit is
    explicit, sp will point at least 256 bytes above it (see The 
    Stack).

    - If the function is built to use a frame pointer register, fp 
    contains 0 or points to the most recently created stack backtrace 
    structure (see The stack backtrace data structure). This is not 
    usual in Thumb state.

    - The space between sp and the stack limit must be readable and 
    writable memory which the called function can use as temporary 
    workspace, and overwrite with any values before the function 
    returns (see The Stack).

Note that argument passing is more or less standard for a computer architecture, (well at least nothing is too out of the ordinary as specified in the argument passing document, though there are many real-world cases where that is not honored, not too worried about that here though).

At the instant control arrives at the target function, the argument 
list is allocated as follows:

    - the first four argument words (or fewer if there are fewer than 
    four argument words remaining in the argument list) are in machine 
    registers a1-a4

    - the remainder of the argument list (if any) is in memory, at the 
    location addressed by sp and higher addressed words thereafter.

    - A floating-point value is treated as one, two, or three integer 
    values, as appropriate to its precision. The TPCS does not support 
    the passing or returning of floating-point values in ARM 
    floating-point registers.

And last but not least, control return

When the return link value for a function call is placed in the pc:

    - sp, fp, sl, v6, v5, and v1-v4 contain the same values as they 
    did at the instant of control arrival. If the function returns a 
    simple value of size one word or less, the value is contained in 
    a1.

    - If the function returns a simple value of size one word or less, 
    then the value must be in a1. A language implementation is not 
    obliged to consider all single-word values simple. See Non-simple 
    value return.

    - If the function returns a simple floating-point value, the value 
    is encoded in a1, {a1, a2}, or {a1, a2, a3}, depending on its 
    precision.

SO, time to cry. basically because by default thumb mode omit's the frame pointer theres not going to be frame's to jump back to. Instead of doing some fun tricks and heuristics to figure out what our frames are, we're just going to dump flash. Because I don't want to think that hard.

But we know that we can read the flash section, and we know that the onboard SRAM is used for the stack, and that the 0x60000000 region that our code gets mapped to is where our heap is.

Next: Dumping Flash

Connected peripherals

ESP chip

pd-OS

As I've been going through this exercise of looking at the play.date console I've been hoping there's some cool name for the underlying operating system. Their Playdate account management and game distribution system is called WOPR, and they made the legendary Untitled Goose Game. Unfortunately the OS is just called Playdate OS.

Anyways, lets take a look at it.

Dumping Flash

From Valid Memory we know that flash starts at 0x08000000 and goes until 0x080FFFFF. So lets write up a quick C stub to dump it over serial, and a python script to pickup the dump.

See pd-usb for details but the high level is that you just need to connect with baud rate 115200 to the play.date, the ID's (just in case you want custom udev rules for mitm serial or something similar, or its not under /dev/ttyACM0) are:

  • Vendor ID: 0x1331
  • Product ID: 0x5740

The general algorithm is very simple:

Note:

Flash has a default value of 0xffffffff, so I used shortcut logic to wait until 100 were read in a row, no need to waste time, though the baud rate is 115200 so at least it moves pretty quick in this case. Many times you need to do this it is prohibitively slow to read + dump flash though.

static void dump_flash(PlaydateAPI *pd)
{
    uint32_t addr;
    uint32_t value = 0;
    uint32_t default_value = 0; // number of consecutive ``0xffffffff``

    // pattern so python script to pickup the dump knows we're printing
    println("0x%08x, 0x%08x, dump_start:\r\n", FLASH_START, FLASH_END);

    // iterate over each word in flash, and dump it
    for (addr = FLASH_START; addr <= FLASH_END; addr += 0x4)
    {
        value = *((uint32_t *)addr);

        // check if ``0xffffffff``
        if (value == 0xffffffff)
        {
            default_value++;
            
            // if there are 100 ``0xffffffff`` in a row then quit,
            // assume that we hit the end of the flash memory used
            if (default_value >= 100)
                break;
        }
        else
        {
            default_value = 0;
        }

        println("0x%08x: 0x%08x\r\n", addr, value);
    }
}

A couple speed bumps along the way:

So I've been using the docs as given in the awesome playdate repo, which started crashing things a bunch when reading from flash 0x08000000 until 0x08050000 which then let me dump until the end of flash, so I went back through the FCC filing photos to see the actual MCU onboard.

The actual MCU is STM23f746ie, which really doesn't change anything wrt what docs to use. So for the moment I'm still stumped why it's crashing attempting to read flash below 0x08050000.

Then I found this,

Step 1: get the typedef for the System Flash Registers:

// github STM32CubeF7/Drivers/CMSIS/Device/ST/STM32F7xx/Include/stm32f7xx.h
typedef struct
{
  __IO uint32_t ACR;      /*!< FLASH access control register,     Address offset: 0x00 */
  __IO uint32_t KEYR;     /*!< FLASH key register,                Address offset: 0x04 */
  __IO uint32_t OPTKEYR;  /*!< FLASH option key register,         Address offset: 0x08 */
  __IO uint32_t SR;       /*!< FLASH status register,             Address offset: 0x0C */
  __IO uint32_t CR;       /*!< FLASH control register,            Address offset: 0x10 */
  __IO uint32_t OPTCR;    /*!< FLASH option control register ,    Address offset: 0x14 */
  __IO uint32_t OPTCR1;   /*!< FLASH option control register 1 ,  Address offset: 0x18 */
} FLASH_TypeDef;

Well so we're still crashing, heck. So they probably are using the Memory Protection Unit. At least we could dump user flash. Time to try to dump some more stuff and move on.

See if we can dump OTP memory, though it might also be protected. It starts at 0x1FF0F000 and is (16 * 64) bytes (16 banks):

#define OTP_START 0x1FF0F000UL
#define OTP_END OTP_START + (16 * 64)

static void dump_otp(PlaydateAPI *pd)
{
    uint32_t addr;
    uint32_t value = 0;

    println("0x%08x, 0x%08x, dump_start:\r\n", OTP_START, OTP_END);

    for (addr = OTP_START; addr <= OTP_END; addr += 0x4)
    {
        value = *((uint32_t *)addr);

        // print value
        println("%08x\r\n", value);
    }
}

F.

Still crashing, tho duh we're still not privileged.

Time to go about this a different way - what if their kernel doesn't do usermode pointer validation?

Lets try again:

static void write_kmem_to_file(PlaydateAPI *pd, uint32_t base, uint32_t size, const char *fname)
{
    SDFile *file = pd->file->open(fname, kFileWrite);

    // make sure we can actually write to a file
    if (file == NULL)
    {
        WRITE_FSTRING(pd, "%s was unable to be opened because:\n\t%s", fname, pd->file->geterr());
        return;   
    }

    // tell the kernel to write to the file
    if ((-1 == pd->file->write(file, (void*)base, size)) == 1)
    {
        WRITE_FSTRING(pd, "ERROR when writing to file:\n\t%s", pd->file->geterr());
        return;
    }

    // close the file
    pd->file->close(file);
}


static void dump_otp(PlaydateAPI *pd)
{
    WRITE_FSTRING(pd, "Attempting to read memory from %p to %p", OTP_START, OTP_END);
    write_kmem_to_file(pd, OTP_START, OTP_END - OTP_START, "otp_mem.bin");
}

Still crashing, though after looking back through the re notes I noticed

  1. they do use OTP (cool) and
  2. once upon a time they didn't validate the pointers (or something to that effect - most likely they MPU wasn't actually enabled).

Upon further attempts I'm assuming that any memory accessed by the kernel for the process is limited to the program stack / heap as the following did not work.

  • start of flash 0x08000000
  • known kernel addresses (from forcing crashes) 0x0803xxxx
  • known user space addresses (from pd api) 0x0805xxxx
  • system registers
  • OTP memory

So after perusing the forums a bit it's using (as i actually assumed right for once) FreeeRTOS. And now it looks like the new play.date versions are using (working) MPU builds, rude.

Oh well, got user mode dumps, time to RE a bit.

Next: Triaging User Flash Dump

skeletor.gif

Triaging User Mode Dump

Before pigeon-holing too hard on something in particular it's good to go over the goals of what you're doing and what a decent path forward might actually be. This is especially important for things like reversing an entire OS / bare metal system where you have interrupts firing all over the place, a ton of third party code, vendor SDK code, and the actual target code you care about.

SO, in the spirit of "seeing where we're at", we can take a step back and go over the goals again:

  1. Figure out how to talk to Wi-Fi
  2. Figure out the limits of being "unprivileged"
  3. Kernel mode dump (disclaimer, its not hard - but most likely unnecessary for Wi-Fi)
  4. Reverse all the things

So focusing on task 1, there are a few sub tasks that will probably lead to most of the others getting done in the process.

  • Label all the public SDK functions
  • Label FreeRTOS functions
  • If there are any in user space, find the memfault functions
  • Find each of the running threads
  • Find the thread that talks to the ESP wifi chip on board
  • Figure out which IPC method they use to control that thread

The high level workflow for the intial triage of user space is going to roughly follow:

  • look @ PlaydateSimulator.debug for type information we can maybe use ( or at least get a reference for their programming paradigms)
  • make a playdate game to dump addresses for all of the documented API's
    • im assuming that some scoreboard api has some Wi-Fi things under the hood will probably get us close
  • look at FreeRTOS source to figure out what some of the syscalls are, then work up to the pd api functions
  • if there's 0 progress:
    • we can reverse some of the season games to figure out what's going on (but that's cheating :p )
    • look for hardcoded AT commands, tried and true "grep flag" basically always works, but is suuuper anti-climactic

So without further ado, let's jump in

Baby's first binary dump

We'll use binary.ninja to reverse the code on the playdate, note that we're in thumb2 mode and our starting offset is not 0. So when we first open the binary dump we should see this:

first_open.png

Make sure to change the image base address to 0x08050000.

Also switch it to disassembly so things look clearer - binary ninja is not best suited for non standard targets, and all disassemblers are notoriously bad at arm / thumb / vfp, so we'll take it easy on it. Notice how all of the functions aren't named, and things are all over the place (duh its a binary dump). We have some work to do.

Step 1 - Applying debug types

The first step is going to be installing the dwarf debug info plugin that vector35 published, then ugh restarting binary ninja. But afterwards we can now import debug info from the shipped external PlaydateSimulator.debug file from the SDK.

./import_debug_info

And now all the types are applied we can export the types to a header file.

Turning the provided dwarf2 debug file into a header we get this. We can import the header into binary ninja with some extra flags [as documented below], and then we will have those as needed hen we find stuff. Note that we probably won't use those types much in user space but better safe than sorry. Also now that binary ninja literally just stabilized the header / type importing I figured I'd go ahead and give it a whirl.

Step 1.5 - Importing the SDK C Types

The more useful thing to import into binary ninja for looking at user space is going to be the C_API (specifically the pd_api.h header). Note that binary ninja uses clang type parser for their header importing, so now we can import the headers with the arguments from the compiler like so:

Import selection is here:

./import_headers_select.png

Then add the compiler options:

./import_headers_1.png

And it worked (this time), though it failed the first time I tried to, and reading the documentation shows that there are extra arguents necessary to add the system types, so you might need to add those. On my machine it looked like:

./import_headers_2.png

Now we can directly use the types directly (instead of hand-jamming them) for the next step:

Step 2 - Dumping the API addresses

The Playdate API is super nice and user friendly, and is half the reason I started writing all this stuff up. So now that we've imported all the Playdate C_API types and functions we need to dump the addresses from a real device so we can properly label them in the memory dump we're reversing in binary ninja.

So to do this we're going to dump all of the tables in the main PlaydateAPI struct (each of the struct members is a pointer to anotherb struct of function pointers), and afterwards we will use that to label the corresponding functions in binary ninja.

Corresponding code snippet:

/**
 * \brief macro to dump pointer tables, looks super fancy but all its really doing
 * is:
 *   - starting a json object for the table with the provided name, address, and size.
 *   - dumping each word as 4 zero padded bytes in a hex string as a part of the data
 *     array in the json object
 */
#define LOG_TABLE(__pd, __name, __sz, __ptr)                                                \
    do                                                                                      \
    {                                                                                       \
        WRITE_FCONSOLE(__pd,                                                                \
                       "{\"name\": \"%s\", \"address\": \"%p\", \"size\": %d, \"data\": [", \
                       __name,                                                              \
                       (void *)__ptr,                                                       \
                       __sz);                                                               \
        for (int i = 0; i < __sz / 4; i++)                                                  \
        {                                                                                   \
            WRITE_FCONSOLE(__pd, "\"%08x\",", *(((uint32_t *)__ptr) + i));                  \
        }                                                                                   \
        WRITE_FCONSOLE(__pd, "%s", "]}");                                                   \
    } while (0)

/**
 * \brief dumps a jsonl of symbols to the console
 */
static void log_addresses(PlaydateAPI *pd)
{
    LOG_TABLE(pd, "pd->system", sizeof(struct playdate_sys), pd->system);
    LOG_TABLE(pd, "pd->file", sizeof(struct playdate_file), pd->file);
    LOG_TABLE(pd, "pd->graphics", sizeof(struct playdate_graphics), pd->graphics);
    LOG_TABLE(pd, "pd->sprite", sizeof(struct playdate_sprite), pd->sprite);
    LOG_TABLE(pd, "pd->display", sizeof(struct playdate_display), pd->display);
    LOG_TABLE(pd, "pd->sound", sizeof(struct playdate_sound), pd->sound);
    LOG_TABLE(pd, "pd->lua", sizeof(struct playdate_lua), pd->lua);
    LOG_TABLE(pd, "pd->json", sizeof(struct playdate_json), pd->json);
    LOG_TABLE(pd, "pd->scoreboards", sizeof(struct playdate_scoreboards), pd->scoreboards);
    WRITE_FSTRING(pd, "Wrote tables to console", "");
}

This will output json for each pointer table, and give us most of the information we need without doing toooo much work in C. (initially I started dumping the name of each function as well but that started to be too much in C).

Also I started to get lazy, so we'll just look at the playdate simulator console and look at the output to line it up with the api headers and binary ninja.

Each function table is going to get dumped in order, so we can just walk down the list of functions in each struct and directly apply things in binry ninja.

Snapshot:

./dumping_api_addresses.png

Cool, we can go back to binary ninja and apply the playdate_sys struct type to the address listed, and hopefully that will just label all the functions for us like I've previously done in ghidra.

./pd_sys_type_fail.png

Excuse me tf. The endianness is wrong for the addresses :facepalm:. Joy. Well I don't know how to fix that (after digging around a bit I couldn't figure it out, not going to worry about it though since it doesn't really break anything and you can manually add cross-references - the answer is probably defining a custom data section that is big endian).

What we can do is manually create the references to all the functions, and manually define functions at the necessary addresses (note because we are in thumb mode when we create addresses at the odd addresses binary ninja will correctly create the function at the odd address - 1, see here for computer engineers getting carried away). Cool let's look at realloc and see if it worked:

./realloc_fail

Ummm, great another bug. Failing to disassemble. Fantastic. After repo-ing ghidra, objdump, capstone and ida, I'm thinking it's probably an issue with a hard float instructions (it always is lmao).

The hard float abi used for this mcu is FPv5. So looks like we're going to need to make some MR's (or PR's in github land). yay.

See Triaging Hard Float Disassembly for more.

In the mean time we can finish labelling all the functions and see where we stand.

Next: Searching for FreeRTOS code

Searching for FreeRTOS code

So starting off, we know a couple things, we have a memory dump, and we know that the processor is the STM32F746IE, which has internal flash range from 0x08000000 to 0x080FFFFF inclusive. We are able to dump flash starting at 0x08050000, and we know that the MPU is enabled. We are (currently) trying to figure out where the FreeRTOS code is stored in the flash dump.

Let's see where some default FreeRTOS builds for ARM Cortex M7 put their code. In the FreeRTOS source tree, the demo projects are in the path demo/<platform-info>, so we'll see if we can find a couple MPU + Cortex M7 projects and see if they have attached linker scripts.

Bingo.

What we can guesstimate off of these (and the fact that we can read half the flash space) is that the protected "kernel" code is in the first half, and that user mode is in the second half of flash. Looking further at the linker scripts (the first one linked mainly), we can see that they shove the syscalls at the beginning of the user flash region like so:

  /* The program code and other data into "FLASH" Rom type memory */
  .text :
  {
    /* Place the FreeRTOS System Calls first in the unprivileged region. */
    . = ALIGN(4);
    __syscalls_flash_start__ = .;
    *(freertos_system_calls)
    __syscalls_flash_end__ = .;
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
    *(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
  } >FLASH

So we can probably just make a sample project for ourselves and then use the generated .map listing files to see which syscalls are included and if any of the code looks similar.

Except, I'm lazy and I don't want to do it by hand. So we're going to attempt to make binary ninja signatures with sigkit and go hunting for syscalls.

First though, we're going to do some basic recon:

From the MPU Syscalls page:

When the MPU is used the standard (non MPU) API functions are mapped to equivalents that start "MPU_", the prototypes for which are defined in this header files. This will cause the application code to call the MPU_ version which wraps the non-MPU version with privilege promoting then demoting code, so the kernel code always runs will full privileges.

This sounds super jank lol, in an ideal secure world the user thread should not be getting promoted ever. Here's to minimal os security woop woooppp

grep == win

First I'm going to see if there are any SVC invocations so we know what we're dealing with. To do that (instead of a cool way that is better like a python script in binary ninja so we know what functions own the svc calls), we're going to disassemble all of userspace bin, then grep for all the syscall instructions, then trim down to the unique syscalls, and then count the lines.

arm-none-eabi-objdump -D -b binary -marm -Mforce-thumb --adjust-vma=0x08050000 0x8050000_0x80fffff_playdate.bin | grep svc | sed -e"s/;//p" | awk '{$1=$1};1' | cut -d ' ' -f 4 | sort | uniq | wc -l
$ 66

So our kernel surface area is 66 syscalls, by default FreeRTOS ships with a couple depending on the platform, and has a ton pre-implemented (but not assigned to numbers afaics) so theres a lot of things going on here, cool. I'm assuming that the pre-implemented syscalls are going to be lower numbers and any custom ones are descending starting from 0xff (255).

Reminder of our goals here - we know that only season games can use internet, and we can safely assume that there is some functionality on board to do that - the only question is going to be if that is exposed to games in userspace or if it is hidden behind a syscall or completely handled in kernel space.

And we need find a way to automatically signature a bunch of functions so we don't have to waste a bunch of time doing boring RE. So we're going to build an example FreeRTOS project with MPU enabled and all of the syscalls, then build signatures from that and attempt to locate any code in our userspace dump.

Building FreeRTOS

TODO

Example FreeRTOSConfig.h

Errata

Random things required to actually RE the playdate.

thar be dragons here

Triaging Hard Float Disassembly

Anytime anyone tells you "I had to look at [arbitrary compiler] source today", that's a bad sign. It's like if someone told you they pooped today, what is so wrong that there was a need to notify you that they pooped???

Well, I had to look at gcc source, and it's because disassembler's don't support ARM FPv5 (note that gdb libopcode seem's to albeit associates this FPU version with ARMv8 - but at least there's that).

What is ARM FPv5 and why does this matter

For the STM32F746IE MCU (our target processor), the Playdate uses FPv5 single-precision floating point support, therefore there are fancy instructions that aren't supported that ruin all our disassembly.

Hard vs Soft Float

They do the same math, except Hard float is implemented in Hard ware, and Soft float is implemented in Soft ware. TLDR; Hard float is orders of magnitude faster, at the cost of higher power consumption and a more expensive MCU etc.

FPv5 Specification

Architecture Reference Manual - chapter 6 is The Floating-point Instruction Set Encoding

Floating Point Instructions

WTF are they doing at ARM (and can I have some)

Primer: ARM mode vs. Thumb Mode

Primer: What are FPU, VFP, ASE, NEON, MPE, SVE, SME, MVE, and VPU?

simd + neon + ase + mpe + vfp(v1 -> v5) + vfp11 + mve can u just not, plz and ty.

Straight to the (gcc) source

"why not just use the docs + libopcodes if they already support it"

Because sister, I have seen the beyond and I'm sick of the docs being different from real life. That and an incorrect lifter is worse than an absent lifter.

So, I'm going to use the thing that emits the actual code to add FPv5 support.

For science.

And because I hate myself.

Overview of gcc source tree

Source tree

  • Top Level docs: link
  • What we care about: link
    • specifically we want to know how a gcc backend is created (ie ARM)
    • note that a frontend is something like the C programming language, while the backend is a platform / mcu architecture

Overview of a gcc backend

GCC Backend docs

Of note: >A directory machine under gcc/config, containing a machine description machine.md file (see Machine Descriptions), header files machine.h and machine-protos.h and a source file machine.c (see Target Description Macros and Functions), possibly a target Makefile fragment t-machine (see The Target Makefile Fragment), and maybe some other files. The names of these files may be changed from the defaults given by explicit specifications in config.gcc.

We're mostly concerned with how instructions are emitted and what they look like, so we want to look at the machine descriptions.

Overview of ARM gcc backend

How does GCC do ARM FPv5

Next: FPv5 meet World

FPv5 meet world

Opcode patterns from the docs

Going through the ARMv7M spec there are a few things we need to keep in mind:

Instruction alignment / byte ordering:

insn_alignment.png

The important bits are the "byte at address A + n"

When reading the instruction decodings from the manual:

fpu_register_formats.png

Rude "syntax flexibilities" to make things oh so much clearer:

syntax_flexiblities

FPU register encoding

fpu_register_encoding

Opcode patterns from gcc

FPv5 meet Binary Ninja

FPv5 meet Ghidra

FPv5 meet objdump