Darling Docs

This is the main documentation source for Darling.

It can be edited on GitHub.

Using Darling

Build Instructions

You must be running a 64-bit x86 Linux distribution. Darling cannot be used on a 32-bit x86 system, not even to run 32-bit applications.

Dependencies

Clang is required to compile Darling; at least Clang 6 is required. You can force a specific version of Clang (if it is installed on your system) by editing Toolchain.cmake.

A minimum of 4 GB of RAM is also required for building. Using swap space may help reduce the memory usage, but is likely to slow the build down significantly.

Linux 5.0 or higher is required.

Debian 10/11

$ sudo apt install cmake clang-6.0 bison flex xz-utils libfuse-dev libudev-dev pkg-config \
libc6-dev-i386 libcap2-bin git git-lfs python2 libglu1-mesa-dev libcairo2-dev \
libgl1-mesa-dev libtiff5-dev libfreetype6-dev libxml2-dev libegl1-mesa-dev libfontconfig1-dev \
libbsd-dev libxrandr-dev libxcursor-dev libgif-dev libpulse-dev libavformat-dev libavcodec-dev \
libswresample-dev libdbus-1-dev libxkbfile-dev libssl-dev

Debian Testing

$ sudo apt install cmake clang-9 bison flex xz-utils libfuse-dev libudev-dev pkg-config \
libc6-dev-i386 libcap2-bin git git-lfs python2 libglu1-mesa-dev libcairo2-dev \
libgl1-mesa-dev libtiff5-dev libfreetype6-dev libxml2-dev libegl1-mesa-dev libfontconfig1-dev \
libbsd-dev libxrandr-dev libxcursor-dev libgif-dev libpulse-dev libavformat-dev libavcodec-dev \
libswresample-dev libdbus-1-dev libxkbfile-dev libssl-dev

Ubuntu 18.04/20.04:

$ sudo apt install cmake clang bison flex libfuse-dev libudev-dev pkg-config libc6-dev-i386 \
gcc-multilib libcairo2-dev libgl1-mesa-dev libglu1-mesa-dev libtiff5-dev \
libfreetype6-dev git git-lfs libelf-dev libxml2-dev libegl1-mesa-dev libfontconfig1-dev libbsd-dev \
libxrandr-dev libxcursor-dev libgif-dev libavutil-dev libpulse-dev libavformat-dev libavcodec-dev \
libswresample-dev libdbus-1-dev libxkbfile-dev libssl-dev python2

Arch Linux & Manjaro:

$ sudo pacman -S --needed make cmake clang flex bison icu fuse gcc-multilib \
lib32-gcc-libs pkg-config fontconfig cairo libtiff python2 mesa llvm libbsd libxkbfile \ 
libxcursor libxext libxkbcommon libxrandr ffmpeg git git-lfs

Fedora and CentOS

RPMFusion is required for FFmpeg.

$ sudo dnf install make cmake clang bison dbus-devel flex python2 glibc-devel.i686 fuse-devel \
systemd-devel elfutils-libelf-devel cairo-devel freetype-devel.{x86_64,i686} \
libjpeg-turbo-devel.{x86_64,i686} libtiff-devel.{x86_64,i686} fontconfig-devel.{x86_64,i686} \
libglvnd-devel.{x86_64,i686} mesa-libGL-devel.{x86_64,i686} mesa-libEGL-devel.{x86_64,i686} \
mesa-libGLU-devel.{x86_64,i686} libxml2-devel libbsd-devel git git-lfs libXcursor-devel libXrandr-devel giflib-devel \
ffmpeg-devel pulseaudio-libs-devel libxkbfile-devel openssl-devel llvm libcap-devel

OpenSUSE Tumbleweed

You will need to build Darling with only the 64bit components. See Build Options for instructions.

$ sudo zypper install make cmake-full clang10 bison flex python-base glibc fuse-devel libsystemd0 \
libelf1 cairo-devel libfreetype6 libjpeg-turbo libfontconfig1 libglvnd Mesa-libGL-devel \
Mesa-libEGL-devel libGLU1 libxml2-tools libbsd-devel git git-lfs libXcursor-devel giflib-devel ffmpeg-4 \
ffmpeg-4-libavcodec-devel ffmpeg-4-libavformat-devel libpulse-devel pulseaudio-utils libxkbfile-devel \
openssl llvm libcap-progs libtiff-devel libjpeg8-devel libXrandr-devel dbus-1-devel glu-devel ffmpeg-4-libswresample-devel

Alpine Linux

Make sure to enable the community repository. Alpine also doesn't support 32-bit builds, so make sure to disable that.

$ sudo apk add cmake clang bison flex xz fuse-dev pkgconfig libcap git git-lfs python2 python3 glu-dev \
cairo-dev mesa-dev tiff-dev freetype-dev libxml2-dev fontconfig-dev libbsd-dev libxrandr-dev libxcursor-dev \
giflib-dev pulseaudio-dev ffmpeg-dev dbus-dev libxkbfile-dev openssl-dev libexecinfo-dev make gcc g++ xdg-user-dirs

These are the minimum requirements for building and running Darling on Alpine. Of course, if you want to run GUI applications, you'll also need a desktop environment.

Fetch the Sources

Darling uses git-lfs. Set this up if needed with official instructions.

Darling makes extensive use of Git submodules, therefore you cannot use a plain git clone. Make a clone like this:

$ git clone --recursive https://github.com/darlinghq/darling.git

Attention: The source tree requires up to 5 GB of disk space!

Updating sources

If you have already cloned Darling and would like to get the latest changes, do this in the source root:

$ git lfs install
$ git pull
$ git submodule init
$ git submodule update

Build

The build system of Darling is CMake. Makefiles are generated by CMake by default.

Attention: The build may require up to 16 GB of disk space! The Darling installation itself then takes up to 1 GB.

Building and Installing

Now let's build Darling:

# Move into the cloned sources
$ cd darling

# Make a build directory
$ mkdir build && cd build

# Configure the build
$ cmake ..

# Build and install Darling
$ make
$ sudo make install

Build Options

Doing non-full (a.k.a. shallow) builds

You will notice that it takes a long time to build Darling. Darling contains the software layer equivalent to an entire operating system, which means it contains a large amount of code. You can optionally disable some large and less vital parts of the build in order to get faster builds.

To do this, use the -DFULL_BUILD=OFF option when configuring Darling through CMake.

You may encounter some things to be missing, such as JavaScriptCore. Before creating an issue about a certain library or framework missing from Darling, verify that you are doing a full build by not using this option or setting it to ON.

Disabling 32-bit Libraries

Darling normally builds both 32-bit and 64-bit versions of all libraries, to enable 32-bit programs to run under Darling. However, this means Darling also requires 32-bit version of certain native libraries. If you can't setup a multilib environment or you just want to build only the 64-bit components, use -DTARGET_i386=OFF during configuration to disable building the 32-bit components.

Parallel Builds

Another way to speed up the build is to run make with multiple jobs. For this, run make -j8 instead, where 8 is a number of current jobs to run of your choosing. In general, avoid running more jobs than twice the amount CPU cores of your machine.

"Unified" JavaScriptCore Builds

If you still want to build JavaScriptCore and have a bit of RAM to spare, JavaScriptCore also supports a build mode known as "unified builds". This build mode can cut JSC build times in half, at the expense of causing slightly higher RAM usage. This build mode can be enabled in Darling by adding -DJSC_UNIFIED_BUILD=ON when configuring the build.

Debug Builds

By default, CMake setups up a non-debug, non-release build. If you run LLDB and encounter messages indicating a lack of debug symbols, make sure you are doing a debug build. To do this, use -DCMAKE_BUILD_TYPE=Debug.

Unit tests

Darling has a limited number of unit tests. These are not currently built by default, but this can be enabled with '-DENABLE_TESTS=1'. These tests are then installed to /usr/libexec within your Darling container.

Additional, Non-standard Binaries

Darling tries to stick to a standard macOS installation as much as possible. However, if you would like to build and install some additional packages (such as GNU tar), you can add -DADDITIONAL_PACKAGES=ON.

Custom Installation Prefix

To install Darling in a custom directory use the CMAKE_INSTALL_PREFIX CMake option. However, a Darling installataion is NOT portable, because the installataion prefix is hardcoded into the darling executable. This is intentional. If you do move your Darling installation you will get this error message:

Cannot mount overlay: No such file or directory
Cannot open mnt namespace file: No such file or directory

If you wish to properly move your Darling installation, the only supported option is for you to uninstall your current Darling installation, and then rebuild Darling with a different installation prefix.

Known Issues

BackBox

If your distribution is Backbox and you run into build issues try the following commands:

sudo update-alternatives --install /usr/bin/clang clang /usr/bin/clang-6.0 600
sudo update-alternatives --install /usr/bin/clang++ clang++ /usr/bin/clang++-6.0 600

SELinux

On SELinux you may see the following error when starting Darling:

Cannot open mnt namespace file: No such file or directory

To work around this try this command: setsebool -P mmap_low_allowed 1.

File System Support

Darling uses overlayfs for implementing prefixes on top of the macOS-like root filesystem. While overlayfs is not very picky about the lower (read-only) filesystem (where your /usr lives), it has stricter requirements for the upper filesystem (your home directory, unless you override the DPREFIX environment variable).

To quote the kernel documentation:

The lower filesystem can be any filesystem supported by Linux and does not need to be writable. The lower filesystem can even be another overlayfs. The upper filesystem will normally be writable and if it is it must support the creation of trusted.* extended attributes, and must provide valid d_type in readdir responses, so NFS is not suitable.

In addition to NFS not being supported, ZFS and eCryptfs encrypted storage are also known not to work.

If you try to use an unsupported file system, this error will be printed:

Cannot mount overlay: Invalid argument

Building for the WSL

The Windows Subsystem for Linux (or WSL for short) allows you to run Linux programs on a Windows system.

Darling should work with no problems on WSL2, as this is essentially a standard virtual machine with a full Linux kernel. WSL1 is more complicated and is not currently working.

As Darling no longer uses a Linux Kernel Module, no special instructions are needed for either WSL1 or WSL2. Use the standard Linux build instructions.

Uninstall

This page is only for if Darling was build and installed manually as instructed on Build Instructions. If you installed Darling through a package manager, please remove the related packages using that package manager.

Uninstall commands

The following commands will completely remove Darling. Replace the source root with the path to your local copy of the Darling source code.

$ cd darling
$ tools/uninstall

Darling shell

We plan to implement a nice and user-friendly GUI for Darling, but for now the primary way to use Darling and interact with it is via the Darling shell.

Basic usage

To get a shell inside the container, just run darling shell as a regular user. Behind the scenes, this command will start the container or connect to an already-running one and spawn a shell inside. It will also automatically load the kernel module and initialize the prefix contents if needed.

Inside, you'll find an emulated macOS-like environment. macOS is Unix-like, so most familiar commands will work. For example, it may be interesting to run ls -l /, uname and sw_vers to explore the emulated system. Darling bundles many of the command-line tools macOS ships — of the same ancient versions. The shell itself is Bash version 3.2.

The filesystem layout inside the container is similar to that of macOS, including the top-level /Applications, /Users and /System directories. The original Linux filesystem is visible as a separate partition that's mounted on /Volumes/SystemRoot. When running macOS programs under Darling, you'll likely want them to access files in your home folder; to make this convenient, there's a LinuxHome symlink in your Darling home folder that points to your Linux home folder, as seen from inside the container; additionally, standard directories such as Downloads in your Darling home folder are symlinked to the corresponding folders in your Linux home folder.

Running Linux Binaries

You can run normal Linux binaries inside the container, too. They won't make use of Darling's libraries and system call emulation and may not see the macOS-like environment:

$ darling shell
Darling [~]$ uname
Darwin
Darling [~]$ /Volumes/SystemRoot/bin/uname
Linux

Becoming root

Should you encounter an application that bails out because you are not root (typically because it needs write access outside your home directory), you can use the fake sudo command. It is fake, because it only makes getuid() and geteuid() system calls return 0, but grants you no extra privileges.

Examples

  • darling shell: Opens a Bash prompt.
  • darling shell /usr/local/bin/someapp arg: Execute /usr/local/bin/someapp with an argument. Note that the path is evaluated inside the Darling Prefix. The command is started through the shell (uses sh -c).
  • darling ~/.darling/usr/local/bin/someapp arg: Equivalent of the previous example (which doesn't make use of the shell), assuming that the prefix is ~/.darling.

Darling prefix

A Darling prefix is a container overlayed on top of a base macOS-like root file system located in $installation_prefix/libexec/darling. The default prefix location is ~/.darling and this can be controlled with the DPREFIX environment variable, very much like WINEPREFIX under Wine.

Note that in order to change the prefix location with DPREFIX, you should export this variable in the current shell before running Darling. Using it only when running Darling (e.g. DPREFIX=foo darling shell) will not work as expected.

The container uses overlayfs along with a user mount namespace to provide a different idea of / for macOS applications.

When you run an executable inside the prefix for the first time (after boot), launchd, the Darwin init process representing the container is started. This init process keeps the root file system mounted.

Updating the prefix

Unlike Wine, Darling doesn't need to update the prefix whenever the Darling installation is updated. There is one caveat, though: since overlayfs caches the contents of underlying file system(s), you may need to terminate the container to see Darling's updated files:

$ darling shutdown

Note that this will terminate all processes running in the container.

Multiple simultaneously running prefixes

Darling supports having multiple prefixes running simultaneously. All darling commands will use either the default prefix or the prefix specified by DPREFIX, if this environment variable is set. This means, for example, that in order to shutdown a particular prefix, you must set DPREFIX to the desired prefix (or unset it, for the default prefix) before running darling shutdown.

Installing software

There are multiple ways to install software on macOS, and our aim is to make all of them work on Darling as well. However there currently are a few limitations, mainly the lack of GUI.

You might not even need to install it

Unlike Wine, Darling can run software that's installed on an existing macOS installation on the same computer. This is possible thanks to the way application bundles (.app-s) work on macOS and Darling.

To use an app that's already installed, you just need to locate the existing installation (e.g. /Volumes/SystemRoot/run/media/username/Macintosh HD/Applications/SomeApp.app) and run the app from there.

DMG files

Many apps for macOS are distributed as .dmg (disk image) files that contain the .app bundle inside. Under macOS, you would click the DMG to mount it and then drag the .app to your Applications folder to copy it there.

Under Darling, use hdiutil attach SomeApp.dmg to mount the DMG (the same command works on macOS too), and then copy the .app using cp:

Darling [~]$ hdiutil attach Downloads/SomeApp.dmg
/Volumes/SomeApp
Darling [~]$ cp -r /Volumes/SomeApp/SomeApp.app /Applications/

Archives

Some apps are distributed as archives instead of disk images. To install such an app, unpack the archive using the appropriate CLI tools and copy the .app to /Applications.

Mac App Store

Many apps are only available via Apple's Mac App Store. To install such an application in Darling, download the app from a real App Store (possibly running on another computer) and copy the .app over to Darling.

PKG files

Many apps use .pkg, the native package format of macOS, as their distribution format. It's not enough to simply copy the contents of a package somewhere; they are really meant to be installed and can run arbitrary scripts during installation.

Under macOS, you would use the graphical Installer.app or the command-line installer utility to install this kind of package. You can do the latter under Darling as well:

Darling [~]$ installer -pkg mc-4.8.7-0.pkg -target /

Unlike macOS, Darling also has the uninstaller command, which lets you easily uninstall packages.

Package managers

There are many third-party package managers for macOS, the most popular one being Homebrew. Ultimately, we want to make it possible to use all the existing package managers with Darling, however, some may not work well right now.

Command-line developer tools

To install command-line developer tools such as the C compiler (Clang) and LLDB, you can install Xcode using one of the method mentioned above, and then run

Darling [~]$ xcode-select --switch /Applications/Xcode.app

Alternatively, you can download and install only command-line tools from Apple by running

Darling [~]$ xcode-select --install

Note that both Xcode and command-line tools are subject to Apple's EULA.

What to try

Here are some things you may want to try after installing Darling.

See if Darling can print the famous greeting:

Darling [~]$ echo Hello World
Hello World

It works!

Run uname

uname is a standard Unix command to get the name (and optionally the version) of the core OS. On Linux distributions, it prints "Linux":

$ uname
Linux

But Darling emulates a complete Darwin environment, so running uname results in "Darwin":

Darling [~]$ uname
Darwin

Run sw_vers

sw_vers (for "software version") is a Darwin command that prints the user-facing name, version and code name (such as "El Capitan") of the OS:

Darling [~]$ sw_vers
ProductName:    Mac OS X
ProductVersion: 10.14
BuildVersion:   Darling

Explore the file system

Explore the file system Darling presents to Darwin programs, e.g.:

Darling [~]$ ls -l /
...
Darling [~]$ ls /System/Library
Caches Frameworks OpenSSL ...
Darling [~]$ ls /usr/lib
...
Darling [~]$ ls -l /Volumes
...

Inspect the Mach-O binaries

Darling ships with tools like nm and otool that let you inspect Mach-O binaries, ones that make up Darling and any third-party ones:

Darling [~]$ nm /usr/lib/libobjc.A.dylib
...
Darling [~]$ otool -L /bin/bash
/bin/bash:
	/usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.0.0)

Explore process memory layout

While Darling emulates a complete Darwin system, it's still powered by Linux underneath. Sometimes, this may prove useful. For example, you can use Linux's /proc pseudo-filesystem to explore the running processes. Let's use cat to explore its own process memory layout:

Darling [~]$ cat /proc/self/maps
...
7ffff7ffb000-7ffff7ffd000 rwxp 00000000 fe:01 20482                      /home/user/.darling/bin/cat
7ffff7ffd000-7ffff7ffe000 rw-p 00002000 fe:01 20482                      /home/user/.darling/bin/cat
7ffff7ffe000-7ffff7fff000 r--p 00003000 fe:01 20482                      /home/user/.darling/bin/cat
7ffff7fff000-7ffff80f3000 rwxp 00182000 fe:01 60690                      /home/user/.darling/usr/lib/dyld
7ffff80f3000-7ffff80fc000 rw-p 00276000 fe:01 60690                      /home/user/.darling/usr/lib/dyld
7ffff80fc000-7ffff8136000 rw-p 00000000 00:00 0
7ffff8136000-7ffff81d6000 r--p 0027f000 fe:01 60690                      /home/user/.darling/usr/lib/dyld
7fffffdde000-7fffffdff000 rw-p 00000000 00:00 0                          [stack]
7fffffe00000-7fffffe01000 r--s 00000000 00:0e 8761                       anon_inode:[commpage]

Check out the mounts

Darling runs in a mount namespace that's separate from the host. You can use host's native mount tool to inspect it:

Darling [~]$ /Volumes/SystemRoot/usr/bin/mount | column -t
/Volumes/SystemRoot/dev/sda3  on  /Volumes/SystemRoot  type  ext4     (rw,relatime,seclabel)
overlay                       on  /                    type  overlay  (rw,relatime,seclabel,lowerdir=/usr/local/libexec/darling,upperdir=/home/user/.darling,workdir=/home/user/.darling.workdir)
proc                          on  /proc                type  proc     (rw,relatime)
<...>

Notice that not only can you simply run a native ELF executable installed on the host, you can also pipe its output directly into a Darwin command (like column in this case).

Alternatively, you can read the same info from the /proc pseudo-filesystem:

Darling [~]$ column -t /proc/self/mounts
<...>

List running processes

Darling emulates the BSD sysctls that are needed for ps to work:

Darling [~]$ ps aux
USER   PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
user    32   0.0  0.4  4229972  13016   ??  ?     1Jan70   0:00.05 ps aux
user     5   0.0  0.5  4239500  15536   ??  ?     1Jan70   0:00.22 /bin/launchctl bootstrap -S System
user     6   0.0  0.4  4229916  11504   ??  ?     1Jan70   0:00.09 /usr/libexec/shellspawn
user     7   0.0  0.6  4565228  17308   ??  ?     1Jan70   0:00.14 /usr/sbin/syslogd
user     8   0.0  0.6  4407876  18936   ??  ?     1Jan70   0:00.15 /usr/sbin/notifyd
user    29   0.0  0.2  4229948   7584   ??  ?N    1Jan70   0:00.03 /usr/libexec/shellspawn
user    30   0.0  0.5  4231736  14268   ??  ?     1Jan70   0:00.11 /bin/bash
user     1   0.0  0.5  4256056  15484   ??  ?     1Jan70   0:00.25 launchd

Read the manual

Darling ships with many man pages you can read:

Darling [~]$ man dyld

Run a script

Like Darwin, Darling ships with a build of Python, Ruby and Perl. You can try running a script or exploring them interactively.

Darling [~]$ python
Python 2.7.10 (default, Sep  8 2018, 13:32:07) 
[GCC 4.2.1 Compatible Clang 6.0.1 (tags/RELEASE_601/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.platform
'darwin'

Trace a process

Use our xtrace tool to trace the emulated Darwin syscalls a process makes:

Darling [~]$ arch
i386
Darling [~]$ xtrace arch
<...>
[321] mach_timebase_info_trap() -> numer = 1, denom = 1
[321] issetugid() -> 0
[321] host_self_trap() -> port right 2563
[321] mach_msg_trap(0x7fffffdff270, MACH_SEND_MSG|MACH_RCV_MSG, 40, 320, port 1543, 0, port 0)
[321]         {remote = copy send 2563, local = make send-once 1543, id = 200}, 16 bytes of inline data
[321]         mach_host::host_info(copy send 2563, 1, 12)
[321] mach_msg_trap() -> KERN_SUCCESS
[321]         {local = move send-once 1543, id = 300}, 64 bytes of inline data
[321]         mach_host::host_info() -> [8, 8, 0, 0, 3104465855, 4160607681, 4160604077, 0, 4292867120, 4292867064, 4151031935, 3160657432], 12 
[321] _kernelrpc_mach_port_deallocate_trap(task 259, port name 2563) -> KERN_SUCCESS
[321] ioctl(0, 1074030202, 0x7fffffdff3d4) -> 0
[321] fstat64(1, 0x7fffffdfef80) -> 0
[321] ioctl(1, 1074030202, 0x7fffffdfefd4) -> 0
[321] write_nocancel(1, 0x7f823680a400, 5)i386
 -> 5
[321] exit(0)

Control running services

Use launchctl tool to control launchd:

Darling [~]$ launchctl list
PID	Status	Label
323	-	0x7ffea3407da0.anonymous.launchctl
49	-	0x7ffea3406d50.anonymous.shellspawn
50	-	0x7ffea3406a20.anonymous.bash
39	-	0x7ffea3406350.anonymous.shellspawn
40	-	0x7ffea3405fd0.anonymous.bash
-	0	com.apple.periodic-monthly
-	0	com.apple.var-db-dslocal-backup
31	-	com.apple.aslmanager
-	0	com.apple.newsyslog
-	0	com.apple.periodic-daily
-	0	com.apple.securityd
19	-	com.apple.memberd
23	-	com.apple.notifyd
20	-	org.darlinghq.iokitd
-	0	com.apple.periodic-weekly
21	-	org.darlinghq.shellspawn
22	-	com.apple.syslogd
-	0	com.apple.launchctl.System

Darling [~]$ sudo launchctl bstree
System/
    A  org.darlinghq.iokitd
    A  com.apple.system.logger
    D  com.apple.activity_tracing.cache-delete
    D  com.apple.SecurityServer
    A  com.apple.aslmanager
    A  com.apple.system.notification_center
    A  com.apple.PowerManagement.control
    com.apple.xpc.system (XPC Singleton Domain)/

Read man launchctl for more information of other commands launchctl has.

Fetch a webpage

See if networking works as it should:

Darling [~]$ curl http://example.org
<!doctype html>
<html>
<head>
    <title>Example Domain</title>
<...>

Try using sudo

Just like the real Mac OS X may, Darling allows you to get root privileges without having to enter any password, except in our case it's a feature:

Darling [~]$ whoami
user
Darling [~]$ sudo whoami
root

Of course, our sudo command only gives the program the impression it's running as root; in reality, it still runs with privileges of your user. Some programs explicitly check that they're running as root, so you can use our sudo to convince them to run.

Use a package manager

Download and install the Rudix Package Manager:

Note: Not currently working due to lack of TLS support.

Darling [~]$ curl -s https://raw.githubusercontent.com/rudix-mac/rpm/2015.10.20/rudix.py | sudo python - install rudix

Now you can install arbitrary packages using the rudix command:

Darling [~]$ sudo rudix install wget mc

Install Homebrew

macOS's de-facto package manager, Homebrew, installs and works under Darling (albeit with issues with certain formulas).

Darling [~]$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Now you can install packages just like you would on a real macOS installation:

Darling [~]$ brew install wget

Try running Midnight Commander

If you've installed Midnight Commander (mc package in Rudix), launch it to see if it runs smoothly:

Darling [~]$ mc

Manually install a package

You can also try installing a .pkg file manually using the installer command:

Darling [~]$ installer -pkg mc-4.8.7-0.pkg -target /

Unlike macOS, Darling also ships with an uninstaller command which you can use to easily uninstall packages.

Attach disk images

Darling ships with an implementation of hdiutil, a tool that allows you to attach and detach DMG disk images:

Darling [~]$ hdiutil attach Downloads/SomeApp.dmg
/Volumes/SomeApp
Darling [~]$ ls /Volumes/SomeApp
SomeApp.app <...>
Darling [~]$ cp -r /Volumes/SomeApp/SomeApp.app /Applications/
Darling [~]$ hdiutil detach /Volumes/SomeApp

Check out user defaults

macOS software uses the so-called "user defaults" system to store preferences. You can access it using the defaults tool:

Darling [~]$ defaults read

For more information about using defaults, read the manual:

Darling [~]$ man defaults

Run neofetch

Get the neofetch.sh script from its homepage and run it:

Darling [~]$ bash neofetch.sh
                    'c.          user@hostname
                 ,xNMM.          ------------------------
               .OMMMMo           OS: macOS Mojave 10.14 Darling x86_64
               OMMM0,            Kernel: 16.0.0
     .;loddo:' loolloddol;.      Uptime: 1 day, 23 hours, 25 mins
   cKMMMMMMMMMMNWMMMMMMMMMM0:    Shell: bash 3.2.57
 .KMMMMMMMMMMMMMMMMMMMMMMMWd.    DE: Aqua
 XMMMMMMMMMMMMMMMMMMMMMMMX.      WM: Quartz Compositor
;MMMMMMMMMMMMMMMMMMMMMMMM:       WM Theme: Blue (Print: Entry, AppleInterfaceStyle, Does Not Exist)
:MMMMMMMMMMMMMMMMMMMMMMMM:       Terminal: /dev/pts/1
.MMMMMMMMMMMMMMMMMMMMMMMMX.      CPU: GenuineIntel
 kMMMMMMMMMMMMMMMMMMMMMMMMWd.    Memory: 12622017MiB / 2004MiB
 .XMMMMMMMMMMMMMMMMMMMMMMMMMMk
  .XMMMMMMMMMMMMMMMMMMMMMMMMK.
    kMMMMMMMMMMMMMMMMMMMMMMd
     ;KMMMMMMMWXXWMMMMMMMk.
       .cooc,.    .,coo:.

neofetch exercises a lot of the Darwin internals, including BSD sysctl calls, Mach IPC and host info API, and the "user defaults" subsystem.

Compile and run a program

If you have Xcode SDK installed, you can compile and run programs.

Darling [~]$ xcode-select --switch /Applications/Xcode.app

Now, build a "Hello World" C program using the Clang compiler:

Darling [~]$ cat > hello-world.c
#include <stdio.h>

int main() {
    puts("Hello World!");
}
Darling [~]$ clang hello-world.c -o hello-world

And run it:

Darling [~]$ ./hello-world
Hello world!

The whole compiler stack works, how cool is that! Now, let's try Swift:

Darling [~]$ cat > hi.swift
print("Hi!")
Darling [~]$ swiftc hi.swift
Darling [~]$ ./hi
Hi!

Try running apps

Darling has experimental support for graphical applications written using Cocoa. If you have a simple app installed, you can try running it:

Darling [~]$ /Applications/HelloWorld.app/Contents/MacOS/HelloWorld

Known working software

The following software has been tested to work with Darling:

  • Homebrew.
    • Terminal GNU Emacs 28.1, installed via brew
    • CMake 3.23.2, installed via brew
    • GNUPlot, installed via brew, when outputting to PNG files.
    • Python 3.9, installed via brew.
  • The Xcode commandline tools.
  • GNUPlot (http://www.gnuplot.info/), when outputting to PNG files.

Known partially working software

The following software has been tested and is known to work partially with Darling:

  • Classic Finder (https://classicmacfinder.com/).
    • Source code: https://bitbucket.org/bszyman/classic-finder-app/src/develop/
    • Works, but has the following bugs:
    • Menubar not visible anywhere.
    • Windows cannot be closed after they lose focus - the window decorations disappear.
    • Sometimes does not respond when trying to browse to system folders.
    • Runs very slowly.
  • dotnet (the commandline compiler).
    • Very minimal "hello world" applications compile and run.

Known non-functional software

The following software has been tested and is known not to work with Darling:

  • XQuartz 2.8.1:
    • Hangs, and causes client applications to hang when they try to use it.
  • Macports (though Homebrew does work).
  • Python 3.10 (From https://www.python.org) (darlingserver regression).

GUI applications will not work in Darling at this point in time, with very few exceptions. More specifically:

  • Most GUI toolkits, including:
    • Anything using the Python Tk/Tkinter toolkits.
    • Anything using the WxPython and WxWidgets toolkits.
    • Anything using the Xamarin/MAUI toolkits.
    • Mac Catalyst applications (there is no UIKit implementation for Darling yet).
  • Most GUI applications, including:
    • The Xcode GUI.
    • Logic.
    • Final Cut Pro.
    • Any Adobe Suite applications.
    • Any complex GUI application in general will not work at this point in time - only simple "Hello World" type GUIs will work.

Darling internals

Documentation of Darling to help new developers understand how Darling (and macOS) work.

Basics

Loader

Whereas Linux uses the ELF as the format for applications, dynamic libraries and so on, macOS uses Mach-O.

mldr is responsible for loading Mach-O binaries by loading the application's binary and then loading dyld and handing over control to it.

An additional trick is then needed to load ELF libraries into such processes.

dyld

Dyld is Apple's dynamic linker. It examines what libraries are needed by the macOS application, loads them and performs other necessary tasks. After it is done, it jumps to application's entry point.

Dyld is special in the sense that as the only "executable" on macOS, it does not (cannot) link to any libraries. As a consequence of this, it has to be statically linked to a subset of libSystem (counterpart of glibc on Linux).

System call emulation

System calls are the most critical interface of all user space software. They are the means of invoking kernel's functionality. Without them, software would not be able to communicate with the user, access the file system or establish network connections.

Every kernel provides a very specific set of system calls, even if there is a similar set of libc APIs available on different systems. For example, the system call set greatly differs between Linux and FreeBSD.

MacOS uses a kernel named XNU. XNU's system calls differ greatly from those of Linux, which is why XNU system call emulation is at the heart of Darling.

XNU system calls

Unlike other kernels, XNU has three distinct system call tables:

  1. BSD system calls, invoked via sysenter/syscall. These calls frequently have a direct counterpart on Linux, which makes them easiest to implement.
  2. Mach system calls, which use negative system call numbers. These are non-existent on Linux and are mostly implemented in darlingserver. These calls are built around Mach ports.
  3. Machine dependent system calls, invoked via int 0x82. There are only a few of them and they are mainly related to thread-local storage.

Introduction

Darling emulates system calls by providing a modified /usr/lib/system/libsystem_kernel.dylib library, the source code of which is located in src/kernel. However, some parts of the emulation are located in Darling's userspace kernel server (a.k.a. darlingserver, located in src/external/darlingserver).

This is why libsystem_kernel.dylib (and also dyld, which contains a static build of this library) can never be copied from macOS to Darling.

Emulating XNU system calls directly in Linux would have a few benefits, but isn't really workable. Unlike BSD kernels, Linux has no support for foreign system call emulation and having such an extensive and intrusive patchset merged into Linux would be too difficult. Requiring Darling's users to patch their kernels is out of question as well.

In the past, Darling used a kernel module to implement many of the same things now handled by darlingserver. This approach had certain advantages (e.g. much simpler to manage Darling processes and typically faster than darlingserver), but it was found to be too unstable. Additionally, debugging kernel code is much harder than debugging userspace code, so it also presented development challenges.

Disadvantages of this approach

  • Inability to run a full copy of macOS under Darling (notwithstanding the legality of such endeavor), at least the files mentioned above need to be different.

  • Inability to run software making direct system calls. This includes some old UPX-packed executables. In the past, software written in Go also belonged in this group, but this is no longer the case. Note that Apple provides no support for making direct system calls (which is effectively very similar to distributing statically linked executables described in the article) and frequently changes the system call table, hence such software is bound to break over time.

Advantages of this approach

  • Significantly easier development.
  • No need to worry about having patches merged into Linux.
  • It is easier for users to have newest code available, it is not needed to run the latest kernel to have the most up to date system call emulation.

Implementation

libsystem_kernel consists of two main components: Apple's open-source wrappers and our own syscall emulation. This setup makes our emulation model what happens on macOS as closely as possible. Apple uses wrappers in order to perform additional set up for syscalls as necessary and this also allows them to modify the kernel syscall interface between releases while keeping a stable libsystem_kernel interface.

When the wrappers are called, they perform any necessary set up and then invoke the syscall using a macro. This macro normally expands to a bit of assembly code that includes a syscall instruction, but for Darling, this macro is replaced with our own modified version that redirects the call over to one of our syscall lookup functions (according to the syscall type: BSD, Mach, or machdep). Our handler then looks in the appropriate table and invokes our emulated version of the syscall (or returns ENOSYS if we don't support it).

After that, the process becomes largely syscall-specific and documenting how we handle each and every one of them here would be a large task, and probably not that useful (after all, if you're looking to dive into the specifics of libsystem_kernel, it's probably best to look at the code). However, it is useful to document details common across multiple syscalls, so that is what follows now.

Making Linux syscalls

The majority of our emulated libsystem_kernel syscalls need to make Linux syscalls (and some need to make multiple syscalls). In order to make this easier, libsystem_kernel has a LINUX_SYSCALL macro. It can be used like so: LINUX_SYSCALL(linux_syscall_number, arguments...). As an example, this is how our chroot uses it:

long sys_chroot(const char* path) {
  // ... other code ...
  LINUX_SYSCALL(__NR_chroot, path);
  // ... more code ...
};

In addition, to avoid dependence on external headers, libsystem_kernel includes copies of Linux headers defining syscall numbers. All of these syscall numbers are prefixed with __NR_. The syscall numbers, as well as availability, depends on the architecture. See this directory for the header we include.

Making darlingserver calls

Many syscalls can be emulated using only Linux syscalls. However, there are some, most notably Mach syscalls, that require calls into our userspace kernel server.

darlingserver calls have wrappers in libsystem_kernel that handle all the RPC boilerplate for messaging the server. Prototypes for these wrappers can be included using #include <darlingserver/rpc.h>. A full list of all the supported RPC calls can be found here.

Each wrapper is of the form:

int dserver_rpc_CALL_NAME_HERE(int arg1, char arg2 /* and arg3, arg4, etc. */, int* return_val1, char* return_val2 /* and return_val3, return_val4, etc. */);

The arguments to the wrapper are the arguments (if any) to the RPC call followed by pointers to store the return values (if any) from the call. All return value pointers can be set to NULL if you don't want/care about that value.

File descriptors can also be sent and received like any other argument or return value; the wrappers take care of the necessary RPC boilerplate behind the scenes.

For example, this is the wrapper prototype for the uidgid call, which retrieves the UID and GID of the current process (as seen within the process) and optionally updates them (if either new_uid or new_gid are greater than -1):

int dserver_rpc_uidgid(int32_t new_uid, int32_t new_gid, int32_t* out_old_uid, int32_t* out_old_gid);

The following are all valid invocations of this wrapper:

int old_uid;
int old_gid;

// retrieve the old UID and set the new UID to 5 (without changing the GID)
dserver_rpc_uidgid(5, -1, &old_uid, NULL);

// retrieve the old UID and GID and don't change the UID or GID values
dserver_rpc_uidgid(-1, -1, &old_uid, &old_gid);

// set the new GID to 89 (without changing the UID)
dserver_rpc_uidgid(-1, 89, NULL, NULL);

For more details on darlingserver RPC, see the appropriate doc.

errno translation

Every syscall needs to return a status code, and Linux syscalls also return status codes. Therefore, many of our emulated syscalls that make Linux syscalls return the Linux syscall's return code as their own. However, since errno values differ between BSD and Linux, we need to translate these codes before returning them. That's what errno_linux_to_bsd is for: feed it a Linux errno and it'll give you the equivalent BSD errno. It will also automatically handle signs: if you give a positive code, it'll return a positive code, and vice versa for negative codes.

"Simple" implementations of certain libc functions

Libc provides some incredibly useful functions that we simply cannot use in libsystem_kernel, due the fact that libsystem_kernel can't have any external dependencies. However, we would still like to have many of these available in libsystem_kernel. Solution? Implement our own simple versions of these functions.

The result is that everything in simple.h is available in libsystem_kernel. These are all designed to provide the necessary functionality from their libc counterparts. This means, for example, that most (but not all) format specifiers are supported by __simple_printf, mainly just because we don't need them all. Additionally, __simple_kprintf is libsystem_kernel-unique: it can be used to print to the Linux kernel log stream (using our LKM).

vchroot handling

As part of Darling's containerization, Darling uses something called a virtual chroot, or "vchroot" for short. The idea behind this technique is that, since we control libsystem_kernel, through which all Darwin filesystem operations must go, we can rewrite paths in libsystem_kernel to make the root (i.e. /) point to our prefix (e.g. ~/.darling). The main advantage of this method compared to an actual chroot is that Linux libraries and programs used within the prefix are oblivious to this and see the normal Linux root. This means that Linux libraries can communicate with daemons and services exactly like they normally would, requiring no extra setup on our part.

The disadvantage of this approach is that means a lot more work for our libsystem_kernel. We have to be careful to translate every single path we handle, both ones we receive as input (i.e. we must expand them to their full Linux paths) and ones we return (i.e. we must unexpand them back to their prefix-relative paths). Thankfully, there are two functions in libsystem_kernel that handle all the expansion and unexpansion to make it easier on syscalls: vchroot_expand and vchroot_unexpand.

vchroot_expand expects a pointer to an argument structure (for legacy reasons; this used to be an LKM call). This structure contains 3 members. On input, path is a character buffer containing the path to expand. On output, it contains the expanded path. flags specifies flags for the call. Currently, there is only a single flag: VCHROOT_FOLLOW tells the function to return the expanded symlink target path if the input path refers to a symlink. dfd specifies a descriptor pointing to a directory to use as the base from which to resolve relative input paths. If it is set to -100, the process's current working directory. vchroot_expand returns 0 on success or a Linux errno (remember to translate this if returning it from a syscall!).

vchroot_unexpand does the opposite of vchroot_expand: it takes a fully expanded Linux path, and returns the unexpanded prefix-relative path. Like vchroot_expand, it expects a pointer to an argument structure, although it only has a single member: path. On input, path is a character buffer containing the path to unexpand. On output, path contains the unexpanded path. Like vchroot_expand, vchroot_unexpand returns 0 on success or a Linux errno.

Containerization

Darling supports use of multiple prefixes (virtual root directories), very much like Wine. Unlike Wine, Darling makes use of Linux's support for various user-controlled namespaces. This makes Darling's prefixes behave a lot more like Docker/LXC containers.

Implementation

The implementation fully resides in the darling binary, which performs several tasks:

  • Create a new mount namespace. Mounts created inside the namespace are automatically destroyed when the container is stopped.
  • Set up an overlayfs mount, which overlays Darling's readonly root tree (which is installed e.g. in /usr/local/libexec/darling) with the prefix's path. This means the prefix gets updated prefix contents for free (unlike in Wine), but the user can still manipulate prefix contents.
  • Activate "vchroot". That is what we call our virtual chroot implementation, which still allows applications to escape into the outside system via a special directory (/Volumes/SystemRoot).
  • Set up a new PID namespace. launchd is then started as the init process for the container.

More namespaces (e.g. UID or network) will be considered in future.

Caveats

  • When you make changes to Darling's installation directory (e.g. /usr/local/libexec/darling), you must stop running containers (via darling shutdown) so that the changes take effect. This is a limitaton of overlayfs.

Calling host system APIs

This article describes how Darling enables code compiled into Mach-O files to interact with API exported from ELF files on the host system. This is needed, for example, to access ALSA audio APIs.

Apple's dynamic loader (dyld) cannot load ELF files and extending it in this direction would be a rather extensive endeavor, also because Apple's linker (ld64) would need to be extended at the same time. This means some kind of bridge between the host platform's ELF loader and the "Mach-O world" has to be set up.

The ELF Bridge

mldr is responsible for loading Mach-O's. However, it is also responsible for providing the ELF bridge for the Mach-O world. Because it is an ELF itself, it has full access to a normal Linux ELF environment, complete with dynamic library loading and pthread functionality (which is necessary for Darling's threading implementation). How does mldr allow the Mach-O world to use this stuff? As part of loading a Mach-O binary, mldr populates a structure with the addresses of ELF functions we want to use. After populating this structure (struct elf_calls), it passes it to the Mach-O binary as part of the special applep environment array. Mach-O code can then call those functions at any time using the function pointers stored in the structure.

Wrappers

To enable easy linking, a concept of ELF wrappers was introduced, along with a tool named wrapgen. wrapgen parses ELF libraries, extracts the SONAME (name of library to be loaded) and a list of visible symbols exported from the library.

Now that we have the symbols, a neat trick is used to make them available to Mach-O applications. The Mach-O format supports so called symbol resolvers, which are functions that return the address of the symbol they represent. dyld calls them and provides the result as symbol address to whoever needs the symbol.

Therefore, wrapgen produces C files such as this:

#include <elfcalls.h>
extern struct elf_calls* _elfcalls;

static void* lib_handle;
__attribute__((constructor)) static void initializer() {
    lib_handle = _elfcalls->dlopen_fatal("libasound.so.2");
}

__attribute__((destructor)) static void destructor() {
    _elfcalls->dlclose_fatal(lib_handle);
}

void* snd_pcm_open() {
    __asm__(".symbol_resolver _snd_pcm_open");
    return _elfcalls->dlsym_fatal(lib_handle, "snd_pcm_open");
}

The C file is then compiled into a Mach-O library, which transparently wraps an ELF library on the host system.

CMake integration

To make things very easy, there is a CMake function that automatically takes care of wrapping a host system's library.

Example:

include(wrap_elf)
include(darling_exe)

wrap_elf(asound libasound.so)

add_darling_executable(pcm_min pcm_min.c)
target_link_libraries(pcm_min system asound)

The wrap_elf() call creates a Mach-O library of given name, wrapping an ELF library of given name, and installs it into /usr/lib/native inside the prefix.

Threading

Thread implementation

Rationale

Darling uses Apple's original libpthread library. MacOS applications running under Darling therefore use the same exact threading library as on macOS.

Apple's libpthread manages threads through a collection of bsdthread* system calls, which are implemented by Darling. This way Apple's libpthread could operate absolutely independently on Linux.

However, there is a huge catch. Darling could set up threads on its own and everything would be working fine, but only unless no calls to native Linux libraries (e.g. PulseAudio) would be made. The problem would become obvious as soon as a Linux library makes any thread-related operations on its own - they would crash immediately. This includes many pthread_* calls, but thread-local storage access as well.

Wrapping native libpthread

In Darling, libsystem_kernel uses libelfloader to handle certain bsdthread* system calls. libelfloader, in turn, uses native libpthread to start a thread. Once native libpthread sets up the thread, the control is handed over to Apple's libpthread.

Apple's libpthread needs control over the stack, so we cannot use the stack provided and managed by native libpthread. Therefore we quickly switch to a different stack once we get control from the native libpthread library.

Thread-local storage

Thread-local storage (TLS) is a critical feature enabling store and retrieval of per-thread data.

In user applications (written in C/C++), TLS variables are typically accessed via pthread_getspecific() and pthread_setspecific() or via special attributes __thread or thread_local (since C++11). However, TLS is no less important even in applications that do not make use of this facility explicitly.

This is because TLS is a much needed functionality used by the threading library libpthread (otherwise pthread_self() would not work) and also in libc itself (errno is typically a per-thread value to avoid races).

On 32/64-bit x86, old segment registers from the 16-bit times have found new use thanks to TLS.

TLS setup

  1. When a new thread is being set up, a block of memory is allocated by libpthread. This block of memory is usually what pthread_self() returns.
  2. libpthread asks the kernel to set up a new GDT entry that refers to this memory block.
  3. The entry number is then set into the respective segment register.

Linux specific

On Linux, there are two different system calls used to create GDT entries.

  • set_thread_area() is available on both x86 and x86-64.
  • arch_prctl() is only available on x86-64, but unlike set_thread_area() it also supports 64-bit base addresses.

macOS specific

A machine-dependent ("machdep") system call thread_fast_set_cthread_self() is used to create a new segment. Darling translates this call to one of the above Linux system calls.

TLS usage

The concept of memory segments is very simple. You can imagine that the segment refers to a certain area of your process' virtual memory. Any memory accesses you make via the segment register are relative to start of that area and are limited by area's size.

Imagine that you have set up FS to point to an integer array. Now you can set an element whose index is in EAX to value 42:

movl $42, %fs:(%eax, 4)

Registers

While x86 offers a total of six segment registers, only two of them (FS and GS) can be used for TLS (the others being CS, DS, ES and SS). This effectively limits the number of facilities managing TLS areas in a single process to two. One of them is typically used by the platform's native libc, whilst the other one is typically used by foreign libc's (courtesy of Darling and Wine).

The following table explains which registers are used by whom:

SystemTLS register on i386TLS register on x86-64
Linux libcGSFS
Apple's libSystemGSGS
Apple's libSystem (Darling)FSGS

This is also why Wine for macOS can never run under Darling: it would always overwrite either the register used by Linux libc or Darling's libSystem.

MacOS specifics

Mach ports

Mach ports are the IPC primitives under Mach. They are conceptually similar to Unix pipes, sockets or message queues: using ports, tasks (and the kernel) can send each other messages.

Drawing the analogy with pipes further,

  • A port is like a pipe. It is conceptually a message queue maintained by the kernel — this is where it differs from a pipe, which is an uninterpreted stream of raw bytes.

  • Tasks and the kernel itself can enqueue and dequeue messages to/from a port via a port right to that port. A port right is a handle to a port that allows either sending (enqueuing) or receiving (dequeuing) messages, a lot like a file descriptor connected to either the read or the write end of a pipe. There are the following kinds of port rights:

    • Receive right, which allows receiving messages sent to the port. Mach ports are MPSC (multiple-producer, single-consumer) queues, which means that there may only ever be one receive right for each port in the whole system (unlike with pipes, where multiple processes can all hold file descriptors to the read end of one pipe).
    • Send right, which allows sending messages to the port.
    • Send-once right, which allows sending one message to the port and then disappears.
    • Port set right, which denotes a port set rather than a single port. Dequeuing a message from a port set dequeues a message from one of the ports it contains. Port sets can be used to listen on several ports simultaneously, a lot like select/poll/epoll/kqueue in Unix.
    • Dead name, which is not an actual port right, but merely a placeholder. When a port is destroyed, all existing port rights to the port turn into dead names.
  • A port right name is a specific integer value a task uses to refer to a port right it holds, a lot like a file descriptor that a process uses to refer to an open file. Sending a port right name, the integer, to another task does not allow it to use the name to access the port right, because the name is only meaningful in the context of the port right namespace of the original task.

The above compares both port rights and port right names to file descriptors, because Unix doesn't differentiate between the handle to an open file and the small integer value aspects of file descriptors. Mach does, but even when talking about Mach, it's common to say a port or a port right actually meaning a port right name that denotes the right to the port. In particular, the mach_port_t C type (aka int) is actually the type of port right names, not ports themselves (which are implemented in the kernel and have the real type of struct ipc_port).

Low-level API

It's possible to use Mach ports directly by using the mach_msg() syscall (Mach trap; actually, mach_msg() is a user-space wrapper over the mach_msg_overwrite_trap() trap) and various mach_*() functions provided by the kernel (which are in fact MIG routines, see below).

Here's an example of sending a Mach message between processes:

// File: sender.c

#include <stdio.h>

#include <mach/mach.h>
#include <servers/bootstrap.h>

int main() {

    // Lookup the receiver port using the bootstrap server.
    mach_port_t port;
    kern_return_t kr = bootstrap_look_up(bootstrap_port, "org.darlinghq.example", &port);
    if (kr != KERN_SUCCESS) {
        printf("bootstrap_look_up() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("bootstrap_look_up() returned port right name %d\n", port);


    // Construct our message.
    struct {
        mach_msg_header_t header;
        char some_text[10];
        int some_number;
    } message;

    message.header.msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_COPY_SEND, 0);
    message.header.msgh_remote_port = port;
    message.header.msgh_local_port = MACH_PORT_NULL;

    strncpy(message.some_text, "Hello", sizeof(message.some_text));
    message.some_number = 35;

    // Send the message.
    kr = mach_msg(
        &message.header,  // Same as (mach_msg_header_t *) &message.
        MACH_SEND_MSG,    // Options. We're sending a message.
        sizeof(message),  // Size of the message being sent.
        0,                // Size of the buffer for receiving.
        MACH_PORT_NULL,   // A port to receive a message on, if receiving.
        MACH_MSG_TIMEOUT_NONE,
        MACH_PORT_NULL    // Port for the kernel to send notifications about this message to.
    );
    if (kr != KERN_SUCCESS) {
        printf("mach_msg() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("Sent a message\n");
}
// File: receiver.c

#include <stdio.h>

#include <mach/mach.h>
#include <servers/bootstrap.h>

int main() {

    // Create a new port.
    mach_port_t port;
    kern_return_t kr = mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_RECEIVE, &port);
    if (kr != KERN_SUCCESS) {
        printf("mach_port_allocate() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("mach_port_allocate() created port right name %d\n", port);


    // Give us a send right to this port, in addition to the receive right.
    kr = mach_port_insert_right(mach_task_self(), port, port, MACH_MSG_TYPE_MAKE_SEND);
    if (kr != KERN_SUCCESS) {
        printf("mach_port_insert_right() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("mach_port_insert_right() inserted a send right\n");


    // Send the send right to the bootstrap server, so that it can be looked up by other processes.
    kr = bootstrap_register(bootstrap_port, "org.darlinghq.example", port);
    if (kr != KERN_SUCCESS) {
        printf("bootstrap_register() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("bootstrap_register()'ed our port\n");


    // Wait for a message.
    struct {
        mach_msg_header_t header;
        char some_text[10];
        int some_number;
        mach_msg_trailer_t trailer;
    } message;

    kr = mach_msg(
        &message.header,  // Same as (mach_msg_header_t *) &message.
        MACH_RCV_MSG,     // Options. We're receiving a message.
        0,                // Size of the message being sent, if sending.
        sizeof(message),  // Size of the buffer for receiving.
        port,             // The port to receive a message on.
        MACH_MSG_TIMEOUT_NONE,
        MACH_PORT_NULL    // Port for the kernel to send notifications about this message to.
    );
    if (kr != KERN_SUCCESS) {
        printf("mach_msg() failed with code 0x%x\n", kr);
        return 1;
    }
    printf("Got a message\n");

    message.some_text[9] = 0;
    printf("Text: %s, number: %d\n", message.some_text, message.some_number);
}
Darling [/tmp]$ ./receiver
mach_port_allocate() created port right name 2563
mach_port_insert_right() inserted a send right
bootstrap_register()'ed our port

# in another terminal:
Darling [/tmp]$ ./sender
bootstrap_look_up() returned port right name 2563
Sent a message

# back in the first terminal:
Got a message
Text: Hello, number: 35

As you can see, in this case the kernel decided to use the same number (2563) for port right names in both processes. This cannot be relied upon; in general, the kernel is free to pick any unused names for new ports.

Ports as capabilities

In addition to custom "inline" data as shown above, it's possible for a message to contain:

  • Out-of-line data that will be copied to a new virtual memory page in the receiving task. If possible (i.e. if the out-of-line data is page-aligned in the sending task), Mach will use copy-on-write techniques to pass the data to the receiving task without actually copying it.
  • Port rights that will be sent to the receiving task. This way, it's possible for a task to transfer its port rights to another task. This is how bootstrap_register() and bootstrap_look_up() work in the above example.

The only way for a task to get a port right for a port is to either create that port, or have some other task (or the kernel) send it the right. In this way, Mach ports are capabilities (as in capability-based security).

Among other things, that means that for Mach programs (including the kernel itself) which allow other tasks to ask it to perform operations by sending messages to a port the program listens on, it's idiomatic not to explicitly check if the sender has any specific "permissions" to perform this task; just having a send right to the port is considered enough of a permission. For example, you can manipulate any task (with calls like vm_write() and thread_create()) as long as you can get its task port; it doesn't matter if you're root or not and so on (and indeed, UIDs and other Unix security measures don't exist on the Mach level).

Where to get ports

It would make a lot of sense to make Mach port inheritable across fork() and exec() like file descriptors are (and that is the way it work on the Hurd), but on Darwin, they're not. A task starts with a fresh port right namespace after either fork() or exec(), except for a few special ports that are inherited:

  • Task port (aka kernel port), a send right to a port whose receive right is held by the kernel. This port allows to manipulate the task it refers to, including reading and writing its virtual memory, creating and otherwise manipulating its threads, and terminating (killing) the task (see task.defs, mach_vm.defs and vm_map.defs). Call mach_task_self() to get the name for this port for the caller task. This port is only inherited across exec(); a new task created with fork() gets a new task port (as a special case, a task also gets a new task port after exec()ing a suid binary). The Mach task_create() function that allows creating new tasks returns the task port of the new task to the caller; but it's unavailable (always returns KERN_FAILURE) on Darwin, so the only way to spawn a task and get its port is to perform the "port swap dance" while doing a fork().

  • Host port, a send right to another port whose receive right is held by the kernel. The host port allows getting information about the kernel and the host machine, such as the OS (kernel) version, number of processors and memory usage statistics (see mach_host.defs). Get it using mach_host_self(). There also exists a "privileged host control port" (host_priv_t) that allows privileged tasks (aka processes running as root) to control the host (see host_priv.defs). The official way to get it is by calling host_get_host_priv_port() passing the "regular" host port; in reality it returns either the same port name (if the task is privileged) or MACH_PORT_NULL (if it's not).

  • Task name port, an Apple extension, an unprivileged version of the task port. It references the task, but does not allow controlling it. The only thing that seems to be available through it is task_info().

  • Bootstrap port, a send right to the bootstrap server (on Darwin, this is launchd). The bootstrap server serves as a server registry, allowing other servers to export their ports under well-known reverse-DNS names (such as com.apple.system.notification_center), and other tasks to look up those ports by these names. The bootstrap server is the primary way to "connect" to another task, comparable to D-Bus under Linux. The bootstrap port for the current task is available in the bootstrap_port global variable. On the Hurd, the filesystem serves as the server registry, and they use the bootstrap port for passing context to translators instead.

  • Seatbelt port, task access port, debug control port, which are yet to be documented.

In addition to using the *_self() traps and other methods mentioned above, you can get all these ports by calling task_get_special_port(), passing in the task port (for the caller task or any other task) and an identifier of the port (TASK_BOOTSTRAP_PORT and so on). There also exist wrapper macros (task_get_bootstrap_port()` and so on) that pass the right identifier automatically.

There also exists task_set_special_port() (and the wrapper macros) that allows you to change the special ports for a given task to any send right you provide. mach_task_self() and all the other APIs discussed above will, in fact, return these replaced ports rather than the real ports for the task, host and so on. This is a powerful mechanism that can be used, for example, to disallow task to manipulate itself, log and forward any messages it sends and receives (Hurd's rpctrace), or make it believe it's running on a different version of the kernel or on another host. This is also how tasks get the bootstrap port: the first task (launchd) starts with a null bootstrap port, allocates a port and sets it as the bootstrap port for the tasks it spawns.

There also are two special sets of ports that tasks inherit:

  • Registered ports, up to 3 of them. They can be registered using mach_ports_register() and later looked up using mach_ports_lookup(). Apple's XPC uses these to pass down its "XPC bootstrap port".
  • Exception ports, which are the ports where the kernel should send info about exceptions happening during the task execution (for example, Unix's Segmentation Fault corresponds to EXC_BAD_ACCESS). This way, a task can handle its own exceptions or let another task handle its exceptions. This is how the Crash Reporter.app works on macOS, and this is also what LLDB uses. Under Darling, the Linux kernel delivers these kinds of events as Unix signals to the process that they happen in, then the process converts the received signals to Mach exceptions and sends them to the correct exception port (see sigexc.c).

As a Darwin extension, there are pid_for_task(), task_for_pid(), and task_name_for_pid() syscalls that allow converting between Mach task ports and Unix PIDs. Since they essentially circumvent the capability model (PIDs are just integers in the global namespace), they are crippled on iOS/macOS with UID checks, entitlements, SIP, etc. limiting their use. On Darling, they are unrestricted.

Similarly to task ports, threads have their corresponding thread ports, obtainable with mach_thread_self() and interposable with thread_set_special_port().

Higher-level APIs

Most programs don't construct Mach messages manually and don't call mach_msg() directly. Instead, they use higher-level APIs or tools that wrap Mach messaging.

MIG

MIG stands for Mach Interface Generator. It takes interface definitions like this example:

// File: window.defs

subsystem window 35000;

#include <mach/std_types.defs>

routine create_window(
    server: mach_port_t;
    out window: mach_port_t);

routine window_set_frame(
    window: mach_port_t;
    x: int;
    y: int;
    width: int;
    height: int);

And generates the C boilerplate for serializing and deserializing the calls and sending/handling the Mach messages. Here's a (much simplified) client code that it generates:

// File: windowUser.c

/* Routine create_window */
kern_return_t create_window(mach_port_t server, mach_port_t *window) {

	typedef struct {
		mach_msg_header_t Head;
	} Request;

	typedef struct {
		mach_msg_header_t Head;
		/* start of the kernel processed data */
		mach_msg_body_t msgh_body;
		mach_msg_port_descriptor_t window;
		/* end of the kernel processed data */
		mach_msg_trailer_t trailer;
	} Reply;

	union {
		Request In;
		Reply Out;
	} Mess;

	Request *InP = &Mess.In;
	Reply *Out0P = &Mess.Out;

	mach_msg_return_t msg_result;

	InP->Head.msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_COPY_SEND, MACH_MSG_TYPE_MAKE_SEND_ONCE);
	/* msgh_size passed as argument */
	InP->Head.msgh_request_port = server;
	InP->Head.msgh_reply_port = mig_get_reply_port();
	InP->Head.msgh_id = 35000;
	InP->Head.msgh_reserved = 0;

	msg_result = mach_msg(
		&InP->Head,
		MACH_SEND_MSG|MACH_RCV_MSG|MACH_MSG_OPTION_NONE, 
		(mach_msg_size_t) sizeof(Request),
		(mach_msg_size_t) sizeof(Reply),
		InP->Head.msgh_reply_port,
		MACH_MSG_TIMEOUT_NONE,
		MACH_PORT_NULL
	);

	if (msg_result != MACH_MSG_SUCCESS) {
		return msg_result;
	}

	*window = Out0P->window.name;
	return KERN_SUCCESS;
}

/* Routine window_set_frame */
kern_return_t window_set_frame(mach_port_t window, int x, int y, int width, int height) {

	typedef struct {
		mach_msg_header_t Head;
		int x;
		int y;
		int width;
		int height;
	} Request;

	typedef struct {
		mach_msg_header_t Head;
		mach_msg_trailer_t trailer;
	} Reply;

	union {
		Request In;
		Reply Out;
	} Mess;

	Request *InP = &Mess.In;
	Reply *Out0P = &Mess.Out;

	mach_msg_return_t msg_result;

	InP->x = x;
	InP->y = y;
	InP->width = width;
	InP->height = height;
	InP->Head.msgh_bits = MACH_MSGH_BITS(MACH_MSG_TYPE_COPY_SEND, MACH_MSG_TYPE_MAKE_SEND_ONCE);
	/* msgh_size passed as argument */
	InP->Head.msgh_request_port = window;
	InP->Head.msgh_reply_port = mig_get_reply_port();
	InP->Head.msgh_id = 35001;
	InP->Head.msgh_reserved = 0;

	msg_result = mach_msg(
		&InP->Head,
		MACH_SEND_MSG|MACH_RCV_MSG|MACH_MSG_OPTION_NONE,
		(mach_msg_size_t) sizeof(Request),
		(mach_msg_size_t) sizeof(Reply), 
		InP->Head.msgh_reply_port,
		MACH_MSG_TIMEOUT_NONE,
		MACH_PORT_NULL
	);

	if (msg_result != MACH_MSG_SUCCESS) {
		return msg_result;
	}

	return KERN_SUCCESS;
}

To get the reply from the server, MIG includes a port right (called the reply port) in the message, and then performs a send on the server port and a receive on the reply port with a single mach_msg() call. The client keeps the receive right for the reply port, while the server gets sent a send-once right. This way, event though MIG reuses a single (per-thread) reply port for all the servers it talks to, servers can't impersonate each other.

And the corresponding server code, also simplified:

// File: windowServer.c

/* Routine create_window */
extern kern_return_t create_window(mach_port_t server, mach_port_t *window);

/* Routine create_window */
mig_internal novalue _Xcreate_window(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP) {

	typedef struct {
		mach_msg_header_t Head;
		mach_msg_trailer_t trailer;
	} Request;

	typedef struct {
		mach_msg_header_t Head;
		/* start of the kernel processed data */
		mach_msg_body_t msgh_body;
		mach_msg_port_descriptor_t window;
		/* end of the kernel processed data */
	} Reply;

	Request *In0P = (Request *) InHeadP;
	Reply *OutP = (Reply *) OutHeadP;


	kern_return_t RetCode;

	OutP->window.disposition = MACH_MSG_TYPE_COPY_SEND;
	OutP->window.pad1 = 0;
	OutP->window.pad2 = 0;
	OutP->window.type = MACH_MSG_PORT_DESCRIPTOR;

	RetCode = create_window(In0P->Head.msgh_request_port, &OutP->window.name);

	if (RetCode != KERN_SUCCESS) {
		MIG_RETURN_ERROR(OutP, RetCode);
	}

	OutP->Head.msgh_bits |= MACH_MSGH_BITS_COMPLEX;
	OutP->Head.msgh_size = (mach_msg_size_t) (sizeof(Reply));
	OutP->msgh_body.msgh_descriptor_count = 1;
}


/* Routine window_set_frame */
extern kern_return_t window_set_frame(mach_port_t window, int x, int y, int width, int height);

/* Routine window_set_frame */
mig_internal novalue _Xwindow_set_frame(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP) {

	typedef struct {
		mach_msg_header_t Head;
		int x;
		int y;
		int width;
		int height;
		mach_msg_trailer_t trailer;
	} Request;

	typedef struct {
		mach_msg_header_t Head;
	} Reply;

	Request *In0P = (Request *) InHeadP;
	Reply *OutP = (Reply *) OutHeadP;

	window_set_frame(In0P->Head.msgh_request_port, In0P->x, In0P->y, In0P->width, In0P->height);
}


/* Description of this subsystem, for use in direct RPC */
const struct window_subsystem {
	mach_msg_id_t	start;		/* Min routine number */
	mach_msg_id_t	end;		/* Max routine number + 1 */
        unsigned int    maxsize;        /* Max msg size */
	struct routine_descriptor	/* Array of routine descriptors */
		routine[2];
} window_subsystem = {
	35000,
	35002,
	(mach_msg_size_t) sizeof(union __ReplyUnion__window_subsystem),
	{
		_Xcreate_window,
		_Xwindow_set_frame
	}
};

boolean_t window_server(mach_msg_header_t *InHeadP, mach_msg_header_t *OutHeadP) {
	mig_routine_t routine;

	OutHeadP->msgh_bits = MACH_MSGH_BITS(MACH_MSGH_BITS_REPLY(InHeadP->msgh_bits), 0);
	OutHeadP->msgh_remote_port = InHeadP->msgh_reply_port;
	/* Minimal size: routine() will update it if different */
	OutHeadP->msgh_size = (mach_msg_size_t) sizeof(mig_reply_error_t);
	OutHeadP->msgh_local_port = MACH_PORT_NULL;
	OutHeadP->msgh_id = InHeadP->msgh_id + 100;
	OutHeadP->msgh_reserved = 0;

	if ((InHeadP->msgh_id > 35001) || (InHeadP->msgh_id < 35000)) {
		return FALSE;
	}
	routine = window_subsystem.routine[InHeadP->msgh_id - 35000];
	(*routine) (InHeadP, OutHeadP);
	return TRUE;
}

Client-side usage looks just like invoking the routines:

mach_port_t server_port = ...;
mach_port_t window;
kern_return_t kr;

kr = create_window(server_port, &window);

kr = window_set_frame(window, 50, 55, 200, 100);

And on the server side, you implement the corresponding functions and then call mach_msg_server():

kern_return_t create_window(mach_port_t server, mach_port_t *window) {
    // ...
}

kern_return_t window_set_frame(mach_port_t window, int x, int y, int width, int height) {
    // ...
}

int main() {
    mach_port_t server_port = ...;
    mach_msg_server(window_server, window_subsystem.maxsize, server_port, 0);
}

MIG supports a bunch of useful options and features. It's extensively used in Mach for RPC (remote procedure calls), including for communicating between the kernel and the userspace. Other than a few direct Mach traps such as msg_send() itself, Mach kernel API functions (such as the ones for task and port manipulation) are in fact MIG routines.

Distributed Objects

Distributed Objects is a dynamic RPC platform, as opposed to MIG, which is fundamentally based on static code generation. Distributed objects allows you to transparently use Objective-C objects from other processes as if they were regular Objective-C objects.

// File: server.m

NSConnection *connection = [NSConnection defaultConnection];
[connection setRootObject: myAwesomeObject];
[connection registerName: @"org.darlinghq.example"];
[[NSRunLoop currentRunLoop] run];
// File: client.m

NSConnection *connection =
    [NSConnection connectionWithRegisteredName: @"org.darlinghq.example"
                                          host: nil];

MyAwesomeObject *proxy = [[connection rootProxy] retain];
[proxy someMethod];

There is no need to statically generate any code for this, it all works at runtime through the magic of objc message forwarding infrastructure. Methods you call may return other objects (or take them as arguments), and the distributed objects support code will automatically either send a copy of the object to the remote process (using NSCoding) or proxy methods called on those objects as well.

XPC

XPC is a newer IPC framework from Apple, tightly integrated with launchd. Its lower-level C API allows processes to exchange plist-like data via Mach messages. Higher-level Objective-C API (NSXPC*) exports a proxying interface similar to Distributed Objects. Unlike Distributed Objects, it's asynchronous, doesn't try to hide the possibility of connection errors, and only allows passing whitelisted types (to prevent certain kinds of attacks).

Apple's XPC is not open source. On Darling, the low-level XPC implementation (libxpc) is based on NextBSD libxpc. The high-level Cocoa APIs are not yet implemented.

Useful resources

Note that Apple's version of Mach as used in XNU/Darwin is subtly different than both OSF Mach and GNU Mach.

Mach Exceptions

This section documents how unix signals and Mach exceptions interact within XNU.

Unlike XNU, Linux uses unix signals for all purposes, i.e. for both typically hardware-generated (e.g. SIGSEGV) and software-generated (e.g. SIGINT) signals.

Hardware Exceptions

In XNU, hardware (CPU) exceptions are handled by the Mach part of the kernel (exception_triage() in osfmk/kern/exception.c). As a result of such an exception, a Mach exception is generated and delivered to the exception port of given task.

By default - i.e. when the debugger hasn't replaced the task's exception ports - these Mach exceptions make their way into bsd/uxkern/ux_exception.c, which contains a kernel thread receiving and translating Mach exceptions into BSD signals (via ux_exception()). This translation takes into account hardware specifics, also by calling machine_exception() in bsd/dev/i386/unix_signal.c. Finally, the BSD signal is sent using threadsignal() in bsd/kern/kern_sig.c.

Software Signals

Software signal processing in XNU follows the usual pattern from BSD/Linux.

However, if ptrace(PT_ATTACHEXC) was called on the target process, the signal is also translated into a Mach exception via do_bsdexception(), the pending signal is removed and the process is set to a stopped state (SSTOP).

The debugger then has to call ptrace(PT_THUPDATE) to set the unix signal number to be delivered to the target process (which could also be 0, i.e. no signal) and that also resumes the process (state SRUN).

Debugging under Darling

Debugging support in Darling makes use of what we call "cooperative debugging". It means the code in the debuggee is aware it's being debugged and actively assists the process. In Darling, this role is taken on mainly by sigexc.c in libsystem_kernel.dylib, so no application modifications are necessary.

MacOS debuggers use a combination of BSD-like and Mach APIs to control and inspect the debuggee.

To emulate the macOS behavior, Darling makes use of POSIX real-time signals to invoke actions in the cooperative debugging code.

OperationmacOSLinuxDarling implementation
Attach to debuggeeptrace(PT_ATTACHEXC)
Causes the kernel to redirect all signals (aka exceptions) to the Mach "exception port" of the process. Only debuggee termination is notified via wait().
ptrace(PTRACE_ATTACH)
Signals sent to the debuggee and the debuggee termination event are received in the debugger via wait().
Notify the LKM that we will be tracing the process. Send a RT signal to the debuggee to notify it of the situation. The debuggee sets up handlers to handle all signals and forward them to the exception port.
Examine registersthread_get_state(X86_THREAD_STATE)ptrace(PTRACE_GETREGS)Upon receiving a signal, the debuggee reads its own register state and passes it to the kernel via thread_set_state().
Pausing the debuggeekill(SIGSTOP)kill(SIGSTOP) or ptrace(PTRACE_INTERRUPT)Send a RT signal to the debuggee that it should act as if SIGSTOP were sent to the process. We cannot send a real SIGSTOP, because then the debuggee couldn't provide/update register state to the debugger etc.
Change signal deliveryptrace(PT_THUPDATE)ptrace(PTRACE_rest)Send a RT signal to the debuggee to inform it what it should do with the signal (ignore, pass it to the application etc.)
Set memory watchpointsthread_set_state(X86_DEBUG_STATE)ptrace(PTRACE_POKEUSER)Implement the effects of PTRACE_POKEUSER in the LKM.

Resources

Commpage

Commpage is a special memory structure that is always located at the same address in all processes. On macOS, this mapping is provided by the kernel. In case of Darling, this functionality is supplemented by the kernel module.

Purpose

  • CPU information (number of cores, CPU capabilities etc.)
  • Current precise date and time (this information is not filled in by Darling, causing a fall back to a system call).
  • PFZ - preemption-free zone. Contains a piece of code which when run prevents the process from being preempted. This is used for a lock-free implementation of certain primitives (e.g. OSAtomicFifoEnqueue). (Not available under Darling.)

It is somewhat related to vDSO on Linux, except that vDSO behaves like a real library, while commpage is just a chunk of data.

The commpage is not documented anywhere, meaning it's not an API intended to be used by 3rd party software. It is however used in source code provided on opensource.apple.com. Darling provides a commpage for compatibility reasons.

Location

The address differs between 32-bit and 64-bit systems.

  • 32-bit systems: 0xffff0000
  • 64-bit systems: 0x7fffffe00000

Note that the 32-bit address is outside of permissible user space area on 32-bit Linux kernels. This is why Darling runs only under 64-bit kernels, which don't have this limitation.

Distributed Objects

Here's how the Distributed Objects are structured internally:

  • NSPort is a type that abstracts away a port -- something that can receive and send messages (NSPortMessage). The few commonly used concrete port classes are NSMachPort (Mach port), NSSocketPort (network socket, possibly talking to another host), and NSMessagePort (another wrapper over Mach ports). With some luck, it's possible to use a custom port class, too.

    NSPort is really supposed to be a Mach port (and that's what [NSPort port] creates), while other port types have kind of been retrofitted on top of the existing Mach port semantics. NSPort itself conforms to NSCoding, so you can "send" a port over another port (it does not support coders other than NSPortCoder).

    Some NSPort subclasses may not fully work unless you're using them for Distributed Objects (with an actual NSConnection *).

  • NSPortMessage roughly describes a Mach message. It has "send" and "receive" ports, a msgid, and an array of components. Individual components can be either data (NSData) or a port (NSPort), corresponding to MACH_MSG_OOL_DESCRIPTOR and MACH_MSG_PORT_DESCRIPTOR. Passing a port will only work with ports of the same type as the port you're sending this message through.

  • NSPortNameServer abstracts away a name server that you can use to map (string) names to port. You can register your port for a name and lookup other ports by name. NSMachBootstrapServer implements this interface on top of the Mach bootstrap server (launchd on Darwin).

  • NSPortCoder is an NSCoder that essentially serializes and deserializes data (and ports) to and from port messages, using the same NSCoding infrastructure a lot of types already implement. Unlike other coders (read: archivers), it supports encoding and decoding ports, though this is mostly useless as DO itself make very little use of this.

    NSPortCoder itself is an abstract class. In older versions of OS X, it had a concrete subclass, NSConcretePortCoder, which only supported non-keyed (also called unkeyed) coding. In newer versions, NSConcretePortCoder itself is now an abstract class, and it has two subclasses, NSKeyedPortCoder and NSUnkeyedPortCoder.

    NSPortCoder subclasses send the replacementObjectForPortCoder: message to the objects they encode. The default implementation of that method replaces the object with a NSDistantObject (a Distributed Objects proxy), no matter whether the type conforms to NSCoding or not. Some DO-aware classes that wish to be encoded differently (for example, NSString) override that method, and usually do something like this:

    - (id) replacementObjectForPortCoder: (NSPortCoder *) portCoder {
        if ([portCoder isByref]) {
            return [super replacementObjectForPortCoder: portCoder];
        }
        return self;
    }
    
  • NSDistantObject is a proxy (NSProxy) that stands in for a remote object and forwards any messages over to the remote object. The same NSDistantObject type is returned by the default NSObject implementation of replacementObjectForPortCoder:, and this instance is what gets serialized and sent over the connection (and deserialized to a NSDistantObject on the other end).

  • NSConnection represents a connection (described by a pair of ports). NSConnection stores local and remote object maps, to intern created proxies (NSDistantObjects) when receiving or sending objects. NSConnection actually implements forwarding method invocations, and also serves as a port's delegate, handling received messages.

You would think that NSPort, NSPortMessage, NSPortNameServer, and NSPortCoder do not "know" they're being used for DO/RPC, and are generic enough to be used for regular communication directly. This is almost true, but DO specifics pop up in unexpected places.

Resources

  • https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/DistrObjects/DistrObjects.html
  • GNUstep implementation

darlingserver

darlingserver is Darling's userspace kernel server, much like wineserver for Wine. It implements some services that the XNU kernel would normally provide on macOS.

Because of the types of services it provides and the way it implements some of those services, darlingserver is very low-level and is a fairly complex beast. This can be a major hurdle for new developers and contributors. Therefore, the goal of this documentation is to explain as much of darlingserver's architecture and implementation as possible to make it easier to work on.

Basic Design and Architecture Decisions

Use Parts of XNU When Possible

The first and most important thing to understand about darlingserver's architecture is that it includes part of the XNU kernel as part of its sources. The reasoning behind this decision can be summed up as "why reinvent the wheel when you can just use it instead?"

XNU is open-source and implements many of the services darlingserver needs to provide, so we can just use it to implement them. In fact, we've actually already tried reinventing the wheel in the past with an older version of the LKM by implementing those services ourselves. It turned out to be much easier to just duct-tape the necessary XNU code, which was the approach used in the now-defunct LKM.

Avoid Wasting Resources

This sounds like a rather obvious decision, but with something as complex as darlingserver, this is certainly a conscious design choice that must be kept in mind for all other design and implementation decisions. While darlingserver is certainly not as optimized as it could be, it strives to avoid wasting resources like file descriptors, CPU time, threads, and memory, among other things.

For example, rather than create a real Linux thread for each thread the server manages, we use userspace threads called microthreads (a.k.a. fibers) and perform cooperative scheduling between them ourselves. Using real Linux threads would greatly simplify thread management within darlingserver, but Linux threads are much heavier resource-wise and creating one for each thread the server manages would waste too many resources unnecessarily.

Actual Design and Implementation

With the overarching design decisions in mind, it should be easier to understand why certain things are implemented the way they are. Feel free to read up on these topics in whatever order you like:

RPC

A key component of darlingserver is remote procedure calls, or RPC, between the server and its client processes.

darlingserver RPC is implemented with Unix datagram sockets. Each managed thread has its own socket to communicate with the server. This is done so that each thread has its own address and can communicate independently with the server.

An alternative implementation would be to use Unix seq-packet sockets, which would simplify some aspects of process and thread lifetime management. However, with seq-packet sockets, the server needs to open a separate descriptor for each connection, whereas it only needs one single socket with the datagram-based implementation. Thus, to reduce server-side descriptor usage, the datagram-based implementation was chosen.

Boilerplate/Wrapper Generation Script

Because Unix datagram IPC requires a lot of boilerplate code on both sides, a script (generate-rpc-wrappers.py) is used to automatically generate this boilerplate code for both the client and server. The client can then just invoke a wrapper for each call and let the wrapper worry about sending the call and receiving the reply. Likewise, the server can focus on implementing the call and just let the helper methods take care of creating the reply.

In addition to sending and receiving message content/data, the wrappers generated by this script automatically take care of passing file descriptors back and forth using SCM_RIGHTS ancillary data. Both the client and the server can easily send any number of FDs using the wrappers.

Client Side

On the client side, the script generates wrappers for each call that provide a simple call interface and take care of all the IPC plumbing behind the scenes. The basic structure of each wrapper (also explained in System Call Emulation) is as follows:

int dserver_rpc_CALL_NAME_HERE(int arg1, char arg2 /* and arg3, arg4, etc. */, int* return_val1, char* return_val2 /* and return_val3, return_val4, etc. */);

All return value pointers are optional (NULL can be passed if you don't care about the value). As for the actual return value of the wrapper (int), that is returned by all wrapper scripts and it is actually an error code. If it is negative, it indicates an internal RPC error occurred (e.g. communication with the server was interrupted, the server died, the socket was closed, invalid reply, etc.). If it is positive, it is an error code from the server for that specific call (e.g. ESRCH if a certain process or thread wasn't found, ENOENT if a particular file wasn't found, etc.). As with most error codes, 0 indicates success.

For example, the following call entry in the RPC wrapper generator script:

('mldr_path', [
  ('buffer', 'char*', 'uint64_t'),
  ('buffer_size', 'uint64_t'),
], [
  ('length', 'uint64_t'),
]),

produces the following wrapper prototype:

int dserver_rpc_mldr_path(char* buffer, uint64_t buffer_size, uint64_t* out_length);

This function accepts a pointer to a buffer for the server to write the path to mldr into and the size of this buffer, and then it returns the actual length of the path (the full length, even if it was longer than the given buffer). The char* buffer argument is internally converted to a uint64_t for RPC so that the server can receive the full pointer no matter what pointer size the client process is using (e.g. 32-bit pointers). The server receives the pointer value and is in charge of writing to the buffer using the pointer. With the length return value, however, the value is sent as a serialized value in the RPC reply; the wrappers take care of this and the server does not receive the pointer value directly.

RPC Hooks

The code generated by the script is only half the story, however. The other half lies within the RPC hooks that the generated code requires in order to actually send and receive messages. The reason this is done this way is so that client-side RPC can be used in different environments: currently, this means it can be used in both libsystem_kernel (which operates in a Darwin environment) and mldr (which operates in a Linux environment).

The client RPC hooks provide functions for actually sending and receiving messages to and from the client socket, printing debug information in the case of an error, and retrieving process and architecture information, among other things. Additionally, they also provide environment-specific definitions for the wrapper code to use for things like msghdr, iovec, and cmsg structures and sizes as well as constants like SCM_RIGHTS and SOL_SOCKET.

Client RPC hooks are also responsible for handling S2C calls from the server. See the S2C Calls section for more information.

Interrupts/Signals

Most RPC calls are uninterruptible: once the call is started, the thread cannot be interrupted by a signal until the reply is received. This simplifies the code needed to perform the call and allows it behave much like a regular function call. This is okay because most calls don't require the server to wait or sleep (locks notwithstanding). However, some calls include long periods of waiting where it would not be okay to wait uninterruptibly. For those calls, the ALLOW_INTERRUPTIONS flag in the RPC wrapper generator script indicates that, if any syscalls return EINTR (e.g. sendmsg and recvmsg), the call should be aborted and -EINTR should be returned.

Another interrupt-related thing that the RPC wrappers handle is out-of-order replies. However, the handling of this is only enabled for a specific pair of calls (interrupt_enter and interrupt_exit). See Interrupts for more information.

Server Side

The server side portion of darlingserver RPC is more complex than that of the client side. Unlike the wrapper code generated for the client side, the code generated for the server side can be very different depending on the flags set on the call entry in the script. There are three main types of calls the server can handle: Mach traps, BSD traps, and generic server calls.

Mach and BSD Traps

Calls marked as Mach traps with XNU_TRAP_CALL in the script produce server-side code that automatically calls the corresponding Mach trap from a duct-taped context. These calls do not return any values separately; all their return values are written directly into client pointers because that's how those calls behave in the XNU kernel. Calls marked as BSD traps behave similarly, except that they do return one value: a status code that's only valid when the call succeeds; this is because BSD traps return 2 status codes: one for success and one for failure.

Generic Server Calls

All other calls are treated as generic server calls that have handlers on the C++ side. The basic structure for each handler is:

void DarlingServer::Call::PascalCaseCallNameHere::processCall() {
  /* ... */
};

The generator script creates a _body member for each call that contains the message data structure from the client. For example, for the mldr_path call we saw earlier, the _body structure contains buffer (a uint64_t) and buffer_size (another uint64_t).

Each call class also has a _thread member that is a weak pointer to the thread that made the call. See Threads, processes, and microthreads for more details on the information available in the Thread class, including how to write to client memory directly (usually into pointers received from clients in call arguments).

Finally, each call class has a _sendReply method whose prototype depends on that call's specific interface. All _sendReply methods accept the call status code as their first argument, but additional arguments (if any) depend on the call's return values as specified in the generator script. This function must be called in order for the call's reply to be sent to the client; if it is never called, the client will be left waiting forever.

For example, for the mldr_path call, the _sendReply method accepts an additional length parameter (a uint64_t). This will be sent back to the client in the reply.

Conventions

Status Code Sign Indicates Origin

As noted earlier, negative status codes from RPC calls indicate internal RPC errors (usually fatal, but sometimes not, e.g. -EINTR) whereas positive status codes indicate error codes specific to each call. As always, 0 indicates success.

Strings/Buffer Arguments

String/buffer arguments should come in pairs of arguments, one for the string/buffer address and the other for the size/length. Additionally, for cases where the server is writing into the given string/buffer, you should add a return value that indicates how much data the server wanted to write—not how much data was actually written. This is done so that clients can retry the call with a bigger string/buffer if necessary.

S2C Calls

Most RPC calls originate from the client to the server. However, there is a small number of calls that the server makes to the client instead. These calls are used to access client resources that the server cannot access directly on Linux systems.

For example, on macOS, any process with the VM (virtual memory) port for another process is able to map, unmap, and protect memory (among other things) in that other process. However, on Linux, there is no way to perform these actions on another process' virtual memory. Therefore, whenever the server needs to perform these actions in a client process, it makes an S2C call to ask the client to perform those actions on itself on the server's behalf.

Request Methods

The server can request for a client to perform an S2C call by either sending it a special reply to an ongoing call or by sending it an S2C signal. Which method is chosen depends on the server-side state of the target thread: if it is currently processing a call, the special reply method is used; otherwise, the S2C signal method is used.

The special reply method is straightforward: the client is waiting for a reply from the server for a particular call that it made. When the client receives the special reply from the server, it knows to handle this differently and execute the S2C call using the information in the reply. It then sends back the result and continues waiting for a reply to original call.

The S2C signal method sends a special POSIX real-time signal to the target thread. When it receives this signal, it knows to message the server to receive the details of the S2C call it needs to execute. Once it completes the call and sends back the result, it exits the signal handler and continues with whatever it was doing when it was interrupted.

An alternative method was considered where the client doesn't even need to be aware that an S2C call is occurring. This method uses ptrace to temporarily take over the target thread and have it execute whatever code the server needs, resuming normal execution once complete. The main problem with this method, however, is that there is no asynchronous ptrace API available; any implementation using it would require blocking an actual server thread (not microthread). As such, it was rejected in favor of the asynchronous methods described earlier.

Threads, Processes, and Microthreads

darlingserver needs to keep track of the threads and processes it manages because both darlingserver code and XNU code require this information. Additionally, we use microthreads to execute calls and run XNU code in the context of a particular thread.

Lifetime Management

When a thread or process is created, the first thing it does is to create a new client socket and check-in with the server to register itself with the server. On the server side, a new thread/process is created and some initialization is performed to set up an XNU context for the thread/process.

Before a thread exits, it informs the server that it's checking-out and the server performs some cleanup of the info it had on the thread. With a process exiting, however, the server is instead notified by the system via a procfd that the server opens for the process when it registers a new process. When the server is informed that a process has died, it cleans up any server-side threads that the process might still have registered; this can happen, for example, when the program is killed by a signal and doesn't have a chance to inform the server that its threads are dying.

Detecting Forks and Execs

Detecting forks is quite simple: when a client forks, it has to replace its existing socket with a new one so it can have its own separate connection to the server instead of inheriting its parents connection; in doing so, it has to check-in with the server again, so darlingserver can find out that way.

Detecting execs is a little more complicated. When a client is going to perform an exec, it opens a close-on-exec pipe and gives it to the server so that it can wait for an event on this pipe. If the client successfully performs the exec, the pipe is automatically closed by Linux and the server will receive a hang-up event on the pipe. When it sees that the pipe is empty, it knows that the exec succeeded and the process has been replaced. However, if the exec fails, the pipe is not automatically closed by Linux. Instead, the client writes a single byte into the pipe and then closes it. When the server receives the hang-up event and sees the pipe is not empty, it knows the exec failed and the process has not been replaced.

Reading From and Writing To Process Memory

One thing that is often necessary when implementing RPC calls is the ability to read and write directly to/from process memory. Often, this is used to access memory at a pointer given as part of a call argument. This can be achieved using the readMemory and writeMemory functions available in Process instances. These methods make use of Linux's process_vm_readv and process_vm_writev to access the target process' memory.

Identifiers

All Darling threads and processes have two identifiers: one that darlingserver sees (the one in its PID namespace) and one that other Darling processes see (the one within the Darling PID namespace). In darlingserver, the first type of ID is just called an ID (identifier); it is sometimes referred to as the global or Linux ID, as well. The second type of ID is called an NSID (namespaced identifier).

Kernel microthreads have NSIDs but they do not have IDs because they have no corresponding Linux thread/process. Additionally, kernel microthread NSIDs are always very high numbers to more easily identify them for debugging purposes. This distinction is not enough to identify a microthread as being a kernel microthread, however; rather, you can tell if a Thread is a kernel microthread if its ID is invalid (e.g. -1) but its NSID valid.

Thread and Process Registries

darlingserver keeps track of all active processes and threads in two registries (processRegistry and threadRegistry). Each registry can be accessed via either an ID or an NSID.

The process registry does include the kernel process, as well. This is a pseudo-process that does not have an actual Darling process associated with it; it exists mainly for duct-taped XNU code to use and to own all the kernel microthreads. The thread registry also includes all kernel microthreads.

Microthreading

A major component of darlingserver is microthreading. This consists of executing multiple threads using cooperative scheduling within one or more Linux worker threads. As described in the section intro, this approach was chosen because it conserves resources. Currently, due to bugs in the multithreaded implementation, only a single worker thread is used.

Microthreads use setcontext, makecontext, and getcontext to perform cooperative multitasking. Whenever a microthread needs to wait for something, it suspends itself and returns to the worker thread runner; the worker thread runner function is considered the "top" of the thread (and you will see mentions of this in the microthreading code). Microthreads can also optionally suspend with a custom continuation point. By default, when a microthread suspends itself, it will continue executing where it left off once it resumes. However, with a custom continuation point, the old stack is discarded and the microthread will start executing from the continuation point once it resumes. This functionality is used extensively by duct-taped XNU code.

All the information for each microthread is contained within a Thread instance. When running in a microthread, the current Thread and Process instances can be accessed via the currentThread and currentProcess static methods on each respective class.

To further conserve resources, microthreads use a stack pool to limit the number of simultaneously allocated stacks. With this stack pool, we avoid wasting memory on stacks that are currently unused (only microthreads currently executing or suspended normally need stacks) and we also avoid the (slight) additional delay of allocating a brand new stack every time we need one.

Details on the Worker Thread Runner

The worker thread runner function itself is actually very simply: it just invokes the doWork method on the Thread instance it was given. This doWork method is where the core of the microthreading logic is.

The first thing this method does is check whether the microthread is even allowed to run: microthreads can be be deferred, already running, terminating, or even dead; if the microthread is in any of these states, doWork simply returns.

The next thing this method does is set up some context for the microthread to run: it sets the _running flag for the Thread instance, sets the Thread instance as the current Thread (a thread-local variable is used to keep track of this for each worker thread), and notifies the duct-tape code that this Thread is going to start running.

Then, this method uses getcontext to initialize the ucontext for the microthread to switch back to when it wants to suspend; a thread-local variable is used for each worker to store this context. However, the catch with getcontext/setcontext compared to setjmp/longjmp is that they give no indication of whether the context is being executed for the first time or whether it has been re-executed (e.g. using getcontext). Therefore, an additional thread-local flag for each worker is used to keep track of this. When the method is first invoked, it sets this flag to false; whenever a microthread jumps back to the worker thread runner, it sets this flag to true so the worker thread runner can change its behavior accordingly.

That's why the next thing this method does is check whether the flag is set; if it is, then the microthread has already run and has returned back to the worker thread runner. In this case, the method performs some clean up: it unsets the _running flag on the Thread instance, unsets it as the current thread, and notifies the duct-tape code that the microthread has finished running. The method also checks whether the Thread has been marked for termination. If the microthread has instructed us to unlock a certain lock once it's fully suspended, this method also takes care of that now; this approach is used to ensure the microthread doesn't miss wake-ups from anything protected by that lock.

In the case where the "returning to thread top" flag is unset, this method examines the Thread to determine where it should start executing it. The first thing it checks is whether there's a pending interrupt_enter call. If so, it saves the current Thread state and creates a new one to begin processing the interrupt (see Interrupts/Signals for more info). Otherwise, it checks if the thread has a pending call. If it does, the old stack is discarded (which includes any suspended state) and a new stack is created to being processing the incoming call. Finally, if the thread has no pending call and is suspended, this method resumes the thread from the appropriate point of execution: if it had a pending continuation callback, it resumes from the continuation callback; otherwise, it resumes from the point where it was suspended. In all cases, this branch ("returning to thread top" being unset) does not continue executing normally; it always switches a new context to begin executing the thread.

Kernel Microthreads

Sometimes, it is necessary to run code within a microthread that does not have a managed thread associated with it; this is most necessary for duct-taped XNU code. That's why darlingserver also supports kernel microthreads. These are Thread instances that have the necessary state for running code (usually duct-taped code) but do not have a managed thread. This implies that some information on these threads is unavailable. For example, these threads cannot perform S2C calls, process interrupts, or process calls of their own. However, they can still be schedule to run and can suspend and resume like normal threads.

In most cases, C++ code doesn't need to use kernel microthreads directly. Duct-taped XNU code is usually the one that wants to create kernel threads for things like deferred clean up, thread_calls, and timer_calls. C++ code usually uses Thread::kernelAsync and, less commonly, Thread::kernelSync. These static methods allow you to schedule a lambda to run within a kernel microthread at some point in the future. As the names imply, one schedules it asynchronously—schedule it to run and then return normally—and the other schedules it synchronously—schedule it to run and wait for it to finish executing before returning. Note that the synchronous method should never be called from another microthread (at least with the current implementation). This is because it will block the worker thread and prevent it from switching to any other microthread. For the same reason, it should never be called from the main (event loop) thread, as this thread needs to be constantly handling new events.

Duct-Taped Code

As described in the section intro, darlingserver uses XNU code to implement much of the functionality needed for server calls. However, because this code is kernel code and uses code from several other kernel subsystems that we can't include in darlingserver, we have to implement some support code that essentially duct-tapes the XNU code into something we can use.

Hooks

Because duct-taped code uses XNU kernel headers, including any Linux headers would quickly cause header conflicts and complications. Therefore, whenever our duct-taping code needs to use some Linux functionality, we do one of two things: either manually add declaractions/prototypes (e.g. for simple things like mmap) or, more commonly, add hooks for duct-taped code to call into C++ code. Additionally, duct-taped code is C code, so if it needs to use some functionality that the C++ code implements, it must use hooks.

Currently, there are quite a lot of hooks for various things. The most common types of hooks are hooks for process- and thread-related functionality implemented in C++ code. There are also hooks for microthreading features like suspension, resumption, termination, and kernel microthread creation. Additionally, there are hooks for various other miscellaneous features like timer creation and logging.

Locks

Because duct-taped code runs within microthreads, it can't use normal locks. Instead, we implement microthread-aware locks that suspend the microthread when they need to sleep instead of blocking the worker thread running the microthread.

These duct-taped locks use a normal lock to protect their internal state. However, this normal lock is only held for short periods of time (only to manage the duct-taped lock state), so it doesn't significantly block the worker thread.

In order to lock a duct-taped lock, the internal state lock is first acquired. Once we have that lock, we check the duct-taped lock state to see if the lock is free. If it is, we mark it as locked by our current microthread, unlock the state lock, and then return. If it's not free, then we add ourselves to the lock wait queue so that whoever currently owns the lock can wake us up when they're done, and then we suspend our microthread. Note that the lock wait queue is just a simple queue of XNU thread structures; we cannot use XNU's waitq for this because waitqs themselves need to use locks.

When unlocking a duct-taped lock, we first acquire the internal state lock. Then we mark the lock as unlocked and check if anyone is waiting for the lock. If someone is waiting for the lock, we remove them from the queue and wake them up. Finally, we unlock the state lock and return.

XNU uses different types of locks for different purposes (e.g. spin locks, mutexes, simple_locks, ulocks, and more). However, for our purposes, all locks are mutexes implemented as described above.

Timers

Timers are essential for duct-taped XNU code. Fortunately for us, XNU implements them by overlaying an architecture-independent layer onto a very thin architecture-dependent layer; all we have to implement is that architecture-dependent layer.

For this, all we have to do set up a timerfd in the event loop and provide a hook for duct-tape code to arm/disarm the timer. When the timer fires, we notify the duct-tape code that it fired, who then notifies the XNU architecture-independent timer layer.

kqueue Channels

Most kqueue functionality is implemented within libkqueue, which runs within each Darling process. However, there are two filters that cannot be implemented this way: Mach port filters and process filters. These filters require information only available within darlingserver.

We can't just use server calls that sleep until the filters become active because this would completely ignore any other filters in libkqueue. Instead, we provide server calls to open additional sockets called kqueue channels (or kqchannels). Libkqueue can add these descriptors to its epoll context to be woken up when a message is available on the channels. The server writes a notification message to a kqchannel when an event is available; the client is woken up and then asks the server for the full event information.

Mach Port Filters

Mach port filters leverage the existing XNU code for Mach port kqueue filters,communicating back and forth between the duct-taped code and the C++ code.

When a Mach port kqchannel is created, it registers a duct-taped knote context for the Mach port. When the duct-taped Mach port filter code notifies us that an event is ready, the duct-taped kqchannel code notifies the C++ kqchannel code, which then notifies the client as described earlier. Likewise, when a client wants to read the full event details, the C++ code asks the duct-taped XNU code for event details.

Process Filters

Process filters hook into the target Process instance in dalringserver to have it notify us when certain events like forks, execs, and deaths occur.

A special case occurs when a process forks because there is an old, deprecated feature called NOTE_TRACK that allows clients to request for a knote to be automatically created for new forks of the target process. Though newer versions of XNU have completely removed support for this feature, we would like to keep compatibility with this feature for older software; plus, it's not too hard to implement.

Interrupts/Signals

darlingserver needs to be involved in signal processing for two main reasons. The first is that there is some Mach functionality that relies on having access to signal information such as what signal occurred and the register state when it occured so that we can pass this on to debuggers when they ask for it. The second reason is that signals can occur while a thread is waiting for a server call to complete. When this occurs, there is a possible race between the server replying to thread's interrupted call and the thread making a new server call as part of signal processing/handling.

Informing the Server about Signals

When a thread is interrupted/signaled, it informs the server using the interrupt_enter server call (more on that in Handling Interrupted Calls). This method only informs the server that the thread has been interrupted and allows the client to wait for the server to sort things out on its end (in case it was in the middle of processing the client's interrupted call). A separate call, sigprocess, is required to inform the server about the signal info and register state.

On the server side, this call reads the signal information from the client and hands it over to some duct-taped XNU signal processing code. This code notifies anyone that's interested (e.g. debuggers) and allows them to manipulate the thread state if necessary. Once that's done, the server writes out the new thread state to the client and sends the reply to allow the client to continue. The client then processes the signal as appropriate (e.g. calling signal handlers installed by program code), and once it's done, it informs the server using interrupt_exit.

Handling Interrupted Calls

One issue with handling signals is that signal handlers need to make server calls of their own, but signals can occur at any time, even in the middle of another server call. If this is not handled correctly, a race condition is almost sure to occur, with the server sending a reply for the interrupted call at the same time that the client makes a new call in the signal handler.

That's why the interrupt_enter call also synchronizes the call state with the server. What this means is that it informs the server that the client received a signal and waits for the server to give it the green light to continue.

On the server side, if the server was processing a call for the client, the call is aborted and any reply it generates is saved to be sent once the client finishes processing the signal (i.e. when it makes the interrupt_exit call).

However, there's still another possible source for a race condition: what if the server had just finished processing the call and had already queued the reply (or had just finished sending it) when the client received a signal and made the interrupt_enter call? In this case, the client would immediately receive a reply, but for the wrong call (the one that was just interrupted). That's why another job of interrupt_enter on the client side is to handle unexpected replies and push them back to the server.

It allocates a buffer (on the stack) large enough to accomodate all possible replies and file descriptors. When it receives a reply it did not expect (i.e. a reply that is not for the interrupt_enter call), it sends it back to the server using a special push_reply message, which the server uses to save the message so it can send it once it receives an interrupt_exit call.

Event Loop

The main thread is in charge of the event loop for darlingserver. This event loop handles things like sending and receiving messages, detecting process deaths, and handling timers.

Sending and Receiving Messages

All messages are sent and received asynchronously. The MessageQueue class wraps up this functionality neatly and makes it thread-safe to push and pop messages to/from the queue. Messages are sent and received in batches using sendmmsg and recvmmsg.

Timers

The event loop includes a timerfd for duct-taped code to arm and disarm as necessary. See the duct-tape timers section for more info.

Process Deaths

When a process is registered with the server, darlingserver opens a procfd for the process and adds it to its epoll context. Linux marks the procfd as readable once the target process dies.

Monitors

darlingserver also includes support for adding generic descriptors to the event loop via the Monitor class. Monitors allow you to add a descriptor to the event loop and get notified via a lambda that runs on the main (a.k.a. event loop) thread when the event occurs.

Contributing

If you are familiar with how GitHub Pull Requests work, you should feel right at home contributing to Darling.

Fork the repository

Locate the repository that you made changes in on GitHub. The following command can help.

$ cd darling/src/external/less
$ git remote get-url origin
https://github.com/darlinghq/darling-less.git

If it is an https scheme then you can paste the URL directly into your browser.

Once at the page for the repository you made changes to, click the Fork button. GitHub will then take you to the page for your fork of that repository. The last step here is to copy the URL for your fork. Use the Clone or download button to copy it.

Commit and push your changes

Create and check out a branch describing your changes. In this example, we will use reinvent-wheel. Next, add your fork as a remote.

$ git remote add my-fork git@github.com:user/darling-less.git

After this, push your commits to your fork.

git push -u my-fork reinvent-wheel

The -u my-fork part is only necessary when a branch has never been pushed to your fork before.

Submit a pull request

On the GitHub page of your fork, select the branch you just pushed from the branch dropdown menu (you may need to reload). Click the New pull request button. Give it a useful title and descriptive comment. After this, you can click create.

After this, your changes will be reviewed by a Darling Project member.

Debugging

We provide a build of LLDB that is known to work under Darling. It is built from vanilla sources, i.e. without any modifications.

That doesn't mean it works reliably. Fully supporting a debugger is a complex task, so LLDB is known to be buggy under Darling.

Troubleshooting LLDB

If you want to troubleshoot a problem with how LLDB runs under Darling, you need to make debugserver produce a log file.

If running debugserver separately, add -l targetfile. If using LLDB directly (which spawns debugserver as a subprocess automatically), pass an additional environment variable to LLDB:

LLDB_DEBUGSERVER_LOG_FILE=somelogfile.txt

External DebugServer

If you're having trouble using LLDB normally, you may get luckier by running the debugserver separately.

In one terminal, start the debugserver:

$ ./debugserver 127.0.0.1:12345 /bin/bash

In another terminal, connect to the server in LLDB:

$ ./lldb
(lldb) platform select remote-macosx
Platform: remote-macosx
Connected: no
(lldb) process connect connect://127.0.0.1:12345

Please note that environment variables may be missing by default, if used like this.

Debug with core dump

If you are unable to start your application through lldb, you can generate a core dump and load it through lldb. You will need to enable some options on the Linux side before you are able to generate a core dump. You will to tell Linux that you want to generate a core dump, and that there is no size limit for the core dump.

sudo sysctl -w kernel.core_pattern=core_dump
ulimit -c unlimited

Option #1: Grab Core Dump From Current Working Directory

Note that the core dump will be stored into the current working directory that Linux (not Darling) is pointing to. So you should cd into the directory you want the core dump to be stored in before you execute darling shell. From there, you can execute the application.

cd /path/you/want/to/store/core/dump/in
darling shell
/path/to/application/executable

If everything was set up properly, you should find a file called core_dump. It will be located in the current working directory that Linux is pointing to.

Option #2: Using coredumpctl To Get The Core Dump.

If your distro uses SystemD, you can use coredumpctl utility as an alternative to grab the code dump.

To generate the core dump, run the application as normal (ex: darling shell /path/to/application/executable). Once the program crashes, you should see a (core dumped) message appear. For example:

Illegal instruction: 4 (core dumped)

After a core dump is generated, you will need to locate the core dump. coredumpctl list -r will give you a list of core dumps (where the newest entries are listed first).

$ coredumpctl list -r
TIME                            PID  UID  GID SIG     COREFILE  EXE
Sat 2022-08-13 23:43:20 PDT  812790 1000 1000 SIGILL  present   /usr/local/libexec/darling/usr/libexec/darling/mldr

Once you figure out the process' core_dump that you want save, you can use the coredumpctl dump command to dump it, like so:

# 812790 is the PID that you want to dump
coredumpctl dump -o core_dump 812790

For the time being, you will need to use the darling-coredump command to convert the ELF formatted core dump into a Mach-O core dump.

darling-coredump /path/to/core/dump/file

After the program has finished executing, you should see a darlingcore-core_dump file. This file will be in the same folder as the core_dump file. Once you found the file, you can load it up on lldb.

lldb --core /path/to/darlingcore/core/dump/file

Built-in debugging utilities

malloc

libmalloc (the default library that implements malloc(), free() and friends) supports many debug features that can be turned on via environment, among them:

  • MallocScribble (fill freed memory with 0x55)
  • MallocPreScribble (fill allocated memory with 0xaa)
  • MallocGuardEdges (guard edges of large allocations)

When this is not enough, you can use libgmalloc, which is (for the most part) a drop-in replacement for libmalloc. This is how you use it:

$ DYLD_INSERT_LIBRARIES=/usr/lib/libgmalloc.dylib DYLD_FORCE_FLAT_NAMESPACE=1 ./test`

libgmalloc catches memory issues such as use-after-free and buffer overflows, and attempts to do this the moment they occur, rather than wait for them to mess up some internal state and manifest elsewhere. It does that by placing each allocation on its own memory page(s), and adding a guard page next to it (like MallocGuardEdges on steroids) — by default, after the allocated buffer (to catch buffer overruns), or with MALLOC_PROTECT_BEFORE, before it (to catch buffer underruns). You're likely want to try it both ways, with and without MALLOC_PROTECT_BEFORE. Another useful option is MALLOC_FILL_SPACE (similar to MallocPreScribble from above). See libgmalloc(3) for more details.

Objective-C and Cocoa

In Objective-C land, you can set NSZombieEnabled=YES in order to detect use-after-free of Objective-C objects. This replaces -[NSObject dealloc] with a different implementation that does not deallocate the memory, but instead turns the object into a "zombie". Upon receiving any message (which indicates a use-after-free), the zombie will log the message and abort the process.

Another useful option is NSObjCMessageLoggingEnabled=YES, which will instruct the Objective-C runtime to log all the sent messages to a file in /tmp.

In AppKit, you can set the NSShowAllViews default (e.g. with -NSShowAllViews 1 on the command line) to cause it to draw a colorful border around each view.

You can find more tips (not all of which work under Darling) here.

xtrace

You can use xtrace to trace Darwin syscalls a program makes, a lot like using strace on Linux:

$ xtrace vm_stat
[139] fstat64(1, 0x7fffffdfe340) -> 0
[139] host_self_trap() -> port right 2563
[139] mach_msg_trap(0x7fffffdfdfc0, MACH_SEND_MSG|MACH_RCV_MSG, 40, 1072, port 1543, 0, port 0)
[139]         {remote = copy send 2563, local = make send-once 1543, id = 219}, 16 bytes of inline data
[139]         mach_host::host_statistics64(copy send 2563, 4, 38)
[139]     mach_msg_trap() -> KERN_SUCCESS
[139]         {local = move send-once 1543, id = 319}, 168 bytes of inline data
[139]         mach_host::host_statistics64() -> [75212, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1024, 4159955152, 720014745, 503911965, 0, 4160737656, 2563, 4292863152, 4160733284, 4160733071, 0], 38 
[139] write_nocancel(1, 0x7fd456800600, 58)Mach Virtual Memory Statistics: (page size of 4096 bytes)
-> 58
[139] write_nocancel(1, 0x7fd456800600, 49)Pages free:                               75212.
-> 49
...

xtrace can trace both BSD syscalls and Mach traps. For mach_msg() in particular, xtrace additionally displays the message being sent or received, as well as tries to decode it as a MIG routine call or reply.

Note that xtrace only traces emulated Darwin syscalls, so any native Linux syscalls made (usually by native ELF libraries) will not be displayed, which means information and open file descriptors may appear to come from nowhere in those cases.

When Darling Is Not Able To Start Up Properly

In some situations, Darling is not able to access the bash shell. Normally, you should never run into this situation if you are building off of master. However, if you are doing any major changes to the source code (ex: updating Apple's open-source code to a new release), it may cause core applications to break in unexpected ways.

If you ever run into this situation, here are some tricks that can help you find the root cause.

Logging When An Executable is Loading.

In src/kernel/emulation/linux/mach/lkm.c, you can add the following print statements to the mach_driver_init method, like so:

	if (applep != NULL)
	{
		__simple_printf("applep is not NULL\n");
		int i;
		for (i = 0; applep[i] != NULL; i++)
		{
			__simple_printf("applep[%d] = %s\n", i, applep[i]);
			if (strncmp(applep[i], "elf_calls=", 10) == 0)
			{
				uintptr_t table = (uintptr_t) __simple_atoi16(applep[i] + 10, NULL);
				_elfcalls = (struct elf_calls*) table;
				__simple_printf("_elfcalls = %d\n", _elfcalls);
			}
		}
	}

This will print out values stored in applep. One benefit of this is that you get to see which programs are being executed.

$ darling shell
Bootstrapping the container with launchd...
applep is not NULL
applep[0] = executable_path=/usr/libexec/darling/vchroot
applep[1] = kernfd=4
applep[2] = elf_calls=428390
_elfcalls = 4359056
applep is not NULL
applep[0] = executable_path=/sbin/launchd
applep[1] = kernfd=4
applep[2] = elf_calls=428390
_elfcalls = 4359056
applep is not NULL
applep[0] = executable_path=/usr/sbin/memberd
applep[1] = kernfd=4
applep[2] = elf_calls=428390
_elfcalls = 4359056
applep is not NULL
...

Just keep in mind that some scripts can break with this change.

Generating stubs

Darling has a stub generator that is capable of generating stubs for C and Objective-C frameworks and shared libraries. A computer running macOS is required to run the stub generator.

Preparations

You don't need to do this step if you already have a bin folder in your home directory, with the PATH variable pointing to it. If not, copy/paste the following commands into Terminal.

Create the bin folder if it doesn't exist:

$ mkdir ~/bin

If you PATH variable does not include the bin folder, you will need to add it.

# For bash
$ echo "export PATH=\"~/bin:\$PATH\"" >> ~/.bash_profile && source ~/.bash_profile
# For zsh
% echo "export PATH=\"\$HOME/bin:\$PATH\"" >> ~/.zshenv && source ~/.zshenv

Getting the stub generator

Copy/paste the following command into Terminal. It will download both darling-stub-gen and class-dump and place it in the bin folder

$ curl https://raw.githubusercontent.com/darlinghq/darling/master/tools/darling-stub-gen -o ~/bin/darling-stub-gen && chmod +x ~/bin/darling-stub-gen && curl https://github.com/darlinghq/class-dump/releases/download/mojave/class-dump -L -o ~/bin/class-dump && chmod +x ~/bin/class-dump

Using the stub generator

To run the stub generator, structure your arguments like this:

$ darling-stub-gen /System/Library/Frameworks/DVDPlayback.framework/DVDPlayback DVDPlayback

The process is identical for dynamic libraries.

The above command will create a folder that can be placed in the src/frameworks/ directory of Darling's source tree. It is generated from the DVDPlayback framework. Note that the first argument points to the actual binary of the framework, not the root directory of the framework.

Applying the stubs to Darling

Once you have generated the stub folder for the framework, copy that folder into Darling's source tree under src/frameworks/.

Then traverse to the src/frameworks/include/ directory (also located inside Darling's source tree) and create a soft symbolic link. The link should point to the folder inside the include directory (ex: MyNewFolder/include/MyNewFolder).

Example:

$ cd src/frameworks/include/
$ ln -s ../MyNewFolder/include/MyNewFolder MyNewFolder`

Finally, you will need to add the folder to the build. In src/frameworks/CMakeLists.txt, add the following line: add_subdirectory(MyNewFolder). Make sure you put it in alphabetical order.

Run a build and make sure your new code compiles. After that completes, you are ready to submit a pull request.

See Contributing for how to submit a pull request. This commit is an example of a stub for a framework that was added to Darling using the process described in this article. Most notable is what it does to src/CMakeLists.txt.

Known issues

  • The stub generator does not currently generate symbols for constants. Those must be manually added if a program needs them.
  • Generating stubs for platforms outside of x86 (macOS, iOS Simulator) is not supported.
  • TODO: Figure out how to generate stubs from a dyld_shared_cache file.

Generating Syscalls

While not common, there are situations where you might need to regenerate syscalls. Fortunately, Apple provides a convenient perl script called create-syscalls.pl. The script is located in [XNU]/libsyscall/xcodescripts/.

Understanding create-syscalls.pl's Arugments

When you try to run the command, without any arguments, you will get the following prompt:

Usage: create-syscalls.pl syscalls.master custom-directory platforms-directory platform-name out-directory

Unfortunately, the prompt does not give a lot of helpful information. Below is a breakdown of what the script is requesting.

  • syscalls.master - Point to the syscalls.master file. ex: [XNU]/bsd/kern/syscalls.master
  • custom-directory - Point to a directory that contains a SYS.h, custom.s, and a list of __ files. These files can be found in a folder called custom. ex:[XNU]/libsyscall/custom/
  • platforms-directory- Point to a directory that contains the syscall.map files. Ex:[XNU]/libsyscall/Platforms/
  • platform-name - One of the platform names that are in the platform directory. ex: MacOSX

In addition, you will need to define the list of ARCHS that the tool should generate.

export ARCHS="arm64 i386 x86_64"

Example

export ARCHS="arm64 i386 x86_64"
mkdir ~/Desktop/generated_syscalls
./create-syscalls.pl $XNU/bsd/kern/syscalls.master $XNU/libsyscall/custom/ $XNU/libsyscall/Platforms/ MacOSX ~/Desktop/generated_syscalls

High priority stuff

The intention of this page is to serve as a location for pointing developers to areas where work is most needed.

CoreCrypto

CoreCrypto's source code is publicly available, but its license prevents us from using it for Darling. Luckily, it's not that difficult! Some work has already been done.

Cocotron

AppKit

More details to be added.

libxpc & launchd

  • Implement missing APIs
    • Most (if not all) of the missing APIs are private ones
  • Implement XPC domain support in launchd
    • This is required for proper XPC service support, since at the moment, only system-wide services (i.e. traditional launchd services) are supported

CoreAudio

  • Implement AudioFormat and ExtAudioConverter APIs.
  • Implement AUGraph and AudioQueue utility APIs.
  • Implement various Audio Units existing by default on macOS. This includes units providing audio mixing or effects.

CoreServices

  • Implement LaunchServices APIs for applications and file type mappings, backed by a database.
  • Implement UTI (Uniform Type Identifiers) API, also backed by a database.

Google Summer of Code

2019 tasks

  • Implement NSUserNotification over libnotify or directly over the D-Bus API.
  • Implement launch services & open(1).
  • Fix all warnings in CoreFoundation, Foundation, AppKit, and CoreGraphics.
  • Get TextEdit to compile and fully function.

Packaging

NOTE: This is not extensively tested, and may break

Debian

To package Darling for Debian-based systems, we provide the tools/debian/makedeb script.

All output files are stored in the parent directory of the source root because of a technical limitation of debuild.

Install Dependencies

$ sudo apt install devscripts equivs dpkg-dev debhelper

Building Binary Packages

Install Build Dependencies

$ sudo mk-build-deps -ir

Build

$ tools/debian/make-deb

Build Source Packages

Use this if you want to upload to a service like Launchpad.

$ tools/debian/make-deb --dsc

RPM

Build

  1. Install docker and docker-compose
  2. cd rpm
  3. Build the docker image: docker-compose build rpm
  4. Build the rpms: docker-compose run rpm (Can take over half an hour)
  5. Now you can run dnf install RPMS/x84_64/darling*.rpm
  6. If using SELinux, run setsebool -P mmap_low_allowed 1 to allow Darling low level access

Build for other operating systems

By default, the package will be built for Fedora 30. To build for a different OS, simply use:

RPM_OS=fedora:31 docker-compose build rpm

Future improvements

Updating Sources

Darling contains many components built from open source Apple code, such as libdispatch, libc, Security, and many more.

Every so often, Apple releases updated versions of these components on their open source portal (typically shortly after a new OS update).

This section is a guide for any developers that find themselves needing to update Darling's copies of Apple's open source components.

General Steps

Each project is different, but for the most part, these are the steps you should follow. Note that these are general guidelines; most projects require additional steps before they'll build/run correctly!

1. Replace the current source with the updated source

The first step should be to download the updated source and replace the current source with it. You should delete all the files in the current source except the CMakeLists.txt and copy over the updated source. Most Apple sources don't contain a CMakeLists.txt of their own, but if one does, you should delete it or (preferably) rename it to something else (e.g. CMakeLists.apple.txt).

2. Create an initial update commit

You should now create a commit that includes all the changes (e.g. git add -A) with a message containing the name of the project and the version it was updated to (e.g. Libc-1353.60.8). This is done to clearly separate our changes from Apple's original code and it makes it easier to see this distinction in the Git history.

3. Update the CMakeLists.txt

The next step is to update the CMakeLists.txt for Darling to actually build it. Generally, the CMakeLists.txt won't need to be changed much, except for maybe adding or removing files that were added or removed in the new source. In case the new source does need more modifications in order to build, you can usually refer to the Xcode build information in the project (*.xcodeproj, *.xcconfig, etc.) to determine what flags or sources need to be included.

4. Review the Git history for the project

You should check the previous source files (usually with Git) to see if any Darling-specific modifications were made to the code. If so, review the modifications to see whether they're still necessary for the updated code. All modifications are normally marked as being Darling-specific somehow. See the next step for usual markers for Darling-specific changes.

5. Make source modifications if necessary

Most of Apple's code builds just fine, and when problems do arise, more often than not, a change in compiler flags or definitions in the CMakeLists.txt will resolve the problem. Nonetheless, there are cases where Darling-specific workarounds are required. In these cases, you should try to keep your modifications to a minimum and only use them as a last resort in case all other methods of trying to fix the problem fail (e.g. check if any files are missing; they might need to be generated; maybe there's a script that needs to be run).

If you make modifications to the code, mark your changes as Darling-specific somehow. This serves as a way to identify our changes and makes it easier to update projects in the future. The way to do so depends on what kind of source file you're modifying, but here's a list of ways for a couple of languages:

  • C/C++/Objective-C/Objective-C++/MIG

    #ifdef DARLING
    // Darling-specific modifications...
    #endif // DARLING
    

    or...

    #if defined(DARLING) && OTHER_CONDITIONS
    // Darling-specific modifications...
    #endif
    
  • Shell scripts

    ## DARLING
    # Darling-specific modifications...
    ## /DARLING
    

6. Build and test it

This might seem obvious but this is the most important step. Before proposing any changes to Darling, make sure they at least build!

Most Darling subprojects don't have any real test suites, so "testing" here means making sure that the changes don't break something that was previously working. For now, that means testing out various applications and programs inside a Darling environment built with your changes.

While these kinds of checks will still be performed by the Darling team before accepting the changes, it is still highly recommended that you test your changes yourself and save everybody some time.

7. Commit your final changes

Finally, to propose your changes to be merged into Darling, commit your changes, preferably with a message that indicates that it contains Darling-specific modifications for the project and optionally what changes you made.

Additional Notes

Like it was mentioned earlier, most projects require additional modifications and tweaks to work.

The following are links to more specific update requirement guides for subprojects that need them. Note that these document what has had to be done until now; the upstream sources could completely switch up their setup from one version to the next, but until now, project structures have been pretty stable. Nonetheless, these are still guidelines; whenever sources are updated, you need to make sure to review them and perform any additional steps as necessary (and if possible, please document them).

Additional Update Guidelines for libc

As a reminder, these are guidelines. The version you update to might make some of these steps obsolete, so make sure you understand what they do and why they're necessary.

Steps

1. Replace the current source with the updated source

Follow step 1 in "General Steps", with some extra exceptions. In addition to keeping the CMakeLists.txt, also keep:

  • empty.c - This is needed as a dummy source file for libraries that are only built for certain configurations.
  • weak_reference.h - This is needed to provide a no-op macro that sometimes gets overridden or undefined in certain files
  • darling-scripts - This directory contains some scripts you'll need later on to generate some essential files
  • locale/locale - Contains locale information.

2. Create an initial update commit

Follow step 2 in "General Steps".

3. Update the CMakeLists.txt

Follow step 3 in "General Steps".

4. Review the Git history for the project

Follow step 4 in "General Steps".

The next step explains some of the modifications in place when this document was last updated.

5. Make source modifications if necessary

Follow step 5 in "General Steps".

As of Libc-1353.60.8, the following files need Darling-specific modifications:

  • gen/FreeBSD/opendir.c - Modified to not define _filldir when building the legacy variant of the library

    Otherwise, it causes duplicate symbol errors for the i386 build of libc because the noinode64 variant already defines it exactly the same way and both the noinode64 and legacy variants are included in the i386 build.

  • gen/FreeBSD/telldir.c - Modified to not define _fixtelldir when building the legacy variant of the library

    Same reason as the modification in gen/FreeBSD/opendir.c.

  • string/FreeBSD/strerror.c - Modified to not define __errstr and strerror_r when building the legacy variant of the library

    Without this modification, both the legacy and regular variants define the exact same functions (same name and functionality), causing duplicate symbol errors for the i386 build.

Remember that, as explained in steps 4 and 5 of "General Steps", some of these modifications may not be necessary for the version you're updating to, and some additional modifications may be necessary.

6. Generate some additional files

libc requires some additional files that are normally generated by the Xcode build for Apple's code. However, since we're not using Xcode, we need to generate these files separately.

The scripts you need to run are:

  • darling-scripts/generate-derived.sh - Generates the feature headers in derived for various platforms
  • darling-scripts/replace-libc-comments.sh - Finds headers with private libc-only code, copies them to a private header directory, and removes the private code from the originals (a.k.a. the public headers)

7. Build and test it

Follow step 6 in "General Steps".

8. Commit your final changes

Follow step 7 in "General Steps".