Published 08/15/2023
Cross compiling Rust and C with LLVM
By Asher White
For desktops and servers, you basically have three operating system options: Linux, Windows or macOS. Linux dominates the cloud computing and high-performance server markets, Windows has strong hold on enterprise and on-premises servers along with desktop and business usage, while macOS is ideal for developers, content creators and personal users. The healthy competition between these key players pushes innovation and improvements. It’s also a huge pain. Building for different operating systems and architectures can be a frustrating and fragile process, but it’s necessary.
As a developer, you have to be prepared to ship software for every platform. Especially for web developers—Linux OSs dominate the server market, but most people don’t use Linux as their desktop OS. So what can you do? One solution is using platform-agnostic runtimes like Node.js and Java. These abstract away a lot of the complexity of running across multiple OSs, but they bring a performance cost. For places where you need native code (from a language like C, Go or Rust), you’ll need to cross-compile.
Most compiled languages nowadays use LLVM to actually generate machine code. Fortunately, LLVM is a native cross-compiler: every LLVM-based compiler can generate the actual machine code for any platform out-of-the-box. What isn’t so easy is getting that machine code to interface with an operating system other than the one you’re building the software on, especially if you depend on C code.
Some languages are easier to cross-compile than others. Go and Zig both excel, Rust is middle-of-the-road and C/C++ is a headache. The problem is, if you’re building a mid-size project or bigger, you’re dependencies are probably going to include some C/C++ code somewhere along the line, no matter which language you’re programming in. For example, I have a tool that I wrote in Rust that depends on libsqlite3
, libwebp
and libdeflate
, which are all C programs. So, if you have a larger project, you’ll need to cross-compile C.
Of the languages mentioned so far, only Zig helps you to cross-compile C, the rest leave it up to you. This blog post will examine how you need to set up a cross-compilation toolchain to compile Rust and C code from aarch64 macOS to: x86_64 Linux GNU, x86_64 Linux MUSL, aarch64 Linux GNU, aarch64 Linux MUSL, i686 Linux GNU, arm7 Linux GNU HF, x86_64 Windows GNU, x86_64 Windows MSVC and aarch64 Windows MSVC. Plus, since you’re on macOS you can compile for aarch64 macOS and x86_64 macOS without any setup. This will won’t go in to setting up CMake toolchains, but the principles should be pretty much the same. We’ll start with the simplest setup: Linux.
Linux
Both macOS and Linux and Unix-based systems, so the build process and directory structure is similar for both. To start cross-compiling with macOS, you’ll need to install LLVM, with brew install llvm
. macOS does include an LLVM setup out-of-the-box, but it doesn’t provide all the tools you’ll need for cross-compiling. Because macOS has it’s own version of LLVM, brew doesn’t install it globally, you need to be find its path with brew --prefix llvm
. All the binaries you’ll need will be in $(brew --prefix llvm)/bin
.
If you’re building a static executable, you’re almost good to go. Just install the Rust standard library for your target (rustup target add aarch64-unknown-linux-musl
or rustup target add x86_64-unknown-linux-musl
) and then run CC_aarch64_unknown_linux_musl="/opt/homebrew/opt/llvm/bin/clang -target aarch64-unknown-linux-musl -fuse-ld=lld" RUSTFLAGS="-C linker=/opt/homebrew/opt/llvm/bin/clang -C link-arg=--target=aarch64-unknown-linux-musl -C link-arg=-fuse-ld=lld" cargo build --target aarch64-unknown-linux-musl
. (Replace /opt/homebrew/opt/llvm
with whatever brew --prefix llvm
is on your system.)
What does that build command do? It uses the RUSTFLAGS
environment variable to set the linker to be clang
(not the system clang, but the one you install with Homebrew), and then it tells clang
to use lld
as its linker. Clang is the LLVM C compiler, and LLD is the LLVM linker. Why not use LLD directly? Using Clang makes it easier for Rust to just specify all the files to be linked and then Clang makes the perfect linker command to actually do it. If you don’t want to have to set the RUSTFLAGS
environment variable every time you build, you can add the following code to your .cargo/config.toml
:
[env]
CC_aarch64-unknown-linux-musl = "/opt/homebrew/opt/llvm/bin/clang -target aarch64-unknown-linux-musl -fuse-ld=lld"
CXX_aarch64-unknown-linux-musl = "/opt/homebrew/opt/llvm/bin/clang++ -target aarch64-unknown-linux-musl -fuse-ld=lld"
CC_x86_64-unknown-linux-musl = "/opt/homebrew/opt/llvm/bin/clang -target x86_64-unknown-linux-musl -fuse-ld=lld"
CXX_x86_64-unknown-linux-musl = "/opt/homebrew/opt/llvm/bin/clang++ -target x86_64-unknown-linux-musl -fuse-ld=lld"
[target.aarch64-unknown-linux-musl]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target aarch64-unknown-linux-musl -fuse-ld=lld"]
[target.x86_64-unknown-linux-musl]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target x86_64-unknown-linux-musl -fuse-ld=lld"]
What if you want to build a dynamic executable/library or compile for the gnu
ABI? You have two options. The simplest is cargo zigbuild
. This is a cargo extension that uses Zig as its linker and C compiler. Zig includes its own C standard library to use, and it works in most cases. When it does, it’s super simple to use: just install cargo-zigbuild
with brew install cargo-zigbuild
and then cargo zigbuild --target aarch64-unknown-linux-musl
. This same zigbuild
command will work for MUSL and GLIBC Linux along with MinGW Windows (x86_64-pc-windows-gnu
, though Windows support might be more buggy). When it works, it’s the ideal solution for compiling from macOS to Linux.
Unfortunately, for larger projects that use some more obscure C features, it simply doesn’t work. Zig collates all the headers and libraries for several different architectures to save on space, but sometimes the headers will rename a symbol, but the actual binary linked to doesn’t have that symbol. So, if cargo zigbuild
doesn’t work, you’ll need to use a sysroot.
A sysroot is basically a copy of the target operating system that your executable gets linked to. For Linux, the easiest way to get a sysroot is to use Docker. Docker lets you create small VMs for several different distros and then export them to a tarball that you can extract into folder. This is what my Dockerfile looked like for a Debian system:
FROM debian:latest
RUN dpkg --add-architecture amd64
RUN dpkg --add-architecture i386
RUN dpkg --add-architecture armhf
RUN apt-get update
RUN apt-get install -y libc-dev libgcc-10-dev libc-dev:amd64 libgcc-10-dev:amd64 libc-dev:i386 libgcc-10-dev:i386 libc-dev:armhf libgcc-10-dev:armhf
RUN apt-get install -y symlinks
RUN symlinks -cr /lib /lib64 /usr/lib
Debian is a GLIBC system, so dyamically linked by default. So, I can use this sysroot to make x86_64-unknown-linux-gnu
, aarch64-unknown-linux-gnu
, i686-unknown-linux-gnu
and armv7-unknown-linux-gnueabihf
binaries. Debian supports adding multiple architectures other than the native one (arm64
) with dpkg --add-architecture
. If you are on an x86_64
system, just swap out amd64
for arm64
. Then, I just install the needed libraries (libc-dev
for C headers and libgcc-10-dev
for the error-handling functionality required by Rust) for each architecture. The last step is interesting and it can cause some problems if you skip it. Inside the VM, lots of libraries are symlinked so that they show up in all the places they need to. But, the links are absolute, so once I extract the Docker image to my disk, they don’t work anymore. So, I install the symlinks
package and run it to turn all absolute symlinks into relative ones. Once you have that Dockerfile, you can run docker build . --tag linux-gnu -f linux-gnu.Dockerfile
. Once that finishes building, you can docker create linux-gnu
. It’ll output a container id, and you can docker export <container id> -o linux-gnu.tar
. Your last step is just to extract the tarball into a directory: tar -xf linux-gnu.tar -C linux-gnu
. Now you have a Linux sysroot that will work for 4 different architectures!
How do you use this sysroot? You could do it all with environment variables, but its easier to save it into a .cargo/config.toml
. You need to set the C and C++ compilers for the target arch (using the CC_<target_triple>
and CXX_<target_triple>
environment variables). You also need to set the Rust linker and some linker arguments:
[env]
CC_aarch64-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang -target aarch64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CXX_aarch64-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang++ -target aarch64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CC_x86_64-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang -target x86_64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CXX_x86_64-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang++ -target x86_64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CC_i686-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang -target i686-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CXX_i686-unknown-linux-gnu = "/opt/homebrew/opt/llvm/bin/clang++ -target i686-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CC_armv7-unknown-linux-gnueabihf = "/opt/homebrew/opt/llvm/bin/clang -target armv7-unknown-linux-gnueabihf --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
CXX_armv7-unknown-linux-gnueabihf = "/opt/homebrew/opt/llvm/bin/clang++ -target armv7-unknown-linux-gnueabihf --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld"
[target.aarch64-unknown-linux-gnu]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target aarch64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld", "-C", "target-feature=-crt-static"]
[target.x86_64-unknown-linux-gnu]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target x86_64-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld", "-C", "target-feature=-crt-static"]
[target.i686-unknown-linux-gnu]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target i686-unknown-linux-gnu --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld", "-C", "target-feature=-crt-static"]
[target.armv7-unknown-linux-gnueabihf]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target armv7-unknown-linux-gnueabihf --sysroot=/path/to/sysroots/linux-gnu -fuse-ld=lld", "-C", "target-feature=-crt-static"]
Most of this is pretty straightforward, except for the -C target-feature=-crt-static
. Before I added that, there were a lot of really weird errors, with undefined symbols in libpthread
, libdl
and other built-in libraries. After a bit of research I found out that it was because Rust by default tries to link to static libraries first, but if libc
is linked dynamically, all the other built-in libraries need to be linked dynamically as well. To get Rust to do this I needed to disable a CPU feature, of all things. That’s what -C target-feature=-crt-static
does: disables (with ‘-’ as a minus sign) the ‘Static C RunTime’ (crt-static
) feature.
If you keep getting weird undefined symbol errors from system libraries, make sure that there are no broken symlinks in your sysroot, because even with target-feature=-crt-static
, Rust will fall back on static archives if there’s a problem with the dynamic libraries.
To compile for GLIBC Linux from macOS, that’s all you need to do. I haven’t personally tested this next part, but if you need to dynamically link to other libraries on the target system, just edit the Dockerfile to install them and edit your pkg-config
paths to point to the sysroot, and it should work.
If you need a sysroot to compile for MUSL, it is a very similar procedure, just start from the alpine
Docker image instead of Debian and use apk
instead of apt
. You can skip the symlink-editing step, as Alpine seems to use relative symlinks by default. If in doubt about whether to use a sysroot for MUSL, you probably should if you have any C dependencies, that way the headers will match the linked-to libraries.
Being able to cross-compile for Linux from macOS is extremely valuable for a web developer, as it lets you take advantage of the performance of native code without giving up too much flexibility. And, because macOS and Linux are both Unix-like, cross-compilation is relatively painless. But what about compiling for Windows?
Windows
While Windows has a different architecture than macOS and Linux, the good news is that it is still well-supported by LLVM and Rust. There’s a tool that works out-of-the-box most of the time: cargo-xwin
. For larger projects, it might need a bit of configuration, but it should work. You can install it with cargo install cargo-xwin
and use it like this: cargo build --target x86_64-pc-windows-msvc
. You still need to add the Rust target with rustup
(you need to use the msvc
targets, not the gnu
ones), but otherwise the Windows development files and downloaded and cached automatically.
If it works, that’s great, you don’t have anything else to worry about. (I’ve found it works well in most cases, even when you have C dependencies, just not every time.) If it doesn’t work, you’ll have to either use MinGW or configure cargo-xwin
with environment variables.
MinGW is a toolchain to make Windows behave more like a Unix system. It works well, and the setup is similar to setting up a Linux sysroot. The downside is that the binaries produced don’t always use native Windows features, they use the Unix-style ones instead, so they can be quite a bit bigger, and you might run into issues if you try to incorporate Windows-specific functionality. Also, at least for the moment, MinGW doesn’t support Windows on ARM.
To use MinGW, you need to install it with Homebrew: brew install mingw-w64
. Then, you need to set up the sysroot. It will be found in $(brew --prefix mingw-w64)/toolchain-x86_64/x86_64-w64-mingw32
. MinGW provides libraries for x86_64 and i686, but I was only able to get x86_64 to work, the 32-bit libraries didn’t seem to define some error handling symbols. You’ll also need to add the GCC libraries to the library search path with -L$(brew --prefix mingw-w64)/toolchain-x86_64/lib/gcc/x86_64
. On my computer, this is what my .cargo/config.toml
looked like for MinGW:
CC_x86_64-pc-windows-gnu = "/opt/homebrew/opt/llvm/bin/clang -target x86_64-pc-windows-gnu --sysroot=/opt/homebrew/opt/mingw-w64/toolchain-x86_64/x86_64-w64-mingw32 -L/opt/homebrew/opt/mingw-w64/toolchain-x86_64/lib/gcc/x86_64-w64-mingw32/12.2.0 -fuse-ld=lld"
CXX_x86_64-pc-windows-gnu = "/opt/homebrew/opt/llvm/bin/clang++ -target x86_64-pc-windows-gnu --sysroot=/opt/homebrew/opt/mingw-w64/toolchain-x86_64/x86_64-w64-mingw32 -L/opt/homebrew/opt/mingw-w64/toolchain-x86_64/lib/gcc/x86_64-w64-mingw32/12.2.0 -fuse-ld=lld"
[target.x86_64-pc-windows-gnu]
linker = "/opt/homebrew/opt/llvm/bin/clang"
rustflags = ["-C", "link-args=-target x86_64-pc-windows-gnu --sysroot=/opt/homebrew/opt/mingw-w64/toolchain-x86_64/x86_64-w64-mingw32 -L/opt/homebrew/opt/mingw-w64/toolchain-x86_64/lib/gcc/x86_64-w64-mingw32/12.2.0 -fuse-ld=lld"]
MinGW is great for porting Unix-based code to Windows, but as mentioned earlier, it has some drawbacks. It’s better to use the MSVC ABI when you can.
If cargo xwin
isn’t working for you, there are few things you can do. For me, I had to turn on the -Wno-implicit-function-declaration
flag, because some headers weren’t being included properly in some C code. And, I had to add -Xclang target-cpu -Xclang nehalem
to be able compile SIMD code. The problem is that cargo xwin
uses a wrapper of Clang that matches the MSVC CL.exe, but you can still pass Clang-specific options to clang -cc1
with the -Xclang
flag. To see which options are supported, you can run clang-cl -help
(normal options) and clang -cc1 -help
(-Xclang
options). It’s complicated, but it works in the end.
The stickiest problem with using the MSVC ABI, though, is when dependencies include assembly code. The build system will tries to package it as a library instead of an archive, and I couldn’t find how to change it. Ultimately, I set it so that Clang always builds an archive instead of a library, because when I’m building Rust, the C code is always a dependency. To do this, I set -fuse-ld=llvm-lib
, as opposed to -fuse-ld=lld-link
. The Rust linker still needs to be lld-link
.
Something to be aware when overriding CFLAGS
when building with cargo xwin
is that the .cargo/config.toml
file is parsed after xwin
is run, so the .cargo/config.toml
CFLAGS
completely overwrite the CFLAGS
set by xwin
. That means you need to find out what options xwin
would have set (check out the source code) and set them manually. Then, run the build with cargo xwin build --target <target>
and everything should work. This is what my .cargo/config.toml
looked like for building with xwin
:
[env]
CFLAGS_x86_64-pc-windows-msvc = "/imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/crt/include /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/ucrt /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/um /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/shared -fuse-ld=llvm-lib -Wno-implicit-function-declaration -Xclang -target-cpu -Xclang nehalem --target=x86_64-pc-windows-msvc"
CFLAGS_aarch64-pc-windows-msvc = "/imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/crt/include /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/ucrt /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/um /imsvc/Users/asherwhite/Library/Caches/cargo-xwin/xwin/sdk/include/shared -fuse-ld=llvm-lib -Wno-implicit-function-declaration --target=aarch64-pc-windows-msvc"
Cross-compilation can be a pain, but I hope this blog post helped smooth the process for you. When it’s done properly, it helps you take advantage of multiple operating systems and the advantages of each one, and at the same time keeping maximum performance.