Making Sense of the RISC-V Toolchain

Making Sense of the RISC-V Toolchain

Intro

The RISC-V toolchain is a set of development tools (compiler, assembler, linker, debugger, and supporting libraries), that turn source code into binaries that run on RISC-V processors. It's the foundation for building anything targeting RISC-V hardware, used by bare-metal firmware like OpenSBI to full applications running on an OS like Linux or FreeBSD.

The RISC-V GCC toolchain can be found here: RISC-V GCC toolchain

Get the source

bash
git clone https://github.com/riscv/riscv-gnu-toolchain

Install Prerequisites

On Ubuntu or Debian based systems, install the following prerequisites:

bash
sudo apt-get install autoconf automake autotools-dev curl python3 python3-pip python3-tomli libmpc-dev libmpfr-dev libgmp-dev gawk build-essential bison flex texinfo gperf libtool patchutils bc zlib1g-dev libexpat-dev ninja-build git cmake libglib2.0-dev libslirp-dev libncurses-dev

On RHEL based systems, install then following prerequisites:

bash
sudo dnf install autoconf automake python3 libmpc-devel mpfr-devel gmp-devel gawk  bison flex texinfo patchutils gcc gcc-c++ zlib-devel expat-devel libslirp-devel ncurses-devel

Newlib vs Linux glibc

The RISC-V GCC toolchain allows you to target a few standard libraries, mainly glibc, newlib and musl.

When building for RISC-V you'll typically choose between newlib and glibc depending on your target. The newlib target is the go-to for embedded RISC-V systems running without an OS (think microcontrollers or custom bare-metal firmware). The glibc target pairs with Linux-based RISC-V systems where you have a full operating system underneath and need complete POSIX support.

To build for newlib bare-metal ELF bindings:

bash
./configure --prefix=/opt/riscv-elf --enable-multilib
make -j$[$(nproc)-1]

To build for Linux glibc bindings:

bash
./configure --prefix=/opt/riscv-linux --enable-multilib
make -j$[$(nproc)-1] linux

Multilib

--enable-multilib tells the toolchain to build support for multiple target ABIs and architectures in a single installation — for RISC-V this means you get libraries and runtime support for different combinations of ISA extensions (like rv32 vs rv64) and calling conventions (like ilp32 vs lp64) without needing a separate toolchain for each one.

The default newlib and Linux toolchain builds suppport the following:

text
/opt/riscv/riscv-elf/bin/riscv64-unknown-elf-gcc --print-multi-lib
.;
rv32i/ilp32;@march=rv32i@mabi=ilp32
rv32im/ilp32;@march=rv32im@mabi=ilp32
rv32iac/ilp32;@march=rv32iac@mabi=ilp32
rv32imac/ilp32;@march=rv32imac@mabi=ilp32
rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f
rv64imac/lp64;@march=rv64imac@mabi=lp64
rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
text
/opt/riscv/riscv-linux/bin/riscv64-unknown-linux-gnu-gcc --print-multi-lib
.;
lib32/ilp32;@march=rv32imac@mabi=ilp32
lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d
lib64/lp64;@march=rv64imac@mabi=lp64
lib64/lp64d;@march=rv64imafdc@mabi=lp64d

Note: .; is the default variant

To get the default varient, run the following:

text
/opt/riscv/riscv-linux/bin/riscv64-unknown-linux-gnu-gcc -Q --help=target | grep -E '  \-march= |  \-mabi= '

The default for both newlib and Linux targets:

-mabi=lp64d

-march=rv64imafdc_zicsr_zifencei_zmmul_zaamo_zalrsc_zca_zcd

Minimum needed to build Linux

Linux requires at minimum: rv64imac / lp64 Most Linux distros build with: rv64imafdc / lp64d

As seen in the previous section, the default -march value is rv64imafdc_zicsr_zifencei_zmmul_zaamo_zalrsc_zca_zcd, which is far more extentions that systems like qemu support.

When building busybox and linux, you may need to set these options via CFLAGS:

CFLAGS="-march=rv64imac -mabi=lp64"

There are no rv32i, rv32im, or rv32iac variants because those can't run Linux. The toolchain maintainers didn't bother shipping library variants for ISA combinations that can never host a Linux system. Every entry starts at imac as the floor.

The bare-metal toolchain by contrast ships rv32i and rv32im because firmware doesn't care about Linux's requirements.

Abbreviations

march = what instructions the CPU supports

mabi = how data is laid out and passed

I = Base Integer ISA

M = Multiply/Divide

A = Atomics

F = Single-precision floating-point

D = Double-precision floating-point

C = Compressed Instructions

Different ABIs in plain English

ABI stands for Application Binary Interface. The key word is binary — it's not about source code, it's about what actually happens at the machine level when compiled code runs and when different pieces of compiled code talk to each other. If an API (Application Programming Interface) is a contract between pieces of source code, an ABI is a contract between pieces of compiled machine code. It answers the question: when two chunks of binary code need to interact, how exactly do they do it?

How LP64 became de de-facto model: 64-Bit Programming Models: Why LP64?

i = int

l = long

p = pointer

The number = their size in bits

ilp32: (32-bit, no FPU)

Everything is 32-bit. Floating-point math is done in software (slow, but works on chips with no FPU).

ilp32f: (32-bit, single-precision hardware only)

float values go into FPU registers for speed, but double values are too wide for those registers, so they fall back to memory/software. Niche — you'd only use this on a chip that has an F extension (single-precision FPU) but not a D extension (double-precision).

ilp32d: (32-bit, full hardware float)

Both float and double go into FPU registers. Fast floating-point on a 32-bit chip with a full FPU.

lp64: (64-bit, no FPU)

Same soft-float approach as ilp32, but pointers and long are now 64-bit. Targets 64-bit RISC-V chips without (or ignoring) an FPU.

lp64f: (64-bit, single-precision hardware only)

Same niche trade-off as ilp32f, just on a 64-bit chip.

lp64d: (64-bit, full hardware float) (Most common)

The standard ABI for 64-bit Linux RISC-V. Full 64-bit addressing + full hardware double-precision floating point.

Script

Here's a bash script that will build both newlib and glibc toolchains with multilib enabled:

bash
#!/bin/bash -x

set -euo pipefail

TOPDIR=/opt/riscv
RISCV_NEWLIB=$TOPDIR/riscv-elf
RISCV_GLIBC=$TOPDIR/riscv-linux

## Setup install dirs
mkdir -p "${RISCV_NEWLIB}"
mkdir -p "${RISCV_GLIBC}"

## Clone riscv-gnu-toolchain
git clone https://github.com/riscv-collab/riscv-gnu-toolchain.git
cd riscv-gnu-toolchain/

## Build newlib
echo "Building Newlib (bare-metal) toolchain..."
mkdir -p build-newlib/
cd build-newlib/

../configure --prefix="${RISCV_NEWLIB}" --enable-multilib
make -j$[$(nproc)-1]
cd ../ 

## Build glibc
echo "Building glibc (Linux) toolchain..."
mkdir -p build-glibc/
cd build-glibc/

../configure --prefix="${RISCV_GLIBC}" --enable-multilib
make -j$[$(nproc)-1] linux
cd ../ 

/opt/riscv/riscv-elf/bin/riscv64-unknown-elf-gcc --version && \
/opt/riscv/riscv-linux/bin/riscv64-unknown-linux-gnu-gcc --version && \
echo "Toolchains built successfully!"

Also, a script to add the tollchains to the system PATH:

bash
export PATH=$PATH:/opt/riscv/riscv-elf/bin/
export PATH=$PATH:/opt/riscv/riscv-linux/bin/