Performance — C++ vs Rust vs Go Link to heading
Today, let’s run some benchmark and compare performance of the same program written in C++ vs Rust vs Go. We try our best to isolate noise from factors other than the difference in languages. As with any benchmark, however, the result is to be taken with grain of salt — no single benchmark can truly compare the performance of two different languages.
Program Link to heading
The program we will compare is gunzip that decompresses .gz files. There are different implementations of gunzip, such as GNU gzip written in C, zlib written in C, miniz written in C, flate2-rs written in Rust and gzip in Go.
However, we can’t accurately benchmark two languages unless one is a direct port of another, as this introduces noise from possibly different implementations.
For that reason, we will choose the following three
- gunzip written in
Rust - cpp_gunzip, a direct port of above written in
C++ - go_gunzip, a direct port of above written in
Go
Minimizing noise Link to heading
There is still one issue — use of external library. They rely on a third-party library for computing CRC32 checksum, which takes significant time within decompression. In particular, gunzip relies on crc32fast crate, cpp_gunzip can link against either zlib or FastCrc32, and go_gunzip relies on Go standard crc32 library. Luckily, all of them support a multi-thread option that runs CRC32 checksum on a separate thread, resulting in runtime proportional to only Inflate (decompression) implementation that they all share — this is because Inflate takes longer than CRC32 checksum, so by parallelizing, we can effectively minimize the contribution from CRC32 checksum.
Let’s run some experiments to verify. Let’s compile cpp_gunzip in two different ways: (1) using FastCrc32 and (2) zlib for computing CRC32 checksum. We will then compare the runtime of the two using a single-threaded and two-threaded modes and see how they differ.
# terminal in Linux
git clone https://github.com/TechHara/cpp_gunzip.git
cd cpp_gunzip
# compile with FastCrc32 vs zlib for CRC32 checksum
cmake -B fastcrc32 -DCMAKE_CXX_FLAGS=-O3 -DUSE_FAST_CRC32=ON . && make -j -C fastcrc32
cmake -B zlib -DCMAKE_CXX_FLAGS=-O3 -DUSE_FAST_CRC32=OFF . && make -j -C zlib
# download linux source code and compress as .gz file
curl -o- https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.8.7.tar.xz | xz -d | gzip > linux.tgz
# run with single-thread
time fastcrc32/gunzip < linux.tgz > linux.tar
time zlib/gunzip < linux.tgz > linux.tar
# run with multi-thread (-t option)
time fastcrc32/gunzip -t < linux.tgz > linux.tar
time zlib/gunzip -t < linux.tgz > linux.tar

On my x64 Ubuntu system, the single-thread mode shows massive difference in performance contributed from the two CRC32 checksum libraries. However, as we run them in multi-threaded mode, the two show no difference in runtime as expected. Hence, this allow us to minimize noise from different CRC32 libraries we are using among the C++ vs Rust vs Go versions of the programs we will benchmark.
Benchmark Link to heading
Now, let’s run our benchmark and compare performance of C++ vs Rust vs Go using the exact same implementation of .gz decompression. We already ran the C++ version, so let’s run Rust and Go versions. Make sure to run in multi-threaded mode to minimize noise from CRC32 checksum.
# clone the Rust version
git clone https://github.com/TechHara/gunzip.git
cd gunzip
# build
cargo build -r
# run in multi-threaded mode (-t)
time target/release/gunzip -t < ../linux.tgz > linux.tar
# clone the Go version
cd ..
git clone https://github.com/TechHara/go_gunzip.git
cd go_gunzip
# build
go build
# set max process to 2
export GOMAXPROCS=2
# run in multi-threaded mode (-t)
time ./gunzip -t < ../linux.tgz > linux.tar

Alright, on my x64 Ubuntu system, C++ and Rust run at almost the same speed while Go takes about 2x longer. This is slightly more favorable compared to what benchmark games presents (4x) for Go.

As always, better performance does not mean better language. One has to take into account applications and development/maintenance time as well as safety/security in consideration when choosing a language. The classic example is Python, which is ~100x slower than C but is the most popular programming language.