Whichever language we write our program with, at the end of the day the executable is created by translating into the machine instructions. In general, fewer instructions require fewer CPU cycles and hence run faster. So I have been wondering how a simple program written in various languages compare in terms of # instructions, # cache references, and # branches.
Let’s start off with the most simple program, which simply reads input text file and prints out line by line. We will write this in C, C++, and Rust.
// stdio.cc
#include <cstdio>
#include <string>
#include <cstdlib>
/**
* Copies input to output line by line
* Usage: io [INPUT [OUTPUT]]
* If INPUT/OUTPUT is omitted, stdin/stdout is assumed
*/
int main(int argc, const char** argv) {
std::string input = "-";
std::string output = "-";
if (argc >= 2) input = argv[1];
if (argc >= 3) output = argv[2];
if (input == "-") input = "/dev/stdin";
if (output == "-") output = "/dev/stdout";
FILE* ifs = fopen(input.data(), "r");
FILE* ofs = fopen(output.data(), "w");
char* line = nullptr;
size_t capacity;
while (auto n = getline(&line, &capacity, ifs)) {
if (n <= 0) break;
fwrite(line, sizeof(char), n, ofs);
}
free(line);
fclose(ifs);
fclose(ofs);
return 0;
}
Ok, this is technically C++ but it uses stdio and stdlib library from C, so let’s just call it C-implementation.
// iostream.cc
#include <iostream>
#include <fstream>
/**
* Copies input to output line by line
* Usage: io [INPUT [OUTPUT]]
* If INPUT/OUTPUT is omitted, stdin/stdout is assumed
*/
int main(int argc, const char** argv) {
std::ios::sync_with_stdio(false);
std::string input = "-";
std::string output = "-";
if (argc >= 2) input = argv[1];
if (argc >= 3) output = argv[2];
if (input == "-") input = "/dev/stdin";
if (output == "-") output = "/dev/stdout";
std::ifstream ifs{input};
std::ofstream ofs{output};
std::string line;
while (std::getline(ifs, line)) {
ofs << line << "\n";
}
return 0;
}
For our C++ program above, it is almost identical to C-version, except it uses iostream from the C++ standard library.
// rust_io.rs
use std::io::{BufRead, BufReader, BufWriter, Write};
use std::fs::File;
/**
* Copies input to output line by line
* Usage: io [INPUT [OUTPUT]]
* If INPUT/OUTPUT is omitted, stdin/stdout is assumed
*/
fn main() -> std::io::Result<()> {
let args: Vec<_> = std::env::args().collect();
let input_file = if args.len() >= 2 { args[1].clone() } else { "/dev/stdin".to_owned() };
let output_file = if args.len() >= 3 { args[2].clone() } else { "/dev/stdout".to_owned() };
let ifs = BufReader::new(File::open(input_file)?);
let mut ofs = BufWriter::new(File::create(output_file)?);
for line in ifs.lines() {
let line = line?;
writeln!(ofs, "{}", line)?;
}
Ok(())
}
Finally the rust code.
So, any guess as to which program will run with the least instructions? cache references? branches?
To be fair, I will use LLVM/Clang to compile C and C++, since Rust uses LLVM as the backend. On my arm64 architecture, here is the result: