July 20, 2024
This post is about one part of a much larger embedded project that I'll talk more about soon. One aspect of this system is that I hope to program it entirely in Rust, with a basic operating system and portable code modules. The chip the OS runs on is RISC-V, so the OS (running in machine mode) provides system calls that the code modules (running in user mode) can execute.
Executing system calls on RISC-V is easy: in user mode,
the ecall
instruction causes an environment
call to the next lower level of execution---in this case
machine mode from user mode. However, you (the programmer) have
to create all the code that communicates functions, arguments,
and return values. There are a lot of ways to do this, but I'm
looking at a simple (and currently unfinished) approach that
uses a 32-bit system call function ID and the RISC-V calling
convention to pass arguments and return values.
The ultimate goal is to make code modules easy to compile with a generic RISC-V Rust target. This requires that code modules have access to all the function call prototypes and IDs: something easy to do with a proc macro!
We'll move through the proc macro code step by step. The big
picture is simple: it declares a function attribute
(rakko_syscall
) that takes one
argument, id
, then processes the signature of the
function and generates simple JSON output containing the same
info. This JSON can then be converted into basic Rust system
call wrappers (functions that take arguments, place them in the
correct registers, then execute ecall
). The
primary goal here is to take proc-macro-tagged system call
functions compiled into the OS, convert them to JSON, then use
that JSON to create wrappers for code modules.
With this code, we can turn this:
#[rakko_syscall(id=1)]
fn modtest1(c: u8, d: Box) -> Box {
into this:
{
"name": "modtest1",
"id": 1,
"arguments": [
{
"name": "c",
"ty": "u8"
},
{
"name": "d",
"ty": "Box"
}
],
"return_type": "Box"
}
One thing to note: I'm no proc macro expert! There is likely an easier way to do everything that this proc macro does.
Each code block below comes together to make a
small lib.rs
for a proc macro crate.
extern crate proc_macro;
use proc_macro::TokenStream;
use syn::{parse_macro_input, FnArg, ItemFn, LitInt, Pat, PathArguments, ReturnType, Type, GenericArgument, Expr, Lit};
use syn::spanned::Spanned;
use std::io::Write;
use std::fs::OpenOptions;
use json::{self, JsonValue, object};
Pretty normal proc macro crates, plus
the json
crate to generate JSON output.
#[derive(Default)]
struct RakkoSyscallArgument {
name: String,
ty: String,
}
impl Into for RakkoSyscallArgument {
fn into(self) -> JsonValue {
let m = object! {
name: self.name,
ty: self.ty,
};
return m;
}
}
#[derive(Default)]
struct RakkoSyscall {
name: String,
id: u32,
arguments: Vec<RakkoSyscallArgument>,
return_type: String
}
Each system call is represented by a RakkoSyscall
struct. The name and return type are simply
a String
, the syscall ID is a u32
, and
each argument is represented by
a RakkoSyscallArgument
, another simple struct with
argument name and type as
strings. RakkoSyscallArgument
also has a
basic Into
that creates a JSON object using
the object!
macro.
#[proc_macro_attribute]
pub fn rakko_syscall(attr: TokenStream, item: TokenStream) -> TokenStream {
let mut this_syscall = RakkoSyscall::default();
// process the attribute tag to get the id
let id_parser = syn::meta::parser(|meta| {
if meta.path.is_ident("id") {
let w: LitInt = meta.value()?.parse()?;
this_syscall.id = w.base10_parse()?;
Ok(())
} else {
Err(meta.error("unsupported rakko_syscall property"))
}
});
parse_macro_input!(attr with id_parser);
We start by creating a blank RakkoSyscall
, then
parse the "id"
field. This is the only field that
can be present and it can only be an integer literal. If it
isn't, the error is simple:
error: expected integer literal
--> src/main.rs:7:20
|
7 | #[rakko_syscall(id="a")]
| ^^^
Then we process the function spec. The name is easy, just one field. The arguments are more complicated, because we have to be able to process multiple types of argument:
u8
, i32
, etc, but
not String
(because values are passed simply either
as direct numbers or pointers in registers)Box<u8>
. These are supported primarily
for the Box
type, to put data on the heap and
pass a pointer.
Box<Vec<u8>>
, so that we
can Box
a more complex type than a struct or integer.
Box<[u8; 2]>
This is the largest block of code and it's honestly ugly, but it does work. It starts by parsing the function name, then iterating over arguments and identifying the type of each one. See the comments for more specifics.
let m = item.clone();
let a = parse_macro_input!(m as ItemFn);
// set the function name in the struct
this_syscall.name = a.sig.ident.to_string();
let mut arguments = Vec::new();
for (count, input) in a.sig.inputs.into_iter().enumerate() {
let mut arg_struct = RakkoSyscallArgument::default();
if let FnArg::Typed(arg) = input {
if let Pat::Ident(pat_ident) = *arg.pat {
// set the name of the argument
arg_struct.name = pat_ident.ident.to_string();
}
if let Type::Path(type_path) = *arg.ty {
let segments = type_path.path.segments;
arg_struct.ty = segments[0].ident.to_string().clone();
if let PathArguments::AngleBracketed(angle_bracketed) = segments[0].clone().arguments {
// assemble a generic type
if let GenericArgument::Type(generic_arg_1) = &angle_bracketed.args[0] {
if let Type::Path(type_path_inner_1) = generic_arg_1 {
// traverse inward to the first set of angle brackets
let inner_segment_2 = type_path_inner_1.path.segments[0].clone();
arg_struct.ty = format!("{}<{}", arg_struct.ty, inner_segment_2.ident.to_string());
if let PathArguments::AngleBracketed(angle_bracketed_inner) = inner_segment_2.arguments {
// handle double-generic types if there's an inner generic
if let GenericArgument::Type(generic_arg_2) = &angle_bracketed_inner.args[0] {
if let Type::Path(type_path_inner_2) = generic_arg_2 {
// expand into single-generic, note that we use format! to expand the existing string
arg_struct.ty = format!("{}<{}>>", arg_struct.ty, type_path_inner_2.path.segments[0].ident.to_string());
}
}
} else {
// close the angle bracket for single-generic types
arg_struct.ty = format!("{}>", arg_struct.ty);
}
} else if let Type::Array(type_array) = generic_arg_1 {
// handle single-generic types of arrays
if let Type::Path(elem) = *type_array.elem.clone() {
let array_type = elem.path.segments[0].ident.to_string();
// get the LitInt that represents the array length
if let Expr::Lit(array_length) = &type_array.len {
if let Lit::Int(array_length) = &array_length.lit {
let array_length_int: std::num::NonZero = array_length.base10_parse().unwrap();
arg_struct.ty = format!("{}<[{array_type}; {array_length_int}]>", arg_struct.ty);
}
}
}
}
}
}
}
arguments.push(arg_struct);
}
}
Finally, we parse the return value. It's possible that we'll want more potential return types, but for now we only have support for non-generic types and single-generic types.
if let ReturnType::Type(_, ret_type) = a.sig.output { // _ ignores the RArrow
if let Type::Path(type_path) = *ret_type {
let segments = type_path.path.segments;
// grab the basic return type
this_syscall.return_type = segments[0].ident.to_string();
// check if we have a generic argument and process it plus the thing inside
if let PathArguments::AngleBracketed(angle_bracketed) = segments[0].clone().arguments {
if let GenericArgument::Type(generic_arg_1) = &angle_bracketed.args[0] {
if let Type::Path(type_path_inner_1) = generic_arg_1 {
let inner_segment_2 = type_path_inner_1.path.segments[0].clone();
// expand the type into generic
this_syscall.return_type = format!("{}<{}>", this_syscall.return_type, inner_segment_2.ident.to_string());
}
}
}
// anything else is not permitted
} else {
// .into() converts the proc_macro2.TokenStream generated
// by .into_compile_error() into a proc_macro.TokenStream
return syn::Error::new(ret_type.span(), "syscall functions may only return single values!")
.into_compile_error()
.into();
}
}
This is the fun part. The json
crate has a handy
macro object!
(as shown above) that can take any
types and turn them into nicely formatted JSON.
let data = object! {
name: this_syscall.name,
id: this_syscall.id,
arguments: arguments,
return_type: this_syscall.return_type,
};
Then we save it to a file in append mode and return the original item being processed:
let mut file = OpenOptions::new()
.append(true)
.create(true)
.open("functions.txt")
.expect("can't open file!");
write!(file, "{data}").unwrap();
item
This saves functions.txt
in the main directory
of the crate using this proc macro (which makes sense,
because the point of a proc macro is that it is code that's
executed wherever).
That's all! Come back next time (I promise it'll be there!) for more cool stuff.