mht.wtf

Computer science, programming, and whatnot.

Building Zig structs at Compile Time

June 11, 2022 back to posts

Let's talk about comptime in Zig. comptime is the feature that allowyou to run code at compile time, and is maybe Zig's biggest differentiator from other languages in the same space. Combined with having types as values we get both type specializaton, generics, reflection, and even code generation.

For readers who are not familiar with Zig, here's a small example. We can make a Range type that is generic over the element type by writing a function called Range that takes a type (which is required to be compile time known), and produces a struct with two fields of that type. Returning the struct from the function is no problem; types are values after all. It looks like this:

fn Range(comptime t: type) type {
    return struct {
        from: t,
        to: t,
    };
}

We can use this new function, and the type it returns, like this1:

test "range-create" {
    var a = Range(i32){ .from = 0, .to = 10 };
    std.debug.print("\n[{}, {})\n", .{ a.from, a.to });
}

which, when ran, prints out the numbers we gave in a math-like format.

$ zig test comptime-struct/cs.zig --test-filter range
Test [0/1] test "range-create"...
[0, 10)
All 1 tests passed.

We can also add a method to the newly created type, for instance for checking whether a value is in the range or not. The type of this parameter other in the contains method is the type t that we're given as argument in Range, and it works just as expected.

fn Range(comptime t: type) type {
    return struct {
        from: t,
        to: t,

        pub fn contains(this: @This(), other: t) bool {
            return this.from <= other and other < this.to;
        }
    };
}

Here we're using the @This() builtin which gives us the type in which we currently are. We need this here since we don't have a name for the type yet, as we're still defining it. There's nothing special about the name this, but it is familiar from many other languages, and since the builtin is called @This it's a convenient name to give. The new method can be tested like so:

test "range-contains" {
    var r = Range(i32){ .from = 0, .to = 10 };
    try std.testing.expect(r.contains(5));
    try std.testing.expect(r.contains(0));
    try std.testing.expect(r.contains(9));
    try std.testing.expect(!r.contains(10));
    try std.testing.expect(!r.contains(-1));
}

which works:

$ zig test comptime-struct/cs.zig --test-filter range-contains
All 1 tests passed.

Building Structs

Usually in Zig, the way you define a struct is by assigning the value of a struct { .. } expression to a name, like this:

const MyString = struct {
    someNumber: i32,
    aBool: bool,
    yourString: []const u8,
};

We have just seen how to control the types of the struct fields programatically (and, I stress, with completely regular Zig code!). What about the names? Or both? It is possible to construct, at compile time, a new struct in which the names and types of all of the field come from some other data?

The answer is yes! The key is the @Type builtin, which takes a std.builtin.TypeInfo2 and reifies3 the description of the type into a real type. Here's how it looks:

test "reify-empty" {
    const Type = @Type(.{
        .Struct = .{
            .layout = .Auto,
            .fields = &[_]std.builtin.TypeInfo.StructField{},
            .decls = &[_]std.builtin.TypeInfo.Declaration{},
            .is_tuple = false,
        },
    });
    try std.testing.expect(@sizeOf(Type) == 0);
}

This will create an empty struct, since we're instantiating the .Struct field of the TypeInfo enum with both .fields and .decls empty. So far this only seems to be a difficult way of writing const Type = struct {};, but this is just regular Zig code, and while we require that the value passed to @Type is compile time known, we don't require it to be one big literal like it is now. It can very well be the result of a complex computation, as long as it is compile time known.

We can for instance write a function that takes an anonymous struct literal with names and types that should be the fields of a struct, and if the name starts with a ? it automatially makes the field optional. In code, calling our function

const Foo = MakeStruct(.{
    .{ "someNumber", i32 },
    .{ "?aBool", bool },
    .{ "?yourString", yourString },
});

should be the same writing

const Foo = struct {
    someNumber: i32,
    aBool ?bool,
    yourString: ?[]const u8,
};

One way of doing this by building up a list of StructFields with the right names and types, making a TypeInfo struct with those fields, and pass it to @Type. The only thing we must do is to branch on whether the variable name starts with a ?, and if so, remove the ? from the name and turn the given type into an optional type, T to ?T. Here is an example:

fn MakeStruct(comptime in: anytype) type {
    var fields: [in.len]std.builtin.TypeInfo.StructField = undefined;
    for (in) |t, i| {
        var fieldType: type = t[1];
        var fieldName: []const u8 = t[0][0..];
        if (fieldName[0] == '?') {
            fieldType = @Type(.{ .Optional = .{ .child = fieldType } });
            fieldName = fieldName[1..];
        }
        fields[i] = .{
            .name = fieldName,
            .field_type = fieldType,
            .default_value = null,
            .is_comptime = false,
            .alignment = 0,
        };
    }
    return @Type(.{
        .Struct = .{
            .layout = .Auto,
            .fields = fields[0..],
            .decls = &[_]std.builtin.TypeInfo.Declaration{},
            .is_tuple = false,
        },
    });
}

There's another thing to highlight here. We are declaring fields to be an array of length in.len, even though in is the argument of the function. This is fine since in is declared to be comptime known, and so of course we should be able to declare statically sized arrays of that length, and indeed, in Zig we can.

We can see that we're getting what we expect by using the "inverse" builtin of @Type which is @typeInfo. @typeInfo takes a type and returns its std.builtin.TypeInfo, which we can operate on.

test "make-struct" {
    const Type = MakeStruct(.{
        .{ "someNumber", i32 },
        .{ "?aBool", bool },
        .{ "?yourString", []const u8 }, 
    });
    
    std.debug.print("\n", .{});
    inline for (@typeInfo(Type).Struct.fields) |f, i| {
        std.debug.print("field {} is {s} type is {s}\n", .{ i, f.name, f.field_type });
    }
}

Here we are just looping over the fields of the struct and printing out the names and types in order. The result is this:

$ zig test comptime-struct/cs.zig --test-filter make
Test [0/1] test "make-struct"...
field 0 is someNumber type is i32
field 1 is aBool type is ?bool
field 2 is yourString type is ?[]const u8
All 1 tests passed.

We have succesfully moved the ? from the field names and over to the field types. Granted, this new way of making structs does not offer very much in terms of readability or functionality. Putting the ? in the name isn't any easier than having it in the type.

So What?

Even though we are effectively generating code at compile time, there's no magic here: we're just writing regular Zig code. The data types we're making are from std.builtin, so they're tightly bound to the language, but there's no special syntax, and no second language to learn and remember. By simply filling in a std.builtin.TypeInfo we can construct new types at compile time.

Also, the input to our function was an anonymous struct literal, but this doesn't have to be the case. We could have taken a []const u8 with source code of a struct definition from another language like C++ or Rust, parsed it, and constructed the corresponding Zig type for the given definition. Parsing the other language would be the vast majority of the work, because as we've just seen, making the Zig struct is really easy.

Another idea is to have a compile-time readable configuration .ini file embedded in the source with @embedFile, and a function that reads in the file, finds the names and types of the values in the file, and collects it all into a struct. This struct would always be in perfect correspondance with the .ini file, and so there is no danger of reference configuration file and code diverging. There would be one definite source of truth for the configuration values.

In most other compiled languages, this is very difficult to do without any external tools. One would most likely try to go the other way, and have the struct definition be the single source of truth, and output the default config file from that, either through a function that has to be kept up-to-date as fields are added and changed, or by a macro system, which is likely to be written in some DSL. If you would want to have the configuration file as a plain text file, you would need to ensure that the file on disk is always consistent with the code; maybe you would want this to be a distinct step in the build process of the program.

Either way, the value proposition of Zig is clear: by simply allowing Zig code to be ran at compile time4 we get a powerful and easy to use metaprogramming system without requiring to learn a second language or use external tools.

Pointers, complaints, suggestions, and others can be sent to my public inbox (plain text emails only).

Thanks for reading.

Footnotes

  1. We could also have written var a: Range(i32) = .{ .from = 0, .to = 10 }; even though it might look funny that we have put a function call in the type specifier position of the expression, as this is usually reserved for type literals in other languages. Not so in Zig!

  2. This type is about to be renamed to just Type, but I'm running my code samples with the 0.9.1 compiler which is still using the old name.

  3. From Merriam-Webster: "Reify: to consider or represent (something abstract) as a material or concrete thing : to give definite content and form to (a concept or idea)"

  4. At comptime, the full Zig language is available, but there are some limitations. For instance, I/O is not allowed.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License