Row polymorphism is a kind of polymorphism of types that allows records/variants identified by their fields (rows), instead of their nominalness. I.e., you can select a type which includes the specific rows you want.
Implementing a similar mechanism of row polymorphism in C is not impossible. C provides a flexible and powerful macro system to generate some codes during compilation. With anonymous structures and unions, we can generate a type that has some specific fields, and access them by rows(’ values or references) instead of type.
In this write-up, we will implement a simple bump memory allocator to show this idea.
Bump allocator is a kind of memory allocator that uses a pointer to track the usage of memory linearly. In simplest case, you can’t actually “free” an object allocated by bump allocator — it only allows you to deallocate all bytes you allocated, without any mechanism for deallocating objects randomly. The following code shows a minimal implementation of the (static) bump allocator:
typedef struct {
uint8_t heap[ALLOCATOR_CAPACITY];
size_t usage;
} bump_t;
void *bump_alloc(bump_t *alloc, size_t size) {
if (size == 0) return NULL;
if (alloc->usage + size > ALLOCATOR_CAPACITY) return NULL;
void *ptr = alloc->heap + alloc->usage;
alloc->usage += size;
return ptr;
}
void bump_dealloc(bump_t *alloc) {
alloc->usage = 0;
}
We want stack allocation for less syscalls and its simplicity. But, you can easily see the limitation of this implementation here: the capacity of bump_t is fixed, and it cannot adapt all the cases. We want a more flexible interface. To achieve this, we use a macro to generate the structure with a user-specified capacity:
#define bump_t(size) struct { \
uint8_t heap[size]; \
size_t usage; \
}
Whenever user defined an allocator using this macro, it will generate a anonymous structure, with a fixed-size heap, which is completely determined during compilation.
Now, here is the problem: the structure generated is anonymous. We cannot access its fields directly, because we do not know its nominal type and cannot pass it to the function with any further type information beyond a void pointer. You definitely can use the macro with return value, which is a compile-extended feature of GCC, but I don’t want any non-portable code here. So, let’s dig further.
Rather than thinking about how to pass a complete, fully-typed structure to the functions, we can better take the pointers or values to its fields needed by them, and access them easily via the pointers or values. Here is the code:
void *bump__alloc(uint8_t *heap, size_t *usage, size_t cap, size_t size) {
if (!heap || !usage || cap == 0 || size == 0) return NULL;
if (*usage + size > cap) return NULL;
void *ptr = heap + *usage;
*usage += size;
return ptr;
}
And, a wrapper:
#define bump_alloc(alloc, size) \
bump__alloc((alloc).heap, &(alloc).usage, sizeof((alloc).heap), size)
After rewrote the allocation method, we could simply use this interface as the following example:
{ bump_t(2) alloc = {0};
uint8_t *seq = bump_alloc(alloc, 2);
} /* destructured automatically due to its lifetime */
Kind of convenient, right? Maybe there could be a bump__rows_t structure to record all of the fields included by a bump allocator to simplify the (internal) API and gain a better maintainability, but that’s out of the topic of this write-up.
Row polymorphism seems to be an advanced feature only owned by some specific powerful type system. But, in C, a weakly static-typed programming language, there is still a way to simulate this feature via some meta-programming using C macros, which allows you to create a more user-friendly, reasonable, and, the most important, type-safe API and reuse more code.
This trick won’t be only used in this tiny allocator. Here is an amazing project also used the similar idea, created by Tsoding, implementing a hash table.