Return Styles: Pseud0ch, Terminal, Valhalla, NES, Geocities, Blue Moon.

Pages: 1-

Lolli uses GNU/Emacs

Name: Anonymous 2021-07-02 14:38

...and massages the smelly RMS feet!

Name: Anonymous 2021-07-02 14:40

Yesterday I've seen Mr. Stallman in Boston. He was accompanied by a cute pre-teen girl. Is that his daughter? I was too shy to ask him myself.

Name: Anonymous 2021-07-02 14:44

Assuming the stack grows from top to bottom we can get a conservative approximation of the bottom of the stack by just taking the address of some local variable:

void* Cello_GC_Stack_Bot() {
void* p = NULL;
return &p;
}

But before we do this we need to ensure two things. First we want to make sure we flush all of the values in the registers onto the stack so that we don't miss a pointer hiding in a register, and secondly we want to make sure the Cello_GC_Stack_Bot function isn't inlined by the compiler. We can spill the registers into stack memory in a somewhat portable way with setjmp - which puts the registers into a jmp_buf variable. And we can ensure that the function is not inlined by only calling the marking function via a function pointer, who's value is decided using a volatile variable (volatile variables are immune to optimisations). Then we know that the Cello_GC_Stack_Bot function will return an address that will definitely cover the spilled registers and everything else on the stack above our call.

void Cello_GC_Mark_Prelude() {
jmp_buf env;
setjmp(env);

volatile int noinline = 1;
void (*mark_stack)(void) = noinline
? Cello_GC_Mark_Stack
: (var(*)(void))(NULL);

mark_stack();
}

Getting the top of the stack is a little more difficult, but assuming user programs start from main we can use a very cheeky macro to wrap it in a custom function that first registers the address of some local variable with the Cello GC, and then calls the user program:

int Cello_Main(int argc, char** argv);

#define main(...) \
main(int argc, char** argv) { \
var stk = NULL; \
Cello_GC_Init(&stk); \
return Cello_Main(argc, argv); \
}; \
int Cello_Main(int argc, char** argv)

Using these techniques we can get a safe approximate upper and lower bound to the area of stack memory that should contain all the relevant pointers to garbage collectable objects. Now all we need to do is scan this memory range and mark any pointers we find referenced.

void Cello_Mark(void) {

var top = Cello_GC_Stack_Top();
var bot = Cello_GC_Stack_Bot();

for (var p = top; p <= bot; p += sizeof(var)) {
Cello_GC_Mark_Item(*((var*)p));
}

}

But how can we tell if some block of memory is actually a pointer? We don't want to be following pointers recklessly or else we might cause a segfault. Now in general there is no way to distinguish between some memory that looks like a pointer, and an actual pointer itself - but there are a couple of heuristics that we can use to disregard lots of potential addresses.

First - pointers must be memory aligned - which means for 64-bit machines they can only be located every 8-byte boundary, and must only point to some value on an 8-byte boundary. This means the pointer value must be a multiple of the pointer size, and the address of the pointer must be a multiple of the pointer size. We can also keep track of the maximum and minimum pointer addresses we've allocated and quickly disregard anything outside of these bounds.

These simple measures will stop us following bad or invalid pointers in almost all cases, but we need to narrow it down even further because a bad memory access could crash the program. For this purpose a hash table is maintained which stores all the pointers which have been allocated by Cello. It can be used to quickly check if a pointer is to an allocated Cello object.

Here is what the function Cello_GC_Mark_Item roughly looks like. It does these brief checks and then on success does the actual marking and recursion.

void Cello_GC_Mark_Item(var a) {

if (a % sizeof(var) is 0
and a >= minptr
and a <= maxptr
and Cello_GC_Allocated(a)
and not Cello_GC_Marked(a)) {
Cello_GC_Mark(a);
Cello_GC_Recurse(a);
}

}

The recursion function is also simple. It either calls the mark function on the Cello object, or scans the memory at that location and tries to mark each segment of memory as if it were a pointer - just like the stack data.

static void Cello_GC_Recurse(struct GC* gc, var ptr) {

var type = type_of(ptr);

struct Mark* m = type_instance(type, Mark);
if (m and m->mark) {
m->mark(ptr, gc, (void(*)(var,void*))GC_Mark_Item);
return;
}

struct Size* s = type_instance(type, Size);
if (s and s->size) {
for (size_t i = 0; i < s->size(); i += sizeof(var)) {
var p = ((char*)ptr) + i;
GC_Mark_Item(gc, *((var*)p));
}
return;
}

}

This completes all that is required for the marking stage. The sweeping stage is equally simple. It just deletes all Cello Objects not marked! Again roughly it looks like this:

void Cello_Sweep(void) {

for (size_t i = 0; i < Cello_GC_NumItems(); i++) {
var ptr = Cello_GC_Item(i);
if (Cello_GC_Marked(ptr)) {
Cello_GC_Unmark(ptr);
} else {
Cello_GC_Delete(ptr);
}
}

}

Now when a user asks for a new allocation via new and memory usage has exceeded some bound we simply call Cello_Mark and Cello_Sweep to free any available memory.

As you probably suspected, in reality there is a little more going on, with various optimisations and tweaks to what is shown in the code snippits here, but the Cello Garbage Collector is still very simple, safe and portable - if a little slow!

Overall I hope I've shown that simple Garbage Collection for limited environments is not only possible in C - it is fairly simple to implement and understand.

Name: Anonymous 2021-07-02 14:59

>>2

SUGARFEET DADDY

Don't change these.
Name: Email:
Entire Thread Thread List