5 minute read

Daily dispatches from my 12 weeks at the Recurse Center in Summer 2023

I’m scared of pointers, okay?

Evil Pointer
Evil pointer

There, I said it!

I got into whole damned business by building websites and stuff – HTML, maybe some PHP, a little JavaScript, a relational database or two if I was lucky. Then more recently Python has been my daily driver. Python, okay? Understand? So now I’m looking at a bunch of C code filled with ampersands and asterisks, and I’m weighing the possibility that my computer might, I don’t know, melt or explode or something on compilation. I’m scared, I tell you!

Winkies

Ah but as young Paul Atreides once said, “Fear is the mind-killer.”

Alright Muad’Dib, time to face the fear and permit it to pass over me and through me and all that.

The Box of Pain

The concept of a register having a particular address where it stores some value is actually familiar thanks to my time with nand2tetris these past 6 weeks. Nevertheless, it strikes me as a great deal of power to wield, having this kind of direct access to computer memory and the sacred knowledge of where, precisely, the very stuff of software resides. So, to ease into the whole thing I created my own version of a simple swap function while following K&R.

Evil Pointer
Another evil pointer

Here’s something that will not work.

void swap(int a, int b)
{
    int tmp;

    tmp = a;
    a = b;
    b = tmp;
}

int main()
{
    int x, y;

    x = 3;
    y = 8;

    printf("Before swap: x = %d, y = %d\n", x, y);
    swap(x, y);
    printf("After swap:  x = %d, y = %d\n", x, y);

    return 0;
}
Before swap: x = 100, y = 200
After swap:  x = 100, y = 200

Here, I’ve created a swap() function that takes two integers, a and b, as arguments. After assigning the value at a to tmp, I assign the value of b to a and then tmp (the value formerly known as a) to b. Seems promising, but it doesn’t work, since calling swap(x, y) passes copies of the values, and not the values themselves, to the function. So tmp, a, and b indeed do their little dance together, but x and y are never affected.

As an alternative, we can pass not values to swap but the addresses where those values are stored, and from the body of swap() shuffle the things in these addresses around.

void swap(int *px, int *py)
{
    int tmp;
    tmp = *px;
    *px = *py;
    *py = tmp;
}

int main()
{
    int x, y;

    x = 3;
    y = 8;

    printf("Before swap: x = %d, y = %d\n", x, y);
    swap(&x, &y);
    printf("After swap:  x = %d, y = %d\n", x, y);

    return 0;
}
Before swap: x = 100, y = 200
After swap:  x = 200, y = 100

Here, &x is saying “the address of the register where the integer ‘x’ is.” Same with &y. This is called referencing, I suppose since we’re retrieving not the values assigned to x and y but their references.

Next, we’re passing these pointers to swap() as arguments *px and *py (‘p’ for pointer). *px and *py are just addresses that point to two places in memory.

In the body of the function, when we assign *px to a new integer variable called tmp, what we’re doing is saying “hey, get me the thing at this address *px and assign it to tmp, got it?” This is called dereferencing. We continue to deref stuff: “Hey, also, go ahead and assign the integer at address *py to address *px,” we say, “and while you’re at it do me a favor and put that integer tmp at address *py. Thanks!”

Now when we call swap(), we’re not manipulating copies of x and y; instead, we have direct access to their locations in memory, and we’re using that knowledge to move things from one box to another.

I must not fear

Evil Pointer
Another!

Okay, so, here’s a simple diagram of what is happening here (I think).

Here’s my RAM.

0 3 x
1 8 y
2
3
4

It has 5 (!) memory registers, with addresses ranging from 0 to 4. I’ve assigned the integer value 3 to x, which the compiler stuck at address @0, which presumably was free, and I’ve assigned the integer value 8 to y, which the compiler stuck at the next available address, which in this case is @1. Great!

Next order of business: let’s pass the addresses for x and y to swap() like so: swap(&x, &y). Once we’re inside swap, we’ll initialize a new integer, tmp, which the compiler conveniently puts at the next available place in RAM. (Note that there’s no value stored there yet.)

0 3 x
1 8 y
2 tmp
3
4

When we assigned *px to tmp, then, we’re saying we want to assign the value @0 (where *px is pointing) to wherever tmp is:

0 3 x
1 8 y
2 3 tmp
3
4

Then we assign *px = *py, setting the register @0 to whatever value is on the register @1.

0 3 8 x
1 8 y
2 3 tmp
3
4

Finally, we assign *py = tmp, which sets register @1 to whatever tmp is.

0 8 x
1 8 3 y
2 3 tmp
3
4

Hopefully this is mostly accurate.

Anyway, what’s cool is that we can actually print out the addresses if we want, so

int main()
{
    int x, y;

    x = 3;
    y = 8

    printf("@x = %p\n", &x);
    printf("@y = %p\n", &y);

    return 0;
}

will yield

@x = 0x7ff7bcc93918
@y = 0x7ff7bcc93914

And this is neat for another reason, too, since we can see that x and y are being stored 4 addresses away from one another – which makes total sense, since integer types take up, you guessed it, 4 bytes of space! So the compiler seems to be saying, “Cool, you’re declaring two integers here? Each needs four bytes of space, and I’m gonna store them right next to each other – at addresses 140,702,000,953,620 and 140,702,000,953,624 if you want to write it down.”

Other stuff

  • Finished implementing DNS in python with the Implement DNS in a Weekend crew. Spent a little time reimplementing it in C (which is what led me down this pointer hole in the first place), but, well, that might take a while. Definitely bumping into walls and learning a lot.
  • Paired on some C stuff, such as the above, and felt like I’m having one of those tiny steps/giant leaps situations
  • Group meeting and assembler code review for nand2tetris
  • Fun pairing sesh with someone doing wacky unthinkables in P5.js