5 minute read

Daily dispatches from my 12 weeks at the Recurse Center in Summer 2023

As much as I’m enjoying getting thrown into the deep end of C with Build Your Own Lisp, I’m finding it gratifying and delightful all around to be learning the fundamentals in an orderly fashion with Kernighan & Ritchie.

Histograms in the Terminal

Among my K&R activities of the weekend was creating several simply programs to tabluate and then display data received from stdin. Here’s one such program that I found particularly fun. It tallies up the input by word length and spits out a crude histogram:

> Alice was beginning to get very tired of sitting by her sister on the 
  bank, and of having nothing to do: once or twice she had peeped into 
  the book her sister was reading, but it had no pictures or conversations 
  in it, “and what is the use of a book,” thought Alice “without pictures 
  or conversations?”
Word length count, horizontal histogram:
 0 |
 1 | #
 2 | ##############
 3 | ################
 4 | #####
 5 | #####
 6 | #####
 7 | ###
 8 | ####
 9 | #
10 | #
11 |
12 |
13 | #
14 |


Word length count, vertical histogram:
16 |           #
15 |           #
14 |        #  #
13 |        #  #
12 |        #  #
11 |        #  #
10 |        #  #
 9 |        #  #
 8 |        #  #
 7 |        #  #
 6 |        #  #
 5 |        #  #  #  #  #
 4 |        #  #  #  #  #     #
 3 |        #  #  #  #  #  #  #
 2 |        #  #  #  #  #  #  #
 1 |     #  #  #  #  #  #  #  #  #  #        #
     ---------------------------------------------
      0  1  2  3  4  5  6  7  8  9 10 11 12 13 14  <-- word length

From a programming perspective, this is not particularly difficult, but it was satisfying to think through in a new language. If I were in Python, I would probably would have defaulted to using a dictionary with word lengths as keys and counts as their paired values. Here, however, I used an array to hold word-length counts, with each index serving as the placeholder for words of that length. That meant that displaying the horizontal historgram was a matter simply of looping through the array, and then looping through the count at that index and printing a hash each time.

The situation was slightly more complex for the vertical histogram, since it needed to be drawn from the top down, which meant that we also had to keep track of the maximum word-length count. In the case of the first paragraph of Alice’s Adventures in Wonderland, three-letter words occur most frequently at 16 times, so here our outer loop would count down from 16 to 0, and an inner loop would cycle through the word-length count array, printing a hash for each item with at least that many occurrences. There’s only one word-length that 16 or 15 occurrences, but by the time we get to 14 occurrences in the outer loop, we start getting a match for 2-letter words, as well. On and on it goes.

Tallying Characters

A related exercise was to tally up occurrences of each character. I decided to take a similar approach and used an array to hold my character values. In this case, the array was length 128 to account for all ASCII characters.

The same opening paragraph from Alice looks like this:

RAW COUNT
  ( 10):  2
  ( 32): 56
, ( 44):  4
: ( 58):  1
? ( 63):  1
A ( 65):  2
a ( 97): 13
b ( 98):  6
c ( 99):  8
d (100):  8
e (101): 25
f (102):  3
g (103):  8
h (104): 14
i (105): 23
k (107):  3
l (108):  2
n (110): 20
o (111): 24
p (112):  4
r (114): 14
s (115): 16
t (116): 26
u (117):  6
v (118):  4
w (119):  5
y (121):  2


HISTOGRAM
  ( 10) | ##
  ( 32) | ########################################################
, ( 44) | ####
: ( 58) | #
? ( 63) | #
A ( 65) | ##
a ( 97) | #############
b ( 98) | ######
c ( 99) | ########
d (100) | ########
e (101) | #########################
f (102) | ###
g (103) | ########
h (104) | ##############
i (105) | #######################
k (107) | ###
l (108) | ##
n (110) | ####################
o (111) | ########################
p (112) | ####
r (114) | ##############
s (115) | ################
t (116) | ##########################
u (117) | ######
v (118) | ####
w (119) | #####
y (121) | ##

Here we are essentially reading in text from stdin one character at a time and incrementing the value at the index number of that character’s integer equivalent. Displaying the results is just a matter of looping through that array and printing the value at each index number (I ignored items with 0 count to keep the output relatively brief).

What I like about this solution is that it allows us to account for the full (ASCII) spectrum of characters and also gives us some visibility into the integer equivalents of each character. That includes invisible characters like spaces (32 in decimal) and newlines (10 in decimal).

Again, pretty simple as software goes, but an exercise that is incredibly illuminating from the perspective of understanding C fundamentals and that offers some glimpses as to how typing is working under the hood.

Other things

Over the weekend

  • Hunkered down with C doing exercises from Kernighan & Ritchie book for a while, including the above
  • Did some LeetCode pairing
  • Started modeling for an ML project to establish benchmarks before delving into feature engineering

Today

  • Worked through a few more sections of CPL
  • BYOL is jumping in complexity, and jumping far ahead of where I’m at in CPL, so today’s group meeting was especially illuminating and productive when it came to developing better intuition for memory management and pointers in C.
  • Paired with a few nand2tetris folks on writing an assembly language program to multiply two numbers in memory.