What I learned from K&R

I just finished reading the book on programming in C, which is appropriately called The C Programming Language, by Brian Kernighan and Dennis Ritchie. It’s nicknamed “K&R” after the authors. Dennis Ritchie invented C and was a major designer of unix. Brian Kernighan wrote cron and is the k in awk, among other awesome things.

The thing is, I’ve known C for a long-ass time – since before Microsoft released their first C++ compiler, which is when I learned C++ (from the book that came with the MSVC 1.0 box). So why did I bother to read K&R now?

First, it’s an inspiringly well-written book. There are a very small number of books that start from the basics and lead you all the way through to the advanced stuff, without skimping on the details, and this is one of them (another is Calculus by Michael Spivak, or The TeXbook by Donald Knuth). I love books like that. They are both fun and efficient to read – beautiful. Secondly, the more I program, the more I appreciate deep mastery of a language. And finally, I am in the midst of designing a new programming language for fun.

Interesting stuff from K&R

I thought about turning this into an essay, but it’s more fun to read the way I took the notes: as a random jumble of interesting tidbits that caught my eye as a veteran C programmer reading K&R for the first time. If you’re in the same position – you know C but haven’t read K&R – consider this a summary just for you!

  • The point of a struct tag (as in struct tagname { int i; float f; };) is to effectively define a new type called struct tagname. I’ve always used typedef’s, so I never really understood the point of the tags before.
  • The authors like 4-space indents (I like 2, and Linus Torvalds likes 8).
  • The authors are ok with brace-free one-line code blocks, such as a single line after an if statement. Every style guideline I’ve ever read (and yes, I’ve read a few) has forbidden this, so it’s interesting.
  • They like assignment and/or increments within conditional expressions (as in the line if ((fp = fopen(*++argv, "r")) == NULL) { on page 162.) I prefer to have about one primary verb per line of code.
  • They sometimes declare function prototypes within function definitions. I didn’t even know that was legal.
  • They sometimes declare both variables and function prototypes in the same comma-delimited list.
  • The names for argc and argv come from “argument count” and “argument vector”.
  • The type void was originally not in the language.
  • They didn’t seem to feel very strongly against global variables. I’ve read in many places that global variables are basically forbidden, though I personally think they are not so bad if you can use them without polluting the global namespace, treating them as essentially local to a file or very small set of files.
  • There is a rationale behind the syntax for all pointer declarations (including for function pointers): they are designed so that if you replaced *myVar with just myVar, you’d get a declaration for the pointed-to object. And they chose this since, at usage time, when you dereference a pointer, you are talking about *myVar, so there is some consistency there (I wish someone had told me this idea a long time ago – it makes a lot more sense now!)
  • The authors sound as if they wished malloc could have returned a more specific type based on the input, which makes sense. I like that they care more about design honesty than salesmanship.
  • You can make bit-fields with a struct! As in, you can declare struct fields to be a certain number of bits in size. I didn’t know that. I guess most people prefer bit-wise operators, though, because I don’t think I’ve ever seen that used.
  • In printf, you can provide variables for field-width specifiers; for example printf("%*s %*.*f", 8, "hi", 5, 2, 3.14159) will print out "      hi 3.14".
  • The authors don’t mention buffer overflows when talking about things like sprintf, strcpy, or memcpy. Ruh-roh, be careful newbie coders!
  • File-handling functions like fputs or fprintf accept the FILE pointer in different positions – it would be easier to use if it were always the first parameter. They also include a sample of what’s behind a FILE pointer, which is somehow very satisfying to find out (it always felt like a sort of condescending mystery to me).
  • I still don’t know what the beginning c in “calloc” stands for. (Some people say “clear” but I don’t see anything definitive. See here.)
  • In example 8.6, they return a pointer to a static variable, which some folks consider to be bad practice.
  • Technically, you’re not allowed to compare pointers unless they’re within the same array (or one past the end). I didn’t know it was so restrictive. I also noticed their sample implementation of malloc/free breaks this rule.
  • They really like one or two-letter variable names. Check out the implementation of free on page 188. I like slightly more descriptive names, sort of in the python culture (as opposed to ass-long names, as in the java culture).
  • There is some insane stuff going on in setjmp.h. I had no idea.
  • Here’s an example of a declaration that’s fairly tricky to parse: void (*signal(int sig, void (*handler)(int)))(int) What? It’s a function that takes an int and a function pointer as inputs, and returns a function pointer as well.
  • There used to be a keyword called entry but it was dropped from the language. I wonder what it might have done (they don’t say).
  • Incremental assignment operators like += used to be written in the other order (and allow spaces between them!) like x =+ 1, but that was changed due to the ambiguity of something like x=-2.

Good stuff.

A new language

I don’t have much to show yet for the new language. The primary goal of the language is to be faster than C but still allow for very nice code, even while working with large-scale programs. I have some specific ideas for the speed in mind.

Other features I’m interested in:

  • No classes; instead use scopes and interfaces together. A scope is like a struct in C, but can be smart about inheritance-like behavior, and an interface is like a pure virtual class in C++ or an interface in java.
  • Built to give the coder maximal control, lisp-style.
  • Post-compile optimization. Nuff said.

Leave a Reply

Powered by WP Hashcash