Computer Science Education vs Computing Science Education, January 13, 2017

Wanting to learn from the masters, I read MIT's (in)famous Structure and Interpretation of Computer Programs. I have strong mixed feelings about what I got out of the book. In short, I think that SICP introduces a lot of important concepts in the science of computing, but at the same time, I think that it doesn't introduce enough fundamental concepts about how computing machines operate.

To give a few examples, there is one exercise that challenges the reader to understand how the Church Encoding can be used to describe numbers in a theoretical language that lacks numbers. In short, the Church Encoding uses higher-ordered functions to represent performing an operation a given number of times. I find this absolutely fascinating. It quickly became apparent to me that the Church Encoding can be used to derive the Von Neuman Ordinals, which represent numbers as sets. Another important concept introduced is the Y-Combinator function, a higher-ordered function that encapsulates the very concept of recursion. And oh yes, if you're not familiar with the Y-Combinator function, know that it is notoriously difficult to understand well. I'll admit that I, myself, have at most a very tenuous understanding of it.

If I were teaching a computer science curriculum myself, I would definitely want my students to be exposed to these concepts, but as a practical matter, they really don't make that much of a difference in the work on I do on a day to day basis. I think you could compare this to what Joel Spolsky, co-founder of StackOverflow, had to say about programming with pointers. Many modern programming languages provide managed memory with automatic garbage collection. This is arguably a very good thing because it automates programming tasks that can be tedious and extremely error-prone. It does come at a performance penalty, and for applications where performance is absolutely critical, non-managed languages can still be used. But what Spolsky had to say about managed languages is that students who learned in such languages were never exposed to the challenge of programming with pointers, and that he finds that programmers who know pointers are typically more skillful than those who don't, and that requiring students to learn about pointers is a good way of weeding out students who aren't fit to be programmers.

My primary criticism of SICP as an introduction to computer science is not so much what it teaches, but what it doesn't teach.

One of the first things taught to me when I took an introductory computer science course is how computers perform arithmetic, both integer and floating point. I learnt about relative advantages and disadvantages of different ways of representing numbers. I learnt about the perils of arithmetic overflow and dual representations of zero. SICP delves into a lot of numerical algorithms, but I can't recall that anywhere in the book does it say anything about how a computer executes algorithms. I think that learning from such a course would have driven me mad. I have always been fascinated with computers and what they allow us to do, and I was chomping at the bit to see how they really work!

Another important concept is recursion. It's another thing that Joel says that every programmer should be educated in, even if he never writes a single recursive function on the job. Oh boy, does SICP teach about recursion, even going so far as to introduce the Y-Combinator function, as I mentioned above. But once again, it doesn't tell the reader much about how recursion works in practice. In CS101, I was taught the concept of the call stack and how the computer uses the call stack to store data about the function currently being executed. We talked about how the call stack contains pointers back to where a function was invoked, and we learnt about the dangers of overflowing the stack.

If you forced me to make a choice that I would either understand the call stack or the Y-Combinator function for the rest of my life, but could never understand both, I would choose the call stack every time. It's a concept I use at work every day. I must add, tho, that SICP does introduce tail-recursion, which is a very useful concept which I do not recall my CS101 class included.

I'm not saying that SICP didn't contain a lot of the same material as the CS101 class I took. A lot of the teaching about data structures and algorithms was similar, but at it's heart, SICP struck me as a book about theoretical computing science as opposed to practical computer science.

I simply fear that there are students out there who are being put off from studying computer science by a curriculum that simply doesn't emphasize what they are truly fascinated by.