Blog of Stuff: Posts tagged 'Teaching'

Please don't teach trees as undirected graphs

2024-11-29T15:40:07Z

I think I’m not alone in suggesting that the tree, and perhaps the binary tree specifically, is the most foundational data structure in computer science.

This raises the question of how binary trees should be introduced, and I’d like to make the case that introducing binary trees as a special case of undirected graphs is not a good idea.

I can see reasons why you might think this is a good idea; they both involve connecting circles with straight lines on the board. If you take the general notion of a graph, and you happen to arrange the lines and circles in a particular way, it’s a tree. That kind of suggests that graphs are a generalization of trees, and that you might want to introduce trees as a special case of these graphs.

Please do not do this.

Specifically, my claim is that this produces a complex definition of trees with many global constraints, a definition which makes it very hard to prove some simple things.

Let’s see why.

Trees as Undirected Graphs

To begin with, consider the definition of a graph, a set of nodes along with a symmetric binary relation on nodes that we call “edges”.

In order to construct the set of trees, we can restrict this definition in the following ways.

A tree is a graph with the following properties:

We designate one node as the root node,
it is connected, and
it has no cycles.

The last two of these are global cqnstraints requiring complex checks, and verifying the third is genuinely quite complex, especially for a first-year student.

Suppose now that we want to identify the descendants of a node. This is where things start to get really unpleasant. The descendants of a node are the adjacent nodes that are not the parent node. But which is the parent node? The root node has no parent node. For other nodes, it is the adjacent node which is the first along the path to the root node. Again, this is not a simple process; if we provide a symmetric edge relation and a parent node, determining the path to the root node (and thus the parent node (and thus, by elimination, the descendants)) requires potentially a full search of the tree. Yes, this can be cached. Yes, it doesn’t matter whether you do a depth-first or breadth-first search. It’s still a lot of work.

But wait! Suppose we want a binary tree? Firstly, we must ensure that every node has at most two descendants, which is the same as ensuring that the root node has at most 2, and every other node has at most 3. That’s the easy part, though. In a binary tree we typically care whether a child is the left child or the right child; adding this to a graph representation requires labeling every node as either a left child or a right child, and then checking to make sure that no node has more than one left or right child.

Inductively Specified Trees

Okay, so what’s the alternative?

The alternative is the inductive specification of a binary tree: a binary tree is either an empty tree, or it is a node containing a left tree and a right tree.

This representation is essentially a tree by definition; no global checks are required. Asking whether a binary tree specified in this way is a binary tree is genuinely trivial: only binary trees can be specified using this schema.

You could argue that inductive specifications are hard for students, and I wouldn’t completely disagree, but I would also argue that the overwhelming simplicity of this definition of binary trees powerfully outweighs the challenge of becoming comfortable with inductive specifications.

Identifying the descendants of a node here is essentially trivial; an empty tree has no descendants, a non-empty tree lists its two descendants.

Freebie: this representation also sidesteps the irritating issues around identity which bedevil programmers everywhere.

What about directed graphs?

Trees as Directed Graphs

Directed graphs are considerably better than undirected graphs. But still way worse than the inductive specification. Specifically, you need an undirected graph that is connected and with no directed cycles, and you also need an additional constraint that at most one node points to another one (making the arbitrary choice that the edges point from parents to children). Checking these constraints still requires examining the whole graph. Identifying the child nodes is much nicer. Distinguishing the left and right children is still a serious problem.

Done

To summarize; it’s reasonable to talk about the correspondence between these two representations. You might even want to try to prove them isomorphic. But please don’t introduce trees as a special case of graphs.

On the relationship between mathematical functions and program functions

2020-12-11T16:46:35Z

NOTE: this text is a brief overview intended for instructors of a CS course that we’re developing; it is not technical. It tries to make the case that our early courses should steer students toward understanding problems in terms of pure functions. If you have suggestions or feedback, I’d love to hear them/it.

This section of the course introduces functions, a crucial topic in the field of computer science AND in the field of math.

Programmers and mathematicians sometimes think about the term “function” somewhat differently. Furthermore, some people who are familiar with both fields assign different meanings to the word “function” in the two fields.

The definition of a function in the mathematical domain is fairly well specified, though of course things get a little fuzzy around the edges. We’re going to define functions as the arrows in the category Set, more or less (if that’s not helpful, ignore it). That is, a function has a specified domain and a specified codomain, and it maps every element of the domain to a particular element in the codomain. There’s no requirement that it map every element of the domain to a different element of the codomain (one-to-one) OR that there be some element of the domain that maps to any chosen element of the codomain (onto). This (I claim) is the standard notion of “function” in math.[*]

Programmers also use and love functions. Nearly every programming language has a notion of functions. Of course, they’re sometimes called “procedures” or even “paragraphs” (I believe that’s COBOL. Yikes.). In programming, functions are often thought of as being elements of abstraction that are designed to allow repetition. And so they are. But it turns out that they also, in most modern programming languages, can be thought of as mathematical functions. Well, some of them can.

For many functions, this is totally obvious. If I consider the function from numbers to numbers that we might write in math class as f(x) = 14x + 2, then I can write that as a function in most programming languages. (If you disagree with me, hold that thought.)

But… things aren’t always so clear. What about a function that doesn’t return at all? What about a function that takes input, or produces output? What about a function that mutates an external variable, or reads a value from a mutable value? What about a function that signals an error? All of these present problems, some more substantial than others. None of these have a totally obvious mapping to mathematical functions.

There certainly are ways to fit these functions into mathematical models, but in general, the clearest lesson is that when there is a natural way to express a problem using functions that map directly to mathematical functions, we should. These are generally called “pure” or “purely functional” functions.

So, why should it matter whether our functions are pure? What benefits do we gain when we express functions in purely functional ways?

The clearest one is predictability, also known as debuggability and testability. When I write a pure function that maps the input “savage speeders” to 17, then I know that it will always map that string to 17; I don’t need to worry that it will work differently when the global foo counter is less than zero, or when it’s run in parallel, or on Wednesday, or when the value in memory location 0x3342a7 is less than the value in memory location 0x3342a8.

Put differently, pure functions allow me to reliably decompose problems into sub-pieces. When I’m debugging and testing, I don’t need to worry about setup and teardown to establish external conditions.

Another way to understand this is to dip into functions that use mutation. If we want to model these as mathematical functions, we need to understand that in addition to their stated inputs, they take additional hidden inputs. In the simplest case, this may be the value of a global counter. Things get much more complex when we allow mutation of data structures; now we need to worry about whether two values are the “same” value; that is, whether mutating one of them will change the other. Worse still, mutating certain values may affect the evaluation of other concurrent procedures.

For these reasons and others like them, pure functions are vastly easier to reason about, debug, and maintain. Over time, many of our programming domains and paradigms are migrating toward primarily-pure settings. Examples include the spread of the popular map-reduce frameworks, and the wild explosion of popularity in deep learning networks. In both cases, the purity spreads downward from a mathematical framework.

Note that it is not always the case that pure approaches are the most natural “first choice” for programmers, especially introductory programmers, for whom programs are often imagined as a “sequence of changes”; do this, then do this, then do this, then you’re done. In this model, the program is performing a series of mutations on a larger world. Helping introductory programmers move to a purer model is a challenge, but one with substantial payoff.

For this reason, this section focuses directly on pure functions, and invites students to conceive of programs using the models that they’ve been taught in elementary and secondary school, most particularly tables mapping inputs to outputs.

[*] The only reason I mention the category Set is to draw attention to the distinction between “codomain” and “range”; every function has a named codomain, regardless of whether its range covers it. For instance, the “times-two” function from Reals to Reals is a different function from the “times-two” function from integers to integers, and the “times-two” function from integers to reals is yet a third function.

knowing what's out there

2017-10-03T18:23:23Z

I’m teaching a class to first-year college students. I just had a quick catch-up session with some of the students that had no prior programming experience, and one of them asked a fantastic question: “How do you know what library functions are available?”

In a classroom setting, teachers can work to prevent this kind of question by ensuring that students have seen all of the functions that they will need, or at least that they’ve seen enough library functions to complete the assignment.

But what about when they’re trying to be creative, and do something that might or might not be possible?

Let’s take a concrete example: a student in a music programming question wants to reverse a sound. How can this be done?

First thing, they probably look in the documentation for the sound library. They scan the list of function names, and none of them looks like something that reverses sounds. Should they give up?

Answer: no.

So what should they do?

Probably, they should look for a more general function that could be used to accomplish this task. For a first year student, this is a deeply daunting task. That is, they have to look at a sequence of thirty or forty functions and for each one, quickly determine whether or not it could be used in a program to accomplish the desired task. Even worse, it may be the case that they need to assemble several of these functions. In this case, the student needs to look at a function and think, “Gee… this function could be part of a solution, if I can figure out a way to solve this new problem that it creates.” You can see that this can lead to an exponential exploration of problem space.

Insted, an experienced programmer will probably tacitly generalize the desired function. “There’s no function that reverses a sound; is there any kind of more general sound rearranging that’s possible?” The answer here is yes. Unfortunately, there are actually two different ways to do this. First, there’s a clip function that can cut a portion out of a sound, and there’s an append operation that can glue sounds together. Separately, there’s a rearrange function that allows a user to provide a function to be used in mapping the frames of an old sound onto a new one.

Which one to use? It turns out that the first solution has some serious computational issues associated with it. If I want to reverse a three-minute sound, I’ll need to create a list containing 8,640,000 separate sounds.¹ This will probably take a while. The second one is much faster. So now, even after finding a set of library functions that work, the student needs to backtrack and try a different one.

Ouch.

Fast Forward One Day

Okay, so now it’s Tuesday.

I’m working on the handin server for my upper-division Programming Languages class, and I notice a funny problem in the logs. To wit:

[33|2017-10-03T11:05:03] checking Program2 for (clements)
[33|2017-10-03T11:05:04] running 794KB (170MB 184MB)
[33|2017-10-03T11:05:09] running 70MB (239MB 255MB)
[33|2017-10-03T11:05:09] done testing. 0 tests failed.
[33|2017-10-03T11:05:13] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:17] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:21] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:25] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:29] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:33] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:36] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:40] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:44] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:48] running 54MB (224MB 238MB)
[33|2017-10-03T11:05:51] session killed (timeout) while running tests

What we see here is that at 11:05:09, the submission passed all the tests. However, at 11:05:51, the server timed out and the submission was marked as a failure.

Why?

The TL;DR here is this: I didn’t know (enough) about a library function. But it took me an hour to figure it out.

Here’s the extended version: I took a look at my checker module, and it looked fine. Specifically, it ended by logging the number of failures, and then showing a dialog box to the user that indicated the number of failures. Nothing wrong there.

I spent about ten minutes writing a careful e-mail to the Racket Users mailing list, trying to figure out what’s wrong with the handin server. As I was carefully explaining why it couldn’t be my fault… I realized that it was. Specifically, my dialog box was a modal one, waiting for the user to click ‘OK’. If the user doesn’t click ‘OK’, the call to the message box function doesn’t return, and the checker module doesn’t finish, and as far as the handin server is concerned, the user’s program has failed to halt and should be removed. Ouch. Delete e-mail before sending.

Side note: if you don’t use the unbelievably effective technique of carefully formulating a bug report for a group of people that you respect and whose time you’re leery of wasting—you should. It works.

Sub-side note: unfortunately, if you’re in college, you probably don’t have the judgment or patience to be aware that you’re wasting other people’s time. Ah well.

So, how to solve this problem? My first try is to put the call to message in its own thread.

I’m not entirely surprised when this starts behaving very badly indeed; I get messages like

[1|2017-10-03T11:10:20] ERROR: upload not confirmed: hekok

and

[2|2017-10-03T11:11:13] ERROR: upload not confirmed: chc

What the heck??

After a quick search for the string ‘hekok’ in the source, I decide that that’s probably a different scary bug, and that I don’t have time to debug it.

Side note: It makes me sad to see what I think might be bugs that I decide not to pursue; it’s a chance to make the world slightly better that I’m deliberately passing up. Of course, it might take a long time to track them down, and in many cases, they turn out not to be bugs at all.

So, maybe I can make this message part of a final result. Time to read some docs: I read the docs for the check: form that defines the checker module, and all of the optional arguments. Nothing. Then, on a whim, I decide to read the docs for the message form.

Aha! It turns out that the message form can be passed the style final, in which case the message is used as the final message to the student, after the submission is complete.

This is exactly what I want, and I use it.

Problem solved.

It’s at about this moment that I realize that what I’ve been experiencing is almost exactly the same problem that my students are facing; I don’t know what’s in the libraries. In some ways, I’m even more dangerous than the students, because I know of ways to hack around the problem, and solve it the wrong way.

So: how are we supposed to know what libraries are available? In my mind, it’s still a major open question. Let’s see if I can get anyone at RacketCon interested.

Alternate Ending

There’s an alternate, depressing conclusion to my experience: it’s incredibly hard to get software right. Every piece of software is riddled with problems like this that don’t occur frequently, and are arguably not even “bugs”, per se, except that they obviously are, and fixing one takes about an hour of skilled programmer time. There’s not enough time in the day. It’s all going to fall apart.

Or maybe we’re going to give up on “engineering” our programs and fall back to “evolving” them; we accept an ecosystem of horribly buggy software and choose the best stuff and apply weird patches and lash them together with baling wire. Nasty, but it really works.

¹ : … or 100x the number of seconds in a day. Coincidence, sorry.

ontologies OF programs

2017-08-22T14:59:33Z

Reading Daniel Dennett’s “From Bacteria to Bach and Back” this morning, I came across an interesting section where he extends the notion of ontology—a “system of things that can be known”—to programs. Specifically, he writes about what kinds of things a GPS program might know about: latitudes, longitudes, etc.

I was struck by the connection to the “data definition” part of the design recipe. Specifically, would it help beginning programmers to think about “the kinds of data that their program ‘knows about’”? This personification of programs can be seen as anti-analytical, but it might help students a lot.

Perhaps I’ll try it out this fall and see how it goes.

Okay, that’s all.

restrictive or: notes from the dark side

2017-04-26T17:33:13Z

Okay, it’s week four of data structures in Python. In the past few days, I’ve read a lot of terrible code. Here’s a beautiful, horrible, example:

# An IntList is one of
# - None, or
# - Pair(int, IntList)
class Pair:
    def __init__(self, first, rest):
        self.first = first
        self.rest = rest

    # standard definitions of __eq__ and __repr__ ...

# A Position is one of
# - an int, representing a list index, or
# - None

# IntList int -> Position
# find the position of the sought element in the list, return None if not found.
def search(l, sought):
    if l == None:
        return None
    rest_result = search(l.rest, sought)
    if (l.first == sought or rest_result) != None:
        if l.first == sought:
            return 0
        else:
            return 1 + rest_result

This code works correctly. It searches a list to find the position of a given element. Notice anything interesting about it?

Take a look at the parentheses in the if line. How do you feel about this code now?

(Spoilers after the jump. Figure out why it works before clicking, and how to avoid this problem.)

Okay, that’s just awful. The intent was to write

1	if (l.first == sought) or (rest_result != None):

instead of

1	if (l.first == sought or rest_result) != None:

… but the student misparenthesized. The student’s code works because if l.first is equal to sought, the or evaluates to True, which is in fact not equal to None, so you wind up in the right place. Otherwise, the or winds up being the value of rest_result, which is then correctly compared to None. Note also that there’s no else case, meaning that the function secretly returns None in the fall-through, which is also correct.

I hope you agree with me that this code is abominable, and we’d like to prevent students from writing it.

What’s the fix? I think the obvious fix is to ensure that or only works on booleans. This is guaranteed by most typed languages in the type system, or by a dynamic check in the implementation of or.

Does Racket have this problem? No, it does not. In the case of Racket, though, it’s because of the notion of “student languages,” an idea to which many people pay lip service but which few people actually carry out.

Anyway: types or language levels FTW. Or, more precisely:

Python FTL!

Not liking Python any better now

2017-03-12T06:33:25Z

It’s much closer to ‘go’ time now with Python, and I must say, getting to know Python better is not making me like it better. I know it’s widely used, but it really has many nasty bits, especially when I look toward using it for teaching. Here’s my old list:

Testing framework involves hideous boilerplate.
Testing framework has standard problems with floating-point numbers.
Scoping was clearly designed by someone who’d never taken (or failed to pay attention in) a programming languages course.
The vile ‘return’ appears everywhere.

But wait, now I have many more, and I’m a bit more shouty:

Oh dear lord, I’m going to have to force my students to implement their own equality method in order to get test-case-intensional checking. Awful. Discovering this was the moment when I switched from actually writing Python to writing Racket code that generates Python. Bleah.
Python’s timing mechanism involves a hideously unhygienic “pass me a string representing a program” mechanism. Totally dreadful. Worse than C macros.
Finally, I just finished reading Guido Van Rossum’s piece on tail-calling, and I find his arguments not just unconvincing, not just wrong, but sort of deliberately insulting. His best point is his first: TRE (or TCO or just proper tail-calling) can reduce the utility of stack traces. However, the solution of translating this code to loops destroys the stack traces too! You can argue that you lose stack frames in those instances in which you make tail calls that are not representable as loops, and in that case I guess I’d point you to our work with continuation marks. His next point can be paraphrased as “If we give them nice things, they might come to depend on them.” Well, yes. His third point suggests to me that he’s tired of losing arguments with Scheme programmers. Fourth, and maybe this is the most persuasive, he points out that Python is a poorly designed language and that it’s not easy for a compiler to reliably determine whether a call is in tail position. Actually, it looks like he’s wrong even here; I read it more carefully, and he’s getting hung up on some extremely simple scoping issues. I’m really not impressed by GvR as a language designer.

time spent on CPE430, Spring 2016

2016-06-08T16:31:05Z

Earlier this year, I was talking to Kurt Mammen, who’s teaching 357, and he mentioned that he’d surveyed his students to see how much time they were putting into the course. I think that’s an excellent idea, so I did it too.

Specifically, I conducted a quick end-of-course survey in CPE 430, asking students to estimate the number of weekly hours they spent on the class, outside of lab and lecture.

Here are some pictures of the results. For students that specified a range, I simply took the mean of the endpoints of the range as their response.

Density of responses

Then, for those who will complain that a simple histogram is easier to read, a simple histogram of rounded-to-the-nearest-hour responses:

Histogram of responses

Finally, in an attempt to squish the results into something more accurately describable as a parameterizable normal curve, I plotted the density of the natural log of the responses. Here it is:

Density of logs of responses

Sure enough, it looks much more normal, with no fat tail to the right. This may just be data hacking, of course. For what it’s worth, the mean of this curve is 2.13, with a standard deviation of 0.49.

(All graphs generated with Racket.)

things I already dislike about Python

2016-05-16T19:55:56Z

I’m just getting started, but already Python is looking like a terrible teaching language, relative to Racket.

Testing framework involves hideous boilerplate.
Testing framework has standard problems with floating-point numbers.
Scoping was clearly designed by someone who’d never taken (or failed to pay attention in) a programming languages course.
The vile ‘return’ appears everywhere.

Break it! Confrontational thinking in computer science

2015-06-14T22:09:51Z

So here I am grading another exam. This exam question asks students to imagine what would happen to an interpreter if environments were treated like stores. Then, it asks them to construct a program that would illustrate the difference.

They fail, completely.

(Okay, not completely.)

By and large, it’s pretty easy to characterize the basic failing: these students are unwilling to break the rules.

I feel like this idea has been hitting me over the head for a couple of months, now, so I’m going to write it down. Specifically, I see many areas of computer science where it’s important to engage in what I call “confrontational thinking.” Taking something that looks good, and poking holes in it to find out what’s wrong with it.

Is this what other people call “critical thinking”? I don’t think so. Associations are important, and I think that the term “critical thinking” now simply refers to a watery desire for students to somehow be more well-rounded and high-level thinkers. I’m thinking of something different.

For me, I think this skill is most closely affiliated with Math. Specifically, if you want to succeed in Math (and no, I don’t mean the ability to memorize and deploy the closed-form solution to quadratics), you need to be able to take a nice idea and bash it against the wall. Find counterexamples, think outside the box. Try to prove your professor wrong. All kinds of interesting things fall out when you disassemble the nice clean theorems that you’re given.

In programming, this may be even more important. In programming, after all, the artifacts are not theorems given to you by teachers (and proven by Euclid)—they’re programs you wrote yourself, and they’re probably jammed with bugs. If you can’t attack your program, and try to break it, you’re going to develop fragile artifacts that work only for the corner that your mind was stuck in.

In the classroom, this is how you learn. Students that patiently absorb all of the text on the board don’t appear (to me) to actually be learning anything; it’s those that are in your face—as long as they can clearly enunciate their questions—that are assembling knowledge.

I was reminded of this by watching one of Veritasium’s excellent youtube posts, this one on discovering the rule that generates a sequence of numbers. Excellent, I should say, except that I wanted to shout at the people in the video. Go take a look to see what I mean. These people are having a lot of trouble engaging in confrontational thinking.

Why do students have such trouble with this?

I think that one reason may be that we learn different kinds of things in different ways.

So, for instance, consider learning to walk. This is something we learn very early, long before we’re able to absorb complex instructions. We’re certainly partially hard-wired for this, but a big chunk of it is trial and error.

Here’s the point, though. Once we learn to walk, do we engage in confrontational thinking about the process? Heck no! Deliberately try to walk wrong, to see what happens? I’m pretty sure I can accurately guess what’s going to happen, and it’s probably going to involve bleeding and bandages.

For this kind of activity, confrontational thinking is not the best idea. Instead, we find the way that works, and we stick with it.

More generally, this is the way that we actually get things done in our lives. If every morning I decide to question the way that breakfast works—whether I can fry an egg without a pan, or whether the food needs to go in my mouth—I’m going to end up messy and frustrated.

For many of our students, I claim, programming is like this. After many painful episodes of trial and error, they’ve developed scars and bruises, and know of one narrow path that happens to work pretty well. Getting them to consider other paths is frightening and aversive. (Say, for instance, the parenthesized syntax of lisp-like languages.)

In fact, if you want to really go overboard, you can compare these two learning styles to the notions of conservatism and liberality. The conservative style gets the job done, using fewer resources and taking fewer risks. The liberal style involves self-doubt and failure, but is ultimately necessary in order to learn.

Hmm… when I put it that way, it sounds perfectly obvious.

Now, the only question is how to get the students to engage, and start questioning me.

Is teaching programming like teaching math?

2014-12-17T17:13:02Z

One of my children is in third grade. As part of a “back-to-school” night this year, I sat in a very small chair while a teacher explained to me the “Math Practices” identified as part of the new Common Core standards for math teaching.

Perhaps the small chair simply made me more receptive, taking me back to third grade myself, but as she ran down the list, I found myself thinking: “gosh, these are exactly the same skills that I want to impart to beginning programmers!”

Here’s the list of Math Practices, a.k.a. “Standards for Mathematical Practice”:

Make sense of problems and persevere in solving them.
Reason abstractly and quantitatively.
Construct viable arguments and critique the reasoning of others.
Model with Mathematics.
Use appropriate tools strategically.
Attend to precision.
Look for and make use of structure.
Look for and express regularity in repeated reasoning.

Holy Moley! Those are incredibly relevant in teaching programming. Furthermore, they sound like they were written by someone intimately familiar with the How To Design Programs or Bootstrap curricula. Indeed, in the remainder of my analysis, I’ll be referring specifically to the steps 1–4 of the design recipe proposed by HtDP (as, e.g., “step 2 of DR”).

Let’s take those apart, one by one:

Make sense of problems and persevere in solving them.

This is really two things, but they’re both incredibly important in programming. The first one puts the emphasis first on understanding the problem. Don’t charge ahead and try to solve the problem (write the program) before you have some understanding of the problem. This can be clearly expressed by writing a purpose statement (part of step 2 of DR), and by designing data (step 1 of DR).

The second part of this—perseverance—is incredibly important. It’s not directly a step in solving the problem, but it’s one of a family of meta-skills that largely determines whether a student succeeds in an introductory programming class (erm, citation needed).
Reason abstractly and quantitatively.

Bouncing back and forth between the abstract and the quantitative is one of the key skills in programming. Indeed, a program represents a mapping from problem to solution for a large set of problems; the transition from the concrete to the abstract is the raison d’etre of programming itself.

Want to multiply one pair of numbers? use a calculator. Want to multiply seventy million pairs of numbers? write a program.

However, humans work best—and learn best—when we’re thinking about concrete, quantitative problems. That’s why step 3 of the design recipe requires students to come up with concrete examples of problem inputs, and the corresponding results. Without these concrete examples, students quickly get lost in trying to tackle all possible inputs, without being able to focus on concrete, quantitative inputs.
Construct viable arguments and critique the reasoning of others.

One of the key skills that programmers require is that of understanding programs. Learning to program without learning to read others’ programs is like learning to talk without knowing how to listen. It’s true that students see small examples of programs in textbooks and in class, but it’s also vital for students to see other students programs; learning to understand these will help them to see what’s missing in their own programs ¹ ² ³ ⁴.

Also, programmers frequently engage in the activity of debugging. Okay, extremely frequently. The process of debugging is fundamentally one of regarding one’s own program as a third party. It’s clear to the programmer what they meant the program to do, but debugging it requires them to look at the program as if it were written by someone else, doing their best to peel off the lens of intention, and see what it actually, does, not what they meant it to do. They must then construct viable arguments as to why the program performs as it does, and how to correct it.

Debugging is a truly vital programming skill that is often not well taught.
Model with Mathematics.

“Model with Mathematics” is, more or less, another name for programming.

It’s not clear whether this adds anything to the conversation, but it seems clear that there’s more or less one hundred percent overlap between this “practice” and that of programming.
Use appropriate tools strategically.

Writing a program consists entirely in applying various functions and language constructs to a set of inputs.

In the first few weeks, these tools consist almost entirely of basic mathematical and graphical functions: plus, times, rectangle, and the like.

Later, students can bring to bear their knowledge of the “tools” of program templates (step 4 in the design recipe) to organize programs that operate on more complex forms of data. This is still in the “hand-holding” phase of programming.

Still later, students will learn about more sophisticated programming “tools”—divide and conquer forms of generative recursion, standard iteration functions (map, filter, foldl), and optimization techniques such as memoization. These tools pop instantly to the mind of a seasoned programmer, just as a woodworker might immediately identify the rabbet plane that will create the desired shape without difficulty.
Attend to precision.

Precision arrives somewhat later for programmers than it does for students of math. In the first ten or twenty weeks of student programming, programs tend to be entirely right or catastrophically wrong, especially if they’re following the steps of the design recipe, and using data definitions supplied to them.

Later, though, precision takes on an increasing importance. Programming is largely algebraic—how to combine operators and language forms to build a program—and the “precision” that is most often missing is that of corner cases, and unexpected combinations of data. “Attending to precision” in these cases consists in developing test cases that carefully cover the space of possible inputs.
Look for and make use of structure.

Making use of structure is in some ways the fundamental job of a programmer. The programmer must—before even beginning to write the program—decide how to model the data of the problem as values in some programming language. This is step 1 of the design recipe, and for the first five or six weeks of programming, students can’t be expected to design their own data.

In the next five or six weeks, students develop the ability to choose simple structures to represent well-understood data.

Finally, students move on to tackling problems where there is no single best way to model the data. In these cases, the best model may depend on operational constraints, or data volume, or any number of other criteria. Indeed, the entire fields of databases and data science may be considered to be an expression of this practice.
Look for and express regularity in repeated reasoning.

Yeah, that’s abstraction. Important in programming.

Okay, so that’s it. I hope it’s clear at this point that

Teaching math is a lot like teaching programming.

For more, take a look at Felleisen & Krishnamurthi, “Why Computer Science Doesn’t Matter,” ⁵.

Kulkarni, Chinmay, Steven P. Dow, and Scott R. Klemmer. “Early and repeated exposure to examples improves creative work.” Design Thinking Research. Springer International Publishing, 2014. 49–62. http://link.springer.com/chapter/10.1007/978–3–319–01303–9_4 ↩
Politz, Joe Gibbs, Shriram Krishnamurthi, and Kathi Fisler. “In-flow peer-review of tests in test-first programming.” Proceedings of the tenth annual conference on International computing education research. ACM, 2014. http://dl.acm.org/citation.cfm?id=2632347 ↩
Hundhausen, Christopher D., Anukrati Agrawal, and Pawan Agarwal. “Talking about code: Integrating pedagogical code reviews into early computing courses.” ACM Transactions on Computing Education (TOCE) 13.3 (2013): 14. http://dl.acm.org/citation.cfm?id=2499951 ↩
Sondergaard, Harald. “Learning from and with peers: the different roles of student peer reviewing.” ACM SIGCSE Bulletin. Vol. 41. No. 3. ACM, 2009. http://dl.acm.org/citation.cfm?id=1562893 ↩
Felleisen, Matthias, and Shriram Krishnamurthi. “Viewpoint Why computer science doesn’t matter.” Communications of the ACM 52.7 (2009): 37–40. http://dl.acm.org/citation.cfm?id=1538803 ↩

Too Elegant For September

2013-04-05T00:18:00Z

Being on sabbatical has given me a bit of experience with other systems and languages. Also, my kids are now old enough to “mess around” with programming. Learning from both of these, I’d like to hazard a bit of HtDP heresy: students should learn for i = 1 to 10 before they learn

1
2
3

(define (sum lon)
  (cond [(empty? lon) 0]
        [else (+ (first lon) (sum (rest lon)))]))

To many of you, this may seem obvious. I’m not writing to you. Or maybe you folks can just read along and nod sagely.

HtDP takes this small and very lovely thing—recursive traversals over inductively defined data—and shows how it covers a huge piece of real estate. Really, if students could just understand how to write this class of programs effectively, they would have a vastly easier time with much of the rest of their programming careers, to say nothing of the remainder of their undergraduate tenure. Throw a few twists in there—a bit of mutation for efficiency, some memoization, some dynamic programming—and you’re pretty much done with the programming part of your first four years.

The sad thing is that many, many students make it through an entire four-year curriculum without ever really figuring out how to write a simple recursive traversal of an inductively defined data structure. This makes professors sad.

Among the Very Simple applications of this nice idea is that of “indexes.” That is, the natural numbers can be regarded as an inductively defined set, where a natural number is either 0 or the successor of a natural number. This allows you to regard any kind of indexing loop as simply a special case of … a recursive traversal of an inductively defined data structure.

So here’s the problem: in September, you face a bunch of bright-eyed, enthusiastic, deeply forgiving first-year college students. And you give them the recursive traversal of the inductively defined data structure. A very small number of them get it, and they’re off to the races. The rest of them struggle, and struggle, and finally get their teammates to help them write the code, and really wish they’d taken some other class.

NB: the rest of this makes less sense… even to me. Not finished.

However, another big part of the problem is … well, monads are like burritos.

Let me take a step back.

The notion of repeated action is a visceral and easily-understood one. Here’s what I mean. “A human can multiply a pair of 32-bit integers in about a minute. A computer can multiply 32-bit integers at a rate of several billion per second, or about a hundred billion times as fast as a person.” That’s an easily-understood claim: we understand what it means to the same thing a whole bunch of times really fast.

So, when I write

for i=[1..100] multiply_two_numbers();

It’s pretty easy to understand that I’m doing something one hundred times.