Sound Generation: A DrRacket Primer

All programs in this Primer run in the Intermediate Student language. Also, they require the rsound.plt package. To make everything work nicely:

Set the Language Level to Intermediate student, and
begin each program with:
(require (planet "main.rkt" ("clements" "rsound.plt" 1 7)))
(require (planet "draw.rkt" ("clements" "rsound.plt" 1 7)))

Also, if you just skim through this text, you won’t learn much. You need to paste all of the example code in and try it. Whenever you run a program, make sure you take a second to think about what it’s going to do before you run it.

1 Simple Sine Waves

DrRacket has simple sine and cosine functions, called sin and cos. They use radians as units, so (cos (/ pi 2)) produces 0.0. Try it. Wait! Here’s what I got: 6.123233995736766e-17. What the heck is that? You read this as "approximately 6.123 times 10 to the minus seventeenth." Is that close to zero? Yes, it is. But it also shows us that when we’re testing these things, we’re going to have to accept "close enough" as a correct answer.

So how do we turn this idea into a sound? The first and most important thing is that we need to develop a function whose graph looks like this:

Let’s call it a-sine-tone.

In this picture, the "x" axis represents time. So we want a function that changes over time. Put differently, this function takes in a time, and returns an amplitude. Let’s follow the design recipe.

First, we need a purpose statement and a header:

  ; a-sine-tone : number -> number
  ; given a time, produces a single sine wave amplitude
  (define (a-sine-tone t)
    ...)

The next step in the design recipe is to produce a test case or two. The sine wave pictured above doesn’t have any units attached to it, so we can’t tell how fast it’s going up or down, or how big it is.

In order to use this function with Rsound’s sound generation functions, the maximum amplitude should be 1.0, and the minimum amplitude should be -1.0. Also, the time will be represented as a frame number, as we’ve discussed. In order to change a frame number into a number of seconds, we need to divide by the sample rate. Let’s assume the standard sample rate of 44.1 kHz.

What frequency do we want the resulting sound to have? Well, let’s say 147 Hz, just for the heck of it.

Okay, so now we need a sine wave that completes 147 complete cycles every second, and has a maximum amplitude of 1.0. How can we write a test case for this?

. . . (Think about it for a second, please.)

Well, one way to do a fairly good job is to test the result at a number of locations; locations where we know the answer.

So, for instance, after a sine wave has completed a half or whole cycle (or any multiple thereof), the sine wave’s value should be zero. How long is this wave’s cycle? It’s 1/147 of a second. If we’re using a sample rate of 44.1 kHz, then we multiply by 44100 to translate 1/147 of a second into 300 frames. Ah! Now you see that maybe that choice of 147 Hz wasn’t so random, after all. So, after 150 frames (a half cycle) or any multiple thereof, the sine wave should be zero. We can use this to write a test case:

(check-within (a-sine-tone 0) 0.0 0.001)

What is this check-within form? Well, it’s a form that we can use to check that a number is "close enough" to another number. It accepts the tested form, the expected result, and a "tolerance" that indicates how close the numbers must be in order for the test case to pass.

Is this enough? Not for me; let’s check that it returns to zero after, say, four and a half cycles:

(check-within (a-sine-tone 1350) 0.0 0.001)

Is this enough? Not for me; the test cases as they now are would be satisfied by a function that was uniformly zero, so we should have one test case where the expected answer isn’t zero. How about if we go 7/8 of the way through the cycle? In this case, the number should be coming back up toward zero, with a value of negative root 2 over 2:

(check-within (a-sine-tone (* 7/8 300)) (- (* 1/2 (sqrt 2))) 0.001)

Okay, let’s go back and finish the body of the function. It turns out the argument to the sin function essentially just requires a bunch of conversions. To change from frames to seconds, we multiply by one over the sample rate: (* t 1/44100). Then, we need to multiply by rotations per second, the frequency: (* 147 (* t 1/44100)). Finally, we want the answer in radians, not in rotations, so we need to multiply by the number of radians in a full rotation, 2π: (* twopi (* 147 (* t 1/44100))). We can rewrite that to use just one multiply (and look nicer) like this: (sin (* twopi 147 t 1/44100)) Here’s the final program:

  ; a-sine-tone : number -> number
  ; given a time, produces a single sine wave amplitude
  (define (a-sine-tone t)
     (sin (* twopi 147 t 1/44100)))

  (check-within (a-sine-tone 0) 0.0 0.001)
  (check-within (a-sine-tone 1350) 0.0 0.001)
  (check-within (a-sine-tone (* 7/8 300)) (- (* 1/2 (sqrt 2))) 0.001)

As an aside: did my test cases discover bugs in my program, when I was writing it for this tutorial? Yes, they did. I had forgotten to call sin in my program.

Okay, does this actually make a sound? Let’s try it, using fun->mono-rsound. This function accepts the duration of the sound (in frames), the sample rate, and the function that’s going to be used to generate the sound. That’s right, we pass a function to another function. Let’s call the result half-second-tone:

(define half-second-tone (fun->mono-rsound 22050 44100 a-sine-tone))

Let’s try playing it:

(rsound-play half-second-tone)

Whew! It worked.

1.1 Exercises

Develop the function another-sine-tone, that produces a tone at a frequency of 294 Hz. Follow the design recipe, and create the new test cases before writing the function.
Using cond and/or modulo, develop the function zigzag, that produces a wave that looks like this:

2 Adding together sine waves

What if I want to add together more than one sine wave? Now I want to develop a function that produces the sum of the two sine waves. If the two frequencies are 147 and 492 Hz, the result should look like this:

In order to create this, we’re going to go through the same process that we did before: Naming the function, writing the purpose statement and header:

  ; a-two-sine-tone : number -> number
  ; given a time, produce the sum of two sine waves' amplitude
  (define (a-two-sine-tone t)
    ...)

Now, test cases. For this one, it might be easier just to get out the calculator. Since one of the sine waves has a zero at 1350 samples, we can use that to simplify one of our test cases:

  (check-within (a-two-sine-tone 0) 0.0 0.001)
  (check-within (a-two-sine-tone 1350)
                (sin (* twopi 492 1350 1/44100)) 0.001)

All right, let’s try writing the body. This time, instead of returning a single sine, we need to return the sum of two sines. So the body of the function could be: (+ (sin (* twopi 147 t 1/44100)) (sin (* twopi 492 t 1/44100))).

Try it. How does it sound?

Wait! That’s not right. Let’s take a look at the output of that function. Oh dear, here’s what I get:

I see: we violated the restriction that the result can’t go about 1.0. In order to fix this, let’s just multiply the result by 1/2. So now the body looks like this: (* 1/2 (+ (sin (* twopi 147 t 1/44100)) (sin (* twopi 492 t 1/44100)))).

Does this work? Yes! The final function looks like this:

  ; a-two-sine-tone : number -> number
  ; given a time, produce the sum of two sine waves' amplitude
  (define (a-two-sine-tone t)
    (* 1/2 (+ (sin (* twopi 147 t 1/44100)) (sin (* twopi 492 t 1/44100)))))

  (check-within (a-two-sine-tone 0) 0.0 0.001)
  (check-within (a-two-sine-tone 1350)
                (* 1/2 (sin (* twopi 492 1350 1/44100))) 0.001)

  (define half-second-two-tones (fun->mono-rsound 22050 44100 a-two-sine-tone))

3 Adding Another Parameter

That’s fine, but what if we don’t want our program to work for only one fixed frequency? Let’s take a baby step from the earlier function: we’ll keep the 147 Hz the same, but we want the other frequency to be provided by the user.

The natural way to do this is to add another argument to our function:

  ; a-pair-tone : number number -> number
  ; given a time *and a frequency*, produce the sum of two sine
  ; waves' amplitude
  (define (a-pair-tone t f)
    (* 1/2 (+ (sin (* twopi 147 t 1/44100)) (sin (* twopi f t 1/44100)))))

Note that we’ve replaced the 492 with t. We need to update our test cases; we can use our earlier ones, with some modifications:

  (check-within (a-pair-tone 0 0) 0.0 0.001)
  (check-within (a-pair-tone 1350 0) 0.0 0.001)
  (check-within (a-pair-tone (* 7/8 300) 0)
                (* 1/2 (- (* 1/2 (sqrt 2))))
                0.001)
  (check-within (a-pair-tone 1350 492)
                (* 1/2 (sin (* twopi 492 1350 1/44100)))
                0.001)

You should recognize all of these as earlier test cases, and they all pass.

Unfortunately, this leads to a problem: fun->mono-rsound wants a function with only one argument, not one with two. If we try calling fun->mono-rsound with a-pair-tone, we get an error.

(fun->mono-rsound 44100 44100 a-pair-tone)

yields:

fun->mono-rsound: expects type <function of one argument> as 3rd argument, given: (lambda (a1 a2) ...); other arguments were: 44100 44100

There are a number of ways to fix this problem. One is to use local. A simpler one is to use an rsound primitive called signal, that takes a function and and turns it into a function of one argument, called a "signal". More specifically, it takes a function whose first parameter is a frame number, and a bunch of values to use as the other arguments to the function, and produces a signal usable with fun->mono-rsound or any other function that uses signals. Here’s an example, using our a-pair-tone function:

(fun->mono-rsound 22050 44100 (signal a-pair-tone 234))

In this case, the signal function takes in the function that we want to use as a signal (a-pair-tone), and all of the arguments other than the frame number. In this case, that’s just the frequency, 234. It produces a signal. If you like, you can try calling this signal directly.

Now, we can make a bunch of sounds that pair the frequency of 147 with another selected frequency:

  (rsound-play
   (rsound-append*
    (list (fun->mono-rsound 22050 44100 (signal a-pair-tone 178))
          (fun->mono-rsound 22050 44100 (signal a-pair-tone 340))
          (fun->mono-rsound 22050 44100 (signal a-pair-tone 222))
          (fun->mono-rsound 22050 44100 (signal a-pair-tone 412)))))

Once again: the signal function allows us to "bundle up" a function with multiple arguments so that it takes just one.

Note that we can still test and develop the a-pair-tone function independently; that’s important!

Now that we’ve "broken the link" between the function definition and its use in fun->mono-rsound, it’s easy to generalize further. If we want to develop a new function that takes two frequencies and creates a pair of half-second tones using a-pair-tone, we could do that like this:

  ; pair-tone-seq : number number -> rsound
  ; given two frequencies, produce a pair of chords where each
  ; frequency is paired with one of frequency 147.
  (define (pair-tone-seq freq-a freq-b)
    (rsound-append*
     (list (fun->mono-rsound 22050 44100 (signal a-pair-tone freq-a))
           (fun->mono-rsound 22050 44100 (signal a-pair-tone
                                                 freq-b)))))

  (rsound-play (pair-tone-seq 178 340))

Note that this is the first function that we’ve written that produces an rsound. For these functions, we won’t require test cases. The key is to move as much as possible outside of these functions.

3.1 Exercises

Develop the any-sine-wave function, that accepts a frame t and a frequency f and a sample-rate sr and returns the sine of 2π*t*f*1/sr.
Use this function along with rsound-play and fun->mono-rsound and a signal, just as we did above, to produce a tone of length 2 seconds that plays a tone at 298 Hz.
Develop the harm2 function, that accepts a frame t and a frequency f and returns the sum of half the sine of 2π*t*f*1/44100 and one quarter of the sine of 2π*t*2f*1/44100. Put differently, it returns a sine wave at frequency f and a smaller sine wave that’s one octave up.
Use this function as you did the other one, to produce a sound of length 1.5 seconds that contains the frequency 178 Hz and the frequency 356 Hz.
Develop the function slow-slide, that accepts a frame t and a total number of frames limit and returns a function that corresponds to a linear graph from 0.0 (at frame 0) up to 1.0 (at frame limit). Here’s a picture of this graph:
Use this function to make a sound. How does it sound?

4 Working With Existing Sounds

The rsound-ith/left and rsound-ith/right functions extract samples from an existing rsound. Using these functions, we can do things like looping sounds, playing them backward, squeezing and lengthening them, and so forth.

As you might expect, rsound-ith/left and rsound-ith/right extract elements from the left and right channels, respectively. So, for instance, if my-sound refers to a sound, then (rsound-ith/left my-sound 145) will return the 145th frame of the left channel of the given sound.

In order to simplify all of the programs in this section, we will assume that all sounds are monophonic.

To try it out, let’s try to develop a function that returns the fifth second of a given input sound. (When does the fifth second begin? At four seconds into the sound. Think about it for a second. Or four.) We’ll assume that the given sound is at least five seconds long.

As we did before, we’re going to first develop a function that returns one sample of the desired sound, and then use fun->mono-rsound to create an rsound from it.

Our first function, plus-four-seconds, will accept an rsound and a sample number t and return the corresponding sample starting at +4 seconds from the input sound. We’ll just ignore the right channel, for now, and we’ll assume a sample rate of 44100.

First, the contract and purpose statement and header:

  ; plus-four-seconds : number rsound -> number
  ; given a frame number and an rsound, return the four-seconds-later
  ; sample from the left channel of the input sound
  (define (plus-four-seconds t rsound)
    ...)

Next, we need a test case. This requires making an input rsound. How about a simple linear sound that increases from 0.0 to 1.0 over the course of five seconds? Hey, we already wrote that function! It’s called slow-slide. So, if we use plus-four-seconds on the result of a five-second slow-slide, we should get a function that increases from 4/5 up to 1.0.

  (define test-slide
    (fun->mono-rsound (* 44100 5) 44100
                      (signal slow-slide (* 44100 5))))
  (check-within (plus-four-seconds 0 test-slide) 0.8 0.001)
  (check-within (plus-four-seconds 22050 test-slide) 0.9 0.001)
  (check-within (plus-four-seconds 44099 test-slide) 1.0 0.001)

The body is now pretty straightforward; we just need to return the sample that comes from four seconds "in the future": (rsound-ith/left rsound (+ t (* 44100 4))).

Next, we want to develop a function that uses plus-four-seconds to generate a new rsound. We’ll call it fifth-second. Here’s the completed definition. Note that the test cases can be adapted from our previous set. I know I said that functions producing rsounds didn’t need test cases, but I just couldn’t help myself.

  ; fifth-second: rsound -> rsound
  ; given a sound of at least five seconds, return the fifth second
  (define (fifth-second rsound)
    (fun->mono-rsound 44100 44100 (signal plus-four-seconds rsound)))

  (define slide-5 (fifth-second test-slide))
  (check-within (rsound-ith/left slide-5 0) 0.8 0.001)
  (check-within (rsound-ith/left slide-5 22050) 0.9 0.001)
  (check-within (rsound-ith/left slide-5 44099) 1.0 0.001)

4.1 Exercises

In these exercises, you may continue to assume that the sounds are monophonic, and that you may ignore the right channel.

Generalize the plus-four-seconds function to plus-four/any-sr, that uses the sample-rate of the input sound (use rsound-sample-rate to find the sample rate of a sound) to determine the number of samples that represent four seconds.
Develop the my-rsound-clip function that behaves exactly like rsound-clip. Specifically, it accepts an rsound and two numbers representing "start" and "stop" frames, and returns the rsound containing the specified frames, with the same sample-rate as the input rsound. This will require developing two functions, just like every other example that produces an rsound. As before, use rsound-sample-rate to determine the sample-rate of the input rsound.
Develop the rsound-reverse function, that accepts an rsound and returns a new rsound whose frames are reversed; the resulting sound should play the original sound backward. Use the rsound-frames function to determine the length of the input sound.
Develop the rsound-half-speed function, that accepts an rsound and returns a new rsound that contains each sample of the input sound twice in a row. So if the original rsound’s left-channel samples were 0,0.5,-0.3, the samples of the resulting rsound would be 0,0,0.5,0.5,-0.3,-0.3 . Use this function on a short piece of music. How does it sound?

1	Simple Sine Waves
2	Adding together sine waves
3	Adding Another Parameter
4	Working With Existing Sounds