In my MAT 201B course, “Programming with Media Data,” I’ve developed an audio/visual representation of different molecular random walks, including a loose model of Brownian motion. The fluctuations in the spatial displacements are mapped to color (in terms of HSV) and pitch (in terms of frequency). The user can interact with the simulation, discretely or continuously varying the temperature to affect the magnitude of the displacements. I’ve created two other stochastic processes in addition to the Brownian motion and included a mode which cycles rapidly between the three random walks, essentially producing a ‘random walk between random walks.’ I also programmed a short piece which illustrates the various aspects of my project in action. You can see a video screen capture of it below.

The abc conjecture

Recently, the Japanese mathematician Shinichi Mochizuki announced that he had a proof of the abc conjecture. This claim was significant enough to receive plenty of major news coverage, including from the New York Times. His strategy was to use entirely novel self-developed techniques — so-called inter-universal Teichmüller theory. In fact, practically no one else in the mathematics community is conversant in these techniques, so it will take quite a long time before the proof is sufficiently refereed. If it turns out to be accurate, however, it will undoubtedly a huge achievement!

So what is the abc conjecture, anyway? Essentially, it lies somewhere in the intersection of additive and multiplicative number theory. Curiously, it’s a statement about one of the simplest equations possible:

\( a + b = c \)

for natural numbers \( a \), \(b\), and \(c\). Naturally, there are infinitely many solutions to this equation. Now, it’s obvious that this equation is additive, but we can give it a multiplicative flavor if we consider the various prime divisors of \(a\), \(b\), and \(c\). In particular, for this conjecture, we’re interested in repeated divisors — cases where \(a\), \(b\), and \(c\) are divisible by perfect powers.

It’s probably helpful to use a concrete example to illustrate some of the ideas here. So suppose \( a = 81, b = 128, c = 209. \) We see that indeed \( a + b = c. \) Now, \( a = 3^4, b = 2^7, \) but \( c = 11 \cdot 19. \) Thus, although \( a \) and \( b \) are divisible by large prime powers, it seems that \( c \) is not. Of course this is simply one example, but in general this sort of pattern appears to be true. Try it out for yourself and see if you can find any counterexamples!

In order to describe the abc conjecture precisely, it will help to introduce a little terminology. Let’s define the radical of a natural number \( n \), denoted \( \mathrm{rad}(n), \)  to be the product of the distinct prime factors of \( n. \) For instance,

\( \mathrm{rad}(12) = \mathrm{rad}(72) = 2 \cdot 3 = 6, \mathrm{rad}(2^7) = 2, \mathrm{rad}(1) = 1, \) etc.

One formulation of the abc conjecture tries to compare \( c \) to \( \mathrm{rad} (abc) \). We can think of this in the following way: to what power must we raise the radical of \( abc \) in order to obtain \( c \)? In other words, in the form of an equation, what exponent \( q \) satisfies

\( \displaystyle (\mathrm{rad} (abc))^{q} = c \)?

The more that \( c \) is highly divisible by primes, the larger value of \( q \) we will need. Solving for \( q \), we obtain

\( \displaystyle q = \frac{\log c}{\log (\mathrm{rad} (abc))} \).

For convenience, let’s call this quantity \( q \) the quality of the abc-solution. We claim that large values of \( q \) arise when \( a\), \(b\), and \(c\) are divisible by prime powers. It’s worth verifying this claim in order to get some intuition for how this quantity \( q \) is a useful construct. So suppose for the sake of argument that \(a\), \(b\), and \(c\) are all perfect powers. For instance, let \( a = x^n, b = y^n, c = z^n \) for some natural numbers \( x, y, z, n. \) Then we get the infamous Fermat equation \( x^n + y^n = z^n \). Of course, it’s now known that no solutions exist for \( n \ge 3 \), but for the sake of argument, let’s suppose solutions do in fact exist for arbitrarily large values of \( n \). What is the quality of such an abc-solution? Well,

\(\begin{align*} \log c &= \log (\max(a, b, c))\\
&\ge \frac{1}{3}(\log a + \log b + \log c)\\
&= \frac{1}{3}\log abc\\
&= \frac{1}{3}\log (x^ny^nz^n)\\
&= \frac{1}{3}\log(xyz)^n\\
&= \frac{n}{3}\log(xyz)\\
&\ge \frac{n}{3}\log(\mathrm{rad}(abc)).


\( \displaystyle q =\frac{\log c}{\log (\mathrm{rad} (abc))} \ge \frac{n}{3}. \)

If \( n \) can be arbitrarily large, then so can \( q. \) However, the abc conjecture establishes a restriction on the size of \( q \) as follows.
The abc conjecture: For any real number \(C \) greater than \(1\), all but a finite number of \(abc\)-solutions have a quality \(q\) less than or equal to \(C\).
There are several alternative formulations as well — you can check them out at Wikipedia or MathWorld — but this one, adapted from Barry Mazur’s piece Questions about Number, appealed to me most. Let’s hope that Mochizuki’s proof is successful and opens up new doors in number theory! For further reading and research, here is a list of the consequences of the abc conjecture.


Aliasing is a common problem in signal processing, both visually and aurally. If you’ve ever watched the spokes on a wheel turn in a movie, you may have wondered why they sometimes appear to go backward or even stand still. The reason for this has to do with how film works. Typically, the illusion of continuity is created by rapidly juxtaposing many still photographs, for instance, \( 24 \) per second. (This number is usually called the frame rate.) In other words, one photograph takes place in every \( \frac{1}{24} \) seconds. Let’s assume that we are photographing a car whose wheels turn at a rate of \( \frac{1}{4} \) revolution every \( \frac{1}{24} \) seconds. Then on film, the car will appear to be moving forward as normal. But suppose instead that the car is traveling three times faster — that is, its wheels are turning at a rate of \( \frac{3}{4} \) revolution every \( \frac{1}{24} \) seconds. When we observe the film, without sufficient visual context, it will actually appear as though the wheels are turning backward \( \frac{1}{4} \) revolution every \( \frac{1}{24} \) seconds. In other words, the car was moving so fast that the frame rate couldn’t keep up. In fact, even if the car had only been moving twice as fast — that is, at a rate of \( \frac{1}{2} \) revolution every \( \frac{1}{24} \) seconds — we wouldn’t be able to distinguish whether its motion was forward or backward on film.

Thus, if we want to accurately capture the motion of a wheel on film, we actually need to choose a frame rate at least twice the rate of the wheel. (This is the essence of the Nyquist-Shannon sampling theorem.) So a rate of \( 24 \) frames per second can accommodate any wheel spinning at a rate slower than \( 12 \) revolutions per second. If we take the average circumference of a tire to be about \( 67 \) inches, then \( 12 \) revolutions per second corresponds to \( 67 \) feet per second, or approximately \( 45 \) mph. Hence, any motion in excess of \( 45 \) mph may appear distorted by aliasing without a faster frame rate.

All these arguments apply to audio processing as well! In digital signal processing, a continuous sound is sampled by gathering rapid data about discrete instants of the sound, similar to taking rapid photographs of a visual event. The sampling rate (analogous to the frame rate) indicates just how much data are gathered in one second. For example, one common choice of sampling rate is \( 44100 \) samples per second. But if the sampling rate does not exceed twice the frequency of a signal, then the signal may be subject to aliasing. This results in low-frequency signals which were not present in the analog sound manifesting themselves in the digital version. For example, a tone with a frequency of \( 43660 \) Hz could be heard instead as a tone with frequency \( 44100 – 43660 = 440 \) Hz, a concert A.

While aliasing is usually a drawback of digital processing that engineers try to avoid, it sometimes has an interesting sound. Here’s a quick little track that demonstrates the concept:

The program is written so that the frequency slides perpetually higher; however, you will eventually start hearing a quasi-random scattering of different pitches due to aliasing.

Musical Cantor

Cantor’s ternary set, or more briefly, the Cantor set, is a well-known construction that illustrates some interesting topological properties of the real numbers. To describe it briefly, we define an iterative process as follows: Let \( C_0 \) be the interval from \( 0 \) to \( 1\). Then \( C_1 \) is obtained by removing the middle third from \( C_0 \), leaving the two intervals from \( 0 \) to \( \frac{1}{3} \) and \( \frac{2}{3} \) to  \( 1 \). The set \( C_2 \) is obtained by removing the two middle thirds from \( C_1 \). So \( C_2 \) will then contain four intervals, each of length \( \frac{1}{9}. \) In general, the set \( C_n \) will contain \( 2^{n} \) intervals, each of length \( \left(\frac{1}{3}\right)^{n} \). The Cantor set, denoted \( C \), is then defined as

\(C=\displaystyle\bigcap_{n=0}^{\infty} C_n. \)

The Cantor set satisfies many curious properties. For instance, in a certain sense, it has length zero. This is more or less because at stage \( n \), it has a total length of \( \left(\frac{2}{3}\right)^{n} \), which approaches \( 0 \) as \( n \) approaches infinity. However, from the perspective of cardinality, it can be shown that the Cantor set is uncountable. This is actually fairly simple to see, using the following argument. In order to be an element of the Cantor set, you must be a member of every set \( C_n \). Going from \( C_0 \) to \( C_1 \), you have to be either ‘left’ or ‘right’ of the middle third. Let’s say hypothetically that you are ‘left.’ Then going from \( C_1 \) to \( C_2 \), again, you must be either ‘left’ or ‘right’ of the middle third. It’s clear that this process repeats indefinitely. If we represent ‘left’ and ‘right’ by \( 0 \) and \( 1 \), then we see that elements of the Cantor set can be put into one-to-one correspondence with infinite binary sequences. Thus, by Cantor’s diagonal argument, the Cantor set must be uncountable. So although it has measure zero, it is an uncountable set!

One can also try to “hear” the Cantor set in the following way. Start with a particular frequency, let’s say, middle C. Define \( C_0 \) as sustaining that frequency for some duration, let’s say, one second. Then choose an interval of modulation, for instance, a tone. Then \( C_1 \) would be the triplet D-C-D, since we have modulated the outer thirds by a tone. We can proceed similarly to the general form of \( C_n \), and in principle, we could even define the full Cantor set \( C \), although it would be difficult to program. The following audio tracks illustrate a few of these ideas.

Here’s Cantor for \( n = 0, 1, 2, 3, 4: \)

Cantor with \( n = 6: \)

Inverted Cantor:

Cantor with the modulation factor as a perfect fourth:

A mash-up of many of these ideas:

I hope this post allows you to hear the Cantor set better!

(Note: Once again, I’ve coded all the musical examples in Gamma.

Rhythm and Frequency

If you happen to have a subscription to JSTOR, Karlheinz Stockhausen’s 1962 article “The Concept of Unity in Electronic Music” is a classic explanation of some of the concepts I’m about to describe. It turns out that some musical parameters, such as rhythm and frequency, are in fact manifestations of one underlying mathematical principle.

Consider a simple rhythm, for instance, a pulse that divides a second into three equal beats. We can view the speed of this rhythm as 3 Hz. If we speed this pulse up until it is several hundred Hz, we will no longer process it as a rhythm but instead as a pitch with a frequency, as the following audio example demonstrates:

More interestingly, suppose we take a polyrhythm. That is, we take a single pulse, let’s say, 1 Hz, and divide it evenly into two different parts. One of the simplest possibilities is a 3:2 pattern, which would initially start out as a 3 Hz : 2 Hz ratio. Again, if we speed this polyrhythm up until it is several hundred Hz, our brains begin to perceive the rhythm as frequency. In particular, we will hear a perfect fifth, since the ratio of the frequencies is 3:2. Try to determine in the middle of the following audio example where you stop perceiving the sounds as a rhythm and start perceiving them as frequency:

Here’s a similar example, but with a 4:3 ratio, so that the frequencies play a perfect fourth:

And the irresistible 6:5:4 ratio, which produces a major triad:

It may seem counterintuitive, but these examples demonstrate that rhythm and frequency are in a certain sense the same concept. It’s merely our perception of them that varies.

(Note: I’ve coded all the musical examples in Gamma.)

Finite ant crawling

Here’s another example of recursive probability in action on a different problem.

Suppose you are an ant crawling along the edges of an equilateral triangle \( ABC \). You start at point \( A \). Every minute, you choose uniformly at random one of the other two vertices and move to it. What is the probability that you return to where you started (point \( A \)) after exactly twelve minutes?

Well, okay, at first the question seems hard to grasp, so let’s try to approach it by first tackling simpler questions. What is the probability that you return to where you started after exactly one minute? Zero, since you can only make one move. Well that didn’t work! How about after two minutes? Now we’re getting somewhere, since you can do this via two different paths: one through point \( B \) and one through point \( C. \) There are four total paths you could take, so the probability in this case must be \( \frac{1}{2} \). Just out of curiosity/completeness, we could also ask ourselves what the probability of arriving at point \( B \) or point \( C \) is after two moves. By symmetry, these probabilities must be the same, and their sum, together with the probability of arriving at point \( A \) after two moves, must be \( 1. \) Hence, they must each have probability \( \frac{1}{4} \).

Believe it or not, we’re already in a position to solve the original problem! The key is to think recursively. With this in mind, let \( p_{k} \) denote the probability of returning back to where you started after \( 2k \) minutes. We already know that \( p_1 = \frac{1}{2}. \) For \( k \) greater than \( 1 \), we want to relate \( p_k \) to \( p_{k-1} \). In other words, once you’ve made \( 2k – 2 \) steps, what is the probability that after taking two more steps, you are back to where you started?

This depends on where you are after \( 2k – 2 \) steps. So we can just break it down into all the cases! You’re at point \( A \) with probability \( p_{k-1} \), and from here, the probability of taking two more steps and getting back to point \( A \) is simply \( \frac{1}{2} \) as we computed before. You’re at either of points \( B \) or \( C \) with a sum total of probability \( 1 – p_{k-1} \), and from each of those cases, the probability of taking two more steps and getting back to point \( A \) is \( \frac{1}{4} \), also as calculated before. So we are finally in a position to write down our recursive equation:

\( p_k = \frac{1}{2}p_{k-1} + \frac{1}{4}(1 – p_{k-1}) \)


\( p_k = \frac{1}{4}(p_{k-1} + 1). \)

Thus, turning the crank a few times, starting with \( p_1 = \frac{1}{2} \), we get \( p_2 = \frac{3}{8}, p_3 = \frac{11}{32}, p_4 = \frac{43}{128}, p_5 = \frac{171}{512}, \) and finally, \( p_6 = \frac{683}{2048}. \) (Note that our final answer is very close to \( \frac{1}{3} \), which makes intuitive sense.)

There are actually many other ways to approach the same problem (one could, for instance, create a transition matrix), but this one is very simple and clean. In fact, it’s not too hard from here to come up with an explicit formula for \( p_k \) and prove that it is correct via induction. (A good exercise!)

Infinite coin flipping

Suppose Alice and Bob decide to play the following simple game: they alternate turns flipping a coin, and whoever gets heads first immediately wins. (Alice gets the privilege of going first.) What are Alice’s chances of winning this game?

To make it more general, let’s suppose the coin is heads with probability \( x \), where \( 0 \le x \le 1 \). There are two approaches to solving this problem. Let’s look at the ‘brute force’ way first. Alice only gets to flip the coin on the first, third, fifth, seventh, … turns. Thus, her chances of winning the game must be equal to the infinite sum of her chances of winning on each of those respective turns. On the first turn, her chance of winning is \( x \). In order to win on turn three, all previous flips must have been tails. Thus, her chances of winning in this case are \( (1-x)^{2} x \). Similarly, on turn five, her chances of winning are \( (1-x)^{4} x \), and this suggests that on turn \( 2n + 1 \), her chances of winning are \( (1-x)^{2n} x \). Thus, her chances of winning the game in total are

\( \displaystyle \sum_{n=0}^{\infty} (1-x)^{2n} x. \)

This is a geometric series with first term \( x \) and ratio \( (1-x)^{2} \), so it is equal to \( \frac{x}{1 – (1-x)^{2}} = \frac{x}{2x – x^{2}} = \frac{1}{2 – x}. \) So for instance, if the coin is fair, \( x = \frac{1}{2} \) and Alice’s chances of winning are \( \frac{1}{2 – \frac{1}{2}} = \frac{2}{3}. \)

While this approach certainly was successful, let’s look at a slicker alternative. Starting over, let’s denote the probability that Alice wins the game by \( P \). Then of course the probability that Bob wins the game is \( 1 – P \). Instead of putting ourselves in Alice’s shoes, let’s pretend for a minute that we are Bob. Suppose Alice goes first and throws tails. Now, as Bob, we are pretty happy! It’s almost as though we get to go first now …

Well wait a minute, it is exactly as though we get to go first now! This observation actually tells us something: the original probability of Bob winning the game must be equal to the probability of Alice winning the game while first throwing tails. In equations, this means that \( (1 – P) = P(1 – x) \). Now we can simply solve for \( P \):

\( 1 = P(1-x) + P \)

\( 1 = P(2 – x) \)

So \( P = \frac{1}{2-x} \), just as before. (Notice that our answer makes intuitive sense: as \( x \searrow 0 \), Alice’s probability of winning approaches \( \frac{1}{2} \) from above, and as \( x \nearrow 1 \), Alice’s probability of winning approaches \( 1 \) from below.)

This approach to probability is essentially recursive, and we’ll see more of it in the next post.

Probability and number theory, part II

In this post, we will explore the following question: If we choose two integers uniformly at random, what is the probability that they are relatively prime?

We start with a preliminary result. The Riemann zeta function, \( \zeta(s) \), is ubiquitous in number theory. Although the most interesting version of it is the complex-valued one, today we only need the real version. It is defined as follows: for any real number \( s \) greater than \( 1, \)

\( \zeta(s) = \displaystyle\sum_{n=1}^{\infty} n^{-s} \).

The Euler product  for the zeta function is the following alternative expression:

\( \zeta(s) = \displaystyle\prod_{p} \frac{1}{1 – p^{-s}}, \)

where the product is taken over every prime \( p \). To see why this formula is true, we start with the right-hand side and use a geometric series expansion to obtain

\(\displaystyle\prod_{p}\frac{1}{1-p^{-s}}=\displaystyle\prod_{p}(1+p^{-s}+p^{-2s}+\cdots). \)

By the Fundamental Theorem of Arithmetic, every positive integer can be expressed as a unique product of primes, up to order. (We consider \( 1 \) to be the product of no primes; i.e., the empty product.) Thus, there is a one-to-one correspondence between the terms in the sum \( \sum_{n=1}^{\infty} n^{-s} \) and the terms in the expansion of the product \( \prod_{p}(1+p^{-s}+p^{-2s}+\cdots).\) Hence,

\( \displaystyle\prod_{p}\frac{1}{1-p^{-s}}=\displaystyle\sum_{n=1}^{\infty}n^{-s}=\zeta(s).\)

Now back to our original problem! Suppose we choose \( a, b \in \mathbb{Z} \) uniformly at random. Fix an arbitrary prime \( p \). The probability that both \( a \) and \( b \) are divisible by \( p \) is \( \left(\frac{1}{p}\right)\left(\frac{1}{p}\right) = p^{-2} \). Hence, the probability that they are not both divisible by \( p \) is \( 1 – p^{-2} \).

The condition that \( a \) and \( b \) are relatively prime is equivalent to the condition that, for every prime \( p \), \( a \) and \( b \) are not both divisible by \( p \). Furthermore, for distinct primes \( p \) and \( q \), the events that \( a \) and \( b \) are not both divisible by \( p \) and \( a \) and \( b \) are not both divisible by \( q \) are mutually independent. (In fact, more precisely, the events that \( a \) and \( b \) are not both divisible by \( m \) and \( a \) and \( b \) are not both divisible by \( n \), for relatively prime integers \( m \) and \( n \), are mutually independent.)

Thus, the probability that \( a \) and \( b \) are relatively prime is equal to

\( \displaystyle\prod_{p}(1-p^{-2}), \)

where the product is taken over every prime \( p \). But

\( \displaystyle\prod_{p}(1-p^{-2}) = \displaystyle\prod_{p}(\frac{1}{1-p^{-2}})^{-1}=(\displaystyle\prod_{p}\frac{1}{1-p^{-2}})^{-1} = \frac{1}{\zeta(2)}.\)

Hence the answer is \( \frac{1}{\zeta(2)}\approx 0.608. \, \spadesuit \)

The closed-form expression for \( \zeta(2) \) is actually well-known (it equals \( \frac{\pi^{2}}{6}) \), but that’s another story for another day. The result also generalizes to more than two integers, a nice exercise for the reader.

Probabilistic approach to the Euler totient function

Euler’s totient function, \( \varphi(n) \) is a well-known arithmetic function from number theory. For any \(n \in \mathbb{N} \), \( \varphi(n) \) is defined to be the number of integers \( k \) in the range \( 1 \leq k \leq n \) for which \( \gcd(n, k) = 1 \). There is a formula for computing \( \varphi(n) \):

\( \varphi(n) = n \displaystyle\prod_{p|n} (1 – \frac{1}{p}), \)

where the product is taken over all the distinct primes \( p \) dividing \( n \). This formula can be proved in a few ways. One rests on the fact that \( \varphi(n) \) is multiplicative. Another uses an elaborate form of inclusion-exclusion. The one I am presenting now is based on probability.

Suppose we fix a positive integer \( n \) and choose uniformly at random an integer \( k \) in the range  \( 1 \leq k \leq n \). What is the probability that \( \gcd(n, k) = 1 \)? We answer this question in two ways. On one hand, we know by definition that there are \( \varphi(n) \) successful outcomes and \( n \) total outcomes, so the probability is equal to \( \frac{\varphi(n)}{n} \). On the other hand, we proceed more constructively. In order for the integer \( k \) to be relatively prime to \( n \), it is necessary and sufficient for each prime divisor \( p \) of \( n \) that \( k \) must not be divisible by \( p \).

The probability that \( k \) is divisible by \( p \) is \( \frac{1}{p} \). Hence, the probability that \( k \) is not divisible by \( p \) is \( 1 – \frac{1}{p} \). Furthermore, for distinct primes \( p \) and \( q \), the events that \( k \) is not divisible by \( p \) and \( k \) is not divisible by \( q \) are mutually independent. (More precisely, the events that \( a \) and \( b \) are not both divisible by \( u \) and \( a \) and \( b \) are not both divisible by \( v \), for relatively prime integers \( u \) and \( v \), are mutually independent.)

Thus, the probability that \( k \) is relatively prime to \( n \) is

\( \displaystyle \prod_{p|n} \left(1 – \frac{1}{p}\right). \)

Equating our two results, we have

\( \displaystyle\frac{\varphi(n)}{n} = \displaystyle\prod_{p|n}(1-\frac{1}{p}).\)

In other words,

\( \varphi(n) = n \displaystyle\prod_{p|n}(1-\frac{1}{p}). \, \spadesuit \)

We’ll use a similar approach in the future to show that the probability of choosing uniformly at random a pair of relatively prime integers is \( \frac{1}{\zeta(2)}. \)

Everything you ever wanted to know about temperaments (but were afraid to ask)

Here’s a historical paper I wrote on different musical temperaments from Antiquity until the late Renaissance.

A History of Temperaments

Recommended reading for anyone who is confused about the differences between just, mean, and equal temperament.

Here are some supplementary audio samples:

Just major third:

Pythagorean major third (ditone):

Syntonic comma:

Just major third cycle:

The resulting comma (“diesis”):