The first one is made from a couple of GarageBand’s built-in keyboards and synths, and has a “theme song to a sci-fi show” vibe to it:

The second one is built in a few layers of repetition, and has a very earnest vibe to it. It’s just a recording or two of me playing the piano, so I was only using GarageBand for mixing:

]]>Here’s a clip of one of my own unpremeditated improvisations. I still find myself resorting to little crutches here and there, but if I deliberately try to push the envelope, the results are usually surprisingly–occasionally they are jarring, but you learn to roll with it! The rhythmic staccato portion around 2:45, for instance, definitely represents a departure from my comfort zone, but in a positive way.

]]>The basic idea is to render one full scene up front (think of it as a down payment of time). It is typical to simulate the fluid velocity field on a regular grid with \( N \) voxels, so mathematically, each time-step can be identified with an abstract vector living in \( \mathbb{R}^{3N} \). From the scene, which we regard as a sequence of vectors in \( \mathbb{R}^{3N} \) using an idea from linear algebra called the singular value decomposition (SVD), we discover a set of singular vectors—at most one per time-step. If there are \( r \) of these singular vectors, then we can identify the span of this set with \( \mathbb{R}^r \), which is a low-dimensional subspace of \( \mathbb{R}^{3N} \); hence, the name. An analogy for this might be something like the motion of a human hand—if we discretize the hand into millions of voxels, then in principle, there are many degrees of freedom in which a computer simulation could choose from. However, we know that in practice, there is a much smaller range of movements that are interesting or even physically possible. Theoretically then, there is some smaller subspace in which we can perform our calculations without sacrificing any quality!

As it turns out, however, this technique has some serious hidden drawbacks, primarily related to RAM limitations of typical computers (at least in the year 2016). In particular, for a reasonably high-detailed grid, the subspace re-simulation can require upwards of 90 GB RAM to be able to run! (Most typical laptops will have 4-16 GB RAM as of writing). Although the aptly-named “Kraken” machine in my lab does indeed have the requisite 96 GB RAM, for most people, this is an immediate dealbreaker. Hence the need for a data compression technique.

In my paper, we use a type of lossy compression similar to the pre-2000 JPEG algorithm which is based on the Discrete Cosine Transform (DCT). The basic premise of these types of compressions is to re-express the data in terms of a different basis, ideally one in which the representation is sparse (meaning that many of the coefficients are zero or at least very close to zero). As a basic example, if you wanted to encode the vector of all 1s in \( \mathbb{R}^n \), if you stick to using the standard ordered basis, each basis vector has a coefficient of 1, so in order to reconstruct it, you have to specify \( n \) numbers. However, using the discrete cosine basis instead, the representation for the very same vector would have every coefficient other than the first one would be 0, meaning only 1 number must be specified for the reconstruction to achieve the very same vector!

It turned out that the results from the DCT-based compression were reasonably promising in terms of the compression ratio obtained; however, the extra overhead required from decompressing and reconstructing the data initially proved extremely severe, increasing the previous runtime by a whole order of magnitude. Fortunately, we were able to use some simple mathematical trickery to evaluate the reconstruction phase sparsely inside the frequency domain before taking the inverse transform, using the fact that the DCT is a unitary transform (and thus preserves inner products).

The project page (linked at the beginning and again here) contains the full paper as well as some images and video results, so please check it out for a more thorough treatment!

Currently, I am experimenting with sonifying certain aspects of the subspace re-simulation, mapping the singular values into frequencies, and turning trajectories through the subspace into morphologies of chords and spectra.

]]>“In ‘Cantor’, the mathematical process of removing middle thirds to generate the Cantor ternary set (introduced by mathematician Georg Cantor in the 1870s) inspires a musical process across multiple timescales. Different levels and mappings are layered together to create a a sort of Cantor mosaic.”

And the piece itself:

You can read about some of my earlier projects and explorations with the Cantor set here and here.

]]>One concept that is occasionally used is the self-similarity matrix (SSM). Assume that we begin with a set of \( n \) feature vectors for which we would like to construct the corresponding SSM. By definition, this matrix is the \( n \times n \) matrix whose entry at position \( (i, j) \) is give by a ‘similarity’ (e.g., Euclidean distance) between vectors \( v_i \) and \( v_j \). Here is a graphical example of an SSM of approximately \( 100 \) windows of a spectrogram:

This matrix may remind you of a related concept from probability. Given a set of states, we can define the stochastic matrix, which determines with what probability we will transition from one state to another. In other words, the entry at position \( (i, j) \) of the stochastic matrix is given by the probability of transitioning from state \( i \) to state \( j \). Here is an example of a stochastic matrix (note how the rows sum to \( 1 \)):

\(P=\begin{pmatrix}0 & 0 & 1/2 & 0 & 1/2\\

0 & 0 & 1 & 0 & 0 \\

1/4 & 1/4 & 0 & 1/4 & 1/4\\

0 & 0 & 1/2 & 0 & 1/2\\

0 & 0 & 0 & 0 & 1

\end{pmatrix}\)

Given an SSM, we can obtain a stochastic matrix by normalizing each row to sum to \( 1 \). A stochastic matrix can then drive an algorithmic process such as a Markov chain. The main question is determining how to map the different ‘states’ into audio. One approach, assuming the SSM has been computed from a spectrogram, is to map each window to a stochastic frequency cloud, where the various frequencies occur in proportion to the energy in the corresponding bin. Here’s a basic audio example:

The previous example focused mostly on frequencies, but we can use other typical MIR features to map to other musical parameters, such as amplitude envelopes. For instance, we can compute a windowed RMS of an audio sample to get an idea of where energy peaks are located in time. We may then use the RMS signal itself as an envelope, creating a sound whose contours follow the energy of the original sound through time. Here’s an example:

Another musical parameter we can control is rhythm. By locating relative extrema (using a simple first-difference approach), we can also approximate onsets. The spacing between those onsets can then be mapped into rhythms.In my example, I randomly choose a spacing with weight corresponding to the intensity of the onset.

Finally, there is the parameter of form. Since music typically evolves through time, it is aesthetically important to devise some system whereby the music will be changing. A purely algorithmic approach, based on self-similarity, is intriguing: starting with an initial SSM, we generate a few seconds of audio using the previously described techniques. That audio is then analyzed to generate a new SSM, which in turn drives new audio. Other data such as RMS can also be sequentially updated. The piece essentially composes itself forever. Of theoretical interest might be the question of whether it converges to some steady state (or perhaps ‘converges’ into a steady loop).

Here is an étude based on combining several of the aforementioned ideas:

]]>

The concept starts from considering the Cantor set probabilistically. The Cantor function is in fact the cumulative distribution function for a random variable which is uniformly distributed on the Cantor set. But we can also define the Cantor function directly. Consider the Cantor set as the set of all ternary expansions whose entires are all either \( 0 \) or \(2\). As mentioned previously, there is then a bijection between the Cantor set and the full unit interval by ‘pretending’ the \(2\)s are \(1\)s and interpreting the numbers as binary expansions. Hence, for an input \(x\) which belongs to the Cantor set, we define \(c(x)\) as this correspondence. We then further define the Cantor function on inputs \(x\) that do not belong to the Cantor set as follows: since \(x\) does not belong to the Cantor set, it must contain a \(1\) in its ternary expansion. By truncating \(x\) immediately after the first \(1\) in its expansion, ‘pretending’ any previous \(2\)s are \(1\)s, and then interpreting the result as a binary expression as before, we obtain the appropriate output \(c(x)\).

I’ve recently written the code for a Cantor function UGen in SuperCollider for use in digital audio. The UGen uses an approximation to the Cantor function based on a recursive sequence of functions which converge pointwise to the Cantor function. The user can control the depth of the recursion for more or less precision. Here are a few very primitive demonstrations. To illustrate the recursion of the function sequence that converges to the Cantor function, I’ve taken a sine oscillator and modulated its frequency from \(200\) to \(600\) Hz using different convergents of the Cantor function. Here they are, for \( n = 0, 1, 2, \text{and } 3\):

\( n = 0\): (equivalent to one straight line between \(200\) and \(600\); i.e., no plateaus)

\(n = 1\): (comprises three straight lines, the middle one being completely flat; i.e., one plateau)

\(n = 2\): (three plateaus connected by straight lines)

\(n = 3\): (seven plateaus connected by straight lines)

(In general, there will be \( 2^n – 1\) plateaus.)

Of course, those examples aren’t especially interesting from a musical perspective. However, I found that the UGen was quite expressive when used at control rate to manipulate such parameters as envelopes and rhythms. Here is a recording of an improvisation I generated in SuperCollider, using almost exclusively the Cantor UGen to shape the sounds. (The actual audio rate UGen is a pulse oscillator, and the end result is put through a reverberator.)

If others are interested in trying out the Cantor function UGen, I will post the code shortly!

]]>

Movement I:

Movement II:

Movement III:

]]>For my final project in pattern formation, I investigated a new form of Newton iteration by combining it with Möbius transformations (NB: there’s a very nice video by Doug Arnold and Jonathan Rogness that explains these intuitively). These transformations are functions \( f \colon \mathbb{C} \rightarrow \mathbb{C} \) given by \( f(z) = \frac{az + b}{cz + d} \), where \( a, b, c, d \in \mathbb{C} \) satisfy the condition \( ad – bc \neq 0 \). In Newton iteration, we approximate the roots of a polynomial \( p(z) \) by constructing the sequence \( (z_n) \) given by

\( \displaystyle z_{n+1} = p(z_n) \ – \frac{p(z_n)}{p'(z_n)} \)

However, I considered the modified sequence

\( \displaystyle z_{n+1} = p(z_n) \ – \frac{a\frac{p(z_n)}{p'(z_n)} + b}{c\frac{p(z_n)}{p'(z_n)} + d} \)

In other words, I took a Möbius transformation of the term \( \frac{p(z_n)}{p'(z_n)} \). By modulating the parameters \(a\), \(b\), \(c\), and \(d\), many different fractal patterns emerge. Here is a gallery of my results:

I was especially interested in how to move continuously from fractal to fractal. Hence, I generated a video that demonstrates how several of these images relate to one another:

As another avenue of pattern generation, I also used the Mandelbrot set as input:

Here it is in video form, inverted:

Please also check out my fellow students’ final projects at our course website!

]]>*Spoiler alert* The key positions is after White plays 18.g3, so you can try to work it out for yourself if you like! Of course, knowing ahead of time the critical moment makes any tactic easier to spot, which is part of why finding moves like this can be so difficult!

```
[Event "5-minute pool"]
```

[Site "Internet Chess Club"]

[Date "2013.05.14"]

[Round "?"]

[White "Pavel"]

[Black "Aaron"]

[Result "0-1"]

[WhiteELO "2071"]

[BlackELO "2098"]
1.e4 e5 2.Nf3 Nc6 3.Bb5 Nf6 4.O-O Bc5

( 4...Nxe4 5.d4 Nd6 6.Bxc6 dxc6 7.dxe5 Nf5 8.Qxd8+ Kxd8 {is the

classical Berlin.} )

5.c3

5...O-O

( 5...d6 $2 6.d4 Bb6 7.d5 a6 8.Ba4 {and White wins material.})

6.d4 Bb6 7.Re1

7...d6 8.Bxc6 $6

{Since White does not win a pawn on e5 as a result of this trade due

to tactical reasons, this move is premature.}

8...bxc6 9.dxe5 Ng4

{The point. With the double attack on e5 and f2 Black retains material

equality.}

10.Be3

( 10.Rf1 $2 Ba6 )

10...Bxe3 11.fxe3 Nxe5 12.Nxe5 dxe5

{The position is approximately even, assuming White can find a good

square for his knight.}

13.Na3

{This maneuver is a bit slow.}

13...Qh4

{Pressuring e4 and eying the kingside.}

14.Qf3 f5 $5

( 14...Be6 {was safer, but Black plays for more.} )

15.exf5 Bxf5 16.Qxc6

{White accepts the pawn. But now Black gains time to coordinate an attack on

the king.}

16...Be4

( 16...Bd3 $2 17.Qd5+ )

17.Qe6+

( 17.Qc4+ Kh8 18.Qe2 {was a safer alternative.}

)

17...Kh8 18.g3 $2

( 18.Rf1 {and White is alive and kicking. The point of 18.g3 is that

18...Qh3 is impossible. However...

} )

18...Rf2 $1

{Once the initial shock wears off, this move is actually pretty simple

to calculate. The myriad checkmating threats force White to capture

either the rook or the queen (apart from trivial spite moves like

Qc8+/Qe8+/Qg8+/Qh6). But both captures lead to forced mate!}

( 18...Rf2 19.gxh4

( 19.Kxf2 Qxh2+ 20.Kf1 Qg2# )

19...Rg2+ 20.Kf1

( 20.Kh1 Rg3# )

20...Rf8+ 21.Qf7 Rxf7# )

0-1

]]>On an unrelated note, here are a few of my results from implementing a simulation of the Doppler effect. In our first example, a source is traveling at about Mach \( \frac{1}{10} \) horizontally from right to left across a distance of \(200\) meters. When it is directly in front of us, the distance between us and the source is \(10\) meters.

In the second example, the source travels twice as fast, twice as far, and we are twice as close. The shift is correspondingly more dramatic.

One of the more counterintuitive aspects of the Doppler effect is the situation in which the source is coming directly at us. We never really experience this in real life (unless we are actually getting run over by the source). In this situation, the source does not in fact slide continuously from a higher frequency to a lower frequency but rather it does a discrete jump. (This is because the Doppler effect depends on the projections of the velocity vectors of the source and the receiver onto the line connecting the two, so unless the source and receiver are traveling directly toward each other, these projections vary continuously. However, when the source and receiver travel directly toward each other, the projections are constant until they meet, at which point they discretely jump to a new value.) In the particular example I have implemented, the source is traveling at Mach \( \frac{1}{4} \). (Warning: this example might be somewhat uncomfortable to listen to!)

Notice that we heard what sounded like a major sixth! This is actually a pretty straightforward consequence of the mathematics behind the Doppler shift. The emitted frequency, \(f\), is distorted by a frequency factor of \( \frac{c}{c + v_s} \), where \(c\) is the speed of sound in the relevant medium and \(v_s\) is the velocity of the source approaching you, with the convention that it is negative as it approaches you and positive as it recedes. Hence, the frequency ratio between the approaching sound and the receding sound is given by \( \frac{c + |v_s|}{c – |v_s|} \). In particular, at Mach \( \frac{1}{4} \), this ratio is \(\frac{\frac{5}{4}}{\frac{3}{4}} = \frac{5}{3} \), which is a just major sixth. Other musical intervals are, of course, also easily obtainable — Mach \( \frac{1}{3} \) yields an octave, for instance. In general, the reader can verify with some basic algebra that in order to get the frequency ratio \(P:Q\), the source must travel at Mach \( \frac{P – Q}{P + Q} \).

You might have noticed the expression \( c – v_s \) in the denominator above and wondered about the situation in which we are traveling Mach \(1\) or faster, but that’s another story for a future post.

]]>