I’ve always enjoyed the idea of turning audio into graphics, ever since the days of Windows Media Player’s incredibly naff yet somewhat entertaining visualiser feature. A few years ago I discovered raster-noton, the great German record label which has been home to music from the likes of Robert Lippok, Aoki Takamasa and Ryoji Ikeda. More recently I came across this video for atom™‘s track, ‘strom’ on raster’s YouTube channel.
This waveform oscilloscope-like visualisation is quite cool. It’s nothing hugely original, but it’s nice and clean looking.
I then clicked on a recommended video to find this unofficial video for one of my favourite tracks, Robert Lippok’s ‘Whitesuperstructure’.
If we want to take some sound and turn it into visuals, we need some way of dealing with each. Whilst we could probably go in really low level here, for working on the Web, the WebAudio and HTML Canvas APIs will do a very good job.
We’ve got a few options if we want to play audio, but let’s just go with a simple
<audio> tag in an HTML document for now.
This should work straight away. When we load the page, we should hear the music begin to play.
Since we want a canvas for our output graphics, let’s add a
We could achieve what we want simply by putting all of our logic in one place and intimately coupling our canvas and our audio with our code, but a much better practice is to follow the UNIX philosophy and write small, modular - and therefore reusable - components.
So let’s make a
To define what a
Visualiser object is, we will write a constructor function.
Visualiser will keep track of an audio source, and ‘visualise’ this out onto the canvas. It will use something I’ll call a render function to do the actual drawing. More on this later.
First we’ll call some methods (which we will define in a minute) to initialise stuff. Then we’ll start calling the custom render function. We’ll make a call to the standard
window.requestAnimationFrame function and pass it an anonymous function as the callback. Within this anonymous function, we will call the custom
render function, passing it a reference to the
Visualiser instance that called it, and then make another call back to
window.requestAnimationFrame to render the next frame. This cycle will repeat for the lifetime of the
Visualiser. Delegating the responsibility for deciding when to render frames like this is good because it means that the browser can perform small optimisations such as pausing the canvas while the the containing tab is inactive.
There is one more property that we will set -
_ to its name to denote the privateness. The exact use of this particular property will be explained later…
Now that we’ve got our basic constructor function for
Visualiser objects, let’s implement those initialisation methods we made calls to.
We’ll add these methods to the prototype of the
Visualiser constructor function. This means that every
Visualiser instance will refer back to the same initialisation function which is slighty more efficient than each instance containing its own copy of the function. We’ll probably only ever have one
Visualiser, but who knows! Always make room for the future.
To prepare the canvas, we need to get it from the DOM, and then set its properties to the values we want. The
Visualiser already knows the
id of the canvas element that it’s using as input, so we can just use
document.querySelector to retrieve a reference to it.
We’ll also get the
pixelRatio property on the
window object and use it to scale the canvas so that it looks good on HiDPI displays because HTML Canvas looks ugly by default on modern displays.
Apart from setting properties on the canvas and configuring the canvas’ styles, we’ll also get a
RenderingContext2D that we can use to directly draw to the canvas. We’ll bind this as a property called
renderingContext on the object so we can access it later.
Next, we need to initialise our audio within the
Visualiser. The WebAudio API represents all the audio processing as a graph. Basically, we’ve got a bunch of nodes that can be connected to each other in various ways. It’s quite cool.
The first thing we need is an
AudioContext instance. Everything happens within, or in conjunction with, this object. We create create one using a method on the
window and make this a property on the
Visualiser‘s prototype, and we set another property to reference the audio element in the same way as we did with the canvas rendering context.
We’ve got a context, so now we need some actual nodes. We use a
MediaElementSourceNode to represent our audio source, and we create it by passing in our existing
<audio> element from earlier. This node will act as our ultimate audio source within the graph.
WebAudio allows us to do all sorts of fancy processing and even signal generation (with oscillators) but what we want to do is extract data from existing audio. The
AnalyserNode is perfect for this. We can create one using another property on the
In our audio graph, our nodes need to be connected. Intuitively, our source node should be connected to the analyser node so that we can extract data, but we’ll also connect the output of the analyser to the
destination property of the audio context. Connecting a node to the
destination allows us to actually hear the audio as well as get data from it!
When we’re visualising audio, probably the most important data value we want as input is the amplitude of the signal. Let’s add a method for getting that value.
We can use the
getByteTimeDomainData method on the analyser node to retrieve an array of bytes, with each byte corresponding to the displacement of the waveform at a given time. The time interval between these measurements is determined by the
fftSize property that we set earlier, but we actually don’t care about it that much because we’re not going to use the points individually. As long as they collectively represent a fairly short time period (a few milliseconds), it’s fine.
In order to get an amplitude that makes sense, we’ll find the maximum value of the data we have retrieved. This gives us the maximum displacement of the wave which is the amplitude within the period. To make the data we’re returning more usable, we’ll also scale it to between 0 and 1 using a little bit of maths.
maths.max function is one that I wrote separately to make this easier. You should be able to implement that yourself fairly easily, or just use some other maths library.
There is one problem with getting the amplitude like this and it’s that, for most audio sources, it tends to change very quickly. When you’re making a visualisation, rapid changes in a value will tend to make things jump and flicker and generally look nasty.
To fix this, let’s interpolate between the values! What we’ll do is store the amplitude when we measure it, and then next time we measure an amplitude, we’ll do a bit of linear interpolation between this new amplitude and the last one. That way, there won’t be as many sharp jumps in value.
We’ll extract this bit of logic into its own little private function called
_smoothAmplitude. Given an amplitude, and a linear interpolation factor (the distance the new value is from the last, as a fraction of the difference), the function returns a new, smoother amplitude.
We can add a new method called
getAmplitudeSmooth that makes use of this private smoothing function.
There you go; smoother than a fresh jar of Skippy™.
And, if you really have to know - here’s the linear interpolation function. Should be fairly intuitive. I think of it like: We want a point somewhere between
b. The point should be some fraction,
t, of the way between the two. So we start at
a and add on a fraction
t (so we need the product) of the distance between
b (which is equal to
It’s the moment you’ve all be waiting for… Let’s visualise.
Our visualiser allows us to pass in a custom ‘render function’ that actually does the drawing. Let’s write a simple render function that, say, draws a square in the centre of the canvas that changes its size and colour according to the current amplitude.
The render function is automatically passed a reference to the
Visualiser, and we can use this reference to get the rendering context. We can also call any of the methods we defined earlier on
Visualiser.prototype to get the amplitude, or do whatever else.
We can make use of the same
lerp function as before, this time for using the amplitude to control some other property such as colour hue in degrees or width and height in pixels.
This function will be called when each frame is to be drawn.
Earlier on, we defined our
<canvas> elements to have
id values of ‘in’ and ‘out’, respectively. Now we can pass these
ids to the
Visualiser constructor to tie these two things together and make a visualiser!
Of course, we will also pass in our render function, sensibly named
And here’s our final visualisation. This is a very simple example, but using the same basic framework, we can create arbitrarily complex graphics.
What we’ve set up so far works pretty nicely but, to make it even nicer, we could make some small additions.
- Non-linear interpolation (quadratic, cubic, quartic, polynomial)
- Retrieving frequency-domain data from the audio (using Fourier analysis). WebAudio will actually help you with this. Just use
AnalyserNode.prototype.getByteFrequencyDomainDataor the equivalent float version.
- Some extra helper classes/functions for drawing shapes according to the current audio data.
Anyway, enjoy. Here’s another visualisation made using the same