Monthly Archives: June 2010

Taking the swing out of songs

I recently read about a program called Swinger in a music blog. The post’s author, Paul Lamere, describes the program nicely:

The Swinger is a bit of python code that takes any song and makes it swing. It does this be taking each beat and time-stretching the first half of each beat while time-shrinking the second half. It has quite a magical effect.

The description is followed by several amusing audio samples of songs in “straight” rhythm turned into swing rhythm by the program.

For those unfamiliar with the musical terms, I’ll try to explain a little what the program does. Usually in music, each beat is divided into two notes of equal length. This is “straight” rhythm. You can hear it, for example, in the guitar intro for Every Breath You Take by The Police: all the notes are have uniform length. On the other, notes can be swung, which means that the second note of the beat comes a little bit later. In other words, the first note of the beat is longer than the other. You can hear it, say, in Personal Jesus by Depeche Mode (not the Johnny Cash version).

Naturally, if you can take a normal song and add a swing to it, you can also do the reverse. Instead of stretching the first half of each beat and shrinking the second half, you shrink the first half and stretch the other. Swinger can already do this. You just need to give it the ratio between the two notes of the beat, and it will stretch and shrink the notes so they will have uniform length.

The song I decided to try and de-swing was Revolution 1 by the Beatles. Here is the result:

Some parts of it are de-swinged pretty well (the first verse, the backing vocals), other parts (e.g. the intro) sound pretty much the same to me as the original.

What’s the “swing ratio” of Revolution? The way I hear the song, each beat is divided into three equal parts, the first note taking 2 parts and the second taking the last one, meaning a ratio of 2:1. Swinger actually takes as a parameter the number x-0.5, where x is the part of the beat taken by the first note. In other words, the parameter is the part of the second half of the beat taken by the first note. So I tried using 0.17. The result didn’t sound very different than the original. I finally settled on 0.25 — a ratio of 3:1.

You can notice that the effect is much less striking than the swinging effect shown in the blog. I also tried de-swinging another song (Terrapin by Syd Barrett) but it didn’t have much effect. It looks like de-swinging is harder than en-swinging. Why is that?

My theory is that in the beat wasn’t detected 100% correctly in these cases. The algorithm’s result greatly depends on the beat detection quality. In fact, if you think about it, en-swinging and de-swinging are actually the same process applied with slightly different time shifts: in both cases, the algorithm alternately stretches and shrinks small chunks of the song. The only difference is which part of the beat the stretching happens. If the beat isn’t detected correctly, the de-swinging process may even make the song swing even more! I suspect that the algorithm has a harder time detecting the beats for songs with swing.

Another explanation is that when the song is swinging, the rhythm is more loose. The different musical parts (vocals, percussion) are less in sync with each other. Simple time-stretching therefore can’t “fix” the rhythm.

std::string is contiguous

You can safely assume that the memory buffer used by std::string is contiguous. Specifically, the address of the string’s first character can be used as the address for the whole string, just like a C-style char array:

std::string str = "foo";
strncpy(&str[0], "bar", 3); // str now contains "bar".

Why is this safe? The current C++ standard apparently doesn’t guarantee that the string is stored contiguously, but it is in all known implementations. Additionally, the next C++ standard (C++0x) will make this guarantee. So the above usage is valid on all present and future C++ implementations.

Why is this important? It’s common for functions, especially in the Windows API, to “return” strings by copying them into a buffer passed to the function. Since the memory buffer used in std::string is contiguous you can safely pass it to the function, after resizing the string to the correct size.

A typical usage for Windows API functions:

// get required buffer size
DWORD bufSize = 0;
GetComputerNameA(NULL, &bufSize);
if (!bufSize && GetLastError() != ERROR_BUFFER_OVERFLOW) {
  throw std::runtime_error("GetComputerNameA failed");
}
// bufSize now contains required size of buffer, including null terminator
std::string buf(bufSize, '\0');
if (!GetComputerNameA(&buf[0], &bufSize)) {
  throw std::runtime_error("GetComputerNameA failed");
}
// bufSize now contains actual size of data
buf.resize(bufSize);
// now use buf as a regular std::string

This is cumbersome but actually easier than plain C code, since you don’t have to manage the memory yourself.

Note that the expression &str[0] is valid only if str isn’t empty. Also, everything I’ve said also applies to std::wstring, the wide-character version of std::string.

References: