Toward Live Drum Separation Using Probabilistic Spectral Clustering Based on the Itakura-Saito Divergence

Eric Battenberg, Victor Huang, and David Wessel

AES 45th Conference on Time-Frequency Processing in Audio

Helsinki, Finland March 1-4, 2012.


This page contains supplementary information, data, plots, and audio to accompany the above paper which was presented at the AES 45th Conference on Time-Frequency Processing in Audio held in Helsinki, Finland March 1-4, 2012.

Talk Slides

Coming soon:
[Audio Separation Examples]
[Appendix containing derivations of the EM updates]



There were 10 drum performances recorded to be used in evaluation of the drum separation system.  A recording of each track is included below:

We plan to add audio examples so the separation quality can be subjectively evaluated.  For now, we present plots showing the drum-wise activations produced by the system.

For each track, three plots are included: the ground truth original MIDI data, along with separation matrices produced using two different sets of parameters.
KH is the number of head templates used for each drum, while KT is the number of tail templates.  The "Optimal" number of head templates varies per drum and is listed in the paper.

Track 1
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)
Track 1 featured only three drums: bass, snare, and closed hi-hat. As can be seen, the separation without a tail template adds many erroneous ride onsets.


Track 2
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)
With Track 2, both separations did quite well, correctly identifying ride onsets.



Track 3
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)
Track 3 featured a open hi-hat as opposed to a closed hi-hat. As explained in the paper, both separations added closed hi-hat onsets along with the correct open hi-hat onsets, but the separation without a tail template added erroneous ride onsets as well. onsets.


Track 4
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)
Track 4, similar to Track 2, featured mostly ride onsets in addition to bass and snare. As expected, both separations did reasonably well.


Track 5
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)


Track 6
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)


Track 7
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)


Track 8
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)


Track 9
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)


Track 10
Ground Truth
Separation 1 (KH=Optimal and KT=1)
Separation 2 (KH=1 and KT=0)