Groove2Groove

Groove2Groove (Grv2Grv) is an AI system for music accompaniment style transfer. Given two MIDI files – a content input and a style input – it generates a new accompaniment for the first file in the style of the second one. For example, we can use it to transfer the style of Fantastic Voyage by Lakeside onto Lithium by Nirvana, as in this video. Check out our interactive demo for more examples.

Paper PDF DOI

The system is described in our paper:

Ondřej Cífka, Umut Şimşekli and Gaël Richard. "Groove2Groove: One-Shot Music Style Transfer with Supervision from Synthetic Data." IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28:2638–2650, 2020. doi: 10.1109/TASLP.2020.3019642.

Additional resources GitHub

We provide the following resources to supplement the paper:

Interactive demo
Source code of the system and the evaluation metrics
Configuration files with hyperparameter settings
Trained model checkpoints
Style interpolation (blending) examples
Dataset used for training and evaluation

Dataset Zenodo

The Groove2Groove MIDI Dataset is a parallel corpus of synthetic MIDI accompaniments in almost 3,000 different styles. The accompaniments were created from chord charts using the commercial Band-in-a-Box accompaniment generation software as described in the paper. Each chord chart is rendered in at least two different styles, providing pairs of examples for supervised training.

The dataset is available from Zenodo. If you use the data for your research, please cite the paper.

The code used for automating the accompaniment generation is available on GitHub.

MIDI files

The midi directory contains one subdirectory for each part of the dataset:

train contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures (the Band-in-a-Box maximum) following a 2 measure count-in. (Note that 11 of these files are empty due to technical difficulties and are only included in the raw version of the data.)
val and test each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those in train.
itest is generated from the same chord charts as test, but in 40 styles from the training set.

Each style is actually one of two substyles (meant for the A and B sections of a song) of a Band-in-a-Box style. The two substyles are always in the same part of the dataset. More information about the styles can be found in the file styles.tsv. The chord charts used to generate these MIDI files are described below.

Each set of MIDI files is provided in two versions, each in its own subdirectory:

raw – the raw output of Band-in-a-Box.
fixed – non-empty files only, fixed so that each track has the correct program number.

The filenames have the form {chart_name}.{style}_{substyle}.mid. The charts_styles_substyles.tsv file lists the chord chart filenames along with the styles and substyles applied to each chord chart.

Chord charts

The charts directory is structured similarly to the midi directory and contains the corresponding chord charts. Each set of chord charts is provided in the ABC format (in the abc subdirectory) and the Band-in-a-Box (MGU) format (in the mgu subdirectory). The MGU files are all in the default “ZZJAZZ” style. To enable generation in the A and B substyles, we provide each MGU file in an A and B variant where the entire chord chart is just one long A or B section, respectively.

The chord charts were generated using language models trained on the iRb corpus (see the paper for further details). 5/6 of the chord charts are in major keys, the other 1/6 in minor keys (approximately following the distribution of keys in iRb).