Groove2Groove (Grv2Grv) is an AI system for music accompaniment style transfer. Given two MIDI files – a content input and a style input – it generates a new accompaniment for the first file in the style of the second one. For example, we can use it to transfer the style of Fantastic Voyage by Lakeside onto Lithium by Nirvana, as in this video. Check out our interactive demo for more examples.
The system is described in our paper:
Ondřej Cífka, Umut Şimşekli and Gaël Richard. "Groove2Groove: One-Shot Music Style Transfer with Supervision from Synthetic Data." IEEE/ACM Transactions on Audio, Speech, and Language Processing, 28:2638–2650, 2020. doi: 10.1109/TASLP.2020.3019642.
Additional resources GitHub
We provide the following resources to supplement the paper:
- Interactive demo
- Source code of the system and the evaluation metrics
- Configuration files with hyperparameter settings
- Trained model checkpoints
- Style interpolation (blending) examples
- Dataset used for training and evaluation
Dataset Zenodo
The Groove2Groove MIDI Dataset is a parallel corpus of synthetic MIDI accompaniments in almost 3,000 different styles. The accompaniments were created from chord charts using the commercial Band-in-a-Box accompaniment generation software as described in the paper. Each chord chart is rendered in at least two different styles, providing pairs of examples for supervised training.
The dataset is available from Zenodo. If you use the data for your research, please cite the paper.
The code used for automating the accompaniment generation is available on GitHub.
MIDI files
The midi
directory contains one subdirectory for each part of the dataset:
train
contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures (the Band-in-a-Box maximum) following a 2 measure count-in. (Note that 11 of these files are empty due to technical difficulties and are only included in theraw
version of the data.)val
andtest
each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those intrain
.itest
is generated from the same chord charts astest
, but in 40 styles from the training set.
Each style is actually one of two substyles (meant for the A and B sections of a song) of a
Band-in-a-Box style. The two substyles are always in the same part of the dataset. More information
about the styles can be found in the file styles.tsv
.
The chord charts used to generate these MIDI files are described below.
Each set of MIDI files is provided in two versions, each in its own subdirectory:
raw
– the raw output of Band-in-a-Box.fixed
– non-empty files only, fixed so that each track has the correct program number.
The filenames have the form {chart_name}.{style}_{substyle}.mid
. The charts_styles_substyles.tsv
file lists the chord chart filenames along with the styles and substyles applied to each chord
chart.
Chord charts
The charts
directory is structured similarly to the midi
directory and contains the
corresponding chord charts. Each set of chord charts is provided in the ABC format (in the abc
subdirectory) and the Band-in-a-Box (MGU) format (in the mgu
subdirectory). The MGU files are all
in the default “ZZJAZZ” style. To enable generation in the A and B substyles, we provide each MGU
file in an A and B variant where the entire chord chart is just one long A or B section,
respectively.
The chord charts were generated using language models trained on the iRb corpus (see the paper for further details). 5/6 of the chord charts are in major keys, the other 1/6 in minor keys (approximately following the distribution of keys in iRb).