Submission
Updates:
- [June 17, 2023] Submission is now closed
- [June 2, 2023] Added final instructions regarding the submission.
What evaluation data is provided?
Systems will be ranked based on their performance on the following evaluation data:
-
CHiME-5 evaluation set
The CHiME-5 evaluation set consists of short single-channel audio segments extracted from the CHiME-5 binaural recordings. Each segment contains noisy speech with up to three simultaneously-active speakers. Only the single-speaker segments of the
eval/1
subset will be used for the first evaluation stage using the DNS-MOS objective performance metrics. The audio samples of theeval/listening_test
subset will be used for the second evaluation stage (listening test). -
Reverberant LibriCHiME-5 evaluation set
The reverberant LibriCHiME-5 evaluation set contains single-channel noisy speech mixtures with up to three simultaneously-active speakers. Each mixture is labeled with the clean speech reference signal. This dataset will only be used for the first evaluation stage.
For additional information please refer to the Data and Rules sections.
What do participants need to submit?
Please make sure to follow the instructions below carefully.
Audio files
-
Participants must submit the audio signals produced at the output of their speech enhancement system for the CHiME-5 evaluation set (
eval/1
andeval/listening_test
subsets only) and for the reverberant LibriCHiME-5 evaluation set (eval/1
,eval/2
, andeval/3
subsets). The UDASE task only focuses on single-microphone noise suppression, without addressing speech separation or dereverberation. The output signal thus corresponds to single-microphone and potentially multi-speaker speech with suppressed/attenuated background noise. -
The output signals should have the same number of samples as the noisy speech input signals, and they should be submitted as 16-bit PCM WAV files with a 16 kHz sampling rate.
-
The loudness of the submitted audio signals for the
eval/1
andeval/listening_test
subsets of the CHiME-5 dataset should be normalized to -30 LUFS. Please visit the baseline github repository (see the Baseline section) for instructions regarding how to perform this loudness normalization.
CSV files
-
The participants are kindly asked to compute the objective performance metrics and submit two CSV files:
-
The first CSV file should contain the DNS-MOS scores for each example of the
eval/1
subset of the CHiME-5 evaluation set (3013 examples in total). It should be formatted as follows:subset input_file_name output_file_name SIG_MOS BAK_MOS OVR_MOS eval/1
<mix ID>.wav
<mix ID>_output.wav
⋮ ⋮ ⋮ Participants are asked to normalize their signals to -30 LUFS before computing the DNS-MOS performance scores (see the baseline github repository for instructions regarding how to perform this loudness normalization). The motivation for this normalization is that DNS-MOS (especially the SIG and BAK scores) is very sensitive to a change of the input signal loudness. This sensitivity to the overall signal loudness would make it difficult to compare different systems without a common normalization procedure.
-
The second CSV file should contain the SI-SDR scores for each example of the reverberant LibriCHiME-5 evaluation set (1952 examples in total). It should be formatted as follows:
subset input_file_name output_file_name SI-SDR eval/1
,eval/2
oreval/3
<mix ID>_mix.wav
<mix ID>_output.wav
⋮ ⋮ ⋮
-
-
Instructions and tools to compute the DNS-MOS and the SI-SDR metrics and to generate the CSV files are available in the baseline github repository (see the Baseline section). The provided numbers will be verified by the organizers by scoring the submitted audio signals using the scoring functions in the baseline github repository. It is therefore important that participants verify their evaluation is reproducible using the provided tools.
-
Optionally, participants may also submit a third CSV file containing the SI-SDR scores for each example of the following subsets of the LibriMix dataset:
Libri2Mix/wav16k/max/test/mix_single
(3000 single-speaker examples);Libri2Mix/wav16k/max/test/mix_both
(3000 2-speaker examples);Libri3Mix/wav16k/max/test/mix_both
(3000 3-speaker examples).
SI-SDR results on LibriMix will not be used to rank systems because it would not be consistent with the purpose of the UDASE task. They will only be used to compare the performance on the (close to) in-domain and out-of-domain datasets.
Technical report
- Each submission should include a technical report in the form of a two-page extended abstract (references can extend onto a third page). This technical report needs to be sufficiently complete for the organizers to judge whether the submitted system complies with the task rules. It should include:
- an abstract;
- an introduction;
- a description of the methodology and the experimental setup (including a description of the system, model architecture and external/pre-existing tools/softwares that might have been used);
- a presentation and discussion of the results obtained following the official task rules, and potentially additional results;
- a conclusion and references.
-
In addition to the final results obtained with the submitted system, participants are asked to provide in the technical report any information that could help understanding/justifying the choices made during the development of this system. This may for instance take the form of an ablation study. For systems that perform unsupervised adaptation from a fully-supervised model, it is expected to show that the final model after unsupervised adaptation on the unlabeled CHiME-5 data obtains better performance than the same model trained only on the labeled LibriMix data.
- See here for additional information (including a LaTeX author kit) and for submitting the technical report. In particular, please read carefully the “CHiME-7 challenge papers” section.
Naming and packaging of the submission
Please make sure to follow the instructions below carefully.
-
The output audio signals and the CSV files should be placed in a directory with the following structure:
. ├── audio │ ├── CHiME-5 │ │ └── eval │ │ ├── 1 │ │ │ ├── <mix ID>_output.wav │ │ │ ├── ... │ │ └── listening_test │ │ ├── <mix ID>_output.wav │ │ ├── ... │ └── reverberant-LibriCHiME-5 │ └── eval │ ├── 1 │ │ ├── <mix ID>_output.wav │ │ ├── ... │ ├── 2 │ │ ├── <mix ID>_output.wav │ │ ├── ... │ └── 3 │ ├── <mix ID>_output.wav │ ├── ... └── csv ├── CHiME-5 │ └── results.csv ├── LibriMix │ └── results.csv └── reverberant-LibriCHiME-5 └── results.csv
where
audio/CHiME-5/eval/1
contains 3013 files;audio/CHiME-5/eval/listening_test
contains 241 files;audio/reverberant-LibriCHiME-5/eval/1
contains 1394 files;audio/reverberant-LibriCHiME-5/eval/2
contains 494 files;audio/reverberant-LibriCHiME-5/eval/3
contains 64 files.
-
Providing the
csv/LibriMix/results.csv
file in the submitted directory is optional. - Output signals should be named using the following conventions:
- For each noisy speech input signal named
<mix ID>.wav
in the CHiME-5 evaluation set, the corresponding output signal should be named<mix ID>_output.wav
. For example, the output file associated to an input file with relative patheval/1/<mix ID>.wav
in the CHiME-5 eval set should be placed ataudio/CHiME-5/eval/1/<mix ID>_output.wav
in the submitted directory - For each noisy speech input signal named
<mix ID>_mix.wav
in the reverberant LibriCHiME-5 evaluation set, the corresponding output signal should be named<mix ID>_output.wav
. For instance, the output file associated to an input file with relative patheval/1/<mix ID>_mix.wav
in the reverberant LibriCHiME-5 eval set should be placed ataudio/reverberant-LibriCHiME-5/eval/1/<mix ID>_output.wav
in the submitted directory.
- For each noisy speech input signal named
-
The directory containing the output audio signals and the CSV files (
.
in the above directory tree structure) should be packaged usingzip
ortar
or any standard packaging tool. -
An example submission zip file (including audio and csv files) is available here.
- Before submission, please verify that your submission directory is valid using the
check_submission.py
script available in the baseline github repository (see the Baseline section). Your submission should be valid if you do not receive any error message when runningpython check_submission.py path/to/the/submission/directory
.
How to submit?
- Please first submit your technical report (extended abstract) following the instructions here. You will obtain a CMT paper ID, which is required to fill out the submission form below.
- The submission has to be done using this Google form. You will be asked to provide:
- general information about the submission (corresponding authors, submission name, team profile, CMT paper ID, etc.);
- general information about the submitted system;
- the average performance scores of the submitted system;
- a link to download the packed directory containing the submitted audio signals and CSV files.
- Considering the provided evaluation dataset, the size of the packed directory should be about 700 Mo. Participants can use any service such as Google Drive, WeTransfer or similar to upload their results.
Subsmission deadline
See Important dates