Note that submission is now closed


  1. We will release the evaluation set data according to CHiME-7 DASR evaluation data release in Important Dates.

  2. Please register in order to receive the evaluation data and also Mixer 6 data for training.
  3. Before the CHiME-7 DASR submission deadline (AoE) in Important Dates, you are required to upload via CMT a short system description paper (2 pages + references). In parallel using this Google form participants have to submit their JSON predictions both for the dev and eval set. Note that you can upload up to 3 systems predictions for each track.
  4. Shortly after we will communicate confidentially to every participant the final scores on the evaluation set.
    Full leaderboard and the winners will however be announced in the Workshop.
  5. Every participant should present their work at the CHiME-7 Interspeech Satellite Workshop in Dublin. Note that exceptions can be made if participation in person is impossible due but we strongly encourage in-person participation. See Important Dates for the exact date and arrange your trip accordingly.
  6. After the Workshop participants are invited to submit a full-length paper (3 to 6 pages, including references) which will appear in the Workshop proceedings and will be indexed (see next Section below for more info).

CHiME-2023 Interspeech Satellite Workshop

All participants are expected to present their work at the CHiME-2023 workshop, no matter the results. This challenge is a community-wide effort in tackling a very difficult problem and we are very interested in discover your approaches !
See Workshop page for more details. We expect your participation so please plan ahead potential issues about the traveling there. If participation is impossible we can make exceptions, but it is highly encouraged.

Participants are highly encouraged to extend their system description paper and submit a full-paper (3 to 6 pages, including one page for references only) after the workshop (see Important Dates] which will be in the ISCA archive and appear in the workshop proceedings. The proceedings will be registered (ISBN) and indexed by Thomson-Reuters and Elsevier.


Please register by filling this Google Form.
Submit one form for each team by eliging one contact person.
Make sure you get the confirmation email.

When to Submit ? When will Evaluation data be available ?

See Important Dates page for CHiME-7 DASR submission deadline, for when the evaluation data will be made available to participants.

What Evaluation Data will be provided ?

As explained in Data page, the only really blind evaluation data in this Task is Mixer 6 Speech evaluation. The 12th June this will be provided to participants that have requested the Mixer 6 Speech data via the procedure described in the Data page.
At the same time we will instruct participants on how to generate the additional evaluation audio files for CHiME-6 and DiPCo partitions via the data generation scripts provided with the Baseline.

Again, we will only provide, for each dataset audio recordings from the far-field microphones as well as the universal evaluation map .uem files. As explained in Data page the UEM indicate the start and stop of the portion of the recording where your submission will be evaluated. The other parts will be ignored for all metrics computation.

Also oracle-diarization annotation will be providedin the form of a JSON file with same formatting as specified in Data page:

        "end_time": "43.82",
        "start_time": "40.60",
        "words": "placeholder",
        "speaker": "P05",
        "session_id": "S02"

Note that however the words is set to “placeholder” for all utterances. This oracle diarization annotation can only be used when submitting to the acoustic robustness sub-track. It cannot be used for submissions to the main track, see Rules page. For the main track you are required to implement also a diarization system (may be also implicit in the ASR model, as e.g. done by Whisper).

What Participants need to Submit and How ?

Participants should submit, by the end of this Task final submission deadline both a 2 pages + references technical description paper and their predictions for the Task evaluation and development sets.

System Predictions

As explained briefly in the Main page, participants have to produce for each session a JSON annotation with same formatting as the transcription_scoring annotation of train or dev partitions, as outlined in Data page. Thus containing both diarization and transcription information.
Participants need to produce ONE JSON file for each of the three scenarios, named as follows chime6.json, dipco.json and mixer6.json. These should be placed in the same folder. One folder for dev and one for evaluation (6 JSONs in total):

├── dev
│   ├── chime6.json
│   ├── dipco.json
│   └── mixer6.json
└── eval
    ├── chime6.json
    ├── dipco.json
    └── mixer6.json

These JSON files will have to contain predictions for each session of each scenario (CHiME-6, DiPCo and Mixer6) in the development/evaluation sets. It does not matter the sorting of the utterances or sessions inside each JSON file. Again the required keys and their formatting should have this format:

        "end_time": "43.82",
        "start_time": "40.60",
        "words": "my predicted format",
        "speaker": "P05",
        "session_id": "S02"

You are responsible for text normalization when submitting the predictions. We won’t normalize them for you ! Rememeber we will score against transcription_scoring ! You are free to use the baseline normalization or another one.

You can also look at how the baseline system creates these JSON files: see stage 4 in asr1/ as well as the script used to score the submissions local/ We encourage you to test the scoring of your system on the development set, for which you have the ground truth available. The scoring script is detailed also in the of the baseline system.

Technical Report

In addition to this participants should submit a 2 pages (+ references) containing a description of the system the participants devised to tackle this challenge.

You should use the CHiME Workshop CMT for submitting this short system description paper and use the Challenge Task1: DASR Track.

One requirement for this paper is to detail as precisely as possible the datasets and the pre-trained models used to develop such system, the data augmentation techniques used and the amount of data in hours (or words if text-based data). If data augmentation on-the-fly is used this also has to be specified.

This technical paper will be used by the Organizers to asses if the Task rules have been complied with.
If it does not comply to the rules the submission will not be considered for the final ranking but the team is still invited to present their work to the workshop and submit a full-length paper afterwards.

Full Team Submission Paper

Each participant is invited to submit a full-length paper after the workshop (4 to 6 pages, references included). Again, participants can use CMT also for this full-length system paper.

This paper will be due in September, after the workshop, the exact date is given in Important Dates page.
As said this work will be included ISCA archive and appear in the workshop proceedings. The proceedings will be registered (ISBN) and indexed by Thomson-Reuters and Elsevier. So your work will be highly visible.

Where to Submit ?

You can submit your system output using this Google Form, in which you have also to specify the CMT paper ID of the short system description paper related to the submission.
Please ensure only one submission per team.

Each team should upload their transcriptions (in the format and folder structure specified above) using such Google Form.

Each participant can submit up to 3 different sets of transcriptions for each track (they can belong to different systems).
E.g., for the main track evaluation set, you can upload a zip of a folder containing three different sets of transcriptions, which should be distinguished by different names, e.g. main_track_<submission_tag>.

One example of two submitted systems for the main track (2 out of 3 max) eval set could be as following.
You can follow the same structure for the sub-track, and for both eval and dev sets.

├── main_track_sys1
│   ├── chime6.json
│   ├── dipco.json
│   └── mixer6.json
├── main_track_sys2
    ├── chime6.json
    ├── dipco.json
    └── mixer6.json