Our challenge follows the rules set by the CHiME-8 DASR track, along with some additional guidelines specific to the NOTSOFAR challenge, detailed below.

Shared rules between NOTSOFAR-1 and CHiME-8 DASR

  • External data and pre-trained models specified on the Rules page may be used. We follow DASR’s unconstrained LM track rules with pre-trained LMs allowed.
  • Other datasets and models can be proposed until March 15th (see Additional Models).
  • Close-talk microphones cannot be used during inference.
  • Self-supervised adaptation on evaluation data is allowed, but you need to perform it independently for each session.
  • We do not require participants to open-source their system, however it is highly encouraged.
  • We require each participant to submit a 2 to 6-page description paper (see Submission section)
  • Systems will be ranked with tcpWER (see Challenge Tracks), using a 5-second collar, and text normalization developed for this challenge. The normalization aims to enhance compatibility across various ASRs, removing filler words (‘hmm’, ‘uh’, ‘ah’, ‘eh’), and replacing numerals with their spelled-out version. It is applied to both hyp and ref transcripts.

Rules exclusive to NOTSOFAR-1

  • Algorithms are allowed to use information from a single device only during inference (also referred to as ‘session’).
  • You are encouraged to incorporate the a-priori information of the multi-channel array’s geometry, including during inference.
  • Solutions deemed practical and efficient will be highlighted in a separate leaderboard.
  • We work closely with the CHiME-8 DASR challenge that offers a geometry-agnostic multi-channel track. Every geometry-agnostic system submitted to their track will automatically yield a result in our geometry-specific multi-channel track on the NOTSOFAR meeting dataset. (See Scientific Goals)