Results

On the page below, we present the results of all submitted systems compared to the baseline we published for the challenge. We would like to thank all participants for their entries, which have not only improved the performance of the systems, but also proposed some interesting methodologies to address the problem of conversations in smart glasses.

As previously announced, we have categorized the systems into four groups based on their mean latency. For more information on how the mean latency is computed, please refer to the Rules page. The four categories were created using latency thresholds of 150 ms, 350 ms, and 1000 ms. The last category with latency over 1000ms includes non-streaming systems. The systems were evaluated using multi-talker word error rate (WER), which was computed separately for SELF (the wearer of the glasses) and OTHER (the conversation partner). The multitalker WER allows us to analyze speaker attribution rates separately, as reported in the tables below as Self ATTR and Other ATTR.

Overall results

First, we present the results of all submitted systems across the categories. Please note that non-streaming systems are shown in the scatter plot at 1500ms latency for simplicity of visualization.

Rank Team Name System
Name
Category Overall
WER [%]
Self
WER [%]
Other
WER [%]
Self
ATTR [%]
Other
ATTR [%]
Latency
[ms]
Latency
[ms]

Results per category

The results for four different latency categories are presented below.

Mean latency <=150ms

Rank Team Name System
Name
System
Name
Overall
WER [%]
Self
WER [%]
Other
WER [%]
Self
ATTR [%]
Other
ATTR [%]
Latency
[ms]
Latency
[ms]

Mean latency 150 - 350ms

Rank Team Name System
Name
System
Name
Overall
WER [%]
Self
WER [%]
Other
WER [%]
Self
ATTR [%]
Other
ATTR [%]
Latency
[ms]
Latency
[ms]

Mean latency 350 - 1000ms

Rank Team Name System
Name
System
Name
Overall
WER [%]
Self
WER [%]
Other
WER [%]
Self
ATTR [%]
Other
ATTR [%]
Latency
[ms]
Latency
[ms]

Mean latency >1000ms (incl. non-streaming)

Rank Team Name System
Name
System
Name
Overall
WER [%]
Self
WER [%]
Other
WER [%]
Self
ATTR [%]
Other
ATTR [%]
Latency
[ms]
Latency
[ms]