posted on 2024-10-01, 12:11authored byTejumade AfonjaTejumade Afonja, Tobi Olatunji, Sewade Ogun, Naome A Etori, Abraham Owodunni, Moshood Yekini
Recent strides in automatic speech recognition (ASR) have accelerated their application in the medical domain where their performance on accented medical named entities (NE) such as drug names, diagnoses, and lab results, is largely unknown. We rigorously evaluate multiple ASR models on a clinical English dataset of 93 African accents. Our analysis reveals that despite some models achieving low overall word error rates (WER), errors in clinical entities are higher, potentially posing substantial risks to patient safety. To empirically demonstrate this, we extract clinical entities from transcripts, develop a novel algorithm to align ASR predictions with these entities, and compute medical NE Recall, medical WER, and character error rate. Our results show that fine-tuning on accented clinical speech improves medical WER by a wide margin (25-34 % relative), improving their practical applicability in healthcare environments.
History
Primary Research Area
Trustworthy Information Processing
Name of Conference
INTERSPEECH (ISCA)
Page Range
2315-2319
Publisher
International Speech Communication Association
Open Access Type
Green
BibTeX
@conference{Afonja:Olatunji:Ogun:Etori:Owodunni:Yekini:2024,
title = "Performant ASR Models for Medical Entities in Accented Speech",
author = "Afonja, Tejumade" AND "Olatunji, Tobi" AND "Ogun, Sewade" AND "Etori, Naome A" AND "Owodunni, Abraham" AND "Yekini, Moshood",
year = 2024,
month = 9,
pages = "2315--2319",
publisher = "International Speech Communication Association",
doi = "10.21437/interspeech.2024-2261"
}