This week I am spotlighting research results published by our 3M HIS team of artificial intelligence (AI) scientists and researchers in the recent past. I will give a synopsis of the research conducted and lessons learned. For a full explanation of what was done, take a look at the overview blog for each work or the actual technical paper published.
The published research falls into three broad categories: automatic speech recognition (ASR) and speech processing, conversation summarization and evaluation, and coding of clinical documents.
Speech processing: The current focus of speech processing is to support conversation transcript generation and support virtual assistants.
- Low resource, low footprint, wake word detection. Wake word detection is a common problem in virtual assistants. In this research, with a small amount of wake word data, an efficient wake word detection algorithm is built by distilling the phone recognition capabilities of a large ASR model, in effect compressing it into a small footprint that can be easily run on mobile devices.
- Are you dictating to me? Recognizing the role of scribes in summarizing, physicians frequently do micro-dictations summarizing various aspects of the encounter – such as physical exam or assessment and plan. This research shows how a simple, yet effective machine learning (ML) model can be built to reliably detect such dictation in conversation recordings, based on both audio and textual features.
Conversation summarization and evaluation: Supporting scribes by efficiently creating clinical documents directly from doctor-patient conversations.
- Leveraging pre-trained language models. This research explores how to use large pre-trained language models effectively to create an abstractive history of present illness (HPI) section of doctor-patient conversations. It also shows the effectiveness of a two-stage summarization training framework, summarizing chunks of conversations into partial summaries first and then summarizing the summaries into a final report. This work is one of the first to provide state of the art results for this section.
- In-domain pre-training. Off-the-shelf pre-trained language models are not exposed to substantial clinical content. In this work, researchers show that taking advantage of additional pre-training on clinical content markedly improves the results obtained for summarizing clinical conversations.
- Extract and abstract doctor-patient conversations. Doctor-patient conversations can cover a lot of different topics. In this successful experiment, conversation utterances are first categorized as pertaining to different sections/topics of a clinical note. A second stage then summarizes it section wise.
- Factuality scoring of summaries. Language models can be used to summarize doctor-patient conversations. However, they can introduce hallucinated concepts. Dealing with this issue is one of the key challenges that must be overcome before these models can be safely deployed at scale. In this work, we describe existing methods for measuring the degree of factual inconsistency in machine-generated summaries. We highlight some shortcomings with these methods and provide some techniques to address them.
Clinical coding: The 3M HIS coding research team’s focus is to assist coders and clinical documentation specialists with ML solutions that provide accurate codes for encounters.
- Effective convolutional attention network (ECAN). Assigning ICD-10 codes to patient encounters is a non-trivial task. This multi-label document classification problem must contend with more than 90,000 codes. In this pioneering work, researchers explore the use of a novel architecture that provides an effective way to assign such codes.
Acknowledgement: Thank you to my AI colleagues who helped compile this summary of tech blogs: Longxiang Zhang, Thomas Schaaf, Hua Cheng, Jing Su, John Glover and Arindam Gosh.
I am always looking for feedback and if you would like me to cover a story, please let me know! Leave me a comment below or ask a question on my blogger profile page.
“Juggy” Jagannathan, PhD, is an AI evangelist with four decades of experience in AI and computer science research.