Correcting and Improving Automatic Transcripts of Video Recordings at UICOM

brown and black wooden chairs inside room
  • Responsibilities: Instructional Design, Management
  • Target Audience: All stakeholders at University of Illinois College of Medicine (UICOM)
  • Tools Used: Echo360
  • Budget: N/A, internal project
  • Client: University of Illinois College of Medicine (UICOM)
  • Year: 2019-present

Overview

Around 2019, some of our international students at the University of Illinois College of Medicine (UICOM) met with me to talk about the automatic transcripts created for all of our video content in our video storage system, Echo360. Internally, we had turned on the transcripts feature in spring 2019 as a pilot to determine the accuracy of the built-in automatic speech recognition (ASR) tool in the software and figure out how to best provide students with additional ways to consume the video content.

What we determined right away is the accuracy of these automatic transcripts was sometimes really poor. After discussions with Echo360 representatives, we found out that there was not a built-in dictionary for medical, pharmacologic, and certain higher level scientific terms in the ASR tool they used (which was Google AWS). For example, if a faculty member said ‘hypoallergenic’, the system might not have heard it correctly and the transcript would show ‘hyper-allergenic’ which is incorrect or sometimes the word was completely indecipherable.

Here is an example of how the system incorrectly heard the term ECG from a faculty member who was speaking. The correct term is ECG. After listening to this part of the lecture. it was very clear he said ECG, not eggs.

It is really imperative that we correct these transcripts. There are students who have learning disabilities or other accommodations. Sometimes we have faculty who have strong accents because English is not their first language and we have students who’s first language is also not English. We need to be able to provide the best education to all students.

Process

Initially, some of our medical students tried to correct the incorrect transcripts themselves and soon discovered how long it can take to correct them. We determined that a better option, until Echo360 has the appropriate dictionaries in their system, was to hire non-pre-med students to edit the transcripts. I worked with my colleague at UICOM Dr. Elizabeth Balderas to interview and hire up to 5 students to do this work.

Dr. Balderas and I developed a job title and description based on existing similar jobs in the university system. Then, we published the job via Handshake, a tool for students to find jobs on campus, as well as other sources like LinkedIn. The title we came up with was Medical Lecture Transcriber. We were able to hire four non-pre-med students who had science backgrounds and one 4th year medical student to edit transcripts.

I went through our Echo360 lecture recordings in the system and determined which transcripts in these recordings might need the most improvement. I was able to do this fairly easily because of my familiarity with the faculty as well as reviewing past student evaluations which gave some insight. We used Google Sheets to allow the student workers to select a specific recording to edit. Many of these student workers were studying various aspects of science, so a value added to editing the transcripts was perhaps an opportunity for them to learn something that was relevant to their own studies. We also warned the student workers that, depending on how much work a transcript might need, it could easily take 3-4+ hours to edit a 1 hour recording transcript.

Results and Takeaways

After running this process during spring term 2020, it became clear that the students were not keeping up with the editing like we wanted and needed. For some of them, it took longer than expected (though we told them it would take a long time) and it conflicted with their own student workload. At the end of the semester, we decided to discontinue the process as it was not as successful as we envisioned. In mid-2021, Echo360 started using a different ASR vendor which has a better medical / scientific / pharm dictionary.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.