Analysis of Speech Recordings from a Head and Torso Simulator (HATS) with and without Face Coverings using both Spectrogram and Transcription Tools
Disciplines
Architectural Engineering | Engineering | Speech and Rhetorical Studies
Abstract (300 words maximum)
The COVID-19 pandemic caused the need for a wide assortment of face coverings to be adopted all over the globe. But with the use of such face covering, speech sound levels and intelligibly are often sacrificed over the benefit of reduced transmissibility of the virus. This made communications at all public locations more challenging, but especially so in classroom settings when one needs to listen to a single talker over a long time period. Prior research at KSU consisted of measuring and evaluating the sound levels and frequency bands of white noise played out of the artificial mouth (loudspeaker) of an acoustic head and torso simulator (HATS) at the instructor location in a 40-seat empty classroom. Sound levels and recordings were made at a 2-meter distance and a 6.2-meter distance, simulating a student listener in the front row and in the back row, respectively. This procedure was repeated with several popular face coverings, including the standard N95, KN95, various cloth masks, and transparent plastic face shields. In addition, WAV recordings were made with and without masks using short phases of pre-recorded American English female and male talkers. This directed study consists of finding a way to make use of the speech recordings at a 2-meter distance from the HATS. Adobe Audition software was used to improve the playback quality of the speech recording and trim the modified files into 10 seconds of speech, and all start with the same spoken words. This software is also being used to create spectrograms to identify the frequencies of attenuation or amplification when a particular mask is on. Adobe Premier Pro software is used for the transcription of these files. Documenting which words are transcribed correctly or incorrectly will provide further indication of the effects of various face coverings on speech intelligibility.
Academic department under which the project should be listed
Mechanical Engineering
Primary Investigator (PI) Name
Richard Ruhala
Additional Faculty
Laura Ruhala, Mechanical Engineering, lruhala@kennesaw.edu
Analysis of Speech Recordings from a Head and Torso Simulator (HATS) with and without Face Coverings using both Spectrogram and Transcription Tools
The COVID-19 pandemic caused the need for a wide assortment of face coverings to be adopted all over the globe. But with the use of such face covering, speech sound levels and intelligibly are often sacrificed over the benefit of reduced transmissibility of the virus. This made communications at all public locations more challenging, but especially so in classroom settings when one needs to listen to a single talker over a long time period. Prior research at KSU consisted of measuring and evaluating the sound levels and frequency bands of white noise played out of the artificial mouth (loudspeaker) of an acoustic head and torso simulator (HATS) at the instructor location in a 40-seat empty classroom. Sound levels and recordings were made at a 2-meter distance and a 6.2-meter distance, simulating a student listener in the front row and in the back row, respectively. This procedure was repeated with several popular face coverings, including the standard N95, KN95, various cloth masks, and transparent plastic face shields. In addition, WAV recordings were made with and without masks using short phases of pre-recorded American English female and male talkers. This directed study consists of finding a way to make use of the speech recordings at a 2-meter distance from the HATS. Adobe Audition software was used to improve the playback quality of the speech recording and trim the modified files into 10 seconds of speech, and all start with the same spoken words. This software is also being used to create spectrograms to identify the frequencies of attenuation or amplification when a particular mask is on. Adobe Premier Pro software is used for the transcription of these files. Documenting which words are transcribed correctly or incorrectly will provide further indication of the effects of various face coverings on speech intelligibility.