Wednesday, April 10, 2024

SpeeChin smart necklace recognizes wearer’s silent speech commands

Speech recognition helps us do everything faster – whether it’s checking emails, creating a document, or playing our favorite songs. But those technologies require audible speech, which is not always appropriate, for example, in a business meeting or a quiet library.

Now, Cornell University’s Asst. Prof. Cheng Zhang and doctoral student Ruidong Zhang have developed SpeeChin, a silent-speech recognition (SSR) device that can identify silent commands using images of skin deformation in the neck and face captured by a neck-mounted infrared (IR) camera.

The experimental device is similar in appearance to NeckFace, a technology Cheng Zhang and his SciFi Lab team members unveiled last year. NeckFace can continuously track full facial expressions by using infrared cameras to capture images of the chin and face from beneath the neck.

Like NeckFace, the SpeeChin features an IR camera mounted on a 3D-printed necklace case, which is hung on a silver chain with the camera pointing up at the wearer’s chin. For increased stability, the developers designed a wing on each side, along with a coin that serves as a weight on its bottom.

The IR camera captures images of the neck and face from under the chin. These images are first pre-processed and then deep learned by an end-to-end deep convolutional-recurrent-neural-network (CRNN) model to infer different silent speech commands.

The initial experimentation with 20 participants (10 participants for each language) showed that SpeeChin could recognize 54 silent speech commands in English and 44 simple Mandarin words or phrases. SpeeChin recognized commands in English and Mandarin with an average accuracy of 90.5% and 91.6%, respectively. However, the success rates dropped significantly when participants used the device while walking, due in part to the variation in walking styles (more versus less head movement, for example).

Once developed further, the SpeeChin could be utilized in real-world applications. The technology could also be used by people who lack the power of speech.