Sound x Vision - It takes more than accurate dictation

It takes more than accurate dictations

The textual inputting method has always been a challenge when it comes to extended reality. Traditional input method using keyboard requires very least user effort while the popular approach in extended reality environment, “mid-air typing” sounds like a fantasy but in reality, it is likely to cause arm and hand fatigues. Voice input, on the other hand, requires very little or no hand movement thus may be a better choice for textual inputting.

Voice recognition

Our approach is to develop voice input as the main textual input solution in an extended reality environment. The solution is based on real-time speech recognition with fast response at a low error rate. But an accurate voice recognition alone is not enough to replace the traditional input method using a keyboard, it takes more to engage users to use it as the main way of textual inputting.

Suggestions

In our approach, suggestions play the main role in improving voice input. A prediction model will be developed, which studies users for their input behaviors and provides personalized suggestions along with speech recognition. The method also requires an intuitive gesture for switching between the suggestions seamlessly.

Little input + suggestions = meaningful message at least effort

The use of suggestions can be also used for user commands and communication with virtual assistants.

Unspeakable

Not everything can be easily or appropriately expressed by speech, for example, emoji. Emoji has been a way to express emotions in dreary messages, but how can one put an emoji on text messages with voice input? Speaking “Smiley face, smiley face” to a machine sounds cringe. So comes an additional layer of our approach - drawing, simple sketch in the air to draw your favorite emoji and it will appear on your text message, most of the time it just takes a sketchy “:” and “)” to get a smiley face. Another example of unspeakable objects is mathematical expressions which can also be easily input by drawing.

08/12/2021

Author: Phuoc Trinh