Speech to text, or almost voice chat

Speech to text, or almost voice chat - Printable Version

+- Xonotic Forums (https://forums.xonotic.org)
+-- Forum: Creating & Contributing (https://forums.xonotic.org/forumdisplay.php?fid=10)
+--- Forum: Xonotic - Development (https://forums.xonotic.org/forumdisplay.php?fid=12)
+--- Thread: Speech to text, or almost voice chat (/showthread.php?tid=9451)

Speech to text, or almost voice chat - oblector - 06-30-2022

I describe how to chat while playing, by speaking (not typing), without the necessity of the recipients installing additional software. As it is a kludge, you may want to change the details.

First install some speech to text software, such as nerd-dictation. Remember to download models for one or more languages.

As far I know, for security Xonotic can not execute external programs. So we'll have a script for executing nerd-dictation when some specific files in the .xonotic directory get changed.

Consider the following script (depending on inotify tools), which you may need to adapt, as it presupposes some directory names and that the models you have are named model-en and model-pt.

Code:
#!/bin/bash

set -eu

xondir="$HOME"/.xonotic/data/

diagn() {

  printf "%s\n" "$1" >> /dev/stderr

}

main() {

  cd "$xondir"

  outfile="$xondir"/voiceout

  nerddic="$HOME"/gits/nerd-dictation/nerd-dictation

  while true ; do

    diagn "Waiting signal..."

    # Change the languages here:

    file=$(inotifywait --format %w -e modify en pt)

    diagn "Received start signal: $file"

    echo -n > "$file"

    {

      model="$HOME"/gits/nerd-dictation/model-$file/

      res=$("$nerddic" begin --numbers-as-digits --vosk-model="$model" --output STDOUT | tr '\n' ' ')

      diagn "Writing text...: $res"

      printf "set nerddic_text \"%s\"" "$res" > "$outfile"

    } &

    diagn "Waiting stop signal..."

    inotifywait -e modify voicestop

    diagn "Received stop signal"

    echo -n > voicestop

    "$nerddic" "end";

  done

}

main

Inside the .xonotic/data/ directory create the empty files: voiceout, voicestop, en and pt.

Add the following (or similar) to .xonotic/data/autoexec.cfg:

Code:
set nerddic_text "<unitialized>"

set lang en

alias voiceask "condump $lang"

alias voicestop "condump voicestop"

alias voicerecstart "cprint Recording voice... ($lang); voiceask"

alias voiceexecres "exec voiceout"

alias voicerecend "voicestop; cprint Processing, wait how much I do not know"

alias voicesayres_ "say ${nerddic_text}"

alias voicesayres "voiceexecres; voicesayres_"

alias voiceshowres_ "cprint ${nerddic_text}"

alias voiceshowres "voiceexecres; voiceshowres_"

alias voiceteamsayres_ "say_team ${nerddic_text}"

alias voiceteamsayres "voiceexecres; voiceteamsayres_"

Then start the bash script, and run Xonotic.

Note that some files are modified with condump, which writes the entire console history, a bit inefficient. Anyway, speech to text demands more power and time.

You may change the language with the Xonotic console command lang pt. You may bind some of these commands to some key presses; the command voiceshowres shows the interpreted text before publishing.

You may want to change the code to automatically send the text after recording; caution with race conditions.

It seems to be slightly practical (not much in phrenetic games). I couldn't test it rigorously because I'm not a fluent anglophone, nor I have a quality microphone.

Thanks for the attention. Suggestions? Also, I'm grateful for this marvelous game.