I began utilizing OpenAi’s whisper AI model to transcribe my voice, which is fantastic for ‘writing’ my thoughts in my diary without having to use my fingers.
I use it to recap progress before going for a break. I just open a shell, type voice-note record some audio and let the model process it in the background while I go grab a snack. The transcribed voice note appears eventually on my daily note file.
OpenAI’s voice recognition Whisper2 model works great in English and Spanish. It does not understand my terrible German. I run it on CPU but it does not really matter because I don’t have the need to do it in real time. If a nice GPU falls in my hand I’m sure it could do real time processing.
Add this shell script to your .zshrc/.bashrc etc.
function voice-note() { local LANG=${1:-'English'} local DATESTAMP=$(date +%Y-%m-%d); local TIMESTAMP=$(date +%Y-%m-%d-%H:%M:%S); local JOURNALDIR=~/journal/diary local AUDIOFILE=${JOURNALDIR}/${TIMESTAMP}.mp3; local TEMPDIR=$(mktemp -d) local JOURNALFILE=${JOURNALDIR}/${DATESTAMP}.md; # Install whisper # pip3 install git+https://github.com/openai/whisper.git # How to update whisper. # pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git # Install arecord # apt-get install arecord arecord -v -f cd -t raw | lame -r - "${AUDIOFILE}"; whisper "${AUDIOFILE}" --output_dir /${TEMPDIR} --output_format txt --model small --language ${LANG}; echo "\n\n# Voice note ${TIMESTAMP}" >> ${JOURNALFILE} cat "/${TEMPDIR}/${TIMESTAMP}.txt" >> ${JOURNALFILE} }