I began utilizing OpenAi’s whisper AI model to transcribe my voice, which is fantastic for ‘writing’ my thoughts in my diary without having to use my fingers.
I use it to recap progress before going for a break. I just open a shell, type voice-note record some audio and let the model process it in the background while I go grab a snack. The transcribed voice note appears eventually on my daily note file.
OpenAI’s voice recognition Whisper2 model works great in English and Spanish. It does not understand my terrible German. I run it on CPU but it does not really matter because I don’t have the need to do it in real time. If a nice GPU falls in my hand I’m sure it could do real time processing.
Add this shell script to your .zshrc/.bashrc etc.
function voice-note() {
local LANG=${1:-'English'}
local DATESTAMP=$(date +%Y-%m-%d);
local TIMESTAMP=$(date +%Y-%m-%d-%H:%M:%S);
local JOURNALDIR=~/journal/diary
local AUDIOFILE=${JOURNALDIR}/${TIMESTAMP}.mp3;
local TEMPDIR=$(mktemp -d)
local JOURNALFILE=${JOURNALDIR}/${DATESTAMP}.md;
# Install whisper
# pip3 install git+https://github.com/openai/whisper.git
# How to update whisper.
# pip3 install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
# Install arecord
# apt-get install arecord
arecord -v -f cd -t raw | lame -r - "${AUDIOFILE}";
whisper "${AUDIOFILE}" --output_dir /${TEMPDIR} --output_format txt --model small --language ${LANG};
echo "\n\n# Voice note ${TIMESTAMP}" >> ${JOURNALFILE}
cat "/${TEMPDIR}/${TIMESTAMP}.txt" >> ${JOURNALFILE}
}