Have you ever dreamed about controlling your PC with voice commands? Well, now you can (though only some specific actions)!
What do I need?
- A computer with Ubuntu (you can still do the same on other distributions, but this post won’t cover that).
- A microphone (a cheap one will do).
- Some application(s) which you want to control and which can be used with commands on the terminal.
Installation
Go over to my PPA and install packages julius and julius-voxforge from there.
Writing the command and control application
Follow the instructions from /usr/share/doc/julius-voxforge/examples/README to create your own grammar, and then edit the command.py file to suit your needs (the simplest configuration would be to just edit the dictionary near line 60). Finally, to execute it: julius -quiet -input mic -C julian.jconf 2>/dev/null | ./command.py
Problems
I don’t really have much experience with Julius, but if you have problems with the instructions explained here leave a comment or ping me on IRC (RainCT@Freenode) and I’ll try to help you. But first look at the examples below to ensure that you’ve done everything right :).
More?
I’m currently working at further improving those packages and getting them into Ubuntu. Also, I may write another post in the future explaining how to create your own speech corpora and acoustic models, but I can’t promise anything.
Example on how to control Rhythmbox:
* example.voca:
% NS_B <s> sil % NS_E </s> sil % ID DO d uw COMP k ax m p % COMMAND PLAY p l ey NEXT n eh k s t PREV p r iy v SHOW sh ow UP ah p DOWN d aw n SILENCE s ay l ax n s
* sample.grammar
S: NS_B ID COMMAND NS_E
* command.py’s parse function
def parse(line):
params = [param.lower() for param in line.split() if param]
commands = {
'play': 'rhythmbox-client --play',
'silence': 'rhythmbox-client --pause',
'next': 'rhythmbox-client --next',
'prev': 'rhythmbox-client --previous',
'show': 'rhythmbox-client --notify',
'up': 'rhythmbox-client --volume-up',
'down': 'rhythmbox-client --volume-down',
}
if params[1] in commands:
os.popen(commands[params[1]])
* Usage: (Action – Verbal command)
Reproduce - DO PLAY
Pause - DO SILENCE (I didn't use "DO PAUSE" because like that it had a very high error rate)
Next song - DO NEXT
Previous song - DO PREV ("DO PREVIOUS" can't be used because VoxForge's acoustic models don't support some of it's phonemes)
Show the name of the current song - DO SHOW
Increment Rhythmbox's volume - DO UP
Decrement Rhythmbox's volume - DO DOWN
Random tip:
You can let the computer answer to your commands using either espeak “text to say” or, if you have Festival (which sounds more natural) installed, festival -b ‘(SayText “text to say”)’.
Happy hacking!





Guai!!! Quan tingui temps ho provaré!
Wow. It’s cool to control music by voice. Unfortunately it don’t work when I play songs very load :P
Do you know if there’s a project to make voice control easy to the user? Something to control apps with a simple gui…
[...] Gevatter: Writing a command and control application with voice command. Let’s admit it. Most of us Trek fans have long waited for the day that we can control our [...]
[@ Loffe]
Yeah, that with the loud music is true, but I think it works quit good anyway. While I was writing the script I wondered if it would recognize anything while I have music playing, but as you see up to a certain point it does :).
About GUI, I know of the two following ones, but I haven’t tried them yet:
– Simon (http://simon-listens.org/index.php?id=122&L=1), which uses Julius, but it doesn’t seem to build on Hardy.
– The panel applet gnome-voice-control (http://live.gnome.org/GnomeVoiceControl), which uses a different engine, Sphinx (which from my previous experience with it seems to require a pretty good microphone – using the cheap one I have it only recognizes like 1 word out of 50). It is in the repositories, but the version in Hardy doesn’t work (although I just spoke with the maintainer and he is working on fixing it now).
Just if any other stumbles upon the error i got.
The .dict file was not build, i had to use “sudo mkdfa sample” to build the dict and dfa file. I dont know why, cause the directory i created was in my homefolder, belonging me and was chmod 777 -R by me.
Hi RainCT,
thank you for creating julius and julius-voxforge in intrepid, that will make my project for school much more simple! Will it be updated when the new audio speech-files in VoxForge are processed? (Now there are around 50 new speech submissions.)
VoxForge: Hey. I’m happy that it’s useful for you :). Drop me a mail once the new version is released and I’ll try if I can get it in, but I can’t promise anything as Feature Freeze is about to start (which means that no new features can enter Intrepid).
Hi RainCT,
thank you for your quick reply!
Is speech recognition accuracy really a feature? I thought that a feature is a new package with new options. It depends, I think, on the reviewer. Maybe you can better not release it, but waiting a few months and try then to release it in intrepid-backports as it might be a bigger improvement.
I don’t know when Ken comes back from holiday and will process the submissions, but I’ll drop you a mail when it is ready.
On the other hand, if it will not be entered in Intrepid, is it simply possible to overwrite the acoustic model in order to improve the accuracy?
Voxforge: Well, “new upstream versions” are considered features, unless they are “bugfix only”. But as VoxForge is not an application it may well be possible to get an exception. In regards to your last question, you can of course use any acoustic model you want (just write the path to that one you want instead of to /usr/share/julius-voxforge/acoustic/, in julian.jconf).
Thank you!
Hello, thanks to your work I was able to install Julius under Ubuntu. I hope that you continue your good work. Greetings
This is really cool. It’s like in star trek almost where you tell the computer to do something (like computer, turn on holodeck) and magically the computer does it.
Thanks alot for posting this, this was exactly what I was looking for for a project of mine that I had a few months ago. Thanks alot!
I’ve waited so long for an application like this one. :) Easy to use, and easy to place new options. I configured it to open all my developing tabs on firefox in case of the ‘develop’ word. :D
I’m afraid your how-to is a bit too bare-bone for me. I may be an occasional hacker, but that’s mostly not (read: never) in Python. Could you explain how you exactly get the voice commands to work on Rhymthmbox?
@sander:
In the example I run Julius and call the Python application so that it gets Julius’s output (from stdout), which I parse at real-time line by line. If that line is of the form «sentence1: <s> [...] </s>», it contains the recognized text, which in the parse() function is compared to the known commands. If a known command is found (eg., «play»), I call rhythmbox-client (a command-line tool to control Rhythmbox) for it to do the recognized action.
I hope this clarifies it for you; if it doesn’t, ask again :).
FYI, I’ve just uploaded julius-voxforge 0.1.1~daily20090611-0ubuntu2 which provides an updated version of the script (including support for Banshee).
Hi! I was surfing and found your blog post… nice! I love your blog. :) Cheers! Sandra. R.
I got the “sample” program to work fine. So then I had a go at the “controlapp” program. It says “Taking control of Rhythmbox media player.” I then try saying the commands and such but nothing changes. I even started up Rhythmbox and set it up so that if I did hit pause or play or anything like that I would know. But nothing happens. I checked the dictionary to make sure the words like play and pause were there and they are but nothing happens still.
I just figured out that I forgot to add the lines ” sed -i ‘s/sample\.dfa/mediaplayer\.dfa/’ *.jconf
sed -i ‘s/sample\.dict/mediaplayer\.dict/’ *.jconf”
I hope this helps someone
Sir,
i am trying to develop control application with voice recognition.After executing
julius -quiet -input mic -C julian.jconf | ./command.py
process comes in waiting state(cursor is continuos blinking ).Means sometimes it works and shows <<> and works fine.
But there is no surety in working.
Please help me .
Sometimes it gives error
Error: adin_oss: failed to read samples
Hi,
can give some explanation on how do you actually send the “word” that you get from Julius to the python language?
Thank you
@ Roland:
You can see that in the example included in the “julius-voxforge” Ubuntu package. For your convenience, I’ll post a copy of the current Python code here:
You execute it like this:
This way Julius will run and write the recognized words (together with some other useless text) to the standard output. That text will be forwarded to the Python script which can then identify the relevant words and see whether they are recognized.
As you can see, this is just a quick hack. Theoretically it should be possible to use libjulius to access Julius directly from Python, but I’m not sure what the state of said library is at the moment and I haven’t found time to look into it yet.
[...] speech recognition engine) is already packaged since Intrepid and some time ago I wrote about how to use it to control applications (including instructions to setup some basic commands to control [...]
[...] in some months), but especially interesting is that it supports more phonemes now (remember the old days when words like “previous” or “computer” couldn’t be [...]
Hi it works great, im using the command.py that comes with julius-voxforge and would like to use the script to execute other programs and scripts could you please provide a more simple generic script template to work from (i cant code in python im currently learning bash), this is hard for me to disect and use for other things other than or rhythmbox/banshee, the command.py you have in your tutorial (not the voxforge one) is more like something i am after but i cant seem to get it to work if i use it on its own, i guess its a code snippet and not the full script, but i want to be able to do something like:
recognized word from julius
commands:
word=command
word=command
word=command
word=command
etc
if recognised word = any of the words above
then
execute recognized command
with no fancy stuff to check whether rhythmbox or banshee are installed etc just a bare bones and very simple version that can be used to execute any command
also is this possible to do in bash, as i would ideally like to learn one language at a time,
Anyhelp appreciated - Thanks!
[...] Julius+Voxforge: seems to be very easy to install, but no GUI. [...]
Can you update this for julius 4.2.1?
[...] http://bloc.eurion.net/archives/2008/writing-a-command-and-control-application-with-voice-recognitio… [...]