Archive for the ‘Ubuntu-Planet’ Category

Writing a command and control application with voice recognition

Saturday, July 26th, 2008

Have you ever dreamed about controlling your PC with voice commands? Well, now you can (though only some specific actions)!

What do I need?
- A computer with Ubuntu (you can still do the same on other distributions, but this post won’t cover that).
- A microphone (a cheap one will do).
- Some application(s) which you want to control and which can be used with commands on the terminal.

Installation
Go over to my PPA and install packages julius and julius-voxforge from there.

Writing the command and control application
Follow the instructions from /usr/share/doc/julius-voxforge/examples/README to create your own grammar, and then edit the command.py file to suit your needs (the simplest configuration would be to just edit the dictionary near line 60). Finally, to execute it: julius -quiet -input mic -C julian.jconf 2>/dev/null | ./command.py

Problems
I don’t really have much experience with Julius, but if you have problems with the instructions explained here leave a comment or ping me on IRC (RainCT@Freenode) and I’ll try to help you. But first look at the examples below to ensure that you’ve done everything right :).

More?
I’m currently working at further improving those packages and getting them into Ubuntu. Also, I may write another post in the future explaining how to create your own speech corpora and acoustic models, but I can’t promise anything.

Example on how to control Rhythmbox:

. example.voca:
% NS_B
<s> sil

% NS_E
</s> sil

% ID
DO d uw
COMP k ax m p

% COMMAND
PLAY p l ey
NEXT n eh k s t
PREV p r iy v
SHOW sh ow
UP ah p
DOWN d aw n
SILENCE s ay l ax n s

. sample.grammar
S: NS_B ID COMMAND NS_E

. command.py’s parse function (note: Wordpress breaks the indentation)
def parse(line):
params = [param.lower() for param in line.split() if param]
commands = {
'play': 'rhythmbox-client --play',
'silence': 'rhythmbox-client --pause',
'next': 'rhythmbox-client --next',
'prev': 'rhythmbox-client --previous',
'show': 'rhythmbox-client --notify',
'up': 'rhythmbox-client --volume-up',
'down': 'rhythmbox-client --volume-down',
}
if params[1] in commands: os.popen(commands[params[1]])

. Usage: (Action - Verbal command)
Reproduce - DO PLAY
Pause - DO SILENCE (I didn’t use “DO PAUSE” because like that it had a very high error rate)
Next song - DO NEXT
Previous song - DO PREV (”DO PREVIOUS” can’t be used because VoxForge’s acoustic models don’t support some of it’s phonemes)
Show the name of the current song - DO SHOW
Increment Rhythmbox’s volume - DO UP
Decrement Rhythmbox’s volume - DO DOWN

Random tip:
You can let the computer answer to your commands using either espeak “text to say” or, if you have Festival (which sounds more natural) installed, festival -b ‘(SayText “text to say”)’.

Happy hacking!

XML Internationalization

Thursday, July 10th, 2008

Hey,

Okay, I know this is not a forum, but I wanted to ask for recommendations about what’s the best way handle internationalization of XML documents (of the content in it, of course, not the tags themselves). More details about why I’m asking this will come later :).

Thanks.

If you don’t want to waste 1 hour debugging…

Monday, June 16th, 2008

… call gtk.gdk.threads_init() when you want to work with threads in a PyGTK application.

Well, to not leave this post so short, I’ll also refer a post explaning threads on PyGTK which looks quite good (I didn’t look carefuly at it because I already knew 99% of what it says… it’s a pity that because of that I didn’t notice the 1% that I didn’t know and which would have saved me one hour of debugging :/). Also, note that if you write your own __init__ function in order to pass data to the thread you will have to call threading.Thread.__init__(self) somewhere in it.

Internship

Sunday, June 1st, 2008

I’m thinking about doing an internship this summer, so if you know of some programming related company in the area of Barcelona where they might be interested in taking someone I’d be grateful if you tell me :).

I’m a 17 year old technological baccalaureate student with experience in XHTML, CSS, PHP, Python and Debian/Ubuntu packaging (plus a bit JavaScript, bash, etc.), and the languages I know are Catalan, Spanish, English (mostly written) and spoken German. Ask for the CV if necessary.

Thanks!

Catalan Ubuntu Youth

Tuesday, May 20th, 2008

We have recently decided to created a sub-group of the Catalan LoCo Team only for the youth, with the idea to focus on approaching other young people to the Free Software world and to create a space where we can speak about those thinks that interest us, without having the entire community listening.

As a start, we have created a mailing list and this Friday we will hold our first event in Sant Cugat, which will be a dinner where we will meet together and discuss the further goals of the group. Of course, all of you under the age of 30 who speak Catalan are invited to come and join us!