CakePHP: Using a model inside a component

Apparently this is discouraged for some mysterious reason, but if like me you want to have nice code where models are used from within components (so you don’t have to clutter up your controllers or shell classes), here’s how:

$this->ModelName = ClassRegistry::init('ModelName');

Source

For example, you can place it like this at the start of your component’s definition:

public function __construct() {
    $this->Message = ClassRegistry::init('Message');
}

Debious – A dubious Debian packaging GUI

Just some little (unfinished) concept mockup. Seeing that much of it still ends up as a “text box with syntax highlighting” it’d probably make sense to implement it as a gedit plugin.

Balsamiq source XML

My dot files (Tips and Tricks for Bash & co.)

.bashrc

# If not running interactively, don't do anything
[ -z "$PS1" ] && return

# don't put duplicate lines in the history and ignore same sucessive entries.
export HISTCONTROL=ignoreboth

# make the history longer
HISTFILESIZE=5000

# append to the history file, don't overwrite it
shopt -s histappend

# check the window size after each command and, if necessary,
# update the values of LINES and COLUMNS.
shopt -s checkwinsize

# make less more friendly for non-text input files, see lesspipe(1)
[ -x /usr/bin/lesspipe ] && eval "$(SHELL=/bin/sh lesspipe)"

case "$TERM" in
xterm*|rxvt*|screen)
    #PS1='\[\e[1;34m\][\u, \W]\$ \[\e[m\]'
    # http://live.gnome.org/Git/Tips, http://tldp.org/HOWTO/Bash-Prompt-HOWTO/x329.html
    PS1='\[\e[1;34m\][\u, \W$(__git_ps1 "(\[\e[1;30m\]%s\[\e[m\]\[\e[1;34m\])")]\$ \[\e[m\]'
    ;;
*)
    ;;
esac

# enable color support of ls and also add handy aliases
if [ -x /usr/bin/dircolors ]; then
    eval "`dircolors -b`"
    alias ls='ls --color=auto'
fi

shopt -s cdspell
shopt -s cmdhist

# enable programmable completion features
if [ -f /etc/bash_dyncompletion ]; then
    . /etc/bash_dyncompletion
elif [ -f /etc/bash_completion ]; then
    . /etc/bash_completion
fi

. ~/.bash_aliases

export PATH=$PATH:/sbin:/usr/sbin:/home/rainct/bin:/home/rainct/.local/bin
export DEBFULLNAME="Siegfried-Angel Gevatter Pujals"
export DEBSIGN_KEYID="363DEAE3"
export DEB_MAINTAINER_MODE=1
export PBUILDFOLDER="/home/rainct/pbuilder"
export QUILT_PATCHES=debian/patches
export GREP_OPTIONS='--color=auto --exclude-dir=\.svn'
export EDITOR=nano

# This also needs an entry in .devscript:
# DEBUILD_PRSERVE_ENVVARS=DPKG_GENSYMBOLS_CHECK_LEVEL
export DPKG_GENSYMBOLS_CHECK_LEVEL=4

Continue reading →

Language Identification and it’s state in Free Software

Working on a new feature for eSpeak GUI I started looking into language identification. Forcing users to manually choose the text’s language is a botheration, so trying to guess it by checking which system dictionary contains the most words from the text or some other method would surely be beneficial.

After a quick search I learned that it’s much easier than this: it’s possible to reliably determine the language based on statistic n-gram information. Ignoring the fact that now I officially hate Firefox, Chromium, OpenOffice.org and everyone else there for not implementing this and having me spend the day changing the spell-checker’s language, I was left with the choice on how to use this in eSpeak GUI.

The first option I found was TextCat, which is also the only library I’ve found to be packaged for Debian. However, ignoring the fact that upstream isn’t maintaining it any more (such a library shouldn’t need too much maintainance, after all), the package declares incorrect dependencies (bug filled a month ago, no response yet) and the API is also pretty crappy (it requires a physical file indicating the location of the statistic models).

Unrelated to that, I’ve also found that the Catalan text samples it includes are incorrect, so the same may be true for other languages. I guess it’d make sense to work on a new (and completely Unicode) language samples collection. I’ve thought of using something like the Universal Declaration of Human Rights since this way all languages can have the same text, but being more of a legal thing it may be biased by some words being too repetitive.

Looking for other alternatives to the TextCat library I’ve only found the following:

  • TextCat (same name, different code): PHP licensed, so incompatible with GPL projects.
  • Mguesser (part of mnogosearch-mysql): it’s a standalone executable and not a library.
  • SpamAssassin’s TextCat.pm: also a standalone executable, this time written in Perl. Apparently they were using a fork of TextCat (the original library, not the PHP licensed one) before that.

So it looks like I’ll have to start by getting a good collection of text samples I can use to generate the statistic data. Then I have several options on how to actually use it. As I see it, those are my possibilities:

  1. Fixing libtextcat‘s packaging and just using that.
  2. Taking it over as new upstream maintainer. Not my preferred option as I don’t really feel like maintaining a C library at this point.
  3. Trying to convince the maintainer of the new TextCat (with last commit January this year and a more sane API) to re-license it in a GPL-compatible way, packaging that and seeing how that one works (haven’t tried it out yet).
  4. Writing my own implementation in Python, maybe based upon this example or TextCat.pm.

Any other ideas, pointers to some library I may have missed or offers to collaborate are very welcome. Please also note that my intention in writing this post is not only to rant about there being no well-maintained ready-to-use library being available, but especially raising awareness on the topic of language identification. I’d love to see this feature all around the desktop, just like (and in combination with) spell-checking, which is already omnipresent.

The Red Hat Way

I’ve just used one of the GUADEC USB sticks for the first time and found that, in addition to several PDF brochures and a bunch of wallpapers, it includes a Red Hat commercial. Excellent as always.

httpv://www.youtube.com/watch?v=ySyPIoyXJ-k

 
Skip to toolbar