[Dbix-class] Advices needed on creating multilingual application

Mark Lawrence nomad at null.net
Wed Sep 19 11:58:24 GMT 2007


On Wed Sep 19, 2007 at 11:50:26AM +0300, mbit at ukr.net wrote:

> Let's assume an easy application. We have a blog. But there are
> three versions int three languages: en, de and ru for example. But
> when we don't have a translation for the en part, we show original
> part. This way first we add original post in de, and then if we
> translate that part we add translation. And when a user views en
> version of the site he sees translated part. Otherwize he
> sees de version. This way in all three languages we have complete
> website, just not everything is always translated, but we know what
> changed and was added in all languages.

The application might be 'easy', but the type of problem is not, otherwise
I expect we would have seen the major blogging/CMS platforms do this long
ago. Unfortunately I think the only ones who have made reasonable
progress are Plone and the Gengo plugin for wordpress.

Although it won't help with your Perl/DBIx::Class issues, looking at
their database schemas might give you some ideas. As you have probably
already found, apart from the (minor) complexity of determining which
record(s) to retrieve, the biggest issue is efficiency.

Here is how I do it (database-side only).

For every type of object that we wish to have translatable components there
are two tables. The first table holds all data that is non-translatable,
plus the original language versions of the translatable data. Eg:

    CREATE TABLE entry (
        id    INTEGER,
        ctime INTEGER,
        lang  VARCHAR(4),
        title VARCHAR(255),
        body  TEXT,
    )

The second table contains the translations of the translatable columns

    CREATE TABLE entry_t (
        id      INTEGER,
        transof INTEGER REFERENCES entry(id),
        ctime   INTEGER
        lang    VARCHAR(4),
        title   VARCHAR(255),
        body    TEXT,
    )

To retrieve the desired translation with fallback to the original
language this query is the most efficient I've found. 

  "find the last 5 blog entries, in Russian but showing their original
  language if no Russian translation available":

    SELECT
        t1.id,
        CASCADE(t2.lang, t1.lang)   AS lang,
        CASCADE(t2.title, t1.title) AS title,
        CASCADE(t2.body, t1.body)   AS body,
    FROM
        entry AS t1
    LEFT JOIN
        entry_t AS t2
    ON
        t1.id = t2.transof AND t2.lang = 'ru'
    ORDER BY
        t1.ctime
    LIMIT 5

There is another way to do this, where all the translations are in the
same table and using self joins, but I don't think it is as clear or
efficient as the above. Of course the above is the simplest case. I tend
to also want columns to specify whether the entry is published, whether
the translation is complete, whether to show the entry at all if there
is no translation for the selected language (ie hidden from other
language groups) and so on.


>     lang => {
>         data_type     => 'character varying',
>         default_value => undef,
>         size          => 2,
>         is_nullable   => 0

By the way, not all language codes are limited to two characters in
length.

> My question is if there is a better approach to do the same stuff. Because i
> faced a lot of difficulties in prefetching related object like creator where i
> had to do something like this:
> 
> prefetch => { i18n => [ qw/ creator / ] }

Oh, forgot about that. Something like this works.

    CREATE TABLE users (
        id    INTEGER,
        name  VARCHAR(255),
    )

    CREATE TABLE entry (
        id    INTEGER,
        ctime INTEGER,
        author INTEGER REFERENCES users(id),
        lang  VARCHAR(4),
        title VARCHAR(255),
        body  TEXT,
    )

    CREATE TABLE entry_t (
        id      INTEGER,
        transof INTEGER REFERENCES entry(id),
        ctime   INTEGER
        lang    VARCHAR(4),
        title   VARCHAR(255),
        body    TEXT,
    )

    SELECT
        t1.id,
        t3.name,
        CASCADE(t2.lang, t1.lang)   AS lang,
        CASCADE(t2.title, t1.title) AS title,
        CASCADE(t2.body, t1.body)   AS body,
    FROM
        entry AS t1
    LEFT JOIN
        entry_t AS t2
    ON
        t1.id = t2.transof AND t2.lang = 'ru'
    LEFT JOIN
        users AS t3
    ON
        t3.id = t1.author
    ORDER BY
        t1.ctime
    LIMIT 5


> Or worse when we have a related object that has a translation
> too... That's a big mess for me.

On the SQL side you can juse keep on adding CASCADEs and JOINs as many
times as you want. Unfortunately this doesn't really translate well to
pure Object(table)-based mappers, so I'll leave it to others on this list
to explain how you can do that with DBIC. (I couldn't find a nice way).

Hope that helps.

Mark.
-- 
Mark Lawrence



More information about the DBIx-Class mailing list