[Catalyst] advise on data processing in Cat/DBIC/Model
Matt S Trout
dbix-class at trout.me.uk
Mon Nov 26 19:04:24 GMT 2007
On Mon, Nov 26, 2007 at 04:33:02PM +0100, Rainer Clasen wrote:
> Hello,
>
> within my current project, some value is collected up to once a day:
>
> CREATE TABLE a_value {
> day date PRIMARY KEY,
> other_values integer NOT NULL,
> value integer
> another_value integer
> );
>
> Data comes in a bit sporadic - so I cannot rely each day having an entry.
> Actually there also be longer periods (weeks/month/??) without data.
>
> I'm currently a bit at a loss on how to "properly" cook up this data to
> easily display it in fixed time steps. I'm thinking of a list of *all*
> days/weeks/month/... in a certain timerange. Such a list would allow the
> view easy access to present the data (say as html table with one row per
> time step or as input for GD::Graph).
>
> This means there are basically two tasks:
> - aggregate the data for each time step: No-brainer with DBIx::Class.
> - get NULL entries for time steps without data: The intersting part.
>
> I can come up the following solutions to generate the NULL entries:
>
> - use a SQL stored procedure or temp table with the start-dates of the
> desired time-steps, do an outer join and stuff this in a DBIC
> result_source as described in the DBIC cookbook under "arbitrary SQL".
>
> example query for ->name():
> SELECT
> d.id,
> steps AS day,
> d.value,
> COALESCE( d.other_value, $4 ) AS other_value
> FROM
> timeseries( $1, $2, $3) AS steps
> LEFT JOIN ( SELECT * FROM data WHERE other_value = $4 ) d
> ON ( d.day >= $2 AND d.day + $1 < $3;
> $1 = time steps. eg. '1 day'
> $2 = start date. eg. '2007-11-1'
> $3 = end date. eg '2007-11-30'
> $4 = other_value to filter on.
> timeseries(step,start,end) = stored procedure that returns the
> start-dates of the time-steps within the specified time-range.
I tend to do -sort- of this.
Except that instead of using a function like timeseries() I'll create a
pivot table with a 'date' column that I prepopulated with all dates from
now to say 2020 (and make sure one of my cron jobs extends this when we
reach say 2019 or so). Then I put function indexes on the various DATE_PART
or equivalent functions that I might use to pull the month, year etc.
That way I can query the pivot as "just another DBIC class" and everything
gets simpler.
--
Matt S Trout Catalyst and DBIx::Class consulting and support -
Technical Director http://www.shadowcat.co.uk/catalyst/
Shadowcat Systems Ltd. Christmas fun in collectable card game form -
http://www.shadowcat.co.uk/resources/2007_trading/
More information about the Catalyst
mailing list