[Catalyst] Pager and Cache

Mitch Jackson perimus at gmail.com
Wed May 21 20:17:32 BST 2008


>> For a simple common example, a page from a long list of users.  To use
>> $last_id_plus_1, I would first need a numeric id on my user table
>> (unlikely, as the username is already a unique identifier so adding an
>> autoincrement INT would be pointless
>
> been bitten by this many times. I add INTs as row ids no matter how unique
> i think any data will be. Because sometimes... well, you know how it goes.
> what if i have to change the username?

I've used this approach before, but it's definately not One-Size-Fits-All.
You make a good point.  It is possible to change a username in that
case, by issuing an update statement to each table.  This is not so
bad, as long as it is a rare occurence.

>> and break normalization
> why would that be ?
As I understand it: Creating a new unique identifier when you already
have one, especially for it's own sake, breaks normalization.  If
usernames must be unqiue there's your identifier.  Normalized doesn't
always mean most efficient.


> Moreover, if the username is your FK in any table, you're
> wasting tons of space and possibly memory if the server is set right.
It's true that an INT is more efficiently indexed than a CHAR by the
DBMS, so that looks more efficient on the surface.  However, this can
be introducing more inefficiency than is immediately obvious.

An example out of an application I maintain, reports are generated off
sales records.  These reports may be filtered and sorted by
salesperson(FK username) or other username fields, and the PHB may
want to see a table of rows including the salesperson, assigned
customer service agent, perhaps underwriting agent or sales manager
for each row.

In this case, if I were to use an INT instead of a username for these
fields, I would have to join the users table 4 times for each row to
display the usernames!  Now the slight efficiency gained using an INT
user_id has become an efficiency problem when working with reports
where the username need be known.

Here again is an example that breaks the $id_plus_1 paging approach.
If we sort by salesperson,date there is no usable $Last_PK_ID_+1 magic
value to find the right page for this query.   Or, if one can be
somehow engineered, it will be a lot of work proprietary to only this
one report, and the code will be more complex and harder to understand
and modify when the maintenance coder comes around.

/Mitchell K. Jackson



More information about the Catalyst mailing list