[Catalyst] C::P::PageCache patch for reducing duplicate processing
Marcello Romani
mromani at ottotecnica.com
Mon Jun 26 08:14:27 CEST 2006
Wade.Stuart at fallon.com ha scritto:
>
>
>
>
>
> catalyst-bounces at lists.rawmode.org wrote on 06/23/2006 11:04:07 AM:
>
>> Wade.Stuart at fallon.com ha scritto:
>>>
>>>
>>>
>>>> Wade.Stuart at fallon.com ha scritto:
>>>>>
>>>>>
>>>>>> Perrin Harkins wrote:
>>>>>>> Toby Corkindale wrote:
>>>>>>>> One of the aims of my patch is to avoid the expense of having
>>> numerous
>>>>>>>> processes produce the same code simultaneously. So on the initial
>>>>> bunch
>>>>>>>> of requests, I'll still try and have the code delay all-but-one of
>>>>> them
>>>>>>>> from building the page.
>>>>>>> Presumably this only happens when a new cached page suddenly
> becomes
>>>>>>> available and is instantly in high demand. It's not very frequent.
>>> In
>>>>>>> my opinion, that isn't worth the danger of messing with something
> as
>>>>>>> potentially troublesome as locks that block page generation, but I
>>>>>>> suppose no one is forced to use the locking.
>>>>>> Good point.
>>>>>> I'll try and implement the features so they can be enabled
> separately.
>>>>> I will second the "I don't think it is worth it" case. 99% of the
> time
>>>>> caching is set at startup and the only time the case you are coding
> for
>>> is
>>>>> hit is on the first page load if the second request comes in for the
>>> same
>>>>> page before the page build is done from the first hit. Seems like
> such
>>> an
>>>>> outside case that I would be against all that extra locking an
> special
>>> case
>>>>> code even if it is an option.
>>>> Could this condition be triggered by the user hitting "Reload" or "Go"
>>>> many times while waiting for the page ?
>>> Yes, my case statement was general on purpose. If a user or multiple
>>> users make multiple requests for the page, and the requests are the
> first
>>> ones that happen after the server is started, multiple builds would
> happen
>> not only when the server is (re)started, but also when the cached page
>> expires
>
> No cache expire/rebuild is covered by the algorithm I submitted earlier --
> only one rebuild happens during which other requests serve the old copy.
> After the one rebuild is done, the new copy is served for all requests
> until next expire period starts the process over again. The scope of the
> potential problem is truly limited to:
>
> Multiple hits, in which the timing of those hits is quicker than the
> initial build of the page, to a page that has not been hit and cached since
> the server was restarted.
This lowers the need for code that takes care of the special case IHMO
(production services are not supposed to be restarted very often).
>
>>> until the first build is done and stored in cache. After the cache is
>>> populated there is no window of opportunity for the case to exist
> (given my
>>> no blocking method is implemented). That seems to me like very minimal
>>> exposure and acceptable "startup cost" for a site.
>> I agree.
>>
>>> Throwing in locking and all that baggage to avoid the outside case (or
> all
>>> the logic to allow the option of the locking) just seems to go against
>>> KISS.
>>>
>>>
>> I agree on this point too.
>>
>> I also think that the problem at hand could show up only on the most
>> requested pages of a heavy-traffic site.
>
> True, and only after a service restart.
>
>
>> Also, it would have a significant impact only on undersized hardware
>> IMHO, because it would cause a spike in memory and cpu utilization, that
>> would cease as soon as the first copy of the page is produced, as you
> said.
>> In other words, I think some benchmarks would be required to know how
>> much a problem this... problem really is.
>
> I will not argue with benchmarks results, you can always just logically
> deduce the maximum vector too -- in this case it is small.
>
> In addition to these changes, I may also start looking into two more cache
> types for Cat too.
>
> 1: A session based cache so that heavy user customized pages can be cached
> uniquely per session -- this would allow apps that have session specific
> pages to allow caching (which is not really possible with PageCache right
> now as I read the code. I see this as a big win because currently you need
> to manage heavy (meaning expensive) customized pages to lower their build
> cost cause there is no cache available -- with cache you can direct to only
> build one time per 15 seconds or whatever and have a heavier page.
Seems clever to me.
>
> 2: A pagecache daemon system that rebuilds expired cache in queue behind
> the scenes inserting into the cache. This would prevent the blocking on
> the hit that comes after the expire period while the long building page is
> build.
>
Just issue a request for an expired page so that when the first client
(real) request for that page comes in, the page has already been cached.
(just a thought :-)
>
> _______________________________________________
> List: Catalyst at lists.rawmode.org
> Listinfo: http://lists.rawmode.org/mailman/listinfo/catalyst
> Searchable archive: http://www.mail-archive.com/catalyst@lists.rawmode.org/
> Dev site: http://dev.catalyst.perl.org/
>
>
--
Marcello Romani
Responsabile IT
Ottotecnica s.r.l.
http://www.ottotecnica.com
More information about the Catalyst
mailing list