[cgl_discussion] Re: [Cgl_proof_subgroup] FYI: timer scalability whitepaper url
george anzinger
george at mvista.com
Thu Sep 5 18:27:38 PDT 2002
Andy Pfiffer wrote:
>
> On Thu, 2002-09-05 at 15:42, george anzinger wrote:
> > Whoo, wait a nano second. Andy's paper analized what I call
> > the "cascade timers". High-res-timers require a new timer
> > list structure which I described somewhere (see below).
>
> Yes, I saw the nice graphs posted earlier.
>
> At
> > this time you can still "CONFIG" the cascade timer list, but
> > I consider that a warm fuzzy that will go away once we feel
> > "comfortable" with the new structure. The high-res option
> > REQUIRES the new list structure so to be able to enable the
> > "cascade timer" list you must turn off the high-res option.
>
> Well, for what it's worth, the head of the CGLE repository won't build
> with high-res configured out (yes, there is a bug filed).
Uh, have a number? Do we now have a new Bugzilla? Awk!!
>
> >From a scalability standpoint, I'm not too concerned with fundamental
> changes to the data structure, provided it doesn't make things worse.
>
> Currently, the algorithm for insertion, modification, and removal is of
> O(1) complexity. SMP lock contention appears to be correlated to both
> the length of a "cascade timer" list as well as the average wall-clock
> execution time of invoked callbacks (1 long-running callback could cause
> as much contention as 20,000 simultaneously firing very short
> callbacks).
Actually, I have been told, that if you take the cascade in
to account it is not really O(1). There was some discussion
on this on the high-res-timers discussion list.
Yes, it is these callbacks that give Real Time OS venders
nightmares. There is no real control...
> The cost for maintaining the "sorted" property of the data
> structure is spread out over time (roughly 1 workload-related burst of
> processing every N clock interrupts, and so on). If the distribution of
> expiration times is evenly distributed into the future, the "cascade
> timer" algorithm really isn't all that bad.
>
> On the other hand, if the system has been asked (directly or indirectly)
> to maintain, say, 30,000 timers with identical expiration times set for
> 10 weeks into the future, then 30,000 timers need to be carried forward
> until they fire. In this scenario, the current "cascade timer"
> algorithm as configured will cause no more than 4 (maybe fewer) cascade
> operations over a span of 10 weeks.
I think the real issue with such a load would be the expire
time when all the call backs need to be made. But this is a
problem not matter what list structure is used.
>
> I understand that from a real-time point of view, that "blurp" while the
> system relinks 30,000 data structures might be a Very Bad Thing. I
> couldn't find a good description of a real-life carrier-grade workload
> and what awful things it might do to the timer algorithm, so I'll just
> toss up my hands at this point and say I don't have much of evidence of
> it being a Very Bad Thing. ;^)
For what its worth, I did not move away from the cascade
list just because it is cascade. It was more related to
needing to keep an ordered list in any case (high-res has
entries between jiffies). It is also easier to throw more
memory at the new list than at the cascade list.
>
> Andy
--
George Anzinger george at mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
More information about the cgl_discussion
mailing list