[cgl_discussion] Re: [Cgl_proof_subgroup] FYI: timer scalability whitepaper url
george at mvista.com
Thu Sep 5 18:27:38 PDT 2002
Andy Pfiffer wrote:
> On Thu, 2002-09-05 at 15:42, george anzinger wrote:
> > Whoo, wait a nano second. Andy's paper analized what I call
> > the "cascade timers". High-res-timers require a new timer
> > list structure which I described somewhere (see below).
> Yes, I saw the nice graphs posted earlier.
> > this time you can still "CONFIG" the cascade timer list, but
> > I consider that a warm fuzzy that will go away once we feel
> > "comfortable" with the new structure. The high-res option
> > REQUIRES the new list structure so to be able to enable the
> > "cascade timer" list you must turn off the high-res option.
> Well, for what it's worth, the head of the CGLE repository won't build
> with high-res configured out (yes, there is a bug filed).
Uh, have a number? Do we now have a new Bugzilla? Awk!!
> >From a scalability standpoint, I'm not too concerned with fundamental
> changes to the data structure, provided it doesn't make things worse.
> Currently, the algorithm for insertion, modification, and removal is of
> O(1) complexity. SMP lock contention appears to be correlated to both
> the length of a "cascade timer" list as well as the average wall-clock
> execution time of invoked callbacks (1 long-running callback could cause
> as much contention as 20,000 simultaneously firing very short
Actually, I have been told, that if you take the cascade in
to account it is not really O(1). There was some discussion
on this on the high-res-timers discussion list.
Yes, it is these callbacks that give Real Time OS venders
nightmares. There is no real control...
> The cost for maintaining the "sorted" property of the data
> structure is spread out over time (roughly 1 workload-related burst of
> processing every N clock interrupts, and so on). If the distribution of
> expiration times is evenly distributed into the future, the "cascade
> timer" algorithm really isn't all that bad.
> On the other hand, if the system has been asked (directly or indirectly)
> to maintain, say, 30,000 timers with identical expiration times set for
> 10 weeks into the future, then 30,000 timers need to be carried forward
> until they fire. In this scenario, the current "cascade timer"
> algorithm as configured will cause no more than 4 (maybe fewer) cascade
> operations over a span of 10 weeks.
I think the real issue with such a load would be the expire
time when all the call backs need to be made. But this is a
problem not matter what list structure is used.
> I understand that from a real-time point of view, that "blurp" while the
> system relinks 30,000 data structures might be a Very Bad Thing. I
> couldn't find a good description of a real-life carrier-grade workload
> and what awful things it might do to the timer algorithm, so I'll just
> toss up my hands at this point and say I don't have much of evidence of
> it being a Very Bad Thing. ;^)
For what its worth, I did not move away from the cascade
list just because it is cascade. It was more related to
needing to keep an ordered list in any case (high-res has
entries between jiffies). It is also easier to throw more
memory at the new list than at the cascade list.
George Anzinger george at mvista.com
More information about the cgl_discussion