[Soekris] sc1100 TSC bug and scx200_hrt

Jim Cromie jim.cromie at gmail.com
Sun Oct 8 02:24:12 UTC 2006


David Zelinsky wrote:
> Jim Cromie <jim.cromie at gmail.com> writes:
>
>   
>> hi Alexander, David, Ted,
>>
>> thanks for your help.
>> The fix is in to 18-mmX (probably temporarily, til its added to 19-rc1),
>> and has been forwarded to -stable for consideration/inclusion into 18.1
>>     
>
> Thanks, Jim, for finding and fxing this bug.  I installed the patch
> and it seems to work, as did the suggestion of passing mhz27=1 as
> argument to the scx200_hrt module.
>
>   
actually, you found it (and reported it).  I had mhz27=1 in 
/etc/modprobe.d/, and forgot about it, so hadnt
tested the default in a while :-}

>> heres a brief explanation of things ( in case it helps you isolate any
>> more bugs ;-)
>>     
>
> [-snip-]
>
>   
>> I dont at present have any current NTP numbers to share here, though
>> David posted these, which I re-send.
>> If you get better numbers with scx200_hrt, please post, along with uptime.
>> Also, send same for pit, tsc.  I suspect tsc's slowness might be
>> evident in the last 4 numbers below.
>>     
>
> The posted numbers were with scx200_hrt as clocksource.  Does TSC's
> slowness still affect it?
>
>   
no - as long as the tsc is detected as 'running slowly', it cannot foul 
timekeeping.

>> % ntpq -pcrv
>>      remote           refid      st t when poll reach   delay   offset  jitter
>> ==============================================================================
>> *portico.dedekin 198.6.255.249    2 u   64  128  377    1.369   -4.837   0.778
>>  LOCAL(0)        LOCAL(0)        13 l   25   64  377    0.000    0.000   0.004
>>     
>
>
> This was with scx200_hrt as clocksource without the new patch, but
> with the module called with mhz27=1, which made it run at normal speed
> rather than 27 times too fast.
>
>
>
> With the TSC clocksource, the jitter is so bad it can't keep ntp sync
> at all:
>
> % ntpq -pcrv
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
>  portico.dedekin 198.6.255.249    2 u   64  256  377   47.200  -454738 304606.
> *LOCAL(0)        LOCAL(0)        13 l   52   64  377    0.000    0.000   0.015
>
>   
That makes sense - its consistent with what Ted Phelps saw, graphed, and 
reported here (way back when).
With the inclusion of GTS, it became simple to fix (John Stultz did the 
hard parts)

>
> With the PIT clocksource I got numbers like these:
>
> % ntpq -pcrv
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> *portico.dedekin 198.6.255.249    2 u   12   64  377    1.297   58.347  15.520
>  LOCAL(0)        LOCAL(0)        13 l   10   64  377    0.000    0.000   0.015
>
>
>   
I suspect that PIT works better than this indicates - your poll is still 
only 64 secs, a fully settled NTP would be polling
at 1024 sec.  The PIT is expensive to read (several ISA bus cycles - 
*very* slow compared to a single rdtsc instruction),
but I cant quite swallow that it could cause 15 ms of jitter.  ICBW, and 
it doesnt matter - scx200_hrt works now ;-)

> Currently, with the newly patched scx200_hrt module, I see this:
>
> % uptime
>  20:35:33 up 21:37,  1 user,  load average: 0.03, 0.05, 0.01
>
> % cat /sys/devices/system/clocksource/clocksource0/current_clocksource 
> scx200_hrt 
>
> % ntpq -pcrv
>      remote           refid      st t when poll reach   delay   offset  jitter
> ==============================================================================
> *portico.dedekin 198.6.255.249    2 u  251 1024  377    1.814    0.300   0.125
>  LOCAL(0)        LOCAL(0)        13 l   55   64  377    0.000    0.000   0.004
> assID=0 status=0664 leap_none, sync_ntp, 6 events, event_peer/strat_chg,
> version="ntpd 4.2.0a at 1:4.2.0a+stable-2-r Fri Aug 26 10:30:12 UTC 2005 (1)"?,
> processor="i586", system="Linux/2.6.18-soekris-patched-1", leap=00,
> stratum=3, precision=-18, rootdelay=36.268, rootdispersion=69.574,
> peer=58612, refid=192.168.0.10,
> reftime=c8ced0eb.a71a9fbe  Wed, Oct  4 2006 20:31:39.652, poll=10,
> clock=0xc8ced1e6.8ef5fd47, state=4, offset=0.300, frequency=153.532,
> noise=0.190, jitter=0.125, stability=1.048
>
>
>   

yay.  thanks.

One of these months, I'll get around to graphing those variables over a 
day of operation
for each clocksource.  This would presumably make the characteristics 
crystal clear,
but its not a huge priority for me atm.

BTW, the fix is in 2.6.19-rc1, and will probably get into 2.6.18.1 when 
its released
(If Im paying attention when its opened)


More information about the Soekris-tech mailing list