Log entry time 22:43:25 on July26,2000

Entry number 47579

keyword=re: high deadtimes (what to do)

This evening I was called and asked why the deadtimes
were repeatedly going to 70% even after `kcoda'. I looked
at the adaqs2 system using `top' and found I/O wait was
high (typ. 80%). DAQ deadtime 70%. When I killed the
MSS copying, deadtime recovered to 23%. A next file started
getting copied, and the deadtime remained stable several
minutes. To gather more info, I suggest the shift do this
when seeing high deadtimes. Run `top' (by typing `top')
and write down the system parameters like kernel usage,
free swap, and I/O wait. Next, temporarily halt the copying
to MSS using the following commands:
`ps -ef | grep etcp' (this lists the PID for the copy algorithm
`etcp'). The PID is the 1st number. Lets say its 1614. Next:
kill -STOP 1614 (but this must be done as `adaq' account)
See if the deadtime recovers. After awhile, resume copying
kill -CONT 1614 (again from `adaq' account on adaqs2).
Another thing to try, but not too often, is to `kproc etcp'
Its a bad thing to do often (leaves a truncated file which
I must recover by hand), but gives some extra info about
how (and if) the `etcp' causes deadtime. One speculation
is that during unstable beam the algorithm gets fooled but
things recover when it restarts afresh on a next file (??).
Another possibility is that etcp needs some tuning.

User name R. Michaels

Log entry time 22:43:25 on July26,2000

Entry number 47579