Running CODA on Spectrometers DAQ Systems

Robert Michaels, rom@jlab.org, Jefferson Lab Hall A, updated Sept 19, 2002

This file: hallaweb.jlab.org/equipment/daq/guide2.html

I. NORMAL DAQ OPERATIONS

This assumes that runcontrol is running. If not, see Cold Start in section III.

Use the "a-onl" account on adaql2.

The run coordinator should know the passwords, which are also posted on the wall on a paper near the whiteboard.

Normally, runcontrol will be already running. If not, see section on Cold Start below. To start and stop runs, push the buttons "Start Run" and "End Run". Start Run is the same as the sequence prestart, go.

To change configurations, use the "Run Type" button. If you've been running, you will first have to push "Abort" button. Choose the configuration from the dialog box, then press "Download". The relevant configurations are:
- COINC -- Normal config for reading both spectrometers (singles and coinc trig)
- UNBUFF -- same as above, but unbuffered (slower but safer)
- PEDRUN -- a pedestal run
- WITHROC14 -- Same as ``unbuff'' but includes the ROC14 crate.

II. FREQUENTLY ASKED QUESTIONS

Where is the data ? To find run 1602, type "find_run 1602" from adaq computers on the adaq accounts. Another command: "lastrun" prints where the latest run is. The data are on /adaql2/dataN (N=1,2,3,..)

What is the deadtime ? The deadtime is displayed in the "datamon" window -- make sure you are using the one on same computer as runcontrol. Probably datamon is already running near the runcontrol window. If it is not running, type "datamon"

Prescale Factor ? Go to ~a-onl/prescale and edit file "prescale.dat"

Disk Handling ? There are at several hundred Gbytes of disk for data taking. Automatic scripts choose the disk to write data, copy to MSS, delete the data when its safe, and avoid simultaneous read/write on a disk. Things to *NOT* do:
- Do *NOT* erase or move data files (even if you think its junk).
- Do *NOT* try to change run numbers.
- Do *NOT* change names of run files.
- Do *NOT* try to put data by hand into MSS.
It takes a couple hours for files to appear in MSS, and the files remain on adaq disk for typically 2 or 3 days.

Pedestal Runs The purpose of a pedestal run is to obtain pedestal files used by fastbus for pedestal suppression. If you've been running the normal spectrometer DAQ configuration, you'll need to press "Abort", then "RunTypes" and select the PEDRUN configuration. Download, StartRun. Run for about 5000 events, then EndRun. You may check the pedestal files in ~a-onl/ped (ped1.dat and ped2.dat), See README there. After pedestal run, change back to the running configuration. NOTE: DONT try to use PEDRUN for anything other than pedestal determination -- its confusing since the prescale factors are in a different file, the data goes to disk 6, etc. If you want a run with pedestal suppression turned off, see the README.

III. REBOOTING STUFF

HOW TO SHUTDOWN or REBOOT WORKSTATIONS

Rarely, the workstations don't function properly and the simplest way out is to reboot. To reboot adaqs2,s3 (SunOS): Login as "adaq" and type "reboot". In a few minutes the workstation comes back alive. To shutdown on SunOS, login as "adaq" and type "shutdown". After several minutes screen goes black, wait a minute more, then power off. For Linux, Ole usually keeps some instructions posted near the PC terminal. One may hit Ctrl-Alt-F1 to go to console mode, then Ctrl-Alt-Del.

QUICK RESETS

Problems with CODA 2.2 can usually be solved with a simple reset or with a Cold Start. If not call Bob Michaels or Bodo Reitz. Do NOT waste an hour stuck on resets.

If a ROC seems to be hung up, you can reboot it by going to the workspace "Components" and typing "reboot" at the vxWorks prompt (-> reboot). Wait 2 minutes and telnet back in to verify its alive. The name of the ROC computer is normally written in the name of the xterm (hallasfi1, hallasfi2, etc). You need to know what subset of these computers are used for your configuration. For example, for e00007 we use hallasfi1 (R-arm fastbus ROC1), hallasfi2 (R-arm fastbus ROC2), hallasfi3 (L-arm fastbus ROC3), hallasfi4 (L-arm fastbus ROC4), hallavme2 (R-arm Scaler Crate TS0), hallavme4 (L-arm Scaler Crate TS1), and perhaps hallavme1 (BPM/raster crate ROC14). If the ROC seems really frozen, use the "Crate Resets" button in the magnet EPICS screen on hac. Some labelling in this GUI: roc14=bpm/raster, roc1=R-arm fastbus1 (this also boots roc2), roc3=L-arm fastbus1, roc4=L-arm fastbus2, TS0=R-arm Trig Super, TS1=L-arm Trig Super. NOTE: both fastbbus crates in on an arm are reset with one button.. To reset from the GUI, toggle the state of the button.

If you reboot the ROC, or if something on the workstation is hung up, try pushing the "Reset" button in runcontrol. Then Configure, Download, and StartRun as usual.

If a ``quick'' reset doesn't work, try a Cold Start (see below).

COLD START

First kill all CODA process on the workstation where CODA is running by typing from anywhere on the relevant account and relevant computer "kcoda". This script stops runcontrol, the event builder, event recorder, the runcontrol server, and cleans up the ET system. Now you can start everything again by typing "runcontrol" NOTE: For Linux based DAQ as we are using for HRS, one should start "rcServer" interactively before starting "runcontrol". This avoids some problems like "ER not connected". Just type "rcServer" in a window and leave it. Since upgrading to Redhat 7.3 it has also become necessary to run ``coda_er'' interactively (otherwise it sometimes fails to open a data file, and we don't know why). The instructions are printed by the ``kcoda'' script when if finishes.

Before downloading, it is first a good idea to make sure the fastbus and VME crates (i.e. the ROCs) are running. One finds that when resetting, one can frequently avoid rebooting the frontend crates, and just restart runcontrol, but if you must: Reset the ROCs according to the reset procedure described in ``quick reset'' section above. An alternative and convenient way to reset (reboot) the ROCs is to go to the Components workspace and enter "reboot" at the vxWorks prompt -> of each ROC. After a few minutes, telnet back in and verify they are up. Some of the crates take longer than others to reboot. Be patient. However if the ROCs are really frozen, press the reset buttons (as explained above) instead.

To start runcontrol GUI on the workstation, type "runcontrol". Then press the Connect button. After "connect", wait 10 seconds, then press "Run Type"; a dialog box pops up and you must choose the configuration you want, which is probably "COINC" for buffered mode or "UNBUFF" for unbuffered mode. Then press download and wait about 30 seconds. Now you can "Start Run" to start a run.

Reminders: For Linux based DAQ, one must start "rcServer" by hand before starting "runcontrol". Also when recovering from a DAQ crash, you must press the "reset" button after configuring and before "download" in the runcontrol GUI. Lately we've needed to run coda_er interactively.

If you ever logout of the DAQ computer adaql2 or have rebooted it, here is how to restart the preferred setup. Login as "a-onl". Start emacs in background: emacs ~a-onl/prescale/prescale.dat & Also start datamon by typing "datamon". Next, in the "components" workspace, login to all the frontend computers by typing "telnet hallasfi1" where the name of the computer like hallasfi1 is in the name of the xterm window. Also in the name of these xterms is the portserver port where you can connect via RS232 (portserver instructions are at hallaweb.jlab.org/equipment/daq/portserver.html. So, an example name may be an xterm with title "ROC2--hallasfi2--hatsv3-port-8" meaning ROC2 is IP address hallasfi2 and on portserver hatsv3 at port 8. Note: If these xterm windows for components are not there, type "setupxterms" to bring them up.

V. THINGS THAT GO WRONG WITH DAQ or COMPUTERS

Lots of deadtime -- Please see hallaweb.jlab.org/equipment/daq/dtime_faq.html

Computer/Network problems: Try calling Bob Michaels (pager 584-7410). You may also call the computer-center expert on-call (cell phone 876-1794). For problems with the linux boxes or ESPACE ask Ole Hansen, though Bob can probably help too. For problems with slow controls or hac ask Javier Gomez. Bodo Reitz and Dave Abbott are Bob's substitute when out of town.

The Event Recorder (ER) or Event Builder (EB) may complain something like "ER1 not responding" on the Linux version of CODA, and you get no events. This may be because you forgot: Before running "runcontrol" one must run "rcServer" interactively from the login shell of the relevant account on the relevant computer ("a-onl" on "adaql2").
No data file is opened ? You'll notice, e.g. that datamon does not see the present file. This is a recent problem that started happening with RedHat 7.3 upgrade. The problem was never observed, however, if we run coda_er interactively. The instructions to do so are printed when you run ``kcoda''.
How to Program the RAM of the ROCs. When fastbus loses power, it may be necessary to reload the VME RAM (it isn't as non-volatile as it should be.) Up to date info is kept at ~adev/doc/vmeram.doc Portserver instructions ~adaq/doc/portserver.doc which are also online.

If Bob is unavailable, some details about recovery are at jlabs1:/home/rom/doc/recovery.doc

More info: hallaweb.jlab.org/equipment/daq/guide.html and hallaweb.jlab.org/equipment/daq/daq_trig.html.

This page maintained by Robert Michaels rom@jlab.org