• Main INDEX
  • Monthly INDEX
  • PREV
  • NEXT

    User name S. Sirca

    Log entry time 00:14:37 on January21,2001

    Entry number 55334

    keyword=HOWTO get out of a CODA crash

    At the end of the run #1586, CODA failed to terminate the run with ROC1,
    ROC2, ROC14 apparently hanging. Tried to reboot the ROCs per hand and
    do the kcoda-runcontrol-configure-donwload sequence, but it did not work.
    Called Bob to fix the situation remotely. There is a well defined procedure
    of bringing CODA into a clean state.

    1) If the crash seems to be caused by a single ROC crash, this ROC has to
    be rebooted. After reboot, type `i' at the ROC prompt. The process called
    `coda_roc' should be in the PEND(ing) or READY state.
    Then try to `reset', `configure' and `download' the trigger and start a new run.
    In many cases, this should be sufficient.

    2) If this does not help, reboot the relevant ROC, exit CODA, do a `kcoda'
    and take care that all related processes are dead. Also, check for the
    file called /tmp/et_sys_tst1. If it's there, delete it. Then start runcontrol,
    connect to server, configure the trigger to `twospect', then `reset', `configure'
    again, and you should be able to start the run.

    3) It can happen that the connection to several ROCs seems lost.
    It can then be a good idea to reboot ALL of them. In addition,
    if you were too eager to shoot off CODA-related processes, also the
    component-xterms can be gone. In this case, open an xterm in the `components'
    workspace and say `setupxterms'. This sets up 6 xterms for the ROCs.
    Close the ROC3 window since it is not used in this experiment.
    Then login into each one of the five ROCs using names on the window titlebar:

    telnet hallasfi1 / hallasfi2 / hallavme1 / halladaq1 / halladaq4

    When their prompts show up, type `i' to see if the `coda_roc' processes
    are in the PEND(ing) or READY state. If so, start runcontrol, connect to
    server, configure the trigger, `reset', `configure' again, and you should be
    able to start the run.