Work and Cache Disks in Hall A

Robert Michaels, rom@jlab.org, Jefferson Lab Hall A, Feb 1, 2011

General info about JLab computers (e.g. what is a work or cache disk ?) can be learned from the computer center. This page pertains to the management of Hall A work and cache disks.

WORK DISK on JLab CUE

We have several Terabytes of scratch work disk on JLab's Common Unix Environment (CUE). These are allocated to each experiment based on need, with input from the form you filled out to request computer resources when you submitted your proposal.

Snapshot of existing disks.

To get to your work area follow the link: For the example of e02013, you "cd /work/halla/e02013" on JLab computers like ifarml3. If you type "ls" there you will see one or more disks, like "disk1", "disk2". Use these links for anything you do; do not use the mount points. If you are in the the appropriate Unix group (in this example a-e02013) you can write there. You must be in the Unix group corresponding to your experiment. Ask me (rom@jlab.org) to be added to a group or for other questions.

Disk cleanup

Periodically we do a massive cleanup of work disks. (The last one was Feb 1, 2011).
Ancient history: on March 2009 we decommissioned old work disks. One may expect this to happen periodically as well.

IMPORTANT: Work disks are scratch space and have no backups. Plan for the possibility that files on /work might be deleted or the disk might die. Keep important files in /home or in MSS.

ADAQ WORK DISKS

Thanks to Ole Hansen, we also have several TB of work disk on the adaq Linux cluster (adaql machines, l = linux). Disks are /adaqlN/workM where N = 1,2... and M = 1,2,3... The running experiment may keep stuff there, e.g. huge root files, etc. But these disks are not backed up. Also, a cleanup script might operate there if the disk fills up. At the end of the experiment, the adaq work disks are cleaned up (files erased) to make room for the next experiment.

CACHE DISK

In addition to /work, there are two kinds of /cache disk. One kind is ``hidden'' from users and is used to feed the batch farm from data in MSS. Another kind is open to users and is the temporary repository for MSS data. The latter disks are /cache/mss/halla/EXPERIMENT where EXPERIMENT corresponds to different experiments. See Computer Center Scicomp pages.

My response to ``I want more disk !''

Here is my response to the common request "Please give me more disk immediately." A short technical justification via e-mail from the spokesperson is needed. But it might be impossible. The computer center buys large disk servers and partitions them by quotas. Some adjustment of quotas may be possible if some experiments are not using their quota. The procurement cycle takes several months and we have not had great budgets lately.

The argument that disk is cheap does not really work because: 1) if we bought very cheap disks the failure rate would be too high for the huge amount of disk we have; and 2) Significant costs include the server that houses the disk, space, power, and especially sysadmin time. It ends up being about 10 times more expensive than the disk you're thinking about.

It is also possible to stage your huge output to MSS. Hall B does it that way, as it is impossible for them to store all output on /work.

How quotas are decided: An important input is the computer resource request form that the spokesperson filled out when submitting their proposal. And they almost always underestimate the needs at this stage. Otherwise it's my decision (which I don't enjoy), based on the size of the experiment, how much data, and how active they are in analysis.

Feel free to offer suggestions or better solutions. Note, some users have occassionally told me they want to go around me and discuss this problem with management. Answer : go ahead ! I don't mind. If management wants to buy more disk or tell me to do something different, I will of course listen.

Work and Cache Disks in Hall A

WORK DISK on JLab CUE

Disk cleanup

ADAQ WORK DISKS

CACHE DISK

My response to ``I want more disk !''