Info   About   Work   and   Cache   Disks   in   Hall A

Robert Michaels, rom@jlab.org, Jefferson Lab Hall A, updated April 2004


General info about JLab computers (e.g. what is a work or cache disk ?) can be learned from the computer center. This page pertains to the management of Hall A work and cache disks.

We have 5 Terabytes of scratch work disk on JLab's Common Unix Environment (CUE). There are various physical partitions: 16 are 240 Gbytes and 2 are 480 GBytes. Typically the 240 Gb partitions are dedicated to "contemporary experiments"; the two larger ones are for everyone (including contemporary experiments). Contemporary experiments include the one running, the one or two upcoming, and recent previous experiments.

To get to your work area, contemporary or not, follow the link: For the example of e98108, you "cd /work/halla/e98108" on JLab computers. If you are in the Unix group a-e98108 you can write there. You must be in the Unix group corresponding to your experiment. Ask me to be added to a group or for other questions. Some experiments have multiple disk areas, like /work/halla/e97103/disk1 and /work/halla/e97103/disk2, usually one of these is the "dedicated" disk, the other points to a region on a big "shared" disk.

A "cleanup" script will maintain free space on the work disks. In simple terms, the largest and oldest files are deleted nightly, if necessary, to keep some space free. For the dedicated disks, the spokespersons can elect to have me turn off the cleanup script; but it will run on the big disk. The cleanup script is described in the appendix below.

IMPORTANT:   Work disks have no backups. Leave only things on /work that you don't mind losing. Do not confuse /work, which is scratch space, with the central /home fileserver.

When an experiment loses its disk, its data will be squeezed into one of the big general purpose disks, first running "cleanup" with strict parameters to make space. Afterwards you may still follow the link to get to your data.



Thanks to Ole Hansen, we also have several hundred GBytes of work disk on the adaq Linux cluster (adaql machines, l = linux). Disks are /adaqlN/workM where N = 1,2... and M = 1,2,3...   The running experiment may keep stuff there, e.g. huge root files, etc. But these disks are not backed up. Also, a cleanup script operates there (see appendix). At the end of the experiment, the adaq work disks are cleaned up (files erased) to make room for the next experiment.


In addition to /work, there are two kinds of /cache disk. One kind is ``hidden'' from users and is used to feed the batch farm from data in MSS. Another kind is open to users and is the temporary repository for MSS data. We presently have 1 Tbyte of the latter kind, spread among /cache/halla/EXPERIMENT where EXPERIMENT corresponds to different experiments. See Computer Center Scicomp pages.  



APPENDIX -- Cleanup scripts

For WORK Disks on JLab CUE --     Every night the disks are checked. If less than 95% full, nothing happens. If greater than 95% full, files will be deleted as follows: Initially, files greater than NB bytes are considered, then NB/(10**i), i=1,2,3...(loop) where NB is big. I.e., first we consider 10 Gbyte files, then 1 Gbyte, then 100 Mbyte, etc. At each level of filesize, files are sorted by last usage; oldest files considered first. A file is deleted only if it has not been used within NDAYS (= 7). After each deletion the disk usage is checked; when it falls below 90% the deletion stops. This script can be disabled for the dedicated disks at request of spokesperson, but cannot be disabled for the general (shared) disks.


For ADAQ WORK DISKS in counting room.
The following algorithm will maintain space on the ADAQ Linux scratch disks
/adaqlN/dataM (N=1,2..) (M=1,2..).   
Every night each disk is checked.  If less than 95% full, nothing happens.  
If greater than 95% full, files will be deleted as follows:  Initially,
files greater than NB bytes are considered, then NB/(10**i), i=1,2,3...(loop)
where NB is big.  I.e., first we consider 10 Gbyte files, then 1 Gbyte, then
100 Mbyte, etc.  At each level of filesize, files are sorted by last usage; 
oldest files considered first.  A file is deleted ONLY IF it has not been
used within NDAYS (= 14). After each deletion the disk usage is checked; 
when it falls below 80% the deletion stops.  To prevent infinite looping,
the script quits trying after a number of files are considered; e.g. if the
users "touch" all files within NDAYS, nothing gets deleted.  The sorting
by filesize tends to avoid removing source files. To further avoid this
(and to avoid deleting "data") a file is not deleted if it contains a
key word like ".dat" or ".cxx" etc.  At then end of an experiment
anything might be deleted to make room
for the next experiment.
 

Usage on Assigned Hall A Work Disks
===================================
Snapshot on April 5, 2004.  Do not use the /w mount.
Instead use the links /work/halla...  
In some cases when you "cd /work/halla/experiment"
you will see links like "disk1", "disk2", "disk3", etc.
Typically "disk1" is a link to a dedicated disk and
the others are links to areas on shared disks.
The links may change over time as new experiments replace old.


  /w mount     Size   % used   experiment
  /w/work1601   480    95%      shared
  /w/work1602   240    43%      e99117
  /w/work1603   240    93%      e98108
  /w/work1604   240    42%      e97103
  /w/work1605   240    91%      e00102
  /w/work1606   240    71%      e99114
  /w/work1607   240    35%      e99007
  /w/work1701   480    86%      shared
  /w/work1702   240    31%      e94104
  /w/work1703   240    13%      gdh
  /w/work1704   240    72%      parity
  /w/work1705   240    92%      e01001
  /w/work1706   240    55%      e01020
  /w/work1707   240    84%      e00007
  /w/work2404   240     1%      e04012
  /w/work2603   240     1%      e00110
  /w/work2703   240     1%      e03106
  /w/work2704   240     1%      shared   
  

R. Michaels   --   e-mail: rom@jlab.org

Below is the table of CUE work disks assigned to hall A experiments.

Always follow the link (shown in middle column) because this could change to point to another mount point. To find out what Unix group, you can do the following, e.g. for e89044, "ls -la /work/halla | grep e93049", then realizing from this that it is a link to /w/work1601, do this: "ls -la /w/work1601 | grep e93049" and now you can see that you must be in Unix group a-e93049 to write to this disk. Ask me to be assigned to the group.

I remind you that a "dedicated" disk means the physical partition belongs to only that experiment, while a "shared" disk means the link points to an area on a partition which is shared by several other experiments.


Hall A CUE Work Disks
     EXPERIMENT            DISK   (link)         SHARED   /   DEDICATED  
    DVCS      /work/halla/dvcs/disk1     dedicated to e00110  
    DVCS      /work/halla/dvcs/disk2     dedicated to e03106  
    DVCS      /work/halla/dvcs/disk3     shared    
    e00007      /work/halla/e00007/disk1     dedicated  
    e00007      /work/halla/e00007/diskN   (N=2,3)   shared  
    e00102      /work/halla/e00102/disk1     dedicated  
    e00102      /work/halla/e00102/diskN   (N=2,3)   shared  
    e01012      /work/halla/e01012/diskN   (N=1,2)   shared  
    e01012      /work/halla/e01012/disk3     dedicated  
    e01001      /work/halla/e01001     dedicated  
    e01020      /work/halla/e01020     dedicated  
    e89003      /work/halla/e89003     shared  
    e89033      /work/halla/e89033     shared  
    e89044      /work/halla/e89044/diskN   (N=1,2)   shared  
    e91004      /work/halla/e91004     shared  
    e91026      /work/halla/e91026     shared  
    e93027      /work/halla/e93027     shared  
    e93049      /work/halla/e93049     shared  
    e93050      /work/halla/e93050     shared  
    e93108      /work/halla/e93108     shared  
    e94010      /work/halla/e94010/diskN   (N=1-3)   shared  
    e94104      /work/halla/e94104/diskN   (N=1-3)   shared  
    e97107      /work/halla/e94107     dedicated  
    e95001      /work/halla/e95001     shared  
    e97103      /work/halla/e97103/disk1     dedicated  
    e97103      /work/halla/e97103/disk2     shared  
    e97108      /work/halla/e97108     shared  
    e97111      /work/halla/e97111     shared  
    e98108      /work/halla/e98108     dedicated  
    e99007      /work/halla/e99007     dedicated  
    e99114      /work/halla/e99114/disk1     dedicated  
    e99114      /work/halla/e99114/disk2     shared  
    e99114      /work/halla/e99114/disk3     shared  
    e99117      /work/halla/e99117     shared  
    gammap99      /work/halla/gammap99     shared  
    gdh      /work/halla/gdh     shared  
    gdh      /work/halla/gdh-2     dedicated  
    ndelta      /work/halla/ndelta/diskN (N=1-3)     shared  
    ndelta      /work/halla/ndelta-2     shared  
  HAPPEX   (Hyd-I, II, He4)   /work/halla/parity/disk1     dedicated  
  HAPPEX   (Hyd-I, II, He4)   /work/halla/parity/diskN   (N=2-4)   shared  
  HAPPEX   (Hyd-I, II, He4)   /work/halla/parity/disk5     dedicated
    ready by Oct 2004
    
  e01015   (SRC)     /work/halla/e01015/disk1     dedicated
    ready by Oct 2004
    
  e01015   (SRC)     /work/halla/e01015/disk2     shared
    ready by Oct 2004
    
  e02013     /work/halla/e02013/disk1     dedicated
    ready by Oct 2004

    
  e02013     /work/halla/e02013/disk2     shared
    ready by Oct 2004