Difference between revisions of "Data handling scripts (to MSS and deletion)"
(Created page with "On the ADAQ cluster, running from the adaq account, there are cron scripts which automatically copy the raw data to the MSS, and may also delete files from the local data disk...")
Revision as of 17:35, 12 June 2019
On the ADAQ cluster, running from the adaq account, there are cron scripts which automatically copy the raw data to the MSS, and may also delete files from the local data disks if necessary. This has been implemented for 20+ years for the HRS DAQ, the Parity DAQ, and the polarimeters Moller and Compton. At times it was deployed for other DAQ systems too, like BDX and PEPPO. If you are a maintainer of a DAQ systems please contact me (Bob Michaels) because there is some coordination and rules to follow, e.g. filenames must be unique, of limited size, and there should have agreed upon keywords like the experiment name in the filename.
For most users, the main rule is this : DON'T INTERFERE ! Please do not move, delete, or rename raw data files. Especially do not delete anything ! Even if you don't understand why ! Thank you ! Also, please do not put any files on a disk with the string "data" on the name of the disk. Those "data" disks are for raw data only. There are separate work disks and scratch disks for files that are not raw data, for example root output files.
Briefly, the way the scripts work is as follows:
1. MSS copying: All files on the data disks are checked to see if they are in the MSS already in the appropriate area, and if not we put them there using "jput". There is a certain probability of order 1% that jput fails. These failures are logged. Another script comes along and re-tries to copy any failures. Essentially we try an infinite number of times to "jput", in practice I think I've never seen two or more failures. Now, there are various things that can go wrong, e.g. we logged that the file was copied but it actually was not, or we copied but there is no duplicate. There are about 15 rare things that can go wrong. For each of these, an email is sent to Bob Michaels, and I check the log files and fix the problem by hand.
2. File deletion: A cleanup script is run which first checks disk usage. If the disks are getting full, it starts to delete files provided they are in the MSS, in duplicate, with the same name and same byte count as file on the local disk which is to be deleted. We try not to delete files, and so the aggressiveness of the cleanup script is tuned to how full the disk is. Specifically the amount of time the files can stay on disk is a function of the percentage of disk available. If the disks get close to full and email is sent to Bob Michaels.