Doing ATHENA analysis on the Liverpool Farm 
  Introduction 
This is a collection of information on how to use the Liverpool farm for ATHENA analysis, such as the 
LiverpoolAnalysis framework. Most of the things I learned from Carl and Mike originally. I simply write things down here, which seem to work well. If you find incorrect information, please notify me, change the page, ....
  File Storage 
While it is possible, to read (AOD) files from hepstore (i.e. /hepstore/store2/...), this is not recommended for large data sets.
Instead, the best is to store the data on the local mass storage system DPM, 
/dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/. This space is accessible, after you have setup your grid environment, e.g. on hepgrid1 using
source /batchsoft/atlas/grid/setup.sh
 Currently there's 20TB of storage with <2TB free (check using command 
dpm-getspacemd). Concurrent access by many hosts from the farm should scale well for this storage area.
You may want to read the computing pages on this subject as well! https://hep.ph.liv.ac.uk/twiki/bin/view/Computing/GridStorageGuide
One drawback is, that "standard" unix file handling commands ( 
ls, 
rm, 
cp) will not work. Instead you have to use commands starting with 
rf or 
dpns-, i.e.
dpns-ls -l /dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/
rfdir /dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/
 Note that these commands do not accept wildcards, i.e. "*", you'd have to use small scripts.
  Downloading more Data 
As far as I understand, this space is managed by all users - everyone can put new files there or delete (old) files. So we all have to use the commands responsibly (and NOT by mistake delete somebody else's files for example!). If you want to download new files, there are special options for the 
dq2-get command, which you can use from 
hepgrid1 ( 
other hosts disfavored for large downloads!):
export dpmbase='/dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/'
dq2-get -k ATLASLIVERPOOLDISK -S srm://hepgrid11.ph.liv.ac.uk:8446/srm/managerv2?SFN=$dpmbase -p lcg mc08.105802.JF17_pythia_jet_filter.recon.AOD.e347_s462_r563/
Using the container identifier like 
mc08.105802.JF17_pythia_jet_filter.recon.AOD.e347_s462_r563/ will try download 
ALL task IDs (tid) of the data set. To load a specific one, use e.g. mc08.105802.JF17_pythia_jet_filter.recon.AOD.e347_s462_r563_tid027563. Note also, that you should not try to download very large data sets in one go. A few 10GB/day should be fine. I've done ~2000 files in one session, which should not be an everyday action, but seems to work ok. Often parts of the downloads fail, files end up with a "__DQ2-xxxx" extension or zero file size. One can use scripts to delete or re-download these.
You should also keep in mind, that when trying to load additional files of a certain set, the 
dq2-get command may transfer already available files again, thus stressing the grid without need.
Important: No not try to move around files on the dpm with e.g. the 
rfcp command, as this does not set the 
ATLASLIVERPOOLDISK token properly. It's possible (but not very convenient) to use 
lcg-cp in this case:
lcg-cp -v --vo atlas -b -D srmv2 -S ATLASLIVERPOOLDISK  srm://hepgrid11.ph.liv.ac.uk:8446/srm/managerv2?SFN=$dpmbase/olddir/filename srm://hepgrid11.ph.liv.ac.uk:8446/srm/managerv2?SFN=$dpmbase/newdir/newfilename
  Using the files in your analysis 
The simplest way to load the files from your job options, use a syntax like
svcMgr.EventSelector.InputCollections = [
    "rfio:/dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/mc08.106020.PythiaWenu_1Lepton.recon.AOD.e352_s462_r541/AOD.028292._04021.pool.root.1",
    "rfio:/dpm/ph.liv.ac.uk/home/atlas/atlasliverpooldisk/mc08.106020.PythiaWenu_1Lepton.recon.AOD.e352_s462_r541/AOD.028292._04022.pool.root.1",
    ...
]
The function 
getFileList from LiverpoolAnalysis and the script below handle this (to some extend) already automatically.
Note, that using the rfio protocol directly may lead to very poor performance in times, when the network load is high! See instructions below how to improve using FileStager.
  Running Jobs on the Farm 
Jobs have to be submitted from the machine 
hepcluster, so logon there. There you can use the standard batch commands 
qsub (submit), 
qstat (check job status), 
qdel (delete jobs). See also for the 
computing pages on this subject
To create job scripts to run over a large DPM data sample with automatic job splitting, you can use the following scripts, which I got from Carl and modified. 
You will have to adapt things (minimally): 
-  Batch2.py Basic script (modified for use of FileStagerand new topJobOptions, see below, original version Batch.py)
-  submit.sh An example on how to use the above Batch.py
You will need to setup your grid certificate for DPM access (see above or 
submit.sh).
A random collection of things to observe and know: 
-  While the file list is simply appended to you job options file and you do not need to do anything special with your standard file, some options (output file name, number of events) are set by Python variables ( MyOutput,MyEventsin the above case). Check my (new) top option file (older version), which are derived from LiverpoolAnalysis Z example.
-  The above scripts creates new directories for storing the submission scripts and the output. These are deleted without warning, if you rerun, so take care, if you need the old files.
-  If you have large output files, they may not fit into your home area. One solution is to use the /scratchdisk space for temporary storage. Be sure to move your important files later to a different place! My original solution was, to store them on/hepstoredisks, for which I usedscpto copy the files to a computer with write access (farm machines have only read access). This works well, if you have setup passwordless ssh login (see e.g. here or use google)
-  The above script is setup to use the mediumqueue, which allows 24h (CPU) time and has ~50 nodes/job slots. There is also ashortqueue with less nodes and max. 1h (CPU) time.
  Using FileStager to run Jobs on the Farm 
While the "direct" 
rfio access to the files will work, there may be a serious performance drop in situations, when the DPM system is loaded heavily. One way to improve things considerably then, is to use 
FileStager. This will automatically download the files to be analysed in the background, the job can access it directly from the local disk, and eventually the file will be removed. The gain in speed can be up to a factor of 10 or so. Thanks again to John and Carl for helping me to get this working. Below you'll find preliminary instructions.
The 
FileStager documentation can be found 
here, but you probably will not need those.
First, you should update the 
FileStager version to the latest version (as of now this is 
FileStager-00-00-34). After setting up ATHENA and in your directory do:
cmt co -r FileStager-00-00-34 Database/FileStager
cd Database/FileStager/cmt
cmt config
source setup.sh
gmake
Then, you need to configure things in your topJobOptions. I've updated the 
LiverpoolAnalysis example options 
LivZAnalysis/share/LivZBosonExample_topOptions.py by an option 
UseFileStager. You'll also need the additional configuration routine 
LivTools/python/LivTools_FileStagerConfig.py. Note that I also modified the logics how the input files are defined. This will also work with my 
Batch2.py mentioned above, use option 
-p. The new files are all in CVS.
-- 
JanKretzschmar - 05 Jun 2009 -- 
JanKretzschmar - 02 Feb 2009