You are here: Foswiki>ATLAS Web>GridInformation>GridNtuplingUsingGanga (28 Mar 2007, MikeFlowerdew)Edit Attach

Using Ganga to submit grid ntupling jobs

This is a set of instructions for how to use ganga to submit ntupling jobs using AODtoLiverpoolNtuple. I did this at CERN, but there shouldn't really be any reason for this not to work at Liverpool (barring a few changes in the initial setup).

At the moment, the setup is rather complicated, and the submission script is pretty basic. In principle, the same thing could be done more simply by using a job template, or a script which takes arguments such as the DQ2 dataset to use as input.

Step 1: Set up the grid

For lxplus, see the ATLAS Workbook. You need to set up the grid, DQ2 and ganga. Ganga heavily uses a directory called ~/gangadir/. Before using ganga for the first time, you may want to get some scratch space and create this directory as a soft link to ~/scratch0/.

To get scratch space (extra space which is not backed up), email atlas-support@cernREMOVETHIS.ch and ask for it. Then, create your soft link:
mkdir ~/scratch0/gangadir
ln -s ~/scratch0/gangadir ~/gangadir

Step 2: Avoiding CASTOR

Castor is basically dead, but ganga tries to write output there by default. One way around this (under test!) would be to use a different grid Storage Element. I've set up a directory at Manchester, with the following command:
edg-gridftp-mkdir gsiftp://dcache01.tier2.hep.manchester.ac.uk/pnfs/tier2.hep.manchester.ac.uk/data/atlas/users/flowerdew
The bit after the :// is the srm directory, which I'll write as srm://<SE>/<dir>/ for the rest of the page.

Step 3: Prepare ganga

The first time you run ganga, it creates the file ~/.gangarc. This needs to be edited, most of these should just involve uncommenting a line, line numbers are approximate:

  • Line 33 RUNTIME_PATH = GangaAtlas
  • Line 227 VirtualOrganisation = atlas
  • Line 275 LCGOutputLocation = srm://<SE>/<dir> NB: This doesn't seem to work for me, see below.
  • Line 280 ATLAS_SOFTWARE = /afs/cern.ch/project/gd/apps/atlas/slc3/software

With the current version (4.2.X) of ganga, jobs will systematically fail at certain grid nodes (for example, mars-ce2.mars.lesc.doc.ic.ac.uk) with the error curl: command not found. This will be fixed, but for now lines such as ExcludedCEs = ic\.ac\.uk need to be added in the LCG section of .gangarc. I am still testing this, please let me know if it works for you.

Step 4: Customising the submission script

Now look at the attachments on this page. There are only three files, and this is all you need. They are:

  • SubmitGanga.py Delete the .txt from the filename. This is the script that is used to submit the job.
  • AODtoLiverpoolNtuple_JobOptions.py Again, delete the .txt. These are the jobOptions I use, or use the one in CVS.
  • AODtoLiverpoolNtuple-00003.tar.gz This contains all the source code and is sent off to define the job. Although this is based on the job.prepare() method in ganga, it is not the same, it contains a small but necessary change to make it work at a remote site.

So, taking the SubmitGanga.py file, a few small changes have to be made:

  • Lines 16 and 18: Point these towards wherever you have the tarball and jobOptions, respectively
  • Line 25: Replace with the name of the DQ2 dataset you want to ntuple
  • Line 12: Change as appropriate, depending on the number and size of the AODs...
  • Line 34: Point this towards srm://<SE>/<dir>/ which you chose in step 2.
  • Line 37: This name must match the ntuple name given in the job options, or the output will not copy from the CE to the SE - if you are using the job options on this page, no change is needed. The name Ntuple.root does not need to change from job to job - when copying the ntuples from the grid, you have to specify a full local filename anyway, and so you can easily call the file whatever you like at that stage.

Step 5: Submitting!

From within ganga:
execfile('SubmitGanga.py')
and wait...

Keep tabs on your jobs by typing jobs, or jobs[n] for more detail, where n is the job number. For information on an individual subjob, type jobs[n].subjobs[m]... You get the idea.

With any luck, the job will work, and output data will be deposited in the directory you set up in steps 2 and 3. See this page for ways of seeing and copying this data back to Liverpool.

-- MikeFlowerdew - 22 Mar 2007
Topic attachments
I Attachment Action Size Date WhoSorted ascending Comment
AODtoLiverpoolNtuple-00003.tar.gzgz AODtoLiverpoolNtuple-00003.tar.gz manage 1 MB 22 Mar 2007 - 15:16 MikeFlowerdew Tarball containing code
AODtoLiverpoolNtuple_JobOptions.py.txttxt AODtoLiverpoolNtuple_JobOptions.py.txt manage 3 K 22 Mar 2007 - 15:15 MikeFlowerdew Job options for AODtoLiverpoolNtuple, with electron trigger
SubmitGanga.py.txttxt SubmitGanga.py.txt manage 1 K 27 Mar 2007 - 07:37 MikeFlowerdew Basic Ganga submission script
Topic revision: r4 - 28 Mar 2007, MikeFlowerdew
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback