HEP Filestores

There are a number of file stores available. Some are general purpose and some are application specific. Most are available on all desktops and nodes via NFS and can be accessed via Samba on Windows.

User home directories

Each user has a home directory under /user stored on the machine hepuser. This area contains configuration settings for desktops and applications, emails and small amounts of code, graphics and documents. The user areas currently use quotas and should not used for bulk storage of data and application suites.The default quota is currently 80GB. You can check your quota usage for all filesystems with
  • quota -s
You should receive an email to your HEP email address once per day if you are over quota, please check your inbox regularly.

On Linux your home directory should always be available on every HEP Linux system you can log in to. On Windows it can be accessed via Samba using the following share
  • \\hep.ph.liv.ac.uk\username
This can be mapped to a drive letter. Open a file explorer window. Under the Computer Tab, select Map a Network Drive. Enter the share name for the folder.

On OS X in a Finder Window click on the Menu entry Go>Connect to Server... and use the following address
  • smb://hep.ph.liv.ac.uk/username
The user home directories are backed up daily so if you accidentally delete or modify a file it can be rolled back. The backups for the last day can be accessed on gateway.ph.liv.ac.uk from the directory /backup.

"Bundle" Bulk Data Storage

The HEP research systems (batch and interactive nodes) have access to bulk data stores served by scalable, clustered storage systems. There are currently two stores, Scratch and Data.

Only standard POSIX file access is provided at present (i.e. no XROOTD).

Operations on lots of small files or listing/deleting thousands of files may have poor performance, this is a limitation of the clustered file system. We have tried to improve this where possible but it is not a bug if such operations take longer than on eg hepstore (within reason: if basic operations take a very long time let us know).

Similarly lots of small writes can significantly damage performance, output streamed from jobs should be directed to a local disk (eg under $HEPTMP/) then copied to scratch or data once finished. Streaming data directly causes huge numbers of transactions on the storage cluster hindering performance for your jobs and others.

Data

Data is available on HEP desktops as well as core research systems.

The Bundle data area is intended to be used for longer term storage of bulk experiment data for high throughput data analysis. The system will perform more efficiently with large files. Small file operations e.g. software development/compiling should be performed elsewhere eg in your user area.

Each experiment will have a fixed top level directory eg
  • /bundle/data/ATLAS
which will have a single quota applied. This quota applies to that directory and any subdirectories, we cannot give separate quotas for users. To check the usage of the experiment data store use df e.g. for ATLAS
  • df -h /bundle/data/ATLAS
Bundle data is stored on redundant storage arrays but is not backed up.

Experiments may request 1TB by default, and up to a maximum of 10TB (space permitting). More can be made available if necessary and capacity allows.

Experiments wishing to use more data storage can add their own storage servers (if they meet a minimum specification requirement) to the Bundle pool, and their quota will be increased by the capacity of their server. This also increases the aggregate performance of the whole system. Please contact the HEP admins if you wish to add substantial amounts of storage.

Total performance is expected to be much greater than /hepstore but less than the HEP DPM service (but should scale upwards as more servers are added).

Scratch

The Bundle scratch area is intended to be used for short term temporary storage of data, job input/outputs for up to 30 days. There are no specific quotas and all users have write access. Any files older than 30 days will be automatically deleted by the system.

The Bundle scratch area is only available on research nodes (interactive and batch nodes, not desktops) under
  • /bundle/scratch/30day
Bundle scratch data is not backed up, and may be subject to deletion at short notice. The scratch service is experimental and may be withdrawn.

The Bundle scratch area currently has 20TB of storage available to users (there is some space reserved for emergencies).

Total performance is expected to be greater than the old /scratch area which should be considered deprecated.

Hepstore general purpose file stores

There is one general purpose storage area for HEP users, hepstore. Users used to have a default quota of 300GB on the old hepstore, this has been increased to 500GB on the new hepstore. Increases to this can be requested, where this is appropriate for the work being done and the nature of the data being stored.

Hepstore is for general purpose data. Large file experimental data should generally be stored on bundle, or on an appropriate dedicated area.

Hepstore is not backed up, and should not be used for storing the only copy of irreplaceable data.

The server is not configured to be used for intensive file processing; intensive operations or batch work should use scratch or local disks (e.g. under $HEPTMP/) for staging input and/or output as appropriate.

The mount point is located on Linux desktops under
  • /hepstore/
It is available from most desktops and front ends such as gateway (if you find a machine that doesn't have access let the administrators know) and are for everyone to use. To access it from Windows use the following share:
  • \\hep.ph.liv.ac.uk\hepstore
This can be mapped to a drive letter. Open a file explorer window. Under the Computer Tab, select Map a Network Drive. Enter the share name for the folder.

On OS X in a Finder Window click on the Menu entry Go>Connect to Server... and use the following address
  • smb://hep.ph.liv.ac.uk/hepstore
It is located on fault tolerant arrays but this does not provide protection from accidental erasure or recovery from multiple disc failures.

Do not store critical data on these file stores. If you need to store large quantities of data that is critical and needs to be backed up you need to discuss this with the administrators.
Topic revision: r25 - 02 May 2024, RobertFay
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback