next up previous contents
Next: File Naming Conventions Up: SE Usage to Store Previous: SE Usage to Store   Contents

Storing Data on SE

In a Grid context, data produced by jobs are created on the volatile disk storage of the WN that is currently running the job. Small files are retrieved by the user via OutputSandBox at the submitting UI node, but longer production files are stored at specialized nodes, the SE (Storage Elements) nodes that are shared grid-wide.

Usually every CE supports a SE node in the same network area (CloseSE), in this way the natural destination for production files is the CloseSE. Each job before ending should check the produced files and store them at some SE, preferably the CloseSE that optimizes transmission times and grants, to a certain extent, the network availability from the CE to the SE.

User data is stored at the SE mount point with special naming conventions as explained in Section 6.2. The stored data name (Unix file) is a PFN (Physical File Name). As there is no grid-wide GID/UID couple for file ownership, each file is given an unique identifier, the GUID. Users should also define a LFN (Logical File Name) for the file. The whole file information is stored in the RC (Replica Catalogue), that is accessed to retrieve and verify data.

To optimize data access, users can replicate the files created at a WN and stored at the related CloseSE on any other SE. In this way, when a program needs data from the Grid, the file replica is available at the CloseSE of the processing WN. I will not discuss here how replicas are triggered by the middleware as this is an advanced option not foressen in this report.


next up previous contents
Next: File Naming Conventions Up: SE Usage to Store Previous: SE Usage to Store   Contents
luvisetto 2003-12-17