Data set (IBM mainframe)

This is an old revision of this page, as edited by 64.240.127.148 (talk) at 18:45, 13 October 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The term data set or dataset is used to refer to files on an IBM mainframe computer, typically stored on DASD or magnetic tape. They are record-oriented files.

Unlike files on UNIX systems, they are not unstructured streams of bytes but rather are organized in various logical record and block structures determined by the DSORG (data set organisation) parameter of the JCL that was used to allocate them.

For example, a PDS or Partitioned Data Set is a dataset containining multiple members, each of which holds a separate sub-data set, similar to a directory in other types of file system.

Since MVS/XA there is also the Partitioned DATA set Extended (PDSE).

PDS/E file structure is similar to PDS files and is used to store the same types of data. However, PDS/E files have a better directory structure which does not require pre-allocation of directory blocks when the PDS/E is defined (and therefore does not run out of directory blocks if not enough were specified). Also, PDS/E stores members in such a way that no compression is needed to reclaim dead space. PDS/E files can only reside on disk in order to use the directory structure to access individual members. PDS/E files are also referred to as Libraries.