Data set (IBM mainframe): Difference between revisions

Content deleted Content added
m WikiCleaner 0.98 - Repairing link to disambiguation page - You can help!
SmackBot (talk | contribs)
m remove Erik9bot category,outdated, tag and general fixes
Line 1:
{{Unreferenced|date=December 2009}}
{{otheruses4Otheruses4|mainframe computer file|a general meaning in computing field|Data set}}
A '''data set''', or '''dataset''', is a [[computer file]] having a [[record-oriented file|record organization]]. The term pertains to the [[IBM]] [[mainframe computer|mainframe]] operating system line, starting with [[OS/360]], and is still used by its successors, including the current [[z/OS]]. Those systems historically preferred this term over a ''file''. Data set is typically stored on [[direct access storage device]] (DASD) or [[magnetic tape]].
 
Datasets are not unstructured streams of [[byte]]s, but rather are organized in various logical record and block structures determined by the <code>DSORG</code> (data set organization), <code>RECFM</code> (record format), and other parameters. These parameters are specified at the time of the data set allocation (creation), for example with the [[Job Control Language]] <code>DD</code> statements. Inside a job they are stored in the [[Data Control Block]] (DCB), which is a data structure used to access datasets, for example using [[access method]]s.
 
== Dataset organization ==
{{Mainframe I/O access methods}}
In OS/360, the DCB's DSORG parameter specifies how the dataset is organized. It may be physically sequential ("PS"), indexed sequential ("IS"), partitioned ("PO"), or Direct Access ("DA"). Datasets on tape may only be DSORG=PS. The choice of organization depends on how the data is to be accessed, and in particular, how it is to be updated.
Line 10 ⟶ 11:
Programmers utilize various [[access method]]s (such as [[QSAM]] or [[VSAM]]) in programs reading and writing data sets, their choice depending on given data set organization.
 
== Record format (RECFM) ==
 
Regardless of organization, the physical structure of each record is essentially the same, and is uniform throughout the dataset. This is specified in the DCB <code>RECFM</code> parameter. <code>RECFM=F</code> means that the records are of fixed length, specified via the <code>LRECL</code> parameter, and <code>RECFM=V</code> specifies a variable-length record. V records when stored on media are prefixed by a Record Descriptor Word (RDW) containing the integer length of the record in bytes. With <code>RECFM=FB</code> and <code>RECFM=VB</code>, multiple logical records are grouped together into a single [[Block (data storage)|physical block]] on tape or disk. FB and VB are <code>fixed-blocked</code>, and <code>variable-blocked</code>, respectively. The <code>BLKSIZE</code> parameter specifies the maximum length of the block. <code>RECFM=FBS</code> could be also specified, meaning <code>fixed-blocked standard</code>, meaning the all blocks except the last one were required to be in full <code>BLKSIZE</code> length. <code>RECFM=VBS</code>, or <code>variable-blocked spanned</code>, means a logical record could be spanned across two or more blocks, with flags in the RDW indicating whether a record segment is continued into the next block and/or was continued from the previous one.
 
This mechanism eliminates the need for using any "delimiter" byte value to separate records. Thus data can be of any type, including binary integers, floating point, or characters, without introducing a false end-of-record condition. The data set is an abstraction of a collection of records, in contrast to files as unstructured streams of bytes.
 
== Partitioned datasets ==
 
For example, a '''PDS''' or '''Partitioned Data Set''' is a dataset containing multiple ''members'', each of which holds a separate sub-data set, similar to a [[directory (file systems)|directory]] in other types of [[file system]]s. This type of dataset is often used to hold executable programs (''load modules''), source program libraries (especially Assembler macro definitions). A PDS is most somewhat analogous to a [[ZIP (file format)|Zip]] file on [[microcomputer]]s, except the files stored in a PDS are not compressed.
 
Line 30 ⟶ 29:
PDS/E structure is similar to PDS and is used to store the same types of data. However, PDS/E files have a better directory structure which does not require pre-allocation of directory blocks when the PDS/E is defined (and therefore does not run out of directory blocks if not enough were specified). Also, PDS/E automatically stores members in such a way that compression operation is not needed to reclaim "dead" space. PDS/E files can only reside on disk in order to use the directory structure to access individual members.
 
== See also ==
* [[Volume table of contents]] (VTOC), a structure describing data sets stored on the disk
 
{{DEFAULTSORT:Data Set (Ibm Mainframe)}}
 
[[Category:Data management]]
[[Category:IBM Mainframe computer operating systems]]
[[Category:Computer file systems]]
[[Category:Files]]
[[Category:Articles lacking sources (Erik9bot)]]
 
[[ja:データセット (IBMメインフレーム)]]