Data set (IBM mainframe): Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Alter: pages. Formatted dashes. | Use this bot. Report bugs. | Suggested by AManWithNoPlan | Linked from User:AManWithNoPlan/sandbox4 | #UCB_webform_linked 210/1831
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
 
(14 intermediate revisions by 7 users not shown)
Line 1:
{{Short description|Type of computer file existing on IBM mainframe operating systems}}
{{about|mainframe computer filefiles|adata general meaning in computing fieldcommunications|Data setmodem}}
 
In the context of [[IBM]] [[mainframe computer]]s in the [[SIBM System/360]] line and its successors, a '''data set''' (IBM preferred) or '''dataset''' is a [[computer file]] having a [[record-oriented file|record organization]]. Use of this term began with, e.g., [[DOS/360]], and [[OS/360]], and is still used by their successors, including the current [[VSE (operating system)|VSE]] and [[z/OS]]. Documentation for these systems historically preferred this term rather than ''[[computer file|file]]''.
 
A data set is typically stored on a [[direct access storage device]] (DASD) or [[magnetic tape]],<ref>{{cite web
Line 27 ⟶ 28:
| pages = 138–139
| url = http://bitsavers.org/pdf/ibm/360/os/R21.7_Apr73/GC28-6704-4_OS_JCL_Aug76.pdf
| series = work =IBM Systems Reference Library
| publisher = IBM
}}
Line 44 ⟶ 45:
:QTAM message queue in application
;PO
:Partitioned Organization
;PS
:Physical Sequential
among others.
Data sets on tape may only be <code>DSORG=PS</code>. The choice of organization depends on how the data is to be accessed, and in particular, how it is to be updated.
 
Programmers utilize various [[access method]]s (such as [[Queued Sequential Access Method|QSAM]] or [[VSAM]]) in programs for reading and writing data sets. Access method depends on the given data set organization.
 
==Record format (RECFM)==
Regardless of organization, the physical structure of each record is essentially the same, and is uniform throughout the data set. This is specified in the DCB <code>RECFM</code> parameter. <code>RECFM=F</code> means that the records are of fixed length, specified via the <code>LRECL</code> parameter. <code>RECFM=V</code> specifies a variable-length record. V records when stored on media are prefixed by a Record Descriptor Word (RDW) containing the integer length of the record in bytes and flag bits. With <code>RECFM=FB</code> and <code>RECFM=VB</code>, multiple logical records are grouped together into a single [[Block (data storage)|physical block]] on tape or DASD. FB and VB are <codeem>fixed-blocked</codeem>, and <codeem>variable-blocked</codeem>, respectively. <code>RECFM=U</code> (undefined) is also variable length, but the length of the record is determined by the length of the block rather than by a control field.
 
The <code>BLKSIZE</code> parameter specifies the maximum length of the block. <code>RECFM=FBS</code><ref>{{cite web
Line 59 ⟶ 60:
|title=Example: Record format VBS
|website=[[IBM]]
|quote=Variable-length, blocked, spanned (VBS)}}</ref> could be also specified, meaning <codeem>fixed-blocked standard</codeem>, meaning all the blocks except the last one were required to be in full <code>BLKSIZE</code> length. <code>RECFM=VBS</code>, or <codeem>variable-blocked spanned</codeem>, means a logical record could be spanned across two or more blocks, with flags in the RDW indicating whether a record segment is continued into the next block and/or was continued from the previous one.
 
This mechanism eliminates the need for using any "[[delimiter]]" byte value to separate records. Thus data can be of any type, including binary integers, floating-point, or characters, without introducing a false end-of-record condition. The data set is an abstraction of a collection of records, in contrast to files as unstructured streams of bytes.
 
== Partitioned data set ==
Line 108 ⟶ 109:
|quote=... non-VSAM ...
|title=What is a generation data group? |website=IBM.com}}</ref> that are successive generations of historically-related data<ref name=G.sets>{{cite web |title=Generation data sets
|website=[[IBM]] |quote=successive, historically related, |url=https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.ieab500/iea3b5_Generation_data_sets_.htm}}</ref> stored on an IBM mainframe (running [[OS/360 and successors|OS/360 and its successors]] or [[DOS/360_and_successors360 and successors|DOS/VSE360 and its successors]]).<ref name=VSE.VSAM>{{cite web |title=VSE/VSAM Commands |url=http://ftp.www.ibm.com/s390/zos/vse/pdf3/zvse31/doc/iesvoe10.pdf |access-date=2021-10-11 |archive-date=2022-01-31 |archive-url=https://web.archive.org/web/20220131235307/http://ftp.www.ibm.com/s390/zos/vse/pdf3/zvse31/doc/iesvoe10.pdf |url-status=dead }}</ref>
|title=VSE/VSAM Commands |url=http://ftp.www.ibm.com/s390/zos/vse/pdf3/zvse31/doc/iesvoe10.pdf}}</ref>
 
A GDG is usually cataloged.<ref name=G.sets/>
 
An individual member of the GDG collection is called a "''Generation Data Set''."<ref name=G.sets/><ref>"A generation data set is one of ...</ref> The latter may be identified by an absolute number, {{code|ACCTG.OURGDG(1234)}}, or a relative number: {{code|(-1)}} for the previous generation, {{code|(0)}} for the current one, and {{code|(+1)}} the next generation.<ref>{{cite web
|url=http://mainframewizard.com/content/what-gdg |title=What is a GDG?}}</ref>
 
A GDG specifies how many generations of a data set are to be kept and at what age a generation will be deleted. Whenever a new generation is created, the system checks whether one or more obsolete generations are to be deleted.
 
The purpose of GDGs is to automate archival, using the command language [[Job Control Language|JCL]], the data set name given is generic. When DSN appears, the GDG data set appears along with the history number, where
 
(0) is the most recent version
 
(-1), (-2), ... are previous generations
 
(+1) a new generation (see DD)
 
Another use of GDGs is to be able to address all generations simultaneously within a JCL script without having to know the number of currently available generations. To do this, you have to omit the parentheses and the generation number in the JCL when specifying the dataset.
 
===GDG JCL & features===
Line 126 ⟶ 138:
| page = 269
| url = http://bitsavers.org/pdf/ibm/360/os/R21.7_Apr73/GC28-6586-15_OS_Utilities_Rel_21.7_Apr73.pdf
| workseries = =IBM Systems Reference Library
| publisher = [[IBM]]
| access-date = May 19, 2022
}}
</ref> of the {{pslink|Support programs for OS/360 and successors|IEHPROGM}} utility or the {{code|DEFINE GENERATIONGROUP}} statement<ref>{{cite manual
| title = OS/VS Access Method Services
| id = GC26-3836-1
Line 139 ⟶ 151:
| pages = 107–110
| url = http://bitsavers.org/pdf/ibm/370/OS_VS2/Release_2_1973/GC26-3836-1_OS_VS_Access_Method_Services_May1974.pdf
| work series = Systems
| publisher = [[IBM]]
| access-date = May 19, 2022
Line 153 ⟶ 165:
|title=IDCAMS – Create and delete GDG base using JCL
|url=http://code.xmlgadgets.com/2011/05/16/idcams-create-and-delete-gdg-base/comment-page-1}}</ref>
 
====Example====
Creation of a standard GDG for five safety scopes, each at least 35 days old:
<syntaxhighlight lang="jcl">
//STEP1 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DEFINE GDG (NAME('DB2.FULLCOPY.DSNDB04.TSTEST') LIMIT(5) SCRATCH FOR(35))
/*
</syntaxhighlight>
 
Delete a standard GDG:
<syntaxhighlight lang="jcl">
//STEP3 EXEC PGM=IDCAMS
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
DELETE DB2.FULLCOPY.DSNDB04.TSTEST GDG FORCE
/*
 
</syntaxhighlight>
 
==References==