IBM Parallel Sysplex: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 11:31, 17 August 2011 edit Smtchahal (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 9,542 edits No edit summary ← Previous edit		Latest revision as of 18:34, 28 August 2024 edit undo InternetArchiveBot (talk \| contribs) Bots, Pending changes reviewers 5,668,620 edits Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5
(48 intermediate revisions by 32 users not shown)
Line 1: {{Short description\|Cluster of IBM mainframes}} In computing, a '''Parallel Sysplex''' is a [[computer cluster\|cluster]] of [[IBM mainframe]]s acting together as a single system image with [[z/OS]]. Used for disaster recovery, Parallel Sysplex combines data sharing and [[parallel computing]] to allow a cluster of up to 32 systems to share a workload for [[high performance computing\|high performance]] and [[high availability]]. In computing, a '''Parallel Sysplex''' is a [[computer cluster\|cluster]] of [[IBM mainframe]]s acting together as a [[single system image]] with [[z/OS]]. Used for disaster recovery, Parallel Sysplex combines data sharing and [[parallel computing]] to allow a cluster of up to 32 systems to share a workload for [[high performance computing\|high performance]] and [[high availability]]. ==Sysplex== In 1990, [[IBM]] [[mainframe computer]]s, introduced the concept of a '''Systems Complex''', commonly called a '''Sysplex''', ~~allows~~with ~~multiple~~[[MVS]]/ESA ~~processors to be joined into a single unit, sharing the same ''Sysplex name'' and Couple Data Sets~~SPV4.1. ~~Put~~This ~~another~~allows ~~way,~~authorized acomponents ~~Sysplex~~in isup ato ~~single~~eight [[logical ~~system~~partition]]s ~~running~~(LPARs) onto ~~one~~communicate orand ~~more~~cooperate ~~physical~~with ~~systems.~~each other ~~Sysplexes~~using ~~are~~the ~~often~~[[IBM ~~isolated within a single system, but Parallel Sysplex technology allows multiple mainframes to act as~~XCF\|XCF]] ~~one~~protocol. Components of a Sysplex include: * A ~~Sysplex~~common ~~Timer~~time ~~which~~source ~~synchronizes~~to synchronize all member systems' clocks;. This can involve either a Sysplex timer (Model 9037), or the Server Time Protocol (STP) * [[Global Resource Serialization]] (GRS), which allows multiple systems to access the same resources concurrently, serializing where necessary to ensure exclusive access; * Cross System Coupling Facility ([[IBM XCF\|XCF]]), which allows systems to communicate [[peer-to-peer]]; * Couple Data Sets (CDS); Users of a (base) Sysplex include: * Console services – allowing one to merge multiple MCS consoles from the different members of the Sysplex, providing a single system image for Operations * Automatic Restart Manager (ARM) – Policy to direct automatic restart of failed jobs or started tasks on the same system if it is available or on another LPAR in the Sysplex * Sysplex Failure Manager (SFM) – Policy that specifies automated actions to take when certain failures occur such as loss of a member of a Sysplex or when reconfiguring systems * [[Workload Manager]] (WLM) – Policy based performance management of heterogeneous workloads across one or more z/OS images or even on AIX * [[Global Resource Serialization]] (GRS) - Communication – allows use of XCF links instead of dedicated channels for GRS, and Dynamic RNLs * Tivoli OPC – Hot standby support for the controller * [[RACF]] (IBM's mainframe security software product) – Sysplex-wide RVARY and SETROPTS commands * PDSE file sharing * Multisystem VLFNOTE, SDUMP, SLIP, DAE * [[Resource Measurement Facility]] (RMF) – Sysplex-wide reporting * [[CICS]] – uses XCF to provide better performance and response time than using VTAM for transaction routing and function shipping. * zFS – Using XCF communication to access data across multiple LPARs ==Parallel Sysplex== [[File:GDPS.svg\|thumb\|300px\|Schematic representation of a Parallel Sysplex]] IBM introduced<ref>{{cite web The forerunner to Parallel Sysplex was '''Virtual Coupling''', a technique which allowed up to 12 [[IBM ESA/390]] systems to execute jobs in parallel. The true Parallel Sysplex was introduced with then-new mainframe models in April 1994.<ref>http://www.redbooks.ibm.com/redbooks/pdfs/sg244356.pdf System/390 Parallel Sysplex Performance - IBM Redbook. Retrieved 17-09-2007.</ref> \| title = S/390 Parallel Sysplex Overview \| id = 194-080 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/0/897/ENUS194-080/index.html \| work = Announcement Letters \| publisher = IBM }} </ref> the Parallel Sysplex with the addition of the 9674<ref>{{cite web \| title = IBM S/390 Coupling Facility 9674 Model C01 \| id = 194-082 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/2/897/ENUS194-082/index.html \| work = Announcement Letters \| publisher = IBM }} </ref> [[Coupling Facility]] (CF), new S/390 models,<ref>{{cite web \| title = S/390 Parallel Sysplex Offering \| id = 194-081 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/1/897/ENUS194-081/index.html \| work = Announcement Letters \| publisher = IBM }} </ref><ref>{{cite web \| title = IBM ES/9000 Water-Cooled Processor Enhancements: New Ten-Way Processor, Parallel Sysplex Capability, and Additional Functions \| id = 194-084 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/4/897/ENUS194-084/index.html \| work = Announcement Letters \| publisher = IBM }} </ref><ref>{{cite web \| title = IBM Enterprise System/9000 Air-Cooled Processors Enhanced with Additional Functions and Parallel Sysplex Capability \| id = 194-084 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/5/897/ENUS194-085/index.html \| work = Announcement Letters \| publisher = IBM }} </ref> upgrades to existing models, coupling links for high speed communication and MVS/ESA SP V5.1<ref>{{cite web \| title = IBM MVS/ESA SP Version 5 Release 1 and OpenEdition Enhancements \| id = 294-152 \| date = April 6, 1994 \| url = https://www.ibm.com/common/ssi/ShowDoc.wss?docURL=/common/ssi/rep_ca/2/897/ENUS294-152/index.html \| work = Announcement Letters \| publisher = IBM }} </ref> operating system support, in April 1994.<ref>{{cite book \| title = System/390 Parallel Sysplex Performance \| id = SG24-4356-03 \| date = December 1998 \| edition = Fourth \| url = http://www.redbooks.ibm.com/redbooks/pdfs/sg244356.pdf \| publisher = International Business Machines Corporation \| access-date = 2007-09-17 \| url-status = dead \| archive-url = https://web.archive.org/web/20110518132944/http://www.redbooks.ibm.com/redbooks/pdfs/sg244356.pdf \| archive-date = 2011-05-18 }} </ref> The Coupling Facility (CF) may reside on a dedicated stand-alone server configured with processors that can run Coupling Facility control code (CFCC), as integral processors on the mainframes themselves configured as ICFs (Internal Coupling Facilities), or less common, as normal LPARs. The CF contains Lock, List, and Cache structures to help with serialization, message passing, and buffer consistency between multiple LPARs.<ref>{{cite web \| title = Coupling Facility Configuration Options \| id = ZSW01971USEN \| author = David Raften \| date = November 2019 \| publisher = IBM \| work = Positioning paper \| url = http://www.ibm.com/common/ssi/fcgi-bin/ssialias?infotype=SA&subtype=WH&attachment=ZSW01971USEN.PDF&appname=STGE_ZS_ZS_USEN&htmlfid=ZSW01971USEN }} </ref> The primary goal of a Parallel Sysplex is to provide data sharing capabilities, allowing multiple databases for direct reads and writes to shared data. This can provide benefits of * Help remove single points of failure within the server, LPAR, or subsystems * Application Availability * Single System Image * Dynamic Session Balancing * Dynamic Transaction Routing * Scalable capacity Databases running on the System z server that can take advantage of this include: * [[IBM Db2]] * [[IBM Information Management System]] (IMS). * [[Virtual Storage Access Method\|VSAM]] (VSARM/RLS) * IDMS * Adabas * DataCom * Oracle Other components can use the Coupling Facility to help with system management, performance, or reduced hardware requirements. Called “Resource Sharing”, uses include: * Catalog – shared catalogs to improve performance by reducing I/O to a catalog data set on disk * CICS – Using the CF to provide sharing and recovery capabilities for named counters, data tables, or transient data * DFSMShsm – Workload balancing for data migration workload * GRS Star – Reduced CPU and response time performance for data set allocation. Tape Switching uses the GRS structure to provide sharing of tape units between z/OS images. * Dynamic CHPID Management (DCM), and I/O priority management * JES2 Checkpoint – Provides improved access to a multisystem checkpoint * Operlog / Logrec – Merged multisystem logs for system management * RACF – shared data set to simplify security management across the Parallel Sysplex * WebSphere MQ – Shared message queues for availability and flexibility * WLM - provides support for Intelligent Resource Director (IRD) to extends the z/OS Workload Manager to help manage CPU and I/O resources across multiple LPARs within the Parallel Sysplex. Functions include LPAR CPU management, IRD. Multi-system enclave management for improved performance * XCF Star – Reduced hardware requirements and simplified management of XCF communication paths Major components of a Parallel Sysplex include: * [[Coupling Facility]] (CF or ICF) hardware, allowing multiple processors to share, cache, update, and balance data access; * Sysplex Timers or(more recently, Server Time Protocol) to synchronize the clocks of all member systems; * High speed, high quality, redundant cabling; * Software ([[operating system]] services and, usually, [[middleware]] such as [[IBM ~~DB2\|DB2~~Db2]]). The Coupling Facility may be either a dedicated external system (a small mainframe, such as a [[System z9]] BC, specially configured with only coupling facility processors) or integral processors on the mainframes themselves configured as ICFs (Internal Coupling Facilities).<ref>{{cite web \|url=~~http~~https://www.pcmag.com/encyclopedia_term/0~~,2542,t=Coupling~~%2C2542%2Ct%3DCoupling+Facility~~&i=40413,00~~%26i%3D40413%2C00.asp \|title=Coupling Facility Definition \|publisher=PC Magazine.com \|~~accessdate~~access-date=April 13, 2009 \|archive-date=December 2, 2008 \|archive-url=https://web.archive.org/web/20081202161800/http://www.pcmag.com/encyclopedia_term/0%2C2542%2Ct%3DCoupling+Facility%26i%3D40413%2C00.asp \|url-status=dead }}</ref> It is recommended that at least one external CF be used in a parallel sysplex.<ref>{{cite web \|url=http://www-ti.informatik.uni-tuebingen.de/os390/sysplex/sysplex/couplfac.pdf \|title=Coupling Facility \|~~accessdate~~access-date=April 13, 2009 \|archive-date=July 17, 2011 \|archive-url=https://web.archive.org/web/20110717185607/http://www-ti.informatik.uni-tuebingen.de/os390/sysplex/sysplex/couplfac.pdf \|url-status=dead }}</ref> AIt is recommended that a Parallel Sysplex has at least two CFs and/or ICFs for redundancy, especially in a production data sharing environment. ~~Every~~Server ~~mainframe~~Time ~~participating~~Protocol (STP) replaced the Sysplex Timers beginning in a2005 ~~Parallel~~for System z mainframe models z990 and newer.<ref>{{cite web \|title=Migrate from a Sysplex ~~does~~Timer ~~not~~to ~~need~~STP an\|url=http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.e0zm100/sttostp.htm ~~ICF~~\|publisher=IBM or\|access-date=April ~~its~~15, ~~own~~2009 ~~external~~}}</ref> ~~CF —~~A ~~mainframes~~Sysplex ~~merely~~Timer ~~attach~~is a physically separate piece of hardware from the mainframe,<ref>{{cite ~~via~~web \|title=Sysplex Timer \|url=http://www.symmetricom.com/resources/compliance-certifications/sysplex-timer/ \|publisher=Symmetricom \|access-date=April ~~cables~~15, to2009 }}</ref> whereas STP is an integral facility within the ~~external~~mainframe's ~~CFs~~microcode.<ref>{{cite orweb ~~ICFs~~\|title=IBM Server Time Protocol (STP) \|url=http://www-03.ibm.com/systems/z/advantages/pso/stp.html \|archive-url=https://web.archive.org/web/20080613095316/http://www-03.ibm.com/systems/z/advantages/pso/stp.html \|url-status=dead \|archive-date=June 13, 2008 \|publisher=IBM \|access-date=April 15, 2009 }}</ref> With STP and ICFs it is possible to construct a complete Parallel Sysplex installation with two connected mainframes. Moreover, a single mainframe can contain the internal equivalent of a complete physical Parallel Sysplex, useful for application testing and development purposes.<ref>{{cite web \|url=http://www.zjournal.com/index.cfm?section=article&aid=308 \|title=MVS Boot Camp: IBM Health Checker \|first=John E. \|last=Johnson \|publisher=z/Journal \|access-date=April 15, 2009 }}{{dead link\|date=January 2018 \|bot=InternetArchiveBot \|fix-attempted=yes }}</ref> Server Time Protocol (STP) replaced the Sysplex Timers beginning in 2005 for System z mainframe models z990 and newer.<ref>{{cite web \|title=Migrate from a Sysplex Timer to STP \|url=http://publib.boulder.ibm.com/infocenter/zos/v1r9/index.jsp?topic=/com.ibm.zos.r9.e0zm100/sttostp.htm \|publisher=IBM \|accessdate=April 15, 2009 }}</ref> A Sysplex Timer is a physically separate piece of hardware from the mainframe<ref>{{cite web \|title=Sysplex Timer \|url=http://www.symmetricom.com/resources/compliance-certifications/sysplex-timer/ \|publisher=Symmetricom \|accessdate=April 15, 2009 }}</ref>, whereas STP is an integral facility within the mainframe's microcode.<ref>{{cite web \|title=IBM Server Time Protocol (STP) \|url=http://www-03.ibm.com/systems/z/advantages/pso/stp.html \|publisher=IBM \|accessdate=April 15, 2009 }}</ref> With STP and ICFs it is possible to construct a complete Parallel Sysplex installation with two connected mainframes. Moreover, a single mainframe can contain the internal equivalent of a complete physical Parallel Sysplex, useful for application testing and development purposes.<ref>{{cite web \|url=http://www.zjournal.com/index.cfm?section=article&aid=308 \|title=MVS Boot Camp: IBM Health Checker \|first=John E. \|last=Johnson \|publisher=z/Journal \|accessdate=April 15, 2009 }}</ref> The IBM Systems Journal dedicated a full issue to all the technology components.<ref>{{cite web \|url=http://researchweb.watson.ibm.com/journal/sj36-2.html \|title=IBM's System Journal on S/390 Parallel Sysplex Clusters \|~~accessdate~~access-date=~~1997~~24 April 2017 \|archive-date=9 March 2012 \|archive-url=https://web.archive.org/web/20120309150534/http://researchweb.watson.ibm.com/journal/sj36-2.html \|url-status=dead }}</ref> ==Server Time Protocol== Maintaining accurate time is important in computer systems. For example, in a transaction-processing system the recovery process reconstructs the transaction data from log files. If time stamps are used for transaction-data logging, and the time stamps of two related transactions are transposed from the actual sequence, then the reconstruction of the transaction database may not match the state before the recovery process. Server Time Protocol (STP) can be used to provide a single time source between multiple servers. Based on Network Time Protocol concepts, one of the System z servers is designated by the HMC as the primary time source (Stratum 1). It then sends timing signals to the Stratum 2 servers through use of coupling links. The Stratum 2 servers in turn send timing signals to the Stratum 3 servers. To provide availability, one of the servers can be designated as a backup time source, and a third server can be designated as an Arbiter to assist the Backup Time Server in determining if it should take the role of the Primary during exception conditions. STP has been available on System z servers since 2005. More information on STP is available in “Server Time Protocol Planning Guide”.<ref>{{cite manual \| title = Server Time Protocol Planning Guide \| id = SG24-7280-03 \| date = June 2013 \| edition = Fourth \| work = Redbooks \| publisher = International Business Machines Corporation \| url = http://www.redbooks.ibm.com/redbooks/pdfs/sg247280.pdf }} </ref> ==Geographically Dispersed Parallel Sysplex== {{redirect\|GDPS\|other uses\|GDPS (disambiguation)}} '''Geographically Dispersed Parallel Sysplex''' ('''GDPS''') is an extension of Parallel Sysplex of mainframes located, potentially, in different cities. GDPS includes configurations for single site or multiple site configurations:<ref>{{cite conference \|first=Riaz \|last=Ahmad \|date=March 5, 2009 \|title=GDPS 3.6 Update & Implementation \|publisher=SHARE \|___location=Austin, TX \|url=http://ew.share.org/proceedingmod/abstract.cfm?abstract_id=19145 \|~~accessdate~~access-date=April 17, 2009 }}{{Dead link\|date=January 2020 \|bot=InternetArchiveBot \|fix-attempted=yes }}</ref> * GDPS HyperSwap Manager: This is based on synchronous [[Peer to Peer Remote Copy]] (PPRC) technology for use within a single data center. Data is copied from the primary storage device to a secondary storage device. In the event of a failure on the primary storage device, the system automatically makes the secondary storage device the primary, usually without disrupting running applications. * GDPS~~/HyperSwap~~ ~~Manager~~Metro: ItThis is abased on synchronous ~~[[Peer~~data tomirroring ~~Peer Remote Copy]]~~technology (PPRC) ~~technology~~that ~~for~~can ~~use~~be ~~within~~used aon ~~single~~mainframes ~~data~~{{convert\|200\|km\|mi}} ~~center~~apart. ~~Data~~In isa ~~copied~~two-system ~~from~~model, ~~the~~both ~~primary~~sites ~~storage~~can ~~device~~be toadministered aas ~~secondary~~if ~~storage~~they ~~device~~were one system. In the event of a failure onof ~~the~~a ~~primary~~system or storage device, ~~the~~recovery ~~system~~can occur automatically, ~~makes~~with ~~the~~limited ~~secondary~~or ~~storage~~no ~~device the primary, usually without disrupting running~~data ~~applications~~loss. * GDPS Global - XRC: This is based on asynchronous [[Extended Remote Copy]] (XRC) technology with no restrictions on distance. XRC copies data on storage devices between two sites such that only a few seconds of data may be lost in the event of a failure. If a failure does occur, a user must initiate the recovery process. Once initiated, the process is automatic in recovering from secondary storage devices and reconfiguring systems. * GDPS/PPRC: It is a synchronous data mirroring technology (PPRC) that can be used on mainframes {{convert\|200\|km\|mi}} apart. In a two-system model, both sites can be administered as if they were one system. In the event of a failure of a system or storage device, recovery can occur with limited or no data loss automatically. * GDPS~~/XRC~~ Global - GM: ItThis is anbased on asynchronous [[~~Extended~~IBM ~~Remote~~Global ~~Copy~~Mirror]] ~~(XRC)~~ technology with no restrictions on distance. ~~XRC~~It ~~copies~~is ~~data~~designed onfor ~~storage~~recovery ~~devices between two sites such that only~~from a ~~few seconds of data may be lost in the event of a~~total failure. at Ifone ~~a failure does occur, a user must initiate the recovery process~~site. ~~Once~~It ~~initiated,~~will ~~the process is automatic in recovering from~~activate secondary storage devices and ~~reconfiguring~~backup systems. * GDPS Metro Global - GM: This is a configuration for systems with more than two systems/sites, for purposes of disaster recovery. It is based on GDPS Metro together with GDPS Global - GM. * GDPS/GM: It is an asynchronous [[IBM Global Mirror]] technology with no restrictions on distance. It is designed to recovery from a total failure at one site. It will activate secondary storage devices and backup systems. * GDPS~~/MGM~~ &Metro ~~GDPS/MzGM~~Global - XRC: ~~These~~This ~~are~~is ~~configurations~~a configuration for systems with more than two systems/sites for purposes of disaster recovery. ~~GDPS/MGM~~It ~~and~~is based on GDPS~~/MzGM~~ ~~are~~Metro ~~based~~together onwith GDPS~~/PPRC~~ ~~and~~Global - ~~GDPS/~~XRC~~, respectively~~. * GDPS Continuous Availability: This is a disaster recovery / continuous availability solution, based on two or more sites, separated by unlimited distances, running the same applications and having the same data to provide cross-site workload balancing. IBM Multi-site Workload Lifeline, through its monitoring and workload routing, plays an integral role in the GDPS Continuous Availability solution. ==See also== Line 43 ⟶ 180: ==External links== * [https://web.archive.org/web/20050721234105/http://www-1.ibm.com/servers/eserver/zseries/pso/ IBM Parallel Sysplex site] * [https://web.archive.org/web/20050611083713/http://www-1.ibm.com/servers/eserver/zseries/gdps/ IBM GDPS page] [[Category:IBM mainframe technology]] Line 50 ⟶ 187: [[Category:Cluster computing]] [[Category:Parallel computing]] ~~[[de:Parallel Sysplex]]~~ ~~[[es:Sysplex Paralelo]]~~ ~~[[ja:並列シスプレックス]]~~