Simple API for Grid Applications: Difference between revisions

Content deleted Content added
m typo
Edited disambig hatnote
 
(28 intermediate revisions by 24 users not shown)
Line 1:
{{Short description|Grid computing application programming interface}}
{{other uses2|Saga}}
{{Redirect|SAGA (computing)||Saga (disambiguation)#Computing}}
{{Infobox software
| logo = [[File:Logo saga.png|220px|SAGA C++/Python logo]]
| screenshot =
| developer = [[Center for Computation and Technology]] at [[Louisiana State University|LSU]], [http://radical.rutgers.edu/ RADICAL Group] at [[Rutgers University]], [http://www.in2p3.fr/ IN2P3 (France)], and [http://www.vu.nl/ Vrije Universiteit (Amsterdam, Thethe Netherlands)]
|
| programming language = [[C++]], [[Python (programming language)|Python]], [[Java (programming language)|Java]]
| platform = [[Cross-platform]]
Line 17:
The SAGA specification for distributed computing originally consisted of a single document, GFD.90, which was released in 2009.
 
The SAGA [[API]] does not strive to replace [[Globus Toolkit|Globus]] or similar [[grid computing]] middleware systems, and does not target middleware developers, but application developers with no background on grid computing. Such developers typically wish to devote their time to their own goals and minimize the time spent coding infrastructure functionality. The API insulates application developers from middleware.
 
The specification of services, and the protocols to interact with them, is out of the scope of SAGA. Rather, the API seeks to hide the detail of any service infrastructures that may or may not be used to implement the functionality that the application developer needs. The API aligns, however, with all middleware standards within [[Open Grid Forum]] (OGF).<ref name="gdf90">{{Cite journal | author1=T. Goodale, | author2=S. Jha, | author3=H. Kaiser, | author4=T. Kielmann, | author5=P. Kleijer, | author6=A. Merzky, | author7=J. Shalf, and| author8=C. Smith, | title=A Simple API for Grid Applications (SAGA), | journal=OGF Document Series 90, | url=http://www.ogf.org/documents/GFD.90.pdf}}</ref>
 
The SAGA API defined a mechanism to specify additional API ''packages'' which expand its scope. The SAGA Core API itself defines a number of packages: job management, file management, replica management, remote procedure calls, and streams. SAGA covers the most important and frequently used distributed functionality and is supported and available on every major grid systems - [[Extreme Science and Engineering Discovery Environment]] (XSEDE), EGI and FutureGrid. SAGA not only supports a wide range of distributed programming and coordination models but is also easily extensible to support new and emerging middleware.<ref>SAGA: A Simple API for Grid applications, High-Level Application Programming on the Grid
Tom Goodale, Shantenu Jha, Harmut Kaiser, Thilo Kielmann, Pascal K leijer, Gregor von Laszewski, Craig Lee, Andre Merzky, Hrabri Rajic, John Shalf
Computational Methods in Science and Technology, vol. 12 # 1, 2006</ref><ref>Grid Interoperability at the Application Level Using SAGA
Line 28:
== Standardization ==
 
The SAGA API is standardised in the SAGA Working Group the [[Open Grid Forum]].<ref>{{cite web | url=http://redmine.ogf.org/projects/saga-wg | title=Overview - SAGA WG - Open Grid Forum }}</ref> Based on a set of use cases
<ref>Shantenu Jha, Andre Merzky: "A Collection of Use Cases for a Simple API for Grid Applications", OGF Informational Document, [http://www.ogf.org/documents/GFD.70.pdf GFD.70 (pdf)]</ref>
,<ref>Shantenu Jha, Andre Merzky: "A Requirements Analysis for a Simple API for Grid Applications", OGF Informational Document, [http://www.ogf.org/documents/GFD.71.pdf GFD.71 (pdf)]</ref>
the SAGA Core API specification<ref>[http://www.ogf.org/documents/GFD.90.pdf Tom Goodale, Shantenu Jha, Hartmut Kaiser, Thilo Kielmann, Pascal Kleijer, Andre Merzky, John Shalf, Chris Smith: name="A Simple API for Grid Applications (SAGA)gdf90", OGF Recommendation Document, GFD.90]</ref> defines a set of general API principles (the 'SAGA Look and Feel', and a set of API packages which render commonly used Grid programming patterns (job management, file management and access, replica management etc.) The SAGA Core specification also defines how additional API packages are to be defined, and how they relate to the Core API, and to its 'Look and Feel'. Based on that, a number of API extensions have been defined, and are in various states of the standardisation process.<ref>Steve Fisher, Anthony Wilson, Arumugam Paventhan: "SAGA API Extension: Service Discovery API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.144.pdf GFD.144 (pdf)]</ref><ref>Andre Merzky: "SAGA API Extension: Advert API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.177.pdf GFD.177 (pdf)]</ref><ref>Andre Merzky: "SAGA API Extension: Message API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.178.pdf GFD.178 (pdf)]</ref><ref>Steve Fisher, Anthony Wilson: "SAGA API Extension: Information System Navigator API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.195.pdf GFD.195 (pdf)]</ref>
<ref>Steve Fisher, Anthony Wilson, Arumugam Paventhan: "SAGA API Extension: Service Discovery API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.144.pdf GFD.144 (pdf)]</ref>
<ref>Andre Merzky: "SAGA API Extension: Advert API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.177.pdf GFD.177 (pdf)]</ref>
<ref>Andre Merzky: "SAGA API Extension: Message API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.178.pdf GFD.178 (pdf)]</ref>
<ref>Steve Fisher, Anthony Wilson: "SAGA API Extension: Information System Navigator API", OGF Recommendation Document, [http://www.ogf.org/documents/GFD.195.pdf GFD.195 (pdf)]</ref>
 
All SAGA specifications are defined in (a flavor of) [[Interface description language|IDL]], and thus object oriented, but language neutral. Different language bindings exist (Java, C++, Python), but are, at this point, not standardised. Nevertheless, different implementations of these language bindings have a relatively coherent API definition (in particular, the different Java implementations share the same abstract API interface classes).
 
All SAGA specifications are defined in (a flavor of) [[Interface description language|IDL]], and thus object oriented, but language neutral. Different language bindings exist (Java, C++, Python), but are, at this point, not standardised. Nevertheless, different implementations of these language bindings have a relatively coherent API definition (in particular, the different Java implementations share the same abstract API interface classes).
 
The 'Look and Feel' part of the SAGA Core API specification covers the following areas:
Line 53 ⟶ 48:
[[File:SAGA Architecture.png|thumb|236px|The SAGA C++/Python architecture: a light-weight [[run-time system|runtime system]] dispatches API calls from the application to the [[middleware]] through a set of [[plug-in (computing)|plug-ins]] or ''adaptors''.]]
 
SAGA is designed as an [[Object-oriented programming|object oriented]] interface. It encapsulates related functionality in a set of objects, that are grouped in functional [[namespace]]s, which are called ''packages'' in SAGA. The SAGA core implementation defines the following packages:<ref>{{Cite web | title=The SAGA C++ Reference API (Documentation) [| url=http://static.saga.cct.lsu.edu/apidoc/cpp/latest/].}}</ref>
 
* saga::advert - interface for [[Advert Service]] access
Line 64 ⟶ 59:
* saga::stream - interface for data stream client and servers
 
The overall architecture of SAGA follows the [[adaptor pattern]], a [[software design pattern]] which is used for translating one interface into another. In SAGA it translates the calls from the API packages to the interfaces of the underlying middleware. The SAGA run-time system uses [[late binding]] to decide at [[program lifecycle phase|run-time]] which [[plug-in (computing)|plug-in]] (''middleware adaptor'') to load and bind.<ref>{{Cite web | title=SAGA: How it works| (onwebsite=www.vimeo.com Vimeo)| [url=http://www.vimeo.com/20955440].}}</ref>
 
== Supported middleware ==
Line 79 ⟶ 74:
| [[Eucalyptus (computing)|Eucalyptus]] || saga-adaptors-aws || saga::job
|-
| [[Globus Toolkit|Globus]] [[Grid Resource Allocation Manager|GRAM]] (2 and 5) || saga-adaptors-globus || saga::job
|-
| Globus [[GridFTP]] || saga-adaptors-globus || saga::filesystem
Line 110 ⟶ 105:
== Implementations ==
 
Since the SAGA interface definitions are not bound to any specific programming language, several implementations of the SAGA standards exist in different programming languages. Apart from the implementation language, they differ from each other in their completeness in terms of standard coverage, as well as in their support for distributed middleware.
standards exist in different programming languages. Apart from the implementation language, they differ from each other in their completeness in terms of standard coverage, as well as in their support for distributed middleware.
 
=== SAGA C++ ===
 
[httpSAGA C++<ref>{{Cite web| title=SAGA C++ | url=https://saga-project.github.comio/saga-corecpp/ SAGA C++]}}</ref> was the first complete implementation of the SAGA Core specification, written in C++. Currently the C++ implementation is not under active development.
 
=== RADICAL-SAGA(Python) ===
 
[httpRADICAL-SAGA<ref>{{Cite web | url=https://radical-cybertools.github.io/#SAGAPython | title=RADICAL-SAGA]}}</ref> is a light-weight Python package that implements parts of the [http://www.ogf.org/documents/GFD.90.pdf OGF GFD.90]<ref name="gdf90"/> interface specification and provides plug-ins for different distributed middleware systems and services. RADICAL-SAGA implements the most commonly used features of GFD.90 based upon extensive use-case analysis, and focuses on usability and simple deployment in real-world heterogeneous distributed computing environments and application scenarios. RADICAL-SAGA currently implements the job and the file management core APIs as well as the resource management API extension. RADICAL-SAGA provides plug-ins for different distributed middleware systems and services, including support for the [[Portable Batch System|PBS]], [[Sun Grid Engine]], [[Secure Shell|SSH]], [[SSH File Transfer Protocol|SFTP]] and others. RADICAL-SAGA can be used to develop distributed applications and frameworks that run on distributed cyber-infrastructure including [XSEDE,<ref>{{Cite web | url=https://www.xsede.org/ | title=Thank you for your interest in XSEDE],}}</ref> LONI and [FutureGrid,<ref>{{Cite web | archive-url=https://wwwweb.archive.org/web/20101125083552/http://futuregrid.org/ | title=FutureGrid], | url=http://futuregrid.org/ | url-status=usurped | archive-date=2010-11-25}}</ref> other clouds and local clusters.
 
=== JavaSAGA ===
 
JavaSAGA is a Java implementation of SAGA. This status of JavaSAGA remains uncertain.
<syntaxhighlight lang="java">
import java.util.io.*
 
int main (int argc, char** argv)
{
namespace sa = saga::attributes;
namespace sja = saga::job::attributes;
 
try
{
saga::job::description jd;
 
jd.set_attribute (sja::description_executable, "/home/user/hello-mpi");
jd.set_attribute (sja::description_output, "/home/user/hello.out");
jd.set_attribute (sja::description_error, "/home/user/hello.err");
 
// Declare this as an MPI-style job
jd.set_attribute (sja::description_spmd_variation, "mpi");
 
// Name of the queue we want to use
jd.set_attribute (sja::description_queue, "checkpt");
jd.set_attribute (sja::description_spmd_variation, "mpi");
// Number of processors to request
jd.set_attribute (sja::description_number_of_processes, "32");
 
saga::job::service js("gram://my.globus.host/jobmanager-pbs");
saga::job::job j = js.create_job(jd);
 
j.run()
}
catch(saga::exception const & e)
{
std::cerr << "SAGA exception caught: " << e.what() << std::endl;
}
}
</syntaxhighlight>
 
=== jSAGA ===
 
[jSAGA <ref>{{Cite web | url=http://grid.in2p3.fr/jsaga/ | title=jSAGA]}}</ref> is another Java implementation of the SAGA Core specification. jSAGA is currently under active development.
 
=== DESHL ===
 
The [DESHL<ref>{{Cite web | archive-url=https://web.archive.org/web/20120608113439/http://www.deisa.eu/usersupport/user-documentation/deshl | url=http://www.deisa.eu/usersupport/user-documentation/deshl | title=DESHL] | archive-date=2012-06-08}}</ref> (DEISA Services for Heterogeneous management Layer), provides functionality for submission and management of computational jobs within [[DEISA]]. DESHL is implemented as a set of command-line tools on-top of a SAGA-inspired API implemented in Java. On the back-end, it interfaces with HiLA, a generic grid access client library, which is part of the [[UNICORE]] system.
 
== Examples ==
Line 137 ⟶ 167:
===Job submission===
 
A typical task in a distributed application is to submit a ''job'' to a local or remote [[distributed resource manager]]. SAGA provides a high-level API called the ''job package'' for this. The following two simple examples show how the SAGA job package API can be used to submit a [[Message Passing Interface]] (MPI) job to a remote Globus GRAM resource manager.
 
'''====C++''':====
<sourcesyntaxhighlight lang="cpp">
#include <saga/saga.hpp>
 
Line 176 ⟶ 206:
}
 
</syntaxhighlight>
</source>
 
'''====Python''':====
<sourcesyntaxhighlight lang="python">
#!/usr/bin/env pythonpython3
import sys
 
import sys, time
import bliss.saga as saga
 
def main(jobno: int, session, jobservice) -> None:
bfast_base_dir = saga.Url("sftp://india.futuregrid.org/N/u/oweidner/software/bfast/")
 
try:
workdir = "%s/tmp/run/%s" % (bfast_base_dir.path, str(int(time.time())))
Line 195 ⟶ 224:
 
jd = saga.job.Description()
jd.wall_time_limit = 5 # wall-time in minutes
jd.total_cpu_count = 1
jd.environment = {'"BFAST_DIR'": bfast_base_dir.path}
jd.working_directory = workdir
jd.executable = '"$BFAST_DIR/bin/bfast'"
jd.arguments = ['"match'", '"-A 1'",
'"-r $BFAST_DIR/data/small/reads_5K/reads.10.fastq'",
'"-f $BFAST_DIR/data/small/reference/hg_2122.fa'"]
 
myjob = js.create_job(jd)
myjob.run()
 
print ("Job #%s started with ID '%s' and working directory: '%s'"\
% (jobno, myjob.jobid, workdir))
 
myjob.wait()
 
print ("Job #%s with ID '%s' finished (RC: %s). Output available in: '%s'"\
% (jobno, myjob.jobid, myjob.exitcode, workdir))
 
basedir.close()
 
except saga.Exception, ex:
print (f"An error occurred during job execution: %s{ex}" % (str(ex))
sys.exit(-1)
 
if __name__ == "__main__":
execution_host = saga.Url("pbs+ssh://india.futuregrid.org")
 
execution_host = saga.Url("pbs+ssh://india.futuregrid.org")
ctx = saga.Context()
ctx.type = saga.Context.SSH
ctx.userid = '"oweidner'" # like 'ssh username@host ...'
ctx.userkey = '"/Users/oweidner/.ssh/rsa_work'" # like ssh -i ...'
 
session = saga.Session()
Line 233 ⟶ 261:
 
js = saga.job.Service(execution_host, session)
 
for i in range (0, 4):
main(i, session, js)
</syntaxhighlight>
</source>
 
== Grants ==
The work related to the SAGA Project is funded by the following grants: NSF-CHE 1125332 (CDI),<ref>[https://archive.today/20121212073028/http://nsf.rutgers.edu/2011/12/title-cdi-type-ii-mapping-complex.html NSF-CHE 1125332 (CDI)],</ref> NSF-EPS 1003897 (LaSIGMA),<ref>[http://lasigma.loni.org/ NSF-EPS 1003897 (LaSIGMA)],</ref> NSF-OCI 1007115 (ExTENCI).<ref>[httphttps://www.nsf.gov/pubs/2006/nsf06599/nsf06599.htm NSF-OCI 1007115 (ExTENCI)].</ref> Previous grants include: NSF-OCI 0710874 (HPCOPS), NIH grant number P20RR016456 and UK EPSRC grant number GR/D0766171/1 via OMII-UK<ref>[https://web.archive.org/web/20040220065312/http://www.omii.ac.uk/ OMII-UK]</ref>
 
== External links ==
* [httphttps://saga-project.github.com/bliss/ SAGA-Bliss - A Python implementation of SAGA]
* [http://grid.in2p3.fr/jsaga/ jSAGA - A Java implementation of SAGA]
* [httphttps://saga-project.github.io/saga-cpp/ SAGA C++ - A C++ implementation of SAGA]
* [httphttps://saga-project.github.com/major-lab/saga-glib SAGA-GLib - A Vala implementation of SAGA for PROJECTGLib]
* [https://saga-project.github.com/ SAGA PROJECT]
* [[POSIX]]