Bioinformatics workflow management system: Difference between revisions

Content deleted Content added
Dexbot (talk | contribs)
m Bot: Deprecating Template:Cite doi and some minor fixes
m Journal cites: format journal names, using AWB (11852)
Line 2:
A '''bioinformatics workflow management system''' is a specialized form of [[workflow management system]] designed specifically to compose and execute a series of computational or data manipulation steps, or a [[workflows|workflow]], that relate to [[bioinformatics]].
 
There are currently many different workflow systems. Some have been developed more generally as [[scientific workflow system]]s for use by scientists from many different disciplines like [[astronomy]] and [[earth science]]. All such systems are based on an abstract representation of how a computation proceeds in the form of a directed graph, where each node represents a task to be executed and edges represent either data flow or execution dependencies between different tasks. Each system typically provides visual front-end allowing the user to build and modify complex applications with little or no programming expertise.<ref>{{Cite journal | last1 = Oinn | first1 = T. | last2 = Greenwood | first2 = M. | last3 = Addis | first3 = M. | last4 = Alpdemir | first4 = M. N. | last5 = Ferris | first5 = J. | last6 = Glover | first6 = K. | last7 = Goble | first7 = C. | authorlink7 = Carole Goble| last8 = Goderis | first8 = A. | last9 = Hull | first9 = D. | doi = 10.1002/cpe.993 | last10 = Marvin | first10 = D. | last11 = Li | first11 = P. | last12 = Lord | first12 = P. | last13 = Pocock | first13 = M. R. | last14 = Senger | first14 = M. | last15 = Stevens | first15 = R. | last16 = Wipat | first16 = A. | last17 = Wroe | first17 = C. | title = Taverna: Lessons in creating a workflow environment for the life sciences | journal = Concurrency and Computation: Practice and Experience | volume = 18 | issue = 10 | pages = 1067–1100 | year = 2006 | pmid = | pmc = }}</ref><ref>{{Cite journal | last1 = Yu | first1 = J. | last2 = Buyya | first2 = R. | doi = 10.1145/1084805.1084814 | title = A taxonomy of scientific workflow systems for grid computing | journal = ACM SIGMOD Record | volume = 34 | issue = 3 | pages = 44 | year = 2005 | pmid = | pmc = }}</ref><ref name="CIBEC 2008">{{Cite journal | last1 = Curcin | first1 = V. | last2 = Ghanem | first2 = M. | title = Scientific workflow systems - can one size fit all? | doi = 10.1109/CIBEC.2008.4786077 | pages = 1–9 | year = 2008 | pmid = | pmc = | journal=2008 Cairo International Biomedical Engineering Conference}}</ref>
 
==Examples==
Line 96:
| first1 = M. A.
| title = Chipster: User-friendly analysis software for microarray and other high-throughput data
| journal = BMC genomicsGenomics
| volume = 12
| pages = 507
Line 150:
| doi = 10.1038/nmeth.1809
}}</ref>
* [[KNIME]] the Konstanz Information Miner<ref>{{Cite journal | doi = 10.1016/j.compbiolchem.2007.08.009| title = Workflow based framework for life science informatics| journal = Computational Biology and Chemistry| year = 2007| volume=31| pages=305–319}}</ref>
* [[OnlineHPC]] Online workflow designer based on [[Taverna workbench|Taverna]]
* SeqWare: Hadoop Oozie-based workflow system focused on genomics data analysis in cloud environments{{citation needed|date=October 2014}}.
* Tavaxy<ref>{{Cite journal | last1 = Abouelhoda | first1 = M. | last2 = Issa | first2 = S. | last3 = Ghanem | first3 = M. | title = Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support | doi = 10.1186/1471-2105-13-77 | journal = BMC Bioinformatics | volume = 13 | pages = 77 | year = 2012 | pmid = 22559942| pmc =3583125 }}</ref> A cloud-based bioinformatics workflow system that integrates features from both Taverna and Galaxy for NGS data analysis.
* [[Taverna workbench]]:<ref>{{Cite journal | last1 = Oinn | first1 = T. | last2 = Addis | first2 = M. | last3 = Ferris | first3 = J. | last4 = Marvin | first4 = D. | last5 = Senger | first5 = M. | last6 = Greenwood | first6 = M. | last7 = Carver | first7 = T. | last8 = Glover | first8 = K. | last9 = Pocock | first9 = M. R. | last10 = Wipat | doi = 10.1093/bioinformatics/bth361 | first10 = A. | last11 = Li | first11 = P. | title = Taverna: A tool for the composition and enactment of bioinformatics workflows | journal = Bioinformatics | volume = 20 | issue = 17 | pages = 3045–3054 | year = 2004 | pmid = 15201187 | pmc = }}</ref><ref>{{Cite journal | last1 = Hull | first1 = D.| authorlink = | last2 = Wolstencroft | first2 = K. | last3 = Stevens | first3 = R. | authorlink3 = Robert David Stevens | last4 = Goble | first4 = C. A. | authorlink4 = Carole Goble | last5 = Pocock | first5 = M. R. | last6 = Li | first6 = P. | last7 = Oinn | first7 = T. | title = Taverna: A tool for building and running workflows of services | doi = 10.1093/nar/gkl320 | journal = [[Nucleic Acids Research]] | volume = 34 | issue = Web Server issue | pages = W729–W732 | year = 2006 | pmc = 1538887 | pmid = 16845108}} {{open access}}</ref> an early ___domain-independent system widely used in bioinformatics and other areas of [[e-Science]]
* [[VisTrails]]<ref>{{Cite journal | doi = 10.1109/VISUAL.2005.1532788 | title = VisTrails: enabling interactive multiple-view visualizations| year = 2005 | journal=VIS 05. IEEE Visualization, 2005.}}</ref>
 
==Comparisons between workflow systems==
With a large number of bioinformatics workflow systems to choose from, it becomes difficult to understand and compare the features of the different workflow systems. There has been little work conducted in evaluating and comparing the systems from a bioinformatician's perspective, especially when it comes to comparing the data types they can deal with, the in-built functionalities that are provided to the user or even their performance or usability. Examples of existing comparisons include
 
* The paper "Scientific workflow systems-can one size fit all?",<ref>{{Cite journal | last1 name= Curcin | first1 = V. | last2 = Ghanem | first2 = M. | title = Scientific workflow systems - can one size fit all? | doi = 10.1109/"CIBEC.2008.4786077 | pages = 1–9 | year = 2008 | pmid = | pmc = }}<"/ref> which provides a high-level framework for comparing workflow systems based on their control flow and data flow properties. The systems compared include [[Discovery Net]], [[Taverna workbench|Taverna]], Triana, [[Kepler scientific workflow system|Kepler]] as well as Yawl and [[Business Process Execution Language|BPEL]].
 
* The paper "Meta-workflows: pattern-based interoperability between Galaxy and Taverna" <ref>
{{Cite book | last1 = Abouelhoda | first1 = M. | last2 = Alaa | first2 = S. | last3 = Ghanem | first3 = M. | doi = 10.1145/1833398.1833400 | chapter = Meta-workflows | title = Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science - Wands '10 | pages = 1 | year = 2010 | isbn = 9781450301886 | pmid = | pmc = }}</ref> which provides a more user-oriented comparison between [[Taverna workbench|Taverna]] and [[Galaxy (computational biology)|Galaxy]] in the context of enabling interoperability between both systems.