#REDIRECT [[Scientific workflow system]] {{R from merge}}
{{Redirect|IDBS||IDB (disambiguation){{!}}IDB}}
{{Example farm|date=February 2012}}
A '''bioinformatics workflow management system''' is a specialized form of [[workflow management system]] designed specifically to compose and execute a series of computational or data manipulation steps, or a [[workflows|workflow]], that relate to [[bioinformatics]].
There are currently many different workflow systems. Some have been developed more generally as [[scientific workflow system]]s for use by scientists from many different disciplines like [[astronomy]] and [[earth science]]. All such systems are based on an abstract representation of how a computation proceeds in the form of a directed graph, where each node represents a task to be executed and edges represent either data flow or execution dependencies between different tasks. Each system typically provides a visual front-end, allowing the user to build and modify complex applications with little or no programming expertise.<ref>{{Cite journal | last1 = Oinn | first1 = T. | last2 = Greenwood | first2 = M. | last3 = Addis | first3 = M. | last4 = Alpdemir | first4 = M. N. | last5 = Ferris | first5 = J. | last6 = Glover | first6 = K. | last7 = Goble | first7 = C. | author-link7 = Carole Goble| last8 = Goderis | first8 = A. | last9 = Hull | first9 = D. | doi = 10.1002/cpe.993 | last10 = Marvin | first10 = D. | last11 = Li | first11 = P. | last12 = Lord | first12 = P. | last13 = Pocock | first13 = M. R. | last14 = Senger | first14 = M. | last15 = Stevens | first15 = R. | last16 = Wipat | first16 = A. | last17 = Wroe | first17 = C. | title = Taverna: Lessons in creating a workflow environment for the life sciences | journal = Concurrency and Computation: Practice and Experience | volume = 18 | issue = 10 | pages = 1067–1100 | year = 2006 | s2cid = 10219281 | url = https://eprints.soton.ac.uk/260908/1/taverna-ccpe-reviewed.pdf }}</ref><ref>{{Cite journal | last1 = Yu | first1 = J. | last2 = Buyya | first2 = R. | doi = 10.1145/1084805.1084814 | title = A taxonomy of scientific workflow systems for grid computing | journal = ACM SIGMOD Record | volume = 34 | issue = 3 | pages = 44 | year = 2005 | citeseerx = 10.1.1.63.3176 | s2cid = 538714 }}</ref><ref name="CIBEC 2008">{{Cite book | last1 = Curcin | first1 = V. | last2 = Ghanem | first2 = M. | title = Scientific workflow systems - can one size fit all? | doi = 10.1109/CIBEC.2008.4786077 | pages = 1–9 | year = 2008 | journal=2008 Cairo International Biomedical Engineering Conference| isbn = 978-1-4244-2694-2 | s2cid = 1885579 }}</ref>
==Examples==
In alphabetical order, some examples of bioinformatics workflow management systems include:
* [[Anduril (workflow engine)|Anduril]] bioinformatics and image analysis<ref>{{Cite web|url=http://www.anduril.org|title=Anduril workflow website}}</ref><ref>{{Cite journal|last1=Ovaska|first1=Kristian|last2=Laakso|first2=Marko|last3=Haapa-Paananen|first3=Saija|last4=Louhimo|first4=Riku|last5=Chen|first5=Ping|last6=Aittomäki|first6=Viljami|last7=Valo|first7=Erkka|last8=Núñez-Fontarnau|first8=Javier|last9=Rantanen|first9=Ville|date=2010-09-07|title=Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme|journal=Genome Medicine|volume=2|issue=9|pages=65|doi=10.1186/gm186|issn=1756-994X|pmc=3092116|pmid=20822536}}</ref>
* [[BioBIKE]]: a [[Web application|Web-based]], programmable, integrated biological knowledge base<ref>{{Cite journal
| last1 = Elhai | first1 = J.
| last2 = Taton | first2 = A.
| last3 = Massar | first3 = J.
| last4 = Myers | first4 = J. K.
| last5 = Travers | first5 = M.
| last6 = Casey | first6 = J.
| last7 = Slupesky | first7 = M.
| last8 = Shrager | first8 = J.
| doi = 10.1093/nar/gkp354
| title = BioBIKE: A Web-based, programmable, integrated biological knowledge base
| journal = Nucleic Acids Research
| volume = 37
| issue = Web Server issue
| pages = W28–W32
| year = 2009
| pmid = 19433511
| pmc =2703918
}}</ref>
*[[CLC bio]], a bioinformatics analysis and workflow management platform from [[Qiagen|QIAGEN Digital Insights]].
*[[Clone manager|Clone Manager]] from Sci-Ed.
*[[Cuneiform (programming language)|Cuneiform]]: A functional workflow language for large-scale data analysis<ref>{{Cite journal
| last1 = Brandt | first1 = Jörgen
| last2 = Bux | first2 = Marc N.
| last3 = Leser | first3 = Ulf
| title = Cuneiform: A functional language for large scale scientific data analysis
| journal = Proceedings of the Workshops of the EDBT/ICDT
| volume = 1330
| pages = 17–26
| year = 2015
| url = http://ceur-ws.org/Vol-1330/paper-03.pdf
}}</ref>
* [[Discovery Net]]: one of the earliest examples of a scientific workflow system, later commercialized as InforSense which was then acquired by IDBS.{{citation needed|date=September 2016}}
*[[Galaxy (computational biology)|Galaxy]]: initially targeted at [[genomics]]<ref>{{Cite journal
| last1 = Goecks | first1 = J.
| last2 = Nekrutenko | first2 = A.
| last3 = Taylor | first3 = J.
| last4 = Galaxy Team | first4 = T.
| title = Galaxy: A comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences
| doi = 10.1186/gb-2010-11-8-r86
| journal = Genome Biology
| volume = 11
| issue = 8
| pages = R86
| year = 2010
| pmid = 20738864
| pmc =2945788
}}</ref>
* [[GenePattern]]: A powerful scientific workflow system that provides access to hundreds of genomic analysis tools.<ref>{{Cite journal
| pmid = 16642009
| year = 2006
| last1 = Reich
| first1 = Michael
| title = GenePattern 2.0
| journal = Nature Genetics
| volume = 38
| issue = 1
| pages = 500–5001
| doi = 10.1038/ng0506-500
| s2cid = 5503897
|display-authors=etal}}</ref>
* [[KNIME]] the Konstanz Information Miner<ref>{{Cite journal | doi = 10.1016/j.compbiolchem.2007.08.009| pmid = 17931570| title = Workflow based framework for life science informatics| journal = Computational Biology and Chemistry| year = 2007| volume=31| issue = 5–6| pages=305–319| last1 = Tiwari| first1 = Abhishek| last2 = Sekhar| first2 = Arvind K.T.}}</ref>
* [[OnlineHPC]] Online workflow designer based on [[Taverna workbench|Taverna]]{{citation needed|date=September 2016}}
*[[UGENE]] provides a workflow management system that is installed on a local computer<ref>{{Cite journal
| pmid = 22368248
| year = 2012
| last1 = Okonechnikov
| first1 = K
| title = Unipro UGENE: A unified bioinformatics toolkit
| journal = Bioinformatics
| volume = 28
| issue = 8
| pages = 1166–7
| last2 = Golosova
| first2 = O
| last3 = Fursov
| first3 = M
| last4 = Ugene
| first4 = Team
| doi = 10.1093/bioinformatics/bts091
| doi-access = free
}}</ref>
* [[GenPipes]] A flexible and powerful Python-based framework that facilitates the development and deployment of multi-step genomic workflows, optimized for High-Performance Computing (HPC) clusters and the cloud <ref>{{Cite journal |last=Mathieu Bourgey, Rola Dali, Robert Eveleigh, Kuang Chung Chen, Louis Letourneau, Joel Fillon, Marc Michaud, Maxime Caron, Johanna Sandoval, Francois Lefebvre, Gary Leveque, Eloi Mercier, David Bujold, Pascale Marquis, Patrick Tran Van, David Anderson de Lima Morais, Julien Tremblay, Xiaojian Shao, Edouard Henrion, Emmanuel Gonzalez, Pierre-Olivier Quirion, Bryan Caron, Guillaume Bourque |title=GenPipes: an open-source framework for distributed and scalable genomic analyses |url=https://academic.oup.com/gigascience/article/8/6/giz037/5513895 |journal=GigaScience |volume=Volume 8, Issue 6, June 2019}}</ref>
* Snakemake python-driven implimentation of the [[Make (software)|Makefile]] for automatic compilation and exectution of workflows
* Nextflow
* [[VisTrails]]<ref>{{Cite book | doi = 10.1109/VISUAL.2005.1532788 | title = VisTrails: enabling interactive multiple-view visualizations| year = 2005 | journal=VIS 05. IEEE Visualization, 2005.| pages = 135–142| last1 = Bavoil| first1 = L.| last2 = Callahan| first2 = S.P.| last3 = Crossno| first3 = P.J.| last4 = Freire| first4 = J.| last5 = Scheidegger| first5 = C.E.| last6 = Silva| first6 = C.T.| last7 = Vo| first7 = H.T.| isbn = 978-0-7803-9462-9}}</ref>
==Comparisons between workflow systems==
With a large number of bioinformatics workflow systems to choose from,<ref>{{cite web|url=https://s.apache.org/existing-workflow-systems|title=Existing Workflow systems|website=Common Workflow Language wiki|archive-url=https://web.archive.org/web/20191017094453/https://github.com/common-workflow-language/common-workflow-language/wiki/Existing-Workflow-systems|archive-date=2019-10-17|url-status=live|access-date=2019-10-17}}</ref> it becomes difficult to understand and compare the features of the different workflow systems. There has been little work conducted in evaluating and comparing the systems from a bioinformatician's perspective, especially when it comes to comparing the data types they can deal with, the in-built functionalities that are provided to the user or even their performance or usability. Examples of existing comparisons include:
* The paper "Scientific workflow systems-can one size fit all?",<ref name="CIBEC 2008"/> which provides a high-level framework for comparing workflow systems based on their control flow and data flow properties. The systems compared include [[Discovery Net]], [[Taverna workbench|Taverna]], Triana, [[Kepler scientific workflow system|Kepler]] as well as Yawl and [[Business Process Execution Language|BPEL]].
* The paper "Meta-workflows: pattern-based interoperability between Galaxy and Taverna"<ref>
{{Cite book | last1 = Abouelhoda | first1 = M. | last2 = Alaa | first2 = S. | last3 = Ghanem | first3 = M. | doi = 10.1145/1833398.1833400 | chapter = Meta-workflows | title = Proceedings of the 1st International Workshop on Workflow Approaches to New Data-centric Science - Wands '10 | pages = 1 | year = 2010 | isbn = 9781450301886 | s2cid = 17343728 }}</ref> which provides a more user-oriented comparison between [[Taverna workbench|Taverna]] and [[Galaxy (computational biology)|Galaxy]] in the context of enabling interoperability between both systems.
* The infrastructure paper "Delivering ICT Infrastructure for Biomedical Research"<ref>
{{Citation
| last1 = Nyrönen | first1 = TH
| last2 = Laitinen | first2 = J
| title = Delivering ICT infrastructure for biomedical research
| pages = 37–44
| series = Proceedings of the WICSA/ECSA 2012 Companion Volume (WICSA/ECSA '12)
| year = 2012
| publisher = ACM
| doi = 10.1145/2361999.2362006
|display-authors=etal| isbn = 9781450315685
| s2cid = 18199745
}}
</ref> compares two workflow systems, [[Anduril (workflow engine)|Anduril]] and Chipster,<ref name=chipster>{{Cite journal
| pmid = 21999641
| pmc = 3215701
| year = 2011
| last1 = Kallio
| first1 = M. A.
| title = Chipster: User-friendly analysis software for microarray and other high-throughput data
| journal = BMC Genomics
| volume = 12
| pages = 507
| last2 = Tuimala
| first2 = J. T.
| last3 = Hupponen
| first3 = T
| last4 = Klemelä
| first4 = P
| last5 = Gentile
| first5 = M
| last6 = Scheinin
| first6 = I
| last7 = Koski
| first7 = M
| last8 = Käki
| first8 = J
| last9 = Korpelainen
| first9 = E. I.
| doi = 10.1186/1471-2164-12-507
}}</ref> in terms of infrastructure requirements in a cloud-delivery model.
* The paper "A review of bioinformatic pipeline frameworks"<ref>{{cite journal |last=Leipzig |first=J |date=2016 |title=A review of bioinformatic pipeline frameworks |journal=Briefings in Bioinformatics |volume=18 |issue=3 |pages=530–536 |doi=10.1093/bib/bbw020 |pmid=27013646 |pmc=5429012 |name-list-style=vanc }}
</ref> attempts to classify workflow management systems based on three dimensions: "using an implicit or explicit syntax, using a configuration, convention or class-based design paradigm and offering a command line or workbench interface".
==References==
{{Reflist}}
{{DEFAULTSORT:Bioinformatics workflow management systems}}
[[Category:Bioinformatics]]
[[Category:Bioinformatics software]]
[[Category:Workflow applications]]
|