Bioinformatics Open Source Conference: Difference between revisions

Content deleted Content added
Added a section about the 2024 BOSC conference
m General fixes via AutoWikiBrowser, removed stub tag
 
(5 intermediate revisions by 3 users not shown)
Line 1:
 
{{Infobox recurring event
| name = Bioinformatics Open Source Conference
Line 14 ⟶ 13:
| next = BOSC 2023
| frequency = Annually
| ___location = [[Madison,_Wisconsin Wisconsin|Madison]], United States (2022)
| years_active = {{age|2000|08|17}}
| first =
Line 29 ⟶ 28:
| footnotes =
}}
The '''Bioinformatics Open Source Conference''' ('''BOSC''') is an [[academic conference]] on [[Open source|open-source]] programming and other [[open science]] practices in [[bioinformatics]], organised by the [[Open Bioinformatics Foundation]]. The conference has been held annually since 2000 and is run as a two-day meeting either within [[Intelligent Systems for Molecular Biology]] (ISMB) conference or as a joint conference with the [[Galaxy_Galaxy (computational_biologycomputational biology)|Galaxy]] community.
 
== Program ==
Line 41 ⟶ 40:
BOSC 2016 was organized in [[Orlando, Florida]] from July 8–9 before the main ISMB conference.<ref>{{Cite web|url=https://www.open-bio.org/wiki/BOSC_2016|title=BOSC 2016 – Open Bioinformatics Foundation|last=|first=|date=|website=|publisher=Open Bio|access-date=}}</ref>
 
In 2018 and 2020, BOSC partnered with [[Galaxy_Galaxy (computational_biologycomputational biology)|Galaxy]] to organize two joint conferences called ''GCCBOSC'' and ''Bioinformatics Community Conference'' (BCC) respectively.<ref>{{Cite web|url=https://www.open-bio.org/events/bosc-2021/about/#Past_BOSCs|title=About BOSC - Open Bioinformatics Foundation|accessdate=November 22, 2022}}</ref> The event in 2018 was held in [[Portland, Oregon]].<ref>{{Cite web|url=https://www.open-bio.org/2018/07/27/gccbosc-2018-post-meeting-report/|title=GCCBOSC 2018 - Open Bioinformatics Foundation|accessdate=November 22, 2022}}</ref> The BCC in 2020 took place online with two time schedules for eastern/western time zones<ref>{{Cite web|url=https://bcc2020.github.io/|title=Bioinformatics Community Conference|accessdate=November 22, 2022}}</ref>
 
Since 2021, BOSC has been taking place within the ISMB conferences again. In 2023 BOSC took place in [[Lyon]], France between July 24-2824–28 as part of the ISMB/[[European_Conference_on_Computational_BiologyEuropean Conference on Computational Biology|ECCB]] conference.
 
==BOSC 2024==
== Conference Highlights ==
=== BOSC 2024 ===
 
The BOSC 2024 conference was a part of the [https://www.iscb.org/ismb2024/home|'''Intelligent Systems for Molecular Biology Conference of 2024]'''. The 2024 event also marked the 25th anniversary of the conference, which took place in '''Montreal, Canada'''.<ref name="NCBIEvent2024">{{Cite web
| title = NCBI at ISMB 2024 & BOSC
| url = https://ncbiinsights.ncbi.nlm.nih.gov/event/ncbi-ismb-bosc-2024/
| website = NCBI Insights
| date = 21 June 2024
| access-date = 2025-04-24
}}</ref>
 
The conference was held in a hybrid setting, with around 200 peoplelive attendingattendees inand personthe andrest manywatching otherslive. viewingThe conference covered a wide variety of topics, with the presentationsmain onlinetheme focusing on approaches to using [[Artificial intelligence (AI)]] and [[Machine learning | Machine learning (ML)]] in [[Bioinformatics]].<ref name="BOSC25Years">{{Cite journal
| last1 = Harris
| first1 = Nomi L.
| last2 = Hokamp
| first2 = Karsten
| last3 = Maia
| first3 = Jessica
| last4 = Ménager
| first4 = Hervé
| last5 = Munoz-Torres
| first5 = Monica C.
| last6 = Sawant
| first6 = Swapnil
| last7 = Unni
| first7 = Deepak
| last8 = Williams
| first8 = Jason
| title = 25 Years of BOSC, the Bioinformatics Open Source Conference
| journal = F1000Research
| volume = 13
| year = 2024
| doi = 10.12688/f1000research.156426.1
| doi-broken-date = 1 July 2025
| doi-access = free
| url = https://f1000research.com/articles/13-1100
| access-date = 2025-04-23
}}</ref>
 
===Keynote Speakers===
The conference covered a wide variety of topics, with the main theme focusing on approaches to using '''Artificial Intelligence''' and '''Machine Learning''' in '''Bioinformatics'''.
 
==== Event Highlights ====
 
The conference featured two keynote speakers.
 
One of them, Dr. Mélanie Courtot, gave a presentation titled '''"The Data Shows We Need Better Data"''' on day one of the conference. During her speech, she discussed some of the resources available to obtain quality free data and open-source software programs for conducting research. In addition, she introduced the '''TRUE principles''' for preparing data for AI tools. '''TRUE''' is an acronym standing for '''Tracked''', '''Reasonable''', '''Understandable''', and '''Ethical'''.
 
The next keynote speaker to present on day two was '''Andrew Su''', who gave a presentation titled '''"Open Data, Knowledge Graphs, and Large Language Models"'''. This presentation discussed how to verify the accuracy of answers produced by [[Large language model|Large language Models (LLM)]]. A solution he presented was [[Retrieval-augmented generation|Retrieval-Augmented Generation (RAG)]].<ref name="BOSC25Years" />
Dr. Courtot explained that tracked data for AI means that it should be known how the data was obtained, there should be evidence to support the claims of the data, and the authors who released the data should be properly credited. The final part of this principle is that the data should be computationally manageable.
 
===Other Presentations===
The '''Reasonable''' component of the principle states that the data should be organized in a logical way so that new inferences and conclusions can be made from it.
 
Other than the keynote speakers, there were a total of 36 talks and 23 posters selected to be presented at the conference over the course of multiple sessions. One of the sessions being '''Data Analysis'''. These presentations were about approaches to analyzing [[Biomedical data science|biomedical data]], different types of data that are freely available for use, and some of the research that has been done using these [[Open source|open-source]] tools and data. Another session was titled '''Open Data Session''', which included presentations about some of the freely available [[database]]s, [[open data portal]]s, and platforms that are being used by researchers around the world. The session '''Visualization''' included presentations about new additions to older [[Biological Databases|biological databases]].
The '''Understandable''' part dictates that the data should be able to be processed by open-source AI models. Some of the models she included in her presentation were [https://ai.meta.com/llama/ LLaMA] and [https://mistral.ai Mistral].
The next session was titled '''“Standards and Frameworks for Open Science”'''. This session was all about how to create consistent, recyclable, and long lasting open source software. The final session was called '''“Open Approaches to AI/ML”''' , which was about how to use machine learning to solve biological problems.<ref name="BOSC25Years" />
 
Finally, the '''Ethical''' principle emphasized that available data should promote diversity, equity, and inclusion, while maintaining the privacy of those the data may be linked to.
 
The next keynote speaker to present on day two was '''Andrew Su''', who gave a presentation titled '''"Open Data, Knowledge Graphs, and Large Language Models"'''. This presentation discussed how, despite the usefulness of large language models (LLMs) for retrieving data or answering specific questions, they are not always accurate and the responses they generate still need to be verified.
 
A solution he presented was [[Retrieval-augmented generation|Retrieval-Augmented Generation (RAG)]]. He explained this as a way to improve the accuracy of answers provided by LLMs by keeping the information they query well-organized.
 
Another topic in his presentation included tools that can be used to test the accuracy and rate the quality of answers obtained from LLMs.
 
==== Timeline ====
 
==== Day 1 ====
 
Other than the keynote speakers, there were a total of 36 talks and 23 posters selected to be presented at the conference. One of the sessions for day one was '''Data Analysis'''. These presentations were about open-source approaches to analyzing biomedical data, different types of data that are freely available for use, and some of the research that has been done using these open-source tools and data. Some of the presentations for this session included:
* '''"Gemma: Curation, Re-analysis and Dissemination of 18,000 Gene Expression Studies"''' by Paul Pavlidis
**[https://www.youtube.com/watch?v=vpqd5nt5Juc Recorded presentation]
* '''"ROC Picker: Propagating Statistical and Systematic Uncertainties in Biological Analysis"''' by Jeffery Roskes
**[https://www.youtube.com/watch?v=8tq_aBc1nh4 Recorded presentation]
* '''"Antimicrobial Resistance Prediction of Non-Tuberculosis Mycobacteria from Whole Genome Sequence Data"''' by Idowu Olawoye
**[https://www.youtube.com/watch?v=ya723kllWfo Recorded presentation]
 
The next session of day one was the '''Open Data Session''', which included presentations about some of the databases, data portals, and platforms that are being used by researchers around the world. Some of the presentations in this session were:
* '''"Creating an Open-source Data Platform"''' by Mitchell Shiell
**[https://www.youtube.com/watch?v=vv_gT6cwJPM Recorded presentation]
* '''"Going Viral: The Development of the VirusSeq Data Portal"''' by Justin Richardsson
**[https://www.youtube.com/watch?v=Qb9Kn75kkgA Recorded presentation]
* '''"intermine.bio2rdf.org: A QLever SPARQL Endpoint for InterMine Databases"''' by Francois Belleau
**[https://www.youtube.com/watch?v=YuCdruCgX_Y Recorded presentation]
 
The next session was '''Visualization''', which included presentations about new additions to older databases. Presentations in this session included:
* '''"Connecting Integrated Genome Browser to a Huge Genome Database Using Its Own API Solves One Problem and Creates Another"''' by Ann Loraine
**[https://www.youtube.com/watch?v=xT330tEGvJ8 Recorded presentation]
* '''"Collaborating Our Way to Optimal Integration Between Tripai 4 and JBrowse 2"''' by Carolyn T. Caron
**[https://www.youtube.com/watch?v=UDYQ6FlazZo Recorded presentation]
* '''"An Integrated Environment for Browsing 3-D Protein Structures and Multiple Sequence Alignment in JBrowse 2"''' by Colin Diesh
**[https://www.youtube.com/watch?v=EQmUowU6Y8A Recorded presentation]
 
The last session for day one was '''Developer Tools and Libraries''', displaying some of the open-source tools used for analyzing data. Some of the presentations in this session included:
* '''"Codefair: Make Biomedical Research FAIR Without Breaking a Sweat"''' by Bhavesh Patel
**[https://www.youtube.com/watch?v=8OBm0SsJw7s Recorded presentation]
* '''"An Open-source Ecosystem for Scalable and Computationally Efficient Nanopore Data Processing"''' by Avishai Weissberg
**[https://www.youtube.com/watch?v=VaSctMRQYxQ Recorded presentation]
* '''"Tattaki: Enhancing the Robustness of Bioinformatics Workflows with Simple, Tolerant File Format Detection"''' by Masaki Fuki
**[https://www.youtube.com/watch?v=7GGluYq7qD8 Recorded presentation]
 
====Day 2====
The first session of day 2 was '''“Standards and Frameworks for Open Science”'''. This session was all about how to create consistent, recyclable, and long lasting software. Presentations in this session included.
 
*"'''Enhancing Reproducibility in Immunogenetics: Leveraging Containerization Technology for Bioinformatics Workflows"''' by Rayo Suseno
**[https://www.youtube.com/watch?v=5k_32AYe-iw Recorded presentation]
*"'''Breaking the silo: composable bioinformatics through cross-disciplinary open standards"''' by Nezar Abdennur
**[https://www.youtube.com/watch?v=mzkE-O8Jrq0 Recorded presentation]
 
*'''"For long-term sustainable software in bioinformatics: a manifesto"''' by Luis Pedro Coelho
**[https://www.youtube.com/watch?v=u9h83qnCEsI Recorded presentation]
 
The next session was called '''“Open Approaches to AI/ML”''' , which was about how to use machine learning to solve biological problems. Presentations in this session included.
*"'''Gene Set Summarization Using Large Language Models"''' by Marcin Joanchimiak
**[https://www.youtube.com/watch?v=hRWbBkKqjakf Recorded presentation]
*"'''FAIR, modular and reproducible image-based ML workflows for biologists: a template and case study from imageomics"''' by Hilmar Lapp
**[https://www.youtube.com/watch?v=PUusHdapEss Recorded presentation]
*"'''Trust and Transparency in Reporting Machine Learning: The DOME-GigaScience Press Trial"''' by Chris Armi
**[https://www.youtube.com/watch?v=XDh9N4c68pA Recorded presentation]
 
====Open Panel Discussion====
[[File:BOSC2024panelists.gif|thumb|From left to right, Monica Munoz-Torres, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, Lawrence Hunter, and Andrew Su during the “Open Source AI/ML: A Game Changer for Bioinformatics?” open panel discussion]]
The events of day two concluded with an open panel discussion titled “Open Source AI/ML: A Game Changer for Bioinformatics?”.
The researchers on the panel included Lawrence Hunter, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, and Andrew Su. The moderator of the panel was Monica Munoz-Torres. This discussion explored the benefits and drawbacks of using Artificial Intelligence and Machine Learning in Bioinformatics research.<ref name="BOSC25Years" />
 
==External links==
The events of day two concluded with an open panel discussion titled '''“Open Source AI/ML: A Game Changer for Bioinformatics?”'''.
*[https://www.open-bio.org/events/bosc-2024/ BOSC 2024 conference page]
The researchers on the panel included Lawrence Hunter, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, and Andrew Su. The moderator of the panel was Monica Munoz-Torres. This open discussion revolved around the potential gains and pitfalls of using AI and ML methods to conduct bioinformatic research.
*[https://www.youtube.com/watch?v=jNgOhlDi-BI Full recorded BOSC 2024 open panel discussion]
 
*[https://www.open-bio.org/events/bosc-2024/bosc-2024-schedule/ Other BOSC 2024 presentations]
Once each of the panelists had explained their positions, the discussion was opened to the audience. After a long discussion the sances of the panelists were split with half thinking the use of AI and ML in bioinformatics has been an important and bettering for the field while the other half were still weary of the potential harms of it.
*[https://www.youtube.com/watch?v=T_0Yb9hldlw Dr. Mélanie Courtot BOSC 2024 keynote speech]
<ref>{{cite web |title=BOSC 2024 |url=https://www.open-bio.org/events/bosc-2024/ |website=Open Bioinformatics Foundation |access-date=April 21, 2025}}</ref>
*[https://www.youtube.com/watch?v=-lgXRHY1vw4&t=2s Andrew Su BOSC 2024 keynote speech]
<ref>{{cite web |last=Courtot |first=Mélanie |title=BOSC keynote |url=https://courtotlab.genomeinformatics.org/2024/07/15/BOSC-keynote.html |website=Courtot Lab Genome Informatics |date=July 15, 2024 |access-date=April 21, 2025}}</ref>
*[https://www.iscb.org/ismb2024/home ISMB 2024 conference page]
<ref>{{cite web |title=BOSC 2024 Schedule |url=https://www.open-bio.org/events/bosc-2024/bosc-2024-schedule/ |website=Open Bioinformatics Foundation |access-date=April 21, 2025}}</ref>
<ref>{{cite journal |last=Harris |first=Nomi L. |last2=Hokamp |first2=Karsten |last3=Maia |first3=Jessica |last4=Ménager |first4=Hervé |last5=Munoz-Torres |first5=Monica C. |last6=Sawant |first6=Swapnil |last7=Unni |first7=Deepak |last8=Williams |first8=Jason |title=25 Years of BOSC, the Bioinformatics Open Source Conference [version 1; peer review: not peer reviewed] |journal=F1000Research |date=September 27, 2024 |volume=13 |pages=1100 |doi=10.12688/f1000research.156426.1 |url=https://f1000research.com/articles/13-1100 |access-date=April 21, 2025}}</ref>
 
=== BOSC 2023 ===
 
The '''2023 Bioinformatics Open Source Conference (BOSC 2023)''' was held on '''July 24–25, 2023''', drawing over '''2,100 in-person attendees''' and approximately '''900 online viewers'''. About '''200 participants''' actively engaged in the event's sessions and activities.<ref>{{cite web |title=BOSC 2023, the 24th annual Bioinformatics Open Source Conference |url=https://pmc.ncbi.nlm.nih.gov/articles/PMC10704065/ |website=PubMed Central |publisher=National Center for Biotechnology Information|access-date=21 April 2025}}</ref>
 
=== Keynote Presentations ===
 
The keynote speakers were '''Sara El-Gebali''' and '''Joseph M. Yracheta'''.
 
* El-Gebali presented ''“A New Odyssey: Pioneering the Future of Scientific Progress Through Open Collaboration”''. Her talk explored navigating the realm of science through diverse alliances and institutions, with a focus on promoting open science through collaboration.
 
* Yracheta gave a talk titled ''“The Dissonance between Scientific Altruism & Capitalist Extraction: The Zero Trust and Federated Data Sovereignty Solution”'', offering insights from the American Indian perspective in the United States. He critiqued the lack of clarity and transparency in current Open Data policies, arguing they tend to prioritize funding and researcher data rights over individual privacy.
 
=== Open and Ethical Data Sharing Panel ===
[[File:BOSC2023 Open Data Panel.jpg|thumb|Panelists at the BOSC 2023 discussion on Open and Ethical Data Sharing: Verena Ras, Sara El-Gebali, Bastian Greshake Tzovaras, and Joseph M. Yracheta.]]
In addition to the keynotes, BOSC 2023 hosted a panel on '''Open and Ethical Data Sharing''', featuring keynote speakers El-Gebali and Yracheta along with '''Verena Ras''' and '''Bastian Greshake Tzovaras'''. The panel addressed the absence of a formal ethical code for bioinformaticians and emphasized the need for stronger advocacy in ethical data sharing practices.
 
=== Topical Sessions and Posters ===
[[File:BOSC2023 Poster.jpg|thumb|A participant presenting the poster BioThings Explorer: A query engine for a federated knowledge graph of biomedical APIs]]
BOSC also featured a topical session comprising '''53 talks''', with '''49 presenters displaying posters'''. Topics included, but were not limited to:
 
* Open Science and Reproducible Research
* Open Biomedical Data
* Citizen/Participatory Science
* Standards and Interoperability
* Data Science, Workflows, Data Access and Visualization
* Open Approaches to Translational Bioinformatics
* Developer Tools and Libraries
* Inclusion, Outreach and Training
<ref>{{cite web |title=BOSC 2023: Bioinformatics Open Source Conference |url=https://www.open-bio.org/events/bosc-2023/|website=BOSC 2023 |publisher=Open Bioinformatics Foundation|access-date=21 April 2025}}</ref>
 
=== BOSC 2022 ===
BOSC 2022 marked the first hybrid Bioinformatics Open Source Conference, offering both virtual and in-person attendance in Madison, Wisconsin. Approximately 1,000 participants attended in person, with an additional 800 joining virtually. The conference featured a panel discussion titled 'Building and Sustaining Inclusive Open Science Communities,' along with 28 talks and 46 posters covering various topics in bioinformatics. BOSC 2022 also included joint keynotes with the Education and Bio-Ontologies Communities of Special Interest (COSIs). Jason Williams presented 'Riding the Bicycle: Including All Scientists on a Path to Excellence,' and Melissa Haendel delivered 'The Open Data Highway: Turbo-Boosting Translational Traffic with Ontologies.'
<ref>{{cite web |last1=Harris |first1=Nomi |title=BOSC 2022: the first hybrid and 23rd annual Bioinformatics Open Source Conference |url=https://pubmed.ncbi.nlm.nih.gov/36128559/ |website=PubMed NCBI |publisher=U.S. National Library of Medicine |access-date=20 April 2025}}</ref>{{Use mdy dates|date=August 2017}}
[[File:Jason Williams moderating the panel on “Building and Sustaining Inclusive Open Science Communities”.jpg|thumb|Keynote speaker Jason Williams conducting BOSC 2022 panel discussion.]]
 
=== Past conferences ===
As of January 2024, there have been 24 BOSC held around the world, of those 20 were purely in-person conferences, 2 purely remote due to the [[COVID-19 pandemic]] and one that was organized as a hybrid meeting.<ref>{{Cite web |last= |title=OBF » About BOSC » About BOSC |url=https://www.open-bio.org/events/bosc-2021/about/ |access-date=2022-11-23 |language=en-US}}</ref>
{| class="wikitable sortable"
|+
!Year
!Conference partner
!Location
!Keynote speakers
|-
|2023
|ISMB
|Lyon, France
|Joseph M. Yracheta, Sara El-Gebali
|-
|2022
|ISMB
|Hybrid: Madison, WI and online
|Jason Williams, [[Melissa Haendel]]
|-
|2021
|ISMB
|Online (would have been Lyon)
|Christie Bahlai, Lara Mangravite, Thomas Hervé Mboa Nkoudou
|-
|2020
|GCC
|Online (would have been Toronto)
|[[Lincoln Stein]], Abigail Cabunoc Mayes
|-
|2019
|ISMB
|Basel, Switzerland
|Nicola Mulder
|-
|2018
|GCC
|Portland, OR
|[[Fernando Pérez (software developer)|Fernando Pérez]], [[Tracy Teal]]
|-
|2017
|ISMB
|Prague, Czech Republic
|Mad Price Ball, [[Nick Loman]]
|-
|2016
|ISMB
|Orlando, FL
|[[Jennifer Gardy]], [[Steven Salzberg]]
|-
|2015
|ISMB
|Dublin, Ireland
|[[Ewan Birney]], Holly Bik
|-
|2014
|ISMB
|Boston, MA
|[[Philip Bourne]], Titus Brown
|-
|2013
|ISMB
|Berlin, Germany
|[[Sean Eddy]], [[Cameron Neylon]]
|-
|2012
|ISMB
|Long Beach, CA
|[[Jonathan Eisen]], [[Carole Goble]]
|-
|2011
|ISMB
|Vienna, Austria
|[[Lawrence Hunter]], Matt Wood
|-
|2010
|ISMB
|Boston, MA
|Guy Coates, [[Ross Gardler]]
|-
|2009
|ISMB
|Stockholm, Sweden
|Robert Hanmer, Alan Ruttenberg
|-
|2008
|ISMB
|Toronto, Canada
|[[Julian Lombardi]]
|-
|2007
|ISMB
|Vienna, Austria
|[[Carole Goble]]
|-
|2006
|ISMB
|Fortaleza, Brasil
|[[Amos Bairoch]], Alberto M.R. Davila
|-
|2005
|ISMB
|Detroit, MI
|Hilmar Lapp
|-
|2004
|ISMB
|Glasgow, Scotland
|[[Wolfgang Huber (scientist)|Wolfgang Huber]]
|-
|2003
|ISMB
|Brisbane, Australia
| -
|-
|2002
|ISMB
|Edmonton, Canada
|[[Ewan Birney]], [[Michael Eisen]], Winston Hide
|-
|2001
|ISMB
|Copenhagen, Denmark
|[[Steven Brenner]]
|-
|2000
|ISMB
|San Diego, CA
|[[Tim O'Reilly]], [[Lincoln Stein]]
|}
 
==References==
Line 312 ⟶ 115:
[[Category:Bioinformatics]]
[[Category:Biology conferences]]
 
 
{{bioinformatics-stub}}