Bioinformatics Open Source Conference: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Altered template type. Add: pmid, doi, page, volume, journal, pmc, date, doi-access, doi-broken-date, authors 1-10. Removed URL that duplicated identifier. Removed access-date with no URL. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Jay8g | #UCB_toolbar
excessive and promotional
Line 44:
 
Since 2021, BOSC has been taking place within the ISMB conferences again. In 2023 BOSC took place in [[Lyon]], France between July 24-28 as part of the ISMB/[[European_Conference_on_Computational_Biology|ECCB]] conference.
 
== Conference Highlights ==
=== BOSC 2024 ===
 
The BOSC 2024 conference was a part of the [https://www.iscb.org/ismb2024/home|Intelligent Systems for Molecular Biology Conference of 2024]. The 2024 event also marked the 25th anniversary of the conference, which took place in '''Montreal, Canada'''.
 
The conference was held in a hybrid setting, with around 200 people attending in person and many others viewing the presentations online.
 
The conference covered a wide variety of topics, with the main theme focusing on approaches to using '''Artificial Intelligence''' and '''Machine Learning''' in '''Bioinformatics'''.
 
==== Event Highlights ====
 
The conference featured two keynote speakers.
 
One of them, Dr. Mélanie Courtot, gave a presentation titled '''"The Data Shows We Need Better Data"''' on day one of the conference. During her speech, she discussed some of the resources available to obtain quality free data and open-source software programs for conducting research. In addition, she introduced the '''TRUE principles''' for preparing data for AI tools. '''TRUE''' is an acronym standing for '''Tracked''', '''Reasonable''', '''Understandable''', and '''Ethical'''.
 
Dr. Courtot explained that tracked data for AI means that it should be known how the data was obtained, there should be evidence to support the claims of the data, and the authors who released the data should be properly credited. The final part of this principle is that the data should be computationally manageable.
 
The '''Reasonable''' component of the principle states that the data should be organized in a logical way so that new inferences and conclusions can be made from it.
 
The '''Understandable''' part dictates that the data should be able to be processed by open-source AI models. Some of the models she included in her presentation were [https://ai.meta.com/llama/ LLaMA] and [https://mistral.ai Mistral].
 
Finally, the '''Ethical''' principle emphasized that available data should promote diversity, equity, and inclusion, while maintaining the privacy of those the data may be linked to.
 
The next keynote speaker to present on day two was '''Andrew Su''', who gave a presentation titled '''"Open Data, Knowledge Graphs, and Large Language Models"'''. This presentation discussed how, despite the usefulness of large language models (LLMs) for retrieving data or answering specific questions, they are not always accurate and the responses they generate still need to be verified.
 
A solution he presented was [[Retrieval-augmented generation|Retrieval-Augmented Generation (RAG)]]. He explained this as a way to improve the accuracy of answers provided by LLMs by keeping the information they query well-organized.
 
Another topic in his presentation included tools that can be used to test the accuracy and rate the quality of answers obtained from LLMs.
 
==== Timeline ====
 
==== Day 1 ====
 
Other than the keynote speakers, there were a total of 36 talks and 23 posters selected to be presented at the conference. One of the sessions for day one was '''Data Analysis'''. These presentations were about open-source approaches to analyzing biomedical data, different types of data that are freely available for use, and some of the research that has been done using these open-source tools and data. Some of the presentations for this session included:
* '''"Gemma: Curation, Re-analysis and Dissemination of 18,000 Gene Expression Studies"''' by Paul Pavlidis
**[https://www.youtube.com/watch?v=vpqd5nt5Juc Recorded presentation]
* '''"ROC Picker: Propagating Statistical and Systematic Uncertainties in Biological Analysis"''' by Jeffery Roskes
**[https://www.youtube.com/watch?v=8tq_aBc1nh4 Recorded presentation]
* '''"Antimicrobial Resistance Prediction of Non-Tuberculosis Mycobacteria from Whole Genome Sequence Data"''' by Idowu Olawoye
**[https://www.youtube.com/watch?v=ya723kllWfo Recorded presentation]
 
The next session of day one was the '''Open Data Session''', which included presentations about some of the databases, data portals, and platforms that are being used by researchers around the world. Some of the presentations in this session were:
* '''"Creating an Open-source Data Platform"''' by Mitchell Shiell
**[https://www.youtube.com/watch?v=vv_gT6cwJPM Recorded presentation]
* '''"Going Viral: The Development of the VirusSeq Data Portal"''' by Justin Richardsson
**[https://www.youtube.com/watch?v=Qb9Kn75kkgA Recorded presentation]
* '''"intermine.bio2rdf.org: A QLever SPARQL Endpoint for InterMine Databases"''' by Francois Belleau
**[https://www.youtube.com/watch?v=YuCdruCgX_Y Recorded presentation]
 
The next session was '''Visualization''', which included presentations about new additions to older databases. Presentations in this session included:
* '''"Connecting Integrated Genome Browser to a Huge Genome Database Using Its Own API Solves One Problem and Creates Another"''' by Ann Loraine
**[https://www.youtube.com/watch?v=xT330tEGvJ8 Recorded presentation]
* '''"Collaborating Our Way to Optimal Integration Between Tripai 4 and JBrowse 2"''' by Carolyn T. Caron
**[https://www.youtube.com/watch?v=UDYQ6FlazZo Recorded presentation]
* '''"An Integrated Environment for Browsing 3-D Protein Structures and Multiple Sequence Alignment in JBrowse 2"''' by Colin Diesh
**[https://www.youtube.com/watch?v=EQmUowU6Y8A Recorded presentation]
 
The last session for day one was '''Developer Tools and Libraries''', displaying some of the open-source tools used for analyzing data. Some of the presentations in this session included:
* '''"Codefair: Make Biomedical Research FAIR Without Breaking a Sweat"''' by Bhavesh Patel
**[https://www.youtube.com/watch?v=8OBm0SsJw7s Recorded presentation]
* '''"An Open-source Ecosystem for Scalable and Computationally Efficient Nanopore Data Processing"''' by Avishai Weissberg
**[https://www.youtube.com/watch?v=VaSctMRQYxQ Recorded presentation]
* '''"Tattaki: Enhancing the Robustness of Bioinformatics Workflows with Simple, Tolerant File Format Detection"''' by Masaki Fuki
**[https://www.youtube.com/watch?v=7GGluYq7qD8 Recorded presentation]
 
====Day 2====
The first session of day 2 was '''“Standards and Frameworks for Open Science”'''. This session was all about how to create consistent, recyclable, and long lasting software. Presentations in this session included.
 
*"'''Enhancing Reproducibility in Immunogenetics: Leveraging Containerization Technology for Bioinformatics Workflows"''' by Rayo Suseno
**[https://www.youtube.com/watch?v=5k_32AYe-iw Recorded presentation]
*"'''Breaking the silo: composable bioinformatics through cross-disciplinary open standards"''' by Nezar Abdennur
**[https://www.youtube.com/watch?v=mzkE-O8Jrq0 Recorded presentation]
 
*'''"For long-term sustainable software in bioinformatics: a manifesto"''' by Luis Pedro Coelho
**[https://www.youtube.com/watch?v=u9h83qnCEsI Recorded presentation]
 
The next session was called '''“Open Approaches to AI/ML”''' , which was about how to use machine learning to solve biological problems. Presentations in this session included.
*"'''Gene Set Summarization Using Large Language Models"''' by Marcin Joanchimiak
**[https://www.youtube.com/watch?v=hRWbBkKqjakf Recorded presentation]
*"'''FAIR, modular and reproducible image-based ML workflows for biologists: a template and case study from imageomics"''' by Hilmar Lapp
**[https://www.youtube.com/watch?v=PUusHdapEss Recorded presentation]
*"'''Trust and Transparency in Reporting Machine Learning: The DOME-GigaScience Press Trial"''' by Chris Armi
**[https://www.youtube.com/watch?v=XDh9N4c68pA Recorded presentation]
 
====Open Panel Discussion====
 
The events of day two concluded with an open panel discussion titled '''“Open Source AI/ML: A Game Changer for Bioinformatics?”'''.
The researchers on the panel included Lawrence Hunter, Thomas Hervé Mboa Nkoudou, Mélanie Courtot, and Andrew Su. The moderator of the panel was Monica Munoz-Torres. This open discussion revolved around the potential gains and pitfalls of using AI and ML methods to conduct bioinformatic research.
 
Once each of the panelists had explained their positions, the discussion was opened to the audience. After a long discussion the sances of the panelists were split with half thinking the use of AI and ML in bioinformatics has been an important and bettering for the field while the other half were still weary of the potential harms of it.
<ref>{{cite web |title=BOSC 2024 |url=https://www.open-bio.org/events/bosc-2024/ |website=Open Bioinformatics Foundation |access-date=April 21, 2025}}</ref>
<ref>{{cite web |last=Courtot |first=Mélanie |title=BOSC keynote |url=https://courtotlab.genomeinformatics.org/2024/07/15/BOSC-keynote.html |website=Courtot Lab Genome Informatics |date=July 15, 2024 |access-date=April 21, 2025}}</ref>
<ref>{{cite web |title=BOSC 2024 Schedule |url=https://www.open-bio.org/events/bosc-2024/bosc-2024-schedule/ |website=Open Bioinformatics Foundation |access-date=April 21, 2025}}</ref>
<ref>{{cite journal |last1=Harris |first1=Nomi L. |last2=Hokamp |first2=Karsten |last3=Maia |first3=Jessica |last4=Ménager |first4=Hervé |last5=Munoz-Torres |first5=Monica C. |last6=Sawant |first6=Swapnil |last7=Unni |first7=Deepak |last8=Williams |first8=Jason |title=25 Years of BOSC, the Bioinformatics Open Source Conference [version 1; peer review: not peer reviewed] |journal=F1000Research |date=September 27, 2024 |volume=13 |pages=1100 |doi=10.12688/f1000research.156426.1 |doi-broken-date=April 22, 2025 |doi-access=free |url=https://f1000research.com/articles/13-1100 |access-date=April 21, 2025}}</ref>
 
=== BOSC 2023 ===
 
The '''2023 Bioinformatics Open Source Conference (BOSC 2023)''' was held on '''July 24–25, 2023''', drawing over '''2,100 in-person attendees''' and approximately '''900 online viewers'''. About '''200 participants''' actively engaged in the event's sessions and activities.<ref>{{cite journal |title=BOSC 2023, the 24th annual Bioinformatics Open Source Conference |date=2023 |pmc=10704065 |last1=Harris |first1=N. L. |last2=Fields |first2=C. J. |last3=Hokamp |first3=K. |last4=Just |first4=J. |last5=Khetani |first5=R. |last6=Maia |first6=J. |last7=Ménager |first7=H. |last8=Munoz-Torres |first8=M. C. |last9=Unni |first9=D. |last10=Williams |first10=J. |journal=F1000Research |volume=12 |page=1568 |doi=10.12688/f1000research.143015.1 |doi-access=free |pmid=38076297 }}</ref>
 
=== Keynote Presentations ===
 
The keynote speakers were '''Sara El-Gebali''' and '''Joseph M. Yracheta'''.
 
* El-Gebali presented ''“A New Odyssey: Pioneering the Future of Scientific Progress Through Open Collaboration”''. Her talk explored navigating the realm of science through diverse alliances and institutions, with a focus on promoting open science through collaboration.
 
* Yracheta gave a talk titled ''“The Dissonance between Scientific Altruism & Capitalist Extraction: The Zero Trust and Federated Data Sovereignty Solution”'', offering insights from the American Indian perspective in the United States. He critiqued the lack of clarity and transparency in current Open Data policies, arguing they tend to prioritize funding and researcher data rights over individual privacy.
 
=== Open and Ethical Data Sharing Panel ===
[[File:BOSC2023 Open Data Panel.jpg|thumb|Panelists at the BOSC 2023 discussion on Open and Ethical Data Sharing: Verena Ras, Sara El-Gebali, Bastian Greshake Tzovaras, and Joseph M. Yracheta.]]
In addition to the keynotes, BOSC 2023 hosted a panel on '''Open and Ethical Data Sharing''', featuring keynote speakers El-Gebali and Yracheta along with '''Verena Ras''' and '''Bastian Greshake Tzovaras'''. The panel addressed the absence of a formal ethical code for bioinformaticians and emphasized the need for stronger advocacy in ethical data sharing practices.
 
=== Topical Sessions and Posters ===
[[File:BOSC2023 Poster.jpg|thumb|A participant presenting the poster BioThings Explorer: A query engine for a federated knowledge graph of biomedical APIs]]
BOSC also featured a topical session comprising '''53 talks''', with '''49 presenters displaying posters'''. Topics included, but were not limited to:
 
* Open Science and Reproducible Research
* Open Biomedical Data
* Citizen/Participatory Science
* Standards and Interoperability
* Data Science, Workflows, Data Access and Visualization
* Open Approaches to Translational Bioinformatics
* Developer Tools and Libraries
* Inclusion, Outreach and Training
<ref>{{cite web |title=BOSC 2023: Bioinformatics Open Source Conference |url=https://www.open-bio.org/events/bosc-2023/|website=BOSC 2023 |publisher=Open Bioinformatics Foundation|access-date=21 April 2025}}</ref>
 
=== BOSC 2022 ===
BOSC 2022 marked the first hybrid Bioinformatics Open Source Conference, offering both virtual and in-person attendance in Madison, Wisconsin. Approximately 1,000 participants attended in person, with an additional 800 joining virtually. The conference featured a panel discussion titled 'Building and Sustaining Inclusive Open Science Communities,' along with 28 talks and 46 posters covering various topics in bioinformatics. BOSC 2022 also included joint keynotes with the Education and Bio-Ontologies Communities of Special Interest (COSIs). Jason Williams presented 'Riding the Bicycle: Including All Scientists on a Path to Excellence,' and Melissa Haendel delivered 'The Open Data Highway: Turbo-Boosting Translational Traffic with Ontologies.'
<ref>{{cite journal |last1=Harris |first1=Nomi |title=BOSC 2022: the first hybrid and 23rd annual Bioinformatics Open Source Conference |journal=F1000Research |date=2022 |volume=11 |page=1034 |publisher=U.S. National Library of Medicine |doi=10.12688/f1000research.125043.1 |doi-access=free |pmid=36128559 |pmc=9468630 }}</ref>{{Use mdy dates|date=August 2017}}
[[File:Jason Williams moderating the panel on “Building and Sustaining Inclusive Open Science Communities”.jpg|thumb|Keynote speaker Jason Williams conducting BOSC 2022 panel discussion.]]
 
=== Past conferences ===
As of January 2024, there have been 24 BOSC held around the world, of those 20 were purely in-person conferences, 2 purely remote due to the [[COVID-19 pandemic]] and one that was organized as a hybrid meeting.<ref>{{Cite web |last= |title=OBF » About BOSC » About BOSC |url=https://www.open-bio.org/events/bosc-2021/about/ |access-date=2022-11-23 |language=en-US}}</ref>
{| class="wikitable sortable"
|+
!Year
!Conference partner
!Location
!Keynote speakers
|-
|2023
|ISMB
|Lyon, France
|Joseph M. Yracheta, Sara El-Gebali
|-
|2022
|ISMB
|Hybrid: Madison, WI and online
|Jason Williams, [[Melissa Haendel]]
|-
|2021
|ISMB
|Online (would have been Lyon)
|Christie Bahlai, Lara Mangravite, Thomas Hervé Mboa Nkoudou
|-
|2020
|GCC
|Online (would have been Toronto)
|[[Lincoln Stein]], Abigail Cabunoc Mayes
|-
|2019
|ISMB
|Basel, Switzerland
|Nicola Mulder
|-
|2018
|GCC
|Portland, OR
|[[Fernando Pérez (software developer)|Fernando Pérez]], [[Tracy Teal]]
|-
|2017
|ISMB
|Prague, Czech Republic
|Mad Price Ball, [[Nick Loman]]
|-
|2016
|ISMB
|Orlando, FL
|[[Jennifer Gardy]], [[Steven Salzberg]]
|-
|2015
|ISMB
|Dublin, Ireland
|[[Ewan Birney]], Holly Bik
|-
|2014
|ISMB
|Boston, MA
|[[Philip Bourne]], Titus Brown
|-
|2013
|ISMB
|Berlin, Germany
|[[Sean Eddy]], [[Cameron Neylon]]
|-
|2012
|ISMB
|Long Beach, CA
|[[Jonathan Eisen]], [[Carole Goble]]
|-
|2011
|ISMB
|Vienna, Austria
|[[Lawrence Hunter]], Matt Wood
|-
|2010
|ISMB
|Boston, MA
|Guy Coates, [[Ross Gardler]]
|-
|2009
|ISMB
|Stockholm, Sweden
|Robert Hanmer, Alan Ruttenberg
|-
|2008
|ISMB
|Toronto, Canada
|[[Julian Lombardi]]
|-
|2007
|ISMB
|Vienna, Austria
|[[Carole Goble]]
|-
|2006
|ISMB
|Fortaleza, Brasil
|[[Amos Bairoch]], Alberto M.R. Davila
|-
|2005
|ISMB
|Detroit, MI
|Hilmar Lapp
|-
|2004
|ISMB
|Glasgow, Scotland
|[[Wolfgang Huber (scientist)|Wolfgang Huber]]
|-
|2003
|ISMB
|Brisbane, Australia
| -
|-
|2002
|ISMB
|Edmonton, Canada
|[[Ewan Birney]], [[Michael Eisen]], Winston Hide
|-
|2001
|ISMB
|Copenhagen, Denmark
|[[Steven Brenner]]
|-
|2000
|ISMB
|San Diego, CA
|[[Tim O'Reilly]], [[Lincoln Stein]]
|}
 
==References==