Recap of the COVID-19 Biohackathon by the pangenome browser group
Several contributions helped bring the pangenome browser to the next level
In this time of global change, many of us have had to stop and think about why we do what we do. What are our efforts doing for the world, for those who need help most, for future generations? At Computomics, having a mission based around improving the global future has always been at our core and as scientists, we hold a firm belief that science in the right hands can do a lot of good. As our week of the COVID Pangenome Hackathon comes to a close, we are proud to say that we believe our efforts are making a difference and we are grateful for the engagement and help of the global bioinformatics community.
As Josiah Seaman, from Kew Gardens London and core member of the Pantograph team, said, “The international collaboration on this project was really exhilarating. We had contributors from five continents. At the next Hackathon, we just need to recruit an Australian and an Antarctican. It's a clear image that we're all in this together and that our differences are superficial. At the core, we all love our families, we want good health for everyone, and a subsection of us love to code."
What started off as a collaboration to create a pangenome browser for complex plant genomes has turned into a visualization tool for viral genomes and their diversity.
Here we followed up with Josiah Seaman to get his feedback on how the Hackathon went.
Computomics: Josiah, can you describe the COVID-19 Biohackathon set up?
Josiah Seaman: The idea was that volunteers would come from the larger hackathon community, describe their skills, and they'd be assigned an issue to work on, often in teams. Then our core team was responsible for merging everyone's contributions together and clarifying the specification.
Computomics: So how many people signed up and contributed?
JS: We had well over 300 people sign up for various tasks and 20 individuals who we communicated with specifically for the Pangenome browser.
Computomics: What are the major achievements from this week?
JS: We had several core features that we needed to make the tool usable and we have contributions on all of them. Rendering nucleotide level information was a big goal. We also added the data infrastructure for zooming in and out on the pangenome. Erik Garrison (University of California, Santa Cruz) identified that our data analysis was 100x too slow and a team of python experts literally sped up the process by 100x at the end of the hackathon. There were also many improvements to process, testing, and integration. Jerven Bolleman (Swiss Institute of Bioinformatics) and Toshiyuki Yokoyama (University of Tokyo) developed an ontology for storing our analysis in a knowledge graph, which will be useful for integrating annotations as well.
Computomics: What are the next steps?
JS: We still need to integrate everyone's contributions which includes a second round of bug testing. We have multiple features staged to support the front end, but the front end doesn't exist yet including knowledge graphs, and zooming. Our long term goal is annotations.
Computomics: Who do you want to have access to this? Who should be most excited that this is being created?
JS: Our target application is mid-term through this pandemic, when we're tracking the genetic diversity of the virus across the planet and ensuring that the vaccine will be effective in all populations. We can use the tools ourselves and make observations and recommendations, or epidemiologists can use it on their own private data. Our team leadership has access to policy makers in at least 5 different countries. We just need something informed to say.
Graph Genome applications are wide and varied. We've kept in mind that sequencing human immune systems could also allow for treatment recommendations. However, it looks like hospitals are far too overloaded to pay attention to the details of sequencing, even streamlined. One way this could work is making different recommendations at a state and nation level since most virus spread is local at this point people within a nation are likely to have the same strain.
Computomics: Where should we look for more information?
JS: Go to GraphGenome.org and continue to participate. We will keep organizing participants and we will keep everyone updated with the most recent news.
Computomics: Thanks, Josiah, for working on this with us and for this interview!
We would like to thank ELIXIR and especially deNBI Cloud - the ELIXIR Node in Germany - who provided a virtual machine of 28 cores, 64GB RAM and 1TB storage to the COVID19 Pangenome Visualization project of COVID19 Virtual BioHackathon 2020.
For more information please contact us.