BeerDeCoded: Exploring the beer metagenome

Originally published as a F1000 Blog article

Jonathan Sobel shares his scientific quest to understand beer at a molecular level

A citizen science project carried out by the Hackuarium Association investigated the genome of various beers. The results of this work were published as as a Data Note on F1000Research. Jonthan Sobel explains what the thinking was behind this project and what they found. 

Beer is a fantastic beverage produced through a fermentation process and widely consumed around the world. It has inspired many generations of scientists and citizens. For instance, the famous student T-Test was initially developed for the beer industry, and has now become a popular statistical test for biologists. Again, James Joule was a brewer, who discovered the first law of thermodynamics with his brewing equipment. Many of us enjoy a good beer with our friends, and like to discover new ones.

Beer, an inspiration to generations

At Hackuarium, we chose beer to discuss the challenges and opportunities of DNA sequencing with fellow citizen scientists. The knowledge that we are acquiring could be of interest for breweries and microbreweries to develop new products.

Beer microorganisms

We wanted to detect different microorganisms in the beer ecosystem. Comparing different beers, allowed us to assess their similarities and their specificities, to see if there was a terroir – regional environmental factors that affect the properties of the final product – for beer, similarly to wine and cheese making. We thought this would trigger a discussion with a wider audience on the ethical issues regarding DNA sequencing data, while showing what we could learn about the product.

Creative solutions

Hackuarium, the bio-hacker space founded in 2014 by Luc Henry, was of paramount importance in the development of this project and bringing the team together. The team had experts from various backgrounds, including biologists, bioinformaticians, designers, a social community manager and business developer, met at the association.

By working in this association, we had greater freedom in choosing the direction of our project than in a company or in academia. We could virtually work on anything, providing that it was in agreement with our ethical charter and the law.

beerdecoded_JonathanSobel

When we set out on the project we had no money, so we had to find creative solutions to finance the project. In 2015, a kickstarter campaign led by Gianpaolo Rando provided us with the resources to generate a first dataset that we are now publishing as proof of concept in F1000Research.

The beer ecosystem

We produced a descriptive dataset of the fungal diversity referred to as the ITS region in 39 beers from 5 countries. Our preliminary analysis, revealed that the beer ecosystem is richer than expected in terms of fungal species.

In all the beers in our dataset, we found several Saccharomyces species, as well as  several species specific to one beer that could potentially represent their genetic identity, i.e., an internal fingerprint of authenticity.

Some ITS species could just be contaminants introduced during the beer making process, which could affect the final product quality, but that remains to be investigated.

Improving public understanding of sequencing

BeerDeCoded is an ambassador project about the DNA sequencing of a mass consumption product. Many have an idea about the beer production process or have a favourite beer, so using a familiar product makes it easier to communicate about molecular biology and sequencing techniques.

We are currently working with Berenice Batut from the Galaxy training network to train future generations of bioinformaticians with this dataset. We are also in discussion with other citizen science groups to export this project to other cities, to generate and gather more beer related data to expand our knowledge base.

Short Bio

Jonathan Sobel PhD pursued his master’s in a mucosal immunity lab at Lausanne university hospital, CHUV, where he studied the effect of lipoxin A4 on endometriosis at the protein level. Then, he did his Ph.D. in the computational systems biology lab at EPFL where he studied the interplay between metabolism and the circadian cycle in mouse liver, using next generation sequencing approaches. He joined the Regazzi lab at the university of Lausanne in 2017 as a post-doc in bioinformatics to study beta-cell maturation. Side to his academic work, Jonathan was involved in Hackuarium, a bio-hacker space in Renens, where he did science popularization.

Advertisements

A social community manager in a BioHacker space

In a biohacker space such as Hackuarium, the main force is the group of people. Their competencies and their motivation will drastically impact the success of their projects. The board of the association usually takes care of some operational considerations such as event planning, infrastructure improvements or treasury. One central role is the social community manager. His purpose is to introduce new members and to find the needed human resources for each project. He follows the progress, and he encourages members to publish their results and populate the wiki with their findings. He is as well responsible for giving physical and digital access to the community, and he helps to shape the external communication of the association. All these tasks are crucial to attract new members and projects. Also, this role is of paramount importance for the sustainability of the community. This role may be the first to require a professional person and therefore a salary that ensures the motivation of the social community manager. If the community manager fails in his tasks, the image of the community in its whole is impacted.

In an old project called “Quantitative Anthropology”, we performed a community analysis of our Slack team. This analysis has been done during the first year of Hackuarium and demonstrated the volume of exchanges between the community members.

hackuarium_network_2015

This network visualized with Cytoscape, show the most active members as well as the central role of the community manager, at that time our friend Shalf. The nodes represent the people, and the purple edges show the volume of exchange.

Depending on the size of the association, the treasury, and the amount of work, the board might include more professionals. But before reaching this stage, the association’s size should reach a critical mass of about 200 members. Today hackuarium has about 110 members with around 15 active projects. Our ability to grow our community and our capacity to fulfill each role needed on the board will determine the sustainability of Hackuarium in the following years.

BOSC session of the ECCB/ISMB 2017

As the lucky recipient of an Open Bioinformatics Foundation (OBF) travel grant, I had the chance to attend to my first Bio-informatics Open Source Conference (BOSC) in Prague the 21 and 22 July 2017.

Pear

During the event, I discovered an amazing community of scientists and developers involved in the field of bioinformatics, with a strong open source mindset. “Sharing is caring” and I can tell these guys care a lot! The first day was quite technical with several talks about projects developed during the code fest that happen just prior to the conference. I had the opportunity to discover the FAIR principles (Findability, Accessibility, Interoperability, and Reproducibility) of open data, and several tools aimed at simplifying bioinformatics workflow sharing, visualizing and production. These teams triggered my curiosity towards the Common Workflow Language (CWL) and data standards, notably RABIX, GA4GH and nextflow. Several talks presented very useful tools such as YAMP (Yet Another Metagenomic Pipline!) or MiltiQC for next generation sequencing quality control, and Open MS 2.0 for mass-spectrometry data analysis. One important topic of the day was the reproducibility of bioinformatics piplines and several talks were addressing this question with various approaches, such as containers (Dockers, BioContainers), GNU Guix or package repository such as BioConda.

logo

On the second day, I had the chance to present our BeerDeCoded project in the Citizen Science session of the BOSC. I had the first slot in the morning with an audience of nearly 250 attendees. The beer topic is kind of holy in a geek environment. I had the pleasure to share several important message regarding science conducted outside of academia or industry, in a community laboratory space like Hackuarium. I put some emphasis about science communication between fields and outside of our institutional scientific community. As experts, we have the responsibility to make our knowledge and our researches accessible to a wide audience, and this is exactly our goal with BeerDeCoded and Hackuarium. In addition, I could announce the official release of our first results based on the metagenomic analysis of 39 beer samples. I was able to show our preliminary analysis. The BOSC community was really enthusiastic about the project and attendees tweeted quite a lot about it.

In addition, I had some very interesting questions about the interest of breweries in BeerDeCoded, the potential fear of companies that citizen scientist decode their proprietary yeast strain or about the data integration of sequencing with GC/MS in order to study small molecules present in our beer data-set. Moreover, this talk will potentially trigger new collaborations with with Bérénice Batut from the Galaxy training networkGalaxy is an open source, web-based platform for data intensive biomedical research. The Galaxy platform regroups a collection of bioinformatics tools and workflow that can be run without coding knowledge. This program is widely used by biologists to analyze their next generation sequencing data. BeerDeCoded will benefit from this collaboration with a specific instance of Galaxy to make the beer metagenomics accessible to anyone.

Later during this second day, I was impressed by several other talks. One of the greatest initiative is the work of the H3ABioNet, which aim at training bioinformaticians in Africa. I discovered the Journal of open source software (JOSS) that facilitate the publication of bioinformatics software. Then, BioThings SDK and Wikidata presented their API and their knowledge base that allows retrieving efficiently some annotations of biological data. This day was as well the occasion to discuss about data sharing of human data (wearable, clinical, etc.) in order to improve precision medicine and the ethical implication. In addition, second-hand data usage and the problem of re-digitalization of published data in a non-machine readable format was evoked. Finally, Nick Loman gave the closing keynote presentation at BOSC. He gave a great talk about virus outburst surveillance using the Oxford nanopore minION sequencing technology, using two examples, namely Ebola outburst in Africa in 2015 and the Zika virus in Brazil.

In summary my first BOSC experience was very intense and highly interesting. I met with great scientists and developers and I learned about the newest open source software/library/API and practices in this field. I would like to thank once again the BOF committee for allowing me to join this great event and to give me the opportunity to present Hackuarium and the BeerDeCoded project.