Chronobiology through the prism of Wikipedia

I am a computational biologist. As such, one of my duties is to navigate through large data sets with my expertise as compass. By exploring and mapping knowledge, I do what really matters to me as a scientist. I try to understand what we have accomplished and what is missing.

Back in 2016, I was working on my thesis in chronobiology. One of my concerns was to find seminal works, important concepts, and leaders of my field. At that time, I started developing a small script to parse Pubmed for circadian clock related publications. I found about 15K papers, which I sorted by citation count. I did the same for authors and keywords. Thanks to this analysis, I discovered the exceptional work of Benzer and Konoopka on the identification of clock mutants in drosophila in 1971 with more than 2000 citations. I learned the names of Michael Rosbach, Jeffery Hall and Mike Young before they received their Nobel Prize in 2017. I could also see that one of my mentors, Ueli Schibler, was one of the godfathers of the molecular clock field.

When I came to Israel for my post doc, I met Rona Aviram, a fellow chronobiologist at the Weizmann Institute, and Omer Benjakob, a philosopher of science and an expert in digital encyclopedias. They published a case study examining the different versions of the main Wikipedia article on “circadian rhythm” to explore how Wikipedia reflects scientific development. Wikipedia, created in 2001, works on a consensus model. Wikipedia articles are continuously edited by the community, and the version history is accessible on the site. In their study, Rona and Omer detailed the dynamics of the collective editorial process. They discovered that the median time of insertion of a breakthrough in Wikipedia is about 5 years. In addition, they investigated the work of Beth MacDonald, a prolific editor of the article. As revealed by her editing pattern on Wikipedia, she suffered from non-24h-hour syndrome.

When we met, we decided to examine Wikipedia to gain a broader view of our field. We first asked ourselves a couple of questions. How many Wikipedia articles cite primary circadian literature? What are the key concepts and the key players in circadian research represented in Wikipedia? Thus, we identified the knowledge gap between what we do on a daily basis and what people know about our field in Wikipedia. We found every Wikipedia article that cited any of the 15K Pubmed circadian clock papers. We listed about 180 Wikipedia articles that we classified into Concepts, People, Disease, Model, and Genes/molecules. We produced a network that visually represents the relationship between “Wikipedia articles” and “pubmed papers” and a timeline of creation of Wikipedia articles. (beta versions)

Our investigation of editorial patterns revealed that edits peak at a specific time of the year, with a substantial number of new editors. A deeper investigation of the editor profiles revealed that many of them were students from Washington University, taking a course on chronobiology taught by Erik Herzog. He used Wikipedia editing as an assignment for students to enhance public access to important discoveries in chronobiology.

Project overview (woodstock.bio slide)

Our work provided a birds’ eye view of the representation of the field on Wikipedia. It led us to develop a new interactive navigable tool to explore the online encyclopedia and the related primary literature.

For future research, we want to further explore the content of each of these Wikipedia articles, hyperlinks, and editors to learn more about the growth of the encyclopedia. We hope to study the consensus process and potential controversies in chronobiology, and we plan to apply the same methodology in other case studies.

That’s why I decided to be a scientist. I wanted to be able to raise questions that would contribute to the development of knowledge. I wanted to understand how fundamental research becomes knowledge. In this sense, Wikipedia is a wonderful place to study how theories and concepts develop over time.

Jonathan Sobel

Woodstock.bio conference

Tel-Aviv 14/02/2020

Acknowlegments: Omer Benjakob, Rona Aviram, Hila Ratzabi, Aline Jaccottet for their comments and corrections.

Leave a comment