Open Publishing for the Big-Data Era

Scott Edmunds
GigaScience journal and BGI Hong Kong

Time: October 30, 12-2pm
Location: Social Sciences & Humanities (SSH) Room 1246

GigaScience is an open-access, open-data journal attempting to revolutionize large-scale biological data dissemination, organization and re-use. Utilizing the experience and data handling infrastructure of the BGI, the worlds largest genomics organization, GigaScience links standard manuscript publication with an integrated database that hosts all associated data and provides data analysis tools and computing resources. In addition, open-source platforms such as the popular Galaxy workflow management system are used by GigaScience to make publishing more transparent and open by making all of the supporting workflows and methods available, thereby promoting reproducibility which the authors are credited for.

The GigaScience platform has already been involved in releasing data during the deadly 2011 German E. coli outbreak which aided a global crowdsourcing effort that led to groups around the world, and even bloggers outside of the usual academic environment, to contribute analyses to an open-source GitHub based repository. Many still unpublished genomes have been released to the global community, and workflows and software from a number of scientific papers have been archived and shared in as open, reproducible, transparent and usable form as possible.

With data citation producing evidence of its use in the wider research community, GigaScience hopes to revolutionize the publication model with the aim of executable publications, where data analyses can be reproduced and built upon by users without a coding background or heavy computational infrastructure in a more democratized manner.