Exploring Elasticsearch architectures with Oracle Cloud

The IEU GWAS Database

The MRC Integrative Epidemiology Unit (MRC IEU) at the University of Bristol hosts the IEU GWAS Database, one of the world’s largest open collections of Genome Wide Associate Study data. As of April 2019, the database contains over 250 billion genetic association records from more than 20,000 analysis of human traits.

The IEU GWAS database underpins the IEUs flagship MR-Base analytical platform (www.mrbase.org) which is used by people all over the world to carry out analyses that identify causal relationships between risk factors and diseases, and prioritize potential drug targets. The use of MR-Base by hundreds of unique users per week generates a high volume of queries to the IEU GWAS database (typically 1-4 million queries per week).

Objectives of the Oracle Cloud/MRC IEU collaboration

Both the IEU GWAS database and the MR-Base platform are open to the entire scientific community at no cost, supporting rapid knowledge discovery, and open and collaborative science. However, the scale of the database and the volume of use by academics, major pharma companies and others, create significant compute resource challenges for a non-profit organization. The primary objective of the collaboration was to explore ways to use Oracle Cloud Infrastructure to improve database performance and efficiency.


Collaboratively analysing thousands of phenotypic traits using UK Biobank data, Google APIs and HPC

Around 18 months ago we began developing a platform to provide a collaborative environment for running GWAS using the UKBiobank data.  To date we have completed 20,000 GWAS on automated phenotypes and 400 GWAS on curated phenotypes. These are being made available via the MR-Base platform with regular updates and an anticipated full release date of December 2018. (more…)