Custom Continuous Deployment to Uncover the Secrets in the Genome
Reading the genome to search for the cause of a disease has improved the lives of many children enrolled in clinical trials. However, to convert research into clinical practice requires the ability to query large volumes of data and find the needle in the haystack efficiently. This is hampered by traditional server- and database-based approaches being too expensive and unable to scale with accumulating medical information.
We hence developed a serverless approach to exchange human genomic information between organisations. The framework was architected to provide instantaneous analysis of non-local data on demand, with zero downtime costs and minimal running costs.
We used Terraform to write the infrastructure, enabling rapid iteration and version control at the architecture level. In order to maintain governance over our infrastructure created in this way, we developed a custom Continuous Deployment service that built and securely maintained each project, providing visibility and security over the entire organisation’s cloud infrastructure.
Outline/Structure of the Talk
We start off introducing the benefits of serverless, e.g. explaining why “once you go serverless you never go back”. However, also pointing out the difficulties of adhering to good devOps practices on distributed and ever evolving cloud architecture.
We then give a genome research application GT-Scan as the example and discuss the concrete problems we’ve faced with deploying a serverless research platform in a flexible yet fully documented manner.
We introduce the detailed approach we have developed for developing the architecture in a GitHub repository and deploying the system in an automated and tractable manner to AWS
We walk the audience through the capability this approach provides to us for automatically benchmarking two architectures side-by-side (blue-yellow deployment).
We conclude by outlining how we expect this to evolve with dedicated tools for serverless devOps capable of maintaining and deploying code bases across different cloud providers. And what impact this has on Data science through becoming an architecture-sharing discipline.
The audience will gain a deep understanding of cutting edge deployment methodologies of serverless architecture on AWS using terraform and good DevOps practices. The audience might also find the scientific and medical applications enabled by our serverless “search engine of the genome” inspiring by us painting the picture of how medical practices are revolutionised through technology.
Prerequisites for Attendees
Knowledge of using AWS as a cloud services provider