Understanding the rules of life

Bioscience for an integrated understanding of health

Category: Standard Studentships

Mining cancer gene dependencies with machine learning algorithms to identify and validate novel cell cycle regulators.

Project No.2229

Primary Supervisor

Dr Helfrid Hochegger- University of Sussex


Dr Eduardo Campillo-Funollet – University of Kent



The advent of CRIPSR has enabled us to analyse how different cancer cells depend on different genes for heir survival.

Large scale gene dependency screens have delivered a dataset for over 900 cell lines and 14,000 genes. This is a rich resource for data mining and machine learning to identify new functional associations and dependencies. Dr. Hochegger and Dr. Campillo-Funollet have been collaborating over the past year to identify new cell cycle regulators using correlation- and cluster-analysis of gene dependency data in this data set.

This project aims to further develop this approach using supervised and un-supervised machine learning algorithms as well as deep learning approaches. In parallel we aim to functionally validate already predicted cell cycle regulators in various cancer cell lines. The project will have two components, a theoretical part that will be co-supervised by Drs Hochegger and Campillo-Funollet and an applied part involving high throughput microscopy and dynamic analysis of the cell cycle.

The student will be exposed to both data science and cancer cell biology in an interdisciplinary environment. They will become fluent in applying state of the art programming languages (Python, Matlab), set up machine learning algorithms and develop new code to query cancer cell line and tumour genomics data bases. In parallel, the student will learn to perform high throughput gene depletion screens and will set up automated image segmentation and analysis work flows to detect cell cycle phenotypes. This will be done on an already identified set of approximately 30 novel cell cycle regulators, while novel hits will be generated using more refined ML algorithms.

The interdisciplinary outlook of this project provides a good opportunity for the student to learn a broad scope of lab based and data science skills for a successful career as a cancer biologist.