Infectious agents, such as bacteria and viruses, are involved in the development of a variety of human cancers such as cervical, liver, stomach and bladder cancer. We hypothesise that other infectious agents are directly linked to cancer development, but as yet remain unidentified. We have developed an analytic pipeline (SEPATH) that identifies infectious agents in whole genome sequencing data. This pipeline is already in place, working and currently being improved. Through a competitive process we have been awarded access to apply SEPATH to the whole genome sequence data of 40,000 cancer genomes from Genomics England’s 100,000 Genome project. Genomics England’s 100,000 Genome project is a large-scale initiative, backed by the UK government, to sequence samples from 100,000 patients including 40,000 from a range of different cancer types. The aim of this PhD studentship is to run SEPATH on this data and search for new infectious agents associated with cancer in a range of different types. The student will aim to answer the following questions:
- Are any pathogens significantly over-represented in a particular disease compared to normal tissue and other disease types.
- Are any pathogens increasingly common as the tumour grade increases or are they associated with other clinical categories.
- Does the diversity of the microbiome/virome vary across clinical categories.
- Are there recurrent sequences that do not align to known organisms for a particular cancer type. This will give the intriguing possibility of discovering brand new bacteria that are linked to cancer.
These studies are anticipated to identify a "smoking-gun" that will spark further examinations for mechanisms of disease induction in the lab.
Closing Date: 24 March 2017
If only I were a grad student again .:-)
Sounds like a really cool project: good luck filling it!
Out of curiosity, are their any publicly available details around your SEPATH pipeline? My google-fu is turning up donuts, and would be interested to learn more about it.
Hi Steve, Thanks for the interest. It's not published yet and we are still trialling a few different approaches, but it looks like a kmer approach with Kraken or Metaphlan at it's core is the way forward.