Machine learning is a powerful tool to study the effect of cancer on species and ecosystems

Authors: Antoine M. Dujon, Marion Vittecoq, Georgina Bramwell, Frédéric Thomas, Beata Ujvari



Cancer is an understudied but important process in wildlife. Cancerous cells are proposed to have had significant effect on the evolution of metazoan species due to their negative effect on host fitness. However, gaining knowledge on the impact of cancer on species and ecosystems is currently relatively slow as it requires expertise in both ecology and oncology. The field can greatly benefit from automation to reduce the need of excessive manpower and analyse complex ecological datasets.

In this commentary, we examine how machine learning has been used to gain knowledge on oncogenic processes in wildlife. Using a landscape ecology approach, we explore spatial scales ranging from the size of a molecule up to whole ecosystems and detail, for each level, how machine learning has been used, or could contribute to obtain insights on cancer in wildlife populations and ecosystems.

We illustrate how machine learning is a powerful toolbox to conduct studies at the interface of ecology and oncology. We provide guidance for the readers of both fields on how to implement machine learning tools in their research and identify directions to move the field forward using this promising technology. We demonstrate how applying machine learning to complex ecological datasets will (a) contribute to quantitating the effect of cancer at different life stages in wildlife; (b) allow the mining of long-term datasets to understand the spatiotemporal variability of cancer risk factors and (c) contribute to mitigating cancer risk factors and the conservation of endangered species.

With this study, we aim to facilitate the use of machine learning to wildlife species and to encourage discussion between the scientists of the fields of oncology and ecology. We highlight the importance of international and pluridisciplinary collaborations to collect high-quality datasets on which efficient machine learning algorithms can be trained.