More than a year ago, the University Medical Center Groningen (UMCG) and MIcompany started a cross-border collaboration in the field of asthma and allergy research. The goal of this research is to create medical breakthroughs by applying Artificial Intelligence (AI) to the massive available amount of multi-level DNA data. The results of this collaboration are beyond all expectations. Professor Gerard Koppelman at the UMCG: “the robustness and prediction power of the developed models are unique in the field”. Founding partner Marnix Bügel PhD of MIcompany: “the unlimited richness of DNA data stretches our capabilities in the field of AI. Furthermore, it offers us the opportunity to attract the best talent in the market”. Based on the early successes of the collaboration, new research streams have been defined. These apply Image Recognition and Bayesian Networks AI techniques to single cell DNA expression data and high-resolution lung images. To learn more, click here for a Ted Talk-style video about why we started this research.
The researchers of the UMCG and the data scientists of MIcompany were able to build a unique AI model to predict allergic disease in young children. The model can predict allergy with great precision by only using the modification level of few DNA locations (the so called CpG sites) as found in the nose. Click here for our AI Chalk Talk movie about how we used DNA methylation to diagnose asthma.
The prediction model was built using a massive dataset including DNA variation associated with allergic disease, personal factors, environmental factors, blood- and nasal DNA modification (methylation). The team will now test this allergy prediction model in two international studies, to verify if the results can be used world-wide. Due to the initial success of this cross-border collaboration, the team of involved researchers from UMCG and MIcompany has grown from 5 to more than 15 in a year’s time and 3 additional research streams have been started.
Photo 1. The initiators of the collaboration managing partner Marnix Bügel Phd and professor and lung pediatrician Gerard Koppelman
The human body functions as a complex network in which different biological processes work together and influence each other. In one of our research streams, we managed to identify the workings of part of this network through the use of Bayesian Network models. These models showed how someone’s genetic build-up can lead to specific DNA modification (methylation) within nasal cells and/or gene expression through RNA, which carry the instructions for making proteins in the cell. Since we made different networks for allergic and non-allergic patients, it was also possible to identify particular ‘pathways’ that are present for healthy individuals but are altered for allergic patients.
Building such a network is no trivial task. Because our genetic architecture is so rich and complex, there are over 10^525995 possible network structures we can construct. To place that into perspective, there are only 10^82 atoms in the universe. Through a successful collaboration with both biological experts at UMCG and Bayesian Network experts in Israel, we managed to both construct meaningful networks and incorporate existing biological knowledge into these, via so-called priors. Ilya Petoukhov – principal of MIcompany: “We are hopeful that the identification of key players in this network is an important step towards the development of new treatments”. Click here for our AI Chalk Talk movie about Bayesian Network research.
To really understand how asthma and allergies work, we need to understand the working of cells. For that, a lot of research is conducted on the RNA expressed by cells. RNA consists of snippets of our DNA code that is forwarded within the cells to proteins so they know what to do and therefore dictates how a cell works.
So far, RNA research has focused on understanding the expression of clumps of tissue, consisting of many different cell types. This leads to a ‘soup’ of cells in which it is easy to miss an allergy-specific cell type. Very recently, single cell RNA sequencing has been developed, with which we are able to measure the individual gene expression of thousands of individual lung cells per person at a time. Martijn Nawijn – associate professor at the UMCG: “This is revolutionary, because we can now study the biology behind asthma at unprecedented detail: to know for each cell what type of cell it is, what functions it performs and how it interacts with other cells”.
Although promising, this data provides us with big challenges, which we aim to overcome using AI techniques. One of these is called ‘sparsity’, meaning that the dataset contains many zeros. The other challenge is that the complex steps in measuring RNA can lead to so-called statistical and technical bias. Click here for our AI Chalk Talk movie about how we tackled these challenges in single cell data.
The next step will be to combine research stream 1 and 2 for understanding how individual cells communicate and collaborate within tissue (so-called co-regulation). To create this understanding, gene expression networks will be constructed based on single cell data.
One of the worst lung diseases is COPD (chronic obstructive pulmonary disease), which makes your airway walls so thick and prone to contraction that too little air is able to arrive in your lungs. Because many COPD patients get a lung transplant, there is a lot of visual information available of how a COPD lung looks like in various stages of the disease. These are no normal images, but high-definition images of 100k x 100k pixels, comparable in size to more than 1000 photos made by your average iPhone. These images contain the key for understanding COPD on cellular level and creating treatments for airways.
An experienced pathologist can easily identify certain structures within the zoomed-out image, such as airways, blood vessels and alveoli. However, the zoomed-out image contains too little distinction between interior and exterior airway wall tissue and between subjects with a different degree of airway deterioration, limiting our understanding of the working of COPD. This is where AI comes in, as it can identify cell types and structures on the most granular pixel level and then use that to identify and highlight cell structures.
“I’m hoping that Image Recognition will help me identify structural differences in airways in a way I could never spot visually, as well as help my students to interpret these lung images more easily.”
– Wim Timens,
Professor in pathology at the UMCG
Chronic lung diseases such as asthma, most allergies and COPD are among the most common diseases in the world. Asthma is the most common chronic disease in children, while allergies are the most prevalent in adults. Around 400,000 people die of asthma worldwide every year. Around 150 million Europeans suffer from a chronic allergic disease and this number is increasing exponentially. By 2025, 50% of all Europeans are expected to have a chronic allergic disease.
GRIAC is the respiratory research institute of the University Medical Center Groningen (UMCG) and the overarching research theme Healthy Ageing. GRIAC has three goals: to prevent the disease from developing further by identifying risk factors; minimizing the effects of disease by optimizing diagnosis and treatment; and improving the quality of life of patients. The ultimate goal is to develop treatments to cure these diseases.
MIcompany is an Artificial Intelligence (AI) company based in Tel Aviv and Amsterdam. From our offices, we drive AI transformations by building AI solutions and skills. Our team of more than 80 data scientists, AI engineers and software engineers serves industry-leading companies such as eBay, Booking.com, Heineken, KPN, LeasePlan, Aegon, and Shufersal, in more than 25 countries.