Rare diseases and Big Data, a challenge or an opportunity?

Author: Marc De San Pedro   /  8 of January of 2019

One out of every seventeen people, some 6% of the population, will be affected by a rare disease at some point in their life. This is equivalent to 30 million people in Europe as a whole. It is worthwhile dedicating both time and resources in an effort to shorten the 7 to 10 years that patients who suffer from a rare disease waste in waiting for a positive diagnosis.

Rare diseases are characterized by several unmet needs and one of them, perhaps the most important, is the difficulty involved in making an accurate and timely diagnosis. The delay in making a diagnosis can have serious consequences for the life of the patient, either altering it or shortening it, since a precise and speedy detection of a disease, distinguishing it from other disorders is essential for prescribing the correct treatment.

80% of rare diseases have a genetic component. Often rare diseases are chronic and potentially life-threatening, they can be the due to a single gene, multifactorial, chromosomal or indeed non-genetic. 75% of rare diseases affect infants; Rare diseases also include childhood cancers and other illnesses which are already known, such as cystic fibrosis and Huntington’s disease.

A key to progress: the use of big data

Big data techniques can be used to speed up the diagnosis and treatment of diseases which are difficult to identify, have invisible symptoms and scarce patient populations. Instead of referring to a singular decision tree (algorithm) or a static consultation table, healthcare professionals can use big data to compare large-scale dynamic relationships to track patterns and predict rare diseases with significant accuracy.

Through disease modelling based on massive data samples, it is even possible to correlate certain diseases with generalized variables, thus taking the first step in the definition of criteria for rare diseases. The more data used to create models, the more complete the models will be for predicting diseases in patients. Thus, with sufficient data, it is even possible to identify diseases with low incidence.

By using machine learning over numerous iterations (automatic learning), big data techniques can be applied to multiple variables, from which the results are combined to calculate the probability of a patient having an illness. However, it will be an approach to reality, as an answer can never be perfectly certain, although an increase in available data can increase the level of confidence in a particular diagnosis.

BIBLIOGRAPHY