Blog detail

Vitale Role of Data Science in Advancing Medicine

The healthcare industry is an in-demand field that provides opportunities to make a positive difference in the world; as a result, a career in healthcare is an attractive option for many job seekers. For those who want to be involved with healthcare but don’t want to work in a hospital or clinic, data science—which is fast becoming a major part of the healthcare industry—provides an excellent opportunity to contribute to the advancement of the field.

The Use of Data Science in Medicine is Booming

The healthcare industry has been one of the most prominent beneficiaries of the emergence of data science. Thanks to data scientists, medical diagnostics are becoming more efficient and accessible, medical treatment more personalized, and medical research more data-driven. Established pharmaceutical companies and research centers such as Merck and the Mayo Clinic are relying on data science projects to analyze massive amounts of data in the pursuit of new insights, as are some of the world’s largest tech firms. Google, NVidia, and other industry leaders have spent significant amounts of money to purchase multiple leading AI startups, and Google’s DeepMind Health AI service is being used by the UK National Health Service (NHS) to alert healthcare workers when in-hospital patients are at risk for acute kidney injuries.

For those who don’t want to work for a large tech or healthcare company, a career in data science also provides an ideal path into entrepreneurship. Healthcare AI startups raised over US $4.3 billion in venture capital between 2013 and 2018, more than any other industry. Even better, the amount of funding available has yet to peak; investment in healthcare AI is booming, and more venture capital was raised for healthcare AI in 2018 than in any previous year. The relationship between the medical and data science communities is so strongly developed that healthcare-oriented data science tools have their own central repositories. These repositories (called “Bioconductor”, “Biopython”, etc) are collections of coding tools devoted entirely to facilitating genomics-focused medical research. Their existence reflects the unique level of demand for medically-oriented data science; Bioconductor is the only major discipline-specific repository available for R, a data science-oriented programming language.

What Are Some Other Ways That Data Science Is Used in Medicine?

Research Recovery and Drug Discovery

Data science is driving improvements to the drug discovery process that are propelling the creation of new insights into how to combat disease and illness. By designing algorithms capable of ingesting large amounts of medical data, data scientists can aggregate data from existing and new studies in order to leverage a “big data” approach to research. For instance, a data science firm called Data2Discovery is using a natural language processing algorithm to collect and analyze data from hundreds of thousands of pages of medical texts. Traditional medical studies focus on validating a specific set of hypotheses, which means valuable insights from research studies may go unrecognized if they’re not relevant to a study’s hypothesis. Drug discovery projects maximize the value of individual research by aggregating its data into a larger framework which can be leveraged to discover previously unknown relationships between variables within the dataset. 

Data Science Enables Truly Personalised Medicine

Personalized medicine, also referred to as precision medicine, refers to a future in which every patient receives treatment that is customized to their unique biological characteristics. Data scientists working in this still-emerging field are developing tools to understand how a patient’s genetics affect their response to specific treatments, with the goal of enabling physicians to provide patients with personalized treatment programs that are tailored to their specific biology. The traditional model for developing new medicines has not changed significantly in decades: Potential new drugs are evaluated through a series of trials, which conclude with a study of their effects on a sample of candidate patients. While these trials may consider how demographic data affect treatment outcomes, they rarely assess data according to more detailed variables such as the billions of data points that comprise the human genome.

The lack of data points assessed in traditional clinical trials means that physicians lack the ability to tailor treatments to individual patients, and must instead prescribe based on general information such as age, height, and weight. As a result, many patients—such as those with mental illness—are forced to rely on trial and error to figure out what treatments work best for them. This process is inefficient, costly, and can cause stress to patients, potentially discouraging them from continuing to explore treatment options. Personalized medicine short-circuits this process by considering human physiology in detail, down to the genetic level. By comparing treatment outcomes in-context with detailed patient data, data scientists can discover how specific genetic proteins or other biological attributes affect the efficacy of a given treatment. This information provides physicians with the ability to determine which drug would work best for a specific patient, both in terms of its effectiveness and in terms of the likelihood that the patient will experience side effects from the treatment.