Computer Science / Artificial Intelligence Computer Science / Software Engineering

Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms

February 23, 2026

Reading time: 5 minute

...

#Software Engineering #Computer Science #Artificial Intelligence

📝 Original Info

Title: Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms
ArXiv ID: 1707.07673
Date: 2017-07-26
Authors: Researchers from original ArXiv paper

📝 Abstract

Electronic Health Records are electronic data generated during or as a byproduct of routine patient care. Structured, semi-structured and unstructured EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the development of precision medicine approaches at scale. A main EHR use-case is defining phenotyping algorithms that identify disease status, onset and severity. Phenotyping algorithms utilize diagnoses, prescriptions, laboratory tests, symptoms and other elements in order to identify patients with or without a specific trait. No common standardized, structured, computable format exists for storing phenotyping algorithms. The majority of algorithms are stored as human-readable descriptive text documents making their translation to code challenging due to their inherent complexity and hinders their sharing and re-use across the community. In this paper, we evaluate the two key Semantic Web Technologies, the Web Ontology Language and the Resource Description Framework, for enabling computable representations of EHR-driven phenotyping algorithms.

💡 Deep Analysis

Deep Dive into Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms.

📄 Full Content

1 Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms Václav Papež1,2,*, MSc, Spiros Denaxas1,2,*, PhD, Harry Hemingway1,2, FRCP 1 Institute of Health Informatics, University College London, London, UK 2 Farr Institute of Health Informatics Research, University College London, London, UK Abstract Electronic Health Records are electronic data generated during or as a byproduct of routine patient care. Structured, semi-structured and unstructured EHR offer researchers unprecedented phenotypic breadth and depth and have the potential to accelerate the development of precision medicine approaches at scale. A main EHR use-case is defining phenotyping algorithms that identify disease status, onset and severity. Phenotyping algorithms utilize diagnoses, prescriptions, laboratory tests, symptoms and other elements in order to identify patients with or without a specific trait. No common standardized, structured, computable format exists for storing phenotyping algorithms. The majority of algorithms are stored as human-readable descriptive text documents making their translation to code challenging due to their inherent complexity and hinders their sharing and re-use across the community. In this paper, we evaluate the two key Semantic Web Technologies, the Web Ontology Language and the Resource Description Framework, for enabling computable representations of EHR-driven phenotyping algorithms. Introduction Electronic Health Records (EHR) are structured, semi-structured and unstructured data that are generated during routine interactions of patients with primary care, hospital care and tertiary healthcare or as a byproduct of those interactions for billing or administrative purposes1. Structured EHR are recorded using controlled clinical terminologies while unstructured data include clinical text and narrative. Semi-structured EHR data often loosely follow a data specification (e.g. prescription events, medical imaging reports) but this varies greatly across information systems, clinical specialties and healthcare providers. High-throughput genotyping and increased availability of EHR data are giving scientists the unprecedented opportunity to exploit routinely generated clinical data to advance precision medicine at scale. EHR data can fundamentally alter the manner in which genetic association studies are performed and enable scientists to examine the association of genetic variants and traits in larger sample sizes and phenotypic breadth2. A primary use-case of EHR data is the creation of phenotyping (or “case finding”) algorithms3, computational algorithms that identify patients that have (or have not) been diagnosed with a particular condition4 (e.g. acute myocardial infarction, prostate cancer, or anxiety etc.) and where applicable the disease onset and severity. Phenotyping algorithms tend to use clinical information such as diagnoses, laboratory tests, symptoms, clinical examination findings, prescriptions, referrals and other EHR data elements. While the term phenotype is traditionally defined as the physical manifestation of a particular trait, in the context of EHR research, phenotypes are broadly (but not exclusively) as the presence or absence of a particular clinical condition. In EHR resources linked with genetic data, such as the Electronic Medical Records and Genomics (eMERGE) consortium5, these phenotypes can enable large-scale genomic association studies which have been traditionally limited to a small set of traits. Phenotyping however is a challenging and time-consuming process since often data been collected for care, auditing or administrative purposes and not for research. The contents of EHR data sources are an indirect representation of the true patient state as skewed by the underlying healthcare process e.g. clinical guidelines, information systems, data standards6. Defining and validating EHR phenotyping algorithms is challenging and time-consuming. Challenges are amplified by the lack of a common definition standard for algorithms, making their sharing across the scientific community problematic. Despite the fact that phenotype components are structured and often annotated by controlled clinical terminology terms, phenotype definitions, and their underlying logic are usually expressed as free-text which is not readily machine-readable. The translation from this narrative to programming code used to identify and extract patients (e.g. implementing a phenotyping algorithm using Structured Query Language for use in a relational database management system) can be problematic due to potential ambiguities in the manner in which the algorithm was expressed or potential ways of implementing it using local data. There is a clear and urgent need to develop and

2 evaluate a computable, standards-driven format to facilitate the systematic creation, sharing and re-use of EH

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

Best Practices for Applying Deep Learning to Novel Applications

CPBVP: A Constraint-Programming Framework for Bounded Program Verification

Interval Valued Trapezoidal Neutrosophic Set for Prioritization of Non-functional Requirements

Start searching

No results found