iHi Data Platform
CMUH Big Data Center: From iHi Data Platform to Practical Healthcare Intelligence
The Big Data Center (BDC) of China Medical University Hospital (CMUH) established the Clinical Research Data Repository in 2016. BDC manages the largest phenome-genome-environmental data platform in Asia, encompassing 19-year EMR and environmental exposure data from 3 million patients and genetic information from 230 thousand patients, which forms the solid foundation for generating clinical data with high resolution. To make great use of valuable medical big data, BDC developed the smart data platform, iHi platform, in 2020 to ignite hyper-intelligent data applications. The iHi platform is the only innovative data platform that combines clinical, genetic, and environmental data in Taiwan (Fig. 1).
The iHi platform provides clean, integrated, and de-identified data to clinical researchers through a cloud-based system. This data architecture not only can make the data ecosystem interoperable and sustainable but also solve the problem of cluttered data and low accessibility and create a venue for infinite artificial intelligence. Through the iHi platform services, we aim to expand multi-omics clinical data for education, research, and clinical or business application. Ultimately, the insights inspired by the iHi platform provide feedback to clinical settings and ultimately improve medical quality and patient health (Fig. 2).
The features of the iHi platform
The iHi platform was designed as a patient-centered medical data ecosystem that provides clinical researchers with accessible, reliable, and diverse data. Nine AI/ML/data tools have been approved by US FDA or Taiwan FDA, providing the validity and quality of the iHi platform. The iHi platform obtains innovative data structure (data Lego) and systematic data annotation workflows (data chip). Based on these characteristics, we further establish the iHi Genomics analytic platform to speed up clinical research discovery. Further, we can move towards internationalization by taking advantage of the great quality and comprehensive data of the iHi platform.
Innovative data structure: Data Lego
We aim to build a full-spectrum big data ecosystem that can not only integrate the EMR data, health insurance data, genomic, and environmental data but also combine with real-world data from patient-centered systems and multi-omics data such as microbiome and exosome data. Most importantly, these diverse and heterogeneous data must be linkable and traceable for sustainability and reusable. Therefore, we process all data through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. Furthermore, we modularize multi-omics and multi-dimensional datasets into data LEGO brick which is deep-cleaned and well-sorted by their characteristics. The iHi data platform, a data LEGO pool, deposits diverse data sources, such as EHR, medical images and examination reports. The researchers can select the interested LEGO brick to bring them into the iHi platform and perform the following analyses directly (Fig. 3).
Smart Data Augmented Annotation
All data provided in the iHi platform were processed through the standardized data management pipeline, which provides users with high-quality and protected clinical datasets. To perform systematic data cleaning, validation, and integration, we establish a unique smart data chip fabrication process to control the quality of each processing step.
From taking the data sources, six chips, including data source management, data architecture design, data polishing, standardization, refinement, data validation and stacking, are used to produce qualified and certified datasets, and finally, the smart data chip was generated (Fig. 4).
Through this standard and pre-built smart data chip fabrication process, we can easily manage and trace each process step in the iHi platform. In addition, we are the only platform that provides both ISO and CNS double-certified de-identified data in Taiwan (Fig. 5).
This brand-new concept of flow production used in data management can clean data deeply and enhance the high-quality AI solution that fits into the clinical flow. At the same time, high-performance AI can help extract important new data features to enhance data diversity and further brew the smart data ecosystem (Fig. 6).
In 2021, we launched the iHi Genomics Analytic Platform, which is an easy-to-use analytic platform for data exploring and extracting insights from interesting datasets. The iHi Genomics provides the disease cohort selection and Genome-wide Association Study or GWAS analysis within a few clicks. The iHi Genomics can generate the full report for GWAS, including the QC details and the Manhattan plot for significant SNPs associated with the disease (Fig.7).
Using virtual desktop infrastructure (VDI), the user can remotely access de-identified data certified by ISO and CNS in a highly secure environment (Fig 8).
Under the full support of the CMUH mainboard, the BDC hosts more than 3 million patients, which can be connected to genetic and environmental data. This deep-cleaned, multi-omics, and integrated data can provide macro-level and micro-level resolution for clinical insights discovery. Based on iHi services, we developed 21 patented AI/data solutions with US and Taiwan FDA pre-marketing approval and published more than 80 SCI papers (Fig. 9).
Based on the experience and performance of the iHi platform, we have collaborated with 19 international institutions, including universities, medical centers, and national institutes of health from seven countries (Fig. 10).
We were also invited by ASN (American Society of Nephrology) to present the research results of chronic kidney disease (CKD) from various medical data sources (Fig. 11).
The working flow and infrastructure of the iHi platform have been highly recognized by many leading experts worldwide (Fig. 12).
In 2022, we have initiated several clinical and intelligent medical cooperative projects with United Arab Emirates (UAE) (Fig. 13), US and Japan. Due to the well-designed data architecture of the iHi platform, we are able to build an innovation-friendly data ecosystem and connect with the whole world.
We are looking forward to more cooperations.