2050812 Machine Learning Based Semi-automatic Iterative Annotation Method of Similar Laboratory Test Item Combination Between Multiple Facilities

>100 Views

August 12, 25

スライド概要

20250812
MEDINFO2025 Taipei
OS28 - Language Model Meets Healthcare

profile-image

医療情報関連のスライド共有.

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

(ダウンロード不可)

関連スライド

各ページのテキスト
1.

OS28 - Language Models Meet Healthcare 11:00 - 12:30 Room: 201F, 2F (Location: Taipei International Convention Center) Machine Learning Based Semi-automatic Iterative Annotation Method of Similar Laboratory Test Item Combination Between Multiple Facilities Naoto KUME, Hiroshima University Hospital, JAPAN Satoshi KATO, H.U. Group Research Institute G.K., JAPAN Present by: Naoto KUME Date: 12 Aug Location: Taipei, Taiwan

2.

MedInfo 2025 Machine Learning Based Semi-automatic Iterative Annotation Method of Similar Laboratory Test Item Combination Between Multiple Facilities Naoto KUME *1, Satoshi KATO *2 *1 Hiroshima University Hospital, Japan *2 H.U. Group Research Institute G.K. P. 2

3.

Background ◼ Creating a big data from multiple facilities always facing a problem of item mapping. ◼ Especially about laboratory test item mapping, Master-based mapping is not quite sufficient quality is guaranteed. ◼ Also, the mapping task burden is quite high. ◼ Challenges of the standard code of the laboratory test items, such as LOINC, JLAC-10(in Japan) ◼ Yet not implemented in hospitals because of the mapping is nobody’s job, and ever. ◼ The result of mapping depending on who a specialist do the job. → variation of mapping results M1 x M2 x M3 x 3000 items on a master per 1 hospital Standard M Lack of information makes it difficult to create mapping table by its needs, in practical way Needs RWD Naoto KUME, Hiroshima-University Hospital MedInfo 2025 P. 3

4.

P. 4 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Implementation of The Nationwide EHR “Sen-nen Karte” ◼ Previous work is ◼ Connecting 106 hospitals, voluntary-basis, since 2015 ◼ Promoting legislation to establish the new act ◼ since 2015, act in 2018 ◼ “Next Generation Medical Infrastructure Act” ◼ Establishing the new company to get the government authorization ◼ since 2019 ◼ Providing anonymized data to industries and academia for the use of medicine and healthcare.

5.

Introduction ◼ Data-driven research is yet a new trend of clinical study after the discussion of secondary use of clinical dataset. ◼ Secondary Use on Electronic Health Record (EHR) requires construction of a big data from multiple facilities. ◼ Challenges: ◼ Reduction of the mapping burden ◼ The lack of interoperability of laboratory test items name and code between facilities. ◼ Sustaining the mapped state catching up with the trend of test item usage “Keep Active”. ◼ From a viewpoint of the cost including burden and budget, it is practically difficult perfect mapping. *Please Insert Presenter Name* MedInfo 2025 P. 5

6.

Purpose ◼ Providing semi-automatic mapping method to reduction the mapping cost ◼ Promising the quality of the mapping result, the mapping result is certified by the professionals ◼ Mapped test items are preserved as mapping experience and accelerate mapping of the new facility items in the next cycle. *Please Insert Presenter Name* MedInfo 2025 P. 6

7.

P. 7 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Method ◼ Providing a semi-automatic iterative mapping method ◼ By machine learning (ML), which provide item combination candidates before generating annotated data that is proven by the clinical laboratory technologist. ◼ The proposed model implements an item group dictionary, called spelling variation dictionary, by each laboratory test item, which grows the classification model in each annotation cycle by a new facility result data. M Master “HbA1c” LIS, EMR Processing not always the same EMR use the result for the prediction Result data “HbA1c+*” transform to numeric parameter

8.

P. 8 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Sample of original input of the test result … 31 coincident The specialist annotated ground truth “answer” for the first training and evaluation, not every time.

9.

P. 9 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan The Hell of Japanese Expression Variation Language character has 6 variations: Alphabet, Hira-gana, Kata-Kana, Kanji, 2bite, 1bite “HbA1c”, with the standard expression (NGSP, JDS) Item name from 31 hospitals

10.

P. 10 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Input data generation for machine learning (ML) Input Meta Data Item name (in-house) Specimen name Unit Reference range Text Numeric 2. Split by Q1—Q4 3. Split by demographics Result Evaluation Target Ans wer Not Target Not Tar get Similarity scoring Numeric parameters Statistic value 1. All result in 2 years Processing ML

11.

P. 11 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Input and Output Process of ML Manual task

12.

P. 12 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Iterative Annotation Model as classification decision process Manual task Input after second cycle

13.

P. 13 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Definition Productivity Improvement Ratio (PIR) • A metric representing the ratio of machine learning work efficiency to manual work efficiency. • The higher values indicate greater efficiency improvement through machine learning. PIR = 0.0004 / 0.2771 = 714.9 This values show 714.9 times greater efficiency than human-led mapping. • TLC: Total number of laboratory test codes across all hospitals. • NCT: Number of Correctly identified Target codes. • NoD: Number of code candidates detected by machine learning. • NTP: Number of True Positive. 4 items (27-23) are missing  give up

14.

P. 14 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Definition Mean Success Rate (MSR) • MSR measures the potential correctness of the ML model as a pre-screening tool for detecting target laboratory test items across multiple hospitals. • With final validation requiring annotated ground truth by domain experts. TP = true positive FP = false positive FN = false negative

15.

P. 15 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Evaluation criteria ◼ MSR and PIR exhibit a trade-off relationship, allowing for strategic adjustment of the prediction cutoff threshold to optimize performance based on project priorities. ◼ Increases MSR → More target codes detected (higher sensitivity) ◼ Decreases PIR → More false positives included, reducing precision ratio ◼ Acceptable trade-off → Higher manual review burden in exchange for fewer missed items ◼ Originally, this task has the ground truth ◼ If the specialist checks all item combination, some of ground truth is given ◼ ROC curve is expected high https://sefiks.com/2020/12/10/a-gentle-introduction-to-roc-curve-and-auc/

16.

Result ◼ 11 target items are performed ◼ PIR are significantly high. MSR is enough high. Total in-house codes (TLC): 69,654 mean = 3,258 Table 2. Performance Metrics for Laboratory Test Code Mapping NHP = Number of Hospitals, NTP = Number of True Positive, NCT = Number of Correct Target, NoD = Number of Detection, S/N = Signal/Noise Ratio, MSR = Mean Success Rate, PIR = Productivity Improvement Ratio. *Please Insert Presenter Name* MedInfo 2025 P. 16

17.

P. 17 MedInfo 2025 Naoto KUME, Hiroshima University Hospital, Japan Success Rate and Confidence Rate • The machine learning performance metrics showed high accuracy across test items, with • Sensitivity between 0.995-0.998, • Specificity ranging from 0.810 to 0.996, • AUC values above 0.992, → demonstrating the robust predictive capability of the XGBoost model. • Table 2 reveals high performance of the proposed method across 11 laboratory test items. • The model successfully mapped test codes with Mean Success Rate (MSR) ranging from 0.80 to 1.00. • The high MSR suggests the method's consistent performance across different test types, indicating robust mapping capabilities. • The average Productivity Improvement Ratio (PIR) of 1,178.6 (SD = 765.0) • demonstrates substantial efficiency gains, with the large standard deviation reflecting variability in mapping complexity across different laboratory test items.

18.

Discussion ◼ Because the massive dataset is available by the certified business operator certified by Japanese next-generation medical infrastructure law, the proposed data-driven approach was realized by using multiple hospital test result dataset. ◼ The performance of false positive (FP) should be investigated for the future work to reduce visual inspection to the found items. ◼ To ensure no items are overlooked, those not identified by the classification model must be carefully investigated. ◼ However, FP is manually checked so the final mapping mistake is avoided. ◼ When finding the lack item “false negative(FN)”, MSR should ◼ Take lower Cut off → reduce MSR → Still missing items need to be registered to the dictionary ◼ A found FN is registered to the Dictionary, for the next cycle. ◼ New item mapping without Ground truth? ◼ Learning curve is unknown how many facilities are necessary to annotate to get enough performance. *Please Insert Presenter Name* MedInfo 2025 P. 18

19.

Conclusion ◼ This study aims to provide sustainable method to create a common laboratory test master by the reduction of the mapping cost at the same time ensuring the quality of the mapping. ◼ An iterative annotation method was proposed and initially selected 11 test items are evaluated the forecast model performance. ◼ In result, over 70% to 98% items were successfully automatically found from facility test items and mapped to a common test master. ◼ Therefore, the proposed method is expected to reduce the cost of mapping over a thousand test items, which is expected to use for the secondary use of general clinical studies. *Please Insert Presenter Name* MedInfo 2025 P. 19