Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records

Publikation: Working paper › Preprint › Forskning

Standard

Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. / Kaas-Hansen, Benjamin Skov; Placido, Davide; Rodrìguez, Cristina Leal; Thorsen-Meyer, Hans-Christian; Gentile, Simona; Nielsen, Anna Pors; Brunak, Søren; Jürgens, Gesche; Andersen, Stig Ejdrup.

Authorea, 2022.

Publikation: Working paper › Preprint › Forskning

Harvard

Kaas-Hansen, BS, Placido, D, Rodrìguez, CL, Thorsen-Meyer, H-C, Gentile, S, Nielsen, AP, Brunak, S, Jürgens, G & Andersen, SE 2022 'Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records' Authorea. https://doi.org/10.22541/au.164349930.05269919/v1

APA

Kaas-Hansen, B. S., Placido, D., Rodrìguez, C. L., Thorsen-Meyer, H-C., Gentile, S., Nielsen, A. P., Brunak, S., Jürgens, G., & Andersen, S. E. (2022). Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. Authorea. Authorea Preprints https://doi.org/10.22541/au.164349930.05269919/v1

Vancouver

Kaas-Hansen BS, Placido D, Rodrìguez CL, Thorsen-Meyer H-C, Gentile S, Nielsen AP o.a. Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. Authorea. 2022. https://doi.org/10.22541/au.164349930.05269919/v1

Author

Kaas-Hansen, Benjamin Skov ; Placido, Davide ; Rodrìguez, Cristina Leal ; Thorsen-Meyer, Hans-Christian ; Gentile, Simona ; Nielsen, Anna Pors ; Brunak, Søren ; Jürgens, Gesche ; Andersen, Stig Ejdrup. / Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records. Authorea, 2022. (Authorea Preprints).

Bibtex

@techreport{778827a5f57e460d9a90dd948234d69f,

title = "Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records",

abstract = "Aim: To create a drug safety signalling pipeline associating latent information in clinical free text with exposure profiles to highlight potential adverse drug reactions to single drugs and drug pairs. Methods: All inpatient visits of a 500,000-patient sample from two Danish regions, between 18 May 2008 and 30 June 2016. Tokens from clinical notes recorded within 48 hours of admission were operationalised with a fastText embedding. For each of the 10,720 single-drug and drug-pair exposures from doorstep medication profiles, we trained a feed-forward neural network predicting the risk of exposure using embedding vectors as inputs. Results: 2,905,251 inpatient visits comprised 13,740,564 doorstep drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concurrently. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. Conclusion: We built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, by predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and makes our method appealing in settings that must make sense of non-English free text for pharmacovigilance.",

author = "Kaas-Hansen, {Benjamin Skov} and Davide Placido and Rodr{\`i}guez, {Cristina Leal} and Hans-Christian Thorsen-Meyer and Simona Gentile and Nielsen, {Anna Pors} and S{\o}ren Brunak and Gesche J{\"u}rgens and Andersen, {Stig Ejdrup}",

year = "2022",

doi = "10.22541/au.164349930.05269919/v1",

language = "English",

series = "Authorea Preprints",

publisher = "Authorea",

type = "WorkingPaper",

institution = "Authorea",

}

RIS

TY - UNPB

T1 - Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records

AU - Kaas-Hansen, Benjamin Skov

AU - Placido, Davide

AU - Rodrìguez, Cristina Leal

AU - Thorsen-Meyer, Hans-Christian

AU - Gentile, Simona

AU - Nielsen, Anna Pors

AU - Brunak, Søren

AU - Jürgens, Gesche

AU - Andersen, Stig Ejdrup

PY - 2022

Y1 - 2022

N2 - Aim: To create a drug safety signalling pipeline associating latent information in clinical free text with exposure profiles to highlight potential adverse drug reactions to single drugs and drug pairs. Methods: All inpatient visits of a 500,000-patient sample from two Danish regions, between 18 May 2008 and 30 June 2016. Tokens from clinical notes recorded within 48 hours of admission were operationalised with a fastText embedding. For each of the 10,720 single-drug and drug-pair exposures from doorstep medication profiles, we trained a feed-forward neural network predicting the risk of exposure using embedding vectors as inputs. Results: 2,905,251 inpatient visits comprised 13,740,564 doorstep drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concurrently. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. Conclusion: We built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, by predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and makes our method appealing in settings that must make sense of non-English free text for pharmacovigilance.

AB - Aim: To create a drug safety signalling pipeline associating latent information in clinical free text with exposure profiles to highlight potential adverse drug reactions to single drugs and drug pairs. Methods: All inpatient visits of a 500,000-patient sample from two Danish regions, between 18 May 2008 and 30 June 2016. Tokens from clinical notes recorded within 48 hours of admission were operationalised with a fastText embedding. For each of the 10,720 single-drug and drug-pair exposures from doorstep medication profiles, we trained a feed-forward neural network predicting the risk of exposure using embedding vectors as inputs. Results: 2,905,251 inpatient visits comprised 13,740,564 doorstep drug prescriptions; the median number of prescriptions was 5 (IQR: 3-9) and in 1,184,340 (41%) admissions patients used ≥5 drugs concurrently. 10,788,259 clinical notes were included, with 179,441,739 tokens retained after pruning. Of 345 single-drug signals reviewed, 28 (8.1%) represented possibly undescribed relationships; 186 (54%) signals were clinically meaningful. 16 (14%) of the 115 drug-pair signals were possible interactions and 2 (1.7%) were known. Conclusion: We built a language-agnostic pipeline for mining associations between free-text information and medication exposure without manual curation, by predicting not the likely outcome of a range of exposures, but the likely exposures for outcomes of interest. Our approach may help overcome limitations of text mining methods relying on curated data in English and makes our method appealing in settings that must make sense of non-English free text for pharmacovigilance.

U2 - 10.22541/au.164349930.05269919/v1

DO - 10.22541/au.164349930.05269919/v1

M3 - Preprint

T3 - Authorea Preprints

BT - Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records

PB - Authorea

ER -

ID: 291368764

Institut for Klinisk Medicin