NYMC Faculty Publications

Creating Rare Epilepsy Cohorts Using Keyword Search in Electronic Health Records

Author Type(s)


Journal Title


First Page


Last Page


Document Type


Publication Date





OBJECTIVE: Administrative codes to identify people with rare epilepsies in electronic health records are limited. The current study evaluated the use of keyword search as an alternative method for rare epilepsy cohort creation using electronic health records data. METHODS: Data included clinical notes from encounters with International Classification of Diseases, Ninth Revision (ICD-9) codes for seizures, epilepsy, and/or convulsions during 2010-2014, across six health care systems in New York City. We identified cases with rare epilepsies by searching clinical notes for keywords associated with 33 rare epilepsies. We validated cases via manual chart review. We compared the performance of keyword search to manual chart review using positive predictive value (PPV), sensitivity, and F-score. We selected an initial combination of keywords using the highest F-scores. RESULTS: Data included clinical notes from 77 924 cases with ICD-9 codes for seizures, epilepsy, and/or convulsions. The all-keyword search method identified 6095 candidates, and manual chart review confirmed that 2068 (34%) had a rare epilepsy. The initial combination method identified 1862 cases with a rare epilepsy, and this method performed as follows: PPV median = .64 (interquartile range [IQR] = .50-.81, range = .20-1.00), sensitivity median = .93 (IQR = .76-1.00, range = .10-1.00), and F-score median = .71 (IQR = .63-.85, range = .18-1.00). Using this method, we identified four cohorts of rare epilepsies with over 100 individuals, including infantile spasms, Lennox-Gastaut syndrome, Rett syndrome, and tuberous sclerosis complex. We identified over 50 individuals with two rare epilepsies that do not have specific ICD-10 codes for cohort creation (epilepsy with myoclonic atonic seizures, Sturge-Weber syndrome). SIGNIFICANCE: Keyword search is an effective method for cohort creation. These findings can improve identification and surveillance of individuals with rare epilepsies and promote their referral to specialty clinics, clinical research, and support groups.