We report the diagnostic validity of a selection algorithm for identifying epilepsy cases.Retrospective validation study of International Classification of Diseases 10th Revision Australian Modification (ICD-10AM)-coded hospital records and pharmaceutical data sampled from 300 consecutive potential epilepsy-coded cases and 300 randomly chosen cases without epilepsy from 3/7/2012 to 10/7/2013. Two epilepsy specialists independently validated the diagnosis of epilepsy. A multivariable logistic regression model was fitted to identify the optimum coding algorithm for epilepsy and was internally validated.One hundred fifty-eight out of three hundred (52.6%) epilepsy-coded records and 0/300 (0%) nonepilepsy records were confirmed to have epilepsy. The kappa for interrater agreement was 0.89 (95% CI=0.81-0.97). The model utilizing epilepsy (G40), status epilepticus (G41) and ≥1 antiepileptic drug (AED) conferred the highest positive predictive value of 81.4% (95% CI=73.1-87.9) and a specificity of 99.9% (95% CI=99.9-100.0). The area under the receiver operating curve was 0.90 (95% CI=0.88-0.93).When combined with pharmaceutical data, the precision of case identification for epilepsy data linkage design was considerably improved and could provide considerable potential for efficient and reasonably accurate case ascertainment in epidemiological studies.