Window Classifiers and Conditional Random Fields for Medical Report De-Identification.
2019, IberLEF@ SEPLN, 744-754, 2019Citas: 1
Agregar PDF Importar citas Importar citas SCRAPME Plots Conexiones
Autor(es)
Viviana Cotik and Franco M Luque and Juan Manuel Pérez
Abstract
Information extraction of medical reports is key in order to improve timely discoveries of findings and as an aid to improve decisions about medical treatments and budget. In order to develop information extraction methods, medical data has to be available. Since this data is extremely sensitive due to the presence of personal information, report de-identification is needed. We present two methods, a window classifier and an implementation of conditional random fields (CRF) in order to de-identify personal information of Spanish medical records provided by the MEDDOCAN challenge. CRF obtained the best results with a F1-measure of 0.897 for named entity recognition with exact match (subtask 1), and 0.930 and 0.940 for inexact match (subtask 2 strict and merged respectively).