One model per entity: using hundreds of machine learning models to recognize and normalize biomedical names in text

RANLP 2017 · Victor Bellon, Raul Rodriguez-Esteban ·

We explored a new approach to named entity recognition based on hundreds of machine learning models, each trained to distinguish a single entity, and showed its application to gene name identification (GNI). The rationale for our approach, which we named {``}one model per entity{''} (OMPE), was that increasing the number of models would make the learning task easier for each individual model. Our training strategy leveraged freely-available database annotations instead of manually-annotated corpora. While its performance in our proof-of-concept was disappointing, we believe that there is enough room for improvement that such approaches could reach competitive performance while eliminating the cost of creating costly training corpora.

PDF Abstract