A novel method based on symbolic regression for interpretable semantic similarity measurement

Authors Jorge Martínez Gil
Jose M. Chaves-Gonzalez
Editors
Title A novel method based on symbolic regression for interpretable semantic similarity measurement
Type article
Journal Expert Systems with Applications
Volume 160
DOI 10.1016/j.eswa.2020.113663
Month December
Year 2020
SCCH ID# 20066
Abstract

The problem of automatically measuring the degree of semantic similarity between textual expressions is a challenge that consists of calculating the degree of likeness between two text fragments that have none or few features in common according to human judgment. In recent times, several machine learning methods have been able to establish a new state-of-the-art regarding the accuracy, but none or little attention has been paid to their interpretability, i.e. the extent to which an end-user could be able to understand the cause of the output from these approaches. Although such solutions based on symbolic regression already exist in the field of clustering, we propose here a new approach which is being able to reach high levels of interpretability without sacrificing accuracy in the context of semantic textual similarity. After a complete empirical evaluation using several benchmark datasets, it is shown that our approach yields promising results in a wide range of scenarios.