A novel method based on symbolic regression for interpretable semantic similarity measurement
Jorge Martínez Gil
Jose M. Chaves-Gonzalez
|Title||A novel method based on symbolic regression for interpretable semantic similarity measurement|
|Journal||Expert Systems with Applications|
The problem of automatically measuring the degree of semantic similarity between textual expressions is a challenge that consists of calculating the degree of likeness between two text fragments that have none or few features in common according to human judgment. In recent times, several machine learning methods have been able to establish a new state-of-the-art regarding the accuracy, but none or little attention has been paid to their interpretability, i.e. the extent to which an end-user could be able to understand the cause of the output from these approaches. Although such solutions based on symbolic regression already exist in the field of clustering, we propose here a new approach which is being able to reach high levels of interpretability without sacrificing accuracy in the context of semantic textual similarity. After a complete empirical evaluation using several benchmark datasets, it is shown that our approach yields promising results in a wide range of scenarios.