Statistical analysis of ml-based paraphrase detectors with lexical similarity metrics

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Paraphrase detection has several important applications in natural language processing. Examples of such applications include language translation, text summarization, question answering, plagiarism detection, and online information retrieval. A number of metrics have been proposed in the literature to quantify the textual similarity between two sentences. However, the accuracy of utilizing each similarity metric alone in detecting paraphrases is very low. Though some machine learning (ML) techniques have been deployed for paraphrase detection, there is no known study that intensively benchmarks their performance on this problem under similar conditions. In this paper, we evaluate the utility of integrating five lexical similarity metrics with three standard machine learning paradigms to detect paraphrases. We apply statistical tests to compare and benchmark the relative significance of the adopted ML-based paraphrase detectors on different datasets.

Original languageEnglish
Title of host publicationICISA 2014 - 2014 5th International Conference on Information Science and Applications
PublisherIEEE Computer Society
ISBN (Print)9781479944439
DOIs
StatePublished - 2014

Publication series

NameICISA 2014 - 2014 5th International Conference on Information Science and Applications

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Statistical analysis of ml-based paraphrase detectors with lexical similarity metrics'. Together they form a unique fingerprint.

Cite this