Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Mubassarj (talk | contribs)
Tag: Reverted
m Reverted edits by Mubassarj (talk): addition of unnecessary/inappropriate external links (HG) (3.4.10)
Line 11:
* finding pairings between such patterns the represent paraphrases, i.e. "{{mvar|X}} (injured/wounded) {{mvar|Y}} people, {{mvar|Z}} seriously" and "{{mvar|Y}} were (wounded/hurt) by {{mvar|X}}, among them {{mvar|Z}} were in serious condition"
 
This is achieved by first clustering similar sentences together using [[n-gram]] overlap. Recurring patterns are found within clusters by using multi-sequence alignment. Then the position of argument words are determined by finding areas of high variability within each clusters, aka between words shared by more than 50% of a cluster's sentences. Pairings between patterns are then found by comparing similar variable words between different corpora. Finally new [https://www.paraphrasing.io/ paraphrasers]paraphrases can be generated by choosing a matching cluster for a source sentence, then substituting the source sentence's argument into any number of patterns in the cluster.
 
=== Phrase-based Machine Translation ===