Name : Similarity , Version : 0.1
Description : Extension componant for calculate similarity score between two strings , A score of 0.0 means that the two strings are absolutely dissimilar, and 1.0 means that absolutely similar (or equal). Anything in between indicates how similar each the two strings are.
All the blocks :
Instructions of extension :
feature : String
target : String
algorithm : property
###Algorithms
1.JaroSimilarity
2.JaroWinklerSimilarity
3.LevenshteinDistance
4.DiceCoefficient
Return score
###Algorithms (property)
.
In computer science and statistics, the Jaro–Winkler distance is a string metric measuring an edit distance between two sequences. It is a variant proposed in 1990 by William E. Winkler of the Jaro distance metric (1989, Matthew A. Jaro).
The Jaro–Winkler distance uses a prefix scale
p
{\displaystyle p}
which gives more favourable ratings to strings that match from the beginning for a set prefix length
â„“
{\displa...
In computer science and statistics, the Jaro–Winkler distance is a string metric measuring an edit distance between two sequences. It is a variant proposed in 1990 by William E. Winkler of the Jaro distance metric (1989, Matthew A. Jaro).
The Jaro–Winkler distance uses a prefix scale
p
{\displaystyle p}
which gives more favourable ratings to strings that match from the beginning for a set prefix length
â„“
{\displa...
In information theory, linguistics, and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.
Levenshtein distance may also be referred to as edit di...
The Sørensen–Dice coefficient (see below for other names) is a statistic used to gauge the similarity of two samples. It was independently developed by the botanists Thorvald Sørensen and Lee Raymond Dice, who published in 1948 and 1945 respectively.
The index is known by several other names, especially Sørensen–Dice index, Sørensen index and Dice's coefficient. Other variations include the "similarity coefficient" or "index", such as Dice similarity coefficient (DSC). Common alternate spellings...
Error return and source
DEMO BLOCKS :
In this demo I am trying to find out the similarities between McMahons and McDonald’s by algorithm Jaro
I hope I explained the benefit of the extension
Download AIX :
com.aemo.similarity.aix (13.7 KB)
Download AIA :
similarity.aia (15.9 KB)
If you like my work, please support me
9 Likes
Shreyaa:
Nice extension
Ammaraldewani:
Amazing​ thanks
Thanx @Shreyaa & @Ammaraldewani
2 Likes