string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient...

README

string-similarity


Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Table of Contents


  - Usage
    - For Node.js
  - API
      - Arguments
      - Returns
      - Examples
      - Arguments
      - Returns
      - Examples
    - 2.0.0
    - 3.0.0
    - 3.0.1
    - 4.0.1
    - 4.0.2
    - 4.0.3

Usage


For Node.js


Install using:

  1. ``` sh
  2. npm install string-similarity --save
  3. ```

In your code:

  1. ``` js
  2. var stringSimilarity = require("string-similarity");

  3. var similarity = stringSimilarity.compareTwoStrings("healed", "sealed");

  4. var matches = stringSimilarity.findBestMatch("healed", [
  5.   "edward",
  6.   "sealed",
  7.   "theatre",
  8. ]);
  9. ```

For browser apps


Include `` to get the latest version.

Or `` to get a specific version (4.0.1) in this case.

This exposes a global variable called stringSimilarity which you can start using.

  1. ```
  2. <script>
  3.   stringSimilarity.compareTwoStrings('what!', 'who?');
  4. </script>
  5. ```

(The package is exposed as UMD, so you can consume it as such)

API


The package contains two methods:

compareTwoStrings(string1, string2)


Returns a fraction between 0 and 1, which indicates the degree of similarity between the two strings. 0 indicates completely different strings, 1 indicates identical strings. The comparison is case-sensitive.

Arguments

1. string1 (string): The first string
2. string2 (string): The second string

Order does not make a difference.

Returns

(number): A fraction from 0 to 1, both inclusive. Higher number indicates more similarity.

Examples

  1. ``` js
  2. stringSimilarity.compareTwoStrings("healed", "sealed");
  3. // → 0.8

  4. stringSimilarity.compareTwoStrings(
  5.   "Olive-green table for sale, in extremely good condition.",
  6.   "For sale: table in very good  condition, olive green in colour."
  7. );
  8. // → 0.6060606060606061

  9. stringSimilarity.compareTwoStrings(
  10.   "Olive-green table for sale, in extremely good condition.",
  11.   "For sale: green Subaru Impreza, 210,000 miles"
  12. );
  13. // → 0.2558139534883721

  14. stringSimilarity.compareTwoStrings(
  15.   "Olive-green table for sale, in extremely good condition.",
  16.   "Wanted: mountain bike with at least 21 gears."
  17. );
  18. // → 0.1411764705882353
  19. ```

findBestMatch(mainString, targetStrings)


Compares mainString against each string in targetStrings.

Arguments

1. mainString (string): The string to match each target string against.
2. targetStrings (Array): Each string in this array will be matched against the main string.

Returns

(Object): An object with a ratings property, which gives a similarity rating for each target string, a bestMatch property, which specifies which target string was most similar to the main string, and a bestMatchIndex property, which specifies the index of the bestMatch in the targetStrings array.

Examples

  1. ``` js
  2. stringSimilarity.findBestMatch('Olive-green table for sale, in extremely good condition.', [
  3.   'For sale: green Subaru Impreza, 210,000 miles',
  4.   'For sale: table in very good condition, olive green in colour.',
  5.   'Wanted: mountain bike with at least 21 gears.'
  6. ]);
  7. // →
  8. { ratings:
  9.    [ { target: 'For sale: green Subaru Impreza, 210,000 miles',
  10.        rating: 0.2558139534883721 },
  11.      { target: 'For sale: table in very good condition, olive green in colour.',
  12.        rating: 0.6060606060606061 },
  13.      { target: 'Wanted: mountain bike with at least 21 gears.',
  14.        rating: 0.1411764705882353 } ],
  15.   bestMatch:
  16.    { target: 'For sale: table in very good condition, olive green in colour.',
  17.      rating: 0.6060606060606061 },
  18.   bestMatchIndex: 1
  19. }
  20. ```

Release Notes


2.0.0


- Removed production dependencies
- Updated to ES6 (this breaks backward-compatibility for pre-ES6 apps)

3.0.0


- Performance improvement for compareTwoStrings(..): now O(n) instead of O(n^2)
- The algorithm has been tweaked slightly to disregard spaces and word boundaries. This will change the rating values slightly but not enough to make a significant difference
- Adding a bestMatchIndex to the results for findBestMatch(..) to point to the best match in the supplied targetStrings array

3.0.1


- Refactoring: removed unused functions; used substring instead of substr
- Updated dependencies

4.0.1


- Distributing as an UMD build to be used in browsers.

4.0.2


- Update dependencies to latest versions.

4.0.3


- Make compatible with IE and ES5. Also, update deps. (see PR56)

4.0.4


- Simplify some conditional statements. Also, update deps. (see PR50)

Build status Known Vulnerabilities