Super Expressive

Super Expressive is a zero-dependency JavaScript library for building regul...

README

Super Expressive


Super Expressive Logo

Super Expressive is a JavaScript library that allows you to build regular expressions in almost natural language - with no extra dependencies, and a lightweight code footprint (less than 4kb with minification + gzip!).


- Why
- API
Click to expand

  - .sticky
  - .unicode
  - .anyChar
  - .digit
  - .nonDigit
  - .word
  - .nonWord
  - .newline
  - .tab
  - .nullByte
  - .anyOf
  - .capture
  - .group
  - .end())
  - .optional
  - .char(c)

Why?


Regex is a very powerful tool, but its terse and cryptic vocabulary can make constructing and communicating them with others a challenge. Even developers who understand them well can have trouble reading their own back just a few months later! In addition, they can't be easily created and manipulated in a programmatic way - closing off an entire avenue of dynamic text processing.

That's where Super Expressive comes in. It provides a programmatic and human readable way to create regular expressions. It's API uses the fluent builder pattern, and is completely immutable. It's built to be discoverable and predictable:

- properties and methods describe what they do in plain English
- order matters! quantifiers are specified before the thing they change, just like in English (e.g. SuperExpressive().exactly(5).digit)
- if you make a mistake, you'll know how to fix it. SuperExpressive will guide you towards a fix if your expression is invalid
- subexpressions can be used to create meaningful, reusable components
- includes an index.d.ts file for full TypeScript support

SuperExpressive turns those complex and unwieldy regexes that appear in code reviews into something that can be read, understood, and properly reviewed by your peers - and maintained by anyone!

Installation and Usage


  1. ```
  2. npm i super-expressive
  3. ```

  1. ```JavaScript
  2. const SuperExpressive = require('super-expressive');

  3. // Or as an ES6 module
  4. import SuperExpressive from 'super-expressive';
  5. ```

Example


The following example recognises and captures the value of a 16-bit hexadecimal number like 0xC0D3.

  1. ```javascript
  2. const SuperExpressive = require('super-expressive');

  3. const myRegex = SuperExpressive()
  4.   .startOfInput
  5.   .optional.string('0x')
  6.   .capture
  7.     .exactly(4).anyOf
  8.       .range('A', 'F')
  9.       .range('a', 'f')
  10.       .range('0', '9')
  11.     .end()
  12.   .end()
  13.   .endOfInput
  14.   .toRegex();

  15. // Produces the following regular expression:
  16. /^(?:0x)?([A-Fa-f0-9]{4})$/
  17. ```

Playground


You can experiment with SuperExpressive in the Super Expressive Playground by @nartc. This is a great way to build a regex description, and test it against various inputs.


Ports


Super Expressive has been ported to the following languages:

PHP


https://github.com/bassim/super-expressive-php by @bassim

Ruby


https://github.com/hiy/super-expressive-ruby by @hiy

API


SuperExpressive()


SuperExpressive()

Creates an instance of SuperExpressive.

.allowMultipleMatches


Uses the g flag on the regular expression, which indicates that it should match multiple values when run on a string.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .allowMultipleMatches
  4.   .string('hello')
  5.   .toRegex();
  6. // ->
  7. /hello/g
  8. ```

.lineByLine


Uses the m flag on the regular expression, which indicates that it should treat the .startOfInput and .endOfInput markers as the start and end of lines.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .lineByLine
  4.   .string('^hello$')
  5.   .toRegex();
  6. // ->
  7. /\^hello\$/m
  8. ```

.caseInsensitive


Uses the i flag on the regular expression, which indicates that it should treat ignore the uppercase/lowercase distinction when matching.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .caseInsensitive
  4.   .string('HELLO')
  5.   .toRegex();
  6. // ->
  7. /HELLO/i
  8. ```

.sticky


Uses the y flag on the regular expression, which indicates that it should create a stateful regular expression that can be resumed from the last match.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .sticky
  4.   .string('hello')
  5.   .toRegex();
  6. // ->
  7. /hello/y
  8. ```

.unicode


Uses the u flag on the regular expression, which indicates that it should use full unicode matching.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .unicode
  4.   .string('héllo')
  5.   .toRegex();
  6. // ->
  7. /héllo/u
  8. ```

.singleLine


Uses the s flag on the regular expression, which indicates that the input should be treated as a single line, where the .startOfInput and .endOfInput markers explicitly mark the start and end of input, and .anyChar also matches newlines.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .singleLine
  4.   .string('hello')
  5.   .anyChar
  6.   .string('world')
  7.   .toRegex();
  8. // ->
  9. /hello.world/s
  10. ```

.anyChar


Matches any single character. When combined with .singleLine, it also matches newlines.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anyChar
  4.   .toRegex();
  5. // ->
  6. /./
  7. ```

.whitespaceChar


Matches any whitespace character, including the special whitespace characters: \r\n\t\f\v.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .whitespaceChar
  4.   .toRegex();
  5. // ->
  6. /\s/
  7. ```

.nonWhitespaceChar


Matches any non-whitespace character, excluding also the special whitespace characters: \r\n\t\f\v.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .nonWhitespaceChar
  4.   .toRegex();
  5. // ->
  6. /\S/
  7. ```

.digit


Matches any digit from 0-9.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .digit
  4.   .toRegex();
  5. // ->
  6. /\d/
  7. ```

.nonDigit


Matches any non-digit.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .nonDigit
  4.   .toRegex();
  5. // ->
  6. /\D/
  7. ```

.word


Matches any alpha-numeric (a-z, A-Z, 0-9) characters, as well as _.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .word
  4.   .toRegex();
  5. // ->
  6. /\w/
  7. ```

.nonWord


Matches any non alpha-numeric (a-z, A-Z, 0-9) characters, excluding _ as well.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .nonWord
  4.   .toRegex();
  5. // ->
  6. /\W/
  7. ```

.wordBoundary


Matches (without consuming any characters) immediately between a character matched by .word and a character not matched by .word (in either order).

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .digit
  4.   .wordBoundary
  5.   .toRegex();
  6. // ->
  7. /\d\b/
  8. ```

.nonWordBoundary


Matches (without consuming any characters) at the position between two characters matched by .word.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .digit
  4.   .nonWordBoundary
  5.   .toRegex();
  6. // ->
  7. /\d\B/
  8. ```

.newline


Matches a \n character.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .newline
  4.   .toRegex();
  5. // ->
  6. /\n/
  7. ```

.carriageReturn


Matches a \r character.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .carriageReturn
  4.   .toRegex();
  5. // ->
  6. /\r/
  7. ```

.tab


Matches a \t character.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .tab
  4.   .toRegex();
  5. // ->
  6. /\t/
  7. ```

.nullByte


Matches a \u0000 character (ASCII 0).

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .nullByte
  4.   .toRegex();
  5. // ->
  6. /\0/
  7. ```

.anyOf


Matches a choice between specified elements. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anyOf
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .toRegex();
  9. // ->
  10. /(?:XXX|[a-f0-9])/
  11. ```

.capture


Creates a capture group for the proceeding elements. Needs to be finalised with .end(). Can be later referenced with backreference(index).

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .capture
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .toRegex();
  9. // ->
  10. /([a-f][0-9]XXX)/
  11. ```

.namedCapture(name)


Creates a named capture group for the proceeding elements. Needs to be finalised with .end(). Can be later referenced with namedBackreference(name) or backreference(index).

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .namedCapture('interestingStuff')
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .toRegex();
  9. // ->
  10. /(?[a-f][0-9]XXX)/
  11. ```

.namedBackreference(name)


Matches exactly what was previously matched by a namedCapture.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .namedCapture('interestingStuff')
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .string('something else')
  9.   .namedBackreference('interestingStuff')
  10.   .toRegex();
  11. // ->
  12. /(?[a-f][0-9]XXX)something else\k/
  13. ```

.backreference(index)


Matches exactly what was previously matched by a capture or namedCapture using a positional index. Note regex indexes start at 1, so the first capture group has index 1.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .capture
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .string('something else')
  9.   .backreference(1)
  10.   .toRegex();
  11. // ->
  12. /([a-f][0-9]XXX)something else\1/
  13. ```

.group


Creates a non-capturing group of the proceeding elements. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .optional.group
  4.     .range('a', 'f')
  5.     .range('0', '9')
  6.     .string('XXX')
  7.   .end()
  8.   .toRegex();
  9. // ->
  10. /(?:[a-f][0-9]XXX)?/
  11. ```

.end()


Signifies the end of a SuperExpressive grouping, such as .anyOf, .group, or .capture.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .capture
  4.     .anyOf
  5.       .range('a', 'f')
  6.       .range('0', '9')
  7.       .string('XXX')
  8.     .end()
  9.   .end()
  10.   .toRegex();
  11. // ->
  12. /((?:XXX|[a-f0-9]))/
  13. ```

.assertAhead


Assert that the proceeding elements are found without consuming them. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .assertAhead
  4.     .range('a', 'f')
  5.   .end()
  6.   .range('a', 'z')
  7.   .toRegex();
  8. // ->
  9. /(?=[a-f])[a-z]/
  10. ```

.assertNotAhead


Assert that the proceeding elements are not found without consuming them. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .assertNotAhead
  4.     .range('a', 'f')
  5.   .end()
  6.   .range('g', 'z')
  7.   .toRegex();
  8. // ->
  9. /(?![a-f])[g-z]/
  10. ```

.assertBehind


Assert that the elements contained within are found immediately before this point in the string. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .assertBehind
  4.     .string('hello ')
  5.   .end()
  6.   .string('world')
  7.   .toRegex();
  8. // ->
  9. /(?<=hello )world/
  10. ```

.assertNotBehind


Assert that the elements contained within are not found immediately before this point in the string. Needs to be finalised with .end().

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .assertNotBehind
  4.     .string('hello ')
  5.   .end()
  6.   .string('world')
  7.   .toRegex();
  8. // ->
  9. /(?
  10. ```

.optional


Assert that the proceeding element may or may not be matched.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .optional.digit
  4.   .toRegex();
  5. // ->
  6. /\d?/
  7. ```

.zeroOrMore


Assert that the proceeding element may not be matched, or may be matched multiple times.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .zeroOrMore.digit
  4.   .toRegex();
  5. // ->
  6. /\d*/
  7. ```

.zeroOrMoreLazy


Assert that the proceeding element may not be matched, or may be matched multiple times, but as few times as possible.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .zeroOrMoreLazy.digit
  4.   .toRegex();
  5. // ->
  6. /\d*?/
  7. ```

.oneOrMore


Assert that the proceeding element may be matched once, or may be matched multiple times.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .oneOrMore.digit
  4.   .toRegex();
  5. // ->
  6. /\d+/
  7. ```

.oneOrMoreLazy


Assert that the proceeding element may be matched once, or may be matched multiple times, but as few times as possible.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .oneOrMoreLazy.digit
  4.   .toRegex();
  5. // ->
  6. /\d+?/
  7. ```

.exactly(n)


Assert that the proceeding element will be matched exactly n times.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .exactly(5).digit
  4.   .toRegex();
  5. // ->
  6. /\d{5}/
  7. ```

.atLeast(n)


Assert that the proceeding element will be matched at least n times.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .atLeast(5).digit
  4.   .toRegex();
  5. // ->
  6. /\d{5,}/
  7. ```

.between(x, y)


Assert that the proceeding element will be matched somewhere between x and y times.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .between(3, 5).digit
  4.   .toRegex();
  5. // ->
  6. /\d{3,5}/
  7. ```

.betweenLazy(x, y)


Assert that the proceeding element will be matched somewhere between x and y times, but as few times as possible.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .betweenLazy(3, 5).digit
  4.   .toRegex();
  5. // ->
  6. /\d{3,5}?/
  7. ```

.startOfInput


Assert the start of input, or the start of a line when .lineByLine is used.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .startOfInput
  4.   .string('hello')
  5.   .toRegex();
  6. // ->
  7. /^hello/
  8. ```

.endOfInput


Assert the end of input, or the end of a line when .lineByLine is used.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .string('hello')
  4.   .endOfInput
  5.   .toRegex();
  6. // ->
  7. /hello$/
  8. ```

.anyOfChars(chars)


Matches any of the characters in the provided string chars.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anyOfChars('aeiou')
  4.   .toRegex();
  5. // ->
  6. /[aeiou]/
  7. ```

.anythingButChars(chars)


Matches any character, except any of those in the provided string chars.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anythingButChars('aeiou')
  4.   .toRegex();
  5. // ->
  6. /[^aeiou]/
  7. ```

.anythingButString(str)


Matches any string the same length as str, except the characters sequentially defined in str.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anythingButString('aeiou')
  4.   .toRegex();
  5. // ->
  6. /(?:[^a][^e][^i][^o][^u])/
  7. ```

.anythingButRange(a, b)


Matches any character, except those that would be captured by the .range specified bya and b.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .anythingButRange(0, 9)
  4.   .toRegex();
  5. // ->
  6. /[^0-9]/
  7. ```

.string(s)


Matches the exact string s.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .string('hello')
  4.   .toRegex();
  5. // ->
  6. /hello/
  7. ```

.char(c)


Matches the exact character c.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .char('x')
  4.   .toRegex();
  5. // ->
  6. /x/
  7. ```

.range(a, b)


Matches any character that falls between a and b. Ordering is defined by a characters ASCII or unicode value.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .range('a', 'z')
  4.   .toRegex();
  5. // ->
  6. /[a-z]/
  7. ```

.subexpression(expr, opts?)


- opts.namespace: A string namespace to use on all named capture groups in the subexpression, to avoid naming collisions with your own named groups (default = '')
- opts.ignoreFlags: If set to true, any flags this subexpression specifies should be disregarded (default = true)
- opts.ignoreStartAndEnd: If set to true, any startOfInput/endOfInput asserted in this subexpression specifies should be disregarded (default = true)

Matches another SuperExpressive instance inline. Can be used to create libraries, or to modularise you code. By default, flags and start/end of input markers are ignored, but can be explcitly turned on in the options object.

Example
  1. ```JavaScript
  2. // A reusable SuperExpressive...
  3. const fiveDigits = SuperExpressive().exactly(5).digit;

  4. SuperExpressive()
  5.   .oneOrMore.range('a', 'z')
  6.   .atLeast(3).anyChar
  7.   .subexpression(fiveDigits)
  8.   .toRegex();
  9. // ->
  10. /[a-z]+.{3,}\d{5}/
  11. ```


.toRegexString()


Outputs a string representation of the regular expression that this SuperExpression models.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .allowMultipleMatches
  4.   .lineByLine
  5.   .startOfInput
  6.   .optional.string('0x')
  7.   .capture
  8.     .exactly(4).anyOf
  9.       .range('A', 'F')
  10.       .range('a', 'f')
  11.       .range('0', '9')
  12.     .end()
  13.   .end()
  14.   .endOfInput
  15.   .toRegexString();
  16. // ->
  17. "/^(?:0x)?([A-Fa-f0-9]{4})$/gm"
  18. ```

.toRegex()


Outputs the regular expression that this SuperExpression models.

Example
  1. ```JavaScript
  2. SuperExpressive()
  3.   .allowMultipleMatches
  4.   .lineByLine
  5.   .startOfInput
  6.   .optional.string('0x')
  7.   .capture
  8.     .exactly(4).anyOf
  9.       .range('A', 'F')
  10.       .range('a', 'f')
  11.       .range('0', '9')
  12.     .end()
  13.   .end()
  14.   .endOfInput
  15.   .toRegex();
  16. // ->
  17. /^(?:0x)?([A-Fa-f0-9]{4})$/gm
  18. ```