How to Write Regular Expressions to Code with JavaScript - dummies

How to Write Regular Expressions to Code with JavaScript

By Chris Minnick, Eva Holland

Before you can make use of a regular expression in JavaScript, you need to create an object containing the expression. You can write regular expressions in one of two ways:

  • By using a regular expression literal

  • Through the constructor function of the RegExp object

Using the RegExp object

When you create a regular expression by calling the RegExp constructor function, the resulting object gets created at run time, rather than when the script is loaded. You should use the RegExp constructor function when you don’t know the value of the regular expression when you’re writing the script.

For example, you may be asking the user to input a regular expression, or you may be getting the regular expression from an external source or calculating some part of the regular expression when the script runs.

This program creates a regular expression using a random letter and then asks the user to type a sentence. When the user submits the form, the program calculates how many instances of the random letter were in the user-submitted text.

<html>
<head>
 <title>Letter Counting Game</title>
 <script>
 window.addEventListener(‘load’,loader,false);
 //get a random letter
 var letter = String.fromCharCode(97 + Math.floor(Math.random() * 26));
 /* Create a regular expression using the letter. Set the g option to find all occurrences. */
 var re = new RegExp(letter,’g’);
 function loader(e){
  document.getElementById(“getText”).addEventListener(‘submit’,countLetter,false);
 }
 function countLetter(e){
  e.preventDefault();
  document.getElementById(“results”).innerHTML = “The secret letter was “ + letter +”.”;
  var userText = document.getElementById(“userWords”).value;
  var matches = userText.match(re);
  if (matches){
  var count = matches.length;
  } else {
  var count = 0;
  }
  document.getElementById(“results”).innerHTML += “ You typed the secret letter “ + count + “ times.”;
 }
 </script>
</head>
<body>
 <form id=“getText”>
 <p>I’m thinking of a letter! Type a sentence, and then I’ll tell you how many times your sentence uses my secret letter!</p>
 <input type=“text” name=“userWords” id=“userWords”>
 <input type=“submit” name=“submit”>
 </form>
 <div id=“results”></div>
</body>
</html>

This is the result of running the preceding program in a web browser.

The Letter Counting Game result.
The Letter Counting Game result.

Regular expression literals

To create a regular expression literal, you enclose the value of the regular expression between slashes instead of quotes.

For example:

var myRegularExpression = /JavaScript/;

Regular expression literals are compiled by the browser when the script is loaded and remain constant through the life of the script. The result is that regular expression literals offer better performance for expressions that will be unchanging.

The preceding example uses a regular expression to look for an exact match of the string JavaScript. A regular expression containing a string of characters to be matched exactly is called a simple pattern.

In a real application or program, you’ll want to account for users who use some variation on the correct spelling. For example, a user may input any of the following words and clearly mean JavaScript:

  • javascript

  • Javascript

  • java script

  • JS

  • js

There may even be more exotic variations. One of the wonderful and frustrating things about dealing with input from real live people is that you never know for sure what they’re going to do! In order to be able to detect variations in capitalization and spelling, you can use more sophisticated regular expressions to look for patterns or sets of characters, rather than just literal strings.

The following is a revised regular expression that will match JavaScript as well as “Javascript or javascript:

var myRegularExpression = /[Jj]ava[Ss]cript/;

Things are starting to look a little foreign, but if you understand the meaning of the different characters, you’ll see that this is actually still pretty simple. The square brackets in a regular expression define a character set and will match any one of the characters within that set. By writing [Jj], what you’re saying is that either a capital or lowercase j will match.

Testing regular expressions

Sometimes when you’re writing regular expressions, it’s helpful to have an easy way to test an expression to make sure that it’s actually doing what you want. A number of websites and tools can help you test your regular expressions.

regex101.com is one such site. To use regex101.com, type your regular expression in the box at the top of the screen and type some text in the box underneath it. The site checks the text against using your regular expression and highlights the matches that are found.

This shows regex101.com using our example regular expression to test against a question about JavaScript.

Using regex101.com to test a regular expression.
Using regex101.com to test a regular expression.

Special characters in regular expressions

Regular expressions make it possible for you to look for numbers in strings, letters, groups of letters, repetitions of characters, and much more.

To create complex search patterns, you can use the regular expression special characters. The most commonly used special characters are listed here.

Special Character Meaning
Designates whether the next character should be treated as a
special character or whether it should be treated as a literal
character. If the following character is a special character, the
designates that it should be treated literally.
^ Finds the beginning of the input.
$ Finds the end of the input.
* Finds the preceding character 0 or more times.
+ Finds the preceding character 1 or more times.
? Finds the preceding character 0 or 1 time.
. Finds any single character except the newline character.
x|y Finds either x or y.
{n} Finds exactly n occurrences of the preceding character.
[xyz] Finds any one of the characters in the brackets.
[^xyz] Finds any characters other than the ones in the brackets.
[b] Finds a backspace.
b Finds a word boundary.
B Finds a nonword boundary.
d Finds a digit character.
D Finds any nondigit character.
n Finds a line feed.
s Finds a single white space character, including space, tab,
form feed, and line feed.
S Finds a single nonwhite-space character.
t Finds a tab.
w Finds any alpha-numeric character, including an
underscore.
W Finds any nonword character.