Understanding RegEx in JavaScript

When I first started learning RegEx, I thought it was the most cryptic thing ever and it seemed impossible to learn.  I think we’ve all heard the quote by Jamie Zawinski, “Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.”  I mean, what could (\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,6}) possibly mean?

Since then, I’ve learned to love them. Turns out, the above RegEx is pretty common and easy to understand.

There are many great RegEx visualizers online. My favorite is Regulex. Below is an image that was produced by Regulex.


This makes it pretty clear that it is a RegEx for an email address. There are also many RegEx cheat sheets. My favorite is OverAPI.

Here are a few fundamental aspects of RegEx that are good to be solid on:

Greedy vs nongreedy:

By default, regular expressions are greedy, meaning that they will match as much as possible.

// ["12345", index: 0, input: "12345"]

/\d+/ will match a digit that appears 1 or more times. 12345 matches that description which happens to be the entire argument. Nongreedy regular expressions will match as little as possible. We can make a regular expression nongreedy by putting a ? after +,*,?,or{}

// ["1", index: 0, input: "12345"]



By default, a RegEx search will start at the first index of a string and attempt to find a match. If it does, it returns it. If it doesn’t, it will move to the next index and start the search again. This continues until a match is found or it reaches the end of the string. However, you manipulate where the match should be found using a combination of ^ and $

// Only match at the beginning of the string.
// ["12345", index: 0, input: "12345"]

// Only match at the end of the string
// ["12345", index: 5, input: "hello12345"]

// Only match if the entire string matches.
// ["12345", index: 0, input: "12345"]



Postive Assertion

/q(?=u)/ matches a q that is followed by a u without making u part of the match.

// ["monkey", index: 0, input: "monkeyman"]


Negative Lookahead

/q(?!u)/ matches a q that is not followed by a u

// ["monkey", index: 0, input: "monkeybutt"]



If you use a RegEx that contains subexpressions (grouped by parentheses) when using RegEx methods like exec and match, the subexpressions will also show up in the result array.

// both log ["monkeyman", "monkey", index: 0, input: "monkeyman"]

With replace we can use $n which inserts the nth parenthesized submatch string.

console.log('Aiken, Quinton'.replace(/(\w+), (\w+)/,"$2 $1"));
// Quinton Aiken

$1 references the first submatch and $2 references the second submatch.

The global flag (g):

When used with replace, all occurrences of the pattern will be replaced.

// _abc_ac_ac

When used with match, all occurrences of the pattern will be returned in the results array.

// ["6", "5", "7"]

When used with exec, things start to get slightly more confusing. All RegExp objects have a lastIndex property. exec, when called with the global flag, will update the lastIndex property of the RegEx to point to the index of the string immediately following the match.

var Re = /y/g;


// 2

The next time that exec is called on the RegExp object, the search will start at its lastIndex property. exec returns null when it doesn’t find a match so we can utilize it nicely in a while loop.

var input = "3 42 88",
    num = /\b\d+\b/g,

while( match = num.exec(input) ) {

// ["3", index: 0, input: "3 42 88"] 1
// ["42", index: 2, input: "3 42 88"] 4
// ["88", index: 5, input: "3 42 88"] 7

// 0

During each iteration, num.lastIndex is updated and the next iteration’s pattern search begins at this index. Finally, when num.exec(input) returns null and we break out of the loop, num.lastIndex is reset back to 0.

Coderbyte is a great site with free JavaScript coding challenges. “Results for Multiple Brackets” is a great exercise on Coderbyte in RegExp. Here is a description of the problem:

Have the function MultipleBrackets(str) take the str parameter being passed and return 1 #ofBrackets if the brackets are correctly matched and each one is accounted for. Otherwise return 0. For example: if str is “(hello [world])(!)”, then the output should be 1 3 because all the brackets are matched and there are 3 pairs of brackets, but if str is “((hello [world])” the the output should be 0 because the brackets do not correctly match up. Only “(“, “)”, “[“, and “]” will be used as brackets. If str contains no brackets return 1.

I recommend heading on over to Coderbyte, creating an account and giving the exercise a go. My solution and explanation are below.

function MultipleBrackets(str) {

  var count = 0,
  matches = [],
  Re = /\([^\(\[\]\)]*\)|\[[^\(\[\]\)]*\]/g;

  while( matches = Re.exec(str) ) {

    str = str.replace(matches[0],'');

    Re.lastIndex = 0;


  if( str.match(/[\(\)\[\]]/) )
    return '0';
    return '1 ' + count;


Re matches the following pattern: 


During each iteration of the while loop, we start at the first index of the string by manually resetting Re.lastIndex. If we have a match, we replace it with an empty string. The degradation of "(hello [world])(!)" looks like this:

"(hello [world])(!)"
"(hello )(!)"

After this is complete, we do a final check to see if the degraded string contains a (, ), [, or ]. If it does, the brackets were not correctly matched. If it doesn’t, they were.

For more information on RegEx, I recommend checking out Regular-Expressions.info and Eloquent JavaScript’s chapter on RegEx.

Leave a comment below and follow me on twitter: @QuintonAiken.