Javascript unique strings with RegExps

Back to articles

hackvertor

Author:

Gareth Heyes

@hackvertor

Published: Wed, 04 Feb 2009 10:19:01 GMT

Updated: Sat, 22 Mar 2025 15:38:12 GMT

I wrote a cool new feature for Hackvertor which found unique strings in some text, basically the function took a argument of a regular expression to split the text into parts and then scan a dynamic reg exp to check if the strings were unique are not.

I thought I'd run through the code as a change from my usual posts which are usually about hacking javascript.

<pre lang="javascript"> function unique(regexp, code) { code = code.split(regexp); var result = []; var found = ""; for(var i=0;i<code.length;i++) { if(!new RegExp(found).test(code[i]) || found == '') { var escaped = code[i].replace(/([\\.^$*+?{}\[\]\|\(\)!])/g,"\\$1"); if(/^[\d\w]/.test(escaped)) { escaped = '\\b' + escaped; } if(/[\d\w]$/.test(escaped)) { escaped = escaped + '\\b'; } result.push(code[i]); if(found != '') { found += '|'; } found += escaped; } } return result.join(","); } alert(unique(/\s+/, 'test test test test1 test1 test2 test2 test2 test3 test2 test3')); </pre>

I used RegExp.test because it returns true or false and is much faster than match. The found string builds a dynamic RegExp of all the strings matched and checks their boundaries if they end or start with numbers or text. This is to stop parts of strings being matched and allowing strings within quotes etc. I also escape any RegExp characters matched in the text and return a string of the matches separated by commas.

Here is the Hackvertor tag in action:- Hackvertor unique tag

Back to articles