Jump to main navigation


Converting a string to slug with JavaScript

090503

Recently I've been working on implementing slugs in my CMS to be able to generate nicer URLs. In order to do so I've created a little JavaScript function that converts a string to a slug. I'll first give you the code and then explain it a bit. [update: 2.07.10 Fixed IE issue]

Note that you should not trust any JavaScript validation or processing. The submitted data should always be validated on the server. The reason to do JavaScript validation or processing is to provide an enhanced user experience, but is not a security measure.

function string_to_slug(str) {
  str = str.replace(/^\s+|\s+$/g, ''); // trim
  
  // remove accents, swap ñ for n, etc
  var from = "ÀÁÄÂÈÉËÊÌÍÏÎÒÓÖÔÙÚÜÛàáäâèéëêìíïîòóöôùúüûÑñÇç·/_,:;";
  var to   = "aaaaeeeeiiiioooouuuuaaaaeeeeiiiioooouuuunncc------";
  for (var i=0, l=from.length ; i<l ; i++) {
    str = str.replace(from[i], to[i]);
  }

  str = str.replace(/[^a-zA-Z0-9 -]/g, '') // remove invalid chars
    .replace(/\s+/g, '-') // collapse whitespace and replace by -
    .replace(/-+/g, '-') // collapse dashes
    .toLowerCase();
  return str;
}

Here's a step by step description:

  • The first thing we do is trim the string, that is, remove any whitespace at the beginning and end. The regular expression /^\s+|\s+$/g does exactly that:
    • / marks the start of the regular expression
    • ^\s+ means "one or more white-space caracteres at the beginning of the string"
    • | means "or"
    • \s+$ means "one or more white-space caracteres at the end of the string"
    • /g ends the regular expression, and sets the global flag (otherwise only one substitution would be performed)
  • We are going to remove any invalid characters, but first we'll replace any 'special' letters for their 'plain' versions. For example in Spanish we have á, é and so on, and even though these are not valid characters in a slug, we don't want to simply remove them, so instead we replace them for a, e, etc. The JavaScript has nothing fancy here.

    Note that I also choose to replace ·/_,:; for dashes (the first dot is the middle dot, used for example in Catalan), I think this will generate better slugs than if we simply remove this characters.

    You might need/want to adjust this part of the function to suit your needs (your language might have other symbols that I haven't included here).

  • Now we're ready to remove any remaining invalid characters. The regular expression /[^a-zA-Z0-9 -]/g will match any character that is not a lowercase letter, an uppercase letter, a digit, a space or a dash. I won't explain this regexp in detail, this post is getting way too long! :) Do a search for "character classes", there's plenty of info around.

    Note that we include spaces as a valid character. Don't worry, we'll get rid of them in the next step. We can't just remove them from the string, because we want to replace them by dashes. The reason why we didn't replace them in the previous step is because eliminating unwanted characters might joing together two whitespace areas. For example, consider the string "One @ Two". If we replace spaces by dashes first and then eliminate unwanted characters, we'd get "One--Two". Those two dashes don't look pretty enough, so I choose to replace spaces after eliminating unwanted characters (see next step).

  • Now it's time to replace any spaces with dashes. But we'll collapse any whitespace as well, so multiple spaces will be converted to a single dash. The expression /\s+/g should be easy if you understood the one about trimming the string.
  • Almost there! The expression /-+/g matches any series of consecutive dashes (which may occur as a result of the previous substitutions), so we replace that for a single dash.
  • Finally we call the JavaScript function toLowerCase, job done!

There's room for improvement. For instance, we could replace the & sign for "and", but that brings problem with multilanguage sites. One could detect the language being used and replace by the appropriate word, but it seems a bit overkill to me... As it is, this should generate nice slugs in most cases.

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

6 comments to “Converting a string to slug with JavaScript”

  1. #01 By Tiago S., 090829 at 08:22

    Hi, Thanks for your post. Saved me a lot of time and work that I
    would have if I had to convert my similar function from ruby to js.
    http://snippets.dzone.com/posts/show/2384 Thanks again for sharing,
    Tiago

  2. #02 By David Prek, 091007 at 06:31

    Looks great except does not work in IE browsers.

  3. #03 By dense13, 091007 at 09:32

    You're right David, I'll have to look into that. The strange thing is, I remember it working in my CMS... I'll try to sort it out soon.

  4. #04 By Shane, 100610 at 01:48

    Awesome! Thank you. Saves me having to try get my head around more
    regex... ;)

  5. #05 By dense13, 100610 at 11:13

    Glad you find it useful Shane. But make sure you test it in IE (see comments 2 and 3). I still haven't gotten around to checking that out.

  6. #06 By dense13, 100702 at 13:39

    @David Prek, I managed to fix that, there was a problem with Regular expressions that was triggering an Out of memory message. In fact I don't know why I was using a regular expression in the loop, it's not necessary at all. I must have done it for a reason, but obviously it was not a _good_ reason. :)

Add a comment


Allowed XHTML: <a href="" title=""> <abbr title=""> <acronym title=""> <blockquote cite=""> <code> <em> <p> <strike> <strong>


To prove you're a person (not a spam script), type the answer to the math equation shown in the picture. Click on the picture to hear an audio file of the equation.
Click to hear an audio file of the anti-spam equation =

Additional content and navigation

Categories

Main navigation menu