How to replace plain URLs with links? (C# and Javascript)

Idea: In Suppo project we want to replace URLs in chat conversations by HTML links. Yes the first thing was to use google to find some recommendations or regular expressions BUT: On stackoverflow in article http://stackoverflow.com/questions/37684/how-to-replace-plain-urls-with-links is written:

“First off, rolling your own regexp to parse URLs is a terrible idea. You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs. URIs are complex  … and there is multiple recommended libraries in Javascript (expect similar libraries in C#) like linkifyjs, anchorme.js, autolinker.js … OK so where is the problem?

The problem for me is that all of this libraries are huge … >30KB minified.

Why?

Because we are developing livechat window (Suppo) which is attached to every page of our clients and +30KB of javascript code is not acceptable when our solution has 63KB at all (included the core of jQuery library)

So what next?

Back to beginning and compose simple regular expression (is not so important to work on every URLs … suffice many of them)

C#:

public static string Linkify(this string text)
{
    // www|http|https|ftp|news|file
    text = Regex.Replace(
        text,
        @"((www\.|(http|https|ftp|news|file)+\:\/\/)[&#95;.a-z0-9-]+\.[a-z0-9\/&#95;:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])",
        "<a href=\"$1\" target=\"_blank\">$1</a>",
        RegexOptions.IgnoreCase)
        .Replace("href=\"www", "href=\"http://www");

    // mailto
    text = Regex.Replace(
        text,
        @"(([a-zA-Z0-9_\-\.])+@[a-zA-Z\ ]+?(\.[a-zA-Z]{2,6})+)",
        "<a href=\"mailto:$1\">$1</a>",
        RegexOptions.IgnoreCase);

    return text;
}

Javascript:

// www|http|https|ftp|news|file
text = text.replace(
    /((www\.|(http|https|ftp|news|file)+\:\/\/)[&#95;.a-z0-9-]+\.[a-z0-9\/&#95;:@=.+?,##%&~-]*[^.|\'|\# |!|\(|?|,| |>|<|;|\)])/gim,
    '<a href="$1" target="_blank">$1</a>');

// mailto
text = text.replace(
    /(([a-zA-Z0-9\-\_\.])+@[a-zA-Z\_]+?(\.[a-zA-Z]{2,6})+)/gim,
    '<a href="mailto:$1">$1</a>');

It’s simple: first part of code find www|http|https|ftp|news|file and replace it by <a href=”http(s):// and second part find email address and replace it by <a href=”mailto::.
That is all and it's less than 1KB of code.

Comments are closed