Regular Expressions - Tutorial Part 8: Usage and examples

 

0 Comments | Write a Comment | Rate this Article | Report Article

 

Type

Tutorial for Beginners

Category

Tools & Utilities

Language

English

Author

Stefan Trost Media

Date

13.07.2011

Ratings

50

Views

1777
 
 


About the author

Stefan Trost is a developer of software and web solutions and gladly also cares about your needs and desires. Contact

Profile of Stefan Trost Media
Articles by Stefan Trost Media

This is the last part of our tutorial about regular expressions. In this part, I would like to present some areas of application for regular expressions to you. If you should have further questions or if something is not clear, feel free to write a comment so that I can expand this tutorial if neccessary. Up to now, the following parts are published:

Part 1: Basics | Part 2: Normal strings, grouping and repetitions | Part 3: Meta characters and combinations | Part 4: Selections of characters and alternatives | Part 5: Character groups and classes | Part 6: Reusing and backward references | Part 7: Modifiers | Part 8: Usage and Examples

Important Note

To try out the example and to test your own regular expressions, you can use the software Text Converter in its free Basic version. In the first part of this tutorial, there is an explanation how to use the application.

An Internet adress

Regular Expression:   [a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}

First of all, we want to find an Internet adress with the help of a regular expression. An Internet adress consists of a name of arbitrary length which is separated with a dot and the domain ending. The ending can have two to four characters and we will consider that an Internet adress consists of the characters a to z in lower or upper case writing. In addition, we would like to allow the point and the hyphen in our domain. In the box, you can see how a regular expression for this case would look like.

At the beginning, we have a group of characters in square brackets which is standing for the name. Because point and hyphen are mata characters, we write them at the end of the group so that they are detected as normal characters. With the + behind the group, we say, that characters from this group can be repeated in an arbitrary number of times. We do not use the asterisk instead of the point, because we want to have at least one character. After that group we write "\.". That is the dot between domain name and domain ending. We have to escape this dot with \, because the point is a meta character in regular expressions. [a-zA-Z] is our domain ending, which can only consists of characters. Domain endings can have two (de) to four (info or mobi) characters. So, we write {2,4} behind the ending. So, our regular expression is already ready, with which we can find an arbitrary domain (at least whether the domain has no umlauts in it, but of course, we can easily add them to the expression).

An e-mail adress

\b[a-zA-Z0-9._+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b

In the next step, we want to search for an e-mail adress with a regular expression. You can see the example in the box. An e-mail adress can consist of characters, numbers, points, underscores, plus signs and hyphens. After that an @ and an Internet adress follows. For the Internet adress, we can use the first example, so that we only have to write the characters an e-mail adress can have between [ and ]. Again, we write the meta characters at the end and with the + behind the square brackets, we say that the characters from this group can be repeated in an arbitrary way. The @ is no meta character and can be used in regular expressions just like it is. If we want our e-mail adress to be an whole word, we can write \b in front and behind the expression. \b stands for the position at the beginning respectively the ending of a word.

Link to an e-mail adress

Search for:   (\b[a-zA-Z0-9._+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}\b)

Replace with: <a href="mailto:$1>$1</a>

At the positions of the dots, the found adress should be inserted, so we write $1 at this positions, because $1 delivers the adress.

If we are now using this expression, each e-mail adress we can find in the text will be replace with its own link. And we have save a lot of work by using regular expressions.

Further questions?

Regular expressions are a very complex issue and sometimes, it is not easy to understand them. If you should have further questions, you can gladly write them in the comments. If you need some help for particular regular expressions or if I should create a regular expression for you for a specific problem, I can do so for a donation. Simply write to me.

 

© Stefan Trost - The usage of this tutorial, even in parts, is prohibited without prior written consent of Stefan Trost. But of course, you are welcome to link to this tutorial.

 
  
 

Comments

Have you got the same opinion like the author or do you want to add something? Here you can leave a comment.

Write a comment

You can leave an anonymous comment. If you want to write something under your name, please log in or register.



Past Comments

Nobody has written a comment on this article. You can be the first one.