Regular Expressions - Tutorial Part 1: Basics

 

0 Comments | Write a Comment | Rate this Article | Report Article

 

Type

Tutorial for Beginners

Category

Tools & Utilities

Language

English

Author

Stefan Trost Media

Date

13.07.2011

Ratings

31

Views

2003
 
 


About the author

Stefan Trost is a developer of software and web solutions and gladly also cares about your needs and desires. Contact

Profile of Stefan Trost Media
Articles by Stefan Trost Media

In this tutorial, we want to learn how to use regular expressions. In the first part we want to have a look at some basics, the further parts (look at the bottom of this text for a list) will explain more and more about this extensive topic. But what are regular expressions at all and for what can we use them?

What are regular expressions and for what can we use them?

In general, regular expressions are a possibility to describe a text. If we want to replace a string with another one, we can do this very easily if we know both strings. For example, if we want to replace "word1" with "word2".

But what can we do if several writings of a word can occur? Of course, we can type in each variant to replace them all, but it would be much more difficult if we want to search for each e-mail adress in a text, no matter how this e-mail adress looks like. Here it is not possible to search for all combinations of characters and Internet adresses we can think about. This would be too much. So, it would be nice if we can tell the computer how an e-mail adress looks like, so that the computer can search for us.

And exactly at this point, the best idea is to use regular expressions. Regular expressions make it possible to describe texts, following a specific pattern. We can create a regular expression describing the following: First of all, there are some characters or numbers, then there is an @-character and the string ends with an Internet adress with an arbitrary domain name and domain ending. And then, we can use this search term to find and replace each e-mail adress we can imagine. How to do this and how to works this particularly, we learn later in this tutorial.

Beyond searching and replacing texts, there are many other areas of application for regular expressions. For example, you can rewrite parts of a text according to special patterns or also in htaccess files, regular expressions can be used. If you are using an htaccess file on your homepage, you can for example rewrite and redirect requests matching a particular pattern. These and many other areas are there, in which regular expressions can make your life easier.

Used software

If you want to see directly how the regular expressions described in this tutorial are woring or if you want to play a little bit with different variants of them, you can use the software  Text Converter. Simply drag an arbitrary text file (file extension TXT) onto the program and click on "Replace Text" in the actions on the right side of the main window. Here you can see a find and a replace box. Behind them, you can activate for both boxes separately whether regular expressions should be used or not.

The regular expressions mentioned in this tutorial can be written or copied in the search box and you can look which parts of the text file react. You can see the changes immediately after typing your expressions in the preview on the bottom left of the Text Converter. But of course, you can also follow this tutorial without testing and using the expressions in the Text Converter.

The Text Converter is a comprehensive and powerful software, with which you can change multiple text files in different ways at the same time. The program is available in a free Basic version and a Pro version with extended functions. To test the regular expressions of this tutorial with the preview function of the Text Converter (without having to save the files), the Basic version is completely enough for you.

A first example

But enough of preface. We want to have a look at a first example. We have already said, thet regular expressions can describe strings. One of the most important characters within regular expressions is the point. The point can stand for any other arbitrary character.

Example 1                      

Search for:       .          Original Text:       abc def 123

Replace with:     x          After replacement:   xxxxxxxxxxx

 

Example 2

Search for:       .+         Original Text:       abc def 123

Replace with:     x          After replacement:   x

 

Example 3

Search for:       .*         Original Text:       abc def 123

Replace with:     x          After replacement:   xx

Simply enter a point in the search box of the action "Replace Text" in the Text Converter and activate "regular expressions" under the box. After that, you can type an "x" into the replace box. With this, we replace all characters matching the regular expression "." with the character "x". The result is the following: Each character in the original text is matching ".". So, each character will be replaced with an "x" and we get a file with an "x" at each point a character was before.

Something different, we can see in the secound example. Here we change the first example and we write a point behind the point. Also the plus is a so called meta character and has a special meaning within regular expressions. It corresponds to the character standing before the point and means: The character in front of me has to appear at least for one time, but it can also appear multiple times. An arbitrary character can appear as often as it want. So, in the secound example, the whole string "abc def 123" will be detected and this whole string will be replaced with an "x". So, as a result, wie get one "x".

Equally interesting is the asterisk *. This meta character means, that the character in front of it can appear any number of times - so it is also possible that it appears never. In the third example, the whole string is detected again and it will be replaced with an "x" and also "no character" will be detected and replaced with an "x", so we get the new string "xx".

Summary

  • Regular expressions can describe strings. You can simplify many tasks with them.
  • The point . can stand for an arbitrary character
  • The plus + means: The character before the plus have to appear at least for one time and can also appear multiple times
  • The asterisk * means: The preceding character can appear any number of times (also never)

Read more

 

© Stefan Trost - The usage of this tutorial, even in parts, is prohibited without prior written consent of Stefan Trost. But of course, you are welcome to link to this tutorial.

 
  
 

Comments

Have you got the same opinion like the author or do you want to add something? Here you can leave a comment.

Write a comment

You can leave an anonymous comment. If you want to write something under your name, please log in or register.



Past Comments

Nobody has written a comment on this article. You can be the first one.