Regular Expressions - Tutorial Part 4: Selections of characters and alternatives

 

0 Comments | Write a Comment | Rate this Article | Report Article

 

Type

Tutorial for Beginners

Category

Tools & Utilities

Language

English

Author

Stefan Trost Media

Date

13.07.2011

Ratings

30

Views

1330
 
 


About the author

Stefan Trost is a developer of software and web solutions and gladly also cares about your needs and desires. Contact

Profile of Stefan Trost Media
Articles by Stefan Trost Media

In the first three parts of this tutorial we have only cared about explicitely defined characters. But with regular expressions, it is also possible to define whole character classes or character alternatives. In this part of the tutorial, we want to learn more about that. Up to now, the following parts are published:

Part 1: Basics | Part 2: Normal strings, grouping and repetitions | Part 3: Meta characters and combinations | Part 4: Selections of characters and alternatives | Part 5: Character groups and classes | Part 6: Reusing and backward references | Part 7: Modifiers | Part 8: Usage and Examples

Importan Note

To try out the example and to test your own regular expression, you can use the software Text Converter in its free Basic version. In the first part of this tutorial, there is an explanation how to use the application.

Character selections and alternative characters

Imagine, you want to search for a word which can be written in two different ways. But you only want to use one regular expression for this. As an easy example, we take the character combination "axa" and "aza". So, we need a regular expression which matches an "x" as well as a "z" surrounded by two "a". You can see this expression in example 1.

Example 1

Search for:    Replace with:    Original:        After replacement:

a[zx]a         X                aza aaa axa      X aaa X

ab[cd]         X                abcd abc abd     Xd X X

[abc]          X                abcd abc abd     XXXd XXX XXd

[ab]+          X                abc aaa ababc    Xc X Xc

[ab]+c         X                abc aaa ababc    X aaa Xc

The square brackets define a character selection. Each character in this selection can appear at this point. In the first line, the possibilities are "aza" and "axa". In the secound line, "abc" and "abd" match the expression. From the string "abcd", only "abc" will be replaced and "d" will not be changed.

In the third line of the first example, we have only defined a character selection without a fixe character in front or behind it. Now, we are searching for "a", "b" and "c" and all of the occurences will be replaced.

In the fourth line of example 1, we combine a character selection with a plus. The plus means: The character before has to appear at least for one time and can appear any number of times. In this case, the character before is an "a" or a "b". The whole expression will match strings like "aaa", "bbbb" but also "ababbaab", because the repeated character must not be the same.

In the last line, we add a "c" to the example before. With this, we are searching for a string ending with a "c" in which "a"s and "b"s can occur in any number and in any combination before the "c".

Alternative strings

Example 2

Search for:    Replace with:    Original:        After replacement:

abc|def        X                abcd ef defg     Xd ef Xg

z(abc|def)     X                abcd ef zdefg    abcd ef Xg

In example 1, we have discussed alternative characters within a character selection. But it is also possible to go another way. Also the meta character | stands for alternatives. In the first line of the secound example, we are using the regular expression "abc|def" to search for "abc" or "def" and replace the string, we have found with an "X". With this you can define regular expressions with complete parts of words as alternatives. In the second line, we are using the regular expression "z(abc|def)". This means: We are searching for a string beginning with "z" and after that the sequence "abc" or "def" should appear. The "abc" at the beginning will not be replaced, because there is no "z" in front of it. The "zdef" will be replaced. With the round brackets we have grouped "abc|def".

Square and round brackets

In the following example, we want to have a closer look to the difference between square and round brackets. Square brackets are defining a selection of characters, each character within the brackets are an alternative to the others. In comparison, round brackets are grouping characters. If characters in round characters are standing side by side, they also have to appear in this order.

Example  3

Search for:    Replace with:    Original:        After replacement:

a(bc|de)       X                abcd ade ae      Xd X ae

a[bc|de]       X                abcd ade ae      Xcd Xe X

In example 3, we are comparing two search patterns. The first uses square brackets, the secound uses round brackets. The example with the round brackets means: First of all, an "a" has to appear, thereafter "bc" or "de". So "abc" and "ade" will be replaced.

It is something different, if we are using square brackets. In this secound example, the alternatives separated with the | do not mean "bc" or "de". Instead, they mean "a character out of the group "bc" or a character out of the group "de"". So, in this example "ab" ("a" and a character from the first group) and "ad" ("a" and a character from the secound group) will be replaced.

Summary

  • Square brackets define a selection of characters. Each character in this bracket can appear.
  • With | you can separate alternative strings. To match this expression, it is enough if one of the alternatives fits.
  • Square and round brackets have different meanings. Square brackets define a selection, round brackets are used for grouping.

Read more

 

© Stefan Trost - The usage of this tutorial, even in parts, is prohibited without prior written consent of Stefan Trost. But of course, you are welcome to link to this tutorial.

 
  
 

Comments

Have you got the same opinion like the author or do you want to add something? Here you can leave a comment.

Write a comment

You can leave an anonymous comment. If you want to write something under your name, please log in or register.



Past Comments

Nobody has written a comment on this article. You can be the first one.