This tutorial presents Java powerful regular-expression processing capabilities using class
Matcher and class
matches method. This tutorial is intended for students and developers who are familiar with basic Java string-processing techniques.
[Note: This tutorial is an excerpt (Section 29.7) of Chapter 29, Strings, Characters and Regular Expressions, from our textbook Java How to Program, 6/e. This tutorial may refer to other chapters or sections of the book that are not included here. Permission Information: Deitel, Harvey M. and Paul J., JAVA HOW TO PROGRAM, ©2005, pp.1378-1387. Electronically reproduced by permission of Pearson Education, Inc., Upper Saddle River, New Jersey.]29.7 Regular Expressions, Class Pattern and Class Matcher (Continued)
Method replaceAll replaces text in a string with new text (the second argument) wherever the original string matches a regular expression (the first argument). Line 14 replaces every instance of "*" in firstString with "^". Note that the regular expression ("\\*") precedes character * with two backslashes, \. Normally, * is a quantifier indicating that a regular expression should match any number of occurrences of a preceding pattern. However, in line 14, we want to find all occurrences of the literal character *—to do this, we must escape character * with character \. By escaping a special regular-expression character with a \, we instruct the regular-expression matching engine to find the actual character, as opposed to what it represents in a regular expression. Since the expression is stored in a Java string and \ is a special character in Java strings, we must include an additional \. So the Java string "\\*" represents the regular-expression pattern \* which matches a single * character in the search string. In line 19, every match for the regular expression "stars" in firstString is replaced with "carets". Method replaceFirst (line 32) replaces the first occurrence of a pattern match. Java Strings are immutable, therefore method replaceFirst returns a new string in which the appropriate characters have been replaced. This line takes the original string and replaces it with the string returned by replaceFirst. By iterating three times we replace the first three instances of a digit (\d) in secondString with the text "digit". Method split divides a string into several substrings. The original string is broken in any location that matches a specified regular expression. Method split returns an array of strings containing the substrings between matches for the regular expression. In line 38, we use method split to tokenize a string of comma-separated integers. The argument is the regular expression that locates the delimiter. In this case, we use the regular expression ",\\s*" to separate the substrings wherever a comma occurs. By matching any white-space characters, we eliminate extra spaces from the resulting substrings. Note that the commas and whitespace are not returned as part of the substrings. Again, note that the Java string ",\\s*" represents the regular expression ,\s*.
Classes Pattern and Matcher
In addition to the regular-expression capabilities of class String, Java provides other classes in package java.util.regex that help developers manipulate regular expressions. Class Pattern represents a regular expression. Class Matcher contains both a regular-expression pattern and a CharSequence in which to search for the pattern. CharSequence is an interface that allows read access to a sequence of characters. The interface requires that the methods charAt, length, subSequence and toString be declared. Both String and StringBuffer implement interface CharSequence, so an instance of either of these classes can be used with class Matcher.
Common Programming Error 29.4
|A regular expression can be tested against an object of any class that implements interface CharSequence, but the regular expression must be a String. Attempting to create a regular expression as a StringBuffer is an error.|
If a regular expression will be used only once, static Pattern method matches can be used. This method takes a string that specifies the regular expression and a CharSequence on which to perform the match. This method returns a boolean indicating whether the search object (the second argument) matches the regular expression. If a regular expression will be used more than once, it is more efficient to use static Pattern method compile to create a specific Pattern object for that regular expression. This method receives a string representing the pattern and returns a new Pattern object, which can then be used to call method matcher. This method receives a CharSequence to search and returns a Matcher object. Matcher provides method matches, which performs the same task as Pattern method matches, but receives no arguments—the search pattern and search object are encapsulated in the Matcher object. Class Matcher provides other methods, including find, lookingAt, replaceFirst and replaceAll.