Transcript Slide 1

Chapter 9
Characters and Strings
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 9 Objectives
After you have read and studied this
chapter, you should be able to
• Declare and manipulate data of the char
data type.
• Write string processing programs, using
String and StringBuffer objects.
• Differentiate the String and StringBuffer
classes and use the correct class in
solving a given task.
• Tell the difference between equality and
equivalence testings for String objects.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
In Java, single characters are
represented using the data type char.
Character constants are written as
symbols enclosed in single quotes:
char ch1 = ‘X’;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
ASCII stands for American Standard
Code for Information Interchange.
ASCII is one of the document coding
schemes widely used today. This
coding scheme allows different
computers to share information easily.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
ASCII works well for English-language
documents because all characters and
punctuation marks are included in the
ASCII codes.
ASCII does not represent the full
character sets of other languages.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
The Unicode Worldwide Character
Standard (Unicode) supports the
interchange, processing, and display
of the written texts of diverse
languages.
Java uses the Unicode standard for
representing char constants.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
A string is a sequence of characters that
is treated as a single value.
Instances of the String class are used
to represent strings in Java.
We access individual characters of a
string by calling the charAt method of
the String object.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
Each character in a string has an index
we use to access the character.
Java uses zero-based indexing; the first
character’s index is 0, the second is 1,
and so on.
To refer to the first character of the word
name, we say
name.charAt(0).
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.1
An indexed expression is used to refer
to individual characters in a string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
Since String is a class, we can create
an instance of a class by using the
new method.
The statements we have used so far,
such as
String name1 = “Kona”;
works as a shorthand for
String name1 = new String(“Kona”);
But this shorthand works for the String
class only.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
String comparison may be done in
several ways.
• The methods equals and
equalsIgnoreCase compare string values;
one is case-sensitive and one is not.
• The method compareTo returns a value:
• Zero (0) if the strings are equal.
• A negative integer if the first string is less
than the second.
• A positive integer if the first string is
greater than the second.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Pattern matching is a common function
in many applications.
In Java 2 SDK 1.4, two new classes,
Pattern and Matcher, are added.
The String class also includes several
new methods that support pattern
matching.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The matches method from the String
class is similar to the equals method.
However, unlike equals, the argument
to matches can be a pattern.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Suppose that all new students are
assigned a three-digit code:
• The first digit represents the major (5
indicates computer science);
• The second digit represents either in-state
(1), out-of-state (2), or foreign (3);
• The third digit indicates campus housing:
• On-campus dorms are numbered 1-7.
• Students living off-campus are represented
by the digit 8.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The valid code pattern for computer
science majors living on-campus:
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The pattern is called a regular
expression that allows us to denote a
large set of “words” (any sequence of
symbols) succinctly.
Brackets [] represent choices, so [abc]
means a, b, or c.
For example, the definition for a valid
Java identifier may be stated as
[a-zA-Z][a-zA-Z0-9_$]*
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Expression
Description
[013]
A single digit 0, 1, or 3.
[0-9][0-9]
Any two-digit number from 00 to
99.
A[0-4]b[05]
A string that consists of four
characters. The first character is A.
The second character is a number
between 0 and 4, inclusive. The
third character is b. The last
character is either 0 or 5.
[0-9&&[^4567]] A single digit that is 0, 1, 2, 3, 8, or
9.
[a-z0-9]
A single character that is either a
lowercase letter or a digit.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Expression Description
X{N}
Repeat X exactly N times, where
X is a regular expression for a
single character.
X{N,}
Repeat X at least N times.
X{N,M}
Repeat X at least N but no more
than M times.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The period symbol (.) is used to match any
character except a line terminator (\n or
\r).
String document;
document = ...; //assign text to ‘document’
if (document.matches(“.*zen of objects.*”){
System.out.println(“Found”);
}else{
System.out.println(“Not found”);
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Brackets ([ ]) are used for expressing a
range of choices for a given character.
To express a range of choices for
multiple characters, use parentheses
and the vertical bar.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
Expression
Description
[wb](ad|eed)
Matches wad, weed,
bad, and beed.
(pro|anti)-OOP
Matches pro-OOP and
anti-OOP
(AZ|CA|CO)[0-9]{4} Matches
AZxxxx,CAxxxx, and
COxxxx, where x is a
single digit.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The replaceAll method is new to the
Version 1.4 String class.
This method allows us to replace all
occurrences of a substring that
matches a given regular expression
with a given replacement string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
For example, to replace all vowels in a
string with the @ symbol:
String originalText, modifiedText;
originalText = ...;
//assign string to ‘originalText’
modifiedText =
originalText.replaceAll(“[aeiou]”,”@”);
Note that this method does not change
the original text; it simply returns a
modified text as a separate string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
To match a whole word, use the \b symbol
to designate the word boundary.
str.replaceAll(\\btemp\\b, “temporary”);
Two backslashes are necessary because
we must write the expression in a String
representation. Two backslashes
prevents the system from interpreting
the regular expression backslash as a
control character.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching
and Regular Expression
The backslash is also used to search for
a command character. For example:
• To search for the plus symbol (+) in text,
we use the backslash as \+.
• To express it as a string, we write “\\+”.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The matches and replaceAll methods
of the String class are shorthand for
using the Pattern and Matcher
classes from the java.util.regex
package.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
If str and regex are String objects, then
both
str.matches(regex);
and
Pattern.matches(regex, str);
are equivalent to
Pattern pattern = Pattern.compile(regex);
Matcher matcher = p.matcher(str);
matcher.matches();
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
Creating Pattern and Matcher objects
gives us more options and efficiency.
The compile method of the Pattern class
converts the stated regular expression to
an internal format to carry out the patternmatching operation.
This conversion is carried out every time
the matches method of the String or
Pattern class is executed.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
/*
Chapter 9 Sample Program: Checks whether
the input string is a valid identifier.
This version uses the Matcher and Pattern
classes.
File: Ch9MatchJavaIdentifier2.java
*/
import javax.swing.*;
import java.util.regex.*;
class Ch9MatchJavaIdentifier2 {
private static final String STOP = STOP";
private static final String VALID ="Valid
Java identifier";
private static final String INVALID ="Not
a valid Java identifier";
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
private static final String
VALID_IDENTIFIER_PATTERN
= "[a-zA-Z][a-zA-Z0-9_$]*";
public static void main (String[] args) {
String
str, reply;
Matcher matcher;
Pattern pattern
=Pattern.compile(VALID_IDENTIFIER_PATTERN);
while (true) {
str = JOptionPane.showInputDialog(null,
"Identifier:");
if (str.equals(STOP)) break;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
matcher = pattern.matcher(str);
if (matcher.matches()) {
reply = VALID;
} else {
reply = INVALID;
}
JOptionPane.showMessageDialog(null, str +
":\n" + reply);
}
}
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The find method is another powerful
method of the Matcher class.
The method searches for the next
sequence in a string that matches the
pattern, and returns true if the pattern
is found.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
When a matcher finds a matching
sequence of characters, we can query
the location of the sequence by using
the start and end methods.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The start method returns the position in
the string where the first character of
the pattern is found.
The end method returns the value 1
more than the position in the string
where the last character of the pattern
is found.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
Comparing String objects is similar to
comparing other objects.
The equality test (==) is true if the
contents of the variables are the same.
For a reference data type, the equality
test is true if both variables refer to the
same object, because they both
contain the same address.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
The equals method is true if the String
objects to which the two variables refer
contain the same string value.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.2A
The difference between the equality test
and the equals method.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.2B and C
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
As long as a new String object is created
using the new operator, the rule for
comparing objects applies to comparing
strings.
String str = new String (“Java”);
If the new operator is not used, string
data are treated as if they are of the
primitive data type.
String str = “Java”;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.3
The difference between using and not
using the new operator for String.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
When a String object is created, it
cannot be changed.
Manipulating the content of a string,
such as replacing a character,
appending a string with another string,
deleting a portion of a string, and so
on, may be accomplished by using the
StringBuffer class.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
For example:
StringBuffer word = new
StringBuffer(“Java”);
word.setCharAt(0, ‘D’);
word.setCharAt(1, ‘i’ );
changes the string from “Java” to “Diva.”
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
The following example reads a sentence
and replaces all vowels in the sentence
with the character X.
/*
Chapter 9 Sample Program: Replace every
vowel in a given sentence with 'X‘using
StringBuffer.
File: Ch9ReplaceVowelsWithX.java
*/
import javax.swing.*;
class Ch9ReplaceVowelsWithX {
{
public static void main (String[] args)
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
StringBuffer
String
tempStringBuffer;
inSentence;
int
char
numberOfCharacters;
letter;
inSentence =
JOptionPane.showInputDialog(null, "Enter a
sentence:");
tempStringBuffer=
new StringBuffer(inSentence);
numberOfCharacters =
tempStringBuffer.length();
for (int index = 0; index <
numberOfCharacters; index++) {
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
letter = tempStringBuffer.charAt(index);
if ( letter
letter
letter
letter
letter
==
==
==
==
==
'a'
'e'
'i'
'o'
'u'
||
||
||
||
||
letter
letter
letter
letter
letter
==
==
==
==
==
'A'
'E'
'I'
'O'
'U'
||
||
||
||
) {
tempStringBuffer.setCharAt(index,'X');
}
}
System.out.println( "Input: " +
inSentence + "\n");
System.out.println( "Output: " +
tempStringBuffer );
}
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We cannot input StringBuffer objects.
We must input String objects and
convert them to StringBuffer objects.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We use the append method to append
a String or StringBuffer object to the
end of a StringBuffer object.
The method append can also take an
argument of the primitive data type.
Any primitive data type argument is
converted to a string before it is
appended to a StringBuffer object.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We can insert a string at a specified
position by using the insert method.
The syntax for this method is:
<StringBuffer>.insert(<insertIndex>, <value>);
where <insertIndex> must be greater
than or equal to 0 and less than or equal
to the length of <StringBuffer>, and
the <value> is an object or a value of
the primitive data type.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
For example, executing
StringBuffer str =
new StringBuffer(“Java is great”);
str.insert(8, “really”);
changes the string
Java is great
to
Java is really great.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.