Cheracters and strings

Download Report

Transcript Cheracters and strings

Chapter 9
Characters and Strings
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Chapter 9 Objectives
After you have read and studied this chapter,
you should be able to
• Declare and manipulate data of the char data
type.
• Write string processing programs, using String
and StringBuffer objects.
• Differentiate the String and StringBuffer
classes and use the correct class in solving a
given task.
• Tell the difference between equality and
equivalence testings for String objects.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
In Java, single characters are represented using
the data type char.
Character constants are written as symbols
enclosed in single quotes:
char ch1 = ‘X’;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
ASCII stands for American Standard Code for
Information Interchange.
ASCII is one of the document coding schemes
widely used today. This coding scheme allows
different computers to share information easily.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
ASCII works well for English-language documents
because all characters and punctuation marks are
included in the ASCII codes.
ASCII does not represent the full character sets of
other languages.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.1 Characters
The Unicode Worldwide Character Standard
(Unicode) supports the interchange, processing,
and display of the written texts of diverse
languages.
Java uses the Unicode standard for representing
char constants.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
A string is a sequence of characters that is treated
as a single value.
Instances of the String class are used to represent
strings in Java.
We access individual characters of a string by
calling the charAt method of the String object.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
Each character in a string has an index we use to
access the character.
Java uses zero-based indexing; the first
character’s index is 0, the second is 1, and so on.
To refer to the first character of the word name, we
say
name.charAt(0).
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.1
An indexed expression is used to refer to individual
characters in a string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
Since String is a class, we can create an instance
of a class by using the new method.
The statements we have used so far, such as
String name1 = “Kona”;
works as a shorthand for
String name1 = new String(“Kona”);
But this shorthand works for the String class only.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.2 Strings
String comparison may be done in several ways.
• The methods equals and equalsIgnoreCase compare
string values; one is case-sensitive and one is not.
• The method compareTo returns a value:
• Zero (0) if the strings are equal.
• A negative integer if the first string is less than the
second.
• A positive integer if the first string is greater than the
second.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Pattern matching is a common function in many
applications.
In Java 2 SDK 1.4, two new classes, Pattern and
Matcher, are added.
The String class also includes several new
methods that support pattern matching.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The matches method from the String class is
similar to the equals method.
However, unlike equals, the argument to matches
can be a pattern.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Suppose that all new students are assigned
a three-digit code:
• The first digit represents the major (5 indicates
computer science);
• The second digit represents either in-state (1),
out-of-state (2), or foreign (3);
• The third digit indicates campus housing:
• On-campus dorms are numbered 1-7.
• Students living off-campus are represented by
the digit 8.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The valid code pattern for computer
science majors living on-campus:
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The pattern is called a regular expression that
allows us to denote a large set of “words” (any
sequence of symbols) succinctly.
Brackets [] represent choices, so [abc] means a,
b, or c.
For example, the definition for a valid Java
identifier may be stated as
[a-zA-Z][a-zA-Z0-9_$]*
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Expression
Description
[013]
A single digit 0, 1, or 3.
[0-9][0-9]
Any two-digit number from 00 to
99.
A[0-4]b[05]
A string that consists of four
characters. The first character is A.
The second character is a number
between 0 and 4, inclusive. The
third character is b. The last
character is either 0 or 5.
[0-9&&[^4567]] A single digit that is 0, 1, 2, 3, 8, or
9.
[a-z0-9]
A single character that is either a
lowercase letter or a digit.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Expression
Description
X{N}
Repeat X exactly N times, where X is a
regular expression for a single character.
X{N,}
Repeat X at least N times.
X{N,M}
Repeat X at least N but no more than M
times.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The period symbol (.) is used to match any
character except a line terminator (\n or \r).
String document;
document = ...; //assign text to ‘document’
if (document.matches(“.*zen of objects.*”){
System.out.println(“Found”);
}else{
System.out.println(“Not found”);
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Brackets ([ ]) are used for expressing a range of
choices for a given character.
To express a range of choices for multiple
characters, use parentheses and the vertical bar.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
Expression
Description
[wb](ad|eed)
Matches wad, weed, bad,
and beed.
(pro|anti)-OOP
Matches pro-OOP and antiOOP
(AZ|CA|CO)[0-9]{4}
Matches AZxxxx,CAxxxx, and
COxxxx, where x is a single
digit.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The replaceAll method is new to the Version 1.4
String class.
This method allows us to replace all occurrences
of a substring that matches a given regular
expression with a given replacement string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
For example, to replace all vowels in a string with
the @ symbol:
String originalText, modifiedText;
originalText = ...;
//assign string to ‘originalText’
modifiedText = originalText.replaceAll(“[aeiou]”,”@”);
Note that this method does not change the original
text; it simply returns a modified text as a separate
string.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
To match a whole word, use the \b symbol to
designate the word boundary.
str.replaceAll(“\\btemp\\b”, “temporary”);
Two backslashes are necessary because we must
write the expression in a String representation.
Two backslashes prevents the system from
interpreting the regular expression backslash as a
control character.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.3 Pattern Matching and Regular Expression
The backslash is also used to search for a
command character. For example:
• To search for the plus symbol (+) in text, we use the
backslash as \+.
• To express it as a string, we write “\\+”
str.replaceAll(“(C|C\\+\\+)”, “Java”);
• Other regular expression control characters also need
backslashes to work properly (for example: *, (, ), [, ], {,
}).
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The matches and replaceAll methods of the
String class are shorthand for using the Pattern
and Matcher classes from the java.util.regex
package.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
If str and regex are String objects, then both
str.matches(regex);
and
Pattern.matches(regex, str);
are equivalent to
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
matcher.matches();
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
Creating Pattern and Matcher objects gives us more
options and efficiency.
The compile method of the Pattern class converts
the stated regular expression to an internal format
to carry out the pattern-matching operation.
This conversion is carried out every time the
matches method of the String or Pattern class is
executed.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
/*
Chapter 9 Sample Program: Checks whether the input
string is a valid identifier. This version uses the
Matcher and Pattern classes.
File: Ch9MatchJavaIdentifier2.java
*/
import javax.swing.*;
import java.util.regex.*;
class Ch9MatchJavaIdentifier2
{
private static final String STOP = STOP";
private static final String VALID ="Valid Java
identifier";
private static final String INVALID ="Not a valid
Java identifier";
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
private static final String VALID_IDENTIFIER_PATTERN
= "[a-zA-Z][a-zA-Z0-9_$]*";
public static void main (String[] args)
{
String
str, reply;
Matcher matcher;
Pattern pattern
=Pattern.compile(VALID_IDENTIFIER_PATTERN);
while (true)
{
str = JOptionPane.showInputDialog(null,
"Identifier:");
if (str.equals(STOP)) break;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
matcher = pattern.matcher(str);
if (matcher.matches())
reply = VALID;
else
reply = INVALID;
JOptionPane.showMessageDialog(null,
str + ":\n" + reply);
}
}
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The find method is another powerful method of the
Matcher class.
The method searches for the next sequence in a
string that matches the pattern, and returns true
if the pattern is found.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
When a matcher finds a matching sequence of
characters, we can query the location of the
sequence by using the start and end methods.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.4 The Pattern and Matcher Classes
The start method returns the position in the string
where the first character of the pattern is found.
The end method returns the value 1 more than the
position in the string where the last character of
the pattern is found.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
Comparing String objects is similar to comparing
other objects.
The equality test (==) is true if the contents of the
variables are the same.
For a reference data type, the equality test is true
if both variables refer to the same object,
because they both contain the same address.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
The equals method is true if the String objects to
which the two variables refer contain the same
string value.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.2A
The difference between the equality test and the
equals method.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.2B and C
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.5 Comparing Strings
As long as a new String object is created using the
new operator, the rule for comparing objects
applies to comparing strings.
String str = new String (“Java”);
If the new operator is not used, string data are
treated as if they are of the primitive data type.
String str = “Java”;
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
Fig. 9.3
The difference between using and not using the
new operator for String.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
When a String object is created, it cannot be
changed.
Manipulating the content of a string, such as
replacing a character, appending a string with
another string, deleting a portion of a string, and
so on, may be accomplished by using the
StringBuffer class.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
For example:
StringBuffer word = new StringBuffer(“Java”);
word.setCharAt(0, ‘D’);
word.setCharAt(1, ‘i’ );
changes the string from “Java” to “Diva.”
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
The following example reads a sentence and
replaces all vowels in the sentence with the
character X.
/*
Chapter 9 Sample Program: Replace every vowel in a
given sentence with 'X‘using StringBuffer.
File: Ch9ReplaceVowelsWithX.java
*/
import javax.swing.*;
class Ch9ReplaceVowelsWithX
{
public static void main (String[] args)
{
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
StringBuffer
String
tempStringBuffer;
inSentence;
int
char
numberOfCharacters;
letter;
inSentence = JOptionPane.showInputDialog(null, "Enter
a sentence:");
tempStringBuffer=
new StringBuffer(inSentence);
numberOfCharacters = tempStringBuffer.length();
for (int index = 0; index < numberOfCharacters;
index++)
{
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
letter = tempStringBuffer.charAt(index);
if ( letter == 'a' || letter == 'A' ||
letter == 'e' || letter == 'E' ||
letter == 'i' || letter == 'I' ||
letter == 'o' || letter == 'O' ||
letter == 'u' || letter == 'U'
)
{
tempStringBuffer.setCharAt(index,'X');
}
}
System.out.println( "Input: " + inSentence + "\n");
System.out.println( "Output: " + tempStringBuffer );
}
}
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We cannot input StringBuffer objects.
We must input String objects and convert them to
StringBuffer objects.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We use the append method to append a String or
StringBuffer object to the end of a StringBuffer
object.
The method append can also take an argument of
the primitive data type.
Any primitive data type argument is converted to a
string before it is appended to a StringBuffer
object.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
We can insert a string at a specified position by using
the insert method.
The syntax for this method is:
<StringBuffer>.insert(<insertIndex>, <value>);
where <insertIndex> must be greater than or
equal to 0 and less than or equal to the length of
<StringBuffer>, and the <value> is an object
or a value of the primitive data type.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.
9.6 StringBuffer
For example, executing
StringBuffer str =
new StringBuffer(“Java is great”);
str.insert(8, “really”);
changes the string
Java is great
to
Java is really great.
©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display.