Lecture Notes 3
Download
Report
Transcript Lecture Notes 3
Chapter 5 Strings
CSC1310 Fall 2009
Strings
String is an ordered collection of characters that
stores and represents text-based information.
Strings in Python are immutable (e.g., cannot be
changed in place) sequences (e.g., they have a leftto-right order).
Single and Double Quotes
Single and double quotes are interchangeable.
>>> ‘Python’, “Python”
Empty literal: ‘’ or “”.
It allows you to embed a quote character of the
other type inside a string:
>>> “knight’s”,‘knight”s’
Python automatically concatenates adjacent
strings
>>>”Title” ‘ of’ “ the book”
Escape Sequences
Escape sequence is a special byte code embedded
into string, that can not be easily typed on a
keyboard.
\ with one (or more) character(s) is replaced by a
single character in the resulting string.
\n - Newline
\t
- Horizontal Tab
>>> s=‘a\nb\tc’ # 5 characters!
>>> s
‘a\nb\tc’
>>> print s
a
b
c
Escape Sequences
\\
- Backslash
\’
- Single quote
\”
- Double quote
\a
- Bell
\b
- Backspace
\r
- Carriage return
\xhh - Hex digits value hh
\0
- Null (binary zero bytes)
>>> print ‘a\0m\0c’ # 5 characters!
amc
Raw Strings
>>>print ‘C:\temp\new.txt’
>>>print ‘C:\\temp\\new.txt’
Raw string suppress escape
Format: r“text” or r‘text’(R“text” or R‘text’)
>>>print r‘C:\temp\new.txt’
Raw strings may be used for directory paths,
text pattern matching.
Triple Quoted Strings or Block
Strings
Block string is a convenient literal format for coding
multiline text data (error msgs, HTML or XML code).
Format: “”” text””” or ‘’’text’’’
Unicode Strings
Unicode (“wide” character) strings are used to
support non-latin characters that require more than
one byte in memory.
Format: u“text” or u‘text’(U“text” or U‘text’)
Expression with Unicode and normal strings has
Unicode string as a result.
>>>’fall’+u’08’
u’fall08’
>>>str(u’fall08’),unicode(‘fall08’)
‘fall08’,u’fall08’
Basic operations: len(), +, *,in
len(str) function returns the length of a string str.
>>>len(‘abc’)
str1+str2 (concatenation) creates a new string by
joining operands str1 and str2.
>>>‘abc’ + ‘def’,len(‘abc’ + ‘def’)
str*i (repeat) adds a string str to itself i times.
>>> print ‘-’ * 80
str 1 in str2 (membership) returns true if str1 is a
substring of str2; otherwise, returns false.
>>>day=‘Monday 8th Sept 2008’
>>>’sep’ in day
>>>’th Sep’ in day
Indexing
Each character in the string can be accessed by its
position (offset) – index.
>>>S = ‘STRINGINPYTHON’
>>>S[14],S[-15]
Negative offset can be viewed as counting backward
from the end(offset –x is xth character from the end).
>>>S[0],S[10],S[13],S[-5],S[-14]
(‘S’,’T’,’N’,’Y’,’S’)
Slicing
Slicing allows us to extract an entire section
(substring) in a single step.
Str1[offset1:offset2] returns a substring of str1
starting from offset1 (including) and ending at
offset2 (excluding).
>>>S[1:3] #extract item at offsets1 and 2
>>>S[1:] #all items past the first
>>>S[:3] # extract items at offsets 0,1,2
>>>S[:-1] #fetch all but the last item
>>>S[-1:] # extract last item
>>>S[:]
# a copy of the string
In Python 2.3
Third index – stride(step)
>>>S=‘0123456789’
>>>S[1:10:2]
‘13579’
To reverse string use step =-1
>>>”hello”[::-1]
String Conversion
You cannot add a number and a string together.
int(str1) converts string str1 into integer.
float(str1) converts string str1 into floating-point
number.
Older techniques: functions string.atoi(str1) and
string.atof(str1).
>>>int(“42”)+1,float(“42”)+1
str(i) converts numeric i to string(`i`)
>>>”fall0”+str(8)
Changing Strings
You cannot change a string in-place by assigning
value to an index(S[0] = ‘X’)
To modify: create new string with concatenation and
slicing.
>>>s=‘spam’
>>>s=s+” again” # s+=” again!”
>>>s
>>>s=s[:3]+” is here”+s[-1:]
>>>s
Alternatively, format a string.
Formatting Strings
“format string” % “object to insert”
>>>s=“Sales tax”
>>>s=”%s is %d percent!” % (s,8)
%s string
%d decimal integer
%i
integer (%u - unsigned integer)
%o octal integer
%x hex integer (%X – uppercase hex integer)
%e floating-point exponent (%E - uppercase)
%f floating-point decimal (%F – uppercase)
String Methods (p.91 table 5-4)
Str1.replace(str2,str3) replaces each substring str2
in Str1 to str3.
>>>‘string in python’.replace( ‘in’, ‘XXXX’)
Str1.find(str2) returns the offset where substring
str2 appears in Str1, or -1.
>>>where =‘string in python’.find( ‘in’)
>>>’string in python’[:where]
>>>‘in1in2in3in4in5’.replace( ‘in’, ‘XX’,3)
String Methods
Str.upper(), str.lower(), str.swapcase()
Str1.count(substr,start,end)
Str1.endswith(suffix,start,end)
Str1.startswith(prefix,start,end)
Str1.index(substr,start,end)
Str1.isalnum(),str1.isalpha(), str1.isdigit(),
str1.islower(),str1.isspace(),str1.issupper()
String Module
Maketrans()/translate()
>>>import string
>>>convert=string.maketrans(“ _-”,”_-+”)
>>>input=“It is a two_part – one_part”
>>>input.translate(convert)
‘It_is_a_two-part_+_one-part’
String Module
Constants
digits
‘0123456789’
octdigits
‘01234567’
hexdigits ‘0123456789abcdefABCDEF’
lowercase ‘abcdefghijklmnopqrstuvwxyz’
uppercase ‘ABCDEFGHIJKLMNOPRQSTUVWXYZ’
letters
lowercase+uppercase
whitespace ‘\t\n\r\v’
>>>import string
>>>x=raw_input()
>>>x in string.digits