Lecture Notes 3

Download Report

Transcript Lecture Notes 3

Chapter 5 Strings
CSC1310 Fall 2009
Strings

String is an ordered collection of characters that
stores and represents text-based information.

Strings in Python are immutable (e.g., cannot be
changed in place) sequences (e.g., they have a leftto-right order).
Single and Double Quotes
Single and double quotes are interchangeable.
>>> ‘Python’, “Python”
 Empty literal: ‘’ or “”.
 It allows you to embed a quote character of the
other type inside a string:
>>> “knight’s”,‘knight”s’
 Python automatically concatenates adjacent
strings
>>>”Title” ‘ of’ “ the book”

Escape Sequences
Escape sequence is a special byte code embedded
into string, that can not be easily typed on a
keyboard.
 \ with one (or more) character(s) is replaced by a
single character in the resulting string.
 \n - Newline
 \t
- Horizontal Tab
>>> s=‘a\nb\tc’ # 5 characters!
>>> s
‘a\nb\tc’
>>> print s
a
b
c

Escape Sequences
\\
- Backslash
 \’
- Single quote
 \”
- Double quote
 \a
- Bell
 \b
- Backspace
 \r
- Carriage return
 \xhh - Hex digits value hh
 \0
- Null (binary zero bytes)
>>> print ‘a\0m\0c’ # 5 characters!
amc

Raw Strings
>>>print ‘C:\temp\new.txt’
>>>print ‘C:\\temp\\new.txt’
 Raw string suppress escape

Format: r“text” or r‘text’(R“text” or R‘text’)
>>>print r‘C:\temp\new.txt’
 Raw strings may be used for directory paths,
text pattern matching.
Triple Quoted Strings or Block
Strings


Block string is a convenient literal format for coding
multiline text data (error msgs, HTML or XML code).
Format: “”” text””” or ‘’’text’’’
Unicode Strings
Unicode (“wide” character) strings are used to
support non-latin characters that require more than
one byte in memory.
 Format: u“text” or u‘text’(U“text” or U‘text’)
 Expression with Unicode and normal strings has
Unicode string as a result.
>>>’fall’+u’08’
u’fall08’
>>>str(u’fall08’),unicode(‘fall08’)
‘fall08’,u’fall08’

Basic operations: len(), +, *,in
len(str) function returns the length of a string str.
>>>len(‘abc’)
 str1+str2 (concatenation) creates a new string by
joining operands str1 and str2.
>>>‘abc’ + ‘def’,len(‘abc’ + ‘def’)
 str*i (repeat) adds a string str to itself i times.
>>> print ‘-’ * 80
 str 1 in str2 (membership) returns true if str1 is a
substring of str2; otherwise, returns false.
>>>day=‘Monday 8th Sept 2008’
>>>’sep’ in day
>>>’th Sep’ in day

Indexing
Each character in the string can be accessed by its
position (offset) – index.
>>>S = ‘STRINGINPYTHON’

>>>S[14],S[-15]
Negative offset can be viewed as counting backward
from the end(offset –x is xth character from the end).
>>>S[0],S[10],S[13],S[-5],S[-14]
(‘S’,’T’,’N’,’Y’,’S’)

Slicing
Slicing allows us to extract an entire section
(substring) in a single step.
 Str1[offset1:offset2] returns a substring of str1
starting from offset1 (including) and ending at
offset2 (excluding).
>>>S[1:3] #extract item at offsets1 and 2
>>>S[1:] #all items past the first
>>>S[:3] # extract items at offsets 0,1,2
>>>S[:-1] #fetch all but the last item
>>>S[-1:] # extract last item
>>>S[:]
# a copy of the string

In Python 2.3
Third index – stride(step)
>>>S=‘0123456789’
>>>S[1:10:2]
‘13579’
 To reverse string use step =-1
>>>”hello”[::-1]

String Conversion
You cannot add a number and a string together.
 int(str1) converts string str1 into integer.
 float(str1) converts string str1 into floating-point
number.
 Older techniques: functions string.atoi(str1) and
string.atof(str1).
>>>int(“42”)+1,float(“42”)+1
 str(i) converts numeric i to string(`i`)
>>>”fall0”+str(8)

Changing Strings
You cannot change a string in-place by assigning
value to an index(S[0] = ‘X’)
 To modify: create new string with concatenation and
slicing.
>>>s=‘spam’
>>>s=s+” again” # s+=” again!”
>>>s
>>>s=s[:3]+” is here”+s[-1:]
>>>s
 Alternatively, format a string.

Formatting Strings
“format string” % “object to insert”
>>>s=“Sales tax”
>>>s=”%s is %d percent!” % (s,8)
 %s string
 %d decimal integer
 %i
integer (%u - unsigned integer)
 %o octal integer
 %x hex integer (%X – uppercase hex integer)
 %e floating-point exponent (%E - uppercase)
 %f floating-point decimal (%F – uppercase)

String Methods (p.91 table 5-4)
Str1.replace(str2,str3) replaces each substring str2
in Str1 to str3.
>>>‘string in python’.replace( ‘in’, ‘XXXX’)
 Str1.find(str2) returns the offset where substring
str2 appears in Str1, or -1.
>>>where =‘string in python’.find( ‘in’)
>>>’string in python’[:where]

>>>‘in1in2in3in4in5’.replace( ‘in’, ‘XX’,3)
String Methods





Str.upper(), str.lower(), str.swapcase()
Str1.count(substr,start,end)
Str1.endswith(suffix,start,end)
Str1.startswith(prefix,start,end)
Str1.index(substr,start,end)
Str1.isalnum(),str1.isalpha(), str1.isdigit(),
str1.islower(),str1.isspace(),str1.issupper()
String Module
Maketrans()/translate()
>>>import string
>>>convert=string.maketrans(“ _-”,”_-+”)
>>>input=“It is a two_part – one_part”
>>>input.translate(convert)
‘It_is_a_two-part_+_one-part’

String Module

Constants







digits
‘0123456789’
octdigits
‘01234567’
hexdigits ‘0123456789abcdefABCDEF’
lowercase ‘abcdefghijklmnopqrstuvwxyz’
uppercase ‘ABCDEFGHIJKLMNOPRQSTUVWXYZ’
letters
lowercase+uppercase
whitespace ‘\t\n\r\v’
>>>import string
>>>x=raw_input()
>>>x in string.digits