Assignment 3 and 4 Info

Download Report

Transcript Assignment 3 and 4 Info

Programming Assignment 3
due November 15, 2007
• Write a "proof of concept" Python program
for copying one directory system to
another.
• In your favorite operating system, create a
directory structure.
• Then create a copy of that directory
structure in a new area on disk subject to
restrictions on file names.
• The original directory is called "source1" and the copy to
be created is to be called "dest1"
• Directory and file names in source1 may be of up to 12
characters in length and may be any of a-z, 0-9, +, -, @,
$; in dest1 they may be up to 8 characters in length and
may be any of a-z 0-9, +, -. (If @ and/or $ are not legal
characters for directory/file names on your system,
choose two other legal characters.)
• The full (relative, starting with "dest1") path name of a
file within dest1 must have length at most 40 (counting
path separators such as "/" or ":")
file name conversions
•
-1- replace @ by -a and $ by -s
•
•
-2- escape + by -+ and - by -•
•
"w@$h" becomes "w-a-sh"
"a-b+c" becomes "a--b-+c"
-3- truncate long names and add unique number to
distinguish between original names that truncate to
same short name
•
•
•
•
"washington" becomes "washingt"
"adams-john" becomes "adams--j"
"adams-john-quincy" becomes "adams-+1"
(You may assume that no more than 10 names truncate to the
same 8 character string.)
directory name conversions
• If the path name of a file in dest1 is too
long (greater than 40 characters) you must
systematically rename subdirectories to
shorter names.
• dest1/presiden/virginia/18thcent/washingt
• (41 characters)
• becomes
• dest1/presiden/virginia/18thcen/washingt
provide alternate displays
• Write a set of Python functions which will allow
the user to work with the dest1 directories and
files using the names that were used in source1
(without direct reference to source1)
• Hints: You'll need to build a list or dictionary or . .
. of directory names and file names with path
information. Your best strategy may be to
construct the list(s)/dictionary(ies) before
creating anything in dest1.
• See next slide for explanation
(thanks to DJ)
•
The program for Assignment 3 should consist of three pieces of functionality:
•
1) A function that, given a source and destination directory, perform the copy function with rename as
described in the previous notes on this assignment
•
2) A function that, provided a path name in an already copied source directory, can provide the
destination path name for that file.
•
3) A function that, provided a path name in the destination directory of a previous copy, can provide the
source path name.
•
It is recommended that #2 and #3 involve table lookup of some sort.
•
These functions should be wrapped in a menu driven program that includes an option to quit and that has
the menu displayed after each choice of function until that quit option is chosen.
Assignment 3 - directory
structure on source (include
these)
• /place/usa/westvirginia/monongalia/morga
ntown
• /place/usa/westvirginia/monongalia/starcity
• /place/usa/westvirginia/monongalia/westov
er
• /president/washington
• /president/adams+john
• /president/adams+john+q
Assignment 4 - Web Scraping
•
You have been hired by the laziest Mountaineer fan ever. He loves Mountaineer
football and Men’s Basketball, but won’t get off his couch to check the schedule (even
if it’s on fire). He wants someone who can generate a local electronic copy of the
2007 – 2008 football and Men’s basketball schedules for him with one simple python
script he can run from his laptop computer. The program has to go out on the
Internet to fetch the schedule in case the times change, but can use any legitimate
sports website (foxsports.msn.com, espn.com, etc) or WVU official site
(www.wvu.edu) in order to generate the results. It should use web scraping
techniques to demonstrate state-of-the-art technology.
•
Possible technologies or documents to research for this assignment include:
urllib/urllib2, BeautifulSoup, Dive Into Python Chapter 8.
•
This assignment is due no later than 7 p.m on Thursday, November 29, 2007.