data - ipt12

Download Report

Transcript data - ipt12

Information Systems
&
Databases
1
2.1 Information Systems
Look at Integrated Information
Processes
2
I/O
(Internal)
Input
Store
&
Retrieve
Process
Organise
Analyse
Transmit
&
Receive
3
Understand
the Problem
External
I/O
Display
Collect
Modem
NIC
Process
Organise
Analyse
Information
Processes
Output
Processor
Hardware
Printer
Monitor
Speaker
Scanner
Keyboard
Microphone
CD
Disc
Memory
Input
&
Output
Interdependence
of
Information Processes
Store
&
Retrieve
Organise
Retrieve
Format doc
Analyse
Transmit
&
Receive
4
Understand
the Problem/Make Decision
Print
Spellcheck
Display
Collect
Enter data into WP
Rerieve & Fax
Save document
Process
Word Processing
Context Diagram
OCR
Word
Process
Input
Text
Output
Printer
document
Data Flow Diagram
Retrieve
Store
&
Retrieve
Process
Enter data into WP
Rerieve & Fax
Organise
Print
Format doc
Save
User
Folder
Spellcheck
Analyse
Transmit
&
Receive
Word
Process
Input
Text
OCR
Display
Save document
Collect
Word
Processing
Print
Printer
document
Retrieve
Document
Detailed level Data Flow Diagram
OCR
Input
Text
Create
doc
Save
Para
graphs doc
Edit
doc
Corrected
Draft
Print
doc
Three
Copies
Printer
Draft
pages
Final Draft
User
Folder
3rd level Data Flow Diagram
Memory
keystrokes
OCR
Input
Text
Memory
sentences
Create
doc
Errors
Save
Para
graphs doc
Corrected
Draft
Corrections
Edit
doc
Print
doc
Three
Copies
Printer
Draft
pages
5
Final Draft
User
Folder
Design Solution
3rd level Data Flow Diagram
Retrieve
Memory
keystrokes
Display
Rerieve & Fax
Save document
Collect
Store
&
Process
RetrieveSource
Enter data into WP
OrganiseScanPrint
Formatdocdoc
Spellcheck
VDU
Check
Transmit
AnalyseOCR
&
Receive
Scan
OK?
OCR Input
Text
Memory
sentences
Create
doc
Errors
Para Save
doc
graphs
pages
Corrections
Corrected Edit
doc
Draft
Draft
Print
doc
Three Printer
Copies
Final Draft
User
Folder
No
Yes
Draft
Save
Input
OCR
File
Edit
Input
OCR
File
Final
Draft
Fax
Document
Save
Draft
Via type 3 modem
Fax
Draft
Edit/save
Draft
How
Display?
Print
Final
Draft
6
Final
Draft
Print
Document
3 copies
System
Flow
Chart
Design Solution
Interdependence
of
Information Processes
Store
&
Retrieve
Organise
Retrieve
Format doc
Analyse
Transmit
&
Receive
7 & Evaluate
Test
Print
Spellcheck
Display
Collect
Enter data into WP
Rerieve & Fax
Save document
Process
Word Processing
Characteristics of an
information system
8
•Organising, or
•Analysing
•Sorting
•Grouping
•Indexing
•Summarising
•Listing
•Tabling
•Synthesis
•Hierarchy (priority listing)
•Questioning (querying)
•Reporting
Types of information systems
• Process Transactions
– Transaction Processing System (TPS)
• Information about an organisation
– Management Information System (MIS)
• Help make decisions
– Decision Support Systems (DSS)
– Expert System
• Manage information within organisation
– Office Automation Systems (OAS)
– Multimedia System
– Database Management System (DBMS)
9
Comparing Systems
10
•Type
•Purpose
•Examples
•TPS
•Process Transactions •Banking; POS; Reservations;
Loans
•MIS
•Information about
an organisation
•Accounting; ordering budgeting;
production
•DSS
•To help make
decisions
•Expert system; data mining; risk
analysis; modelling; simulations
•OAS
•Office
administration
•Personal productivity e.g MS
Office; Groupware - email;
calendars
Representing an information
system
Purpose
• Who is it for?
• Need(s) they have
Information System
Information Processes
C O A S T/- P D
PEOPLE
Participants
Users
Others
11
Data(Input)
Information
(Output)
Information
Technology
H/W
S/W
Comms
2.2 Database information
systems
Examples
12
School Databases
• Represent a School Information System
diagrammatically showing:
–
–
–
–
–
13
Purpose
Processes
People (Participants/Users)
Data/Information
Information Technology
• Describe the relationships between the
participants, data/information and information
technology
School Administration Database
• Purpose
– The efficient and effective operation of the school
• Processes
– Timetabling; staff allocations; take roll; record marks; s
• People (Participants/Users)
– Teachers; students; administrators; clerks; assistants;
technical support; gardeners; executive
• Data/Information
– Classrooms; subjects; class rolls; equipment; books;
timetable; attendance; marks; names & addresses
• Information Technology
14
– Keyboard, monitor, printer, network, file server
– School Administration System; Library system
RTA
Police
Driving
Inspectors
Vehicle
Inspectors
RTA
System
Cashiers
Managers
Vehicles
Drivers
15
Registration
Clerks
RTA ?
Purpose
• Who is it for?
• Need(s) they have
Information System
Information Processes
Participants
& Users
16
Data/
Information
Information
Technology
Video Store Database
Purpose
• Who is it for?
• Need(s) they have
Information System
Information Processes
Participants
& Users
17
Data/
Information
Information
Technology
Video Store System
Information Processes
C: Scan video/card
S: save database
T: Transfer data
A: Search database
A: Update data
D: Display information
Participants
Manager
Staff
USERS
Customers
OTHERS
Suppliers
18
Data/
Member data
Video data
Barcodes
Information
Receipt
Information
Technology
H: Personal Computer
H: Barcode reader
S: Database
C: Phone
2.3 Organisational Methods
19
Organising data
Non-computer methods
• Organising is the structuring of data
• Manual type databases
–
–
–
–
–
20
Telephone book
Dictionaries
Recipe card files
Encyclopedia
To do list
Advantages & Disadvantages
of computer-based methods of organising data
Advantages
• Faster
• Data does not have to be in set order (flexible)
• Data management is easier
• Data exchange possible
• Large storage
Disadvantages
• Need a computer
• Training usually required
• Security e.g during exchange of data
21
Computer based methods
of organising data
• Flat file systems
• Database management systems
• Hypermedia
22
Flat-file databases
• Data is stored and retrieved from a single
Table
• Database file contains multiple records
• Each record contains many fields
• Each field contains data (characters)
• Ability to search the database using data in
any field (key field or search field
23
Flat-file database
Field
File
•1
•2
•3
Records
24
•4
•5
•John
Database
25
Definitions
•Level
•Description
•File
•Block of data
•Record
•Collection of data about
one entity
•Specific aspect, data of
same type
•Data unique to record
•Spotty’s
•Individual letter or
number, etc
•Spotty
•Field
•Key field
•Character
26
•Example
•(Vet’s DB)
•Dogs
•Spotty, Bassett, 1998,
Y, N, Jan03
•Year Born = 1998
DBMS
• Data Base Management System
• Software package that allows users to
access a database
• E.g Microsoft Access; Oracle 8i; Informix;
IBM DB2, SQL Server
• The data is stored in a database
• DBMS database can have different inherent
structures
27
DataBase Structures
The common database structures
• Single Table
– Flat-file
• Multiple Tables
– Hierarchical
– Network
– Relational
28
Hypermedia c.f. Databases
Hypermedia
• Many files & types
• Many locations
Distributed DB
• Many files
• One file type
• Many locations
• Many tables
• Synchronisation
29
Flat-file DB
• One file & type
• One location
• One ‘table’
DBMS
• One file & type
• One location
• Many tables
Hierarchical DB Structure
Organises data as a series of levels.
Top down structure (nodes and branches)
Each node can have many branches
30
Each lower level node (child) is linked to a single higher order
node (parent)
Network DB Structure
Data organised as series of nodes linked by branches
Each node can have many branches
Low-level node (child) can link to more than one highlevel node (parent)
31
Relational DB Structure
Organises data in a series of related tables
Relationships established between the tables to
provide a flexible way of storing and accessing the
data
32
DB Structures Compared
Hierarchical
33
Network
Relational
Type of DB Keys
DB Key is a field that allows data to be accessed; any
field can be used a a key field
• Single key - uses one field
• Composite key - a group of two or more keys that can
be used to uniquely identify a record in the database
• Primary key - data unique to DB
• Secondary key - not necessarily unique (can be a
single key or a composite key
• Foreign Key (relational databases) Primary key from
one table used to create relationship with another entity
34
DB Keys
•
•
•
•
•
36
Name, Address, Gender, Age
J Smith, 3 River Rd, F, 20
J Smith, 3 River Rd, M 30
J Smith, 3 River Rd, M, 50
J Smith, 16 East St, M, 30
Relational Database Schemas
• SCHEMA
– Organised plan of the entire database
– Shows how and where data is found
– Describes data and the data’s logical relationships
• DB Schema consists of:
– Entities
– Attributes
– Relationships
37
ERD
• Entity-Relationship Diagram
• Graphical method of identifying entities and
showing relationships
• A type of Data Modelling Tool
Primary Key
38
•CUSTOMER
•PETS
•Customer_Number
•Pet_ID
•Customer_Type
•Customer_Number
•Customer_Name
•Pet_Name
•Street
•Animal_Type
•Suburb
•Breed
•State
•DOB
Foreign Key
DB Definitions
• Entity
– Specific thing about which information is
collected e.g a company, a person, a product
• Attribute
– Defined property of an entity e.g for a entity
(student) attributes could be Surname, Given
Name, DOB, Courses Taken etc
– In a flat-file database an attribute equates to a
field
• Relationship
39
Data Dictionary
• Data dictionaries hold METADATA (data
about data in the database)
• Used to design & manage the database
• Known as a Data Modelling tool
• Data Dictionary contains
40
–
–
–
–
–
Table Name
Attribute Name (Field Name)
Field Data Type
Field Size
Field Description/example (purpose)
DB Relationships
• The ways the entities in the database are
related
–
–
–
–
41
One-to-one
One-to-many
Many-to-one
Many-to-many
One-to-one
42
•CUSTOMER
•PETS
•Customer_Number
•Pet_ID
•Customer_Type
•Customer_Number
•Customer_Name
•Pet_Name
•Street
•Animal_Type
•Suburb
•Breed
•State
•DOB
One-to-many
•VISIT
•PETS
•Pet_ID
•Customer_Number
•Pet_Name
•Animal_Type
•Pet_ID
•Visit_Date
•FollowUp_Date
•Amount
•VISIT
•Breed
•Pet_ID
•DOB
•Visit_Date
•FollowUp_Date
•Amount
43
Many-to-many
Suppliers
Repairers
Repairer_ID
Name
Street
Suburb
Postcode
Supplier_ID
Name
Street
Suburb
Postcode
Supplier_ID
Name
Street
Suburb
Postcode
Repairers
Repairer_ID
Name
Street
Suburb
Postcode
44
Suppliers
Suppliers
Supplier_ID
Name
Street
Suburb
Postcode
Flat-file c.f Relational
•Feature
•Flat-file DB
•Relational DB
•# files
•One
•One
•# tables
•One
•Many
•Redundancy
•High
•Low
•Simple
•Complex
•Nil
•Yes
•Processing
•Relationships
45
Which to choose?
• Consider a variety of scenarios
• What factors would influence you decision
of choosing either a computer based or noncomputer based method organise data?
46
Which database?
• Flat-file for simple tasks, a single entity
and when redundancy not an issue
• Relational for complex tasks or where
multiple entities are involved
47
Flat-file database
• A relatively simple database system in which each
database is contained in a single table.
• In contrast, relational database systems can use
multiple tables to store information, and each table
can have a different record format.
48
Database Management Systems
• A collection of programs that enables you to store, modify,
and extract information from a database. There are many
different types of DBMSs, ranging from small systems that
run on personal computers to huge systems that run on
mainframes. Examples of database applications: are
•
•
•
•
49
computerized library systems
automated teller machines
flight reservation systems
computerized parts inventory systems
• The terms relational, network, flat, and hierarchical all refer
to the way a DBMS organizes information internally. The
internal organization can affect how quickly and flexibly you
can extract information.
Advantages/Disadvantages
DBMS
ADVANTAGES
• Data independence
• Reduced data
redundancy
• Easier to maintain data
integrity
• Easier data security
• Economy of scale
50
DISADVANTAGES
• Larger file sizes
• Higher cost
• More hardware
required
• Higher impact of
failure
Data Modelling Tools
• Data Modelling: the process of identifying
entities, attributes and relationships
• Data Modelling Tools include:
– Entity Relationship Diagrams (ERDs)
– Data Dictionary
54
Database Normalisation
• Process of organising data to minimise data
redundancy
• Relates to relational databases only
• Involves dividing a database into two or
more tables and defining the relationship
between the tables
• Objective: isolate data so that additions,
deletions and modifications of a field can be
made in juts one table and then propagated
through the database via defined
relationships
55
‘Zero Normal Form - ZNF’
• Let’s say we want to store some personal
bookmarks (URLs) for a number of users
• ZNF: because none of the rules of normalisation
have been applied
• If we want more URLs, we would need to add
more columns - this means data input screen
would need to change everytime we added an
extra URL
56
First Normal Form
1.
2.
3.
Eliminate repeating groups in individual tables.
Create a separate table for each set of related data.
Identify each set of related data with a primary key.
• We've
4. solved the problem of url field limitation.
• But look at the headache we've now caused ourselves
• Every time we input a new record we've got to duplicate company and user name
• Our database grow much larger than we'd ever want it to
• We could easily begin corrupting our data by misspelling redundant information.
57
Second
Normal Form
1. Create separate tables for
sets of values that apply to
multiple records.
2. Relate these tables with a
foreign key.
• We're in much better shape.
• But what happens when we want to add another employee of company
ABC? Or 200 employees?
• Now we've got company names and addresses duplicating themselves, a
situation just rife for introducing errors into our data.
58
Third
Normal Form
1. Eliminate fields that
do not depend on the
key.
• Users and urls tables can
grow without unnecessary
duplication or corruption of
data.
• Most developers will say
the Third Normal Form is far
enough - and in most cases
they would be correct.
• But look at our url fields do you notice the duplication
of data?
59
Fourth
Normal Form
1. In a many-to-many
relationship,
independent
entities can not be
stored in the same
table.
60
Fifth Normal Form
1. The original table must be reconstructed from the tables
into which it has been broken down.
61
•
There is one more form of normalization which is sometimes applied,
but it is indeed very esoteric and is in most cases probably not
required to get the most functionality out of your data structure or
application. It's tenet suggests:
•
The benefit of applying this rule ensures you have not created any
extraneous columns in your tables, and that all of the table structures
you have created are only as large as they need to be. It's good
practice to apply this rule, but unless you're dealing with a very large
data schema you probably won't need it.
Data Relationships
• One-to-one, one-to-many, and many-to-many.
• Look at the users table in the First Normal Form example
above. For a moment let's imagine we put the url fields in a
separate table, and every time we input one record into the
users table we would input one row into the urls table. We
would then have a one-to-one relationship: each row in the
users table would have exactly one corresponding row in the
urls table. For the purposes of our application this would
neither be useful nor normalized.
• Now look at the tables in the Second Normal Form example.
Our tables allow one user to have many urls associated with
his user record. This is a one-to-many relationship.
• The many-to-many relationship, however, is slightly more
complex. Notice in our Third Normal Form example we have
one user related to many urls. As mentioned, we want to
change that structure to allow many users to be related to
many urls, and thus we want a many-to-many relationship.
Let's take a look at what that would do to our table structure
before we discuss it:
62
Database Keys
• In database management systems, a key is a field that you use to
sort data.
• It can also be called a key field , sort key, index, or key word.
• For example, if you sort records by age, then the age field is a
key.
• Most database management systems allow you to have more than
one key so that you can sort records in different ways.
• One of the keys is designated the primary key, and must hold a
unique value for each record.
• A key field that identifies records in a different table is called a
foreign key.
63
Sorting & Searching
• SORT: ascending (A-Z; 0-9) or descending
(Z-A; 9-0)
– Multiple sorts
• SEARCH
– Query the database using Structured Query
Language (SQL)
64
SQL Syntax
Keyword
Source
Description
SELECT
Field(s)
FROM
Table(s)
WHERE
Operator, data
What is to be
Surname
displayed
Tables the fields Students
are to come from
Search criteria
Gender=‘F’
ORDER BY
Field(s)
[]
65
Orders of
display
Example
Surname ASC
Operators
66
Type
Operator Description
Example
Relational
>
<=
Contains
Greater than
Less or equal
Criteria
A>B
A<=B
Contains ‘a”
Logical
AND
OR
NOT
Both must be true
Either true
false
A AND B
A OR B
A NOT B
Sizing a database
- Formal method
•
•
•
•
•
•
•
•
•
•
•
67
1) Determine how many fields there are on the schema.
2) Next determine the datatypes of each of the fields -record the field length
3) Next, locate information on these datatypes storage requirements.
4) Identify any Diary fields and fields > 255 characters in length
5) Add up the byte values for each field's storage requirements.
6) Calculate for diary fields or fields > 255 characters in length,
7) Determine if you have any indexes built on any fields in the schema. If so, multiply the value of
the storage length of that fields datatype by 1.5.
8) Add up all the byte values you determined and recorded in the previous steps. This is the base
value for how much space each ticket or record requires in your database.
9) Estimate how many tickets a day will be entered into the system. Multiply that number by the
value you got in step 8.
10) Estimate how many days a year your system will be in production. Use a figure of 200 days for
normal businesses (taking into account holidays, weekends, etc.) or 365 if operating a 24x7
production system. Multiply this by the value you got in step 9.
11) Multiply the value in step 10 by 1.2 (this adds 20% to account for system "slop" and other
miscellaneous overhead). This value now represents approximately how much disk storage it will
take to accomodate this schema's growth for the next year. If you have multiple schemas - you will
need to do all the previous calculations for each schema
Sizing a database
- Quick & Dirty Method
68
•
1) For every 20 or so fields on a schema use a base value of about 1K per ticket. For
example,. if your schema has 60 fields, figure a base of 3K per ticket. If there are any
indexed fields, add another 0.5k to the base value.
•
2) If there are large text fields or diary fields on the schema, add up all their lengths,
multiply by 1/3 and add this to the base total obtained in step 1. For example, my
schema has 60 fields, of which 5 are fields > 255 characters in length. I use a base of
3.5K per ticket (it has some indexes). I add up the lengths of each of the 5 large text
fields (10K, 20K, 5K, 13K, 10K - total of 58K) and multiply by 1/3 for a total of about
19.3K. Add this to the base of 3.5K. This gives a value of about 22.8K per ticket or
record.
•
3) Now multiply the value you obtained in step 2 by the number of tickets a day you
expect to enter into the system.
•
4) Now multiply the value in step 3 by the number of business days per year (200 for
normal businesses, 365 for 24x7 shops). This is the ballpark figure for how much disk
space your schema will need to store a year's worth of data.
References
• www.webopedia.com
– Database, databases - normalization
• www.phpbuilder.com/columns/barry20000731.php3
– Database Normalization And Design Techniques
– Barry Wise (itcn.com)
– INT Media Group - 2002
• support.microsoft.com
• www.devshed.com
69
2.5 Other Information Processes
DISPLAYING - for database
information systems
70
Database Views
• Data in a database can be viewed different
ways for different purposes
View
Purpose
Form view
Record displayed as thought it were on paper.
(can be used for delete, edit and add data)
Displays only the information that will be
printed.
Shows on the data that answers the question
Report view
Query view
71
Database
Reports
Typically from a
Report Generator
72
• a program, usually part of a database management system,
that extracts information from one or more files and
presents the information in a specified format. Most report
writers allow you to select records that meet certain
conditions and to display selected fields in rows and
columns. You can also format data into pie charts, bar
charts, and other diagrams. Once you have created a format
for a report, you can save the format specifications in a file
and continue reusing it for new data.
Report Design Principles
•
•
•
•
73
Consistency of styles, fonts, formatting
Clear headings and lables
Clean, simple page layouts
Page numbers, dates, version details
Database screens
• Design and create screens for interacting
with selected parts of a database and justify
their appropriateness
– Form views
74
Hypermedia/Hypertext
• Hypermedia: a combination of media whose
locations are linked electronically to provide an
easy way to navigate between information
• Information stored as separate documents (or files)
• Hypertext: a system that allows documents to be
cross-linked.
75
Nodes & Links
• Link (Hyperlink) usually indicated by a
highlighted item. The author must specify
the location of the item.
• Node: If another is the destination of the
link, that computer is called a node.
76
Characteristics of Hypermedia
•Characteristic
•Definition
•Example
•Hypertext
•Text with links to
other documents
•Connection
between elements
•Location where
data is stored
•Internet resource
address
•Data about data
•“Hot Spot”
•Link
•Node
•URL
•Metadata
77
•Hypertext; Image
Map
•Server on WWW
•http:www…..
•HTML tags
URLs
• Uniform Resource Locator = node
Protocol
Domain
name
Directory
Folder
http://www.cambridge.edu.au/IPT/ipthsc.htm
World
Wide
Web
78
Country
Web
page
Metadata Tags
• Metadata: Data about data
• HTML uses tags to tell the browser how the
following data is to be handled
• E.g <H1> Metadata</H1>
79
Software for Hypermedia
• Microsoft Powerpoint and can be used to
organise text, graphics and sounds for a
presentation
80
Information
task
81
Which system to use?
Appointments
Diary
Manual
System
Ideal for
personal
use
Computer-based systems
Flat-file
Relational
Hypermedia
DB
DBMS
Ideal for
Too complex
Ideal for
business
for task
travellers
use
Product
Catalogue
Ideal for
letterbox
drops
Simple lists
only
Suitable but
generallly
too complex
Ideal for web
access
Stock/
Inventory
Too slow/
difficult to
maintain
Simple lists
only
Ideal for
complex
tasks
Generally not
suted
Personnel
Records
Too slow/
difficult to
maintain
Simple lists
only
Ideal for
complex
tasks
Class marks
Test results
Ideal
(private &
secure)
Ideal for
porocessing
Too complex
for task
Ideal
Suited but
genrally too
compolex
Generally not
suited
Suitable but
generallly
too complex
Ideal for web
access
Mail Merging
Not suited
Phone/Address
List
Ideal for
personal
use
Ideal for
larger lists
Generally not
suitable
(privacy/
security)
Generally not
suited (except
HSC results)
How to decide?
• Simple tasks, small amount of stable data (Manual
System)
• Complex tasks, large amount of changing data
(Computer-based System)
– Basic processing, easy to learn/use, simple structures
(Flat-file database)
– Complex processing, complex structures (Relational
Database)
– Limited processing, needs wide distribution of data
(Hypermedia)
82
2.4 Storage & Retrieval
83
Data Access Methods
Direct (Random) & sequential
84
To go from point A
to point Z in a
sequential-access
system, you must
pass through all
intervening points.
In a random-access
system, you can
jump directly to
point Z. Disks are
random access
media, whereas
tapes are sequential
access media.
•
Storage
85
•
ON-LINE: data processed under direct control of CPU
e.g memory of Direct Access Disk (DAD)
OFF-LINE: data controlled by the system e.g tape of
disk
Device
Description
Storage
method
Capacity
Data access
Hard disk
Fixed metal or
glass platters
Magnetic
Gigabytes
Direct
CD-ROM,
DVD
Plastic disk,
metal coating
Optical (laser)
650MB-17GB
Direct
Removable
cartridge
External metal
or plastic disk
Magnetic
100MB-2GB
Direct
Tape
Thin strip
plastic on reels
Magnetic
Gigabytes
Sequential
Distributed databases
86
• A database that consists of two or more data
files located at different sites on a computer
network.
• Because the database is distributed,
different users can access it without
interfering with one another.
• However, the DBMS must periodically
synchronize the scattered databases to make
sure that they all have consistent data.
Encryption & Decryption
• Encryption: The translation of data by a secret code.
• The most effective way to achieve data security.
• To read an encrypted file, you must have access to a secret
key or password that enables you to decrypt it.
• Unencrypted data is called plain text ; encrypted data is
referred to as cipher text.
• There are two main types of encryption: asymmetric
encryption (also called public-key encryption) and
symmetric encryption.
87
Symmetric & Asymmetric
encryption
• Symmetric encryption: the same key is used to
encrypt and decrypt the message.
• Asymmetric: A cryptographic system that uses
two keys -- a public key known to everyone and a
private or secret key known only to the recipient
of the message. When John wants to send a secure
message to Jane, he uses Jane's public key to
encrypt the message. Jane then uses her private
key to decrypt it.
88
Hypermedia - search engines
PRELIMINARY SEARCHING HINTS
1. Choose a search engine, directory or library in accordance with the kind of
search you are doing and the kind of results you are seeking.
2. Consider: Are you looking for a Web site? Information that might be contained
within Usenet? Academic articles that may only be retrievable with gopher?
3. Determine your aims: Do you want a specific hard-to-find document on an
esoteric subject, or general information on a broader topic? Do you need to
search the entire Web, or is what you are seeking likely to be found on a
number of sites, or only the most popular sites?
4. In making your choice, determine whether the information you are looking for
is likely to be in a page's title or first paragraph, or buried deeper within the
document or site.
5. Use a search engine's advanced features, if available, and read the help files if
you are unclear about its searching procedure.
SOURCE: http://www.windweaver.com/searchguide.htm#PRELIMINARY
89
SEARCH TERMS AND SYNTAX
90
1. Enter synonyms, alternate spellings and alternate forms (e.g. dance, dancing,
dances) for your search terms.
2. Enter all the singular or unique terms which are likely to be included in the
document or site you are seeking.
3. Avoid using very common terms (e.g. Internet, people) which may lead to a
preponderance of irrelevant search results.
4. Determine how your search engine uses capitals and plurals, and enter
capitalized or plural forms of your search words if appropriate.
5. Use a phrase or proper name if possible to narrow your search and therefore
retrieve more relevant results (unless you want a large number of results)
6. Use multiple operators (e.g. AND, NOT) if a search engine allows you to do so.
7. If you receive too many results, refine and improve your search. (After perusing
the results, you may become aware of how to use NOT - e.g. Boston AND
hockey AND NOT Bruins)
8. Pay attention to proper spacing and punctuation in your search syntax (i.e. no
space when using + means +term not + term)
9. Words between quotations treated as ‘phrases”
Web References
• Use full referencing
–
–
–
–
–
–
–
91
Author
Title
Publisher
Publications Date
PLUS
Date referenced
URL!
World Wide Web
• Use a search engine to locate data on the
World Wide Web
92
How a search engine works
• http://www.howstuffworks.com/search-engine.htm
93
Type
Advantages
Disadvantages
“Crawler”
Finds more pages
Less ‘dead links’
Less reliable page
descriptions
Web directory
Better page
classification system
More reliable page
descriptions
Many ‘dead links’
Fewer pages
indexed
Web crawler
94
Web Directory
• People create the directory
95
2.6 Issues
Related to information systems and
databases
96
Issues in handling data
• Privacy, security and accuracy
• What would be examples of inappropriate
use of data in a school web site relating to
the above issues?
97
Acknowledgement of data
sources
• Use full referencing
–
–
–
–
–
–
–
98
Author
Title
Publisher
Publications Date
PLUS
Date referenced
URL!
Freedom of Information
In New South Wales, the Freedom of Information Act 1989
gives you the legal right to :
• Obtain access to information held as records by State
Government Agencies, a Government Minister, local
government and other public bodies;
• Request amendments to records of a personal nature that
are inaccurate; and
• Appeal against a decision not to grant access to
information or to amend personal records.
99
Privacy
• The Privacy and Personal Information Protection Act
was passed in 1998 and established the Office of the
NSW Privacy Commissioner. The jurisdiction of the
Act is generally limited to state and local government
agencies.
• the Act introduces a set of privacy standards for the
NSW public sector. These standards regulate the
way public sector agencies deal with personal
information.
100
Data accuracy & reliability
• Data integrity - data should be validated and
cross-checked
–
–
–
–
101
Check the source of the data
Check data against other sources
Use your own intelligence
Acknowledge bias
Data Validation methods
• Range checks (dates, amounts)
• List check (against other known data e.g
names)
• Type check (numerical, date, text)
• Check digit
102
Access to data
• Who owns my data? Who should be able to
access it?
• Should all information be free and available?
• What about medical and credit data?
103
New Trends
• In the organisation, processing, storage and
retrieval of data
• Syllabus refers to:
– Data warehousing
– Data mining
104
Data Warehouse
• A collection of data designed to support
management decision making. Data warehouses
contain a wide variety of data that present a
coherent picture of business conditions at a single
point in time.
• Development of a data warehouse includes
development of systems to extract data from
operating systems plus installation of a warehouse
database system that provides managers flexible
access to the data.
• The term data warehousing generally refers to
combine many different databases across an entire
enterprise.
105
An extra DB!
106
Features of a Data Warehouse
•
•
•
•
107
Strategic data (not operational)
Temporal data (time periods)
Summary data
Read-only
Data Mining
• A hot buzzword for a class of database
applications that look for hidden patterns in a
group of data.
• For example, data mining software can help retail
companies find customers with common interests.
• The term is commonly misused to describe
software that presents data in new ways.
• True data mining software doesn't just change the
presentation, but actually discovers previously
unknown relationships among the data.
108
“Drill Down”
• to move from summary information to detailed
data by focusing in on something.
• To drill down through a series of folders, for
example, on a desktop means to go through the
hierarchy of folders to find a specific file or to
click through drop-down menus in a GUI.
• To drill down through a database is to access
information by starting with a general category
and moving through the hierarchy of field to file
to record.
109
Activity
• Identify and apply issues of ownership,
accuracy, security and privacy of
information.
110
Activity
• Discuss issues of access to and control of
information
• Validate information retrieved from the
Internet
111
Activity
• Design and generate reports from a database
112
Activity
• Summarise, extrapolate and report on data
retrieved from the Internet
113
Activity
Create a Database
• Create a simple relational database from a
schematic diagram and data dictionary
• Populate a relational database with data
114
Activity
Create a Data Dictionary
• Create a data dictionary for a given set of
data
115
Activity
Documentation
• Create documentation, including data
modelling, to indicate how a relational
database has been used to organise data
116
Construct
• Construct a hypertext document from a
storyboard
117
Activity
Schema Changes
• Modify an existing schema to meet a
change in user requirements
118
Data Flow Diagram
• Diagrammatically represents the flow of
information within an information system
Process
External
Entity
119
Data Flow
Data
Store
Activity
• Develop DFDs for a Library System
Borrow
Book
Book
Borrower
Borrower
Details
Borrowers
File
Acquisitions
Book
120
Book Details
Orders
File
DFD - Video Store System
Member
Card, PIN, cash
MovieTime
Video
Video, receipt
Member data
Video data
Transaction
Data
Video
Database
121
DFD - Appointment Diary
Person
1
Request
appointment
Diary
Enter
appointment
Confirm
appointment
122
Check
Diary
Enter
appointment
Diary
Person
2
Backup & Security
• To copy files to a second medium (a disk or
tape) as a precaution in case the first
medium fails
• FIREWALL: hardware or software that
prevents unauthorised access to a network
123
Methods of data backup
Backup
Description
Full
All data on system is
backed up
Advantages
Disadvantages
Quick
Longer backup
recovery
All data saved
Differential Only changed data
Rapid backup Recovery requires
since last full backup Minimal space full backup and
one incremental
Incremental Only changed data
Fastest backup Longer recovery since last incremental Minimal space need full backup
backup
and number of
incremental
backups
124
Design & develop a storyboard
• For each of the following situations, decide
the most appropriate storyboard layout.
• Design a set of simple labelled sketches to
illustrate
– A. A very simple children’s educational
program on desert animals for young children
– B. A glossary of medical terms for Doctors (in
alphabetical order).
125
Storyboard
• A series of frames, each representing a
different action or scene
• Storyboards are used to plan and organise
hypermedia projects
• Usually drawn on paper; frequently edited
• Contain navigation paths, content
information and graphic concepts
• Simple to construct
126
Storyboard Layout Options
• Linear
e.g “cartoon-like’
• Hierarchical
e.g simple web pages
• Non-linear
• Combination
127
e.g complex
web site
Use Software
• Use software that links data, such as:
• HTML editors
• web page creation software or a hypertext
package
128