Data and Knowledge Management

Download Report

Transcript Data and Knowledge Management

Data and Knowledge
Management
10-1
Data Management:
A Critical Success Factor
•
•
•
•
•
The difficulties and the process
Data sources and collection
Data quality
Multimedia and object-oriented databases
Document management
10-2
The Difficulties and the Process:
The Difficulties
• Data amount increases exponentially
• Data: multiple sources
• Small portion of data useful for specific
decisions
• Increased need for external data
10-3
The Difficulties and the Process:
The Difficulties
• Differing legal requirements among
countries
• Selection of data management tool - large
number
• Data security, quality, and integrity
10-4
The Difficulties and the Process:
Data Life Cycle Process and
Knowledge Discovery
•
•
•
•
•
•
•
Data Collection
Stored in databases
Processed
Stored in data warehouse
Transformation - ready for analysis
Data mining tools - knowledge
Presentation
10-5
Data Sources and Collection
•
•
•
•
•
Internal data
Personal data
External data
Internet and commercial database services
Methods for collecting raw data
10-6
Data Quality (DQ)
• Intrinsic DQ:
– Accuracy, objectivity, believability, and
reputation
• Accessibility DQ:
– Accessibility and access security
10-7
Data Quality (DQ)
• Contextual DQ:
– Relevancy, value added, timeliness,
completeness
• Representation DQ:
– Interpretability, ease of understanding, concise
representation, and consistent representation
10-8
10-9
Multimedia and Object-Oriented
Databases
• Object-Oriented database (multimedia
database)
• Document management
10-10
Data Warehousing,
Mining, and Analysis
• Transaction versus analytical processing
• Data warehouse and data marts
• Knowledge discovery, analysis, and mining
10-11
Transaction Versus Analytical
Processing
Good Data Delivery System
•
•
•
•
Easy data access by end users
Quicker decision making
Accurate and effective decision making
Flexible decision making
10-12
Transaction Versus Analytical
Processing
Solution
• Business representation of data for end
users
• Client-server environment - end users query
and reporting capability
• Server-based repository (data warehouse)
10-13
The Data Warehouse and Marts
The purpose of a data warehouse is to
establish a data repository that makes
operational data accessible in a form readily
acceptable for analytical processing
activities . . .
A data mart is … dedicated to a functional
or regional area.
10-14
Characteristics of Data
Warehousing
•
•
•
•
•
Organization
Consistency
Time variant
Nonvolatile
Relational
10-15
The Data Warehouse and Marts
•
•
•
•
•
Benefits
Cost
Architecture
Putting the data warehouse on the internet
Suitability
10-16
Knowledge Discovery, Analysis,
and Mining
• Foundations of knowledge discovery in
databases (KDD)
• Tools and techniques of KDD
• Online analytical processing (OLAP)
• Data mining
10-17
The Foundations of Knowledge
Discovery in Databases (KDD)
• Massive data collection
• Powerful multiprocessor computers
• Data mining algorithms
10-18
10-19
OLAP Queries
• Access very large amounts of data
• Analyze the relationships between many
types of business elements
• Involve aggregated data
• Compare aggregated data over hierarchical
time periods
10-20
OLAP Queries
• Present data in different perspectives
• Involve complex calculations between data
elements
• Able to respond quickly to user requests
10-21
Data Mining
• Automated prediction of trends
• Automated discovery of previously
unknown patterns
10-22
Data Mining
Characteristics and Objectives
• Data often buried deep within large
databases
• Data may be consolidated in data
warehouse or kept in internet and intranet
servers
• Usually client-server architecture
10-23
Data Mining
Characteristics and Objectives
• Data mining tools extract information
buried in corporate files or archived public
records
• The “miner” is often an end user
• “Striking it rich” usually involves finding
unexpected, valuable results
• Parallel processing
10-24
Data Mining
Characteristics and Objectives
• Data mining yields five types of
information
• Data miners can use one or several tools
10-25
Data Mining Yields Five Types of
Information
•
•
•
•
•
Association
Sequences
Classifications
Clusters
Forecasting
10-26
Data Mining Techniques
•
•
•
•
Case-based reasoning
Neural computing
Intelligent agents
Others: decision trees, genetic algorithms,
nearest neighbor method, and rule reduction
10-27
Data Visualization Technologies
• Data visualization
• Multidimensionality
• Geographical information systems (GIS)
10-28
Data Visualization
Data visualization refers to presentation of
data by technologies digital images,
geographical information systems, graphical
user interfaces, multidimensional tables and
graphs, virtual reality, three-dimensional
presentations and animation.
10-29
Multidimensionality
• Major advantage - data can be organized the
way managers prefer to see the data
• There factors: dimensions, measures, and
time
10-30
Examples
• Dimensions
– Products, salespeople, market segments,
business units, geographical locations
• Measures
– Money, sales volume, head count, inventory,
profit, actual versus forecasted
• Time
– Daily, weekly, monthly, quarterly, yearly
10-31
Geographical Information
Systems (GIS)
A GIS is a computer-based system for
capturing, storing, checking, integrating,
manipulating, and displaying data using
digitized maps.
10-32
Geographical Information
Systems (GIS)
• Software
• Data
• Emerging GIS applications
10-33
Emerging GIS Applications
• Integration of GIS and GPS
– Reengineer aviation and shipping industries
• Intelligent GIS (integration of GIS and ES)
• User interface
– Multimedia, 3D graphics, animated and
interactive maps
• Web applications
10-34
Marketing Databases in Action
• The Marketing Transaction Database
(MTD)
• Implementation Examples
10-35
The Marketing Transaction
Database (MTD)
… a new kind of database, oriented toward
targeting and personalizing marketing
messages in real time.
10-36
10-37
Knowledge Management
• Knowledge management or managing
knowledge databases
• A knowledge base is a database that
contains infromation or organizational know
how.
10-38
Knowledge Management
• Knowledge bases and organizational
learning
• Implementing knowledge management
systems
10-39
Arthur Andersen’s
Learning Organization Knowledge Base
• Global best practices hotline
• These data combined with ongoing research
identify areas to be developed
• Research analysis team with content experts
to develop best practices
• Qualitative and quantitative information and
tools are released on CD-ROM for
corporate wide access
10-40
Arthur Andersen’s
Knowledge Base
• Best company profiles
• Relevant Arthur Andersen engagement
experience
• Top 10 case studies and articles
• World-class performance measures
• Diagnostic tools
10-41
Arthur Andersen’s
Knowledge Base
• Customizable presentations
• Process definitions and directory of internal
experts
• Best control practice
• Tax implementations
10-42
Managerial Issues
•
•
•
•
•
•
Cost-benefit analysis
Where to store data physically
Disaster recovery
Internal or external
Data security and ethics
Data purging
10-43
Managerial Issues
• The legacy data problem
• Data delivery
• Privacy
10-44
Copyright  1999 John Wiley & Sons, Incorporated. All rights
reserved. Reproduction or translation of this work beyond that
permitted in Section 117 of the 1976 United States Copyright Act
without the express written permission of the copyright owner in
unlawful. Request for further information should be addressed to
the Permissions Department, John Wiley & Son, Inc. Adopters of
the textbook are granted permission to make back-up copies for
his/her own use only, to make copies for distribution to student of
the course the textbook is used in, and to modify this material to
best suit their instructional needs. Under no circumstances can
copies be made for resale. The publisher assumes no
responsibility for errors, omissions, or damages, caused by the use
of these programs or from the use of the information contained
herein.
10-45