Knowledge management systems

Download Report

Transcript Knowledge management systems

CHAPTER 5
Data and
Knowledge
Management
CHAPTER OUTLINE
5.1
5.2
5.3
5.4
5.5
Managing Data
The Database Approach
Database Management Systems
Data Warehouses and Data Marts
Knowledge Management
DIFFICULTIES OF MANAGING
DATA
• Amount of data increasing exponentially
• Data are scattered throughout organizations and collected by
many individuals using various methods and devices.
• Data come from many sources.
• Data security, quality, and integrity are critical.
ANNUAL FLOOD OF DATA FROM…..
Credit card swipes
E-mails
Digital video
Online TV
RFID tags
Blogs
Digital video surveillance
Radiology scans
Source: Media Bakery
ANNUAL FLOOD OF NEW DATA!
In the zettabyte
range
A zettabyte is
1000 exabytes
© Fanatic Studio/Age Fotostock America, Inc.
DATA GOVERNANCE
•Data Governance
See video
DATA GOVERNANCE
•Master Data Management
See video
DATA GOVERNANCE
•Master Data
See video
MASTER DATA MANAGEMENT
John Stevens registers for Introduction to Management
Information Systems (ISMN 3140) from 10 AM until 11 AM
on Mondays and Wednesdays in Room 41 Smith Hall,
taught by Professor Rainer.
Transaction Data
John Stevens
Intro to Management Information Systems
ISMN 3140
10 AM until 11 AM
Mondays and Wednesdays
Room 41 Smith Hall
Professor Rainer
Master Data
Student
Course
Course No.
Time
Weekday
Location
Instructor
BIG DATA
 Defining Big Data
 as diverse, high-volume,high-velocity information assets that
require new forms of processing to enable enhanced decision
making, insight discovery, and process optimization .
 Exhibit variety;
 • Include structured, unstructured, and semi -structured data;
 • Are generated at high velocity with an uncertain pattern;
 • Do not fi t neatly into traditional, structured, relational
databases (discussed later in this
 chapter); and Can be captured, processed, transformed, and
analyzed in a reasonable amount of time
 only by sophisticated information systems.
EXAMPLES OF BIG DATA
 When the Sloan Digital Sky Survey in New Mexico was launched
in 2000, its telescope collected more data in its first few weeks
than had been amassed in the entire history of astronomy. By
2013, the survey’s archive contained hundreds of terabytes of
data. However, the Large Synoptic Survey Telescope in Chile, due
to come online in 2016, will collect that quantity of data every
five days.
 • In 2013 Google was processing more than 24 petabytes of
data every day.
 • Facebook members upload more than 10 million new photos
every hour. In addition, they click a “like” button or leave a
comment nearly 3 billion times every day.
 • The 800 million monthly users of Google’s YouTube service
upload more than an hour ofvideo every second.
 • The number of messages on Twitter grows at 200 percent every
year. By mid-2013 the volume exceeded 450 million tweets per
day.
CHARACTERISTICS OF BIG DATA
Volume: We have noted the incredible volume of Big Data in this
chapter. Although the sheer volume of Big Data presents data
management problems, this volume also makes Big Data
incredibly valuable. Irrespective of their source, structure,
format, and frequency, data are always valuable. If certain
types of data appear to have no value today, it is because we
have not yet been able to analyze them effectively. For example,
several years ago when Google began harnessing satellite
imagery, capturing street views, and then sharing these
geographical data for free, few people understood its value.
Today, we recognize that such data are incredibly useful (e.g.,
consider the myriad of uses for Google Maps).
CHARACTERISTICS OF BIG DATA
Velocity: The rate at which data fl ow into an organization is
rapidly increasing. Velocity is critical because it increases the
speed of the feedback loop between a company and its
customers. For example, the Internet and mobile technology
enable online retailers to compile histories not only on fi nal
sales, but on their customers’ every click and interaction.
Companies that can quickly utilize that information—for
example,
by
recommending
additional
purchases—gain
competitive advantage.
CHARACTERISTICS OF BIG DATA
Variety: Traditional data formats tend to be structured,
relatively well described, and they change slowly. Traditional
data include fi nancial market data, point-of-sale transactions,
and much more. In contrast, Big Data formats change rapidly.
They include satellite imagery, broadcast audio streams, digital
music fi les, Web page content, scans of government
documents, and comments posted on social networks.
MANAGING BIG DATA
The first step for many organizations toward managing Big Data
was to integrate information silos into a database environment
and then to develop data warehouses for decision making. After
completing this step, many organizations turned their attention
to the business of information management—making sense of
their proliferating data. In recent years, Oracle, IBM,
Microsoft,and SAP have spent billions of dollars purchasing
software firms that specialize in data management and
business intelligence. In addition, many organizations are
turning to NoSQL databases (think of them as “ not only SQL”
databases) to process Big Data. These databases provide an
alternative for firms that have more and dif ferent kinds of data
(Big Data) in addition to the traditional, structured data that fit
neatly into the rows and columns of relational databases.
LEVERAGING BIG DATA
Organizations must do more than simply manage Big Data; they
must also gain value from it.
In general, there are six broadly applicable ways to leverage Big
Data to gain value.
Creating Transparency. Simply making Big Data easier for
relevant stakeholders to access in a timely manner can create
tremendous business value. In the public sector, for example,
making relevant data more readily accessible across otherwise
separate departments can sharply reduce search and
processing times. In manufacturing, integrating data from R&D,
engineering, and manufacturing units to enable concurrent
engineering can significantly reduce time to market and
improve quality.
LEVERAGING BIG DATA
Enabling Experimentation. Experimentation allows organizations to
discover needs and
improve per formance. As organizations create and store more data in
digital form, they can
collect more accurate and detailed per formance data (in real or near -real
time) on ever ything from product inventories to per sonnel sick days. IT
enables organizations to set up controlled experiments.
For example, Amazon constantly experiments by of fering slightly dif ferent
“looks” on its Web site. These experiments are called A/B experiments,
because each experiment has only two possible outcomes. Here is how
the experiment works: Hundreds of thousands of people who click on
Amazon.com will see one ver sion of the Web site, and hundreds of
thousands of other s will see the other ver sion. One experiment might
change the location of the “ Buy” button on the Web page. Another might
change the size of a par ticular font on the Web page. Amazon captures
data on an assor tment of variables from all of the clicks, including which
pages user s visited, the time they spent on each page, and whether the
click led to a purchase. It then analyzes all of these data to “tweak” its
Web site to provide the optimal user experience.
LEVERAGING BIG DATA
Segmenting Population to Customize Actions. Big Data allows
organizations to create narrowly defined customer
segmentations and to tailor products and services to precisely
meet customer needs. For example, companies are able to
perform micro-segmentation of customers in real time to
precisely target promotions and advertising. Suppose, for
instance, that a company knows you are in one of its stores,
considering a particular product. (They can obtain this
information from your smartphone, from in -store cameras, and
from facial recognition software.) They can send a coupon
directly to your phone of fering 10 percent of f if you buy the
product within the next five minutes.
LEVERAGING BIG DATA
Replacing/Supporting Human Decision Making with Automated
Algorithms.
Sophisticated analytics can substantially improve decision
making, minimize risks, and unearth valuable insights. For
example, tax agencies use automated risk -analysis software
tools to identify tax returns that warrant for further
examination, and retailers can use algorithms to fine-tune
inventories and pricing in response to real -time in-store and
online sales.
LEVERAGING BIG DATA
Innovating New Business Models, Products, and Services. Big
Data enables companies to create new products and services,
enhance existing ones, and invent entirely new business
models. For example, manufacturers utilize data obtained from
the use of actual products to improve the development of the
next generation of products and to create innovative after-sales
service of ferings. The emergence of real -time location data has
created an entirely new set of location-based services ranging
from navigation to pricing property and casualty insurance
based on where, and how, people drive their cars.
LEVERAGING BIG DATA
Organizations Can Analyze Far More Data. In some cases,
organizations can even process all the data relating to a particular
phenomenon, meaning that they do not have to rely as much on
sampling. Random sampling works well, but it is not as effective
as analyzing an entire dataset. In addition, random sampling has
some basic weaknesses. To begin with, its accuracy depends on
ensuring randomness when collecting the sample data. However,
achieving such randomness is tricky. Systematic biases in the
process of data collection can cause the results to be highly
inaccurate. For example, consider political polling using landline
phones.
This sample tends to exclude people who use only cell phones. This
bias can seriously skew the results, because cell phone users are
typically younger and more liberal than people who rely primarily
on landline phones.
5.2 THE DATABASE APPROACH
Database management system (DBMS) minimize the
following problems:
Data redundancy
Data isolation
Data inconsistency
DATABASE APPROACH (CONTINUED)
DBMSs maximize the following issues:
Data security
Data integrity
Data independence
DATABASE MANAGEMENT SYSTEMS
DATA HIERARCHY
Bit
Byte
Field
Record
File (or table)
Database
HIERARCHY OF DATA FOR A
COMPUTER-BASED FILE
DATA HIERARCHY (CONTINUED)
Bit (binary digit)
Byte (eight bits)
DATA HIERARCHY (CONTINUED)
Example of Field and Record
DATA HIERARCHY (CONTINUED)
Example of Field and Record
DESIGNING THE DATABASE
Data model
Entity
Attribute
Primary key
Secondary keys
ENTIT Y-RELATIONSHIP MODELING
Database designers plan the database design in a process
called entity -relationship (ER) modeling .
ER diagrams consists of entities, attributes and relationships.
Entity classes
Instance
Identifiers
5.3 DATABASE MANAGEMENT SYSTEMS
Database management system (DBMS)
Relational database model
Structured Query Language (SQL)
Query by Example (QBE)
STUDENT DATABASE EXAMPLE
NORMALIZATION
Normalization
Minimum redundancy
Maximum data integrity
Best processing performance
Normalized data occurs when attributes in the table
depend only on the primary key.
NON-NORMALIZED RELATION
NORMALIZING THE DATABASE (PART A)
NORMALIZING THE DATABASE (PART B)
NORMALIZATION PRODUCES ORDER
5.4 DATA WAREHOUSING
Data warehouses and Data Marts
Organized by business dimension or subject
Multidimensional
Historical
Use online analytical processing
DATA WAREHOUSE FRAMEWORK &
VIEWS
BENEFITS OF DATA WAREHOUSING
End users can access data quickly and easily via Web browsers
because they are located in one place.
End users can conduct extensive analysis with data in ways that
may not have been possible before.
End users have a consolidated view of organizational data.
5.5 KNOWLEDGE MANAGEMENT
Knowledge management (KM)
Knowledge
Intellectual capital (or intellectual assets)
© Peter Eggermann/Age Fotostock America, Inc.
KNOWLEDGE MANAGEMENT
(CONTINUED)
Explicit Knowledge
(above the waterline)
Tacit Knowledge
(below the waterline)
© Ina Penning/Age Fotostock America, Inc.
KNOWLEDGE MANAGEMENT
(CONTINUED)
Knowledge management systems (KMSs)
Best practices
© Peter Eggermann/Age Fotostock America, Inc.
KNOWLEDGE MANAGEMENT SYSTEM
CYCLE
Create knowledge
Capture knowledge
Refine knowledge
Store knowledge
Manage knowledge
Disseminate knowledge
KNOWLEDGE MANAGEMENT SYSTEM
CYCLE
HOMEWORK
 Answer the questions of the «Closing Case Can Organizations
Have Too Much Data»