Big Data - SUNY Oneonta

Download Report

Transcript Big Data - SUNY Oneonta

Big Data
SUNY IITG
Outline
•
•
•
•
Introduction
Similar Topics
What Does Big Data Look Like?
Why Use Big Data?
– How Is It Useful?
– What Companies Rely On Big Data?
• Summary
• Questions
• References
4/6/2016
2
Events,
Existence,
changes
observation
recording
Postprocess
Where (big) data starts to play a role
What exactly is Big Data
•
extremely large data sets that may be analyzed computationally to reveal patterns,
trends, and associations, especially relating to human behavior and interactions.
•
•
Q:, and why does the University need an institute?
A: Currently, many domains, including science, engineering, health care,
environmental science, e-commerce and, increasingly, the humanities, generate
massive amounts of data. These data are accumulating to the point where making
sense of them is a huge challenge. And it’s not just about size, speed and variety,
but also the complexity of the data sets, the enormous numbers of variables and
the uncertainty in measurements from global environmental monitoring systems,
studies of gene expression and others. Because data can come from multiple
sources, they also must be integrated, creating additional complications.
So we need a Big Data Institute at the University because many of the problems
we face in science, engineering, health care and the humanities require very
powerful tools for answering questions across the domains. And because U.Va. is a
complete university, we can combine our efforts to solve some of the most
challenging problems.
•
History of data
• Data was recorded by human beings since
ancient time.
– Experts generate data
– everage people consume data
• In recent century, easier by computers
– Still experts or specialized software supply generate
data
– Everage humanbeings consume data
• Nowadays, by human beings using various
devices and software
– By everybody
“Big” data
• Data on paper
• Usage of computers create big data problem,
with the following characteristics
– Faster. High speed, rapidity with which data comes
in
– Easier
– Varieties
– Large amount
– Various resources
Multiple faceted
• “Big data” is something that has multiple
definitions,
resources
•
•
•
•
Sensor network
High precision devices
Surveillance
…
• data fusion, and information analysis.
• More accurate analyses may lead to more
confident decision making. And better
decisions can mean greater operational
efficiencies, cost reductions and reduced risk.
(Predictive Analytics, Big Data)
• What is Data Mining
Who is in the jungle ?
• SAS
Introduction to Big Data
• Extremely large data sets which grow
exponentially
• Difficult to process with traditional methods
• Reveals patterns, trends & associations
• What is Big Data and how does it work?
4/6/2016
13
Introduction (continuted)
• Unimportant: Size of data
• Important: Ability to analyze such a data set
4/6/2016
14
Graphical Representation
• “There were 5 exabytes
of information created
between the dawn of
civilization through
2003, but that much
information is now
created every 2 days.”
– Eric Schmidt
• Google Employee (2010)
4/6/2016
15
Confusion on Big Data
• Not usually confused with another concept
• Confusion occurs when trying to understand
why
4/6/2016
16
Components to Big Data: Multi-faceted
Relationships among data
4/6/2016
18
Data Representation: Table
Pictures/Day
55 Million
55,000,000
Tweets/Day
340 Million
340,000,000
Documents/Day
1 Billion
1,000,000,000
Total Bytes/Day
2.5 Quintillion
1,000,000,000,000,000,000
4/6/2016
19
Process of Big Data Creation
Mainframe
4/6/2016
Client/Server
The Internet
Social
Media/The
Cloud
20
Why Use Big Data?
•
•
•
•
Better management of data
Speed, capacity & scalability benefits
Better visualization of data
Data analysis capabilities will evolve
4/6/2016
21
Big Data Analyzing Software
• Amazon
– Amazon DynamoDB
– Amazon Redshift
• DataStax
– Cassandra
• Developed by Facebook, inspired by DynamoDB
4/6/2016
22
Who Relies on Big Data?
• Majority of companies
– Retail
• Amazon
– Entertainment
• Netflix
– Health
• MyFitnessPal
4/6/2016
23
Amazon
• Predicts what customers want before they
start their search
• “Frequently Bought Together”
• “Customers Who Bought This Item Also
Bought”
4/6/2016
24
Netflix
• Analyze viewing habits
• Improve suggestions
4/6/2016
25
MyFitnessPal
• Extensive database with information
immediately available
4/6/2016
26
Big Data’s Influence for the
Future
• Healthcare industry could save $300 billion a
year by using big data analytics
• Big Data has helped predict crimes three times
more accurately than current forecasting
• Companies involved in retail could increase
profit by more than 60%
4/6/2016
27
Fun Facts
• 1.9 million IT jobs will be created by 2015 to
work with Big Data projects
• Data transferred via mobile networks
increased by 81% every month between 2012
and 2014
• NSA is only capable of analyzing 1.6% of all
internet traffic per day
4/6/2016
28
Summary
Big Data has a much larger impact on daily
life than the majority of society realizes.
Whether they’re actively on the internet or
shopping at the grocery store, data is being
recorded and analyzed instantaneously. Without
Big Data the technology available to us today
would not be as reliable or successful.
4/6/2016
29
Questions
• What is more important, the amount of data
or the process by which the data is analyzed?
– The process of analyzing the data
• What is the pattern of growth for Big Data?
– Exponential growth
4/6/2016
30
References
• http://news.virginia.edu/content/uva-appoints-engineeringprofessor-don-brown-lead-new-big-data-institute
• http://www.sas.com/en_us/insights/big-data/what-is-bigdata.html
• http://oxfamblogs.org/fp2p/what-is-the-future-impact-of-bigdata/#prettyPhoto
• http://smartdatacollective.com/bernardmarr/232941/top-10big-data-quotes-all-time
• http://www.cio.com/article/2385690/big-data/5-reasons-tomove-to-big-data--and-1-reason-why-it-won-t-be-easy-.html
• http://www.informationweek.com/big-data/big-dataanalytics/13-big-data-vendors-to-watch-in-2013/d/did/1107738?page_number=1
4/6/2016
31