Transcript Big data

Challenges and Opportunities with Big Data
Dr Hammou Messatfa
IBM Europe Government CTO
Member of the IBM Academy of Technology
© 2013 IBM Corporation
Agenda
 What is big data and why is it such a popular topic and why now?
 Implications on skills
 Organizations are extracting value from big data
 Implications on Research
 IBM’s big data journey
2
© 2013 IBM Corporation
Data is the new Oil. Data is just like crude. It’s
valuable, but if unrefined it cannot really be used.
– Clive Humby, DunnHumby
We have for the first time an economy based on
a key resource [Information] that is not only renewable,
but self-generating. Running out of it is not a problem,
but drowning in it is.
– John Naisbitt
3
© 2013 IBM Corporation
The number of organizations who see analytics as a competitive
advantage is growing.
63%
2010
business initiative
2011
2012
BUSINESS
IMPERATIVE
IQ
© 2013 IBM Corporation
Analytics is Progressing from the Possible to the Proven
Helps detect life
threatening conditions
up to 24 hours sooner
Reduced
Improper Payment
$16
Billion
Cut
serious crime
by
%
30
Tax Agency
Smarter Healtcare
Analytics
Smarter Revenue
Management
Smarter Crime
Prevention
© 2013 IBM Corporation
Big data characteristics
Big data embodies new data characteristics created by today’s
digitized marketplace
Characteristics of big data
Source: IBM methodology
6
© 2013 IBM Corporation
Organizations are evolving their big data journey
PEOPLE & PROCESS
What skills and processes do I need to add or
modify to be successful?
RESEARCH
7
© 2013 IBM Corporation
An acute shortage of skills threatens our ability to address emerging
opportunities and risks
Among organizations worldwide today…
have major skill gaps
in mobile, business
analytics, and security
has all the skills it needs to be
successful applying advanced
technology* for business benefit
report a skills shortage in the
ability to manage information
* Includes business analytics, mobile computing, social business, and cloud computing
8
Sources: IBM Tech Trends report 2012, Enterprise Strategy Group, CompTIA
© 2013 IBM Corporation
Big data requires a broad set of skills
"By 2015, big data demand will reach 4.4 million jobs globally, but only
one-third of those jobs will be filled."
Source: Gartner "Gartner's Top Predictions for IT Organizations and Users, 2013 and Beyond: Balancing Economics, Risk,
Opportunity and Innovation" 19 Oct 2012
Math and
Operations Research
Expertise
Data Experts
Develop analytic algorithms
Data architecture, management,
governance, policy
Decision Making
Executive and
Management
Apply information to solve
business issues
Tool Developers
Industry Vertical
Domain Expertise
Mask complexity and
analytics to lower skills
boundaries
Visualization
Expertise
9
Interpret data sets,
determine correlations and
present in meaningful ways
Develop hypothesis, identify
relevant business issues,
ask the right questions
© 2013 IBM Corporation
Sample critical job roles
Data Policy is fastest growing job role!
http://www.cutter.com/bia/fulltext/reports/2013/02/index.html
http://ibmdatamag.com/2013/02/i-am-an-information-strategist
10
© 2013 IBM Corporation
IBM Academic Initiative
August 14, 2013
IBM and Universities Team Up to Close a 'Big Data' Skills Gap
By Lee Gardner
IBM Corporation's skills program focused on partnering with university faculty
Our mission
 Partner with academic institutions to better educate millions of students
for a smarter planet and more competitive IT workforce
Key offering areas
 Business Analytics
 Big Data
 Security
 Software Engineering
 Mobile Development
www.ibm.com/academicinitiative
11
© 2013 IBM Corporation
IBM Academic Collaboration Capabilities Map
STUDENTCENTRIC
EDUCATION
Academic Events
RESEARCH
BUSINESS
Job Fairs
Scientific
Conferences
Industry Events
T3 Training
Student Training
IBM Research Visits
Technology Briefings
Curriculum Support
Industry Mentorship
Joint Funding
Proposals
Market Surveys
Laboratories
Student Projects
Research Projects
Services
Instructor
Certification
Student Certification
Publications
Products
Faculty Awards ©
Student Contests
Faculty Awards ®
Entrepreneurship
Faculty Internships
Internship & Hiring
Research Internships
Venture Capital
Advisory Boards
Student
Communities
Research Network
Industry Community
Page 12 of
5 April 2017
PARTNERSHIP MANAGEMENT
FACULTYCENTRIC
EDUCATION
© 2013 IBM Corporation
Organizations are evolving their big data journey
STRATEGY & VALUE
13
What are the key business issues or opportunities
that big data can help me to address?
© 2013 IBM Corporation
Big data adoption
Patterns of organizational behavior are consistent across four
stages of big data adoption
Big data adoption
When segmented into four groups based on current levels of big data activity, respondents showed significant consistency in
Total respondents n = 1061
organizational behaviors
Totals do not equal 100% due to rounding
14
© 2013 IBM Corporation
Big data is a business priority
– inspiring new models and processes for organizations, and even entire
industries
15
© 2013 IBM Corporation
Organizations are evolving their big data journey
RESEARCH
16
What are the essential analytics capabilities we
need to ensure we have in place?
© 2013 IBM Corporation
But there’s still room for research!
 Improve individual tools
– Handle particular data types better
– Make it easier to find entities in data
– Make it easier to compose analyses from existing models
 Improve the environment for exploring massive data?
– Pre-integrated data sets to provide context
– Powerful infrastructure for data management and analytics
– Rich collection of analytics and tools for analysis
– Expertise in all aspects of the process
– A great user experience through automation and intelligent guidance
 Leverage tools and environment to solve important problems for people, industry and the
world at large
17
© 2013 IBM Corporation
The Big Data Approach to Analytics is Different
Traditional Analytics
Big Data Analytics
Structured & Repeatable
Structure built to store data
Iterative & Exploratory
Data is the structure
Business
Users
Determine
Questions
IT Team
Delivers Data
On Flexible
Platform
Analyzed
Information
Available Information
Capacity constrained down sampling
of available information
Analyzed
Information
IT Team
Builds System
To Answer
Known Questions
Carefully cleanse a small information
before any analysis
Analyze ALL Available Information
Whole population analytics
connects the dots
Analyzed
Information
Business
Users
Explore and
Ask Any Question
Analyze information as is & cleanse as
needed & existing repeatable
© 2013 IBM Corporation
The Big Data Approach to Analytics is Different
Traditional Analytics
Big Data Analytics
Structured & Repeatable
Structure built to store data
Iterative & Exploratory
Data is the structure
Hypothesis
Question
Data
?
All Information
Exploration
Analyzed
Information
Answer
Data
Start with hypothesis
Test against selected data
Analyze after landing…
Actionable Insight
Correlation
Data leads the way
Explore all data, identify correlations
Analyze in motion…
© 2013 IBM Corporation
Big data: This is just the beginning
9000
100
6000
You are here
Social
Media
50
Percent of uncertain data
Volume in Exabytes
Percentage of uncertain data
Sensors
& Devices
VoIP
Volume
Veracity
3000
Enterprise
Data
Variety
2010
Source: IBM Global Technology Outlook 2012
20
0
2015
IBM source data is based on analysis done by the IBM Market Intelligence Department. IBM Market Intelligence data
is provided for illustrative purposes and is not intended to be a guarantee of future growth rates or market opportunity
© 2013 IBM Corporation
Preparing data for analysis itself requires analytics
Q: “How exposed am I
to my borrowers?”
Midas Flow
Search UI
SEC
Crawl
Extract
Resolve
Map &
Fuse
FDIC
Financial
Companies &
Key People
Temporal
Analysis
Reports
Loan
bankruptcies,
merger/acquisitions
borrower,
lender
Company
subsidiaries, insider,
5%, 10% owner,
banking subsidiaries
issuer
Security
21
job changes
Event
employment, director, officer
insider, 5% owner, 10% owner
Person
holdings,
transactions
holdings,
transactions
© 2013 IBM Corporation
Sample Application – Real Time Lead Generation
Go for the
best, DP2000
Buying a
DSLR
today !
Buying
DSLR
today!
Thrza gr8 deal
on ZX-550 @
the mall
Prior Social
Business
Transactions
Data
250M tweets/day
Michael’s online friends offer lots of advice
Entity Extraction,
Fact Discovery,
Intent & Sentiment
Influencers
Intent
Millions of tweets yield one
company-specific fact Customer ready to buy a
DSLR camera today,
possibly at a nearby mall
Text Analytics used to extract intent from Social Media
Married, Male, Spouse
Birthdate, Gift Type, Intent
to Purchase, Timeframe
Wifey’s birthday tomorrow, looking for a killer dslr
Sarcasm,
Wishful Thinking
Potential
Locations and
Activity
Maybe I should buy her that purple
roadster, while I’m at it. ;-) lol
Intent to Purchase,
Gift Type?
In NYC area this w/e, any good malls
nearby?
Region & City Location,
Timeframe, Intent to Shop
 Resultant fact base contains billions of facts, and is incrementally updated
 Fact segmentation or clustering is rapid enough to drive a business decision
22
© 2013 IBM Corporation
22
Deriving actionable consumer insights from social media
Leverage social media and
computational models to to predict
intrinsic traits that influence
consumer behavior
© 2013 IBM Corporation
Leading IBM: Eras of computing
Computer Intelligence
Cognitive
Systems Era
Programmable
Systems Era
Tabulating
Systems Era
Time
© 2012 International Business Machines Corporation
24
Family
History
Findings
Patient
A Her
58-year-old
medications
woman
were
presented
levothyroxine,
to
A 58-year-old
woman
complains
of
urine
dipstick
was
positive
forher
primary
hydroxychloroquine,
care
physician
after
pravastatin,
several
days
and
dizziness,
anorexia,
dry mouth,
leukocyte
esterase
and nitrites.
The
of dizziness,
anorexia,
dry
mouth,
increasedalendronate.
thirst,
andgiven
frequent
patient
a
Herprescription
familythirst,
history
included
oral
and
increased
and
frequent
urination.
urination.
She
had
also
afor
fever.
fo
ciprofloxacin
a
Her
history
was
notable
forhad
cutaneous
bladder
cancer
in
her
mother,
She
had
also
had
a
fever
and
reported
She
reported
no
pain
in
her
abdomen,
urinary
tract
infection.
3
days
later,
lupus, hyperlipidemia, osteoporosis,that
Graves'
disease
inweakness
twowhen
food
would
“get
stuck”
she
was
back,
and
notract
cough,
orsisters,
diarrhea.
patient
reported
and
frequent
urinary
infections,
a left
hemochromatosis
in
one
sister,
and
swallowing.
She
reported
no
pain
in
dizziness. Her for
supine
bloodcyst,
pressure
oophorectomy
a benign
andher
thrombocytopenic
abdomen,
back,
or
nowas
cough,
wasidiopathic
120/80
mm
Hg,flank
and and
pulse
88.
primary
hypothyroidism,
diagnosed
a
purpura
in
one
sister
shortness of breath,
diarrhea, or dysuria
year earlier
difficulty swallowing
fever
dry mouth
thirst
anorexia
frequent urination
dizziness
no abdominal pain
no back pain
no cough
no diarrhea
Oral cancer
Bladder cancer
Hemochromatosis
Purpura
Graves’ Disease
(Thyroid Autoimmune)
cutaneous lupus
osteoporosis
hyperlipidemia
frequent UTI
hypothyroidism
Medications History
Symptoms
Family
Medications
Findings
Patient
History
History
Symptoms
Putting the pieces together at point of impact can be game changing
can be life changing
25
© 2012 International Business Machines Corporation
Alendronate
pravastatin
levothyroxine
hydroxychloroquine
urine dipstick:
leukocyte esterase
supine 120/80 mm HG
heart rate: 88 bpm
urine culture: E. Coli
Diagnosis Models
Confidence
Renal Failure
UTI
Diabetes
Influenza
Hypokalemia
Esophagitis
• Extract Symptoms from record
Most
Most
Confident
ConfidentDiagnosis:
Diagnosis:UTI
Diabetes
Esophagitis
Influenza
• Use paraphrasings mined from text to handle
• Identify
• • Extract
Extract
negative
Patient
Family
Medications
Symptoms
History
alternate
phrasings
• • Use
Reason
•Medical
Use
with
database
Taxonomies
mined
of
relations
drug
toand
side-effects
generalize
tovariants
explainmedical
away
•
Perform
broad
search
for
possible
diagnoses
• conditions
Together,
symptoms
multiple
to the(thirst
granularity
diagnoses
is consistent
used
mayby
best
w/
the
UTI)
explain
models
• Score Confidencesymptoms
in each diagnosis based on
far UTI was present
• Extract Findings:evidence
Confirmsso
that
25
Watson enables three classes of cognitive services
Ask
• Leverage vast amounts of data
• Ask questions for greater insights
• Natural language inquiries
• e.g. - Next generation Chat
Discover
• Find the rationale for given answers
• Prompt for inputs to yield improved responses
• Inspire considerations of new ideas
• e.g. - Next generation Search  Discovery
Decide
• Ingest and analyze domain sources, info models
• Generate evidence based decisions with confidence
• Learn with new outcomes and actions
• e.g. - Next generation Apps  Probabilistic Apps
© 2012 International Business Machines Corporation
26
© 2013 IBM Corporation
IBM Research: The World is Our Lab
Dublin
China
Zurich
Almaden
Watson
Haifa
India
Tokyo
Austin
Brazil
Melbourne
IBM Research labs
Labs added since 2010
Other IBM Research presence
© 2012 International Business Machines Corporation
28
IBM building strength and leadership in big data and analytics
Building the most comprehensive
Business Analytics & Optimization portfolio & partnerships
2013
 More than $16B in Acquisitions
Since 2005
 More than 10,000 Technical
Professionals
Talent Acquisition
Social Analytics/Consumer Insight
Workload Optimized Systems
 More than 7,500 Dedicated
Consultants
Advanced Case Management
 Largest Math Department
in Private Industry
Content Analytics
Decision Management
 More than 27,000 Business
Partner Certifications
 Partner with more than 1000
Universities on Analytics
Stream Computing
Pervasive Content
pureScale
pureXML
Deep Compression
Developer Productivity
Autonomic Operations
2005
29
Innovation that Matters
© 2013 IBM Corporation
30
© 2013 IBM Corporation