Big Data - Old Dog Consulting
Download
Report
Transcript Big Data - Old Dog Consulting
Net-Centric 2016
Big Data
Why Bigger Does Not Mean Better
Adrian Farrel : Old Dog Consulting
[email protected]
www.isocore.com/2016
Net-Centric 2016
What Shall We Talk About?
•
•
•
•
•
•
What is “Big Data”?
How big is “Big”?
The big database: watering hole or tar pit?
Clogging the pipes
The observer effect
Getting the balance right
2
Net-Centric 2016
The More You Know, The More You Can Know
• The basic premise and promise of Big Data
• “Give me all of the data and I will answer all of your questions”
• Applies to any scenario and any data set
• 57% of customers who bought Jell-O also bought floor mops
• The proposal:
• Collect all data about a network as it happens
• Call it “Telemetry”
• Use it in real time to diagnose and predict all network behaviour
• Call it intelligent network operation
3
Net-Centric 2016
Let Me Tell You a Story
• I like fairy stories
• I write fairy stories
• The underlying principles…
• Magic is an everyday
occurrence
• Unexpected and wonderful
things can happen
• The ending isn’t always happy
• There can be blood and tears
• People get what they deserve
• Big Data is a Fairy Story
4
Net-Centric 2016
When More Is Less
• You can be swamped with information
• It may be meaningless details
• It could be valid data but…
• It might not be immediately relevant
• It can swamp the one piece of information that you
really need to know
• It could distract you
• It might be false (or at least a false positive)
• The more data you have the harder it is to find out the
facts
5
Net-Centric 2016
“Sit Down and Tell Me All About It”
• That dangerous little word “all”
• It’s pretty rare that someone has enough time to hear
about all of your problems
• And they probably
Let me see, I made
don’t care that much a short list of a few
other things
Will he never
shut up?
6
Net-Centric 2016
“What I Did on My Holidays”
• It’s a typical English assignment in school
• Write an essay about what you did on your holidays
• My brother cares about details
•
•
•
•
•
He began with what time he woke up and what was on the radio
He went on to describe breakfast
And how he cleaned his teeth
And what he wore (including in what order he put his clothes on)
Six pages later, and at the end of the time for the assignment, he still
hadn’t got out of the house
7
Net-Centric 2016
“What Are Your Symptoms?”
•
•
•
•
The doctor asks you: “What seems to be the trouble?”
You tell him…
• I have a bit of a headache
• I’m a bit slow first thing in the morning
• I’m really worried about my bills
• My knees hurt
• There’s a pain in my back when I lie down
• My wife is always complaining
• My finger joints a bit stiff
• I have a long list of jobs I need to do
What will he diagnose?
In reality, doctors often don’t discover the real reason for a visit until right at
the end of the consultation
• When the patient is forced to say what is really troubling them
• I’m in so much pain that I can’t sleep
8
Net-Centric 2016
What Are The Traffic Conditions?
•
•
•
•
•
•
•
•
Quite a bit of traffic on Route 7
Traffic lights not working in down-town Baltimore
Roadworks in Leesburg
Ooooh, look! A red Ferrari!
Abnormal load on Route 12 going West
School children on their way about now
That’s a big bus
Oh, by the way, watch out! There’s a car coming!
9
Net-Centric 2016
Seeing the Wood – There Are Too Many Trees
• Sometimes it can be really hard to sort out the
important details if you have too much data to sort
through.
• It can also be difficult to get a good view of the
overall situation if you are looking at all of the
separate data points.
• We say “It is hard to see the wood for the trees.”
• The point is not that the data is not valid
• Simply that there is too much
• It stops us seeing the details, but also the shape of
the bigger picture
10
Net-Centric 2016
Just Take A Sip
• The rate of arrival of new data can be enormous
• We’re not just talking about a static data set
• Static sets can be processed out of real time and used to
find out what happened
• Big Data in networking is everchanging
• Traffic fluctuates
• Devices and links “twitch”
• Steady state is only a statistical average
• You can drown in new data and never get to process the
big picture
11
Net-Centric 2016
Getting Bogged Down
And Pulling Everyone Else Into Your Mess
12
Net-Centric 2016
The Hawthorne Effect
• A term in psychology
• The act of analysing (productivity) behaviour of a
population tends to improve productivity
• When the population knows it is being specially
monitored
• Because it’s a novelty
• Because they like to know that someone cares
• Because they know they’re being watched
• And…
• Experimental evidence that we don’t understand…
• The impact of changing variables
• The other variables in the experiment
13
Net-Centric 2016
The Probe Effect and The Observer Effect
• In software and hardware systems
• The act of observing a process may affect the outcome
• For example:
• Turning on logging or tracing in software
• Slows things down
• Causes things to happen in a different order
• Moves things around in memory
• And a “debug build” changes the executable image
• For example:
• Probing hardware can change the capacitance or
resistance in a circuit
14
Net-Centric 2016
None of This is New
Consequences In Networking
•
•
•
•
•
•
Packet counters
Internal logging
Syslog and alarms
SNMP Notifications (Traps)
Netflow (IPFIX)
OAM
• Fault detection
• Fault isolation
• Network diagnosis
15
Net-Centric 2016
What Have We Learned?
• Things tend to blow up
• In networking, the act of gathering
information impacts the ability to transmit data
• More generally true in packet networks?
• This can be so bad that you lose the ability to control
the system
• It’s a DoS vector
• So everything has to have thresholds and rate limits
and information aggregation
• But in that case you’re not gathering all the data
16
Net-Centric 2016
None of This Is New… Or is it?
•
•
•
•
Maybe some things are new
Better understanding of networks
More OAM and measurement tools
High capacity resources
• Faster CPUs and collaborative processing
• More storage and memory
• Bigger links
• Better algorithms
• Artificial intelligence
17
Net-Centric 2016
Big Data – Just Big Enough
• The clue is in collecting the data you
need
• Don’t just collect all data
• But how do you know what data you
need?
• This has to be part of the solution set
• You have smart processing of your data set
• You must also have smart building of your data set
• (And you also need smart collection mechanisms)
• Then you pretty much solve all of the problems
18
Net-Centric 2016
Artificial Intelligence
• Everyone is talking about “machine learning” and AI
• This has been the tool of choice for processing enormous and
complex data sets for years
• Gene patterning, drug trials, face recognition, self-driving cars,
Internet searches, …
• Does Big Data in networking have any meaning without AI?
• To process the data
• To determine what data to collect
19
Net-Centric 2016
Questions
[email protected]
20