Transcript Document
Cross-Platform Aviation Analytics
Using Big-Data Integration Methods
2013 Integrated Communications Navigation
and Surveillance (ICNS) Conference
April 25, 2013
Dr. Tulinda Larsen
Vice President
[email protected]
Mobile. +1 (443) 510-3566
4833 Rugby Avenue, Suite 301
Bethesda, Maryland 20814
www.masflight.com
T H E AN ALY S I S C H AL L E N G E
The Analysis Challenge
Scale and complexity of aviation data limits research applications
Fuel and Oil
Conservation
Problems Acquiring Data
Gate and
Terminal Use
•
Real-time transmission of very large data
•
Proprietary and inconsistent formats
•
No conditioning or validation
Weather Plan &
Ops Recovery
Pilot and
Crew Staffing
Operational
Optimization
Obtaining radar and airport data, schedules,
weather maps and forecasts, fleet information
Problems Analyzing Information
Using data for strategic planning and recovery,
cost improvement and new market opportunities
•
Goes beyond desktop capability
•
Time-consuming manual slicing of data
•
Need weather and competitor information
to answer key operational questions
Big-data analytical methods can address these challenges
CLOUD COMPUTING
What is Cloud Computing?
The cloud consists of terrestrial servers across the Internet that
collectively store, manage and process data
•
•
Figurative “Cloud”
The term comes from
the common use of a
cloud-shaped symbol as
an abstraction for the
Internet but application
to virtual servers is as
recent as 2006
Cloud computing is
the use of resources
(hardware and software)
that are delivered as a
service over the Internet
or other network
Network
Identity
Monitoring
Content
Object Storage
Content
Application
Platform
Infrastructure
Computation
Financial Metrics
Communication
Collaboration
Storage
Databases
Cloud Computing Architecture
CLOUD COMPUTING
What are Cloud Architectures?
•
•
Cloud computing
services can be
delivered by an internal
IT organization
(company-owned
private cloud) or
By an external service
provider (managed
services private
cloud or public cloud
provider) or
Shared between users
Public Cloud
Providers
Community
Cloud
Providers
Low Industry
Expertise
High Industry
Focus
Managed
Services
Private
Providers
Company
Private
Clouds
Private Infrastructure
In aviation cloud resources can be customized and shared among
consortiums of customers (community cloud) or
shared with customers in other industries (public cloud)
B I G - D ATA A N A LY T I C S
What is Big-Data Analytics?
• The process of examining diverse, large-scale data sets to uncover
patterns, unknown correlations and other useful information
• Organizations have different levels of (1) database management
expertise and (2) knowledge to process and analyze big data sets
– “Big data” is a relative term based on the user
– Data tables in excess of ten terabytes (10TB) are difficult to work with
using most relational database management systems, and particularly
using desktop statistics and visualization packages, including Microsoft
Excel and Access
• Unstructured data sources in the operational world simply do not fit
into desktop or small-scale database structures
– They can be hosted using cloud computing at lower cost, and mined
more efficiently, than with on-premises database architectures
B I G - D ATA A N A LY T I C S
What are Big-Data Analytics Tools?
•
Big-data analytics employ software tools from advanced analytics
disciplines such as data mining and predictive analytics.
– Mining data, trends or analysis of these multi-terabyte data sets requires
parallel software running on tens, hundreds, or even thousands of
servers to keep pace with user demands and processing expectations.
•
A new class of big-data methods have emerged to address user demands
for horizontal scaling and availability of underlying data
– Hadoop and MapReduce, among others, offer fast processing speed.
– Great for large-scale static data sets, but not so great for real-time data
– Most organizations employ a hybrid method combining technologies
•
A robust open source framework supports processing in clustered systems.
•
Platform-as-a-service vendors (Microsoft, Amazon, Google) offer turn-key
solutions for analysts to simply upload, link and compute basic data sets
– Great for simple historical analysis; bad for real-time or diverse data sets
MASFLIGHT
masFlight: A Global Aviation Data
Warehouse and Big-Data Analytics Platform
Hybrid Architecture
Redundancy
• Physical architecture for secure data feeds
• Multi-source data acquisition
• Cloud-based instances for linking
• Real-time validation and processing
• Managed cloud data tables
• Replication across cloud infrastructure
• Integrates with local BI and warehouses
• Load balancing and parallel processing
Backup
Customization
• Cluster processing to reduce dependencies
• Customizable for specific user requirements
• Monitored data integrity and performance
• Dashboards and web templates
• Multiple geographic zones and clusters
• Integrated internal data in warehouse
• Imaging of tables for replication
• Connect to local BI systems
D ATA A N D A P P L I C AT I O N S
masFlight’s Data and Applications Platform
OUR CLOUD-BASED DATA WAREHOUSE
Data Input Feeds
In-House Servers
For private gov’t feeds
Reference and Static Data
Geospatial, airline, airport info
Current Weather
Global hourly conditions
Forecast Weather
Standard and severe forecasts
Flight Schedules
What’s planned to operate
Secure External Network
Cloud Warehouse
Linked Information
60TB structured data
Airport & Gate Status
Multisource, real-time feeds
OUR CUSTOMER APPLICATIONS
Web Application
(masflight.com)
HTML 5 / Ruby
Analyst focused
Customizable
Fast deployment
SaaS revenue model
Dashboards &
Web Services
REST web services
Feed internal systems
Custom dashboards
Flexible interfaces
Secure U.S./Canada Radar
Authorized direct access
Other Airspace Data
Satellite and transponder info
Government Economic Data
Revenue and audited data
Robots and Java
Applications
Cloud Managed
Database Hosting
Automated collection
Virtual tables
Updated in real time
Bypass constraints
Ultimate customization
M A S F L I G H T P L AT F O R M
masFlight Platform
Multisource, integrated airline operations data
Planned Flight
Schedules
Airport
Runway Data
Airport Gate &
Terminal Data
Airline
Ops Data
Multisource
Flight Status
U.S. Radar
Data
Airline Fleet
Information
Global Weather
Data and Maps
Key Partners
and Suppliers:
Our platform shows
where, when and why
problems occur
• Examine diversions,
cancellations, delays
and determine root
causes
• Deep-dive into airport
gates, taxi times, and
runway patterns
• Analyze air space
usage and air traffic
management
E N D TO E N D C APAB I L I T Y
Big-Data Analytics Facilitates End-to-End Analysis
A full picture of each flight is critical for analyzing operations
Query flights from planned schedule through post-operation recovery
Up to 500 data points per flight
KIAD V268 SWANN
1502Z 1550Z 1620Z
Origin weather
Origin information
Operating airline
Scheduled times
Departure gate/time
Taxi-out/takeoff times
Flight plan filed
Actual path flown
Congestion
Weather diversions
En-route times and fixes
Arrival weather
Destination information
Landing/taxi times
Arrival gate/time
Diversion data
Aircraft information
Other sources only offer limited, disaggregated and unformatted regional data
COVERAGE
A Global Solution
masFlight tracks flights, airports and weather around the world
North and South America
EMEA and Asia
• Global daily flight
information capture
― 82,000 flights
― 350 airlines
― 1700 airports
• Integrated weather data
for 6,000 stations
― Match weather to delays
― Validate block forecasts
at granular level
White lines are flights in the masFlight platform from February 8, 2013.
Yellow pins are weather stations feeding hourly data to our platform.
Maps from Google Earth / masFlight
― Add weather analytics
to IRROPS review and
scenario planning
TOWER CLOSINGS
Example 1: Proposed FAA Tower Closures
masFlight used big-data to link airport operations across three large data sets:
– Current and historical airline schedules
– Raw Aircraft Situation Display to Industry (ASDI) radar data from the FAA
– Enhanced Traffic Management System Counts (ETMS), including Airport
operations counts by type (commercial, freight, etc.), departure & arrival
Findings: Proposed Tower Closings
•
Dots indicate closures; Red dots have scheduled service
From schedules database: 55 airports with
scheduled passenger airline service
– 14 EAS Airports
•
From ASDI & ETMS: 10,600 weekly flights
on a flight plan (ex. VFR and local traffic)
– 6,500 Part 91/125 weekly flights
– 4,100 Part 135/121 weekly flights
Based on scheduled service March 1 – 7, 2013; scheduled service includes scheduled
charter flights, cargo flights, and passenger flights
TOWER CLOSINGS
Example 1: Big-Data Analytics Applied to
ASDI and ETMS To Analyze Operations
Distribution of Airports
By Average Number of “Daily” Impacted Flights
Airports Affected by Tower Closures
Count of Airports
44
26
24
23
11
10
6
Up to 5
5-10
10-15
15-20
20-25
25-30
30-35
2
1
2
35-40
40-45
45+
Average Number of Daily Operations with a Flight Plan Filed
Source: ASDI radar data – Part 91/151 flying and Part 135/121 flying – March 1-7, 2013; masFlight analysis
Note: Average “daily“ operations based on 5-day week
CAUSAL FACTORS
Example 2: Aviation Safety Causal Factor
Data-mining algorithms can mine the text of safety reports
to obtain specific data that can be used to analyze causal factors.
For example, consider the following ASRS report (ACN 1031837):
“Departing IAH in a 737-800 at about 17,000 FT, 11 miles behind a 737-900 on the Junction departure over
CUZZZ Intersection. Smooth air with wind on the nose bearing 275 degrees at 18 KTS.
We were suddenly in moderate chop which lasted 4 or 5 seconds then stopped and then resumed for
another 4 or 5 seconds with a significant amount of right rolling… I selected a max rate climb mode in the
FMC in order to climb above the wake and flight path of the leading -900. We asked ATC for the type ahead
of us and reported the wake encounter. The -900 was about 3,300 FT higher than we were.”
• Synopsis
– B737-800 First Officer reported wake encounter from preceding B737-900
with resultant roll and moderate chop.
What causal factors can be identified from this narrative that
could be applied to future predictive applications?
CAUSAL FACTORS
Example 2: Identifying Causal Factors
Indicators – Data Element
Methods – Identifying Context and Causes
•
Time of day
•
Date range (month, day)
•
Aircraft type
We pinpoint the sequencing of flights on the IAH Junction
Seven departure (at CUZZZ) during the specified wind
conditions to find cases where a B737-900 at 20,000 feet
precedes by 11 miles a B737-800 at 17,000 feet
•
Fix or coordinates
•
•
Originating airport
Search related data sets including ASDI
(flight tracks, local traffic and congestion)
•
•
Destination airport
Weather conditions for alternative causes (winds aloft,
shear and convective activity)
•
Weather notes
•
Airline specific information (repeated occurrence of event
in aircraft type)
Big data gives us visibility into contextual factors even if specific
data points are missing such as a specific date or route.
Big-data analytics gives us insight into unreported factors as well.
C O M PA R I N G O T P A N D U T I L I Z AT I O N
Example 3: Correlating Utilization and Delays
Daily Utilization vs. On-time Departures
January 2013 System Operations
Narrowbodies
By Day of Week
100.0%
Correlation Coefficient -0.53
ONTIME DEPARTURE PERFORMANCE
90.0%
Includes AA, AC, AS,
B6,
F9, FL, NK, UA, US,
VX and WN
100%
95%
80.0%
70.0%
90%
60.0%
85%
7.0
80%
75%
100.0%
70%
90.0%
65%
80.0%
60%
9.0
11.0
13.0
Widebodies
by Day of Week
70.0%
7
9
11
HOURS OF DAILY UTILIZATION
SOURCE: masFlight (masflight.com)
13
60.0%
7.0
9.0
11.0
13.0
U T I L I Z AT I O N B Y H U B
Example 4: Daily Utilization of Gates, by Hub
Big-data analysis of different carriers – daily departures per gate used
United Airlines Hubs
Alaska Airlines Hubs
American Hubs
Average Daily Deps per Gate Used
Average Daily Deps per Gate Used
Average Daily Deps per Gate Used
CLE
SJC
3.6
IAD
3.8
IAH
5.8
4.0
JFK
LAX
4.3
GEG
4.4
MIA
DEN
6.1
SFO
5.3
LGA
EWR
6.2
ANC
5.4
LAX
PDX
5.5
SFO
7.2
LAX
SAN
7.4
ORD
7.7
2.7
6.4
6.8
DFW
6.4
SEA
5.0
6.9
ORD
7.8
7.2
JetBlue Focus
US Airways Hubs
AirTran Hubs
Average Daily Deps per Gate Used
Average Daily Deps per Gate Used
Average Daily Deps per Gate Used
FLL
DCA
4.9
BOS
5.2
PHX
MCO
5.8
BOS
5.8
LGB
MKE
6.0
6.2
CLT
5.9
6.9
7.2
MCO
June 1 through August 31, 2012. Gates with minimum 1x daily use
SOURCE: masFlight (masflight.com)
5.5
6.6
BWI
DCA
4.7
4.9
ATL
PHL
JFK
4.2
6.6
CONCLUSIONS
Conclusions for Big Data in Aviation
• Big-data transforms operational and commercial problems that were
practically unsolvable using discrete data and on-premises hardware
• Big data offers new insight into existing data by centralizing data
acquisition and consolidation in the cloud and mining data sets efficiently
• There is a rich portfolio of information that can feed aviation data analytics
– Flight position, schedules, airport/gate, weather and government data sets
offer incredible insight into the underlying causes of aviation inefficiency.
– Excessive size of each set forces analysts to consider cloud based
architectures to store, link and mine the underlying information
– When structured, validated and linked, these data sources become
significantly more compelling for applied research than they are individually
• Today’s cloud based technologies offer a solution
CONCLUSIONS
Conclusions: Our Approach
• masFlight’s data warehouse and analysis methods provide a
valuable example for others attempting to solve cloud based
analytics of aviation data sets
• masFlight’s hybrid architecture, consolidating secure data feeds in
on-premises server installations and feeding structured data into the
cloud for distribution, addresses the unique format, security and
scale requirements of the industry
• masFlight’s method is well suited for airline performance review,
competitive benchmarking, airport operations and schedule design,
and has demonstrated value in addressing real-world problems in
airline and airport operations as well as government applications