for distributed real-time analytics

Download Report

Transcript for distributed real-time analytics

Scaling Hailo in
the cloud
@davegardnerisme
Cloud Expo Europe
January 2013
Hailo is the taxi app. Use Hailo to get a black
cab wherever you are, whenever you want.
• The world’s highest-rated taxi
app - over 7,000 five-star
reviews
• Over 300,000 registered
passengers
• A Hailo hail is accepted around
the world every 5 seconds
• Hailo is growing (30%+) every
month
• Became the largest taxi network
in all of Ireland within two
months of launch
“I come to use clouds,
not to build them...”
Adrian Cockcroft
http://bit.ly/WM4g2Z
Hailo runs on AWS. AWS allows us
to grow rapidly.
• 2 regions
• 6 availability zones
• We use some AWS services:
ELB, Route53 DNS, EBS
DC 1
DC 2
DC 4
DC 3
DC 5
DC 6
Going global…
Route53 latency-based DNS
us-east-1
eu-west-1
ELB
ELB
API layer
API layer
Web
service
Web
service
Web
service
Web
service
Web
service
Web
service
Web
service
Web
service
Services
(C*,
ZK..)
Services
(C*,
ZK..)
Services
(C*,
ZK..)
Services
(C*,
ZK..)
Services
(C*,
ZK..)
Services
(C*,
ZK..)
We favour technologies that are:
• distributed
• resilient
• operationally simple
• Cassandra
for distributed primary storage
• Zookeeper
for distributed locking
• Acunu Analytics
for distributed real-time analytics
• NSQ
for distributed queuing
Our C* usage:
• Two global clusters spanning 2
regions and 6 AZs
• Use cases include customer
records, job history and more
Our AA usage:
• Main event stream is location
updates from drivers (500/sec)
• Can aggregate in real-time to
answer questions such as
“how many drivers active in
last 10 minutes?”
The key point is to keep services
stateless and use specific tools for
specific jobs:
storage, search, analytics,
coordination
(makes it easier to scale)
Next steps
• Expansion, expansion,
expansion
• NYC, Tokyo up next including
third DC in Asia