What is “Data Science”?
Download
Report
Transcript What is “Data Science”?
How Can Engineering Take Data
Sciences from Ideas to Action
By Adam Stallard
How Can Engineering Take Data
Sciences from Ideas to Action
What is Data Science
The Data Scientists
Scientists vs. Engineers
A Real Life Example – The Idea
Roles
Into Action
Demo
What is “Data Science”?
big data
business analytics
mathematics
programming
probability models
data engineering
machine learning
visualization
statistics
data warehousing
data analytics
insight
The Data Scientists
Skill Overlap
big data
business analytics
mathematics
programming
probability models
data engineering
machine learning
visualization
statistics
data warehousing
data analytics
insight
Scientists vs. Engineers
Scientists
theoretical physics
explore the natural
world with theory
and practical
experiments
discover and
expand knowledge
provide information
and knowledge for
the engineers
use
Engineers
robotic engineering
Apply (scientific)
knowledge to solve
problems
development and/or
improvement of
product or
processes
optimise, efficiency
are key (cost, tools
etc)
Scientists vs. Engineers
“In engineering you do not start a project
unless you know the answer while in science
you do not start a project if you know the
answer.”
The Idea: A Real Life Example
Idea (aka Business Question)
Does the early opinion of Morpheus match our
segmentation of potential buyers?
Role of the Data Scientists
gather requirements from stakeholders
prepare any existing usable data models
natural language processing in R/Python/etc
liaise with various team members – Data
Science Engineers
analysis of data / presentation of outcome
attend meetings
Role of the Data Science Engineers
gather requirements from Data Scientists
prepare any existing applications/processes or
develop new ones
decide and research on best solution for the
task
retrieve, profile, clean and integrate data
automation of repetitive tasks – make life
easier
liaise with various stakeholder
attend meetings
Data Scientist Requirements
Brief / Requirements:
download user posts/comments from social
networks and dedicated gaming websites to
analyse
data in csv format or integrated to data warehouse
set parameters, including but not limited to website
and filter comments
mobile – can be used anywhere with the correct
authentication etc.
Data Science Engineers Into Action
What the DSE will be developing:
- create a website/comment scraper – modular
and scalable
- comments logged to a database
- interface allows users to set parameters
- application hosted on web server
Data Science Engineers Sprint Into
Action
First Sprint
research
develop script for reddit comment scrapping
log reddit comments to mysql database
start application via web interface
comment output
Data Science Engineers – Tools of the
Trade
What Was Built - Application Demo
bmxgamer.coeus.feralhosting.com/flask/
What Was Built - Application Demo
bmxgamer.coeus.feralhosting.com/flask/
Just for Fun
Sprint Review
Your thoughts?
What would you improve?
Tasks for the next sprint?
Thank you for Listening
Any Questions?