What is “Data Science”?

Download Report

Transcript What is “Data Science”?

How Can Engineering Take Data
Sciences from Ideas to Action
By Adam Stallard
How Can Engineering Take Data
Sciences from Ideas to Action
What is Data Science
The Data Scientists
Scientists vs. Engineers
A Real Life Example – The Idea
Roles
Into Action
Demo
What is “Data Science”?

big data

business analytics

mathematics

programming

probability models

data engineering

machine learning

visualization

statistics

data warehousing

data analytics

insight
The Data Scientists
Skill Overlap

big data

business analytics

mathematics

programming

probability models

data engineering

machine learning

visualization

statistics

data warehousing

data analytics

insight
Scientists vs. Engineers
Scientists





theoretical physics
explore the natural
world with theory
and practical
experiments
discover and
expand knowledge
provide information
and knowledge for
the engineers
use
Engineers




robotic engineering
Apply (scientific)
knowledge to solve
problems
development and/or
improvement of
product or
processes
optimise, efficiency
are key (cost, tools
etc)
Scientists vs. Engineers
“In engineering you do not start a project
unless you know the answer while in science
you do not start a project if you know the
answer.”
The Idea: A Real Life Example
Idea (aka Business Question)
Does the early opinion of Morpheus match our
segmentation of potential buyers?
Role of the Data Scientists

gather requirements from stakeholders

prepare any existing usable data models

natural language processing in R/Python/etc

liaise with various team members – Data
Science Engineers

analysis of data / presentation of outcome

attend meetings
Role of the Data Science Engineers





gather requirements from Data Scientists
prepare any existing applications/processes or
develop new ones
decide and research on best solution for the
task
retrieve, profile, clean and integrate data
automation of repetitive tasks – make life
easier

liaise with various stakeholder

attend meetings
Data Scientist Requirements
Brief / Requirements:

download user posts/comments from social
networks and dedicated gaming websites to
analyse

data in csv format or integrated to data warehouse

set parameters, including but not limited to website
and filter comments

mobile – can be used anywhere with the correct
authentication etc.
Data Science Engineers Into Action
What the DSE will be developing:
- create a website/comment scraper – modular
and scalable
- comments logged to a database
- interface allows users to set parameters
- application hosted on web server
Data Science Engineers Sprint Into
Action
First Sprint

research

develop script for reddit comment scrapping

log reddit comments to mysql database

start application via web interface

comment output
Data Science Engineers – Tools of the
Trade
What Was Built - Application Demo
bmxgamer.coeus.feralhosting.com/flask/
What Was Built - Application Demo
bmxgamer.coeus.feralhosting.com/flask/
Just for Fun
Sprint Review

Your thoughts?

What would you improve?

Tasks for the next sprint?
Thank you for Listening

Any Questions?