Analytics and Data Science
Download
Report
Transcript Analytics and Data Science
Understanding the field & setting expectations
ANALYTICS AND DATA SCIENCE
BACKGROUND
Personal
Academic
International
UNT Alumni (Mathematics)
Economics & Mathematics
Professional
Academic Research, Hilton, Ansira, Sabre
ANALYTICS & DATA SCIENCE DEFINED
Analytics: Discovery and communication of meaningful patterns
in Data
Data Science: The novel application of algorithms and statistical
techniques to solve business problems.
Reality: Different meanings at different companies
A relatively new field
The culture of the company determines the nature of work that you do
Most Companies are in the process of defining their analytics strategy
Titles common to the field:
Data Scientist, Analytics Consultant, Statistical Modeler, Risk Analyst, Statistician.
TYPE OF PROBLEMS TYPICALLY ENCOUNTERED
Forecasting
“Predictive Analytics”: Classification
Customer Retention/ Churn Modeling
Who is likely to leave for a competitor
Recommendation Engines
Logistic Regression, SVM, Random Forest, Gradient Boosting
Fraud, Customer Acquisition
Netflix Challenge
Customer Choice Modeling
What will people buy
Multinomial Logit Model
Optimization
Market Mix Modeling
Clustering/ Market Basket Analysis
DATABASES & BIG DATA
Most Companies house their data in relational databases
Hadoop -An open source distributed framework for storing and processing
large amounts of data
Oracle, Teradata, IBM DB2, Microsoft SQL
SQL queries used to retrieve data
SQL: a basic entry level requirement to work in this field
Most of tasks require significant amounts of time and energy combining tables and data
Petabytes
Java based
Map-Reduce
Pig, Hive-SQL syntax-Facebook, Impala-SQL syntax, Spark
Spark – UTD offers a Spark Course
HTML
JSON
PROGRAMMING LANGUAGES
Statistical Programming Languages
R- Open Source, easy to learn, unparalleled no. of packages and
functionality, Memory Limitations.
SAS – Very Common in Businesses but losing popularity, expensive,
losing market share to R, handles large data sets well.
Python – Versatile, reasonable no. of packages, R’s biggest
competitor.
Matlab – More common in Engineering field.
General Programming Languages
JAVA – Not knowing java has cost me at least 4 jobs.
C/ C++ - For writing faster R programs
Scala – Spark more common among people on the forefront of development
INTERNATIONAL STUDENTS
Search for positions you are overqualified for.
State your status as soon as possible
Some companies have policies against hiring international students.
myvisajobs.com
More likely to sponsor you
See companies that are sponsoring
See salaries for negotiation purposes
Others.
THINGS YOU MUST HAVE UNDER YOUR BELT
SQL
Experience with Large Data Sets
Get exposure
JAVA
Specialize in something
Linux Experience
Take courses
Free courses at UNT
Very Strong in at least one area (Optimization, Forecasting, Classification)
10k records is no large
SAS/ R
Fundamental Requirement
Learn it.
Multiple Projects (At least 3)- Code Research Paper, Apply a technique to company data,
participate in Kaggle, do internship.
RECRUITING
Universities
Companies
UTD – School of Management/ Operations Research
OSU (Oklahoma) – Analytics and Data Mining Programs
UNT-Economics
SMU- Statistics
Economics, Mathematics, Statistics, Operations Research, Computer
Science, Engineering.
AT&T, Sabre, Epsilon, Amazon,
AnalyticRecruiting.com (lots of Phone Interviews),
Kforce.com (Very Promising and takes care of Visa issues)
MISCELLANEOUS
Kaggle.com
Internships are extremely important
The Home of Data Science
Company recruiting & Pays winners
Many Kaggle winners manage Analytics teams
Compete! Get recognized.
AT&T, Sabre, Epsilon, Amazon, Santander, Capital One in Plano
Companies prefer to hire Mathematicians
Never accept first offer
Jumping around vs. Staying at one company
They always divide by 2
Dallas R user group- Network
Meetup.com – Network
Informs local chapter
BOOKS
The Elements of Statistical Learning: Data Mining, Inference
and Prediction.
The Art of R Programming
The Theory and Practice of Revenue Management
THANK YOU!