- Jonathan Adler

Download Report

Transcript - Jonathan Adler

Math problems in Industry
Dr. Jonathan D Adler
[email protected]
http://jadler.info
Questions
about math
jobs
• What kind of jobs are actually out there for math
students?
• Anything besides teaching or accounting?
• What jobs are actually enjoyable?
• What is the work like in these jobs?
• What kind of skills do I need to get one of these jobs?
• Who should I be asking these questions to?
2
Allow me to answer
your questions
By telling you my life story
3
College:
Worcester
Polytechnic
Institute
4
• I became a math major because I liked taking math
classes
• When I was an undergraduate in math I had no idea
what kinds of jobs I could get. I figured the breakdown
was:
About me
1. High school math teacher
2. College math professor
3. Industry???
• Naturally, as someone who liked taking math classes, (2)
seemed like the natural choice for me.
• Decided to get a BS/MS so that I could then get a PhD
• Maybe more useful for industry I guess?
• Did my master’s thesis on graph theory and monadic
second order logic
5
Maybe being
a professor
wasn’t right
for me
• Had summer research and internships, and didn’t like
doing research
• If being a math professor means doing a lot of trying to
find exact solutions and coming up fruitless, maybe it
wasn’t right for me.
• My internship was fun! I solved problems, and wrote
code people could use! Maybe industry would be right
for me.
• I had a few more internships:
• Did a project with the BOSE Corporation numerically
modeling screw insertion into plastic
• Worked at Boeing making mathematical forecasts of how
aircraft demand would change in the near future
• It felt like math
6
• When searching for my first job, there were zero nonteaching jobs labelled “mathematician”
Finished
school, time
to find a job
• Lots of jobs would take math majors
• “If they’re hiring math majors, that must mean I’ll be
doing math! Let’s see how it goes!”
• Ended up getting a job at a company called Vistaprint
7
First real job:
Vistaprint
8
• Vistaprint sells custom printed business cards online
Vistaprint
• Referred by a friend (hiring math majors!)
• Job description: maintain the statistical models used to
forecast the business
9
• Vistaprint had millions of customers
Vistaprint
problem 1:
customer
segmentation
• Some customers spent a little money holiday greeting
cards
• Some customers spent hundreds of dollars on materials
to promote their small business
• Some customers paid Vistaprint to host their small
business website
• Every day,Vistaprint sends an email with a coupon
• What message should you put in the email
• How much should the coupon save?
• Proper answer can dramatically increase the revenue to
Vistaprint
10
• Idea, use statistics to split customers by their attributes
• Analyze the customers in each group to see what type of
customers they are
• Send an different email to each cluster
Vistaprint
problem 1:
customer
segmentation
• Run an A/B test to see
what the right coupon
value is
• Try sending two
coupons, one for $5
off and one for $10 off,
and see which does
better
• If the $10 does well
enough to offset the
increased cost, use it
again in the future
11
• The vast majority of my work was answering questions like this:
“Do customers who buy business cards tend to buy holiday cards as well?”
• Seems like an obvious question, but how do you turn it into
something you can answer with numbers?
Vistaprint
problem 2:
the general
analytics
problem
• What does “buying business cards” mean? Look at the past year’s
data and see if they have made any purchases that included
business cards
• What does “tend to buy” mean? See if they put a holiday card in
the order with business cards, or maybe ordered within the next
month
• What should we compare this to? Customers who hadn’t ordered
any business cards maybe?
• We had data on every customer, order and product we sold
• Look in the data for every order over the past 12 months, and
count the times they included business cards and holiday cards,
compare the percent of orders with biz and hol cards to the total
with biz cards
• Really want to answer the question they didn’t ask:
Can we get more orders for holiday cards by marketing to the people who
bought business cards?
12
“Math” work
Proving theorems
Every math undergrad thinks
they will end up here
Developing new methods
Difference
between
“Math” and
“mathy”
Finding new applications for existing methods
Using existing methods in conventional ways
“mathy” work
Vistaprint job
(as hired)
Manipulating data in interesting ways
Interesting jobs for
math grad students
fall here
Interesting jobs for math
undergrads fall here
Manipulating data in Excel
Manipulating charts in PowerPoint
most jobs for math
undergrads fall here
Pressing the button that needs to be pressed
Office work
Going to meetings
13
Job titles
“Math” work
Difference
between
“Math” and
“mathy”
“mathy” work
Proving theorems
Researcher
Developing new methods
Scientist
Machine learning expert/
Finding new applications for existing methods Senior data scientist/
Statistician/
Using existing methods in conventional ways Operations researcher/
Advanced analytics expert
Manipulating data in interesting ways
Manipulating data in Excel
Manipulating charts in PowerPoint
Analytics / Data Science
Analyst
Pressing the button that needs to be pressed
Office work
Going to meetings
Business analyst
14
• Nothing “wrong” with being lower on the ladder
Difference
between
“Math” and
“mathy”
• Just because it isn’t “Math” doesn’t mean it isn’t
interesting and a good brain workout
• Way more jobs lower on the ladder
• Too high up and your stuff doesn’t get used
• Thinking all the time is exhausting
• You can move up the ladder
• Find a new area to do something mathematical
• Improve a current process with clever tricks
• Automate a process that is boring
15
• Problem:
Vistaprint
problem 3:
forecasting
and quality
control
• Vistaprint had data on revenue for every order for the past few
years
• On a recent Tuesday had a bug which went undetected and
lowered sales dramatically
• A director emailed my department and asked for history of
sales over all Tuesdays, was going to take the average of that
and if you’re below it you can tell there is a bug
• Options:
a)
Run a query on the database to get the data and email it to
him
b) Point out it’s a much more complicated and interesting
problem
•
•
•
•
Sales are increasing over time
Only interested in sudden drops
How do you correctly determine how low is “too low?”
What if you want to detect it more quickly than that?
16
• I was loud and complained we were doing things the wrong way, and
ultimately was put in charge of a team to create a system of
statistical quality control tools to detect anomalies
• Normalize the data:
Vistaprint
problem 3:
forecasting
and quality
control
• By day of week
• By time of year
• By if it is a holiday
• Compare to recent few hours for sudden drops, previous week for
more long term decay. Use mean and standard deviation to detect
when an drop is sufficiently large
Daily sales volume in millions of dollars
3
2.5
2
1.5
1
0.5
0
17
• To be decent in industry you need to know
• Basic databases (intro database course)
Skills you
should get
before leaving
school
• Everyone stores their data in SQL (or equivalent)
• Since industry mathematics is all based on data, you need to be able to
manipulate it
• Basic programming (intro programming course)
• Don’t have to be a CS wizard, but do need to be able to understand
loops, functions, and the simple stuff
• Language isn’t a big deal (R, Python, and MATLAB all okay), but need to be
able to do more than just run built in functions
• Basic statistics (intro statistics course)
• Everything has uncertainty
• Need to be able to understand how the uncertainty of the data will affect
your results
• Unfortunately assumptions from class never hold, so get ready to roll with
it
18
Skills you
should get
before leaving
school
• To be excellent in industry you need
• Everything from the previous list
• The ability to learn more things
• Projects and internships are potentially great ways to
learn how to learn
19
• By the time I left Vistaprint I was:
• Building new models for forecasting sales volumes using
much more advanced methodologies
Vistaprint
(continued)
• If you have the previous three years of sales volumes, how
can you predict what next year’s will be? By day?
• Leading a team to build a statistical quality control tool
for sales metrics
• Dabbling in recommendation engine to decide what
products to show to users of the site
• Given that you know a lot about the person visiting the
site, how can you decide what products to recommend to
them?
20
Next real job:
Boeing
21
• Boeing sells airplanes
• Worked in the market forecasting group: helped predict
number of airplanes the world would need in 20 years
Boeing
• Example problem: suppose you have historic data on
when aircraft were built and when they were scrapped;
how can you predict when airplanes that are currently
flying will be scrapped? Answer is affected by things like
economic downturns
• This is a statistical analysis, given you have an airplane and
you know it’s model and age, what is the probability it will
be scrapped in a particular year?
• Model and age are independent variables, predicting
scrapped probability
• We used a logistic regression ultimately
22
• This is a statistical analysis, given you have an airplane
and you know it’s model and age, what is the probability
it will be scrapped in a particular year?
Boeing
problem 1:
airplane
analysis
• Model and age are independent variables, predicting
scrapped probability
• We used a logistic regression ultimately
Model
Tail Number
Age
Usage
Year
Status
727
N12345
35
Freight
2015
Flying
727
N76543
37
Freight
2015
Parked
737
N12486
29
Freight
2015
Scrapped
737
N74824
15
Passenger
2015
Flying
737
N62349
12
Passenger
2015
Flying
23
• To know how many airplanes will be in the world in 20
years, it helps to know how many people will be riding in
each plane
Boeing
problem 2:
passengers
per seat
• Load factor is the passengers per seat (ex: 60 passengers
and 100 seats = 60%)
• How to predict future load factors from historic data
24
• While job had some math, ultimately I fell lower on the
ladder
Boeing
• “I know what will help get me higher up that ladder!
More school!”
• Maybe I’ll realize I wanted to be a professor after all
25
Grad school
2:
Arizona State
University
26
• Went to get a PhD in Industrial Engineering
• Liked the operations research part of math, but that
wasn’t in the math department of ASU
• Conveniently also taught me statistics and data mining
Arizona State
University
• Researched routing policies for electric vehicles (graph
theory meets optimization meets transportation)
• Being an academic is:
• An extremely difficult job to get in the first place
• Wasn’t ultimately rewarding to me
• Academia: publish papers that move the field forward
slightly
• Industry: make models and tools that can help your one
particular company a lot
27
• A local brownie company had a website where people
could order brownies online
• Some people sent them to friends and family
• Businesses sent them to clients and customers
Sample
problem:
Brownie text
analytics
• Brownie company wanted to know how to split business
orders and consumer orders
• Different customers would get different catalogs
• Gift messages on the brownie boxes held the key –use
text analytics to decide the purpose of the order
• Fancy way of just counting the business words and the
consumer words
THANK YOU ALL FOR YOUR AND HARD WORK, IT IS TRULY APPRECIATED BY
THE MANAGEMENT TEAM
CONGRATULATIONS AND BEST OF LUCK ON YOUR NEW JOB! WE ARE VERY
PROUD OF YOU! LOVE MOM
28
Next job:
Promontory
Growth and
Innovation
29
Promontory
Growth and
Innovation
• PGI was a “management consulting” company, in that we
helped companies increase profitability
• I was a lead on the team that took client data and tried
to figure out what’s going on
• End up using a grab bag of mathematical techniques
•
•
•
•
Data mining
Bayesian statistics
Linear programming
Graph theory
• Pretty high up the “mathy” ladder yet still got to deliver
real value
• During the job I accidentally became a
• Software developer
• Project manager
30
• A hospital has hourly staff, which work in different
departments and jobs (sometimes multiple)
Promontory
Growth and
Innovation
problem:
staffing
• The staff is taking a lot of overtime hours
• How much of that overtime is necessary?
• A matching problem:
• The departments and jobs have a demand that needs to
be filled
• The staff has a supply
• There is a bipartite graph between those two sets, and
the edges are if an employee can do that job
• A linear program can be solved to figure out how much
of the demand can be served before needing overtime
31
Promontory
Growth and
Innovation
problem:
staffing
ER
Joseph
OR
Susan
Pharmacy
Mary
Recovery
32
• Graduate school can teach you how to learn on your
own
• Essential for jobs higher up the mathy ladder
Grad school
only sort of
helps
• Grad school may teach you some of the three key skills
(statistics, databases, programming)
• An MS or PhD may open doors for some jobs higher up
on the mathy ladder
• A doctorate without experience is an even harder hire
• Has a huge opportunity cost if you’re not careful
33
Today:
Microsoft
Studios
34
• Microsoft makes lots of first party games: Halo,
Minecraft, Gears of War, etc.
• Each game creates data (player kills and deaths, times
logged in, in game purchases)
Microsoft
Studios
• That data can be used to:
• Prevent cheating
• Forecast sales of future games
• Improve the game itself
• Data from one game can be shared with others for a
better overall picture
• I am a member of the (just starting) shared analytics
across all titles
35
Microsoft
Studios
problem:
cheat
detection
• Sometimes when people play games they cheat, how do
you detect these?
• Look for anomalies in the data:
• People who have lap times in a racing game that are faster
than physically possible
• People who have scores 3 standard deviations above the
mean
• People who submit multiple scores at once
36
Questions
(revisited)
• What kind of jobs are actually out there for math students?
• Teaching jobs – high school and small liberal arts professors
• Industry jobs – helping companies by using math on their data
• Research jobs – doing academic research to move the field
forward
• What is the work like in these jobs?
• Lots of crunching numbers and using math to answer questions
• What kind of skills do I need to get one of these jobs?
Stats, programming, databases, and some focused math classes
• Who should I be asking these questions to?
People who have been down this road before. Reach out to people
like me either directly or through your local university
37
Questions?
(For a copy of this presentation go to: http://jadler.info)
38