INFO296A2_2016_Spring_Lecture_1_Intro_v3.0x

Download Report

Transcript INFO296A2_2016_Spring_Lecture_1_Intro_v3.0x

Thought Leaders in Data Science and
Analytics
- Big Data Analytics
Ram Akella
[email protected]
Cell 650-279-3078
Skype ID: ramakella1
621C SDH (Sutardja Dai Hall)
http://courses.ischool.berkeley.edu/i296a-dsa/s16/
https://piazza.com/berkeley/spring2016/info296a/home
(Under development)
See
http://courses.ischool.berkeley.edu/i296a-dsa/s15/ for now
Monday, January 25, 2016
Introduction
• Technology cycles
– Analytics
– Architecture/Infrastructure
– Interaction
• A possible cut
– Enterprise Analytics:
• Enterprise databases (DB)and Business Intelligence (BI)
– Web Analytics
• Leading to Hadoop, Spark/Shark, Streaming + Analytics
– Internet of Things
• Continuous sensing and proactive response
• What is new and different about it?
Today’s Seminar(s)
• Computational Advertising: Dr. Jimi Shanahan
– Web Analytics: NativeX/Adobe + Plus more
• Entrepreneurial Data Analytics: Dr. Sudhakar
Muddu
– Developing Data Analytics Products and/or firms
– IoT, Security, Big Data
– Splunk + Plus more
Next weeks Seminar(s)
• Web Analytics and Social Media: Eric
Rasmussen
– Groupon
• IoT and Enterprise Analytics: Jose Abelenda
– Lyft
- Challenge problems follow
Data Science and Analytics: What is it?
• Components
– Data collection, storage, and basic processing
• Architecture and Infrastructure
– Analytics
– Domain
– Business Needs
• To solve real Big Data problems, need
expertise in some or all of these areas
• Need to form teams!
Seminar and Course Structure
• Set of broad ranging industry talks
– Provide perspectives on
•
•
•
•
•
Basic
–
•
Domain and business needs
Infrastructure needs
Analytics needs
State of the art
2 units: Class participation + 2 team reports on seminar themes
(due 3/14 and 5/2)
Advanced (subject to review of CVs and approval)
–
-
2 or 3 units: Potential projects, ideas, mentoring (including
industry executives, VCs) and possible data (Reports due 2/8,
3/7, 4/11, 5/2)
Either
i) Startup product development
OR
ii) Industry problem solution
Projects and Participation
•
•
•
•
These are key to learning
Forming teams is critical
Need Analytics, infrastructure, business, domain
We can help
–
–
–
–
–
Faculty
Staff
Other students
Industry executives, managers, researchers and personnel
VCs
• Requires submission of CV, proposal
Background
• Basic
– An Introduction to Statistical Learning
(James, Witten, Hastie, Tibshirani)
- R or equivalent
- Data Mining, linear algebra, statistics, or equivalent
- Additional (specialized):
Field Experiments (Gerber, Green)
- Background courses on next slide
Advanced: To discuss
- Coursera courses, EDX courses
- Campus courses
Background Courses
•
•
•
•
•
•
•
Big Data Analytics Background Resources
http://www-bcf.usc.edu/~gareth/ISL/
https://work.caltech.edu/telecourse.html
http://www.stat.berkeley.edu/~mjwain/Fall2012_Stat241a/
http://datascienc.es/
http://courses.ischool.berkeley.edu/i290-dma/s12/
https://blogs.ischool.berkeley.edu/i290-abdts12/author/hearst/
• http://www.cs.berkeley.edu/~jordan/courses/294-fall09/
• http://alex.smola.org/teaching/berkeley2012/
• http://www.cs.berkeley.edu/~jordan/courses/281Aspring14/
Action
•
•
•
•
•
Sign up sheet
Set up teams
Provide CVs
Start determining data sets and projects
Meeting times, including Skype (beyond class
times)
• Set up boot camp times for Infrastructure and
Machine Learning/Data Mining
• Use Piazza!
Meeting Times/Office hours
• Mon: 11-1 pm, (backup: 4/5 -5/6 pm), By
appointment
• Tue/Th: By appointment
• Skype/tel, in addition to in-person meetings
Course Expectations
• This is about addressing the unstructured real world
and Silicon Valley
• NOT a structured, course, with an organized, linear flow
• You are expected to already know or learn data mining
and machine learning
– Bootcamp for those who need assistance
• Seminars to provide industry context
– Again, thematic, but no evident linear flow structure –
executive schedules!
• Industry and VC mentors for
– Entrepreneurial project on data analytics product
development