Java, C# , Python, R

Download Report

Transcript Java, C# , Python, R

KEUZE SEMESTER BIG DATA
WHO AM I?
• Peter Odenhoven
• [email protected]
• Amsterdam University of Applied Sciences
• Background: Mathematics / Statistics
• Teaching at this moment:
• Programming: Java, C# , Python, R
• Databases: SQL and NOSQL
• Data warehousing / Business Intelligence
• Big Data
IT’S ALL ABOUT FINDING STUFF …
PROJECT ASSIGNMENTS
1.
2.
3.
4.
5.
HvA CHIEF: charging electrical cars
AFC AJAX : monotoring physics of player
CURVE FEVER: toxic behaviour in a multiplayer game
Digital Life Centre: sensor data
NIKHEF: CERN information use
10 REASONS WHY DATA SCIENTIST IS THE SEXIEST
JOB OF THE 21ST CENTURY (OR NOT)
WHAT YOU SHOULD BRING,
WHAT YOU SHOULD GET
NOT ALL PROBLEMS ARE BIG DATA PROBLEMS
OLD AND NEW FRIENDS: SQL AND NOSQL
“… DO I LOOK LIKE A MAN WITH A PLAN…”
module
Project
1
2
description
ec
Big data applications
Business Studio
Data analysis/mining
Business skills
total
Project
3
4
Big data applications
Data processing and storage
Information visualization
Professional skills
total
6
4
4
1
15
5
4
4
2
15
BUSINESS STUDIO
•
•
•
•
•
Business opportunities
Data warehousing
Reporting
Ethics
Legal issues
DATA ANALYSIS AND DATA MINING
• Some mathematics, MAPLE
• More theory on algoritms
• A lot of practice:
• Rapid miner
• R
• Python
INFORMATION VISUALIZATION
The best data visualizations
are ones that expose
something new about the
underlying patterns and
relationships contained within
the data.
Understanding those
relationships — and being able
to observe them — is key to
good decision making
MORE DETAILS: DATA PROCESSING
• Read and write different data types
from different data sources
• Process (filter, clean, filter, combine,
etc) Data
• Understand Map Reduce concept
• Read and Write data from distributed
file system
• Work with tools such as R and Hadoop
GROUPING
• Individual preferences 1,2 and 3
• We decided, based upon …
o Interest
o Ambition
o Background
• Not necessarily groups bounded to classes…