Class14_VizPythonx

Download Report

Transcript Class14_VizPythonx

Using Python to Retrieve and
Visualize Data (Part 2 of 2)
Jon Goodall
Hydroinformatics
Fall 2014
This work was funded by National Science
Foundation Grants EPS 1135482 and EPS
1208732
Quick Review
• Last class we covered
– How to create a time series plot that reads data directly from an
ODM MySQL database
– How to generalize the script to work for any SiteID, VariableID,
StartDateTime, and EndDateTime
• Today’s Agenda:
–
–
–
–
–
Go over the “cleaned-up” script from last class’s material
Show how to create a figure with multiple subplots
Show how to use functions to organize your code
Show how to use classes to organize your code
Briefly introduce the Pandas package for data analysis in Python
Solution to Challenge Problems from
Last Class
• In Canvas, download
Class13_InClassDemo_Clean.py for a solution
to the challenge problems presented at the
end of last class’s lecture.
• We will walk through this code in class.
One Figure, Multiple Subplots
• It is often valuable to include multiple subplots on a single
figure.
• You can do this by using the add_subplot method on the
figure object as shown below.
ax = fig.add_subplot(211)
OR
ax = fig.add_subplot(2,1,1)
– First number is number of rows
– Second number is number of columns
– Third number is the specific plot assigned to the axes object ‘ax’
Challenge Problem
• Extend Class13_InClassDemo_Clean.py to
have at least two subplots
• Do not use functions (yet) to do this.
• Take ~ 5 minutes to think about this, and then
I will present a solution.
Solution
See Class13_InClassDemo_Clean_Subplots.py in Canvas
Example Codes on Canvas
• I’ve put some example codes on Canvas.
Please download these and we will walk
through them in class.
– Class14_InClassDemoFunctions.py = An example
of creating and using functions
– Class14_InClassDemoClass.py = An example of
creating and using a class
– Class14_InClassDemoPandas.py = An example of
using the Pandas library
Using Functions to Reduce
Repetitive Code
• Syntax for a function in Python:
def MyFunction(Arg1, Arg2):
A = Arg1 + Arg2
return A
• This function takes two arguments, adds
them, then returns the sum.
Example of Creating Subplots
using Functions
• See Class14_InClassDemoFunctions.py for an
example of how to use functions to solve the
challenge problem.
• We will walk through this code in class.
Creating Your Own Class to Further
Simplify your Code
• Syntax for a class in Python:
class TimeSeries():
def __init__(self, SiteID, VariableID):
<code to initialize a new object of this
class goes here>
def plot(self):
<code to create a plot of the time
series goes here>
Example of Creating your own Class
• See Class14_InClassDemoClass.py for an
example of how to create your own class to
solve the challenge problem.
• We will walk through this code in class.
Calling your Class from another Script
• Now that you have your own user-defined
class, you can import it into a script.
• Example:
from Class14_InClassDemoClass import TimeSeries
ts = TimeSeries(1,36)
ts.plot()
Introduction to the Pandas Package
• Includes classes designed specifically for data analysis
and visualization including
– DataFrame
– Series
• See Class14_InClassDemoPandas.py for an example of
how to plot a time series using the Pandas Library. We
will walk through this code in class.
• I found this blog post to be a very helpful introduction
to some of the time series functionality in Pandas.
Resulting Image from Example
Pandas Script
Take home message: Creating a similar figure using matplotlib is
possible, but would take many more lines of code.
Assignment 5
• Now you get to try this on your own.
• Build from the examples provided in class
(and/or others you find online) to create your
own ‘publication ready figure’ from the data
stored in LBRODM
• Exactly what plot you create is up to you.
• Details for the assignment are provided on
Canvas
• The assignment will be due October 16.
Summary
• I hope I convinced you that
– Reproducibility matters and it should be a goal of your data
analysis and visualization steps
– Using PyMySQL and matplotlib, it is possible automate data
visualizations with a script
– There are many ways to organize your code and functions
• Messy code is quick to write, but it may come back to haunt you if it is
part of a larger project
– Pandas library provides classes that make it simpler to handle
data analysis and visualization processes.
• I hope that you now have some basic concepts that you can
build from using online documentation and examples
– Like all things, practice makes perfect.
Resources
• matplotlib gallery of examples
– http://matplotlib.org/gallery.html
• Pandas compared to SQL
– http://pandas.pydata.org/pandasdocs/stable/comparison_with_sql.html
• Pandas compared to R
– http://pandas.pydata.org/pandasdocs/stable/comparison_with_r.html