What will be covered
Download
Report
Transcript What will be covered
Advanced Excel
for Data Analysis
Yale University
StatLab
What will be covered
Following up on Intro to Excel Workshop
Importing and exporting data
Data management
cleaning and manipulation
Statistics and data analysis
Pivot tables
Graphing and charts
Writing custom Macros/VBA
More time for specific questions at the end
Quick Review of Some Basics
We’re using Excel 2007
Excel 2010 looks different again
(improvement)
Recent Excel improvements/changes
More rows (not 65536 anymore)
Note new file suffixes: .xlsx and .xlsm
Not backwards compatible
-save as .xls to re-open in older versions
-free download allows older versions to open
More levels of sorting and filtering
Better charting
When and Why Excel
It’s a spreadsheet
Great for calculations and formulas
Okay for Data Management
Everybody has Excel (including Macs)
No VBA or SharePoint access for Macs
That’s how we got the Data
But that’s what most people use it for
Should probably be using Access
Ubiquitous file format widely imported and
exported
We’ve been using it our whole lives, right?
Most of us actually only scratch the surface.
Importing And Exporting
Important for Importing and Exporting:
Define Variable Types (string, date, int)
Open
“All Files” (*.*)
Text to Column, Delimited (basic)
Let’s see an example
http://statlab.stat.yale.edu/help/workshops/introData/dataintro
StatTransfer
Supports Excel from/to anything
Does batch file conversion
Embedding in Word/Powerpoint
Keeps data linked so if you change data you don’t have to reimport to document
Working with Data
Treat Data Like Data, not a bunch of cells
Defining Data Ranges
Name your Sheets
Name your data ranges
Conditional formatting and Data Validation
Quick way to determine whether your data looks good
Set ranges so that if a value falls outside the allowed range,
the cell formatting changes
Create a unique ID row if it doesn’t exist
Allows you to restore order and merge
Filtering, Sorting and Finding Data
Most common question: I need to find unique
values or a subset of my data.
Adding a conditional dummy variable is easy
Sorting and Filtering – make sure you highlight the
entire dataset!
Remember to copy and paste as values
Lookups
VLOOKUP and HLOOKUP
Used with named ranges
Text Manipulation
If cell A1 = “abcde ”
TRIM(A7)=abcde
RIGHT (A7,2)=de
and LEFT, MID
SUBSTITUTE (A7, “c”, “o”)=abode
Comparing Cells (especially in Macros):
gets rid of extra spaces
If cell A1 = “12345” (a string, not a #)
VALUE(A1)=12345
Useful for building syntax, repetitive text for other
programs
Statistics and Data Analysis
Many basic summary functions
CONFIDENCE for confidence intervals
Probability distributions
NORMDIST, NORMINV, etc.
Array functions
AVERAGE, STDEV, CORREL, etc.
LINEST, FORECAST
Data Analysis Tools
Data tab -> Data Analysis
If not installed, upper left circle -> Add-Ins
Linear regression, Chi-square
Pivot Tables and Pivot Charts
Easy snapshot views of your data.
Great for transforming time series data into
tables
Pivot charts for Large Data
Be aware: Default calculation is “COUNT” and
you probably want “SUM”
Click-and-drag interface makes this easy to use,
but requires manual refresh anytime your data
changes
Graphing and Charts
Dependent on column/row as x,y
You may need to copy, paste special/transpose to get the
chart type you want
Charts for publishing
Black and white usually preferred
Patterns not gradients
Dots Per Inch (DPI)
Charts should depend on what is being conveyed
Bar graph (measurements)
Scatter plots (correlation, multivariate)
No Pie Charts
More Advanced Charts
Trend lines
Error Bars
Stock (High-Low-Close)
Changing the data to series
Removing/Hiding lines
Excel wants defined variable names, but it can be tricked
Excellent tips for advanced graphing online
http://peltiertech.com/
Writing Your Own Macros
Be careful: You can’t UNDO running a macro.
Use the record function to get the idea and then
customize for your needs.
Basic structure is Object Oriented
Object.property()
The built-in Editor will show options
Supports If, then, else, for loops , while loops
Be careful: You can’t UNDO running a macro.
More advanced
Financial Calculation functions
Present value, IRR, amortization
Database connections
What-if analysis
Forms (with Data Validation)
Password protection
Resources
Lots of Yale resources available
StatLab Consultants
Sherlock Campbell
Themba Flowers
Orbis.yale.edu
We have access to several online resources
Excel Hacks (O’Reilly)
YouTube. (Oh really?)
Internet Searches
Bing might actually be better for Office
Thank you
Questions?