What will be covered

Download Report

Transcript What will be covered

Advanced Excel
for Data Analysis
Yale University
StatLab
What will be covered

Following up on Intro to Excel Workshop


Importing and exporting data
Data management






cleaning and manipulation
Statistics and data analysis
Pivot tables
Graphing and charts
Writing custom Macros/VBA
More time for specific questions at the end
Quick Review of Some Basics

We’re using Excel 2007


Excel 2010 looks different again
(improvement)
Recent Excel improvements/changes
More rows (not 65536 anymore)
 Note new file suffixes: .xlsx and .xlsm


Not backwards compatible
-save as .xls to re-open in older versions
-free download allows older versions to open
More levels of sorting and filtering
 Better charting

When and Why Excel

It’s a spreadsheet


Great for calculations and formulas
Okay for Data Management



Everybody has Excel (including Macs)


No VBA or SharePoint access for Macs
That’s how we got the Data


But that’s what most people use it for
Should probably be using Access
Ubiquitous file format widely imported and
exported
We’ve been using it our whole lives, right?

Most of us actually only scratch the surface.
Importing And Exporting


Important for Importing and Exporting:
Define Variable Types (string, date, int)
Open



“All Files” (*.*)
Text to Column, Delimited (basic)
Let’s see an example
http://statlab.stat.yale.edu/help/workshops/introData/dataintro

StatTransfer



Supports Excel from/to anything
Does batch file conversion
Embedding in Word/Powerpoint

Keeps data linked so if you change data you don’t have to reimport to document
Working with Data

Treat Data Like Data, not a bunch of cells

Defining Data Ranges
Name your Sheets
 Name your data ranges


Conditional formatting and Data Validation
Quick way to determine whether your data looks good
 Set ranges so that if a value falls outside the allowed range,
the cell formatting changes


Create a unique ID row if it doesn’t exist

Allows you to restore order and merge
Filtering, Sorting and Finding Data

Most common question: I need to find unique
values or a subset of my data.


Adding a conditional dummy variable is easy


Sorting and Filtering – make sure you highlight the
entire dataset!
Remember to copy and paste as values
Lookups
VLOOKUP and HLOOKUP
 Used with named ranges

Text Manipulation

If cell A1 = “abcde ”

TRIM(A7)=abcde


RIGHT (A7,2)=de



and LEFT, MID
SUBSTITUTE (A7, “c”, “o”)=abode
Comparing Cells (especially in Macros):



gets rid of extra spaces
If cell A1 = “12345” (a string, not a #)
VALUE(A1)=12345
Useful for building syntax, repetitive text for other
programs
Statistics and Data Analysis

Many basic summary functions



CONFIDENCE for confidence intervals
Probability distributions


NORMDIST, NORMINV, etc.
Array functions


AVERAGE, STDEV, CORREL, etc.
LINEST, FORECAST
Data Analysis Tools

Data tab -> Data Analysis


If not installed, upper left circle -> Add-Ins
Linear regression, Chi-square
Pivot Tables and Pivot Charts

Easy snapshot views of your data.


Great for transforming time series data into
tables


Pivot charts for Large Data
Be aware: Default calculation is “COUNT” and
you probably want “SUM”
Click-and-drag interface makes this easy to use,
but requires manual refresh anytime your data
changes
Graphing and Charts

Dependent on column/row as x,y


You may need to copy, paste special/transpose to get the
chart type you want
Charts for publishing

Black and white usually preferred



Patterns not gradients
Dots Per Inch (DPI)
Charts should depend on what is being conveyed



Bar graph (measurements)
Scatter plots (correlation, multivariate)
No Pie Charts
More Advanced Charts

Trend lines




Error Bars
Stock (High-Low-Close)


Changing the data to series
Removing/Hiding lines
Excel wants defined variable names, but it can be tricked
Excellent tips for advanced graphing online

http://peltiertech.com/
Writing Your Own Macros
Be careful: You can’t UNDO running a macro.
 Use the record function to get the idea and then
customize for your needs.
 Basic structure is Object Oriented

Object.property()
 The built-in Editor will show options

Supports If, then, else, for loops , while loops
 Be careful: You can’t UNDO running a macro.

More advanced

Financial Calculation functions





Present value, IRR, amortization
Database connections
What-if analysis
Forms (with Data Validation)
Password protection
Resources

Lots of Yale resources available
StatLab Consultants
 Sherlock Campbell
 Themba Flowers
 Orbis.yale.edu


We have access to several online resources
Excel Hacks (O’Reilly)
 YouTube. (Oh really?)
 Internet Searches


Bing might actually be better for Office
Thank you
Questions?