Transcript IT 163

IT 456-01
System Administration
Lesson 2
Dr Jeffrey A Robinson
Overview of Course
Week 1 – Introduction
Week 2 – Installation of SQL and management Tools
Week 3 - Creating and Using a Database
Week 4 - Using the SQL Server Agent Service
Week 5 - Disaster Recovery
Week 6 - SQL Server Security
Week 7 - Performance Monitoring
Week 8 - Performance Optimization
Week 9 - High Availability
Unit 2
Chapter 1 introduces SQL Server 2008,
listing the components, services, and
editions of SQL Server 2008. It also
describes how to plan and perform a SQL
Server 2008 installation. Hardware and
software prerequisites, server and security
considerations, and the installation
process are described
SQL Server major components
The five SQL Server 2008 editions available for use in
production environment are
–
–
–
–
–
SQL Server 2008 Enterprise edition
SQL Server 2008 Standard edition
SQL Server 2008 Mobile edition
SQL Server 2008 Workgroup edition
SQL Server 2008 Web edition
SQL Server major components
Instructions in DocShare folder
– IT456_-_Installing_a_Default_Instance_of_SQL_Server_20058
SQL Server 2008
SQL Server Management Studio (SSMS)
SQL Server Business Intelligence Development Studio !!!
Databases (AdventureWorks Northwnd Pubs NW
Traders)
Services
– Analysis Services
– Notification Services
– Integration Services (need XP Pro or Vista Ultimate)
NOT Home or Business versions
Books online !!!
DM databases and Excel DM extentions
Unit 2 assignment
200–300 word - Describe the requirements
of MS SQL Server 2008 and the process
you must follow prior to the installation of
MS SQL Server 2008. (Hint - Before
installing MS SQL Server 2008 an
installation plan should be created…. )
Unit 2 assignment (cont)
Project – Install a default instance of SQL
Server 2008.
– a. For directions, see Chapter 1.
– b. Create the NWTraders Database using
instructions in the file in Doc Sharing called
"Create NWTraders Database
– c. Run the scripts from the file in DocSharing
called “Create NWTraders Objects and
Data.”
Unit 3 assignment
Write a 200–300 word response to each of
the following topics.
1. 10 points. Define five options used
when creating a database and identify the
importance of each option.
2. 10 points. Define the options for the
recovery method available when creating
a database. Identify the strengths and
weaknesses of each option.
Unit 3 assignment (cont)
For this Unit's Project you will:
– Develop a stored procedure within the
NWTraders database to evaluate
characteristics of a customer and his/her
purchases or employee and the purchases
he/she have been involved in the
procurement. Examples include the
evaluation of purchases by customer by
region, zip code with items or evaluation of
employees by title, birthdate, region or
zipcode.
Next week
Week 3 - SQL Server Configuration
and Internals
Important stuff on TLOGs
Additional backup Slides
Excel - Add ins
To enable the full resources of Excel you need to be
sure that the appropriate Add-in are enables in excel
In Office 2007 this is done by clicking on the
Office icon in the upper left of the Excel window
Then select the Excel Options
button on the bottom of the
Popup menu
Click n Add-Ins on the left
Navigation bar and OK on the
Next pop-up menu
Excel – Add-ins
You can then select which addins to select or enable. Make
sure you enable the Analysis
ToolPak and the Solver Add-in
in particular.
Depending on additional
packages or applications you
have purchased, there may be
additional add-ins that you can
enable and use.
(Some add-ins require you to
restart Excel or even reboot
depending on your version and
operating system)
Excel Statistical Add-ins
Once installed these new utilities will show up on the
Excel toolbar ribbon under the Data Tab
Third party add-ins may show up as new tabs on the
toolbar ribbon or be listed under the Add-In tab
Excel Add-Ins
Once installed a much wider suite of statistical
tools will be enabled in Excel, including:
–
–
–
Anova: Single Factor, Two-Factor with or without Replication
Correlation and Covariance
Descriptive Statistics
Summary statistics Select if you want Excel to produce one field for each of the following statistics in
the output table: Mean, Standard Error (of the mean), Median, Mode, Standard Deviation, Variance,
Kurtosis, Skewness, Range, Minimum, Maximum, Sum, Count, Largest (#), Smallest (#), and
Confidence Level.
–
–
–
–
–
–
–
–
–
–
–
–
Exponential Smoothing
F-Test Two-Sample for Variances
Fourier Analysis
Histogram
Moving Average
Random Number Generation
Rank and Percentile
Regression (for linear fits and normal probability plots)
Sampling
t-Test: Paired Two Sample For Means
t-Test: Two-Sample Assuming Equal Variances
t-Test: Two-Sample Assuming Unequal Variances
Excel 2007 changes
Excel has undergone a number of changes in the 2007 Office version.
While we have discussed a few, here is a partial list of some more
important changes.
Excel Analyze tools Add-in
The Excel Add-in comes with a number of excellent analysis features.
Coupled with the Analysis Services add-in Excel becomes a powerful
entry level data analysis tool.
Additionally, under the ANALYZE tab, there are features such as:
- Analyze Key Influencers
- Detect categories
- Fill from Example
- Forecast
- Highlight Exceptions
- and Scenario Analysis
Excel Analyze
Once you are connected, the following
features will be enabled in the ANALYZE
tab of your Excel workbook
Excel Analyze Add-in
The Analyze Key Influencers tool lets you select a column that contains a
desired outcome or target value, and then analyze the patterns in a data set
to determine which factors had the strongest influence on the outcome. For
example, if you have a customer list that includes a column that shows the
total purchases for each customer over the past year, you could analyze the
table to determine the customer demographics for your top purchasers.
The tool also lets you select a pair of possible outcomes and compare
them. If the column that you are predicting contains discrete values such as
"Purchase = Yes", "Purchase = No", or "Purchase = Yes but not now", you
can compare the consumers who purchase immediately against those who
purchase later, and those who purchase later against those who never
purchase.
This helps determine possible decision-making factors.
Excel Analyze Add-in
The Detect Categories tool automatically
finds rows in a table that have similar
characteristics.
When the tool finishes, it creates a report
that lists the categories it found, together
with their distinguishing characteristics. By
default it adds a new column to the data
table that contains the proposed category.
You can then review the categories and
rename them.
Excel DM Add-in
The features of the Excel DM add-in suite include:
- Explore data
- Partition Data
- Clean data
- Estimate
- Cluster
- Associate
- Forecast
We will examine some of these in more detail in labs following this
lecture.
Excel DM Add-in
Explore data
The Explore Data wizard lets you preview your
data, and gather statistical information that is
useful in cleanup or planning the data modeling
phase
Partition data
The Partition Data wizard is a fast, easy way to
separate data into training and testing sets. It
includes sampling options such as random
sampling and oversampling.
Excel DM Add-in
Clean data
There are two Clean Data wizards, the Outliers wizard and the
Relabel wizard. These wizards help simplify data cleaning prior to
data mining.
Another tool that is useful in data cleaning is the Fill From Example
tool, in the Table Analysis Tools for Excel. This tool detects missing
values and lets you either remove them or replace them with a
mean, null, or other value.
The Remove Outliers wizard graphs the distribution of values in
your data and lets you remove extreme values.
The Relabel Data wizard lets you work with the data values to make
them easier to read and understand. For example, you can replace
numeric values with text labels, or assign a group label to
continuous data ranges.
Excel DM Add-in
Classify
The Classify Data wizard builds a classification model that predicts
the values of one column based on values in other columns in the
model. It uses the Microsoft Decision Trees algorithm, which is a
classification and regression algorithm used for predictive modeling
of both discrete and continuous attributes.
Estimate
The Estimate wizard builds an estimation model that extracts data
patterns and uses the patterns to predict continuous numeric, date,
or time values. It uses the Microsoft Decision Trees algorithm.
Cluster
The Cluster wizard builds a clustering model that detects groups of
rows that share similar characteristics.
Excel DM Add-in
Associate
The Associate wizard builds an association model that
detects associations between items that appear in
multiple transactions: for example, in market basket
analysis.
Forecast
The Forecast wizard builds a forecasting model that
detects patterns in a series of cells, and then forecasts
additional values.
Advanced
Create mining Model
Add Model to Structure
Accuracy and Validation
Accuracy Chart
The Accuracy Chart wizard helps you
create a prediction query and assess the
performance of a data mining model by
creating a lift chart or scatter plot chart.
The lift chart is important because it helps
distinguish between models in a structure
that are almost the same, to help you
determine which model provides the best
predictions.
Accuracy and Validation
Classification Matrix
The Classification Matrix wizard helps you create a prediction
query to assess the performance of a classification model.
The output is a chart that summarizes both accurate and
inaccurate predictions made by the model. The matrix is a
valuable tool because it not only shows how frequently the
model correctly predicted a value, but also shows which
values the model most frequently predicted incorrectly.
Profit Chart
The Profit Chart wizard helps you determine the costs and
benefits of using a data mining model. It assesses the
performance of a prediction, but also incorporates the cost of
taking action based on the prediction, and estimates the
return on investment
Enabling the Excel DM Suite
First install the tool bar (the same way the Stats Add-in was
installed)
Then enable the add-in by clicking START > ALL PROGRAMS >
SQL Server 2005 DM Addins > Sever Configuration Utility
Enable Temporary Session mining Models (these are objects that
are created by SQL Server and Excel and are deleted after use.
Allow users to create new databases (unless you want to used an
existing SQL Server database; finally enable permissions for users
to create, modify and delete the objects in this DB. (The default DB
name is DMAddinsDB)
Excel DM Suite
Setting up Data Analysis tables
Once SQL server is configured to enable Excel DM add-ins, open
your Excel workbook.
Select a worksheet and highlight the data you want to convert to an
Excel table (as shown on the previous slides on Excel tables; HOME
> STYLES > FORMAT AS TABLE)
Next, connect to the server and database that you previously
enabled. (On the ribbon: ANALYZE > CONNECTION > Add New
Connection) Enter the server name and the local database that you
specified when running the configuration utility. The connection to
the SQL Server Analysis Services Utilities will be displayed in the
Excel Ribbon
Excel DM - Addins
However, with DM features enabled, you also get the
following new tools, as well. (Under the Data Mining Tab)
Notice: there are now many tools are your disposal.
However, you must first indicate the data you want to
analyze. The method for selecting data is the same for
all the DM tools, but I will use the classify tool as an
example.
Excel DM - Addins
Click on the classify data icon to access the Classify data Wizard.
The defaults are the Excel tables in your workbook or you and enter
a range of data
within a table.
You can, however
link directly to
existing SQL
Server databases
to access tables,
columns and rows
of data in existing
relational
databases
Excel DM - Addins
To link to an analysis services data source you need to
enter the Data Source name, Server name and catalog
name.
Once entered, you will
be able to access all
of the data in that
data source by simply
clicking on the field
or columns that you
want to import to
Excel.
Excel DM - Addins
The Data Source Editor will automatically generate an
SQL query to pull the specified data from the SQL
Server database for use.
You can accept the default
SQL Query or edit it
manually, if you wish.
Once loaded you can use the
various DM tools, but these
tools will only work if the
data has already been
imported to excel.
Excel DM – Classification Tree
By simply selecting the default table in the sample database provided with
the DM Addins (DMAddins_SampleData.xlsx) Source Data tab, the
classification tool completes the creation of a classification tree (shown
here)
Excel DM – Explore Data
•
•
•
•
•
•
•
Another useful initial tool in the DM suite is EXPLORE DATA. Again
selecting data you get immediate results.
You can automatically
Generate distribution
histograms, with
a variable number of
buckets for numeric data
and a sorted Pareto chart
for non-numeric data.
•
•
•
•
•
This is orders of magnitude
easier than pivot tables
and histogram generation
using the Analysis add-in
tool