Normalizing Database Files

Download Report

Transcript Normalizing Database Files

Normalizing Database Files
Professor Ralph Westfall
May, 2011
Concepts Covered

Using SQL Server Management Studio to
take unnormalized data in a file and splitting
it into two normalized files


For example, repeated data in a field that has a
one to many or many to many relationship with
the other fields in the table
Doing the same thing with Excel
Create Non-Redundant Table
CREATE TABLE currenciesCKey (
Currency VARCHAR(15))
INSERT INTO currencies (Currency)
SELECT DISTINCT Currency from nations

Word DISTINCT gets rid of duplicated entries
in the normalized new table
Finish Creating New Table

Click Execute button


Click in Object Explorer window, then click
Refresh icon and expand Tables node for this
database to see the new table
May need to right-click Tables>Refresh for the
database used to see table that was just created
Add Normalized New Table Key

Right-click this new table name>Design






Right-click top field>Insert Column
Change Data Type to int (or something else
appropriate)
Right-click this new column>Set Primary Key
Set (Is Identity) to Yes in Column Properties
Set Identity Seed to a value large enough so all
current and expected rows will have same
number of digits in their key
Click diskette icon to save this new table
Create Table with Foreign Key


Define a new table that has a foreign key to
reference the data in the second table
Do a join that includes all fields from both
tables EXCEPT for the field that went into the
separate normalized table


the key field from the second table gets added to
the new table but the data the key field
references doesn't go into it
Results of this join fill the new table with data
Create Table with Foreign Key
CREATE TABLE [table] ([field [datatype], [field]
[datatype], etc.)
INSERT INTO [table] ([field], [field],
[field],[table].[field], etc.)
WHERE [table].[field]=[other_table].[field]
 Will have all fields (excluding duplicates) in
the new table



Will need to qualify fields with same names in
both tables by name of the table they come from
Execute, then Refresh to see the new table
See sample code in Notes
Verify Data

Use SQL Server Management Studio to
Select Top 1000 Rows to see data in new
table


visually check foreign keys against values in
other table
Or run a SQL query to join and view the data
Backup

Backup data from the original table





Right-click database name>Tasks>Export
Data>Next
Data source is SQL Server Native Client
10.0>Use Windows Authentication>Database is
the one that was being used>Next
Create a new .txt file on desktop>Destination is
Flat File Destination>Browse for that file>Format:
Delimited>Next
Copy data from one or more etc.>Next
Source table or view: [original table]>Next>Next>
Finish
Finish Backup and Delete

Verify that the new table has the same
number of rows as the original table



SELECT COUNT([key_field]) FROM
[original_table] and then repeat with new table
Delete the original table
Detach the database
Excel



Could easily normalize small and middle
sized data sets in Excel
See Filter for unique values or remove
duplicate values for how to get unique values
for second table
Copy the column to be normalized into a new
table



add sequence of numbers in column to the left
put captions over each of these two columns
follow instructions in the link above
Excel - 2

Add an extra column to the main table



create a VLOOKUP column on the second table
you just created that converts the data in the unnormalized column in the main table to the key
value
use Paste Special to replace the actual with key
values
save both tables as .csv files to load into a
database