Transcript Lecture 2

Architecture of Database Management
System (SQL Server)
The most important mechanism of SQL server:
 High level of abstractions (users acts on logical
structures (e.g. tables, views) rather than physical
structures (e.g. database files).
 Dialect of SQL called Transact-SQL (T-SQL is a very
powerful query/programming language).
 Constraints (they allow to ensure correctness and
integrity of the data).
 High performance (e.g. efficient query processing).
 Scalability (capability to work in both small and large
 High availability (downtime is reduced to minimum in
case of failure).
 Transaction management and concurrency control.
 Security (secure authentication and authorization
methods, different auditing mechanisms).
 Effective backup and restore strategies.
The database server must have installed the proper version of
Windows Server.
There are several SQL Server editions:
 Datacenter,
 Enterprise,
 Standard,
 Developer,
 Express.
The first three editions are typical production editions of SQL
The Datacenter edition is the most powerful in terms of
scalability, virtualization and consolidation.
The Enterprise edition allows to use more processors and
consume more memory than the Standard one.
The Developer edition provides the same functionality as
Datacenter. It is dedicated to development and test
environments. The Developer edition can be upgraded for
production use.
SQL Server Express is free edition aimed at developing database
systems in small enterprises.
The architecture of SQL Server
Relational engine consists of, among others, three main
 Query parser – it takes the source code of a query/program
as input, performs analysis and returns its syntactic tree.
 Query optimizer – it analyses query executions plans and
chooses the one which is calculated to be cheapest. SQL
Server query optimizer support many optimization methods
including indices, query rewriting and caching query plans.
 Query evaluator – it executes the query and returns its
Storage engine is responsible for:
 managing database files,
 storage and retrieval of data,
 caching data for reuse,
 controlling concurrency.
SQL OS provides API (application
programming interface) which is used by
all SQL Server components. It includes
services responsible for memory
management, scheduling, deadlock
detection, exception handling, etc.
 It
is possible to install more than one copy of
SQL Server on the same server.
 Each copy is called an instance of SQL Server.
Each instance contains the set of specific
 There are two instance types: default
instance (called MSSQLSERVER) and named
instance (its name is defined by a user during
 Only one default instance can exist on the
database server.
The most common SQL Server services are:
 SQL Server service (it is known as the
database engine),
 SQL Server Agent - it lets to automate some
administrative tasks and monitor the
instance and its databases (this service is not
available in the Express Edition of SQL
The SQL Server service is named
MSSQL$instance_name for a named instance
(for the default instance it is named MSSQL).
An instance of SQL Server can involve also other
services which provide additional functionality:
 Integration Services – it provides mechanisms to
develop Extract, Transform, Load (ETL) processes.
 Analysis Services - it supports online analytical
processing (OLAP) and data mining.
 SQL Full-Text Filter Daemon Launcher – it is used by
the full-text search mechanism which allows to
formulate full-text queries addressing semistructured data.
 Reporting Services – it lets to create reports and
manage them.
 SQL Server Agent – it lets to automate some
administrative tasks.
 SQL Server Browser – this service is responsible for
listening incoming requests. It provides information
about instances and their ports to these requests.
Server Configuration Manager (SSCM) is
an application which allows to configure the
instance services and network protocols used
by SQL Server.
 It can be noticed that some services (i.e. SQL
Server Agent, SQL Server,) are always
associated with the proper instance (they are
followed by the instance name) and some
components (i.e. SQL Server Browser) are
not instance-aware (they can be shared by
all instances of SQL Server which are
installed on the same database server).
SQL Server Management Studio (SSMS) is an application
which lets to administer of all SQL Server components.
SSMS enables, among others:
 configuring the instance properties (e.g. security,
connections, memory, processors, etc.),
 creating and managing the instance objects (e.g.
logins, linked servers, triggers, etc.),
 creating and managing the system and user-defined
 developing various types of scripts (e.g. Transact SQL,
XML, MDX scripts, etc.),
 analysing query plans and statistics,
 configuring and managing the instance services (e.g.
SQL Server Agent, Analysis Services, Reporting
The following elements are necessary to
connect to the SQL Server Database Engine:
 Network protocols on the server and user
hosts (these protocols must be configured
and activated).
 SQL Server Native client on user computers
(it supports two main database APIs: OLE DB
(Object Linking and Embedding for
Databases) and ODBC (Open Database
 The Database Engine must listen on at lest
one network protocol.
SQL Server supports the following network
 Shared Memory – it lets users connect to an
instance of SQL Server if the connection is
established on the host on which the instance is
 TCP/IP - it lets users connect to an instance of
SQL Server through the network. They must
provide the host name and instance name.
 Named Pipes – it lets to network access to an
instance of SQL Server. The connection can be
established on the basis of one of the following
protocols: NetBEUI, TCP/IP, and IPX/SPX.
 VIA (Virtual Interface Architecture) – it is
dedicated to System Area Networks.
SQL Server applies a Microsoft communication
format called TDS (tabular data stream) to
communicate with user hosts. During the
installation of SQL Server a special object
called TDS Endpoint is created for each
network protocol. The endpoints have the
following names:
 TSQL Local Machine (for the Shared Memory
 TSQL Named Pipes,
 TSQL Default TCP/IP,
 TSQL Default VIA.
Users must have the proper permissions on a
given endpoint to connect to the SQL Server
The following figure summarizes the main
components required for connections:
 Logon
to the Windows sever on which the
instance of SQL Server is running.
 Launch SSMS.
 Provide server name (e.g. .\SQLExpress – it
allows to connect to the named instance
SQLExpress which resides on the local
 Select the authentication mode.
 Provide user name and password.
Database files
An instance of SQL Server contains the system and userdefined databases. Each database consists of at least two
 primary database file – it stores data and has the
extension *.mdf,
 transaction log file – it lets to recover the database in
the case of failure.
A database can have more than one datafile. Additional data
files are called secondary data files and have the extension
*.ndf. Applying multiple data files results in the following
 better performance (readings are much faster if the files
are located on different disks),
 using more cores (in the case of one data file only one
core is used because only one I/O thread is created),
 easier management (e.g. it is possible to backup only
some part of a database).
The following figure presents the structure of a
SQL Server database:
primary data
file *.mdf
secondonary data
files *.ndf
log file *.ldf
additional log
files *.ldf
Pages and extents
 A page is the basic unit of data storage.
 It has the size of 8kB in SQL Server.
 All read/write operations are preformed on
the page level.
 Each page posses a unique number and stores
also the number of file to which it belongs.
 A page consists of the header and the area in
which data is stored.
 This region has the size of 8060 bytes.
 SQL Server groups pages into areas called
 They are introduced to facilitate space
Filegroups are containers for databases files.
The concept of filegroups allows to simplify the
following administrative tasks:
 backing up and restoring databases (it is
possible to backup and restore a single
 controlling in which files database objects
(e.g. tables, indexes) are stored.
 Each SQL Server database contains so called
primary file group.
 The primary database file belongs to this group.
 Other data files can be also the member of the
primary group.
 Alternatively users can define additional
filegroups and assign secondary database files to
The following figure presents the structure of a SQL Server database
including filegroups.
The database contains four files: three data files and one log file.
The primary data file demoDB.mdf belongs to the primary filegroup.
The secondary data files demoDB2.ndf and demoDB3.ndf belong to the userdefined filegroup called udfg.
SQL Server provides the following system databases:
 system - it stores the instance metadata and
configuration information. An instance does not start
if the files of master are unavailable.
 model - it is the template for user-defined
 msdb - it stores backup and restore history. SQL
Agent objects (e.g. jobs, alerts, schedules) are also
kept in msdb.
 resource - This is read only database which stores
all the system objects. These objects are presented
in the sys schema which occurs in each database.
 tempdb - it stores temporary objects and
intermediate query results.
User-defined databases
CREATE DATABASE database_name
[ PRIMARY ] [ files_defintion [ ,...n ] ]
[ , FILEGROUP filegroup_name file_defintion [ ,...n ] ]
files_defintion [ ,...n ]
file_defintion :=
(NAME = logical_file_name ,
FILENAME = 'physical_file_name'
[ , SIZE = file_size [ KB | MB | GB | TB ] ]
[ , MAXSIZE = { max_size [ KB | MB | GB | TB ] | UNLIMITED ]
[ , FILEGROWTH = growth_value [ KB | MB | GB | TB | % ] ]
) [ ,...n ]
database_name – it represents the name of a new database.
ON – it lets to explicitly define database files.
PRIMARY – it indicates that the first file presented in files_defintion becomes the primary data file.
LOG ON – it lets to explicitly define log files.
file_definition – it specifies list of files and their properties.
NAME = logical_file_name – it specifies the logical file name.
NAME = 'physical_file_name' – it specifies the physical file name.
SIZE = file_szie – it specifies the size of the file.
MAXSIZE = max_szie – it specifies the maximum size of the file. The file can grow until it reaches max_size.
FILEGROWTH = growth_value – it automatically increases the file when there is no enough space to perform user
transactions. growth_value represents the amount of space which SQL Server adds to the file.