MS exam 70-229 - SQL Server 2000 Database Design & Implementation Guide
Contents:
Part 1: Introduction to MS SQL Server 2000
Part 2: Using basic T-SQL
Part 3: Database design
Part 4: Accessing and modifying data in SQL database
Part 5: Implementing stored procedures and triggers
Part 6: Working with views
Part 7: Implementing indexes
Part 8: Working with transactions and locks
Part 9: SQL Server Security
Part 10: Performance tuning
Part 11: Miscellaneous points
Preface
By far this guide is not complete. Some time ago, back in 2005, I wanted to take the exam. However, I never did.
This guide needed some work and thus it was on my table for a while. Given that I could either trash the guide or release it
as is, I decided it was worth it to give it away.
I have written this short preparation guide as a way for myself to ease studying for the Microsoft
70-229 exam titled: "SQL Server 2000 Database Design & Implementation".
I provide this guide as is, without any guarantees, explicit or implied, as
to its contents. You may use the information contained herein in your computer career, however
I take no responsibility for any damages you may incur as a result of following this guide.
You may use this document freely and share it with anybody as long as you provide the
whole document in one piece and do not charge any money for it. If you find any mistakes,
please feel free to inform me about them Tom Kitta. Legal stuff aside,
let us start.
Guide version 0.1 last updated on 17/10/2005
Part 1: Introduction to MS SQL Server 2000
[1.1] SQL server components and capabilities
- SQL server supports up to 32 processors and 64Gb of RAM in Enterprise edition. Also included is log shipping and clustering.
Developer edition has same features as Enterprise edition but is license limited. Due to memory allocation on 32 bit processors RAM above 4GB needs
to be statically allocated.
- Standard edition supports up to 4 processors and 2GB of RAM.
- Personal edition supports up to 2 processors and can be installed on client operating systems (like Windows XP), however, it is limited to 8
concurrent operations by performance degradation (Workload governor that imposes wait times if there are more than 8
concurrent operations in the system). Windows CE edition runs on pocket PCs.
- Desktop engine (MSDE 2000) doesn't come with any user interface and is database size limited as well as performance limited
(same as personal edition), best used for embedded applications.
- SQL server 2000 supports SQL-92 standard (entry level).
- SQL server 2000 consists of the following components:
- Relational database engine.
- DTS - data transportation services.
- English query - query the system in plain English, without the need for T-SQL knowledge.
- Analysis services.
- Meta Data services.
- Tools, both GUI and command prompt.
- Online help files and books.
- The following are system databases present in SQL server 2000: "master", "tempdb", "model"
and "msdb". When SQL server replication is turned on, another system database is created, "distribution".
- Collation - controls storage of characters (strings) as well as rules used when sorting characters (comparison rules). Each
object can have its own collation. Each collation has three properties: "sort order for Unicode data", "sort order for non-unicode
data" and "non-unicode data code page".
- Unicode two byte bit patterns are defined by Unicode standard, thus you cannot set your own code page.
- Database pages in SQL Server 2000 are 8kb each, with the first 96 bytes reserved for system data - you are left with 8060
bytes for your own data.
- You are limited to 1024 columns per table. SQL server is capable of storing over 2 billion objects per database. Keys and
indexes are limited to 900 bytes in size. No more than 16 columns can make up an index or a key.
- Extents are groups of 8 pages and are basic allocation unit. For small objects (less than 64kb) extents can be mixed, with
up to 8 objects per extent.
- Global allocation map (GAM) (of free extents) governs allocation status of 64000 (4Gb of data) extents - if extent is free it has value
of 1 in GAM, if it is used it has value of 0 in GAM.
- Shared global allocation map (SGAM) is like GAM but for shared extents, value of 1 in SGAM means extent is used as mixed extent with
free pages, value of 0 in SGAM means its not a mixed extent or its full.
- Primary data file extension is ".mdf", secondary data file extension is ".ndf", log file extension is ".ldf".
- It is recommended that log files are about 25% size of the database.
- Clustered tables have their pages linked in a double linked list implemented as a B+-tree due to presence of clustered index.
Non-clustered tables are called heaps.
- Clients send SQL data to the server utilizing "Tabular data stream (TDS)". Newest version of TDS is 8, SQL 7 used version 7,
earliest version supported version 4.2 of TDS.
- Full text searching is administered by MS search service via full text catalogs that are stored as separate files which are not
accessed by MS SQL Server.
- All SQL runs inside a transaction - SQL server supports auto commit, which is on by default (as well as implicit and explicit transactions).
- Data definition language used by SQL server is extension of SQL-92 DDL with support of "CREATE, ALTER and DROP" commands.
- SQL server supports the following APIs:
- ActiveX Data Objects (ADO).
- ODBC with APIs build over ODBC such as Remote Data Objects (RDO) and Data Access Objects (DAO).
- Embedded SQL for C (ESQL).
- Legacy (pre SQL-92 Standard) DB-library for C.
Part 2: Using basic T-SQL
[2.1] Using query analyzer
- The execution tab, trace tab, statistics tab and estimated execution tab are not shown by default in query analyzer, you need to enable them by
selecting appropriate button from the query menu for all tabs you want to see.
- You may use debugger (single instance only) by selecting stored procedure from object browser located in query analyzer and
selecting "debug". It is not advised to use debugger on production servers due to object locking that debugger uses.
- isqlw is a command line utility that by default opens query analyzer. It can be used without interactive GUI
environment with input and output files.
- Query analyzer uses ODBC API with SQL-92 compliant drivers set with default settings to connect to SQL server.
- isql utility is used to interact with the SQL server from command line. It uses case sensitive switches, "-S" for server,
"-U" for user "-P" for password. It uses DB-Library to connect to the server and thus doesn't set any
connection settings by default - you will need to use "SET" command to set your own connection settings at OS command
prompt, for example you may set the editor to notepad with "SET EDITOR=notepad". If no user
name/ password/ server is specified environment variables are checked for these & current user/password is used via
Windows authentication. There is a limit of 1000 characters per display line. This client works on SQL server 6.5 level. Use
"QUIT" or "EXIT" to exit (not case sensitive), use "GO" to execute a query.
- osql command prompt utility is identical to "isql" utility except for the way it connect to the SQL
server, osql uses ODBC driver with default connection setting and thus complies with SQL-92 standard.
[2.2] Basic T-SQL
- All client applications use T-SQL to communicate with the server. T-SQL is used to send three types of commands: DDL, DCL, DML.
- DDL - data definition language is used for object manipulation and uses keywords "CREATE",
"ALTER" and "DROP".
- DCL - data control language is used to control access to objects and uses keywords "GRANT",
"REVOKE" and "DENY".
- DML - data manipulation language is used to manipulate data within objects (created with DDL) and uses keywords such
as "INSERT", "DELETE", "UPDATE" and "SELECT".
- All objects can have a name called "identifier", for some objects an identifier is optional. There are two types
of identifiers, "regular" and "delimited". The delimited identifiers need to be placed in "[" and "]"
(or in " " but "SET QUOTED_IDENTIFIER" has to be ON to use quotes) because they do not comply with identifier naming rules (i.e. they may have spaces in them or be
identical to SQL reserved words).
- Variables can be used in batches and scripts and begin with "@", for example "@myData=14" sets variable
"@myData" to value of 14.
- SQL Server supports different types of functions:
- Aggregate functions are allowed as expressions only in "SELECT", "COMPUTE" and "HAVING".
- Scalar functions take single value and return single value.
- Rowset functions return objects that can be used as table references in T-SQL statements. Rowset functions may return
different results when they are called multiple times.
- User defined functions can be scalar or table returning in nature. Table returning functions can be either made of a
single select statement that returns a table (inline) or a set of statements that build a table that is returned
(multi-statement).
- User defined data types are expressed in terms of build-in data types. They can be added with the help of "sp_addtype
{type}, {system type}, [null|not null], [owner]" and dropped with "sp_droptype {type}".
- Control of flow keywords:
- "BEGIN - END" pair is used to group together blocks of SQL statements.
- "BREAK" and "CONTINUE" is used with while loops, created with WHILE keyword.
- "IF - ELSE" is used for conditional processing.
- "GOTO" - is used to jump to specified label.
- "RETURN" - terminates query, stored procedure and batch.
- "WAITFOR" - suspends execution of a connection.
- T-SQL script supports two types of comments, inline "--" and block C style "/* */". Note that
multilane comments cannot span batches - batch end is signaled by "GO" statement at the beginning of a line which is always
executed regardless of comments (its executed even if inside comments if its at the beginning of a line).
- Processing order of T-SQL commands (like simple SELECT): Parse text, build query execution sequence tree, use query optimizer to update
query execution tree, relational engine executes plan - asks (using OLE DB) storage engine for data, relational engine processes
returned data to transform it into requested format.
- A batch is a grouping of T-SQL statements that is sent together to SQL server. Most run-time errors will stop batch
execution once encountered. Batch end is signaled by the "GO" command. Batches are compiled into single execution plan.
Batches are restricted in size to the size of a network data pocket, default 4kb, maximum of 64kb.
Batch restrictions:
- Batches can contain only one "CREATE" statement.
- You cannot alter a table and then reference new columns later on in a batch.
- "EXECUTE" statement is only needed if its not the first statement in a batch. Note that objects executed with
this command uses its own execution plan (not part of the batches execution plan).
- Identifier rules: letter (lower or upper case), one of @, #, _ - with @ as a variable start and # as temporary variable.
@@ used by some global variables and ## for global temporary variables. Numbers can be used not for the first character,
$ can also be used. Spaces and reserved keywords are not allowed.
Part 3: Database design and manipulation
[3.1] Basics of database design and manipulation
- Databases are made of tables (entities) that contain rows (topples) and columns (attributes) - note definitions.
- OLTP - online transaction processing. OLAP - online analytical processing (decision support database) used for
historical warehouses and data mining. Denormalized for optimum read performance.
- Normalization should not be extremist - avoid cases where you have queries that need more than four joins.
- Normalization rules for well-designed databases:
- Every table should have a unique row identifier - a primary key. A primary key should be as short as possible, it should
uniquely identify every row and should not change. System generated primary key is called surrogate key.
- Only one entity should be stored per table.
- Multiple values should not be stored in the same column or row (i.e. lists) and new columns should not be created to
store similar type of information.
- Columns should avoid allowing null values as they need special processing that takes time.
- Normalization forms (formal forms):
- 1st normal form: there are no duplicate rows in a table, each column is single valued and entries in each column are of same type.
- 2nd normal form: in 1st normal form and all non-key attributes are dependant on the primary key.
- 3rd normal form: in 1st and 2nd normal forms and all non-key attributes don't depend on each other (only on primary key).
This is the form at which most DB normalization stops.
- BCNF (Boyce-Codd) normal form: in 1st, 2nd and 3rd normal forms and dependencies within keys are dependant on
keys (stronger version of 3rd normal form).
- 4th normal form: in 1st, 2nd, 3rd and BCNF normal forms and all dependencies are result of keys.
- 5th normal form: in 1st, 2nd, 3rd, BCNF and 4th normal forms and every join dependency in the entity is a consequence
of the candidate keys of that entity - relationships cannot be decomposed any further.
- When deciding on the column(s) that going to be used as a primary key you may come across different combinations -
candidate keys. Keys that you don't choose to be your primary key are called alternative keys. (note that all candidate
keys pass requirements for a primary key).
- Relationship types: "one to one" (rare since data could be in just one table), "one to many"
(common) and "many to many" (need junction table).
- System design considerations:
- This is rather obvious but you need to known what business logic/functions you are going to model in your database. Goals set must be realistic.
- The amount of data system is going to handle and how that amount will change (most likely grow at a certain rate).
- How the data will be used (tasks done by users) and by whom (how many and which users).
- Identify business rules that will need to be implemented on database end.
- Database design steps:
- Logical design - step before you start to divide things into entities, interview of a client with a goal of finding out what is needed.
- Identify who are your entities (tables) and what attributes do they have (columns).
- Identify all relationships between your entities - record foreign keys. When a table has relation of 1 (one) write "1" beside it,
when it has relation of many write "∞" beside it.
- Record all data constraints.
- Using filegroups:
- Transaction log files are not part of a filegroup.
- Each data file is a member of only one filegroup.
- Each filegroup (and thus a file) can only be used by a single database.
- Primary data file contains startup information & data - every database has one primary file. Primary file is in the primary filegroup.
- Secondary data files are used to increase performance or to go around file size limitations (set at 32tb).
- You may create tables in specific filegroup - like partitions in Oracle.
- All system tables are located in the primary filegroup.
- Initially the primary filegroup is the default filegroup, but you may change that later on to any user defined filegroup.
- It is a good idea to reserve primary filegroup for system objects while user defined filegroup stores other objects.
- Objects that compete for space and/or I/O resources should be placed in different filegroups. For example, tables that are
joined in a query could be placed in separate file groups. Remember that indexes are objects too and can be placed in
separate from their table filegroups.
- Transaction log files are best placed on a different disk than filegroups - for increased performance.
- You may backup only single filegroup, not the whole database. They also can be checked individually for consistency.
- Primary file group has to be in read-write mode, however, for historical data, secondary file groups can be read-only.
- In order to have point-of-failure recovery ability you must protect your primary filegroup.
- Database creation & management rules:
- Database name follows same naming convention as database objects.
- By default, only members of "sysadmin" and "dbcreator" server roles can create new databases.
- There can be at most 32767 databases created on a single server.
- User who creates a database becomes its owner.
- Newly created database files (there are at least two - primary data file and transaction log file) are
"wiped clean" by the server before "Model" database copy is created - space is overwritten with zeros.
- When creating new database you should always specify maximum DB size as to prevent DB taking over all disk space
(sort of like each new release of Windows ☺).
- You may create a new database either with enterprise manager or by using "CREATE DATABASE" T-SQL statement.
Note that for with SQL option you need to set all DB settings in DB creation script. In enterprise manager you may also use
"Create Database Wizard".
- You can view detailed information about a database by using enterprise manager, executing
"sp_helpdb {database name}" stored procedure or using function "databasepropertyex" as in this example
"SELECT databasepropertyex('pubs','status')".
- All database changes can be made from enterprise manager. Some changes can be made with "ALTER DATABASE"
other with help of stored procedures like "sp_dboption".
- Databases can be deleted using enterprise manager or with "DROP DATABASE" command.
- Raid 0 (stripe) is not recommended for any SQL files (maybe except tempdb). Raid 1 is recommended for transaction log
files due to its speed benefits for sequential I/O.
- When multiple RAID channels are available combining them into a logical spanned drive creates a drive that has as
throughput the sum of all channels used to create it.
- SQL server supports "raw" partitions. Raw partitions are partitions that have not been formatted with MS file
system. On a raw partition you can place only a single file. You cannot use some features with raw partitions since they cannot
expand automatically (no autogrow) - you need to expand them manually.
- You may use "DBCC SHRINKDATABASE {database name}" or "DBCC SHRINKFILE {file name}" to decrease amount of space taken by the database.
[3.2] Creating and modifying databases
- You create new databases with the help of "CREATE DATABASE" statement. Only members of "sysadmin" and
"dbcreator" (by default) can create new databases.
- CREATE DATABASE {database name} ON (
- {[PRIMARY] [file specifications],[file specifications] etc} [FILEGROUP] [file specifications,[file specifications] etc] -- can have multiple file groups but only one primary.
- ) LOG ON (
- [file specifications],[file specifications] etc)
- {COLLATE collation name defaults to server default}
- File specifications consist of:
- (NAME = logical file name,
- FILENAME = 'operating system file name in single quotes',
- SIZE = size in KB/MB/GB/TB at start if not specified defaults to model db size,
- MAXSIZE = max size in KB/MB/GB/TB/UNLIMITED unlimited is the default and is not recommended,
- FILEGROWTH = growth increment that can be specified in KB/MB or as % with minimum of 64kb default is 10%)
- To create read only database (for example a database that is going to reside on a CD-ROM) use
"sp_create_removable" stored procedure instead of "CREATE DATABASE".
- You can rename databases with "sp_renamedb", however database needs to be in single user mode before it can be renamed.
- You modify databases with "ALTER DATABASE" statement:
- ALTER DATABASE {database name}
- [ADD FILE {(file specification)} TO FILEGROUP {file group name}]|
- [ADD FILEGROUP {file group name}]|
- [ADD LOG FILE {(file specification)}]|
- [REMOVE FILE {logical file name}]|
- [REMOVE FILEGROUP {file group name}]|
- [MODIFY NAME = {new database name}]|
- [MODIFY FILEGROUP {file group name} [READONLY|READWRITE|DEFAULT] NAME={new name}]|
- [SET {options}]|
- [COLLATE {collation name}]
- With above SET statement you can set different options, in older versions of SQL server you needed to use "sp_dboption".
- Some interesting (or not self explanatory options available):
- RESTRICTED_USER - database in use only by "db_owner", "dbcreator" and "sysadmin".
- AUTO_CLOSE - database option that when turned on will free database assigned resources once all users finished using it.
- ANSI_NULLS - SQL option when turned on will force all comparisons that involve nulls to evaluate to "unknown". Otherwise if both values compared are nulls "true" is returned.
- ANSI_PADDING - SQL option when turned on strings are padded to same length before comparison or insertion operations.
- ANSI_WARNINGS - SQL option when turned on errors such as divide by zero will show up.
- QUOTED_IDENTIFIER - SQL option when turned on double quotation marks may be used to enclose delimited identifiers.
- BULK_LOGGED - recovery setting, certain bulk operations will not be logged by the system.
- SIMPLE - recovery setting, log space reused when not needed by server recovery - i.e. only active part of the log is kept.
- Databases are removed with "DROP DATABASE {database name}" statement.
- You can attach a database with "sp_attach_db @dbname='{database name}', @filename1='{file name 1 - up to 16 can be used}". You can detach a database with "sp_detach_db @dbname='{database name}'".
- Stored procedure "sp_helpfile" can be used to determine files used by current database.
- The command "DBCC SQLPERF (LOGSPACEO)" will return current amounts of space used by all logs - different things can be found with "SQLPERF".
- Stored procedure "sp_spaceused {objectname}" returns amount of space used by given object.
[3.3] Table management
- Every column in a table has associated data type, it is an attribute of a column. Other objects can also have a data type
associated with them, these are: variables, stored procedure parameters, function return values. Stored procedures may also
return a value (return code) that is always an integer.
- Tables should be created in a way that avoids the use of "nullable" columns. Try to use default column values with
the help of "DEFAULT" keyword - note that defaults are treated as constraints when creating tables. When
defining a column datatype you also define whatever given column can store nulls or not with "NULL" or
"NOT NULL" keywords.
- Data type can be broken into four parts: type of data that object can store, length of data that can be stored, precision
and number scale (last two for numbers only).
- The following data types (build in) are not limited to 8kb size (in bytes): "image", "text" and "ntext". Note that unicode
characters take 2 bytes each. Maximum size for blob type objects is set at 2Gb (note truncations in DTS transformation).
- The "sql_variant" data type can hold values of other types (small size types), except text, ntext, timestamp,
image, and sql_variant datatypes.
- SQL Server 2000 supports "text in row option" with "sp_tableoption '{table name}', 'text in row',
'{number of bytes from 24 to 7000 with 256 as default}'" stored procedure allows storage of BLOB datatypes in rows
for small sizes - increases speed of access for small BLOB objects while keeping size flexibility.
- The "datetime" datatype is stored as two integers (4 bytes each). The earliest date stored is 1753/01/01,
latest 9999/12/31. "DATEDIFF(datepart, startdate, enddate)" is used for date comparison, startdate is subtracted from enddate.
- "Cursor" and "Table" data type cannot be assigned to a table column.
- User defined datatypes can be added with enterprise manager or with "sp_addtype" stored procedure. Types created
in a database exist only in that database, for system wide types create them in "Model" database (new types will
exist only in newly created databases).
- When changing defaults with T-SQL you need to delete old default first before creating a new one. Defaults must comply with
column specified datatype. You cannot create defaults for "identity", "primary key" or columns with "timestamp" type.
- Each table can contain only one identity column (of type that is subset of integers) and one global unique identifier column
(used for replication, unique across all servers).
- With T-SQL identity columns are created with "IDENTITY(startValue,increment)" function position of this
keywords in definition is not enforced by SQL server. If you want to add or remove identity property from a column it is best
done through enterprise manager, since this operation involves a lot of SQL (enterprise manager is not efficient as it copies
the whole table, manually you can just create a new table and integrate it into your table structure).
- Identity column can be referenced by its column name or by "IDENTITYCOL" keyword.
- Global unique identifier columns are not nullable columns that have type set to "uniqueidentifier" and whose
default is created with the function "newID()" you also need to specify "ROWGUIDCOL" during column
definition otherwise you just create normal column that happens to store GUIs. The keyword "ROWGUIDCOL" can be used
to reference column that contains global unique identifier.
- Information about tables can be provided by "sp_help {table name in quotes or if none provided lists all objects in
current DB}", enterprise manager, "sp_depends {object name mandatory in quotes}" - list dependant objects.
[3.4] Data integrity - making sure values stored in the DB are correct
- Methods that SQL server uses to enforce data integrity: "NOT NULL|NULL", data types (including user defined), "DEFAULT" definitions,
"IDENTITY" property, constraints, rules, triggers and indexes.
- Constraints, for optimization reasons, are preferred data integrity enforcement mechanism to defaults, rules and triggers.
- Main types of data integrity:
- Entity integrity - enforces integrity of primary key (unique identifier column) of a table.
- Domain integrity - enforces validity of entries in a particular table, such as type, format, length, range of values and other constraints.
- Referential integrity - ensures that key values are consistent across tables, no references to invalid key entries exist.
- User defined integrity - enforces user defined business rules not present in other categories.
- SQL server supports cascading referential integrity - it is placed right after foreign key constraint definition, with
"[ON DELETE {NO ACTION|CASCADE}] [ON UPDATE {NO ACTION|CASCADE}]".
- Constraints are divided into four groups: "Primary key", "foreign key", "unique" and
"check". Constraints also have two subtypes, "column level" constraints and "table level" constraints.
- Primary key column(s) have unique values, enforced by special unique index which also helps with data access speed. Only one
primary key constraint is allowed per table. You create column level constrain with "PRIMARY KEY" keyword. You create
table level primary key constraint with "CONSTRAINT {name of constraint} PRIMARY KEY ({column list})".
- Foreign key constraints can contain null values, however, if nulls are present in composite column foreign keys the
constrain is not verified. Foreign key constrain may reference primary key constrain of another table or unique constraint of
foreign table. You can create foreign key reference in two ways, with column based constrain using "REFERENCEOS"
keyword as in "REFERENCEOS orders(orderID)" or table level constrain as in
"CONSTRAINT {name} FOREIGN KEY ({columns}) REFERENCEOS {tableName}({columnList})". When adding foreign key constrain you
may choose not to perform default column check. You may also disable this constrain for insert and update operations.
- Unique constraint allows columns to have null values, there can be many unique constraint columns in a table and unique
constrain column may be referenced as foreign key in a different table. When a unique constrain is create, by default, SQL
server (as with primary key) creates non-clustered unique index. You create unique constraints with "UNIQUE" keyword
for column level constraint and with standard constrain syntax for table level constraints.
- Check constraints are used to limit acceptable column values to some sub range of column's data type. You can apply multiple
check constraints to the same column, they will be evaluated in the order they were created. You add column level check
constraint by using the same syntax as with table level constrain, but following right after (without a comma) type definition.
"CONSTRAINT {name} CHECK ({columnName} {range of values})". As with unique constraint, you can disable check
constraint for update and insert operations as well as for replication purposes.
- Before you can modify existing constraint, you need to drop it and then re-create it with new settings.
- To stop SQL server from doing checks before it implements a constraint, use keyword "WITH NOCHECK" after table
name. To disable it, use "NOCHECK CONSTRAINT {constraint name}" after table name (i.e. omit the "WITH").
- Stored procedure "sp_helpconstraint {tablename}" shows all constraints present in a table (they are
also present in detailed output of "sp_help").
- Check constraints can use "between" keywords and "like" and "not like".
- You can create global default constraints (default objects) using: "CREATE DEFAULT {default name} AS
{default value}". You bind a default to a column in a table using "sp_binddefault {default name}, {table.column}".
- You can create global rules (preferred way is to use check constraints) with: "CREATE RULE {rule name}
{some variable @testVar constraint definition}". You bind rules to columns using "sp_bindrule {rule name}, {table.column}".
- SQL server supports three types of replication:
- "snapshot" - distributes data to subscribers as it appeared at the moment the snapshot was taken. Copies
whole data set.
- "transactional" - uses snapshot to push data to subscribers for initial population. As changes occur, they
are propagated to subscribers. Subscriber data is read only.
- "merge" - similar to transactional replication but allows data changes to be made by the subscriber. Changes
are merged together at publishing server.
Part 4: Accessing and modifying data in SQL database
[4.1] Accessing database data
- Basic elements of SQL select statement (most important functionality) broken into parts according to order they may appear
with optional parameters in square brackets:
- SELECT [DISTINCT] {list of table column names or acceptable substitutes such as aggregates}.
- [into {new table name}] used in select to insert data from multiple tables can be combined to create a new table. This is different than select - insert statement, here you SELECT {columns} INTO {new table name} without "INSERT" keyword.
- FROM {table list}.
- [WHERE {single search condition}] - for multiple search conditions add as many AND {search condition} and OR
{search condition}.
- [GROUP BY {column list}] - used with aggregates in SELECT list.
- [WITH CUBE|WITH ROLLUP] - used to expand information about data returned to user by SQL server. Needs "GROUP BY" to be used, "WITH CUBE" or "WITH ROLLUP" but not both.
- [HAVING {aggregate conditions}] - limits placed on aggregate functions present in select statement, can use AND and
OR for multiple conditions.
- [ORDER BY {columns list}[ASC|DESC]] - order by column data, can use ASC or DESC after each column name to specify whatever
sorting should be ascending or descending.
- In select clause you may use "DISTINCT" to specify distinct values only, "TOP n" to choose n first
values returned by the query. The "AS" keyword is optional and is used to rename column names or output from
aggregates. Note that when using an alias for a table name, subsequent references to that column may use the alias, for
renamed aggregates you cannot use the alias in SQL - its for display purposes only. Alias for a table needs to be used instead
of a table name in subsequent SQL, if table name is used instead of an alias defined for it SQL server will throw an error.
- There are two types of joins, "inner" and "outer". Both join types are based on set theory. Outer joins can be left
(all rows from left table with join condition matching rows from right table), right (all rows from right table with join
condition matching rows from left table) or full (all rows from both tables regardless of join condition). Most joins can be
re-written as sub-queries and vice versa.
- Subqueries can be used to retrieve values that are part of the select statement as well as placed in the where clause.
In the where clause there are different keywords that can be placed between subquery and "WHERE" keyword, such as
"IN", "NOT IN", "{comparison operator} [ANY|ALL]", "EXIST", "NOT EXIST"
(existence checks for value returned by a query of True or False regarding existence).
- The "cube" operator generates result set that includes aggregates for all combinations of values in selected columns - 3D
data view. You will get an extra row(s) (at least one extra row) in your query where for each output column combination you
will get values of aggregates. This includes null value for output columns.
- The "rollup" operator is used to create hierarchical order from lowest to highest according to columns listed in group by clause.
- With both cube and rollup operators you can use aggregate function "GROUPING({column(s) used in group by clause})"
that is used to indicate with 1 or 0 rows in return query that are part of rollup or cube operation (1) or not (0).
- Both cube and rollup are not ANSI standard, however, they should be used instead of "COMPUTE|COMPUTE BY".
- You use compute statement to create additional results sets after execution of a select statement. These result sets contain
summary information about data present in a column(s) using aggregate function. "COMPUTE {aggregate function such as
sum({column name})}, [more computations] [BY] [column name]".
- "sp_executesql" stored procedure can be used to execute SQL statements that can be reused multiple times or contain
dynamic SQL.
[4.2] Updating, deleting and inserting database data
- Insert syntax: "INSERT [INTO] {table or view} [(list of columns comma separated)] (list of values comma separated in
quotes if needed)". Note that "INTO" is optional.
- Insert select query is created with "INSERT [INTO] {table or view} [(list of columns comma separated)]
SELECT [(list of columns comma separated)]" -- rest of select as usual with possible filters.
- In order to insert large amounts of data into columns of supported types (i.e. text, ntext, image and custom types based on
them) you can use "WRITETEXT" statement. This statement doesn't work in views. ADO applications can use
"appendChunk()" method. OLE DB applications can use "isequentialStream()" method. ODBC uses
"SQLputData()". DB-library uses "dbwritetext()" method.
- Update syntax: "UPDATE {table name} SET {column = new value list of name value pairs which are comma separated} [WHERE condition]".
- Using "UPDATE" you can update data on remote servers using linked server name or "OPENROWSET" or
"OPENDATASOURCEO" or "OPENQUERY" functions.
- Update is only performed if new data adheres to all constraints placed on target table.
- You may use SELECT to select data from different table than the one you are updating, as in
"SET {column name}= {here goes select statement for a value}".
- Delete syntax: "DELETE {table or view} [FROM {tables, views} WHERE {condition(s)}]" only data from the table
right after "DELETE" gets deleted. The "FROM" and "WHERE" are optional - used to specify rows that need to be deleted.
- "TRUNCATE TABLE {table name}" is a not logged way of quickly deleting all data from a database table as well
as resetting all indexes placed on a table being deleted.
- The "OPTION (FAST n)" allows a select query to be optimized for display of the first "n" rows as fast as possible.
[4.3] Advanced methods of accessing and modifying data in SQL server
- "bcp" bulk copy program is a command line utility that is used to get data to and from a data file.
- From T-SQL you may use "BULK INSERT" statement to insert large amounts of data into SQL server from outside
text file. However, this command cannot, like bcp, copy data to a data file. For text file format this is the fastest way to
get data into SQL server. For native data, it is as fast as bcp and DTS.
- As with DTS, the order of the columns and not needed columns in a data file don't matter since both bcp and bulk insert
can skip not needed data as well as re-arrange data that is needed.
- Bcp syntax and important switches:
- bcp {database..dbtable | query} in|out {data file name}.
- The -U switch is for user name, case sensitive.
- The -T switch is for trusted connection (i.e. using windows authentication).
- The -P switch is for user password.
- The -S switch is for server name.
- Bulk insert syntax:
- BULK INSERT {table name}
- FROM {data file path in quotes}
- WITH (DATAFILETYPE = '{file type}')
- Different file types:
- native, (-n in bcp) used to store data in SQL native format that is only understood by bcp/ bulk insert.
- character, char in bulk insert (-c in bcp) uses standard characters (plain text) for columns. This is bulk
insert default mode.
- unicode character, widechar in bulk insert (-w in bcp) same as character but in unicode.
- unicode native, widenative in bulk insert (-N in bcp) uses native mode for all non-character data while using
unicode for all character data.
- For fastest possible bulk insert (not logged for every insert) enable "select into/bulk copy" option. If you need to
log everything, than use both "BCP" and "BULK INSERT" with options "FIRE_TRIGGERS" and
"CHECK_CONSTRAINTS". By default for bulk operations, these options are not executed - to speed things up.
- DTS packages can be saved to SQL server, SQL Server Meta Data Services, structured file or MS Visual Basic file.
- Package execution can be scheduled. Package source version support is build into SQL server. You can password protect your packages.
- Each package can contain objects of four different types: "DTS tasks", "DTS transformations",
"DTS connections" and "DTS workflow".
- DTS task types:
- Data import and export - import/ export data from text files of different types or OLE DB data sources.
- Data transformation - you may re-arrange data columns, remove some or even run data driven query.
- Database object copy - you can transfer any DB object, for example trigger, index or user login.
- Sending messages - you can send email, execute other DTS packages and send messages to other packages.
- Execute T-SQL or ActiveX scripts on a data source.
- Since DTS is based on COM (Component Object Model) you can create your own custom tasks.
- DTS transformation abilities:
- Column data manipulation - for example changing size or type of column data.
- Applying ActiveX functions - complex conditional operations.
- Using one of build in transformations or writing your own.
- SQL Server can access external data through OLE DB which is provided as a "rowset". You may reference a
rowset in a query in the same way as SQL table.
- You can access external data through a linked server. A linked server can be added through enterprise manager or using
"sp_addlinkedserver" stored procedure. You reference tables in linked server via
"{linked server name}.catalog.schema.{object name}" as in "SELECT * FROM pollData...polls".
- "OPENQUERY" function is used to execute a query on a linked server. It's syntax is
"openquery({server name},{sql to be executed})" function results can be placed instead of a table name in select operation.
- "OPENROWSET" is used to access remote data except you need to provide connection information for foreign server,
use for infrequent access to remote servers. This function returns results of a query as opposed to entire table as "OPENDATASOURCEO".
- "OPENDATASOURCEO" is similar to "OPENROWSET", it supports "catalog.schema.object" notation
and exposes multiple row sets. Used for ad-hoc operations. This statement is a macro as is its cousin "OPENROWSET".
- "OPENXML" - used with small XML documents for which this statement provides a row set result - i.e. can be used in place of a table.
- A cursor is defined as a entity that maps over a result set and establishes a position on a single row within the
result set. Operations can be performed on the row cursor is on all rows under it.
- Cursors extend result sets by providing application with a way to work with one or few rows at a time instead of whole
results sets returned by SELECT statement.
- There are three types of cursors, T-SQL server cursors, API server cursors (both together are referred to as server cursors) and client cursors.
- Server cursors support only single result sets and don't support some T-SQL statements such as "COMPUTE" and "INTO".
- Using cursors to retrieve data:
- You declare a cursor with "DECLARE {cursorName} CURSOR FOR" statement followed by standard SELECT statement
that declares the result set with which cursor is going to work.
- After cursor is declared, it needs to be populated with "OPEN {cursorName}" - this statement executes select
statement present in cursor's definition above.
- Use "FETCH NEXT|FIRST|LAST|PRIOR|ABSOLUTE n|RELATIVE n (note that n can be positive or negative) FROM
{cursorName}" to retrieve rows, this is executed many times, since there are usually many rows in the cursor result set.
- Use "CLOSE {cursorName}" to end active cursor use, not that cursor is locked at the last row and can still be
opened with "OPEN {cursorName}".
- To remove cursor form the system freeing all resources use "DEALLOCATE {cursorName}".
- Cursors are subject to the same locking rules as any other select statement. Additional "scroll" locks can
be added for cursors based on cursor concurrency level.
- Fetching of every row requires round trip to the server. T-SQL cursors don't support batches. T-SQL cursors work best when
they are part of stored procedure (or a special stored procedure - trigger) where they are part of the execution plan and
network traffic is non-existent due to operations being performed on the same server.
- T-SQL cursors exist only in T-SQL world and can thus only be referenced by other T-SQL statements. For applications you need to use
cursor support build into database API. T-SQL cursors are also limited to one row at a time, while API server cursors are not.
- API server cursors go through the same process as T-SQL cursors, they are supported on OLE DB, ODBC, ADO and
DB-library (if "DBCLIENTCURSOR" is not set DB-library works same as other APIs).
- There are four cursor types for API server cursors, forward-only, static (aka insensitive or snapshot), dynamic and keyset-driven.
- Cursor behavior is defined with two keywords in SQL-92: "SCROLL" and "INSENSITVE" database APIs may add more.
- Client cursors are implemented by caching all of the client's result set rows. Thus each time client application calls a
cursor the driver performs cursor operation on the result set cached on the client (not all rows may be cached, all is default).
- Client cursors cannot be dynamic, they can be static or forward only. Client cursors are supported by ODBC, DB-library and ADO.
- Client cursors should only be used if one needs to go around the limitations of T-SQL with server cursors or their
inability to work with batches.
- All cursors require memory and other caching resources, large cursors or unnecessary cursors can degrade performance.
- There are three types of cursors:
- Static cursors - return snapshot result (stored in tempdb).
- Dynamic cursors - opposite to static, all changes are seen as users scroll through data.
- Keyset cursors - as static cursors but controlled by unique set of identifiers stored in tempdb called a keyset.
- Forward only cursors - defined with "FORWARD_ONLY" statement, more efficient than regular cursors.
- SQL Server support for XML overview:
- Ability to retrieve XML from SELECT statements with "FOR XML" statement.
- Ability to write XML data with "OPENXML" rowset provider.
- Retrieving XML data with help of Xpath query language.
- Support for XML-Data Reduced schemas (XDR).
- Ability to access SQL server through HTTP protocol.
- OLE DB provider enabled for XML documents to be set as command text and return sets as stream.
- The syntax for select statement XML retrieval: "FOR XML {RAW|AUTO|EXPLICIT} [,XMLDATA] [,ELEMENTS] [, BINARY BASE64]".
- Raw mode transforms each row into XML element with generic identifier. Auto mode returns query results as nested XML
elements with each column as an element (with "ELEMENTS" option) or attribute (default). In explicit mode query
creator controls shape of XML - requires the first column to hold positive integers that represent XML tag IDs.
The "XMLDATA" option returns XML data schema and should be used to find out data types.
- "OPENXML" is a rowset provider similar to OPENROWSET, it can be used anywhere a table or view name can. In order
to use it "sp_xml_preparedocument" stored procedure needs to be run in order to parse XML document and return object
reference. OPENXML needs reference to a rowset schema with the "WITH" keyword - it enables OPENXML to known how to
map XML data to column names.
Part 5: Implementing stored procedures and triggers
[5.1] Stored procedure basics
- Every piece of SQL sent to the server has to be checked for validity, security permission and a new execution plan needs to
be build. Complex T-SQL that makes up the body of a stored procedure is treated as a single piece of SQL. Execution plan
for a stored procedure is created when stored procedure is run for the first time. Every subsequent run of the stored procedure
forces SQL server to check cache for execution plan and create new plan if cache is empty.
- Stored procedures can return integer status code. Extended stored procedures can be created with C++ or other programming
languages that can be compiled into DLL (Dynamic linked library).
- System stored procedures are stored in the "MASTER" database. There are hundreds of system procedures included with SQL server.
- Temporary stored procedures are created in "TEMPDB". There are three types of temporary stored procedures: local
(private) - stored procedure name begins with "#", global - stored procedure name begins with "##" and
created directly in TEMPDB.
- Scope of local temporary stored procedure is limited to the connection that created it, after connection is done, it is
deleted. There is no chance of name collision since SQL appends connection number to procedure name. Other users can see stored
procedure but cannot access it.
- Global temporary stored procedure can be executed by any connection, it is not private. This stored procedure has
fixed security permission granting access to public role. This stored procedure is deleted as soon as connection that
created it is gone and any other connections are done executing it.
- Temporary stored procedures create in "TEMPDB" exist till server is restarted. You need to set security permissions for
them and provide path to the objects they are working with.
- Extended stored procedures are external DLL programs. Most extended stored procedure starts with "xp_", but
this is not a rule. To determine stored procedure type use "OBJECTPROPERTY" function with property "isExtendedProc".
- Remote stored procedures execute stored procedure on a remote server, they have been replaced by distributed queries.
- Stored procedure code (body) is stored in the "sysComments" table.
- Stored procedures can be created from enterprise manager in two ways, from tree menu and using wizard. They can also be
created using "CREATE PROCEDURE" (or "CREATE PROC" for short) statement. For API access use SQL-DMO
"storedProcedure" object.
- "CREATE PROC" is followed by stored procedure name. If you want to encrypt your stored procedure, add after its
name "WITH ENCRYPTION" - note that you will not be able to decrypt your stored procedure. If you want to force SQL
server to recompile your stored procedure every time, add "WITH RECOMPILE". On the next line it is customary to
define input and output parameters delimited by a comma. Output parameters are distinguished from input parameters by presence
of "OUTPUT" statement after their definition. Parameters are optional.
- After parameters are defined, if any, stored procedure body begins with "AS" keyword.
- If you want to ensure stored procedure ownership, for example dbo to make sure anyone can run it regardless of table
ownership, proceed stored procedure name with owner name. You don't specify database name in stored procedure name.
- When stored procedure is created SQL server doesn't check existence of objects it references, SQL does this only at
execution. By default SQL server assumes that the owner of all objects referenced in your stored procedure is procedure's
owner. If this is not the case, make sure you add proper owner to every object.
- In order to use extended stored procedure you need to register it with SQL server. You can do that either through
enterprise manager or with help of "sp_addextendedproc" stored procedure.
- SQL server automatically scans the Master database for all system stored procedures, local and global temporary stored
procedures. It doesn't search Master database for extended stored procedures. You need to provide database name and owner
(or owner or none) for your own stored procedures, depending on where you are. Different users may have stored procedures
named the same in the same database.
- Stored procedures cannot create other stored procedures or similar objects; they may create temporary tables.
- You may specify parameters to a stored procedure in two ways, one is to include them right after stored procedure name.
The other is to defined with @ all variables that stored procedure needs as name value pairs.
- Stored procedures are limited to 128Mb in size. Maximum number of stored procedure parameters is set at 2100.
- You may create local and global temporary stored procedures using # and ## before procedure name, respectively. After procedure
name you may include "[;number]" where the number is "grouping ID" - it groups stored procedures together, it
must be sequential - numbering starts at 1.
- You can configure stored procedures that are owned by dbo and are in Master database to run when SQL server starts. This
is done with "sp_procoption" stored procedure. For procedures that are outside of master database, just nest
them inside master database stored procedures.
- Stored procedures can me modified with "ALTER PROCEDURE" (or "ALTER PROC" for short). Modified
stored procedure retains its security settings and its id.
- Procedures are removed from the database with the help of "DROP PROCEDURE" statement. You can also use enterprise manager.
- There are many reasons for a stored procedure to recompile, in addition to it being removed from cache:
- When a lot of data that stored procedure accesses changes, stored procedure might get a new execution plan.
- Creating or dropping objects inside a stored procedure together with data access functions such as "SELECT"
can cause recompile since new objects might be created that were not available before.
- Using cursors that access temporary tables may cause recompile of a stored procedure.
- Referencing temporary tables that were not created inside a stored procedure may cause a recompile - create temporary
objects inside stored procedures (if they are for their sole use).
[5.2] Programming stored procedures
- You can assign query values to a variable in a stored procedure, just use "{variable name} = {select query}".
- When a stored procedure returns multiple results set, or when you want to store result set(s) in variables when you
execute a query and provide parameters you may also use name-value pairs to provide output variable names. However, you
need to add "OUTPUT" at the end of name value pair, for example "@output1 = @resultOutput1 OUTPUT".
- You may get away from above name-value pair system and provide just names. However, in this case the order of definition
of input parameters in a particular stored procedure will be taken to fill in the name part of the pair. You still need to
type OUTPUT after the value if for output parameter.
- To output variable's value use select statement. You may also assign values to columns in a select statement that are stored inside variables.
- The "RETURN" statement is run to return stored procedure exit code, when it is encountered stored procedure
exits with provided error code as in "RETURN(1)" - return code 1.
- To set default value to a parameter, you must in a stored procedure, at the place of declaration make it equal something,
for example "@testInput int = 0".
- When errors occur a special variable "@@ERROR" changes value from default 0 to error code. Note that this variable
changes with every SQL statement execution - this includes if statements in stored procedures.
- The function "RAISERROR('{msg text',{severity},{state})" can be used to throw errors. Error messages are stored
in "sysmessages" table, user defined error messages should have error number greater than 50000.
- You can output strings from a stored procedure using the "PRINT" statement followed by your string. In regular
T-SQL script, what you print will show up in the "messages" window of say query analyzer.
- Procedures can be recursive and can be nested up to 32 levels deep - "@@NESTLEVEL" indicates current nesting level.
You can use cursors in stored procedures as well, however it is preferred to use results sets since they are easier to program
and are more efficient.
- You use and declare cursors in stored procedures as you do outside them. Note that if you are going to return a cursor
from a stored procedure via parameter, that parameter must be of type "cursor".
- Rules for using"IF" "ELSE" and "ELSE IF" in stored procedures:
- In "IF" you may use comparison operators, like "=", for example "IF @x=5".
- You may also use in IF logical operators such as "OR" and "AND", for example "IF @x=5 AND @y='t'".
- For single statement inside IF (i.e. IF followed by say select) you don't need "BEGIN" and "END", however you may still use them if you want to.
- If your IF is to enclose multiple statements you need to use "BEGIN" right after IF and "END" right
after block of statements. BEGIN and END serve the same purpose as curly brackets in C.
- You may use "ELSE" and "ELSE IF".
- You may use "WHILE" loop as in "WHILE(condition that stays true during loop run)".
- The variable "@@ROWCOUNT" contains the number of rows returned by last query, it is reset with execution of
every SQL statement such as "IF". The variable
"@@FETCH_STATUS" contains the row currently fetched from the cursor, -1 when no more are left. The variable
"@@IDENTITY" contains the identity value (column value, integer) of the last insert statement into any table in the
current session.
- The "IDENT_CURRENT('{table name}')" function can tell the last inserted identity value into table provided.
- The variable "@return_status" contains the value of return variable from a stored procedure. You cannot return
"NULL" from a stored procedure, error will be generated and "0" will be returned.
- The "sp_recompile [ @objname = ] 'object'" stored procedure is used to recompile object given (if its a stored procedure) or all
objects in given table when they are run next time. Stored procedures are only optimized when they are compiled.
[5.3] Working with triggers
- Triggers (as most other objects) can be created with T-SQL and using the enterprise manager.
- Triggers are special stored procedures that are invoked after data is modified with one of "INSERT", "UPDATE" or "DELETE".
- You should consider using simpler integrity preserving techniques, such as constraints or (legacy) rules. Triggers should be
used when you need to perform complex integrity preserving operation, one that cannot be accomplished with a constraint. One
such place is making changes in different than the current table(s), for logging reasons or for reasons of denormalized
(duplicated) data in the database. Triggers may also add extra security to your application, for example, any modification to
employees table can be sent via email to system administrator - this is an example of an external action.
- Triggers expand the idea of cascading referential integrity (updates or deletes to foreign keys) since they can operate
on any table in the database or objects outside of it (they are stored procedures). You should let "ON DELETE CASCADE ON
UPDATE CASCADE" take care of basic referential integrity.
- Triggers cannot be created on temporary or system tables, but they can reference temporary or system tables. Triggers
cannot be run manually, they only respond to events. One trigger can be set to respond to multiple events. Triggers can
only be applied to a table ("AFTER" trigger can only be applied to a table) or a view - one only.
- The "INSTEAD OF UPDATE" and "INSTEAD OF DELETE" triggers cannot be defined on tables that have
"ON UPDATE" or "ON DELETE" cascading referential integrity defined.
- Both "TRUNCATE TABLE" and "WRITETEXT" don't trigger the run of a trigger since they are not logged
events (cannot be rolled back).
- When update trigger fires, it has ability to query data in the table that is about to be modified. For insert trigger
newly inserted data is stored in a memory table called "INSERTED". For delete trigger, deleted data is stored
in memory table called "DELETED".
- SQL server supports two classes of triggers, "AFTER" and "INSTEAD OF". The AFTER is the default and is
run in addition to the action (event). The INSTEAD OF trigger is run instead of the action (event) that started it. The AFTER
trigger is also known as the FOR trigger. The INSTEAD OF trigger is new to SQL server 2000.
- The INSTEAD OF triggers - as their function suggest - can only be applied once per table or view. This limitation
doesn't apply to AFTER triggers, there can be many AFTER triggers applied to a table for the same action. You can
specify trigger run order (which trigger runs first and which runs last) with "sp_settriggerorder".
- You can apply both types of triggers onto a table. The INSTEAD OF trigger always runs first, if it fails, transaction is rolled
back. If there are any constraints on the table, they are run after INSTEAD OF trigger. The AFTER trigger is run last,
only if there were no errors in execution before it.
- Only the INSTEAD OF triggers can access table columns of text, ntext and image type.
- T-SQL Triggers creation syntax (for T-SQL, you can also use "osql" command prompt tool or other tools):
- CREATE TRIGGER {trigger name}
- ON {table name or views - the last ones only for INSTEAD OF triggers}
- FOR|AFTER|INSTEAD OF {trigger class and type(s) - i.e. INSERT, UPDATE and DELETE} - note that FOR and AFTER mean the same thing, pick one only.
- AS {follow with T-SQL as when creating standard stored procedure}
- Same as with regular stored procedures you cannot specify database name within trigger name, only the owner and trigger
name can be placed there. Other rules and permissions set as in regular stored procedures as well.
- After the "ON" you specify table name or view name, you can include the owner of the table. A trigger can be
applied only to a single table, if you need to create same trigger for many tables you will need to duplicate code.
- The "FOR|AFTER|INSTEAD OF" statement is followed by the list of event(s) that will force trigger execution.
The list may contain one or all of "INSERT, DELETE, UPDATE".
- Triggers can be modified with "ALTER TRIGGER {trigger name}" statement and deleted with
"DROP TRIGGER {trigger name}" statement. Trigger text can be viewed with "sp_helptrigger" or
"sp_helptext" stored procedures.
- You may disable a trigger with "DISABLE TRIGGER {trigger name}" and enable it with "ENABLE TRIGGER {trigger
name}" used in ALTER TABLE statement.
- In both update and insert triggers you have access to special functionality designed for determining the type of event that
has occurred. You use the true/false value returned by "UPDATE({column name without quotes})" - if true an insert or
update event has occurred.
- For testing multiple column update or insert you may use "COLUMNS_UPDATED()" function. It returns a bit mask for
updated columns. For testing update of a column that is higher than 8, you need to use SUBSTRING function as well (8th column
is represented as number 128, 7th as 64 and so on).
- To rollback trigger firing operation you may use "ROLLBACK TRANSACTION". Triggers should be transparent to the
applications that use them (unless such applications were configured with triggers in mind). Thus, you should be careful when
you return values from a trigger. Also, it might be a good idea to suppress line modification message returned by a trigger with
"SET NOCOUNT ON" statement - users will never known that a trigger executed.
Part 6: Working with views and functions
[6.1] View basics
- A "view" is a virtual table whose contents are defined by a query. View contents are not stored in a physical location, a view is
just combination of data from different places.
- Views are used for security reasons - they allow system administrator a way of sharing data without giving direct access to
the raw data. Views are also used to simplify the way database is seen by the users - define complex logic such as multiple joins
or unions in a view. With help of partitioning (implemented with UNION statement), views can be
used to improve database performance. Views can also be used to export and import data - with help of bcp program or BULK INSERT.
- Security in views involves restricting users to specific tables or even specific rows. Different tables can be joined to
look like one. Aggregate information can be supplied without access to detailed info.
- Views that are complex - expensive to execute can have clustered index implemented in them, in this case views data is
stored to the disk. View will always reflect data changes done to the tables it references.
- You cannot have indexed partitioned view, either view has an index or it is partitioned, not both. For indexed views all
tables must be in the same database.
- Indexed views are only supported by SQL server enterprise and developer editions.
- Accessing data returned by a view is easy, just treat it as a table - place view's name where you normally place table name.
- A federation of servers is a group of servers that share data for optimization purposes but are all administered separately.
- There are three classes of views, depending on the class, different restrictions are placed on your ability to update data presented in a view.
- Simple views - there is no "INSTEAD OF" trigger used and view is not partitioned.
- To delete data from the view you must construct the view based on only one underlying table (the one you are going to delete data from).
- View cannot be only based on an expression, i.e. there must be at least one table in "FROM" part of the query.
- No aggregate functions can be used in the main query (they are permitted in Subqueries).
- You cannot use derived columns in select list. Derived column is a result set formed by operation that is not a simple column reference.
- If "WITH CHECK OPTION" statement is used when view is created, no rows can be modified in such a way that
they will be removed from the view - illegal operation will throw an error. This doesn't effect referenced tables when updated directly.
- Columns listed in UPDATE or INSERT statement must belong to only one table that is part of the view. You cannot
update columns that belong to different tables (that are part of the view) at the same time.
- Rather obvious - all insert and update operation have to observer any constraints placed on underlying table.
- Simple updatable views in their select statement cannot contain any aggregate functions (subqueries may contain them).
- Partitioned views.
- Updatable only on SQL server 2000 Enterprise (or Developer) editions.
- Partitioned views are only updatable if sets of SELECT statements that define them are combined into one using
"UNION ALL" statement. Each SELECT references only one table.
- You cannot use "OPENDATASOURCEO" or "OPENROWSET" functions to specify pass-through query -
these functions can only be used for linked tables that use 4 part name.
- For insert operation you need to specify all columns (including ones with default and null clauses) - tables that
have identity or columns with "timestamp" type as one of the columns cannot be modified.
- You cannot insert or update or delete if a self join is present in the view (with view itself or one of member tables with each other).
- For update and insert operation you cannot use "DEFAULT" as the value of a column (when DEFAULT is specified on that column).
- Partitioned views require that you enable lazy schema validation (query processor will not request object metadata
until it is needed) with help of "sp_serveroption" stored procedure.
- Views with defined INSTEAD OF triggers - extend capabilities of UPDATE and DELETE through the use of a trigger.
- Rules for defining views:
- You cannot use AFTER triggers with views, only INSTEAD OF triggers (for table update operations).
- You cannot use DEFAULT or rules with views as part of a query.
- You cannot use full-text indexes defined on views but view query may reference tables that support full-text indexes.
- The following cannot be used in view main query: "ORDER BY (unless you use TOP), INTO, COMPUTE and COMPUTE BY".
- Temporary views and views on temporary tables are not supported.
- To create a view you must have permission to do so granted by database owner.
- Views are created in a way very similar to stored procedures. You may use enterprise manager or "CREATE VIEW" statement.
- When you create clustered index on a view you save processing time required for building of result set (called
materializing the view). Certain restrictions exist for a view that you want to hold clustered index.
- You need to set "ANSI_NULLS" and "QUOTED_IDENTIFIER" options on before creating view. All tables
referenced by the view also need "ANSI_NULLS" option set to yes. Also the "SCHEMABINDING" option must
be used when creating the view.
- View for which you want to create a clustered index cannot reference other views.
- All referenced tables have to be in the same database and have the same owner.
- All used functions have to be deterministic in nature.
- Certain T-SQL is not allowed in addition to standard SQL restrictions placed on views. For example, no aggregate functions, no COUNT(*) - only COUNT_BIG(*).
- All column names need to be defined, you cannot use * (star).
- Unlike stored procedures, view's execution plan is recreated each time a view is accessed. An exception to that is an indexed view.
Only view's query tree is cached.
- "SCHEMABINDING" option allows the view or procedure or function to see the schema of objects it references.
Referenced objects cannot change (i.e. for example objects cannot be dropped) without the view being modified. All objects
referenced in the view need full path, i.e. "object.name".
- Indexes are created with "CREATE INDEX" statement. Standard options like "ANSI_NULLS" have to be turned
on for execute operation to proceed.
- To partition tables, create small tables with restrictions on the number of rows each table can have. Implement limits on
the number of rows per table using "CHECK" constraint when creating the table.
- Partitioned view is a bunch of select statements from each small table (that the big one was divided into) separated with
"UNION ALL" statement ("UNION ALL" returns all rows, including duplicates, where "UNION" removes duplicates).
- When using "UNION" note that the number of columns in both queries must be the same and data types must be
compatible. The column names in the first query will give returned dataset column names.
- Views can be modified in similar way to stored procedures - using "ALTER VIEW {view name}" statement. Views are
deleted in similar fashion to stored procedures, using "DROP VIEW {view name}" statement.
[6.2] Working with functions
- There are three types of functions, scalar, table and system defined table/ scalar functions. Scalar returning functions
can be used in query filtering, constraint building and computed column definitions. Table returning functions can be used
as an alternative to read only views - they can support more than one select statement to which views are limited.
- Functions accept inputs (up to 1024) and return only one output with the help of "RETURN" statement which is required.
- Scalar functions cannot return these data types: text, ntext, image, cursor, timestamp.
- Functions don't change anything in the database - they don't create any new objects.
- Table returning UDFs have their names proceeded by double colons, (::).
- You may create your own system functions, proceed the name with "fn_", set owner to
"system_function_schema", create them in Master database.
- You may use system information schema views "ROUTINES, ROUTINE_COLUMNS and PARAMETERS" to see information about
functions and views. The first view is the only way to get information system table valued UDF (in Master db).
- Tables returned by table returning UDFs have certain limitations, for example you cannot manipulate any text and ntext
fields with "UPDATETEXT, WRITETEXT and READTEXT" statements.
Part 7: Implementing indexes
[7.1] Index basics
- There are two types of indexes, "clustered" and "not-clustered". There can be only one clustered index per table or view.
Both types of indexes are implemented as B+-trees. Clustered index contains pointers to table records at a leaf level,
while non-clustered index contains bookmarks to table records in the leaf level.
- Full-text index is not SQL Server index - MS search engine maintains full-text index. With this service you can search for
specific strings in text. Full-text index is contained in a catalog that is maintained by MS search service.
- A clustered index is created on the "PRIMARY KEY" column at its creation, if clustered index already exist in
given table when primary key constraint is added, only non-clustered index is created.
- Minimal B-trees are created when the column used to create the index has as few (or none) duplicate values. When you create
clustered index on a column that has duplicates, SQL will need to create special column with unique values to be used as the
key column.
- You may specify "fillfactor" for both types of indexes. A fillfactor specifies the amount of free space left in
each page after a "page split". High fillfactors are good for static tables, low fillfactors are for dynamic tables.
Fillfactor is measured in percentages, default value is 0 (this is similar to 100%). Fillfactor setting is only enforced at
index creation. Low values can slow down read operations.
- Column on which a unique index is created has non-clustered unique index created automatically - to assert the uniqueness
of values entered into constraint column.
- To force different types of indexes for unique and primary key column, include "NONCLUSTERED" or
"CLUSTERED" as appropriate after constrain definition when creating the constraint (as part of a new table or otherwise).
- You can create up to 249 nonclustered indexes on a table (plus one clustered) or 250 nonclustered only indexes.
- For a view, you first need to create a unique clustered index before creating nonclustered indexes, this doesn't apply to
tables.
- A table that doesn't contain a clustered index on any of its columns is called a heap. If a table contains only
nonclustered indexes, they will point to rows in a table. When a clustered index is added all nonclustered indexes will
point (they are always called bookmarks) to keys in the clustered index.
- Indexes can be made "unique", they can be defined on multiple columns (such index is said to be a composite
index), have defined fill factor, have a pad index (change space allocation in the B-tree) and have defined sort order for keys
(ascending or descending). Other advanced features exist.
- A "covering index" is one that when used in a query (said to be covered) prevents the query from going directly to the table -
all results are supplied by the index.
- Sort order placed on the index defines the default sort in which query results will be returned. Note that for performance
reasons, for a single column index, there is no difference what sort it is in.
- The stored procedure "sp_helpindex {table name}" can be used to get information about indexes placed on a table.
You can also use enterprise manager or query "sysindexes" system table directly.
[7.2] Index administration
- Index can be created in enterprise manager directly or with help of a wizard. You may also use
"CREATE [UNIQUE] [CLUSTERED|NONCLUSTERED] INDEX {index name} ON {table name(column list)} [WITH {list of options}] [ON {file group}]" statement.
Index can also be created as part of table definition (create and alter). After column name you may specify "DESC"
or "ASC" for sort order.
- By default you create nonclustered indexes that use system wide padding settings. Sorting is ascending. Filegroup is the
default file group by default.
- In create or alter table statements index relevant commands are these related to primary key and unique constraint, which
each create an index by default. As additional option you may specify the fill factor and file group where new index will be
created.
- Indexes are removed with "DROP INDEX {tableName.indexName}" statement. You need to drop indexes before recreating
them again unless you use "DROP_EXISTING" in create statement. This statement, when issued for clustered index causes
no need for secondary indexes to go through reindexing operation.
- When clustered index is created or dropped on a table all secondary indexes have to be reindexed (they will now use
different bookmark types). You can rebuild all indexes using "DBCC DBREINDEX ('[table name]', [index name, leave blank for
all], [fill factor such as 90])" statement - you will avoid having to drop and recreate all indexes.
- Rules for creating effective indexes:
- Remember that clustered index is more important than any nonclustered index - since you only can have one.
- Use clustered indexes on columns that have unique values, use nonclustered index for columns that don't contain unique data.
- Use nonclustered indexes on foreign key columns, columns that are commonly used and columns with few unique values.
- Wildcard queries are time intensive because they are not aided by indexes.
- Range queries are best used with clustered indexes.
- Sort order can be accommodated with clustered index sort order.
- For purposes of keeping indexes small, your main index - clustered index, should contain as small values as possible
(these are used for bookmarks by nonclustered indexes).
- There is no need for indexes on small tables, especially when they rarely change.
- Remember that when table information is modified indexes have to be updated as well - too many indexes may decrease
performance. This is especially true for tables that undergo frequent modifications.
- When insert performance drops on heavily inserted to table, you may consider dropping indexes and recreating them with lower fill factor.
- Index tuning wizard can be used with input consisting of a trace file, sql script file or direct SQL script.
- Index tuning wizard doesn't make recommendations for: primary key constraint, indexes in foreign databases and unique
indexes. It also doesn't evaluate queries with quoted identifiers and SQL server 6.5 queries, however, it will consider query,
table and index hints (but can cause suboptimal recommendations to be given),
- Index tuning wizard has options that limit number of queries considered and it can generate script for its recommendations
to be implemented at a later date.
- SQL server maintains index use statistics that can be used for index tuning. You may disable statistics gathering on static
tables using "STATISTICS_NORECOMPUTE" option of create index statement. To turn it back on, use
"UPDATE STATISTICS {table} {index}".
- Multicolumn indexes have ability to distinguish rows which have repeated values in a single column - multiple columns make each row unique.
- You can use "update statistics [table name] {index name}" to update statistics on a table where a lot of data
changed recently - index key distribution will be updated.
Part 8: Working with transactions and locks
[8.1] SQL server transaction basics
- All T-SQL statements are processed as a part of a transaction. Locking is used to ensure that data accessed by the
transaction doesn't change in unwanted way.
- Transactions must exhibit properties that are referred to as "ACID":
- Atomicity - transactions are atomic units of work, either all or no work gets done.
- Consistency - completed transactions leave the database in consistent state - i.e. all data structures at the end of the transaction must be in valid state.
- Isolation - concurrent transactions must be isolated from other transactions. This means that a transaction
either sees data as it was before other transaction modified it, or after another transaction ended. Data is never seen in intermediate state.
- Durability - when transaction is complete its results are permanently stored by the system (i.e. even if system is to shut down immediately).
- Transaction log supports three recovery operations:
- Recovery of individual transaction - executed when application issues "ROLLBACK" or when SQL server detects an error.
- Recovery of all incomplete transactions when SQL server starts - when SQL server is started it automatically runs
recovery operation on every database. Every modification that is in the log but was not written is rolled forward, every
incomplete transaction found in the log is rolled back.
- Rolling of restored database forward to the failure point - after hardware loss you may recover your database to the point
of failure by restoring database backups and follow by restoring sequence of transaction log file backups. All not complete
transactions are rolled back.
- SQL server uses a "write ahead log" style. It ensures that no modifications are written to the data file on disk
before they are written to the log file.
- Pages are written to disk when a checkpoint is encountered. Writing a page to disk is called "page flushing".
Pages that were not yet written to data file, but modified in server memory space are called "dirty pages". Note
that before a dirty pages is flushed to the disk log file needs to be modified.
- Transaction log is a sequence of log record strings. Each record in a sequence is identified by its "log sequence number
(LSN)". New records are written to the end of log file with higher LSN numbers than their predecessors. Either description of the
logical operation performed is stored or before and after images.
- The following types of transactions are stored in the log file: Every data modification (insert, delete and update), any
modification of a table or index, start and end of every transaction, extent allocation or de-allocation, rollback operations.
- Checkpoint occurs as a result of the following operations: when "ALTER DATABASE" is executed, when
"CHECKPOINT" is executed, when SQL server is stopped or when SQL server generates checkpoint itself (based on number of records in the log).
- Removing old log entries is called "truncating log". Active part of the log can never be truncated, active log
record is determined by "minimum recovery log sequence number MinLSN".
- Each physical log file is divided into a number of virtual log files. Their size and number is dynamically allocated by SQL
server. There could be too many virtual log files in one physical file if log file grown to a large size using small increments.
- You can shrink log file reduces it physical size (since truncating only removes unused entries). You can shrink logs in
increments of virtual log file size, provided that virtual log file doesn't contain any log entries.
- You shrink the database with "DBCC SHRINKDATABASE", file with "DBCC SHRINKFILE".
- Transaction boundaries can be specified with T-SQL: "BEGIN TRAN[SACTION]", "ROLLBACK TRAN[SACTION] [transaction name]",
"COMMIT TRAN[SACTION]" and others. The can also be specified with application API.
- "SAVE TRAN[SACTION] {savepoint_name}" is used to create a marker point inside a transaction to which the
transaction can be rolled back. This construct is not supported for distributed transactions.
- Certain SQL statements cannot be placed inside a transaction, such as "CREATE|ALTER|DROP DATABASE",
"LOAD DATABASE|TRANSACTION" and others.
- Transactions require a lot of resources - thus programmers should code them efficiently - certain programming practices should be followed.
- Transactions should not be used only for data browsing (this is why when looking at table data SQL server, after a timeout, clears result set).
- Transactions should be as short as possible and access as little data as needed.
- User transactions should not include code that does unneeded processing, for example, in web development, output
formatting should not be done inside a transaction.
- Explicit transaction is one where both start and end of a transaction are explicitly defined. In early versions of SQL server
these transactions were called "user defined".
- Autocommit transaction is one which starts at the start of every SQL statements and ends with its completion. This is the
default transaction mode of SQL server for both direct T-SQL and API.
- Implicit transaction mode provides continuous transactions - once one transaction ends, another is started. To end a
transaction you need to issue either "COMMIT" or "ROLLBACK" command. You can set this mode using
"SET IMPLICIT_TRANSACTIONS {ON|OFF}".
- Distributed transactions span at least two servers which are known as resource managers. Transaction has to be managed by
transaction manager that supports "X/Open XA" specifications. Note that transactions that span two or more databases
on the same server are also distributed transactions, but the management is handled internally. With external distributed
transactions the commit procedure has two stages, prepare phase and commit phase.
- When a batch of SQL statements is executed each is part of a separate transaction. When a run time error occurs during
execution of one of these SQL statements, they are rolled back. If a compile error occurs (SQL server cannot build execution
plan) than non of the SQL statements are executed - transactions and T-SQL processing never took place.
[8.2] SQL server locking
- Concurrency control - ability of a system to prevent modifications made by one person to adversely effect another person.
- Optimistic concurrency control - users don't lock data when they need it - when data update is performed system checks
whatever another user changed the data. If it was changed, than error is thrown and transaction is rolled back. This is used
in environments with low number of collisions.
- Pessimistic concurrency control - users lock data they work with so that other users will not be able to modify it. Used in
environments where there is a large number of collisions.
- The following problems exist with concurrency when locking is not implemented:
- Uncommitted dependency (dirty read) - occur when a row that is in the process of being updated is selected by another
transaction. The transaction that is doing the read gets data that is not yet committed by updating transaction.
- Inconsistent analysis (non-repeatable read) - occurs when transaction reads data row more than once and gets different
results every time. Note that the row being read more than once is committed every time by updating (another) transaction -
this is not a dirty read.
- Phantom reads - occurs when transaction reads a row(s) more than once and get row the first time but not the second
time due to actions of another transaction which deleted that particular row. Similar situation can occur with delete
transaction replaced by insert transaction.
- Buried or lost updates - occur when at least two transactions select the same row and update it based on value
originally selected (transactions don't know about each other). This problem also occurs on file systems when multiple
users have same file open and both update it - MS Source Safe and other CSV programs where created to tackle this issue.
- Type of locking is specified by the user as part of transaction isolation level.
- Transaction is said to be blocked when it is waiting for a resource that another process has placed a lock on. By default
transactions don't time out, unless "SET LOCK_TIMEOUT {time in ms}" is set.
- By default SQL server manages locks and transactions on per connection basis, i.e. locks cannot be shared among different connections.
- A bound connection enables two or more connections to share the same transaction and locks. This decreases locking issues - improves speed.
- Locks can be applied on small and large chunks of data, such as rows, pages, keys, tables, indexes and databases. These are
determined by SQL server, they can be escalated or de-escalated by the server (rare event).
- There are several types of locking modes: schema, exclusive, update, shared and intent. Lock mode determines level of dependency.
- SQL server has the ability to automatically detect deadlocks and resolve them by terminating one of the transactions that formed the deadlock.
- In order to find out the query that is producing the deadlock you need to add a trace on flag 1204, when server starts or
using "DBCC TRACEON(1204)". You may also capture "deadlock chain event" in SQL profiler.
- SQL server supports distributed transactions - transactions that span multiple servers.
- SQL-92 isolation levels:
- Read uncommitted - dirty reads, ensures that physical data corruption doesn't occur, nothing more.
- Read committed - SQL server default isolation level, allows for phantom reads and inconsistent analysis.
- Repeatable read - permits phantom reads only.
- Serializable - complete isolation.
- You can customize the following SQL locking features:
- Setting lock timeout duration - note that applications need to trap error 1222 that is caused when
"LOCK_TIMEOUT" has been set. Occurrence of this error doesn't automatically rollback transaction. Current timeout
value is contained in variable "@@LOCK_TIMEOUT" - it returns "-1" if its not defined.
- Managing deadlocks - cyclic dependencies of two processes on each other can be minimized when objects are accessed in
the same order, transactions adhere to efficiency rules (i.e. they are short), the lowest appropriate isolation level is
used and connections are bound.
- Using table level lock hints - hints can be specified for SELECT, UPDATE, INSERT and DELETE statements. Table level
locks take effect only for the query they are part of. Lock hints are applied as in "WITH ({list of locks, certain
lock types cannot be issued together, for example NOLOCK and ROWLOCK})" right after "FROM {tables}".
- Configuring the granularity of a lock placed on an index - you may disallow certain types of locking such as page
level, row level or both page and row level. SQL server chooses locks automatically, however, if you know how data from
a table is accessed, disallowing certain locks on an index can speed things up.
- Transaction isolation levels are set with the command "SET TRANSACTION ISOLATION LEVEL {level}" and last till end
of current session.
- To determine whatever blocking is occuring (or transactions are held to long) you may run "sp_who" or
"DBCC OPENTRAN" commands every few seconds. Alternatively, you may capture with SQL profiler "transaction event".
Part 9: SQL Server Security
[9.1] Security basics
- There are five levels of security (layers):
- Physical security - make sure no one simply steals your server(s) (they are not cheap + data they store is so valuable
it will in many cases make or break your organization).
- Network security - data traveling on the wire between client and server can be encrypted using application layer
security (either SSL or RPC) or network layer with IPSec. You configure application encryption through client network
utility. You may consider changing default port 1433.
- Domain security - login through credentials provided by Windows domain accounts. Note that SQL passwords are case
sensitive if SQL server was installed with case sensitive sort order turned on. Windows authentication allows you to
grant or deny connection rights to individual users or windows groups.
- Local computer security - security offered by OS SQL server was installed on top of.
- SQL server security - divided into four parts, authentication, authorization, auditing and encryption.
- SQL server supports two authentication methods, Windows and SQL server. Authentication mode is configured during server
installation and later from SQL server properties screen accessed from enterprise manager. You can have
"windows only" authentication or "mixed mode" authentication.
- SQL server supports Windows groups and server roles - these group sets of privileges together used in authorization procedure.
- There are two types of SQL server roles, standard and application. Standard roles have privileges that users inherit, can
be nested, can contain windows groups. SQL server includes pre-defined roles. Application roles are used for application
specific permissions (permission is given to an application, not a user) and are protected only by a password, they don't
contain users.
- Only single application role is permitted per connection. When you switch to application role via "sp_setapprole"
only application permissions are used, user permissions are ignored.
- The "public role" is similar to Windows "everyone group" - every object is automatically contained in it.
- There are three permission states, "Grant", "Revoke" and "Deny". Revoke permission means to remove it. Effective permissions
(sum of all states) are based on processing deny permissions first - they are the most important, followed by grants.
- There are three types of permissions in SQL server, object, statement and implied. Statement permissions refer to keywords
such as "CREATE". Implied permissions are assigned to object owners, fixed server roles and fixed database roles -
they cannot be revoked.
- Event auditing is configured using SQL profiler. Events can be logged to a database or a file.
- Ownership chain exists when a set of nested objects and calling object are owned by the same user and exist in the same
database. Chains allow SQL server to check only permissions for the calling object - speed things up. Ownership chains apply
only to SELECT, INSERT, DELETE and UPDATE statements.
- When a stored procedure runs, any objects created by it are owned by procedure's owner, not by the user who called the
procedure - i.e. procedure is like a program.
- It is a good idea to specify database owner - "dbo" as object owner in order to optimize permission assignment.
The goal is to create ownership chains through consistent object ownership.
- When assigning permissions first look to windows groups (if server is part of a domain) and fixed roles, than to standard
roles and finish with individual user permissions.
- Make use of delegating permission of SQL server - let other users have permission to grant object permissions, ease load
on yourself.
[9.2] How security is implemented and administered
- Authentication can be configured through enterprise manager, security tree. Windows accounts need a path to the account, it
can either be domain name, local computer name or "BUILTIN" for special build in groups (such
"Administrators" or "power users" or "users"). The format is "{domain|computer name|BUILTIN}\{user name}".
- Windows accounts are granted with "EXEC sp_grantlogin @loginame = '{domain|computer name|BUILTIN}\{user name}'".
Access right are denied with "sp_denylogin" in place of "sp_grantlogin". To remove Windows account from
SQL server, use "sp_revokelogin" stored procedure.
- SQL server logins are added with "sp_addlogin @loginame = '{login name}' [@passwd = '{login password}']
[@defdb='{default database, it is Master if not given}']" Other options are "@deflanguage, @encryptopt
(don't encrypt) @sid (specify SID instead of auto generation)". Note that password is optional.
- SQL server logins cannot be denied, instead you remove them with "sp_droplogin".
- After user is authenticated, he or she needs to be authorized. Users are authorized to use a resource if its permitted for
guests to access it or windows group which they are member of has access or they have access through standard (or fixed) role.
- Authorization can be configured through enterprise manager for each database through users node. You can only work here with
users that have authenticated SQL server user account. Any account added to the database is automatically placed into the
"public" role.
- You use two stored procedures to grant or revoke access to a database to security account - "sp_grantdbaccess
@loginame='{domain\username}' [@name_in_db='{alias name}']" and "sp_revokedbaccess @name_in_db={user name}".
- From users node of enterprise manager you can change permissions that relate to users, role and group by accessing a user
and clicking on permissions button - all objects and user permissions are listed on them.
- Create and backup permissions on per user basis are handled for the whole database - accessed by right clicking on database
and selecting permissions tab from properties window.
- T-SQL permission commands: "GRANT [ALL] {permissions} [column(s) ON table name|ON table(s) name or view or stored
procedure or function] TO {security account(s) comma separated} [WITH GRANT OPTION] [AS {group name|role name}]" The
"WITH GRANT OPTION" allows security account to grant other accounts the security permission it has just received.
The "ALL" means all valid permissions. The "AS {group name|role name}" - if permissions are granted
to a group or a role you need to specify group or role that has authority to execute a grant and whose you are a member. Note
that some permissions can have columns listed for which they are valid, such as "grant select (firstName,lastName),
update (firstName,LastName) on dbo.employees to test".
- The "revoke [GRANT OPTION FOR] {permissions} ON {objects} FORM {user} [CASCADE] " is used to remove (delete)
permissions from security account. The "CASCADE" option is used to remove permissions from this account and from
any account that this account has granted permissions. The "GRANT OPTION FOR" is used to revoke
"WITH GRANT OPTION".
- The "DENY {permissions] TO {user(s)} [CASCADE]" is used to deny access to users, format is similar to both
grant and revoke.
- You can create roles in Enterprise manager with help of T-SQL or with "sp_addrole" (delete them with
"sp_droprole") stored procedure. You cannot create circular dependency between roles (A is member of B and B
is member of A). Fixed server roles cannot be enclosed, be a member of, any other role. You cannot modify fixed server or
database roles.
- Application roles can be added using enterprise manager or with "sp_addapprole @rolename='{role name}'
@password='{pwd}'".
- Database roles are contained in each database (under roles), while server roles are placed in security folder of a server.
- You can add/remove members from roles by changing role properties in enterprise manager or with help of
"sp_addrolemember" and "sp_addsrvrolemember".
- The whole point of roles is to create group like structures that live inside SQL server. Roles can be database or server based. They
can have users and other roles as members (but not fixed roles). You assign permissions to roles just like you assign them to users (from
edit screen that appears after you click on permissions button).
- To find out in SQL profiler who is executing security functions such as "GRANT" use "GDR Event class".
Part 10: Performance tuning
[10.1] Performance tuning
- Base-lining is the process of determining the "base" performance of a system to which future performance is going to
be compared. After all how do we know system has sped up or slowed down if we don't known how fast it was going in the past.
- Database performance can be monitored with SQL profiler. Current activity can be seen with Enterprise manager. Problems in
the past will be present in SQL server logs. You can use stored procedures "sp_who" and "sp_monitor".
Windows Performance monitor.
- SQL profiler monitors database and server activity sorting it into events. Events can be saved to trace file (".trc") or to
a trace table. You can rerun saved traces against SQL server.
- SQL profiler by default audits some events: "Audit Login Event", "Audit Logout Event",
"ExistingConnection", "RPC:Completed", "SQL:BatchCompleted".
- In order to be able to replay activity (provided that no cursors were used) you need "Audit Login Event", "Audit Logout Event",
"ExistingConnection", "RPC:Starting", "SQL:BatchStarting".
- Profiler can be configured to break trace file into small chunks or create a cyclic trace file(s). SQL profiler will stop
if free space on the disk drops under 10Mb.
- You can use a trace template to define criteria for your trace. When trace is running you can generate a script that
contains SQL needed to capture the trace (SQL used by profiler for the trace) - from save as menu.
- Index tuning wizard is a tool used to create optimal indexes without the need for higher understanding of database
processes. This is achieved with the help of SQL profile trace or SQL script that show the amount of workload database is
under. SQL Batch events have to be present for data to be of any value to Index tuning wizard.
- Index tuning wizard has its limitations, for example it will not suggest indexes for cross database data access. For
tables with primary key index no clustered index recommendation will be given.
- Index tuning wizard can be started from many applications, SQL query analyzer, enterprise manager and SQL profiler.
- You can SQL Analyzer selection for analysis if you start index tuning wizard from SQL query analyzer.
Part 11: Miscellaneous points
[11.1] Other interesting points
- Security is broken down into two stages (parts): "authentication" and "authorization".
- SQL server can work in two authentication modes, Windows and mixed mode.
- When installing fail over clusters you need to disable NetBIOS.
- "sp_configure" stored procedure is used to configure server wide database settings.
- When you compare known value against a "null" you will get unknown as comparison result.
- "columnproperty" function returns information about certain property, for example precision, in a column. syntax
is columnproperty(objectID,columnName,propertyName). Note that you can get object ID from object name with the help of
"object_id(objectName)" function.
- If you want to build a column name inside the SELECT statement use this syntax: "SELECT "columnName" =
'text contained in it'" please note the "" that are enclosing column name.
- DBCC stands for database console command.
- Filtering out system "server process IDs" ("SPIDs") saves system resources and time when you run a
trace on a busy server.
- RPC (remote stored procedure call) requests bypass much of the statement parsing and parameter processing in SQL server
and are faster than using the Transact-SQL EXECUTE statement. Used by ODBC driver when SQL statement passed to it calls a stored procedure.
- Both sample databases, "Pubs" and "Northwind" can be rebuild using scripts present in the
"install" directory.
- SQL server 2000 supports only limited encryption (user logins and stored proc/ view/ types/ rules/ functions) SQL
server 2005 supports native encryption that allows users with ease to encrypt table data. In SQL 2000 you need extended
stored procedures to do that.
- List of usefull DBCC commands and build-in stored procedures can be found on this
administrator's quick reference page