Database

Transactions in Nested SP

34th Friday Fun Session (Part 2) – 22nd Sep 2017

When one (MS SQL Server) Stored Procedure (SP), say, A calls another SP, say, B inside it; we call the SPs are nested. It is not always the case that A calls B. B can be called directly as well. Now if they are required to execute within transaction, in both the cases – when executed individually or nested, we face some interesting situations. Here we discuss how to manage transactions for those cases, when scope is limited within a database.

Requirements

We usually face two kinds of requirements.

  1. Full Commit: either all in both parent (A) and child (B) have to be executed or all should fail.
  2. Partial Commit: if one or more calls to child fail, let them fail. But persist/commit the changes done by parent and the successful calls to child.

Nesting does not need to be limited within two levels (parent and child) as stated here. For example, B can further call another SP C, or A can call any of B or C and so on. The requirements can vary in some other ways as well. However, if we solve the above two cases, the same mechanism/logic can be used/extended to handle all other situations.

Note that, we are also considering the situation that the child, B can be directly called, and in that situation, we are making sure that it is executing within its own transaction.

Full commit

The way we have written the SPs A and B as shown below, A is calling B 6 times. However B fails on the last call. A is inserting data into a table called OuterTable and B is doing the same for InnerTable.

In all or nothing case, we want to make sure, when we execute A, if any call to B fails, no data should be inserted in any of the tables; all should fail. It is obvious that if all the calls pass, then all data changed by them should persist.

It is also required that if B is executed directly, everything happening inside it must happen within a transaction.

Code for SP A:

CREATE Procedure [A]
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    DECLARE @Counter AS INT

    SET @Success = 0
    SET @Counter = 0

    INSERT INTO [OuterTable] ([Name]) VALUES(@Counter)

    WHILE < 6
    BEGIN
      EXEC @Success = [B] @Counter;

      IF @Success = 0
        THROW 50001, 'B Failed! Rollback all!', 1

      SET @Counter = @Counter + 1;
    END

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

Code for SP B:

CREATE Procedure [B] (@CallCounter INT)
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    SET @Success = 0

    IF @CallCounter = 5
      THROW 50001, 'Last call cannot be executed! Rollback all!', 1

    INSERT INTO [InnerTable] ([Name]) VALUES(@Counter)

    SET @Success = 1

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

When A gets called it starts a transaction. It then calls B, that succeeds 5 times, and subsequently fails on the last one.

@@TRANCOUNT is important

When B starts, it begins a transaction, unconditionally. Each time A calls B, @@TRANCOUNT gets incremented by 1 and reaches 2. Each time B succeeds, it executes a commit, thus reducing @@TRANCOUNT Back to 1, corresponding to the transaction started by A.

However, the last call to B fails. It then throws an exception, that gets the execution flow jump into the catch block, where it checks @@TRANCOUNT, that is now 2. It then rollbacks the transaction. A rollback discards all transactions in the nested call and resets @@TRANCOUNT. Before commit or rollback, it is a safe practice to make sure that it is still under a transaction, by checking @@TRANCOUNT value that should be greater than 0.

At this situation, B returns 0 to indicate failure (usual practice is to use 0 for indicating success though). A gets to know that B has failed. A throws exception, and ends up in catch block, where it checks @@TRANCOUNT, that is already 0. No further rollback is called and at the end 0 is returned from A to mean that it has failed.

Everything that A and B have done so far are all rolled back. After all, no matter how many transactions are started, a single rollback can discard all. At the end, no data changes happen to either of the tables, where they inserted data.

If B executes directly, depending on the parameter, data would persist or it would fail. However, all inside it would happen within a transaction.

Partial Commit

In this case, we want to persist the changes made by 5 successful calls to B, along with what A has done. The only thing that should get discarded is the failed 6th call to B.

Code for SP A:

CREATE Procedure [A]
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    DECLARE @Counter AS INT

    SET @Success = 0
    SET @Counter = 0

    INSERT INTO [OuterTable] ([Name]) VALUES(@Counter)

    WHILE < 6
    BEGIN
      EXEC @Success = [B] @Counter;

      IF @Success = 0
        PRINT 'I know B Failed! But I am just ignoring that!', 1

      IF @Success = 2
        THROW 50001, 'Something terribly wrong! Rollback all!', 1

      SET @Counter = @Counter + 1;
    END

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

Code for SP B:

CREATE Procedure [B] (@CallCounter INT)
AS
BEGIN
  DECLARE @ProcedureSave NVARCHAR(MAX)
  DECLARE @TranCounter INT;

  SET @ProcedureSave = 'B-' + CONVERT(CHAR(23), GETDATE(), 121)
  SET @TranCounter = @@TRANCOUNT;

  IF @TranCounter > 0
    SAVE TRANSACTION @ProcedureSave;
  ELSE
    BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    SET @Success = 0

    IF @CallCounter = 5
      THROW 50001, 'Last call cannot be executed! Rollback all!', 1

    INSERT INTO [InnerTable] ([Name]) VALUES(@Counter)

    SET @Success = 1

    IF @TranCounter = 0 AND @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @TranCounter = 0
      ROLLBACK TRANSACTION;
    ELSE
      IF XACT_STATE() <> -1
        ROLLBACK TRANSACTION @ProcedureSave;
      ELSE
        SET @Success = 2
  END CATCH;

  RETURN @Success
END

Savepoint

B starts a transaction, only if it is already not inside one. Meaning, if it is directly called then it starts a transaction of its own. However, if it is called by A, it does not do so. In that case, it creates a savepoint. A savepoint name is created in a way so that for each instance of the SP’s execution, it remains unique. A savepoint is a point where it can come back by executing a partial rollback. In such a situation, all changes between the savepoint and the place where partial rollback is called get discarded.

That is precisely what happens, when B fails and an exception is thrown. In catch block, it checks whether it started the transaction on its own. If that is the case, it simply calls a (full) rollback. However, if it was called by A, that also means it created a savepoint; it would simply execute a partial rollback using the savepoint. A partial rollback only discards the changes done by B itself.

XACT_STATE() is a safe choice

However, prior to executing the partial rollback it should check whether it is safe to do so. If XACT_STATE() returns -1, that means something went wrong, something that should happen rarely. In this situation, a partial rollback is no longer possible – only a full rollback can be done. Special failure code 2 is returned by B to indicate such a colossal failure.

A remains mostly the same. The only thing that gets changed is when it knows B has failed. Earlier, it would have thrown an exception. But this time, if the returned failure code is 0, it simply ignores that B has failed. If the returned failure code is 2, it knows that B fails to execute partial rollback  – something really bad – A executes a (full) rollback.

In the above case, we see that the outer table gets one new row, and the inner table gets 5 new rows; notwithstanding the fact that 6th call to B fails.

When B is called directly – not from A – it manages its own transaction; savepoint is never used.

Advertisements

Executing SP using EF

34th Friday Fun Session (Part 1) – 22nd Sep 2017

Many a times, we use Entity Framework (EF), Microsoft’s recommended data access technology for an application, to execute (MS SQL Server) Stored Procedure (SP), and retrieve the results emitted by them. Here we discuss different kinds of output that a SP can produce and how we can retrieve them using EF.

SP output

A SP typically provides the following kinds of results:

  1. Return code (single integer)
  2. Output parameters (one or more, any data type)
  3. Result set
    1. Single result set
    2. Multiple result set
      1. All having the same schema
      2. Having different schema

Return code

SP can return a single integer return code. Return statement without any value (null) would automatically return 0. It is mostly used to exit execution of a SP when certain condition is met.

CREATE Procedure SpReturnCode
AS
BEGIN
  RETURN 1;
END

Using T-SQL we can execute SP like below.

DECLARE @Success AS INT
EXEC @Success = [SpReturnCode];
PRINT @Success

Output parameter

SP can return one or more values, each having its own data type.

CREATE Procedure SpOutputParameter (@InputValue INT, @OutputValue INT OUTPUT)
AS
BEGIN
  SET @OutputValue = @InputValue + 1;
  RETURN 0;
END

Using T-SQL we can execute SP like below.

DECLARE @ReturnValue INT;
EXECUTE [SpOutputParameter] 2, @ReturnValue OUTPUT;
PRINT @ReturnValue

Single result set

Returns a result set having 0 or more rows of a certain schema. The following SP returns a result set with a single column named, Success.

CREATE Procedure SpSingleResultSet
AS
BEGIN
  SELECT 3 AS Success
  RETURN 0;
END

Multiple result set, same schema

The following SP returns the result set for Employee schema twice.

CREATE Procedure SpMultipleResultSetSameSchema
AS
BEGIN
  SELECT * FROM [Employee]
  SELECT * FROM [Employee] SELECT [EmployeeId] > 10

  RETURN 0;
END

The following SP returns a result set that is not associated with any database entity.

CREATE Procedure SpMultipleResultSetNonDbContextEntity
AS
BEGIN
  DECLARE @Loop AS INT
  SET @Loop = 0

  WHILE @Loop < 10
  BEGIN
    EXEC SpSingleResultSet
    SET @Loop = @Loop + 1
  END

  RETURN 0;
END

Multiple result set, multiple schema

The following SP returns two different result sets: one for Company and another for Employee.

CREATE Procedure SpMultipleResultSetMultipleSchema
AS
BEGIN
  SELECT * FROM [Company]
  SELECT * FROM [Employee]

  RETURN 0;
END

Executing SP using EF

We will use the following different ways in EF to read different kinds of SP output as described earlier:

  1. ExecuteSqlCommand
  2. SqlQuery
  3. CreateCommand/ExecuteReader

ExecuteSqlCommand

This executes a given DDL/DML command. This can be executed when no result set needs to be returned.

Return code

Return code does not require any explicit output parameter to be used in the SP. However, while calling from EF, it should be treated as an output parameter by specifying the direction for the parameter.

SqlParameter returnCode = new SqlParameter("@ReturnCode", SqlDbType.Int);
returnCode.Direction = ParameterDirection.Output;

Db.Database.ExecuteSqlCommand("exec @ReturnCode = [SpReturnCode] ", returnCode);
var returnCodeValue = (int)returnCode.Value;

Output Parameter

For each of the output parameters, we need to declare an output parameter in EF matching the appropriate data type.

SqlParameter inputParam = new SqlParameter("@InputValue", SqlDbType.Int);
inputParam.Direction = ParameterDirection.Input;

SqlParameter outputParam = new SqlParameter("@OutputValue ", SqlDbType.Int);
outputParam.Direction = ParameterDirection.Output;

Db.Database.ExecuteSqlCommand("[SpOutputParameter] @InputValue, @OutputValue OUT", inputParam, outputParam);
var returnValue = (int)outputParam.Value;

SqlQuery

SqlQuery is usually used when SP returns a single result set. However, it can return any data type including primitive types – not necessarily only entity type. If the SP returns multiple result sets, it will only get the first one. However, the complete execution of the entire SP does happen.

public class SuccessSet
{
  public int Success { get; set; }
}

var result = Db.Database.SqlQuery("[SpSingleResultSet]").ToList();

CreateCommand/ExecuteReader

When multiple result sets to be returned this method can be used. We need to use IObjectContextAdapter interface that makes use explicit interface implementation.

Db.Database.Initialize(force: false);
var cmd = Db.Database.Connection.CreateCommand();
cmd.CommandText = "[SpMultipleResultSetSameSchema]";

Db.Database.Connection.Open();

var reader = cmd.ExecuteReader();
var employees =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
    .ObjectContext
    .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

foreach (var employee in employees)
  Console.WriteLine(employee. Name);

While(!reader.NextResult())
{
  employees =
    ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
    .ObjectContext
    .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

  foreach (var employee in employees)
    Console.WriteLine(employee. Name);
}

Db.GetDatabase().Connection.Close();

When result set is not in DbContext, slight modification is required.

var successSets =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<SuccessSet>(reader);

When different schema are used we can still use the same, we just need to use the right object while translating.

var companies =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<Company>(reader, "Company", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

reader.NextResult();

var employees =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

 

Incision into Isolation Levels

22nd Friday Fun Session (Part 1) – 16th Jun 2017

We are trying to see how isolation level, serializable to be precise, can help us implementing a synchronization mechanism for web application.

Let us start with ACID

ACID stands for Atomicity, Consistency, Isolation and Durability. It is detailed in ISO standard. Database systems implement this so that a sequence of operations, called as transaction, can be perceived as a single logical operation.

Atomicity

All operations of a transaction are all done or nothing done. Logging with undo capability can be used to achieve this.

Consistency

Given that all database constraints (foreign key, unique etc.) are valid at the beginning, the same should be maintained, at the end of the transaction as well.

Durability

All changes done by a committed transaction must go to storage even if database system crashes in the middle. Logging with redo capability can be used to achieve this.

Why Logging?

We talked about logging and then redo/undo in the previous sections. Why Logging? Well, when some transactions changes data, they are not immediately written to disk. Rather those pages are marked as dirty. Lazy writing flushes them to disk later. Instant writing to disk is expensive. Instead, logging the operation that is directly written to disk immediately, is much cheaper.

However, performance, while important is not a must. Logging is essential to ensure atomicity and durability. Any modification must be written to log before applying to actual database. This is known as write-ahead logging (WAL) This is to make sure that in case of a crash (say, 2 out 5 operations of a transaction are written to database storage and then it crashes), system can come back, read the log and figure out what was supposed to be done and what was not supposed to be done. By redoing and undoing necessary operations, durability and atomicity is ensured.

Focus on the I of ACID

Today we focus on the I of ACID, called isolation. When we are writing a transaction, we write the operations inside it thinking nobody else is doing anything else to the data that we are dealing with. Isolation property defines such an environment and database systems implements that.

So, why do we need such an environment? Well, without this, in a highly concurrent transaction execution environment, our understanding of the data we are working with will not hold true, as other would change them simultaneously. It will happen largely due to three problems: dirty read, non-repeatable read, and phantom read.

However, creating such an isolated environment is expensive in terms of performance. Hence, a number of other isolation levels are introduced, giving various degrees of isolation rather than a complete isolation.

The ISO standard defines the following Isolation levels that we will describe in terms of two transactions T1 and T2 that executes in parallel.

Read Uncommitted

Transaction 1 (T1) updates salary for Joe
Transaction 2 (T2) reads updated salary for Joe
T1 aborts transaction

We see that, T2 read dirty (because T1 did not commit the updated salary) data and went ahead with his decisions/operations inside it based on it, that was of course a wrong thing it did.

As the name implies Read Uncommitted reads uncommitted data, also called dirty data that is wrong. So we see, this isolation level does not guarantee isolation property and it is an example of a weaker isolation level. Note that along with dirty read, it also has non-repeatable read and phantom read problems.

Read Committed

The next better isolation level, as the name Read Committed implies, reads only committed data and solves the dirty read problem encountered previously in Read Uncommitted isolation level. Let us see through an example. Now T1 is running in Read Committed isolation level.

T1 reads the salary of Joe
T2 updates the salary of Joe and commits
T1 reads the salary of Joe

So we see T1 reads the salary of Joe twice, and it is different in the two cases. In the second case, it reads the data that was modified and committed by T2. No more dirty read by T1. Good.

But the isolation property expects each transaction to happen in complete isolation, meaning it would assume it is the only transaction that is taking place now. Joe’s salary was not updated by T1. Then why should T1 see different data when it reads the second time?

So we see, T1 could not repeat a read (the same salary for Joe). Hence, this problem is called non-repeatable read. Read Committed, like Read Uncommitted is another weaker isolation level. Again, note that, along with non-repeatable read it also has the phantom read problem.

Repeatable Read

To solve the non-repeatable read problem, Repeatable Read isolation level comes into picture. Since T1 reads the salary of Joe, no other transaction should be able to modify Joe’s data if we run T1 in Repeatable Read isolation level.

If we repeat the previous transactions we did earlier we would see T2 waits for T1 to finish first. Because T1 would use the right locks on the rows it reads so that others cannot delete/modify it. Repeatable read is solving the non-repeatable read problem as the name implies.

However, that won’t stop new data insertion. After all, Repeatable Read put necessary locks only on the data that it has read, not on future data. Hence, we will see ghost/phantom data. Let’s see an example.

T1 reads 4 rows in employee table
T2 inserts one record in employee table and commits
T1 reads 5 rows in employee table

We see that T1 sees a phantom row (the newly inserted row by T2) in its second read of employee table. Repeatable Read, once again, another weaker isolation level.

Serializable

So far, we see different isolation levels providing different degrees of isolation level but not what I of ACID really defines as isolation. We also know that weaker isolation levels are introduced to avoid the performance penalty that occurs for executing transactions in complete isolation. But at times, it becomes an absolute necessity to execute transaction in full isolation. Serializable comes into picture to implement that complete isolation. In serializable isolation level, it is ensured that we get the effect as if all transactions are happened one after another, in the order they started.

So if we rerun the earlier two transactions, we would see T2 waiting for T1 to complete first. Hence both the reads of T1 would read 4 rows. Only after T1 is done that T2 would insert a new row.

This solves all the three problems: dirty read, non-repeatable read and phantom reads.

At this point, it can be mentioned here that ISO standard expects serializable, not serial. The end result of a serializable execution is to produce a result equivalent to executing them one after another. Serializable does not necessarily executing transaction one after another, just that the end result is the same, had they executed serially.

MS SQL Server implementation

With Serializable, we are done with the 4 ISO transaction isolation levels. MS SQL Server implements all of them. In addition, it implements a fifth one, called Snapshot.

Snapshot

It is an isolation level that solves all the three problems just like serializable. So, why do we have two isolation levels doing the same thing? What special thing snapshot is doing?

If we closely observe the earlier serializable isolation level, implemented using locks, we see that it is too pessimistic. T2 has to wait for T1 to finish. But T2 could be simultaneously executed. After all, T1 is only reading, not modifying any data.

Snapshot comes into picture with optimistic concurrency control. It uses multiversion concurrency control (MVCC) to implement this. Every transaction starts with the latest committed copy it sees and keeps on executing the operations inside it.

So, for the last example we saw, in snapshot, T1 would read 4 rows in both the reads. After all, it had its own private copy. On the other hand T2 would start with its own copy, add a row in the middle. At the end, it would see there was no conflict. This is because no other transaction, T1 in this case, did anything conflicting. So, two transactions are simultaneously executed without violating the requirements of a serializable solation level.

What would happen if both T1 and T2 modify the same data, creating a conflicting situation? Well, both the transaction started with its own copy hoping that at the end there would no conflict, hence it is called optimistic. But if there is a conflict, the one committed first would win. The other would fail and rollback.

By the way, even though it is called serializable, write skew, anomaly is still present and it cannot be called serializable in ISO definition.

Database needs to be configured, which at times takes a while, to use this isolation. Again, since each transaction uses its own private copy, it is resource intensive.

At a glance

Isolation levels

Default transaction isolation level for MS SQL Server

Read Committed is the default isolation level set for MS SQL Server. Keeping performance in mind, it is done this way. So all along if you had thought, by default, you were getting the I of ACID by SQL Server, you are wrong. You are living with non-repeatable read and phantom read unless you have explicitly changed the isolation level or used locks.

How to set isolation level in MS SQL Server?

We can set one isolation level at a time using the following command:

SET TRANSACTION ISOLATION LEVEL 
   { READ UNCOMMITTED 
   | READ COMMITTED 
   | REPEATABLE READ 
   | SNAPSHOT 
   | SERIALIZABLE 
   } 
[ ; ]

As mentioned earlier, to set snapshot isolation level some database specific configuration is required before executing the above command.

How long isolation level remains active?

Once set, it lasts for the session, the duration of which is largely controlled by the component that creates it. When another session starts, it starts with the default Read Committed.

Transaction

It is obvious yet important to remember, isolation levels works on transaction. After all, it is called transaction isolation level. If we want to isolate (using isolation level) the execution of a set operation as a single logical operation, then they have to be wrapped with Begin and Commit transaction.

MS SQL Server data (table) storage

What does MS SQL Server store?

MS SQL Server has data files and (transactional) log files.

How is the log file organized?

Log files contain a series of log records, not divided by pages.

How is the data file organized?

Data files are grouped by extent and each extent by pages. Every page belongs to an extent.

What is page and extent?

Each page is 8 KB (128 pages/MB) and 8 contiguous pages make an extent of 64KB (16 extent/MB).

Disk I/O is performed at page level. That means, the minimum SQL server would read/write at one go is a page ~ 8KB.

MS SQL Server can also read a whole extent at one go instead of reading a single page when it makes sense. Look out for partition alignment so that reading one extent does not end up touching two disk blocks, affecting performance.

What are the different types of pages and what do they store?

Each page starts with a 96-byte header that stores information about the page including page number, page type etc.

Page types include Data, Index, Text/Image, Page Free Space (PFS, information about page allocation/free pages), Global Allocation Map (GAM, extent allocation information), Shared Global Allocation Map (SGAM, mixed extent with at least one unused page information), Bulk Changed Map (information about extents that got changed by a bulk logged operation since last BACKUP LOG statement) Differential Changed Map (information about extents that got changed since last BACKUP DATABASE statement), Index allocation map (IAM, page/extent allocation for an object, say a heap of a table).

So basically, pages store data and metadata for the MS SQL Server to make efficient use of them for various I/O (related) operations.

What are the different types of extents?

There are two types of extents: uniform and mixed extent. When a table is created instead of allocating a whole extent (8 pages), a single page (a page from a mixed extent type) is allocated. This goes on until the table crosses 8-page size. Onwards, every time a whole extent (uniform extent type) is allocated.

IAM pages (come from a mixed extent) keep track of these allocations.

Where does my table data go?

First of all, a table can have partitions. Let’s say we have not partitioned a table. Then we have just one partition. By the way, partition resides in a single file group.

Table can have a clustered index or not. If clustered index then the data organization type is B-tree else Heap (no logical order of data). It can be mentioned here that nonclustered index has same b-tree structure with the difference that it does not have data pages. Rather, it has index pages containing index rows, each having a row locator pointing to a data row in the clustered index or heap. Heap or b-tree of a partition can have up to 3 allocations units.

What is an allocation unit?

Allocation unit is a collection of pages within a heap or B-tree. We have three different types of allocation units based on the kind of data they would store. They are:

  1. IN_ROW_DATA: for data (data page) or index (index page) rows, can save all but LOB.
  1. LOB_DATA: for text, ntext, image, xml, varchar (max), nvarchar (max), varbinary (max) data types. Uses Text/Image pages.
  1. ROW_OVERFLOW_DATA: for variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8060 byte row size limit. Uses Text/Image pages.

How are the data rows saved in a data page?

We are talking about data pages (used by both heap and b-tree to store data rows and not index rows) in IN_ROW_DATA allocation unit. They are saved serially starting from the end of 96-byte header, not necessarily in the logical order as dictated by a clustered index, if any. It is the row offset table that maintains the logical order.

What is row offset table?

For each data row there is a 2-byte entry, saving the offset. It tells from where the data row starts. The entry for the first data row (logical, not physical) is there at the very end of the page. Suppose the data page contains 20 rows. Then there will be 20 entries and these entries make a row offset table. The first (0th) entry would say where the first logical data row would start. The physical position of this might be somewhere in the middle of the page and not necessarily at the beginning meaning just after the header.

What is the maximum data size in a data page?

Maximum data and overhead size in a data page is 8060 bytes.

How to save a row bigger than 8060 bytes?

A table can contain maximum 8060 bytes per row. Unless it has a variable type column defined. It would then use a 24-byte pointer in data page to point a Text/Image page in ROW_OVERFLOW_DATA allocation unit where they would move the data, starting with the largest width column. Data of a single variable column must not exceed 8000 bytes but combined width can exceed 8060 bytes. However, if a row is less than 8060 bytes and there is no space in the page then page split – not a row-overflow will occur.

A variable column value must be completely in-row or completely out-row. Multiple such columns can spread over several pages in ROW_OVERFLOW_DATA allocation unit. Such Text/Image pages are managed by a chain of IAM pages. Note that, the index key of a clustered index cannot contain varchar columns that have existing data in the ROW_OVERFLOW_DATA allocation unit.

How to save single column data bigger than 8000 bytes?

LOB column can store up to 2 GB data. Whether at attempt to save the data in-row would take place depends on the settings. If saved out-row, a 16-byte pointer from data page for each LOB column would point to the initial Text/Image page in LOB_DATA allocation unit. A page in LOB_DATA can store multiple columns from multiple rows but for a single table. Text/Image pages storing LOB data are managed by a chain of IAM pages.

MS SQL Server nvarchar issues

Do you need to store non-English characters, like Chinese characters?

Then you need UNICODE characters. nvarchar, ntext, nchar are your options. n is for ISO synonym national. Uses UNICODE UCS-2 character set, 2 bytes for a character, double than that of char/varchar/text. This is a price you need to pay for multi-ligual support.

nchar or nvarchar/ntext?

If you know the data length is always going to be around a fixed size, say, 20 characters then nchar, else nvarchar/ntext.

nvarchar ($fixed size$) or nvarchar (max)/ntext?

If you know your data is always going to be somewhere between a fixed range, say, 1 to 100 characters and would never exceed 100 then nvarchar (100). It makes sense, looks good. Use fixed length nvarchar as long the size you know would be less than 8000 bytes. More than 8K means the data would be saved out-row. Fixed length nvarchar can’t help.

If your data size can be NULL to 2 GB and anything in between then you have no option but nvarchar (max)/ntext.

nvarchar (max) or ntext?

Go for nvarchar (max), not ntext. ntext will not be supported in future and nvarchar (max) better in performance than ntext.

Is nvarchar (max) bad for performance?

SELECT 
[name] AS tablename, 
large_value_types_out_of_row 
FROM sys.tables

If large_value_types_out_of_row for the table that contains your nvarchar (max) column is 0 (=OFF) then MS SQL Server would try to save the data in-row, given that the row fits in 8K page size. If the row does not fit in 8K then the nvarchar (max) data would go out-row, one or more extra pages depending on the nvarchar (max) column data size.

If MS SQL Server has to read extra pages for a row, performance would suffer. But what to do? The data is big, right?

If large_value_types_out_of_row is 1 (=ON), then no matter how small is your nvarchar (max) column data, it will always be saved out-row.

In-row or out-row?

If the select statements you use for the table excludes the nvarchar (max) columns then use out-row. Because when you will be reading the non-nvarchar (max) columns, you are not reading the other pages used to store long nvarchar (max). Not only that, since all those big data are not in the same page, your single page can now fit more rows with the small data. You will be doing good here.

But if you have to read the whole row (all columns) at one go, then in-row good.

How to change in-row to out-row and vice versa?

You can change anytime.

sp_tableoption N'MyTableName', 'large value types out of row', 'ON'

However, doing so won’t immediately change the existing data layout. New inserted data would follow the new way. Existing data, if updated would be saved in the new layout. Well, you can always read the existing data, change it and then update it again with the old value. That way, you save all your data in the new layout.

What about ntext?

Well, we already decided not to use next. But since we are discussing this let’s see the respective issues.

This is how you can know the existing layout.

SELECT 
[name] AS tablename, 
text_in_row_limit 
FROM sys.tables

text_in_row_limit = 0 says, the option is not set. If non-zero, say, 256 then that is the number of bytes that will be saved in-row.

This is how to change it.

sp_tableoption N'MyTableName', 'text in row', ON

text_in_row_limit would be 256.

sp_tableoption N'MyTableName', 'text in row', '1000'

If you want that limit to be 1000 bytes. Yes, setting itself won’t change the existing data layout just like nvarchar.

How full are the pages?

Remember, you pages should be full as much as possible. Otherwise you will be reading a page but get very small data. You will be doing bad.

SELECT 
page_count, 
avg_page_space_used_in_percent 
FROM sys.dm_db_index_physical_stats(DB_ID(N'MyDatabaseName'), 
                                    OBJECT_ID(N'MyTableName'), 
                                    NULL, NULL, 'DETAILED')

But ignore small tables having less than 1000 pages (page_count).

Are the rows fitting in 8K?

SELECT 
OBJECT_NAME(object_id) as TableName, 
OBJECT_NAME(index_id) as IndexName, 
min_record_size_in_bytes, 
max_record_size_in_bytes, 
avg_record_size_in_bytes 
FROM sys.dm_db_index_physical_stats(DB_ID(N'MyDatabaseName'), 
                                    OBJECT_ID(N'MyTableName'), 
                                    NULL, NULL, 'DETAILED')

It is better to design tables in way so that data of n (n >= 1) rows fit in 8K, a page.

Is there any other limitation for nvarchar (max)?

You cannot create an index on nvarchar (max) column.

Also avoid using nvarchar (max) column in where clause.

What are the issues for migrating varchar to nvarchar?

Your migrated data is going to be double. Some rows might no longer fit in-row. They may over flow to out-row. Performance might be affected.

MS SQL Server recovery model

Recovery model

A recovery model is a database property that controls how transactions are logged. Backup and restore operations occur within the context of the recovery model of the database.

Why log transactions?

To guarantee ACID properties over crashes or hardware failures every action is logged (to disk). In case of system failure, it would read the log and redo/undo the logged actions to bring the database to a consistent state.

Where it is logged?

MS SQL Server maintains mainly two kinds of files. Data file and log file. Log file can map to one or more physical files. Also note that transactional log is different from other logs like setup log, error log, profiler log, agent log, event log etc.

Types of recovery model

Depending on how extensively MS SQL Server logs there are 3 types of recovery models. They are:

  1. Simple recovery model
  2. Full recovery model
  3. Bulked logged recovery model

Simple recovery model

Suppose every day you take a backup of the database. If there is a failure, you at most lose the data for one day and you can afford that. In that case, you can choose simple recovery model. Here, every transaction is logged. However, as soon as those transactions are written to disk (checkpoint), the log is discarded.

Here you don’t have the provision to take back up for log. You only backup data. And as said before, in case of failure, you can recover the data that you backed up last time. You loose everything else, done since the data backup.

You get rid of the administrative task of taking the log backup. You get rid of running into the risk of managing very large log files.

However you lose the following features:

Log shipping – This is a disaster recovery technique where the transactional log is copied from primary (production) database on a primary server (the production MS SQL Server instance) to one or more secondary database (warm backup of primary database) on separate secondary server instance (where warm standby copy/copies is/are stored). This feature works with the help of full recovery model that will be discussed a bit later.

AlwaysOn or Database mirroring – These features support/create fail-over environment using redundancy. If the primary database fails, the secondary (not backups) database takes over. These features also work with the help full recovery model.

Media recovery without data loss – If media (tape/disk) fails recovery without data loss not possible. It needs full recovery model.

Point-in-time restores – You cannot restore the database to any arbitrary point defined by a date and time. For this to work, you need full recover model.

Full recovery model

By this time, it is clear that for the features above to work you need full recovery model. Well, it will log all the transactions from the beginning even if those transactions are written from memory to disk. So it is obvious that the log size will be too big. It means there is provision to take log backups. And that you have to take log backup regularly else you might run out of production server disk space. Yes, if you backup log then the backed-up up log will be removed from the database log file.

Armed with full recovery model (with the pain of administrating transactional log) you won’t lose any data due to a lost or damaged data file. You can restore database to any point of time.

Unless the tail of transactional log file is damaged all is good. Else you lose the data since last time you took a log backup.

Bulk recovery model

Same as full recovery model just that for the bulk operations (to be precise, a set of operations those can be minimally logged) it does not log everything. This model allows bulk copy to work with high performance by not logging all transactions.

Point-in-time recovery not supported. Recovery possible to the end of any (data and log) backup.

If log is damaged or bulk logged operations occurred since the most recent log backup, changes since the last backup will be lost. Else no data is lost.

Why log size too big?

This is a typical problem people face.

You can check log_reuse_wait_desc column in sys.databases. It will tell you why?

However, most likely the database is using the default full recovery model. You are unaware of it: especially, if the database is created by SharePoint. And you forgot to take regular log backup.

Reference

FILESTREAM – considerations, restrictions and limitations

There are considerations, restrictions and limitations to use FILESTREAM. They are listed below:

Works as it is or with some considerations

  1. SQL Server Integration Service (SSIS) – uses DT_IMAGE SSIS, can use import and export column transformation
  2. Log Shipping – supports – both primary and secondary servers must run MS SQL Server 2008 or above
  3. SQL Server Express – supports
  4. Full Text Indexing – works if there is an additional column in the table with filename extension for each BLOB
  5. Failover clustering – FILESTREAM filegroup must be on a shared disk, enable FILESTREAM on each node hosting FILESTREAM instance

Limited work

  1. Replication – use NEWSEQUENTIALID(), Merge Replication synchronization over https using Web Synchronization has 50 MB limit
  2. Distributed Queries and Linked Server – can not use PathName() for four part name
  3. Contained Database – requires extra configuration outside database, hence not fully contained

Does not work

  1. Transparent Data Encryption (TDE) – does not work
  2. Database Snapshots – does not work with FILESTREAM filegroups
  3. Database Mirroring – does not support

Reference

Prev

FILESTREAM – setup

To use FILESTREAM feature both Windows administrator and SQL Server administrator have to enable it. While enabling from Windows there are three options to choose from:

  1. Enable FILESTREAM for Transact-SQL Access check box.
  2. Enable FILESTREAM for file I/O streaming access. This is to read and write FILESTREAM data from Windows. Enter the name of the Windows share in the Windows Share Name box.
  3. Select Allow remote clients to have streaming access to FILESTREAM data. This is to let remote clients access the FILESTREAM data stored on this share.

For SQL Server instance there are three options to choose from:

  1. 0 – Disable FILESTREAM support for this instance
  2. 1 – enable FILESTREAM for Transact-SQL access only
  3. 2 – enable FILESTREAM for Transact-SQL and Win32 streaming access

Note that FILESTREAM would perform worse for 1MB+ size files for full read/write if Transact-SQL is chosen (and Win32 streaming is not chosen) than no FILESTREAM BLOB.

Now that FILESTREAM is enabled, we can go ahead to do the rest of the setups.

First, create a file group to keep the FILESTREAM data. There is a 1:1 mapping between data containers and file groups.

       ALTER DATABASE MyDatabase ADD
       FILEGROUP MyFileStreamGroup1 CONTAINS FILESTREAM;

 Second, define data container. In the below example, “D:\Production” must exist already. “D:\Production\FileStreamDataContainerRoot” is going to be the root of the data container.

       ALTER DATABASE MyDatabase ADD FILE (
       NAME = FileStreamGroup1File,
       FILENAME = 'D:\Production\FileStreamDataContainerRoot')
       TO FILEGROUP MyFileStreamGroup1 ;

Third, create a table with a FILESTREAM column. All FILESTREAM columns of a table must go to a single FILESTREAM group that can be specified, defaulting to default FILESTREAM filegroup. Table must have a column of the non-null uniqueidentifier data type with ROWGUIDCOL attribute and either a UNIQUE or PRIMARY KEY single-column constraint. FILESTREAM data container will have a directory for a table with subdirectory corresponding to the FILESTREAM column of the table.

       USE MyDatabase ;
       GO

       CREATE TABLE TableWithFileStreamColumn (
       DocumentID INT IDENTITY PRIMARY KEY,
       Document VARBINARY (MAX) FILESTREAM NULL,
       DocGUID UNIQUEIDENTIFIER NOT NULL ROWGUIDCOL UNIQUE DEFAULT NEWID ())
       FILESTREAM_ON MyFileStreamGroup1;

Now it is time to use FILESTREAM using standard Transact-SQL or Win32 APIs.

Reference

Prev          Next

FILESTREAM – hardware/OS consideration

There should be some considerations regarding hardware selection/configuration and OS configuration like:

  • Hard disk – SCSI is better for reliability and performance
  • RAID level – RAID 10 is recommended even though at least 50% space is redundant depending on configuration
  • Storage layout – all data in one volume may not be the right choice if load is high
  • Cluster size – 64KB and above to reduce fragmentation, also defragment periodically
  • Compression – cannot be enabled if cluster size is 4KB or less, may not be effective for all kinds of data, comes at the expense of CPU
  • Anti-virus configuration – AV can need to be setup properly so that it does not delete FILESTREAM files
  • Disable 8.3 naming for NTFS – if number of files are high then NTFS has to work harder to get a unique name
  • Turn off last access time update – since files are not accessed by explorer rather SQL Server turning off this feature is OK and it saves a lot of time
  • Security – SQL server need to be configured to use integrated security (Win32 file APIs to work that uses Windows token associated with the client’s thread accessing files)

Reference

Prev          Next

FILESTREAM – what and when

If you have to store files (BLOB) in MS SQL Server database, you might consider using FILESTREAM (started from MS SQL Server 2008). This is essentially enabling FILESTREAM attribute of varbinary (max) type column. Some other setups are necessary prior to that.

FILESTREAM combines MS SQL Server with NTFS file system. Database is good at reading/writing/processing small pieces (rows and columns) of information scattered around in the database files while file system is good at reading/writing files as a whole. File system start performing better when the files are larger than 1 MB.

By combining the two you can get the best of the both worlds, like getting past the low max size limit of database BLOB and use max NTFS size, good streaming performance of file system, integration with structured data (transactional consistency), better fragmentation management, manageability (backups etc can be done together) etc.

However if you are doing frequent small updates FILESTREAM performs poor. This is because here it has to work more – guarantee transactional consistency as well write to file system that essentially creates/writes a new copy and delete the old one. It also takes more time to take backup.

FILESTREAM data can be accessed in two ways: standard Transact-SQL and by using Win32 APIs. If you want the best performance of File System you need to use Win32 APIs. There are provisions for the Win32 APIs to work within the context of database transaction.

FILESTREAM BLOBS use file system cache and does not use database buffer pool. This improves query processing and overall database performance.

However do note that certain features like Transparent Data Encryption (TDE) are not supported with FILESTREAM.

To summarize, if you are to save files 1 MB or larger in database that will be read a lot with few small updates and that a middle tier is available where you can use Win32 APIs for accessing them then FILESTREAM is a good choice.

Reference

Next