SQL SERVER – Beginning Reading Transaction Log File

In this article we will examine what the SQL Server Transaction Log is, why do we need one and how it works.

Transaction Log Overview

The SQL Server transaction log is maybe one of the most unclear parts of SQL Server architecture. At the same time the transaction log is one of the most essential components of every database.

Every SQL Server database has a transaction log that keeps all transactions and the database changes made by each transaction. In case of a system failure, the transaction log might be required to bring your database back to a consistent state. That is why the transaction log should never be deleted or moved.

Transaction Log Purpose

Let’s have a look at ACID. Atomicity, Consistency, Isolation, and Durability are set of properties that guarantee that database transactions are processed reliably. The transaction log assists us with durability. It guarantees that the committed transactions will stay forever. Durability can be achieved by flushing the transaction’s log records to non-volatile storage before acknowledging commitment. This means that the transaction must be stored on the hard drive. Otherwise, if you got a power failure and your changes were only in the memory on the database server, after the server restart you may find that your changes weren’t saved. Thus, all changes must be written to disk before we can return a successful commit back to the application.

SQL Server stores data rows on 8 KB pages, that can be located anywhere in the data file. A page can hold a big number of rows (about 400 rows on the page). The page is the smallest unit of reading and writing, so if you modify one row on the page, the entire 8 KB page is written back to disk.

Let’s consider the situation when you update 100 rows in a single transaction. Some of these rows could be spread out across the data file.

You would need to drag the disk head to each position, write the updated page to disk, and move onto the next location, waiting for the disk to spin, and the disk head to move each time.

The transaction log is a sequential file, and rather than make a randomly placed record for each page, it capable of writing in 512 byte sectors.

The smallest write should hit one disk sector. A disk sector is a physical sector on a drive. Most drives contain sectors which are 512 bytes.

Based on this, we can conclude that SQL Server will write 512 bytes to the log file for a single tiny write transaction.

Well, as transaction log is sequential, you don’t need to wait for the disk head to move to the next position to write the change. The next position is already right under the disk head. This means you can write the changes down on disk quickly.

While writing changes to the transaction log, the in-memory version of the data page is also being updated. Further, the Lazy Writer process will come and commit these data pages to disk. If you got a power failure, then all the data pages in memory will be lost, but we have a record of all changes that occurred in the database.

Reading Transaction Log File

Let’s examine how SQL Server logs transactions and what information is stored in the transaction log file.

To read the transaction log file we will use the fn_dblog function. This function is one of several undocumented SQL Server functions.

Every record in the SQL Server transaction log is uniquely identified by a log sequence number (LSN). If LSNn+1 > LSNn — the change described by the log record referred to LSNn+1 occurred after the change described by the log record LSNn.

fn_dblog accepts two parameters:

  • The first is starting log sequence number. You can also specify NULL, which means it will return everything from the start of the log.
  • The second is the ending log sequence number. You can also specify NULL, which means you want to return everything to the end of the log.

I will use dbForge Studio for SQL Server to run SQL code.

For demonstration purposes let’s create a test database and a table.

The IDE allows to create a database in several clicks. Once you establish a connection to SQL Server, right click the connection, and then click New Database. The Database designer will appear. In the designer window enter the database name (I will use ViewTransactionLog) and click the Update Database button. That is all. To create a table, expand the database node, right click the Table folder, and then click New Table.

The table will contain three columns: ID, FirstName, and LastName.

Now we can use SQL script editor to execute the fn_dblog function.

Note that 461 rows have been created for the database and the blank table. To see transaction log data:

Note that the Transaction Name column contains the database name. This value appears only when the particular transaction starts with “LOP_BEGIN_XACT” (means begin transaction) in the Operation column. The operation column shows which operation is being performed by SQL Server, either insert, update, delete, shrink, lock, page allocation, etc. Transaction ID is the same for all transaction parts.

Now let’s run several DML commands to see how this affect the transaction log file.

Let’s have a look at the transaction log.

We can read the first raw in the log as:

  • INSERT statement with transaction ID 0000:00000a80
  • started at 2014/10/10 10:44:52:273
  • INSERT a row in a HEAP table ‘dbo.PErson’ in pageID 0000011d:00000140
  • finished at 2014/10/10 10:44:52:273

Conclusion

As you can see, the transaction log is a list of changes that have occurred in a database. The transaction log also gives us the ability to replay any operation that occurred in the database.

If you take a full database backup, and then have copies of the transaction log, you will be able to replay every operation and bring the database back to a consistent state at any point in time.

For instance, if you accidentally dropped a table, you can bring the database back to the point just before the drop time or by a specific LSN, and recover the data.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL Authority News – Save USD 600 for SQL Live! 360 in November

I love talking about performance tuning and I love teaching about performance tuning. I think this is one subject, which everyone would like to discuss about, as this subject never gets old. I stay in India so I often deliver conference sessions on this subject over here. However, if you are in the USA, this is your opportunity to join me to discuss about SQL Server Performance.

I’ll be speaking at Live! 360, November 17 – 21 in Orlando, FL. Live! 360 brings together five conferences, and the brightest minds in IT and Dev, to explore leading edge technologies and conquer current ones. More info: http://bit.ly/LSPK22HOME

As a speaker, I can extend $600 savings on the 5-day package. Register here: http://bit.ly/LSPK22REG and use code LSPK22.

I’ll be presenting the following sessions:

Secrets of SQL Server: Database Worst Practices
Abstract: Chances are you have heard, or even uttered, this expression. This demo-oriented session will show many examples where database professionals were dumbfounded by their own mistakes, and could even bring back memories of your own early DBA days. The goal of this session is to expose the small details that can be dangerous to the production environment and SQL Server as a whole, as well as talk about worst practices and how to avoid them. Shedding light on some of these perils and the tricks to avoid them may even save your current job. After attending this session, Developers will only need 60 seconds to improve performance of their database server in their SharePoint implementation. We will have a quiz during the session to keep the conversation alive. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Additionally, all attendees of the session will have access to learning material presented in the session.

Workshop: Performance Tuning Best Practices – Every DBA and Developer MUST Know
In this session we will try to bridge the gap between DBA and Developer. We will approach performance as a whole subject rather than dividing them into DBA or Developer problems.

Performance Tuning is a tricky subject, but not as difficult as we think. It is very common to see in this industry that the DBA believes the performance issue is due to bad code and the Developer believes it is due to the bad deployment of the database server.

In this session, we will try to bridge the gap between DBA and Developer. We will approach performance as a whole subject rather than dividing them into DBA or Developer problems.

We will have very different approaches to solving the performance problems of SQL Server – we will have stories related to real world scenarios and practical approach of solving performance problems.

We will primarily discuss:

  • Best Deployment Practices
  • Indexes, Best Practices and Maintenance
  • Identifying slow running queries and Tuning
  • Identifying resource bottleneck with the help of SQL Wait Stats
  • Optimizing Resources – CPU, IO, Memory
  • Best Practices to Resolve ad-hoc issues

You will learn:

  • Performance Tuning
  • Best Practices
  • Index and Query Tuning

SPECIAL OFFER:

As a speaker, I can extend $600 savings on the 5-day package. Register here: http://bit.ly/LSPK22REG and use code LSPK22.

Get better at what’s available, and get ready for what’s coming. We delve deep into the topics that matter most in your world today, covering practical knowledge, fundamentals, new technologies, and futures.  Register now:http://bit.ly/LSPK22REG

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – What is the Difference Between An INNER JOIN and WHERE Clause

Just received a question in email:

“What is the Difference Between An INNER JOIN and WHERE Clause?”

Well, the answer can be quite long but I will try to answer in few words and a very simple one.

If you are talking about resultset – there will be not much different. If you are talking about performance – SQL Server engine is much smarter that it will automatically re-write your quest in most of the cases so you will have no difference in performance. With that said, I still prefer that you use INNER JOIN when a query involves more than one table as that is the ANSI valid syntax.

If you are ever going to use OUTER JOIN, the question of JOIN vs WHERE does not make sense at all, as the answer may be different in most of the cases. Here is the quick example of the same.

Let us create two tables and select the sample data from the table.
USE TempDB
GO
-- Creating Sales Rep Table
CREATE TABLE SalesRep (ID INT, SalesRep VARCHAR(10))
GO
INSERT INTO SalesRep (ID, SalesRep)
SELECT 1, 'Pinal'
UNION ALL
SELECT 2, 'Vinod'
UNION ALL
SELECT 3, 'Balmukund'
UNION ALL
SELECT 4, 'Chandra'
UNION ALL
SELECT 5, 'DJ'
GO
-- Creating Sales Orders Table
CREATE TABLE Orders (ID INT, Amount INT)
GO
INSERT INTO Orders (ID, Amount)
SELECT 1, 100
UNION ALL
SELECT 1, 200
UNION ALL
SELECT 2, 500
UNION ALL
SELECT 2, 300
UNION ALL
SELECT 3, 600
GO
-- Select Data
SELECT *
FROM SalesRep
GO
SELECT *
FROM Orders
GO

Here is the resultset of the data selected from above two tables.

Now let us see a quick example of the INNER JOIN and WHERE condition.

-- JOIN
SELECT s.SalesRep, o.Amount
FROM SalesRep s
INNER JOIN Orders o ON s.id = o.id
GO
-- WHERE
SELECT s.SalesRep, o.Amount
FROM SalesRep s, Orders o
WHERE s.id = o.id
GO

You will notice from the resultset both of the query returns exactly same result. As I mentioned earlier when we are using INNER JOIN and WHERE clause, there is no impact of the resultset if the JOIN condition and WHERE clause have almost same condition.

Let us see a quick example where Outer Join gives absolutely different results compared to where as there is totally different business logic when you have to use outer join. As I mentioned above, it is like comparing apples and oranges if you compare outer join and where clause. They are not the same logically.

-- JOIN
SELECT s.SalesRep, o.Amount
FROM SalesRep s
LEFT OUTER JOIN Orders o ON s.id = o.id
GO
-- WHERE
SELECT s.SalesRep, o.Amount
FROM SalesRep s, Orders o
WHERE s.id = o.id
GO

Here is the script to clean up the tables created.

-- Clean up
DROP TABLE SalesRep
DROP TABLE Orders
GO

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – SQL Server 2008 Service Pack 4 – Download

It has been a long time since SQL Server 2008 got Service Pack Update. Microsoft has finally released SQL Server 2008 service pack 4 and its feature pack. SQL Server 2008 SP4 contains fixes to issues reported as well as Hotfix solutions have provided since SQL Server 2008 Service Pack 2 up to and including Cumulative Update 13.

I have personally switched to SQL Server 2014 few months ago and I am happy with its performance and robust behavior. Many of the customer and clients are still using SQL Server 2012. However, if you are using SQL Server 2008, I suggest that you look at upgrading to the latest version of SQL Server or at least update your software with latest service pack.

You can download SQL Server 2008 Service Pack from following link:

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Add Auto Incremental Identity Column to Table After Creating Table

Question: Is it possible to add an auto incremental identity column to any table in SQL Server after creating a table.

Answer: There are two answers – No and Yes. Let us see them one by one.

Answer No – If you have an integer column in your table and you want to convert that column to identity table. It is not possible with the help of SQL Server. You will just have to add a new column.

Answer Yes – If you want to add new column to the table, it is totally possible to do so with the help of following a script.

ALTER TABLE YourTable ADD IDCol INT IDENTITY(1,1)

If you want to convert your old column to int column, may be you can drop that first and add a new column identity right after that with the help of following a script.
ALTER TABLE YourTable DROP COLUMN IDCol
ALTER TABLE YourTable ADD IDCol INT IDENTITY(1,1)

Let me know if you have any other work around besides SSMS (as that option just drops table and recreates it).

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Row Offset in SQL Server For Different Version

A very common question is how to find row offset in SQL Server. Here are a few examples based on different version of SQL Server. I have included SQL Server 2000 event, though it is almost 15 years old product and I encounter is less and less every day on production server.

I have used database adventureworks for example.

USE AdventureWorks2014
GO
--------------------------------------------------
-- SQL Server 2012/2014
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
OFFSET
(@PageNumber-1)*@RowsPerPage ROWS
FETCH NEXT @RowsPerPage ROWS ONLY
GO
--------------------------------------------------
-- SQL Server 2008 / R2
-- SQL Server 2005
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM (
SELECT SalesOrderDetailID, SalesOrderID, ProductID,
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS RowNum
FROM Sales.SalesOrderDetail ) AS SOD
WHERE SOD.RowNum BETWEEN ((@PageNumber-1)*@RowsPerPage)+1
AND @RowsPerPage*(@PageNumber)
GO
--------------------------------------------------
-- SQL Server 2000
--------------------------------------------------
DECLARE @RowsPerPage INT
DECLARE
@PageNumber INT
SET
@RowsPerPage = 10
SET @PageNumber = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP (@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP ((@PageNumber)*@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
) AS SOD
ORDER BY SalesOrderDetailID DESC
) AS SOD2
ORDER BY SalesOrderDetailID ASC
GO

The result of the above queries is identical to each other.

This blog post is based on an earlier blog post which had a few errors corrected in this blog.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – ​Tuning Queries is Sometimes Waste of Time – Notes from the Field #049

[Note from Pinal]: This is a 49th episode of Notes from the Field series. Every day I get few emails where I am asked how to tune queries so the entire server run faster. I always end up writing long answer this question. The reason is simple – query tuning is indeed the need of hours, but it is not everything. There are many more things one should do along with tuning queries. I asked the same question to Brian Moran, who is every day dealing with consultants and various organizations who are facing issues with SQL Server.

In this episode of the Notes from the Field series database expert Brian Moran explains a how Query Tuning is sometimes a waste of time when we are facing issues with performance and scalability. Read the experience of Brian in his own words.


Tuning queries is sometimes a waste of time when you’re trying to fix performance and scalability problems. That sounds crazy but it’s true. Designing an efficient workload is almost always more important than maintaining a laser focus on query tuning and other technical tricks of the DBA trade.

Here’s an example that helps to explain what I mean.

Simple Example

What’s the shortest distance between two points? A straight line, of course. Everyone knows that. Imagine that you need to calculate the most efficient path from Washington DC to New York. Imagine those cities on a traditional 3D globe. The straight line is simple to figure out, right? Just head north as the crow flies.

But what if you wander about and take a curvy path as you head north? DC to New York is about 204 miles as the crow flies, but Google says the distance is 226 miles if you take I95 and the New Jersey Turnpike. Google also presents some reasonable travel options that are as far 264 miles. It’s easy to imagine you could increase the distance even more by taking scenic roads and side trips along the way. Of course, you can also draw a straight line on a globe between DC and New York that heads in a southerly direction, which is over 24,000 miles. That’s over 100 times more expensive that the straight route northwards.

Now, let me map that mapping example back to database performance tuning. Sometimes database performance tuning is an exercise in making the workload more efficient, just as you might want to make your route to New York more efficient. To make the database workload more efficient, you can tune queries, add indexes, and do all sorts of other neat tricks. Query tuning is like an exercise in flattening the northerly path from DC to New York to make as straight a line as possible. You want to tune your queries so that they’re as direct as possible and don’t veer off into unnecessary detours.

But, what if you’re talking about trying to flatten a squiggly line from DC to New York that starts out by heading south 180 degrees, the wrong way?  You can make that line as straight and as efficient as you want.  But heading south from DC to get to New York is never going to be efficient no matter how straight of a line you draw. A route that goes the wrong way can be optimized, but it’s still going the wrong way.

Inefficient Workload

The same idea applies to queries. If you start out tuning an inefficient workload, you’ll end up with an inefficient workload that is tuned, but it is still inefficient.

This seems like a silly example. Everyone intuitively knows that the southerly route is inherently a poor choice. You can tune the route as much as you want, but in best case scenarios it’s still going to be over 100X worse than going north.

Oddly enough, 25 years of database consulting tells me many DBAs will spend most of their time tuning queries that are the database equivalent of trying to make that southerly line from DC to New York as straight as possible. They’ll spend days, weeks, or months, trying to shave that last 10%.

All too often, DBAs fail to take a moment to pause, understand the nature of the problem they are trying to solve, and try to envision a more efficient way to solve the problem.  These DBAs don’t recognize that changing the workload can be vastly better than tuning the workload.

Two Classic Examples

Here are a two classic examples that most DBAs intuitively understand. Say you have a business need defined in a way that requires drop-down list box that will be populated with 100,000 rows that users will have to scroll through. You can try to tune the process to make it more efficient at loading 100,000 rows every time the screen is touched. But that tuning does not change the workload.

Rather than simply tuning the existing workload, a better investment in time would be if you helped the designers of that screen understand and accept a solution that doesn’t require loading 100,000 rows. Or perhaps you have a system that makes heavy use of database cursors in a procedural manner of some kind. Most DBA’s know that architecting with a set-based solution will almost always be better than trying to make the cursor-based approach as fast as possible.

Here’s a 15-year-old example that’s still a common design pattern mistake made today. I was working on an office supply site when e-commerce was first becoming big business. The application was built in a very object-oriented manner. Developers love object-oriented approaches, but databases aren’t always as fond of the approach. In this case this object-oriented design pattern led to 1000 and sometimes 2000 or more round trips to SQL Server when a single user searched for office chairs, and the search brought back just one screen of results.

This was a problem because the system needed to support hundreds or thousands of concurrent users. The existing workload might therefore need to handle hundreds of thousands or millions of batch requests per second. The client wanted to do this on a single SQL Server instance since that’s all their budget could afford.

Summary

I could have spent an infinite amount of time tuning the queries used in this approach, but the workload was never going to be efficient. I was never going to be able to handle hundreds of thousands of batch requests per second on a single instance of SQL Server 2000. Instead, I made some changes to the architecture. In other words, I changed the workload, and I was able to achieve the performance goals I had.

The examples I used here are pretty basic, and most DBA’s today are aware of how to deal with such situations. But I’m willing to bet that many people reading this post are struggling with a performance problem where tuning the queries is a poor use of time as compared to completely changing the essence of the workload. It’s too easy to lose sight of the forest for the trees.

Having a hard time figuring out how to tune your queries to achieve performance goals? Maybe you should take a step back and focus on changes you can make to the workload instead?

If you want to get started with performance tuning and database security with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)