SQL SERVER – Do You Know Your Data’s Classification? – Notes from the Field #050

[Notes from Pinal]: Data is a very simple word, yet it is very powerful. There is a famous saying – Know Your Data. I have quite often found that developers do not know their data, they are often confused with the same and not sure how to answer that. My friend Tim Radney is an amazing person who usually have answer to each of the questions which looks simple but are complicated in reality.

Linchpin People are database coaches and wellness experts for a data driven world. In this 50th episode of the Notes from the Fields series database expert Tim Radney (partner at Linchpin People) explains in a very simple word. Please follow Tim on his twitter handle at @tradney.


Do you know your data’s classification?

As data professionals, we have many responsibilities. We could be responsible for backing up and restoring data, writing reports, building queries, writing stored procedures, tuning workloads, or any vast number of responsibilities.

If you are in the business of granting access to the data either through reports/queries or provisioning login access, you should be aware of the type of data you are granting access to. Your company likely has policies in place that should guide how access to certain classifications of data should be handled.

Some of the more common types of data that your company would have stricter controls over would be related to PII, PCI, SOX, GLBA, or HIPPA. You should know what the guidelines are within your company for access to this data and help make sure that those standards are being upheld.

These data assets may require additional auditing on who has access to view, print, export, etc. When the data is viewed it may require water marks or headers/footers to be on any reports.

Your organization may require certain types of data, such as PCI to be audited on who is viewing certain elements of that data.

Worst, your organization may need to be doing these things but are not. If not, you should be asking why and helping to implement a data governance plan.

If you want me to take a look at your server and its settings, or if your server is facing any issue we can Fix Your SQL Server.

Note: Tim has also written an excellent book on SQL Backup and Recovery, a must have for everyone.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL SERVER – Beginning Reading Transaction Log File

In this article we will examine what the SQL Server Transaction Log is, why do we need one and how it works.

Transaction Log Overview

The SQL Server transaction log is maybe one of the most unclear parts of SQL Server architecture. At the same time the transaction log is one of the most essential components of every database.

Every SQL Server database has a transaction log that keeps all transactions and the database changes made by each transaction. In case of a system failure, the transaction log might be required to bring your database back to a consistent state. That is why the transaction log should never be deleted or moved.

Transaction Log Purpose

Let’s have a look at ACID. Atomicity, Consistency, Isolation, and Durability are set of properties that guarantee that database transactions are processed reliably. The transaction log assists us with durability. It guarantees that the committed transactions will stay forever. Durability can be achieved by flushing the transaction’s log records to non-volatile storage before acknowledging commitment. This means that the transaction must be stored on the hard drive. Otherwise, if you got a power failure and your changes were only in the memory on the database server, after the server restart you may find that your changes weren’t saved. Thus, all changes must be written to disk before we can return a successful commit back to the application.

SQL Server stores data rows on 8 KB pages, that can be located anywhere in the data file. A page can hold a big number of rows (about 400 rows on the page). The page is the smallest unit of reading and writing, so if you modify one row on the page, the entire 8 KB page is written back to disk.

Let’s consider the situation when you update 100 rows in a single transaction. Some of these rows could be spread out across the data file.

You would need to drag the disk head to each position, write the updated page to disk, and move onto the next location, waiting for the disk to spin, and the disk head to move each time.

The transaction log is a sequential file, and rather than make a randomly placed record for each page, it capable of writing in 512 byte sectors.

The smallest write should hit one disk sector. A disk sector is a physical sector on a drive. Most drives contain sectors which are 512 bytes.

Based on this, we can conclude that SQL Server will write 512 bytes to the log file for a single tiny write transaction.

Well, as transaction log is sequential, you don’t need to wait for the disk head to move to the next position to write the change. The next position is already right under the disk head. This means you can write the changes down on disk quickly.

While writing changes to the transaction log, the in-memory version of the data page is also being updated. Further, the Lazy Writer process will come and commit these data pages to disk. If you got a power failure, then all the data pages in memory will be lost, but we have a record of all changes that occurred in the database.

Reading Transaction Log File

Let’s examine how SQL Server logs transactions and what information is stored in the transaction log file.

To read the transaction log file we will use the fn_dblog function. This function is one of several undocumented SQL Server functions.

Every record in the SQL Server transaction log is uniquely identified by a log sequence number (LSN). If LSNn+1 > LSNn — the change described by the log record referred to LSNn+1 occurred after the change described by the log record LSNn.

fn_dblog accepts two parameters:

  • The first is starting log sequence number. You can also specify NULL, which means it will return everything from the start of the log.
  • The second is the ending log sequence number. You can also specify NULL, which means you want to return everything to the end of the log.

I will use dbForge Studio for SQL Server to run SQL code.

For demonstration purposes let’s create a test database and a table.

The IDE allows to create a database in several clicks. Once you establish a connection to SQL Server, right click the connection, and then click New Database. The Database designer will appear. In the designer window enter the database name (I will use ViewTransactionLog) and click the Update Database button. That is all. To create a table, expand the database node, right click the Table folder, and then click New Table.

The table will contain three columns: ID, FirstName, and LastName.

Now we can use SQL script editor to execute the fn_dblog function.

Note that 461 rows have been created for the database and the blank table. To see transaction log data:

Note that the Transaction Name column contains the database name. This value appears only when the particular transaction starts with “LOP_BEGIN_XACT” (means begin transaction) in the Operation column. The operation column shows which operation is being performed by SQL Server, either insert, update, delete, shrink, lock, page allocation, etc. Transaction ID is the same for all transaction parts.

Now let’s run several DML commands to see how this affect the transaction log file.

Let’s have a look at the transaction log.

We can read the first raw in the log as:

  • INSERT statement with transaction ID 0000:00000a80
  • started at 2014/10/10 10:44:52:273
  • INSERT a row in a HEAP table ‘dbo.PErson’ in pageID 0000011d:00000140
  • finished at 2014/10/10 10:44:52:273

Conclusion

As you can see, the transaction log is a list of changes that have occurred in a database. The transaction log also gives us the ability to replay any operation that occurred in the database.

If you take a full database backup, and then have copies of the transaction log, you will be able to replay every operation and bring the database back to a consistent state at any point in time.

For instance, if you accidentally dropped a table, you can bring the database back to the point just before the drop time or by a specific LSN, and recover the data.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL Authority News – Save USD 600 for SQL Live! 360 in November

I love talking about performance tuning and I love teaching about performance tuning. I think this is one subject, which everyone would like to discuss about, as this subject never gets old. I stay in India so I often deliver conference sessions on this subject over here. However, if you are in the USA, this is your opportunity to join me to discuss about SQL Server Performance.

I’ll be speaking at Live! 360, November 17 – 21 in Orlando, FL. Live! 360 brings together five conferences, and the brightest minds in IT and Dev, to explore leading edge technologies and conquer current ones. More info: http://bit.ly/LSPK22HOME

As a speaker, I can extend $600 savings on the 5-day package. Register here: http://bit.ly/LSPK22REG and use code LSPK22.

I’ll be presenting the following sessions:

Secrets of SQL Server: Database Worst Practices
Abstract: Chances are you have heard, or even uttered, this expression. This demo-oriented session will show many examples where database professionals were dumbfounded by their own mistakes, and could even bring back memories of your own early DBA days. The goal of this session is to expose the small details that can be dangerous to the production environment and SQL Server as a whole, as well as talk about worst practices and how to avoid them. Shedding light on some of these perils and the tricks to avoid them may even save your current job. After attending this session, Developers will only need 60 seconds to improve performance of their database server in their SharePoint implementation. We will have a quiz during the session to keep the conversation alive. Developers will walk out with scripts and knowledge that can be applied to their servers, immediately post the session. Additionally, all attendees of the session will have access to learning material presented in the session.

Workshop: Performance Tuning Best Practices – Every DBA and Developer MUST Know
In this session we will try to bridge the gap between DBA and Developer. We will approach performance as a whole subject rather than dividing them into DBA or Developer problems.

Performance Tuning is a tricky subject, but not as difficult as we think. It is very common to see in this industry that the DBA believes the performance issue is due to bad code and the Developer believes it is due to the bad deployment of the database server.

In this session, we will try to bridge the gap between DBA and Developer. We will approach performance as a whole subject rather than dividing them into DBA or Developer problems.

We will have very different approaches to solving the performance problems of SQL Server – we will have stories related to real world scenarios and practical approach of solving performance problems.

We will primarily discuss:

  • Best Deployment Practices
  • Indexes, Best Practices and Maintenance
  • Identifying slow running queries and Tuning
  • Identifying resource bottleneck with the help of SQL Wait Stats
  • Optimizing Resources – CPU, IO, Memory
  • Best Practices to Resolve ad-hoc issues

You will learn:

  • Performance Tuning
  • Best Practices
  • Index and Query Tuning

SPECIAL OFFER:

As a speaker, I can extend $600 savings on the 5-day package. Register here: http://bit.ly/LSPK22REG and use code LSPK22.

Get better at what’s available, and get ready for what’s coming. We delve deep into the topics that matter most in your world today, covering practical knowledge, fundamentals, new technologies, and futures.  Register now:http://bit.ly/LSPK22REG

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – What is the Difference Between An INNER JOIN and WHERE Clause

Just received a question in email:

“What is the Difference Between An INNER JOIN and WHERE Clause?”

Well, the answer can be quite long but I will try to answer in few words and a very simple one.

If you are talking about resultset – there will be not much different. If you are talking about performance – SQL Server engine is much smarter that it will automatically re-write your quest in most of the cases so you will have no difference in performance. With that said, I still prefer that you use INNER JOIN when a query involves more than one table as that is the ANSI valid syntax.

If you are ever going to use OUTER JOIN, the question of JOIN vs WHERE does not make sense at all, as the answer may be different in most of the cases. Here is the quick example of the same.

Let us create two tables and select the sample data from the table.
USE TempDB
GO
-- Creating Sales Rep Table
CREATE TABLE SalesRep (ID INT, SalesRep VARCHAR(10))
GO
INSERT INTO SalesRep (ID, SalesRep)
SELECT 1, 'Pinal'
UNION ALL
SELECT 2, 'Vinod'
UNION ALL
SELECT 3, 'Balmukund'
UNION ALL
SELECT 4, 'Chandra'
UNION ALL
SELECT 5, 'DJ'
GO
-- Creating Sales Orders Table
CREATE TABLE Orders (ID INT, Amount INT)
GO
INSERT INTO Orders (ID, Amount)
SELECT 1, 100
UNION ALL
SELECT 1, 200
UNION ALL
SELECT 2, 500
UNION ALL
SELECT 2, 300
UNION ALL
SELECT 3, 600
GO
-- Select Data
SELECT *
FROM SalesRep
GO
SELECT *
FROM Orders
GO

Here is the resultset of the data selected from above two tables.

Now let us see a quick example of the INNER JOIN and WHERE condition.

-- JOIN
SELECT s.SalesRep, o.Amount
FROM SalesRep s
INNER JOIN Orders o ON s.id = o.id
GO
-- WHERE
SELECT s.SalesRep, o.Amount
FROM SalesRep s, Orders o
WHERE s.id = o.id
GO

You will notice from the resultset both of the query returns exactly same result. As I mentioned earlier when we are using INNER JOIN and WHERE clause, there is no impact of the resultset if the JOIN condition and WHERE clause have almost same condition.

Let us see a quick example where Outer Join gives absolutely different results compared to where as there is totally different business logic when you have to use outer join. As I mentioned above, it is like comparing apples and oranges if you compare outer join and where clause. They are not the same logically.

-- JOIN
SELECT s.SalesRep, o.Amount
FROM SalesRep s
LEFT OUTER JOIN Orders o ON s.id = o.id
GO
-- WHERE
SELECT s.SalesRep, o.Amount
FROM SalesRep s, Orders o
WHERE s.id = o.id
GO

Here is the script to clean up the tables created.

-- Clean up
DROP TABLE SalesRep
DROP TABLE Orders
GO

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – SQL Server 2008 Service Pack 4 – Download

It has been a long time since SQL Server 2008 got Service Pack Update. Microsoft has finally released SQL Server 2008 service pack 4 and its feature pack. SQL Server 2008 SP4 contains fixes to issues reported as well as Hotfix solutions have provided since SQL Server 2008 Service Pack 2 up to and including Cumulative Update 13.

I have personally switched to SQL Server 2014 few months ago and I am happy with its performance and robust behavior. Many of the customer and clients are still using SQL Server 2012. However, if you are using SQL Server 2008, I suggest that you look at upgrading to the latest version of SQL Server or at least update your software with latest service pack.

You can download SQL Server 2008 Service Pack from following link:

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Add Auto Incremental Identity Column to Table After Creating Table

Question: Is it possible to add an auto incremental identity column to any table in SQL Server after creating a table.

Answer: There are two answers – No and Yes. Let us see them one by one.

Answer No – If you have an integer column in your table and you want to convert that column to identity table. It is not possible with the help of SQL Server. You will just have to add a new column.

Answer Yes – If you want to add new column to the table, it is totally possible to do so with the help of following a script.

ALTER TABLE YourTable ADD IDCol INT IDENTITY(1,1)

If you want to convert your old column to int column, may be you can drop that first and add a new column identity right after that with the help of following a script.
ALTER TABLE YourTable DROP COLUMN IDCol
ALTER TABLE YourTable ADD IDCol INT IDENTITY(1,1)

Let me know if you have any other work around besides SSMS (as that option just drops table and recreates it).

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Row Offset in SQL Server For Different Version

A very common question is how to find row offset in SQL Server. Here are a few examples based on different version of SQL Server. I have included SQL Server 2000 event, though it is almost 15 years old product and I encounter is less and less every day on production server.

I have used database adventureworks for example.

USE AdventureWorks2014
GO
--------------------------------------------------
-- SQL Server 2012/2014
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
OFFSET
(@PageNumber-1)*@RowsPerPage ROWS
FETCH NEXT @RowsPerPage ROWS ONLY
GO
--------------------------------------------------
-- SQL Server 2008 / R2
-- SQL Server 2005
--------------------------------------------------
DECLARE @RowsPerPage INT = 10, @PageNumber INT = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM (
SELECT SalesOrderDetailID, SalesOrderID, ProductID,
ROW_NUMBER() OVER (ORDER BY SalesOrderDetailID) AS RowNum
FROM Sales.SalesOrderDetail ) AS SOD
WHERE SOD.RowNum BETWEEN ((@PageNumber-1)*@RowsPerPage)+1
AND @RowsPerPage*(@PageNumber)
GO
--------------------------------------------------
-- SQL Server 2000
--------------------------------------------------
DECLARE @RowsPerPage INT
DECLARE
@PageNumber INT
SET
@RowsPerPage = 10
SET @PageNumber = 6
SELECT SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP (@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM
(
SELECT TOP ((@PageNumber)*@RowsPerPage)
SalesOrderDetailID, SalesOrderID, ProductID
FROM Sales.SalesOrderDetail
ORDER BY SalesOrderDetailID
) AS SOD
ORDER BY SalesOrderDetailID DESC
) AS SOD2
ORDER BY SalesOrderDetailID ASC
GO

The result of the above queries is identical to each other.

This blog post is based on an earlier blog post which had a few errors corrected in this blog.

Reference: Pinal Dave (http://blog.sqlauthority.com)