SQL SERVER – User Defined Audit with SQL Server

Talk to any financial institution or bank they will be super paranoid when it comes to security and auditing policies applied to their organizations. In a recent session to one of our customers, I had to enter their premises and I had to go through a number of screening. From the entrance, car parking, reception, elevators and everywhere there was someone asking us for some information. The more I think about it, the more frustrated I become. After returning home, I thought through the complete incident with my family. I was pleasantly surprised the calmness at which they were talking to me about process.

Wow, it is a great way to learn patience and why this is important. Organizations have a reason, they need data secure, they want to have procedures so that there is no data loss, no theft of data and many more. In this competitive world this is super critical. Hence these procedures are important, critical for survival. And hence this blog is inspired to the Auditing capability with SQL Server 2012 and what I found interesting.

Creating our Audit

Creating an audit is as simple as going through a series of wizard with some basic data. Go to SQL Server Management Studio -> Security -> Audits and create a new audit. Go ahead and add the FilePath, in our example I have added it as “C:\Audit”.

The TSQL equivalent for the same is:

CREATE SERVER AUDIT [Audit-20141115-213944]
TO FILE
(  FILEPATH = N'C:\Audit'
,MAXSIZE = 0 MB
,MAX_ROLLOVER_FILES = 2147483647
,RESERVE_DISK_SPACE = OFF)
WITH
(  QUEUE_DELAY = 1000
,ON_FAILURE = CONTINUE)
GO

After creating the same, we will make sure this Audit is enabled. This ensures where our Auditing data will get into. In our example it is configured to a File as shown above.

Once we enable the Audit node, the next step is to add the Audit operation. Here we are going to add the “User Defined Audit Group”

CREATE SERVER AUDIT SPECIFICATION [ServerAuditSpecification-20141115-214433]
FOR SERVER AUDIT [Audit-20141115-213944]
ADD (USER_DEFINED_AUDIT_GROUP)
GO

Creating User defined Audit entries

This is a capability that was introduced with SQL Server 2012 and I personally feel this is quite powerful for a couple of reasons. Now, apart from standard auditing capability now applications can raise specific auditing into the audit log so that we can track or audit logically from an applications point of view.

The simplest way to raise an audit record from an application is using the stored procedure sp_audit_write command. A typical command looks like:

EXEC sys.sp_audit_write 1, 0, N'This is an audit from Pinal'

This command will now put an entry into my audit, we just defined in the previous. To view the entry, select the “View Audit Log” option from the Audit we just created. The Log viewer looks like below:

In the example above, we can see the details. Please note the Audit “Succeeded as False” because we sent the second parameter to sys.sp_audit_write as 0 hence it is False. If you want to enter it as True, pass the value as 0. Also the statement that got us this information is available as part of “Statements” line item.

If you want to view all the entries into our audit file, we can use the sys.fn_get_audit_file function.

SELECT * FROM sys.fn_get_audit_file('c:\Audit\*', NULL, NULL)

I hope you got a flavor of how custom user defined audits can be defined. This is a great way to use the auditing feature from your application’s point of view. Do let me know if you found this useful and if you are planning to use the same in your environments.

Reference: Pinal Dave (http://blog.sqlauthority.com)

About these ads

SQL SERVER – Filtering CPU Bound Execution Plans with Extended Events

Let me take a tour to what are we talking here from a scenario point of view. Prior to SQL Server 2012, tracing the execution plan is an all or nothing approach. In profiler, if we select execution plan, we will get every occurrence of every execution plan for all the statements that are executed inside SQL Server captured by the trace. This will bloat up the trace file very quickly and unimaginable for us to do any sort of analysis. So in general performance templates used inside profiler we do not capture execution plans – and rightly so.

I have been waiting for the ability to capture execution plan only when a query executes longer than a certain amount of time. Yes, I can add a few filters to profile, but this wasn’t the most efficient way to work. In this blog we will look at the Extended Event query_post_execution_showplan (similar to Showplan XML Statistics Profile) in SQL Server 2012 which has an interesting addition of cpu_time and duration for filtering. Personally, I thought this is a great addition to the usage of Extended Events.

Why use this?

We want to use this to monitor the overall server performance and troubleshooting when things go wrong. When the server responses are slow and we have not identified any particular query, we can choose to capture all the execution plans that consumed CPU_Time or duration exceeding certain threshold. This is one of the ways this feature can be used effectively.

Extended Event UI

If you are not familiar with the Extended Events UI, here is a great start for this task. Goto SQL Server Management Studio -> Management Node -> Extended Events -> Sessions -> Right Click and select “New Session…”.

This starts a wizard which will make our configuration quite easy. In Event library, search using the keyword “showplan“. Query_post_execution_showplan will show up as a search result. Assuming you are on SQL Server 2012 and above, note the cpu_time and duration are among available event fields.

Next, select this event and move the entry to the “Selected events:” list. This enables the “Configure” button at the top of the dialog box.

Click the Filter (Predicate) tab. You can choose cpu_time, operator and value. Note that cpu_time is in microseconds. For example, if you want to capture execution plans on all queries taking 10 seconds of CPU, you will choose 10000000. This can be done for queries that are executing more than 10 seconds too by selecting the “duration” field.

Extended Event T-SQL

Instead of showing the UI version of the command, I have gone ahead and scripted out with the duration field the same extended events. The command for the same is:

CREATE EVENT SESSION [CPU_XEvent] ON SERVER
ADD EVENT sqlserver.query_post_execution_showplan(    ACTION(package0.process_id,sqlserver.database_name,sqlserver.nt_username,sqlserver.sql_text)
WHERE ([duration]>=(10000000)))
GO

In addition to the filter, I have gone ahead and added few global variables which we might be interested as part of the collection process. Use this with caution and be careful in collecting every single event using Extended Events. I was pleasantly surprised to see the cpu_time and duration in SQL Server 2014 and hence thought it is worth a mention.

Finally, if you are doing it on a Live server. Then make sure to enable the Extended Event and we can use the “watch live data” which I found very useful in SSMS.

If you have used other events of Extended Events in your environment, I would like to learn from you on how you used them and to enable what scenarios.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Validation Rules: Code or Database? – Notes from the Field #054

[Note from Pinal]: This is a 54th episode of Notes from the Field series. Where do we blame for a mistake or error in the system? Well, developer blames DBA and DBA blame developers. Sometimes there is no solution to the catch 22 situation. I have been there and I am very sure that you have been there too. Well, this is an amazing and beautiful article by G. Andrew Duthie. He has attempted to demystify the problem which we all face every day.

In this episode of the Notes from the Field series database expert Andrew Duthie explains about Validation Rules and where they should be implemented. Read the experience of Andrew in his own words.


 

If you’re a DBA dealing with developers, you may run into the question of whether it’s better to allow the developers to write validation rules in their .NET app layer, or whether you should put your foot down and insist that the validation be implemented in stored procedures. The answer, as you might expect from a consultant, is “it depends.”

Advantages of Code-based Validation

One of the things that might inform your decision on what to use for validation is the skills possessed by the development team. If you’ve got a developer or developers who are well-versed in C# and LINQ, but don’t have a lot of experience writing stored procedures, you may want to cut them a break and let them use the tools they’re more familiar with.

Writing validation rules in code at the application layer allows developers to stay within the realm of .NET objects, which can result in faster development time.

Disadvantages of Code-based Validation

While there are probably more that could be discussed, I’ll mention just two of the significant disadvantages to writing validation rules in code.

First, if the code for the validation rules is using LINQ, particularly if the rules are complex, there’s the possibility of queries that generate sub-optimal SQL under the covers. This can be mitigated by profiling the queries to make sure that any performance hogs are caught as early as possible, but it’s certainly a valid concern.

Second, from a maintainability standpoint, having rules in the app means that adding rules requires the app to be recompiled and redeployed. For some apps and environments, this may not be a big deal, but in others, it could definitely be a deal-breaker.

Advantages of Stored Procedure-based Validation

Using stored procedures for validation provides some key advantages. One is proximity to the data. Unlike code-based validation, which may require pumping significant amounts of data over the wire from the database to the app tier, stored procedure-based validation keeps the logic on the DB tier, so performance may be significantly better.

Another advantage is that with a good execution design (for example, a master stored procedure that executes a list of validation rules in a specified order based on a configuration table), it can be relatively easy to introduce new rules with less disruption than having to recompile and redeploy an entire application.

Disadvantages of Stored Procedure-based Validation

The major disadvantage of using stored procedures for validation, speaking as an app developer, is the basic impedance mismatch between .NET code (C# or Visual Basic) and T-SQL. While it’s certainly possible for developers to master both, there’s a mental cost in switching between these environments, and a potential for mistakes when transitioning from one to the other.

The other downside of stored procedures is the mixing of application logic between the app tier and the database tier. While validation close to the data can, as noted, improve performance, if some parts of the application logic live in both the app and database tiers, this could make for more costly maintenance down the road.

Consistency is Key

One additional point I’d like to make is that it’s probably wise to choose one option or the othernot both. If you have multiple applications in development (or even in maintenance mode), having a mix of app-based or sproc-based validation will likely give you headaches at some point. So get your team together and have a discussion about how you’re currently handling things, and whether there might be a better way.

Summary

The short answer to “which is better” is really “either.” It all depends on the skills of your developers, the performance you need from your app, and the other factors I’ve discussed. Although I’m coming at this from the perspective of an app developer, I’ve recently become more comfortable with the idea of stored procedure-based validation, particularly in instances where more than one app may be targeting the same database, since this can help reduce redundancy, and centralize management of rules.

I’d love to get your feedback on how you’ve handled validation rules in your environment, so feel free to share a comment below.

If you want to get started with SQL Server with the help of experts, read more over at Fix Your SQL Server.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – FIX – Msg 4864, Level 16, State 1 – Bulk load data conversion error

Working with SQL Server is such rewarding and most of the times, I get a chance to revisit and explore more into some of these errors. Long time ago, I had written a blog post to read data from CSV/text file and insert into SQL Server Table using BULK INSERT command. You can read it here: SQL SERVER – Import CSV File Into SQL Server Using Bulk Insert – Load Comma Delimited File Into SQL Server

There have been many comments on that blog with error received by my readers. This blog post is to fix few errors mentioned over there.

Msg 4860, Level 16, State 1

This is one of the most common error reported by readers. Let’s have a look at the steps to reproduce the same:

CREATE TABLE library_books_loan
(
student_id VARCHAR(10) NOT NULL,
book_id    VARCHAR(10) NOT NULL,
branch_id  SMALLINT NOT NULL,
id_no      VARCHAR(10) NOT NULL,
date_out   DATE NULL DEFAULT NULL,
due_date   DATE NULL DEFAULT NULL,
date_in    DATE NULL DEFAULT NULL
)

Here is the sample text file I want to insert into this table:

student_id,book_id,branch_id,id_no,date_out,due_date,date_in<>
1,0399147020,1,9019,2013-11-22,2013-12-06,2013-12-01<>
2,0030059380,4,9007,2013-12-01,2013-12-15,2013-12-16<>
3,0671880756,5,9018,2013-12-08,2013-12-22,2013-12-22<>
4,0911625291,3,9013,2014-01-02,2014-01-16,2014-01-12<>
5,0688161995,5,9022,2014-02-10,2014-02-24,2014-03-01<>
6,0911625291,2,9011,2014-03-03,2014-03-17,2014-03-16<>
7,1861003730,3,9034,2014-04-17,2014-05-01,NULL<>
12,0911625607,2,9018,2014-04-19,2014-05-03,NULL<>

We can save the file as C:\Temp\Books_Library.txt. Here is the command which we will run to import the data.

BULK INSERT library_books_loan
FROM 'C:\Temp\Books_Library.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '<>',
FIRSTROW=2

The common mistake which is done by the user is, file gets saved locally (on their computer) and command is executed on remote instance of SQL Server. Here is the error which you would receive if you do this.

Msg 4860, Level 16, State 1, Line 12
Cannot bulk load. The file "C:\Temp\Books_Library.txt.txt" does not exist. 

So please make sure that file exists on the machine where SQL Server is running and path is correct on server itself.

Msg 4864, Level 16, State 1

If file is saved correctly on the server and BULK INSERT is tried then we are likely to get the below error messages:

Msg 4864, Level 16, State 1, Line 12
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 8, column 7 (date_in).
Msg 4864, Level 16, State 1, Line 12
Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 9, column 7 (date_in).

If we read error message correctly, it is complaining about row 8 and row 9. Column is date_in. If we look back at data, we can see that NULL is provided as a value.

This means that SQL Server is treating that value as a string “NULL” and trying to insert that into a column which is defined as the date.

Here is the simple explanation by the demo.

CREATE TABLE SQLAuth (d DATE)
GO
INSERT INTO SQLAuth VALUES ('NULL')
GO

Msg 241, Level 16, State 1, Line 3
Conversion failed when converting date and/or time from character string.

Solution

Modify the text file and don’t pass any value to the column. It would be treated as NULL automatically. Here is the modified version

student_id,book_id,branch_id,id_no,date_out,due_date,date_in<>
 1,0399147020,1,9019,2013-11-22,2013-12-06,2013-12-01<>
 2,0030059380,4,9007,2013-12-01,2013-12-15,2013-12-16<>
 3,0671880756,5,9018,2013-12-08,2013-12-22,2013-12-22<>
 4,0911625291,3,9013,2014-01-02,2014-01-16,2014-01-12<>
 5,0688161995,5,9022,2014-02-10,2014-02-24,2014-03-01<>
 6,0911625291,2,9011,2014-03-03,2014-03-17,2014-03-16<>
 7,1861003730,3,9034,2014-04-17,2014-05-01,<>
 12,0911625607,2,9018,2014-04-19,2014-05-03,<>

If we run the same command now, data should be inserted. As highlighted, we can see NULL values inserted on our destination table.

These are some common errors one can get using BULK INSERT command. In case you are facing other errors, please comment and I shall try to respond it in a future post for sure.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – What is Hypothetical Indexes?

If you ever thought this is some sort of trick to bring you to the blog, well you are wrong. This infact is something worth a look and interesting thing to know. Before I start to explain you the finer details, let me tell you that this is NOT a new feature for performance improvement of the SQL Server Engine.

During profiling one of the databases, one of my DBA friends asked a question about why there is a difference in the indexes shown in SQL Server Management Studio vs. T-SQL query. Though it didn’t make sense to me, I asked for details and wanted him to get me the details. This leads to the learning which I thought was worth a share. Here is what I mean:

Trust me, there is absolutely no Photoshop trick in the above image. Sys.Indexes catalog view shows 7 entries for a table but object explorer shown only one index on the table which is the ONLY object inside the database.

Before we talk about the reason, you can play around with me by creating sample table using the below script.

USE MASTER
GO
IF DB_ID('HypotheticalIndex') IS NOT NULL
BEGIN
ALTER DATABASE
HypotheticalIndex SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DROP DATABASE
HypotheticalIndex
END
CREATE DATABASE
HypotheticalIndex
GO
USE HypotheticalIndex
GO
CREATE TABLE [Alumni]
(
[SSN]         [INT] IDENTITY(1, 1) NOT NULL,
[StudentName] [CHAR](200) NULL,
[GradDate]    [DATETIME] NULL
)
GO
SET NOCOUNT ON
GO
INSERT INTO Alumni
(StudentName,
GradDate)
VALUES      (RAND() * 1000,
DATEADD(dd, RAND() * 100, GETDATE()))
GO 60000 -- should take around 50 seconds
SET NOCOUNT OFF
GO

Above script would create a database and populate rows. Let’s create an index on the table.

CREATE INDEX Idx_Alumni_GradDate
ON Alumni(GradDate)

To create hypothetical index on the same column, we can run the below command:

CREATE INDEX Alumni_hyp_1
ON Alumni(GradDate)
WITH STATISTICS_ONLY=1
GO

Here the keyword is undocumented extension WITH STATISTICS_ONLY which is available with CREATE INDEX command.
Let us next look at sys.indexes output:

SELECT name,
index_id,
type_desc,
data_space_id,
is_hypothetical
FROM   sys.indexes
WHERE  OBJECT_ID = OBJECT_ID('Alumni')
AND    
type_desc <> 'HEAP'

Notice the highlighted column values for hypothetical index. Data_space_id is zero because there is no physical storage for this index. It is not stored in any filegroup or file.

Let’s look at the statistics.

DBCC SHOW_STATISTICS (Alumni, Alumni_hyp_1) WITH STAT_HEADER
GO

We can see that rows are sampled and statistics object is generated for hypothetical index.

This means that the hypothetical index is an index that has the metadata (in sys.indexes with is_hypothetical = 1) and a statistics associated to it (in sys.stats), but does not have physical storage. This is used only for costing evaluation of query plan in conjunction with “DBCC AUTOPILOT” and “SET AUTOPILOT ON” commands. These settings are for a future discussion and blog, let us move along.

Since SQL Server Management Studio is filtering out the hypothetical indexes in Object Explorer – this is the reason my friend saw a difference in T-SQL and SSMS.

STATISTICS_ONLY option is undocumented but I was playing around with the option and found something interesting. Based on the value passed to this parameter the statistics, sampling would be changed.

CREATE INDEX Alumni_hyp_0
ON Alumni(GradDate)
WITH STATISTICS_ONLY=0
GO
DBCC SHOW_STATISTICS (Alumni, Alumni_hyp_0) WITH STAT_HEADER
GO

If we pass a value as zero the statistics isnot generated.

If I pass the highest value (2^31-1) = 2147483647 then sampling is done with Full Scan.

CREATE INDEX [Alumni_hyp_2147483647]
ON Alumni(GradDate)
WITH STATISTICS_ONLY=2147483647
GO
DBCC SHOW_STATISTICS (Alumni, 'Alumni_hyp_2147483647') WITH STAT_HEADER
GO

To find a hypothetical index in the database, we can run below query.

SELECT  *
FROM    sys.indexes
WHERE   is_hypothetical = 1

Generally the index name would be _dta_index because the DTA uses them behind the scene to evaluate the cost of an index by creating them hypothetically. If DTA exits gracefully than it does clean up these indexes. But if someone is losing patience while DTA is running and kills it using task manager, then those indexes would be left behind. It is safe to drop hypothetical indexes. They can be dropped using the normal DROP INDEX command. I must point out that if you apply recommendation provided by the DTA while tuning a query and don’t change the suggested name, they would have _dta_ in their name. The name doesn’t mean they are hypothetical – we need to use the is_hypothetical column to filter them.

How did I learn this? Someone asked how the Database Engine Tuning Advisor (a.k.a. DTA) works. How does it create and evaluate indexes on huge tables? So I captured profiler while running DTA and found many interesting facts and under the cover working of the tool. This blog is an idea after seeing profiler. Learning never stops if you are working with SQL Server!

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – FIX – Error: One or more files do not match the primary file of the database

Writing about SQL Server for close to 8 years, almost every single day means I get a lot of questions from people on a daily basis. Though I try to answer as many as possible (via email, blog, twitter and Facebook), I get many interesting questions. There are a variety of questions ranging from installation, scalability, performance, TSQL, new feature and a variety of error messages. Recently I got an email which had below the question:

Hi Pinal,

I was given task to move the files for few databases and also to rename them. While doing this activity I did some mistake and few of my databases are not coming online. Here is the error I am getting for those databases.

2014-10-20 17:52:39.08 spid37s     Error: 5173, Severity: 16, State: 1.

2014-10-20 17:52:39.08 spid37s     One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files.  If this is an existing database, the file may be corrupted and should be restored from a backup.

2014-10-20 17:52:39.08 spid37s     Log file ‘E:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\SQLAuth_log.ldf’ does not match the primary file.  It may be from a different database or the log may have been rebuilt previously.

Can you please help me how I can resolve this issue and what needs to be done to mitigate?

On further investigation of ERRORLOG it was found that we have “RECOVERY_PENDING” state for the database because recovery of database could not be complete. The cause of the error is all files which are getting used by database don’t belong to the same database. There is a safety mechanism in the database startup where it checks whether all files below to same database. This is so very important because we don’t want to override existing files inadvertently.

It is not possible to open and read the files to find out information about MDF files when it is attached. We can use DBCC CHECKPRIMARYFILE command to read the primary file header to know the information stored. Here is the quick demo of various parameters and the output. I have found this quite powerful and thought was worth a share.

First, let’s create the database with two data files and one log file.

CREATE DATABASE [SQLAuthority]
ON PRIMARY
(NAME = N'SQLAuthority_MDF', FILENAME = N'C:\Temp\SQLAuthority_MDF.mdf'),
(
NAME = N'SQLAuthority_NDF', FILENAME = N'C:\Temp\SQLAuthority_NDF.ndf')
LOG ON
(NAME = N'SQLAuthority_log', FILENAME = N'C:\Temp\SQLAuthority_log.ldf')
GO

Let us detach the database using the following command:

sp_detach_db 'SQLAuthority'

Now let’s use DBCC CHECKPRIMARYFILE to read file header and get details.

We would see “1” and output only for first file because that’s the primary data file. Let’s run the same command with other parameters and check the output.

As we can see that by using the command we can read the header of the file and get various properties of primary files.

  1. Option 1: File IDs, filegroup id, logical name, physical name for other files belonging to same database.
  2. Option 2: Which database this file belongs to, what was the database version and collation.
  3. Option 3 is subset of option 1

This doesn’t mean that we can use two different files of same database taken from different servers and attach them. Sometime this command is useful when you don’t know the logical file names of the files which you have got and want to use the file (when attach is not working due to corruption).

Hope this was an interesting option and you were able to learn something from it as I did when first using the same.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Take the Quiz for a chance to win a Quadcopter Drone

It has been a long time since we ran quiz. So let us get ready for a quiz. The quiz has two parts. You have to get both the parts correct to win Quadcopter with Camera (we will call it drone). We will be giving away a total of 2 Quadcopters.

The quiz is extremely easy and I will ship the Drone anywhere in the world where Amazon will ship it.

Let us jump directly to the quiz. Please complete both the parts the contest. 

Contest Part 1: Brain Teasers

Please execute following script and answer the questions.

-- What will be the output of the following? and Why?
SELECT 28E3
-- What will be the output of the following? and Why?
SELECT 28F3
-- What will be the output of the following? and Why?
SELECT F328

Contest Part 2: Download and Activate Rapid SQL

Question: Download and Activate Rapid SQL.

Hint: You have to download and activate Rapid SQL. If you do not activate Rapid SQL, you will be disqualified for the contest. Why take risk, let us start!

That’s it!

Just answer above questions in the following comments area, in following format.

Remember:

  • Download RapidSQL from this link.
  • Use comments area right below the blog to take participation in the contest
  • Answer before November 21, 2014 midnight GMT.
  • The winner will be announced on December 8.
  • The winner will be selected randomly from all the valid answers.
  • All the valid answers will be kept hidden till December 24, 2014.
  • There will be a total of two winners.
  • The contest is open for any country of the world where Amazon ships products.

Reference: Pinal Dave (http://blog.sqlauthority.com)