SQL SERVER – What is Semantics Search Inside SQL Server?

Every now and then I bump into stuffs that I didn’t even hear about when it comes to working with SQL Server. This time I heard the word Semantics Search and was wondering where and when did it come. In this blog, I will put the foundation for this and later we will look at how this can be used in our environments.

Wikipedia defines Semantics as the study of meaning. Microsoft defines “Semantics” as providing higher-level insight into large quantities of data even across diverse unstructured data.

In SQL Server 2012, this meaning is around key phrases in documents and documents that share similar concepts. However, if we define extraction as “from unstructured data, create structured data that describes it”, then SQL Server already has three instances of extraction: full-text indexing, XML indexing, and spatial indexing (this inputs unstructured spatial values and then creates a structured index supporting spatial queries).

Full-text search uses an inverted index for storing keywords from documents. This can answer questions like “What documents contain this word” or “What documents have this word NEAR that word” or “What documents have SQL Server in the title?”

While full-text search lets you query the words in a document, semantic search lets you query the meaning of the document. Semantic Search adds two additional index types as part of SQL Server’s vision for Rich Unstructured Data Services (RUDS).

  • A tag index – Given a corpus of unstructured text documents, extract the key concepts in these documents or more specifically, the key phrases.
  • A similarity index – utilizing the extracted key concepts to determine which concepts are related, using related concepts to infer related documents, etc. An example of using similarity is to find resumes based on a job description.

SQL Server Semantics falls into “Patterns/Outliers” segmentation. It is optimally designed to perform the extraction one time, but populate both the full-text search and the two sematic indexes, as configured. Therefore, it is possible to create a full-text index without semantic indexes, but it is not possible to create semantic indexes without also creating the full-text index.

Note: You cannot create one semantic index and not the other. You may only use one, but both are created during population when specifying to include.

If you are wondering if this is enabled on your SQL Server 2012 or 2014 instance, you can run the following command to determine if full-text and semantic search are installed. A return value of 1 indicates that these components are installed; a return value of 0 indicates that they are not installed.


Before I write the subsequent blogs around what these are and how to enable, I am interested in knowing has anyone worked or used Sematic Search with SQL Server in your applications?

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Script level upgrade for database ‘master’ failed – CREATE SCHEMA failed due to previous errors

SQL Server troubleshooting topics seem to be the most searched for in this blog. In continuation to this, blog for today is also inspired with one such error message. Some time back I have written below blog which talks about upgrade script mode of SQL Server:

SQL SERVER – Login Failed For User – Reason Server is in Script Upgrade Mode

One of the reader reached out to me with below error in ERRORLOG

Setting object permissions…
Error: 2714, Severity: 16, State: 6.
There is already an object named ‘TargetServersRole’ in the database.
Error: 2759, Severity: 16, State: 0.
CREATE SCHEMA failed due to previous errors.
Error: 912, Severity: 21, State: 2.
Script level upgrade for database ‘master’ failed because upgrade step ‘sqlagent100_msdb_upgrade.sql’ encountered error 2714, state 6, severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error happened during upgrade of the ‘master’ database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
Error: 3417, Severity: 21, State: 3.
Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it. For more information about how to rebuild the master database, see SQL Server Books Online.

Above error is truly scary. “Restore master from a full backup, repair it, or rebuild it”. So I asked my friend to share the script ‘sqlagent100_msdb_upgrade.sql’ to check which query is failing. It was not very difficult to find (I searched for “Setting object permissions…” which was printed in ERRORLOG before error) and I found that below is the piece present in script, which was failing.

FROM msdb.dbo.sysusers
WHERE (name = N'TargetServersRole')
issqlrole = 1)))
-- If there are no members in the role, then drop and re-create it
FROM msdb.dbo.sysusers   su,
msdb.dbo.sysmembers sm
WHERE (su.uid = sm.groupuid)
su.name = N'TargetServersRole')
su.issqlrole = 1)) = 0)
msdb.dbo.sp_droprole @rolename = N'TargetServersRole'
EXECUTE msdb.dbo.sp_addrole @rolename = N'TargetServersRole'
msdb.dbo.sp_addrole @rolename = N'TargetServersRole'

To be very precise, adding the role was failing on MSDB database on his SQL Instance. I asked him to run below and got an error.

TS Role 01 SQL SERVER   Script level upgrade for database master failed   CREATE SCHEMA failed due to previous errors

Now, Since SQL we were not able to connect to SQL, we need to use trace flag 902 to bypass the script. Here are the steps followed.

1. Applied the trace flag -T902 on SQL Server configuration manager
2. Start SQL Services and it should allow us to connect because the trace flag would bypass upgrade script mode.
3. Find the schema/role TargetServersRole in MSDB and make a note of members, if any.
4. In his case, the schema was having someone else as owner. Asked him to modify using below T-SQL
USE [msdb]
ALTER AUTHORIZATION ON ROLE::[TargetServersRole] TO [TargetServersRole]

5. After fixing the cause, we stopped SQL, removed trace flag and started again.
6. This time script got executed and SQL was out of upgrade mode.

Have you also encountered similar issue?

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Taking Backup Without Consuming Disk Space

Long back, I learned this trick and found an interesting use of it. I totally understand that its very dangerous. Recently I have seen someone getting bitten by this so thought of sharing.

Warning: Don’t do this in any production environment

The trick here is to take backup on NUL device. It’s not a typo and there is no missing L in NUL. It is called as “NULL Device”. As per Wikipedia –

The null device is a device file that discards all data written to it, but reports that the write operation succeeded. Its represented by NUL: or NUL on DOS

Taking backup is NUL device is as good as taking backup and deleting it. Since the backup is taken to a device, SQL would send it to the operating system they way backup would have been sent and due to NUL device, operating system discards it and tells SQL that your data is written successfully. You can assume it as writing to directly to the recycling bin, which would be deleted once completely.

Coming to its innovative use. I have seen few DBAs having scheduled job to take backup of transaction log to the NUL device using below command.


When I asked one of them about why they are doing it, I got a very interesting answer.

Before moving to SQL 2012 AlwaysOn Availability Groups, I had a database which has always been in “simple” recovery model.  To put it in an AG, it must be in “full” recovery model, and I know the reason also.  So due to full recovery mode, I am forced to take a transaction log backup and burn up disk space with transaction log backups. In reality, I don’t need them and I am OK with last full backup, which I take once daily. I need secondary replica for reporting purposes only. That’s the reason I am taking LOG backup to NUL device. On a lighter note, I believe that Microsoft should have another recovery model – for example, “Simple-AvailabilityGroup” – which would function the same as “simple” recovery model and discard log records after they have been applied to all secondary replicas. What do you think Pinal?

Here is the message in the ERRORLOG when backup is taken on NUL. Notice that disk path as ‘nul’

Database backed up. Database: master, creation date(time): 2015/10/15(05:53:51), pages dumped: 909, first LSN: 6795:16:84, last LSN: 6795:72:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {‘nul’}). This is an informational message only. No user action is required.

Hope it’s clear that this command would disturb the disaster recovery steps as you might assume that backups are taken as shown in SQL Server logs but they are not present anywhere. Please be very careful in using the command.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

Interview Question of the Week #042 – How Does SPACE Function Works in SQL Server?

This question may sound strange but I have personally observed this being asked in one of the interview.

Question: How Does SPACE Function Works in SQL Server?

Answer: SPACE function accepts one numeric parameter and generates as many as space as a value of the parameter.

Observe the output of the following query:

USE AdventureWorks2014;
SELECT RTRIM(LastName) + ',' + SPACE(5) + ',' + LTRIM(FirstName)
FROM Person.Person
ORDER BY LastName, FirstName;

You will notice there is gap of five space between two commas.

spacefunction Interview Question of the Week #042   How Does SPACE Function Works in SQL Server?

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Simple Way to Find Existence of Column in a Table

If you want to check if the column is already available in the table, you can use system views like sys.columns or INFORMATION_SCHEMA.COLUMNS.

Let us create this dataset


Suppose you want to find out the existence of the column named NAME and print a message. You can do it by using any of the following methods

But did you know there is a shorter way to do this? Use COL_LENGTH system function


What it does is that it finds the length of the column. If it is null, the column does not exist in the table otherwise it exists.

It is simple and faster.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – How to Identify InMemory Objects Can be Identified in SQL Server?

When SQL Server introduced InMemory OLTP functionality, it is a great addition to the database. I have seen people use this capability in limited capacity for specific use case scenarios. As a seasoned developer, it is difficult to know what are the consumers of memory. In a recent conversation, I remember someone asking me “Why is SQL Server using so much memory? I suspect since you introduced the InMemory OLTP, I think it is your new feature that is consuming all the memory in my servers. Is there any way to identify what are the InMemory objects currently in memory?”

In this blog, let me walk through a step by step approach to the figure. We will create a new database for testing.

-- Let us create a small database for testing
-- Add the In-Memory Filegroup
-- Create a InMemory table
CREATE TABLE dbo.tbl_im_oltp (
c3 CHAR(8000) NOT NULL,

The next step is to insert few data into the InMemory table.

-- Insert 100 rows
@i INT = 0
WHILE (@i < 100)
tbl_im_oltp VALUES (@i, 'a', REPLICATE ('b', 8000))
SET @i += 1;

Now that we have created the table with data. Next is to find out what are the tables part of our InMemory OLTP. We have a number of DMVs available which can be used to view this information.

FROM sys.dm_db_xtp_table_memory_stats
FROM sys.dm_db_xtp_memory_consumers

 SQL SERVER   How to Identify InMemory Objects Can be Identified in SQL Server?

As you can see, we can see the objects available in memory at this point in time. Using DMVs for this requirement is an ideal case.

Next step is to clean-up our test cases.

-- Clean up task

Do let me know if you found this script useful. Have you ever had this requirement in your environments? How do you troubleshoot such InMemory tables usage in your production servers?

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Interesting Observations Using MONEY Datatype

MONEY is one of the data types supported in SQL Server. There are some interesting information about this.

Observation 1: Money datatype can accept up to four scale values.

DECLARE @money money
SET @money='$1,200.45679'
SELECT @money

which results to 1200.4567

Observation 2: If the scales value exceeds 4 digits, it is rounded to four digits

DECLARE @money money
SET @money=1200.45679
SELECT @money

which results to 1200.4568

Observation 3: Money datatype is the only datatype that can accept formatted numbers

DECLARE @money money
SET @money='1,200.45679'
SELECT @money

Observation 4: If you use decimal , numeric or float you will get an error

SELECT @money

The error is

Msg 8114, Level 16, State 5, Line 3
Error converting data type varchar to float.

Observation 5: Money datatype can also accept currency symbols prefixed with a number

DECLARE @money money
SET @money='$1,200.45679'
SELECT @money

Observation 6: All commas are omitted in Money datatype

DECLARE @money money
SET @money='1,2,0,0.4,5,6,7,9'
SELECT @money

which results to 1200.4568

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Database Size Limitation in SQL Express

express SQL SERVER   Database Size Limitation in SQL ExpressI have a tendency to write a number of interview style questions and those are quite popular in this blog. Having said that, sometimes we get asked about tricky questions on a small variety which we might know some of the basics – but sometimes it makes us think twice. This blog is an inspiration on someone who quizzed me with a number of questions on SQL Server Express Edition and I thought to put them up here in a FAQ style for reference.

We all know that there is a limit in SQL Server Express Edition with respect to the size of the database. There have been many confusions around this area. This blog post is meant to come clear of those doubts – by no means this is not exhaustive of all. I have kept them in Q&A format and would add more.

  1. What is the size limit you are talking about in SQL Server Express Edition?

That is a function of  the version of SQL Server:
SQL 2000 Desktop Engine (MSDE) has limit of 2 GB data file(s) size.

SQL 2005 Express Edition has limit of 4 GB data file(s) size.

SQL 2008, SQL 2008 R2, SQL 2012 and SQL 2014 Express Edition has limit of 10 GB data file(s) size.

  1. What is the error I would get if I cross that the limits of data sizes mentioned above?

You are likely to face errors in that scenario. Here are two common errors for reference:
CREATE DATABASE or ALTER DATABASE failed because the resulting cumulative database size would exceed your licensed limit of 10240 MB per database

Size in the error message would depend on limit of that SQL Version of Express. The above message is from SQL 2008 R2 till SQL 2014.

In ERRORLOG, you would see errors as below when limit is reached.
Could not allocate space for object ‘TableName’.’IndexName’ in database ‘DatabaseName’ because the ‘PRIMARY’ filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

  1. Is this limit on size for per instance or per database?

The size limit is per database so if the database is composed of multiple data files, then the limit is across all data files in the single database.

  1. Can I have 2 databases of 10 GB each in SQL Server 2014?

Yes. As explained in #3 its per database data file limit.

  1. Does this limit include Transaction Log (LDF) file?

No. It’s only for data files.

  1. Does this limit include File stream file container?

No. It’s only for data files.

  1. Is there any workaround to increase the limit?

There is none to increase the file size limitation. Filestream is not part of the limit so is possible move the blob data file stream.

  1. Will my database be unusable?

No. Adding new data, which needs file size increase would fail with the error mentioned above. SELECT can still run on the database.

  1. What if I want to upgrade to full version of SQL from SQL Express?

That can be done using “Edition Upgrade” Below talks about EVAL edition but same steps can be applied for SQL Express upgrade also.

SQL SERVER – Evaluation Period Has Expired – How to Activate SQL Server?

  1. What else can be done?

Delete unwanted data and shrink the data file.

These are some of the common questions I have seen around with SQL Server Express Edition and its file size limits. Do let me know if you wanted to know anything specific in this area with Express Editions. Will be more than happy to expand the FAQ list to include them. Drop in a line via comments.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – Log Shipping Restore Job Error: The file is too recent to apply to the secondary database

If you are a DBA and handled Log-shipping as high availability solution, there are a number of common errors that come that you would over a period of time become pro on resolving. Here is one of the common error which you must have seen:

2015-10-13 21:09:05.13     *** Error: The file ‘C:\LS_S\LSDemo_20151013153827.trn’ is too recent to apply to the secondary database ‘LSDemo’.(Microsoft.SqlServer.Management.LogShipping) ***
2015-10-13 21:09:05.13     *** Error: The log in this backup set begins at LSN 32000000047300001, which is too recent to apply to the database. An earlier log backup that includes LSN 32000000047000001 can be restored.
RESTORE LOG is terminating abnormally.(.Net SqlClient Data Provider) ***
Above error is a shown in failure of the history of restore job. If the failure is more than configured thresholds, then we would start seen below error in SQL ERRORLOG on secondary also:
2015-10-14 06:22:00.240 spid60       Error: 14421, Severity: 16, State: 1.
2015-10-14 06:22:00.240 spid60       The log shipping secondary database PinalServer.LSDemo has restore threshold of 45 minutes and is out of sync. No restore was performed for 553 minutes. Restored latency is 4 minutes. Check agent log and logshipping monitor information.

To start troubleshooting, we can look at Job activity monitor on secondary which would fail with the below state:

LS Restore 01 SQL SERVER   Log Shipping Restore Job Error: The file is too recent to apply to the secondary database

If you know SQL transaction log backup basics, you might be able to guess the cause. If we look closely to the error, it talks about LSN mismatch. Most of the cases, a manual transaction log backup was taken. I remember few scenarios where a 3rd party tool would have taken transaction log backup of database which was also part of a log shipping configuration.

Since we know the cause now, what we need to figure out is – where is that “out of band” backup? Here is the query which I have written on my earlier blog.

-- Assign the database name to variable below
DECLARE @db_name VARCHAR(100)
SELECT @db_name = 'LSDemo'
-- query
SELECT TOP (30) s.database_name
,CAST(CAST(s.backup_size / 1000000 AS INT) AS VARCHAR(14)) + ' ' + 'MB' AS bkSize
,CAST(DATEDIFF(second, s.backup_start_date, s.backup_finish_date) AS VARCHAR(4)) + ' ' + 'Seconds' TimeTaken
,CAST(s.first_lsn AS VARCHAR(50)) AS first_lsn
,CAST(s.last_lsn AS VARCHAR(50)) AS last_lsn
,CASE s.[type]
THEN 'Full'
THEN 'Differential'
THEN 'Transaction Log'
END AS BackupType
FROM msdb.dbo.backupset s
INNER JOIN msdb.dbo.backupmediafamily m ON s.media_set_id = m.media_set_id
WHERE s.database_name = @db_name
ORDER BY backup_start_date DESC

Once we run the query, we would get list of backups happened on the database. This information is picked from MSDB database.

Below picture is self-explanatory.

LS Restore 02 SQL SERVER   Log Shipping Restore Job Error: The file is too recent to apply to the secondary database

Once we found the “problematic” backup, we need to restore it manually on secondary database. Make sure that we are using either norecovery or standby option so that other logs can be restored. Once file is restored, the restore job would be able to pick-up from the same place and would catch up automatically.

What are the other problems you have seen with Log-shipping? If you can share some of the common errors, it would be of great help for others and I will try to blog about them too with your help.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – ColumnStore Frequently Asked Queries

This blog talks about troubleshooting scenarios in FAQ format when dealing with ColumnStore Indexes inside SQL Server. Use a combination of trace flags, query hints, performance counters, extended events discussed in various blogs to gather relevant data and troubleshoot the problem. Sometimes I have seen people use ColumnStore as a silver bullet to tackle performance problems in their upgraded SQL Server. The common question was: creating the ColumnStore index does not result in as much performance improvement as they expected. What went wrong? Where do we start? If you ask me, then the diagnostic efforts may include asking some of the following questions:

Did the optimizer use the ColumnStore index? If not, why not?

A user can determine whether a ColumnStore index was used in the query plan by looking at SHOWPLAN. A new property Storage can have one of two values, either rowstore or ColumnStore. Storage = ColumnStore indicates that the ColumnStore was used in the query plan.

The query optimizer makes a cost based decision regarding whether to use the ColumnStore as an access method for a given query. There are certain limitations on ColumnStore use to be aware of:

  • Use of the ColumnStore index can be forced (or avoided) by using a table hint (for example, FROM t WITH (INDEX(myColumnStoreindex)).
  • The ColumnStore index does not support SEEK. If the query is expected to return a small fraction of the rows, the optimizer is *unlikely* to choose the ColumnStore index.
  • If the table hint FORCESEEK is used, the optimizer will not consider the ColumnStore index.
  • If a hint to use the ColumnStore index is combined with FORCESEEK, the optimizer will not be able to generate a plan and the query will fail and return an error message.
  • If SET FORCEPLAN or the FORCE ORDER hint is used, the optimizer will honor the hint. If the optimizer would otherwise have used the starjoin optimization with the starjoin bitmaps, using the hints can cause performance to be worse than without the hints. Batch processing may be used, but is not guaranteed.
  • A query hint IGNORE_NONCLUSTERED_COLUMNSTORE_INDEX will prevent use of any ColumnStore index on a per-query basis.
  • If TF 9453 is turned on, the query optimizer will not choose a ColumnStore index (and will not use batch mode processing) for any query.
  • If TF 9462 is turned on, the query optimizer will not choose a ColumnStore index unless a query hint explicitly specifies a ColumnStore index.
  • If TF 9453 and TF 9462 are both on, the query optimizer will not choose a ColumnStore index. If a query hint is added to specify a ColumnStore index, the query will fail with error message 8622, which states that the query processor could not produce a query plan because of the hints defined in the query.

columnstore SQL SERVER   ColumnStore Frequently Asked Queries

The next question to ask would be, was the query executed using batch processing?

A user can determine whether batch processing was used for a particular operator by looking at SHOWPLAN. EstimatedExecutionMode can have one of two values, row or batch.

EstimatedExecutionMode = batch indicates that batch processing was chosen in the query plan. ActualExecutionMode also can have one of two values, either row or batch. If EstimatedExecutionMode = batch and ActualExecutionMode = row, then at runtime the query could not be executed in batch mode, indicating that not enough memory was available for batch processing (or there were not enough threads available for parallel execution). The combination EstimatedExecutionMode = row and ActualExecutionMode = batch will never occur. For a complex query, some parts may execute in batch mode and some parts in row mode.

Next analysis would be, did the ColumnStore index provide good compression for my data?

The degree to which the data can be compressed is highly data-dependent. A column with a large number of unique strings cannot be compressed as much as a column with many duplicate string values. Similarly, numeric data that cover a large range of values cannot be as efficiently encoded as can numeric data that cover a more restricted range of values. For a given set of data, including fewer columns in the ColumnStore index will usually result in better compression of the columns in the index as well as resulting in a smaller index by virtue of having less data (fewer columns). However, it is generally recommended that all the columns in a table be included in the ColumnStore index to ensure that all queries will benefit from the use of the index.

The last question to ask here would be, did the query optimizer find a good plan?

The query plan is displayed in Showplan as usual. Use of the batch/ColumnStore plan for star join queries can be determined by looking at the access method (was a ColumnStore index used for the fact table?), the presence of one or more Batch Hash Table Build operators, and ActualExecutionMode = Batch.

Reference: Pinal Dave (http://blog.sqlauthority.com)