Interview Question of the Week #042 – How Does SPACE Function Works in SQL Server?

This question may sound strange but I have personally observed this being asked in one of the interview.

Question: How Does SPACE Function Works in SQL Server?

Answer: SPACE function accepts one numeric parameter and generates as many as space as a value of the parameter.

Observe the output of the following query:

USE AdventureWorks2014;
GO
SELECT RTRIM(LastName) + ',' + SPACE(5) + ',' + LTRIM(FirstName)
FROM Person.Person
ORDER BY LastName, FirstName;
GO

You will notice there is gap of five space between two commas.

spacefunction Interview Question of the Week #042   How Does SPACE Function Works in SQL Server?

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #041 – What is a Merge Join?

Question: What is a Merge Join?

Answer: The Merge Join transformation provides an output that is generated by joining two sorted data sets using a FULL, LEFT, or INNER join. The Merge Join transformation requires that both inputs be sorted and that the joined columns have matching meta-data. The user cannot join a column that has a numeric data type with a column that has a character data type. If the data have a string data type, the length of the column in the second input must be less than or equal to the length of the column in the first input with which it is merged.

USE pubs
GO
SELECT a.au_id
FROM authors a
JOIN titleauthor b ON a.au_id = b.au_id
OPTION (MERGE JOIN)
GO

Merge join itself is very fast, but it can be an expensive choice if sort operations are required. However, if the data volume is large and the desired data can be obtained presorted from existing B-tree indexes, merge join is often the fastest available join algorithm. Because the query optimizer usually selects the best execution plan for a given select statement, it is not necessary to enforce the desirable join type, but sometimes it can be useful. User can enforce the desirable join type by using the OPTION clause.

Following query will be benefited by MERGE JOIN because authors and titleauthor both has a primary key index on au_id. Due to Primary Key on au_id it is physically sorted by au_id.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #039 – What is Included Column Index in SQL Server 2005 and Onwards?

You may find it interesting that not many people still know what is Included Index in SQL Server. I have seen it personally around me and also have seen at

Question: What is Included Column Index?

Answer: 

In SQL Server 2005, the functionality of non-clustered indexes is extended by adding non-key columns to the leaf level of the non-clustered index. Non-key columns, can help to create cover indexes.By including non-key columns, you can create non-clustered indexes that cover more queries. The Database Engine does not consider non-key columns when calculating the number of index key columns or index key size. Non-key columns can be included in non-clustered index to avoid exceeding the current index size limitations of a maximum of 16 key columns and a maximum index key size of 900 bytes. Another advantage is that using non-key column in the index we can have index data types not allowed as index key columns generally.

In following example column FileName is varchar(400), which will increase the size of the index key bigger than it is allowed. If we still want to include in our cover index to gain performance we can do it by using the Keyword INCLUDE.

USE AdventureWorks
GO
CREATE INDEX IX_Document_Title
ON Production.Document (Title, Revision)
INCLUDE (FileName)

Non-key columns can be included only in non-clustered indexes. Columns can’t be defined in both the key column and the INCLUDE list. Column names can’t be repeated in the INCLUDE list. Non-key columns can be dropped from a table only after the non-key index is dropped first. For Included Column Index to exist there must be at least one key column defined with a maximum of 16 key columns and 1023 included columns.

Avoid adding unnecessary columns. Adding too many index columns, key or non-key as they will affect negatively on performance. Fewer index rows will fit on a page. This could create I/O increases and reduced cache efficiency. More disk space will be required to store the index. Index maintenance may increase the time that it takes to perform modifications, inserts, updates, or deletes, to the underlying table or indexed view.

Another example to test:
Create following Index on Database AdventureWorks in SQL SERVER 2005

USE AdventureWorks
GO
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID)
GO

Test the performance of following query before and after creating Index. The performance improvement is significant.

SELECT AddressLine1, AddressLine2, City, StateProvinceID, PostalCode
FROM Person.Address
WHERE PostalCode BETWEEN '98000'
AND '99999';
GO

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #038 – What is Left Semi Join Showplan Operator?

It is very common of interviewers to ask questions which are a bit off and sometimes not used in daily life.  Here is such question I heard the other dya.

Question: What is Left Semi Join Showplan Operator?

Answer: 

There are few interesting kinds of joint operations exists when execution plan is displayed in text format.

Left Semi Join Showplan Operator
The Left Semi Join operator returns each row from the first (top) input when there is a matching row in the second (bottom) input. If no join predicate exists in the Argument column, each row is a matching row.

Left Anti Semi Join Showplan Operator
The Left Anti Semi Join operator returns each row from the first (top) input when there is no matching row in the second (bottom) input. If no join predicate exists in the Argument column, each row is a matching row.

Right Anti Semi Join Showplan Operator

The Right Anti Semi Join operator outputs each row from the second (bottom) input when a matching row in the first (top) input does not exist. A matching row is defined as a row that satisfies the predicate in the Argument column (if no predicate exists, each row is a matching row).

Right Semi Join Showplan Operator
The Right Semi Join operator returns each row from the second (bottom) input when there is a matching row in the first (top) input. If no join predicate exists in the Argument column, each row is a matching row.

Following script will display Left Anti Semi Join Showplan Operator in the result pane.
USE AdventureWorks;
GO
SET SHOWPLAN_TEXT ON
GO
SELECT ProductID
FROM Production.Product
WHERE ProductID
NOT IN (
SELECT ProductID
FROM Production.WorkOrder);
GO
SET SHOWPLAN_TEXT OFF
GO

leftsemijoin Interview Question of the Week #038   What is Left Semi Join Showplan Operator?

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #037 – What are the Properties of Relational Tables?

The other day in the interview, I heard this question and it sent me back to my college days as I used to hear similar questions from my professors. Here is the same question again and its answer.

Question: What are the properties of the relational tables?

busstra Interview Question of the Week #037   What are the Properties of Relational Tables?

Answers: Relational tables have six properties:

Values Are Atomic
This property implies that columns in a relational table are not repeating group or arrays. The key benefit of the one value property is that it simplifies the data manipulation logic. Such tables are referred to as being in the “first normal form” (1NF).

Column Values Are of the Same Kind
In relational terms this means that all values in a column come from the same domain. A domain is a set of values which a column may have. This property simplifies data access because developers and users can be certain of the type of data contained in a given column. It also simplifies data validation. Because all values are from the same domain, the domain can be defined and enforced with the Data Definition Language (DDL) of the database software.

Each Row is Unique
This property ensures that no two rows in a relational table are identical; there is at least one column, or set of columns, the values of which uniquely identify each row in the table. Such columns are called primary keys. This property guarantees that every row in a relational table is meaningful and that a specific row can be identified by specifying the primary key value.

The Sequence of Columns is Insignificant
This property states that the ordering of the columns in the relational table has no meaning. Columns can be retrieved in any order and in various sequences. The benefit of this property is that it enables many users to share the same table without concern of how the table is organized. It also permits the physical structure of the database to change without affecting the relational tables.

The Sequence of Rows is Insignificant
This property is analogous to the one above, but applies to rows instead of columns. The main benefit is that the rows of a relational table can be retrieved in different order and sequences. Adding information to a relational table is simplified and does not affect existing queries.

Each Column Has a Unique Name
Because the sequence of columns is insignificant, columns must be referenced by name and not by position. A column name need not be unique within an entire database, but only within the table to which it belongs.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #036 – What is the difference between @@DATEFIRST and SET DATEFIRST?

Question: What is the difference between @@DATEFIRST and SET DATEFIRST with respect to SQL Server?

Answer: SQL Server with US English as default language, SQL Server sets DATEFIRST to 7 (Sunday) by default. We can reset any day as the first day of the week using

SET DATEFIRST 5

This will set Friday as the first day of the week.
@@DATEFIRST returns the current value, for the session, of SET DATEFIRST.

SET LANGUAGE italian
GO
SELECT @@DATEFIRST
GO
----This will return result as 1(Monday)
SET LANGUAGE us_english
GO
SELECT @@DATEFIRST
GO
----This will return result as 7(Sunday)

In this way @@DATEFIRST and SET DATEFIRST are related. When I learned about this feature I was very glad as our company has started to server global clients and simple feature like this helps a lot to avoid confusion.

Reference: Pinal Dave (http://blog.sqlauthority.com)

SQL SERVER – How to View Objects in mssqlsystemresource System Database?

The series of interview questions are always fun. Sometimes I get to hear strange questions and are worth a detailed post. This blog is in continuation to one such question that landed into my Inbox which I personally felt had to have little explanation. During interviews there would be at least one question asked about system databases and someone might ask about hidden system database is – mssqlsystemresource. There are a few facts which are known to most of us:

  1. It is a hidden system database. ID of this database is 32768.
  2. It stores schema of system object.
  3. It helps in faster patching because there is no need to ALTER system objects. Since the schema is stored in this database, it would be just replacing mdf and ldf files.

There are a few mythsmyths about this database, none of below is true.

  1. It can be used to rollback service pack. Just replace old files and you are done. This is INCORRECT! Service pack is not just this database replacement.
  2. Backup and restore needed for this database. This is also INCORRECT! This database is just like any other binary like exe and DLL which are needed for SQL Server to run. In case the files are damaged, you need same version of the file from some other SQL instance.

Here are few blogs which I have written earlier:

SQL SERVER – mssqlsystemresource – Resource Database

SQL SERVER – Location of Resource Database in SQL Server Editions

If you want to see the objects under this database, there is a little trick.

  • Stop the SQL Server service
  • Copy the mssqlsystemresource.mdf and mssqlsystemresource.ldf to a new path. Location of these files are listed in my earlier blog listed above,
  • Start the SQL Server Service.
  • Use the following command to attach the data and log file as a new user database.

USE [master]
GO
CREATE DATABASE [mssqlsystemresource_copy] ON
(FILENAME = N'E:\temp\mssqlsystemresource.mdf' ),
(
FILENAME = N'E:\temp\mssqlsystemresource.ldf' )
FOR ATTACH
GO

  • Here is what we would see in management studio. Note that there are NO tables in this database. Just the views and procedures which are in sys schema. Generally they are visible under system views in other databases

sysres 01 SQL SERVER   How to View Objects in mssqlsystemresource System Database?

We can also have a look at the definition of views. Note that this option won’t come for system objects.

sysres 02 SQL SERVER   How to View Objects in mssqlsystemresource System Database?

In practical scenario, there is no need to ever do this but it is always good for a DBA to know what is happening under the hood in SQL Server. I hope this will give you more opportunity to explore.

Note: Please DONOT change the system ResourceDB or replace the same in production environments. It is not advisable. This blog has to be seen as educational and for exploration purposes only.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #035 – Different Ways to Identify the Open Transactions

Question: What are different ways to identify open transactions in SQL Server?

Answer: There are many ways to identify the open transactions. If there are some transactions which were started and not yet committed or rollback, you can use any of these to find them

Use DBCC command

There is a specific DBCC command for this.

DBCC OPENTRAN

Use SYS.SYSPROCESSES view

SELECT * FROM SYS.SYSPROCESSES WHERE OPEN_TRAN = 1

This will list out more details about the open transactions

Use SYS.DM_TRAN_SESSION_TRANSACTIONS view

SELECT * FROM SYS.DM_TRAN_SESSION_TRANSACTIONS

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #034 – What is the Difference Between Distinct and Group By

Question: What is the difference between DISTINCT and GROUP BY?

Answer:

A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

Example of DISTINCT:

SELECT DISTINCT Employee, Rank
FROM Employees

Example of GROUP BY:

SELECT Employee, Rank
FROM Employees
GROUP BY Employee, Rank

Example of GROUP BY with aggregate function:

SELECT Employee, Rank, COUNT(*) EmployeeCount
FROM Employees
GROUP BY Employee, Rank

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #033 – How to Invalidate Procedure Cache of SQL Server?

icon clean Interview Question of the Week #033   How to Invalidate Procedure Cache of SQL Server?Question: How to invalidate all the stored procedure plans in SQL Server?

Answer: DBCC FREEPROCCACHE will invalidate all stored procedure plans that the optimizer has cached in memory and force SQL Server to compile new plans the next time those procedures are run.

Remember not to run this on a production server without proper planning as once you run this DBCC command it will invalidate all the stored procedure plans. This means if the user runs this during pick time, many Stored Procedures will execute at the same time and create a resource bottleneck.

Reference: Pinal Dave (http://blog.sqlauthority.com)