Interview Question of the Week #029 – Difference Between CHARINDEX vs PATINDEX

Question: What is the difference between CHARINDEX vs PATINDEX?

Answer: We can use either CHARINDEX or PATINDEX to search in a TEXT field in SQL SERVER. The CHARINDEX and PATINDEX functions return the starting position of a pattern you specify.

regular expressions regex Interview Question of the Week #029   Difference Between CHARINDEX vs PATINDEX

Both functions take two arguments. With PATINDEX, you must include percent signs before and after the pattern, unless you are looking for the pattern as the first (omit the first %) or last (omit the last %) characters in a column. For CHARINDEX, the pattern cannot include wildcard characters. The second argument is a character expression, usually a column name, in which Adaptive Server searches for the specified pattern.

Example of CHARINDEX:
USE AdventureWorks2014;
GO
SELECT CHARINDEX('important', DocumentSummary),
DocumentSummary
FROM Production.Document
WHERE Title = 'Repair and Service Guidelines';
GO

Examples of PATINDEX:
USE AdventureWorks2014;
GO
SELECT PATINDEX('%imp_rtant%', DocumentSummary),
DocumentSummary
FROM Production.Document
WHERE Title = 'Repair and Service Guidelines';
GO

Summary:
PATINDEX is CHARINDEX + WildCard Search. Use either of them is depending your business need.

Reference : Pinal Dave (http://blog.SQLAuthority.com)

Interview Question of the Week #028 – Few SQL Best Practices

Just last week, I moderated a SQL Server interview and I heard very interesting conversation.

Question – What are the best practices are you following for SQL Server?

When I heard this question – I thought candidate will answer many different things, but he was stunned and had nothing much to add to the conversation. That reminded me one of my older articles which I believe still adds quite a lot of value to this conversation. Here is the article as an answer to the question.

Answer – 

1. Know your data and business application well.
Familiarize yourself with these sources; you must be aware of the data volume and distribution in your database.

2. Test your queries with realistic data.
A SQL statement tested with unrealistic data may behave differently when used in production. To ensure rigorous testing, the data distribution in the test environment must also closely resemble that in the production environment.

3. Write identical SQL statements in your applications.
Take full advantage of stored procedures, and functions wherever possible. The benefits are performance gain as they are precompiled.

4. Use indexes on the tables carefully.
Be sure to create all the necessary indexes on the tables. However, too many of them can degrade performance.

5. Make an indexed path available.
To take advantage of indexes, write your SQL in such a manner that an indexed path is available to it. Using SQL hints is one of the ways to ensure the index is used.

6. Understand the Optimizer.
Understand the optimizer how it uses indexes, where clause, order by clause, having clause, etc.

7. Think globally when acting locally.
Any changes you make in the database to tune one SQL statement may affect the performance of other statements used by applications and users.

8. The WHERE clause is crucial.
The following WHERE clauses would not use the index access path even if an index is available.
E.g. Table1Col1 (Comparison Operator like >, >=, <=,) Table1Col2, Table1Col1 IS (NOT) NULL, Table1Col1 NOT IN (value1, value2), Table1Col1 != expression, Table1Col1 LIKE ‘%pattern%’, NOT Exists sub query.

9. Use WHERE instead of HAVING for record filtering.
Avoid using the HAVING clause along with GROUP BY on an indexed column.

10. Specify the leading index columns in WHERE clauses.
For a composite index, the query would use the index as long as the leading column of the index is specified in the WHERE clause.

11. Evaluate index scan vs. full table scan. (Index Only Searches Vs Large Table Scan, Minimize Table Passes)
If selecting more than 15 percent of the rows from a table, full table scan is usually faster than an index access path. An index is also not used if SQL Server has to perform implicit data conversion. When the percentage of table rows accessed is 15 percent or less, an index scan will work better because it results in multiple logical reads per row accessed, whereas a full table scan can read all the rows in a block in one logical read.

12. Use ORDER BY for index scan.
The SQL Server optimizer will use an index scan if the ORDER BY clause is on an indexed column. The following query illustrates this point.

13. Minimize table passes.
Usually, reducing the number of table passes in a SQL query results in better performance. Queries with fewer table passes mean faster queries.

14. Join tables in the proper order.
Always perform the most restrictive search first to filter out the maximum number of rows in the early phases of a multiple table join. This way, the optimizer will have to work with fewer rows in the subsequent phases of join, improving performance.

15. Redundancy is good in where condition.
Provide as much information as possible in the WHERE clause. It will help the optimizer to clearly infer conditions.

16. Keep it simple, stupid.
Very complex SQL statements can overwhelm the optimizer; sometimes writing multiple, simpler SQL will yield better performance than a single complex SQL statement.

17. You can reach the same destination in different ways.
Each SQL may use a different access path and may perform differently.

18. Reduce network traffic and increase throughput.
Using T-SQL blocks over Multiple SQL statements can achieve better performance as well as reduce network traffic. Stored Procedures are better over T-SQL blocks as they are stored in SQL Server and they are pre-compiled.

19. Better Hardware.
Better hard ware always helps performance. SCACI drives, Raid 10 Array, Multi processors CPU, 64-bit operating system improves the performance by a great amount.

20. Avoid Cursors.
Using SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use a correlated sub query or derived tables if you need to perform row-by-row operations.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #027 – Move TempDB from One Drive to Another Drive When Low Disk Space

I am often invited to attend various interviews and once in a while, I see practical questions discussed in the interview. Here is one of the important interview question which is related to TempDB.

Question – “What will you do when your log file of TempDB is full?”

Answer – The best solution when Log file of TempDB is full is to move it to a new drive. Here is more details about the same.

Make sure that TempDB is set to autogrow and do not set a maximum size for TempDB. If the current drive is too full to allow autogrow events, then arrange a bigger drive, or add files to TempDB on another device (using ALTER DATABASE as described below and allow those files to autogrow.

Move TempDB from one drive to another drive. There are major two reasons why TempDB needs to move from one drive to another drive.
1) TempDB grows bigger and the existing drive does not have enough space.
2) Moving TempDB to another file group which is on a different physical drive helps to improve database disk read, as they can be read simultaneously.

Follow direction below exactly to move the database and log from one drive (c:) to another drive (d:) and (e:).

Open Query Analyzer and connect to your server. Run this script to get the names of the files used for TempDB.

USE TempDB
GO
EXEC sp_helpfile
GO

Results will be something like:
name fileid filename filegroup size
——- —— ————————————————————– ———- ——-
tempdev 1 C:Program FilesMicrosoft SQL ServerMSSQLdatatempdb.mdf PRIMARY 16000 KB
templog 2 C:Program FilesMicrosoft SQL ServerMSSQLdatatemplog.ldf NULL 1024 KB

Along with other information related to the database. The names of the files are usually tempdev and templog by default. These names will be used in the next statement. Run following code, to move mdf and ldf files.

USE MASTER
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = tempdev, FILENAME = 'd:\datatempdb.mdf')
GO
ALTER DATABASE TempDB MODIFY FILE
(NAME = templog, FILENAME = 'e:\datatemplog.ldf')
GO

The definition of the TempDB is changed. However, no changes are made to TempDB till SQL Server restarts. Please stop and restart SQL Server and it will create TempDB files in new locations.

You can read more about this over here.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #026 – 64 Bit Vs 32 Bit Confusion

I often see people getting lost between 64 bit and 32 bit operating systems and application conversation. There are plenty of individuals still today, not sure what is the difference between them as well what runs on what platform. Here are a few very simple questions related to Windows OS and SQL Server.

32bit 64bit Interview Question of the Week #026   64 Bit Vs 32 Bit Confusion

Question: Can you run SQL SERVER 32 bit on Windows 32 bit OS?

Answer: Yes

Question: Can you run SQL SERVER 64 bit on Windows 32 bit OS?

Answer: No

Question: Can you run SQL SERVER 32 bit on Windows 64 bit OS?

Answer: Yes

Question: Can you run SQL SERVER 64 bit on Windows 64 bit OS?

Answer: Yes

I guess there you go. In a simple words, you can run pretty much everything on Windows 64 bit OS. However, Windows 32 bit OS can’t run SQL Server 64 bit.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #025 – How to Implement Paging in SQL Server?

Here is a very popular question when SQL Server interview consists of live coding example.

Question: How to implement paging in SQL Server? Explain with the help of script.

Answer: Here is the script to implemented paging in SQL Server 2005.

USE AdventureWorks
GO
DECLARE  @StartRow INT
DECLARE  
@EndRow INT
SET    
@StartRow = 120
SET @EndRow = 140
SELECT    FirstName, LastName, EmailAddress
FROM (
SELECT    PC.FirstName, PC.LastName, PC.EmailAddress,
ROW_NUMBER() OVER(
ORDER BY PC.FirstName, PC.LastName,PC.ContactID) AS RowNumber
FROM    Person.Contact PC) PersonContact
WHERE    RowNumber > @StartRow
AND RowNumber < @EndRow
ORDER BY FirstName, LastName, EmailAddress
GO

Above query is using ROW_NUMBER () function, but you can also use CTE to accomplish the same. Here is the script for the same:

USE AdventureWorks
GO
DECLARE @StartRow INT
DECLARE
@EndRow INT
SET
@StartRow = 120;
SET @EndRow = 140;
WITH PersonContact AS
(
SELECT PC.FirstName, PC.LastName, PC.EmailAddress,
ROW_NUMBER() OVER(
ORDER BY PC.FirstName, PC.LastName,PC.ContactID) AS RowNumber
FROM Person.Contact PC)
SELECT FirstName, LastName, EmailAddress
FROM PersonContact
WHERE RowNumber > @StartRow
AND RowNumber < @EndRow
ORDER BY FirstName, LastName, EmailAddress
GO

You can read more about this in following blog posts:

Reference : Pinal Dave (http://blog.SQLAuthority.com)

Interview Question of the Week #024 – What is the Best Recovery Model?

Here is one of the most popular questions I often see people asking on the internet.

Question: What is the best recovery model for my database?

Answer: Every situation is different and each situation has different needs for the recovery model.

SQL Server offers three recovery models: full recovery, simple recovery and bulk-logged recovery. The recovery models determine how much data loss is acceptable and determines whether and how transaction logs can be backed up.

Select Simple Recovery Model if:
* Your data is not critical.
* Losing all transactions since the last full or differential backup is not an issue.
* Data is derived from other data sources and is easily recreated.
* Data is static and does not change often.

-- Set the recovery model to Simple
ALTER DATABASE Advetureworks SET RECOVERY SIMPLE
GO

Select Bulk-Logged Recovery Model if:
* Data is critical, but logging large data loads bogs down the system.
* Most bulk operations are done off hours and do not interfere with normal transaction processing.
* You need to be able to recover to a point in time.

-- Set the recovery model to Bulk Logged
ALTER DATABASE Advetureworks SET RECOVERY BULK_LOGGED
GO

Select Full Recovery Model if:
* Data is critical and no data can be lost.
* You always need the ability to do a point-in-time recovery.
* Bulk-logged activities are intermixed with normal transaction processing.
* You are using replication and need the ability to resynchronize all databases involved in replication to a specific point in time.

-- Set the recovery model to Full
ALTER DATABASE Advetureworks SET RECOVERY FULL
GO

You can switch from any recovery model to another recovery model, but prior to or after the switch, you may need to issue additional transaction log or full backups to ensure you have a complete backup set.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #023 – Error Handling with TRY…CATCH

Here is one of the most popular question I see people discussing in SQL Interviews.

Question – How do you handle errors with the help of TRY…CATCH?

Answer –

TRY/CATCH helps to write logic separate the action and error handling code. The code meant for the action is enclosed in the TRY block and the code for error handling is enclosed in the CATCH block. In case the code within the TRY block fails, the control automatically jumps to the CATCH block, letting the transaction roll back and resume execution. In addition to this, the CATCH block captures and provides error information that shows you the ID, message text, state, severity and transaction state of an error.

Functions to be used in CATCH block are :

  • ERROR_NUMBER: returns the error number, and is the same value of @@ERROR.
  • ERROR_SEVERITY: returns the severity level of the error that invoked the CATCH block.
  • ERROR_STATE: returns the state number of the error.
  • ERROR_LINE: returns the line number where the error occurred.
  • ERROR_PROCEDURE: returns the name of the stored procedure or trigger for which the error occurred.
  • ERROR_MESSAGE: returns the full message text of the error. The text includes the values supplied for any substitutable parameters, such as lengths, object names, or times.

You can use these functions anywhere inside a CATCH block, and they will return information regarding the error that has occurred. These functions will return the value null outside of the CATCH block.

Syntax:
BEGIN TRY
{ sql_statement |
statement_block }
END TRY
BEGIN CATCH
{ sql_statement |
statement_block }
END CATCH

The TRY or CATCH block can contain a single T-SQL statement or a series of statements. The CATCH block must follow immediately after the TRY block. The TRY/CATCH block cannot span more than a single batch. In addition, TRY/CATCH block cannot span an IF/ELSE statement.

You can read more about this subject over here: Explanation of TRYCATCH and ERROR Handling.

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #022 – How to Get Started with Big Data?

Big data is one of the most popular subject in recent time and everybody wants to get started on this subject. During recent interviews there are plenty of the questions with related to Big Data. Here is the most popular question which I receive on this subject.

Question: How to get started with Big Data?

Answer: Earlier last year I wrote timeless series on the subject Big Data. Here is the link to the entire series.

 bigdataimages Interview Question of the Week #022   How to Get Started with Big Data?

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #021 – Difference Between Index Seek and Index Scan (Table Scan)

Question: What is the difference between Index Seek and Index Scan?

Answer: Index Scan retrieves all the rows from the table. Index Seek retrieves selective rows from the table.

Index Scan:
Since a scan touches every row in the table whether or not it qualifies, the cost is proportional to the total number of rows in the table. Thus, a scan is an efficient strategy if the table is small or if most of the rows qualify for the predicate.

Index Seek:
Since a seek only touches rows that qualify and pages that contain these qualifying rows, the cost is proportional to the number of qualifying rows and pages rather than to the total number of rows in the table.

Index Scan is nothing but scanning on the data pages from the first page to the last page. If there is an index on a table, and if the query is touching a larger amount of data, which means the query is retrieving more than 50 percent or 90 percent of the data, and then optimizer would just scan all the data pages to retrieve the data rows. If there is no index, then you might see a Table Scan  in the execution plan.

Here are few other related articles on this subject which you may find useful:

Reference: Pinal Dave (http://blog.sqlauthority.com)

Interview Question of the Week #020 – What is the Difference Between DISTINCT and GROUP BY?

Question: What is the Difference Between DISTINCT and GROUP BY?

Answer: A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. GROUP BY should be used to apply aggregate operators to each group. If all you need is to remove duplicates, then use DISTINCT. If you are using sub-queries execution plan for that query varies so in that case you need to check the execution plan before making decision of which is faster.

USE AdventureWorks2014
GO
-- Example of DISTINCT:
SELECT DISTINCT JobTitle, Gender
FROM [HumanResources].[Employee]
GO

interview 20 1 Interview Question of the Week #020   What is the Difference Between DISTINCT and GROUP BY?

USE AdventureWorks2014
GO
-- Example of GROUP BY:
SELECT JobTitle, Gender
FROM [HumanResources].[Employee]
GROUP BY JobTitle, Gender
GO

interview 20 2 Interview Question of the Week #020   What is the Difference Between DISTINCT and GROUP BY?

USE AdventureWorks2014
GO
-- Example of GROUP BY with aggregate function:
SELECT JobTitle, Gender, COUNT(*) EmployeeCount
FROM [HumanResources].[Employee]
GROUP BY JobTitle, Gender
GO

interview 20 3 Interview Question of the Week #020   What is the Difference Between DISTINCT and GROUP BY?

Reference: Pinal Dave (http://blog.sqlauthority.com)