# SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Many to Many Relationships – Day 8 of 35

Answer simple quiz at the end of the blog post and -
Every day one winner from India will get Joes 2 Pros Volume 2.
Every day one winner from United States will get Joes 2 Pros Volume 2.

## Many to Many relationships

If anyone has done some shopping on the internet you are familiar with the term “Shopping Cart” or “Shopping basket”. After you have selected a product you want to buy the storefront will gladly let you keep on shopping until there are many items in you shopping cart. On my last trip to Amazon.com I put 3 things in my cart before checking out. One item was a HDMI cable box converter for my flat screen TV at home. I took a closer look at this products rating and reviews before buying. Below this product it said this items was in the top 100 selling items in home electronics. This means this items has sold to more people than just me and I was not the first to buy this product.

Based on the story above what is the relationship to a product (like the HDMI cable box) and a customer who may want to buy that item? As a customer I can buy many products at once in my shopping cart. Does this mean there is a one to many relationship between customers (like me) and products (like the HDMI cable box). Yes there is but that is only a half truth. A product like the HDMI cable box can be purchased by many different customers. Therefore there is a Many to Many relationship between Customers and products. This product can be bought by many customers and many customer can buy this product.

Another many to many relationship is between books and Authors. I have written 5 books so we know there is at least a one to many relationship between Authors and Books. Now look at my 4th book (SQL Programming Joes 2 Pros Volume 4 – ISBN-13: 978-1451579482). This book has two authors (Rick A. Morelan and Pinal Dave). Since an Author can write many books and a book can be written by many authors there is a many to many relationship between authors and books.

We need a new process to map this complex relationship. Sometimes tables exist for the sole purpose of allowing indirect relationships between tables. What do you call tables the handle relationships for other tables? Common terms for these type of tables are mapping tables, bridge tables, or junction tables.

## Many to Many relationships with invoicing systems

As discussed earlier a product can appear on many invoices and an invoice can have many products. The relationship between products and invoices is known as a many-to-many relationship as seen in the figure below.

If you were to ask yourself “What Products are on Invoice 5631?” or “Go-Duck was ordered on how many invoices?” you could resolve this query? The Figure below shows us how to resolve both mappings. We can see the Go-Duck toy was order twice (on 7/15 and on 9/22). We can also see the shopping cart 5631 has two items in it (Toy Car and Furchee).

Now we’ll take a look at many-to-many relationships between sales invoices and products in the JProCo sample database.

Here is an example of JProCo’s sales invoices mapping to a bridge table (SalesInvoiceDetail) in order to map over to the CurrentProducts table. The CurrentProducts table gives us all the detail of the current products that have been ordered.

With SalesInvoice 5 It looks like many products were ordered on that one invoice (Products 9, 11, 12, and 16). To see what those products are, we would look over to the CurrentProducts table. We see Product 9 is an Underwater Tour 3 Days East Coast. Product 11 is an Underwater Tour 1 Week East Coast, and so forth. So a SalesInvoice can have many products, and products can be ordered on multiple sales invoices.

The SalesInvoiceDetail table is the mapping table that is handling the many to many relationship between the SalesInvoice table and the CurrentProducts table. This 3 table join between these tables would look like the query you see below.

```SELECT si.CustomerID, si.InvoiceID, si.OrderDate, sd.Quantity, cp.ProductName, cp.RetailPrice FROM dbo.SalesInvoiceDetail AS sd INNER JOIN dbo.SalesInvoice AS si ON sd.InvoiceID = si.InvoiceID INNER JOIN CurrentProducts AS cp ON cp.ProductID = sd.ProductID ```

Note: If you want to setup the sample JProCo database on your system you can watch this video. For this post you will want to run the SQLQueriesChapter3.0Setup.sql script from Volume 2.

## Question 8

Q.8) You have tables named dbo.SalesInvoice and dbo.SalesInvoiceDetail. CustomerID is located in the SalesInvoice table and InvoiceID is located in both tables. You have been told to show the discount amounts from the SalesInvoiceDetail table that correspond to the sales of a specific CustomerID of 490. Which T-SQL statement should you use?

1. SELECT sd.CustomerID, si.DiscountAmt, cp.ProductName
FROM dbo.SalesInvoiceDetail sd
INNER JOIN dbo.SalesInvoice si
ON sd.InvoiceID= si.InvoiceID
INNER JOIN CurrentProducts AS cp
ON cp.ProductID = sd.ProductID
WHERE si.CustomerID= 490
2. SELECT sd.CustomerID, si.DiscountAmt, cp.ProductName
FROM dbo.SalesInvoiceDetail sd
INNER JOIN CurrentProducts AS cp
ON cp.ProductID = sd.ProductID
WHERE si.CustomerID= 490
3. SELECT sd.CustomerID, si.DiscountAmt, cp.ProductName
FROM dbo.SalesInvoiceDetail sd
WHERE EXISTS (dbo.SalesInvoice si
ON sd.InvoiceID= si.InvoiceID
INNER JOIN CurrentProducts AS cp
ON cp.ProductID = sd.ProductID
WHERE si.CustomerID= 490)

### Rules:

Every day one winner will be announced from United States.
Every day one winner will be announced from India.
Winner from United States will get Joes 2 Pros Volume 2.
Winner from India will get Joes 2 Pros Volume 2.
The contest is open till next blog post shows up at http://blog.sqlauthority.com which is next day GTM+2.5.

Reference:  Pinal Dave (http://blog.SQLAuthority.com)

# SQL SERVER – Tips from the SQL Joes 2 Pros Development Series – Efficient Query Writing Strategy – Day 4 of 35

Answer simple quiz at the end of the blog post and -
Every day one winner from India will get Joes 2 Pros Volume 1.
Every day one winner from United States will get Joes 2 Pros Volume 1.

## Query Writing Strategy

Some people may push back on this next technique or misunderstand until getting to the very end. The goal is to have fewer errors as you write complex queries more quickly by making sure the easy stuff works first.

If you are a SQL expert who only works on the same database for the rest of your life who will never type the wrong field name in a query then yes you are the fastest query writer your company could ever have. Let’s face it, sometime we as SQL experts need to get familiar with our underlying data before we get great coding velocity. Brute force or memorized keystroke solutions can be fun when learning but not good when you have a deadline and want to write queries quickly, with fewer errors, and test things as you continue to write more code. With that in mind here is something I have yet to find in any book. When enthusiastic SQL students do this, they experience a revelation. The number of errors drops significantly and the speed at writing complex queries increases immediately.

Knowing how to narrow down what you are looking for amongst a vast list of choices helps immensely. Grabbing the right tables first and then the fields second is much like grabbing the right menu before ordering an item from it. In fact, one student named Tim took this back to his team of SQL developers and they immediately implemented this process.

We are all used to following steps we know have proven to work. Most of the time, actions are sequential from top to bottom or left to right. Other times we complete things in phases. The two phases you are going to see here apply to joining tables or any other query that has plenty of logic. Just remember to organize first and clean up (or itemize) second.

When you go to a new restaurant, you ask to see the menu. You want to see all they have to offer. The odds are you may like half the items, but only need a few for to feed your family. Looking at the menu is like starting off with a SELECT *. After looking at all the fields, you pick the one(s) you want.

Phase I:  Organize. When you’re building a new query from many tables, you’ll find yourself wondering, “Where do I start?” Really there are three steps to this phase.  First, lay the steps out by identifying which tables contain the essential data. Second, get all your joins working with a SELECT * statement. Third, add any basic filtering criteria.

Phase II.  Itemize. Once all joins and criteria, such as SELECT *, FROM and WHERE are working, you are ready for Phase II. This entails going back and changing your SELECT * to an itemized select field list as your final step.

Let’s explore how this two-phase process of “Organize then Itemize” is a time-saver. We are going to use one of the challenges from the last lab. In Lab 3.2 (Outer Joins) in Skill Check 2, you needed to get four fields from two different tables. If you list all four desired fields and test one table at time, you get an error as seen on the right side of Figure 4.1.

In Figure 4.1 we write the SELECT statement and part of the FROM clause. The FROM clause will have two tables when we are done, but for now we just want to get the Location table working. When we use the SELECT * it removes any possible errors from line 1. From there, we can focus on our more complicated join logic. We can add tables one at a time until all is working. This is the Organize phase.

SELECT * never results in an “invalid column name” error, but a SELECT list can. After your query is organized and working, you can go back and itemize the SELECT field list to display just what you want. This is the Itemize phase. These steps are broken down as follows:

```--ORGANIZE PHASE: Get SELECT * query written. -- Test first table logic SELECT * FROM Location --Test second table with join SELECT * FROM Location INNER JOIN Employee ON Location.LocationID = Employee.LocationID --Test all tables with criteria SELECT * FROM Location INNER JOIN Employee ON Location.LocationID = Employee.LocationID WHERE [State] = 'WA' --ITEMIZE PHASE: Change to SELECT field list --Choose your fields SELECT FirstName, LastName, City, [State] FROM Location INNER JOIN Employee ON Location.LocationID = Employee.LocationID WHERE [State] = 'WA' ```

SELECT is always the first statement in a query. It’s natural to want to finish your SELECT statement before writing the FROM clause. Start with SELECT * and finish the query. Do your field list when all else is done. Use this method and you will never again get a field selection error while building queries.

Note: If you want to setup the sample JProCo database on your system you can watch this video.

## Question 4

Q. 4) Square brackets are required when…

1. The table name conflicts with a keyword
2. The table name is the same as another table.
3. The table uses the same name as the database.
4. To alias the table.

### Rules:

Every day one winner will be announced from United States.
Every day one winner will be announced from India.
Winner from United States will get Joes 2 Pros Volume 1.
Winner from India will get Joes 2 Pros Volume 1.
The contest is open till next blog post shows up at http://blog.sqlauthority.com which is next day GTM+2.5.

Reference:  Pinal Dave (http://blog.SQLAuthority.com)

# SQL SERVER – Database Worst Practices – New Town and New Job and New Disasters

This blog post is written in response to the T-SQL Tuesday hosted by Amit Banerjee. I know Amit personally and very big fan of his community activities. I read his blog, read his comments, follow his tweets, and the most importantly reach out to him when I feel like talking SQL.

Amit has selected a very interesting subject – Best Practices. When I read the subject, I can directly relate this subject to my real world interactions. I have seen so many developers and DBAs making fatal mistakes in the early days of their career. Most of the time, they get another chance because they are new to the job or that particular task, which they are doing for the first time. However, such excuses cannot be made for experts and senior professionals. They are expected perform and carry out the right practices always.

Instead of writing best practices, I am going to write about few of the worst practices which I have quite often seen in the real world. Some of them are really bad that I often wonder how those who still use them were able to continue on the job so far. Let us go over few of the worst practices I have observed in the industry.

## My log file is growing too big – I truncated the log without taking backup.

This has to be the numero uno of worst practices. Every time I come across such a statement, I almost miss a heartbeat. Before I continue, let me confess that at the beginning of the career, I have practiced the same. After a while, I had learned it the hard way. This is never a good practice; truncating the log file is not an option. The reason why I rate this as the worst practice: this one mistake can make database impossible to recover. When someone truncates the log file without taking backup, there is no chance of recovery.

Here are a couple of articles which I have written on how to prevent log file from growing too big.

How to Stop Growing Log File Too Big

Log File Growing for Model Database – model Database Log File Grew Too Big

master Database Log File Grew Too Big

## I shrink my database daily to regain the space.

This is one of the popular worst practices. I have seen administrators shrinking the database at the end of the day to gain the space only to lose it the very next day. Shrinking is a VERY BAD operation. It increases fragmentation, reduces the performance, and wastes the resources. I strongly advise not to do it.

Here are few articles I had earlier written on this subject.

Shrinking Database is Bad – Increases Fragmentation – Reduces Performance

SHRINKDATABASE For Every Database in the SQL Server

Shrinking NDF and MDF Files – Readers’ Opinion

SHRINKFILE and TRUNCATE Log File in SQL Server 2008

## Clustered Index makes the table to sort every time. I do not have clustered index on any table.

For an OLTP system, Index is very important and clustered index is the most important index (in my opinion). Clustered index forces order to the table and removes the ‘forwarding records’ problem from the database. Personally, I consider tables without clustered index performance to be unacceptable?. In my OLTP system, I always recommend that all tables should have clustered index.

Here is a quick script that can help in identifying a table without clustered index in the database.

Find Table without Clustered Index – Find Table with no Primary Key

Clustered Index on Separate Drive From Table Location

Observation – Effect of Clustered Index over Nonclustered Index

## TempDB is not important; so I will keep it on my slow drive.

Personally, I have tremendous respect for TempDB. Even though it resets every time when the server restarts, this is a single most important database that is shared among all the other databases in the system. This database is used for sorting, temporary objects, triggers, row version and in other operations. Keeping it on the slow drive is not the solution, but reality is that it will just create many performance-related problems in the overall system. If your TempDB is becoming full, move it to the another drive.

Here are a few blog posts I have written on TempDB.

T-SQL Script to Find Details About TempDB

TempDB is Full. Move TempDB from one drive to another drive

Reducing Page Contention on TempDB

Improvements in TempDB

TempDB Restrictions – Temp Database Restrictions

Ideal TempDBFileGrowth Value

## I am confused between Full, Differential, and Log Backup

Inability to understand the proper recovery model is another worst practice. I have people restoring many differential backups while restoring the database. I quite often see that log file backup interval is so huge it is more than differential backup interval. There are so many factors which can lead to disaster and data loss, leading to people to look for a new job in a new town at times. If you are confused regarding what is tail log backup, then stop, and learn from online books before implementing the backup strategy. Even if you are not responsible for implementing the backup strategy, I would still suggest you to read how to carry out proper backup as you never know when it will land heavily on your job card!

Here are few interesting write-ups on this subject on this blog.

Backup Timeline and Understanding of Database Restore Process in Full Recovery Model

Restore Sequence and Understanding NORECOVERY and RECOVERY

Mirrored Backup and Restore and Split File Backup

Restore Database Without or With Backup – Everything About Restore and Backup

Restore Database Backup using SQL Script (T-SQL)

There are many more practices I can write, but I think these five are the top worst practices. Feel free to post your opinions and suggestions.

Reference: Pinal Dave (http://blog.SQLAuthority.com)

# SQL SERVER – Solution – Puzzle – SELECT * vs SELECT COUNT(*)

Earlier I have published Puzzle Why SELECT * throws an error but SELECT COUNT(*) does not.

This question have received many interesting comments. Let us go over few of the answers, which are valid. Before I start the same, let me acknowledge Rob Farley who has not only answered correctly very first but also started interesting conversation in the same thread.

The usual question will be what is the right answer. I would like to point to official Microsoft Connect Items which discusses the same.

RGarvao

https://connect.microsoft.com/SQLServer/feedback/details/671475/select-test-where-exists-select

tiberiu utan

http://connect.microsoft.com/SQLServer/feedback/details/338532/count-returns-a-value-1

Rob Farley

count(*) is about counting rows, not a particular column. It doesn’t even look to see what columns are available, it’ll just count the rows, which in the case of a missing FROM clause, is 1.

“select *” is designed to return columns, and therefore barfs if there are none available.

Even more odd is this one:

select ‘blah’ where exists (select *)

You might be surprised at the results…

Koushik

The engine performs a “Constant scan” for Count(*) where as in the case of “SELECT *” the engine is trying to perform either Index/Cluster/Table scans.

amikolaj

When you query ‘select * from sometable’, SQL replaces * with the current schema of that table. With out a source for the schema, SQL throws an error.

so when you query ‘select count(*)’, you are counting the one row. * is just a constant to SQL here. Check out the execution plan. Like the description states – ‘Scan an internal table of constants.’

You could do ‘select COUNT(‘my name is adam and this is my answer’)’ and get the same answer.

Netra Acharya

SELECT *
Here, * represents all columns from a table. So it always looks for a table (As we know, there should be FROM clause before specifying table name). So, it throws an error whenever this condition is not satisfied.

SELECT COUNT(*)
Here, COUNT is a Function. So it is not mandetory to provide a table.

Check it out this:
DECLARE @cnt INT
SET @cnt = COUNT(*)
SELECT @cnt
SET @cnt = COUNT(‘x’)
SELECT @cnt

Naveen

Select 1 / Select ‘*’ will return 1/* as expected.

Select Count(1)/Count(*) will return the count of result set of select statement.

Count(1)/Count(*) will have one 1/* for each row in the result set of select statement.

Select 1 or Select ‘*’ result set will contain only 1 result. so count is 1.

Where as “Select *” is a sysntax which expects the table or equauivalent to table (table functions, etc..). It is like compilation error for that query.

Ramesh

Hi Friends,
Count is an aggregate function and it expects the rows (list of records) for a specified single column or whole rows for *.

So, when we use ‘select *’ it definitely give and error because ‘*’ is meant to have all the fields but there is not any table and without table it can only raise an error.

So, in the case of ‘Select Count(*)’, there will be an error as a record in the count function so you will get the result as ’1′.

Try using : Select COUNT(‘RAMESH’) and think there is an error ‘Must specify table to select from.’ in place of ‘RAMESH’

Pinal : If i am wrong then please clarify this.

Sachin Nandanwar

Any aggregate function expects a constant or a column name as an expression.
DO NOT be confused with * in an aggregate function.The aggregate function does not treat it as a column name or a set of column names but a constant value, as * is a key word in SQL.

You can replace any value instead of * for the COUNT function.Ex

Select COUNT(5) will result as 1.

The error resulting from select * is obvious it expects an object where it can extract the result set.

I sincerely thank you all for wonderful conversation, I personally enjoyed it and I am sure all of you have the same feeling.

Reference: Pinal Dave (http://blog.SQLAuthority.com)