SQL SERVER – How Many Line of Code Do You Have in Database?

Last weekend I happened to be in one of the beautiful cities in Sri Lanka for some fun time with SQL Enthusiasts. We had our first ever SQLSaturday Event in Sri Lanka and it was just too much fun to be amongst the amazing audiences. I was fortunate to deliver a couple of sessions and not to forget the hours and hours of interaction with the attendees post session. In this blog post we will answer the question How Many Line of Code Do You Have in Database?

SQLSaturdays and Usergroups are an amazing way to learn and share your experience with our SQL lovers from the neighboring country. I couldn’t have asked for a better reception and hospitality. I am eagerly looking forward for the future SQLSaturdays too.

SQL SERVER - How Many Line of Code Do You Have in Database? srilanka-beach

In one of those late night SQL talks, one of my friends asked an interesting question. They were a product based company and one of their clients was asking how many total lines of code they have in their product. My instinct was to say – “Doesn’t Visual Studio help you find this in a jiffy? Getting this data must have been some right click somewhere?” My friend looked at me and said – “Pinal, I never had a problem with my .NET code because VS did help him in a lot of ways. His problem was when working with SQL Server.” Trust me, I am no VS expert here – but the SQL part of the question did get me thinking. I said – “I think I can come up with something may be a couple of days after returning to India”.

I thought of putting together something rudimentary and basic to get the ball rolling. So here is my first shot at this requirement.

SELECT
DB_NAME(DB_ID()) [DB_Name],
TYPE,
COUNT(*)  AS Object_Count,
SUM(LinesOfCode) AS LinesOfCode
FROM (
SELECT
TYPE,
LEN(definition)- LEN(REPLACE(definition,CHAR(10),'')) AS LinesOfCode,
OBJECT_NAME(OBJECT_ID) AS NameOfObject
FROM sys.all_sql_modules a
JOIN sysobjects  s
ON a.OBJECT_ID = s.id
-- AND xtype IN('TR', 'P', 'FN', 'IF', 'TF', 'V')
WHERE OBJECTPROPERTY(OBJECT_ID,'IsMSShipped') = 0
) SubQuery
GROUP BY TYPE

The above script gives a high level numbers of the LOC inside SQL Server by counting the number of “Carriage return” keys in our various functions, modules, procedures, triggers etc. I know it may not be perfect or scientific. But this is my first shot at this requirement. I ran the same against the AdventureWorks database to get the below output.

SQL SERVER - How Many Line of Code Do You Have in Database? Lines-of-code-01

I would love to know any other methods you have ever used in your environments for this requirement? Do you need such numbers in your projects? What have you done for such requirements? Let me know via your comments so that we can learn together.

Reference : Pinal Dave (https://blog.sqlauthority.com)

SQL Saturday, SQL Scripts, SQL Server, SQLAuthority Author Visit
Previous Post
SQL SERVER – The Basics of the Execute Process Task – Notes from the Field #084
Next Post
Interview Question of the Week #024 – What is the Best Recovery Model?

Related Posts

18 Comments. Leave new

  • Hi, I have taken a backup of my database and it’s .bak file size is 200MB. Again I took a backup of same database without any change in the database name,now the size gone 400MB and again I did, it gone 600MB. What logic it uses, because there was no change in the database data between these intervals of backup process, data remains same. But why size increases?

    thanks,
    Kavi

    Reply
    • Chris Burton
      June 12, 2015 8:42 pm

      You are inadvertently creating a backup set containing both backups. If you look at the backup options, the default option under the overwrite media section is: “append to existing backup set” if you change that to “overwrite existing backup set” then you will find the size will be only the size of the single backup.

      However if you do this you will only have the latest backup. It might be a good idea to create a new backup file for each backup that uses the setting “overwrite existing backup set” and in this case the file name would include the ISO date time (example DatabaseName20150612Full.bak ). I would reccomend keeping multiple days of backup files if you can so you can go back to a previous version if you need to.

      Reply
      • Thanks Chris for your reply. It’s new backup,Marc and now I understood the logic with Chris’ answer. Imran, I had a doubt, googled it, couldn’t find answer and then first thing came into my mind was asking it to Pinal. I didn’t know where to ask so that was the reason I asked in his recent post. I believe your comment is not an answer to my question anyway thanks for your concern.

    • Marc Weinstein
      June 12, 2015 11:09 pm

      Are these new backup files, or are you appending?

      Reply
    • Seriously what does your question had to do with this post? This is considered as SPAM. Post question at right place man.

      Reply
  • Marc Weinstein
    June 12, 2015 11:10 pm

    My approach has been to script out all the relevant objects, then open them up in Notepad++, which has a great feature to remove blank or duplicate blank lines. Do blank lines count? Maybe. Should duplicate blank lines count? Probably not.

    Reply
    • Marc, yes – that is surely another approach. I thought of making the first attempt here. Thanks for the suggestions. Helps for sure.

      Reply
  • I had a similar thought to try and determine the “complexity” of a db by checking total number of tables, total number of columns, total number of schemas, object counts by type. Obviously not foolproof but a quick way to see if a db is small and easy or large and complex.

    It gets tricky when you have a large number of tables and stored procs to just for CRUD operations. You’d need to separate the CRUD procedures from the business logic/reporting ones.

    For the blank vs duplicate lines, you could easily do a character count and if you were really worried about it, you could remove all blanks with replace before character counts.

    Reply
  • Michael Barnett
    June 15, 2015 8:58 pm

    DB_Name TYPE Object_Count LinesOfCode
    ATT FN 71 11158
    ATT IF 7 1159
    ATT P 1958 202244
    ATT TF 8 13677
    ATT TR 93 5951
    ATT V 62 7082

    Reply
  • Henrik Johansson
    June 19, 2015 8:10 pm

    Hi!

    With all due respect to you great individuals doing such impressive work of breaking down the problem, I’d just want to put up the query: Is it academic?

    With the database query-size is constantly growing with each transaction, it would be useless to determine the total number thereof, unless of course the database is static, in which case what would it matter?

    Am I right? Am I wrong?

    Please give feedback!

    Reply
    • As you correctly said, it doesn’t really matter. It is the first time someone asked me this question since I first started working with databases since 9-10 yrs now. I thought, even if this is a point in time (Static) value – thought I must be able to simplify the same with a possible solution.
      Thanks again for taking time in writing your thoughts. This is what makes it an interesting conversation.

      Reply
  • Nayakanti Pavan Kumar
    June 22, 2015 8:00 pm

    I agree that there were always few options available to gather the code metrics for simple sql code. But we had challenge with the SSIS packages in order to share code metrics in order to match the leadership expectations. Then, later we could convince them that it is not liable to compare the lines of code of SSIS with sql code. (SSIS can be opened in xml format, BTW)

    Reply
  • Hi,
    Is there any practical use of those numbers. Or we just satisfying a client. Because most of people use to align their scripts more decently, which intrun increase # of lines.

    Let me know if i had missed anything interesting from above conversation.

    Reply
  • Thank you Pinal for this. I’ve been using SQL 2008 for a few years (self taught with the aide of Google). I wonder… rather than counting the total number of lines, is there a function to return the immediate line number? I see where one can use the ERROR_LINE ( ) function to return the line in which an error has occured…

    In my case, I have multiple queries in my SQL where one query is a prerequisit for the next or somewhere down the line. At intervals, there is output generated with a “title”. I’d like to insert the line number of where the output occurs. If someone else is running my code (with 1000+ lines) and the output doesn’t look right I’d like for them to have a good idea of where to start their investigation

    Reply
  • The object types are not always self explanitory. I have updated to produce the actual types of object.

    DECLARE @types TABLE (
    type VARCHAR(10)
    ,type_desc VARCHAR(255)
    )

    INSERT INTO @types
    SELECT [type]
    ,[type_desc]
    FROM sys.objects
    GROUP BY [type]
    ,[type_desc]

    SELECT DB_NAME(DB_ID()) [DB_Name]
    ,TYPE
    ,type_desc
    ,COUNT(*) AS Object_Count
    ,SUM(LinesOfCode) AS LinesOfCode
    FROM (
    SELECT s.TYPE
    ,t.type_desc
    ,LEN(DEFINITION) – LEN(REPLACE(DEFINITION, CHAR(10), ”)) AS LinesOfCode
    ,OBJECT_NAME(OBJECT_ID) AS NameOfObject
    FROM sys.all_sql_modules a
    INNER JOIN sysobjects s ON a.OBJECT_ID = s.id
    INNER JOIN @types t ON t.type = s.type
    — AND xtype IN(‘TR’, ‘P’, ‘FN’, ‘IF’, ‘TF’, ‘V’)
    WHERE OBJECTPROPERTY(OBJECT_ID, ‘IsMSShipped’) = 0
    ) SubQuery
    GROUP BY TYPE
    ,type_desc

    Reply

Leave a Reply