www.fgks.org   »   [go: up one dir, main page]

THE SQL Server Blog Spot on the Web
Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | Join | Help
in Search

Tibor Karaszi

  • The SQL Server script repository

    How come I've missed this? I don't know, but I'm writing this blog post so to bookmark this site.

    You'll find a handful of scripts, mainly using Dynamic Management Views and Functions, to monitor various aspects of SQL Server.

    http://www.microsoft.com/technet/scriptcenter/scripts/sql/sql2005/default.mspx?mfr=true


  • CTP6 is here

  • Kalen Delaney's SQL Server Internals DVD

    Just wanted to let you know about another SQL Server resource from Kalen Delaney.

    A set of DVDs, where Kalen discusses internals and such. So far, I've looked at the first one, covering meta-data and architecture.

    Having known Kalen for some 10 years now (time flies :-) ), I'm not suprised about the high quality of presentation and material.

    Check it out at: SQLServerDVD.com.


  • Finally got Kalen's new Query Tuning book...

    I know this book has been out for a little while now, but I didn't get it until now. This one I've been looking forward to for a long time. A quick look in the book is very promising (as expected).

    The full name of the book, btw, is "Inside Microsoft SQL Server 2005 Query Tuning and Optimization" from MS press. The title says it all, I guess. :-)


  • New version of Niels Berglund's SQLCLRProject

    SQLCLRProject is an alternative to the built-in support in VS for deploying SQLCLR objects. Check it out at:

    http://nielsb.wordpress.com/sqlclrproject/


  • Non-trusted constraints and performance

    (See my part 1 article about non-trusted constraints in general: http://sqlblog.com/blogs/tibor_karaszi/archive/2008/01/12/non-trusted-constraints.aspx)

    The optimizer is a pretty smart little animal. It can even use constraints to eliminate some data access or part of a query. That assumes however that the constraint is trusted. For example, for a check constraint:

    USE Adventureworks
    SET STATISTICS IO ON
    --Also check the execution plan
    SELECT AVG(OrderQtyFROM Sales.SalesOrderDetail
    SELECT AVG(OrderQtyFROM Sales.SalesOrderDetail WHERE UnitPrice 0

    Note that the table isn't even accessed for the second query. SQL Server know that there can be no rows in the table where UnitPrice > 0 because there is a constraint:

    SELECT OBJECT_NAME(parent_object_idAS table_namenamedefinition
    FROM sys.check_constraints
    WHERE parent_object_id OBJECT_ID('sales.salesorderdetail')
    ORDER BY table_name

    Notice the constraint CK_SalesOrderDetail_UnitPrice with the condition ([UnitPrice]>=(0.00)). But what if the constraint isn't trusted?

    ALTER TABLE Sales.SalesOrderDetail NOCHECK CONSTRAINT CK_SalesOrderDetail_UnitPrice
    SELECT AVG(OrderQtyFROM Sales.SalesOrderDetail WHERE UnitPrice 0

    Now we do have data access. And even if we enable the constraint, it will be non-trusted and SQL Server cannot know for sure that no row violates the condition:

    ALTER TABLE Sales.SalesOrderDetail CHECK CONSTRAINT CK_SalesOrderDetail_UnitPrice
    SELECT AVG(OrderQtyFROM Sales.SalesOrderDetail WHERE UnitPrice 0

    We need to enable the constraint WITH CHECK to make sure it is trusted:

    ALTER TABLE Sales.SalesOrderDetail WITH CHECK CHECK CONSTRAINT CK_SalesOrderDetail_UnitPrice
    SELECT AVG(OrderQtyFROM Sales.SalesOrderDetail WHERE UnitPrice 0

    Performance can also be improved with foreign key constraints. For example, we have a foreign key which states that we cannot have an order detail row for an order which doesn't exist in the orders table. Now, consider below (check out both I/O and execution plan):

    SELECT sd.SalesOrderIDsd.CarrierTrackingNumber
    FROM Sales.SalesOrderHeader AS s
     
    INNER JOIN Sales.SalesOrderDetail AS sd
      
    ON s.SalesOrderID sd.SalesOrderID
    WHERE sd.OrderQty 20

    We didn't return any columns from the SalesOrderHeader table, and since SQL Server know that each row in the SalesOrderDetail table has a corresponding row in the SalesOrderHeader table, there's no need to access the SalesOrderHeader table at all. But if the constraint isn't trusted:

    ALTER TABLE Sales.SalesOrderDetail NOCHECK CONSTRAINT FK_SalesOrderDetail_SalesOrderHeader_SalesOrderID

    SELECT sd.SalesOrderIDsd.CarrierTrackingNumber
    FROM Sales.SalesOrderHeader AS s
     
    INNER JOIN Sales.SalesOrderDetail AS sd
      
    ON s.SalesOrderID sd.SalesOrderID
    WHERE sd.OrderQty 20

    Notice that now the SalesOrderTable *is* accessed, because the constraint is disabled. As, as with above, it isn't enough to enable it, we need to enable it WITH CHECK to make it trusted again. OK, we might say that why even bother with the join if we only need columns from the referencing table in the first place. But consider this view:

    CREATE VIEW myView AS
    SELECT 
    sd.SalesOrderIDs.RevisionNumbers.DueDatesd.CarrierTrackingNumber
    FROM Sales.SalesOrderHeader AS s
     
    INNER JOIN Sales.SalesOrderDetail AS sd
      
    ON s.SalesOrderID sd.SalesOrderID

    The view accesses columns from both tables. But it is quite possible that we have users of this view who are only intered in columns from the SalesOrderDetail table:

    SELECT SalesOrderIDCarrierTrackingNumber
    FROM myView

    If the constraint is not trusted, then both tables are accessed (with the join operation, of course). But if we make sure that the constraint is trusted, then only the SalesOrderDetail table is accessed and no join is performed:

    ALTER TABLE Sales.SalesOrderDetail WITH CHECK CHECK CONSTRAINT FK_SalesOrderDetail_SalesOrderHeader_SalesOrderID

    SELECT SalesOrderIDCarrierTrackingNumber
    FROM myView

    So, bottom line is that you should be careful with non-trusted constraints. Obviously because you might have data in the table which violates the constraint, but also for performance reasons. It is easy to check whether you have any non-trusted constraints in the database:

    SELECT OBJECT_NAME(parent_object_idAS table_namename 
    FROM sys.check_constraints 
    WHERE is_not_trusted 1
    UNION ALL
    SELECT OBJECT_NAME(parent_object_idAS table_namename 
    FROM sys.foreign_keys
    WHERE is_not_trusted 1
    ORDER BY table_name

    I should add that to me, the purpose of constraint to make sure I have valid data. The performance aspect is just a bonus. But I admitt that I'm pretty impressed by the optimizer in some cases. :-)

     


  • Non-trusted constraints

    A discussion in the newsgroups before the holidays lead to trusted constraints and performance. This inspired me to blog about it, but I decided to have some vacation first :-). Instead of having one long article, I'll do a two-part. This one is about non-trusted constraints in general, and another one will follow about non-trusted constraints and performance.

    As you might know, we can disable check and foreign key constraints. This can be done when we create the constraint, for instance:

    USE tempdb
    CREATE TABLE t1(c1 int)
    INSERT INTO t1(c1VALUES(-1)
    GO
    ALTER TABLE t1 WITH NOCHECK ADD CONSTRAINT CK_t1_c1 CHECK(c1 0)

    Above allow us to add the constraint even though we have rows that violates the constraint. The constraint isn't disabled, but we (can) have rows in the table that violates the constraint - the constraint isn't trusted. We can also disable an existing constraint:

    USE tempdb
    CREATE TABLE t2(c1 INT CONSTRAINT CK_t2_c1 CHECK(c1 0) )
    INSERT INTO t2(c1VALUES(1)
    GO
    ALTER TABLE t2 NOCHECK CONSTRAINT CK_t2_c1
    GO
    INSERT INTO t2(c1VALUES(-1)

    Again, we now have rows in the table that violates the constraint. For the first example, the constraint is enabled, but we didn't check for existing rows when we added the constraint. If we try to add a row which violates the constraint, we get an error message. For the second example, the constraint isn't even enabled. We can enable a disabled constraint:

    ALTER TABLE t2 CHECK CONSTRAINT CK_t2_c1
    GO
    INSERT INTO t2(c1VALUES(-1)

    The immediate above INSERT command will fail with an error message. The constraint in table t2 is now enabled. But the constraint for both table t1 and table t2 are non-trusted. For table t1, we added the constraint with existing data, and told SQL Server to not check existing data. SQL Server cannot trust this constraint. For table t2, we disabled the constraint, added data, then enabled the constraint. SQL Server cannot trust the constraint because we might have added data which violates the constraint while the constraint was disabled. There's an easy way to check whether you have non-tructed constraints. For instance, for check constraints:

    SELECT OBJECT_NAME(parent_object_idAS table_namename
    FROM sys.check_constraints
    WHERE is_not_trusted 1

    Now, can we turn a non-trusted constraint into a trusted constraint? Yes. But we first need to get rid of offending data:

    DELETE FROM t1 WHERE c1 0
    DELETE FROM t2 WHERE c1 0

    And now we want to make sure the constraints are trusted:

    ALTER TABLE t1 WITH CHECK CHECK CONSTRAINT CK_t1_c1
    ALTER TABLE t2 WITH CHECK CHECK CONSTRAINT CK_t2_c1

    There's no typo above. "WITH CHECK" is validate the data, and "CHECK CONSTRAINT" is to enable the constraint.

    So, why would we want to bother with disabling and non-trusted constraints? The purpose of constraints is to make sure that we have consistent data. Why would we want to break this in the first place? Well, rarely. But here are a coule of possible scenarios where one could consider disabling constraints:

    • We run some batch operation once a week. With constraint enabled, this takes 5 hours. With constraints disabled, it takes 1 hour. We "know" that the batch is written in a way so that it doesn't violate any of our constraints.
    • We want to expand the domain of allowable values for a column. Today we allow values Y and N. We will also allow value U for the column. This is implemented as a check constraint. We remove the existing constraint and add the new one (which also allow for U). We know that no of the existing rows cannot violate the new constraint since we expand the domain of allowable values. Adding the new constraint with NOCHECK is much faster.

    Above examples might seem a bit ... constructed. I haven't encountered much non-trusted constraints in reality, but it has happended. And my initial goal was to talk about non-trusted constraints and performance, and this will come in the following blog post.


  • More maint procedures

    A few months ago, I blogged about a simple procedure to do backup (http://sqlblog.com/blogs/tibor_karaszi/archive/2007/09/25/tsql-script-to-do-backup-like-maintenance-plans.aspx). Just to let you know that Ola Hallengren has some of his own. Much more elaborate than mine, including things like reorg. Check it out at http://ola.hallengren.com/.


  • Backup compression in SQL Server 2008

    Having a few moments to spare, I decided to give this a spin.

    Specifying for the backup to be compressed is really simple. Just specify COMPRESSION in the WITH clause of the BACKUP command. For example:

    BACKUP DATABASE Adventureworks 
    TO DISK = 'C:\Advc.bak' 
    WITH INITCOMPRESSION

    For fun, I compared backup file size and backup time between compressing and not compressing. I ran below after priming the cache (not that it should matter since backup read pages from disk, see http://sqlblog.com/blogs/tibor_karaszi/archive/2007/08/03/does-backup-utilize-pages-in-cache.aspx). I also deleted the backup files, if they exist, before execution.

    DECLARE @dt datetime

    SET @dt GETDATE()
    BACKUP DATABASE Adventureworks 
    TO DISK = 'C:\Adv.bak' 
    WITH INIT
    SELECT DATEDIFF(ms@dtGETDATE())

    SET @dt GETDATE()
    BACKUP DATABASE Adventureworks 
    TO DISK = 'C:\Advc.bak' 
    WITH INITCOMPRESSION
    SELECT DATEDIFF(ms@dtGETDATE())

    Size of backup file is 129 MB vs. 35 MB. Of course, compression ratio varies depending on what type of data there is in the database (string data tend to compress better than other data,. for instance). Time to execute commands were 7.5 seconds vs. 3.8 seconds. Now, this is a virtual machine on VPC and Adventureworks is a tiny database. But at least we can see clear potential for savings here.

    So how do we know if a database backup was compressed? If you've used RESTORE HEADERONLY, you probably noticed this annoying column called "Compressed". Well, this is the first time we see a value of "1" in that column.

    It seems we cannot mix compressed and non-compressed backups on the same file, though. (Not that I often have multiple backups on the same file.) If I append a backup using COMPRESSION on a file were there already are non-compressed backup, I get an error. Or I do a backup without COMPRESSION on a file where there are compressed backups, the new backup will be compressed (even when not saying COMPRESSION). Something to look out for if you have several backups on the backup files.

    So what about the RESTORE command? Well, there's nothing to say, really. You don't have to specify in the RESTORE command that the backup was compressed. Time for RESTORE was 10.4 vs 6.3 seconds (with the destination database already existing). I expect the difference to be bigger on real installation and realistic db size.


  • Owner/schema qualifying object names

    It isn't easy to remember all the details regarding object/schema qualifying object names. There are many variables involved, such as:

    • Version of SQL Server
    • Qualifying the owner when you call the proc (EXEC dbo.procname)
    • Qualifying object references inside the proc code
    • Qualifying object names if you aren't using procedures
    • Whether the user is the same as the owner of the proc/table
    • Default schema for the user

    So, I decided to give it a spin for different combinations and investigate both profiler events and also number of plans in cache. I won't post all details here, that would be too much to write down. You can use the scripts at the end to do your own findings. I did not investigate the differences regarding compile locks. Anyhow, here are my conclusions:

    • I could not produce any recompile (SP:Recompile or SQLStmtRecompile) event for any combbination.
    • I did find SP:CacheMiss events on 2000 when you execute a proc and don't qualify the proc name (for 2005 I always got those events). Then a subsequent SP_CacheHit will follow.
    • For straight SQL (no procedures) I noticed that each user get its separate plan when you don't owner-qualify the table name. This makes sense. An interesting aspect on 2005 was that if you specify a default schema for the user (and two users has the same default schema), then the users will share the plan (basically the default schema becomes the "owner" of the plan).

    Below are the scripts I used:

    --Login as sysadmin:
    USE master
    IF DB_ID('myTestDb'IS NOT NULL DROP DATABASE myTestDb
    IF EXISTS(SELECT FROM syslogins WHERE name 'Kalle'EXEC sp_droplogin 'Kalle'
    IF EXISTS(SELECT FROM syslogins WHERE name 'Olle'EXEC sp_droplogin 'Olle'
    GO
    EXEC sp_addlogin 'Kalle''*hjk&6f' EXEC sp_addlogin 'Olle''*hjk&6f'
    CREATE DATABASE myTestDb
    GO
    USE myTestDb
    EXEC sp_grantdbaccess 'Kalle' EXEC sp_grantdbaccess 'Olle'
    GO
    CREATE TABLE dbo.t(c1 int identity PRIMARY KEYc2 char(30))
    INSERT INTO dbo.t SELECT TOP 1000 'hello' FROM sysobjects a CROSS JOIN sysobjects b
    CREATE INDEX ON t(c1)
    GO
    CREATE PROC dbo.p AS SELECT c1 FROM WHERE c1 34 AND c2 'Hello'
    GO
    CREATE PROC dbo.p_q AS SELECT c1 FROM dbo.t WHERE c1 34 AND c2 'Hello'
    GO
    GRANT EXECUTE ON dbo.p TO KalleOlle
    GRANT EXECUTE ON dbo.p_q TO KalleOlle
    GRANT SELECT ON TO KalleOlle

    --Number of plans in cache, run after executions of proc
    SELECT OBJECT_NAME(objid), sqluid, * 
    FROM master..syscacheobjects 
    WHERE dbid DB_ID()
    AND 
    cacheobjtype 'Compiled Plan'
    AND sql NOT LIKE '%PSTATMAN%'

     

    --Run this three times, logged in as sysadmin (dbo), Kalle and Olle
    USE myTestDb
    GO
    EXEC dbo.p
    GO
    EXEC dbo.p_q
    GO
    EXEC p
    GO
    EXEC p_q
    GO
    SELECT c1 FROM WHERE c1 34 AND c2 'Hello'
    GO
    SELECT c1 FROM dbo.t WHERE c1 34 AND c2 'Hello'
    GO
    USE master


  • Sorted views...

    Here's one that pops up regurarly. Something like:

    "My views are no longer ordered in SQL Server 2005. I have ORDER BY in the view, but when I select from it, the rows are not returned according to the ORDER BY."

    Since I tend to post a reply often enough to above type of posts, I decided to put it in writing once and for all, so I can point to this blog post. (Laziness is a virtue ;-) .)

    A view is by definition not sorted. A view is supposed to behave like a table (and thanks to that we have the same language to operate against views as we have to operate against tables). This is why ORDER BY is not by itself allowed in a view definition.

    It is allowed to have ORDER BY if you also have TOP, but the purpose of ORDER BY is now to make sure that the correct rows are returned (5 most expensive books, for instance), not in any particular order.

    At some point in time, the "workaround" to have TOP 100 PERCENT popped up. The thinking was that we now are allowed to have ORDER BY, which makes the view "sorted". It doesn't. However, SQL Server 2000's optimizer wasn't as smart as 2005's optimizer, so for simple queries, you often found that data was returned according to your ORDER BY.

    SQL Server 2005's optimizer is smarter, and if it finds TOP 100 PERCENT and ORDER BY it realizes that both these operations doesn't affect *which* rows to return, so both operations are removed from the execution plan. That is smart, since these doesn't affect the data anyhow. Remeber that a view is by definition not sorted.

    Some developers has relied on the 2000 behavior to create "sorted views". This behavior was never documented, just a side effect of the execution plan, and was never guaranteed.

    So what do we do? Well, same as when we run queries against a table. We have ORDER BY when we read data from the view!

    Aside: The query builder which you can use when you create a view has a very strange behavior. It allow you to check a "sort" checkbox, and it will then add ORDER BY and TOP 100 PERCENT to the SELECT statement. According to above explanation, this is kind of silly, but I'm hoping that MS will tidy this up for the next version of SQL Server.


  • Find table and index name for fragmented indexes

    Got this question from a newsgroup today. The answer is pretty simple, just use the dynamic management view sys.dm_db_index_physical_stats. I'm posting this here mostly so I have somewhere to refer to when asked this question...

    I prefer to have a helper function to get the index name: 

    CREATE FUNCTION dbo.index_name (@object_id int@index_id int)
    RETURNS sysname
    AS
    BEGIN
      RETURN
    (SELECT name FROM sys.indexes WHERE object_id @object_id and index_id @index_id)
    END;
    GO

    And then a simple query:

    SELECT 
     
    OBJECT_NAME(object_idAS tblName
    ,dbo.index_name(object_idindex_idAS ixName
    ,avg_fragmentation_in_percent
    FROM sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL)
    WHERE avg_fragmentation_in_percent 20
    AND index_type_desc IN('CLUSTERED INDEX''NONCLUSTERED INDEX')

    Then you just adjust the search clause to your liking. One hint is to exclude nindexes with few pages (the page_count column).

     

     


  • TSQL script to do backup like Maintenance Plans

    This is frequently asked question, both in forums, as well as in class:

    How can I though TSQL create new backup files for each backup and have date and time in the file names?

    The reason for above is most often that you want a higher degree of customization than what maint plans allow for. An obvious such in 2005 is to add the CHECKSUM option to the BACKUP command. I know there already exist several script like mine "out there", this is just my simple version so I remember where to point whenever I get this question next. :-)

    Since I probably have some bugs in it, I have it on my website. This way I can fix the bugs as needed. You are most welcome to report bugs in the script if you spot any! And here's the URL:

    http://www.karaszi.com/SQLServer/util_backup_script_like_MP.asp


  • COPY_ONLY backups and SSMS

    Here's a suprising one:

    SQL Server 2005 introduced the COPY_ONLY option of the BACKUP command. This is relevant for

    • Database backups. The backup will not intervene with your differential backups (will not reset the DCM pages).
    • Log backups. The backup will not break the log chain (empty the log).

    There was a post in the newsgroup where a gentleman wanted to do a restore from a backup taken with COPY_ONLY. An he claimed that the GUI would not list that backup member from the backup file. I basically replied that my guess what that he didn't drive the backup GUI correctly (typing the RESTORE command worked just fine, of course). I also said that the backup you produce shod no evidence of being taken using the COPY_ONLY option, since this option only affects what happens with the source database (see above). Needless to say, I was incorrect (else you wouldn't read this blog post).

    It turns out that SSMS does not list backups taken using the COPY_ONLY option. Here's a part of my reply from the newsgroup discussion (edited):

    Let me try it and see if I get the same behaviour in the GUI as you describe:

    BACKUP DATABASE pubs
    TO DISK = N'C:\pubs.bak'
    WITH INIT, COPY_ONLY

    Right-click Databases folder, Restore Database, Type in "pubs" for database name, select "from device", press "..."
    Backup media: File
    File name: C:\pubs.bak, OK
    OK
    ... and indeed, there is nothing listed!

    OK, lets do the same except I don't specify COPY_ONLY... And now the backup is listed! So, my apologies. I was incorrect. I'm surprised that the backup somehow indicated it was done using COPY_ONLY. Let me try something else:

    BACKUP DATABASE pubs
    TO DISK = N'C:\pubs.bak'
    WITH INIT

    BACKUP DATABASE pubs
    TO DISK = N'C:\pubs.bak'
    WITH NOINIT, COPY_ONLY

    RESTORE HEADERONLY FROM DISK = N'C:\pubs.bak'

    Yes, RESTORE HEADERONLY does indicate whether the backup was done using COPY_ONLY. I see a difference in the "flags" column as well as the "IsCopyOnly" column. And the restore dialog only show the first backup. Let me now try the other way around:

    BACKUP DATABASE pubs
    TO DISK = N'C:\pubs.bak'
    WITH INIT, COPY_ONLY

    BACKUP DATABASE pubs
    TO DISK = N'C:\pubs.bak'
    WITH NOINIT

    Now the restore dialog only show the second backup in the backup file (position 2). I get the same result if I type in some other database name to restore into (a non-existing database).

    So, the restore dialog does indeed refuse to list backups done using COPY_ONLY. So here's another reason to type the RESTORE command instead of relying on how the GUI developer believe the restore should be done... :-)


  • Query cost

    I often get a question about the unit for a query plan's "Estimated Subtree Cost". I use to say that "this is just a "tick"". But I now have a better explanation to point to:

    http://blogs.msdn.com/sqlqueryprocessing/archive/2006/10/11/What-is-this-cost.aspx


More Posts Next page »

Books by Bloggers

Powered by Community Server (Commercial Edition), by Telligent Systems