<?xml version='1.0' encoding='UTF-8'?><rss xmlns:atom='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' version='2.0'><channel><atom:id>tag:blogger.com,1999:blog-3912586870549609682</atom:id><lastBuildDate>Mon, 01 Sep 2008 22:38:00 +0000</lastBuildDate><title>All Fired Up...</title><description>Advanced SQL Server discussions, articles and tools.</description><link>http://www.fotia.co.uk/fotia/Blog/AllFiredUp...html</link><managingEditor>noreply@blogger.com (Stefan Delmarco)</managingEditor><generator>Blogger</generator><openSearch:totalResults>29</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-7919049480691442739</guid><pubDate>Fri, 01 Feb 2008 01:10:00 +0000</pubDate><atom:updated>2008-02-04T23:15:21.982Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>partitioned views</category><title>Partitioned Views on a Budget</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;&lt;a title="Partitioned Views" href="http://www.fotia.co.uk/fotia/FA.02.Sql2KPartitionedViews.01.aspx"&gt;Partitioned views&lt;/a&gt; are a great feature of SQL Server. By injecting an additional layer of abstraction between your logical view and physical implementation they give you fine grained control over data partitioning. However, the cover charge required for entry to this exclusive club is pretty &lt;a href="http://msdn2.microsoft.com/en-us/library/ms190019.aspx"&gt;high&lt;/a&gt;. The checklist required to implement a partitioned view can often interfere with your design to such a degree that they become more of a hindrance than a benefit. However, by decomposing the features SQL Server uses to implement the complete partitioned views specification we can get most of the benefits without meeting all of the criteria.&lt;/p&gt; &lt;/font&gt;  &lt;p&gt;One of the benefits of partitioned views is the query optimizer's ability to determine which subset of the member tables need to be queried for each access to the partitioned view. The SQL Server query optimizer has clever features built into it that are able to understand variables and constants in TSQL and relate those constants to search predicates. The optimizer is able to use that insight to short-circuit query plan execution and not bother with branches that will never execute. For example:&lt;/p&gt; &lt;/font&gt;&lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;   &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @constant &lt;font color="#0000ff"&gt;int&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;       &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @constant = 0;       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#696969"&gt;*&lt;/font&gt;       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Person.Contact       &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; 1 &lt;font color="#696969"&gt;= &lt;/font&gt;@constant; &lt;/div&gt; &lt;/font&gt;  &lt;p&gt;The query plan for this bit of TSQL looks like this:&lt;/p&gt;  &lt;p&gt;&lt;img src="http://www.fotia.co.uk/fotia/Blog/Images/200802.StartupExpression.png" /&gt;&lt;/p&gt;  &lt;p&gt;The SQL Server query optimizer has introduced a filter into the query plan that has a special property called a Startup Expression Predicate. This property is the short-circuit logic that determines whether or not the clustered index scan on the Person.Contact table executes or not. If the startup expression evaluates to FALSE then the optimizer doesn't bother executing any portion of that query. In this example, SQL Server will do nothing when @constant = 1 evaluates to FALSE. It is the exact same technology that partitioned views use to determine which underlying member tables to access based on the member tables' check constraints / partitioning columns.&lt;/p&gt;  &lt;p&gt;For example, consider a situation where I would like to query a table that exists in a number of databases. I would like to have a partitioned view that aggregates these member tables in a single view. However, I don't want to pay the penalty of having to query every member table for every query. If I specify the database name in the SELECT query, I want the optimizer to use its short-circuit feature to go directly to the one table that will satisfy the search arguments. The solution is easier than you think!&lt;/p&gt; &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;   &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;create database&lt;/font&gt; Database01;       &lt;br /&gt;&lt;font color="#0000ff"&gt;create database&lt;/font&gt; Database02;       &lt;br /&gt;&lt;font color="#0000ff"&gt;create database&lt;/font&gt; Database10;       &lt;br /&gt;&lt;font color="#0000ff"&gt;create database&lt;/font&gt; Database11;       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;use&lt;/font&gt; Database01       &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; MemberTable (       &lt;br /&gt;&amp;#160;&amp;#160; Col1 &lt;font color="#0000ff"&gt;varchar&lt;/font&gt;(10));       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;use&lt;/font&gt; Database02       &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; MemberTable (       &lt;br /&gt;&amp;#160;&amp;#160; Col1 &lt;font color="#0000ff"&gt;varchar&lt;/font&gt;(10));       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;use&lt;/font&gt; Database10       &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; MemberTable (       &lt;br /&gt;&amp;#160;&amp;#160; Col1 &lt;font color="#0000ff"&gt;varchar&lt;/font&gt;(10));       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;use&lt;/font&gt; Database11       &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; MemberTable (       &lt;br /&gt;&amp;#160;&amp;#160; Col1 &lt;font color="#0000ff"&gt;varchar&lt;/font&gt;(10));       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;insert&lt;/font&gt; Database01.dbo.MemberTable &lt;font color="#0000ff"&gt;values&lt;/font&gt; (&lt;font color="#ff0000"&gt;'Data01'&lt;/font&gt;);       &lt;br /&gt;&lt;font color="#0000ff"&gt;insert&lt;/font&gt; Database02.dbo.MemberTable &lt;font color="#0000ff"&gt;values&lt;/font&gt; (&lt;font color="#ff0000"&gt;'Data02'&lt;/font&gt;);       &lt;br /&gt;&lt;font color="#0000ff"&gt;insert&lt;/font&gt; Database10.dbo.MemberTable &lt;font color="#0000ff"&gt;values&lt;/font&gt; (&lt;font color="#ff0000"&gt;'Data10'&lt;/font&gt;);       &lt;br /&gt;&lt;font color="#0000ff"&gt;insert&lt;/font&gt; Database11.dbo.MemberTable &lt;font color="#0000ff"&gt;values&lt;/font&gt; (&lt;font color="#ff0000"&gt;'Data11'&lt;/font&gt;);       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;use&lt;/font&gt; tempdb;       &lt;br /&gt;go       &lt;br /&gt;&lt;font color="#0000ff"&gt;create view&lt;/font&gt; BudgetView &lt;font color="#0000ff"&gt;as&lt;/font&gt;       &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff0000"&gt;'Database01'&lt;/font&gt; &lt;font color="#0000ff"&gt;as&lt;/font&gt; SourceDatabase, Col1       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Database01.dbo.MemberTable       &lt;br /&gt;&lt;font color="#0000ff"&gt;union all &lt;/font&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff0000"&gt;'Database02'&lt;/font&gt;, Col1       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Database02.dbo.MemberTable       &lt;br /&gt;&lt;font color="#0000ff"&gt;union all&lt;/font&gt;       &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff0000"&gt;'Database10'&lt;/font&gt;, Col1       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Database10.dbo.MemberTable       &lt;br /&gt;&lt;font color="#0000ff"&gt;union all        &lt;br /&gt;select&lt;/font&gt; &lt;font color="#ff0000"&gt;'Database11'&lt;/font&gt;, Col1       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Database11.dbo.MemberTable       &lt;br /&gt;go       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; *       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; BudgetView;       &lt;br /&gt;      &lt;br /&gt;      &lt;table class="grid"&gt;&lt;tbody&gt;         &lt;tr&gt;           &lt;td&gt;SourceDatabase&lt;/td&gt;            &lt;td&gt;Col1&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;--------------&lt;/td&gt;            &lt;td&gt;----------&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database01&lt;/td&gt;            &lt;td&gt;Data01&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database02&lt;/td&gt;            &lt;td&gt;Data02&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database10&lt;/td&gt;            &lt;td&gt;Data10&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database11&lt;/td&gt;            &lt;td&gt;Data11&lt;/td&gt;         &lt;/tr&gt;       &lt;/tbody&gt;&lt;/table&gt;      &lt;p&gt;&lt;b&gt;(4 rows(s) affected)&lt;/b&gt;&lt;/p&gt;   &lt;/div&gt; &lt;/font&gt;  &lt;p&gt;The query plan for a SELECT of all of the rows shows SQL accessing each member table in turn:&lt;/p&gt;  &lt;p&gt;&lt;img src="http://www.fotia.co.uk/fotia/Blog/Images/200802.TableScans.PNG" /&gt;&lt;/p&gt;  &lt;p&gt;This is the clever bit. We've put the partitioning criteria that separates each member table into the view's definition. We've hardcoded the database name of each member table into the first column of the view. The really clever bit is that SQL Server understands this! Try the following:&lt;/p&gt; &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;   &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; *       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; BudgetView       &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; SourceDatabase =&lt;font color="#ff0000"&gt; 'Database01'&lt;/font&gt;;       &lt;br /&gt;      &lt;br /&gt;      &lt;table class="grid"&gt;&lt;tbody&gt;         &lt;tr&gt;           &lt;td&gt;SourceDatabase&lt;/td&gt;            &lt;td&gt;Col1&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;--------------&lt;/td&gt;            &lt;td&gt;----------&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database01&lt;/td&gt;            &lt;td&gt;Data01&lt;/td&gt;         &lt;/tr&gt;       &lt;/tbody&gt;&lt;/table&gt;      &lt;p&gt;&lt;b&gt;(1 row(s) affected)&lt;/b&gt;&lt;/p&gt;   &lt;/div&gt; &lt;/font&gt;  &lt;p&gt;SQL Server has looked at the search argument we specified for SourceDatabase and short-circuited the query plan. It &lt;b&gt;only&lt;/b&gt; queried Database01's MemberTable and didn't bother accessing any of the others!&lt;/p&gt;  &lt;p&gt;&lt;img src="http://www.fotia.co.uk/fotia/Blog/Images/200802.SingleTable.PNG" /&gt;&lt;/p&gt;  &lt;p&gt;It gets even better, SQL Server understands all manner of comparisons, including LIKEs, in this example it only access databases where the name starts with &lt;strong&gt;Database0&lt;/strong&gt;.&lt;/p&gt; &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;   &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; *       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; BudgetView       &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; SourceDatabase like &lt;font color="#ff0000"&gt;'Database0%'&lt;/font&gt;;       &lt;br /&gt;      &lt;br /&gt;      &lt;table class="grid"&gt;&lt;tbody&gt;         &lt;tr&gt;           &lt;td&gt;SourceDatabase&lt;/td&gt;            &lt;td&gt;Col1&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;--------------&lt;/td&gt;            &lt;td&gt;----------&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database01&lt;/td&gt;            &lt;td&gt;Data01&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database02&lt;/td&gt;            &lt;td&gt;Data02&lt;/td&gt;         &lt;/tr&gt;       &lt;/tbody&gt;&lt;/table&gt;      &lt;p&gt;&lt;b&gt;(2 row(s) affected)&lt;/b&gt;&lt;/p&gt;   &lt;/div&gt; &lt;/font&gt;  &lt;p&gt;&lt;img src="http://www.fotia.co.uk/fotia/Blog/Images/200802.DualTable.PNG" /&gt;&lt;/p&gt;  &lt;p&gt;The story with parameters is also good! The startup filters we saw earlier make an appearance: &lt;/p&gt; &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;   &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @sourceDatabase &lt;font color="#0000ff"&gt;varchar&lt;/font&gt;(20);       &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @sourceDatabase = &lt;font color="#ff0000"&gt;'Database10'&lt;/font&gt;;       &lt;br /&gt;      &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; *       &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; dbo.BudgetView       &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; SourceDatabase = @SourceDatabase;       &lt;br /&gt;      &lt;table class="grid"&gt;&lt;tbody&gt;         &lt;tr&gt;           &lt;td&gt;SourceDatabase&lt;/td&gt;            &lt;td&gt;Col1&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;--------------&lt;/td&gt;            &lt;td&gt;----------&lt;/td&gt;         &lt;/tr&gt;          &lt;tr&gt;           &lt;td&gt;Database10&lt;/td&gt;            &lt;td&gt;Data10&lt;/td&gt;         &lt;/tr&gt;       &lt;/tbody&gt;&lt;/table&gt;      &lt;p&gt;&lt;b&gt;(1 row(s) affected)&lt;/b&gt;&lt;/p&gt;   &lt;/div&gt; &lt;/font&gt;  &lt;p&gt;&lt;img src="http://www.fotia.co.uk/fotia/Blog/Images/200802.VariableFilter.PNG" /&gt;&lt;/p&gt;  &lt;p&gt;At execution time SQL Server will dynamically determine which branches of the query plan to short-circuit. This is great as it allows a single query plan to be re-used for any value of @sourceDatabase and still deliver the short-circuits benefits.&lt;/p&gt;  &lt;p&gt;We've managed to create a view that has the same short-circuit benefits as a pukka partitioned view without having to adhere to all the partitioned view requirements. Our base tables don't have primary keys defined and there are no check constraints. On the downside our view is not updateable, it cannot be inserted into, updated or deleted. If you need all of these features you can either implement INSTEAD OF triggers or go with fully-fledged partitioned views. In this example we used the partitioning column to identify the source database name. Constant partitioning columns are also useful for soft-delete / archive tables, coarse categories, or chunky time periods.&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2008/02/partitioned-views-on-budget.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-8985261877128987891</guid><pubDate>Mon, 23 Apr 2007 07:13:00 +0000</pubDate><atom:updated>2008-02-04T23:13:49.697Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>functions</category><title>The NEWSEQUENTIALID Function</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;The NEWSEQUENTIALID system function is an addition to SQL Server 2005. It seeks to bring together, what used to be, conflicting requirements in SQL Server 2000; namely identity-level insert performance and globally unique values. &lt;/p&gt;    &lt;h2&gt;Insert Performance and Uniqueness&lt;/h2&gt;    &lt;p&gt;The best insert performance will always be achieved by using an identity column as the clustering key. With a monotonically increasing identity column as the clustered index, data is always inserted at the end of the table. At a basic level this is equivalent to the performance of a table that has no clustered index (i.e. a heap) where the insert position for the next row is the page where the last row was inserted. Secondly, there will be no page splits. With a normal clustered index, SQL Server will move half the rows on a full page to a newly allocated page in order to make space for the new row. With an identity column on a clustered index SQL Server will not 'split' the page in this manner. It will just allocate another page and continue inserting rows. It has enough smarts to know not to copy half the rows to the newly allocated page as all new rows will be inserted into the new page. &lt;/p&gt;    &lt;p&gt;The limitations of identity columns come to the fore when you need have unique keys that span either tables, databases or SQL Servers. For example, say you have multiple branches that can each generate orders. Each branch has a replicated copy of the Orders table, however you don't want duplicate order numbers generated across branches. Common solutions to this problem include creating a composite order number that includes the branch id, reserving ranges of the identity column or making use of a central 'number fountain'. Each of these approaches introduce a level of indirection that must be adhered to before a row can be created. If the order number is not required to be human-readable friendly, GUIDs are often used instead. &lt;/p&gt;    &lt;h2&gt;GUIDs&lt;/h2&gt;    &lt;p&gt;GUIDs (Globally Unique Identifiers) are pseudo-random 16-byte (128-bit) numbers whose generation algorithm provides a sufficiently high degree of probability that the same GUID will never be generated twice on any computer at any time. Although a GUID is a binary array, it is often represented in its hexadecimal form using the following format, for example: &lt;font style="font-weight: bold"&gt;dbbc2827-edf8-4a2d-92ad-c1e0059304d7&lt;/font&gt;. This makes them suitable for the 'distributed uniqueness requirement' of our Orders table. &lt;/p&gt;    &lt;p&gt;The problem would be pretty much solved there unless we have some non-trivial performance requirements, especially around insert volumes and speed. As GUIDs are effectively pseudo-random (the degree of randomness / predictability was changed by Microsoft in response to privacy concerns) they do not possess the same orderly insert benefits as identity columns when used as the clustering key. The insert IO pattern changes from sequential to random-access when using GUIDs instead of an IDENTITY column. Not only do GUIDs require far more IO to locate the insert location in the clustered index (the index has to be traversed from the root for every insert), fragmentation at the leaf level leads to less dense pages, which in turn requires defragmentation / reindexing. &lt;/p&gt;    &lt;h2&gt;NEWSEQUENTIALID()&lt;/h2&gt;    &lt;p&gt;Enter NEWSEQUENTIALID(), stage left. This is a new system function included in SQL Server 2005 that combines the best of both worlds. NEWSEQUENTIALID() will generate a GUID that will be greater in value than the previously generated one. &lt;/p&gt;    &lt;p&gt;It is easy to see how this function works by looking at the output the following TSQL script generates: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; TestTable (         &lt;br /&gt;&amp;#160;&amp;#160; id &lt;font color="#0000ff"&gt;uniqueidentifier default&lt;/font&gt; &lt;font color="#ff00ff"&gt;newsequentialid&lt;/font&gt;() &lt;font color="#696969"&gt;not null&lt;/font&gt; &lt;font color="#0000ff"&gt;primary key clustered&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; sequence &lt;font color="#0000ff"&gt;int&lt;/font&gt;);         &lt;br /&gt;go         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- Insert 100 rows.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @count &lt;font color="#0000ff"&gt;int&lt;/font&gt;;         &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @count = 0;         &lt;br /&gt;&lt;font color="#0000ff"&gt;while&lt;/font&gt; @count &amp;lt; 100 &lt;font color="#0000ff"&gt;begin&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;insert&lt;/font&gt; TestTable (sequence)         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;values&lt;/font&gt; (@count);         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; @count = @count + 1;         &lt;br /&gt;&lt;font color="#0000ff"&gt;end&lt;/font&gt;;         &lt;br /&gt;go         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; *         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; TestTable         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; id;         &lt;br /&gt;go         &lt;br /&gt;        &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;id&lt;/td&gt;              &lt;td&gt;sequence&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------------------------------&lt;/td&gt;              &lt;td&gt;-----------&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;FA780E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;0&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;FB780E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;1&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;FC780E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;2&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;FD780E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;3&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;FE780E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;4&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;...&lt;/td&gt;              &lt;td&gt;&amp;#160;&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;5C790E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;98&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;5D790E3B-03C2-DB11-BD9F-0011D82F3F23&lt;/td&gt;              &lt;td&gt;99&lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;You'll notice that there is a clear bit-shifting pattern that SQL Server is employing to keep the GUIDs increasing in relative value. These are the restrictions in NEWSEQUENTIALID's use: &lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;NEWSEQUENTIALID cannot be used in arbitrary TSQL statements. It can only be specified as the DEFAULT value for an uniqueidentifier column &lt;/li&gt;      &lt;li&gt;More than one column in a table can use NEWSEQUENTIALID &lt;/li&gt;      &lt;li&gt;It cannot be combined with other scalar functions e.g. REVERSE(NEWSEQUENTIALID()) is not allowed &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;The network card's MAC address is easily identifiable in the generated GUID. My laptop's MAC address is&lt;/p&gt;    &lt;div class="tsql"&gt;&amp;#160;&amp;#160; 00-01-4A-28-64-8B&lt;/div&gt;    &lt;p&gt;This is one of the NEWSEQUENTIALID GUIDs that was generated:&lt;/p&gt;    &lt;div class="tsql"&gt;&amp;#160;&amp;#160; 4EAC7708-30C3-DB11-B902-&lt;font style="font-weight: bold; text-decoration: underline"&gt;00014A28648B&lt;/font&gt;&lt;/div&gt;    &lt;p&gt;Hence the privacy warnings in BOL:&lt;/p&gt;   &lt;img class="pix" alt="NEWSEQUENTIALID() Privacy Warning" src="/fotia/Blog/Images/200704.NewSequentialIdPrivacy.PNG" /&gt;     &lt;p&gt;Correlating the position of the MAC address in the GUID with the bits that get incremented / shifted for each successive GUID sheds light on why the GUIDS remain globally unique. The MAC address portion of the GUID does not change. &lt;/p&gt;    &lt;h2&gt;Performance Comparison&lt;/h2&gt;    &lt;p&gt;The following test demonstrates the benefits of NEWSEQUENTIALID. The insert performance of the various clustering key variations was measured and compared. For each test the same TestTable was recreated with a different Id column definition: &lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;IDENTITY(,) &lt;/li&gt;      &lt;li&gt;NEWID() &lt;/li&gt;      &lt;li&gt;NEWSEQUENTIALID() &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;The following TSQL contains the DDL used to construct the tables for the 3 clustered keys: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#008000"&gt;-- IDENTITY&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; TestTable (         &lt;br /&gt;&amp;#160;&amp;#160; id &lt;font color="#0000ff"&gt;int identity&lt;/font&gt;(1,1) &lt;font color="#696969"&gt;not null&lt;/font&gt; &lt;font color="#0000ff"&gt;primary key clustered&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; sequence &lt;font color="#0000ff"&gt;int&lt;/font&gt; &lt;font color="#696969"&gt;not null&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; data &lt;font color="#0000ff"&gt;char&lt;/font&gt;(250) &lt;font color="#696969"&gt;not null&lt;/font&gt; &lt;font color="#0000ff"&gt;default&lt;/font&gt; &lt;font color="#ff0000"&gt;''&lt;/font&gt;);         &lt;br /&gt;go         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- NEWID&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; TestTable (         &lt;br /&gt;&amp;#160;&amp;#160; id &lt;font color="#0000ff"&gt;uniqueidentifier default&lt;/font&gt; &lt;font color="#ff00ff"&gt;newid&lt;/font&gt;() not null &lt;font color="#0000ff"&gt;primary key clustered&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; sequence &lt;font color="#0000ff"&gt;int&lt;/font&gt; &lt;font color="#696969"&gt;not null&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; data &lt;font color="#0000ff"&gt;char&lt;/font&gt;(250) &lt;font color="#696969"&gt;not null&lt;/font&gt; &lt;font color="#0000ff"&gt;default&lt;/font&gt; &lt;font color="#ff0000"&gt;''&lt;/font&gt;);         &lt;br /&gt;go         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- NEWSEQUENTIALID&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;create table&lt;/font&gt; TestTable (         &lt;br /&gt;&amp;#160;&amp;#160; id &lt;font color="#0000ff"&gt;uniqueidentifier default&lt;/font&gt; &lt;font color="#ff00ff"&gt;newsequentialid&lt;/font&gt;() not null &lt;font color="#0000ff"&gt;primary key clustered&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; sequence &lt;font color="#0000ff"&gt;int&lt;/font&gt; &lt;font color="#696969"&gt;not null&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; data &lt;font color="#0000ff"&gt;char&lt;/font&gt;(250) &lt;font color="#696969"&gt;not null&lt;/font&gt; &lt;font color="#0000ff"&gt;default&lt;/font&gt; &lt;font color="#ff0000"&gt;''&lt;/font&gt;);         &lt;br /&gt;go         &lt;br /&gt;&lt;/div&gt;      &lt;p&gt;For each test 50,000 rows were inserted into TestTable table using the following TSQL script: &lt;/p&gt;      &lt;div class="tsql"&gt;&lt;font color="#008000"&gt;-- Insert 50,000 rows.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @count &lt;font color="#0000ff"&gt;int&lt;/font&gt;;         &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @count = 0;         &lt;br /&gt;&lt;font color="#0000ff"&gt;while&lt;/font&gt; @count &amp;lt; 50000 &lt;font color="#0000ff"&gt;begin&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;insert&lt;/font&gt; TestTable (sequence)         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;values&lt;/font&gt; (@count);         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; @count = @count + 1;         &lt;br /&gt;&lt;font color="#0000ff"&gt;end&lt;/font&gt;;         &lt;br /&gt;go         &lt;br /&gt;&lt;/div&gt;      &lt;p&gt;The following metrics were gathered after each run: &lt;/p&gt;      &lt;div class="tsql"&gt;&lt;font color="#008000"&gt;-- Get the number of read / writes for this session...&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; reads, writes         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; &lt;font color="#0000ff"&gt;sys.dm_exec_sessions&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; session_id = &lt;font color="#ff00ff"&gt;@@spid&lt;/font&gt;;         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- Get the page fragmentation and density at the leaf level.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; index_type_desc, index_depth, page_count, avg_page_space_used_in_percent, avg_fragmentation_in_percent, record_count         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; sys.dm_db_index_physical_stats(&lt;font color="#ff00ff"&gt;db_id&lt;/font&gt;(), &lt;font color="#ff00ff"&gt;object_id&lt;/font&gt;(&lt;font color="#ff0000"&gt;'TestTable'&lt;/font&gt;), &lt;font color="#696969"&gt;null&lt;/font&gt;, &lt;font color="#696969"&gt;null&lt;/font&gt;, &lt;font color="#ff0000"&gt;'detailed'&lt;/font&gt;)         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; index_level = 0;         &lt;br /&gt;go         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;The results are quite compelling:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;       &lt;table class="data"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;th&gt;&amp;#160;&lt;/th&gt;              &lt;th&gt;Reads&lt;/th&gt;              &lt;th&gt;Writes&lt;/th&gt;              &lt;th&gt;Leaf Pages&lt;/th&gt;              &lt;th&gt;Avg Page Used&lt;/th&gt;              &lt;th&gt;Avg Fragmentation&lt;/th&gt;              &lt;th&gt;Record Count&lt;/th&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;th&gt;IDENTITY(,)&lt;/th&gt;              &lt;td&gt;0&lt;/td&gt;              &lt;td&gt;1,683&lt;/td&gt;              &lt;td&gt;1,667&lt;/td&gt;              &lt;td&gt;98.9%&lt;/td&gt;              &lt;td&gt;0.7%&lt;/td&gt;              &lt;td&gt;50,000&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;th&gt;NEWID()&lt;/th&gt;              &lt;td&gt;0&lt;/td&gt;              &lt;td&gt;5,386&lt;/td&gt;              &lt;td&gt;2,486&lt;/td&gt;              &lt;td&gt;69.3%&lt;/td&gt;              &lt;td&gt;99.2%&lt;/td&gt;              &lt;td&gt;50,000&lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;th&gt;NEWSEQUENTIALID()&lt;/th&gt;              &lt;td&gt;0&lt;/td&gt;              &lt;td&gt;1,746&lt;/td&gt;              &lt;td&gt;1,725&lt;/td&gt;              &lt;td&gt;99.9%&lt;/td&gt;              &lt;td&gt;1.0%&lt;/td&gt;              &lt;td&gt;50,000&lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Most striking is the number of writes required by the NEWID system function. This, coupled with the average page density of 69%, is evidence of the page splitting caused by the random distribution of inserts at the leaf level. As soon as a page fills up, it needs to be split into 2 pages of 50% each for the insert to complete. Not only has page splitting resulted in poor page density, it has fragmented the data pages quite badly (there is a 99% probability that the next data page is not next to the current one). In our tests the most likely place for a free page required for the page split is at the end of the table irrespective of where the row is being inserted. Therefore to read the rows in order the scan needs to keep jumping back and forth between widely distributed split pages, hence the appalling fragmentation. &lt;/p&gt;    &lt;p&gt;The minor difference in page count between IDENTITY and NEWSEQUENTIALID metrics is due to the difference in size between IDENTITY's INT (4 bytes) and NEWSEQUENTIALID's UNQUEIDENTIFIER (16 bytes). However, note that any non-clustered index on the UNIQUEIDENTIFIER table will be 4 times larger than the same index on the IDENTITY table. Therefore IDENTITY is still preferred over NEWSEQUENTIALID when choosing between the two. &lt;/p&gt;    &lt;p&gt;The NEWSEQUENTIALID system function has clearly lived up to its claim of providing GUID-like uniqueness coupled with identity-level insert performance. The number of writes, fragmentation and page density are all inline with identity-level metrics. These benefits make NEWSEQUENTIALID a compelling feature. &lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2007/04/newsequentialid-function.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-6549536543849322791</guid><pubDate>Mon, 12 Mar 2007 23:51:00 +0000</pubDate><atom:updated>2008-02-04T23:12:19.707Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>.net</category><title>When a String Is Not a String</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;This article was inspired by a seemingly innocent bit of code that caused a performance problem in a very unexpected way. It is the kind of code that you'd find in many samples. Unfortunately, it has a potential flaw that is only exposed when large data volumes are present. &lt;/p&gt;    &lt;h2&gt;The Problem Code&lt;/h2&gt;    &lt;p&gt;The Data Access Layer (DAL) code in question is as follows: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public static class&lt;/font&gt; &lt;font color="#008b8b"&gt;AccountDal&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public static string&lt;/font&gt; GetTerritoryForAccount(string accountNumber)         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;using&lt;/font&gt;(&lt;font color="#008b8b"&gt;SqlConnection&lt;/font&gt; con = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;SqlConnection&lt;/font&gt;(&lt;font color="#008b8b"&gt;@&amp;quot;Server=localhost;Database=AdventureWorks;          &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; Integrated Security=SSPI&amp;quot;&lt;/font&gt;))         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;using&lt;/font&gt;(&lt;font color="#008b8b"&gt;SqlCommand&lt;/font&gt; cmd = con.CreateCommand())         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; cmd.CommandText = &lt;font color="#008b8b"&gt;@&amp;quot;          &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; select t.Name           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; from Sales.Customer c           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; inner join Sales.SalesTerritory t           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; on c.TerritoryID = t.TerritoryId           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; where c.AccountNumber = @AccountNumber&amp;quot;&lt;/font&gt;;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; cmd.CommandType = &lt;font color="#008b8b"&gt;CommandType&lt;/font&gt;.Text;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; cmd.Parameters.AddWithValue(&lt;font color="#008b8b"&gt;&amp;quot;@AccountNumber&amp;quot;&lt;/font&gt;, accountNumber);         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; con.Open();         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return&lt;/font&gt; cmd.ExecuteScalar() &lt;font color="#0000ff"&gt;as string&lt;/font&gt;;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; }         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Very simple stuff. Open a connection to SQL, select a specific row from the Sales.Customer table, join the resultant row to the Sales.SalesTerritory table and return the name of the territory. The code is making diligent use of parameters for reusable query plans, strong typing and to prevent SQL injection attacks. The following test exercises the code: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;[&lt;font color="#008b8b"&gt;TestClass&lt;/font&gt;]         &lt;br /&gt;&lt;font color="#0000ff"&gt;public class&lt;/font&gt; &lt;font color="#008b8b"&gt;AccountDalTests&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; [&lt;font color="#008b8b"&gt;TestMethod&lt;/font&gt;]         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public void&lt;/font&gt; CanGetTerritory()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;string&lt;/font&gt; territory = &lt;font color="#008b8b"&gt;AccountDal&lt;/font&gt;.GetTerritoryForAccount(&lt;font color="#008b8b"&gt;&amp;quot;AW00000010&amp;quot;&lt;/font&gt;);         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;Assert&lt;/font&gt;.AreEqual(territory, &lt;font color="#008b8b"&gt;&amp;quot;Canada&amp;quot;&lt;/font&gt;);         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;This test passes, so functionally the code works as expected. Running this bit of code on a decently spec'd developer workstation takes about 800 ms. Running it repeatedly doesn't improve on this figure. The Sales.Customer table contains 19,185 rows. Given that we're looking up a single row in the Sales.Customer row the alarm bells should be ringing, especially as the Sales.Customer has a nonclustered, unique index on AccountNumber: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;exec&lt;/font&gt; &lt;font color="#800000"&gt;sp_helpindex&lt;/font&gt; &lt;font color="#ff0000"&gt;'Sales.Customer'&lt;/font&gt;;         &lt;br /&gt;        &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;index_name &lt;/td&gt;              &lt;td&gt;index_description &lt;/td&gt;              &lt;td&gt;index_key &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;-------------------------- &lt;/td&gt;              &lt;td&gt;----------------------------------------------------- &lt;/td&gt;              &lt;td&gt;-------------------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr style="font-weight: bold"&gt;             &lt;td&gt;AK_Customer_AccountNumber &lt;/td&gt;              &lt;td&gt;nonclustered, unique located on PRIMARY &lt;/td&gt;              &lt;td&gt;AccountNumber &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;AK_Customer_rowguid &lt;/td&gt;              &lt;td&gt;nonclustered, unique located on PRIMARY &lt;/td&gt;              &lt;td&gt;rowguid &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;IX_Customer_TerritoryID &lt;/td&gt;              &lt;td&gt;nonclustered located on PRIMARY &lt;/td&gt;              &lt;td&gt;TerritoryID &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;PK_Customer_CustomerID &lt;/td&gt;              &lt;td&gt;clustered, unique, primary key located on PRIMARY &lt;/td&gt;              &lt;td&gt;CustomerID &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;h2&gt;Diagnosing the Issue&lt;/h2&gt;    &lt;p&gt;Capturing the TSQL being executed in SQL Profiler and running it through Query Analyzer shows the following query plan. The query plan shows exactly where all the time is being spent: &lt;img class="pix" alt="Original Query Plan" src="/fotia/Blog/Images/200703.QueryPlanBefore.png" /&gt; The IX_Customer_TerritoryID index is being used. However, we're performing an index scan of all the AccountNumbers instead of just an index seek to the specific row we're returning. The compute scalar step following the index scan shows precisely why. Expr1004 is the output of an implicit conversion of every AccountNumber to NVARCHAR(10). Looking at the definition of the Sales.Customer table shows that AccountNumber is, in fact, a VARCHAR(10): &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; column_name, data_type, character_maximum_length         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; &lt;font color="#008000"&gt;information_schema.columns&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; table_catalog = &lt;font color="#ff0000"&gt;'AdventureWorks'&lt;/font&gt;         &lt;br /&gt;&amp;#160;&lt;font color="#696969"&gt;and&lt;/font&gt; table_schema = &lt;font color="#ff0000"&gt;'Sales'&lt;/font&gt;         &lt;br /&gt;&amp;#160;&lt;font color="#696969"&gt;and&lt;/font&gt; table_name = &lt;font color="#ff0000"&gt;'Customer'&lt;/font&gt;;         &lt;br /&gt;        &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;column_name &lt;/td&gt;              &lt;td&gt;data_type &lt;/td&gt;              &lt;td&gt;character_maximum_length &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;-------------- &lt;/td&gt;              &lt;td&gt;------------------- &lt;/td&gt;              &lt;td&gt;------------------------ &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;CustomerID &lt;/td&gt;              &lt;td&gt;int &lt;/td&gt;              &lt;td&gt;NULL &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;TerritoryID &lt;/td&gt;              &lt;td&gt;int &lt;/td&gt;              &lt;td&gt;NULL &lt;/td&gt;           &lt;/tr&gt;            &lt;tr style="font-weight: bold"&gt;             &lt;td&gt;AccountNumber &lt;/td&gt;              &lt;td&gt;varchar &lt;/td&gt;              &lt;td&gt;10 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;CustomerType &lt;/td&gt;              &lt;td&gt;nchar &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;rowguid &lt;/td&gt;              &lt;td&gt;uniqueidentifier &lt;/td&gt;              &lt;td&gt;NULL &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;ModifiedDate &lt;/td&gt;              &lt;td&gt;datetime &lt;/td&gt;              &lt;td&gt;NULL &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Looking at the TSQL the application sends to SQL Server, we can see that it is binding @AccountNumber as an NVARCHAR(10): &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;exec&lt;/font&gt; &lt;font color="#800000"&gt;sp_executesql&lt;/font&gt; N&lt;font color="#ff0000"&gt;'          &lt;br /&gt;&amp;#160;&amp;#160; select t.Name           &lt;br /&gt;&amp;#160;&amp;#160; from Sales.Customer c           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; inner join Sales.SalesTerritory t           &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; on c.TerritoryID = t.TerritoryId           &lt;br /&gt;&amp;#160;&amp;#160; where c.AccountNumber = @AccountNumber'&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; N&lt;font color="#ff0000"&gt;'@AccountNumber &lt;font style="font-weight: bold; text-decoration: underline"&gt;nvarchar(10)&lt;/font&gt;'&lt;/font&gt;,         &lt;br /&gt;&amp;#160;&amp;#160; @AccountNumber=N&lt;font color="#ff0000"&gt;'AW00000010'&lt;/font&gt;         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;So the application is binding an NVARCHAR(10) parameter that is then being compared to a VARCHAR(10) column. SQL Server then converts every row of the VARCHAR(10) column to an NVARCHAR(10) data type. Clearly this is where the performance problem is. The scalability of this query is therefore linearly dependant on the number of rows in the table. The more rows there are, the more values need to be converted. This not a good situation to be in. &lt;/p&gt;    &lt;h2&gt;Data Type Precedence&lt;/h2&gt;    &lt;p&gt;Whenever SQL Server is asked to compare two differing data types, it uses its data type precedence rules to determine which data type needs to be converted to perform the comparison. Assuming an implicit conversion exists, the data type that appears lower in the list is converted to the data type that appears higher in the list. SQL Server's &lt;a href="http://msdn2.microsoft.com/en-us/library/ms190309.aspx"&gt;Data Type Precedence&lt;/a&gt; order has NVARCHAR appearing higher than VARCHAR. Therefore, all VARCHAR data types need to be converted to NVARCHAR in order to evaluate the AccountNumber values.&lt;/p&gt;    &lt;p&gt;Unfortunately, in our case, we have ~20,000 VARCHARs to convert to NVARCHAR and 1 NVARCHAR @AccountNumber! This is the reason for the index scan in the query plan. Note that with SQL Server 7.0, the query optimiser would always convert the literal and never the column. SQL Server 2000/5 strictly adheres to the Data Type Precedence rules, even if it means having to convert 19,185 rows. This change in behaviour is documented in KB &lt;a href="http://support.microsoft.com/?id=271566"&gt;271566&lt;/a&gt;. &lt;/p&gt;    &lt;p&gt;Looking back at the original C# code, we didn't specify a SqlDbType when adding the @AccountNumber parameter. The SqlParameter class will infer the SqlDbType from the object value specified if it is not specified explicitly. A System.String is always interpreted as SqlDbType.NVarChar as all .NET strings are Unicode. &lt;/p&gt;    &lt;p&gt;There are a number of possible solutions to this problem: &lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;Change the C# code to have knowledge of the SqlDbTypes the parameters are binding to &lt;/li&gt;      &lt;li&gt;Standardise by only allowing inferred SqlDbType types to be used in your database &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;Personally, I prefer the first option by a country mile. The second option prevents you from designing a data model that fully exploits all of the advantages of the SQL Server domain. For example, disallowing VARCHARs will double the amount of storage required to store your VARCHARs. I prefer to use a data type dictionary. &lt;/p&gt;    &lt;h2&gt;Data Type Dictionary&lt;/h2&gt;    &lt;p&gt;Data dictionaries are normally only used within databases. However, introducing a data type dictionary in the DAL allows you to have fine-grained control over your data types as well as other data type facets such as length and precision. In addition, it provides for better type safety as users of the data dictionary can only pass a .NET string to get an AccountNumber SqlParameter object. Any other .NET would obviously be a compile time error. Lastly, should you ever have to change data types (e.g. accountNumber becomes a System.Int32) it would be easy to make the change to the GetAccountNumber method and get compile time errors where string accountNumbers are still being used. &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public static class&lt;/font&gt; &lt;font color="#008b8b"&gt;SqlDictionary&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public static&lt;/font&gt; &lt;font color="#008b8b"&gt;SqlParameter&lt;/font&gt; GetAccountNumber(&lt;font color="#0000ff"&gt;string&lt;/font&gt; accountNumber)         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;SqlParameter&lt;/font&gt; parameter = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;SqlParameter&lt;/font&gt;(&lt;font color="#008b8b"&gt;&amp;quot;@AccountNumber&amp;quot;&lt;/font&gt;, &lt;font color="#008b8b"&gt;SqlDbType&lt;/font&gt;.VarChar, 10);         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; parameter.Value = accountNumber;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return&lt;/font&gt; parameter;         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;With this change, the query plan now looks far more respectable: &lt;img class="pix" alt="Updated Query Plan" src="/fotia/Blog/Images/200703.QueryPlanAfter.png" /&gt; &lt;/p&gt;    &lt;p&gt;Relying on inferred Object / Relational data mappings can lead to unexpected issues. This is a classic example of the ORM-impedance mismatch that so many tools try to abstract away. However, by utilising a simple data dictionary, many of these problems can be easily solved by writing disciplined code. &lt;/p&gt;    &lt;h2&gt;Default Implicit Conversion&lt;/h2&gt;    &lt;p&gt;SQL Server 2005 has mediated the data type precedence trap somewhat. For example, in SQL Server 2000, comparing a BIT column to a literal 1 always resulted in the BIT column being upcast to a TINYINT. Whilst this may sound a bit bizarre, the problem was that the literal number 1 was always implicitly assigned the TINYINT data type and TINYINT appears above BIT in the data type precedence rules. SQL Server 2005 now applies a bit more intelligence. It assigns the literal the same data type as the column so no conversion is required. &lt;/p&gt;    &lt;p&gt;So, if you're still using SQL Server 2000, be sure to specify your BIT literals as 0x1. Whilst this doesn't equate to a BIT data type, its BINARY datatype is lower on the data type precedence rules and is therefore implicitly upcast to a BIT data type. Using this standard you'll never fall into the trap of upcasting all your BITS to TINYINTs. Note that if you run a SQL Server 2005 database in 80 compatability level you'll get the old SQL Server 2000 behaviour. &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#008000"&gt;-- SQL Server 2000 implicit conversion issue.          &lt;br /&gt;-- SQL Server 2005 OK.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductID         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Production.Product         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; FinishedGoodsFlag = 1;         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- SQL Server 2000 OK.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductID         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Production.Product         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; FinishedGoodsFlag = 0x1;         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;The implicit conversion is quite easy to spot in the query plan. &lt;/p&gt;   &lt;img class="pix" alt="Convert SQL 200 vs 2005" src="/fotia/Blog/Images/200703.Convert2Kvs2K5.png" /&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2007/03/when-string-is-not-string.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-4199268131752539985</guid><pubDate>Thu, 01 Mar 2007 22:35:00 +0000</pubDate><atom:updated>2008-02-04T23:10:51.658Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><title>KB Article Hidden Gems</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;One habit that has paid off repeatedly has been to loyally scan through PSS's SQL Server 2005 KB Article &lt;a href="http://support.microsoft.com/common/rss.aspx?rssid=2855&amp;amp;ln=en-us"&gt;RSS Feed&lt;/a&gt; on a daily basis. There isn't a huge amount of volume; some days as many as 5 new articles, most days none. It works very well as my SQL Server gossip column. The KB articles tell me what the latest issues are and what information is considered critical enough by PSS to warrant a KB article.&lt;/p&gt;    &lt;p&gt;For example, yesterday the following KB article was published:&lt;/p&gt;    &lt;p&gt;&lt;a href="http://support.microsoft.com/default.aspx?id=929240"&gt;FIX: I/O requests that are generated by the checkpoint process may cause I/O bottlenecks if the I/O subsystem is not fast enough to sustain the IO requests in SQL Server 2005&lt;/a&gt;&lt;/p&gt;    &lt;p&gt;If you take the time to scroll to the very end of the article, you'll find the &lt;b&gt;More Information&lt;/b&gt; section. This is where the really good stuff is kept. This specific hotfix is not what you would expect. The hotfix adds an additional SQL Server command line option to throttle the I/O requests to the disk subsystem. I/O subsystem issues are common when dealing with SQL Server so knowing about this specific issue and the existence of the hotfix will give you some options when investigating I/O subsystem issues.&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2007/03/kb-article-hidden-gems.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-1457585753553891492</guid><pubDate>Thu, 18 Jan 2007 19:27:00 +0000</pubDate><atom:updated>2008-02-04T23:10:01.120Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>.net</category><title>Covariant Generic List</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;The introduction of generics in .NET 2.0 has opened up a whole new approach to class design in C#. The first, and most obvious, benefit generics provide is strongly typed containers. We no longer have to keep casting, boxing / unboxing, to / from object for ArrayList items anymore. &lt;/p&gt;    &lt;p&gt;However, with the power of generics come a few limitations. This article explores some subtle restrictions in covariance with generics. &lt;/p&gt;    &lt;h2&gt;What is Covariance?&lt;/h2&gt;    &lt;p&gt;In mathematical terms covariance is the measure of the tendency of two things to move or vary together. In the world of object-oriented programming, covariance describes the situation in which a derived type is used where a base class was expected. For example, in .NET arrays are covariant. That is: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#008b8b"&gt;DerivedClass&lt;/font&gt;[] derivedClasses = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;DerivedClass&lt;/font&gt;[3];         &lt;br /&gt;&lt;font color="#008b8b"&gt;BaseClass&lt;/font&gt;[] baseClasses = derivedClasses; &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;As a worked example, let's say that we have a People class. An instance of the People class is created from a data source that requires read / write access to the class's properties. A good example of such a data source is XML serialisation. The XmlSerializer class requires read / write properties and a default constructor in order to create an object from XML. However, once this object has been created, I want the object to be immutable. I don't want any properties of the object to be changed once it has been created. The natural implementation for this requirement is to create the IPerson interface that only exposes the property getters: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#008000"&gt;// Immutable Person.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;public interface&lt;/font&gt; &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;string&lt;/font&gt; Name { &lt;font color="#0000ff"&gt;get&lt;/font&gt;; }         &lt;br /&gt;}         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;public class&lt;/font&gt; &lt;font color="#008b8b"&gt;Person&lt;/font&gt; : &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;private string&lt;/font&gt; name;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public&lt;/font&gt; Person()         &lt;br /&gt;&amp;#160;&amp;#160; {}         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public string&lt;/font&gt; Name         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;get&lt;/font&gt; { &lt;font color="#0000ff"&gt;return this&lt;/font&gt;.name; }         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; { &lt;font color="#0000ff"&gt;this&lt;/font&gt;.name = &lt;font color="#0000ff"&gt;value&lt;/font&gt;; }         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;public static class&lt;/font&gt; &lt;font color="#008b8b"&gt;PersonFactory&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Factory method to create an IPerson instance.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public static&lt;/font&gt; &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt; CreatePerson()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Simulate a data source.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;Person&lt;/font&gt; person = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;Person&lt;/font&gt;();         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; person.Name = &lt;font color="#ff8c00"&gt;&amp;quot;Trogdor&amp;quot;&lt;/font&gt;;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return&lt;/font&gt; person;         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;h2&gt;Generic Covariance Limitations&lt;/h2&gt;    &lt;p&gt;Now, let's say that we aren't creating single Person objects. Instead the data source returns an IList of Person objects. We would naturally try the following code first: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public static class&lt;/font&gt; &lt;font color="#008b8b"&gt;PeopleFactory&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public static&lt;/font&gt; &lt;font color="#ff00ff"&gt;IList&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;IPerson&lt;/font&gt;&amp;gt; CreatePeople()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Simulate a data source.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;List&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Person&lt;/font&gt;&amp;gt; people = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;List&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Person&lt;/font&gt;&amp;gt;();         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Person objects added to list.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// ...&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return&lt;/font&gt; people;         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;The code above does not compile. The C# compiler fails with error:      &lt;br /&gt;&lt;/p&gt;    &lt;div class="tsql"&gt;&lt;font color="#ff0000"&gt;CS0030: Cannot convert type 'System.Collections.Generic.List&amp;lt;TestScratch.Person&amp;gt;' to 'System.Collections.Generic.List&amp;lt;TestScratch.IPerson&amp;gt;' &lt;/font&gt;&lt;/div&gt;    &lt;p&gt;The problem we've run into here has to do with the generic type compatibility, specifically, generics are not covariant. &lt;/p&gt;    &lt;p&gt;This is nicely summarised over on MSDN: &lt;a href="http://msdn2.microsoft.com/en-gb/library/aa479859.aspx#fundamentals_topic12"&gt;Are Generics Covariant, Contra-Variant, or Invariant?&lt;/a&gt;. In a nutshell, generics cannot be covariant as it would allow illegal constructs. For example: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public class&lt;/font&gt; &lt;font color="#008b8b"&gt;Adult&lt;/font&gt; : &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;//&lt;/font&gt;         &lt;br /&gt;}         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;public class&lt;/font&gt; &lt;font color="#008b8b"&gt;Child&lt;/font&gt; : &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;//&lt;/font&gt;         &lt;br /&gt;}         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;public void&lt;/font&gt; WhyCoVarianceIsNotAllowed()         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;List&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Adult&lt;/font&gt;&amp;gt; adults = &lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;List&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Adult&lt;/font&gt;&amp;gt;();         &lt;br /&gt;&amp;#160;&amp;#160; adults.Add(&lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;Adult&lt;/font&gt;());         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// If the following line would be allowed...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008b8b"&gt;List&lt;/font&gt;&amp;lt;&lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;&amp;gt; people = adults;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// This line would have to be legal, i.e. adding a Child to an Adult List.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160; people.Add(&lt;font color="#0000ff"&gt;new&lt;/font&gt; &lt;font color="#008b8b"&gt;Child&lt;/font&gt;());         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;h2&gt;A Generic Covariant List Implementation&lt;/h2&gt;    &lt;p&gt;As dire as this seems, not all is lost. A natural extension to the immutable object requirement would be to constrain the list of immutable objects to be itself immutable. This additional restriction allows us to circumvent the generic covariance issue as the immutable list would not allow the 'illegal' situation to arise. The most natural interface to return for an immutable list is IEnumerable&amp;lt;T&amp;gt;. We therefore need to find a way of converting / casting IList&amp;lt;Person&amp;gt; to IEnumerable&amp;lt;IPerson&amp;gt;. As the C# compiler does not make any generic covariance exceptions, it will still not allow us to directly cast IList&amp;lt;Person&amp;gt; to IEnumerable&amp;lt;IPerson&amp;gt;. Instead we need to find a way to express the readonly nature of IEnumerable in a way that gets the C# compiler to accept that what we're trying to do is legal. That is what the following class accomplishes: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public class&lt;/font&gt; &lt;font color="#008b8b"&gt;EnumerableGeneric&lt;/font&gt;&amp;lt;TClass, TInterface&amp;gt; : &lt;font color="#ff00ff"&gt;IEnumerable&lt;/font&gt;&amp;lt;TInterface&amp;gt;         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;where&lt;/font&gt; TClass : TInterface         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;private&lt;/font&gt; &lt;font color="#ff00ff"&gt;IList&lt;/font&gt;&amp;lt;TClass&amp;gt; list;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public&lt;/font&gt; EnumerableGeneric(&lt;font color="#ff00ff"&gt;IList&lt;/font&gt;&amp;lt;TClass&amp;gt; list)         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;this&lt;/font&gt;.list = list;         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public&lt;/font&gt; &lt;font color="#ff00ff"&gt;IEnumerator&lt;/font&gt;&amp;lt;TInterface&amp;gt; GetEnumerator()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;foreach&lt;/font&gt;(TClass item &lt;font color="#0000ff"&gt;in&lt;/font&gt; list)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;yield return&lt;/font&gt; item;         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;IEnumerator&lt;/font&gt; &lt;font color="#ff00ff"&gt;IEnumerable&lt;/font&gt;.GetEnumerator()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return&lt;/font&gt; &lt;font color="#0000ff"&gt;this&lt;/font&gt;.GetEnumerator();         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;This class pulls together pretty much all of the .NET 2.0 features. Firstly, it uses generic constraints to ensure that TClass implements TInterface (or TInterface could be a base class of TClass). Secondly, the compiler understands constraint as it allows us to use yield return to implicitly cast TClass to TInterface. This class then becomes a simple generic wrapper that can be used whenever a covariant list is required: &lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;public static class&lt;/font&gt; &lt;font color="#008b8b"&gt;PeopleFactory&lt;/font&gt;         &lt;br /&gt;{         &lt;br /&gt;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;public static&lt;/font&gt; &lt;font color="#ff00ff"&gt;IEnumerable&lt;/font&gt;&amp;lt;&lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;&amp;gt; CreatePeople()         &lt;br /&gt;&amp;#160;&amp;#160; {         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Simulate a data source.&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;IList&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Person&lt;/font&gt;&amp;gt; people = &lt;font color="#0000ff"&gt;new&lt;/font&gt; List&amp;lt;Person&amp;gt;();         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// Person objects added to list.&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;// ...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;return new&lt;/font&gt; &lt;font color="#008b8b"&gt;EnumerableGeneric&lt;/font&gt;&amp;lt;&lt;font color="#008b8b"&gt;Person&lt;/font&gt;, &lt;font color="#ff00ff"&gt;IPerson&lt;/font&gt;&amp;gt;(people);         &lt;br /&gt;&amp;#160;&amp;#160; }         &lt;br /&gt;}         &lt;br /&gt;&lt;/div&gt;   &lt;/font&gt;&lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2007/01/covariant-generic-list.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-2758620945276793951</guid><pubDate>Mon, 20 Nov 2006 19:41:00 +0000</pubDate><atom:updated>2008-02-04T23:08:50.958Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>functions</category><title>The TABLESAMPLE Clause</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;When I first same across TABLESAMPLE I had visions of some advanced index hints that allowed fine grained control over table statistics (distribution, ranges, density, etc.). Unfortunately not. As per &lt;a href="http://en.wikipedia.org/wiki/Occam's_Razor"&gt;Occam's Razor&lt;/a&gt; the simplest explanation is the correct one. It just... returns a &lt;b&gt;sample&lt;/b&gt; from the specified &lt;b&gt;table&lt;/b&gt;...?&lt;/p&gt;    &lt;p&gt;The complete syntax for TABLESAMPLE is as follows (from BOL):&lt;/p&gt;   &lt;img class="pix" alt="TABLESAMPLE" src="/fotia/Blog/Images/200611.TableSample.jpg" /&gt;     &lt;p&gt;Given this syntax, the most obvious query to test drive TABLESAMPLE would be to return, say, 10 rows from the Sales.SalesOrderHeader table. After all, this syntax looks a lot like TOP:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; SalesOrderId, OrderDate, TotalDue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Sales.SalesOrderHeader &lt;font color="#0000ff"&gt;tablesample&lt;/font&gt; system (10 rows)&lt;font color="#696969"&gt;;          &lt;br /&gt;&lt;/font&gt;        &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(0 rows(s) affected)&lt;/b&gt;&lt;/p&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;How unexpected is that! Zero rows returned?! Huh? For the sake of adding to the confusion, change the query to return 1,000 rows as follows:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; SalesOrderId, OrderDate, TotalDue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Sales.SalesOrderHeader &lt;font color="#0000ff"&gt;tablesample&lt;/font&gt; system (1000 rows)&lt;font color="#696969"&gt;;&lt;/font&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Then run the query a few times and observe how many rows get returned:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;       &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;44137 &lt;/td&gt;              &lt;td&gt;2001-09-01 00:00:00.000 &lt;/td&gt;              &lt;td&gt;3953.9884 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(938 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;44407 &lt;/td&gt;              &lt;td&gt;2001-10-16 00:00:00.000 &lt;/td&gt;              &lt;td&gt;3729.364 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(1263 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;44757 &lt;/td&gt;              &lt;td&gt;2001-12-01 00:00:00.000 &lt;/td&gt;              &lt;td&gt;49953.7086 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(1085 rows(s) affected)&lt;/b&gt;&lt;/p&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Each execution returns a &lt;b&gt;different&lt;/b&gt; number of rows! Some of them just below 1,000 and others just above. Clearly there is a non-deterministic factor at play here. Without any data changing, re-running the identical query keeps giving different results. Digging into the BOL documentation shows that the TABLESAMPLE algorithm has a very strong statistical slant to it. It works as follows:&lt;/p&gt;    &lt;p&gt;For the table affected by the TABLESAMPLE, SQL Server needs to determine a probability that a &lt;a href="http://msdn2.microsoft.com/en-us/library/ms190969.aspx"&gt;page&lt;/a&gt; of that table's data will be sampled. This probability is equal to the PERCENTAGE specified in the TABLESAMPLE. Alternatively, if a number of ROWS was specified, that number is divided by the total number of rows in the table to calculate the probability. In our case, the Sales.SalesOrderHeader table has 31,465 rows. We asked for 1,000 rows in the second query. Therefore the probability is calculated as 3.2%.&lt;/p&gt;    &lt;p&gt;Once this probability has been calculated, SQL Server will evaluate it for every page of the table. The details of the algorithm aren't documented but it is along the lines of:&lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;For each page, a random number between 0 and 1 is generated (e.g. RAND()). &lt;/li&gt;      &lt;li&gt;If the random number is greater than the probability, SQL Server will pretend that the page for the table does not exist (i.e. it is not sampled / 'out-of-scope'). &lt;/li&gt;      &lt;li&gt;Otherwise, SQL Server considers &lt;b&gt;all&lt;/b&gt; of the rows on the page sampled / 'in scope'. &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;This is very revealing as it explains the behaviour we saw with the first query. We asked for a sample of 10 rows. SQL Server therefore calculated the probability of a page being returned as:      &lt;br /&gt;10 / 31,465 = 0.032 %, a very small number. If we execute the query once, there is a chance SQL Server will not generate a random number that falls within this small probability. However, we can increase our chances of getting at least one page returned by executing the select statement a number of times:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @loopCount &lt;font color="#0000ff"&gt;int&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @loopCount = 0&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;declare&lt;/font&gt; @rowcount &lt;font color="#0000ff"&gt;int&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;set&lt;/font&gt; @rowCount = 0&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#008000"&gt;-- Keep looping till we manage to sample at least one page.&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;while&lt;/font&gt; @rowCount = 0 &lt;font color="#0000ff"&gt;begin&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;select&lt;/font&gt; SalesOrderId, OrderDate, TotalDue         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;from&lt;/font&gt; Sales.SalesOrderHeader &lt;font color="#0000ff"&gt;tablesample&lt;/font&gt; system (10 rows)&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; @rowcount = &lt;font color="#ff00ff"&gt;@@rowcount&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; @loopCount = @loopCount + 1&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;end&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&lt;font color="#0000ff"&gt;raiserror&lt;/font&gt;(&lt;font color="#ff0000"&gt;'Required %d loops to return %d rows'&lt;/font&gt;, 0, 1, @loopCount, @rowCount)&lt;font color="#696969"&gt;;          &lt;br /&gt;&lt;/font&gt;        &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(0 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(0 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(0 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(0 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;SalesOrderId &lt;/td&gt;              &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;TotalDue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;------------ &lt;/td&gt;              &lt;td&gt;----------------------- &lt;/td&gt;              &lt;td&gt;--------------------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;69998 &lt;/td&gt;              &lt;td&gt;2004-05-06 00:00:00.000 &lt;/td&gt;              &lt;td&gt;2649.8453 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;        &lt;p&gt;&lt;b&gt;(46 rows(s) affected)&lt;/b&gt;&lt;/p&gt;        &lt;p&gt;&lt;b&gt;Required 5 loops to return 46 rows&lt;/b&gt;&lt;/p&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Interestingly enough, when we get back some rows (i.e. we get a probability hit) we get 91 rows returned instead of 10. This reveals another facet of the TABLESAMPLE algorithm. It only goes as far as the probability calculation of a page for sampling data. Once it has decided that a page of the table is to be included in the sample, it will not limit the number of rows sampled from that page. Therefore you are always liable to get back the number of rows equivalent to the page size divided by the row width (assuming, of course, that the pages are densely packed). Running the above query repeatedly returns 40-45 rows, indicating that a page of the Sales.SalesOrderHeader table typically holds 40-45 rows per page.&lt;/p&gt;    &lt;p&gt;A couple of additional points to note: &lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;The TABLESAMPLE is applied before any WHERE clauses are evaluated. &lt;/li&gt;      &lt;li&gt;This includes any JOIN conditions as well. Some misleading results can be obtained if two tables that are JOIN'd have a TABLESAMPLE applied to them. As the sampling of the tables are independent and uncorrelated, there is no guarantee that related rows from both tables will be included. &lt;/li&gt;      &lt;li&gt;The TOP clause can be combined with TABLESAMPLE to cap the number of rows returned. We saw earlier how the TABLESAMPLE algorithm can return more rows than requested. However, there is still every chance that less rows will be returned than was specified in the TABLESAMPLE. &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;&lt;/p&gt;    &lt;p&gt;The REPEATABLE option of the TABLESAMPLE clause basically causes SQL Server to use the same random numbers when evaluating each of the table's pages again (SQL Server is probably using a &lt;a href="http://en.wikipedia.org/wiki/Linear_congruential_generator"&gt;linear congruential generator&lt;/a&gt; to obtain the random numbers with the same seed). The net effect is that the same pages will be sampled assuming no underlying data has been changed.&lt;/p&gt;    &lt;p&gt;In terms of practical applications of the TABLESAMPLE algorithm, I really struggled to think of a situation where it would have been useful. At face value, all that TABLESAMPLE does is return a certain number or percentage of 'random-ish' rows from the specified table. Unless you have some highly-specialised requirement to get a 'flavour' for the data in the table, this new clause isn't that useful. It doesn't augment the TOP / ROWCOUNT clause at all and doesn't expose any customisable or extensible behaviour.&lt;/p&gt;    &lt;p&gt;I strongly suspect that the TABLESAMPLE clause was added specifically for the optimiser's gathering of statistics. Anyone who has seen STATMAN scrolling past in a SQL Profiler trace will know that the update of statistics uses a number of undocumented TSQL operators and index hints. For example:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; SalesOrderId, OrderDate, TotalDue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;select top&lt;/font&gt; 100 &lt;font color="#0000ff"&gt;percent&lt;/font&gt; SalesOrderId, OrderDate, TotalDue         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;from&lt;/font&gt; Sales.SalesOrderHeader &lt;font color="#0000ff"&gt;with&lt;/font&gt; (readcommitted, sample 10e-2 &lt;font color="#0000ff"&gt;percent&lt;/font&gt;)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;order by&lt;/font&gt; 1, 2, 3)         &lt;br /&gt;&lt;font color="#0000ff"&gt;option&lt;/font&gt; (bypass optimizer_queue, maxdop 1)&lt;font color="#696969"&gt;;&lt;/font&gt;&lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;I count at least 2 undocumented operators and 1 view ordering workaround being used in this query. TABLESAMPLE is definitely a much cleaner implementation for a query that appears to be performing the same function.&lt;/p&gt;    &lt;p&gt;Who knows, maybe one day your boss will ask your team: &lt;i&gt;&amp;quot;You know what, it would be really cool if we could return random pages from this table, according to a probability we specify, get back different results every time we execute it, and even return vastly different numbers of rows than what we asked for&amp;quot;&lt;/i&gt;. You'll be the first one to raise your hand...&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/11/tablesample-clause.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-7459003878340261328</guid><pubDate>Fri, 13 Oct 2006 14:22:00 +0000</pubDate><atom:updated>2008-02-04T23:06:27.244Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>functions</category><title>The UNPIVOT Operator</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;Completing the discussion on the &lt;a href="DY.16.ThePivotOperator.aspx"&gt;PIVOT&lt;/a&gt; operator is its antonym, the UNPIVOT. However, I would argue that the UNPIVOT operator has been misnamed by Microsoft as it creates the mistaken impression that PIVOT -&amp;gt; UNPIVOT -&amp;gt; PIVOT gets you back to where you started. From BOL: &lt;/p&gt;    &lt;p&gt;&lt;i&gt;UNPIVOT performs the opposite operation to PIVOT by rotating columns of a table-valued expression into column values.&lt;/i&gt;&lt;/p&gt;    &lt;p&gt;It fails to highlight that fact that PIVOT has aggregated the raw data in order to rotate the underlying rows into columns - you have to read to the very end of the &lt;b&gt;Using PIVOT and UNPIVOT&lt;/b&gt; sample for the fine print in BOL. In contrast, the UNPIVOT operator does not perform any aggregation whatsoever. It just turns columns into rows. For each column in the table being &lt;i&gt;UNPIVOT'd&lt;/i&gt; you get a row in the result set where the row values are ColumnName and ColumnValue. It therefore helps to think of &lt;a href="http://en.wikipedia.org/wiki/Matrix_transpose"&gt;transpose&lt;/a&gt; instead of UNPIVOT as this is more accurate description of the UNPIVOT operator's behaviour.&lt;/p&gt;    &lt;p&gt;Nevertheless, let's have a look at the syntax to try and get a better feel for what UNPIVOT brings to the table. As a worked example, consider the following requirement: Write a query to produce a result set that contains a list of products (Production.Product) and their SellStartDate and SellEndDate. The SellStartDate and SellEndDate need to be listed in a single column with a second column specifying whether it is the SellStartDate or SellEndDate value being specified. Lastly, the result set needs to be sorted by this consolidated date column to produce a linear timeline of 'product events'.&lt;/p&gt;    &lt;p&gt;In SQL Server 2000, you'd probably end up with a query that looks very much like the following:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductId, [Name], &lt;font color="#ff0000"&gt;'SellStartDate'&lt;/font&gt; &lt;font color="#0000ff"&gt;as&lt;/font&gt; DateType, SellStartDate &lt;font color="#0000ff"&gt;as&lt;/font&gt; DateValue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Production.Product         &lt;br /&gt;&lt;font color="#0000ff"&gt;union&lt;/font&gt; &lt;font color="#696969"&gt;all&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductId, [Name], &lt;font color="#ff0000"&gt;'SellEndDate'&lt;/font&gt; &lt;font color="#0000ff"&gt;as&lt;/font&gt; DateType, SellEndDate &lt;font color="#0000ff"&gt;as&lt;/font&gt; DateValue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Production.Product         &lt;br /&gt;&lt;font color="#0000ff"&gt;where&lt;/font&gt; SellEndDate &lt;font color="#696969"&gt;is not null&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; DateValue&lt;font color="#696969"&gt;;         &lt;br /&gt;          &lt;br /&gt;&lt;/font&gt;         &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;ProductId &lt;/td&gt;              &lt;td&gt;Name &lt;/td&gt;              &lt;td&gt;DateType &lt;/td&gt;              &lt;td&gt;DateValue &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;320 &lt;/td&gt;              &lt;td&gt;Chainring Bolts &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;321 &lt;/td&gt;              &lt;td&gt;Chainring Nut &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;322 &lt;/td&gt;              &lt;td&gt;Chainring &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;323 &lt;/td&gt;              &lt;td&gt;Crown Race &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;324 &lt;/td&gt;              &lt;td&gt;Chain Stays &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;325 &lt;/td&gt;              &lt;td&gt;Decal 1 &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;326 &lt;/td&gt;              &lt;td&gt;Decal 2 &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;327 &lt;/td&gt;              &lt;td&gt;Down Tube &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;328 &lt;/td&gt;              &lt;td&gt;Mountain End Caps &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;329 &lt;/td&gt;              &lt;td&gt;Road End Caps &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;330 &lt;/td&gt;              &lt;td&gt;Touring End Caps &lt;/td&gt;              &lt;td&gt;SellStartDate &lt;/td&gt;              &lt;td&gt;1998-06-01 00:00:00.000 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;Using the UNPIVOT operator, in SQL Server 2005, the equivalent is as follows:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductId, [Name], DateType, DateValue         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#696969"&gt;(&lt;/font&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; ProductID, [Name], SellStartDate, SellEndDate         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;from&lt;/font&gt; Production.Product&lt;font color="#696969"&gt;)&lt;/font&gt; &lt;font color="#0000ff"&gt;as&lt;/font&gt; rawData         &lt;br /&gt;&lt;font color="#696969"&gt;unpivot&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#696969"&gt;(&lt;/font&gt;DateValue &lt;font color="#0000ff"&gt;for&lt;/font&gt; DateType &lt;font color="#696969"&gt;in&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#696969"&gt;(&lt;/font&gt;SellStartDate, SellEndDate&lt;font color="#696969"&gt;))&lt;/font&gt; &lt;font color="#0000ff"&gt;as&lt;/font&gt; transposed         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; DateValue&lt;font color="#696969"&gt;;&lt;/font&gt; &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;The key to understanding the UNPIVOT syntax is to see how the DateValue and DateType columns are specified. When transposing the columns into rows, they effectively become name / value pairs. The name is the name of the column and value is the value of the column for the relative row. BOL calls it &lt;i&gt;value_column&lt;/i&gt; and &lt;i&gt;pivot_column&lt;/i&gt; but just think of them as name / value pairs.&lt;/p&gt;    &lt;p&gt;The first portion of the query selects the raw data which we will be transposing. The UNPIVOT portion of the query requires the name of the columns to be used for the name and value pair columns (Value FOR Name) followed by the columns that will be transposed. Lastly, the outermost select allows you specify which transposed columns you want to return, so you can choose not to return the Name column if you don't want it.&lt;/p&gt;    &lt;p&gt;As the SELECT...CASE statements were able to duplicate all the functionality of the &lt;a href="DY.16.ThePivotOperator.aspx"&gt;PIVOT&lt;/a&gt; operator, SELECT...UNION ALL can perform all the same tricks as UNPIVOT. However, the advantage again is a more compact and expressive syntax.&lt;/p&gt;    &lt;p&gt;One behaviour of UNPIVOT to watch out for is that it will not transpose column values that are NULL. NULL values will not appear as name / value pairs in the results. In the sample above I've compensated for this behaviour by including &lt;i&gt;where SellEndDate is not null&lt;/i&gt; in the SQL Server 2000 equivalent query. Note that there is way to change this behaviour.&lt;/p&gt;    &lt;p&gt;In terms of performance, the query optimiser highlights a significant difference between the SELECT...UNION ALL and UNPIVOT queries. The former requires a pass over the underlying table for each select (2 passes in our sample) whereas the UNPIVOT requires only a single pass. SQL Server is able to transpose the columns into rows for each column in a single pass over the underlying data. This makes the UNPIVOT query less expensive than the equivalent UNION ALL. The size of this penalty is proportional to the number of rows being processed. This fact along makes UNPIVOT a very useful tool to keep note of in your TSQL arsenal.&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/10/unpivot-operator.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-4621243415141018660</guid><pubDate>Sun, 01 Oct 2006 20:54:00 +0000</pubDate><atom:updated>2008-02-04T23:04:47.457Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><title>SPID Blocks Itself...?</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;Came across this useful KB the other day: &lt;a href="http://support.microsoft.com/?id=906344"&gt;906344&lt;/a&gt;, &lt;b&gt;The blocked column in the sysprocesses table is populated for latch waits after you install SQL Server 2000 SP4&lt;/b&gt;. The technical detail may be a bit confusing. Essentially the SPID is doing a convoluted &amp;quot;WaitForSingleObject&amp;quot;-type block, waiting for a data page to be fetched into memory from disk. If you see a lot of this type of blocking you may want to check your other performance counters as you may have poor performing IO or buffer pool memory pressure.&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/10/spid-blocks-itself.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-4263470973485378611</guid><pubDate>Sat, 23 Sep 2006 00:24:00 +0000</pubDate><atom:updated>2008-02-04T23:03:02.711Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><title>Microsoft SQL Server Development Team Blogs</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;The best source for SQL Server internals and advanced TSQL are the Microsot SQL Server development teams' blogs. Some are more active than others, but they're all worth subscribing to for that gem you don't want to miss. For example, the SCHEMABINDING &lt;a href="http://blogs.msdn.com/sqlprogrammability/archive/2006/05/12/596424.aspx"&gt;tip&lt;/a&gt; is a real nugget.&lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;&lt;a href="http://blogs.msdn.com/sqlprogrammability/"&gt;SQL Programmability &amp;amp; API Development Team Blog&lt;/a&gt; &lt;/li&gt;      &lt;li&gt;&lt;a href="http://blogs.msdn.com/sqltips/"&gt;SQL Server Engine Tips&lt;/a&gt; &lt;/li&gt;      &lt;li&gt;&lt;a href="http://blogs.msdn.com/sqlserverstorageengine//"&gt;SQL Server Storage Engine&lt;/a&gt; &lt;/li&gt;      &lt;li&gt;&lt;a href="http://blogs.msdn.com/sqlqueryprocessing/"&gt;Tips, Tricks, and Advice from the SQL Server Query Processing Team&lt;/a&gt; &lt;/li&gt;      &lt;li&gt;&lt;a href="http://blogs.msdn.com/sqlclr/"&gt;SQL Server 2005: CLR Integration&lt;/a&gt;&lt;/li&gt;   &lt;/ul&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/09/microsoft-sql-server-development-team.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-7311548577161661814</guid><pubDate>Fri, 22 Sep 2006 20:49:00 +0000</pubDate><atom:updated>2008-02-04T23:01:14.253Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>visual studio</category><category domain='http://www.blogger.com/atom/ns#'>sql</category><title>For Database Professionals</title><description>&lt;font style="font-family: Georgia; font-size: small"&gt; &lt;p&gt;Visual Studio 2005 Team Edition for Database Professionals CTP 5 - Beta, download &lt;a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=4014554e-903a-4a62-b429-2b027321c32d&amp;amp;DisplayLang=en"&gt;here&lt;/a&gt;. An update patch for CTP 5 already &lt;a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=9f7d1be0-79c0-4d26-81e2-4ccd07d2352f&amp;amp;DisplayLang=en"&gt;available&lt;/a&gt;&lt;/p&gt;&lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/09/for-database-professionals.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-8296940258164873951</guid><pubDate>Thu, 21 Sep 2006 23:12:00 +0000</pubDate><atom:updated>2008-02-04T22:59:51.207Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>functions</category><title>The PIVOT Operator</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;SQL Server 2005 introduces the PIVOT operator to the TSQL stable. The intention of this operator is to make it easier to transform unique values in rows into columns and aggregate the resulting data. Anyone familiar with Excel pivot tables or Access cross-tab queries will immediately be able to visualise what pivot tables look like. Prior to SQL Server 2005, the only way to transform data, in the manner provided by PIVOT, was to use a number of SELECT...CASE statements. As we'll see, the PIVOT operator has not given us functionality that was not possible before. However, it has made pivoting data much easier to perform and far more readable.&lt;/p&gt;    &lt;p&gt;In order to understand how to form a PIVOT query, it is important to be able to describe, in plain English, what the query is trying to accomplish. Personally I have found that thinking of the PIVOT result set in terms of a graph (i.e. a X- and Y- axis) makes it easier to formulate into TSQL.&lt;/p&gt;    &lt;p&gt;As a worked example, consider the following requirement. You need to generate a report that will correlate the number of addresses that were changed per city for each year. In other words, formulate a result set that will have each city listed as a column (X-axis). For each city, count the number of times an address in that city was modified in a given year (Y-axis). In SQL Server 2000, you'd probably create some TSQL similar to the following:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff00ff"&gt;datepart&lt;/font&gt;(yyyy, modifiedDate) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [WhenUpdates],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;sum&lt;/font&gt;(&lt;font color="#0000ff"&gt;case&lt;/font&gt; city &lt;font color="#0000ff"&gt;when&lt;/font&gt; &lt;font color="#ff0000"&gt;'Monroe'&lt;/font&gt; &lt;font color="#0000ff"&gt;then&lt;/font&gt; 1 &lt;font color="#0000ff"&gt;else&lt;/font&gt; 0 &lt;font color="#0000ff"&gt;end&lt;/font&gt;) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [Monroe],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;sum&lt;/font&gt;(&lt;font color="#0000ff"&gt;case&lt;/font&gt; city &lt;font color="#0000ff"&gt;when&lt;/font&gt; &lt;font color="#ff0000"&gt;'Redmond'&lt;/font&gt; &lt;font color="#0000ff"&gt;then&lt;/font&gt; 1 &lt;font color="#0000ff"&gt;else&lt;/font&gt; 0 &lt;font color="#0000ff"&gt;end&lt;/font&gt;) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [Redmond],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;sum&lt;/font&gt;(&lt;font color="#0000ff"&gt;case&lt;/font&gt; city &lt;font color="#0000ff"&gt;when&lt;/font&gt; &lt;font color="#ff0000"&gt;'Snohomish'&lt;/font&gt; &lt;font color="#0000ff"&gt;then&lt;/font&gt; 1 &lt;font color="#0000ff"&gt;else&lt;/font&gt; 0&lt;font color="#0000ff"&gt; end&lt;/font&gt;) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [Snohomish],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;sum&lt;/font&gt;(&lt;font color="#0000ff"&gt;case&lt;/font&gt; city &lt;font color="#0000ff"&gt;when&lt;/font&gt; &lt;font color="#ff0000"&gt;'Issaquah'&lt;/font&gt; &lt;font color="#0000ff"&gt;then&lt;/font&gt; 1 &lt;font color="#0000ff"&gt;else&lt;/font&gt; 0 &lt;font color="#0000ff"&gt;end&lt;/font&gt;) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [Issaquah]         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Person.Address         &lt;br /&gt;&lt;font color="#0000ff"&gt;group by&lt;/font&gt; &lt;font color="#ff00ff"&gt;datepart&lt;/font&gt;(yyyy, modifiedDate)         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; 1&lt;font color="#696969"&gt;;         &lt;br /&gt;&lt;/font&gt;         &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;WhenUpdates &lt;/td&gt;              &lt;td&gt;Monroe &lt;/td&gt;              &lt;td&gt;Redmond &lt;/td&gt;              &lt;td&gt;Snohomish &lt;/td&gt;              &lt;td&gt;Issaquah &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1996 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1997 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1998 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1999 &lt;/td&gt;              &lt;td&gt;11 &lt;/td&gt;              &lt;td&gt;12 &lt;/td&gt;              &lt;td&gt;7 &lt;/td&gt;              &lt;td&gt;13 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2000 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;3 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;5 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;6 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2002 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;9 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;13 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2003 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;45 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;42 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2004 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;44 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;45 &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;A couple of things to note: &lt;/p&gt;    &lt;ul&gt;     &lt;li&gt;Even though there are many more cities in the Person.Address table than the ones specified, we must 'hardcode' the city names into the query. There is no way to dynamically create additional columns if a new city is encountered. That fact alone makes this query difficult to maintain. If another city is added to Person.Address, this query would have to be modified. &lt;/li&gt;      &lt;li&gt;The Y-axis (WhenUpdated) column is dynamic, we do not require prior knowledge of the modifiedDate's content when formulating the query. When an address is modified in a new year (e.g. 2005), another row will automatically be added to this result set without any changes. &lt;/li&gt;   &lt;/ul&gt;    &lt;p&gt;Now, for the PIVOT operator. There is good news and bad news. The good news is that the syntax is far more compact and readable. The bad news is that it overcomes none of the limitations mentioned above! However, as we'll see, the good news is still worthwhile...&lt;/p&gt;    &lt;p&gt;A PIVOT query is split into a number of sections. First, you need to write the SELECT statement that provides you with the raw, non-pivoted data. In our example, the raw data would be the following query:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff00ff"&gt;datepart&lt;/font&gt;(yyyy, modifiedDate) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [WhenUpdated], AddressID, City         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; Person.Address&lt;font color="#696969"&gt;;         &lt;br /&gt;&lt;/font&gt;         &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;WhenUpdated &lt;/td&gt;              &lt;td&gt;AddressID &lt;/td&gt;              &lt;td&gt;City &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;------------------------------ &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2004 &lt;/td&gt;              &lt;td&gt;19 &lt;/td&gt;              &lt;td&gt;Bothell &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1999 &lt;/td&gt;              &lt;td&gt;20 &lt;/td&gt;              &lt;td&gt;Bothell &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1999 &lt;/td&gt;              &lt;td&gt;21 &lt;/td&gt;              &lt;td&gt;Bothell &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;22 &lt;/td&gt;              &lt;td&gt;Portland &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1999 &lt;/td&gt;              &lt;td&gt;23 &lt;/td&gt;              &lt;td&gt;Seattle &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;24 &lt;/td&gt;              &lt;td&gt;Duluth &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2002 &lt;/td&gt;              &lt;td&gt;25 &lt;/td&gt;              &lt;td&gt;Dallas &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;26 &lt;/td&gt;              &lt;td&gt;San Francisco &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;... &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;WhenUpdated is the Y-axis, AddressID is a data point - we're going to count AddressIDs for a given (X,Y) / (City,WhenUpdated) combination - and City is the X-axis. Once we have the raw data, it is simple case of articulating which column is used for what function in the pivot:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; Pvt.*         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;-- The raw data as a derived table...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;select&lt;/font&gt; datepart(yyyy, modifiedDate) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [WhenUpdated], AddressID, City         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;from&lt;/font&gt; Person.Address) &lt;font color="#0000ff"&gt;as&lt;/font&gt; RawData         &lt;br /&gt;&lt;font color="#696969"&gt;pivot&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;count&lt;/font&gt;(AddressID)&amp;#160; &lt;font color="#008000"&gt;-- How each X,Y point is calculated...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;for&lt;/font&gt; City&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;-- The X-axis, stated explicitly...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#696969"&gt;in&lt;/font&gt; ([Monroe], [Redmond], [Snohomish], [Issaquah])) &lt;font color="#0000ff"&gt;as&lt;/font&gt; Pvt         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; 1&lt;font color="#696969"&gt;;         &lt;br /&gt;&lt;/font&gt;         &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;WhenUpdated &lt;/td&gt;              &lt;td&gt;Monroe &lt;/td&gt;              &lt;td&gt;Redmond &lt;/td&gt;              &lt;td&gt;Snohomish &lt;/td&gt;              &lt;td&gt;Issaquah &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1996 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1997 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1998 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;1999 &lt;/td&gt;              &lt;td&gt;11 &lt;/td&gt;              &lt;td&gt;12 &lt;/td&gt;              &lt;td&gt;7 &lt;/td&gt;              &lt;td&gt;13 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2000 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;3 &lt;/td&gt;              &lt;td&gt;2 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;5 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;6 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2002 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;9 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;13 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2003 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;45 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;42 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2004 &lt;/td&gt;              &lt;td&gt;1 &lt;/td&gt;              &lt;td&gt;44 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;45 &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;At first glance the PIVOT section seems a bit unintuitive. However, if you work through the syntax it will start to make sense. The PIVOT operator needs to know how to calculate each (X,Y) data point's value, specified as an aggregate function. The FOR column contains the X-axis values and the IN contains the hard-coded list of X-axis columns. As you can see, there is no getting around the SELECT...CASE limitation of having to know the transformed columns before-hand. In case you are wondering: no, IN cannot take a wildcard, e.g. IN(*). I believe that the limitation exists due to the optimizer's requirement to have fixed metadata when the execution plan is generated. This would not be possible if the column metadata is dynamically generated from the data itself.&lt;/p&gt;    &lt;p&gt;The Y-axis is inferred as the distinct set of all columns that have not been used by the aggregate function or by the X-Axis. In our case, that would be the WhenUpdated computed column.&lt;/p&gt;    &lt;p&gt;In terms of execution plans, the SELECT...CASE and PIVOT produce almost identical plans. However, as we're throwing away more of the rows in the Person.Address table (only 267 of the 19,614 rows are for the cities in our X-axis), both queries would be far more efficient if they included the set of X-axis column values (i.e. the names of the cities) in their WHERE clauses.&lt;/p&gt;    &lt;p&gt;Instead of trying to dream up dynamic-SQL ways of working around this problem (non-dynamic columns and rows falling outside the predefined X-axis values) I would argue that you're better off aggregating the data in TSQL (using 'traditional' SELECT...GROUP BY statements) and transforming it in your middle tier / client application where there are better data structures that can be used to represent extremely wide or sparse result sets.&lt;/p&gt;    &lt;p&gt;A better pattern to follow is to realise that the X-axis can be used to specify fixed ranges instead of raw values. Using this technique you can select ranges that will guarantee that you have covered off all possible X-axis values. For example, instead of allowing the raw unique values of a column to destabilise the PIVOT, the following query maps the values in the TotalDue column in a number of well-known buckets and then uses those values as the X-axis. Using this technique there is no possibility of having rows fall outside the X-axis range:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;select&lt;/font&gt; Pvt.*         &lt;br /&gt;&lt;font color="#0000ff"&gt;from&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;select&lt;/font&gt; &lt;font color="#ff00ff"&gt;datepart&lt;/font&gt;(yyyy, OrderDate) &lt;font color="#0000ff"&gt;as&lt;/font&gt; [OrderDate],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;case&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;when&lt;/font&gt; TotalDue &amp;lt; 100 &lt;font color="#0000ff"&gt;then&lt;/font&gt; &lt;font color="#ff0000"&gt;'NoMargin'&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;when&lt;/font&gt; TotalDue &amp;gt;= 100 and TotalDue &amp;gt; 1000 &lt;font color="#0000ff"&gt;then&lt;/font&gt; &lt;font color="#ff0000"&gt;'OnTarget'&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;else&lt;/font&gt; &lt;font color="#ff0000"&gt;'ExtraBonus'&lt;/font&gt; &lt;font color="#0000ff"&gt;end as&lt;/font&gt; [Profit],         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; PurchaseOrderId         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;from&lt;/font&gt; Purchasing.PurchaseOrderHeader) &lt;font color="#0000ff"&gt;as&lt;/font&gt; RawData         &lt;br /&gt;&lt;font color="#696969"&gt;pivot&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#ff00ff"&gt;count&lt;/font&gt;(PurchaseOrderId)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;for&lt;/font&gt; Profit         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#696969"&gt;in&lt;/font&gt;([NoMargin], [OnTarget], [ExtraBonus])) &lt;font color="#0000ff"&gt;as&lt;/font&gt; Pvt         &lt;br /&gt;&lt;font color="#0000ff"&gt;order by&lt;/font&gt; 1&lt;font color="#696969"&gt;;         &lt;br /&gt;&lt;/font&gt;         &lt;br /&gt;        &lt;table class="grid"&gt;&lt;tbody&gt;           &lt;tr&gt;             &lt;td&gt;OrderDate &lt;/td&gt;              &lt;td&gt;NoMargin &lt;/td&gt;              &lt;td&gt;OnTarget &lt;/td&gt;              &lt;td&gt;ExtraBonus &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;              &lt;td&gt;----------- &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2001 &lt;/td&gt;              &lt;td&gt;0 &lt;/td&gt;              &lt;td&gt;4 &lt;/td&gt;              &lt;td&gt;4 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2002 &lt;/td&gt;              &lt;td&gt;7 &lt;/td&gt;              &lt;td&gt;105 &lt;/td&gt;              &lt;td&gt;160 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2003 &lt;/td&gt;              &lt;td&gt;26 &lt;/td&gt;              &lt;td&gt;399 &lt;/td&gt;              &lt;td&gt;607 &lt;/td&gt;           &lt;/tr&gt;            &lt;tr&gt;             &lt;td&gt;2004 &lt;/td&gt;              &lt;td&gt;68 &lt;/td&gt;              &lt;td&gt;1070 &lt;/td&gt;              &lt;td&gt;1550 &lt;/td&gt;           &lt;/tr&gt;         &lt;/tbody&gt;&lt;/table&gt;     &lt;/div&gt;   &lt;/font&gt;    &lt;p&gt;As we've seen, the PIVOT operator does not provide any transformational capabilities that weren't possible in SQL Server 2000 (with liberal use of SELECT...CASE). However, as any experienced developer will know, the easier it is to express a reasonably complex SET-based problem, the better chance you have of getting it right, first time. Even more importantly, the poor developer required to maintain it (after you've moved on to greater conquests) will have a better chance of getting it right for the second time!&lt;/p&gt; &lt;/font&gt;  </description><link>http://www.fotia.co.uk/fotia/Blog/2006/09/pivot-operator.html</link><author>noreply@blogger.com (Stefan Delmarco)</author></item><item><guid isPermaLink='false'>tag:blogger.com,1999:blog-3912586870549609682.post-3366047632390012872</guid><pubDate>Sat, 26 Aug 2006 15:47:00 +0000</pubDate><atom:updated>2008-02-04T22:57:08.204Z</atom:updated><category domain='http://www.blogger.com/atom/ns#'>sql</category><category domain='http://www.blogger.com/atom/ns#'>functions</category><title>The APPLY Operator</title><description>&lt;font style="font-size: small; font-family: georgia"&gt;   &lt;p&gt;We've come across the APPLY operator before when we were exploring &lt;a href="DY.11.DynamicManagementViewsPart2.aspx"&gt;DMVs&lt;/a&gt;. The APPLY operator has been added to the TSQL repertoire in SQL Server 2005. It fills an important gap with User-Defined Functions (UDFs).&lt;/p&gt;    &lt;p&gt;UDFs are less well-known than stored procedures (SPs) and are often overlooked by inexperienced SQL developers. &lt;a href="http://msdn2.microsoft.com/en-us/library/ms130214(SQL.90).aspx"&gt;BOL&lt;/a&gt; has a good &lt;a href="http://msdn2.microsoft.com/en-us/library/ms187650.aspx"&gt;write-up&lt;/a&gt; on how to choose between SPs and UDFs. For the context of this article, a brief overview of UDFs:&lt;/p&gt;    &lt;p&gt;UDFs come in two flavours: those that returns scalars and those that return tables. Scalar-valued UDFs are very straightforward. They take in a bunch of parameters and return a single scalar value. These are functions in the traditional VB / C# / C++ sense. Table-valued functions are a little different. They also can take in a bunch of parameters. However, they are able to return a table instead of a scalar value. This returned table is referenced in TSQL in the same manner as a table or a view. This makes them look like a cross between functions and views. In fact, there are two types of table-valued UDFs. Those that contain a single SELECT statement (they look very much like a derived table) called 'Inline Table-Valued Functions' and those that build-up the return table using procedural logic (they look like stored procedures) called 'Multistatement Table-Values Functions'. It is the latter-type that we are interested in.&lt;/p&gt;    &lt;p&gt;Multistatement table-valued functions are very powerful as they allow us to encapsulate complex procedural logic in a function when that logic cannot be implemented in a database view (i.e. you cannot express that desired output using a single set-based TSQL expression). You are free to construct the output rows in what ever manner you wish! The only restriction is that you need to declare the schema of the table that the UDF returns upfront as part of the UDF's declaration.&lt;/p&gt;    &lt;p&gt;As an example, consider the following table-valued UDF:&lt;/p&gt;   &lt;font style="font-size: small; font-family: &amp;#39;Courier New&amp;#39;, courier, monospace"&gt;     &lt;div class="tsql"&gt;&lt;font color="#0000ff"&gt;create function&lt;/font&gt; dbo.udf_DecomposeDns (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; @webServiceUrl &lt;font color="#0000ff"&gt;nvarchar&lt;/font&gt;(1024))         &lt;br /&gt;&lt;font color="#0000ff"&gt;returns&lt;/font&gt; @dnsDomains &lt;font color="#0000ff"&gt;table&lt;/font&gt; (         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; domain &lt;font color="#0000ff"&gt;nvarchar&lt;/font&gt;(1024))         &lt;br /&gt;&lt;font color="#0000ff"&gt;as begin&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;-- Return no rows if NULL...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;if&lt;/font&gt; @webServiceUrl &lt;font color="#696969"&gt;is NULL&lt;/font&gt;         &lt;br /&gt;&lt;font color="#0000ff"&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; return&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;declare&lt;/font&gt; @dns &lt;font color="#0000ff"&gt;nvarchar&lt;/font&gt;(1024)&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#008000"&gt;-- Remove the contents after the DNS...&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;declare&lt;/font&gt; @slashIndex &lt;font color="#0000ff"&gt;int&lt;/font&gt;&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font color="#0000ff"&gt;set&lt;/font&gt; @slashIndex = &lt;font color="#ff00ff"&gt;charindex&lt;/font&gt;(N&lt;font color="#ff0000"&gt;'/'&lt;/font&gt;, @webServiceUrl)&lt;font color="#696969"&gt;;&lt;/font&gt;         &lt;br /&gt;&amp;#160;&amp;#16