Friday, August 24, 2018

QUOTED_IDENTIFIER And PERSISTED Computed Columns

Many of us are familiar with this error message:

INSERT failed because the following SET options have incorrect settings: 'QUOTED_IDENTIFIER'. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.

What is missing in the list however is PERSISTED Computed Columns. Below are two examples with the only difference being whether the computed column is persisted.

Example 1 - Creates a table with a non-persisted computed column. QUOTED_IDENTIFIER is ON for table create.

But we are able to INSERT (and SELECT) data with QUOTED_IDENTIFIER OFF.


DROP TABLE IF EXISTS [dbo].[Numbers];
GO

SET QUOTED_IDENTIFIER ON;
GO

CREATE TABLE [dbo].[Numbers](
       [ID] [int] NOT NULL,
       [Number1] [int] NOT NULL,
       [Number2] [int] NOT NULL,
       [SumOfNumbers]  AS ([Number1]+[Number2]),
 CONSTRAINT [PK_Numbers] PRIMARY KEY CLUSTERED
(
       [ID] ASC
));
GO

SET QUOTED_IDENTIFIER OFF;
GO

INSERT
INTO dbo.Numbers (ID,Number1,Number2)
VALUES (1,5,6);

SELECT * FROM dbo.Numbers;


Results...






Example 2 - Creates a table with a persisted computed column. QUOTED_IDENTIFIER is ON for table create.

But we are not able to INSERT data with QUOTED_IDENTIFIER OFF.


DROP TABLE IF EXISTS [dbo].[Numbers] ;

GO

SET QUOTED_IDENTIFIER ON;
GO

CREATE TABLE [dbo].[Numbers](
       [ID] [int] NOT NULL,
       [Number1] [int] NOT NULL,
       [Number2] [int] NOT NULL,
       [SumOfNumbers]  AS ([Number1]+[Number2]) PERSISTED,
 CONSTRAINT [PK_Numbers] PRIMARY KEY CLUSTERED
(
       [ID] ASC
));
GO

SET QUOTED_IDENTIFIER OFF;
GO

INSERT
INTO dbo.Numbers (ID,Number1,Number2)
VALUES (1,5,6);

SELECT * FROM dbo.Numbers


Results...

INSERT failed because the following SET options have incorrect settings: 'QUOTED_IDENTIFIER'. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.

...with no mention of persisted computed columns.

If you change  QUOTED_IDENTIFIER to ON, it works even when the computed column is persisted. 

 I found some old documentation here, but nothing current. 

Anyway, it took me some time to figure it out and so I decided to post.

Later...

Monday, March 5, 2018

OPEN SYMMETRIC KEY & the Plan Cache


SQL Server natively supports column-level encryption. You can visit here to learn how to do it. But the basics are:

OPEN SYMMETRIC KEY
Do encryption and/or decryption stuff
CLOSE SYMMETRIC KEY

In order to use OPEN SYMMETRIC KEY, you have to specify either a password or a certificate which protects the symmetric key. If you use a password, it is clear text in the sproc code and so SQL Server does not put the sproc in the plan cache. No execution plan reuse and no usage & performance data in sys.dm_exec_procedure_stats DMV.

Thing is… even if, instead of a password, you use a certificate which presents no security exposure, SQL Server still does not put the sproc in the plan cache. If it sees OPEN SYMMETRIC KEY, no plan cache.

Guilt by association.  

SQL Server 2008 introduced another option for decryption only – DECRYPTBYKEYAUTOCERT. This command accepts the certificate name as a parameter and works on its own without OPEN SYMMETRIC KEY and so the sproc can go in the plan cache. Currently, there is no corresponding ENCRYPYBYKEYAUTOCERT.

Here is the big picture:



So why not a corresponding ENCRYPTBYKEYAUTOCERT?

I asked Microsoft at SQL PASS 2017. They took the question back to the mother ship and here is the email reply I received.


Hi Regan,

So the reason we support decryptByAutoCert but not the encryption part is twofold:

1.      Performance: We had received a steady flow of complaints that select operations with decryption occur frequently and should be made as performant as possible; hence we invested in multi-threading of decryption.
2.      Unlike decryption, which is deterministic and can pretty much execute in parallel without much code change, encryption is much harder to parallelize and even harder to test and ensure it works correctly.

So unfortunately, we have not funded parallelizing the encryption operation yet.
Thanks for bringing this to the forefront again. I will put this on our list for planning items.

There is a UserVoice (formerly Connect) item here. Please vote!

In the meantime, there are two work-a-rounds.

(1)   To at least get the sproc in the DMVs so you can monitor usage & performance, move the actual T-SQL code to a “private” sproc and have the application-called sproc just be a wrapper to call the “private” version. This does not get the real code into the plan cache; it still has to be compiled on every execution. But at least you can get all the metrics from the sys.dm_exec_procedure_stats DMV for the wrapper sproc.
(2)   This 2010 MSDN blog post suggests using EXECUTE to fire dynamic SQL for the OPEN SYMMETRIC KEY command (whether using password or certificate). This enables the sproc to go in the plan cache. You still get a SP:CacheMiss for the OPEN SYMMETRIC KEY - but at least now it is only for that one command. 

We do what we can.

Later



Thursday, February 2, 2017

NULLs and the T-SQL EXCEPT Clause

NULL is the absence of a value. This means it is an unknown value. But it is not an unknown value like “x” is an unknown value in algebra. This is because “x” is an actual value and so x = x. NULL is not only unknown; it is the absence of a value. Consequently, NULL does not equal NULL; and any calculation involving NULL is equal to NULL.

So…
2 + 2 + x is equal to 4 + x

But…
2 + 2 + NULL is not equal to 4 + NULL

And T-SQL functions align to this.
2 + 2 = 4
2 + 2 != 5
And anything with a NULL in it is Not Equal. 


All is well in the world so far.


Enter EXCEPT which outputs distinct rows from a 1st dataset unmatched by any row in a 2nd dataset. A simple example is below. 2 is in the top dataset but not in the bottom set and so it is returned as a result of the EXCEPT clause. Even though the top dataset has two 2s only one 2 is returned because the EXCEPT clause inherently produces a distinct result.


We will replace the 2s in the top dataset with NULLs. Notice that although T-SQL functions treat NULLs as not equal to each other, EXCEPT returns a single NULL. Both NULLs are taken to be "equal". 



This conforms to the general functionality of DISTINCT.



Finally, moving one of the NULLs to the bottom set, the result of the EXCEPT is empty – thereby treating NULL in the top set (which could be 2+2) as matching NULL in the bottom set (which could be 5).


The MSDN article goes so far as to describe the two NULL values as “equal” although the SQL standard uses the language “IS NOT DISTINCT FROM”.


So although T-SQL functions treat NULLs as not equal to each other, the T-SQL EXCEPT clause treats NULLs as equal. In the case below, each SELECT separately conforms to the logic that anything plus a missing value nets a missing value, but then treats those resulting missing values as matching. The result of the EXCEPT clause is empty.



Might 2 + 2 actually equal 5?

Happy trails....


Monday, August 1, 2016

SELECT Clause Aliases Affect ORDER BY clause

If a SELECT clause transforms a column, and then aliases the result as the same name of the original column in the base table, the ORDER BY clause is affected and cannot make use of the sorted data in the base table. The SELECT clause alias affects the ORDER BY clause.

For an example, I am using a simple two-column table with an auto-increment Clustered PK and string column.

  



CASTing the PK as a VARCHAR, aliasing it as a different name (n1) with ORDER BY the base table PK column performs an Index Seek and leverages the base table sort.



But, CASTing the PK as a VARCHAR, aliasing it as the base table column name again with ORDER BY the base table PK column performs an Index Seek but must sort the data – because the data actually is sorted as a VARCHAR

 


The column 'n' has its meaning changed between execution of the WHERE clause and execution of the ORDER BY clause. The transformed data is what is being ORDER'd. 



To return transformed data - named the same as the base table column and ordered as it is in the base table PK (which was a weird requirement I recently had), you need to fully qualify the ORDER BY column reference.







Saturday, December 27, 2014

Database Tables Without a Primary Key or Cluster Key

Here is a simple script to report all tables in a database with indicators for those without PKs or CKs.


WITH DataCTE
AS
(
SELECT
        SCHEMA_NAME(schema_id)           AS SchemaName
       ,name                                    AS TableName
       ,'PK'                                    AS KeyType
       ,CASE
              WHEN OBJECTPROPERTY(OBJECT_ID,'TableHasPrimaryKey') = 0 THEN 'X'
              ELSE ''
       END AS MissingIndicator
FROM
       sys.tables
UNION ALL
SELECT
        SCHEMA_NAME(schema_id)           AS SchemaName
       ,name                                    AS TableName
       ,'CK'                                    AS KeyType
       ,CASE
              WHEN OBJECTPROPERTY(OBJECT_ID,'TableHasClustIndex') = 0 THEN 'X'
              ELSE ''
       END AS MissingIndicator
FROM
       sys.tables
)

SELECT
       SchemaName
       ,TableName
       ,ISNULL([CK],'') AS MissingCK
       ,ISNULL([PK],'') AS MissingPK
FROM
       DataCTE
PIVOT
       (
       MIN(MissingIndicator)
       FOR KeyType IN
              (
              [CK],[PK]
              )
       ) pvtData
ORDER BY
       SchemaName
       ,TableName


Here is a sample execution of it against the AdventureWorks 2014 database. The ProductProductPhoto table has no CK in the database as provided at codeplex. I removed the PK on ProductSubcategory (which also is the CK) to test that bit of the script.





Wednesday, October 16, 2013

Returning a Default Row When No Data Would Be Returned


Recently, I had to tune a stored procedure that had an odd behavior. If no data was found matching the parameters passed in, it still returned a row with empty string values. The way the sproc was doing this was to first check for the existence of data for the parameters passed in. If data was there it issues the SELECT statement, otherwise a SELECT of empty string values.

IF NOT EXISTS (SELECT * FROM dbo.TestData WHERE Col1 = @Col1 AND Col2 = @Col2)
       SELECT
               '' AS Col1
              ,'' AS Col2
              ,'' AS Col3
              ,'' AS Col4
ELSE
       SELECT
               Col1
              ,Col2
              ,Col3
              ,Col4
       FROM
              dbo.TestData
       WHERE
              Col1 = @Col1
       AND Col2 = @Col2

Naturally, this sort of logic should not be in the database tier. But the contract with the application was established, and the application call to the database was not sufficiently encapsulated – and so I had to work with this and change only the implementation of the sproc while preserving this contract.

So the obvious problem with the solution above is that when there is a match, there are two requests for the data as the execution plan below shows.



So the challenge was to preserve the default row contract with the application while eliminating the extra requests to check for the existence of the data. I will present three solutions, the third being the best I know of. But naturally, let me know if you have a better solution.

 (1)  UNION ALL the default row.

SELECT
        Col1
       ,Col2
       ,Col3
       ,Col4
FROM
       dbo.TestData
WHERE
       Col1 = @Col1
AND Col2 = @Col2
UNION ALL
SELECT
        '' AS Col1
       ,'' AS Col2
       ,'' AS Col3
       ,'' AS Col4

The problem here is that the default row is returned even when the data exists.


So for this solution, additional logic is needed to return either the records matching the parameters or the default row, but not both – something like this.

 WITH DataCTE
AS
(
SELECT
        Col1
       ,Col2
       ,Col3
       ,Col4
       ,0 AS DefaultRowInd
FROM
       dbo.TestData
WHERE
       Col1 = @Col1
AND Col2 = @Col2
UNION ALL
SELECT
        '' AS Col1
       ,'' AS Col2
       ,'' AS Col3
       ,'' AS Col4
       ,1 AS DefaultRowInd
)

SELECT
        Col1
       ,Col2
       ,Col3
       ,Col4
FROM
       DataCTE
WHERE
       DefaultRowInd = (SELECT MIN(DefaultRowInd) FROM DataCTE)

But the CTE does not shield us from two requests for the data. The WHERE clause at the end referencing DataCTE ultimately is a separate request for the data against the base table, and so - in this regard - no different from the original sproc implementation.



(2)  Table variable solution. This solution reads the data once into a table variable and then checks there for the existence of the data.

DECLARE @Data TABLE (Col1 VARCHAR(50),Col2 VARCHAR(50),Col3 VARCHAR(50),Col4 VARCHAR(50))
INSERT
INTO
       @Data
(
        Col1
       ,Col2
       ,Col3
       ,Col4
)
SELECT
        Col1
       ,Col2
       ,Col3
       ,Col4
FROM
       dbo.TestData
WHERE
       Col1 = @Col1
AND Col2 = @Col2

IF (SELECT COUNT(*) FROM @Data) = 0
       SELECT
               '' AS Col1
              ,'' AS Col2
              ,'' AS Col3
              ,'' AS Col4
ELSE
       SELECT
               Col1
              ,Col2
              ,Col3
              ,Col4
       FROM
              @Data

Although in a heavy-handed way this avoids a separate existence-check hit to the base table, the sproc still has to do the same sort of processing against the table variable and actually it is more expensive. The execution plan comparison is roughly 25% (original) vs. 75% (table variable solution).

(3)  SELECT NULL and LEFT OUTER JOIN. This is the best solution I was able to work out.

SELECT
        ISNULL(Col1,'') AS Col1
       ,ISNULL(Col2,'') AS Col2
       ,ISNULL(Col3,'') AS Col3
       ,ISNULL(Col4,'') AS Col4
FROM
       (SELECT NULL AS DefaultVal) DefaultRow
       LEFT OUTER JOIN dbo.TestData
              ON  Col1 = @Col1
              AND Col2 = @Col2

This solution guarantees a record is returned with the SELECT NULL and will display data from the base table only if it exists – otherwise the ISNULL functions returns empty strings. NOTE: In order for this to work, it is important to move the WHERE clause criteria to the LEFT OUTER JOIN criteria – so it is applied only to the base table, and not to the entire query. If it is not moved from the WHERE clause, the T-SQL will not return the default row.

Below are the result sets for both conditions confirming the required results.


Comparing the execution plans, we see the tuned version has a execution plan cost equal to half of the original sproc – which makes sense because we have eliminated one database hit without introducing costly processing.



Hope this is of some benefit.

Later

 -Regan