Posts

With SQL Server 2017 Microsoft improved the journey towards making your code go faster without any code changes required by your developers. SQL Server 2019 gives you more of this super cool black magic.

Today, we are going to show you the details of why scalar functions executions are changing and how you can make Scalar functions run faster if you are not on SQL Server 2019 (SPOILER ALERT: It will require some code changes)

Why are Scalar Functions Silent Performance Killers?

Scalar functions are silent performance killers because before SQL Sever 2019 you wouldn’t see them in execution plans or using set statistics which is how most people benchmark a queries performance. Below we have a video going over Scalar Functions and how to see the silent performance killer.

Hello SQL Server User, meet your silent but deadly performance killer.

How does SQL Server 2019 Make this better?

This is a great question. I am glad you asked.

If you saw the video above you know that scalar functions do row by row processing and you can see in the Profiler trace that the GetMaxProductQty_Scalar function was executed for all 504 rows in the product table.

In SQL Server 2019, the row by row processing will go to set based logic when possible. This is shown in the next video below in great detail.

Can you help me find more information on Scalar Function changes with SQL 2019?

YES! The following are some great extra reference articles I would highly recommend reviewing.


Can I have the source code and play with the demo?

Yes, I strongly encourage it. In fact, if you have any questions going through the demo and have questions feel free to ask me questions.

USE [master];
GO
ALTER DATABASE [AdventureWorks2012] 
SET QUERY_STORE (OPERATION_MODE = READ_WRITE, 
/* For demo purposes. Most likely don't want these settings in production! */
					DATA_FLUSH_INTERVAL_SECONDS = 60, 
					QUERY_CAPTURE_MODE = ALL, 
					INTERVAL_LENGTH_MINUTES = 1);
GO

USE [AdventureWorks2012];
GO
DBCC DROPCLEANBUFFERS; -- For demo to get PAGEIOLATCH waits
ALTER DATABASE CURRENT SET QUERY_STORE CLEAR ALL;

/* 
How does scalar functions work before SQL Server 2019?
Lets see...
*/
USE [master]
GO
ALTER DATABASE [AdventureWorks2012] SET COMPATIBILITY_LEVEL = 120
GO

USE [AdventureWorks2012]
GO
select @@SPID

set statistics io on
set statistics time on
GO

/*** Makes demo go faster 
CREATE INDEX idx_test ON Sales.SalesOrderDetailEnlarged (ProductID)
INCLUDE (OrderQty)
WITH (DATA_COMPRESSION = PAGE, ONLINE=ON)
***/

/* CleanUp - Remove functions for the demo 
DROP IF EXISTS was added in SQL 2016
*/
USE [AdventureWorks2012]
GO
DROP FUNCTION IF EXISTS GetMaxProductQty_Scalar
DROP FUNCTION IF EXISTS GetMaxProductQty_Inline
GO

/* Now the fun begins... */
USE [AdventureWorks2012]
GO
CREATE FUNCTION GetMaxProductQty_Scalar
(
    @ProductId INT
)
RETURNS INT
AS
BEGIN
    DECLARE @maxQty INT

    SELECT @maxQty = MAX(sod.OrderQty)
    FROM Sales.SalesOrderDetailEnlarged sod
    WHERE sod.ProductId = @ProductId

    RETURN (@maxQty)
END



/* Clear all pages from memory
WARNING: Do not do this in production..
This gives us a senario where no pages or plans are in memory. */
CHECKPOINT
DBCC DROPCLEANBUFFERS;
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
/* New in SQL Server 2016 and SQL Azure.
This removes plan cache for the database you are in.
Can use this to replace DBCC FREEPROCCACHE which would
remove plan cache for ALL databases */


/* TODO: LOAD PROFILER - Show ROW BY ROW Function Calls */
/* Row By Row Processing - Scalar Function */
SELECT
    ProductId,
    dbo.GetMaxProductQty_Scalar(ProductId) As MaxQty
FROM Production.Product
ORDER BY 2 DESC

/* Where is function calls? Didn't we also hit SalesOrderDetailEnlarged?

(504 row(s) affected)
Table 'Product'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

  SQL Server Execution Times:
   CPU time = 1750 ms,  elapsed time = 1958 ms.
*/

/** Lets look at SQL Server 2019 CTP 2.4 Inline Scalar Functions. **/
USE [master]
GO
ALTER DATABASE [AdventureWorks2012] SET COMPATIBILITY_LEVEL = 150
GO


/* Clear all pages from memory
WARNING: Do not do this in production..
This gives us a senario where no pages or plans are in memory. */
USE [AdventureWorks2012]
GO
CHECKPOINT
DBCC DROPCLEANBUFFERS;
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
GO

/* TODO: LOAD PROFILER - Show ROW BY ROW Function Calls */
/* Row By Row Processing - Scalar Function */
USE [AdventureWorks2012]
GO
SELECT
    ProductId,
    dbo.GetMaxProductQty_Scalar(ProductId) As MaxQty
FROM Production.Product
ORDER BY 2 DESC
GO
/**** We now SEE SalesOrderDetailEnlarged in plan and Set Statistics IO.
WE DIDN'T SEE THIS BEFORE SQL SERVER 2019  ****


(504 rows affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailEnlarged'. Scan count 504, logical reads 11930, physical reads 2, read-ahead reads 8341, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Product'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 2156 ms,  elapsed time = 2494 ms.

*/


/*** HOW CAN WE MAKE THIS BETTER 
		if we are NOT on SQL SERVER 2019? ***/
USE [master]
GO
ALTER DATABASE [AdventureWorks2012] SET COMPATIBILITY_LEVEL = 140
GO










/* 
Utilize a table value function
instead of scalar function where possible.

*/
USE [AdventureWorks2012]
GO
CREATE FUNCTION GetMaxProductQty_Inline
(
    @ProductId INT
)
RETURNS TABLE
AS
    RETURN
    (
        SELECT MAX(sode.OrderQty) AS maxqty
        FROM Sales.SalesOrderDetailEnlarged sode
        WHERE sode.ProductId = @ProductId
    )
GO

CHECKPOINT
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

/*
WHAT is APPLY?
https://technet.microsoft.com/en-us/library/ms175156(v=sql.105).aspx
The APPLY operator allows you to invoke a table-valued function 
for each row returned by an outer table expression of a query. 
The table-valued function acts as the right input and the outer 
table expression acts as the left input. The right input is 
evaluated for each row from the left input and the rows produced 
are combined for the final output. 
*/

SELECT  p.ProductId,
		MaxQty
FROM Production.Product p
CROSS APPLY dbo.GetMaxProductQty_Inline(p.ProductId) tvf
ORDER BY 2 DESC


/*
(504 rows affected)
Table 'Product'. Scan count 2, logical reads 40, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailEnlarged'. Scan count 3, logical reads 8230, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 3015 ms,  elapsed time = 2018 ms.

*/



CHECKPOINT
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

/* In this case, we can just join against the two tables */
SELECT p.ProductId, MAX(sod.OrderQty)
FROM Sales.SalesOrderDetailEnlarged sod
RIGHT JOIN Production.Product p ON sod.ProductID = p.ProductID
GROUP BY p.ProductID
ORDER BY 2 DESC

/* (504 rows affected)
Table 'Product'. Scan count 3, logical reads 40, physical reads 1, read-ahead reads 14, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailEnlarged'. Scan count 3, logical reads 8226, physical reads 1, read-ahead reads 8317, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 3048 ms,  elapsed time = 2122 ms.

*/

/* Index to improve - Columnstore Index */
USE [AdventureWorks2012]
GO
CREATE NONCLUSTERED COLUMNSTORE INDEX nccidx_SalesOrderDE on Sales.SalesOrderDetailEnlarged (ProductID, OrderQty)
GO

USE [AdventureWorks2012]
GO
CHECKPOINT
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;
GO

SELECT  ProductId,
		MaxQty
FROM Production.Product
CROSS APPLY dbo.GetMaxProductQty_Inline(ProductId) MaxQty
ORDER BY 2 DESC

/*
(504 rows affected)
Table 'SalesOrderDetailEnlarged'. Scan count 2, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 8251, lob physical reads 13, lob read-ahead reads 30209.
Table 'SalesOrderDetailEnlarged'. Segment reads 6, segment skipped 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Product'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 47 ms,  elapsed time = 724 ms.
*/


/*** Does Non Clustered Columnstore index change anything 
	With SQL 2019? 
	
	We wouldn't be doing this if it didnt ;-)
	****/
USE [master]
GO
ALTER DATABASE [AdventureWorks2012] SET COMPATIBILITY_LEVEL = 150
GO


/* Clear all pages from memory
WARNING: Do not do this in production..
This gives us a senario where no pages or plans are in memory. */
USE [AdventureWorks2012]
GO
CHECKPOINT
DBCC DROPCLEANBUFFERS;
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE;
GO

USE [AdventureWorks2012]
GO
SELECT
    ProductId,
    dbo.GetMaxProductQty_Scalar(ProductId) As MaxQty
FROM Production.Product
ORDER BY 2 DESC

/* YOUR MILAGE MIGHT VERY....
(504 rows affected)
Table 'Worktable'. Scan count 504, logical reads 14546904, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailEnlarged'. Scan count 2, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 4006, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailEnlarged'. Segment reads 6, segment skipped 0.
Table 'Product'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   CPU time = 31203 ms,  elapsed time = 31640 ms.


*/


/* ---------- > CLEANUP < ----------- */
USE [AdventureWorks2012]
GO
DROP FUNCTION IF EXISTS GetMaxProductQty_Scalar
DROP FUNCTION IF EXISTS GetMaxProductQty_Inline
DROP INDEX IF EXISTS idx_SalesOrderDE_ProductID_Include1 on Sales.SalesOrderDetailEnlarged
DROP INDEX IF EXISTS nccidx_SalesOrderDE on Sales.SalesOrderDetailEnlarged 
GO

[Updated on 10/25/2018]

With MSIgnite being this week tons of great new features are being announced.  While many will focus on Spark integration, performance tuning, security, and containers sometimes it’s the simple things that give us the most value. I will talk about a feature that is hidden on page 17 in Microsoft’s SQL 2019 whitepaper. This feature is not getting a lot of love now but everyone will love it.  I will be honest, I didn’t even know about it until reading the white paper.  What is really super cool is that this new feature will find its way into SQL Server 2017 and SQL Server 2016. More on this in a bit.

In the past, I have struggled with the classic string truncation error. Everyone has at one point in time or another. It’s tricky to troubleshoot especially when you are processing a big batch of data.

String or binary data would be truncated

So even in SQL 2019, SQL Server 2017 CU12, and SQL Server 2016 (upcoming SQL Server 2016 SP2 CU)  by default, you will see the very classic “string or binary data would be truncated. The statement has been terminated.” message. Developers, DBA’s, IT Professionals, and almost everyone who has ever touched SQL Server have seen this frustrating error at least once. It is very frustrating for two separate reasons. One, you have no idea which column had the truncation. Second, you have no idea which row also had the error.

Let’s take a look at a very simple example.

use [tempdb]
go

if exists (select 1 from sys.tables where name like 'WorldSeries')
drop table dbo.WorldSeries
GO

CREATE TABLE dbo.WorldSeries (TeamName varchar(200), Championships varchar(3))
GO

INSERT INTO dbo.WorldSeries (TeamName, Championships) VALUES ('Texas Rangers', null)

INSERT INTO dbo.WorldSeries (TeamName, Championships) VALUES ('Houston Astros', 'one'), 
('Atlanta Braves', 'two'), ('Pittsburgh Pirates', 'five')
GO

You will see that our first statement on inserting the Texas Rangers completes successfully even though the Rangers have zero (I like NULL) championships.  We have an error in the next batch of doing three inserts in a single query so the whole transaction will rollback. In this example, we could figure out where our error is but let’s pretend we are doing a massive data load into several columns with billions of records. Actually, I don’t even want to think about troubleshooting that and with this new hot feature in SQL Server 2019, you won’t have too either.

SQL Server 2019 Feature Everyone Will Love

Okay, I think you know where this is going… Let’s not prolong this any longer. We will rerun the last statement with just adding trace flag 460.

DBCC traceon(460)
INSERT INTO dbo.WorldSeries (TeamName, Championships) VALUES ('Houston Astros', 'one'), 
('Atlanta Braves', 'two'), ('Pittsburgh Pirates', 'five')

GO

Now it is crystal clear.  The Pittsburgh Pirates have won too many championships to fit into this table. Okay, I guess we can both agree the string “five” is too big to fit into a varchar(3) column.

John’s Thoughts and Recommendation

All you have to do in SQL Server 2019 CTP 2.0 (Current newest public release) is enable trace flag 460.  I hope, the need for trace flag 460 goes away when SQL Server 2019 is Generally Available.  That said, enabling the trace flag is easy and this feature is going to be a lifesaver especially for people consuming data from other sources and struggle troubleshooting data that doesn’t fit into their data model due to the length of a value.

I am sure I am not the only person who provided feedback on this new feature to Microsoft.  That said, I am really excited to see Pedro’s announcement that the trace flag 460 shouldn’t be required when SQL Server 2019 is Generally Available (GA). The new error message (message 2628) will replace the old error message (message 8152). I am also excited that you will be able to utilize this new feature in SQL Server 2016 and 2017. It makes perfect sense for backward compatibility that you will need to provide trace flag 460 if you are using SQL Server 2017 CU 12 or SQL Server 2016.  Way to go Pedro and the SQL Server Team. This is amazing news!

use [tempdb]
go

if exists (select 1 from sys.tables where name like 'WorldSeries')
drop table dbo.WorldSeries
GO

CREATE TABLE dbo.WorldSeries (TeamName varchar(200), Championships varchar(3))
GO

INSERT INTO dbo.WorldSeries (TeamName, Championships) VALUES ('Texas Rangers', null)

INSERT INTO dbo.WorldSeries (TeamName, Championships) VALUES ('Houston Astros', 'one'), 
('Atlanta Braves', 'two'), ('Pittsburgh Pirates', 'five')
GO