Hello, first time poster, please let me know if more information is needed.
I have a query that joins two tables (defined below) with a CTE, the Reading table contains ~7billion records and is partitioned monthly by ReadingTime. The ReadingAggregateNeeded table contains ~12million records and is not partitioned. When I run the CTE version of the query that joins the two tables directly it will never complete (pageiolatch, cxpacket waits, as if its doing a table scan), if I rewrite the query to join to a table variable (essentially load the entire ReadingAggregateNeeded table into a variable) it completes almost instantly. The two queries are below.
I have rebuilt all of the indexes on both tables, and I have updated statistics. According to dm_db_index_physical_stats, there is no fragmentation on my indexes.
The server that I'm having this problem on is running Developer 64bit - 10.50.1600.1
Can anyone help me to understand what is going on?
Thanks,
Ryan
CREATE TABLE [dbo].[Reading]( [ReadingID] [BIGINT] IDENTITY(1,1) NOT NULL, [SensorID] [INT] NOT NULL, [ReadingTime] [DATETIME] NOT NULL, [ReadingValue] [REAL] NOT NULL, [RawReadingID] [BIGINT] NULL, CONSTRAINT [PK_Reading] PRIMARY KEY NONCLUSTERED ( [ReadingID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY] ) GO CREATE CLUSTERED INDEX [SensorReadingTimeValue] ON [dbo].[Reading] ( [SensorID] ASC, [ReadingTime] ASC, [ReadingValue] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95) GO CREATE TABLE [dbo].[ReadingAggregateNeeded]( [ReadingAggregateNeededId] [bigint] IDENTITY(1,1) NOT NULL, [SensorId] [int] NOT NULL, [ReadingTime] [datetime] NOT NULL, CONSTRAINT [PK_ReadingAggregateNeeded] PRIMARY KEY NONCLUSTERED ( [ReadingAggregateNeededId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 100) ON [PRIMARY] ) GO CREATE CLUSTERED INDEX [ix_SensorIDReadingTimeID] ON [dbo].[ReadingAggregateNeeded] ( [SensorId] ASC, [ReadingTime] ASC, [ReadingAggregateNeededId] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] GO
CTE version (never completes):
DECLARE @MaxAggNeededId BIGINT WITH top10 -- temporarily limit the amount of rows from ReadingAggregateNeeded for testing AS ( SELECT TOP 10 ReadingAggregateNeededId FROM ReadingAggregateNeeded ORDER BY ReadingAggregateNeededId ) SELECT @MaxAggNeededId = MAX(ReadingAggregateNeededId) FROM top10 ; WITH hourlytoprocess AS ( SELECT SensorID , DATEADD(MINUTE, -1 * DATEPART(MINUTE, ReadingTime), DATEADD(SECOND, -1 * DATEPART(SECOND, ReadingTime), ReadingTime)) AS HourStart FROM ReadingAggregateNeeded WHERE ReadingAggregateNeededId <= @MaxAggNeededId GROUP BY SensorID , DATEADD(MINUTE, -1 * DATEPART(MINUTE, ReadingTime), DATEADD(SECOND, -1 * DATEPART(SECOND, ReadingTime), ReadingTime)) ) SELECT Reading.SensorID , HourlyToProcess.HourStart , AVG(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingAverage , SUM(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingSum , MAX(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingMax , MIN(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingMin , COUNT(Reading.ReadingValue) AS ReadingCount FROM HourlyToProcess INNER JOIN Reading ON Reading.SensorID = HourlyToProcess.SensorID AND Reading.ReadingTime >= HourlyToProcess.HourStart AND Reading.ReadingTime < DATEADD(HOUR, 1, HourlyToProcess.HourStart) GROUP BY Reading.SensorID , HourlyToProcess.HourStart
Table variable version (completes instantly):
DECLARE @MaxAggNeededId BIGINT WITH top10 -- temporarily limit the amount of rows from ReadingAggregateNeeded for testing AS ( SELECT TOP 10 ReadingAggregateNeededId FROM ReadingAggregateNeeded ORDER BY ReadingAggregateNeededId ) SELECT @MaxAggNeededId = MAX(ReadingAggregateNeededId) FROM top10 DECLARE @HourlyToProcess TABLE ( SensorID INT , HourStart DATETIME ) INSERT INTO @HourlyToProcess ( SensorID , HourStart ) SELECT SensorID , DATEADD(MINUTE, -1 * DATEPART(MINUTE, ReadingTime), DATEADD(SECOND, -1 * DATEPART(SECOND, ReadingTime), ReadingTime)) AS HourStart FROM ReadingAggregateNeeded WHERE ReadingAggregateNeededId <= @MaxAggNeededId GROUP BY SensorID , DATEADD(MINUTE, -1 * DATEPART(MINUTE, ReadingTime), DATEADD(SECOND, -1 * DATEPART(SECOND, ReadingTime), ReadingTime)) SELECT Reading.SensorID , HourlyToProcess.HourStart , AVG(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingAverage , SUM(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingSum , MAX(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingMax , MIN(CAST(Reading.ReadingValue AS FLOAT)) AS ReadingMin , COUNT(Reading.ReadingValue) AS ReadingCount FROM @HourlyToProcess HourlyToProcess INNER JOIN Reading ON Reading.SensorID = HourlyToProcess.SensorID AND Reading.ReadingTime >= HourlyToProcess.HourStart AND Reading.ReadingTime < DATEADD(HOUR, 1, HourlyToProcess.HourStart) GROUP BY Reading.SensorID , HourlyToProcess.HourStart