Quantcast
Channel: SQL Server Database Engine forum
Viewing all articles
Browse latest Browse all 12963

How can I bulk insert into deltastore? HOW TO use SqlBulkCopy on a table with Non-Clostered columnstore index efficiently ?

$
0
0

Hello SQL server community,

We have an application that is loading data events in batches.

The application inserts events into SQL Server table with Bulk insert (System.Data.SqlBulkCopy). (in batches of 1-10000)

We have added a non-clustered columnstore index to the table.

Now each bulk insert results in COMPRESSED row group with the size of the batch and after a while things get a bit inefficient => you get lots of relatively small compressed rowstore groups and using the index becomes MUCH slower. 

Essentially we are back to times when there was no deltastore on NCCIs

Of course you can run a costly REORGANIZE on your NCCI to merge those tiny closed rowgroups into big ones.


If you execute an insert statement, index records it in the deltastore. And things are handled much more efficiently. 

 

Therefore my question: Is there any way to ask SQL server to treat Bulk insert as normal insert when updating columnstore indexes ?

Another formulation: Is there any way to disable BULKLOAD rowgroup trim reason for columnstore index when bulk-loading data?

Thank you very much,

Alexander

This scripts explains the question more precise:

To run it you need to create a file on filesystem for BULK INSERT 

It will create a DB and will clean it up afterwards 

SQL Server 2017 (14.0.3223.3) was used. 

Behavior is the same on Microsoft SQL Azure (RTM) - 12.0.2000.8   Aug 27 2019 17:56:41

USE [master]
GO

THROW 51000, 'Create a file C:\TestColumnStoreInservVSBulkInsert.txt with content: "test, 1" and comment this line', 1;  

CREATE DATABASE TestColumnStoreInservVSBulkInsert 
GO
use [TestColumnStoreInservVSBulkInsert]

CREATE TABLE [Table](
[value] [varchar](20) NOT NULL INDEX  IX_1 CLUSTERED,
[state] int not NULL
)

CREATE NONCLUSTERED COLUMNSTORE INDEX [NCI_1] ON [dbo].[Table]
(
[value],
[state]
)WITH (DROP_EXISTING = OFF, COMPRESSION_DELAY = 0) 

insert into [Table] values (('TestInitail'), (1))

DECLARE @IndexStateQuery VARCHAR(MAX)  
SET @IndexStateQuery = 'SELECT i.object_id,   
    object_name(i.object_id) AS TableName,   
    i.name AS IndexName,   
    i.index_id,   
    i.type_desc,   
    CSRowGroups.*
FROM sys.indexes AS i  
JOIN sys.dm_db_column_store_row_group_physical_stats AS CSRowGroups  
    ON i.object_id = CSRowGroups.object_id AND i.index_id = CSRowGroups.index_id   
ORDER BY object_name(i.object_id), i.name, row_group_id;  '

EXEC (@IndexStateQuery)
-- Creates a COMPRESSED rowGroup with 1! record  
--QUESTION: How to make this statement add data to Open Rowgroup ?
BULK INSERT [Table] FROM 'C:\TestColumnStoreInservVSBulkInsert.txt' WITH ( FORMAT='CSV', ROWS_PER_BATCH = 1);

EXEC (@IndexStateQuery)
-- Adds one record to existing open rowgroup 
insert into [Table] select top 1 * from [Table]
EXEC (@IndexStateQuery)

--Costly fix. Merge and recomrpess closed rowgroups
--ALTER INDEX NCI_1   ON [Table] REORGANIZE   
--EXEC (@IndexStateQuery)


--Cleanup
use [master]
alter database [TestColumnStoreInservVSBulkInsert] set single_user with rollback immediate
drop database [TestColumnStoreInservVSBulkInsert]

 



Viewing all articles
Browse latest Browse all 12963

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>