Columnstore Index
Columnstore index was introduced in the SQL Server 2012 to significantly speed-up the processing time of common data warehousing queries.
Generally data warehousing workloads involve summarizing large volume of data. As you might know, we use different techniques in data-warehousing (DW) and decision support systems (DSS) to improve performance like pre-computed summary tables, indexed views, OLAP cubes etc. Although, these techniques greatly speed up query the data processing but they have some side effects. These techniques can be inflexible, hard to maintain, and must be designed specifically for specific query problem.
Data warehousing processing overhead can be significantly reduced by using SQL Server Columnstore index. Not only that, columnstore indexes reduces the overhead of the other solutions. Columnstore indexes also enable queries to compute the results quickly so that pre-computation is not required.
Characterisitcs of SQL Server Columnstore Indexes:
How to create Columnstore Index:
Columnstore index can be create using SQL Server Management Studio as well as using T-SQL.
Creating Columnstore index by using SQL Server Management Studio (SSMS):
Creating Columnstore index by using T-SQL:
Create columnstore index on Employee table, name it CSI_Employee, and include all the columns!
GO
Data Types
Columnstore index support some data-types and doesn't support others. You need to be aware of this while creating columnstore index to avoid any run time errors.
The following data types can be included in a columnstore index:
The following data types cannot be included in a columnstore index:
Columnstore index was introduced in the SQL Server 2012 to significantly speed-up the processing time of common data warehousing queries.
Generally data warehousing workloads involve summarizing large volume of data. As you might know, we use different techniques in data-warehousing (DW) and decision support systems (DSS) to improve performance like pre-computed summary tables, indexed views, OLAP cubes etc. Although, these techniques greatly speed up query the data processing but they have some side effects. These techniques can be inflexible, hard to maintain, and must be designed specifically for specific query problem.
Data warehousing processing overhead can be significantly reduced by using SQL Server Columnstore index. Not only that, columnstore indexes reduces the overhead of the other solutions. Columnstore indexes also enable queries to compute the results quickly so that pre-computation is not required.
Characterisitcs of SQL Server Columnstore Indexes:
· Columnar data format – unlike the traditional row based organization of data (called rowstore format), columnstore indexes group and store data in one column at a time. SQL Server query processing can take advantage of this new data
layout and significantly improve query execution time.
· Faster query results – Columnstore indexes can produce faster results for the
following reasons:
o Only the columns needed must be read. Therefore, less data is read
from disk to memory and then from memory to processor cache.
o Columns are heavily compressed. This reduces the number of bytes
that must be read and moved and hence less I/O operations.
o Most queries do not touch all columns of the table. Therefore,
many columns will never be brought into memory. This, combined with excellent
compression, improves buffer pool usage, which reduces total I/O.
o Advanced query execution technology processes chunks of columns
called batches in a streamlined manner,
reducing CPU usage.
· Key columns – There is no concept of key columns in a columnstore index.
· Clustered index key – all the columns of clustered index must be present in the nonclustered columnstore index. If a column
in the clustering key is not listed in the create index statement, it will be
added to the columnstore index automatically.
· Partitioning – Columnstore indexes works with table partitioning. No change to
the table partitioning syntax is required. Albeit, nonclustered
columnstore index can only be created on a partitioned table if the
partitioning column is part of the columnstore index.
· Record Size – There is no index key record size limitation on columnstore indexes.
· Query processing – Along with the columnstore index, SQL Server introduces batch
processing to take advantage of the columnar orientation of the data. The columnstore
structure and batch processing both contribute to performance gains.
· Table cannot be
updated – For SQL Server 2012, a table with a columnstore
index cannot be updated.
How to create Columnstore Index:
Columnstore index can be create using SQL Server Management Studio as well as using T-SQL.
Creating Columnstore index by using SQL Server Management Studio (SSMS):
1.
Open Management Studio and use Object Explorer to connect to the SQL Server Database Engine.
2.
In Object Explorer, expand the instance of SQL Server, expand Databases, expand a database, expand a table, right-click on required
table, point to New Index, and then click Non-Clustered Columnstore Index.
3.
In the Index name dialog box, under the General tab, type a name for the new index, and then click Add.
4.
In the Select Columns dialog box, select
the columns to participate in the columnstore index, and then click OK two times to create the index.
Creating Columnstore index by using T-SQL:
Create columnstore index on Employee table, name it CSI_Employee, and include all the columns!
-- Check whether index CSI_Employee exists on Employee
-- and delete if exists.
IF EXISTS( SELECT name FROM sys.indexes
WHERE object_id = OBJECT_ID('dbo.Employee')
AND name = 'CSI_Employee')
DROP INDEX CSI_Employee ON dbo.Employee
GO
-- Create nonclustered columnstore index CSI_Employee on
Employee.
CREATE NONCLUSTERED COLUMNSTORE
INDEX CSI_Employee
ON dbo.Employee
(
EmployeeID
,FirstName
,MiddleName
,LastName
,DOB
,DepartmentID
)
Data Types
Columnstore index support some data-types and doesn't support others. You need to be aware of this while creating columnstore index to avoid any run time errors.
The following data types can be included in a columnstore index:
· char and varchar
· nchar and nvarchar (except varchar(max) and
nvarchar(max))
· decimal (and numeric) (Except with precision greater than 18 digits.)
· int, bigint, smallint, and tinyint
· float (and real)
· bit
· money and smallmoney
· All date and time data types (except datetimeoffset
with scale greater than 2)
The following data types cannot be included in a columnstore index:
· binary and varbinary
· ntext, text, and image
· varchar(max) and nvarchar(max)
· uniqueidentifier
·
timestamp
·
sql_variant
·
decimal and numeric with precision greater than 18 digits
·
datetimeoffset with
scale greater than 2
·
CLR types (hierarchyid and spatial
types)
·
xml
There are certain restrictions and limitations on columnstore indexes which are listed below:
·
Columnstor index cannot have more than
1024 columns.
·
Columnstor index annot be clustered or unique.
Only nonclustered columnstore indexes are available.
·
Columnstor index cannot be created on a
view or indexed view.
·
Cannot include columns of restricted data-types as mentioned above.
·
Cannot act as a primary
key or a foreign key.
·
Cannot be changed using
the ALTER INDEX statement. Drop and re-create the columnstore index
instead.
·
Cannot be created with
the INCLUDE keyword.
·
Cannot include the ASC
or DESC keywords for sorting the index. Columnstore indexes are ordered
according to the compression algorithms.
·
Does not use or keep
statistics in the manner of a traditional index.
Note: You can compare the performance optimization be executing heavy volume queries before and after creating columnstore index. You might get 50% to 90% performance gain!
No comments:
Post a Comment