COLUMNSTORE Index

Leave a comment (0) Go to comments

SQL Server’s indexes, clustered and nonclustered, are based on the B-tree. B-tree structure is great for finding data that match the condition on the basis of the primary key. They give us good results in the form of performance when all data scan is required. Then why we needed columnstore indexes? Before jumping to the reason lets visit the few advantages of row store option of indexes above column store option. While it’s possible to build a system that stores all data in columnar format, row stores still have advantages in some situations. B-tree provides a very efficient method for looking up or modifying a single row of data. So if our OLTP data includes many single row lookups and many updates and deletes, the preferable option is to use row store technology. But, in OLAP queries scan, aggregate, and join large amounts of data and here comes the usage of column stores technology. The column store technology provides other major advantages. Lets go thru them one by one.

SQL Server now provides you with a choice. You can build columnstore indexes on your big data warehouse tables and get the benefits of column store technology and batch mode processing without giving up the benefits of traditional row store technology when a B-tree is the right tool for the job.

There are two main reasons:

1) Almost every organization stores data in row fashion. Both heaps and B-trees are row stores because they store the values from each column in a given row contiguously. So finding the values corresponding to a particular row is very efficient when data is grouped together on one page. Storing data by rows is less ideal for compressing the data. When data is stored row-wise, the number of rows per page is relatively few, so the opportunities to exploit similarity among values are limited. A column store organizes data in column-wise fashion. In column wise fashion, data from a single column is stored contiguously. Usually there is repetition and similarity among values within a column and it gives the chance of compressing the indexed data in SQL Server.


2) Storing the data in column wise fashion, helps in reduction of IO cost as each column can be accessed independently from each other. If a query touches only a subset of the columns in a table, IO is reduced. In OLAP, a multidimensional query never touches all the columns but only 20-25% of the columns in table and the usage of column wise technology, helps in great reduction of IO and speeding up the queries.

When the query uses at least one columnstore index, query plan processing can speed up joins, aggregations, and filtering. The query optimizer takes care of choosing when to use batch mode processing in case of columnstore index and when to use traditional row mode query processing.

SQL Server now provides you with a choice. You can build columnstore indexes on your big data warehouse tables and get the benefits of column store technology and batch mode processing without giving up the benefits of traditional row store technology when a B-tree is the right tool for the job.

Columnstore indexes are available in CTP 3 of SQL Server Code Name “Denali.”

Columnstore indexes can be created on table using a small variation on existing syntax for creating indexes. Let’s have an index named firstcolumnstoreindex on a table named columnstoretable with three columns, named col1, col2, and col3, use the following syntax:

CREATE NONCLUSTERED COLUMNSTORE INDEX firstcolumnstoreindex ON columnstoretable (col1, col2, col3);

While putting all columns in columnstore index, It does not matter what order you list the columns because a columnstore index does not have a key like a B-tree index does. The data will be re-ordered automatically to get the best compression.

It is necessary to populate the table with data before creating the columnstore index. To update the data in the table which has columnstore index created on it, we need to disable or drop the columnstore index and update the table and then rebuild the columnstore index. Or we can use the method of partitioning. If table is partitioned, first put data to be updated or inserted into a staging table and then build a columnstore index on the staging table, and switch the staging table into an empty partition of your main table. And vice-versa.

EOF - COLUMNSTORE Index, SQL Server 2012 , 2008 R2 , 2008 , 2005 , 2000

Leave a Reply

Your email address will not be published. Required fields are marked *


*


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Notify me of followup comments via e-mail. You can also subscribe without commenting.