- Index (definition)
- Data structure / database object to speed up data retrieval.
- Stores a sorted copy of one or more columns plus pointers to data rows.
- Speeds up
SELECTqueries. - Slows down
INSERT,UPDATE,DELETEbecause indexes must be updated.
- Why indexes help performance
- Reduce full table scans by quickly locating rows.
- Make queries with
WHERE,JOIN,ORDER BY,GROUP BYfaster when they use indexed columns.
Types of Indexes
[1] Clustered index
- Determines physical order of rows in the table.
- Table rows are stored in the same order as the clustered key.
- Only one clustered index per table.
- Leaf nodes contain the actual data pages.
- Fast data retrieval, slower data modifications.
- Example:
1 2CREATE CLUSTERED INDEX IX_Employees_EmployeeID ON Employees (EmployeeID);
[2] Non-clustered index
- Separate structure from the table data.
- Contains indexed column(s) plus pointers (row locators) to actual rows.
- A table can have multiple non-clustered indexes.
- Leaf nodes contain pointers, not the data pages themselves.
- Improves read performance on those columns, slows writes.
- Example:
1 2CREATE NONCLUSTERED INDEX IX_Employees_LastName ON Employees (LastName);
[3] Other common index types
- Unique index: Enforces that all values in the index are unique.
- Composite index: Index over multiple columns.
- Full-text index: Optimized for text search on large text columns.
[4] Covering index
- Index that contains all columns needed by a query.
- Query can be satisfied entirely from the index (no table lookup).
Difference between clustered and non-clustered index
- Clustered: physically sorts table rows. Only one per table.
- Non-clustered: separate structure, can have multiple.
When to use indexes
- Columns frequently used in
WHERE. - Columns used in
JOINconditions. - Columns used in
ORDER BYorGROUP BY. - Columns/column combinations that match common query patterns.
When NOT to use indexes
- Very small tables (full table scan may be faster).
- Columns with high write activity (frequent inserts/updates/deletes) because indexes slow down inserts, updates, and deletes..
- Columns with low selectivity (few distinct values, e.g., boolean flags).
- Checking index usage
- SQL Server:
EXPLAIN,SET STATISTICS IO ON,sys.dm_db_index_usage_stats. - MySQL:
EXPLAIN SELECT ....
- SQL Server:
- Quick example (interview-friendly)
1 2 3 4 5-- Without index SELECT * FROM Employees WHERE DepartmentID = 5; -- With index on DepartmentID CREATE INDEX idx_DepartmentID ON Employees(DepartmentID);- The query now runs much faster, especially on large tables.. With the index, lookups on
DepartmentIDare much faster on large tables.
- The query now runs much faster, especially on large tables.. With the index, lookups on
Tips to Improve Performance Using Indexes
- Index the columns used in WHERE, JOIN, ORDER BY, GROUP BY
- Use composite indexes wisely to cover multiple query columns
- Avoid over-indexing – too many indexes slow down writes
- Use filtered/indexed views for frequently queried subsets
- Update statistics regularly to help the optimizer choose indexes
SQL Index
An index is a database object that improves the speed of data retrieval operations on a database table. It is a data structure that stores a sorted copy of one or more columns of a table, along with pointers to the actual data rows. Indexes are used to speed up data retrieval, but they can slow down data modification operations (INSERT, UPDATE, DELETE) because the index must also be updated.
Types of Indexes
There are 2 main types of indexes: clustered indexes and non-clustered indexes.
Clustered Index
A clustered index is a special type of index that determines the physical order of data in a table. Because the data is stored in the same order as the index, a clustered index can significantly speed up data retrieval operations. A table can have only one clustered index.
Characteristics of Clustered Indexes:
- Determines the physical order of data in a table
- A table can have only one clustered index
- The leaf nodes of a clustered index contain the actual data pages
- Faster data retrieval operations
- Slower data modification operations
Example:
| |
Non-Clustered Index
A non-clustered index is a separate data structure that contains the indexed columns and pointers to the actual data rows. A table can have multiple non-clustered indexes.
Characteristics of Non-Clustered Indexes:
- Separate data structure from the table
- A table can have multiple non-clustered indexes
- The leaf nodes of a non-clustered index contain pointers to the actual data rows
- Faster data retrieval operations
- Slower data modification operations
Example:
| |
When to Use Indexes
Indexes should be used on columns that are frequently used in search conditions (WHERE clause) or join conditions (JOIN clause). They should also be used on columns that are frequently used in ORDER BY or GROUP BY clauses.
When Not to Use Indexes
Indexes should not be used on columns that are frequently updated, as this will slow down data modification operations. They should also not be used on columns that have low selectivity (i.e., columns that have many duplicate values).