# Exploring Statistics
> [!Note]
> Ensuring the optimizer has **accurate statistics**—through both automatic and manual methods—relies on understanding **histograms**, **density**, **string statistics**, the variety of **single-column**, **multi-column**, and **filtered** statistics, and knowing when to apply **FULLSCAN** or rely on default sampling, all while monitoring the **staleness threshold** for updates.
SQL Server can **automatically** generate and refresh statistics under the `AUTO_CREATE_STATISTICS` and `AUTO_UPDATE_STATISTICS` settings, building single-column histograms on demand when queries reference unindexed columns. You may also create statistics explicitly with the `CREATE STATISTICS` command to cover multi-column correlations or filtered subsets of data.
At the heart of each statistics object lies a **histogram**, which captures the distribution of up to 200 value ranges for the leading column. **Density** metrics complement this by quantifying how selective each column is for equality predicates. For character data, **string statistics** enhance the optimizer’s ability to estimate `LIKE` and text patterns accurately.
When maximum precision is required—especially after substantial data loads—you can force a full population of the histogram with `UPDATE STATISTICS … WITH FULLSCAN`. This bypasses the default sampling algorithm to ensure every row contributes to the distribution map.
To prevent estimates from drifting, SQL Server defines a **staleness threshold**: once **500** plus **20%** of the rows in the key column have changed, the engine triggers an automatic statistics update on that object.
Below is a concise tabular summary of these key statistics behaviors:
|**Aspect**|**Details**|
|---|---|
|**Creation & Refresh**|Automatic via **AUTO_CREATE_STATISTICS** / **AUTO_UPDATE_STATISTICS**, or manual with `CREATE STATISTICS`|
|**Core Components**|**Histogram** (value distribution), **Density** (equality selectivity), **String Stats** (text pattern estimates)|
|**Types Available**|**Single-column** (auto), **Multi-column** (manual), **Filtered** (manual)|
|**Advanced Update Option**|`UPDATE STATISTICS … WITH FULLSCAN` for complete histogram accuracy versus default sampled population|
|**Staleness Threshold**|**500 + 20%** of modifications on the key column triggers automatic update|
> [!tip]
> Use **FULLSCAN** sparingly—reserve it for critical tables after bulk loads—to avoid prolonged maintenance windows and high I/O during sampling.
> [!warning]
> Overly delayed or infrequent statistics updates on high-churn tables can mislead the optimizer into poor plan choices, so monitor staleness via `sys.dm_db_stats_properties`.
---
## ## References
- Korotkevitch, D. (2022). _SQL Server advanced troubleshooting and performance tuning: Best practices and techniques_. O’Reilly Media.
- Nevarez, B. (2022). _SQL Server query tuning and optimization: Optimize Microsoft SQL Server 2022 queries and applications_. Packt Publishing.