SQLintersection session – Queries Gone Wild 2: Statistics and Cardinality Estimation, Versions 2008, 2008R2, 2012 and 2014

I attended the SQL intersection conference in November 2014 in Las Vegas, Nevada.  I am including my notes from the sessions in this blog, primarily for my own reference later.  For anyone that comes across these notes, please take them with a grain of salt – I may very easily have misheard something or have written it down incorrectly.

 

Session Title:  Queries Gone Wild 2: Statistics and Cardinality Estimation, Versions 2008, 2008R2, 2012 and 2014
Instructor: Kimberly Tripp, Joe Sack

 

Part I: Statistics and Data Distribution (Tripp)

 

  • Query execution
    • Standardization/normalization/algebrization creates a parse tree; this includes things such as expanding select * and views
    • Heuristics-based optimization (basically a guess) vs cost-based optimization
    • See Kimberly’s SP talk at this conference for limitations on SP performance
  • Query rewrites sometimes help
    • Joins written as subqueries (and vice versa)
    • Derived tables
    • Intermediate materialization
    • Check SARGability of prefixes
  • Selectivity
    • Tipping point is a very small percentage to drive SQL to use NCI with bookmark
    • This is why lots of narrow NCIs aren’t very helpful
    • Tipping point calculations – number of pages in CI * (between 1/4 and 1/3) = number of rows to do bookmark lookups
  • What to look for in statistics
    • Header: Updated date, rows vs rows sampled
    • Density vector
    • Histogram
  • “Bitmasks are evil”
  • When stats are build, SQL starts to “compress” the histogram, meaning there may be fewer than 200 steps. This can result in a bit of skew in the stats where “earlier” row get compressed out earlier and more often, resulting more “lossiness” for those rows
  • Leave auto-create stats on (except for very limited high-throughput environments that require very consistent timing)
  • DBCC SHOW_STATISTICS (tablename, statname)
    • STAT_HEADER
    • DENSITY_VECTOR
      • Output STAT_HEADER and DENSITY_VECTOR to temp table to join
    • HISTOGRAM
    • STAT_HEADER JOIN DENSITY_VECTOR (undocumented)
  • sp_autostats tablename
    • Not as useful except to quickly see if auto update disabled and last updated time
  • stats
    • See Kimberly’s query to get an at-a-glance view of all stats on a DB
  • STATS_DATE() function
  • dm_db_stats_properties
    • Rows vs rows sampled
    • modification_counter
  • Stats types
    • Auto-created
    • User-created
    • Hypothetical (what-if analysis (autopilot))
      • Ignored by query optimizer
      • DBCC AUTOPILOT (blog)
    • sp_update_stats (don’t use; it’s a sledge hammer)
      • Only one row modified triggers update
    • sp_create_stats (OK to use)
      • sp_create_stats ‘indexonly’, ‘fullscan’
        • (whether to user fullscan depends on the table size)
      • Updating
        • Leave autoupdate on, but manage yourself off-hours
        • Often based on sysrscols.rcmodified vs. entire table (3 to 8 percent)
      • Catch-all update statements, i.e.: SET col = ISNULL(@newvalue, col)
        • Causes counters to increase even if no actual change
        • Can greatly skew autoupdate thresholds
      • Auto updating
        • Can be done async (not enabled by default)
auto_update_stats auto_update_stats_async Behavior
Off n/a No update
On Off Wait during updates
On Off Stats updated async
  • Good for OLTP environments generally
  • TF 2371 uses dynamic auto update threshold
  • Leave on as a safety net
  • Best to roll-your-own updating (possibly using Ola Hallengren scripts as a basis)
  • Incremental stats (2014)
    • Partitioned tables, if 20% of records change stats recomputed for that partition and then rolled into overall table stats
    • Very lossy
    • Despite some information out there to the contrary, this is not partition level stats
  • If for some reason auto_update_stats is off (on 2008R2 and earlier), updating stats does not invalidate plans
    • Recommend running sp_recompile to rebuild scripts
  • Watch for cases where stats are bad and stats have been sampled; use full scan instead
    • Test out, then use full scan in scripts off hours
    • Additional benefit is that sampled scans are serial but full scan can be parallelized (subject to maxdop)
  • Filtered stats
    • Session settings not required
    • Create stats over a range of values
    • Add WHERE clause to CREATE STATISTICS statement
    • Beware of using sampling (but weigh against full scan costs)
  • Analyzing data skew
    • See sp_SQLSkills_AnalyzeColumnSkew and sp_SQLSkills_AnalyzeAllLeadingIndexColumnSkew
    • sp_SQLSkills_CreateFilteredStats
      • Splits histogram into 10 to 20 chunks
      • Drop and recreate every week or so
    • Filtered stats only autoupdate on the same schedule as the entire table – not nearly often enough
  • No good way to track stats usage to see how often a stat is used or not used
  • TF 9481 – Use legacy CE
  • TF 3604 – Output to client
  • TF 9204 – Which stat is used
  • New CE does not use
    • Filtered stats
    • Multi-column column stats
    • Autopilot
  • Limitations to filtered stats
    • Need to recompile or use plan guide to take advantage (OPTION(RECOMPILE))
    • Better to use partitioned views
    • Subject to interval subsumption problem when query predicate spans filter interval
    • But these are really designed for point queries
    • Same problem using IN clause where values span ranges. Workaround by using UNION ALL (even using dynamic SQL)
  • See the whitepaper on When to Break Down Complex Queries
  • Unevenly distributed data through the table
    • Can really only solve through indexing

 

Part IIA: Cardinality Estimation (Sack)

 

  • Problem with underestimating
    • Memory grants too small, leads to spills
  • Problem with overestimating
    • Memory grants too large, causes concurrency problems on other threads having to wait
    • Check sys.dm_exec_query_memory_grants for granted_memory_kb = null
  • General problems
    • Serial vs. parallelism
    • Seek vs. scan
    • Bad table join order
    • Bad join algorithm
  • Ways the CE comes up with estimates
    • Direct hit: Specific value in stats, uses EQ_ROWS
    • Intrastep hit: Uses AVG_RANGE_ROWS
    • Unknown runtime value: Uses density vector (all_density * row_count)
    • Distinct values: uses 1.0 / all_density
  • XEvent inaccurate_cardiniality_estimate
  • dm_exec_query_stats has actual row counts for last time executed
    • Can hide leaf-level skews
  • Warnings about troubleshooting CE issues
    • Be sure to check number of executions (need to multiply by number of rows)
    • Some cardinality issues cannot be readily resolved (example: Entity-Value table that contains many data types) and require fundamental redesign
  • Common CE-related problems
    • Missing or stale statistics
      • Verify auto_stats on
      • Check if stats have no_compute
      • Check stats date; manually update if needed
      • TF 2371 – decreasing dynamic threshold for auto updates
    • Sampling issues
      • Compare rows vs rows scanned
      • Will full scan work? Don’t know until you try it but can work in about 20% of cases
    • Hidden column correlation
      • Create multi-column column stats or indexes
      • Optimizer will only create single-column stats automatically
      • If value goes off stats histogram, reverts to density vector which can be worse
    • Intra-table column comparisons
      • e., where col1 < col2
      • Optimizer does not know how to correlate
      • Create computed column (i.e., col1 – col2) and create stats on it
        • Only works for pretty simple comparisons
      • Use self-joins, CTEs, or normalize data
    • Table variables
      • Generally not an issue for very small number of rows (less than about 100 rows)
      • Temp tables are better
    • Scalar or multi-statement table value functions
      • Convert to inline
      • Eliminate functions
      • Ask if the function is appropriate being in the data tier?
    • Parameter sniffing
      • SQL Server optimizing Stored Procedure (class on Pluralsight)
      • Check for ParameterCompiledValue vs ParameterRuntimeValue from actual plan
      • Standard parameter sniffing fixes
    • Implicit data type conversion
      • See Keyhias blog post on what causes conversions
      • Use correct datatypes
      • Use consistent naming conventions
      • Use catalog views to monitor for inconsistencies
    • Complex predicates
      • e., where left(firstname, 15) + ‘ ‘ + left(lastname, 15) = ‘xxxx’
      • Simplify when possible
    • Query complexity
      • Many objects references (implicitly or explicitly)
      • Break into smaller materialized sets
        • IO penalty, but can greatly make it worth it overall
      • Gatekeeper rows (DW scenarios)
        • See this blog post
        • Covering nonclustered index using FORCESEEK (usually a bad thing, but works here)
        • Dynamic SQL, columnstore index (still get skew, but speed of columnstore overcomes problems)
        • No really good resolution as of 2014
      • Linked servers
        • Prior to SQL 2012 SP1 cannot get permissions to get stats on remote server (with nonadmin accounts)
        • Can grant ddladmin role
        • Consider ETL or replication
      • Optimizer bugs
        • Very rare
        • Check Connect (either add to existing or create new with detailed reproduction)
      • See Joe’s list of troubleshooting questions

 

Part IIB: Cardinality Estimator Changes (Sack)

 

  • Many trace flags (disabled by default)
  • 0 to 2012 used essentially the same estimator
  • New CE in 2014
    • Workloads may or may not benefit from it
    • Context DB compat level must be 120+
    • See this article about using trace flags
    • TF 2312 – Use new CE
    • TF 9481 – Use legacy CE
    • Root node of plan has CardinalityEstimationModelVersion
      • 120 = new; 70 = old
    • To override per query: OPTION (QUERYTRACEON 9481)
      • But this requires ‘sa’ rights
      • Use plan guides to work around
    • New behavior
      • Multi-predicate correlation
        • Old: p0 * p1 * p2 * p3
        • New: p0 * p1 ^ (1/2) * p2 ^ (1/4) * p3 ^ (1/8)
        • Exponential backoff, ordered by selectivity
        • Does not use multi-column stats
        • Only to the first 4 predicates (?)
      • Out-of-range value estimation
        • Old: Ascending key problem underestimates recently-added rows; estimates 1 row above top step in histogram
          • TF 2389, 2390 enable auto generation of stats on ascending keys
        • New: Assumes existence of out-of-range rows; estimate based on density vector
      • Join estimates / containment
        • Old: Correlation assumed if non-join filter specified
        • New: No correlation assumed
      • Join estimation
        • New makes assumptions that are best associated with uniformity of data
        • Data skews may result in estimation worse than legacy CE
      • Distinct value count estimation (ambient cardinality)
        • Geared at ORM-generated code that includes extraneous joins or query elements
        • Tends to decrease estimates
      • TF 2363 – shows internal optimizer steps
      • XEvent query_optimizer_estimate_cardinality – indicates what stats are used
        • Usually for edge cases / MS Support
      • Options for regressions
        • Use compat level < 120, TF 2312 for selected queries
        • Use compat level 120, TF 9481 for queries with regressions
        • Use compat level 120, server trace flag 9481 globally
      • Be sure to test! If cannot test, use legacy

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.