SQLintersection session – Top Customer Problems: How to Avoid Calling Bob

I attended the SQL intersection conference in November 2014 in Las Vegas, Nevada.  I am including my notes from the sessions in this blog, primarily for my own reference later.  For anyone that comes across these notes, please take them with a grain of salt – I may very easily have misheard something or have written it down incorrectly.

Session Title:  Top Customer Problems: How to Avoid Calling Bob
Instructor: Bob Ward

  • Biggest issues (in order)
    • Performance
      • Is it a “running” problem (high CPU) vs a “waiting” problem (IO bottleneck, blocking, latch waits, other waits)?
        • Blocking – usually a problem within the app
        • IO bottleneck (PAGEIOLATCH or WRITELOG)
        • Latch waits, hotspots (PAGELATCH_XX)
        • RESOURCE_SEMAPHORE waits
        • ASYNC_NETWORK_IO can be a network problem, but usually a problem with client result processing
        • Other waits, check DMV
      • If the problem is current: Performance Dashboard Reports and DMVs
        • Performance Dashboard Reports
          • DMVs only
          • Create SP in master
          • 2005 version (use for 2005, 2008, 2008R2)
          • 2012 version (user for 2012, 2014)
        • Check for other process using lots of resources
          • High kernel time (i.e., anti-virus)
        • Within SQL Server
          • Background/system tasks
          • High recompiles
        • Query
          • Usually about cardinality estimation
            • Many root causes, see table
            • Tables variables (estimate = 1)
              • TF 2453 (2012 SP2+)
            • Data type mismatch (cannot make estimate, change data types)
          • Poor/missing indexes
          • Query design (lots of scans)
          • Lots of parallelism (check DOP)
        • Post-mortem problems: PSSDiag, SQLNexus
      • Backup/Restore
        • Backup
          • OS error 64 – Network problems (backup started, but connection dropped)
          • OS error 1450 – OS memory problems (backing up to a remote drive)
          • Transaction log growth (full recovery with no t-log backups)
          • VDI backup failure (3rd party app problems)
        • Restore
          • Disk bottleneck
          • Long-running recovery
          • VLF fragmentation
          • CHECKSUM failures
            • Use WITH CONTINUE_AFTER_ERROR to override
          • Consider restoring to a different DB vs replace (if error while restoring, still have fallback)
        • Consistency Issues
          • Message 824 error / DBCC CHECKDB
          • Checksum errors
          • Suspect DB
          • No “magic tricks” to fix
          • Almost always IO path error (try SQLIOSIM)
          • Can also be memory corruption
            • Constant page sniffing – thread that randomly check memory pages
          • Could be a SQL bug (replay the log)
          • Session id of -3 = recovery (deferred)
          • Page-level recovery
        • Stackdumps/Exceptions
          • Verify that the faulting module is SQL
        • Obscure errors
          • Message 1204 – Deadlock
          • Message 833 – IO delay warning (15 sec IO delay)
          • Message 17883 – Non-yielding scheduler (almost always a SQL bug)
          • Message 17884, 17888 – usually a long blocking chain
          • Message 844, 845, 846, 847 – latch time out
          • Use error_reported XEvent to track down a specific error
        • Startup problems
          • Permissions problem (may have changed service account followed by a hard restart)
          • Config (i.e., max server memory)
          • Damaged files
            • Keep file copies of master (shut down SQL, copy file off) and binaries
          • Backup tail of model causes restart problems
        • Memory issues
          • Out of memory
          • Memory grown (OK if below max server memory)
        • Hidden gems
          • Sytem health session persisted (2014)
          • NUMA guidance
          • System Center Advisor

tempdb bulk operation optimized (fix for 2012, built in to 2014)

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.