SQLintersection session – Parallel Query Execution, Part 1

I attended the SQL intersection conference in November 2014 in Las Vegas, Nevada. I am including my notes from the sessions in this blog, primarily for my own reference later. For anyone that comes across these notes, please take them with a grain of salt – I may very easily have misheard something or have written it down incorrectly.

Session Title: Parallel Query Execution, Part 1
Instructor: Paul White

Within a parallel plan, each thread is effectively running a serial plan

Parallel seeks and scans
- 2005 and earlier use parallel page provider
- 2008+ use range enumerators
  - May see ACCESS_METHODS_DATASET_PARENT latch waits
- Only forward scans and seeks are supported
  - Backward scans can only happen serially
- Exchange types
  - Hash – Distributed based on hash function
  - Round Robin – Sent to next available consumer
  - Broadcast – Pushed to all consumers
  - Demand – Row at a time (rare)
  - Range – Mostly index build plans
- Exchanges can be sort order preserving or not
- Root node has property called ThreadStats
  - Will show number of threads reserved and number of threads used
  - May not use all reserved threads depending on timing of thread allocated in the query (can reuse threads that are already complete)
- exec_context_id is the thread number
  - 0 = serial
- Query to identify node id where CXPACKET wait may be happening
- Parallel merge join requires order preservation
  - Does not scale well to high DOP
- Parallel hash join
  - Generally scales well to high DOP
  - Bitmap reduction – can only appear in parallel plans, and can greatly reduce work to be done (i.e., early reduction)
- Parallel nested loops join
  - Inner side runs serially
  - Optimizer is “biased” against them
  - TF 8649 – Force parallel plan
  - Various tricks or plan guides to force usage
    - Such as specifying row goals
    - Express queries as APPLY – As query is pushed toward parallel nested loop queries, can push toward smaller memory grants and more efficient use of resources
  - Parallel sorts
    - Observer memory fraction percentage
    - Tricks to avoid spills
      - Cast strings to larger strings (i.e., select cast(str as varchar(100))) to force larger memory grant
      - Split sort into smaller sorts (initial select into temp table, then cross apply to temp table)
    - Avoid too much parallelism – set the cost threshold for parallelism
    - Parallel deadlocks
      - Once deadlock is detected, contents of CXPACKET buffers spill to tempdb
      - Interesting crafted demo that causes spills only at even DOPs (but not with odd)!

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

sqltran.org

SQLintersection session – Parallel Query Execution, Part 1

Leave a Reply Cancel reply

BEGIN TRAN … COMMIT