xv
Examples
Example 2-1 Assembly Code with an Unpredictable Branch ............................. 2-17
Example 2-2 Code Optimization to Eliminate Branches .....................................2-17
Example 2-3 Eliminating Branch with CMOV Instruction.................................... 2-18
Example 2-4 Use of
pause Instruction ............................................................... 2-19
Example 2-5 Pentium 4 Processor Static Branch Prediction Algorithm..............2-20
Example 2-6 Static Taken Prediction Example ................................................... 2-21
Example 2-7 Static Not-Taken Prediction Example ............................................2-21
Example 2-8 Indirect Branch With Two Favored Targets .................................... 2-25
Example 2-9 A Peeling Technique to Reduce Indirect Branch Misprediction ..... 2-26
Example 2-10 Loop Unrolling ...............................................................................2-28
Example 2-11 Code That Causes Cache Line Split ............................................. 2-31
Example 2-12 Several Situations of Small Loads After Large Store .................... 2-35
Example 2-14 A Non-forwarding Situation in Compiler Generated Code............. 2-36
Example 2-15 Two Examples to Avoid the Non-forwarding Situation in
Example 2-14 ................................................................................ 2-36
Example 2-13 A Non-forwarding Example of Large Load After Small Store ........2-36
Example 2-16 Large and Small Load Stalls ......................................................... 2-37
Example 2-17 An Example of Loop-carried Dependence Chain .......................... 2-39
Example 2-18 Rearranging a Data Structure ....................................................... 2-39
Example 2-19 Decomposing an Array..................................................................2-40
Example 2-20 Dynamic Stack Alignment ............................................................. 2-43
Example 2-21 Non-temporal Stores and 64-byte Bus Write Transactions............ 2-54
Example 2-22 Non-temporal Stores and Partial Bus Write Transactions ............. 2-54
Example 2-23 Algorithm to Avoid Changing the Rounding Mode......................... 2-66
Example 2-24 Dependencies Caused by Referencing Partial Registers.............. 2-77
Example 2-25 Recombining LOAD/OP Code into REG,MEM Form.....................2-91
Example 2-26 Spill Scheduling Example Code .................................................... 2-92
Example 3-1 Identification of MMX Technology with cpuid................................... 3-3
Example 3-3 Identification of SSE by the OS ....................................................... 3-4
Example 3-2 Identification of SSE with cpuid ....................................................... 3-4