Changes between Version 3 and Version 4 of SimdLlvm


Ignore:
Timestamp:
Oct 5, 2011 6:37:34 AM (3 years ago)
Author:
chak
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SimdLlvm

    v3 v4  
    55The SIMD vector extension to GHC proposed here maps to LLVM's vector type in a straight forward manner, which in turn enables us to target a wide range of hardware capabilities. However, GHC's native code generator will simply map SIMD vector operations to ordinary scalar code (in order to avoid having to deal with the complexities of SSE, AVX, NEON, etc). 
    66 
    7 == Summary of the most widely used SIMD extensions == 
     7== Variations in the most widely used SIMD extensions == 
    88 
    99Intel and AMD CPUs use the [http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions SSE family] of extensions and, more recently (since Q1 2011), the [http://en.wikipedia.org/wiki/Advanced_Vector_Extensions AVX] extensions.  ARM CPUs (Cortex A series) use the [http://www.arm.com/products/processors/technologies/neon.php NEON] extensions. Variations between different families of SIMD extensions and between different family members in one family of extensions include the following: 
     
    1717 '''Alignment requirements''':: 
    1818  ??? 
     19 
     20While LLVM mostly shields us from these differences, we need to implement traversals of unboxed Haskell arrays as strided loops, where the stride corresponds to the SIMD vector length. LLVM enables us to use a stride that is not the same as that of the SIMD register width of the target architecture, it makes sense to use the target vector width already in the Haskell code. Why? If the Haskell stride is smaller than the SIMD registers, we do not fully exploit all available parallelism. And if the Haskell stride is longer than the SIMD registers, we produce less efficient code for the excess portion at the end of an array whose length is not a multiple of the stride length and force LLVM to expand individual vector operations to multiple target instructions. 
     21 
     22== Type-dependent vector sizes ==