Changes between Version 4 and Version 5 of Commentary/Compiler/Backends/LLVM/DevelopmentNotes


Ignore:
Timestamp:
Jun 9, 2010 4:43:48 PM (4 years ago)
Author:
dterei
Comment:

Take out some done items

Legend:

Unmodified
Added
Removed
Modified
  • Commentary/Compiler/Backends/LLVM/DevelopmentNotes

    v4 v5  
    3939 * Fix stack calculation in LLVM (my changes must have broken it). 
    4040 
    41 == Missing Control Flow Statements == 
    42  
    43 LLVM requires all control flow to be explicit, including 'ret void'. Cmm doesn't, ret void is assumed like in C. For most code this is fine as with CPS all the blocks are ended with tail calls or branhces. However with the handwritten Cmm it can occur although currently all cases are handled correctly. NOTE: This does manifest! Happens on x86-64 for handwritten cmm code! Causes failure when trying to bootstrap GHC. 
    44  
    45 e.g rts/Apply.cmm has the code: 
    46 {{{ 
    47 INFO_TABLE(stg_PAP,/*special layout*/0,0,PAP,"PAP","PAP") 
    48 {  foreign "C" barf("PAP object entered!") never returns; } 
    49 }}} 
    50  
    51 This get passed to back-end as on x86-64: 
    52 {{{ 
    53 stg_PAP_entry() 
    54         { has static closure: False update_frame: <none> 
    55           type: 0 
    56           desc: 0 
    57           tag: 26 
    58           ptrs: 0 
    59           nptrs: 0 
    60           srt: _no_srt_ 
    61         } 
    62     cp: I64[BaseReg + 16] = R3; 
    63         I64[BaseReg + 24] = R4; 
    64         I64[BaseReg + 32] = R5; 
    65         I64[BaseReg + 40] = R6; 
    66         F32[BaseReg + 80] = F1; 
    67         F32[BaseReg + 84] = F2; 
    68         F32[BaseReg + 88] = F3; 
    69         F32[BaseReg + 92] = F4; 
    70         F64[BaseReg + 96] = D1; 
    71         F64[BaseReg + 104] = D2; 
    72         foreign "ccall" barf((cn_str, PtrHint))[_unsafe_call_] never returns; 
    73         R3 = I64[BaseReg + 16]; 
    74         R4 = I64[BaseReg + 24]; 
    75         R5 = I64[BaseReg + 32]; 
    76         R6 = I64[BaseReg + 40]; 
    77         F1 = F32[BaseReg + 80]; 
    78         F2 = F32[BaseReg + 84]; 
    79         F3 = F32[BaseReg + 88]; 
    80         F4 = F32[BaseReg + 92]; 
    81         D1 = F64[BaseReg + 96]; 
    82         D2 = F64[BaseReg + 104]; 
    83 } 
    84 }}} 
    85  
    86 This case is the only one that occurs where a Cmm block doesn't end with a control flow statement and its only really since it does a !'never returns' call before hand with exits the function for good. 
    87  
    88 '''Solutions''': 
    89  
    90 1) Have a pass in the LLVM back-end which checks each basic block and adds an assumed {{{return void}}} at the end if it doesn't end with a control flow statement. 
    91  
    92 2) Modify {{{compiler/codeGen/CgForeignCall.hs}}}, changing the function {{{emitForeignCall'}}} as so: 
    93  
    94 {{{ 
    95 -- alternative entry point, used by CmmParse 
    96 emitForeignCall' 
    97     :: Safety 
    98     -> HintedCmmFormals    -- where to put the results 
    99     -> CmmCallTarget       -- the op 
    100     -> [CmmHinted CmmExpr] -- arguments 
    101     -> Maybe [GlobalReg]   -- live vars, in case we save them 
    102     -> C_SRT               -- the SRT of the calls continuation 
    103     -> CmmReturnInfo 
    104     -> Code 
    105 emitForeignCall' safety results target args vols _srt ret 
    106  | not (playSafe safety) = do 
    107    temp_args <- load_args_into_temps args 
    108    let (caller_save, caller_load) = callerSaveVolatileRegs vols 
    109 +  let caller_load' = if ret == CmmNeverReturns then [] else caller_load 
    110    stmtsC caller_save 
    111    stmtC (CmmCall target results temp_args CmmUnsafe ret) 
    112 -  stmtsC caller_load 
    113 +  stmtsC caller_load' 
    114 }}} 
    115  
    116 This stops caller save registers being restored if the call is never meant to return, which should be fine since the code is dead code anyway. 
    117  
    118 For the moment I've handled it in the first way as I felt my patch had a better chance of being merged if it changed existing code in GHC as little as possible. I prefer the second way though as adding in 'return void' to cmm basic blocks feels like a hack and the second approach give Cmm the property that all blocks end in a control flow statement which seems pretty useful to me.  
    119  
    12041== Known Function mistaken for Unknown External Label == 
    12142 
     
    13051 * See GHC trac ticket #1852. Floats are padded to word size (4 extra bytes on a 64 bit machine) by putting an appropriate `CmmLit` before them. On `fasm` this is necessary and forces the NCG to produce correct code. On `fvia-C`, this isn't necessary so it strips this padding out. What approach does LLVM blocks end in a control flow statement which seems pretty useful to me.  need? 
    13152 
    132  * Should I be using `FiniteMap` instead of Data.Map? 
    133  
    13453 * {{{SPARC/CodeGen/Gen32.hs}}} seems to have a few special cases for `CmmMachOp`. Perhaps these should also be handled in LLVM to improve performance? 
    13554 
    13655 * tail call only supported on `x86`/`x86-64` and `PowerPC`. What about `SPARC`? How will we use the LLVM back-end on SPARC? 
    137  
    138  * Using hard-coded names for my implementation of STG virtual registers. I think this is fine but should check what are the rules for the unique name generator to make sure it can't conflict. Or perhaps generate unique names for them at the start of each function. 
    13956 
    14057 
     
    14360 * look into lto/gold. 
    14461 * Use a new Monad instead of passing `LlvmEnv` around everywhere. 
    145  * Use `OrdList` instead of []. 
    14662 * Should be able to put all `CmmProc` and `CmmData` labels in environment at start and after that, can print out LLVM IR as I generate it for each data and proc instead of storing. 
    14763 * Look at using LLVM intrinsic functions. There are a few math functions. Also, there is a `smul_overflow` detect function. 
    14864 * Improve Type safety of LLVM module (e.g split out pointers to own data type, to limit where they can be used). More type checking in ppr stage. 
    14965 * Rearrange some functions and files better. 
    150  * Look at merging some of code with NCG. 
    15166 * handling of `LlvmVar` or `LlvmType` for function signature isn't nice. Whole function signature handling could be better really. 
    152  * handling of global registers and custom calling convention ugly. Lots of very similar code, need to change to handle multiple architectures easily. 
    153  
    154 DONE: improved a fair amount. Could still see if can be cleaned up more. Could extend the `RealReg` type for example to include name and LLVM type, making the `getRealRegReg` and `getRealRegArg` functions very simple (and removing the `lmBase`... functions). 
    155  
    15667 * {{{LlvmCodeGen.CodeGen.genCall}}} code for foreign calls is quite complex, could use a clean-up.