Opened 10 months ago

Last modified 3 months ago

#14391 patch task

Make the simplifier independent of the typechecker

Reported by: nomeata Owned by: ulysses4ever
Priority: normal Milestone:
Component: Compiler Version: 8.3
Keywords: newcomer Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s): Phab:D4503
Wiki Page:

Description (last modified by ulysses4ever)

I noticed that the simplifier module depends on all of the type checker, and HsSyn stuff, and renamer stuff, which I found strange.

After a little investigation, it seems that the simplifier depends on CoreMonad, and that pulls some very few type-checker related things:

  1. import TcRnMonad        ( initTcForLookup )
    import {-# SOURCE #-} TcSplice ( lookupThName_maybe )
    

for

thNameToGhcName :: TH.Name -> CoreM (Maybe Name)
thNameToGhcName th_name = do
    hsc_env <- getHscEnv
    liftIO $ initTcForLookup hsc_env (lookupThName_maybe th_name)

which is not even used in GHC, but only in GHC Plugins, so this could probably be moved to a separate module pulled in by GhcPlugins.hs

2.

import TcEnv            ( lookupGlobal )

for

instance MonadThings CoreM where
    lookupThing name = do { hsc_env <- getHscEnv
                          ; liftIO $ lookupGlobal hsc_env name }

This might be a bit harder to disentangle. But if successful, it would probably make building GHC in parallel quite a bit faster. And it just seems strange to me that the Core-to-Core code should depend on the type checker…

Simon says:

Both of these code paths go through initTcForLookup which is massive overkill, as the comment with TcEnv.lookupGlobal says. There's clearly a ToDo here to strip off the redundant stuff and do a minimal lookup.

I am optimistically marking this as newcomer because it is a refactoring task, and a good way of learning a bit about various pieces, with a reasonably clear notion of “success”.


UPDATE (April 2018, Phab:4503): both points described above are addressed to the extent when in order to close the ticket we need to do mere code movement. Namely.

  1. lookupThName_maybe and initTcForLookup are eliminated from CoreMonad.hs completely. Its client, though, thNameToGhcName, is better to be moved in the future also, for it is not used in the CoreMonad.hs (or anywhere else) anyway. Joachim suggested “any module reexported by GhcPlugins (or maybe even that module itself)”.
  1. CoreMonad.hs still calls lookupGlobal which is no longer bound to the typechecker monad, but still resides in TcEnv.hs — it should be moved out of Tc-land at some point in the future in order to close the ticket.

Change History (26)

comment:1 Changed 10 months ago by jkiefer

Hello. Am newcomer. I'd love to take a stab at this! Will start hacking away.

comment:2 Changed 10 months ago by bgamari

Great! Do let us know if you encounter trouble.

comment:3 Changed 10 months ago by simonpj

Great.

I suggest you start with TcEnv.lookupGlobal.

  • It invokes initTcForLookup (massive overkill) in order to call tcLookupGlobal. We need a versionn of this function that operates in the IO monad, not the TcM monad.
  • What does tcLookupGlobal get from the TcM monad? It'll need to get these things as explicit arguments instead, I guess. For example: consults the tcg_type_env, which was initialised by initTcForLookup. And it uses tcg_semantic_mod likewise.
  • Then it hands off to LoadIface.tcLookupImported_maybe. That does a bit more IO-ish things before finally deciding to load a new interface decl in tcImportDecl_maybe.
  • tcImportDecl_maybe uses initIfaceTcRn to make an IfM moand in which to do the loading work. But instesad you can make an IfM from scratch, by writing a variant of initIfaceTcRn.

Nothing really hard here, but you'll need to carefully tease out what dependencies are where. (The lack of explicit dependencies is, of course, both the blessing and the curse of monadic progrmaming.)

Happy to help.

comment:4 Changed 9 months ago by jkiefer

Alrighty! First question while I investigate more into performing a minimal lookup instead of using initTcForLookup.

(Bear with me, I'm writing some of this out as I understand it myself)

We want to refactor thNameToGhcName to no longer be a part of CoreMonad and instead exist in a module to be pulled by GhcPlugins.hs.

Do correct me if I'm wrong, but creating a singleton module with just thNameToGhcName in it seems like a naive approach. Is there an existing module that is pulled in by GhcPlugins.hsthat would be an appropriate home for thNameToGhcName? Or should a new one simply be created?

comment:5 Changed 9 months ago by simonpj

I suggest you start with TcEnv.lookupGlobal.

Doing this will avoid you getting sucked into template haskell stuff right away. We can come back to your question when this is done.

comment:6 in reply to:  5 Changed 9 months ago by jkiefer

Gotcha. Thank you for the guidance!

comment:7 Changed 6 months ago by ulysses4ever

As it seems to stale for a while, I'd like to give it a try.

IIUC, the strategy laid by Simon suggests developing a Tc-less version of lookupGlobal. For this we need to identify ties to Tc inside it and then try to cut those.

First tie (also spelled by Simon above) is “tcg_type_env, which was initialised by initTcForLookup”. I looked at initTcForLookup and its dependencies, and it seems to me that tcg_type_envis initialized with emptyNameEnv there. A question: does that mean that corresponding part of the tcLookupGlobal which queries tcg_type_env is not needed at all and can be omitted in the Tc-less version of lookupGlobal?

comment:8 Changed 6 months ago by simonpj

does that mean that corresponding part of the tcLookupGlobal which queries tcg_type_env is not needed at all and can be omitted

Yes I think so.

comment:9 Changed 6 months ago by ulysses4ever

One more question. Consider a part of tcImportDecl_maybe:

initIfaceTcRn (importDecl name)

You said that it is necessary to create a variant of initIfaceTcRn. That sounds fine. But here is another thing: importDecl has IfM in its type. And IfM is defined in TcRnTypes, so leaving it doesn't bring us decoupling from the typechecker. Should we also replace importDecl? This sounds like some more work, because it depends on loadInterface which is >100 LOCs.

Last edited 6 months ago by ulysses4ever (previous) (diff)

comment:10 Changed 6 months ago by ulysses4ever

Ok, after some more thinking it seems to me that importing just TcRnTypes is not big deal. Now I'm wondering if it is legit to deal with initIfaceTcRn by just repalacing it with TcRnMonad.initIfaceLoad. If so, do we need to move it to some less Tc-heavy place?

comment:11 Changed 6 months ago by simonpj

importDecl has IfM in its type. And IfM is defined in TcRnTypes, so leaving it doesn't bring us decoupling from the typechecker. Should we also replace importDecl?

No, leave all that. The point is as follows (please document this in your patch):

  • TcEnv.lookupGlobal may look up an Id that one one has previously looked up.
  • If so, we are going to read its interface file, and add its bindings to the ExternalPackageTable, which is a persistent in-memory cache of information about other modules.
  • Loading that interface file does quite a bit of work, but we don't consider that as "part of the typechecker"; it's essentially just de-serialising interface-file data on disk. For example, any failures are not user errors; they represent messed-up files or GHC bugs, so can legitimately raise an exception.
  • The entire mechanism of importDecl and loadInterface is part of this. Don't duplicate it!
Last edited 6 months ago by simonpj (previous) (diff)

comment:12 Changed 6 months ago by simonpj

I'm wondering if it is legit to deal with initIfaceTcRn by just repalacing it with TcRnMonad.initIfaceLoad

Yes that sounds just right.

If so, do we need to move it to some less Tc-heavy place?

We might indeed want to do this. But you might want to keep the moving-code-around work in a separate patch, lest the diffs from that refactoring obscure the main payload

comment:13 Changed 5 months ago by ulysses4ever

Owner: set to ulysses4ever

I have one more question about tcLookupGlobal (hope the last one for it). Here its part:

if nameIsLocalOrFrom (tcg_semantic_mod env) name
then notFound name  -- Internal names can happen in GHCi
else

I have trouble with notFound part which essentially does careful error-reporting inside Tc monad. I'm lost on how to port this to IO with the same amount of precision. So far I come up with dumb solution (to be placed in the then branch):

pprPanic "lookupGlobal" (ppr name)

It is probably not good enough. Or is it?

comment:14 Changed 5 months ago by ulysses4ever

So far I've replaced error-reporting for TcM (notFound, failWithTc) with pprPanic.

Now I have full version of lookupGlobal in IO (link). It doesn't seem to break any test (besides already broken ones). Should I submit it to the Phabricator or proceed with the second part pointed out by Joachim, thNameToGhcName?

comment:15 Changed 5 months ago by ulysses4ever

Actually, I don't see thNameToGhcName used anywhere (on master). So it could be (relatively) freely moved. Maybe TcSplice, which it depends upon, is the right place for it? In that case the return type's monad, probably, should be changed from CoreM to plain IO.

comment:16 Changed 5 months ago by nomeata

thNameToGhcName is useful for GHC plugins, so any module reexported by GhcPlugins (or maybe even that module itself) is a good place.

comment:17 Changed 5 months ago by ulysses4ever

Differential Rev(s): D4503

comment:18 Changed 5 months ago by ulysses4ever

Status: newpatch

comment:19 Changed 5 months ago by ulysses4ever

Differential Rev(s): D4503Phab:D4503

comment:20 Changed 5 months ago by ulysses4ever

Let's turn to thNameToGhcName now. We want to remove initTcForLookup from it, and for this we need to replace lookupThName_maybe with Tc-less version.

The lookupThName_maybe, in turn, depends on lookupGlobalOccRn_maybe which seems to be very Rn-heavy inside. So I'm not sure how to proceed: should I try to rewrite lookupGlobalOccRn_maybe or leave it. In the latter case we need to find a way to run RnM in thNameToGhcName, which, I guess, not far from the dreaded initTcForLookup?

comment:21 Changed 5 months ago by simonpj

Well,

  • initTcForLookup initialises the tcg_rdr_env to emptyGlobalRdrEnv, and tcl_rdr to emptyLocalRdrEnv.
  • So in lookupThName_maybe, the lookupLocalRdrEnv is guaranteed to faile; and in lookupGlobalOccRn_maybe only the lookupExactOrOrig stuff can succeed.
  • For lookupOrig we are simply looking in the OrigNameCache, which is available in CoreM.
  • I think lookupExact is even easier, because I think the name must be an External name.

Maybe that can help you get further

comment:22 Changed 5 months ago by ulysses4ever

This makes a lot sense, thanks! I've started doing this.

For the other, simpler part of the refactoring, namely, the module canocalization thing, I've submitted update to the Phab. I'm not sure where to put that refactored function though.

comment:23 Changed 5 months ago by ulysses4ever

Done with TH part closely following your directions. See Phab.

comment:24 Changed 4 months ago by ulysses4ever

Description: modified (diff)

comment:25 Changed 4 months ago by ulysses4ever

Description: modified (diff)

comment:26 Changed 3 months ago by Ben Gamari <ben@…>

In bb3fa2d1/ghc:

Less Tc inside simplCore (Phase 1 for #14391)

Simplifier depends on typechecker in two points: `thNameToGhcName`
(`lookupThName_maybe`, in particular)  and `lookupGlobal`. We want to
cut the ties in two steps.

1. (Presented in this commit), reimplement both functions in a way that
doesn't use typechecker.

2. (Should follow), do code moving: a) `lookupGlobal` should go in some
typechecker-free place; b) `thNameToGhcName` should leave simplifier,
because it is not used there at all (probably, it should be placed
somewhere where `GhcPlugins` can see it -- this is suggested by Joachim
on Trac).

Details
=======

We redesigned lookup interface a bit so that it exposes some
`IO`-equivalents of `Tc`-features in use.

First, `CoreMonad.hs` still calls `lookupGlobal` which is no longer
bound to the typechecker monad, but still resides in `TcEnv.hs` — it
should be moved out of Tc-land at some point (“Phase 2”) in the
future in order to achieve its part of the #14391's goal.

Second, `lookupThName_maybe` is eliminated from `CoreMonad.hs`
completely; this already achieves its part of the goal of #14391. Its
client, though, `thNameToGhcName`, is better to be moved in the future
also, for it is not used in the `CoreMonad.hs` (or anywhere else)
anyway. Joachim suggested “any module reexported by GhcPlugins (or
maybe even that module itself)”.

As a side goal, we removed `initTcForLookup` which was instrumental for
the past version of `lookupGlobal`. This, in turn, called for pushing
some more parts of the lookup interface from the `Tc`-monad to `IO`,
most notably, adding `IO`-version of `lookupOrig` and pushing
`dataConInfoPtrToName` to `IO`. The `lookupOrig` part, in turn,
triggered a slight redesign of name cache updating interface: we now
have both, `updNameCacheIO` and `updNameCacheTc`, both accepting `mod`
and `occ` to force them inside, instead of more error-prone outside
before. But all these hardly have to do anything with #14391, mere
refactoring.

Reviewers: simonpj, nomeata, bgamari, hvr

Reviewed By: simonpj, bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #14391

Differential Revision: https://phabricator.haskell.org/D4503
Note: See TracTickets for help on using tickets.