GHC accepts invalid program because of EPS poisoning
|Reported by:||errge||Owned by:|
|Type of failure:||GHC accepts invalid program||Difficulty:||Moderate (less than a day)|
|Test Case:||Blocked By:|
Assume the following setup:
- compiling in batch mode (--make),
- package klassz, module Class contains the class "Class a" and the data "Data",
- package klassz, module A contains an (orphan) instance "Class Data",
- package main, module AImporter just imports A,
- package main, module B imports only Class, but uses the instance (invalidly);
- if package main, module Main imports B, later imports AImporter; then it compiles;
- if package main, module Main imports AImporter, later imports B, then it doesn't compile.
The attached tgz contains this setup. diff -u main-docompile main-nocompile shows that the only difference between the two main directories is the order of import statements. The language definitely doesn't accept compilation success to depend on import ordering, therefore this is a bug.
My current understanding is that the issue is that this code should never compile, both mains should be rejected, since module B imports only the class, but not the instance.
The issue is that the EPS is only ever increased, never decreased between compilation of different modules in a single batch compilation. This naïve approach causes the instances to be loaded and then never unloaded, so it can be magically found when compiling the invalid B module.
The current code can be found in compiler/iface/LoadIface.hs line 280.
I propose to instead of always loading ifaces into the EPS directly, introduce a proper cache for interfaces, that contains parsed up interface data in a ModuleEnv. Then we can start up with an empty EPS at the beginning of the compilation of every unit and quickly merge info from this cache. I guess the majority of time is the file reading IO and the parsing, not the merging of multiple interface files together.
I also think that these kind of issues will get more and more prominent as people start to use parallel cabal and ghc, because if compilation of the attached example program is randomly parallelized, then in some cases it will build, some cases it won't.
Any opinions, alternative fix ideas?