I am proposing an additional feature, -ddump-splices-file that generates a corresponding .hs.th file for every .hs file that uses Template Haskell.
-ddump-splices is an invaluable but a frustrating way to look at generated Haskell code. If TH generation were some kind of error message, the current output would make sense. However, TH is generating code that we rely on and would be easier to comprehend if we could see it in a way the most similar to our existing Haskell code.
There is a valid complaint that when TH defines something you can't just grep for it, you have to know what TH is defining by reading documentation and imagining something that isn't in front of you.
If you have a file Foo.hs-ddump-splices-file will generate Foo.hs.th. Then whenever someone greps after buliding they will find the declaration. If you check these files in they can grep even before building. Similarly, an IDE can show these files as the source of a declaration. Also, if the TH generation changes in some way when a TH function changes, that will be visible.
This seems like a relatively easy feature to add. Any pointers on where to get started?
Trac metadata
Trac field
Value
Version
7.6.3
Type
FeatureRequest
TypeOfFailure
OtherFailure
Priority
normal
Resolution
Unresolved
Component
Compiler
Test case
Differential revisions
BlockedBy
Related
Blocking
CC
Operating system
Architecture
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
I think the first step would be to write up a concrete proposal for the new feature on a wiki page. A few questions I have that would need to be answered:
If the source file is a .lhs file, is the output .lhs.th?
Can a user alter the extension?
Can a user alter the directory where this file is created?
Then, for implementing it, I would start by looking at !DynFlags and then poking around to see how -ddump-splices works. I imagine it wouldn't be difficult just to redirect the -ddump-splices output to a file. Caveat: I haven't done anything quite like this to GHC, so my suggestions may be wrong.
yes. I thought appending .th would be good because it is tab-completion friendly, but it also appears to be friendly to different haskell extensions.
no, we would want to see feedback from users that this is necessary. If .th is too overloaded we could just make it bigger. One more character, such as .ths might help. Although the extension could easily conflict with someone's computer-wide filesystem, it is unlikely to conflict in a directory with Haskell code.
no, we would want to see feedback from users that this is necessary
I’d recommend Foo.th, just like the .imports file that -ddump-minimal-imports generates. That is probably a feature you’d want to look at, as it is somewhat similar to what you want.
Also, it would be useful if the .th file contains precise code locations of the origin of the splices. This would allow the tools that replace TH splices by their output in the original file to use the .th file conveniently.
Bonus points (well, different and more complicated feature actually, so ignore this for now): Enable a mode where GHC will read the .th file and use that instead of actually running Template Haskell. Distributing the .th files will then allow building packages on architectures where Template Haskell is not available.
Adding location information is a great idea, and if is easy to put the locations in a comment I will do that. However, you might need to get involved with this to add in the features you want. Build-caching is a very interesting feature, but I think it will require GHC/cabal build experts and otherwise a lot more input to think about and implement correctly, so I will leave that for another ticket.
The problem with Foo.th is that now tab completion of the filename stops at Foo.. Whereas with Foo.hs.th tab completion first stops at Foo.hs, which is what you want most of the time. Also, .hs.th together makes for a more unique file extension. If this feature is extended to build caching there will probably be a desire to change how the extensions work. This will actually be a good thing since it will avoid the need to think about backwards compatible files.
Adding location information is a great idea, and if is easy to put the locations in a comment I will do that. However, you might need to get involved with this to add in the features you want. Build-caching is a very interesting feature, but I think it will require GHC/cabal build experts and otherwise a lot more input to think about and implement correctly, so I will leave that for another ticket.
Yes, that was just brianstorming... :-)
The problem with Foo.th is that now tab completion of the filename stops at Foo.. Whereas with Foo.hs.th tab completion first stops at Foo.hs, which is what you want most of the time.
Maybe its different with different workflows, but I, and probably lots of developers, usually happen to have .hi and .o files around that prevent the correct completion anyways. So unless you change that to .hs.hi and .hs.o as well, for the sake of consistency, .th is what follows the principle of least surprise. (But note that this is bikeshedding, do not let such a discussion discourage you from implementing the feature in the first place.)
If you use cabal there shouldn't be .hi and .o files lying around. If they are around, they should be gitignored. Tab completion can use .gitignore at least to favor .hs files. I know that doesn't happen by default in most editors though, but I think most of the community is using cabal or some other build system that they could setup to have a dist folder. .th files should not end up in dist/. Some people might want to gitignore them, but I think if you go to the effort of turning them on you want to check them in.
I think I'm at least partly to blame for Greg's proposal, so naturally, I agree with the idea.
I just wanted to add a piece of evidence for why I would prefer Foo.th.hs over Foo.hs.th or Foo.th: it allows tools/scripts to find the file when searching for *.hs. Of course, tools/scripts can be configured, so this can be worked around. But I find this issue slightly more influential than the tab-completion issue -- I'm accustomed to not having tab completion of the extension.
Whatever the decision, I'd be happy to have this feature.
This seems like a very reasonable thing to do. I'm not volunteering to do it myself, but I'll gladly support anyone who does; I know how the TH implementation works.
The "untyped" splices are expanded by the renamer, and the "typed" ones by the type checker. So if you want to see all splices expanded, you need to look at the output of the type checker. Fortunately that's not difficult: it is more or less what -ddump-tc shows you. So to a first approximation, what you want is to take the output of -ddump-tc and put it in a file.
But there are always details:
-ddump-tc is, as its name implies, a debugging flag. We have not taken care to ensure that the pretty-printed output is fully-parsable Haskell. It should be, but you'd need to work on the Outputable instances for HsSyn to make it fully working.
The type checker "elaborates" the code by adding type abstractions and applications, dictionary abstractions and applications, and so on. For debugging purposes you want to see this; but for your purposes you want to suppress all the elaboration stuff. I've been careful to use different data constructors in HsSyn for elaboration code, so it should be easy to suppress it. But to do that you need to pass a flag into the pretty printer (to tell it whether to suppress it) and we need to think about how to do that. You definitely don't want to write two pretty-printers!
The usual process is to start a GHC Trac wiki page to describe the (user-facing) specification, and sketch any implementation details or choices. And use the ticket or ghc-devs to discuss.
running ghc -ddump-tc -ddump-to-file Foo.hs produces a file Foo.dump-tc:
==================== Typechecker output ====================2014-05-19 03:56:39.777604 UTCTYPE SIGNATURESTYPE CONSTRUCTORS Foo.Foo :: * data Foo No C type associated RecFlag NonRecursive, Not promotable = Foo :: GHC.Types.Int -> Foo Stricts: _ FamilyInstance: noneCOERCION AXIOMSDependent modules: []Dependent packages: [base, ghc-prim, integer-gmp]==================== Typechecker output ====================2014-05-19 03:56:39.78129 UTC==================== Typechecker ====================
This only took me 10 minutes to make the code change, so that is an encouraging start. This is not what I want as an end result, but is this useful now for others that use -ddump-tc to debug? Should I add a new flag for my desired functionality?
I think you'll want a different flag. -ddump-tc is primarily for debugging, and so it's fine for it to spit out all kinds of non-Haskell-source-code stuff.
I submitted a patch in #9126 (closed) to get -ddump-to-file to work with more debug options. I will pursue a more ideal output with a different flag on this ticket.
I finally made a patch for #9126 (closed) that doesn't have failing test cases.
I have a better understanding of how the printing works now. For solving this ticket (creating a file of valid Haskell code of the generated Template Haskell), I did have a question about the suggested approach of using the Typechecker output.
ddump-splices seems to contain exactly what I want, plus some extra stuff. Is the idea behind using the Typechecker output instead of refactoring the splices that I will get more generated stuff besides just Template Haskell?
I'm not sure whether you want to dump out the entire source code (after expanding splices); or just the splices (which is what -ddump-splices does.
If the former, then you need to dump the typechecker output (rather than the earlier renamer output) because type splices are expanded by the type checker.
A big problem here is that -ddump-to-file is a global flag. I assume dumping to stdout is the default because users want that ephemeral behavior. The design for this ticket is that a user always wants to save it to a file, and that should not have to make all other dumps go to a file. If someone wants to ephemerally dump splices to stdout -ddump-splices is already there, well known, and better suited to the job.
Additionally, I want a .hs file to signify that the output is valid Haskell, not just a dump in an ad-hoc format, whereas with dumping the convention is dump-FLAG.
So I did intentionally want to get away from dumping but I am definitely open to ideas for better naming.
The design for this ticket is that a user always wants to save it to a file, and that should not have to make all other dumps go to a file.
So this allows -dth-file and another dump flag to be used at the same time. Why is that useful?
Additionally, I want a .hs file to signify that the output is valid Haskell, not just a dump in an ad-hoc format, whereas with dumping the convention is dump-FLAG.
This convention could be changed, -ddump-th -ddump-to-file would dump to a .th.hs file.
A better name than -dth-file or -dth-dec-file could be -dth-to-file.
It seems like you are thinking of this as another debugging dump flag? It isn't at all, it just uses the existing dump system for implementation convenience. It is designed to be always turned on (if desired) and produce output that can be checked into source control. This should not effect whether actual dump flags are sent to stdout or to a file. I think -to- was supposed to help distinguish between dumping to stdout or to a file, so it doesn't seem necessary for something that always creates a file.