Opened 3 years ago

# literate markdown not handled correctly by unlit

Reported by: Owned by: guest low 7.6.2 Compiler 7.0.1 dagitj@…, jmg@…, trevor@… Unknown/Multiple Unknown/Multiple GHC rejects valid program #7120

### Description

This simple program in literate haskell, using markdown in the comments gives unlit problems:

### Ok so lets try this again.

### A page that loads and compiles:

> myfact 0 = 1
> myfact n = n * n-1

Lets see if it works!


If I run unlit and collect the output I can see where it went wrong:

$~/lib/ghc-7.0.1/unlit Main.lhs Main.lpp$ cat Main.lpp
### Ok so lets try this again.

### A page that loads and compiles:

myfact 0 = 1
myfact n = n * n-1



When I look through the source code of unlit.c I think the place to check for this would be here:

    if ( c == '#' ) {
if ( ignore_shebang ) {
c1 = egetc(istream);
if ( c1 == '!' ) {
while (c=egetc(istream), !isLineTerm(c)) ;
return SHEBANG;
}
myputc(c, ostream);
c=c1;
}
if ( leavecpp ) {
myputc(c, ostream);
while (c=egetc(istream), !isLineTerm(c))
myputc(c,ostream);
myputc('\n',ostream);
return HASH;
}
}


It seems that cabal has a similar unlit function:

I haven't tested it but, I think the cabal version would handle this case correctly (or be easier to fix than a C program from 1990). Would it be possible/wise/feasible to extract the cabal version and make it a permanent replacement for the current unlit.c code?

### comment:1 Changed 3 years ago by nalaurethsulfate

In addition to the cabal version perhaps the perl script mentioned in the obscure unlit.c README reference (http://www.desy.de/user/projects/LitProg/glasgow/programs-and-options.html, lit2stuff) could be called with the correct options to remove the comments from literate Haskell files.

### comment:2 Changed 3 years ago by nalaurethsulfate

while it might be easier to fix the cabal program also handles the same test case incorrectly:

GHCi, version 6.12.1: http://www.haskell.org/ghc/ :? for help

Prelude> :m Distribution.Simple.PreProcess?.Unlit

Prelude Distribution.Simple.PreProcess?.Unlit> f <- readFile "test.lhs"

Prelude Distribution.Simple.PreProcess?.Unlit> f

"### Ok so lets try this again.\n\n### A page that loads and compiles:\n\n> myfact 0

# 1 \n> myfact n = n * n-1\n\nLets see if it works!\n"

Prelude Distribution.Simple.PreProcess?.Unlit> unlit "log.txt" f

Left "### Ok so lets try this again.\n\n### A page that loads and compiles:\n\n myfact 0 = 1 \n myfact n = n * n-1\n\n -- Lets see if it works!\n\n"

Prelude Distribution.Simple.PreProcess?.Unlit>

I don't think that this is terribly surprising though, and shouldn't be to difficult to fix. If someone could please explain why CPP lines wouldn't be in code blocks (no matter how they are delimited) that would help a lot.

### comment:3 Changed 3 years ago by duncan

So the problem here is that ghc does unlit before cpp and so it has to pass the #cpp directives through. It has to do unlit before cpp because in the worst case the only time ghc finds out cpp is needed is when it encounters a {-# LANGUAGE CPP #-} pragma.

In principle I suppose that ghc could unlit with cpp passthrough only for the pass where it reads the module head to find pragmas, and then if cpp is not required to re-unlit the file without the cpp passthrough mode.

Technically this probably does count as H98 non-compliance. The CPP extension interferes with the use of # in ordinary (non-cpp) lhs files.

### comment:4 Changed 3 years ago by simonmar

Thanks Duncan for pointing out one good reason why we need to do unlit before CPP.

### comment:5 Changed 3 years ago by igloo

• Milestone set to 7.2.1

### comment:7 Changed 2 years ago by jmg

I've run into this problem when trying to use org-mode markup in a lhs file. This bug prevents me from using quite a lot of org-mode specific in-file settings. They all start with a '#' in the first column.

### comment:8 Changed 2 years ago by igloo

• Milestone changed from 7.4.1 to 7.6.1
• Priority changed from normal to low

### comment:10 Changed 19 months ago by igloo

• Milestone changed from 7.6.1 to 7.6.2