Opened 18 months ago

Closed 17 months ago

Last modified 17 months ago

#8607 closed bug (fixed)

Invalid location reported for type constructors

Reported by: edsko Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.7
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Revisions:

Description

Given

module Example where

data T = MkT

The locations as reported in 7.4 and 7.7 are different:

  • In 7.4 the ADT is represented as a TyData, itself part of an HsGroup. The TyData contains a Located Name; the location of this name is reported correctly as 3:6. The SrcSpan associated with the Name itself (which represents the def site) is also reported as 3:6, which is dubious.
  • In 7.8 the ADT is represented as a DataDecl (part of TyClGroup inside a HsGroup). As before, the DataDeclcontains a Located Name; the location now however is 3:1-12, which doesn't make sense. The def site associated with the name itself is also 3:1-12, which does make sense.

It seems that when the def site was fixed, it also changed the location of the identifier itself. These two are separate.

Attachments (3)

T8607.hs (1.4 KB) - added by edsko 18 months ago.
Test case (needs T8607A.hs)
T8607.2.hs (1.5 KB) - added by edsko 18 months ago.
T8607A.hs (159 bytes) - added by edsko 18 months ago.
Aux file for the test case

Download all attachments as: .zip

Change History (21)

Changed 18 months ago by edsko

Test case (needs T8607A.hs)

comment:1 Changed 18 months ago by edsko

Actually, the def-site as reported by 7.4 is correct (it's the position where the id is defined, not the span of the definition proper). So I think 7.4 is correct about both locations, and 7.7 wrong about both.

comment:2 Changed 18 months ago by edsko

The same problem goes for class definitions; the location of C in

class C a where
  f :: a -> a

is reported as the span of the entire class, rather than the span of just the identifier.

Changed 18 months ago by edsko

comment:3 Changed 18 months ago by edsko

.. and the same for type synonyms ..

Location of Foo in

type Foo = Int

is reported as the entire synonym declaration (1-14 instead of 6-8).

comment:4 Changed 18 months ago by edsko

.. and type families ..

Location of Bar in

type family Bar a

is reported as the entire line (type family instances do appear to be correct).

Changed 18 months ago by edsko

Aux file for the test case

comment:6 Changed 18 months ago by goldfire

  • Owner set to goldfire

Yes, it does appear to be somewhat intentional. But, it looks like some plumbing change in TcTyClsDecls could restore the old behavior. (I'm thinking of returning TcM [Located TyThing] from tcTyClDecl.) I can take a look at this next week.

comment:7 Changed 18 months ago by edsko

FYI, I sent the following email to SPJ, but he won't be back until next year:

In the commit message you say "The only wrinkle is that, since we don't have the original declaration, we don't have its SrcSpan to put in the error message". But the function you are talking about is checkValidDecl, right? And that takes a Located Name as argument -- would it be possible to use the nameSrcSpan of the Name instead of the error message? I.e., have the def site of the Name be the entire declaration of the type, but the location of the identifier itself still the actual location of the identifier? Changing the SrcSpan associated with the identifier itself is troublesome for IDEs that want to know information about identifiers at particular locations.

I'm not 100% how to go about this though, because the parser obviously generates a RdrName, not a Name, so we don't yet have the distinction between the location of a name (Located Name) and its nameSrcSpan.

comment:8 Changed 18 months ago by goldfire

I'm not sure what chunk of code you're talking about. While I agree that a Located Name is redundant, I don't see how that helps us here. It seems to me that the function of interest is checkValidTyCl, which just takes a TyThing, devoid of any location information other than what is stored in the n_loc field of Names. By passing in an extra location (by using Located TyThing), we can get good error messages and proper spans on identifiers.

I do feel a little dirty decorating the Core-ish TyThing with the Haskell-ish Located, but I think this is a reasonable place to do so. Other suggestions are welcome.

comment:9 Changed 18 months ago by edsko

Well I'm not familiar at all with this code, so I may be talking nonsense; I just wanted to point out that a Located Name is _not_ redundant: a Located Name contains two locations, and that's exactly right: one if the location of the identifier, and the other is the def span of whatever the identifier is referring to. The change in this commit changed _both_ of those locations to be the span of the entire type declaration; it seems to be that it ought to be sufficient to change only the nameSrcSpan but not the location of the identifier itself.

comment:10 Changed 17 months ago by goldfire

  • Owner goldfire deleted

So, I took a look at this, and it's ugly. The problem is that each TyClDecl can give rise to potentially many top-level things. Of these things, only the TyCons are checked for validity, so we really only need location info for TyCons. It's a little painful returning a [Located TyThing] when the Located bit applies to only one disjunct of TyThing. But, I was willing to deal with that. The real problem comes from the fact that one declaration can actually produce many TyCons: a class with associated types. According to the validity checking code -- which checks all TyCons, top-level and not -- we would need these associated type TyCons to have correct locations. Unfortunately, there is no clean way to get good locations for associated types without polluting the code in tcClassATs and possibly ClassATItem, which is persisted within the Class datatype. Very yuck.

Is all of this doable? Absolutely, but it would make Simon wish he had just stayed on holiday. So, I propose that, to fix this problem, we store location information in the TyCon. This should be easy to get correct and easy to use. It would be cleaner than adding Located in various places throughout TcTyClsDecls. But, it stores location information in a very Core-ish place, and I would rather consult with others before going ahead with this plan. There is precedent: CoAxBranches store locations for similar reasons. But, of course, I was the chief person behind CoAxBranches, and it's a little silly to use my own design decision as precedent.

Thoughts?

comment:11 Changed 17 months ago by edsko

So correct me if I'm wrong, but -- TyCon contains a Name, right? And that Name has a nameSrcSpan. And that nameSrcSpan is what is used for error messages (probably through setSrcSpan (getSrcSpan thing) at the start of checkValidTyCl?). So TyCons already have a built-in location, and I guess this was precisely the purpose of Simon's patch.

I don't have a problem with the nameSrcSpan of a Name being set to be the location of the entire type declaration. What is a problem, however, is that in the AST, that name becomes a Located Name, for example as in

data TyClDecl name
  | -- | @data@ declaration
    DataDecl { tcdLName    :: Located name        -- ^ Type constructor
...

So Located Name has two SrcSpans; one for the Located part, and one for the Name part (nameSrcSpan). The problem is that Simon's patch changed both of these SrcSpans to point to the entire span of the type declaration. They are logically different -- one tells you where this occurrence of the identifier is, the other tells you where the identifier is defined. For IDE purposes we need to know accurate information about the location of this occurrence, even if it's nameSrcSpan (def site) points somewhere else.

Do I make any sense at all?

comment:12 Changed 17 months ago by goldfire

Yes, I see what you're saying much better now. But, it still doesn't seem to quite correspond with what's going on in Simon's original change. That commit (1745779...) seems to change only the location in the Located bit, not the one in the nameSrcSpan.

Oh, it's all suddenly clear.

Simon's commit was overzealous. It turns out that the location stored in the Located bit in a tcdLName seems to be ignored in the validity checker -- it's only the location in the nameSrcSpan that does any work. So, the changes Simon made to RdrHsSyn were totally unnecessary to correspond to the refactoring in TcTyClsDecls. The solution is dead easy: I can just revert the changes in RdrHsSyn and revert the corresponding changes in the testsuite, while keeping the refactoring in TcTyClsDecls, which was the whole point to begin with. Have to run now, but will do this later today.

Thanks for pointing me in this direction!

comment:13 Changed 17 months ago by Richard Eisenberg <eir@…>

In 724690f86f9bf92e886a785141c9ef423ddae05e/ghc:

Revert "Simplify the plumbing for checkValidTyCl"

This reverts commit 174577912de7a21b8fe01881a28f5aafce02b92e.

This is part of the fix for #8607. Only reverting RdrHsSyn.lhs.

Conflicts:

	compiler/parser/RdrHsSyn.lhs
	compiler/typecheck/TcTyClsDecls.lhs

comment:14 follow-up: Changed 17 months ago by Richard Eisenberg <eir@…>

In e4afeedc5b8ac0f48cbeac09aa702c8d10433cdb/ghc:

Fix #8607.

The solution (after many false starts) is to change the behavior of
hsLTyClDeclBinders. The idea is that the locations of the names that
the parser generates should really be the names' locations, unlike
what was done in 1745779... But, when the renamer is creating Names
from the RdrNames, the locations stored in the Names should be the
declarations' locations. This is now achieved in hsLTyClDeclBinders,
which returns [Located name], but the location is that of the
*declaration*, not the name itself.

comment:15 Changed 17 months ago by Richard Eisenberg <eir@…>

In 9e28639756bddb797ac99ec0613aeb2a70b0e4b9/testsuite:

Error wibbles while fixing #8607.

comment:16 Changed 17 months ago by goldfire

  • Resolution set to fixed
  • Status changed from new to closed

See the commit message above for details -- somewhat different than my last Trac post on the subject. The solution I found was much cleaner than what I proposed here.

comment:17 in reply to: ↑ 14 Changed 17 months ago by edsko

Replying to Richard Eisenberg <eir@…>:

The solution (after many false starts) is to change the behavior of
hsLTyClDeclBinders. The idea is that the locations of the names that
the parser generates should really be the names' locations, unlike
what was done in 1745779... But, when the renamer is creating Names
from the RdrNames, the locations stored in the Names should be the
declarations' locations. This is now achieved in hsLTyClDeclBinders,
which returns [Located name], but the location is that of the
*declaration*, not the name itself.

Yes, this sounds like the Right Thing To Do. I have tested our code with this patch applied and it seems to work perfectly. Thank you very much!

comment:18 Changed 17 months ago by simonpj

I like it, thank you. Simon

Note: See TracTickets for help on using tickets.