Opened 7 months ago

Closed 4 months ago

#15149 closed bug (fixed)

Identical distinct type family fields miscompiled

Reported by: NeilMitchell Owned by: adamgundry
Priority: normal Milestone: 8.6.1
Component: Compiler Version: 8.4.2
Keywords: ORF Cc: ndmitchell@…, adamgundry
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: GHC rejects valid program Test Case: rename/should_compile/T15149
Blocked By: Blocking:
Related Tickets: #14747, #15277 Differential Rev(s): Phab:D4821
Wiki Page:

Description

Given the code:

-- An.hs
{-# LANGUAGE TypeFamilies #-}
module An where
data family An c :: *

-- AnInt.hs
{-# LANGUAGE TypeFamilies #-}
module AnInt where
import An
data instance An Int = AnInt {an :: Int} deriving Show

-- AnDouble.hs
{-# LANGUAGE TypeFamilies #-}
module AnDouble where
import An
data instance An Double = AnDouble {an :: Double} deriving Show

-- Main.hs
{-# LANGUAGE DisambiguateRecordFields #-}
module Main where
import AnInt
import AnDouble
main = print (AnDouble{an=1}, AnInt{an=1})

I would expect this code to work. In reality it fails at runtime with GHC 8.2.2:

Main.hs:4:15-28: warning: [-Wmissing-fields]
    * Fields of `AnDouble' not initialised: an
    * In the expression: AnDouble {an = 1}
      In the first argument of `print', namely
        `(AnDouble {an = 1}, AnInt {an = 1})'
      In the expression: print (AnDouble {an = 1}, AnInt {an = 1})
  |
6 | main = print (AnDouble{an=1}, AnInt{an=1})
  |               ^^^^^^^^^^^^^^

*** Exception: Main.hs:4:15-28: Missing field in record construction an

And fails at compile time in GHC 8.4.2:

Main.hs:4:31-41: error:
    * Constructor `AnInt' does not have field `an'
    * In the expression: AnInt {an = 1}
      In the first argument of `print', namely
        `(AnDouble {an = 1}, AnInt {an = 1})'
      In the expression: print (AnDouble {an = 1}, AnInt {an = 1})
  |
6 | main = print (AnDouble{an=1}, AnInt{an=1})
  |                               ^^^^^^^^^^^

This code was extracted from a real example, where this bug is pretty fatal, as I haven't been able to find any workarounds (without just avoiding clashing record fields).

Change History (15)

comment:1 Changed 7 months ago by RyanGlScott

Keywords: ORF added

comment:2 Changed 7 months ago by simonpj

Cc: adamgundry added

I investigated. Here is what is going on:

  • When compiling module An, there are two an's in scope
  • You might resaonably think that with DisamgiguateRecordFields, the an in AnInt { an = 1 } is obviously the an from module AnInt.
  • But in fact GHC's renamer uses the type constructor, not the data constructor to disambiguate which an you mean. And both an's have the same "parent", namely the type constructor An.

This was fine before data families, because a single type constructor could not have data constructors with distinct an record fields. But now it can.

A short term fix is to complain about ambiguity rather than arbitrarily picking one (as is done now). This happens here in TcExpr:

tcRecordField con_like flds_w_tys (L loc (FieldOcc sel_name lbl)) rhs
  | Just field_ty <- assocMaybe flds_w_tys sel_name
  = ...

That assocMaybe just picks the first. So we could fix that.

But how can we fix it properly? After all, all the information is there, staring us in the face.

  • One way forward might be to say that the data constructor AnInt is the "parent" of AnInt.an, and the type constructor An is the parent of the data constructor AnInt. But that change would mean we had a multi-level parent structure, with consequences that are hard to foresee.
  • But I think a better way is to stop trying to make the choice in the renamer. Instead in a HsRecordFields structure, keep the field names as not-yet-looked-up OccNames in the renamer, and look them up in the typechecker. At that point, we have the data constructor available in its full glory and don't need to bother with the tricky parent structures in the renamer.

Adam, do you think that would work?

comment:3 Changed 7 months ago by adamgundry

I think this would work, and would amount to eliminating the distinction between HsRecField (containing FieldOcc) and HsRecUpdField (containing AmbiguousFieldOcc). At the moment, the former is used for record construction and pattern matching (fields always resolved by the renamer), while the latter is used for selection and update (ambiguous fields resolved by the type-checker). Here "ambiguous" really means "ambiguous from the renamer's perspective, but not necessarily the type-checker".

It would probably be simpler to defer all field resolution to the type-checker, though we've also considered going in the opposite direction, and moving all field resolution back to the renamer (see https://github.com/ghc-proposals/ghc-proposals/pull/84). I'm not really sure which is better!

There's an awkward corner if we were to defer all field resolution to the type-checker, which is that Ambiguous fields are not currently supported by DsMeta. That is, using them inside a TH bracket will fail with a "not supported" error. The basic problem is that DsMeta works on renamed syntax, even though it runs after type-checking. Would that be difficult to change?

comment:4 Changed 7 months ago by simonpj

It would probably be simpler to defer all field resolution to the type-checker, though we've also considered going in the opposite direction, and moving all field resolution back to the renamer

That proposal (DuplicateRecordFields simplification) is about simplifying the DRF spec. Yes, it'd make it possible to resolve that particular aspect of record fields in the renamer -- but it by no means requires us to do so.

That is, using them inside a TH bracket will fail with a "not supported" error

My baseline thought is that C { x = e1, y = e2 } should have x and y as OccNames right up to the type checker. And that would be so in TH syntax too. In TH we have

data Exp = ... | RecConE [FieldExp]
type FieldExp = (Name,Pat)

We could consider changing that Name to OccName, or just using the NameS variant of Name. No serious problem there.

My notion of using OccName is a bit too simple.

  • It is in principle fine for C { ... } in both expressions and patterns, but in fact H98 syntax allows a qualified name there, thus C { M.x = e }. That suggests we need a RdrName there, not an OccName. The type checker can still resolve it, since the name to the LHS of the = need consider only the top-level GlobalRdrEnv and that is entirely available in the typechecker.
  • For record update e { x = e2 }, the same applies but now the use of qualified names to disambiguate is more important still.

I think this boils down to

  • Remove the extFieldOcc field of data FieldOcc, leaving only a Located RdrName. (In fact we can then inline FieldOcc.)
  • Move the lookup machinery from the renamer to the typechecker

Do you think that'd work? Any motivation to try it?

Last edited 6 months ago by simonpj (previous) (diff)

comment:5 Changed 7 months ago by simonpj

PS, Neil you say

I haven't been able to find any workarounds (without just avoiding clashing record fields).

Can't you just not use DisambiguateRecordFields and instead say this?

main = print (AnDouble{AnDouble.an=1}, AnInt{AnInt.an=1})

comment:6 Changed 7 months ago by adamgundry

Yes, I can try to take a look at this.

My baseline thought is that C { x = e1, y = e2 } should have x and y as OccNames right up to the type checker. And that would be so in TH syntax to.

I agree that this would be simpler (in the presence of DisambiguateRecordFields, DuplicateRecordFields, etc.). I'm concerned that it might impact TH users though, as they will have only an OccName where previously they had a Name (and they might not easily be able to reliably look up the OccName because of precisely this ticket).

Should this go through ghc-proposals, or is there another process for discussing TH changes?

comment:7 Changed 6 months ago by adamgundry

Owner: set to adamgundry

comment:8 Changed 6 months ago by simonpj

I'm concerned that it might impact TH users though, as they will have only an OccName where previously they had a Name

Well comment:4 suggests leaving it as TH.Name; as subsequent bullets point out, we need correspondingly need RdrName not OccName in HsSyn.

comment:9 Changed 6 months ago by adamgundry

I think I've come across a simpler way to fix this and #14747. At the moment, rnHsRecFields goes to some trouble to figure out the parent type constructor of the data constructor (in find_tycon) and then does field name lookup using the parent type constructor. But we already have lookupConstructorFields which lets us directly find out the FieldLabels of a data constructor! (This is used for expanding dot-dot patterns.)

So why don't we just use lookupConstructorFields and search amongst them for the right one? We'd need to be a bit careful to still check the name is in scope (with the right module qualifier, if any), but that should be simple if we know the unambiguous selector name already.

Moving field label resolution to the typechecker still might be worth doing, because it should get rid of quite a bit of duplication. But it's not a small task (e.g. because of dot-dot patterns) so I think it's worth pursuing the small fix first.

comment:10 Changed 6 months ago by adamgundry

Differential Rev(s): Phab:D4821
Status: newpatch
Test Case: rename/should_compile/T15149

comment:11 Changed 6 months ago by simonpj

Moving field label resolution to the typechecker still might be worth doing, because it should get rid of quite a bit of duplication. But it's not a small task (e.g. because of dot-dot patterns) so I think it's worth pursuing the small fix first.

I agree with this point, but we should still do it! Then we could get rid of tcg_field_env altogether, which would be very worthwhile.

Maybe open a fresh ticket for it?

comment:12 Changed 6 months ago by adamgundry

I think Phab:D4821 should be good to go with the small fix, and I've opened #15277 for the bigger refactoring.

comment:13 Changed 6 months ago by Ben Gamari <ben@…>

In 7100850e/ghc:

Use data con name instead of parent in lookupRecFieldOcc

Test Plan: new tests rename/should_compile/{T14747,T15149}

Reviewers: simonpj, bgamari

Reviewed By: bgamari

Subscribers: rwbarton, thomie, carter

GHC Trac Issues: #14747, #15149

Differential Revision: https://phabricator.haskell.org/D4821

comment:14 Changed 6 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be addressed for GHC 8.6.

comment:15 Changed 4 months ago by adamgundry

Milestone: 8.8.18.6.1
Resolution: fixed
Status: patchclosed

This was fixed in 8.6.1, looks like the ticket missed being closed when the patch was merged.

Note: See TracTickets for help on using tickets.