Opened 2 months ago

Last modified 2 months ago

#14562 new bug

IntRep vs WordRep

Reported by: andrewthad Owned by:
Priority: normal Milestone:
Component: Compiler Version: 8.2.1
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:

Description (last modified by andrewthad)

Why do Int# and Word# have different runtime representations? By this I mean that:

Int# :: TYPE 'IntRep
Word# :: TYPE 'WordRep

To my understanding, they are always the same size and always live in the same set of registers. The docs for unsafeCoerce# state that it can be used for:

Casting an unboxed type to another unboxed type of the same size (but not coercions between floating-point and integral types)

The implies that a cast between Int# and Word# is acceptable. But if you're able to unsafeCoerce# between two types, shouldn't they be defined as having the same representation?

What I'm suggesting is that it may be better to collapse IntRep and WordRep into a single representation (probably named WordRep). We would then get slightly more reusable code in some cases:

data WordList (x :: TYPE 'WordRep)
  = WordListCons x (WordList x)
  | WordListNil

ints :: WordList Int#
ints = WordListCons 5# (WordListCons 8# WordListNil)

words :: WordList Word#
words = WordListCons 4## (WordListCons 12## WordListNil)

mapWordList :: forall (x :: TYPE 'WordRep). (x -> x) -> WordList x -> WordList x
mapWordList _ WordListNil = WordListNil
mapWordList f (WordListCons x xs) = WordListCons (f x) xs

biggerInts :: WordList Int#
biggerInts = mapWordList (\x -> x +# 3) ints

biggerWords :: WordList Int#
biggerWords = mapWordList (\x -> plusWord# x 3) ints

For additional context, I'd add that, excluding SumRep and TupleRep (because you can produce different nestings with equivalent representations), coercions between types of different representations are always unsound.

Change History (2)

comment:1 Changed 2 months ago by andrewthad

Description: modified (diff)

comment:2 Changed 2 months ago by simonpj

I see sense in this, but I recall lots of to-and-fro about exactly when to distinguish signed (Int) from unsigned (Word) values. See particularly Note [Signed vs unsigned] in cmm/CmmType.

I'm not arguing on way or another here, just mentioning that the signed/unsigned choice ripples surprisingly far.

Note: See TracTickets for help on using tickets.