Performance loss due to eta expansion

Given the attached file, at both -O2 and -O0, GHC translates the function:

test1 [1,2,3,4,5,6,7,8,9,10] = \x -> x
test1 _ = \x -> negate x

To be equivalent to:

test0 [1,2,3,4,5,6,7,8,9,10] x = x
test0 _ x = negate x

When applied in a loop with something like:

map (test1 [1..]) [1..1000]

The eta-expanded variant is 3x slower. Adding a trace breaks that transformation, and then the code goes 3x faster. Specifically:

test2 [1,2,3,4,5,6,7,8,9,10] = \x -> x
test2 _ = trace "here" $ \x -> negate x

Timings, as reported by Criterion under O2 with GHC 7.10.2, are:

benchmarking test0 = 40.99 ns   (40.96 ns .. 41.02 ns)
benchmarking test1 = 41.09 ns   (41.06 ns .. 41.14 ns)
benchmarking test2 = 17.74 ns   (17.68 ns .. 17.81 ns)

Edited Mar 10, 2019 by Neil Mitchell

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information