Opened 4 years ago

Closed 18 months ago

#4258 closed task (fixed)

Finish new codegen

Reported by: igloo Owned by: simonmar
Priority: high Milestone: 7.8.1
Component: Compiler Version: 6.12.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Difficulty: Unknown
Test Case: Blocked By: #7192
Blocking: #1466, #1498, #3462, #4505, #5925 Related Tickets:

Description

This is a meta-ticket for completing the new codegen.

Change History (29)

comment:1 Changed 4 years ago by igloo

  • Blocking 2253 added

comment:2 Changed 4 years ago by igloo

  • Blocking 3940 added

comment:3 Changed 4 years ago by igloo

  • Blocking 3462 added

comment:4 Changed 4 years ago by igloo

  • Blocking 3132 added

comment:5 Changed 4 years ago by simonmar

  • Blocking 783 added

comment:6 Changed 4 years ago by simonmar

  • Blocking 1246 added

comment:7 Changed 4 years ago by simonmar

  • Blocking 1498 added

comment:8 Changed 3 years ago by igloo

  • Milestone changed from 7.4.1 to 7.6.1

comment:9 Changed 3 years ago by igloo

  • Blocking 1466 added

comment:10 Changed 3 years ago by igloo

  • Blocking 4065 added

comment:11 Changed 3 years ago by igloo

  • Blocking 4505 added

comment:12 Changed 3 years ago by igloo

  • Blocking 5156 added

comment:13 Changed 3 years ago by lelf

  • Cc anton.nik@… added

comment:14 Changed 3 years ago by simonmar

  • Blocking 783 removed

(In #783) This ticket was still open because we wanted to check that there was no bad SRT behaviour. I've checked various versions of the example code and I can't find any bad behaviour, so I'm adding the program as a test case and closing the ticket. If the bad case pops up again with the new code generator, the test will fail.

comment:15 Changed 3 years ago by igloo

  • Difficulty set to Unknown
From: Simon Marlow <marlowsd@gmail.com>
Date: Fri, 06 Jan 2012 11:06:01 +0000

Current status:

 - the new codegen passes all the tests
 - it generates code that is a little bigger/slower than the old codegen
 - it is horrendously slow (this is the blocker)
 - profiling might not work (I don't think this has been tested)

Simon PJ and I are partway through a refactoring sweep.  The plan is
still to switch at some point, and we don't intend to put any more
development effort into the old codegen.  However, the main sticking
point is performance - we can accept a compilation time hit of maybe
10% in return for the extra flexibility, but currently we're *way* off
that.  Hoopl seems to be the main culprit, so whether we have to avoid
hoopl or try to optimise it, I don't know.

comment:16 Changed 2 years ago by lelf

  • Cc anton.nik@… removed

comment:17 Changed 2 years ago by simonmar

  • Blocking 5805 added

comment:18 Changed 2 years ago by simonmar

  • Blocking 5925 added

(In #5925) Let's do this in the new code generator.

comment:19 Changed 2 years ago by simonmar

  • Owner set to simonmar

comment:20 Changed 2 years ago by simonmar

  • Blocking 5156 removed

comment:21 Changed 2 years ago by simonmar

  • Blocking 5805 removed

comment:22 Changed 2 years ago by simonmar

  • Blocking 3940 removed

(In #3940) The new codegen does not have this bug.

comment:23 Changed 2 years ago by simonmar

  • Blocking 2253 removed

(In #2253) I came to check these with the new backend, and it turns out that the old backend is doing just fine on these now. It might be mostly due to this: 3d8ab554ced45c51f39951f29cc53277d5788c37.

These are compiled with HEAD as of yesterday, with -O2.

Program 1:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2vG:
	cmpq $100000000,%rsi
	jle .Lc2vM
	movq %r14,%rbx
	jmp *0(%rbp)
.Lc2vM:
	cmpq $100000001,%rdi
	jle .Lc2vO
	movq %r14,%rbx
	jmp *0(%rbp)
.Lc2vO:
	cmpq $100000008,%r8
	jle .Lc2vR
	movq %r14,%rbx
	jmp *0(%rbp)
.Lc2vR:
	movq %rdi,%rbx
	imulq %r8,%rbx
	movq %rsi,%rax
	imulq %rbx,%rax
	addq %rax,%r14
	incq %rsi
	incq %rdi
	incq %r8
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info

The new code generator does a bit better, commoning up the duplicate blocks:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2vW:
	cmpq $100000000,%rsi
	jle .Lc2wt
.Lc2wj:
	movq %r14,%rbx
	jmp *(%rbp)
.Lc2wt:
	cmpq $100000001,%rdi
	jg .Lc2wj
	cmpq $100000008,%r8
	jg .Lc2wj
	movq %rdi,%rbx
	imulq %r8,%rbx
	movq %rsi,%rax
	imulq %rbx,%rax
	addq %rax,%r14
	incq %rsi
	incq %rdi
	incq %r8
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info

Program 2 (with -O2 -fno-regs-graph, the graph-colouring allocator generates a tiny bit worse code on this one):

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2mJ:
	testq %rsi,%rsi
	jle .Lc2mR
.Lc2mS:
	addq $4,%r14
	decq %rsi
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info
.Lc2mR:
	movl $1000000000,%esi
	jmp r2kR_info

r2kR_info:
.Lc2m8:
	testq %rsi,%rsi
	jle .Lc2mg
.Lc2mh:
	addq $28,%r14
	decq %rsi
	jmp r2kR_info
.Lc2mg:
	movq %r14,%rbx
	jmp *(%rbp)

Program 3:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2hW:
	testq %rsi,%rsi
	jle .Lc2i1
	addq $8,%r14
	decq %rsi
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info
.Lc2i1:
	movq %r14,%rbx
	jmp *0(%rbp)

Program 4:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2lj:
	testq %rsi,%rsi
	jle .Lc2lo
	addq $36,%r14
	decq %rsi
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info
.Lc2lo:
	movq %r14,%rbx
	jmp *0(%rbp)

Program 5:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2rk:
	cmpq $100000000,%rsi
	jle .Lc2ro
	movq %r14,%rbx
	jmp *0(%rbp)
.Lc2ro:
	cmpq $100000001,%rdi
	jle .Lc2rr
	movq %r14,%rbx
	jmp *0(%rbp)
.Lc2rr:
	addq %rsi,%r14
	incq %rsi
	incq %rdi
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info

Program 6:

Main_mainzuzdszdwfoldlMzqzuloop_info:
.Lc2tu:
	testq %r14,%r14
	jle .Lc2tA
	cmpq $39999999,%rsi
	jle .Lc2tD
	jmp *0(%rbp)
.Lc2tA:
	jmp *0(%rbp)
.Lc2tD:
	cvtsi2sdq %rsi,%xmm0
	movsd .Ln2tF(%rip),%xmm1
	mulsd %xmm0,%xmm1
	addsd %xmm1,%xmm5
	decq %r14
	incq %rsi
	jmp Main_mainzuzdszdwfoldlMzqzuloop_info

We still need the strength reduction, I'll make a separate ticket for that.

comment:24 Changed 2 years ago by simonmar

  • Blocking 4065 removed

(In #4065) The new code generator is giving essentially the same code for these two, except that the sense of the branch is different (but that shouldn't make a difference).

Test_zdwbar_info:
.LcrC:
	testq %r14,%r14
	jne .LcrI
.LcrJ:
	movq %rsi,%rbx
	jmp *(%rbp)
.LcrI:
	decq %r14
	incq %rsi
	jmp Test_zdwbar_info

Test_zdwfoo_info:
.Lcqj:
	testq %r14,%r14
	jle .Lcqr
.Lcqs:
	decq %r14
	incq %rsi
.Lcqr:
	movq %rsi,%rbx
	jmp *(%rbp)

The old code generator was ok on foo, but worse on bar:

Test_zdwbar_info:
.Lcra:
	movq %r14,%rax
	testq %r14,%r14
	jne .Lcrg
	movq %rsi,%rbx
	jmp *0(%rbp)
.Lcrg:
	leaq -1(%rax),%r14
	incq %rsi
	jmp Test_zdwbar_info

comment:25 Changed 2 years ago by simonmar

  • Milestone changed from 7.6.1 to 7.8.1

Now scheduled for 7.8.1 (and looking more likely this time!)

comment:26 Changed 23 months ago by simonmar

  • Blocked By 7192 added

comment:27 Changed 21 months ago by simonmar

  • Blocking 3132 removed

comment:28 Changed 21 months ago by simonmar

  • Blocking 1246 removed

comment:29 Changed 18 months ago by simonmar

  • Resolution set to fixed
  • Status changed from new to closed

The new codegen has been the default for a while now. Closing this to unblock the tickets that depend on it.

Note: See TracTickets for help on using tickets.