Opened 4 years ago

Last modified 3 years ago

#9806 new bug

malloc and mallocArray ignore Storable alignment requirements

Reported by: ekmett Owned by:
Priority: normal Milestone:
Component: Core Libraries Version: 7.8.3
Keywords: Cc: core-libraries-committee@…
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: #8627 Differential Rev(s):
Wiki Page:


Not sure if this is a bug or a feature request:

allocaBytesAligned exists and allocates a new pinned byte array with a given alignment. It is used for both alloca and allocaArray.

However, we don't currently have such a facility with malloc.

malloc and mallocArray currently just invoke mallocBytes, and mallocBytes requests memory without alignment guarantees, so the resulting memory might well violate required Storable alignment for the result.

We don't currently pad things out or get properly aligned memory.

It strikes me that we should in theory add mallocBytesAligned, and switch malloc and mallocArray to invoke it instead of mallocBytes.

Foreign.Marshal.Alloc.mallocBytes just invokes system malloc, so it seems that memalign or posix_memalign would be suitable.

In theory _malloc on the system is probably aligning to a pretty common unit size, but if you start doing SIMD stuff you'll start seeing 16 byte alignment requirements, folks who care about cache architecture for example can have 128 byte alignments to pad things to separate cache lines.

We probably haven't noticed since you likely get back bytes with slot-sized alignment just by construction, so if your alignment requirements stay in the <= 8 range you'll be okay, but for larger alignments it appears you'd just get wrong answers.

If something more subtle is going on here (e.g. an alignment bigger than some threshold wont be respected or that users should explicitly expect mallocArray to silently allocate an array with the wrong alignment) that I missed then we definitely need better documentation to address the issue.

Change History (13)

comment:1 Changed 4 years ago by ekmett

comment:2 Changed 4 years ago by carter

[14/11/22-17:13:25]  <edwardk>	 the posix_memalign thing is probably the right fix
[14/11/22-17:13:37]  <edwardk>	 that'd ensure that malloc aligns to whatever boundary the actual storable requires
[14/11/22-17:13:50]  <edwardk>	 so when folks make data types for SIMD blocks of floats they work right
[14/11/22-17:14:30]  <edwardk>	 in theory it could affect even old x86 stuff with the old 10 byte long doubles
[14/11/22-17:14:35]  <edwardk>	 which want a 16 byte alignment

notes by edward for this ticket

comment:3 Changed 4 years ago by carter

this actually impacts the vector package, and possibly other code

comment:4 Changed 4 years ago by carter

its probably too late to get this this fixed up for 7.10, but i think this should definitely happen for 7.12

comment:5 Changed 4 years ago by carter

Milestone: 7.12.1

comment:6 Changed 4 years ago by Yuras

This issue is wider then the one in vector.

All the Foreign API that works with Storable, assumes alignment to be limited to the largest possible value for non-SIMD types. And malloc allocates memory block, suitable for any such Storable instance, that is why it ignores alignment. Note: malloc is implemented via C function with the same name, and the behavior is actually inherited from C standard lib, see here.

To allow large alignments we need to use either posix_memalign or other platform specific API. It is important that posix_memalign returns a pointer, that can be deallocated via C function free, see here. But unfortunately portability might be an issue, see here. (Also it might be slower, but I didn't try it myself.) Other platform specific functions return pointers that can't be deallocated via free function from C stdlib.

We can change free not to use the C free function. But that will break a lot of code that uses free to deallocate memory that comes from C, and uses malloc to allocate memory that will be deallocated in C via C function. Yes, that code is already broken, because it does silly things, but it is common enough to care.

Also realloc will be problematic too, because the corresponding C function doesn't support unusual alignment. Probably other functions will break too.

I'd propose to reclassify the issue to documentation bug. All SIMD code should use special API to allocate and deallocate memory. It is probably possible to fix vector to handle large alignments, but it will introduce inconsistency into API.

Sorry for long comment, and please correct me if I'm missing something.

comment:7 Changed 4 years ago by carter

hrm, you do raise a good point that we should try to understand which platforms / OS versions GHC 7.12 or whatever will support that dont have a posix_memalign feature.

comment:8 Changed 4 years ago by simonmar

Resolution: duplicate
Status: newclosed

We had an old ticket about this (which I've just closed as a dup of this one): #2917

comment:9 Changed 4 years ago by simonmar

Owner: ekmett deleted
Resolution: duplicate
Status: closednew

comment:10 Changed 4 years ago by carter

I'd like to point out that the posix standard SPECIFICALLY specifies that free is the function that must be used to deallocate posix_memalign allocated memory

point being: this change wouldn't change the correctness of any code using the C stdlib free function on supported platforms

Last edited 4 years ago by carter (previous) (diff)

comment:11 Changed 4 years ago by Yuras

Sorry my broken Engligh. Yes, posix_memalign requires free for deallocation, but I was trying to say, that we have to use other API on platforms that don't have posix_memalign. E.g. pagealign_alloc requires pagealign_free. If we provide strong alignment guarantees for malloc, then we should be able to implement that on all current and future platforms.

comment:12 Changed 3 years ago by thoughtpolice


Milestone renamed

comment:13 Changed 3 years ago by thomie

Milestone: 8.0.1
Note: See TracTickets for help on using tickets.