79424516

Date: 2025-02-09 07:04:04
Score: 1.5
Natty:
Report link

As others have noted the crash is almost certainly because the buffer isn't ARGB8888 and due to chroma compression the buffer is narrower than you think, so vImage wanders off the end of the buffer.

Flavor text:

I should probably describe how vImage is usually tested. A 2D buffer is created with guard pages down each side and at the top and bottom. This was entertaining, because the kernel ca. 2002 wasn't really designed at the time to be able to allocate memory well when there were over 100,000 guard pages active, and it ran veeerry slowly at first. We were tipped off by the kernel team that we needed to allocate a large chunk, then make our guarded buffers, then free the chunk and then the kernel would run quickly and it worked! Dunno about the modern VM.

Once the guard vImage_Buffers are made, the function is called with a variety of image sizes in each of the 4 corners of the image for all 16 possible alignments (later for AVX, 32) and the results are compared against a "known good" scalar function.

This elaborate scheme was necessary because the software misalignment scheme for AltiVec was devilishly tricky to get right, and commonly would read off the end of the array. If you are one of the 13.2 people still living to have ever done AltiVec, you'll know immediately what I'm talking about. (After many years of fighting with it, I came to believe lvsl and lvsr behavior for the 0th/16th element was designed backwards such that you were better off using the left shift instruction to shift right and the right shift instruction to shift left, using lvsl(0,-ptr). It's the only time I've ever seen negative pointers in a C language, but I digress.) Long story short, the vector code test had to absolutely beat on it to have any chance of shipping something working and that's what we did. We beat on it to combinatorial exhaustion. It took days.

There existed large iPhone clusters for this purpose with hundreds of units. At one point, we retired a bunch of them because they were no longer supported and had fun setting them up like dominoes to topple in a chain. It seemed sacrilegious somehow, but great fun! We would borrow other engineer's machines by remote login overnight for the Mac. There wasn't a lab with that many Macs available. I expect it is easier today with 24+ core machines. Back then we were lucky to have 2!

Consequently, over the years it has been the case that if a crashing bug comes in on vImage from inside or outside the company, it was almost always a case of user error, typically either a rowbytes issue, transposed height and width, image size wrong or wrong pixel format. Basically, any field in the vImage_Buffer structure is a possible source of unplanned program termination. Get it right!

A lot of this could have been avoided by having vImage package up its buffers with formats attached and keep users out of it, but all the other components like CV and CG already had their own opaque-ish buffer formats and we wanted to trivially interop with them so that CG and CV could use vImage without unpacking buffers, so "unsafe unretained" raw pointers were used and there is no metadata in the buffer about format most of the time, which causes a lot of problems for users. I suspect had we done that, we wouldn't need to include the data format in the function name and other awkwardness, but that ship has sailed. We probably should have made both packaged and unpackaged variants available but 20/20 hindsight and all that... I expect we would have eventually concluded the added safety would just discourage people from tiling their workloads and run much more slowly. You are tiling your workloads, right?

Reasons:
  • Whitelisted phrase (-1): it worked
  • Long answer (-1):
  • No code block (0.5):
  • Ends in question mark (2):
  • Low reputation (1):
Posted by: Ian Ollmann