Skip to content

Comments

Add Highway SIMD acceleration to ImageBufAlgo [add, sub, mul, div, mad, resample]#4994

Open
ssh4net wants to merge 70 commits intoAcademySoftwareFoundation:mainfrom
ssh4net:_hwy
Open

Add Highway SIMD acceleration to ImageBufAlgo [add, sub, mul, div, mad, resample]#4994
ssh4net wants to merge 70 commits intoAcademySoftwareFoundation:mainfrom
ssh4net:_hwy

Conversation

@ssh4net
Copy link
Contributor

@ssh4net ssh4net commented Jan 7, 2026

Optional SIMD optimizations for selected ImageBufAlgo operations using the Google Highway library: • add/sub
• mul/div
• mad
• resample
Adds CMake and build system support, new implementation helpers, and developer documentation.

Code mostly wrote using frontier Opus4.5 and Codex GPT5.2 High models with a strict rules.

Checklist:

  • I have read the guidelines on contributions and code review procedures.
  • I have updated the documentation if my PR adds features or changes
    behavior.
  • I am sure that this PR's changes are tested somewhere in the
    testsuite
    .
  • I have run and passed the testsuite in CI before submitting the
    PR, by pushing the changes to my fork and seeing that the automated CI
    passed there. (Exceptions: If most tests pass and you can't figure out why
    the remaining ones fail, it's ok to submit the PR and ask for help. Or if
    any failures seem entirely unrelated to your change; sometimes things break
    on the GitHub runners.)
  • My code follows the prevailing code style of this project and I
    fixed any problems reported by the clang-format CI test.
  • If I added or modified a public C++ API call, I have also amended the
    corresponding Python bindings. If altering ImageBufAlgo functions, I also
    exposed the new functionality as oiiotool options.

@lgritz
Copy link
Collaborator

lgritz commented Jan 7, 2026

I suspect you used LLM for some of this? Which is fine, but I think you should document in the PR description (commit comment) which tool you used and for what parts.

Comment on lines +127 to +131
template<class Rtype, class Atype, class Btype>
static bool
add_impl_hwy(ImageBuf& R, const ImageBuf& A, const ImageBuf& B, ROI roi,
int nthreads)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done a line-by-line comparison, but it seems to me that the only difference between add_impl_hwy, sub_impl_hwy, and mul_impl_hwy is likely going to be

[](auto d, auto a, auto b) { return hn::Add(a, b); }

versus that one lambda changing for Sub and Mul.

I would love for even the initial commit to reduce this whole thing to a shared hwy_binary_perpixel_op() template that takes the lambda housing the op kernel as a templated parameter.

Comment on lines 155 to 161
// Process pixel by pixel (scalar fallback for strided channels)
for (int x = roi.xbegin; x < roi.xend; ++x) {
Rtype* r_ptr = ChannelPtr<Rtype>(Rv, x, y, roi.chbegin);
const Atype* a_ptr = ChannelPtr<Atype>(Av, x, y,
roi.chbegin);
const Btype* b_ptr = ChannelPtr<Btype>(Bv, x, y,
roi.chbegin);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should benchmark the strided case and see how it compares to the contiguous case and the full scalar fallback that we've always had. If there is no big speed gain, I would be in favor of eliminating this whole clause and let non-contiguous strides use the old scalar path, then there is much less template expansion for hwy in the cases where there is not a large gain to be had. Note that this means that the "to hwy or not to hwy" test would need to test contiguity in addition to just localpixels().

@lgritz
Copy link
Collaborator

lgritz commented Feb 19, 2026

@ssh4net It's been a while since this PR has been updated, but after your last push, it's failing to build. Can you please rebase on main, fix so it passes CI, and ensure that there is a DCO sign-off on each commit? I would like to proceed with this in some form.

@ssh4net
Copy link
Contributor Author

ssh4net commented Feb 20, 2026

@lgritz sure! Give me a bit of time. I have added some fixes based on discussion above, but not verified fully yet, and switched to other projects 😅
Will check, rebase and push soon.

ssh4net and others added 23 commits February 24, 2026 16:45
Optional SIMD optimizations for selected ImageBufAlgo operations using the Google Highway library:
• add/sub
• mul/div
• mad
• resample
Adds CMake and build system support, new implementation helpers, and developer documentation.

Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
This reverts commit 4d3b1f3.

Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Co-authored-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad Erium <shaamaan@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Co-authored-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad Erium <shaamaan@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Co-authored-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad Erium <shaamaan@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Generic per-pixel HWY operation helpers for binary and ternary ops, refactors add/sub/mul/div/mad HWY implementations to use these helpers, and ensures HWY SIMD is only used for contiguous channel ranges. Adds a new test to verify correct fallback to scalar code for strided (non-contiguous) ROI channel ranges.

Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Add specialized HWY fast-paths for add/sub/mul/div/mad to handle the common case where the ROI selects RGB channels of 4-channel (RGBA) images by processing full 4-channel interleaved data and preserving alpha bitwise. Introduce small op lambdas for each operator and handle float/half/double same-type cases with contiguous-channel checks, half-promote/demote paths, and division zero-safety. Also update tests to pre-fill destination buffers and compare results (removed ROI from compare) to validate the strided-ROI fallback behavior. Affects imagebufalgo_addsub.cpp, imagebufalgo_mad.cpp, imagebufalgo_muldiv.cpp and imagebufalgo_test.cpp.

Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Replace duplicated ad-hoc SIMD special-cases in add/sub/mad with generalized HWY helpers that handle the common packed-RGBA-but-ROI-is-RGB case. Introduce PromoteVec/DemoteVec, lane-type mapping for half, interleaved Load/Store helpers (including partial-vector variants), and per-pixel/ternary routines that preserve alpha or mask it for native integer ops. Also switch HwyPixels to use pixel/scanline stride, add necessary forward declarations and includes, and simplify callers to use the new helpers, broadening support to integer/native ops and reducing code duplication.

Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…()` (AcademySoftwareFoundation#4987)

Rearrangements in 3.1 dropped the list of recognized attributes from
the visible online docs and failed to document the span varieties. We
fix and also reword a lot of the descriptions for clarity and uniformity.

The previous organization was that there were several varieties of attribute(). In the header, the first one had the overall long explanation, including the list of all the recognized attributes. The other ones had short explanations of how they differed. In the docs, each one was referenced explicitly, pulling in its attendant bit of documentation.

What really happened is that in the header, I made the new span-based version the "flagship" one with the full explanation, but I neglected to reference it in the docs, so the long description disappeared.

I could have fixed by just adding refs to the new functions to the docs, as I originally meant to. But while I was there, I took the opportunity to surround the whole collection with a group marker, and then include the lot of them with a single reference to the group, rather than need to refer to each function variant individually. And while I was at it, I also reworded (and hopefully improved) some of those explanations.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…Foundation#4990)

Implement RLE compression support for the SGI output plugin. Reading RLE
encoded images was already supported, but writing was never done up
until this point.

The existing sgi test seems sufficient to catch issues and it covers
input/output of both 1 byte-per-pixel and 2 byte-per-pixel files.

The documentation for the image plugins are sometimes not very clear
about which attributes are relevant for input vs. output. There's
usually 3 sections: Attributes, Attributes for Input, and Attributes for
Output.

Before this PR, SGI mentioned the "compression" attribute in the
"general" Attributes section (rather than say just the Input section),
which caused a bit of grief as the only way to discover that RLE was not
implemented for Output was to glance at the file size of the resulting
file... I had assumed that compression was supported for output too but
discovered that it was not.

Now that this PR implements the attribute for output I've left the
documentation as-is in the "general" Attributes section since it applies
to both read/writing now. But I'm open for suggestions here.

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Starting with 1.21, libheif seems to change behavior: When no CICP
metadata is present, libheif now returns 2,2,2 (all unspecified) on
read. OIIO convention, though, is to not set the attribute if valid CICP
data is not in the file.

---------

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…wareFoundation#4993)

For IBA::resample() when bilinear interpolation is used, almost all of
the expense was due to its relying on ImageBuf::interppixel which is
simple but constructs a new ImageBuf::ConstIterator EVERY TIME, which is
very expensive.

Reimplement in a way that reuses a single iterator. This speeds up
IBA::resample by 20x or more typicaly.

Also refactor resample to pull the handling of deep images into a
separate helper function and out of the main inner loop. And add some
benchmarking.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
* CI test vs the latest freetype 2.14.1
* Bump the version of freetype that we auto-build to the latest (from
2.13.2)
* Simplify BZip2 finding logic, switch to using targets

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…areFoundation#4998)

The Intel MacOS 15 CI testing is getting dicier... lots of times,
Homebrew doesn't have cached versions of updated packages, so it tries
to build from source, which takes forever. The big culprit today is Qt.
So, basically, just on this one CI job variant, don't ask it to install
Qt. If it's there, it's there. If not, just skip it. It's tested plenty
in other variants.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…cademySoftwareFoundation#4997)

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Fixes AcademySoftwareFoundation#5000

Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…wareFoundation#4995)

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
lgritz and others added 25 commits February 24, 2026 16:45
Reflecting this month's releases and other things that recently went
into main.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…areFoundation#5026)

Even though we have CI testing on Mac with ARM CPU that were passing,
after getting a new laptop, I saw some test failures that were due to
just a few pixels on a few tests needing a higher comparision threshold.
Results are correct, just different due to the math. I guess this
machine (CPU? build flags? specific compiler or library versions?) is
ever so slightly different than the CI Macs, so I caught a few more
instances that needed to be adjusted.

I tried to increase the thresholds as little as possible to fix the
problem.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…AcademySoftwareFoundation#5025)

I think it was basically harmless, since we do all the metadata name
comparisons using case-insensitive comparisons. But we use "Exif:" as
our prefix for Exif data throughout OIIO by convention, and there was
this tiny handful of places where we said "exif:".

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ySoftwareFoundation#5027)

Need to test some MSVS-specific macros to determine what architecture to
report.

And especially, if it doesn't know the processor architecture, it still
should be *appending* that to the platform, not replacing it! This
caused MSVS-compiled OIIO on Windows to report "unknown arch?"

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
AcademySoftwareFoundation#5031)

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ademySoftwareFoundation#5029)

Since the OpenImageIO 2.5 series, when calls to `check_open` were added,
any format that did not declare support for "tiles" would immediately
fail to open. But many of the formats which attempted to emulate tiles,
by buffering the contents and writing it all as scanlines at the end,
were not updated. All of the tile emulation code for these formats is
effectively dead-code and untested.

Remove the tile emulation code from these formats.

An example of what the failure currently looks like:
```python
>>> out = oiio.ImageOutput.create("test.png")
>>> spec = oiio.ImageSpec(64, 64, 3, 'uint8')
>>> spec.tile_width = 64
>>> out.open("test.png", spec)
False

>>> out.geterror()
'png does not support tiled images'
```

No tests were impacted.

Signed-off-by: Jesse Yurkovich <jesse.y@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ftwareFoundation#5006)

Review: we have long had two assertion macros: OIIO_ASSERT which aborts
upon failure in Debug builds and prints but continues in Release builds,
and OIIO_DASSERT which aborts in Debug builds and is completely inactive
for Relase builds.

Inspired by C++26 contracts, and increasingly available "hardening
modes" in major compilers (especially with the LLVM/clang project's
libc++), I'm introducing some new verification helpers.

New macro `OIIO_CONTRACT_ASSERT` more closely mimics C++26
contract_assert in many ways, and perhaps will simply wrap C++
contract_assert when C++26 is on our menu.

Important ways that OIIO_CONTRACT_ASSERT differs from OIIO_ASSERT and
OIIO_DASSERT:

* Keeping in line with C++ contracts, there are 4 possible responses to
a failed contract assertion: Ignore, Observe (print only), Enforce
(print and abort) and Quick-Enforce (just abort).

* Also define hardening levels: None, Fast, Extensive, and Debug,
mimicking the levels of libc++. The idea is that maybe there will be
some CONTRACT_ASSERT checks you only want to do for certain hardening
levels.

* By default, the contract failure response is Enforce, unless it's both
a release build and the hardening level is set to None (in which case
the response will be Ignore). But it's also overrideable optionally on a
per-translation-unit basis by setting OIIO_ASSERTION_RESPONSE_DEFAULT
before any OIIO headers are included (though obviously that only applies
to inline functions or templates, not to any already-compiled code in
the library).

* Macros for explicit hardening levels: OIIO_HARDENING_ASSERT_FAST(),
EXTENSIVE(), and DEBUG(), which call CONTRACT_ASSERT only when the
hardening level is what's required or stricter.

I also changed the bounds checking in operator[] of string_view, span,
and image_span to use the contract assertions. Note that this adds a
tiny bit of overhead, since the default is "enforce" for release builds
(previously, using OIIO_DASSERT, it did no checks for release builds).
But the benchmarks seem to idicate that the perf difference is barely
measurable.

I added some benchmarking that proves that the bounds check adds a
minute overhead to an element access for a trivial `span<float>`, maybe
even indescernable. Here are benchmarks comparing raw pointer access,
std::array access, span access with the new checks, span access
carefully bypassing the tests.

Linux workstation, gcc-11, on my work computer:

    pointer operator[]:     647.8 ns (+/- 0.1ns)
    std::array operator[]:  647.8 ns (+/- 0.1ns)
    span operator[] :       657.6 ns (+/- 0.5ns)
    span unsafe indexing:   648.2 ns (+/- 0.2ns)
    span range      :       648.1 ns (+/- 0.1ns)

These are the most stable tests I have, with the least trial-to-trial
variation, and show about a 1.5% speed hit on the bounds-checked span
access itself, which I think will be truly un-measurable in the context
of being interleaved with any other operations that you do with the data
you pull from the span.

Mac Intel, Apple Clang 17, on my (old) personal laptop: (much more
variable timing, probably from MacOS scheduler quirks)

    pointer operator[]:     929.2 ns (+/- 6.7ns)
    std::array operator[]:  913.1 ns (+/- 20.6ns)
    span operator[] :       905.8 ns (+/- 13.3ns)
    span unsafe indexing:   913.9 ns (+/- 16.6ns)
    span range      :       916.4 ns (+/- 20.3ns)

You can see that here there is no obvious penalty, in fact it appears a
little faster, but all within the timing uncertainty of the multiple
trials, so statistically it's hard to discern any penalty.

And a couple more for good measure from our CI, but note that because
these are uncontrolled machines somewhere on the GitHub cloud, the
timings might not be as reliable:

Windows, MSVS 2022:

    pointer operator[]:    3716.3 ns (+/- 6.3ns)
    std::array operator[]: 3715.5 ns (+/- 3.4ns)
    span operator[] :      3715.6 ns (+/- 2.6ns)
    span unsafe indexing:  3712.1 ns (+/- 0.7ns)
    span range      :      3714.2 ns (+/- 2.9ns)

Linux, gcc-14, C++20:

    pointer operator[]:    1130.9 ns (+/- 0.2ns),  884.2 k/s
    std::array operator[]: 1132.0 ns (+/- 0.4ns),  883.4 k/s
    span operator[] :      1133.7 ns (+/- 0.4ns),  882.1 k/s
    span unsafe indexing:  1134.2 ns (+/- 1.6ns),  881.7 k/s
    span range      :      1133.9 ns (+/- 0.7ns),  881.9 k/s

MacOS ARM:

    pointer operator[]:    3456.6 ns (+/- 7.5ns)
    std::array operator[]: 3466.8 ns (+/- 12.2ns)
    span operator[] :      3610.9 ns (+/- 11.0ns)
    span unsafe indexing:  3607.4 ns (+/- 4.9ns)
    span range      :      3612.4 ns (+/- 12.2ns)

Windows with MSVS and Linux with newer g++ don't appear to show any
penalty, and the bracketing of trial times indicates that maybe it's
consistent enough to be meaningful? I can't think of anything I'm doing
wrong here that would throw off the timing or disable the range checking
on these tests.

For MacOS ARM, the span looks like it has about a 4% penalty versus raw
pointers? But OTOH, span bounds-checked vs non-checked vs range-for are
all the same, so maybe the speed vs raw pointer is something else
entirely?

Also please note that a preferred way to avoid these extra bounds checks
entirely is to change an index-oriented loop like

    span s;
    for (size_t i = 0; i < s.size(); ++i)
        foo(s[i]);   // maybe bounds check on each iteration?

to a range based loop:

    span s;
    for (auto& v : s)
        foo(v);

which should be inherently safe and require no in-loop checks at all.

---------

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ndation#5032)

Mac Intel is getting long in the tooth, and quite often the Homebrew
packages for Intel are found to be uncached and will try to build from
source. When it's OpenCV, that's disastrous for our CI build times, it
can get stalled for hours building all of OpenCV and its dependencies.
So disable it for that one build variant.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…mySoftwareFoundation#5030)

Extra protections for corrupted BMP files that claim to be palette
images, but have a BPP that doesn't support palette images. Also an
extra guard around accessing the palette array if it is empty.

Add an extra test case for this kind of corruption.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…mySoftwareFoundation#5035)

Fixes AcademySoftwareFoundation#5023

This was crashing when writing TIFF information that was supposed to be
arrays of more than one rational, but in fact was provided as a single
value, it was reading past the end of a memory array.

I noticed that this whole region needs a cleanup, this is not the only
problem. But a full overhaul seems too risky to backport, so my strategy
is as follows:

* THIS fix first, which I will backport right away to 3.0 and 3.1.

* I will then submit a separate PR (already implemented and tested) that
is a much more complete fix and overhaul of this portion of the code
(and other places). That will get merged into main when approved.

* After the second PR is merged, I'll hold it in main for a while to
test its safety, and then decide if it seems ok to backport to 3.1 (but
definitely not 3.0).

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ademySoftwareFoundation#5039)

Bump the version of 'fmt' library that we download and build (if not
found) from 10.2 to 12.1.

Some other touch-ups in build_fmt.cmake.

Also, we have seen that recent fmt versions will fail to compile on MSVC
unless using the `/utf-8` compiler flag, so ensure that is used and also
passed on to other clients of libOpenImageIO_Util (which expose
templates using those headers).

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
AcademySoftwareFoundation#5036)

This is a more comprehensive fix for issues discovered in PR AcademySoftwareFoundation#5035.

The original problem reported in Issue AcademySoftwareFoundation#5023 was a crash when writing
TIFF information that was supposed to be arrays of more than one
rational, it was reading past the end of a memory array.

AcademySoftwareFoundation#5035 is a minimal, immediate fix to address the crashes. But in the
process, I saw a number of ways in which we were dropping metadata on
the floor when the types didn't exactly match, but that we *could*
handle with automatic conversion.

The new cases that we handle with this PR are:

* Exif RESOLUTIONUNIT tag is a short, but by convention we store it by
the name as a string in OIIO metadata, so we need to convert back to a
code (we did so for the main TIFF metadata, but not for Exif in TIFF).
* Handle Exif "version" and "flashpixversion" metadata which have
unusual encoding in TIFF files (they are 4-character strings, but must
be stored in a TIFF tag of type BYTES, not as the usual type ASCII that
most strings use.
* Handle things that TIFF insists are ASCII but that come to us as
metadata that's strings. Easy -- our `ParamValue.get_string()`
automatically converts ther things like ints or floats into string
representation.
* Much more flexibility in automatically converting among the signed and
unsigned, 16 and 32 bit, integer types when the metadata in our
ImageSpec is integer but not the specific type of integer that TIFF/Exif
thinks it should be.

This doesn't appear to change the results of anything in our testsuite,
but it's possible that some non-TIFF-to-TIFF image conversions that
contain Exif data may now do certain type conversions properly instead
of just silently dropping the metadata that had non-matching (but
reasonably valid) types.

Additionally, to do this nicely, I ended up adding a new TypeURational
alias in typedesc.h (similar to TypeRational, but the case where both
numerator and denominator are unsigned ints).

And also fixed a random comment typo I noticed in tiffinput.cpp.

---------

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
This is a PR proposing to keep the gamma precision.

In some cases, we need more precise gamma values, while the existing
rounding operation loses most of the precision. This change will
continue to use rounded values to calculate and store color space
information, but retain the original value in the "Gamma" parameter. In
addition, it can also tidy up existing code.

I've verified with png/exif.png & python-colorconfig tests. No
regression is introduced.

Signed-off-by: Lumina Wang <lumina.wang@autodesk.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ion#5042)

Also switch to a better idiom for detecting if we're a fork.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…demySoftwareFoundation#5043)

Implement support for reading and writing monochrome. Reading requires
libheif 1.17+ for heif_image_handle_get_preferred_decoding_colorspace.

Previously writing a single channel image would cause an exception due
to
wrong parameters, but close() would continue writing the image and
crash.
Destroy m_ctx on exception to prevent that for other potential errors.

Test added for monochrome read and write.

---------

Signed-off-by: Brecht Van Lommel <brecht@blender.org>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ixVersion (AcademySoftwareFoundation#5045)

This allows us to correctly read the ExifVersion and FlashPixVersion
metadata in an EXIF block of a TIFF file.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
AcademySoftwareFoundation#5046)

Fixes AcademySoftwareFoundation#5044

Oops, the logic was a little mixed up when there were exactly two
images. One reason that this was a special case is that conceptually,
there is just a stack, but the implementation is that there is a
separate variable for the top item, and then the actual stack is all the
other items.

Also add more thorough testing of TOP/BOTTOM, including what happens for
2, 1, and also 0 items on the image stack (errors in that last case).

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…ftwareFoundation#5034)

* cmake utility build_dependency_with_cmake was unconditionally doing a
shallow clone and using `clone -b`, but that only works if it's got a
branch or tag name, not if it has a commit hash. So change the logic so
it does a shallow clone only if GIT_TAG is specified but GIT_COMMIT is
not.
* pybind11 self-builder is modified to allow a git commit override.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…size (AcademySoftwareFoundation#5037)

For various tile sizes (and scanline), benchmark how long it takes to
read and write a 4k x 2k image.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…areFoundation#5040)

Intel icc is deprecated and hasn't had a release for a few years. It's
holding us back, both by making us work around an ever growing number of
icc bugs and limitation that will never be fixed, as well as not
allowing us to upgrade minimum versions of certain dependencies, because
icc can't correctly compile newer versions (as an example, it cannot use
a 'fmt' library newer than the oldest we support, 7.0).

So it's time to thank icc for its service and put it on the ice floe for
the polar bears to eat. This is of course in main (future 3.2), and will
not be backported to release branches, since we never stop support of a
dependency or toolchain of existing releases. People requiring icc for
whatever reason may keep using OIIO 3.1 or older.

We will continue to support and test icx, the fully supported Intel
LLVM-based compiler.

---------

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…eFoundation#5041)

The previous minimum, 7.0, dated from mid-2020.

We are raising now (in main / future 3.2 only) to 9.0, which dates from
mid-2022, so we're still supporting several versions and/or years back.

Because this changes minimum dependency versions, it will NOT be
backported to release branches (3.1 or earlier).

I had to remove the CI test variant for icc, because ancient icc can't
correctly build newer versions of fmt, it seems. There is a separate PR
to simply drop icc from our list of supported compilers.

If anybody wants to argue for pulling the minimum up even farther (say,
to fmt 10.0, released in 2023, so still supporting 3 years back), which
would simplify even more places, I would consider it.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…on#5061)

The CI stub generation has been broken for a few days, failing CI every
time. The checked-in stub files seem fine. Just turn off this check
until we can figure out why it is broken.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
)

I was seeing warnings with instantiation of the ispow2 function template
for unsigned type, where the `x >= 0` clause is meaningless. Use a
constexpr if to eliminate that pointless test for unsigned types.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
…cademySoftwareFoundation#5054)

I tested out the JPEG XL CICP support and noticed that color primaries
12 was not supported. This pull request is looking to extend P3 support
for color primaries 12.
Note: color primaries 11 uses the DCI white point and color primaries 12
uses the D65 white point.

The JxlPrimaries enum only covers P3 primaries as value 11 and not 12.
See,

https://github.com/libjxl/libjxl/blob/main/lib/include/jxl/color_encoding.h#L55-L75
Further code is therefore required to account for this on read and
write.

Tests for read and write of color primaries 11 and 12 were added.

Signed-off-by: Shane Smith <shane.smith@dreamworks.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Signed-off-by: Vlad (Kuzmin) Erium <libalias@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants