Commit 6befba7
authored
ACES 2.0 Output Transform performance optimisation (#2119)
* Extend ocioperf to take config file parameter on CLI
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extend ocioconvert to take config on command line
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extract tonescale_fwd function
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extract inverse tonescale function
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Combine c and Z variables in J calculation exponent
replace 100.0 entries when referring to the scale of J
Extract calculation of nonlinear compression into functions
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Split RGB<->JMh function into two parts to expose opponent intermediate values
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Use function to compute matrix multiply for LMS calculations
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove unused member variable from JMhParams structure
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Combine chromatic adaptation weights into LMS matrix (and inverse) - CHANGES PIXEL OUTPUT
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Use matrix form for transforming cone responses to Aab
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Normalise the F_L parameter
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove ra and ba related variables to avoid them being out of sync with opponent calculation
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Make A<->J conversion function generic
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Deduplicate Y<->J conversions
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Factor JMh scaling parameters into Aab matrices
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* factor our references to PI, 360 and 180 constants
Avoid looking up cusp twice during inverse
Whilst searching for the cusp we have already constrained the search so we do not need to clamp
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add functions to explain some of the calculations
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Further clarify when 100 means reference luminance
Migrate rescaling into tonescale s_2 parameter
Rename model_gamma to reflect it is actually the inverse
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* migrate init steps performed within other init functions to the top level to avoid repeat init of precomputed values.
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* extract some of the fixed values that only depend on the hue to reduce recomputation during inverse gamut mapping
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Avoid double lookup for reachMaxM value by resolving once the hue is known.
also reduces size of object on stack by not passing the whole table.
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Push wrapping of hues to the boundary,
mark up conversion points from external inputs etc
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Store gamma values as reciprocals
move more magic constants into const variables
factor some of the complex expressions into function (temporarily makes things slower)
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add some missing includes to headers
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* minor cleanup to use std::array instead of plain array for test samples
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Inline reach boundary finding
restructure find_gamut_boundary_intersection to highlight common patterns.
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extract gamut mapper compression function
rework get_focus_gain to directly computer the slope_gain
Share calculation of analytical thereshold
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Rework gamut mapper to compress absolute M then only recalculate calculate J
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Precalculate maximum search range for cusp lookup
next steps would be to factor hue into separate table to improve cache hits followed by redistribution to more uniform hues which should narrow search range
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Experiment with reusing slope calculations in gamut mapper
presmooth cusp values
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add a collection of TODO's
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Restore function mapping table index to hue
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Minor tweaks to tonescale inverse clamp
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove duplicate table whilst calculating upper hull gamma
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add some additional sample points for the upper hull gamma finder
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Slight tidy up of gamma fitting code
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Experiment with alternate smin implementation
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove unused function and tidy up comments
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extract hue search into separate function
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Extract hues into separate table, merge gamma values into their place (gamma values now sampled on cusp hue intervals). Removes extra texture from GPU path.
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Simplify upper hull gamma hue lookup to avoid unneeded lerping as we are sampling the table entries directly
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Split out tonescale function, minor tweaks to Aab->JMh
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Build tables more uniformly, needs some clean up and lots of testing
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Speed up reach corner finding by switching to testing against the Achromatic rather than J limit
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Speed up hull gamma finding by computing values which depend only on the test points and not the gamma values themselves
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Adjust GPU hue lookup to take advantage of more uniform distribution
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Fix GLSL compatibility with hue lookup
Remove compiler warnings for unused parameters
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Attempt to simplify table generation code
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Explicilty allow GCC to perform additional optimisations - Needs some discussion
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add extra entries to reach table to avoid needing to clamp to range during pixel processing
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* GPU move reach Max M sampling to avoid looking it up multiple times per pixel
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove smoothing from GPU path, it is baked into the csup
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Fix bug with reach lookup
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Try only wrap hues on input to the shaders
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* rework GPU camut compressor to follow the same algorithm as CPU. Not 100% the same GPU still recalculates some values
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Rework solve_J_intersect to have fewer div instructions
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Adjust GPU code to better align with CPU code's structure, some additional precomputation is now applied during shader generation
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Precompute more scaling factors into matrices and nonlinear functions
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Experiment with unsigned integers for array access
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Bypass one J-> A conversion by saving the Aab computed earlier
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Test intrinsics for compression Norm calculation
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Attempt to calculate sin/cos only once per pixel.
Some minor micro optimisations.
Further alignment of GPU with CPU code,
Tests values need evaluating
Some GPU results are different - TBD
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove unused parameters
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Try tree vectoriser for gcc
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Add Vectorise option for MSVC
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove unused function
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Constexpr std::max is only available in C++ 14 for now avoid the call to it
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Try to fir intrinsic based errors on osome build configurations
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Another C++ 14 usage fix
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
* Remove check for CLANG left over from testing
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>
---------
Signed-off-by: Kevin Wheatley <kevin.wheatley@framestore.com>1 parent d807b38 commit 6befba7
12 files changed
Lines changed: 1722 additions & 1073 deletions
File tree
- src
- OpenColorIO
- ops/fixedfunction
- ACES2
- apps
- ocioconvert
- ocioperf
- tests
- cpu
- gpu
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
220 | 220 | | |
221 | 221 | | |
222 | 222 | | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
223 | 233 | | |
224 | 234 | | |
225 | 235 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| 17 | + | |
16 | 18 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
21 | 82 | | |
22 | | - | |
| 83 | + | |
23 | 84 | | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | 85 | | |
29 | 86 | | |
30 | | - | |
| 87 | + | |
31 | 88 | | |
32 | | - | |
33 | | - | |
34 | | - | |
35 | | - | |
36 | 89 | | |
37 | 90 | | |
38 | 91 | | |
39 | 92 | | |
40 | | - | |
41 | | - | |
42 | | - | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
43 | 100 | | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
| 101 | + | |
48 | 102 | | |
49 | 103 | | |
50 | 104 | | |
| |||
57 | 111 | | |
58 | 112 | | |
59 | 113 | | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
60 | 117 | | |
61 | 118 | | |
62 | | - | |
| 119 | + | |
63 | 120 | | |
64 | 121 | | |
65 | | - | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
66 | 135 | | |
67 | 136 | | |
68 | 137 | | |
69 | | - | |
70 | 138 | | |
71 | 139 | | |
72 | 140 | | |
73 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
74 | 150 | | |
75 | 151 | | |
76 | | - | |
77 | 152 | | |
78 | | - | |
79 | 153 | | |
80 | | - | |
81 | | - | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
82 | 157 | | |
83 | | - | |
84 | 158 | | |
85 | 159 | | |
86 | 160 | | |
87 | 161 | | |
88 | 162 | | |
89 | 163 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | 164 | | |
94 | 165 | | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
95 | 171 | | |
96 | 172 | | |
97 | 173 | | |
| |||
100 | 176 | | |
101 | 177 | | |
102 | 178 | | |
103 | | - | |
| 179 | + | |
104 | 180 | | |
105 | 181 | | |
106 | 182 | | |
107 | | - | |
| 183 | + | |
108 | 184 | | |
109 | 185 | | |
110 | 186 | | |
| |||
125 | 201 | | |
126 | 202 | | |
127 | 203 | | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
128 | 209 | | |
129 | 210 | | |
130 | 211 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
0 commit comments