You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin ullamcorper tellus sed ante aliquam tempus. Etiam porttitor urna feugiat nibh elementum, et tempor dolor mattis. Donec accumsan enim augue, a vulputate nisi sodales sit amet. Proin bibendum ex eget mauris cursus euismod nec et nibh. Maecenas ac gravida ante, nec cursus dui. Vivamus purus nibh, placerat ac purus eget, sagittis vestibulum metus. Sed vestibulum bibendum lectus gravida commodo. Pellentesque auctor leo vitae sagittis suscipit.
153
+
Vision encoders are increasingly used in modern applications, from vision-only models to multimodal systems such as vision-language models. Despite their remarkable success, it remains unclear how these architectures represent features internally. Here, we propose a novel approach for interpreting vision features via image reconstruction. We compare two related model families, SigLIP and SigLIP2, which differ only in their training objective, and show that encoders pre-trained on image-based tasks retain significantly more image information than those trained on non-image tasks such as contrastive learning. We further apply our method to a range of vision encoders, ranking them by the informativeness of their feature representations. Finally, we demonstrate that manipulating the feature space yields predictable changes in reconstructed images, revealing that orthogonal rotations — rather than spatial transformations — control color encoding. Our approach can be applied to any vision encoder, shedding light on the inner structure of its feature space. We also append the code of our experiments to reproduce them successfully: <ahref="https://github.com/FusionBrainLab/feature_analysis">https://github.com/FusionBrainLab/feature_analysis</a>.
Visualization of feature space manipulation through red-blue channel swap.
178
180
</h2>
179
181
</div>
180
-
<divclass="item">
181
-
<!-- Your image here -->
182
-
<imgsrc="static/images/carousel3.jpg" alt="MY ALT TEXT"/>
183
-
<h2class="subtitle has-text-centered">
184
-
Third image description.
185
-
</h2>
186
-
</div>
187
-
<divclass="item">
188
-
<!-- Your image here -->
189
-
<imgsrc="static/images/carousel4.jpg" alt="MY ALT TEXT"/>
190
-
<h2class="subtitle has-text-centered">
191
-
Fourth image description.
192
-
</h2>
193
182
</div>
194
183
</div>
195
184
</div>
196
-
</div>
197
185
</section>
198
186
<!-- End image carousel -->
199
187
@@ -291,7 +279,7 @@ <h2 class="title">BibTeX</h2>
291
279
<divclass="content">
292
280
293
281
<p>
294
-
This page was built using the <ahref="https://github.com/eliahuhorwitz/Academic-project-page-template" target="_blank">Academic Project Page Template</a> which was adopted from the<ahref="https://nerfies.github.io" target="_blank">Nerfies</a>project page.
282
+
This page was built using the <ahref="https://github.com/eliahuhorwitz/Academic-project-page-template" target="_blank">Academic Project Page Template</a> which was adopted from the<ahref="https://nerfies.github.io" target="_blank">Nerfies</a>project page.
295
283
You are free to borrow the source code of this website, we just ask that you link back to this page in the footer. <br> This website is licensed under a <arel="license" href="http://creativecommons.org/licenses/by-sa/4.0/" target="_blank">Creative
296
284
Commons Attribution-ShareAlike 4.0 International License</a>.
0 commit comments