You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _sources/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@
5
5
Pre-trained foundation models have revolutionized speech technology like many other adjacent fields. The combination of their capability and opacity has sparked interest in researchers trying to interpret the models in various ways. While interpretability in fields such as computer vision and natural language processing has made significant progress towards understanding model internals and explaining their decisions, speech technology has lagged behind despite the widespread use of complex, black-box neural models. Recent studies have begun to address this gap, marked by a growing body of literature focused on interpretability in the speech domain. This tutorial provides a structured overview of interpretability techniques, their applications, implications, and limitations when applied to speech models, aiming to help researchers and practitioners better understand, evaluate, debug, and optimize speech models while building trust in their predictions. In hands-on sessions, participants will explore how speech models encode distinct features (e.g., linguistic information) and utilize them in their inference. By the end, attendees will be equipped with the tools and knowledge to start analyzing and interpreting speech models in their own research, potentially inspiring new directions.
6
6
7
7
```{note}
8
-
We will present our tutorial about _Interpretability Techniques for Speech Models_ on **Sunday, August 17th** at this year's Interspeech conference in Rotterdam. <br> Check out the [programme](#interspeech-programme) below, and browse the materials through the sidebar menu.
8
+
We presented our tutorial about _Interpretability Techniques for Speech Models_ on **Sunday, August 17th** at this year's Interspeech conference in Rotterdam. <br> Check out the [programme](#interspeech-programme) below, and browse the materials through the sidebar menu (an overview of all slides and notebooks is [here](https://interpretingdl.github.io/speech-interpretability-tutorial/interspeech2025/intro.html)).
We provide four notebooks, exemplifying the interpretability techniques we cover in the tutorial. The notebooks are designed to work on Google Colab, open them through the links below:
4
16
5
-
## Part I: Representation Understanding
17
+
###Part I: Representation Understanding
6
18
-[Probing exercise: Does Wav2Vec2 encode vocal tract movements?](https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/representational-analyses/probing.ipynb)
7
19
-[Representation space comparisons: Locating acoustic and articulatory information in Wav2Vec2 with CKA](https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/representational-analyses/representation_space_comparisons.ipynb)
8
20
9
-
## Part II: Feature Importance Scoring
21
+
###Part II: Feature Importance Scoring
10
22
-[Context Mixing: Quantifying Context-Mixing in Speech Transformers](https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/feature-importance-scoring/context_mixing.ipynb)
11
23
-[Feature attribution: Explaining Speech Classification Models with Feature Attribution](https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/feature-importance-scoring/feature_attribution.ipynb)
<h2>Notebooks<aclass="headerlink" href="#notebooks" title="Link to this heading">#</a></h2>
374
391
<p>We provide four notebooks, exemplifying the interpretability techniques we cover in the tutorial. The notebooks are designed to work on Google Colab, open them through the links below:</p>
375
392
<sectionid="part-i-representation-understanding">
376
-
<h2>Part I: Representation Understanding<aclass="headerlink" href="#part-i-representation-understanding" title="Link to this heading">#</a></h2>
393
+
<h3>Part I: Representation Understanding<aclass="headerlink" href="#part-i-representation-understanding" title="Link to this heading">#</a></h3>
377
394
<ulclass="simple">
378
395
<li><p><aclass="reference external" href="https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/representational-analyses/probing.ipynb">Probing exercise: Does Wav2Vec2 encode vocal tract movements?</a></p></li>
379
396
<li><p><aclass="reference external" href="https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/representational-analyses/representation_space_comparisons.ipynb">Representation space comparisons: Locating acoustic and articulatory information in Wav2Vec2 with CKA</a></p></li>
380
397
</ul>
381
398
</section>
382
399
<sectionid="part-ii-feature-importance-scoring">
383
-
<h2>Part II: Feature Importance Scoring<aclass="headerlink" href="#part-ii-feature-importance-scoring" title="Link to this heading">#</a></h2>
400
+
<h3>Part II: Feature Importance Scoring<aclass="headerlink" href="#part-ii-feature-importance-scoring" title="Link to this heading">#</a></h3>
384
401
<ulclass="simple">
385
402
<li><p><aclass="reference external" href="https://colab.research.google.com/github/interpretingdl/speech-interpretability-tutorial/blob/main/book/interspeech2025/feature-importance-scoring/context_mixing.ipynb">Context Mixing: Quantifying Context-Mixing in Speech Transformers</a></p></li>
0 commit comments