Skip to content

Commit f4316bb

Browse files
committed
add posterior distribution example for neural estimation
2 parents 521eb5d + 6f537f0 commit f4316bb

9 files changed

Lines changed: 543 additions & 18 deletions

.github/workflows/CI.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616
matrix:
1717
version:
1818
- '1.11' # Replace this with the minimum Julia version that your package supports. E.g. if your package requires Julia 1.5 or higher, change this to '1.5'.
19-
#- 'nightly'
19+
- 'nightly'
2020
os:
2121
- ubuntu-latest
2222
arch:
@@ -59,4 +59,4 @@ jobs:
5959
- run: julia --project=docs docs/make.jl
6060
env:
6161
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
62-
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
62+
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "SequentialSamplingModels"
22
uuid = "0e71a2a6-2b30-4447-8742-d083a85e82d1"
33
authors = ["itsdfish"]
4-
version = "0.12.2"
4+
version = "0.12.1"
55

66
[deps]
77
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"

docs/make.jl

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,8 +55,9 @@ makedocs(
5555
"Advanced Model Specification" => "turing_advanced.md",
5656
"Hierarchical Models" => "turing_hierarchical.md",
5757
"Amortized Neural Estimation" => [
58+
"Point Estimation" => "amortized_point_estimation.md",
5859
"Bayesian Parameter Estimation" => "amortized_bayesian_parameter_estimation.md",
59-
]
60+
],
6061
],
6162
"Model Comparison" => [
6263
"Bayes Factors" => "bayes_factor.md",

docs/src/amortized_bayesian_parameter_estimation.md

Lines changed: 42 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,14 @@
11
# Introduction
22

3-
The purpose of this example is to illustrate Bayesian parameter estimation with a neural estimator called normalizing flows. Normalizing flows are a special type of invertible neural network which can learn the posterior distribution by learning the mapping between samples from the prior distribution and simulated data generated from the model.
3+
The purpose of this example is to illustrate how to perform Bayesian parameter estimation with a neural parameter estimation. Neural parameter estimation learns the mapping between simulated data and the parameters of a model (Zammit-Mangion et al., 2024; Sainsbury-Dale et al., 2024; Radev et al., 2023). Neural parameter estimation constitutes a method of amortized inference, whereby a large upfront computational cost is incurred during training to enable rapid parameter estimation with the trained neural network. One benefit of amortized inference is that the neural network can be saved and reused to estimate parameters on multiple datasets. Additionally, neural estimator estimator called normalizing flows. Normalizing flows are a special type of invertible neural network which can learn the posterior distribution by learning the mapping between parameters and the corresponding simulated data.
44

5-
In the example below, we estimate the parameters of the [lognormal race model](lnr.md) (LNR; ). In pratice, one is unlikely to estimate the parameters of the LRN with neural networks because it has an analytic likelihood function. However, we use the LNR for illustration because it is fast to stimulate and the estimation properties of the LRN are known. You can reveal copy-and-pastable version of the full code by clicking the ▶ below.
5+
## Overview
6+
7+
In the example below, we estimate the parameters of the [lognormal race model](lnr.md) (LNR; Heathcote et al., 2012; Rounder et al., 2015) with the package [NeuralEstimators.jl](https://msainsburydale.github.io/NeuralEstimators.jl/dev/). We will use a normalising flow network in order to estimate the full posterior distribution of the LRN parameters. Generally speaking, neural parameter estimation is most useful for models with an intractible likelihood function, such as the leaky competing accumulator (Usher, M., & McClelland ). However, some of its key parameters are notoriously difficult to recover. As an alterntaive, we will use the LNR because its parameters have good estimation properties and can be easily recovered (Rounder et al., 2015).
8+
9+
## Full Code
10+
11+
You can reveal copy-and-pastable version of the full code by clicking the ▶ below.
612

713
```@raw html
814
<details>
@@ -14,8 +20,8 @@ using CairoMakie
1420
using Distributions
1521
using Flux
1622
using NeuralEstimators
17-
using Random
1823
using Plots
24+
using Random
1925
using SequentialSamplingModels
2026

2127
Random.seed!(544)
@@ -93,8 +99,8 @@ using CairoMakie
9399
using Distributions
94100
using Flux
95101
using NeuralEstimators
96-
using Random
97102
using Plots
103+
using Random
98104
using SequentialSamplingModels
99105
```
100106

@@ -105,7 +111,11 @@ Random.seed!(544)
105111

106112
# Simulation Functions
107113

108-
As previously noted, normalizing flow neural networks learn the mapping between the prior distribution and simulated data. Once that mapping is learned, the network is inverted to allow one to sample from posterior distribution. We define two functions to generate training data---one to sample from the prior distribution, and another to sample data from the model, given a sampled parameter vector from the prior distribution. In the code block below, the $K$ samples are generated from each prior and concatonated into a $4 \times K$ array.
114+
As previously noted, normalizing flow neural networks learn the mapping between the prior distribution and simulated data. Once that mapping is learned, the network is inverted to allow one to sample from posterior distribution. We define two functions to generate training data---one to sample from the prior distribution, and another to sample data from the model, given a sampled parameter vector from the prior distribution.
115+
116+
## Sample from Prior Distribution
117+
118+
In the code block below, the $K$ samples are generated from each prior and concatonated into a $4 \times K$ array.
109119

110120
```julia
111121
function sample_prior(K)
@@ -117,6 +127,8 @@ function sample_prior(K)
117127
end
118128
```
119129

130+
## Sample from Model
131+
120132
The code block below specifies the function `simulate` to sample simulated data from the model. In this function, $\theta$ is a $4 \times K$ array, with each column representing an independent sample from the prior distribution, and $m$ is the number of trials sampled from the model per sample from the prior. The helper function `to_array` transforms the data into the required format: an $m \times 2$ array in which the first column consists of choice indices, and the second column consists of reaction times.
121133

122134
```julia
@@ -125,7 +137,11 @@ simulate(θ, m) = [to_array(rand(LNR(ϑ[1:2], ϑ[3], ϑ[4]), m)) for ϑ ∈ each
125137
```
126138
# Configure Neural Network
127139

128-
In this section, we will configure the neural network to perform Bayesian parameter estimation. At a high level, the neural network has two main components. The first component is the `NormalisingFlow`, which learns the density of the model. The second component is the `DeepSet` which learns the summary statistics undelying the distributions.
140+
In this section, we will configure the neural network to perform Bayesian parameter estimation. At a high level, the neural network has two primary components. The first component is the `DeepSet` neural network, which compresses the data by learning summary statistics describing the distribution of data. The second component is an invertible neural network called a `NormalisingFlow`. A normalising flow transforms a set of simple base distributions to approximate a complex distribution (see below). Importantly, because normalising flows are invertible, they can be used to sample from the posterior distribution.
141+
142+
![](https://uvadlc-notebooks.readthedocs.io/en/latest/_images/normalizing_flow_layout.png)
143+
*The sequential transformation process of a normalising flow. [Credit](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial11/NF_image_modeling.html).*
144+
129145
```julia
130146
# Approximate distribution
131147
approx_dist = NormalisingFlow(d, 2d)
@@ -141,13 +157,17 @@ estimator = PosteriorEstimator(approx_dist, network)
141157

142158
# Train the Neural Network
143159

160+
Next, to train the neural estimator, we pass the `estimator`, the function `sample_prior`, and the function `simulate` to the function `train`.
161+
144162
```julia
145163
estimator = train(
146164
estimator,
147-
sample_prior,
148-
simulate;
149-
m,
150-
K = 25_000
165+
sample_prior,
166+
simulate;
167+
# the sample size
168+
m,
169+
# the number of training examples
170+
K = 25_000
151171
)
152172
```
153173

@@ -233,4 +253,15 @@ Flux.loadmodel!(estimator, model_state)
233253
```
234254
```@raw html
235255
</details>
236-
```
256+
```
257+
# References
258+
259+
Heathcote, A., & Love, J. (2012). Linear deterministic accumulator models of simple choice. Frontiers in psychology, 3, 292.
260+
261+
Sainsbury-Dale, Matthew, Andrew Zammit-Mangion, and Raphaël Huser. "Likelihood-free parameter estimation with neural Bayes estimators." The American Statistician 78.1 (2024): 1-14.
262+
263+
Radev, S. T., Schmitt, M., Schumacher, L., Elsemüller, L., Pratz, V., Schälte, Y., ... & Bürkner, P. C. (2023). BayesFlow: Amortized Bayesian workflows with neural networks. arXiv preprint arXiv:2306.16015.
264+
265+
Rouder, J. N., Province, J. M., Morey, R. D., Gomez, P., & Heathcote, A. (2015). The lognormal race: A cognitive-process model of choice and latency with desirable psychometric properties. Psychometrika, 80(2), 491-513.
266+
267+
Zammit-Mangion, Andrew, Matthew Sainsbury-Dale, and Raphaël Huser. "Neural methods for amortized inference." Annual Review of Statistics and Its Application 12 (2024).

0 commit comments

Comments
 (0)