Skip to content

Commit 521eb5d

Browse files
committed
update amortized inference examples
1 parent 137e827 commit 521eb5d

8 files changed

Lines changed: 250 additions & 11 deletions

File tree

docs/Project.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
44
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
55
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
66
FillArrays = "1a297f60-69ca-5386-bcde-b61e274b549b"
7+
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
78
KernelDensity = "5ab0869b-81aa-558d-bb23-cbf5423bbe9b"
9+
NeuralEstimators = "38f6df31-6b4a-4144-b2af-7ace2da57606"
810
ParetoSmooth = "a68b5a21-f429-434e-8bfa-46b447300aac"
911
Pigeons = "0eb8d820-af6a-4919-95ae-11206f830c31"
1012
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
@@ -26,4 +28,4 @@ Plots = "1.0.0"
2628
StatsBase = "0.33.0,0.34.0"
2729
StatsModels = "0.7.0"
2830
StatsPlots = "0.15.0"
29-
Turing = "0.33.0,0.34,0.35.0,0.36.0"
31+
Turing = "0.33.0,0.34,0.35.0,0.36.0, 0.37.0"

docs/make.jl

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,10 @@ makedocs(
5353
"Mode Estimation" => "mode_estimation.md",
5454
"Simple Bayesian Model" => "turing_simple.md",
5555
"Advanced Model Specification" => "turing_advanced.md",
56-
"Hierarchical Models" => "turing_hierarchical.md"
56+
"Hierarchical Models" => "turing_hierarchical.md",
57+
"Amortized Neural Estimation" => [
58+
"Bayesian Parameter Estimation" => "amortized_bayesian_parameter_estimation.md",
59+
]
5760
],
5861
"Model Comparison" => [
5962
"Bayes Factors" => "bayes_factor.md",
Lines changed: 236 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,236 @@
1+
# Introduction
2+
3+
The purpose of this example is to illustrate Bayesian parameter estimation with a neural estimator called normalizing flows. Normalizing flows are a special type of invertible neural network which can learn the posterior distribution by learning the mapping between samples from the prior distribution and simulated data generated from the model.
4+
5+
In the example below, we estimate the parameters of the [lognormal race model](lnr.md) (LNR; ). In pratice, one is unlikely to estimate the parameters of the LRN with neural networks because it has an analytic likelihood function. However, we use the LNR for illustration because it is fast to stimulate and the estimation properties of the LRN are known. You can reveal copy-and-pastable version of the full code by clicking the ▶ below.
6+
7+
```@raw html
8+
<details>
9+
<summary><b>Show Full Code</b></summary>
10+
```
11+
```julia
12+
using AlgebraOfGraphics
13+
using CairoMakie
14+
using Distributions
15+
using Flux
16+
using NeuralEstimators
17+
using Random
18+
using Plots
19+
using SequentialSamplingModels
20+
21+
Random.seed!(544)
22+
23+
n = 2 # dimension of each data replicate
24+
m = 100 # number of independent replicates
25+
d = 4 # dimension of the parameter vector θ
26+
w = 128 # width of each hidden layer
27+
28+
function sample_prior(K)
29+
ν = rand(Normal(-2, 3), K, 2)
30+
σ = rand(truncated(Normal(1, 3), 0, Inf), K)
31+
τ = rand(Uniform(0.100, 0.300), K)
32+
θ = vcat', σ', τ')
33+
return θ
34+
end
35+
36+
to_array(x) = Float32[x.choice'; x.rt']
37+
simulate(θ, m) = [to_array(rand(LNR(ϑ[1:2], ϑ[3], ϑ[4]), m)) for ϑ eachcol(θ)]
38+
39+
# Approximate distribution
40+
approx_dist = NormalisingFlow(d, 2d)
41+
42+
# Neural network mapping data to summary statistics (of the same dimension used in the approximate distribution)
43+
ψ = Chain(x -> log.(x), Dense(n, w, relu), Dense(w, w, relu))
44+
ϕ = Chain(Dense(w, w, relu), Dense(w, 2d))
45+
network = DeepSet(ψ, ϕ)
46+
47+
# Initialise a neural posterior estimator
48+
estimator = PosteriorEstimator(approx_dist, network)
49+
50+
# Train the estimator
51+
estimator = train(
52+
estimator,
53+
sample_prior,
54+
simulate;
55+
m,
56+
K = 25_000
57+
)
58+
59+
# Assess the estimator
60+
θ_test = sample_prior(1000)
61+
data_test = simulate(θ_test, m)
62+
assessment = assess(estimator, θ_test, data_test; parameter_names = ["ν₁", "ν₂", "σ", "τ"])
63+
bias(assessment)
64+
rmse(assessment)
65+
recovery_plot = AlgebraOfGraphics.plot(assessment)
66+
67+
# perform Bayesian parameter estimation on simulated data
68+
θ = [-1.5, 0, 0.75, 0.250]
69+
data = simulate(θ, m)
70+
post_samples = sampleposterior(estimator, data)
71+
Plots.histogram(
72+
post_samples',
73+
layout = (4, 1),
74+
color = :grey,
75+
norm = true,
76+
leg = false,
77+
grid = false,
78+
xlabel = ["ν₁" "ν₂" "σ" "τ"]
79+
)
80+
vline!([θ'], color = :darkred, linewidth = 2)
81+
```
82+
```@raw html
83+
</details>
84+
```
85+
86+
# Load Dependencies
87+
88+
The first step is to load the dependencies. `NeuralEstimators` and `Flux` are the primary packages for performing Bayesian parameter estimation with normalizing flows. We will also load `AlgebraOfGraphics`, `CairoMakie`, and `Plots` to visualize the parameter recovery and posterior distributions.
89+
90+
```julia
91+
using AlgebraOfGraphics
92+
using CairoMakie
93+
using Distributions
94+
using Flux
95+
using NeuralEstimators
96+
using Random
97+
using Plots
98+
using SequentialSamplingModels
99+
```
100+
101+
In the code block below, we set the seed for the random number generator so that the results are reproducible.
102+
```julia
103+
Random.seed!(544)
104+
```
105+
106+
# Simulation Functions
107+
108+
As previously noted, normalizing flow neural networks learn the mapping between the prior distribution and simulated data. Once that mapping is learned, the network is inverted to allow one to sample from posterior distribution. We define two functions to generate training data---one to sample from the prior distribution, and another to sample data from the model, given a sampled parameter vector from the prior distribution. In the code block below, the $K$ samples are generated from each prior and concatonated into a $4 \times K$ array.
109+
110+
```julia
111+
function sample_prior(K)
112+
ν = rand(Normal(-2, 3), K, 2)
113+
σ = rand(truncated(Normal(1, 3), 0, Inf), K)
114+
τ = rand(Uniform(0.100, 0.300), K)
115+
θ = vcat', σ', τ')
116+
return θ
117+
end
118+
```
119+
120+
The code block below specifies the function `simulate` to sample simulated data from the model. In this function, $\theta$ is a $4 \times K$ array, with each column representing an independent sample from the prior distribution, and $m$ is the number of trials sampled from the model per sample from the prior. The helper function `to_array` transforms the data into the required format: an $m \times 2$ array in which the first column consists of choice indices, and the second column consists of reaction times.
121+
122+
```julia
123+
to_array(x) = Float32[x.choice'; x.rt']
124+
simulate(θ, m) = [to_array(rand(LNR(ϑ[1:2], ϑ[3], ϑ[4]), m)) for ϑ eachcol(θ)]
125+
```
126+
# Configure Neural Network
127+
128+
In this section, we will configure the neural network to perform Bayesian parameter estimation. At a high level, the neural network has two main components. The first component is the `NormalisingFlow`, which learns the density of the model. The second component is the `DeepSet` which learns the summary statistics undelying the distributions.
129+
```julia
130+
# Approximate distribution
131+
approx_dist = NormalisingFlow(d, 2d)
132+
133+
# Neural network mapping data to summary statistics (of the same dimension used in the approximate distribution)
134+
ψ = Chain(x -> log.(x), Dense(n, w, relu), Dense(w, w, relu))
135+
ϕ = Chain(Dense(w, w, relu), Dense(w, 2d))
136+
network = DeepSet(ψ, ϕ)
137+
138+
# Initialise a neural posterior estimator
139+
estimator = PosteriorEstimator(approx_dist, network)
140+
```
141+
142+
# Train the Neural Network
143+
144+
```julia
145+
estimator = train(
146+
estimator,
147+
sample_prior,
148+
simulate;
149+
m,
150+
K = 25_000
151+
)
152+
```
153+
154+
# Assess the Accuracy of the Neural Network
155+
156+
As shown in the code block below, the package NeuralEstimators provides three ways to assess the accuracy of the neural network: `bias`, `rmse`, and scatter plots of the parameer recovery. The parameter recovery plots below indicate that the neural network learned the mapping well.
157+
158+
```julia
159+
θ_test = sample_prior(1000)
160+
data_test = simulate(θ_test, m)
161+
assessment = assess(estimator, θ_test, data_test; parameter_names = ["ν₁", "ν₂", "σ", "τ"])
162+
bias(assessment)
163+
rmse(assessment)
164+
recovery_plot = AlgebraOfGraphics.plot(assessment)
165+
```
166+
![](assets/lnr_parameter_recovery.png)
167+
168+
# Perform Bayesian Parameter Estimation
169+
170+
Now that the neural network has been trained, we can perform Bayesian parameter estimation. In the example below, we simulate data from the model using parameters defined in the vector $\theta$. The estimator and data are passed to `sampleposterior`, which generates samples from the posterior distribution of parameters. As expected, the histogram shows that the posterior distributions are near the true parameter values, displayed as red vertical lines.
171+
172+
```julia
173+
θ = [-1.5, 0, 0.75, 0.150]
174+
data = simulate(θ, m)
175+
post_samples = sampleposterior(estimator, data)
176+
Plots.histogram(
177+
post_samples',
178+
layout = (4, 1),
179+
color = :grey,
180+
norm = true,
181+
leg = false,
182+
grid = false,
183+
xlabel = ["ν₁" "ν₂" "σ" "τ"]
184+
)
185+
vline!([θ'], color = :darkred, linewidth = 2)
186+
```
187+
![](assets/lnr_posterior.png)
188+
189+
# Save the Trained Neural Network
190+
191+
```julia
192+
using BSON: @save
193+
using Flux
194+
195+
model_state = Flux.state(estimator)
196+
@save "lnr_estimator.bson" model_state
197+
```
198+
199+
You can load the trained neural network into a new Julia session with the `@load` macro from `BSON`. In order to successfully reuse the trained neural network, you will need to initialize the neural network before passing the trained parameters. You can reveal copy-and-pastable version of the full code by clicking the ▶ below.
200+
201+
```@raw html
202+
<details>
203+
<summary><b>Show Full Code</b></summary>
204+
```
205+
```julia
206+
using BSON: @load
207+
using Distributions
208+
using Flux
209+
using NeuralEstimators
210+
using SequentialSamplingModels
211+
212+
Random.seed!(544)
213+
214+
n = 2 # dimension of each data replicate
215+
m = 100 # number of independent replicates
216+
d = 4 # dimension of the parameter vector θ
217+
w = 128 # width of each hidden layer
218+
219+
# Approximate distribution
220+
approx_dist = NormalisingFlow(d, 2d)
221+
222+
# Neural network mapping data to summary statistics (of the same dimension used in the approximate distribution)
223+
ψ = Chain(x -> log.(x), Dense(n, w, relu), Dense(w, w, relu))
224+
ϕ = Chain(Dense(w, w, relu), Dense(w, 2d))
225+
network = DeepSet(ψ, ϕ)
226+
227+
# Initialise a neural posterior estimator
228+
estimator = PosteriorEstimator(approx_dist, network)
229+
230+
# load the weights
231+
@load "lnr_estimator.bson" model_state
232+
Flux.loadmodel!(estimator, model_state)
233+
```
234+
```@raw html
235+
</details>
236+
```
221 KB
Loading

docs/src/assets/lnr_posterior.png

21.9 KB
Loading

docs/src/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
This package provides a unified interface for simulating and evaluating popular sequential sampling models (SSMs), which integrates with the following packages:
55

66
- [Distributions.jl](https://github.com/JuliaStats/Distributions.jl): functions for probability distributions
7+
- [NeuralEstimators.jl](https://github.com/msainsburydale/NeuralEstimators.jl): amortized inference using neural networks
78
- [Pigeons.jl](http://pigeons.run/dev/): Bayesian parameter estimation and Bayes factors
89
- [Plots.jl](https://github.com/JuliaPlots/Plots.jl): extended plotting tools for SSMs
910
- [Turing.jl](https://turinglang.org/dev/docs/using-turing/get-started): Bayesian parameter estimation

src/LBA.jl

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -98,8 +98,7 @@ function rand(rng::AbstractRNG, d::AbstractLBA)
9898
end
9999

100100
function logpdf(d::AbstractLBA{T, T1}, c, rt) where {T, T1 <: Vector{<:Real}}
101-
(; τ, A, k, ν, σ) = d
102-
b = A + k
101+
(; τ, ν, σ) = d
103102
LL = 0.0
104103
rt < τ ? (return -Inf) : nothing
105104
for i 1:length(ν)
@@ -115,8 +114,7 @@ function logpdf(d::AbstractLBA{T, T1}, c, rt) where {T, T1 <: Vector{<:Real}}
115114
end
116115

117116
function logpdf(d::AbstractLBA{T, T1}, c, rt) where {T, T1 <: Real}
118-
(; τ, A, k, ν, σ) = d
119-
b = A + k
117+
(; τ, ν, σ) = d
120118
LL = 0.0
121119
rt < τ ? (return -Inf) : nothing
122120
for i 1:length(ν)
@@ -150,8 +148,7 @@ function pdf(d::AbstractLBA{T, T1}, c, rt) where {T, T1 <: Vector{<:Real}}
150148
end
151149

152150
function pdf(d::AbstractLBA{T, T1}, c, rt) where {T, T1 <: Real}
153-
(; τ, A, k, ν, σ) = d
154-
b = A + k
151+
(; τ, ν, σ) = d
155152
den = 1.0
156153
rt < τ ? (return 1e-10) : nothing
157154
for i 1:length(ν)

src/type_system.jl

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ An abstract type for the attentional drift diffusion model.
3838
abstract type AbstractaDDM <: AbstractDDM end
3939

4040
"""
41-
AbstractLBA <: SSM2D
41+
AbstractLBA{T, T1} <: SSM2D
4242
4343
An abstract type for the linear ballistic accumulator model.
4444
"""
@@ -52,14 +52,14 @@ An abstract type for the Wald model.
5252
abstract type AbstractWald <: SSM1D end
5353

5454
"""
55-
AbstractLNR <: SSM2D
55+
AbstractLNR{T, T1} <: SSM2D
5656
5757
An abstract type for the lognormal race model
5858
"""
5959
abstract type AbstractLNR{T, T1} <: SSM2D end
6060

6161
"""
62-
AbstractMLBA <: AbstractLBA
62+
AbstractMLBA{T, T1} <: AbstractLBA
6363
6464
An abstract type for the multi-attribute linear ballistic accumulator
6565
"""

0 commit comments

Comments
 (0)