itsdfish
diff --git a/‎.github/workflows/CI.yml‎
Lines changed: 2 additions & 2 deletions b/‎.github/workflows/CI.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎Project.toml‎
Lines changed: 1 addition & 1 deletion b/‎Project.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/make.jl‎
Lines changed: 2 additions & 1 deletion b/‎docs/make.jl‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/src/amortized_bayesian_parameter_estimation.md‎
Lines changed: 42 additions & 11 deletions b/‎docs/src/amortized_bayesian_parameter_estimation.md‎
Lines changed: 42 additions & 11 deletions
@@ -16,7 +16,7 @@ jobs:
       matrix:
         version:
           - '1.11' # Replace this with the minimum Julia version that your package supports. E.g. if your package requires Julia 1.5 or higher, change this to '1.5'.
-          #- 'nightly'
+          - 'nightly'
         os:
           - ubuntu-latest
         arch:
@@ -59,4 +59,4 @@ jobs:
       - run: julia --project=docs docs/make.jl
         env:
           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-          DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
+          DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
@@ -1,7 +1,7 @@
 name = "SequentialSamplingModels"
 uuid = "0e71a2a6-2b30-4447-8742-d083a85e82d1"
 authors = ["itsdfish"]
-version = "0.12.2"
+version = "0.12.1"
 
 [deps]
 Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 
@@ -55,8 +55,9 @@ makedocs(
             "Advanced Model Specification" => "turing_advanced.md",
             "Hierarchical Models" => "turing_hierarchical.md",
             "Amortized Neural Estimation" => [
+                "Point Estimation" => "amortized_point_estimation.md",
                 "Bayesian Parameter Estimation" => "amortized_bayesian_parameter_estimation.md",
-            ]
+            ],
         ],
         "Model Comparison" => [
             "Bayes Factors" => "bayes_factor.md",
 
@@ -1,8 +1,14 @@
 # Introduction
 
-The purpose of this example is to illustrate Bayesian parameter estimation with a neural estimator called normalizing flows. Normalizing flows are a special type of invertible neural network which can learn the posterior distribution by learning the mapping between samples from the prior distribution and simulated data generated from the model. 
+The purpose of this example is to illustrate how to perform Bayesian parameter estimation with a neural parameter estimation. Neural parameter estimation learns the mapping between simulated data and the parameters of a model (Zammit-Mangion et al., 2024; Sainsbury-Dale et al., 2024; Radev et al., 2023). Neural parameter estimation constitutes a method of amortized inference, whereby a large upfront computational cost is incurred during training to enable rapid parameter estimation with the trained neural network. One benefit of amortized inference is that the neural network can be saved and reused to estimate parameters on multiple datasets. Additionally, neural estimator  estimator called normalizing flows. Normalizing flows are a special type of invertible neural network which can learn the posterior distribution by learning the mapping between parameters and the corresponding simulated data. 
 
-In the example below, we estimate the parameters of the [lognormal race model](lnr.md) (LNR; ). In pratice, one is unlikely to estimate the parameters of the LRN with neural networks because it has an analytic likelihood function. However, we use the LNR for illustration because it is fast to stimulate and the estimation properties of the LRN are known. You can reveal copy-and-pastable version of the full code by clicking the ▶ below. 
+## Overview 
+
+In the example below, we estimate the parameters of the [lognormal race model](lnr.md) (LNR; Heathcote et al., 2012; Rounder et al., 2015) with the package [NeuralEstimators.jl](https://msainsburydale.github.io/NeuralEstimators.jl/dev/). We will use a normalising flow network in order to estimate the full posterior distribution of the LRN parameters. Generally speaking, neural parameter estimation is most useful for models with an intractible likelihood function, such as the leaky competing accumulator (Usher, M., & McClelland ). However, some of its key parameters are notoriously difficult to recover. As an alterntaive, we will use the LNR because its parameters have good estimation properties and can be easily recovered (Rounder et al., 2015).
+
+## Full Code
+
+You can reveal copy-and-pastable version of the full code by clicking the ▶ below. 
 
 ```@raw html
 <details>
@@ -14,8 +20,8 @@ using CairoMakie
 using Distributions
 using Flux
 using NeuralEstimators
-using Random
 using Plots
+using Random
 using SequentialSamplingModels
 
 Random.seed!(544)
@@ -93,8 +99,8 @@ using CairoMakie
 using Distributions
 using Flux
 using NeuralEstimators
-using Random
 using Plots
+using Random
 using SequentialSamplingModels
 ```
 
@@ -105,7 +111,11 @@ Random.seed!(544)
 
 # Simulation Functions
 
-As previously noted, normalizing flow neural networks learn the mapping between the prior distribution and simulated data. Once that mapping is learned, the network is inverted to allow one to sample from posterior distribution. We define two functions to generate training data---one to sample from the prior distribution, and another to sample data from the model, given a sampled parameter vector from the prior distribution. In the code block below, the $K$ samples are generated from each prior and concatonated into a $4 \times K$ array. 
+As previously noted, normalizing flow neural networks learn the mapping between the prior distribution and simulated data. Once that mapping is learned, the network is inverted to allow one to sample from posterior distribution. We define two functions to generate training data---one to sample from the prior distribution, and another to sample data from the model, given a sampled parameter vector from the prior distribution.
+
+## Sample from Prior Distribution
+
+ In the code block below, the $K$ samples are generated from each prior and concatonated into a $4 \times K$ array. 
 
 ```julia
 function sample_prior(K)
@@ -117,6 +127,8 @@ function sample_prior(K)
 end
 ```
 
+## Sample from Model 
+
 The code block below specifies the function `simulate` to sample simulated data from the model. In this function, $\theta$ is a $4 \times K$ array, with each column representing an independent sample from the prior distribution, and $m$ is the number of trials sampled from the model per sample from the prior. The helper function `to_array` transforms the data into the required format: an $m \times 2$ array in which the first column consists of choice indices, and the second column consists of reaction times. 
 
 ```julia
@@ -125,7 +137,11 @@ simulate(θ, m) = [to_array(rand(LNR(ϑ[1:2], ϑ[3], ϑ[4]), m)) for ϑ ∈ each
 ```
 # Configure Neural Network
 
-In this section, we will configure the neural network to perform Bayesian parameter estimation. At a high level, the neural network has two main components. The first component is the `NormalisingFlow`, which learns the density of the model. The second component is the `DeepSet` which learns the summary statistics undelying the distributions. 
+In this section, we will configure the neural network to perform Bayesian parameter estimation. At a high level, the neural network has two primary components. The first component is the `DeepSet` neural network, which compresses the data by learning summary statistics describing the distribution of data. The second component is an invertible neural network called a `NormalisingFlow`. A normalising flow transforms a set of simple base distributions to approximate a complex distribution (see below). Importantly, because normalising flows are invertible, they can be used to sample from the posterior distribution. 
+
+![](https://uvadlc-notebooks.readthedocs.io/en/latest/_images/normalizing_flow_layout.png)
+ *The sequential transformation process of a normalising flow. [Credit](https://uvadlc-notebooks.readthedocs.io/en/latest/tutorial_notebooks/tutorial11/NF_image_modeling.html).* 
+
 ```julia
 # Approximate distribution
 approx_dist = NormalisingFlow(d, 2d)
@@ -141,13 +157,17 @@ estimator = PosteriorEstimator(approx_dist, network)
 
 # Train the Neural Network
 
+Next, to train the neural estimator, we pass the `estimator`, the function `sample_prior`, and the function `simulate` to the function `train`. 
+
 ```julia
 estimator = train(
     estimator, 
-	sample_prior, 
-	simulate; 
-	m, 
-	K = 25_000
+    sample_prior, 
+    simulate; 
+    # the sample size
+    m, 
+    # the number of training examples
+    K = 25_000
 )
 ```
 
@@ -233,4 +253,15 @@ Flux.loadmodel!(estimator, model_state)
 ```
 ```@raw html
 </details>
-```
+```
+# References
+
+Heathcote, A., & Love, J. (2012). Linear deterministic accumulator models of simple choice. Frontiers in psychology, 3, 292.
+
+Sainsbury-Dale, Matthew, Andrew Zammit-Mangion, and Raphaël Huser. "Likelihood-free parameter estimation with neural Bayes estimators." The American Statistician 78.1 (2024): 1-14.
+
+Radev, S. T., Schmitt, M., Schumacher, L., Elsemüller, L., Pratz, V., Schälte, Y., ... & Bürkner, P. C. (2023). BayesFlow: Amortized Bayesian workflows with neural networks. arXiv preprint arXiv:2306.16015.
+
+Rouder, J. N., Province, J. M., Morey, R. D., Gomez, P., & Heathcote, A. (2015). The lognormal race: A cognitive-process model of choice and latency with desirable psychometric properties. Psychometrika, 80(2), 491-513.
+
+Zammit-Mangion, Andrew, Matthew Sainsbury-Dale, and Raphaël Huser. "Neural methods for amortized inference." Annual Review of Statistics and Its Application 12 (2024).