Adapted from Navarro-Brul et al., React. Chem. Eng., 2022
The Design of Experiments is the theory of conceiving the optimal set of trials for model-testing experimentation. DOE packages may have 4 different capabilities:
- Generation of the design, e.g., generating factorial designs, latin hypercube, etc., upon the user request
- Analysis of the design, e.g., the ability of comparing the sampling optimality of different designs for the model hypothesis, evaluating the aliasing of factors, etc.
- Analysis of the response, e.g., the ability of testing the model and fitting the coefficients. Most of open-source DOE packages lack this ability, relying on well-established statistic packages such as statsmodels and scikit-learn.
- Design augmentation, which is the typical pipeline of Active Learning (or Bayesian Optimization), using the response of the early trials to suggest a new set of trials that are a promising compromise between exploitation and exploration toward an optimum goal.
In the following we list a number of open-source packages, that focus on the generation and the analisys of designs.
Don't hesitate to open an issue to report any package (with a reasonable users base) that is missing from this list
A collection of "classical" design of experiments.
We refer to a fork called pyDOE2, which is just adding the GSD method (i.e., a 3+levels fractional factorial) to pyDOE.
- Factorial Designs
- General Full-Factorial (
fullfact) - 2-level Full-Factorial (
ff2n) - 2-level Fractional Factorial (
fracfact) - Plackett-Burman (
pbdesign) - Generalized Subset Designs (
gsd)
- General Full-Factorial (
- Response-Surface Designs
- Box-Behnken (
bbdesign) - Central-Composite (
ccdesign)
- Box-Behnken (
- Randomized Designs
- Latin-Hypercube (
lhs)
- Latin-Hypercube (
Another collection of "classical" design of experiments.
- Full factorial:
build.full_fact() - 2-level fractional factorial:
build.frac_fact_res() - Plackett-Burman:
build.plackett_burman() - Sukharev grid:
build.sukharev() - Box-Behnken:
build.box_behnken() - Box-Wilson (Central-composite)
- with center-faced option:
build.central_composite()withface='ccf'option - with center-inscribed option:
build.central_composite()withface='cci'option - with center-circumscribed option:
build.central_composite()withface='ccc'option
- with center-faced option:
- Latin hypercube (simple):
build.lhs() - Latin hypercube (space-filling):
build.space_filling_lhs() - Random k-means cluster:
build.random_k_means() - Maximin reconstruction:
build.maximin() - Halton sequence based:
build.halton() - Uniform random matrix:
build.uniform_random()
Yet another collection of "classical" design of experiments.
- Fractional Factorial:
build_factorial(factor_count, run_count) - Full Factorial:
build_full_factorial(factor_count) - Central Composite:
build_ccd(factor_count, alpha='rotatable', center_points=1) - Mixture Simplex Lattice:
build_simplex_lattice(factor_count, model_order=<ModelOrder.quadratic: 2>) - Mixture Simplex Centroid:
build_simplex_centroid(factor_count) - Optimal Designs:
build_optimal(factor_count, **kwargs)
Analysis of the design:
- Statistical Power:
f_power(model, design, effect_size, alpha) - Alias list:
alias_list(model, design)
Collection of algorithms for uniform sampling, and related topics.
cube- Uniform sampling from the unit hypercubecube.stratify_conventional: stratification of the unit hypercubestratify_generalized: generalized stratification of the unit hypercubecube.latin_design: generate a random latin hypercube design matrixcube.improved_latin_design: generate an ‘improved’ latin hypercube design matrixcube.rank1_design: design matrix for a rank-1 latticecube.sample_halton: generate a Halton point setcube.sample_maximin: maximize the minimal distance in the unit hypercube with extensionscube.sample_k_means: in its default setup, this algorithm converges to a centroidal Voronoi tesselation of the unit hypercubecube.grid: create conventional grid in the unit hypercube
simplex- Uniform sampling on the unit simplexpolytope- Uniform sampling from convex polytopessubset- Select diverse subsetssubset.psa_partition: partition the data set into the given number of clusters with the part-and-select algorithmsubset.psa_select: select representatives points with the part-and-select algorithmsubset.select_greedy_maximin: greedily select a subset according to maximin criterionsubset.select_greedy_maxisum: greedily select a subset according to maxisum criterion.
Analysis of the design:
indicator.solow_polasky_diversity: Solow-Polasky diversityindicator.weitzman_diversity: Weitzman diversityindicator.sum_of_dists: square root of the sum of all pairwise distancesindicator.average_inverse_dist: average inverse distanceindicator.separation_dist: minimal pairwise distanceindicator.wmh_index: quality index of Wahl, Mercadier, and Helbertindicator.sum_of_nn_dists: sum of nearest-neighbor distancesindicator.unanchored_L2_discrepancy: unanchored L2 discrepancy
Definitive Screening Design - GitHub
Implementation of the DSD in python: a small design aimed to screen all factors for second order models.
dsd.generate(n_num, n_cat, factors_dict=None, method='dsd', min_13=True, n_fake_factors=0)
Analysis of the design:
dsd.analysis.get_map_of_correlations(X, effects)
Package focused on the Latin Hypercube Design (LHD), to generate and analyze several variants of this design.
- Classical latin hypercube:
pyLHD.LatinHypercube(size, seed, scramble)
Analysis of the design:
Average Absolute Correlation, Maximum Absolute Correlation, Maximum Projection Criterion (Joseph 2015), Coverage measure, Inter-site Distance, Discrepancy, MaxiMin, Mesh Ratio, Phi_p Criterion.
BoFire is a Bayesian Optimization Framework Intended for Real Experiments. It contains nice features to generate a DoE when starting from scratch.
- D-, A-, G-, E-, K- optimization in a constrained design space
- Space filling in a constrained design space
Analysis of the design:
bofire.utils.doe.get_confounding_matrix()
The Orthogonal Array package contains functionality to generate and analyse orthogonal arrays, optimal designs and conference designs.
- Generate (
oapackage.arraydata_t()) and extend (oapackage.extend_array()) orthogonal arrays - Conference designs (
oapackage.conference_t()) - D-Efficient optimized design (
oapackage.Doptimize())
Analysis of the design:
- D-, Ds-, A-, E- efficiency of the design (
.Defficiency(),.DsEfficiency(),.Aefficiency(),.Eefficiency())