Mathlas repository

doe package

The doe package contains modules related to Design of Experiments (DoE, for short). DoE tries to optimize the way in which queries are performed on a limited source of information so that the information output is optimal. It could be applied to physical experiments or numerical simulations (amongst other fields) where obtaining information can be expensive or slow and careful planning of the experiments can result in output which is more useful.


Latin Hypercube Sampling1 (LHC) is a method of placing points in an n-dimensional space that tries to maximize space occupation. In order to do that the parameter space is divided in an equispaced grid and points are placed in the grid nodes without using the same coordinate more than once per component. This means that when performing a LHC sampling, if a 2D space is divided into \(n\times n\) points and a point is placed at \(\left(\overline{x_i}, \overline{x_j}\right)\) no other point will have \(x_1=\overline{x_i}\) nor \(x_2=\overline{x_j}\).

The most obvious problem with this approach is that the following configuration —where all the sampled points have been placed in a diagonal of the grid— is a perfectly valid LHC distribution of points, but it does not explore the space very well:

Degenerated LHC configuration where all sampled points
                have been placed in the diagonal of the grid

Our solution for consistently obtaining good quality LHC distributions involves creating \(n_{seeds}\) initial sets of points and modifying each set up to \(n_{iter}\) times. We then compute a metric for how well the points are spread out in each of the distributions and return the best one. The metrics that can be use include the harmonic mean of the distances between points or the minimum distance between them.

The following plot shows the initial LHC seed on the left and the optimal LHC distribution (on the right).

Degenerated LHC configuration where all sampled points
                have been placed in the diagonal of the grid

Our implementation also includes a feature which creates point distributions that include a set of given points. Of course these will not be LHC distributions but are created following the same idea.


Creates a distribution of points that fills the space using a global approach. Each point repels all the others and the transitory is integrated until a stationary solution is found.


  • 1 1. McKay, M. D., Beckman, R. J. & Conover, W. J. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics 21, 239 (1979).