A multivariate central limit theorem for randomized orthogonal array sampling designs in computer experiments

Reading time: 7 minute
...

📝 Original Info

  • Title: A multivariate central limit theorem for randomized orthogonal array sampling designs in computer experiments
  • ArXiv ID: 0708.0656
  • Date: 2007-08-05
  • Authors: Wei-Liem Loh

📝 Abstract

Let $f:[0,1)^d \to {\mathbb R}$ be an integrable function. An objective of many computer experiments is to estimate $\int_{[0,1)^d} f(x) dx$ by evaluating f at a finite number of points in [0,1)^d. There is a design issue in the choice of these points and a popular choice is via the use of randomized orthogonal arrays. This article proves a multivariate central limit theorem for a class of randomized orthogonal array sampling designs [Owen (1992a)] as well as for a class of OA-based Latin hypercubes [Tang (1993)].

💡 Deep Analysis

Deep Dive into A multivariate central limit theorem for randomized orthogonal array sampling designs in computer experiments.

Let $f:[0,1)^d \to {\mathbb R}$ be an integrable function. An objective of many computer experiments is to estimate $\int_{[0,1)^d} f(x) dx$ by evaluating f at a finite number of points in [0,1)^d. There is a design issue in the choice of these points and a popular choice is via the use of randomized orthogonal arrays. This article proves a multivariate central limit theorem for a class of randomized orthogonal array sampling designs [Owen (1992a)] as well as for a class of OA-based Latin hypercubes [Tang (1993)].

📄 Full Content

Let X be a random vector uniformly distributed on the d-dimensional unit hypercube For definiteness, let n, d, q and t be positive integers such that t ≤ d. An orthogonal array of strength t is a matrix of n rows and d columns with elements taken from the set of symbols {0, 1, . . . , q -1} such that in any n × t submatrix, each of the q t possible rows occurs the same number of times. The class of all such arrays is denoted by OA(n, d, q, t).

Comprehensive accounts of orthogonal arrays can be found in the books by Raghavarao (1971) and Hedayat, Sloane and Stufken (1999). Owen (1992a), (1994) and Tang (1993) independently proposed the use of randomized orthogonal arrays in computer experiment sampling designs. The main attraction of these designs is that they, in contrast to simple random sampling, stratify on all t-variate margins simultaneously. A class of randomized orthogonal array sampling designs proposed by Owen (1992a) is as follows. Let (a) A ∈ OA(q t , d, q, t) where a i,j denotes the (i, j)th element of A, (b) π 1 , . . . , π d be random permutations of {0, . . . , q -1}, each uniformly distributed on all the q! possible permutations, (c) {U i,j : i = 1, . . . , q t , j = 1, . . . , d}, be [0, 1) uniform random variables, (d) and all the U i,j ’s and π k ’s are independent.

We randomize the symbols of A by applying the permutation π j to the jth column of A, j = 1, . . . , d. This gives us another orthogonal array A * such that its (i, j)th element satisfies a * i,j = π j (a i,j ). An orthogonal array based sample of size q t (taken from [0, 1) d ) is defined to be {X 1 , . . . , X q t } where for i = 1, . . . , q t , X i = (X i,1 , . . . , X i,d ) ′ , X i,j = a * i,j + U i,j q , ∀j = 1, . . . , d.

For t ≥ 2, Tang (1993) observed that the above sampling designs may not stratify well on s-variate margins if s < t. He suggested modified designs that stratify on t-variate margins as well as 1-variate margins simultaneously. He called these designs OA-based Latin hypercubes. Finally, Owen (1997aOwen ( ), (1997b)), in a series of articles, proposed the use of scrambled nets. Given t ∈ Z + , the scrambled nets stratify on s-variate margins whenever t/s is a positive integer.

A class of OA-based Latin hypercubes can be constructed as follows. Let A ∈ OA(q t , d, q, t). As before, we randomize its symbols to obtain the orthogonal array A * . Then for each column of A * , we replace the q t-1 positions with entry k by a random permutation (with each such permutation having an equal probability of being chosen) of {kq t-1 , kq t-1 + 1, . . . , (k + 1)q t-1 -1}, for all k = 0, . . . , q -1. After the replacement is done for all d columns of A * , the newly obtained matrix, say A * * , satisfies A * * ∈ OA(q t , d, q t , 1).

One version of OA-based Latin hypercubes that was considered by Owen (1997a), page 1906, is of the form {Y 1 , . . . , Y q t } where for i = 1, . . . , q t , Y i = (Y i,1 , . . . , Y i,d ) ′ , Y i,j = a * * i,j + U i,j q t , ∀j = 1, . . . , d,

{U i,j : i = 1, . . . , q t , j = 1, . . . , d} are U [0, 1) random variables independent of one another and all other permutations, and a * * i,j denotes the (i, j)th element of A * * . The class of OAbased Latin hypercubes proposed by Tang (1993) requires one more level of randomization where the columns of A * * are randomized. We denote the resulting matrix by A * * * . Tang’s OA-based Latin hypercubes can be expressed as {Y * 1 , . . . , Y * q t } where for i = 1, . . . , q t ,

a * * * i,j + U i,j q t , ∀j = 1, . . . , d, (4) {U i,j : i = 1, . . . , q t , j = 1, . . . , d} are, as before, U [0, 1) random variables independent of one another and all other permutations, and a * * * i,j denotes the (i, j)th element of A * * * . We note that {Y 1 , . . . , Y q t } and {Y * 1 , . . . , Y * q t } are Latin hypercube samples [see, for example, McKay, Conover and Beckman (1979) and Owen (1992b)].

The estimators for µ in (1) that we are concerned with are μoas = q -t q t i=1 f (X i ), μoal = q -t q t i=1 f (Y i ), and μ * oal = q -t q t i=1 f (Y * i ), (5) where the X i ’s, Y i ’s and Y * i ’s are as in (2), ( 3) and ( 4) respectively. It is easily seen that μoas , μoal and μ * oal are all unbiased estimators for µ. For simplicity, we write σ 2 oas = Var(μ oas ), σ 2 oal = Var(μ oal ) and σ * 2 oal = Var(μ * oal ). In this article, we shall assume that t = 2. This significantly simplifies the notation as well as the theoretical arguments that follow. Also as Owen (1992a) and Tang (1993) noted, orthogonal arrays of strength t = 2 lead to the most economical sample size q 2 . This is important in practice especially when q is large. The following theorem is due to Owen (1992a) and Tang (1993).

Theorem 1. Let d ≥ 3, f be a bounded continuous function on [0, 1) d and μoas , μ * oal be as in (5) with A ∈ OA(q 2 , d, q, 2). Then as q → ∞, we have

where for all x = (x 1 , . . . ,

f k,l (x k , x l ). ( 6) Theorem 1 implies that (i) the asymptotic va

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut