Recent Developments in Nonregular Fractional Factorial Designs
Nonregular fractional factorial designs such as Plackett-Burman designs and other orthogonal arrays are widely used in various screening experiments for their run size economy and flexibility. The traditional analysis focuses on main effects only. Ha…
Authors: ** 작성자: 정보가 제공되지 않음 (논문에 명시된 저자 미상) **
Recen t Dev elopmen ts in Nonregular F ractional F actorial Desi gns Hongquan Xu, F rederic k K. H. Phoa and W eng K ee W ong University of Califor nia, L os Angeles Octob er 25, 2018 Abstr act: Nonregular fractional factoria l designs su c h as Plac kett- Burman designs and other orthogonal arra ys are widely used in v arious screening exp eriments for their ru n size econom y and flexibilit y . The traditional analysis focuses on main effect s only . Hamada and W u (1 992) w en t b eyond the traditional approac h and prop osed an analy- sis s trategy to demonstrate that s ome in teractions could b e en tertained and estimated b ey ond a few sig nifican t m ain effects. Their groundbreaking w ork stim ulated muc h of t he rece n t deve lopmen ts in design criterion creation, constru ction and analysis of nonregular designs. This pap er reviews imp ortant d ev elopments in optimalit y criteria and comparison, including pr o j ection prop erties, generalized r esolution, v arious general- ized minimum ab erration criteria, optimalit y results, construction metho d s and analysis strategies for nonr egular designs. Key wor ds and phr ases: F actor screening, generaliz ed minim um ab erration, generalized resolution, orthogonal array , Plac k ett-Burman design, pro jectivit y . 1 In tro duction In man y scien tific in ve stigations, the main in terest is in the study of effects of many factors si- m u ltaneously . F actorial designs, esp ecially t wo-le v el or three-lev el factorial designs, are the most commonly u sed exp erimen tal plans for this t yp e of inv estigatio n. A fu ll factorial exp eriment allo ws all factorial effects to b e estimated indep endently . Ho w ever, it is often to o costly to p erform a full factorial exp erimen t, so a fractional f actorial d esign, wh ic h is a subset or fraction of a full factorial design, is preferr ed sin ce it is cost-effect iv e. F ractional factorial designs are classified in to t wo broad t yp es: r e gular designs and nonr e gular designs. Regular designs are constructed through defining relations among factors and are describ ed in many textbo oks su c h as Bo x, Hunter and Hunter (2005 ), Dean and V oss (1999), Mon tgomery (2005 ) and W u and Hamada (2000). These designs ha ve a simple alia sing structur e in that any t wo effects are either orthogonal or fully aliased. The run sizes are alw a ys a p o wer of 2, 3 or a 1 prime, and thus th e “gaps” b et w een p ossible run sizes are getting wider as the p o wer increases. The concept of r esolution (Bo x and Hunt er 1961) and its refin emen t minimum ab err ation (F ries and Hun ter 1980) pla y a piv otal role in the optimal choic e of r egular designs. There are many r ecen t dev elopments on minimum ab erration designs; see W u and Hamada (2000 ) and Muk erjee and W u (2006 ) for further references. Nonregular designs s u c h as Plac k ett-Burman designs and other orthogonal arra ys are widely used in v arious screening exp erimen ts for their run size econom y and flexibilit y (W u and Hamada, 2000) . They fill the gaps b et w een regular designs in terms of v arious run sizes and are flexible in accommodating v arious combinations of factors with d ifferen t num b ers of lev els. Unlik e regular designs, nonregular designs ma y exhibit a complex aliasing structure, that is, a large n umb er of effects may b e neither orthogonal nor fu lly aliased, which mak es it d ifficult to inte rpret their significance. F or this reason, nonregular designs w ere traditionally u s ed to estimate factor main effects only but not their inte ractions. How ev er, in man y practical s itu ations it is often qu estionable whether the in teraction effects are n egligible. Hamada and W u (1992) w ent b ey ond the traditional approac h and prop osed an analysis strategy to demonstrate that some in teractions could b e en tertained and estimated through their complex aliasing structure. T hey p oint ed out that ignoring interact ions can lead to (i) imp ortan t effects b eing missed, (ii) sp urious effects b eing detected, and (iii) estimate d effects ha ving rev ersed signs resulting in incorrectly recommended f actor lev els. Muc h of the recen t studies in n onregular designs we r e motiv ated from results in Hamada and W u (1992 ). They includ ed prop osal of new optimalit y criteria, construction and analysis of nonr egular designs. The primary aim of this pap er is to r eview ma jor dev elopmen ts in nonr egular fractional factorial designs since 1992. Here is a brief history of the ma jor d ev elopments in n on r egular designs. Plac k ett and Bur- man (1946) ga v e a large collection of t w o-lev el and three-lev el designs for m u lti-facto rial exp eri- men ts. These designs are often referred to as the Plac kett -Burman designs in the literature. Rao (1947 ) introdu ced the concept of orthogonal arra ys, including Plac k ett-Burman designs as sp e- cial cases. Cheng (1980) sh o wed that orthogonal arra ys are u niv er s ally optimal for main effects mo del. Hamada and W u (1992 ) s uccessfully demonstrated that some in teractions could b e identi - fied b eyond a few significan t main effects f or Plac kett -Burman d esigns and other orthogonal arrays. Lin and Drap er (1992) studied the geometrical pro jection prop erties of Plac k ett-Burman designs while W ang and W u (1995 ) and Cheng (19 95, 1998) studied the hidden pro jection prop erties of Plac k ett-Burman designs and other orth ogonal arr a ys. The h idden p r o jection pr op erties provide 2 an explanation for th e success of the analysis strategy due to Hamada and W u (1992) . Sun and W u (1993 ) w ere the fir st to coin the term “nonregular d esigns” when studying statistical pr op erties of Hadamard matrices of order 16. Deng and T ang (1999 ) and T ang an d Deng (1999) intro d uced the concepts of generalized resolution an d generalized minimum ab err ation for t wo-lev el n onregular de- signs. Xu and W u (20 01) prop osed the generalized minimum ab err ation for mixed-lev el non r egular designs. Because of th e p opularit y of m inim u m ab erration, the researc h on nonregular designs has b een largely fo cus ed on the constru ction and prop erties of generalized min imum ab er r ation designs. Our reference list suggests that ke en interest in nonregular designs b egan in 1999 and con tinues to this da y as eviden t b y the in cr easing num b er of s cientific p ap ers on nonregular designs in ma jor statistica l journals. Section 2 r eviews the d ata analysis strategies for nonregular d esigns. Section 3 discusses the geometrica l and hid d en pro jection prop erties of th e Plac k ett-Burman designs and other orthogonal arra ys. S ection 4 in tro d uces the generalized resolution and generalized minim um ab erration and their statistical justifications. Section 5 in tro du ces the minim u m momen t ab err ation criterion, another p opular criterion for n onregular designs. Section 6 considers uniformit y and connections with v arious optimalit y criteria. Section 7 reviews construction metho ds and optimalit y results. Section 8 giv es concluding remarks and futur e directions. 2 Analysis Strategie s W e b egin with a review of a breakthrou gh app roac h (Hamada and W u 1992) by en tertaining in ter- actions in Plac k ett-Burman d esigns and ot her orthogonal arra ys after identifying a few imp ortan t main effects. T hen w e review another strategy pr op osed by Cheng and W u (2001) for the d ual purp oses of factor screening and r esp onse surface exploration (or in teraction detection) with qu an- titativ e facto r s. The analysis strategy p rop osed b y Hamada and W u (1 992) consists of thr ee steps. Step 1. Entertain all the main effects and in teractions that are orthogonal to the main effects. Use standard analysis metho ds su c h as ANO V A and half-normal plots to select significant effects. Step 2. En tertain the significant effects iden tified in the pr evious step and th e t wo-fac tor interac- tions that consist of at least one significan t effect. Iden tify significan t effec ts using a forward selection regression pr o cedure. 3 Step 3. Entertain the significant effects iden tified in the pr evious s tep and all the main effects. Iden tify significant effects using a forward selectio n regression p ro cedure. Iterate b et w een Steps 2 and 3 unt il the selected mo del stops changing. Note th at the traditional analysis of Plac k ett-Burman or other nonregular designs ends at Step 1. Hamada and W u (1992) based their analysis strategy on t wo empirical pr inciples, effe ct sp arsity and effe ct her e dity (see W u and Hamada 2000, Section 3.5). Effect sparsit y implies that only few main effects and ev en few er t w o-factor inte ractions are relativ ely imp ortan t in a factorial exp eri- men t. Effect heredit y means that in order for an in teraction to b e significan t, at least one of its paren t factors should b e signifi cant. Effect heredit y excludes mo dels that con tain an in teraction but none of its parent main effects, whic h lessens the problem of obtaining uninterpretable mo dels. Hamada and W u (19 92) wr ote that th e strategy works well w hen b oth principles h old and the correlations b et ween partially alia s ed effects are small to mo derate. The effect sparsity s u ggests that only a few iterations will be requ ired. Using this pr o cedure, Hama da and W u (1992) reanalyzed data from three real exp erimen ts, a cast fatigue exp eriment u sing a 12 -run Plac kett -Burman design with sev en 2-lev el facto r s, a blo o d glucose exp eriment using an 18-run mixed-level orthogonal arra y with one 2-lev el and sev en 3-lev el factors, and a heat exc hange exper im ent using a 12-run Plac k ett-Burman design with ten 2-l ev el factors. They demonstrated that the traditional main effects analysis w as limited and the results w ere misleading. F or illustration, consider th e cast fatigue exp erimen t condu cted by Hu n ter, Hod i and Eager (1982 ) that used a 12-run Plac k ett-Burman design to study the effect s of sev en factors ( A – G ) on the fatigue life of w eld r epaired castings. T able 1 giv es the data matrix and resp onses, where columns 8–11 are not us ed. The original analysis b y Hunter, Ho di and E ager (1982) iden tified tw o significan t factors F and D . T he factor D had a m uc h smaller effect with a p v alue around 0.2. The fitted mo del wa s ˆ y = 5 . 73 + 0 . 458 F − 0 . 258 D, (1) with a R 2 = 0 . 59. How ev er, Hun ter, Ho d i and Eager (1982) noted a discrepancy b et ween their fitted mo del (1) and previous w ork, n amely , the sign of f actor D w as reversed. Ap p lying the th r ee- step analysis str ategy , Hamada and W u (1992) ident ified a significan t t wo-fa ctor interac tion F G and obtained the follo win g mo del ˆ y = 5 . 73 + 0 . 458 F − 0 . 459 F G . (2) 4 T able 1: Design Matrix an d Resp onses, Cast F atigue Exp eriment F actor Logged Run A B C D E F G 8 9 1 0 11 Lifetime 1 + + − + + + − − − + − 6.058 2 + − + + + − − − + − + 4.733 3 − + + + − − − + − + + 4.625 4 + + + − − − + − + + − 5.899 5 + + − − − + − + + − + 7.000 6 + − − − + − + + − + + 5.752 7 − − − + − + + − + + + 5.682 8 − − + − + + − + + + − 6.607 9 − + − + + − + + + − − 5.818 10 + − + + − + + + − − − 5.917 11 − + + − + + + − − − + 5.863 12 − − − − − − − − − − − 4.809 This mo del has R 2 = 0 . 89, whic h is a significant improv emen t o ver m o del (1 ) in terms of goo d ness of fit. The identi fication of F G was not only consisten t with the engineering kno w ledge rep orted in Hunt er, Ho d i and Eager (1982) but also provi ded a sound explanation on the discrepancy of the sign of factor D . Th e co efficient of D in (1) actually estimates D + 1 3 F G and therefore the sign of D in (1) could b e negativ e ev en if D h ad a sm all p ositiv e effect. T h is exp erimen t w as later reanalyzed with other metho ds b y sev eral authors , including Bo x and Mey er (1993), Chipman, Hamada and W u (1997) , W estfall, Y oung and Lin (1998), Y uan, Joseph and Lin (2007 ), and Phoa, Pa n and Xu (2007 ). Hadama and W u (1992 ) discussed limitatio ns of their analysis strategy and provided solutions. W u and Hamada (20 00, chap. 8) further suggested some extensions su c h as the use of all su bset v ariable selectio n if p ossible. F or quantita tiv e factors w ith more than t wo lev els, Cheng and W u (2001) p rop osed the f ollo wing t wo- stage analysis strategy to ac hieve th e du al ob jectiv es of factor screening and resp onse sur face exploration (or int eraction detection) usin g a single design. This t wo-sta ge analysis str ategy is also the t wo k ey asp ects in standard resp onse su rface method ology . 5 Stage 1. Perform facto r screening and iden tify imp ortan t factors. Stage 2. Fit a s econd-ord er mo del for the factors identified in stage 1. F or m quantit ativ e factors, denoted by x 1 , . . . , x m , the second-order mo del is y = β 0 + m X i =1 β i x i + m X i =1 β ii x 2 i + m X 1= i 0, where the maximization is o ve r all su bsets of r columns. Then the gener alize d r e solution is defined to b e R = r + δ, where δ = 1 − max | s | = r J r ( s ) N . (4) F or the 12-run d esign in T able 1, r = 3, δ = 2 / 3 and the generalized resolution is R = 3 . 67. It is easy to see that for an O A ( N , 2 m , t ), j k ( s ) = 0 for any k ≤ t and therefore r ≤ R < r + 1 wh ere r = t + 1. If δ > 0, a subset s of D with r columns cont ains at least N δ / 2 r copies of a full 2 r factorial and therefore the pro jectivit y of D is at least r (Deng and T ang (1999)) . F or a regular design, δ = 0 and the pro jectivit y is exact ly r − 1. Tw o regular designs of the same resolution can b e d istinguished using the minim um ab erration criterion, and the same id ea can b e applied to n on r egular d esigns u sing the minimum G -ab err atio n criterion (Deng and T ang (1999)). Roughly sp eaking, the m inim u m G -ab erration criterion alw a ys c ho oses a design with the smallest confound ing frequency among d esigns w ith maxim um gener- alized resolution. F ormally , the minimum G -aberr ation criterion is to sequentia lly minimize the comp onen ts in the confoundin g frequency v ector CFV( D ) = [( f 11 , . . . , f 1 N ); ( f 21 , . . . , f 2 N ); . . . ; ( f m 1 , . . . , f mN )] , where f k j denotes the frequency of k -column combinations s with J k ( s ) = N + 1 − j . Minim u m G -ab err ation is v ery strin gent and it attempts to co n trol J -c haracteristics in a v ery strict manner. T ang and Deng (1999) prop osed a relaxed v ersion of minim um G -a b erration and called it the minimum G 2 -ab err ation criterion. Let A k ( D ) = N − 2 X | s | = k J 2 k ( s ) . (5) The v ector ( A 1 ( D ) , . . . , A m ( D )) is ca lled th e gener alize d wor d length p attern , b ecause for a r egular design D , A k ( D ) is the num b er of w ord s of length k in the defin ing con trast sub group of D . The minimum G 2 -ab err ation criterion (T ang and Deng (199 9)) is to sequ en tially minimize the generalized wo r dlength pattern A 1 ( D ) , A 2 ( D ) , . . . , A m ( D ). F or regular d esigns b oth minimum G -ab erration and m in im u m G 2 -ab erration criteria reduce to the traditio nal minim um ab erration criterion. Ho wev er, these t wo criteria can result in selecting 11 differen t nonregular d esigns. W e note that minim um G -ab erration nonregular designs alw ays ha v e maxim um generalized resolution whereas minim um G 2 -ab erration nonregular d esigns ma y not. This is in con trast to regular case wh ere minimum ab erration regular d esigns alwa ys h a ve maxim um resolution among all regular d esigns. T ang and Deng (1999) also defined minimum G e -ab erration for an y e > 0 by replacing J 2 k ( s ) with J e k ( s ) in (5). How ev er, only the minimum G 2 -ab erration criterion is p opular due to v arious statistica l justifications and theoretical resu lts. Xu and W u (2001) prop osed th e gene r alize d minimum ab err ation criterion f or comparing asym- metrical (or mixed-lev el) designs. The generalized minimum aber r ation criterion w as motiv ated from ANO V A mo dels and includes the minimum G 2 -ab erration criterion as a sp ecial case. By exploring an imp ortan t connection b et wee n design theory and co ding theory , Xu and W u (2001) sho wed that the generalized wordlength pattern defined in (5 ) are linear combinations of the distri- bution of p airw ise distance b et we en the ro ws. This obser v ation pla ys a pivota l role in the sub sequen t theoretical deve lopmen t of nonregular designs. Ma and F ang (2001) indep enden tly extended the minim um G 2 -ab erration criterion for designs with more th an tw o leve ls. They named their criterion as th e minimum gener alize d ab err ation criterion, whic h is a sp ecial case of the ge neralized minim um ab erration criterion prop osed by Xu and W u (2001). Y e (2003) redefin ed the ge neralized wordlength pattern and generalize d minim um ab erration for t w o-lev el designs u sing in dicator fun ctions. Ch en g and Y e (200 4) defin ed generalized resolution and generali zed minimum ab erration criterion for quantitat iv e facto r s. The generalized minim um ab erration criterion prop osed by Xu an d W u (200 1) is ind ep endent of the c hoice of treatmen t con trasts and th u s mo d el-free wh ereas the generalized minim u m ab erration criterion by Ch eng and Y e (2004) dep ends on the sp ecific mo del. 4.1 Statistical Justifications Deng and T ang (1999) pro vided a statistical justification for the generaliz ed resolution by sho w ing that designs with maxim um generalized resolution minimize th e con tamination of nonnegligible t wo- factor in teractions on the estimation of m ain effects. T ang and Deng (199 9) pro vided a s imilar statistica l justification for minimum G 2 -ab erration designs. In a fu rther extension, Xu and W u (2001 ) ga ve a s tatistica l justification for generalized minimum ab err ation designs w ith mixed lev els. A common situation that arises in practice is that the main effects are of pr imary inte rest b ut 12 there are un in teresting yet non-negligible in teractions that we kno w will affect the main effec ts estimates. T o fi x ideas, consider a t wo -lev el N × m design D = ( d ij ) with column s d enoted by d 1 , . . . , d m and generalize d resolution b et we en 3 and 4. S upp ose th at one fits a main effects mo d el y i = β 0 + m X j =1 β j d ij + ǫ i , (6) but the true mo del is y i = β 0 + m X j =1 β j d ij + m X k
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment