DESIGN OF EXPERIMENTS INTRODUCTION Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means by examining the variances of samples that are taken. ANOVA allows one to determine whether the differences between the samples are simply due to random error (sampling errors) or whether there are systematic treatment effects that cause the mean in one group to differ from the mean in another. Most of the time ANOVA is used to compare the equality of three or more means, however when the means from two samples are compared using ANOVA it is equivalent to using a t-test to compare the means of independent samples. ANOVA is based on comparing the variance (or variation) between the data samples to variation within each particular sample. If the between variation is much larger than the within variation, the means of different samples will not be equal. If the between and within variations are approximately the same size, then there will be no significant difference between sample means. ELEMENTS OF A DESIGNED EXPERIMENT Definition 1 The response variable is the variable of interest to be measured in the experiment. We also refer to the response as the dependent variable. Definition 2 Factors are those variables whose effect on the response is of interest to the experimenter. Quantitative factors are measured on a numerical scale, whereas qualitative factors are not (naturally) measured on a numerical scale. Definition 3 Factor levels are the values of the factor utilized in the experiment. Definition 4 The treatments of an experiment are the factor-level combinations utilized. Definition 5 An experimental unit is the object on which the response and factors are observed or measured. Definition 6 A designed experiment is an experiment in which the analyst controls the specification of the treatments and the method of assigning the experimental units to each treatment. An observational experiment is an experiment in which the analyst simply observes the treatments and the response on a sample of experimental units
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 1
Definition 7 The completely randomized design is a design in which treatments are randomly assigned to the experimental units or in which independent random samples of experimental units are selected for each treatment. Design of Experiments is classified into three types 1) Completely Randomized Design or One-way Analysis of Variance 2) Randomized Design or Two-way Analysis of Variance 3) Latin Square Design or Three-way Analysis of Variance
1) COMPLETELY RANDOMIZED DESIGN or One-way Analysis of Variance The test procedure compares the variation in observations between samples to the variation within samples. Completely randomized designs are the simplest in which the treatments are assigned to the experimental units completely at random. This allows every experimental unit, i.e., plot, animal, soil sample, etc., to have an equal probability of receiving a treatment. Suppose we wish to compare k population means ( k ≥ 2 ). This situation can arise in two ways. If the study is observational, we are obtaining independently drawn samples from k distinct populations and we wish to compare the population means for some numerical response of interest. If the study is experimental, then we are using a completely randomized design to obtain our data from k distinct treatment groups. In a completely randomized design the experimental units are randomly assigned to one of k treatments and the response value from each unit is obtained. The mean of the numerical response of interest is then compared across the different treatment groups. Advantages of Completely Randomized Designs 1. Complete flexibility is allowed - any number of treatments and replicates may be used. 2. Relatively easy statistical analysis, even with variable replicates and variable experimental errors for different treatments. 3. Analysis remains simple when data are missing. 4. Provides the maximum number of degrees of freedom for error for a given number of experimental units and treatments. Disadvantages of Completely Randomized Designs 1. Relatively low accuracy due to lack of restrictions which allows environmental variation to enter experimental error.
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 2
2. Not suited for large numbers of treatments because a relatively large amount of experimental material is needed which increases the variation.
Appropriate Use of Completely Randomized Designs 1. Under conditions where the experimental material is homogeneous, i.e., laboratory, or growth chamber experiments. 2. Where a fraction of the experimental units is likely to be destroyed or fail to respond. 3. In small experiments where there is a small number of degrees of freedom. The completely randomized design is seldom used in field experiments where the randomized complete block design has been consistently more accurate since there are usually recognizable sources of environmental variation. In One way Classification Null Hypothesis Alternative Hypothesis:
H o : µ1 = µ 2 = ... = µ k H a : at least two population means differ, i.e. µ i ≠ µ j for some i ≠ j.
Assumptions: 1. Samples are drawn independently (completely randomized design) 2. Population variances are equal, i.e. σ 12 = σ 22 = L = σ k2 . 3. Populations are normally distributed. Notations: Number of Samples (or levels)
=k
Number of observations in i th sample
= ni, i = 1,2,3,…, k
Total Number of observations
= n = ∑ ni
Observation j in the i th sample
= xij, j = 1,2,3,….,ni
Sum of ni observations in i th sample
= Ti = ∑ x ij
i
j
Sum of all observations
T = ∑ Ti = ∑∑ xij i
i
j
The Computational Formulae
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 3
T2 SST = ∑∑ x − n i j 2 ij
Total Sum of Squares
Ti 2 T 2 − Between samples sum of squares, SS B = ∑ n i ni Within samples sum of squares SSW = SST - SSB
SST n −1 SS Within sample square, MSW = W n−k Total mean square
Between samples square MS B =
MST =
SS B k −1
Number of d.o.f = (k -1) + (n – k) = n – 1.
ANOVA TABLE Source of variation Between samples Within samples Total
Sum ofSquares SSB SSW SST
Degrees of Freedom k-1 n-k n-1
Mean Square MSB MSW
F Ratio
MS B MSW
Working Method STEP I: Set up Null Hypothesis H 0 = µ1 = µ 2 = ⋅ ⋅ ⋅ = µ k (Population means are equal) STEP II: Set up Alternative Hypothesis H 1 = µ1 ≠ µ 2 ≠ ⋅ ⋅ ⋅ ≠ µ k (Population means are not equal) STEP III: Take l.o.s = α STEP IV: Find total number of observations n. STEP V: Calculate T, the Grand Total number of observations. STEP VI: Calculate the sum of squares SST, SSB, SSW. STEP VII: Prepare ANOVA Table to calculate F-Ratio. STEP VIII: Conclusions. i) If calculated F > Fα for Fα, (k -1) + (n – k) d.o.f , Reject H0 ii) If calculated F < Fα for Fα, (k -1) + (n – k) d.o.f , Accept H0 PROBLEM 1 Neuroscience researchers examined the impact of environment on rat development. Rats were randomly assigned to be raised in one of the four following test conditions: Impoverished (wire mesh cage - housed alone), standard (cage with other rats), enriched (cage with other rats and toys), super enriched (cage with rats and toys changes on a periodic basis). After two months, the rats were tested on a variety of learning measures (including the number of trials to learn a maze to a three perfect trial criteria), and several neurological measure (overall cortical weight, degree of dendritic branching, etc.). The data for the maze task is below. Compute the appropriate test for the data provided below. Impoverished Standard Enriched Super Enriched 22
17
12
8
19
21
14
7
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 4
15
15
11
10
24
12
9
9
18
19
15
12
Solution: Source of variation Between samples Within samples Total
Sum ofSquares SSB= 323.35 SSW = 135.6 SST = 458.95
Degrees of Freedom 3 16 19
Mean Square 107.7833 8.475
F Ratio
MS B = 12.71 MSW
Null Hypothesis H 0 = µ1 = µ 2 = µ 3 = µ 4 Alternative Hypothesis H1: At least two means differ Test Statistic: Fc = 12.71 Table Value F0.05,(3,16)= 3.49 Conclusion: Fc > F0.05,(3,12) , Reject Null Hypothesis 1. What is your computed answer? F = 12.71 (3,16) p < .01 2. What would be the null hypothesis in this study? Environment will have no impact on learning ability as operationalized by maze performance in rats. 3. What would be the alternate hypothesis? Environment will have an impact on learning ability as operationalized by maze performance in rats. 4. What is your Fcrit? Fcrit = 5.29 5. Are there any significant differences between the four testing conditions? Yes - There is no significant difference between the impoverished group and the standard group (Fcomp = 2.32 and qobs= 2.15, n.s.). There is a significant difference between the impoverished group and both the enriched and supenriched group (Fcomp = 16.15 and qobs= 5.68, p < .01) and Fcomp = 31.90 and qobs= 7.98, p < .01), respectively). There is no significant difference between the standard group and the enriched group (Fcomp = 6.24 and qobs= 3.53, n.s.). There is a significant difference between the standard group and the supenriched group (Fcomp = 17.03 and qobs= 5.83, p < .05). There is no significant difference between the enriched group and the superenriched group (Fcomp = 2.65 and qobs= 2.30, p < .05)). 6. Interpret your answer. Environment may have an impact on ability to learn. Differences were found between groups when each group is compared to a group at least two levels above the one under study. Thus for example, there is a difference between the impoverished and the enriched and superenriched but not between the impoverished and the standard groups. PROBLEM 2 A research study was conducted to examine the clinical efficacy of a new antidepressant. Depressed patients were randomly assigned to one of three groups: a placebo group, a group that received a low dose of the drug, and a group that received a moderate dose of the drug. After four weeks of treatment, the patients completed the Beck Depression Inventory. The higher the score, the more depressed the patient. The data are presented below. Compute the appropriate test.
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 5
Placebo Low Dose Moderate Dose 38
22
14
47
19
26
39
8
11
25
23
18
42
31
5
Solution: Source of variation Between samples Within samples Total
Sum ofSquares SSB= 1484.9333 SSW = 790.8 SST = 2275.73333
Degrees of Freedom 2 12 14
Mean Square
F Ratio
742.46666 65.9
MS B = 11.26 MSW
Null Hypothesis H 0 = µ1 = µ 2 = µ 3 Alternative Hypothesis H1: At least two means differ Test Statistic: Fc = 11.26 Table Value F0.05,(2,12)= 6.93 Conclusion: Fc > F0.05,(2,12) , Reject Null Hypothesis
1. What is your computed answer? F = 11.26 (2,12) p < .01 2. What would be the null hypothesis in this study? There will be no difference in depression levels between the three groups. The groups taking the drug will not be different than the groups taking the placebo. 3. What would be the alternate hypothesis? There will be a difference somewhere in depression levels between the three levels of drug groups. 4. What probability level did you choose and why? p = .01. There is a risk involved with a Type I error. I do not want to erroneously say the drug works and then later find out that it doesn't. 5. What is your Fcrit? Fcrit = 6.93 6. Is there a significant difference between the groups? Yes - a significant difference exists somewhere between the three groups. 7. If there is a significant difference, where specifically are the differences? There is a significant difference between the placebo group and the low dose group (Fcomp = 11.75 and qobs= 4.84, p < .05). There is a significant difference between the placebo group and the moderate dose group (Fcomp = 20.77 and qobs= 6.44, p < .01). There is no significant difference between the low dose and the moderate dose groups (Fcomp = 1.27 and qobs= 1.59, n.s.). 8. Interpret your answer. The drug appears to help alleviate depression. However, as there is no significant difference between taking a low or moderate dose, a low dose would be recommended. PROBLEM 3
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 6
A manufacturer of television sets is interested in the effect on tube conductivity of four different types of coating for color picture tubes. The following conductivity data are obtained. Coating Type Conductivity 1 143 141 150 146 2 152 149 137 143 3 134 136 132 127 4 129 127 132 129 Test the null hypothesis that H 0 = µ1 = µ 2 = µ 3 = µ 4 , against the alternative that at least two of the means differ. Use α = 0.05. Solution: Source of variation Between samples Within samples Total
Sum ofSquares SSB= 844.68750 SSW = 236.25000 SST = 1080.93750
Degrees of Freedom 3 12 15
Mean Square 281.56250 19.68750
F Ratio
MS B = 14.30 MSW
Null Hypothesis H 0 = µ1 = µ 2 = µ 3 = µ 4 Alternative Hypothesis H1: At least two means differ Test Statistic: Fc = 14.30 Table Value F0.05,(3,12)= 3.49 Conclusion: Fc > F0.05,(3,12) , Reject Null Hypothesis PROBLEM 4 A manufacturer suspects that the batches of raw material furnished by her supplier differ significantly in calcium content. There is a large number of batches currently in the warehouse. Five of these are randomly selected for study. A chemist makes five determinations on each batch and obtains the following data. Batch 1 23.46 23.48 23.56 23.39 23.40
Batch 2 23.59 23.46 23.42 23.49 23.50
Batch 3 23.51 23.64 23.46 23.52 23.49
Batch 4 23.28 23.40 23.37 23.46 23.39
Batch 5 23.29 23.46 23.37 23.32 23.38
Is there a significant variation in calcium content from batch to batch? Use α = 0.05. Solution: Source of variation Between samples Within samples
Sum ofSquares SSB= 0.0969760 SSW = 0.0876000
Degrees of Freedom 4 20
Mean Square
F Ratio
0.0242440 0.0043800
MS B = 5.54 MSW
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 7
Total
24
SST = 0.1845760
H 0 = µ1 = µ 2 = µ 3 = µ 4 H1: At least two means differ Test Statistic: Fc = 5.54 Table Value F0.05,(4,20)= 2.84 Conclusion: Fc > F0.05,(4,20) , Reject Null Hypothesis. PROBLEM 5 Four Laboratories measure the tin coating weight of 12 disks and that the results are as follows. Lab A 0.25
0.27
0.22
0.30
0.27
0.28
0.32
0.24
0.31
0.26
0.21
0.28
Lab B 0.18
0.28
0.21
0.23
0.25
0.20
0.27
0.19
0.24
0.22
0.29
0.16
Lab C 0.19
0.25
0.27
0.24
0.18
0.26
0.28
0.24
0.25
0.20
0.21
0.19
Lab D 0.23
0.30
0.28
0.28
0.24
0.34
0.20
0.18
0.24
0.28
0.22
0.21
Construct an ANOVA table and test the hypothesis , whether there is any difference among the four sample means can be attributed to chance at 5%
Solution: Source of variation Between samples Within samples Total
Sum ofSquares SSB= 0.013 SSW = 0.0679 SST = 0.0809
Degrees of Freedom 3 44 47
Mean Square
F Ratio
0.0043 0.0015
MS B = 2.87 MSW
H 0 = µ1 = µ 2 = µ 3 = µ 4 H1: At least two means differ Test Statistic: Fc = 2.87 Table Value F0.05,(3,44)= 2.82 Conclusion: Fc > F0.05,(3,44) , Reject Null Hypothesis. PROBLEM 5 A production manager wishes to test the effect of 5 similar milling machines on the surface of finish of small casting. So he selected 5 such machines and conducted the experiment with four replication under each machine as per ‘Completely Randomized Design’ and obtained the following reading Machines
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 8
M1 25 30 16 36
Relication
M2 10 20 33 42
M3 40 30 49 22
M4 27 20 35 48
M5 15 8 45 34
Perform the required ANOVA and state the inference at 5% i.o.s. Solution:
Source of variation Between samples Within samples Total
Sum ofSquares SSB= 303.5 SSW = 2528.25 SST = 2831.78
Degrees of Freedom 4 15 19
Mean Square
F Ratio
75.875 168.55
MS B = 0.4502 MSW
H 0 = µ1 = µ 2 = µ 3 = µ 4 = µ 5 H1: At least two means differ Test Statistic: Fc = 0.4502 Table Value F0.05,(3,44)= 3.06 Conclusion: Fc > F0.05,(3,44) , Accept Null Hypothesis. There is no significant difference between machines in terms of surface finish of small castings. 6) A study of depression and exercise was conducted. 3 groups were used: those in a designed exercise program; a group that is sedentary and a group of runners. A depression rating was good one to the members in each group. Exercise Group Sedentary Group Runners Total
63 71 49 183
58 64 52 174
61 68 47 176
60 65 51 176
62 67 48 177
59 67 126
363 402 247 1012
Does the data provide sufficient evidence to indicate difference among the population means at 1% level of significance? Answer: H₀: No difference among the population mean. α=1% We shift the origin to 60 and subtracted the given values with 60. Depression Exercise Group Sedentary Group Runners Tj Tj²/n
Ti
Ti²/n
Xij
3
-2
1
0
2
-1
3
13.5
19
11 -11 3 3
4 -8 -6 12
8 -13 -4 5.3
5 -9 -4 5.3
7 -12 -3 3
7
42 -53 -8 40.6
294 468.1 775.6
324 579 922
6 12
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 9
=∑ − =775.6+0.44 =776.04
=∑ − =922+0.4 =922.04 = − =922.04-776.04 =146 Source of variation
Sum of squares
Between samples Within samples
776.04 146
OVEN 1 2 3
TEMPERATURE°c 497 481 494 479 483 487
494 489 489
Total
Degrees of freedom 2 5
922.04
Mean square
=13.2
388.02 29.2
496 478 472
18
F ratio
487 472
477
51.22
F₀.₀₁‚₍₁₇‚₅₎=4.33. > Therefore Reject (there is no difference between population means).
7) Three special ovens in a metal working shop are used to heat metal specimens. All the ovens are supposed to operate at the same temperature. It is known that the temperature of an ovens. The table below shows the temperature, in degrees centigrade, of each of the three ovens on a random sample of heating. Test for difference between mean oven temperatures at 5% los?
Solution:OVEN 1 2 3
/R
TEMPERATURE c 5 0 0 5
8 5 6 7
-8 -10 -2 -20
7 -11 -17 -21
8.33
16.33
133.33 147
-2 -17 -19
-12 -12
120.33
48
10 -16 -54 = −
16.66 42.66 486
∑ /R=473.3 ∑ = 2
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 10
∑ 206 246 762 ∑ = 1294
∑
∑
545.3
= (%
we shift the origin to 489 and subtract 489 from the given values. s = ∑∑ s" = ∑
s = ∑
–
"
–
= 1214 -
= 545.32 –
!
= 1194
= 473.32 -
!
!
= ##. = %#.
s& = s − '" − ' = (% − ##. − %#. = 215.36
ANOVA TABLE:Source Sum of squares of variation Between 525.32 rows
d.0.f
Mean square
2
262.66
= . (
Between columns
5
90.664
= %.
10
21.536
Error
453.32
215.36
F Ratio
TOTAL 1194 .# ,(2,10) = 4.10 .# ,(5,10)=3.33 LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 11
> .# = 12.19 > 4.10 Reject > .# = 4.20 > 3.33 Reject Hence, there is difference between mean oven temperatures. 8 ) A manufacturer of television sets is interested in the effect on tube conductivity of four different types of coating for color picture tubes. The following conductivity data are obtained. Coating type 1 2 3 4
conductivity 143 152 134 129
141 149 136 127
150 137 132 132
146 143 127 129
Test the null hypothesis that Hₒ=µ₁=µ₂=µ₃=µ₄ against the alternative that at least 2 of the means differ. Use α=0.05. Answer: Step1: Null hypothesis- Hₒ=µ₁=µ₂=µ₃=µ₄ (population means are equal) Step2: Alternative hypothesis H₁:µi≠µj (popula_on means are not equal) α=0.05 Coating type 1 2 3 4 Ti
Conductivity 143 152 134 129 558
141 149 136 127 553
150 137 132 132 551
146 143 127 129 545
Ti
Ti²∕n
Xij²
580 581 529 517 2207
84100 84390.25 69960.25 66822.25 305272.75
84146 84523 70005 66835 305509
SSb= (ΣTi²)-(T²∕n) =305272.75-304428.06 =844.687 SST= (ΣXij)-(T²∕n) =305509-304428.0625 =1080.9375 SSW=SSt-SSb =1080.93-844.68 =236.25 Source of variation
Sum of squares
Between samples Within samples Total
844.687 236.25 1080.93
Degrees of freedom 3 12 15
Mean square
F ratio
281.56 19.681 72.02
Fc=14.30
F₀.₀₅‚₍₁₅‚₃₎=3.34. Fc>FαTherefore Reject Hₒ (population means are not equal).
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 12
RANDOMIZED BLOCK DESIGN OR TWO-WAY ANALYSIS OF VARIANCE Two-way (or multi-way) ANOVA is an appropriate analysis method for a study with a quantitative outcome and two (or more) categorical explanatory variables. This is an extension of the one factor situation to take account of second factor. As such it is often called a Blocking Factor because it places subjects or units into homogeneous groups called Blocks. The design itself is called a Randomized Block Design. The usual assumptions of Normality, equal variance, and independent errors apply. If an experiment has a quantitative outcome and two categorical explanatory variables that are defined in such a way that each experimental unit (subject) can be exposed to any combination of one level of one explanatory variable and one level of the other explanatory variable, then the most common analysis method is two-way ANOVA. Because there are two different explanatory variables the effects on the outcome of a change in one variable may either not depend on the level of the other variable (additive model) or it may depend on the level of the other variable (interaction model). Assumptions 1. 2. 3.
The population at each factor level combination is (approximately Normally Distributed) These normal populations have a common variance, σ2. The effect of one factor is the same at all levels of the other factor.
Notations Number of levels of row factor Number of levels of column factor Total number of observations Observation in (ij) th cell of the table (ith level of row factor and j th level of column factor) Sum of c observations in i thi row
r c rxc xij i = 1,2,…,r j = 1,2,…,c
TRi = ∑ xij j
Sum of r observations in j th column
TCj = ∑ xij i
Sum of all r x c observations
T = ∑∑ xij = ∑ TRi = ∑ TCj i
j
i
j
Computational Formulae Total Sum of Squares
SST = ∑∑ xij2 − i
Between Rows Sum of Squares Between Columns Sum of Squares
j
T2 rc
2 Ri
T T2 SS R = ∑ − c rc i SSC = ∑ i
TRi2 T 2 − r rc
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 13
Error(residual) Sum of Squares
SSE = SST – SSR – SSC
ANOVA TABLE Source of variation Between rows Between Columns
Sum ofSquares SSR SSC
Degrees of Freedom r-1 c-1
Mean Square MSR MSC
Error(residual)
SSE
(r – 1) x (c – 1)
MSE
Total
SST
r x c -1
H0 : No effect due to row factor H1: An effect due to row factor Critical region F > Fα,(r-1,(r-1)(c-1)) Test Statistic FR =
MS R MS E
F Ratio
MS R MS E MS C MS E
H0 : No effect due to column factor H1: An effect due to column factor Critical region F > Fα,(c-1,(r-1)(c-1)) Test Statistic F C =
MS C MS E
PROBLEM 1
Three laboratories, A, B, and C, are used by food manufacturing companies for making nutrition analyses of their products. The following data are the fat contents (in grams) of the same weight of three similar types of peanut butter.
Peanut Butter Brand 1 Brand 2 Brand 3
Laboratory A B
C
D
16.6 16.0 16.4
16.0 15.6 15.9
16.3 15.9 16.2
17.7 15.5 16.3
Analyse the data at 5% significance by (a) carrying out a one-way ANOVA to see if there is a difference between the fat content of the three brands; (b) performing a two-way ANOVA to see if there is any difference between the Brands using the laboratories as blocks. (c) Do you think there is any evidence that the results were not reasonably consistent between the four laboratories? a) One-way ANOVA Laboratory Peanut Butter A B C D Mean Brand 1 16.6 17.7 16.0 16.3 16.65 Brand 2 16.0 15.5 15.6 15.9 15.75 Brand 3 16.4 16.3 15.9 16.2 16.20
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 14
Mean
16.33
16.50
15.83
16.13
16.20
Sums of squares Total SS: Inputting all the individual values into the calculator gives the following summary statistics: n = 12, x = 16.20, sn = 0.546 ⇒ nsn2 = 3.58
x Between Brands SS: The mean scores x1 = 16.65, x 2 = 15.75 and 3 = 16.20 Each of these means came from 4 values so inputting the means with a frequency of 4 gives: n = 12, x = 16.20, sn = 0.367 ⇒ nsn2 = 1.62 (n and x for checking) Error SS: 3.58 – 1.62 = 1.96 Anova table In this example: (k =3 brands, N =12 values) Source S.S. d.f. M.S.S. Between 1.62 3-1=2 1.62/2 = 0.81 brands Errors 1.96 11 - 2 = 9 1.96/9 = 0.22 Total 3.58 12 - 1 = 11 Hypothesis test H1: At least two of them are different. H0: µ1 = µ2 = µ3
F 0.81/0.22 = 3.72
Critical value: F0.05 (2,9) = 4.26 (Deg. of free. from 'between brands' and 'errors'.) Test Statistic: 3.72 Conclusion: T.S. < C.V. so H0 not rejected. There is no difference between the fat content of the brands.
b)
Two-way ANOVA
Sums of squares From (a): Total SS: nsn2 = 3.58
Between Brands SS: nsn2 = 1.62
Between Labs Sum of Squares: Mean scores x A = 16.33, x B = 16.50, x C = 15.83, x D = 16.13 Each of these means came from 3 values so inputting the means with a frequency of 3 gives: n = 12, x = 16.20, sn = 0.249 ⇒ nsn2 = 0.75 (n and x for checking) Error SS: 3.58 – (1.62 + 0.75) = 1.21 Anova table In this example: (k =3 brands, N =12 values) Source
S.S.
d.f.
M.S.S.
F
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 15
Between 1.62 3-1=2 1.62/2 = 0.81 brands Between 0.75 4–1=3 0.75/3 = 0.25 labs Errors 1.21 11 - 5 = 6 1.21/6 = 0.20 Total 3.58 12 - 1 = 11 Hypothesis test for Brands H0: µ1 = µ2 = µ3 H1: At least two of them are different.
0.81/0.20 = 4.05 0.25/0.20 = 1.25
Critical value: F0.05 (2,6) = 5.14 (Deg. of free. from 'between brands' and 'errors'.) Test Statistic: 4.05 Conclusion: T.S. < C.V. so H0 not rejected. There is no difference between the fat content of the brands. Blocking has not changed to conclusion even though the test statistic has increased.
c) Hypothesis test for Laboratories H0: µA = µ2 = µC = µD H1: At least two of them are different. Critical value: F0.05 (3,6) = 4.76 (Deg. of free. from 'between brands' and 'errors'.) Test Statistic: 1.25 Conclusion: T.S. < C.V. so H0 not rejected. The results between the different laboratories are consistent.
PROBLEM 2 The following data represent the number of units of production per day turned out by 5 different workers using 4 different types of machines
W O R K E R S
a) b)
1
MACHINE TYPE A B 44 38
C 47
D 36
2
46
40
52
43
3
34
36
44
32
4
43
38
46
33
5
38
42
49
39
Test whether the five men differ with respect to mean productivity. Test whether the mea productivity is same for four different machine types. Take α = 5%
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 16
Solution We shift the origin to 40 and subtract 40 from the given values and work out with new values of xij.
W O R K E R S
Ti
MACHINE TYPE C D
Ti 2 r
A
B
1
4
-2
7
-4
5
6.25
85
2
6
0
12
3
21
110.5
189
3 4 5
-6 3 -2
-4 -2 2
4 6 9
-8 -7 -1
-14
49.0
132
0
0
98
16
Ti
5
-6
38
-17
16 T = 20
Ti 2 c
5
7.2
288.8
139
Ti 2 ∑ c =358.8
101
28
326
139
594
∑x
2 ij
j
∑x
2 ij
2
∑
Ti =181.1 r
90 594
i
SS R = ∑ i
SSC = ∑ i
TRi2 T 2 − c rc
181.5 – 20 = 161. 5
TRi2 T 2 − r rc
358.5 – 20 = 338. 8
SSE = SST – SSR – SSC
Source of Variation Between row Workers Between Columns (Machines) Errors Total F0.05,(4, 12) = 3.26
574 – (161.5 + 338.8)= 73.7
S.S.
d.o.f.
M.S.S
F
161.5
c- 1 = 4
40.375
40.374/6.142 = 6.57
338.8
r–1=3
11.933
112.933/6.142 = 18.39
73.7 574
12 19
6.142 -
-
F0.05,(3, 12) = 3.49
F > F0.05,(4, 12) with respect to rows, hence 5 workers differ significantly. F > F0.05,(3, 12) with respect to columns, hence 4 machine types also differ significantly in mean productivity.
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 17
An experiment was designed to study the performance of four different detergents for cleaning injectors. The following ‘cleanliness’ readings were obtained with specially designed equipment for 12 tanks of gas distributed three different models of engines. Obtain appropriate ANOVA table and test at 1% LOS whether there are differences in the detergents on the engines. Solution : We choose 45 as origin.
Detergent A Detergent B Detergent C Detergent D Tj */ Detergent 0 - .+/ Detergent
Engine 1 0 2 3 -3 2 1 A 22B
Engine 2 -2 1 5 -8 -4 Engine 1 2 45 94 47
Engine Ti 3 6 4 7 10 10 18 4 -7 27 T=252 Engine 182.25 185.25 43 201 46 317
*+ - .+/ , 5.33 40 33.33 54 108 134 16.33 89 162.99 317 Engine 3 51 52
Detergent
C
48
50
55
Detergent
D
42
37
49
Now,
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 18
SST
2 =∑+ ∑/ .+/ xij –
= = SSR
SSC
= = =
12
317 – 52.08 264.92 ∑+
3
– 12 162.99 – 52.08 110.91 2/
2
=∑/ 12 – 12
= 185.25 – 52.08 = 138.17 SSE = SST- SSR - SSC = 264.92 – 110.91 – 133.17 = 20.84 ANOVA Table : Source of Sum of d.o.f. Mean of Variance sequence square Between 110.91 3 36.97 rows Between 133.17 2 66.58 Columns Error 208.84 6 3.47 Total 264.92
F ratio 45
FR= 6 45 7
= 10.65 45
FC = 458
7
= 19.19
9 %(<= >?,< >A) = 9.78 9 %(<= >,< >A) = 10.92 Now,
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 19
Let H0 : No difference in the detergent on the engines Let H1 : There is difference in the detergent on the engine But, FR>9 %(<= >?,< >A) H0 is rejected at 1% LOS. 4) An industrial engineer is conducting an experiment on eye focus time. He is interested in the effect of the distance of the object from the eye focus time. Four different distances are of interest. He has five subjects available for the experiment. Because there may be differences among individuals, he decides to conduct the experiment in a randomized block design. The data obtained follow subjects Distances(ft) 4 6 8 10
1 10 7 5 6
2 6 6 3 4
3 6 6 3 4
4 6 1 2 2
5 6 6 5 3
Can we say distance affects the eye focus time@ 5% L.o.s.
SOLUTION: subjects 2 3
4
5
Ti
10 7 6 5 28 196
6 6 3 4 19 90.25
6 6 3 4 19 90.25
6 1 2 2 11 30.25
6 6 5 3 20 100
34 26 18 19 ∑ =97 506.75
210
97
97
45
106
Distances(ft) 1 4 6 8 10 Tj
-
231.2 135.2 64.8 72.2 503.4
- 244 158 72 81
555
SST= ∑ ∑ − =555−470.45 SST=84.55
SSR= ∑ " − =503.4−470.45 LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 20
SSR=32.95
SSC= ∑ − =506.75−470.45 SSC=36.3 SSE=SST−SSR−SSC =84.55−32.95−36.3 =15.3 ANOVA TABLE: Source of Sum of variance squares Between rows 32.95 Between 36.3 columns Error 15.3
Degrees of freedom 3 4
Mean square
F ratio
10.98 9.075
Fr=8.612 FC=7.118
12
1.275
F0.05,(3,12)=8.74 F0.05,(4,12)=5.91 FrF0.05,(4,12) Reject H0. 5) Prior to submitting a quotation for a construction project, companies prepare a detailed analysis of the estimated labour and materials costs required to complete the project. A company which employs three projects cost assessors, wished to compare the mean values of these assessors’ cost estimate. This was done by requiring each assessor to estimate independently the costs of the same four construction projects. These costs, in £0000s, are shown in the next column. Assessors A B C Project 1 46 49 44 Project 2 62 63 59 Project 3 50 54 54 Project 4 66 68 63 solution : Ho i)There is no siginificant difference between the assessors mean cost estimates. II) There is no siginificant difference between the project h = 4 ; k=3. set orgin as 50 N=hk=12
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 21
Project
A
B
C
Ti*
Ti*2
1 2 3 4 T*j
-4 12 0 16 24
-1 13 4 18 34
-6 9 4 13 20
-11 34 8 47 T=78
121 1156 64 2209
T*J2
576
1156
400
(i=1) 53 (i=2) 394 (1=3)23 (1=4)749
2
2132
; =
721
;
=18.667
=676.33
=26.
ANOVA TABLE SV Between rows
SS
Between columns Errors Total
def h-1=2
MS
F ratio
k-1=3 (h-1)(k-1)=6 V =721
hk-1=11
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 22
At 5% loss for degree of factor (2,6) is (2,6) = 5.14 At 5% loss for degree of factor (3,6) is (3,6) = 4.76 i) calculated > table value 108.7>5.14 so H0 is rejected . so there is significant difference between the project. ii) calculate < table 2.7845 < 4.76 so H0 is accepted. so There is no significant difference between the assessors mean cost estimates.
LATIN SQUARE DESIGN A n x n LATIN Square is a square array of n distinct letters, with each appearing once and only once in each row and in column Example: A B C D
B C D A
C D A B
D A B C
NOTATIONS: Number of levels of row factor Number of levels of column factor
n n
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 23
Number of levels of treatment factor Sum of c observations in 24t h row
k
TRi = ∑ xij j
Sum of r observations in j th column
TCj = ∑ xij i
Sum of k observations in k th teatment
TK = ∑ xij k
T = ∑∑ xij = ∑ TRi = ∑ TCj
Sum of all r x c observations
i
j
i
j
Computational Formulae Total Sum of Squares
T2 SST = ∑∑ x − 2 n i j 2 ij
Between Rows Sum of Squares
SS R = ∑ i
Between Columns Sum of Squares
SS C = ∑ i
Between treatment sum of squares
SSTk Error(residual) Sum of Squares
TRi2 T 2 − 2 n n TRi2 T 2 − 2 n n
TK2 T 2 =∑ − n n2 i
SSE = SST – SSR – SSC - SSTk
ANOVA TABLE Source of variation Between rows Between Columns
Sum ofSquares SSR SSC
Degrees of Freedom n-1 n-1
Mean Square MSR = SSR/(n-1) MSC = SSC/(n-1)
Between Treatments
SSE
n-1
MSE = SSTk/(n-1)
Error(residual)
SSE
(n– 1) x (n – 2)
MSE = SSE/(n-1)
Total
SST
n2 -1
F Ratio
MS R MS E MS C MS E MSTk MS E
FR, FC, FTk follows (n-1), (n-1)(n-2) d.o.f
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 24
1) The following data resulted from an experiment to compare three burners B1,B2, and B3,.A Latin square design was resulted was used as the tests were made on three engines and were spreadover three days. Engine 1 Engine 2 Engine 3 DAY 1 B1 16 B2 17 B3 20 DAY 2 B2 16 B3 21 B1 15 DAY 3 B 3 15 B 1 12 B 2 13 Test the hypothesis that there is no difference between the burners at 5% LOS yields. HO: There is no difference between the burners B1=B2=B3 H1: There is difference between the burners B1≠B2≠B3 + ∑.CD Engine 1 Engine 2 Engine 3 Ti DAY 1 DAY 2 DAY 3 Tj
B1 16 B2 16 B 3 15 47
/
∑.CD
B2 17 B3 21 B 1 12 50
B3 20 B1 15 B 2 13 48
736.3
833.3
768
737
874
794
SST= ∑+ ∑/ .CD
-
=2405-2336.1 =68.88
53 52 40 ∑ Tij=145 ∑
/
936.3 901.3 533.3 ∑
+
945 922 538
=2370
=2337 ∑+ ∑/ .CD =2405
+
-
SSR=∑ =2370.9 - 2336.1 =34.8 /
-
SSC= =1.5
Rearrange the data according to treatment
DAY 1 DAY 2 DAY 3
Engine 1
Engine 2
Engine 3
B1 16 B2 17 B 3 20
B1 15 B2 16 B 3 21
B1 12 B2 13 B 3 15
Ti 43 46 56
+
616.3 705.3 1045.3
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
∑.CD 625 714 1066
Page 25
Tj
53
/
∑.CD
52
40
936.3
901.3
533.3
945
922
538
∑ Tij=145 ∑
/
∑
+
=2366
=2370 ∑+ ∑/ .CD =2405
E
*E =∑ =2366.9-2336.1 =30.8 SSE=SST-SSR-SSC-*E =68.88-34.8-1.5-30.8 =1.78 ANOVA TABLE; SOURCE OF SOURCE OF VARIATION VARIATION BETWEEN SSR=34.8 ROWS BETWEEN SSC=1.5 COLOUMNS BETWEEN *E =30.8 TREATMENT ERROR SSE=1.78
-
D.O.F
MEAN SQUARE
F RATIO 45F
(n-1)=2
MSR=(G )=17.4
55F
9H =
(n-1)=2
MSC=(G )=0.75
55J
9J =45I =0.84
(n-1)=2
L K*E =(G ) =15.4
9E = 45IL =17.3
(n-1)(n-2)=2
MSE=(G )(G)=0.89
55
55I
45I
=19.5
45J
45
9(M.MN)(,) =19.0 9E =17.3 9E > 9(M.MN)(,) ACCEPT H0 RESULT; There is no different between the burners at 5%LOS.
2)An oil company tested four different blends of gas online for fuel efficiency according to a latin square design in order to control for the variability of four different drivers and four different models of cars. Fuel efficiency was measured in miles per gallon (mpg) after driving cars over a standard course. Fuel efficiencies (mpg) for 4 blends of gas online (latin square design: blends indicated by letters A-D) Car models Drivers 1 2 3 4
I D 15.5 B 16.3 C 10.3 A 14.7
II B 33.9 C 26.6 A 31.1 D 34.0
III C 13.2 A 19.4 D 17.1 B 19.7
IV A 29.1 D 22.8 B 30.3 C 21.6
Analsyse the data and draw you conclusion.
Solution: LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 26
X PQ = .+/ -20 n= 4 N=n²=16
I II III IV 1 -4.5 13.9 -6.8 9.1 2 -3.7 6.6 -0.6 2.8 3 -9.2 11.1 -2.9 10.3 4 -5.3 14.0 -0.3 1.6 Tj -22.7 45.6 -10.6 23.8 Tj² 515.29 2079.36 112.36 566.44 We will rearrange the measurements according to letters A B C D
9.1 13.9 -6.8 -4.5
-0.6 -3.7 6.6 2.8
11.1 10.3 -9.2 -2.9
-5.3 -0.3 1.6 14
V = - - .+/ ² − = 957.05 – V= 875.6
VF = ∑T+ ² −
(?A. )²
V
Tk²
14.3 20.2 -7.8 9.4 T=36.1
204.47 408.04 60.84 88.36 T=761.73
T² N
(?A. )² A
² V
= W (3273.45) − VJ =736.9 VX> ∑TY ² −
+
Tk
²
= W (349.39) − VF =5.89
Q
∑Xij² 342.5 65.45 322.3 226.7 957.0
Ti² 136.89 26.01 86.49 100 349.39
A
VJ > ∑TQ ² −
Ti 11.7 5.1 9.3 10 T=36.1 3273.45
(?A. )² A
² V
(?A. )²
= W (761.73) − A VX =108.98 VI =Z − ZX − ZJ − ZF =875.6 −108.98−736.91−5.89 = 23.82 ANOVA TABLE:
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 27
Tot al vari atio n,
SV Bet`s rows Bet`s columns Bet`s letters Residual error total
SS
DOF
ZF> 5.89
n-1=3
ZJ> 736.9
n-1=3
ZX> 108.9
n-1=3
ZI> 23.82 Z> 875.6
MS
F
[6
9F>
=1.96 G [`
=245.63
[a
=36..32
9F>
G
9F>
G
(n-1) (n-2)=6
[_ =3.60 (G )(G)
\6 ]^= \_ (]^=)(]^) \8 ]^= \_ (]^=)(]^) \a ]^= \_ (]^=)(]^)
=1.836(reciprocal) =68.23
=10.08
n²-1=15
At 5% loss for D.O.F (Z , Z) (i,e) (3,6) is Fc =4.76, Fc % (3,6) = 4.76 Fd =1.836 ∴ HM is accepted. (i,e) There is no significant difference in fuel efficiency between rows. Fg =68.23 Fc =4.76 ∴ HM is rejected. (i,e) There is some significant difference in fuel efficiency between columns. Fh =10.08 Fc =4.76 ∴ HM is rejected. (i,e) There is some significant difference in fuel efficiency between blends of gas online.
3) The numbers of wireworms counted in the plots of Latin square following soil fumigation (L, M,NO,P)in the previous year were
ROWS
P(4) M(5) O(4) N(12) L(5)
COLUMNS O(2) L(1) M(8) P(7) N(4)
N(5) O(6) L(1) M(7) P(3)
L(1) N(5) P(5) O(10) M(6)
M(3) P(3) N(4) L(5) O(9)
Xij=Xij-1 n=5 N=25 COLUMNS 1 2
3
4
5
Ti
Ti²
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
∑ ²
Page 28
ROWS
1
3
1
4
0
2
10
100
30
2
4
0
5
4
2
15
225
61
3
3
7
0
4
3
17
289
83
4
11
6
6
9
4
36
1296
290
5
4
3
2
5
8
22
484
118
Tj
25
17
17
22
19
T=100
∑ = (%
- - xij = #! j
Tj² - ²
625
289
289
484
361
171
95
81
138
97
- = ! ∑ ∑
xij²=582
We will rearrange the measurements accurate to letters L M N
0 2 4
0 4 4
0 7 3
4 6 11
4 5 3
i 8 24 25
i ² 64 576 625
O P Txij
1 3 10
5 2 15
3 4 17
9 6 36
8 2 22
26 17 T = 100
676 289 ∑ ij=2230
100
225
289
1296
484
2394
- ²
Total varience ² V=∑/ ∑+ .CD ²- =582j
k" =l ∑ ² − ²
² j
#
=# 2394-
k =l ∑ ²- j = # 2048 ²
ANOVA TABLE ∴ H0 is accepted.
Source if variation
180 69 582
=182
#
= 78.8
=9.6
#
km> l ∑ n ²- j = # 2230 k& = 182-78.8-9.6-46 =47.6
∑ ij² 32 130 171
#
=46
Sum of square
Degree of freedom
Mean square
F ratio
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 29
i
Between rows
k" =78.8
n-1=4
k" lG
=19.7
" =
Between columns
k =29.6
n-1=4
=7.4
=
Between letters
km =46
k lG
n-1=4
=11.5
m =
Residual error
k& =47.6
km lG
(n-1)(n-2)=12
k& =3.96 (lG )(lG)
Total
V=182
n²=24
k" ⁄lG =4.97 k& ⁄(lG )(lG)
k ⁄lG =1.86 k& ⁄(lG )(lG)
km ⁄lG =2.90 k& ⁄(lG )(lG)
At 5% los for dof, Fα=5.91 " =4.97 ∴ H0 is accepted .there is significance difference in soil fumigations between rows. Fα=5.91 =1.86 ∴ H0 is accepted .there is significance difference in soil fumigations between columns. Fα=5.91 m> 2.90 There is some significant different between soil fumigations letters
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 30
LECTURE NOTES ON DESIGN OF EXPERIMENTS- Dr.V.GNANARAJ
Page 31