Solved by verified expert:Please look and make any corrections also if you think more information needed please add it. for this work I need it more organization and you have to make modify for each sentence but no a change of the meaning.Please no a change of the meaning and the graph.
123456.docx
Unformatted Attachment Preview
Question 1: the SAS code is:
First, we use normal probability plot to investigate the significant effects:
By looking to the normal plot and half-normal plot, we can see Temperature is the only variable that is
much different from the normal line comparing with the other effects.
Moreover, it is clear that Temperature is the only effect that is highly significant among the other
factors. Can see that in both the main effects plot and the lenth plot.
Second, we use analysis of variance to verify our result:
We create the hypothesis test to see which factor does impact Yield:
0 : .
1 : .
Using ANOVA table, we find that Temperature has p-value is 0.0162 with f-value 15.97.
Now by comparing these results with the significance level for 0.05,1,56 = 4.01 = 0.05
so these result confirm what we have found in the first part. Hence we reject the null
hypothesis and conclude that Temperature does impact Yield.
0 : .
1 : .
Using ANOVA table, we find that concentration has p-value is 0.4340 with f-value 0.75.
Now by comparing these results with the significance level for 0.05,1,56 = 4.01 = 0.05.
Hence we do not reject the null hypothesis and conclude that concentration does not
impact Yield.
0 : .
1 : .
Using ANOVA table, we find that catalyst has p-value is 0.8073 with f-value 0.07. Now by
comparing these results with the significance level for 0.05,1,56 = 4.01 = 0.05. Hence
we do not reject the null hypothesis and conclude that catalyst does not impact Yield.
By looking to the residual plots above, we indicate that there may be some problem with
inequality of variance. The normality and linearity look pretty good in other graphs of
Residual.
Question 2
We have unreplicated 23 factorial design, which mean the design have 8 treatment
combinations. The risk of conducting an expereiment with unreplicated design is the model
may contains a lot of niose especially when the dependent variable (yield) is highly
variable. Thus misleading conclusions may result from the experiment. Another problem
with unreplicated design, there’s no estimate of pure error, therefore we need to assum
that certain high-order intercations are negligible, but when the high order intercations are
important we need to investigate the normal probability plot of the estimates of the effects.
The effects that are negligible are normally distributed, and tend to fall along blue line,
whereas significant effects are not normally distributed and will have nonzero means and
will not lie along the blue line. The important effects are the main effect of temperature and
the interaction between temperature and catalyst.
To verify the significant terms using the analysis of variance, we will combine the negligible
effects as an estimate of error. Because the effect of the intercation between temperature
and catalyst is significant, therefore we need to include the main effect of catalyst in the
analysis of variance.
Source
Nparm DF Sum of Squares F Ratio Prob > F
temp
1 1
1058.0000 36.8000 0.0037*
catalyst
1 1
4.5000 0.1565 0.7126
temp*catalyst
1 1
200.0000 6.9565 0.0577
Term
Estimate Std Error t Ratio Prob>|t|
Intercept
66.75 1.895719 35.21 <.0001*
temp
11.5 1.895719 6.07 0.0037*
catalyst
0.75 1.895719 0.40 0.7126
temp*catalyst
5 1.895719 2.64 0.0577
Based on the table 1, the intercation between temperture and catalyst is significant (F=6.96,
p=0.058) at level 0.1 (N.B. main effects are not important when they are involved in
significant interactions).
The the intercation between temperature and catalyst indicates that catalyst has little effect
at low temperature (-1) but a large effect at high temperature (+1). From the table 2, the
estimated yield is given by
̂ = 11.5 + 11.5temp + 0.75catalyst + 5temp ∗ catalyst
Based on the Normal Quantile Plot (fig 3) the normality assumption seems to be fulfilled and
from predicted versus residual plot (fig 4) there’s no pattern, thus the assumption of
homoscedasticity is not fulfilled.
Question 3
A. Dependent variable distribution
Based on the Boxplot below, the distribution of the dependent variable (amount of the metal
recovered from an ore given a sample of weight ) is approximately symmetric, without
outliers. The mean of the dependent variable is 79.125 with a standard deviation of 10.69.
B. Full Factorial ANOVA
There are two replications, therefore there is an internal estimate of pure error of high order
intercations. From the ANOVA table the model is statistically significant (F=8.85, p=0.003),
which means there are at least one effect have a significant effect on the DV. The model
explain 88.56% of the total variability containing in DV. In the last 2 tables below
summarizes the sums of squares and the percentage contribution of each model term
relative to the total sum of squares. Only the main effects of concentration (F=11.11, p=0.01)
and time (F=45.31, p<0.001) are significant at level 5%. The main effect of time really
dominates this process, accounting for over 73.9% of the total variability explained by the
model, whereas the main effect of concentration accounts for about 17.9%.
Terms
temp
conc
temp*conc
time
temp*time
conc*time
temp*conc*time
Contribution
0.4%
17.9%
1.3%
73.9%
0.8%
0.8%
4.8%
C. ANOVA for Selected Factorial Model
From the previous step we found that only the main effects of time and concentartion have
significant effects on DV at level 5%, thus in stage we will estimate the model with only the
important effects. The model is significant (F=28.39, p<0.001) which the means different.
The model explain 81.4% of the total variation. On the other hand, the main effects of time
and concentration are significant at level 5% were (F=11.09, p=.005) and (F=45.7, p<.001)
respectively. From the profile plots the estimated average of DV increase significantly when
the levels of concentration and time increase from low-level (-1) to high level (+1).
D. Assumptions of the selected Factorial Model
From the QQ plot, the assumption of the normality seems to be fulfilled given that the
quantiles tend to fall along the line except one observation (an outlier). The
homoscedasticity seems to be also accepted since there’s no pattern between predicted
values and Rstudent (student residuals) but there’s a clear outlier with Rstudent > 3 (1st
observation where y=80 and all factors are in low level -1).
Question 4
A. Dependent variable distribution
Based on the Boxplot the distribution of the DV does not conatins extreme values and slightly
right skewed. The average of DV is 198.33 with standard deviation 128.24, which there’s a
large variability.
B. Main effects ANOVA
There are two nominal independent variables (factors). One independent variable has five
levels (material), and the other has three levels (machines). There are a total of 15
observations who are divided into 15 cells. Therefore each cell contains one observation,
which means we have unreplicated factorial design. Then there’s no estimate of pure error,
therefore we need to assum that high-order intercation is negligible to perform the ANOVA.
1) Model
From the ANOVA table the model is statistically significant (F=41.65, p<.001), which means
at least one factor have a significant effect on the DV. The model explain 96.7% of the total
variability containing in DV. In the last table below summarize the sums of squares and F-
test of each model term. Only the main effects of material (F=60.63, p<.001) is significant at
level 5%, whereas the main effect of machines is significant (F=3.71 p=0.073) at level 10%.
2) Model assumption
Based on the QQ plot and predicted values versus student residuals plot both assumption
seems to be fullfiled, except that there is an extreme value (outlier) with Student > 6!!! In
such sitution we need to invistigate why this observation it is an outlier or we can
transform the DV (e.g. logarithm transformation).
C. Main effect ANOVA with transformed DV
In our case we will transform the DV to logarithm, therefore the DV is defined as
( )
1) Model
After the transformation, F, and R square increases to 57.25 and 97.7% respectively and
both main effects become significant at level 5%.
Based on the profile plot of the factor material the 1st level have the highest estimated
average and 5th material level have a lowest estimated average. The estimated average
in the 2nd and 3rd levels are not statistically different at level 5%.
Based on the profile plot of the factor machines, the 1st level has the lowest estimated
average followed by 2nd level and then 3rd level. However there are no significant
differences between 1st vs 2nd levels and between 2nd vs 3rd levels, but there is a significant
difference between 1st and 3rd levels
2) Model assumption
Both assumptions are clearly valid after the transformation of the DV to logarithm scale.
Question 4
The experiment design conatin two factors (clay and mold). Both factors have three levels.
There are a total of 45 observations who are divided into 9 cells. The design is balanced,
therefore each cell contains five observations, which means we have replicated factorial
design. Then we can include the high-order intercation (aka. Full Factorial ANOVA).
A. Full Factorial ANOVA
1) Model
From the ANOVA table the model is statistically significant (F=13.68, p<.001), which means
at least one effect is significant. The model explain 75.25% of the total variability containing
in DV. The interaction is significant (F=2.99, p=0.032) at level 5%, which means the effect of
one independent variable (clay or mold) on the dependent variable is different at different
levels of a second independent variable (clay or mold). On the other hand, both main effects
are significant at level 5%. Howevere the main effects are not important when intercation is
significant.
2) Model assumptions
Based on the QQ plot the quantiles do not lie along the line, thus the normality assumption
is not fulfilled. On the other hand, the values of student residuals increase when the predicted
values increase (variance increase not constant), thus the homoscedasticity is not fulfilled.
To fixe these issues we will transform the DV.
B. Full Factorial ANOVA with transformed DV
1) Model
After trying many transformations we found that the assumptions become valid when we
used the following transformation
−3
2
The values of F, and R square increases to 43.44 and 90.6% respectively and the interaction
term become not significant (F=2.21, p=.087) at level 5%. However, the F values of the main
effects increase significantly from 27.2 to 99.05 for the factor clay and from 21.54 to 70.28
for the factor mold, which means the importance of the main effects increase and the
importance of the interaction become negligible. Therefore we need to remove the
interaction term from the model.
2) Model assumptions
The assumptions are clearly valid without any outlier given that the absolute values of
student residuals < 3.
C. Main Effects ANOVA with transformed DV
1) Model
After removing the intercation term the values of F, and R square increase and decrease to
75.5 and 88.3% respectively. The F values of the main effects increase significantly from
99.05 to 88.32 for the factor clay and from 70.28 to 62.67 for the factor mold, which means
the importance of the main effects increase without the interaction term.
Based on the profile plot of the factor clay, the 1st level has the highest estimated average and
2nd and 3rd levels have the lowest estimated averages and they are not significantly different
at level 5%.
Based on the profile plot of the factor mold, the 1st and 2nd levels have the lowest estimated
averages (they are significantly different) and 3rd level have the highest estimated average.
2) Model assumptions
...
Purchase answer to see full
attachment
You will get a plagiarism-free paper and you can get an originality report upon request.
All the personal information is confidential and we have 100% safe payment methods. We also guarantee good grades
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more