21 June, 2022

R programming interpretation

3.
a.
The model is adequate because the multiple r-squared score is 0.9797. It means that the model used is 97.97% compatible with the heart data.
b.
X1 is the independent variable and for our case it is the biking variable
X2 is the controlling variable and for the case of these assignment it is the smoking variable.
Y is the dependent variable and four our case it is the heart disease variable.
ε is the error variable
c.
The model is adequate because of the high multiple r-squared score of 0.9796. It means that the model is 97.96% suitable to be used investigate the impact of biking and smoking on the heart disease occurrence.
d.
The validation that can be undertaken is as follows:-
• Determining linearity
• Finding homoscedasticity
• Determining presence or absence of multicollinearity
• Independence and normality of errors
I can validate the following:-
• Finding homoscedasticity
• Finding multicollinearity scores
e.
The difference between the two tables are sum sq, mean sq and f value.
The first table explains that point increase in biking causes a 9090.6 impact on heart disease total sum while the second table shows that point increase in biking causes a 9183.8 on heart disease total sum.
The first table shows that a point increase in smoking causes 1086.0 impact on heart disease. The second table shows that a point increase by both smoking result in 992.7 impact on heart disease.
f.
SOURCE D.F S.S M.S F VALUE
BIKING 1 9090.6 9090.6 21251.7
RESIDUAL 495 211.7 0.4 -
TOTAL 496 9302.3 9091.0 21251.7
4.
a.
A balanced design occurs where all the treatment groups have equal number of experiment units.
Yes the experiment is balanced
b.
The 1 graduation group performed better based on all training methods and proficiency test while graduation group 3 performed the least.
c.
The first model is better as compared to the second model. This is because the first model has a lower sum square residual of 47 as compared to the second model with sum square residual of 64.33. A point increase in graduation group has a 152.33 impact on the proficiency score while a point increase in the method of training as a 849.33 impact on proficiency score of students.
d.
The Tukey HDS is suitable for the data because it is used to assess the significant difference between pairs of groups taking into consideration that both training methods and graduations exist in groups.
e.
The training methods are suitable for the proficiency scores obtained based on the Tukey HDS. The training method p value is less that the 0.05 significant value thus the null hypothesis is rejected. The 2-1 graduation group p value is greater than the significant value thus fail to reject the null hypothesis. The 3-1 and 3-2 graduation groups p value is less than 0.05 significant value thus reject the null hypothesis.

No comments:

Post a Comment