22 May, 2020

Test of Association Assignment


Name________________________________
This assignment is worth 100 points. Each problem/part is worth 5 points each.
Due Date: ThursdayMay 21, 2020 by 11:59p.m. EST.
You do not have to type your responses. If you wish to provide hand-written work and scan your document, that is fine. Upload your final document with complete responses to the ‘Test of Association Assignment’ in Canvas.

1.       Some students in the masters of analytics program at Harrisburg University claim they are really good at soccer. To see whether there is an association among students and faculty in terms of the goals they can score in a soccer game, the analytics faculty invited students and teaching assistants (Ph.D. students) to play in soccer matches.  Every player was allowed to play in only one match. Over many matches, we counted the number of players who scored goals.

a.       The soccer.txt file is attached to the ‘Test of Association Assignment’ in Canvas. Import this file into R. Copy and paste your code below that shows where you imported the data. Name the data frame that you create ‘soccer.’
library(readr)
soccer <- read_csv("kochu/soccer.txt")        
b.      When you examine the data frame you imported into R, you will notice that it is not in the appropriate format for performing a chi-square test. Using some of the R categorical functions that you have learned in the course, convert the soccer data frame that you imported in part a in to a table such that the Job variable is on the rows and the Score variable is on the columns. Name the table you create ‘soccer_t’. Copy and paste your code below that created the table.
soccer_t <- read.delim("soccer.txt")
Table create below

c.       Develop the appropriate null and alternative hypothesis for testing an association among these variables. State the null and alternative hypothesis in the context of this problem.

H0: There is no relationship between occupation with the frequency of scoring in a soccer match.
H1: There is a relationship between occupation with the frequency of scoring in a soccer match.


d.      Run the chi-square test in R using the assocstats() function. Copy and paste your code and the output below. What is the value of the chi-square test statistic? What is the p-value related to the chi-square test statistic?
tab<- table(job,freq)
summary(assocstats(tab))



Chi-square test statistic is 12
The p-value is 0.2851
e.      Compute the χ2 test statistic that you obtained from the R output in part d.  You should show your computations. Show the computations of the expected counts. Provide a final table that includes the observed counts with the expected counts in parenthesis for each cell. The final table should have row and column margin totals and a grand total.
summary(chisq.test(tab))


f.        Does the test statistic you found by hand in part e match the χ2 test statistic from the assocstats() output in part d? State Yes or No.

No


g.       Use the pchisq() function in R to find the p-value associated with the χ2 test statistic. Copy and paste your code below.  Does the p-value you found with the pchisq() function match the p-value from the assocstats() output in part d? State Yes or No.
YES
1-pchisq(12, 10, ncp = 0, lower.tail = TRUE, log.p = FALSE)
P value is 0.2850565 to four decimal places is 0.2851 
hence it is similar to the p-value using assocstats ()


h.      Using the p-value for this test, do you reject or not reject the null hypothesis?

Accept the null hypothesis


i.        State a conclusion back in terms of the context of the problem.

The null hypothesis is accepted because the p-value is greater than 0.05.









No comments:

Post a Comment