Expert answer:R studio HW. This question was already asked 1 year ago, as it’s the same HW. I saw that you answered this question for azoozeta1 already and I am hoping you can do it for me as well. Thanks.
homework_1_r__1_.pdf
Unformatted Attachment Preview
Homework 1
Due Wednesday, January 24
1.a. In R define three variables (vectors): Day, WaitingTime and BusLine. All should be numeric variables
with the following values. (You should combine all your work in an R script and submit that with a Word
document that presents and explains the results)
Day
WaitingTime
BusLine
1
43
0
2
27
0
3
4
0
4
18
0
5
17
0
6
31
0
7
16
1
8
18
1
9
7
0
10
41
0
11
22
1
12
5
1
13
14
1
14
12
0
15
23
0
16
7
1
17
12
1
18
3
1
19
18
1
20
9
1
b. combine your three variables in a dataframe called “CrazyDave”
c. For BusLine assign labels 0 to Damen and 1 to Halsted by using the “factor()” function.
d. Print the complete dataframe to check your work. Save your dataset by writing it as a csv file to your
default working directory.
e. provide descriptive statistics (mean, median, mode, variance, standard deviation) as appropriate for these
variables. You can use Describe in the Psych Library to get all of these.
f. According to the CTA timetables both buses run at 10 minute intervals. Given our data could they possibly
be running consistently on time? How do you know?
g. If the CTA maintains the scheduled number of buses, but their arrival times are random, then waiting time
would follow an exponential distribution with parameter λ equal to the rate 1/10minutes. (This is known as
a Poisson Process). Look up the exponential distribution (Wikipedia will do) and find the formula for the
mean and standard deviation and calculate these values using the rate (.1/minute) from the CTA timetable.
h. An important result in statistics called the Central Limit Theorem says that the sample mean of n random
draws from an arbitrary distribution with meanμ and standard deviation σ converges to a normal
distribution with mean μ and standard deviation σ/√ . Use this to calculate the expected mean waiting time
and standard deviation of the mean based on the null hypothesis that arrival times follow the distribution in
g (random arrivals at rate 1/10 minutes).
i. Since our null hypothesis that bus arrivals are Poisson with rate λ = .1 also determines the variance under
the null, we can use a z test to determine whether our sample is consistent with the hypothesis. Construct a
z value by taking the difference between the sample mean and the calculated expected mean and dividing
by the standard deviation (of the mean) from h. Find a normal (z) lookup table (two sided) and find the
corresponding p value. This is the likelihood that our sample was generated under the null hypothesis. If is it
less than 0.05 we reject the null at the 95% confidence level and accept the alternative hypothesis (that the
mean is significantly different from μ).
j. Perhaps the waiting times are not independent (maybe busses run late because of problems that affect
several in a row). Then our calculations in g are not valid. This is typically the case for sample data, where we
do not know the population variance. Instead of the z test we use a t-test, which uses the sample variance to
do hypothesis testing. Use the value for the sample standard deviation for waiting time from part e and
calculate the standard deviation for the mean (by dividing by √ ). Calculate the difference between the
sample mean and the null (10 minutes) and divide by the standard deviation of the mean to get a t-statistic.
Run the Rfunctiont.test(), to check your work. The t-test reports the p value just like our z test above. Is the
mean significantly different from 10 minutes? Does this mean the Professor is not crazy?
k. Things seem to be getting better for Professor Dave. Calculate the Pearson Correlation coefficient for
waiting time and day to determine if waiting times are actually improving over time. Use “cor.test()” to get
the correlation and a test of significance. What can you conclude from the result? Is it statistically
significant?
l. Dave was initially taking the Damen bus, but switched to the Halsted bus in hopes of avoiding his curse.
Use DescribeBy() to calculate the mean and standard deviation for the two routes considered separately.
Use t.test() again to compare two independent samples. Use the “var.equal= true” flag to run Student’s test
and then run again with “var.equal= false” to repeat the test allowing for different variances. To determine
which test is appropriate we need to runLevene’s test for the equality of variances. You can find a function
to do this “leveneTest()” in the “car” package for R. If the sig.(nificance) value is less than .05 we reject the
null (that the variances are equal) and use a modified t-test which accounts for the unequal variances of the
two samples. Which should we use here? Does it matter for determining whether the Halsted bus is better?
m. Even if we can’t say that the Halsted bus is statistically significantly better, the average times are certainly
lower. Is this responsible for all of the improvement over time we found in k.? Run a linear regression of
Waiting Time on Day and BusLine. How much of the variance in waiting time is explained by the model. How
much confidence do we have in the results? Does this impact our interpretation of the correlation in k?
…
Purchase answer to see full
attachment
You will get a plagiarism-free paper and you can get an originality report upon request.
All the personal information is confidential and we have 100% safe payment methods. We also guarantee good grades
Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.
Read moreEach paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.
Read moreThanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.
Read moreYour email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.
Read moreBy sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.
Read more