# 1 Final Project Due Wednesday December 21st before 1pm in my office NA 213. Late projects will not be accepted. You may use various resources such as notes, your book, a computer, or a graphing calculator. You may ask your instructor. You may not do this project with help from math lab or another person. This is not a group project. You must work independently otherwise you will receive a 0. You do not need to type the responses but all responses should be legible. All work should be done on a separate piece of paper. Make sure to attach all print outs, graphs, and software computations. Chapter 1: (10 points) In its most recently published Fact Book and Outcomes Report 2014-2015, MCC tracks the number of credits that all of its students take. Here are the results from Fall Semester 2010 through Fall Semester 2014: (MCC Fact Book and Outcomes Report 2014-2015, page 90) 1. To be a full-time student, a student needs to take 12 or more credits. Calculate each of the following. Round to the nearest whole percent. a. In FA 14 (Fall 2014), what percent of MCC students were full-time students? b. In FA 14 (Fall 2014), what percent of MCC students were part-time students? c. Are the numbers in the table above statistics or parameters? Explain. d. Are “Full-time Student” and “Part-time Student” categorical or quantitative data? Explain. 2. Suppose that you have been asked to conduct a survey of 500 MCC students to estimate the percent of full-time students and percent of part-time students at MCC this semester. Of course, you want the sample to be representative of the MCC student population. If you had access to MCC’s student enrollment information, describe two possible sampling methods that could be used to create a representative sample of 500 students. For each sampling method, be sure to name the sampling method used and how it would be carried out. 2 Chapter 2 and 3: (20 points) Use the Data Set 11( Appendix B): Ages of Oscar Winners- Best Actresses set to solve the following problems: 1. Use software to find the following statistics values- mean, median, mode, range, midrange, and the standard deviation for the data set. 2. Organize the data set in a frequency table using bins with width 10, starting from 20-29, 30-39 and so on. Include columns for the relative and cumulative frequency. 3. Construct a histogram based on the table of step 2. Analyze the distribution curve- number of peaks, the symmetry, and variation. 4. Use the frequency table to calculate the mean, median, mode, and standard deviation for the data set. Compare these values with the values you found in step 1. 5. Calculate the coefficient of variation of the data set. 6. Use range rule of thumb to estimate the standard deviation of the data set. Compare this approximation with the values found in step 1 and 4. 7. Use a calculator or software to find the 5- Number Summary of the data set. Construct a boxplot. Calculate IQR, semi-interquartile range, and midrange. 8. Identify outliers if there are any and construct a modified boxplot. 9. Identify the data value of the 18th percentile –P18. Find in which percentile is the value 40. Chapter 4: (10 points) Classic Birthday Problem: Find the probability that among 25 randomly selected people, at least 2 have the same birthday. To solve this problem you have to use a simulation. A simulation of a procedure is a process that behaves the same way as the procedure so that similar results are produced. For the above classic birthday problem, a simulation begins by representing birthdays by integers from 1 through 365, where 1 represents a birthday of January 1, and 2 represents January 2, and so on. We can simulate 25 birthdays by using a calculator or computer to generate 25 random numbers (with repetition allowed) between 1 and 365. Those numbers can then be sorted, so it becomes easy to examine the list the list to determine whether any 2 of the simulated birth days are the same. Repeat the process as least 20 times or until you are satisfied that you have a good estimate of the probability. Chapter 5: (10 points) The analysis of the last digits of data can sometimes reveal whether the data have been collected through actual measurements or reported by subjects. Refer to an almanac or the Internet and find a collection of data (such as lengths of rivers in the world – at least 30 data values), then analyze the distribution of the last digits to determine whether the values were obtained through actual measurements. Chapter 6: (10 points) Use a coin or a computer to simulate births. You should simulate 100 births and report the number of girls and the number of boys in your simulation. Using n = the total number of births and x= number of girls; compute the mean and the standard deviation for the number of girls. Is the simulated result unusual? Why or why not? 3 Chapter 7: (10 points) (15pts.) You have been hired by a college foundation to conduct a survey of graduates. a) If you want to estimate the percentage of graduates who made a donation to the college after graduation, how many graduates must you survey if you want 98% confidence that your percentage has a margin of error of 5 percentage point? b) If you want to estimate the mean amount of charitable contributions made by graduates, how may graduates must you survey if you want 98% confidence that your sample mean is in error by no more than $50? (Based on result from a pilot study, assume that the standard deviation of donations by graduates is $337.) Chapter 8: (10 points) Pennsylvania Lottery. In Pennsylvania Match 6 Lottery, six numbers between 1 and 49 are randomly drawn. Use a computer to generate 100 random numbers between 1 and 49 (with replacement) and calculate the mean and the standard deviation of your sample. Use a 0.01 significance level to test the claim that the sample is selected from a population with a mean equal to 25, which is the mean of the population of all drawn numbers. Chapter 10: (20 points) Use the Data Set Bears (measurements from anesthetized wild bears) to: 1. Create two different scatter plots. a) Graph1: Age (x) and Weight (y) b) Graph2: Chest (x) and Weight (y) 2. For both scatter plots write up an analysis of all the information that you learn from the picture. At a minimum your analysis should answer the following: – Is there a correlation? Why? – If there appears to be a correlation, describe the correlation. – State the Correlation Coefficient- R, R^2 , and the equation of the regression line. Discuss the “fit” of the regression line to the data. 3. Answer all of the following questions. a) Based on your analysis in step 2, do you think it is possible to infer a bear’s weight from its age, or bear’s chest from its weight? Explain your answer. b) Using the relationships that you calculated, determine the approximate age and chest of a bear with the following weights: – 170 lb -70 lb -105 lb c) Suppose you measure the chest of a bear, you predict that the weight of the bear is 70 lb, and you latter find out that the weight of that bear is actually 75 lb. Give one possible reason that your prediction was incorrect. d) Using the relationship that you calculated, determine the approximate weight of a bear with – age 56 – age 101 – chest 30.0 – chest 53.5

1 Final Project Due Wednesday December 21st before 1pm in my office NA 213. Late projects will not be accepted. You may use various resources such as notes, your book, a computer, or a graphing calculator. You may ask your instructor. You may not do this project with help from math lab or another person. This is not a group project. You must work independently otherwise you will receive a 0. You do not need to type the responses but all responses should be legible. All work should be done on a separate piece of paper. Make sure to attach all print outs, graphs, and software computations. Chapter 1: (10 points) In its most recently published Fact Book and Outcomes Report 2014-2015, MCC tracks the number of credits that all of its students take. Here are the results from Fall Semester 2010 through Fall Semester 2014: (MCC Fact Book and Outcomes Report 2014-2015, page 90) 1. To be a full-time student, a student needs to take 12 or more credits. Calculate each of the following. Round to the nearest whole percent. a. In FA 14 (Fall 2014), what percent of MCC students were full-time students? b. In FA 14 (Fall 2014), what percent of MCC students were part-time students? c. Are the numbers in the table above statistics or parameters? Explain. d. Are “Full-time Student” and “Part-time Student” categorical or quantitative data? Explain. 2. Suppose that you have been asked to conduct a survey of 500 MCC students to estimate the percent of full-time students and percent of part-time students at MCC this semester. Of course, you want the sample to be representative of the MCC student population. If you had access to MCC’s student enrollment information, describe two possible sampling methods that could be used to create a representative sample of 500 students. For each sampling method, be sure to name the sampling method used and how it would be carried out. 2 Chapter 2 and 3: (20 points) Use the Data Set 11( Appendix B): Ages of Oscar Winners- Best Actresses set to solve the following problems: 1. Use software to find the following statistics values- mean, median, mode, range, midrange, and the standard deviation for the data set. 2. Organize the data set in a frequency table using bins with width 10, starting from 20-29, 30-39 and so on. Include columns for the relative and cumulative frequency. 3. Construct a histogram based on the table of step 2. Analyze the distribution curve- number of peaks, the symmetry, and variation. 4. Use the frequency table to calculate the mean, median, mode, and standard deviation for the data set. Compare these values with the values you found in step 1. 5. Calculate the coefficient of variation of the data set. 6. Use range rule of thumb to estimate the standard deviation of the data set. Compare this approximation with the values found in step 1 and 4. 7. Use a calculator or software to find the 5- Number Summary of the data set. Construct a boxplot. Calculate IQR, semi-interquartile range, and midrange. 8. Identify outliers if there are any and construct a modified boxplot. 9. Identify the data value of the 18th percentile –P18. Find in which percentile is the value 40. Chapter 4: (10 points) Classic Birthday Problem: Find the probability that among 25 randomly selected people, at least 2 have the same birthday. To solve this problem you have to use a simulation. A simulation of a procedure is a process that behaves the same way as the procedure so that similar results are produced. For the above classic birthday problem, a simulation begins by representing birthdays by integers from 1 through 365, where 1 represents a birthday of January 1, and 2 represents January 2, and so on. We can simulate 25 birthdays by using a calculator or computer to generate 25 random numbers (with repetition allowed) between 1 and 365. Those numbers can then be sorted, so it becomes easy to examine the list the list to determine whether any 2 of the simulated birth days are the same. Repeat the process as least 20 times or until you are satisfied that you have a good estimate of the probability. Chapter 5: (10 points) The analysis of the last digits of data can sometimes reveal whether the data have been collected through actual measurements or reported by subjects. Refer to an almanac or the Internet and find a collection of data (such as lengths of rivers in the world – at least 30 data values), then analyze the distribution of the last digits to determine whether the values were obtained through actual measurements. Chapter 6: (10 points) Use a coin or a computer to simulate births. You should simulate 100 births and report the number of girls and the number of boys in your simulation. Using n = the total number of births and x= number of girls; compute the mean and the standard deviation for the number of girls. Is the simulated result unusual? Why or why not? 3 Chapter 7: (10 points) (15pts.) You have been hired by a college foundation to conduct a survey of graduates. a) If you want to estimate the percentage of graduates who made a donation to the college after graduation, how many graduates must you survey if you want 98% confidence that your percentage has a margin of error of 5 percentage point? b) If you want to estimate the mean amount of charitable contributions made by graduates, how may graduates must you survey if you want 98% confidence that your sample mean is in error by no more than $50? (Based on result from a pilot study, assume that the standard deviation of donations by graduates is $337.) Chapter 8: (10 points) Pennsylvania Lottery. In Pennsylvania Match 6 Lottery, six numbers between 1 and 49 are randomly drawn. Use a computer to generate 100 random numbers between 1 and 49 (with replacement) and calculate the mean and the standard deviation of your sample. Use a 0.01 significance level to test the claim that the sample is selected from a population with a mean equal to 25, which is the mean of the population of all drawn numbers. Chapter 10: (20 points) Use the Data Set Bears (measurements from anesthetized wild bears) to: 1. Create two different scatter plots. a) Graph1: Age (x) and Weight (y) b) Graph2: Chest (x) and Weight (y) 2. For both scatter plots write up an analysis of all the information that you learn from the picture. At a minimum your analysis should answer the following: – Is there a correlation? Why? – If there appears to be a correlation, describe the correlation. – State the Correlation Coefficient- R, R^2 , and the equation of the regression line. Discuss the “fit” of the regression line to the data. 3. Answer all of the following questions. a) Based on your analysis in step 2, do you think it is possible to infer a bear’s weight from its age, or bear’s chest from its weight? Explain your answer. b) Using the relationships that you calculated, determine the approximate age and chest of a bear with the following weights: – 170 lb -70 lb -105 lb c) Suppose you measure the chest of a bear, you predict that the weight of the bear is 70 lb, and you latter find out that the weight of that bear is actually 75 lb. Give one possible reason that your prediction was incorrect. d) Using the relationship that you calculated, determine the approximate weight of a bear with – age 56 – age 101 – chest 30.0 – chest 53.5