The data include individuals’ age, their “social media mobility” (a larger number reflects a more active social media life), their smartphone’s operating system (A or B), and whether their phone was infected. Your task is two-fold:
1) Using the first 180 observations, build a logistic regression model to predict whether a given individual’s smartphone will be infected. Cross-validate this model with the hold-out sample consisting of the last 60 observations.
2) Similarly using the first 180 observations, conduct a CART effort with the aim of using these data to build a prediction tree to predict whether a given individual’s smartphone will be infected. Similar cross-validate this model with the hold-out sample.
Age | Social | OS | Infect |
52 | 69 | A | No |
52 | 65 | B | No |
58 | 57 | B | No |
55 | 61 | B | Yes |
58 | 29 | A | No |
50 | 63 | A | No |
63 | 69 | A | Yes |
55 | 61 | A | Yes |
51 | 45 | B | Yes |
66 | 62 | A | No |
43 | 69 | B | Yes |
48 | 60 | A | No |
50 | 67 | B | No |
52 | 90 | A | No |
47 | 47 | A | No |
62 | 47 | B | No |
57 | 60 | B | Yes |
57 | 72 | A | Yes |
56 | 36 | A | Yes |
44 | 83 | A | No |
68 | 54 | A | No |
48 | 58 | A | No |
61 | 43 | A | Yes |
59 | 80 | A | Yes |
61 | 45 | B | No |
49 | 69 | A | Yes |
58 | 62 | A | No |
56 | 57 | A | No |
61 | 55 | A | No |
55 | 51 | A | No |
52 | 40 | A | No |
63 | 21 | B | No |
56 | 50 | B | Yes |
57 | 112 | A | Yes |
55 | 62 | A | No |
46 | 56 | A | No |
51 | 72 | B | Yes |
52 | 30 | A | No |
66 | 71 | B | Yes |
48 | 98 | A | No |
56 | 37 | A | No |
58 | 63 | A | No |
51 | 71 | A | No |
51 | 28 | B | No |
64 | 72 | B | Yes |
57 | 68 | B | Yes |
59 | 67 | A | No |
44 | 69 | A | No |
58 | 70 | B | No |
61 | 51 | A | No |
51 | 46 | A | No |
62 | 84 | A | No |
51 | 52 | B | No |
46 | 43 | A | No |
61 | 44 | B | Yes |
50 | 42 | A | No |
62 | 48 | A | No |
60 | 38 | A | No |
52 | 46 | A | No |
54 | 54 | B | No |
51 | 57 | A | No |
45 | 43 | B | No |
54 | 81 | A | Yes |
71 | 52 | A | Yes |
53 | 55 | B | No |
57 | 68 | B | Yes |
51 | 55 | A | Yes |
64 | 61 | A | No |
54 | 16 | B | Yes |
53 | 64 | A | Yes |
45 | 66 | B | No |
58 | 64 | B | Yes |
57 | 49 | A | No |
55 | 83 | B | No |
48 | 67 | B | No |
52 | 71 | B | Yes |
61 | 34 | B | No |
61 | 57 | A | No |
65 | 63 | B | Yes |
66 | 79 | A | Yes |
60 | 57 | B | Yes |
57 | 45 | A | Yes |
54 | 63 | A | Yes |
54 | 66 | B | No |
58 | 69 | B | Yes |
60 | 73 | B | No |
52 | 77 | A | No |
55 | 67 | A | No |
55 | 76 | A | Yes |
59 | 72 | A | Yes |
52 | 60 | A | No |
53 | 70 | A | No |
55 | 66 | A | Yes |
54 | 40 | A | No |
45 | 51 | A | No |
49 | 54 | B | Yes |
56 | 37 | A | No |
52 | 68 | B | Yes |
53 | 55 | A | Yes |
52 | 56 | B | Yes |
58 | 45 | A | Yes |
59 | 58 | B | No |
61 | 35 | A | No |
72 | 82 | B | Yes |
58 | 57 | A | No |
61 | 83 | B | Yes |
49 | 45 | A | No |
54 | 48 | A | No |
53 | 42 | A | No |
51 | 45 | A | No |
57 | 56 | A | No |
49 | 19 | A | No |
44 | 51 | A | No |
46 | 37 | A | No |
47 | 98 | B | No |
48 | 61 | A | No |
49 | 37 | A | No |
66 | 58 | B | No |
56 | 40 | A | No |
65 | 43 | B | No |
61 | 74 | A | No |
46 | 54 | A | Yes |
51 | 74 | B | Yes |
64 | 60 | B | No |
53 | 53 | A | No |
47 | 50 | A | No |
52 | 83 | A | Yes |
59 | 52 | A | Yes |
52 | 87 | B | Yes |
57 | 57 | A | No |
57 | 64 | A | No |
47 | 76 | A | Yes |
53 | 61 | A | Yes |
59 | 50 | B | No |
55 | 32 | B | Yes |
48 | 30 | A | No |
51 | 59 | A | No |
50 | 58 | A | No |
57 | 74 | A | No |
42 | 55 | A | No |
64 | 30 | A | No |
54 | 50 | A | Yes |
61 | 47 | B | Yes |
56 | 71 | A | No |
49 | 57 | B | Yes |
58 | 44 | A | Yes |
52 | 55 | A | No |
58 | 36 | A | Yes |
58 | 56 | A | No |
58 | 54 | A | No |
61 | 50 | A | Yes |
55 | 64 | A | No |
62 | 47 | B | Yes |
58 | 81 | A | Yes |
67 | 52 | A | No |
48 | 90 | B | Yes |
47 | 58 | A | No |
44 | 56 | A | No |
61 | 58 | A | Yes |
57 | 31 | A | No |
49 | 55 | A | No |
43 | 53 | A | No |
64 | 69 | B | Yes |
51 | 64 | A | Yes |
61 | 52 | A | Yes |
64 | 72 | A | Yes |
59 | 55 | B | No |
54 | 44 | A | No |
60 | 48 | A | No |
50 | 64 | A | No |
64 | 57 | B | Yes |
57 | 32 | A | Yes |
58 | 54 | A | Yes |
45 | 61 | A | No |
50 | 70 | B | Yes |
59 | 96 | A | Yes |
64 | 61 | A | Yes |
55 | 51 | B | Yes |
49 | 41 | B | No |
56 | 42 | A | No |
63 | 68 | A | No |
49 | 46 | B | Yes |
59 | 61 | B | No |
45 | 54 | B | No |
65 | 73 | B | No |
52 | 65 | B | Yes |
40 | 63 | B | Yes |
66 | 72 | A | No |
65 | 29 | B | No |
59 | 70 | B | Yes |
49 | 29 | B | Yes |
53 | 86 | B | Yes |
50 | 68 | A | No |
56 | 47 | A | No |
58 | 55 | B | Yes |
61 | 89 | A | Yes |
63 | 47 | A | No |
65 | 60 | A | Yes |
61 | 48 | B | No |
51 | 39 | A | No |
50 | 61 | B | No |
60 | 54 | A | No |
51 | 64 | B | No |
60 | 40 | B | No |
56 | 68 | A | No |
46 | 77 | B | No |
46 | 64 | B | Yes |
49 | 35 | B | Yes |
52 | 41 | A | No |
50 | 91 | A | Yes |
53 | 64 | A | No |
40 | 39 | A | No |
50 | 71 | A | No |
53 | 35 | B | Yes |
51 | 72 | B | Yes |
45 | 57 | B | No |
51 | 73 | A | No |
50 | 77 | A | No |
52 | 48 | A | No |
49 | 49 | B | No |
58 | 56 | A | No |
58 | 56 | A | No |
56 | 51 | B | No |
58 | 56 | B | Yes |
66 | 38 | B | No |
61 | 61 | B | Yes |
57 | 54 | B | No |
64 | 59 | A | No |
59 | 63 | A | No |
60 | 77 | A | Yes |
53 | 95 | A | No |
48 | 48 | A | No |
56 | 84 | A | No |
61 | 49 | B | No |
48 | 71 | B | Yes |
54 | 41 | A | No |
46 | 38 | A | Yes |
50 | 24 | B | No |
55 | 83 | B | Yes |
60 | 27 | A | No |
Parts 3 and 4: On the second worksheet is a quarterly time series of shipments beginning in the first quarter of 1997. Your task is to compare a LOESS-decomposition model and an ARIMA model with regards to forecasting this time series. Use the first 72 quarters of data to build your models and use the last 8 quarters to compare them.
Quarter | Year | Units |
1 | 1997 | 221.6 |
2 | 1997 | 222.2 |
3 | 1997 | 225.7 |
4 | 1997 | 226.1 |
1 | 1998 | 224.4 |
2 | 1998 | 222.8 |
3 | 1998 | 224.3 |
4 | 1998 | 221.7 |
1 | 1999 | 219.6 |
2 | 1999 | 220.3 |
3 | 1999 | 220.7 |
4 | 1999 | 218.1 |
1 | 2000 | 216.4 |
2 | 2000 | 219.2 |
3 | 2000 | 221.6 |
4 | 2000 | 216.5 |
1 | 2001 | 215.9 |
2 | 2001 | 220 |
3 | 2001 | 220.9 |
4 | 2001 | 214.2 |
1 | 2002 | 214.7 |
2 | 2002 | 218.2 |
3 | 2002 | 213.1 |
4 | 2002 | 204.3 |
1 | 2003 | 206.3 |
2 | 2003 | 209.1 |
3 | 2003 | 203 |
4 | 2003 | 193.1 |
1 | 2004 | 194.7 |
2 | 2004 | 195.3 |
3 | 2004 | 191.8 |
4 | 2004 | 183.2 |
1 | 2005 | 185.8 |
2 | 2005 | 184.2 |
3 | 2005 | 181.6 |
4 | 2005 | 169.1 |
1 | 2006 | 176.3 |
2 | 2006 | 172.7 |
3 | 2006 | 171.3 |
4 | 2006 | 158.8 |
1 | 2007 | 168.9 |
2 | 2007 | 164.5 |
3 | 2007 | 160.3 |
4 | 2007 | 151.3 |
1 | 2008 | 163.6 |
2 | 2008 | 157.5 |
3 | 2008 | 154.2 |
4 | 2008 | 144.2 |
1 | 2009 | 158.9 |
2 | 2009 | 151.8 |
3 | 2009 | 146.5 |
4 | 2009 | 138 |
1 | 2010 | 151.9 |
2 | 2010 | 146.9 |
3 | 2010 | 142.5 |
4 | 2010 | 136.6 |
1 | 2011 | 150.5 |
2 | 2011 | 143.9 |
3 | 2011 | 137.9 |
4 | 2011 | 131.1 |
1 | 2012 | 140.7 |
2 | 2012 | 132.5 |
3 | 2012 | 125.9 |
4 | 2012 | 119.2 |
1 | 2013 | 127.5 |
2 | 2013 | 119.2 |
3 | 2013 | 111.4 |
4 | 2013 | 107.3 |
1 | 2014 | 118.6 |
2 | 2014 | 108.8 |
3 | 2014 | 99.5 |
4 | 2014 | 96.2 |
1 | 2015 | 108.4 |
2 | 2015 | 98.1 |
3 | 2015 | 90.1 |
4 | 2015 | 88.4 |
1 | 2016 | 99.2 |
2 | 2016 | 89.9 |
3 | 2016 | 81.2 |
4 | 2016 | 82.1 |