Plotting Conditionally Summed Data (base R or ggplot) - r

I started with a dataframe containing info on West Nile cases in Canada from 2012-2015. 600 observations of 10 variables in total.
> head(mosquitoes)
Years Weeks Province Avg.Temp Avg..Precepitation Wind Number.of.cases Number.of.Dead.Birds Mosquito.Pools.Tested Google.Trend.Searches
1 2015 17 Alberta 48 0.01 8 0 0 0 1
2 2015 18 Alberta 46 0.03 10 0 0 0 2
3 2015 19 Alberta 44 0.07 8 0 0 0 2
4 2015 20 Alberta 51 0.00 9 0 0 0 2
5 2015 21 Alberta 56 0.01 9 0 0 0 4
6 2015 22 Alberta 58 0.10 7 0 0 0 1
Here is the entire data set....sorry it's large.
Years,Weeks,Province,Avg Temp ,Avg. Precepitation,Wind,Number of cases,Number of Dead Birds,Mosquito Pools Tested,Google Trend Searches
2015,17,Alberta,48,0.01,8,0,0,0,1
2015,18,Alberta,46,0.03,10,0,0,0,2
2015,19,Alberta,44,0.07,8,0,0,0,2
2015,20,Alberta,51,0,9,0,0,0,2
2015,21,Alberta,56,0.01,9,0,0,0,4
2015,22,Alberta,58,0.1,7,0,0,0,1
2015,23,Alberta,61,0.05,8,0,0,0,1
2015,24,Alberta,55,0.08,9,0,0,0,1
2015,25,Alberta,63,0.02,6,0,0,0,4
2015,26,Alberta,67,0.16,8,0,0,0,5
2015,27,Alberta,65,0.02,8,0,0,0,3
2015,28,Alberta,62,0.09,10,0,0,0,7
2015,29,Alberta,66,0.01,8,0,0,0,2
2015,30,Alberta,62,0.02,7,0,0,0,3
2015,31,Alberta,64,0.21,7,0,0,0,6
2015,32,Alberta,66,0.07,7,0,0,0,4
2015,33,Alberta,55,0.13,8,0,0,0,4
2015,34,Alberta,63,0,6,0,0,0,1
2015,35,Alberta,52,0.11,9,0,0,0,4
2015,36,Alberta,54,0.02,7,0,0,0,2
2015,37,Alberta,48,0.06,8,0,0,0,2
2015,38,Alberta,52,0.03,9,0,0,0,3
2015,39,Alberta,49,0.03,9,0,0,0,3
2015,40,Alberta,51,0,8,0,0,0,2
2015,41,Alberta,48,0,8,0,0,0,2
2014,17,Alberta,43,0.05,8,0,0,0,1
2014,18,Alberta,44,0.06,9,0,0,0,3
2014,19,Alberta,37,0.03,9,0,0,0,3
2014,20,Alberta,48,0.01,8,0,0,0,1
2014,21,Alberta,57,0.01,10,0,0,0,2
2014,22,Alberta,53,0.06,8,0,0,0,4
2014,23,Alberta,53,0.04,10,0,0,0,6
2014,24,Alberta,53,0.04,10,0,0,0,6
2014,25,Alberta,54,0.24,9,0,0,0,4
2014,26,Alberta,59,0.03,9,0,0,0,7
2014,27,Alberta,64,0.02,11,0,0,0,19
2014,28,Alberta,65,0.03,10,0,0,0,33
2014,29,Alberta,67,0.01,9,0,0,0,18
2014,30,Alberta,62,0.08,10,0,0,0,14
2014,31,Alberta,68,0,10,0,0,0,10
2014,32,Alberta,63,0.16,8,0,0,0,11
2014,33,Alberta,66,0.01,7,0,0,0,19
2014,34,Alberta,58,0.05,8,0,0,0,17
2014,35,Alberta,58,0.04,7,0,0,0,8
2014,36,Alberta,54,0.01,7,0,0,0,12
2014,37,Alberta,41,0.15,8,0,0,0,3
2014,38,Alberta,58,0,5,0,0,0,3
2014,39,Alberta,60,0.02,6,0,0,0,4
2014,40,Alberta,48,0.03,11,0,0,0,5
2014,41,Alberta,51,0,6,0,0,0,3
2013,17,Alberta,42,0,12,0,0,0,3
2013,18,Alberta,42,0.01,11,0,0,0,2
2013,19,Alberta,57,0,11,0,0,0,2
2013,20,Alberta,55,0.01,10,0,0,0,9
2013,21,Alberta,50,0.23,11,0,0,0,7
2013,22,Alberta,52,0.08,6,0,0,0,8
2013,23,Alberta,55,0.15,10,0,0,0,10
2013,24,Alberta,53,0.08,10,0,0,0,4
2013,25,Alberta,57,0.3,11,0,0,0,9
2013,26,Alberta,61,0.01,9,0,0,0,17
2013,27,Alberta,65,0.08,10,0,0,0,27
2013,28,Alberta,59,0.07,8,0,0,0,19
2013,29,Alberta,62,0.01,10,0,0,0,21
2013,30,Alberta,62,0.06,10,0,0,0,18
2013,31,Alberta,57,0.03,7,0,0,0,13
2013,32,Alberta,60,0.07,8,0,0,0,10
2013,33,Alberta,67,0,8,3,0,0,2
2013,34,Alberta,63,0,8,5,0,0,12
2013,35,Alberta,64,0.03,10,4,0,0,20
2013,36,Alberta,64,0.13,8,2,1,0,15
2013,37,Alberta,63,0,9,5,0,0,9
2013,38,Alberta,57,0.06,11,2,0,0,11
2013,39,Alberta,47,0,10,0,0,0,4
2013,40,Alberta,44,0,11,0,0,0,5
2013,41,Alberta,45,0.06,8,0,0,0,5
2012,17,Alberta,49,0.06,7,0,0,0,2
2012,18,Alberta,42,0.13,9,0,0,0,2
2012,19,Alberta,48,0,9,0,0,0,6
2012,20,Alberta,53,0.01,10,0,0,0,2
2012,21,Alberta,49,0.08,8,0,0,0,2
2012,22,Alberta,52,0,9,0,0,0,2
2012,23,Alberta,54,0.28,9,0,0,0,4
2012,24,Alberta,56,0.21,12,0,0,0,7
2012,25,Alberta,56,0.05,8,0,0,0,5
2012,26,Alberta,59,0.14,8,0,0,0,3
2012,27,Alberta,61,0.21,9,0,0,0,22
2012,28,Alberta,69,0,8,0,0,0,32
2012,29,Alberta,65,0.09,10,0,0,0,16
2012,30,Alberta,64,0.02,10,0,0,0,15
2012,31,Alberta,63,0.03,10,0,0,0,20
2012,32,Alberta,68,0,10,0,0,0,25
2012,33,Alberta,62,0.07,10,4,0,0,36
2012,34,Alberta,62,0.05,10,2,0,0,100
2012,35,Alberta,61,0.01,10,0,0,0,76
2012,36,Alberta,57,0,12,1,0,0,29
2012,37,Alberta,57,0,12,2,0,0,30
2012,38,Alberta,59,0,9,0,0,0,14
2012,39,Alberta,58,0.01,9,0,0,0,11
2012,40,Alberta,43,0.07,12,0,0,0,10
2012,41,Alberta,43,0.02,13,0,0,0,7
2015,17,British Columbia,53,0.03,10,0,0,0,5
2015,18,British Columbia,53,0.01,6,0,0,0,5
2015,19,British Columbia,58,0.01,7,0,0,0,5
2015,20,British Columbia,60,0,7,0,0,0,4
2015,21,British Columbia,62,0,7,0,0,0,6
2015,22,British Columbia,60,0.03,7,0,0,0,9
2015,23,British Columbia,62,0,13,0,0,0,9
2015,24,British Columbia,62,0.02,8,0,0,0,10
2015,25,British Columbia,66,0,9,0,0,0,7
2015,26,British Columbia,70,0,12,0,0,0,5
2015,27,British Columbia,67,0.01,9,0,0,0,11
2015,28,British Columbia,66,0,10,0,0,0,9
2015,29,British Columbia,65,0.04,9,0,0,0,14
2015,30,British Columbia,65,0.04,6,0,0,0,7
2015,31,British Columbia,65,0.02,9,0,0,0,7
2015,32,British Columbia,66,0.04,9,0,0,0,9
2015,33,British Columbia,65,0,9,0,0,0,11
2015,34,British Columbia,64,0.1,7,0,0,0,6
2015,35,British Columbia,57,0.12,10,0,0,0,4
2015,36,British Columbia,61,0.02,9,0,0,0,9
2015,37,British Columbia,58,0.09,9,0,0,0,9
2015,38,British Columbia,55,0.04,9,0,0,0,3
2015,39,British Columbia,52,0,6,0,0,0,3
2015,40,British Columbia,56,0.08,6,0,0,0,3
2015,41,British Columbia,51,0.04,7,0,0,0,7
2014,17,British Columbia,49,0.07,10,0,0,0,3
2014,18,British Columbia,54,0.03,8,0,0,0,4
2014,19,British Columbia,53,0.18,9,0,0,0,4
2014,20,British Columbia,60,0,8,0,0,0,6
2014,21,British Columbia,59,0.06,7,0,0,0,6
2014,22,British Columbia,56,0.09,7,0,0,0,6
2014,23,British Columbia,59,0,8,0,0,0,8
2014,24,British Columbia,60,0.03,10,0,0,0,7
2014,25,British Columbia,58,0.09,9,0,0,0,8
2014,26,British Columbia,62,0.05,7,0,0,0,10
2014,27,British Columbia,64,0.01,8,0,0,0,7
2014,28,British Columbia,66,0.01,8,0,0,0,19
2014,29,British Columbia,68,0,9,0,0,0,13
2014,30,British Columbia,63,0.06,8,0,0,0,12
2014,31,British Columbia,67,0,6,0,0,0,16
2014,32,British Columbia,66,0,7,0,0,0,25
2014,33,British Columbia,67,0.08,7,0,0,0,17
2014,34,British Columbia,65,0,6,0,0,0,13
2014,35,British Columbia,66,0,7,0,0,0,30
2014,36,British Columbia,61,0.05,7,0,0,0,9
2014,37,British Columbia,60,0,6,0,0,0,11
2014,38,British Columbia,61,0.02,6,0,0,0,3
2014,39,British Columbia,62,0.12,9,0,0,0,8
2014,40,British Columbia,56,0.04,6,0,0,0,9
2014,41,British Columbia,58,0.03,5,0,0,0,7
2013,17,British Columbia,50,0.03,7,0,0,0,14
2013,18,British Columbia,50,0,12,0,0,0,8
2013,19,British Columbia,59,0.03,6,0,0,0,5
2013,20,British Columbia,56,0.07,8,0,0,0,7
2013,21,British Columbia,54,0.04,8,0,0,0,4
2013,22,British Columbia,55,0.09,7,0,0,0,8
2013,23,British Columbia,60,0.01,9,0,0,0,14
2013,24,British Columbia,58,0.01,7,0,0,0,16
2013,25,British Columbia,62,0.04,8,0,0,0,10
2013,26,British Columbia,63,0.1,7,0,0,0,17
2013,27,British Columbia,67,0,8,0,0,0,29
2013,28,British Columbia,63,0,8,0,0,0,30
2013,29,British Columbia,66,0,9,0,0,0,20
2013,30,British Columbia,64,0,8,0,0,0,34
2013,31,British Columbia,64,0.02,8,0,0,0,11
2013,32,British Columbia,66,0,6,0,0,1,13
2013,33,British Columbia,66,0.02,8,0,0,1,16
2013,34,British Columbia,63,0.01,8,0,0,1,16
2013,35,British Columbia,65,0.17,7,0,1,1,12
2013,36,British Columbia,64,0.06,6,0,0,1,8
2013,37,British Columbia,63,0,6,0,0,1,14
2013,38,British Columbia,60,0.19,6,0,0,1,6
2013,39,British Columbia,54,0.23,10,0,0,1,6
2013,40,British Columbia,51,0.15,9,0,0,1,6
2013,41,British Columbia,51,0.01,8,0,0,1,8
2012,17,British Columbia,53,0.05,8,0,0,0,5
2012,18,British Columbia,50,0.11,7,0,0,0,6
2012,19,British Columbia,52,0,9,0,0,0,7
2012,20,British Columbia,54,0,10,0,0,0,8
2012,21,British Columbia,55,0.06,8,0,0,0,9
2012,22,British Columbia,57,0.07,7,0,0,0,8
2012,23,British Columbia,53,0.07,8,0,0,0,4
2012,24,British Columbia,57,0.04,8,0,0,0,4
2012,25,British Columbia,58,0.13,8,0,0,0,7
2012,26,British Columbia,60,0.04,8,0,0,0,8
2012,27,British Columbia,59,0.03,7,0,0,0,22
2012,28,British Columbia,66,0,6,0,0,0,30
2012,29,British Columbia,66,0.05,8,0,0,0,30
2012,30,British Columbia,63,0.03,8,0,0,0,38
2012,31,British Columbia,65,0,8,0,0,0,60
2012,32,British Columbia,67,0.01,8,0,0,0,34
2012,33,British Columbia,69,0,7,0,0,0,63
2012,34,British Columbia,63,0,8,0,0,0,100
2012,35,British Columbia,62,0,7,0,0,0,51
2012,36,British Columbia,62,0,7,0,0,0,32
2012,37,British Columbia,58,0.01,8,0,0,0,24
2012,38,British Columbia,60,0,6,0,0,0,13
2012,39,British Columbia,57,0,6,0,0,0,13
2012,40,British Columbia,53,0,8,0,0,0,6
2012,41,British Columbia,52,0.09,5,0,0,0,8
2015,17,Manitoba,56,0,10,0,0,0,4
2015,18,Manitoba,48,0,13,0,0,0,4
2015,19,Manitoba,46,0,10,0,0,0,4
2015,20,Manitoba,52,0,14,0,0,0,4
2015,21,Manitoba,57,0,10,0,0,12,4
2015,22,Manitoba,60,0,12,0,0,4,8
2015,23,Manitoba,67,0,9,0,0,87,8
2015,24,Manitoba,59,0,9,0,0,82,8
2015,25,Manitoba,66,0,7,0,0,44,8
2015,26,Manitoba,68,0,7,0,0,75,11
2015,27,Manitoba,66,0,10,0,0,73,17
2015,28,Manitoba,70,0,7,0,0,132,8
2015,29,Manitoba,69,0,9,0,0,139,17
2015,30,Manitoba,70,0,11,0,0,204,4
2015,31,Manitoba,63,0,9,0,0,275,13
2015,32,Manitoba,73,0,9,0,0,195,23
2015,33,Manitoba,62,0,10,0,0,228,13
2015,34,Manitoba,62,0,11,0,0,69,12
2015,35,Manitoba,73,0,11,1,0,92,10
2015,36,Manitoba,57,0,10,1,0,113,8
2015,37,Manitoba,60,0,11,2,0,34,4
2015,38,Manitoba,61,0,13,1,0,0,4
2015,39,Manitoba,53,0,13,0,0,0,6
2015,40,Manitoba,48,0,11,0,0,0,6
2015,41,Manitoba,44,0,11,0,0,0,6
2014,17,Manitoba,42,0,11,0,0,0,4
2014,18,Manitoba,42,0,14,0,0,0,0
2014,19,Manitoba,46,0,9,0,0,0,0
2014,20,Manitoba,45,0,10,0,0,0,0
2014,21,Manitoba,57,0,12,0,0,0,0
2014,22,Manitoba,66,0,8,0,0,0,0
2014,23,Manitoba,62,0,10,0,0,0,5
2014,24,Manitoba,60,0,11,0,0,0,13
2014,25,Manitoba,62,0,12,0,0,0,9
2014,26,Manitoba,66,0,10,0,0,0,7
2014,27,Manitoba,65,0,15,0,0,0,9
2014,28,Manitoba,67,0,11,0,0,0,36
2014,29,Manitoba,63,0,11,0,0,0,24
2014,30,Manitoba,68,0,9,0,0,0,53
2014,31,Manitoba,65,0,8,0,0,7,41
2014,32,Manitoba,71,0,8,0,0,7,48
2014,33,Manitoba,68,0,8,1,0,14,14
2014,34,Manitoba,67,0,8,2,0,19,18
2014,35,Manitoba,61,0,11,2,0,22,9
2014,36,Manitoba,60,0,8,0,0,24,4
2014,37,Manitoba,50,0,11,0,0,24,11
2014,38,Manitoba,52,0,10,0,0,24,4
2014,39,Manitoba,65,0,13,0,0,24,15
2014,40,Manitoba,47,0,16,0,0,24,4
2014,41,Manitoba,39,0,13,0,0,24,4
2013,17,Manitoba,36,0.01,12,0,0,0,4
2013,18,Manitoba,38,0.11,9,0,0,0,4
2013,19,Manitoba,49,0.02,12,0,0,0,4
2013,20,Manitoba,56,0.02,10,0,0,0,5
2013,21,Manitoba,55,0.05,14,0,0,0,4
2013,22,Manitoba,58,0.16,15,0,0,0,4
2013,23,Manitoba,57,0.01,9,0,0,0,9
2013,24,Manitoba,63,0.03,10,0,0,0,16
2013,25,Manitoba,66,0.1,9,0,0,0,23
2013,26,Manitoba,69,0.24,10,0,0,0,14
2013,27,Manitoba,72,0,6,0,0,0,23
2013,28,Manitoba,70,0.06,10,0,0,1,19
2013,29,Manitoba,66,0.1,9,0,0,1,45
2013,30,Manitoba,60,0.19,8,0,1,7,35
2013,31,Manitoba,61,0.03,7,0,0,10,31
2013,32,Manitoba,59,0.04,7,0,0,16,22
2013,33,Manitoba,64,0.02,8,1,0,16,24
2013,34,Manitoba,71,0.17,10,0,0,16,49
2013,35,Manitoba,76,0.01,7,0,0,17,14
2013,36,Manitoba,64,0,10,1,0,17,11
2013,37,Manitoba,63,0.01,8,0,0,19,9
2013,38,Manitoba,54,0,11,0,0,19,6
2013,39,Manitoba,60,0.1,12,0,0,19,13
2013,40,Manitoba,50,0.03,11,0,0,19,8
2013,41,Manitoba,52,0,10,0,1,19,4
2012,17,Manitoba,46,0.01,12,0,0,0,0
2012,18,Manitoba,51,0.05,11,0,0,0,0
2012,19,Manitoba,56,0.06,13,0,0,0,5
2012,20,Manitoba,58,0.16,12,0,0,0,6
2012,21,Manitoba,53,0.02,11,0,0,0,5
2012,22,Manitoba,53,0.13,9,0,0,0,5
2012,23,Manitoba,67,0.08,8,0,0,0,8
2012,24,Manitoba,62,0.17,11,0,0,0,10
2012,25,Manitoba,60,0.04,8,0,0,0,11
2012,26,Manitoba,68,0,10,0,0,0,11
2012,27,Manitoba,73,0.03,7,0,0,0,15
2012,28,Manitoba,73,0,7,0,0,0,17
2012,29,Manitoba,69,0.05,8,1,0,2,21
2012,30,Manitoba,71,0,8,1,0,20,36
2012,31,Manitoba,71,0.2,9,4,0,48,100
2012,32,Manitoba,67,0,9,7,0,62,47
2012,33,Manitoba,62,0.04,8,7,0,98,31
2012,34,Manitoba,69,0.01,7,6,0,108,84
2012,35,Manitoba,70,0.01,11,7,0,111,75
2012,36,Manitoba,63,0.01,11,1,0,116,22
2012,37,Manitoba,59,0.01,11,3,0,116,23
2012,38,Manitoba,47,0.01,12,2,0,116,13
2012,39,Manitoba,50,0,8,0,0,116,5
2012,40,Manitoba,46,0.02,15,0,0,116,7
2012,41,Manitoba,37,0.02,10,0,0,116,5
2015,17,Quebec,53,0,8,0,0,0,8
2015,18,Quebec,65,0.06,8,0,0,0,8
2015,19,Quebec,58,0.09,10,0,0,0,8
2015,20,Quebec,59,0.05,11,0,0,0,8
2015,21,Quebec,69,0.11,11,0,0,0,8
2015,22,Quebec,56,0.07,9,0,0,0,8
2015,23,Quebec,65,0.16,9,0,0,0,8
2015,24,Quebec,64,0.16,7,0,0,0,16
2015,25,Quebec,67,0.18,8,0,0,0,8
2015,26,Quebec,64,0.07,9,0,0,120,19
2015,27,Quebec,71,0.01,8,0,0,127,24
2015,28,Quebec,70,0.05,9,0,1,132,24
2015,29,Quebec,70,0.3,8,0,1,131,16
2015,30,Quebec,75,0.07,9,1,2,129,16
2015,31,Quebec,67,0.02,9,1,3,126,8
2015,32,Quebec,69,0.31,7,0,0,133,8
2015,33,Quebec,76,0.11,9,1,1,125,16
2015,34,Quebec,68,0.01,8,2,1,123,11
2015,35,Quebec,70,0,8,1,3,131,31
2015,36,Quebec,72,0.15,8,2,4,128,15
2015,37,Quebec,69,0.21,9,6,0,123,7
2015,38,Quebec,58,0,7,5,0,108,7
2015,39,Quebec,55,0.17,11,2,2,107,11
2015,40,Quebec,49,0.03,7,5,0,0,7
2015,41,Quebec,51,0.11,11,8,0,0,15
2014,17,Quebec,46,0.05,9,0,0,0,0
2014,18,Quebec,49,0.18,12,0,0,0,0
2014,19,Quebec,53,0.09,10,0,0,0,0
2014,20,Quebec,62,0.17,13,0,0,0,0
2014,21,Quebec,59,0.01,9,0,0,0,13
2014,22,Quebec,59,0.08,9,0,0,0,13
2014,23,Quebec,66,0.13,8,0,0,0,40
2014,24,Quebec,66,0.28,11,0,0,0,18
2014,25,Quebec,65,0.14,8,0,0,0,27
2014,26,Quebec,69,0.14,6,0,0,0,33
2014,27,Quebec,75,0.02,9,0,0,0,23
2014,28,Quebec,70,0.08,12,0,0,0,40
2014,29,Quebec,69,0.05,9,0,0,1,27
2014,30,Quebec,72,0.06,10,0,0,4,28
2014,31,Quebec,66,0.18,8,0,0,9,54
2014,32,Quebec,70,0.04,6,0,0,10,24
2014,33,Quebec,67,0.2,10,1,2,19,34
2014,34,Quebec,66,0,7,1,0,19,9
2014,35,Quebec,70,0,8,1,1,39,17
2014,36,Quebec,72,0.11,10,1,0,70,8
2014,37,Quebec,60,0.12,9,0,3,99,12
2014,38,Quebec,52,0.02,9,1,2,112,13
2014,39,Quebec,61,0.02,9,0,0,119,15
2014,40,Quebec,58,0.06,11,0,1,119,16
2014,41,Quebec,51,0.1,13,1,0,119,16
2013,17,Quebec,46,0.03,11,1,0,0,9
2013,18,Quebec,60,0.01,7,0,0,0,9
2013,19,Quebec,65,0.08,8,0,0,0,9
2013,20,Quebec,51,0.01,11,0,0,0,18
2013,21,Quebec,64,0.19,10,0,0,0,17
2013,22,Quebec,64,0.18,9,0,0,0,9
2013,23,Quebec,59,0.11,10,0,0,0,21
2013,24,Quebec,64,0.11,9,0,0,0,18
2013,25,Quebec,62,0.09,8,0,0,0,9
2013,26,Quebec,69,0.14,9,0,0,0,37
2013,27,Quebec,72,0.02,9,0,0,0,9
2013,28,Quebec,73,0.06,8,0,0,0,45
2013,29,Quebec,79,0.28,9,0,0,2,49
2013,30,Quebec,66,0.06,7,0,0,3,73
2013,31,Quebec,70,0.12,9,1,3,5,40
2013,32,Quebec,68,0.04,9,3,2,11,74
2013,33,Quebec,66,0.08,9,8,4,23,56
2013,34,Quebec,69,0.02,10,3,5,36,64
2013,35,Quebec,70,0.06,7,4,9,36,29
2013,36,Quebec,63,0.06,10,2,6,40,32
2013,37,Quebec,62,0.18,8,3,4,47,20
2013,38,Quebec,58,0.12,9,1,2,59,8
2013,39,Quebec,54,0.03,6,1,0,60,16
2013,40,Quebec,61,0,6,1,0,60,24
2013,41,Quebec,55,0.11,10,0,0,60,20
2012,17,Quebec,40,0.17,13,0,0,0,0
2012,18,Quebec,50,0.03,7,0,0,0,10
2012,19,Quebec,55,0.07,8,0,0,0,10
2012,20,Quebec,61,0.02,7,0,0,0,10
2012,21,Quebec,69,0.1,7,0,0,0,11
2012,22,Quebec,62,0.16,8,0,0,0,10
2012,23,Quebec,61,0.02,8,0,0,0,10
2012,24,Quebec,68,0.08,7,0,0,0,11
2012,25,Quebec,76,0.01,9,0,0,0,11
2012,26,Quebec,69,0.13,9,0,0,0,26
2012,27,Quebec,73,0.12,6,0,0,0,40
2012,28,Quebec,72,0,8,0,2,0,24
2012,29,Quebec,71,0.21,6,1,0,0,11
2012,30,Quebec,71,0.1,7,1,0,0,11
2012,31,Quebec,76,0.01,7,0,1,5,78
2012,32,Quebec,72,0.17,10,2,5,8,31
2012,33,Quebec,70,0.02,7,6,2,19,94
2012,34,Quebec,70,0,6,10,5,19,100
2012,35,Quebec,71,0.01,11,9,8,19,76
2012,36,Quebec,71,0.11,6,14,1,19,70
2012,37,Quebec,63,0.07,8,23,6,19,43
2012,38,Quebec,58,0.12,10,16,0,19,34
2012,39,Quebec,54,0.01,9,27,0,19,38
2012,40,Quebec,57,0.16,8,11,0,19,14
2012,41,Quebec,45,0.06,10,8,0,19,19
2015,17,Ontario,53,0,9,0,0,0,2
2015,18,Ontario,61,0.04,5,0,0,0,2
2015,19,Ontario,58,0.07,7,0,0,0,4
2015,20,Ontario,58,0,8,0,0,0,5
2015,21,Ontario,70,0.11,8,0,0,0,8
2015,22,Ontario,57,0.14,7,0,0,180,8
2015,23,Ontario,65,0.18,6,0,0,356,5
2015,24,Ontario,65,0.08,5,0,1,852,5
2015,25,Ontario,67,0.33,7,0,0,886,13
2015,26,Ontario,63,0.02,7,0,0,954,15
2015,27,Ontario,68,0.04,5,0,0,1152,13
2015,28,Ontario,67,0.03,6,1,0,1216,21
2015,29,Ontario,72,0.01,7,1,4,1219,16
2015,30,Ontario,76,0.03,6,1,1,1222,22
2015,31,Ontario,68,0.06,6,0,8,1176,24
2015,32,Ontario,69,0.21,6,0,0,1168,15
2015,33,Ontario,73,0.09,5,1,0,1168,24
2015,34,Ontario,64,0.01,5,5,1,987,12
2015,35,Ontario,75,0,5,2,1,881,18
2015,36,Ontario,70,0.11,5,5,0,802,9
2015,37,Ontario,65,0.07,6,1,2,712,6
2015,38,Ontario,60,0,5,5,4,526,4
2015,39,Ontario,55,0.04,9,2,2,396,6
2015,40,Ontario,53,0.14,6,3,0,65,5
2015,41,Ontario,52,0.04,8,3,4,0,2
2014,17,Ontario,46,0.05,8,0,0,0,3
2014,18,Ontario,47,0.14,9,0,0,0,2
2014,19,Ontario,53,0,9,0,0,0,2
2014,20,Ontario,56,0.13,6,0,0,0,3
2014,21,Ontario,57,0.09,5,0,0,0,4
2014,22,Ontario,65,0.02,6,0,0,0,7
2014,23,Ontario,63,0.04,6,0,0,0,10
2014,24,Ontario,65,0.19,6,0,0,0,16
2014,25,Ontario,66,0.16,5,0,0,0,13
2014,26,Ontario,69,0.06,4,0,0,0,7
2014,27,Ontario,72,0.09,7,0,0,0,20
2014,28,Ontario,68,0.12,6,0,0,0,17
2014,29,Ontario,66,0.21,5,1,0,0,13
2014,30,Ontario,68,0.03,5,0,0,2,14
2014,31,Ontario,67,0.35,5,0,0,5,35
2014,32,Ontario,68,0.21,4,0,0,9,22
2014,33,Ontario,65,0.12,7,2,0,11,30
2014,34,Ontario,67,0.02,4,0,2,13,11
2014,35,Ontario,67,0,6,2,3,30,18
2014,36,Ontario,71,0.39,5,5,0,43,13
2014,37,Ontario,60,0.15,6,1,0,52,10
2014,38,Ontario,53,0.02,4,0,1,56,7
2014,39,Ontario,60,0.08,4,0,0,56,3
2014,40,Ontario,61,0.06,4,0,0,56,6
2014,41,Ontario,50,0.06,6,0,0,56,4
2013,17,Ontario,43,0.05,6,0,0,0,2
2013,18,Ontario,57,0.05,6,0,0,0,3
2013,19,Ontario,59,0.04,5,0,0,0,4
2013,20,Ontario,51,0.02,8,0,0,0,3
2013,21,Ontario,60,0.17,8,0,0,0,7
2013,22,Ontario,64,0.16,6,1,0,0,9
2013,23,Ontario,58,0.05,7,1,0,0,9
2013,24,Ontario,64,0.29,6,0,0,0,12
2013,25,Ontario,64,0.11,5,0,0,0,12
2013,26,Ontario,73,0.06,4,0,1,2,12
2013,27,Ontario,71,0.05,5,1,0,2,20
2013,28,Ontario,72,0.13,6,2,0,4,15
2013,29,Ontario,80,0.05,5,1,2,12,20
2013,30,Ontario,65,0.12,6,5,0,22,56
2013,31,Ontario,66,0.26,5,4,8,41,43
2013,32,Ontario,67,0.04,6,5,6,65,32
2013,33,Ontario,63,0,5,5,2,89,24
2013,34,Ontario,70,0,5,2,0,131,30
2013,35,Ontario,72,0.2,3,2,8,155,22
2013,36,Ontario,63,0.12,6,7,2,179,12
2013,37,Ontario,64,0.04,6,3,2,190,15
2013,38,Ontario,57,0.17,4,5,2,194,9
2013,39,Ontario,55,0,4,0,1,196,5
2013,40,Ontario,61,0.04,4,5,0,198,9
2013,41,Ontario,56,0.04,4,1,0,198,4
2012,17,Ontario,40,0.06,11,0,0,0,4
2012,18,Ontario,50,0.12,6,0,0,0,3
2012,19,Ontario,56,0.07,6,0,0,0,3
2012,20,Ontario,58,0.02,4,0,0,0,3
2012,21,Ontario,69,0.01,6,0,0,0,5
2012,22,Ontario,64,0.09,8,0,0,0,3
2012,23,Ontario,63,0.03,6,1,0,0,6
2012,24,Ontario,67,0.08,6,0,0,0,4
2012,25,Ontario,76,0.17,6,0,0,2,7
2012,26,Ontario,70,0.04,7,0,0,6,10
2012,27,Ontario,75,0.04,5,3,1,10,39
2012,28,Ontario,73,0.02,5,5,3,19,24
2012,29,Ontario,75,0.06,6,9,1,30,19
2012,30,Ontario,72,0.38,6,14,2,89,17
2012,31,Ontario,73,0.16,4,23,1,162,77
2012,32,Ontario,70,0.14,6,44,1,249,46
2012,33,Ontario,68,0.05,4,44,8,312,64
2012,34,Ontario,67,0,4,38,4,375,83
2012,35,Ontario,70,0.15,6,26,0,409,100
2012,36,Ontario,69,0.56,4,25,0,434,79
2012,37,Ontario,61,0.03,5,17,2,454,37
2012,38,Ontario,57,0.16,5,3,4,462,23
2012,39,Ontario,53,0,6,2,6,462,24
2012,40,Ontario,57,0.03,5,3,0,464,18
2012,41,Ontario,42,0.04,5,1,0,464,10
2015,17,Saskatchewan,50,0,10,0,0,0,6
2015,18,Saskatchewan,46,0,11,0,0,0,12
2015,19,Saskatchewan,46,0,9,0,0,0,6
2015,20,Saskatchewan,53,0,8,0,0,0,6
2015,21,Saskatchewan,56,0,8,0,0,2,9
2015,22,Saskatchewan,60,0,10,0,0,0,9
2015,23,Saskatchewan,64,0,10,0,0,3,9
2015,24,Saskatchewan,57,0,8,0,0,3,12
2015,25,Saskatchewan,65,0,7,0,0,10,31
2015,26,Saskatchewan,70,0,6,0,0,13,15
2015,27,Saskatchewan,66,0,9,0,0,16,13
2015,28,Saskatchewan,67,0,8,0,0,40,15
2015,29,Saskatchewan,68,0,10,0,0,47,16
2015,30,Saskatchewan,63,0.02,9,0,0,69,43
2015,31,Saskatchewan,63,0,8,0,0,67,16
2015,32,Saskatchewan,70,0,8,0,0,80,28
2015,33,Saskatchewan,58,0,8,0,0,94,38
2015,34,Saskatchewan,62,0,8,0,0,42,21
2015,35,Saskatchewan,61,0,10,0,1,41,14
2015,36,Saskatchewan,53,0,8,0,0,0,9
2015,37,Saskatchewan,52,0,8,0,0,0,5
2015,38,Saskatchewan,54,0,10,0,0,0,5
2015,39,Saskatchewan,48,0,8,0,0,0,5
2015,40,Saskatchewan,48,0,9,0,0,0,8
2015,41,Saskatchewan,44,0,11,0,0,0,5
2014,17,Saskatchewan,40,0,12,0,0,0,6
2014,18,Saskatchewan,41,0,10,0,0,0,6
2014,19,Saskatchewan,41,0,9,0,0,0,6
2014,20,Saskatchewan,45,0,7,0,0,0,6
2014,21,Saskatchewan,59,0,10,0,0,0,13
2014,22,Saskatchewan,57,0,11,0,0,0,20
2014,23,Saskatchewan,55,0,8,0,0,0,17
2014,24,Saskatchewan,53,0,10,0,0,0,13
2014,25,Saskatchewan,57,0,10,0,0,0,7
2014,26,Saskatchewan,63,0,8,0,0,0,21
2014,27,Saskatchewan,66,0,11,0,0,0,26
2014,28,Saskatchewan,65,0,10,0,0,0,69
2014,29,Saskatchewan,64,0,9,0,0,0,65
2014,30,Saskatchewan,63,0,9,0,0,1,60
2014,31,Saskatchewan,67,0,6,0,0,1,36
2014,32,Saskatchewan,69,0,6,0,2,2,47
2014,33,Saskatchewan,67,0,7,0,0,9,67
2014,34,Saskatchewan,64,0,8,0,0,19,45
2014,35,Saskatchewan,58,0,9,0,0,20,34
2014,36,Saskatchewan,56,0,8,0,0,20,13
2014,37,Saskatchewan,46,0,9,0,0,20,19
2014,38,Saskatchewan,55,0,8,0,0,20,6
2014,39,Saskatchewan,61,0,9,0,0,20,16
2014,40,Saskatchewan,44,0,12,0,0,20,12
2014,41,Saskatchewan,45,0,9,0,0,20,6
2013,17,Saskatchewan,34,0,10,0,0,0,10
2013,18,Saskatchewan,40,0,12,0,0,0,14
2013,19,Saskatchewan,50,0,12,0,0,0,14
2013,20,Saskatchewan,59,0,9,0,0,0,7
2013,21,Saskatchewan,57,0,13,0,0,0,7
2013,22,Saskatchewan,60,0,9,0,0,0,14
2013,23,Saskatchewan,57,0,9,0,0,0,21
2013,24,Saskatchewan,57,0,10,0,0,0,20
2013,25,Saskatchewan,61,0,10,0,0,0,14
2013,26,Saskatchewan,64,0,7,0,0,0,41
2013,27,Saskatchewan,69,0,7,0,0,0,61
2013,28,Saskatchewan,65,0,8,0,0,1,65
2013,29,Saskatchewan,62,0,9,0,3,1,81
2013,30,Saskatchewan,60,0,9,0,1,3,75
2013,31,Saskatchewan,59,0,8,0,2,3,33
2013,32,Saskatchewan,60,0,6,0,1,18,44
2013,33,Saskatchewan,69,0,8,0,0,29,75
2013,34,Saskatchewan,66,0,8,1,1,29,60
2013,35,Saskatchewan,69,0,8,3,0,36,24
2013,36,Saskatchewan,67,0,7,1,0,40,21
2013,37,Saskatchewan,62,0,9,0,0,40,26
2013,38,Saskatchewan,57,0,10,1,2,40,32
2013,39,Saskatchewan,51,0,9,0,1,40,13
2013,40,Saskatchewan,45,0,11,0,0,40,29
2013,41,Saskatchewan,46,0,10,0,0,40,10
2012,17,Saskatchewan,44,0,13,0,0,0,24
2012,18,Saskatchewan,46,0,12,0,0,0,16
2012,19,Saskatchewan,51,0,13,0,0,0,16
2012,20,Saskatchewan,54,0,12,0,0,0,9
2012,21,Saskatchewan,48,0,11,0,0,0,17
2012,22,Saskatchewan,53,0,9,0,0,0,16
2012,23,Saskatchewan,61,0,13,0,0,0,8
2012,24,Saskatchewan,56,0,11,0,0,0,16
2012,25,Saskatchewan,58,0,7,0,0,0,25
2012,26,Saskatchewan,64,0,12,0,0,0,22
2012,27,Saskatchewan,65,0,9,0,0,0,23
2012,28,Saskatchewan,71,0,7,0,1,0,67
2012,29,Saskatchewan,67,0,10,0,0,0,34
2012,30,Saskatchewan,67,0,8,0,0,0,28
2012,31,Saskatchewan,64,0,8,0,0,0,59
2012,32,Saskatchewan,68,0,8,0,0,3,58
2012,33,Saskatchewan,59,0,8,2,0,4,34
2012,34,Saskatchewan,65,0,9,1,0,6,100
2012,35,Saskatchewan,64,0,9,0,0,6,49
2012,36,Saskatchewan,55,0,11,3,0,6,41
2012,37,Saskatchewan,58,0,13,0,0,6,16
2012,38,Saskatchewan,50,0,8,3,0,6,19
2012,39,Saskatchewan,55,0,6,0,0,6,15
2012,40,Saskatchewan,42,0,10,0,0,6,11
2012,41,Saskatchewan,36,0,8,0,0,6,7
First I produced this plot
But I did that in the most brute force way imaginable
#split out each year
cases2015 <- subset(mosquitoes, mosquitoes$Years==2015)
cases2014 <- subset(mosquitoes, mosquitoes$Years==2014)
cases2013 <- subset(mosquitoes, mosquitoes$Years==2013)
cases2012 <- subset(mosquitoes, mosquitoes$Years==2012)
#get the sums by week
aggregate2015 <- aggregate(cases2015$Number.of.cases, by=list(Weeks=cases2015$Weeks), FUN=sum)
aggregate2014 <- aggregate(cases2014$Number.of.cases, by=list(Weeks=cases2014$Weeks), FUN=sum)
aggregate2013 <- aggregate(cases2013$Number.of.cases, by=list(Weeks=cases2013$Weeks), FUN=sum)
aggregate2012 <- aggregate(cases2012$Number.of.cases, by=list(Weeks=cases2012$Weeks), FUN=sum)
#put the sums back together into a dataframe
aggregateSums <- aggregate2012
aggregateSums <- cbind(aggregateSums, aggregate2013[,2])
aggregateSums <- cbind(aggregateSums, aggregate2014[,2])
aggregateSums <- cbind(aggregateSums, aggregate2015[,2])
#give the columns useful names
colnames(aggregateSums) <- c("Weeks","Cases.2012","Cases.2013","Cases.2014","Cases.2015")
#base R plot
#plot the first set of points
plot(x=aggregateSums$Weeks,y=aggregateSums$Cases.2012,pch=16,col="Red",main="West Nile Cases",xlab="Week",ylab="Number of Cases")
#add additional years
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2013,pch=15,col="Blue")
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2014,pch=14,col="Orange")
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2015,pch=13,col="Brown")
#add the connecting lines
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2012,col="Red")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2013,col="Blue")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2014,col="Orange")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2015,col="Brown")
#click to place legend
legend(locator(1),c("2012","2013","2014","2015"),pch=c(16,15,14,13), col=c("Red","Blue","Orange","Brown"))
So surely there has to be a more efficient way to get there.
My next step is to produce the same plot but for just one province at a time. I don't want to have to go through the above 6 times...
I'm opening to accomplishing this via ggplot. If possible, I'd like to do it without resorting to additional packages (like plyr) as I'm trying to learn the base functionality for manipulating data.
Just to close the loop after Biranjan's answer...
mosq2 <- mosquitoes %>%
select(Years,Weeks,Province,Number.of.cases) %>%
group_by(Years,Weeks,Province) %>%
summarise(sum_case=sum(Number.of.cases))
ggplot(data=mosq2, aes(x=as.factor(Weeks),y=sum_case,color=as.factor(Years))) +
geom_point(aes(shape=as.factor(Years))) +
geom_line(aes(group=as.factor(Years))) +
labs(title="West Nile Cases", x="weeks", y="Number of cases") +
theme(legend.title=element_blank()) +
facet_wrap(~Province,ncol=3) +
scale_x_discrete(breaks=c(17,30,41))
Turned out quite nicely

ggplot(data=data1, aes(x=as.factor(Weeks),y=sum_case,color=as.factor(Years)))+
geom_point(aes(shape=as.factor(Years)))+
geom_line(aes(group=as.factor(Years)))+
labs(title="West Nile cases",x="weeks",y="Number of cases")+
theme(legend.title=element_blank())
Update:
I had too few points in my simulation so it rendered fine so that was the problem. I could't find a way to plot just using ggplot. The same code works if "dplyr" is used first and variable name edited accordingly. I know it is not what you are looking for, sorry to disappoint you.
library(dplyr)
data1 <- data %>%
select(Years,Weeks,Number.of.cases) %>%
group_by(Years,Weeks) %>%
summarise(sum_case=sum(Number.of.cases))

Related

Remove all rows above and below a value in R

We have citizen scientist recording data for us using In-Situ Aqua troll 600 instruments. It is similar to a CTD but not. The data format is a little different. Different enough that I cannot use CTD trim from the OCE package in R. I need to remove all the rows of data during the soak time (time in the water before they start lowering the instrument) and the up cast from the data. That is all the rows after they reached the max depth. So I just need that center portion of my dataframe.
My Data
Date Time Salinity (ppt) (672441) Chlorophyll-a Fluorescence (RFU) (671721) RDO Concentration (mg/L) (672144) Temperature (°C) (676121) Depth (ft) (671051)
16:29.0 0 0.01089297 7.257619 31.91303 0.008220486
16:31.0 0 0.01765913 7.246986 31.93175 0.1499496
16:33.0 0 0.0130412 7.258863 31.93253 0.5387784
16:35.0 0 0.01299242 7.274049 31.93806 0.6187978
16:37.0 0 0.01429801 7.26965 31.94401 0.6640261
16:39.0 0 0.01342988 7.271608 31.93595 0.681709
16:41.0 0 0.01337719 7.271549 31.93503 0.684597
16:43.0 7.087267 0.007094439 6.98015 31.89018 1.598019
16:45.0 28.3442 0.007111916 6.268753 31.83806 1.687673
16:47.0 31.06357 0.007945394 6.197834 31.77821 1.418773
16:49.0 32.07076 0.0080788 6.166986 31.76881 1.382685
16:51.0 31.95504 0.004382414 6.191305 31.72906 1.358556
16:53.0 36.21165 0.01983912 5.732656 29.3942 123.4148
16:55.0 36.37849 0.02243886 5.626586 28.82502 125.2927
16:57.0 36.43061 0.02416219 5.450325 28.23787 126.7997
16:59.0 36.44484 0.02441683 5.421676 28.14037 127.0321
17:01.0 36.46815 4.510316 5.318929 28.09501 127.2064
17:03.0 36.41381 4.012657 5.241654 28.14595 127.2227
17:05.0 36.42724 0.7891375 5.174401 28.20383 127.2019
17:07.0 36.41064 0.4351442 5.120181 28.18592 127.197
17:09.0 36.38155 0.2253969 5.033384 28.21021 127.1895
17:11.0 36.37671 0.2089337 5.019629 28.21222 127.1885
17:13.0 36.43813 0.08728585 4.981099 28.17526 127.2223
17:15.0 36.47644 0.904435 4.951878 28.13579 127.2108
17:17.0 36.54742 0.1230291 4.93056 28.06166 127.2307
17:19.0 36.60466 10.04291 4.908442 27.9397 126.6003
17:21.0 36.61511 11.33922 4.904828 27.92038 126.5161
17:23.0 36.68179 0.6680982 4.87018 27.78319 123.707
17:25.0 36.74612 0.06539913 4.848994 27.72977 119.906
17:27.0 36.75729 0.02414635 4.826871 27.72545 114.9537
17:29.0 37.1578 0.01556828 4.804105 27.81129 113.3405
> depthmax<- max(WS$`Depth (ft) (671051)`, na.rm = TRUE)
> output <- WS[WS$"Depth (ft) (671051)" < depthmax,]
> Output2 <- output[output$"Depth (ft) (671051)" > 1,]
I tried these and got output2 to work but can't seam to get output to work. Is there a more elegant way to do this? Just to recap I need to remove all rows after the depthmax (127.2307) and all the rows before the depth when they start lowering the instrument (~2.41).
Your code does remove the maximum depth, but not the rows after the maximum depth is reached. You want to locate the row index of the the maximum depth and delete that row and the ones after:
start <- tail(which(na.omit(WS$`Depth (ft) (671051)`) < 2.41), 1) + 1
end<- which.max(na.omit(WS$`Depth (ft) (671051)`)) - 1
output <- WS[start:end, ]
The first line finds the index of the last row less than 2.41 and adds 1 to get the starting row. The second line finds the index of the maximum depth and subtracts 1 to get the row before that.

GameTheory package: Convert data frame of games to Coalition Set

I am looking to explore the GameTheory package from CRAN, but I would appreciate help in converting my data (in the form of a data frame of unique combinations and results) in to the required coalition object. The precursor to this I believe to be an ordered list of all coalition values (https://cran.r-project.org/web/packages/GameTheory/vignettes/GameTheory.pdf).
My real data has n ~ 30 'players', and unique combinations = large (say 1000 unique combinations), for which I have 1 and 0 identifiers to describe the combinations. This data is sparsely populated in that I do not have data for all combinations, but will assume combinations not described have zero value. I plan to have one specific 'player' who will appear in all combinations, and act as a baseline.
By way of example this is the data frame I am starting with:
require(GameTheory)
games <- read.csv('C:\\Users\\me\\Desktop\\SampleGames.csv', header = TRUE, row.names = 1)
games
n1 n2 n3 n4 Stakes Wins Success_Rate
1 1 1 0 0 800 60 7.50%
2 1 0 1 0 850 45 5.29%
3 1 0 0 1 150000 10 0.01%
4 1 1 1 0 300 25 8.33%
5 1 1 0 1 1800 65 3.61%
6 1 0 1 1 1900 55 2.89%
7 1 1 1 1 700 40 5.71%
8 1 0 0 0 3000000 10 0.00333%
where n1 is my universal player, and in this instance, I have described all combinations.
To calculate my 'base' coalition value from player {1} alone, I am looking to perform the calculation: 0.00333% (success rate) * all stakes, i.e.
0.00333% * (800 + 850 + 150000 + 300 + 1800 + 1900 + 700 + 3000000) = 105
I'll then have zero values for {2}, {3} and {4} as they never "play" alone in this example.
To calculate my first pair coalition value, I am looking to perform the calculation:
7.5%(800 + 300 + 1800 + 700) + 0.00333%(850 + 150000 + 1900 + 3000000) = 375
This is calculated as players {1,2} base win rate (7.5%) by the stakes they feature in, plus player {1} base win rate (0.00333%) by the combinations he features in that player {2} does not - i.e. exclusive sets.
This logic is repeated for the other unique combinations. For example row 4 would be the combination of {1,2,3} so the calculation is:
7.5%(800+1800) + 5.29%(850+1900) + 8.33%(300+700) + 0.00333%(3000000+150000) = 529 which descriptively is set {1,2} success rate% by Stakes for the combinations it appears in that {3} does not, {1,3} by where {2} does not feature, {1,2,3} by their occurrences, and the base player {1} by examples where neither {2} nor {3} occur.
My expected outcome therefore should look like this I believe:
c(105,0,0,0, 375,304,110,0,0,0, 529,283,246,0, 400)
where the first four numbers are the single player combinations {1} {2} {3} and {4}, the next six numbers are two player combinations {1,2} {1,3} {1,4} (and the null cases {2,3} {2,4} {3,4} which don't exist), then the next four are the three player combinations {1,2,3} {1,2,4} {1,3,4} and the null case {2,3,4}, and lastly the full combination set {1,2,3,4}.
I'd then feed this in to the DefineGame function of the package to create my coalitions object.
Appreciate any help: I have tried to be as descriptive as possible. I really don't know where to start on generating the necessary sets and set exclusions.

Including lagged independent variables - R

I would like to run a regression where I use both the current value and lagged values from a specific independent variable.
My dataset
This is an example extract from my dataset:
dt nrOfCalls nrOfOrders nrOfOrdersLag1 nrOfOrdersLag2 nrOfOrdersLag3
2016/04/20 17 5 9 7 12
2016/04/21 12 8 5 9 7
2016/04/22 14 4 8 5 9
2016/04/23 15 6 4 8 5
2016/04/24 20 14 6 4 8
2016/04/25 10 3 14 6 4
Where NrOfOrdersLagX implies the number of orders X days ago. I have also included dummy variables (because of limited space I have included these dummy variables in the example extract of my dataset).
My code
When I run the following code everything works perfectly fine:
reg <- lm(nrOfCalls ~ dummy1+...+dummy6+nrOfOrders, data=trainingSet)
However, when I try including the lagged values of the nrOfOrders regressor (for this example I only include one lagged value), I get some inordinary results. I use the following code:
reg <- lm(nrOfCalls ~ dummy1+...+dummy6+nrOfOrders+nrOfOrdersLag1, data=trainingSet)
Instead of merely including the regressor nrOfOrdersLag1, it will include all kinds of regressors which variable names are a variation on nrOfOrdersLag1.
Call:
lm(formula = nrOfCalls ~ dummy1 + dummy2 + dummy3 + dummy4 +
dummy5 + dummy6 + nrOfOrders + nrOfOrdersLag1, data = trainCall)
Coefficients:
(Intercept) dummy1 dummy2 dummy3 dummy4
604.06334 -114.03241 -229.67540 -270.62292 -220.12409
dummy5 dummy6 nrOfOrders nrOfOrdersLag110707 nrOfOrdersLag11161
-457.22245 -465.17116 0.01729 -249.54641 -10.98526
nrOfOrdersLag111869 nrOfOrdersLag11207 nrOfOrdersLag11234 nrOfOrdersLag11262 nrOfOrdersLag11267
45.36821 33.46161 -17.70615 -384.09745 -413.64804
nrOfOrdersLag11279 nrOfOrdersLag11285 nrOfOrdersLag112945 nrOfOrdersLag11336 nrOfOrdersLag11348
-200.19660 32.75546 -264.04005 -47.13457 79.48368
nrOfOrdersLag11351 nrOfOrdersLag11355 nrOfOrdersLag11363 nrOfOrdersLag11364 nrOfOrdersLag11368
-208.62312 6.83426 -98.71679 170.29583 -93.83054
nrOfOrdersLag11375 nrOfOrdersLag11398 nrOfOrdersLag11456 nrOfOrdersLag11462 nrOfOrdersLag11464
50.54960 14.39958 118.73762 113.72744 190.54445
nrOfOrdersLag11469 nrOfOrdersLag114778 nrOfOrdersLag11486 nrOfOrdersLag11489 nrOfOrdersLag11504
-8.79258 84.35041 66.29121 29.67360 24.30553
nrOfOrdersLag11505 nrOfOrdersLag11511 nrOfOrdersLag11520 nrOfOrdersLag11521 nrOfOrdersLag11527
286.85352 69.76762 -159.45588 -38.90402 53.62128
nrOfOrdersLag11538 nrOfOrdersLag11540 nrOfOrdersLag11564 nrOfOrdersLag115674 nrOfOrdersLag11579
-104.66037 -60.10656 -58.32177 522.56810 77.65481
nrOfOrdersLag11587 nrOfOrdersLag11593 nrOfOrdersLag11603 nrOfOrdersLag11618 nrOfOrdersLag11622
34.63649 31.28570 -124.35673 16.43115 207.99435
nrOfOrdersLag11624 nrOfOrdersLag11626 nrOfOrdersLag11629 nrOfOrdersLag11631 nrOfOrdersLag11635
93.90391 78.94275 155.88327 15.32027 125.02409
nrOfOrdersLag11640 nrOfOrdersLag11645 nrOfOrdersLag11649 nrOfOrdersLag11651 nrOfOrdersLag11653
208.51996 -42.03086 -1.62533 164.73045 12.61157
nrOfOrdersLag11654 nrOfOrdersLag11673 nrOfOrdersLag11683 nrOfOrdersLag11688 nrOfOrdersLag11698
129.26306 -41.56615 137.09095 149.86866 -49.43096
nrOfOrdersLag11699 nrOfOrdersLag11702 nrOfOrdersLag11703 nrOfOrdersLag11705 nrOfOrdersLag11714
76.86530 202.69027 -70.26281 -173.43605 170.02302
nrOfOrdersLag11715 nrOfOrdersLag11716 nrOfOrdersLag11726 nrOfOrdersLag11749 nrOfOrdersLag11754
34.30252 75.45378 176.16211 76.39492 58.11995
nrOfOrdersLag11757 nrOfOrdersLag11764 nrOfOrdersLag11766 nrOfOrdersLag11772 nrOfOrdersLag11777
133.71731 137.62373 24.95059 -75.96096 54.03353
nrOfOrdersLag11778 nrOfOrdersLag11782 nrOfOrdersLag11793 nrOfOrdersLag11806 nrOfOrdersLag11810
-147.40657 -45.70752 27.76710 94.17449 -191.98461
nrOfOrdersLag11811 nrOfOrdersLag11812 nrOfOrdersLag11814 nrOfOrdersLag11815 nrOfOrdersLag11817
61.04646 145.25908 38.56959 18.22574 140.84081
nrOfOrdersLag11827 nrOfOrdersLag11832 nrOfOrdersLag11839 nrOfOrdersLag11841 nrOfOrdersLag11859
-254.56931 138.30797 -139.32523 -151.50010 39.27760
nrOfOrdersLag11860 nrOfOrdersLag11862 nrOfOrdersLag11868 nrOfOrdersLag11874 nrOfOrdersLag11876
304.88804 150.84361 30.75749 -91.55666 192.43385
nrOfOrdersLag11879 nrOfOrdersLag11880 nrOfOrdersLag11885 nrOfOrdersLag11887 nrOfOrdersLag11891
118.75260 -44.83615 163.35474 194.12038 127.79107
nrOfOrdersLag11896 nrOfOrdersLag11901 nrOfOrdersLag11914 nrOfOrdersLag11919 nrOfOrdersLag11921
82.79870 179.44324 303.18796 242.51540 159.40652
nrOfOrdersLag11928 nrOfOrdersLag11929 nrOfOrdersLag11932 nrOfOrdersLag11937 nrOfOrdersLag11939
484.73958 35.38640 286.54643 46.88513 48.94031
nrOfOrdersLag11952 nrOfOrdersLag11967 nrOfOrdersLag11988 nrOfOrdersLag11994 nrOfOrdersLag11996
265.02228 170.65576 47.77627 317.10968 383.09702
nrOfOrdersLag119987 nrOfOrdersLag12007 nrOfOrdersLag12010 nrOfOrdersLag12017 nrOfOrdersLag12018
416.71786 93.41540 61.71721 73.68938 136.60641
nrOfOrdersLag12019 nrOfOrdersLag12023 nrOfOrdersLag12027 nrOfOrdersLag12034 nrOfOrdersLag12040
88.13672 -214.93168 38.82154 148.72993 -60.63852
nrOfOrdersLag12050 nrOfOrdersLag12051 nrOfOrdersLag12056 nrOfOrdersLag12058 nrOfOrdersLag12060
205.21811 246.46001 163.20151 -0.35863 61.93024
nrOfOrdersLag12073 nrOfOrdersLag12082 nrOfOrdersLag12087 nrOfOrdersLag12093 nrOfOrdersLag12107
122.50936 -27.13307 -43.74262 366.51938 146.85581
nrOfOrdersLag12119 nrOfOrdersLag12122 nrOfOrdersLag12124 nrOfOrdersLag121319 nrOfOrdersLag12133
119.31341 36.35183 253.68015 115.01838 228.66567
nrOfOrdersLag12136 nrOfOrdersLag12137 nrOfOrdersLag12154 nrOfOrdersLag12167 nrOfOrdersLag12169
-9.97711 121.20416 -448.43096 324.45466 169.37446
nrOfOrdersLag12176 nrOfOrdersLag12180 nrOfOrdersLag12181 nrOfOrdersLag12184 nrOfOrdersLag12186
88.35432 -14.74399 41.03555 310.68640 308.82549
nrOfOrdersLag12189 nrOfOrdersLag12195 nrOfOrdersLag12202 nrOfOrdersLag12204 nrOfOrdersLag12216
121.87542 264.78895 191.52156 281.02113 168.29821
nrOfOrdersLag12219 nrOfOrdersLag12221 nrOfOrdersLag12231 nrOfOrdersLag12236 nrOfOrdersLag12237
218.48030 66.07233 -228.54230 111.06068 162.65347
nrOfOrdersLag12242 nrOfOrdersLag12244 nrOfOrdersLag12246 nrOfOrdersLag12261 nrOfOrdersLag12262
12.05505 114.60872 -123.06406 -45.54485 380.26022
nrOfOrdersLag12268 nrOfOrdersLag12271 nrOfOrdersLag12302 nrOfOrdersLag12304 nrOfOrdersLag12311
4.23556 249.55941 248.38079 103.12194 -71.69000
nrOfOrdersLag12313 nrOfOrdersLag12329 nrOfOrdersLag12345 nrOfOrdersLag12353 nrOfOrdersLag12356
247.93662 207.13958 314.96154 95.08688 300.10247
nrOfOrdersLag12361 nrOfOrdersLag12371 nrOfOrdersLag12376 nrOfOrdersLag12380 nrOfOrdersLag12384
37.27506 -167.84137 66.61313 247.32681 237.73556
nrOfOrdersLag12399 nrOfOrdersLag12406 nrOfOrdersLag12413 nrOfOrdersLag12417 nrOfOrdersLag12420
107.37362 399.28658 275.48695 95.07723 324.87029
nrOfOrdersLag12423 nrOfOrdersLag12434 nrOfOrdersLag12437 nrOfOrdersLag12442 nrOfOrdersLag12446
233.30480 193.45613 250.79606 322.78975 320.40151
nrOfOrdersLag12448 nrOfOrdersLag12449 nrOfOrdersLag12451 nrOfOrdersLag12460 nrOfOrdersLag124708
172.20478 -113.45790 108.52769 305.32173 -134.41931
nrOfOrdersLag12484 nrOfOrdersLag12486 nrOfOrdersLag12493 nrOfOrdersLag12497 nrOfOrdersLag12505
156.35931 -9.49808 223.13247 -67.47891 534.66815
nrOfOrdersLag12541 nrOfOrdersLag12552 nrOfOrdersLag12563 nrOfOrdersLag12588 nrOfOrdersLag12596
221.35464 1.92188 -53.40846 -473.89923 497.69016
nrOfOrdersLag12611 nrOfOrdersLag12618 nrOfOrdersLag12623 nrOfOrdersLag12632 nrOfOrdersLag12638
175.77150 125.22040 -302.58298 -159.54109 -337.04664
nrOfOrdersLag12646 nrOfOrdersLag12648 nrOfOrdersLag12663 nrOfOrdersLag12665 nrOfOrdersLag12687
539.15416 350.53169 -148.22458 147.67351 -349.52567
nrOfOrdersLag12696 nrOfOrdersLag12713 nrOfOrdersLag12721 nrOfOrdersLag12723 nrOfOrdersLag12743
-42.64843 141.90979 47.07766 -443.50878 356.28944
nrOfOrdersLag12745 nrOfOrdersLag12750 nrOfOrdersLag12753 nrOfOrdersLag12761 nrOfOrdersLag127688
14.65720 13.35666 8.30924 -191.17540 -123.52409
nrOfOrdersLag12802 nrOfOrdersLag12806 nrOfOrdersLag12812 nrOfOrdersLag12815 nrOfOrdersLag12818
128.14604 281.35157 361.79299 8.34690 86.67458
nrOfOrdersLag12824 nrOfOrdersLag12836 nrOfOrdersLag12841 nrOfOrdersLag12842 nrOfOrdersLag12876
518.23720 -357.78788 288.63660 433.15556 158.51341
nrOfOrdersLag12883 nrOfOrdersLag12884 nrOfOrdersLag12901 nrOfOrdersLag12941 nrOfOrdersLag12956
214.74913 68.99485 -208.43888 -297.43011 319.30849
nrOfOrdersLag12996 nrOfOrdersLag13007 nrOfOrdersLag13013 nrOfOrdersLag13023 nrOfOrdersLag13033
321.02569 -88.96746 80.93579 106.97804 -223.88599
nrOfOrdersLag13051 nrOfOrdersLag13072 nrOfOrdersLag13094 nrOfOrdersLag13098 nrOfOrdersLag13127
40.95339 161.48086 524.04025 -94.23016 17.50082
nrOfOrdersLag13152 nrOfOrdersLag13171 nrOfOrdersLag13185 nrOfOrdersLag13202 nrOfOrdersLag13205
-266.11135 8.82232 -107.11441 -141.14442 212.80057
nrOfOrdersLag13222 nrOfOrdersLag13277 nrOfOrdersLag13295 nrOfOrdersLag13321 nrOfOrdersLag13332
187.90431 306.69183 -24.55235 68.42339 -290.11682
nrOfOrdersLag13362 nrOfOrdersLag13378 nrOfOrdersLag13380 nrOfOrdersLag13391 nrOfOrdersLag13476
44.30976 463.85118 276.57882 -282.06457 34.35207
nrOfOrdersLag13488 nrOfOrdersLag13490 nrOfOrdersLag13530 nrOfOrdersLag13578 nrOfOrdersLag13599
217.46608 386.26006 194.69082 52.45357 406.44931
nrOfOrdersLag13611 nrOfOrdersLag13618 nrOfOrdersLag13626 nrOfOrdersLag13632 nrOfOrdersLag13635
242.81201 -22.19253 23.90163 -395.87751 103.44677
nrOfOrdersLag13674 nrOfOrdersLag13681 nrOfOrdersLag13767 nrOfOrdersLag13841 nrOfOrdersLag13849
200.18354 83.25027 -71.88190 382.05886 -279.73606
nrOfOrdersLag13857 nrOfOrdersLag13874 nrOfOrdersLag13885 nrOfOrdersLag13897 nrOfOrdersLag13908
370.92867 -17.14313 -140.99009 -244.17716 93.79552
nrOfOrdersLag13966 nrOfOrdersLag14009 nrOfOrdersLag14031 nrOfOrdersLag14111 nrOfOrdersLag14160
61.75484 224.96558 -107.99394 -126.12766 572.14222
nrOfOrdersLag14171 nrOfOrdersLag14205 nrOfOrdersLag14312 nrOfOrdersLag14468 nrOfOrdersLag14560
-42.29929 -379.41067 194.25204 -47.50642 -116.49251
nrOfOrdersLag14619 nrOfOrdersLag14640 nrOfOrdersLag14684 nrOfOrdersLag14762 nrOfOrdersLag14776
41.34325 -355.84333 -122.77109 -331.12296 404.86637
nrOfOrdersLag14865 nrOfOrdersLag14959 nrOfOrdersLag14967 nrOfOrdersLag15195 nrOfOrdersLag15218
371.14617 104.60840 -42.74014 99.78008 520.62517
nrOfOrdersLag15402 nrOfOrdersLag16029 nrOfOrdersLag16284 nrOfOrdersLag16321 nrOfOrdersLag16350
529.17004 161.02870 268.77256 74.02159 386.53868
nrOfOrdersLag16418 nrOfOrdersLag16557 nrOfOrdersLag16711 nrOfOrdersLag16722 nrOfOrdersLag16825
-81.37023 190.74905 225.64313 -131.70051 271.39936
nrOfOrdersLag16952 nrOfOrdersLag16996 nrOfOrdersLag17098 nrOfOrdersLag17251 nrOfOrdersLag17279
357.39158 408.46849 210.03477 -25.74894 NA
nrOfOrdersLag17292 nrOfOrdersLag17391 nrOfOrdersLag18642 nrOfOrdersLag18670 nrOfOrdersLag18949
262.00528 4.71906 326.28857 49.30983 174.99732
nrOfOrdersLag19202 nrOfOrdersLag19690 nrOfOrdersLag19772
16.13322 15.59552 -62.26111
I have no clue what is happening and why this is going wrong. Anybody that can help me out here? Thanks in advance!
The lagged independent variables were factor variables instead of integer/numeric variables. Having fixed this, the lm call works as intended.

cut function and controlled frequency in the intervals

My question is pretty simple: the cut() function allows to choose the breaks along which I can divide the range of my vector into intervals. I would like to be able to control for the number of observations within the newly created interval, in a way similar to what could be obtained with a quantile argument in the cut() function call. However I don't want to be using the quantile argument because I would like for the intervals to be chosen fixed, so that I can match them between different databases for further comparison, and I want the same discrete values to be found in the labels of the newly cut vectors.
I used to use this for the quantile approach:
df$z<-cut(df$x, quantile(x, (0:10)/10), include.lowest=TRUE)
Which is fairly simple. My new approach is even simpler, so it resembles this for example:
df$z<-cut(df$x, c(0.04,0.055,0.06,0.065,0.07,0.075,0.08,0.085,0.09,0.095,0.11), include.lowest=T)
I then have another variable which I want to calculate some statistics on, according to the levels of the discrete variable.
So it would go something like this :
df$conf.intx<-ifelse(df$z=="1",t.test(df[df$z=="1",]$y)$conf.int[1],
ifelse(df$z=="2",t.test(df[df$z=="2",]$y)$conf.int[1],
ifelse(df$z=="3",t.test(df[df$z=="3",]$y)$conf.int[1],
ifelse(df$z=="4",t.test(df[df$z=="4",]$y)$conf.int[1],NA))))
But for me to be able to calculate this kind of t-test confidence interval on each of the 'pools' of the y values (which number in the same amount as the observations within the intervals of the discrete variable), I need to be able to control for the number of values within each created interval for z, so that my test remains valid, at least as far as the number of observations is concerned.
Simply put, I'd need an automated procedure that would create the vector of breaks for the z variable so that each of them contains a minimum number of observations. As an added complication, it should be the same breaks for two different databases, which I don't know if it's possible.
Any help on the matter would be welcome, thank you in advance.
EDIT: here is a sample of my data for x.
structure(list(x = c(5.319125, 7.3036667, 5.5166167, 7.0308333,
5.6812917, 6.5496583, 5.6621833, 6.4682, 5.4897417, 7.185175,
6.44905, 7.2055833, 7.629375, 6.2282833, 6.6813917, 7.7976, 6.683975,
5.5089083, 7.307475, 7.3958667, 6.2036583, 6.2488833, 5.9372,
6.6180167, 6.4167833, 5.640275, 8.7416917, 8.3134167, 6.8996833,
5.1161917, 7.0606333, 5.2622667, 6.780925, 5.4615417, 6.48185,
5.51585, 6.2224333, 5.3660667, 7.196525, 6.2984083, 7.0137833,
7.4490083, 5.9712333, 6.4287833, 7.6693917, 6.4406417, 5.4135083,
7.16245, 7.2267, 5.820325, 6.066175, 5.760975, 6.4775, 6.2625,
5.5182583, 8.446625, 8.19025, 6.7955333, 4.7899583, 6.5680167,
4.5965917, 6.3539333, 4.6639, 6.0489667, 4.9047833, 5.353625,
4.711425, 6.6268833, 5.5458083, 6.3271917, 6.4591417, 5.1843917,
5.6117167, 7.1828417, 5.6956917, 5.0271917, 6.741875, 6.68305,
4.7859667, 5.3068667, 5.3245, 5.745675, 5.7518917, 5.37945, 8.0030417,
7.7064583, 6.2935333, 5.1838667, 6.9369333, 4.9734583, 6.7257167,
5.0510333, 6.4257667, 5.2858083, 5.7285167, 5.084, 7.0092833,
5.905875, 6.6893417, 6.8319583, 5.5558083, 5.9854833, 7.5552167,
6.064625, 5.3990333, 7.115175, 7.0600167, 5.1644833, 5.6848667,
5.7014417, 6.1051, 6.1186333, 5.7217667, 8.3685417, 8.071325,
6.6547333, 5.5972417, 7.4226, 5.539725, 7.26335, 5.645975, 6.87475,
5.8486167, 6.3001667, 5.5997833, 7.4353167, 6.5089583, 7.213625,
7.3125667, 6.12095, 6.5410083, 8.0639083, 6.6505167, 5.8886417,
7.6301167, 7.5850417, 5.7693667, 6.2480167, 6.1847167, 6.6896167,
6.6323917, 6.1972167, 8.8560333, 8.5501083, 7.1036167, 4.9929583,
6.9839583, 5.3847417, 6.8814417, 5.59555, 6.7867167, 5.7831333,
6.9370917, 5.7400917, 7.6922, 6.3151, 7.084725, 7.0414417, 5.95435,
6.4274167, 7.6692167, 6.9159, 6.0856083, 7.3079583, 7.1937667,
5.744675, 5.946525, 6.0651833, 6.8488833, 6.5924333, 5.772025,
8.3281167, 8.5475917, 6.7952917, 8.248525, 5.1931083, 7.0688917,
5.4793583, 7.0091583, 5.7593, 7.1053333, 5.9382583, 7.1765417,
6.003075, 7.7699833, 6.2757333, 7.2446583, 7.179275, 6.0013083,
6.447975, 7.7845833, 6.9071083, 6.1009, 7.425425, 7.4619083,
5.9380667, 6.2116, 6.13315, 7.0852, 7.0047417, 6.0763917, 8.5926583,
8.7468417, 7.2485167, 8.5096833, 5.1541, 7.0479917, 5.43065,
6.9689083, 5.7356, 7.0842917, 5.9051667, 7.1283333, 5.9666667,
7.7295583, 6.249925, 7.21005, 7.1427167, 5.9675583, 6.4135667,
7.7448583, 6.874275, 6.0679333, 7.388675, 7.429025, 5.911225,
6.1757167, 6.095225, 7.045775, 6.9870833, 6.0567333, 8.5771167,
8.7541917, 7.3187333, 8.5092083, 5.5746, 7.342925, 5.8561667,
7.4704667, 5.922225, 6.9787, 6.1564167, 7.6059667, 5.9122917,
7.7848833, 6.6192, 7.34055, 7.2352417, 5.9776083, 6.5197583,
7.4891583, 7.2185667, 6.4710167, 7.70945, 7.5078083, 6.1470417,
6.66115, 6.6899333, 7.4454083, 7.2270917, 6.350075, 8.3156667,
8.9007917, 6.7578083, 8.3258083, 5.1996, 6.9688833, 5.3592917,
6.7583417, 5.5623583, 6.756375, 5.7361, 7.120425, 5.6567, 7.6174667,
6.1474833, 7.1442167, 6.74475, 5.5820333, 6.0106, 7.142675, 6.667475,
5.9067917, 7.2392, 7.058675, 5.6394417, 5.9119167, 5.8367333,
6.798025, 6.694675, 5.8565917, 8.6035083, 8.912375, 7.0501083,
8.38045, 4.8478083, 6.7493167, 5.3686667, 6.5152333, 5.282025,
6.5464333, 5.5085583, 6.870975, 5.4757667, 7.318, 5.92225, 6.9300417,
6.5758083, 5.4233083, 5.8295583, 7.0451, 6.4790083, 5.68255,
6.9632833, 6.9965833, 5.5005667, 5.717725, 5.5938083, 6.5309,
6.4824583, 5.4429833, 8.072575, 8.3635, 6.5797167, 8.0352333,
4.6289833, 6.64105, 4.8883833, 6.2025833, 5.2291833, 6.4814667,
5.2211083, 6.5780083, 5.196275, 7.030725, 5.6001583, 6.620475,
6.2858333, 5.114375, 5.5424417, 6.7784917, 6.1561333, 5.339375,
6.6249083, 6.6248583, 5.139775, 5.4195, 5.4531833, 6.3348583,
6.4041417, 5.292, 7.6243833, 7.9624583, 6.3226417, 7.761175,
4.8419083, 6.8384083, 5.3500417, 6.5903333, 5.33275, 6.732575,
5.4486, 6.8069417, 5.4569583, 7.26275, 5.835525, 6.8680333, 6.6712333,
5.4720417, 5.904325, 7.1506917, 6.4746833, 5.638675, 6.9570667,
7.0017333, 5.5033667, 5.6859333, 5.651875, 6.5903, 6.529725,
5.4819667, 7.971975, 8.2337833, 6.5815333, 7.9736583, 5.7711917,
7.543325, 5.8986917, 7.5081333, 6.2920333, 7.5321667, 6.4908917,
7.7616583, 6.4509417, 8.08035, 6.8219, 7.7939167, 7.6491333,
6.4773583, 6.9338667, 8.1865583, 7.3998917, 6.572125, 7.9198417,
8.0568, 6.5880333, 6.8299667, 6.7399833, 7.6436, 7.509275, 6.5139833,
9.1520167, 9.3580667, 7.65415, 9.0725167, 5.7483583, 7.5230417,
5.89105, 7.4808833, 6.1969667, 7.4923583, 6.4092583, 7.70695,
6.3970833, 8.0971333, 6.7949083, 7.76445, 7.6170167, 6.4494333,
6.8997, 8.1575333, 7.3728417, 6.544075, 7.888, 8.0215, 6.5484,
6.7911667, 6.7121917, 7.6179083, 7.4731167, 6.4629167, 9.1226333,
9.3307083, 7.6230583, 9.024875, 5.543925, 7.1460833, 5.6575583,
7.5986083, 6.027075, 7.4386167, 6.3500333, 7.6694833, 6.3682583,
8.0843333, 6.7181083, 7.7376, 7.5818583, 6.4010667, 6.8440083,
8.1217917, 7.3290833, 6.5187333, 7.8591667, 7.9898583, 6.5051,
6.7251167, 6.6881333, 7.477675, 7.3571333, 6.3351833, 8.881575,
9.12315, 7.3851, 8.8008667, 5.3437833, 7.1560417, 5.5748, 7.4622583,
5.9412417, 7.3428667, 6.2594167, 7.5839167, 6.28685, 8.0270917,
6.6388333, 7.6611, 7.50065, 6.3217167, 6.7594417, 8.0401167,
7.252425, 6.444, 7.77975, 7.9104167, 6.42495, 6.6421667, 6.6103333,
7.3489417, 7.23205, 6.2059333, 8.726725, 8.994625, 7.2460917,
8.660125, 5.2502833, 7.2591, 5.6425417, 6.889925, 5.353675, 6.50635,
6.260675, 7.4236583, 5.9076417, 7.3915, 6.2134917, 7.1645333,
6.922675, 6.0295417, 6.1687917, 7.2771083, 6.6152333, 6.3299417,
7.167325, 6.647275, 5.726475, 5.93905, 6.2888583, 6.7497167,
6.4364083, 5.8906583, 7.6052917, 8.039425, 6.5672833, 7.8754667,
6.3086333, 5.352025, 7.2849417, 5.7184833, 6.9675917, 5.5615333,
6.6157917, 6.3505417, 7.4881, 6.0007417, 7.5110583, 6.35525,
7.254075, 7.0289083, 6.1994417, 6.2860833, 7.372575, 6.735975,
6.4628917, 7.3102167, 6.8619417, 5.9123667, 6.1611917, 6.4854083,
6.8942417, 6.563625, 6.0610083, 7.941625, 8.6969167, 6.66075,
8.1197167, 6.2802, 3.9638, 5.870825, 4.1852, 5.5841417, 4.3007583,
5.2352167, 4.4281417, 5.819425, 4.1990917, 5.9338917, 4.89765,
5.7204333, 5.6546833, 4.5632167, 4.9803333, 5.6962417, 5.247725,
4.7092583, 6.0145417, 5.6403917, 4.4016917, 4.7181, 4.5007833,
5.2828917, 5.1314167, 4.7492, 6.777575, 6.9040083, 4.9760583,
6.4471917, 5.0952833, 3.712725, 5.8215333, 4.025725, 5.5635,
4.2354083, 5.143525, 4.4900083, 5.6802417, 4.1214333, 5.8128,
4.7525583, 5.6412583, 5.5534917, 4.487475, 4.8237833, 5.6156917,
5.0573, 4.5755417, 5.8096083, 5.5252083, 4.3145583, 4.5437417,
4.194675, 5.0100833, 4.8972333, 4.590025, 6.6441417, 6.5789417,
4.6947667, 6.1648167, 4.8517333, 3.982925, 5.7966833, 4.1607083,
5.5564833, 4.2557417, 5.2304083, 4.8661333, 5.912875, 4.4988333,
6.03915, 4.9131583, 5.8518667, 5.6578583, 4.773225, 4.8958583,
5.8759833, 5.204725, 4.8961667, 5.9217, 5.58395, 4.5410667, 4.73445,
4.5922333, 5.2517333, 5.0220333, 4.619475, 6.4883667, 6.429175,
4.6796417, 6.3171083, 4.93615, 3.9278833, 5.7590417, 4.1155667,
5.612725, 4.2199833, 5.2126667, 4.805275, 5.8888833, 4.4363,
6.0380083, 4.892, 5.8192083, 5.64205, 4.708825, 4.8751583, 5.833775,
5.2210417, 4.853225, 5.924225, 5.5856583, 4.5386167, 4.7280917,
4.5618, 5.264425, 5.03855, 4.5539, 6.4993, 6.4900667, 4.6749083,
6.2961333, 4.918525, 4.0890583, 6.33385, 4.3470083, 5.9645, 4.6541833,
5.5438667, 4.9556583, 6.1590583, 4.6379417, 6.2876833, 5.2235167,
6.1387167, 6.0547583, 4.9545667, 5.254125, 6.05395, 5.4813417,
4.9971333, 6.2266583, 5.9172833, 4.7275917, 4.9274917, 4.443575,
5.3164917, 5.2507083, 5.1704583, 7.173075, 6.9351583, 5.0816667,
6.5568, 5.3417667, 5.1705167, 7.0777833, 5.6253333, 7.231225,
5.5799167, 6.6942917, 6.1014583, 7.538725, 5.7152667, 7.459275,
6.2406083, 7.064925, 6.9234417, 5.8328833, 6.1819583, 7.2127583,
6.8071583, 6.2599417, 7.2975417, 6.973875, 5.804125, 6.1944667,
6.38855, 7.0553583, 6.8393167, 6.1275417, 7.9986833, 8.5846,
6.4682167, 8.0134583, 6.1805917, 5.0699583, 6.9006667, 5.36365,
6.9204917, 5.4478667, 6.5391583, 6.0647417, 7.2951667, 5.6632833,
7.25595, 6.1057333, 6.9578417, 6.8235583, 5.8671833, 6.0716417,
7.060175, 6.5401, 6.1229417, 7.1305083, 6.7823417, 5.62415, 5.9202,
5.9957167, 6.7142167, 6.4706417, 5.9004667, 7.8304583, 8.2144667,
6.1530583, 7.6896417, 5.9285333, 4.2625417, 5.9677583, 4.58695,
6.0400083, 4.4215333, 5.6052833, 5.04165, 6.48845, 4.6423583,
6.1688833, 5.0256167, 5.926725, 5.7214667, 4.746375, 4.9828,
6.1583083, 5.6903, 5.217375, 6.1341583, 5.7868083, 4.5895333,
4.98235, 5.159725, 5.7866167, 5.6300833, 4.882975, 6.7210833,
7.4314833, 5.2493083, 6.8503833, 5.2225583, 3.8417833, 5.9798,
4.1168583, 5.63415, 4.3311333, 5.0777667, 4.6606833, 5.789425,
4.3565167, 5.9736167, 4.8910667, 5.9445417, 5.699275, 4.6897167,
4.9036083, 5.8767, 5.088675, 4.6224417, 5.8052833, 5.5697167,
4.3237, 4.6084333, 4.2958833, 5.1394417, 5.0137583, 4.7711, 6.771275,
6.5984417, 4.845625, 6.3338083, 5.1370333, 3.1820167, 5.2699667,
3.4827167, 5.0992583, 3.7040583, 4.6358583, 4.1604917, 5.2488333,
3.7522, 5.3774167, 4.2636167, 5.1998167, 5.0456333, 4.051475,
4.289175, 5.1718917, 4.5787083, 4.1461667, 5.2983167, 5.03025,
3.8709333, 4.0917167, 3.731925, 4.5584167, 4.4200333, 4.061375,
6.064225, 6.02975, 4.1590167, 5.6589083, 4.2614833, 3.68695,
5.587375, 3.91725, 5.3387, 4.0061667, 4.9563833, 4.1942, 5.6720583,
3.9584333, 5.6873583, 4.6251, 5.4801417, 5.3975583, 4.2382, 4.6710917,
5.4898083, 5.0469667, 4.4950083, 5.72005, 5.46085, 4.30355, 4.5525917,
4.3681667, 5.1723167, 5.0331417, 4.4793083, 6.5492917, 6.720225,
4.7550917, 6.197775, 4.8082917, 4.09925, 5.986525, 4.3104417,
5.68455, 4.4287167, 5.3555667, 4.5191083, 5.9269833, 4.2695917,
5.9984167, 4.981225, 5.8049917, 5.7680667, 4.5736667, 5.0673583,
5.7443583, 5.2811083, 4.719175, 6.0376667, 5.73875, 4.3947333,
4.8157333, 4.6093417, 5.3906417, 5.2357417, 4.684825, 6.8885583,
7.018425, 5.0878167, 6.5122333, 5.2084, 3.810525, 6.2600083,
3.6246583, 5.7396417, 4.0617917, 5.6724583, 4.2505833, 4.7518417,
4.1232, 6.208375, 4.5881167, 5.252575, 5.71795, 4.0840583, 4.700325,
6.2360333, 4.701725, 3.922525, 5.5162167, 5.6220333, 3.8836833,
4.4883667, 4.5398583)), .Names = "x", row.names = c(NA, -962L
), class = "data.frame")
Assuming I want 30 values per interval (the 'n'), here is the code I used:
df$z<-cut(df$x, seq(30,length(df$x),by=30)/length(df$x), include.lowest=T)
Which gives me:
> table(df$z)
[0.0312,0.0624] (0.0624,0.0936] (0.0936,0.125] (0.125,0.156] (0.156,0.187] (0.187,0.218] (0.218,0.249] (0.249,0.281] (0.281,0.312] (0.312,0.343] (0.343,0.374]
0 0 0 0 0 0 0 0 0 0 0
(0.374,0.405] (0.405,0.437] (0.437,0.468] (0.468,0.499] (0.499,0.53] (0.53,0.561] (0.561,0.593] (0.593,0.624] (0.624,0.655] (0.655,0.686] (0.686,0.717]
0 0 0 0 0 0 0 0 0 0 0
(0.717,0.748] (0.748,0.78] (0.78,0.811] (0.811,0.842] (0.842,0.873] (0.873,0.904] (0.904,0.936] (0.936,0.967] (0.967,0.998]
0 0 0 0 0 0 0 0 0
What I want is a similar result to what I get with quantiles:
df$zbis<-cut(df$x, quantile(df$x, (0:20)/20), include.lowest=T)
table(df$zbis)
[3.18,4.29] (4.29,4.62] (4.62,4.89] (4.89,5.14] (5.14,5.33] (5.33,5.53] (5.53,5.66] (5.66,5.8] (5.8,5.94] (5.94,6.1] (6.1,6.26] (6.26,6.45] (6.45,6.58] (6.58,6.74] (6.74,6.93]
49 48 48 48 48 48 48 48 48 48 48 48 48 48 48
(6.93,7.14] (7.14,7.34] (7.34,7.62] (7.62,8.06] (8.06,9.36]
48 48 48 48 49
Except I'd like this to be reproducible for another database, and so I can't use the quantile function, since I would not get the same intervals on a different database.
SECOND EDIT: here is the second sample from another database. 'x' is the same variable, and they have similar ranges.
structure(list(x = c(5.319125, 7.3036667, 5.5166167, 7.0308333,
5.6812917, 6.5496583, 5.6621833, 6.4682, 5.4897417, 7.185175,
6.44905, 7.2055833, 7.629375, 6.2282833, 6.6813917, 7.7976, 6.683975,
5.5089083, 7.307475, 7.3958667, 6.2036583, 6.2488833, 5.9372,
6.6180167, 6.4167833, 5.640275, 8.7416917, 8.3134167, 6.8996833,
5.1931083, 7.0688917, 5.4793583, 7.0091583, 5.7593, 7.1053333,
5.9382583, 7.1765417, 6.003075, 7.7699833, 6.2757333, 7.2446583,
7.179275, 6.0013083, 6.447975, 7.7845833, 6.9071083, 6.1009,
7.425425, 7.4619083, 5.9380667, 6.2116, 6.13315, 7.0852, 7.0047417,
6.0763917, 8.5926583, 8.7468417, 7.2485167, 8.5096833, 5.177275,
7.09985, 5.6444667, 7.0102417, 5.7303833, 7.0383333, 5.9870583,
7.3342083, 5.9363667, 7.7753333, 6.38355, 7.389575, 7.0396667,
5.889625, 6.29395, 7.51135, 6.940925, 6.1455417, 7.4281833, 7.4657167,
5.9707083, 6.1902083, 6.0936167, 6.9595167, 6.85065, 5.8525,
8.5148083, 8.805625, 7.00665, 8.4457, 5.3437833, 7.1560417, 5.5748,
7.4622583, 5.9412417, 7.3428667, 6.2594167, 7.5839167, 6.28685,
8.0270917, 6.6388333, 7.6611, 7.50065, 6.3217167, 6.7594417,
8.0401167, 7.252425, 6.444, 7.77975, 7.9104167, 6.42495, 6.6421667,
6.6103333, 7.3489417, 7.23205, 6.2059333, 8.726725, 8.994625,
7.2460917, 8.660125, 3.614125, 5.6345917, 3.9410417, 5.2901417,
4.0147333, 4.766825, 4.4500417, 5.5189, 4.11375, 5.6350667, 4.5756917,
5.5998833, 5.3663, 4.44405, 4.5767417, 5.552025, 4.847425, 4.4382583,
5.5769417, 5.2390667, 4.0610917, 4.4054833, 4.1917, 4.9029083,
4.6935917, 4.3499417, 6.0562333, 6.081225, 4.45855, 6.0121583,
4.740275, 4.5028, 6.4177833, 4.8716417, 6.1469917, 4.6208917,
5.7748083, 5.4530083, 6.694125, 5.0944333, 6.5123167, 5.3257083,
6.2765333, 6.0149167, 5.1815583, 5.30715, 6.4149083, 5.82245,
5.515425, 6.3654333, 5.8472833, 4.9798917, 5.1833583, 5.5210333,
6.0410667, 5.7377917, 5.2666083, 7.0378167, 7.744175, 5.718725,
7.3220583, 5.24325, 5.3256, 7.2155167, 5.696925, 7.0029667, 5.5235,
6.7261083, 6.2810667, 7.546825, 5.90915, 7.3299167, 6.2227333,
7.147075, 6.9142417, 6.0012083, 6.1725333, 7.29815, 6.7, 6.3454583,
7.2129583, 6.7559833, 5.8115, 6.0756667, 6.458225, 6.9969167,
6.778825, 6.2245833, 8.0809583, 8.875325, 6.7210917, 8.3203,
6.3513, 5.2591333, 7.1404917, 5.6266417, 6.9356, 5.4568, 6.6604,
6.206025, 7.48525, 5.8323667, 7.24635, 6.1446583, 7.066275, 6.8334,
5.9198667, 6.09505, 7.2206583, 6.63085, 6.270075, 7.1397333,
6.689125, 5.7441333, 6.042575, 6.38255, 6.9325833, 6.7175667,
6.1592, 8.00415, 8.8051167, 6.647125, 8.2465667, 6.2788167, 6.49435,
8.1847583, 6.664475, 8.0528583, 6.6822417, 7.376, 7.1517833,
8.2306833, 6.8584583, 8.3052167, 7.288375, 8.2758583, 7.7162583,
7.2807833, 7.0459, 8.2507833, 7.5855, 7.0505917, 8.2230167, 8.1669,
6.8184667, 6.9700583, 7.0936167, 7.7615667, 7.6239083, 7.0921667,
9.02585, 9.3416167, 7.6256333, 9.0869333, 8.0984667, 4.116325,
6.1680917, 4.56965, 5.797725, 4.36085, 5.42455, 5.144075, 6.1531833,
4.77825, 6.2533417, 5.0192083, 5.99395, 5.6934083, 4.9074167,
4.9823083, 5.9861667, 5.4068833, 5.1872833, 6.10095, 5.659325,
4.6632833, 4.86315, 5.221775, 5.5878, 5.3217083, 4.8202333, 6.4883083,
6.69355, 4.952075, 6.7075583, 5.00015, 5.2502833, 7.2591, 5.6425417,
6.889925, 5.353675, 6.50635, 6.260675, 7.4236583, 5.9076417,
7.3915, 6.2134917, 7.1645333, 6.922675, 6.0295417, 6.1687917,
7.2771083, 6.6152333, 6.3299417, 7.167325, 6.647275, 5.726475,
5.93905, 6.2888583, 6.7497167, 6.4364083, 5.8906583, 7.6052917,
8.039425, 6.5672833, 7.8754667, 6.3086333, 5.352025, 7.2849417,
5.7184833, 6.9675917, 5.5615333, 6.6157917, 6.3505417, 7.4881,
6.0007417, 7.5110583, 6.35525, 7.254075, 7.0289083, 6.1994417,
6.2860833, 7.372575, 6.735975, 6.4628917, 7.3102167, 6.8619417,
5.9123667, 6.1611917, 6.4854083, 6.8942417, 6.563625, 6.0610083,
7.941625, 8.6969167, 6.66075, 8.1197167, 6.2802, 3.9638, 5.870825,
4.1852, 5.5841417, 4.3007583, 5.2352167, 4.4281417, 5.819425,
4.1990917, 5.9338917, 4.89765, 5.7204333, 5.6546833, 4.5632167,
4.9803333, 5.6962417, 5.247725, 4.7092583, 6.0145417, 5.6403917,
4.4016917, 4.7181, 4.5007833, 5.2828917, 5.1314167, 4.7492, 6.777575,
6.9040083, 4.9760583, 6.4471917, 5.0952833, 3.712725, 5.8215333,
4.025725, 5.5635, 4.2354083, 5.143525, 4.4900083, 5.6802417,
4.1214333, 5.8128, 4.7525583, 5.6412583, 5.5534917, 4.487475,
4.8237833, 5.6156917, 5.0573, 4.5755417, 5.8096083, 5.5252083,
4.3145583, 4.5437417, 4.194675, 5.0100833, 4.8972333, 4.590025,
6.6441417, 6.5789417, 4.6947667, 6.1648167, 4.8517333, 4.1059833,
5.9023167, 4.2812417, 5.6593917, 4.3587583, 5.3359583, 4.983275,
6.0223417, 4.6178333, 6.1545333, 5.0244667, 5.9596, 5.7608833,
4.8875333, 4.9990583, 5.9919333, 5.3157417, 5.0169333, 6.024775,
5.6717167, 4.6372083, 4.8370583, 4.7311333, 5.3704, 5.133575,
4.7174917)), .Names = "x", row.names = c(NA, -455L), class = "data.frame")
Updated after some comments:
Since you state that the minimum number of cases in each group would be fine for you, I'd go with Hmisc::cut2
v <- rnorm(10, 0, 1)
Hmisc::cut2(v, m = 3) # minimum of 3 cases per group
The documentation for cut2 states:
m desired minimum number of observations in a group.
The algorithm does not guarantee that all groups will have at least m observations.
The same cuts for separate variables
If the distributions of your variables are very similar you could extract the exact cutpoints by setting the argument onlycuts = T and reuse them for the other variables. In case the distributions are different though, you will end up with few cases in some intervals.
Using your data:
library(magrittr)
library(Hmisc)
cuts <- cut2(df1$x, g = 20, onlycuts = T) # determine cuts based on df1
cut2(df1$x, cuts = cuts) %>% table
cut2(df2$x, cuts = cuts) %>% table*2 # multiplied by two for better comparison
This is a good example of how NOT to pose a question. At last we have an example an, it is possible to post code that applies to it. (You apparently naively pasted the exact code in my comment without thinking about how to express 'n' and 'N' in the context of the problem. I did need to add prob=c( seq(...) , 1) in order to capture the highest values.
This assumes that you want groups of size 100 (although it is still very unclear why this is needed).
x$xct <- cut( x$x, breaks=quantile(x$x, prob=c( seq(100, length(x$x), by=100)/length(x$x) , 1) ))
table(x$xct)
(4.64,5.17] (5.17,5.57] (5.57,5.85] (5.85,6.17] (6.17,6.51] (6.51,6.85]
100 100 100 100 100 100
(6.85,7.26] (7.26,7.94] (7.94,9.36]
100 100 62

How can I apply fisher test on this set of data (nominal variables)

I'm pretty new in statistics:
fisher = function(idxToTest, idxATI){
idxDependent=c()
dependent=c()
p = c()
for(i in c(1:length(idxToTest)))
{
tbl = table(data[[idxToTest[i]]], data[[idxATI]])
rez = fisher.test(tbl, workspace = 20000000000)
if(rez$p.value<0.1){
dependent=c(dependent, TRUE)
if(rez$p.value<0.1){
idxDependent = c(idxDependent, idxToTest[i])
}
}
else{
dependent = c(dependent, FALSE)
}
p = c(p, rez$p.value)
}
}
This is the function I use. It seems to work.
What I understood until now is that I have to pass as first parameter data like:
Men Women
Dieting 10 30
Non-dieting 5 60
My data comes from a CSV:
data = read.csv('***.csv', header = TRUE, sep=',');
My first problem is that I don't know how to converse from:
Loan.Purpose Home.Ownership
lp_value_1 ho_value_2
lp_value_1 ho_value_2
lp_value_2 ho_value_1
lp_value_3 ho_value_2
lp_value_2 ho_value_3
lp_value_4 ho_value_2
lp_value_3 ho_value_3
to:
ho_value_1 ho_value_2 ho_value_3
lp_value1 0 2 0
lp_value2 1 0 1
lp_value3 0 1 1
lp_value4 0 1 0
The second issue is that I don't know what the second parameter should be
POST UPDATE: This is what I get using fisher.test(myTable):
Error in fisher.test(test) : FEXACT error 501.
The hash table key cannot be computed because the largest key
is larger than the largest representable int.
The algorithm cannot proceed.
Reduce the workspace size or use another algorithm.
where myTable is:
MORTGAGE NONE OTHER OWN RENT
car 18 0 0 5 27
credit_card 190 0 2 38 214
debt_consolidation 620 0 2 87 598
educational 5 0 0 3 7
...
Basically, fisher tests only work on smallish data sets because they require alot of memory. But all is good because chi-square tests make minimal additional assumptions and are easier on the computer. Just do:
chisq.test(Loan.Purpose,Home.Ownership)
to get your p-values.
Make sure you read through and understand the help page for chisq.test, especially the examples at the bottom.
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
Then look at a mosaicplot to see the quantities like:
mosaicplot(Loan.Purpose,Home.Ownership)
this reference explains how mosaicplots work.
http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day12/

Resources