Plotting Conditionally Summed Data (base R or ggplot) - r
I started with a dataframe containing info on West Nile cases in Canada from 2012-2015. 600 observations of 10 variables in total.
> head(mosquitoes)
Years Weeks Province Avg.Temp Avg..Precepitation Wind Number.of.cases Number.of.Dead.Birds Mosquito.Pools.Tested Google.Trend.Searches
1 2015 17 Alberta 48 0.01 8 0 0 0 1
2 2015 18 Alberta 46 0.03 10 0 0 0 2
3 2015 19 Alberta 44 0.07 8 0 0 0 2
4 2015 20 Alberta 51 0.00 9 0 0 0 2
5 2015 21 Alberta 56 0.01 9 0 0 0 4
6 2015 22 Alberta 58 0.10 7 0 0 0 1
Here is the entire data set....sorry it's large.
Years,Weeks,Province,Avg Temp ,Avg. Precepitation,Wind,Number of cases,Number of Dead Birds,Mosquito Pools Tested,Google Trend Searches
2015,17,Alberta,48,0.01,8,0,0,0,1
2015,18,Alberta,46,0.03,10,0,0,0,2
2015,19,Alberta,44,0.07,8,0,0,0,2
2015,20,Alberta,51,0,9,0,0,0,2
2015,21,Alberta,56,0.01,9,0,0,0,4
2015,22,Alberta,58,0.1,7,0,0,0,1
2015,23,Alberta,61,0.05,8,0,0,0,1
2015,24,Alberta,55,0.08,9,0,0,0,1
2015,25,Alberta,63,0.02,6,0,0,0,4
2015,26,Alberta,67,0.16,8,0,0,0,5
2015,27,Alberta,65,0.02,8,0,0,0,3
2015,28,Alberta,62,0.09,10,0,0,0,7
2015,29,Alberta,66,0.01,8,0,0,0,2
2015,30,Alberta,62,0.02,7,0,0,0,3
2015,31,Alberta,64,0.21,7,0,0,0,6
2015,32,Alberta,66,0.07,7,0,0,0,4
2015,33,Alberta,55,0.13,8,0,0,0,4
2015,34,Alberta,63,0,6,0,0,0,1
2015,35,Alberta,52,0.11,9,0,0,0,4
2015,36,Alberta,54,0.02,7,0,0,0,2
2015,37,Alberta,48,0.06,8,0,0,0,2
2015,38,Alberta,52,0.03,9,0,0,0,3
2015,39,Alberta,49,0.03,9,0,0,0,3
2015,40,Alberta,51,0,8,0,0,0,2
2015,41,Alberta,48,0,8,0,0,0,2
2014,17,Alberta,43,0.05,8,0,0,0,1
2014,18,Alberta,44,0.06,9,0,0,0,3
2014,19,Alberta,37,0.03,9,0,0,0,3
2014,20,Alberta,48,0.01,8,0,0,0,1
2014,21,Alberta,57,0.01,10,0,0,0,2
2014,22,Alberta,53,0.06,8,0,0,0,4
2014,23,Alberta,53,0.04,10,0,0,0,6
2014,24,Alberta,53,0.04,10,0,0,0,6
2014,25,Alberta,54,0.24,9,0,0,0,4
2014,26,Alberta,59,0.03,9,0,0,0,7
2014,27,Alberta,64,0.02,11,0,0,0,19
2014,28,Alberta,65,0.03,10,0,0,0,33
2014,29,Alberta,67,0.01,9,0,0,0,18
2014,30,Alberta,62,0.08,10,0,0,0,14
2014,31,Alberta,68,0,10,0,0,0,10
2014,32,Alberta,63,0.16,8,0,0,0,11
2014,33,Alberta,66,0.01,7,0,0,0,19
2014,34,Alberta,58,0.05,8,0,0,0,17
2014,35,Alberta,58,0.04,7,0,0,0,8
2014,36,Alberta,54,0.01,7,0,0,0,12
2014,37,Alberta,41,0.15,8,0,0,0,3
2014,38,Alberta,58,0,5,0,0,0,3
2014,39,Alberta,60,0.02,6,0,0,0,4
2014,40,Alberta,48,0.03,11,0,0,0,5
2014,41,Alberta,51,0,6,0,0,0,3
2013,17,Alberta,42,0,12,0,0,0,3
2013,18,Alberta,42,0.01,11,0,0,0,2
2013,19,Alberta,57,0,11,0,0,0,2
2013,20,Alberta,55,0.01,10,0,0,0,9
2013,21,Alberta,50,0.23,11,0,0,0,7
2013,22,Alberta,52,0.08,6,0,0,0,8
2013,23,Alberta,55,0.15,10,0,0,0,10
2013,24,Alberta,53,0.08,10,0,0,0,4
2013,25,Alberta,57,0.3,11,0,0,0,9
2013,26,Alberta,61,0.01,9,0,0,0,17
2013,27,Alberta,65,0.08,10,0,0,0,27
2013,28,Alberta,59,0.07,8,0,0,0,19
2013,29,Alberta,62,0.01,10,0,0,0,21
2013,30,Alberta,62,0.06,10,0,0,0,18
2013,31,Alberta,57,0.03,7,0,0,0,13
2013,32,Alberta,60,0.07,8,0,0,0,10
2013,33,Alberta,67,0,8,3,0,0,2
2013,34,Alberta,63,0,8,5,0,0,12
2013,35,Alberta,64,0.03,10,4,0,0,20
2013,36,Alberta,64,0.13,8,2,1,0,15
2013,37,Alberta,63,0,9,5,0,0,9
2013,38,Alberta,57,0.06,11,2,0,0,11
2013,39,Alberta,47,0,10,0,0,0,4
2013,40,Alberta,44,0,11,0,0,0,5
2013,41,Alberta,45,0.06,8,0,0,0,5
2012,17,Alberta,49,0.06,7,0,0,0,2
2012,18,Alberta,42,0.13,9,0,0,0,2
2012,19,Alberta,48,0,9,0,0,0,6
2012,20,Alberta,53,0.01,10,0,0,0,2
2012,21,Alberta,49,0.08,8,0,0,0,2
2012,22,Alberta,52,0,9,0,0,0,2
2012,23,Alberta,54,0.28,9,0,0,0,4
2012,24,Alberta,56,0.21,12,0,0,0,7
2012,25,Alberta,56,0.05,8,0,0,0,5
2012,26,Alberta,59,0.14,8,0,0,0,3
2012,27,Alberta,61,0.21,9,0,0,0,22
2012,28,Alberta,69,0,8,0,0,0,32
2012,29,Alberta,65,0.09,10,0,0,0,16
2012,30,Alberta,64,0.02,10,0,0,0,15
2012,31,Alberta,63,0.03,10,0,0,0,20
2012,32,Alberta,68,0,10,0,0,0,25
2012,33,Alberta,62,0.07,10,4,0,0,36
2012,34,Alberta,62,0.05,10,2,0,0,100
2012,35,Alberta,61,0.01,10,0,0,0,76
2012,36,Alberta,57,0,12,1,0,0,29
2012,37,Alberta,57,0,12,2,0,0,30
2012,38,Alberta,59,0,9,0,0,0,14
2012,39,Alberta,58,0.01,9,0,0,0,11
2012,40,Alberta,43,0.07,12,0,0,0,10
2012,41,Alberta,43,0.02,13,0,0,0,7
2015,17,British Columbia,53,0.03,10,0,0,0,5
2015,18,British Columbia,53,0.01,6,0,0,0,5
2015,19,British Columbia,58,0.01,7,0,0,0,5
2015,20,British Columbia,60,0,7,0,0,0,4
2015,21,British Columbia,62,0,7,0,0,0,6
2015,22,British Columbia,60,0.03,7,0,0,0,9
2015,23,British Columbia,62,0,13,0,0,0,9
2015,24,British Columbia,62,0.02,8,0,0,0,10
2015,25,British Columbia,66,0,9,0,0,0,7
2015,26,British Columbia,70,0,12,0,0,0,5
2015,27,British Columbia,67,0.01,9,0,0,0,11
2015,28,British Columbia,66,0,10,0,0,0,9
2015,29,British Columbia,65,0.04,9,0,0,0,14
2015,30,British Columbia,65,0.04,6,0,0,0,7
2015,31,British Columbia,65,0.02,9,0,0,0,7
2015,32,British Columbia,66,0.04,9,0,0,0,9
2015,33,British Columbia,65,0,9,0,0,0,11
2015,34,British Columbia,64,0.1,7,0,0,0,6
2015,35,British Columbia,57,0.12,10,0,0,0,4
2015,36,British Columbia,61,0.02,9,0,0,0,9
2015,37,British Columbia,58,0.09,9,0,0,0,9
2015,38,British Columbia,55,0.04,9,0,0,0,3
2015,39,British Columbia,52,0,6,0,0,0,3
2015,40,British Columbia,56,0.08,6,0,0,0,3
2015,41,British Columbia,51,0.04,7,0,0,0,7
2014,17,British Columbia,49,0.07,10,0,0,0,3
2014,18,British Columbia,54,0.03,8,0,0,0,4
2014,19,British Columbia,53,0.18,9,0,0,0,4
2014,20,British Columbia,60,0,8,0,0,0,6
2014,21,British Columbia,59,0.06,7,0,0,0,6
2014,22,British Columbia,56,0.09,7,0,0,0,6
2014,23,British Columbia,59,0,8,0,0,0,8
2014,24,British Columbia,60,0.03,10,0,0,0,7
2014,25,British Columbia,58,0.09,9,0,0,0,8
2014,26,British Columbia,62,0.05,7,0,0,0,10
2014,27,British Columbia,64,0.01,8,0,0,0,7
2014,28,British Columbia,66,0.01,8,0,0,0,19
2014,29,British Columbia,68,0,9,0,0,0,13
2014,30,British Columbia,63,0.06,8,0,0,0,12
2014,31,British Columbia,67,0,6,0,0,0,16
2014,32,British Columbia,66,0,7,0,0,0,25
2014,33,British Columbia,67,0.08,7,0,0,0,17
2014,34,British Columbia,65,0,6,0,0,0,13
2014,35,British Columbia,66,0,7,0,0,0,30
2014,36,British Columbia,61,0.05,7,0,0,0,9
2014,37,British Columbia,60,0,6,0,0,0,11
2014,38,British Columbia,61,0.02,6,0,0,0,3
2014,39,British Columbia,62,0.12,9,0,0,0,8
2014,40,British Columbia,56,0.04,6,0,0,0,9
2014,41,British Columbia,58,0.03,5,0,0,0,7
2013,17,British Columbia,50,0.03,7,0,0,0,14
2013,18,British Columbia,50,0,12,0,0,0,8
2013,19,British Columbia,59,0.03,6,0,0,0,5
2013,20,British Columbia,56,0.07,8,0,0,0,7
2013,21,British Columbia,54,0.04,8,0,0,0,4
2013,22,British Columbia,55,0.09,7,0,0,0,8
2013,23,British Columbia,60,0.01,9,0,0,0,14
2013,24,British Columbia,58,0.01,7,0,0,0,16
2013,25,British Columbia,62,0.04,8,0,0,0,10
2013,26,British Columbia,63,0.1,7,0,0,0,17
2013,27,British Columbia,67,0,8,0,0,0,29
2013,28,British Columbia,63,0,8,0,0,0,30
2013,29,British Columbia,66,0,9,0,0,0,20
2013,30,British Columbia,64,0,8,0,0,0,34
2013,31,British Columbia,64,0.02,8,0,0,0,11
2013,32,British Columbia,66,0,6,0,0,1,13
2013,33,British Columbia,66,0.02,8,0,0,1,16
2013,34,British Columbia,63,0.01,8,0,0,1,16
2013,35,British Columbia,65,0.17,7,0,1,1,12
2013,36,British Columbia,64,0.06,6,0,0,1,8
2013,37,British Columbia,63,0,6,0,0,1,14
2013,38,British Columbia,60,0.19,6,0,0,1,6
2013,39,British Columbia,54,0.23,10,0,0,1,6
2013,40,British Columbia,51,0.15,9,0,0,1,6
2013,41,British Columbia,51,0.01,8,0,0,1,8
2012,17,British Columbia,53,0.05,8,0,0,0,5
2012,18,British Columbia,50,0.11,7,0,0,0,6
2012,19,British Columbia,52,0,9,0,0,0,7
2012,20,British Columbia,54,0,10,0,0,0,8
2012,21,British Columbia,55,0.06,8,0,0,0,9
2012,22,British Columbia,57,0.07,7,0,0,0,8
2012,23,British Columbia,53,0.07,8,0,0,0,4
2012,24,British Columbia,57,0.04,8,0,0,0,4
2012,25,British Columbia,58,0.13,8,0,0,0,7
2012,26,British Columbia,60,0.04,8,0,0,0,8
2012,27,British Columbia,59,0.03,7,0,0,0,22
2012,28,British Columbia,66,0,6,0,0,0,30
2012,29,British Columbia,66,0.05,8,0,0,0,30
2012,30,British Columbia,63,0.03,8,0,0,0,38
2012,31,British Columbia,65,0,8,0,0,0,60
2012,32,British Columbia,67,0.01,8,0,0,0,34
2012,33,British Columbia,69,0,7,0,0,0,63
2012,34,British Columbia,63,0,8,0,0,0,100
2012,35,British Columbia,62,0,7,0,0,0,51
2012,36,British Columbia,62,0,7,0,0,0,32
2012,37,British Columbia,58,0.01,8,0,0,0,24
2012,38,British Columbia,60,0,6,0,0,0,13
2012,39,British Columbia,57,0,6,0,0,0,13
2012,40,British Columbia,53,0,8,0,0,0,6
2012,41,British Columbia,52,0.09,5,0,0,0,8
2015,17,Manitoba,56,0,10,0,0,0,4
2015,18,Manitoba,48,0,13,0,0,0,4
2015,19,Manitoba,46,0,10,0,0,0,4
2015,20,Manitoba,52,0,14,0,0,0,4
2015,21,Manitoba,57,0,10,0,0,12,4
2015,22,Manitoba,60,0,12,0,0,4,8
2015,23,Manitoba,67,0,9,0,0,87,8
2015,24,Manitoba,59,0,9,0,0,82,8
2015,25,Manitoba,66,0,7,0,0,44,8
2015,26,Manitoba,68,0,7,0,0,75,11
2015,27,Manitoba,66,0,10,0,0,73,17
2015,28,Manitoba,70,0,7,0,0,132,8
2015,29,Manitoba,69,0,9,0,0,139,17
2015,30,Manitoba,70,0,11,0,0,204,4
2015,31,Manitoba,63,0,9,0,0,275,13
2015,32,Manitoba,73,0,9,0,0,195,23
2015,33,Manitoba,62,0,10,0,0,228,13
2015,34,Manitoba,62,0,11,0,0,69,12
2015,35,Manitoba,73,0,11,1,0,92,10
2015,36,Manitoba,57,0,10,1,0,113,8
2015,37,Manitoba,60,0,11,2,0,34,4
2015,38,Manitoba,61,0,13,1,0,0,4
2015,39,Manitoba,53,0,13,0,0,0,6
2015,40,Manitoba,48,0,11,0,0,0,6
2015,41,Manitoba,44,0,11,0,0,0,6
2014,17,Manitoba,42,0,11,0,0,0,4
2014,18,Manitoba,42,0,14,0,0,0,0
2014,19,Manitoba,46,0,9,0,0,0,0
2014,20,Manitoba,45,0,10,0,0,0,0
2014,21,Manitoba,57,0,12,0,0,0,0
2014,22,Manitoba,66,0,8,0,0,0,0
2014,23,Manitoba,62,0,10,0,0,0,5
2014,24,Manitoba,60,0,11,0,0,0,13
2014,25,Manitoba,62,0,12,0,0,0,9
2014,26,Manitoba,66,0,10,0,0,0,7
2014,27,Manitoba,65,0,15,0,0,0,9
2014,28,Manitoba,67,0,11,0,0,0,36
2014,29,Manitoba,63,0,11,0,0,0,24
2014,30,Manitoba,68,0,9,0,0,0,53
2014,31,Manitoba,65,0,8,0,0,7,41
2014,32,Manitoba,71,0,8,0,0,7,48
2014,33,Manitoba,68,0,8,1,0,14,14
2014,34,Manitoba,67,0,8,2,0,19,18
2014,35,Manitoba,61,0,11,2,0,22,9
2014,36,Manitoba,60,0,8,0,0,24,4
2014,37,Manitoba,50,0,11,0,0,24,11
2014,38,Manitoba,52,0,10,0,0,24,4
2014,39,Manitoba,65,0,13,0,0,24,15
2014,40,Manitoba,47,0,16,0,0,24,4
2014,41,Manitoba,39,0,13,0,0,24,4
2013,17,Manitoba,36,0.01,12,0,0,0,4
2013,18,Manitoba,38,0.11,9,0,0,0,4
2013,19,Manitoba,49,0.02,12,0,0,0,4
2013,20,Manitoba,56,0.02,10,0,0,0,5
2013,21,Manitoba,55,0.05,14,0,0,0,4
2013,22,Manitoba,58,0.16,15,0,0,0,4
2013,23,Manitoba,57,0.01,9,0,0,0,9
2013,24,Manitoba,63,0.03,10,0,0,0,16
2013,25,Manitoba,66,0.1,9,0,0,0,23
2013,26,Manitoba,69,0.24,10,0,0,0,14
2013,27,Manitoba,72,0,6,0,0,0,23
2013,28,Manitoba,70,0.06,10,0,0,1,19
2013,29,Manitoba,66,0.1,9,0,0,1,45
2013,30,Manitoba,60,0.19,8,0,1,7,35
2013,31,Manitoba,61,0.03,7,0,0,10,31
2013,32,Manitoba,59,0.04,7,0,0,16,22
2013,33,Manitoba,64,0.02,8,1,0,16,24
2013,34,Manitoba,71,0.17,10,0,0,16,49
2013,35,Manitoba,76,0.01,7,0,0,17,14
2013,36,Manitoba,64,0,10,1,0,17,11
2013,37,Manitoba,63,0.01,8,0,0,19,9
2013,38,Manitoba,54,0,11,0,0,19,6
2013,39,Manitoba,60,0.1,12,0,0,19,13
2013,40,Manitoba,50,0.03,11,0,0,19,8
2013,41,Manitoba,52,0,10,0,1,19,4
2012,17,Manitoba,46,0.01,12,0,0,0,0
2012,18,Manitoba,51,0.05,11,0,0,0,0
2012,19,Manitoba,56,0.06,13,0,0,0,5
2012,20,Manitoba,58,0.16,12,0,0,0,6
2012,21,Manitoba,53,0.02,11,0,0,0,5
2012,22,Manitoba,53,0.13,9,0,0,0,5
2012,23,Manitoba,67,0.08,8,0,0,0,8
2012,24,Manitoba,62,0.17,11,0,0,0,10
2012,25,Manitoba,60,0.04,8,0,0,0,11
2012,26,Manitoba,68,0,10,0,0,0,11
2012,27,Manitoba,73,0.03,7,0,0,0,15
2012,28,Manitoba,73,0,7,0,0,0,17
2012,29,Manitoba,69,0.05,8,1,0,2,21
2012,30,Manitoba,71,0,8,1,0,20,36
2012,31,Manitoba,71,0.2,9,4,0,48,100
2012,32,Manitoba,67,0,9,7,0,62,47
2012,33,Manitoba,62,0.04,8,7,0,98,31
2012,34,Manitoba,69,0.01,7,6,0,108,84
2012,35,Manitoba,70,0.01,11,7,0,111,75
2012,36,Manitoba,63,0.01,11,1,0,116,22
2012,37,Manitoba,59,0.01,11,3,0,116,23
2012,38,Manitoba,47,0.01,12,2,0,116,13
2012,39,Manitoba,50,0,8,0,0,116,5
2012,40,Manitoba,46,0.02,15,0,0,116,7
2012,41,Manitoba,37,0.02,10,0,0,116,5
2015,17,Quebec,53,0,8,0,0,0,8
2015,18,Quebec,65,0.06,8,0,0,0,8
2015,19,Quebec,58,0.09,10,0,0,0,8
2015,20,Quebec,59,0.05,11,0,0,0,8
2015,21,Quebec,69,0.11,11,0,0,0,8
2015,22,Quebec,56,0.07,9,0,0,0,8
2015,23,Quebec,65,0.16,9,0,0,0,8
2015,24,Quebec,64,0.16,7,0,0,0,16
2015,25,Quebec,67,0.18,8,0,0,0,8
2015,26,Quebec,64,0.07,9,0,0,120,19
2015,27,Quebec,71,0.01,8,0,0,127,24
2015,28,Quebec,70,0.05,9,0,1,132,24
2015,29,Quebec,70,0.3,8,0,1,131,16
2015,30,Quebec,75,0.07,9,1,2,129,16
2015,31,Quebec,67,0.02,9,1,3,126,8
2015,32,Quebec,69,0.31,7,0,0,133,8
2015,33,Quebec,76,0.11,9,1,1,125,16
2015,34,Quebec,68,0.01,8,2,1,123,11
2015,35,Quebec,70,0,8,1,3,131,31
2015,36,Quebec,72,0.15,8,2,4,128,15
2015,37,Quebec,69,0.21,9,6,0,123,7
2015,38,Quebec,58,0,7,5,0,108,7
2015,39,Quebec,55,0.17,11,2,2,107,11
2015,40,Quebec,49,0.03,7,5,0,0,7
2015,41,Quebec,51,0.11,11,8,0,0,15
2014,17,Quebec,46,0.05,9,0,0,0,0
2014,18,Quebec,49,0.18,12,0,0,0,0
2014,19,Quebec,53,0.09,10,0,0,0,0
2014,20,Quebec,62,0.17,13,0,0,0,0
2014,21,Quebec,59,0.01,9,0,0,0,13
2014,22,Quebec,59,0.08,9,0,0,0,13
2014,23,Quebec,66,0.13,8,0,0,0,40
2014,24,Quebec,66,0.28,11,0,0,0,18
2014,25,Quebec,65,0.14,8,0,0,0,27
2014,26,Quebec,69,0.14,6,0,0,0,33
2014,27,Quebec,75,0.02,9,0,0,0,23
2014,28,Quebec,70,0.08,12,0,0,0,40
2014,29,Quebec,69,0.05,9,0,0,1,27
2014,30,Quebec,72,0.06,10,0,0,4,28
2014,31,Quebec,66,0.18,8,0,0,9,54
2014,32,Quebec,70,0.04,6,0,0,10,24
2014,33,Quebec,67,0.2,10,1,2,19,34
2014,34,Quebec,66,0,7,1,0,19,9
2014,35,Quebec,70,0,8,1,1,39,17
2014,36,Quebec,72,0.11,10,1,0,70,8
2014,37,Quebec,60,0.12,9,0,3,99,12
2014,38,Quebec,52,0.02,9,1,2,112,13
2014,39,Quebec,61,0.02,9,0,0,119,15
2014,40,Quebec,58,0.06,11,0,1,119,16
2014,41,Quebec,51,0.1,13,1,0,119,16
2013,17,Quebec,46,0.03,11,1,0,0,9
2013,18,Quebec,60,0.01,7,0,0,0,9
2013,19,Quebec,65,0.08,8,0,0,0,9
2013,20,Quebec,51,0.01,11,0,0,0,18
2013,21,Quebec,64,0.19,10,0,0,0,17
2013,22,Quebec,64,0.18,9,0,0,0,9
2013,23,Quebec,59,0.11,10,0,0,0,21
2013,24,Quebec,64,0.11,9,0,0,0,18
2013,25,Quebec,62,0.09,8,0,0,0,9
2013,26,Quebec,69,0.14,9,0,0,0,37
2013,27,Quebec,72,0.02,9,0,0,0,9
2013,28,Quebec,73,0.06,8,0,0,0,45
2013,29,Quebec,79,0.28,9,0,0,2,49
2013,30,Quebec,66,0.06,7,0,0,3,73
2013,31,Quebec,70,0.12,9,1,3,5,40
2013,32,Quebec,68,0.04,9,3,2,11,74
2013,33,Quebec,66,0.08,9,8,4,23,56
2013,34,Quebec,69,0.02,10,3,5,36,64
2013,35,Quebec,70,0.06,7,4,9,36,29
2013,36,Quebec,63,0.06,10,2,6,40,32
2013,37,Quebec,62,0.18,8,3,4,47,20
2013,38,Quebec,58,0.12,9,1,2,59,8
2013,39,Quebec,54,0.03,6,1,0,60,16
2013,40,Quebec,61,0,6,1,0,60,24
2013,41,Quebec,55,0.11,10,0,0,60,20
2012,17,Quebec,40,0.17,13,0,0,0,0
2012,18,Quebec,50,0.03,7,0,0,0,10
2012,19,Quebec,55,0.07,8,0,0,0,10
2012,20,Quebec,61,0.02,7,0,0,0,10
2012,21,Quebec,69,0.1,7,0,0,0,11
2012,22,Quebec,62,0.16,8,0,0,0,10
2012,23,Quebec,61,0.02,8,0,0,0,10
2012,24,Quebec,68,0.08,7,0,0,0,11
2012,25,Quebec,76,0.01,9,0,0,0,11
2012,26,Quebec,69,0.13,9,0,0,0,26
2012,27,Quebec,73,0.12,6,0,0,0,40
2012,28,Quebec,72,0,8,0,2,0,24
2012,29,Quebec,71,0.21,6,1,0,0,11
2012,30,Quebec,71,0.1,7,1,0,0,11
2012,31,Quebec,76,0.01,7,0,1,5,78
2012,32,Quebec,72,0.17,10,2,5,8,31
2012,33,Quebec,70,0.02,7,6,2,19,94
2012,34,Quebec,70,0,6,10,5,19,100
2012,35,Quebec,71,0.01,11,9,8,19,76
2012,36,Quebec,71,0.11,6,14,1,19,70
2012,37,Quebec,63,0.07,8,23,6,19,43
2012,38,Quebec,58,0.12,10,16,0,19,34
2012,39,Quebec,54,0.01,9,27,0,19,38
2012,40,Quebec,57,0.16,8,11,0,19,14
2012,41,Quebec,45,0.06,10,8,0,19,19
2015,17,Ontario,53,0,9,0,0,0,2
2015,18,Ontario,61,0.04,5,0,0,0,2
2015,19,Ontario,58,0.07,7,0,0,0,4
2015,20,Ontario,58,0,8,0,0,0,5
2015,21,Ontario,70,0.11,8,0,0,0,8
2015,22,Ontario,57,0.14,7,0,0,180,8
2015,23,Ontario,65,0.18,6,0,0,356,5
2015,24,Ontario,65,0.08,5,0,1,852,5
2015,25,Ontario,67,0.33,7,0,0,886,13
2015,26,Ontario,63,0.02,7,0,0,954,15
2015,27,Ontario,68,0.04,5,0,0,1152,13
2015,28,Ontario,67,0.03,6,1,0,1216,21
2015,29,Ontario,72,0.01,7,1,4,1219,16
2015,30,Ontario,76,0.03,6,1,1,1222,22
2015,31,Ontario,68,0.06,6,0,8,1176,24
2015,32,Ontario,69,0.21,6,0,0,1168,15
2015,33,Ontario,73,0.09,5,1,0,1168,24
2015,34,Ontario,64,0.01,5,5,1,987,12
2015,35,Ontario,75,0,5,2,1,881,18
2015,36,Ontario,70,0.11,5,5,0,802,9
2015,37,Ontario,65,0.07,6,1,2,712,6
2015,38,Ontario,60,0,5,5,4,526,4
2015,39,Ontario,55,0.04,9,2,2,396,6
2015,40,Ontario,53,0.14,6,3,0,65,5
2015,41,Ontario,52,0.04,8,3,4,0,2
2014,17,Ontario,46,0.05,8,0,0,0,3
2014,18,Ontario,47,0.14,9,0,0,0,2
2014,19,Ontario,53,0,9,0,0,0,2
2014,20,Ontario,56,0.13,6,0,0,0,3
2014,21,Ontario,57,0.09,5,0,0,0,4
2014,22,Ontario,65,0.02,6,0,0,0,7
2014,23,Ontario,63,0.04,6,0,0,0,10
2014,24,Ontario,65,0.19,6,0,0,0,16
2014,25,Ontario,66,0.16,5,0,0,0,13
2014,26,Ontario,69,0.06,4,0,0,0,7
2014,27,Ontario,72,0.09,7,0,0,0,20
2014,28,Ontario,68,0.12,6,0,0,0,17
2014,29,Ontario,66,0.21,5,1,0,0,13
2014,30,Ontario,68,0.03,5,0,0,2,14
2014,31,Ontario,67,0.35,5,0,0,5,35
2014,32,Ontario,68,0.21,4,0,0,9,22
2014,33,Ontario,65,0.12,7,2,0,11,30
2014,34,Ontario,67,0.02,4,0,2,13,11
2014,35,Ontario,67,0,6,2,3,30,18
2014,36,Ontario,71,0.39,5,5,0,43,13
2014,37,Ontario,60,0.15,6,1,0,52,10
2014,38,Ontario,53,0.02,4,0,1,56,7
2014,39,Ontario,60,0.08,4,0,0,56,3
2014,40,Ontario,61,0.06,4,0,0,56,6
2014,41,Ontario,50,0.06,6,0,0,56,4
2013,17,Ontario,43,0.05,6,0,0,0,2
2013,18,Ontario,57,0.05,6,0,0,0,3
2013,19,Ontario,59,0.04,5,0,0,0,4
2013,20,Ontario,51,0.02,8,0,0,0,3
2013,21,Ontario,60,0.17,8,0,0,0,7
2013,22,Ontario,64,0.16,6,1,0,0,9
2013,23,Ontario,58,0.05,7,1,0,0,9
2013,24,Ontario,64,0.29,6,0,0,0,12
2013,25,Ontario,64,0.11,5,0,0,0,12
2013,26,Ontario,73,0.06,4,0,1,2,12
2013,27,Ontario,71,0.05,5,1,0,2,20
2013,28,Ontario,72,0.13,6,2,0,4,15
2013,29,Ontario,80,0.05,5,1,2,12,20
2013,30,Ontario,65,0.12,6,5,0,22,56
2013,31,Ontario,66,0.26,5,4,8,41,43
2013,32,Ontario,67,0.04,6,5,6,65,32
2013,33,Ontario,63,0,5,5,2,89,24
2013,34,Ontario,70,0,5,2,0,131,30
2013,35,Ontario,72,0.2,3,2,8,155,22
2013,36,Ontario,63,0.12,6,7,2,179,12
2013,37,Ontario,64,0.04,6,3,2,190,15
2013,38,Ontario,57,0.17,4,5,2,194,9
2013,39,Ontario,55,0,4,0,1,196,5
2013,40,Ontario,61,0.04,4,5,0,198,9
2013,41,Ontario,56,0.04,4,1,0,198,4
2012,17,Ontario,40,0.06,11,0,0,0,4
2012,18,Ontario,50,0.12,6,0,0,0,3
2012,19,Ontario,56,0.07,6,0,0,0,3
2012,20,Ontario,58,0.02,4,0,0,0,3
2012,21,Ontario,69,0.01,6,0,0,0,5
2012,22,Ontario,64,0.09,8,0,0,0,3
2012,23,Ontario,63,0.03,6,1,0,0,6
2012,24,Ontario,67,0.08,6,0,0,0,4
2012,25,Ontario,76,0.17,6,0,0,2,7
2012,26,Ontario,70,0.04,7,0,0,6,10
2012,27,Ontario,75,0.04,5,3,1,10,39
2012,28,Ontario,73,0.02,5,5,3,19,24
2012,29,Ontario,75,0.06,6,9,1,30,19
2012,30,Ontario,72,0.38,6,14,2,89,17
2012,31,Ontario,73,0.16,4,23,1,162,77
2012,32,Ontario,70,0.14,6,44,1,249,46
2012,33,Ontario,68,0.05,4,44,8,312,64
2012,34,Ontario,67,0,4,38,4,375,83
2012,35,Ontario,70,0.15,6,26,0,409,100
2012,36,Ontario,69,0.56,4,25,0,434,79
2012,37,Ontario,61,0.03,5,17,2,454,37
2012,38,Ontario,57,0.16,5,3,4,462,23
2012,39,Ontario,53,0,6,2,6,462,24
2012,40,Ontario,57,0.03,5,3,0,464,18
2012,41,Ontario,42,0.04,5,1,0,464,10
2015,17,Saskatchewan,50,0,10,0,0,0,6
2015,18,Saskatchewan,46,0,11,0,0,0,12
2015,19,Saskatchewan,46,0,9,0,0,0,6
2015,20,Saskatchewan,53,0,8,0,0,0,6
2015,21,Saskatchewan,56,0,8,0,0,2,9
2015,22,Saskatchewan,60,0,10,0,0,0,9
2015,23,Saskatchewan,64,0,10,0,0,3,9
2015,24,Saskatchewan,57,0,8,0,0,3,12
2015,25,Saskatchewan,65,0,7,0,0,10,31
2015,26,Saskatchewan,70,0,6,0,0,13,15
2015,27,Saskatchewan,66,0,9,0,0,16,13
2015,28,Saskatchewan,67,0,8,0,0,40,15
2015,29,Saskatchewan,68,0,10,0,0,47,16
2015,30,Saskatchewan,63,0.02,9,0,0,69,43
2015,31,Saskatchewan,63,0,8,0,0,67,16
2015,32,Saskatchewan,70,0,8,0,0,80,28
2015,33,Saskatchewan,58,0,8,0,0,94,38
2015,34,Saskatchewan,62,0,8,0,0,42,21
2015,35,Saskatchewan,61,0,10,0,1,41,14
2015,36,Saskatchewan,53,0,8,0,0,0,9
2015,37,Saskatchewan,52,0,8,0,0,0,5
2015,38,Saskatchewan,54,0,10,0,0,0,5
2015,39,Saskatchewan,48,0,8,0,0,0,5
2015,40,Saskatchewan,48,0,9,0,0,0,8
2015,41,Saskatchewan,44,0,11,0,0,0,5
2014,17,Saskatchewan,40,0,12,0,0,0,6
2014,18,Saskatchewan,41,0,10,0,0,0,6
2014,19,Saskatchewan,41,0,9,0,0,0,6
2014,20,Saskatchewan,45,0,7,0,0,0,6
2014,21,Saskatchewan,59,0,10,0,0,0,13
2014,22,Saskatchewan,57,0,11,0,0,0,20
2014,23,Saskatchewan,55,0,8,0,0,0,17
2014,24,Saskatchewan,53,0,10,0,0,0,13
2014,25,Saskatchewan,57,0,10,0,0,0,7
2014,26,Saskatchewan,63,0,8,0,0,0,21
2014,27,Saskatchewan,66,0,11,0,0,0,26
2014,28,Saskatchewan,65,0,10,0,0,0,69
2014,29,Saskatchewan,64,0,9,0,0,0,65
2014,30,Saskatchewan,63,0,9,0,0,1,60
2014,31,Saskatchewan,67,0,6,0,0,1,36
2014,32,Saskatchewan,69,0,6,0,2,2,47
2014,33,Saskatchewan,67,0,7,0,0,9,67
2014,34,Saskatchewan,64,0,8,0,0,19,45
2014,35,Saskatchewan,58,0,9,0,0,20,34
2014,36,Saskatchewan,56,0,8,0,0,20,13
2014,37,Saskatchewan,46,0,9,0,0,20,19
2014,38,Saskatchewan,55,0,8,0,0,20,6
2014,39,Saskatchewan,61,0,9,0,0,20,16
2014,40,Saskatchewan,44,0,12,0,0,20,12
2014,41,Saskatchewan,45,0,9,0,0,20,6
2013,17,Saskatchewan,34,0,10,0,0,0,10
2013,18,Saskatchewan,40,0,12,0,0,0,14
2013,19,Saskatchewan,50,0,12,0,0,0,14
2013,20,Saskatchewan,59,0,9,0,0,0,7
2013,21,Saskatchewan,57,0,13,0,0,0,7
2013,22,Saskatchewan,60,0,9,0,0,0,14
2013,23,Saskatchewan,57,0,9,0,0,0,21
2013,24,Saskatchewan,57,0,10,0,0,0,20
2013,25,Saskatchewan,61,0,10,0,0,0,14
2013,26,Saskatchewan,64,0,7,0,0,0,41
2013,27,Saskatchewan,69,0,7,0,0,0,61
2013,28,Saskatchewan,65,0,8,0,0,1,65
2013,29,Saskatchewan,62,0,9,0,3,1,81
2013,30,Saskatchewan,60,0,9,0,1,3,75
2013,31,Saskatchewan,59,0,8,0,2,3,33
2013,32,Saskatchewan,60,0,6,0,1,18,44
2013,33,Saskatchewan,69,0,8,0,0,29,75
2013,34,Saskatchewan,66,0,8,1,1,29,60
2013,35,Saskatchewan,69,0,8,3,0,36,24
2013,36,Saskatchewan,67,0,7,1,0,40,21
2013,37,Saskatchewan,62,0,9,0,0,40,26
2013,38,Saskatchewan,57,0,10,1,2,40,32
2013,39,Saskatchewan,51,0,9,0,1,40,13
2013,40,Saskatchewan,45,0,11,0,0,40,29
2013,41,Saskatchewan,46,0,10,0,0,40,10
2012,17,Saskatchewan,44,0,13,0,0,0,24
2012,18,Saskatchewan,46,0,12,0,0,0,16
2012,19,Saskatchewan,51,0,13,0,0,0,16
2012,20,Saskatchewan,54,0,12,0,0,0,9
2012,21,Saskatchewan,48,0,11,0,0,0,17
2012,22,Saskatchewan,53,0,9,0,0,0,16
2012,23,Saskatchewan,61,0,13,0,0,0,8
2012,24,Saskatchewan,56,0,11,0,0,0,16
2012,25,Saskatchewan,58,0,7,0,0,0,25
2012,26,Saskatchewan,64,0,12,0,0,0,22
2012,27,Saskatchewan,65,0,9,0,0,0,23
2012,28,Saskatchewan,71,0,7,0,1,0,67
2012,29,Saskatchewan,67,0,10,0,0,0,34
2012,30,Saskatchewan,67,0,8,0,0,0,28
2012,31,Saskatchewan,64,0,8,0,0,0,59
2012,32,Saskatchewan,68,0,8,0,0,3,58
2012,33,Saskatchewan,59,0,8,2,0,4,34
2012,34,Saskatchewan,65,0,9,1,0,6,100
2012,35,Saskatchewan,64,0,9,0,0,6,49
2012,36,Saskatchewan,55,0,11,3,0,6,41
2012,37,Saskatchewan,58,0,13,0,0,6,16
2012,38,Saskatchewan,50,0,8,3,0,6,19
2012,39,Saskatchewan,55,0,6,0,0,6,15
2012,40,Saskatchewan,42,0,10,0,0,6,11
2012,41,Saskatchewan,36,0,8,0,0,6,7
First I produced this plot
But I did that in the most brute force way imaginable
#split out each year
cases2015 <- subset(mosquitoes, mosquitoes$Years==2015)
cases2014 <- subset(mosquitoes, mosquitoes$Years==2014)
cases2013 <- subset(mosquitoes, mosquitoes$Years==2013)
cases2012 <- subset(mosquitoes, mosquitoes$Years==2012)
#get the sums by week
aggregate2015 <- aggregate(cases2015$Number.of.cases, by=list(Weeks=cases2015$Weeks), FUN=sum)
aggregate2014 <- aggregate(cases2014$Number.of.cases, by=list(Weeks=cases2014$Weeks), FUN=sum)
aggregate2013 <- aggregate(cases2013$Number.of.cases, by=list(Weeks=cases2013$Weeks), FUN=sum)
aggregate2012 <- aggregate(cases2012$Number.of.cases, by=list(Weeks=cases2012$Weeks), FUN=sum)
#put the sums back together into a dataframe
aggregateSums <- aggregate2012
aggregateSums <- cbind(aggregateSums, aggregate2013[,2])
aggregateSums <- cbind(aggregateSums, aggregate2014[,2])
aggregateSums <- cbind(aggregateSums, aggregate2015[,2])
#give the columns useful names
colnames(aggregateSums) <- c("Weeks","Cases.2012","Cases.2013","Cases.2014","Cases.2015")
#base R plot
#plot the first set of points
plot(x=aggregateSums$Weeks,y=aggregateSums$Cases.2012,pch=16,col="Red",main="West Nile Cases",xlab="Week",ylab="Number of Cases")
#add additional years
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2013,pch=15,col="Blue")
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2014,pch=14,col="Orange")
points(x=aggregateSums$Weeks,y=aggregateSums$Cases.2015,pch=13,col="Brown")
#add the connecting lines
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2012,col="Red")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2013,col="Blue")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2014,col="Orange")
lines(x=aggregateSums$Weeks,y=aggregateSums$Cases.2015,col="Brown")
#click to place legend
legend(locator(1),c("2012","2013","2014","2015"),pch=c(16,15,14,13), col=c("Red","Blue","Orange","Brown"))
So surely there has to be a more efficient way to get there.
My next step is to produce the same plot but for just one province at a time. I don't want to have to go through the above 6 times...
I'm opening to accomplishing this via ggplot. If possible, I'd like to do it without resorting to additional packages (like plyr) as I'm trying to learn the base functionality for manipulating data.
Just to close the loop after Biranjan's answer...
mosq2 <- mosquitoes %>%
select(Years,Weeks,Province,Number.of.cases) %>%
group_by(Years,Weeks,Province) %>%
summarise(sum_case=sum(Number.of.cases))
ggplot(data=mosq2, aes(x=as.factor(Weeks),y=sum_case,color=as.factor(Years))) +
geom_point(aes(shape=as.factor(Years))) +
geom_line(aes(group=as.factor(Years))) +
labs(title="West Nile Cases", x="weeks", y="Number of cases") +
theme(legend.title=element_blank()) +
facet_wrap(~Province,ncol=3) +
scale_x_discrete(breaks=c(17,30,41))
Turned out quite nicely
ggplot(data=data1, aes(x=as.factor(Weeks),y=sum_case,color=as.factor(Years)))+
geom_point(aes(shape=as.factor(Years)))+
geom_line(aes(group=as.factor(Years)))+
labs(title="West Nile cases",x="weeks",y="Number of cases")+
theme(legend.title=element_blank())
Update:
I had too few points in my simulation so it rendered fine so that was the problem. I could't find a way to plot just using ggplot. The same code works if "dplyr" is used first and variable name edited accordingly. I know it is not what you are looking for, sorry to disappoint you.
library(dplyr)
data1 <- data %>%
select(Years,Weeks,Number.of.cases) %>%
group_by(Years,Weeks) %>%
summarise(sum_case=sum(Number.of.cases))
Related
Remove all rows above and below a value in R
We have citizen scientist recording data for us using In-Situ Aqua troll 600 instruments. It is similar to a CTD but not. The data format is a little different. Different enough that I cannot use CTD trim from the OCE package in R. I need to remove all the rows of data during the soak time (time in the water before they start lowering the instrument) and the up cast from the data. That is all the rows after they reached the max depth. So I just need that center portion of my dataframe. My Data Date Time Salinity (ppt) (672441) Chlorophyll-a Fluorescence (RFU) (671721) RDO Concentration (mg/L) (672144) Temperature (°C) (676121) Depth (ft) (671051) 16:29.0 0 0.01089297 7.257619 31.91303 0.008220486 16:31.0 0 0.01765913 7.246986 31.93175 0.1499496 16:33.0 0 0.0130412 7.258863 31.93253 0.5387784 16:35.0 0 0.01299242 7.274049 31.93806 0.6187978 16:37.0 0 0.01429801 7.26965 31.94401 0.6640261 16:39.0 0 0.01342988 7.271608 31.93595 0.681709 16:41.0 0 0.01337719 7.271549 31.93503 0.684597 16:43.0 7.087267 0.007094439 6.98015 31.89018 1.598019 16:45.0 28.3442 0.007111916 6.268753 31.83806 1.687673 16:47.0 31.06357 0.007945394 6.197834 31.77821 1.418773 16:49.0 32.07076 0.0080788 6.166986 31.76881 1.382685 16:51.0 31.95504 0.004382414 6.191305 31.72906 1.358556 16:53.0 36.21165 0.01983912 5.732656 29.3942 123.4148 16:55.0 36.37849 0.02243886 5.626586 28.82502 125.2927 16:57.0 36.43061 0.02416219 5.450325 28.23787 126.7997 16:59.0 36.44484 0.02441683 5.421676 28.14037 127.0321 17:01.0 36.46815 4.510316 5.318929 28.09501 127.2064 17:03.0 36.41381 4.012657 5.241654 28.14595 127.2227 17:05.0 36.42724 0.7891375 5.174401 28.20383 127.2019 17:07.0 36.41064 0.4351442 5.120181 28.18592 127.197 17:09.0 36.38155 0.2253969 5.033384 28.21021 127.1895 17:11.0 36.37671 0.2089337 5.019629 28.21222 127.1885 17:13.0 36.43813 0.08728585 4.981099 28.17526 127.2223 17:15.0 36.47644 0.904435 4.951878 28.13579 127.2108 17:17.0 36.54742 0.1230291 4.93056 28.06166 127.2307 17:19.0 36.60466 10.04291 4.908442 27.9397 126.6003 17:21.0 36.61511 11.33922 4.904828 27.92038 126.5161 17:23.0 36.68179 0.6680982 4.87018 27.78319 123.707 17:25.0 36.74612 0.06539913 4.848994 27.72977 119.906 17:27.0 36.75729 0.02414635 4.826871 27.72545 114.9537 17:29.0 37.1578 0.01556828 4.804105 27.81129 113.3405 > depthmax<- max(WS$`Depth (ft) (671051)`, na.rm = TRUE) > output <- WS[WS$"Depth (ft) (671051)" < depthmax,] > Output2 <- output[output$"Depth (ft) (671051)" > 1,] I tried these and got output2 to work but can't seam to get output to work. Is there a more elegant way to do this? Just to recap I need to remove all rows after the depthmax (127.2307) and all the rows before the depth when they start lowering the instrument (~2.41).
Your code does remove the maximum depth, but not the rows after the maximum depth is reached. You want to locate the row index of the the maximum depth and delete that row and the ones after: start <- tail(which(na.omit(WS$`Depth (ft) (671051)`) < 2.41), 1) + 1 end<- which.max(na.omit(WS$`Depth (ft) (671051)`)) - 1 output <- WS[start:end, ] The first line finds the index of the last row less than 2.41 and adds 1 to get the starting row. The second line finds the index of the maximum depth and subtracts 1 to get the row before that.
GameTheory package: Convert data frame of games to Coalition Set
I am looking to explore the GameTheory package from CRAN, but I would appreciate help in converting my data (in the form of a data frame of unique combinations and results) in to the required coalition object. The precursor to this I believe to be an ordered list of all coalition values (https://cran.r-project.org/web/packages/GameTheory/vignettes/GameTheory.pdf). My real data has n ~ 30 'players', and unique combinations = large (say 1000 unique combinations), for which I have 1 and 0 identifiers to describe the combinations. This data is sparsely populated in that I do not have data for all combinations, but will assume combinations not described have zero value. I plan to have one specific 'player' who will appear in all combinations, and act as a baseline. By way of example this is the data frame I am starting with: require(GameTheory) games <- read.csv('C:\\Users\\me\\Desktop\\SampleGames.csv', header = TRUE, row.names = 1) games n1 n2 n3 n4 Stakes Wins Success_Rate 1 1 1 0 0 800 60 7.50% 2 1 0 1 0 850 45 5.29% 3 1 0 0 1 150000 10 0.01% 4 1 1 1 0 300 25 8.33% 5 1 1 0 1 1800 65 3.61% 6 1 0 1 1 1900 55 2.89% 7 1 1 1 1 700 40 5.71% 8 1 0 0 0 3000000 10 0.00333% where n1 is my universal player, and in this instance, I have described all combinations. To calculate my 'base' coalition value from player {1} alone, I am looking to perform the calculation: 0.00333% (success rate) * all stakes, i.e. 0.00333% * (800 + 850 + 150000 + 300 + 1800 + 1900 + 700 + 3000000) = 105 I'll then have zero values for {2}, {3} and {4} as they never "play" alone in this example. To calculate my first pair coalition value, I am looking to perform the calculation: 7.5%(800 + 300 + 1800 + 700) + 0.00333%(850 + 150000 + 1900 + 3000000) = 375 This is calculated as players {1,2} base win rate (7.5%) by the stakes they feature in, plus player {1} base win rate (0.00333%) by the combinations he features in that player {2} does not - i.e. exclusive sets. This logic is repeated for the other unique combinations. For example row 4 would be the combination of {1,2,3} so the calculation is: 7.5%(800+1800) + 5.29%(850+1900) + 8.33%(300+700) + 0.00333%(3000000+150000) = 529 which descriptively is set {1,2} success rate% by Stakes for the combinations it appears in that {3} does not, {1,3} by where {2} does not feature, {1,2,3} by their occurrences, and the base player {1} by examples where neither {2} nor {3} occur. My expected outcome therefore should look like this I believe: c(105,0,0,0, 375,304,110,0,0,0, 529,283,246,0, 400) where the first four numbers are the single player combinations {1} {2} {3} and {4}, the next six numbers are two player combinations {1,2} {1,3} {1,4} (and the null cases {2,3} {2,4} {3,4} which don't exist), then the next four are the three player combinations {1,2,3} {1,2,4} {1,3,4} and the null case {2,3,4}, and lastly the full combination set {1,2,3,4}. I'd then feed this in to the DefineGame function of the package to create my coalitions object. Appreciate any help: I have tried to be as descriptive as possible. I really don't know where to start on generating the necessary sets and set exclusions.
Including lagged independent variables - R
I would like to run a regression where I use both the current value and lagged values from a specific independent variable. My dataset This is an example extract from my dataset: dt nrOfCalls nrOfOrders nrOfOrdersLag1 nrOfOrdersLag2 nrOfOrdersLag3 2016/04/20 17 5 9 7 12 2016/04/21 12 8 5 9 7 2016/04/22 14 4 8 5 9 2016/04/23 15 6 4 8 5 2016/04/24 20 14 6 4 8 2016/04/25 10 3 14 6 4 Where NrOfOrdersLagX implies the number of orders X days ago. I have also included dummy variables (because of limited space I have included these dummy variables in the example extract of my dataset). My code When I run the following code everything works perfectly fine: reg <- lm(nrOfCalls ~ dummy1+...+dummy6+nrOfOrders, data=trainingSet) However, when I try including the lagged values of the nrOfOrders regressor (for this example I only include one lagged value), I get some inordinary results. I use the following code: reg <- lm(nrOfCalls ~ dummy1+...+dummy6+nrOfOrders+nrOfOrdersLag1, data=trainingSet) Instead of merely including the regressor nrOfOrdersLag1, it will include all kinds of regressors which variable names are a variation on nrOfOrdersLag1. Call: lm(formula = nrOfCalls ~ dummy1 + dummy2 + dummy3 + dummy4 + dummy5 + dummy6 + nrOfOrders + nrOfOrdersLag1, data = trainCall) Coefficients: (Intercept) dummy1 dummy2 dummy3 dummy4 604.06334 -114.03241 -229.67540 -270.62292 -220.12409 dummy5 dummy6 nrOfOrders nrOfOrdersLag110707 nrOfOrdersLag11161 -457.22245 -465.17116 0.01729 -249.54641 -10.98526 nrOfOrdersLag111869 nrOfOrdersLag11207 nrOfOrdersLag11234 nrOfOrdersLag11262 nrOfOrdersLag11267 45.36821 33.46161 -17.70615 -384.09745 -413.64804 nrOfOrdersLag11279 nrOfOrdersLag11285 nrOfOrdersLag112945 nrOfOrdersLag11336 nrOfOrdersLag11348 -200.19660 32.75546 -264.04005 -47.13457 79.48368 nrOfOrdersLag11351 nrOfOrdersLag11355 nrOfOrdersLag11363 nrOfOrdersLag11364 nrOfOrdersLag11368 -208.62312 6.83426 -98.71679 170.29583 -93.83054 nrOfOrdersLag11375 nrOfOrdersLag11398 nrOfOrdersLag11456 nrOfOrdersLag11462 nrOfOrdersLag11464 50.54960 14.39958 118.73762 113.72744 190.54445 nrOfOrdersLag11469 nrOfOrdersLag114778 nrOfOrdersLag11486 nrOfOrdersLag11489 nrOfOrdersLag11504 -8.79258 84.35041 66.29121 29.67360 24.30553 nrOfOrdersLag11505 nrOfOrdersLag11511 nrOfOrdersLag11520 nrOfOrdersLag11521 nrOfOrdersLag11527 286.85352 69.76762 -159.45588 -38.90402 53.62128 nrOfOrdersLag11538 nrOfOrdersLag11540 nrOfOrdersLag11564 nrOfOrdersLag115674 nrOfOrdersLag11579 -104.66037 -60.10656 -58.32177 522.56810 77.65481 nrOfOrdersLag11587 nrOfOrdersLag11593 nrOfOrdersLag11603 nrOfOrdersLag11618 nrOfOrdersLag11622 34.63649 31.28570 -124.35673 16.43115 207.99435 nrOfOrdersLag11624 nrOfOrdersLag11626 nrOfOrdersLag11629 nrOfOrdersLag11631 nrOfOrdersLag11635 93.90391 78.94275 155.88327 15.32027 125.02409 nrOfOrdersLag11640 nrOfOrdersLag11645 nrOfOrdersLag11649 nrOfOrdersLag11651 nrOfOrdersLag11653 208.51996 -42.03086 -1.62533 164.73045 12.61157 nrOfOrdersLag11654 nrOfOrdersLag11673 nrOfOrdersLag11683 nrOfOrdersLag11688 nrOfOrdersLag11698 129.26306 -41.56615 137.09095 149.86866 -49.43096 nrOfOrdersLag11699 nrOfOrdersLag11702 nrOfOrdersLag11703 nrOfOrdersLag11705 nrOfOrdersLag11714 76.86530 202.69027 -70.26281 -173.43605 170.02302 nrOfOrdersLag11715 nrOfOrdersLag11716 nrOfOrdersLag11726 nrOfOrdersLag11749 nrOfOrdersLag11754 34.30252 75.45378 176.16211 76.39492 58.11995 nrOfOrdersLag11757 nrOfOrdersLag11764 nrOfOrdersLag11766 nrOfOrdersLag11772 nrOfOrdersLag11777 133.71731 137.62373 24.95059 -75.96096 54.03353 nrOfOrdersLag11778 nrOfOrdersLag11782 nrOfOrdersLag11793 nrOfOrdersLag11806 nrOfOrdersLag11810 -147.40657 -45.70752 27.76710 94.17449 -191.98461 nrOfOrdersLag11811 nrOfOrdersLag11812 nrOfOrdersLag11814 nrOfOrdersLag11815 nrOfOrdersLag11817 61.04646 145.25908 38.56959 18.22574 140.84081 nrOfOrdersLag11827 nrOfOrdersLag11832 nrOfOrdersLag11839 nrOfOrdersLag11841 nrOfOrdersLag11859 -254.56931 138.30797 -139.32523 -151.50010 39.27760 nrOfOrdersLag11860 nrOfOrdersLag11862 nrOfOrdersLag11868 nrOfOrdersLag11874 nrOfOrdersLag11876 304.88804 150.84361 30.75749 -91.55666 192.43385 nrOfOrdersLag11879 nrOfOrdersLag11880 nrOfOrdersLag11885 nrOfOrdersLag11887 nrOfOrdersLag11891 118.75260 -44.83615 163.35474 194.12038 127.79107 nrOfOrdersLag11896 nrOfOrdersLag11901 nrOfOrdersLag11914 nrOfOrdersLag11919 nrOfOrdersLag11921 82.79870 179.44324 303.18796 242.51540 159.40652 nrOfOrdersLag11928 nrOfOrdersLag11929 nrOfOrdersLag11932 nrOfOrdersLag11937 nrOfOrdersLag11939 484.73958 35.38640 286.54643 46.88513 48.94031 nrOfOrdersLag11952 nrOfOrdersLag11967 nrOfOrdersLag11988 nrOfOrdersLag11994 nrOfOrdersLag11996 265.02228 170.65576 47.77627 317.10968 383.09702 nrOfOrdersLag119987 nrOfOrdersLag12007 nrOfOrdersLag12010 nrOfOrdersLag12017 nrOfOrdersLag12018 416.71786 93.41540 61.71721 73.68938 136.60641 nrOfOrdersLag12019 nrOfOrdersLag12023 nrOfOrdersLag12027 nrOfOrdersLag12034 nrOfOrdersLag12040 88.13672 -214.93168 38.82154 148.72993 -60.63852 nrOfOrdersLag12050 nrOfOrdersLag12051 nrOfOrdersLag12056 nrOfOrdersLag12058 nrOfOrdersLag12060 205.21811 246.46001 163.20151 -0.35863 61.93024 nrOfOrdersLag12073 nrOfOrdersLag12082 nrOfOrdersLag12087 nrOfOrdersLag12093 nrOfOrdersLag12107 122.50936 -27.13307 -43.74262 366.51938 146.85581 nrOfOrdersLag12119 nrOfOrdersLag12122 nrOfOrdersLag12124 nrOfOrdersLag121319 nrOfOrdersLag12133 119.31341 36.35183 253.68015 115.01838 228.66567 nrOfOrdersLag12136 nrOfOrdersLag12137 nrOfOrdersLag12154 nrOfOrdersLag12167 nrOfOrdersLag12169 -9.97711 121.20416 -448.43096 324.45466 169.37446 nrOfOrdersLag12176 nrOfOrdersLag12180 nrOfOrdersLag12181 nrOfOrdersLag12184 nrOfOrdersLag12186 88.35432 -14.74399 41.03555 310.68640 308.82549 nrOfOrdersLag12189 nrOfOrdersLag12195 nrOfOrdersLag12202 nrOfOrdersLag12204 nrOfOrdersLag12216 121.87542 264.78895 191.52156 281.02113 168.29821 nrOfOrdersLag12219 nrOfOrdersLag12221 nrOfOrdersLag12231 nrOfOrdersLag12236 nrOfOrdersLag12237 218.48030 66.07233 -228.54230 111.06068 162.65347 nrOfOrdersLag12242 nrOfOrdersLag12244 nrOfOrdersLag12246 nrOfOrdersLag12261 nrOfOrdersLag12262 12.05505 114.60872 -123.06406 -45.54485 380.26022 nrOfOrdersLag12268 nrOfOrdersLag12271 nrOfOrdersLag12302 nrOfOrdersLag12304 nrOfOrdersLag12311 4.23556 249.55941 248.38079 103.12194 -71.69000 nrOfOrdersLag12313 nrOfOrdersLag12329 nrOfOrdersLag12345 nrOfOrdersLag12353 nrOfOrdersLag12356 247.93662 207.13958 314.96154 95.08688 300.10247 nrOfOrdersLag12361 nrOfOrdersLag12371 nrOfOrdersLag12376 nrOfOrdersLag12380 nrOfOrdersLag12384 37.27506 -167.84137 66.61313 247.32681 237.73556 nrOfOrdersLag12399 nrOfOrdersLag12406 nrOfOrdersLag12413 nrOfOrdersLag12417 nrOfOrdersLag12420 107.37362 399.28658 275.48695 95.07723 324.87029 nrOfOrdersLag12423 nrOfOrdersLag12434 nrOfOrdersLag12437 nrOfOrdersLag12442 nrOfOrdersLag12446 233.30480 193.45613 250.79606 322.78975 320.40151 nrOfOrdersLag12448 nrOfOrdersLag12449 nrOfOrdersLag12451 nrOfOrdersLag12460 nrOfOrdersLag124708 172.20478 -113.45790 108.52769 305.32173 -134.41931 nrOfOrdersLag12484 nrOfOrdersLag12486 nrOfOrdersLag12493 nrOfOrdersLag12497 nrOfOrdersLag12505 156.35931 -9.49808 223.13247 -67.47891 534.66815 nrOfOrdersLag12541 nrOfOrdersLag12552 nrOfOrdersLag12563 nrOfOrdersLag12588 nrOfOrdersLag12596 221.35464 1.92188 -53.40846 -473.89923 497.69016 nrOfOrdersLag12611 nrOfOrdersLag12618 nrOfOrdersLag12623 nrOfOrdersLag12632 nrOfOrdersLag12638 175.77150 125.22040 -302.58298 -159.54109 -337.04664 nrOfOrdersLag12646 nrOfOrdersLag12648 nrOfOrdersLag12663 nrOfOrdersLag12665 nrOfOrdersLag12687 539.15416 350.53169 -148.22458 147.67351 -349.52567 nrOfOrdersLag12696 nrOfOrdersLag12713 nrOfOrdersLag12721 nrOfOrdersLag12723 nrOfOrdersLag12743 -42.64843 141.90979 47.07766 -443.50878 356.28944 nrOfOrdersLag12745 nrOfOrdersLag12750 nrOfOrdersLag12753 nrOfOrdersLag12761 nrOfOrdersLag127688 14.65720 13.35666 8.30924 -191.17540 -123.52409 nrOfOrdersLag12802 nrOfOrdersLag12806 nrOfOrdersLag12812 nrOfOrdersLag12815 nrOfOrdersLag12818 128.14604 281.35157 361.79299 8.34690 86.67458 nrOfOrdersLag12824 nrOfOrdersLag12836 nrOfOrdersLag12841 nrOfOrdersLag12842 nrOfOrdersLag12876 518.23720 -357.78788 288.63660 433.15556 158.51341 nrOfOrdersLag12883 nrOfOrdersLag12884 nrOfOrdersLag12901 nrOfOrdersLag12941 nrOfOrdersLag12956 214.74913 68.99485 -208.43888 -297.43011 319.30849 nrOfOrdersLag12996 nrOfOrdersLag13007 nrOfOrdersLag13013 nrOfOrdersLag13023 nrOfOrdersLag13033 321.02569 -88.96746 80.93579 106.97804 -223.88599 nrOfOrdersLag13051 nrOfOrdersLag13072 nrOfOrdersLag13094 nrOfOrdersLag13098 nrOfOrdersLag13127 40.95339 161.48086 524.04025 -94.23016 17.50082 nrOfOrdersLag13152 nrOfOrdersLag13171 nrOfOrdersLag13185 nrOfOrdersLag13202 nrOfOrdersLag13205 -266.11135 8.82232 -107.11441 -141.14442 212.80057 nrOfOrdersLag13222 nrOfOrdersLag13277 nrOfOrdersLag13295 nrOfOrdersLag13321 nrOfOrdersLag13332 187.90431 306.69183 -24.55235 68.42339 -290.11682 nrOfOrdersLag13362 nrOfOrdersLag13378 nrOfOrdersLag13380 nrOfOrdersLag13391 nrOfOrdersLag13476 44.30976 463.85118 276.57882 -282.06457 34.35207 nrOfOrdersLag13488 nrOfOrdersLag13490 nrOfOrdersLag13530 nrOfOrdersLag13578 nrOfOrdersLag13599 217.46608 386.26006 194.69082 52.45357 406.44931 nrOfOrdersLag13611 nrOfOrdersLag13618 nrOfOrdersLag13626 nrOfOrdersLag13632 nrOfOrdersLag13635 242.81201 -22.19253 23.90163 -395.87751 103.44677 nrOfOrdersLag13674 nrOfOrdersLag13681 nrOfOrdersLag13767 nrOfOrdersLag13841 nrOfOrdersLag13849 200.18354 83.25027 -71.88190 382.05886 -279.73606 nrOfOrdersLag13857 nrOfOrdersLag13874 nrOfOrdersLag13885 nrOfOrdersLag13897 nrOfOrdersLag13908 370.92867 -17.14313 -140.99009 -244.17716 93.79552 nrOfOrdersLag13966 nrOfOrdersLag14009 nrOfOrdersLag14031 nrOfOrdersLag14111 nrOfOrdersLag14160 61.75484 224.96558 -107.99394 -126.12766 572.14222 nrOfOrdersLag14171 nrOfOrdersLag14205 nrOfOrdersLag14312 nrOfOrdersLag14468 nrOfOrdersLag14560 -42.29929 -379.41067 194.25204 -47.50642 -116.49251 nrOfOrdersLag14619 nrOfOrdersLag14640 nrOfOrdersLag14684 nrOfOrdersLag14762 nrOfOrdersLag14776 41.34325 -355.84333 -122.77109 -331.12296 404.86637 nrOfOrdersLag14865 nrOfOrdersLag14959 nrOfOrdersLag14967 nrOfOrdersLag15195 nrOfOrdersLag15218 371.14617 104.60840 -42.74014 99.78008 520.62517 nrOfOrdersLag15402 nrOfOrdersLag16029 nrOfOrdersLag16284 nrOfOrdersLag16321 nrOfOrdersLag16350 529.17004 161.02870 268.77256 74.02159 386.53868 nrOfOrdersLag16418 nrOfOrdersLag16557 nrOfOrdersLag16711 nrOfOrdersLag16722 nrOfOrdersLag16825 -81.37023 190.74905 225.64313 -131.70051 271.39936 nrOfOrdersLag16952 nrOfOrdersLag16996 nrOfOrdersLag17098 nrOfOrdersLag17251 nrOfOrdersLag17279 357.39158 408.46849 210.03477 -25.74894 NA nrOfOrdersLag17292 nrOfOrdersLag17391 nrOfOrdersLag18642 nrOfOrdersLag18670 nrOfOrdersLag18949 262.00528 4.71906 326.28857 49.30983 174.99732 nrOfOrdersLag19202 nrOfOrdersLag19690 nrOfOrdersLag19772 16.13322 15.59552 -62.26111 I have no clue what is happening and why this is going wrong. Anybody that can help me out here? Thanks in advance!
The lagged independent variables were factor variables instead of integer/numeric variables. Having fixed this, the lm call works as intended.
cut function and controlled frequency in the intervals
My question is pretty simple: the cut() function allows to choose the breaks along which I can divide the range of my vector into intervals. I would like to be able to control for the number of observations within the newly created interval, in a way similar to what could be obtained with a quantile argument in the cut() function call. However I don't want to be using the quantile argument because I would like for the intervals to be chosen fixed, so that I can match them between different databases for further comparison, and I want the same discrete values to be found in the labels of the newly cut vectors. I used to use this for the quantile approach: df$z<-cut(df$x, quantile(x, (0:10)/10), include.lowest=TRUE) Which is fairly simple. My new approach is even simpler, so it resembles this for example: df$z<-cut(df$x, c(0.04,0.055,0.06,0.065,0.07,0.075,0.08,0.085,0.09,0.095,0.11), include.lowest=T) I then have another variable which I want to calculate some statistics on, according to the levels of the discrete variable. So it would go something like this : df$conf.intx<-ifelse(df$z=="1",t.test(df[df$z=="1",]$y)$conf.int[1], ifelse(df$z=="2",t.test(df[df$z=="2",]$y)$conf.int[1], ifelse(df$z=="3",t.test(df[df$z=="3",]$y)$conf.int[1], ifelse(df$z=="4",t.test(df[df$z=="4",]$y)$conf.int[1],NA)))) But for me to be able to calculate this kind of t-test confidence interval on each of the 'pools' of the y values (which number in the same amount as the observations within the intervals of the discrete variable), I need to be able to control for the number of values within each created interval for z, so that my test remains valid, at least as far as the number of observations is concerned. Simply put, I'd need an automated procedure that would create the vector of breaks for the z variable so that each of them contains a minimum number of observations. As an added complication, it should be the same breaks for two different databases, which I don't know if it's possible. Any help on the matter would be welcome, thank you in advance. EDIT: here is a sample of my data for x. structure(list(x = c(5.319125, 7.3036667, 5.5166167, 7.0308333, 5.6812917, 6.5496583, 5.6621833, 6.4682, 5.4897417, 7.185175, 6.44905, 7.2055833, 7.629375, 6.2282833, 6.6813917, 7.7976, 6.683975, 5.5089083, 7.307475, 7.3958667, 6.2036583, 6.2488833, 5.9372, 6.6180167, 6.4167833, 5.640275, 8.7416917, 8.3134167, 6.8996833, 5.1161917, 7.0606333, 5.2622667, 6.780925, 5.4615417, 6.48185, 5.51585, 6.2224333, 5.3660667, 7.196525, 6.2984083, 7.0137833, 7.4490083, 5.9712333, 6.4287833, 7.6693917, 6.4406417, 5.4135083, 7.16245, 7.2267, 5.820325, 6.066175, 5.760975, 6.4775, 6.2625, 5.5182583, 8.446625, 8.19025, 6.7955333, 4.7899583, 6.5680167, 4.5965917, 6.3539333, 4.6639, 6.0489667, 4.9047833, 5.353625, 4.711425, 6.6268833, 5.5458083, 6.3271917, 6.4591417, 5.1843917, 5.6117167, 7.1828417, 5.6956917, 5.0271917, 6.741875, 6.68305, 4.7859667, 5.3068667, 5.3245, 5.745675, 5.7518917, 5.37945, 8.0030417, 7.7064583, 6.2935333, 5.1838667, 6.9369333, 4.9734583, 6.7257167, 5.0510333, 6.4257667, 5.2858083, 5.7285167, 5.084, 7.0092833, 5.905875, 6.6893417, 6.8319583, 5.5558083, 5.9854833, 7.5552167, 6.064625, 5.3990333, 7.115175, 7.0600167, 5.1644833, 5.6848667, 5.7014417, 6.1051, 6.1186333, 5.7217667, 8.3685417, 8.071325, 6.6547333, 5.5972417, 7.4226, 5.539725, 7.26335, 5.645975, 6.87475, 5.8486167, 6.3001667, 5.5997833, 7.4353167, 6.5089583, 7.213625, 7.3125667, 6.12095, 6.5410083, 8.0639083, 6.6505167, 5.8886417, 7.6301167, 7.5850417, 5.7693667, 6.2480167, 6.1847167, 6.6896167, 6.6323917, 6.1972167, 8.8560333, 8.5501083, 7.1036167, 4.9929583, 6.9839583, 5.3847417, 6.8814417, 5.59555, 6.7867167, 5.7831333, 6.9370917, 5.7400917, 7.6922, 6.3151, 7.084725, 7.0414417, 5.95435, 6.4274167, 7.6692167, 6.9159, 6.0856083, 7.3079583, 7.1937667, 5.744675, 5.946525, 6.0651833, 6.8488833, 6.5924333, 5.772025, 8.3281167, 8.5475917, 6.7952917, 8.248525, 5.1931083, 7.0688917, 5.4793583, 7.0091583, 5.7593, 7.1053333, 5.9382583, 7.1765417, 6.003075, 7.7699833, 6.2757333, 7.2446583, 7.179275, 6.0013083, 6.447975, 7.7845833, 6.9071083, 6.1009, 7.425425, 7.4619083, 5.9380667, 6.2116, 6.13315, 7.0852, 7.0047417, 6.0763917, 8.5926583, 8.7468417, 7.2485167, 8.5096833, 5.1541, 7.0479917, 5.43065, 6.9689083, 5.7356, 7.0842917, 5.9051667, 7.1283333, 5.9666667, 7.7295583, 6.249925, 7.21005, 7.1427167, 5.9675583, 6.4135667, 7.7448583, 6.874275, 6.0679333, 7.388675, 7.429025, 5.911225, 6.1757167, 6.095225, 7.045775, 6.9870833, 6.0567333, 8.5771167, 8.7541917, 7.3187333, 8.5092083, 5.5746, 7.342925, 5.8561667, 7.4704667, 5.922225, 6.9787, 6.1564167, 7.6059667, 5.9122917, 7.7848833, 6.6192, 7.34055, 7.2352417, 5.9776083, 6.5197583, 7.4891583, 7.2185667, 6.4710167, 7.70945, 7.5078083, 6.1470417, 6.66115, 6.6899333, 7.4454083, 7.2270917, 6.350075, 8.3156667, 8.9007917, 6.7578083, 8.3258083, 5.1996, 6.9688833, 5.3592917, 6.7583417, 5.5623583, 6.756375, 5.7361, 7.120425, 5.6567, 7.6174667, 6.1474833, 7.1442167, 6.74475, 5.5820333, 6.0106, 7.142675, 6.667475, 5.9067917, 7.2392, 7.058675, 5.6394417, 5.9119167, 5.8367333, 6.798025, 6.694675, 5.8565917, 8.6035083, 8.912375, 7.0501083, 8.38045, 4.8478083, 6.7493167, 5.3686667, 6.5152333, 5.282025, 6.5464333, 5.5085583, 6.870975, 5.4757667, 7.318, 5.92225, 6.9300417, 6.5758083, 5.4233083, 5.8295583, 7.0451, 6.4790083, 5.68255, 6.9632833, 6.9965833, 5.5005667, 5.717725, 5.5938083, 6.5309, 6.4824583, 5.4429833, 8.072575, 8.3635, 6.5797167, 8.0352333, 4.6289833, 6.64105, 4.8883833, 6.2025833, 5.2291833, 6.4814667, 5.2211083, 6.5780083, 5.196275, 7.030725, 5.6001583, 6.620475, 6.2858333, 5.114375, 5.5424417, 6.7784917, 6.1561333, 5.339375, 6.6249083, 6.6248583, 5.139775, 5.4195, 5.4531833, 6.3348583, 6.4041417, 5.292, 7.6243833, 7.9624583, 6.3226417, 7.761175, 4.8419083, 6.8384083, 5.3500417, 6.5903333, 5.33275, 6.732575, 5.4486, 6.8069417, 5.4569583, 7.26275, 5.835525, 6.8680333, 6.6712333, 5.4720417, 5.904325, 7.1506917, 6.4746833, 5.638675, 6.9570667, 7.0017333, 5.5033667, 5.6859333, 5.651875, 6.5903, 6.529725, 5.4819667, 7.971975, 8.2337833, 6.5815333, 7.9736583, 5.7711917, 7.543325, 5.8986917, 7.5081333, 6.2920333, 7.5321667, 6.4908917, 7.7616583, 6.4509417, 8.08035, 6.8219, 7.7939167, 7.6491333, 6.4773583, 6.9338667, 8.1865583, 7.3998917, 6.572125, 7.9198417, 8.0568, 6.5880333, 6.8299667, 6.7399833, 7.6436, 7.509275, 6.5139833, 9.1520167, 9.3580667, 7.65415, 9.0725167, 5.7483583, 7.5230417, 5.89105, 7.4808833, 6.1969667, 7.4923583, 6.4092583, 7.70695, 6.3970833, 8.0971333, 6.7949083, 7.76445, 7.6170167, 6.4494333, 6.8997, 8.1575333, 7.3728417, 6.544075, 7.888, 8.0215, 6.5484, 6.7911667, 6.7121917, 7.6179083, 7.4731167, 6.4629167, 9.1226333, 9.3307083, 7.6230583, 9.024875, 5.543925, 7.1460833, 5.6575583, 7.5986083, 6.027075, 7.4386167, 6.3500333, 7.6694833, 6.3682583, 8.0843333, 6.7181083, 7.7376, 7.5818583, 6.4010667, 6.8440083, 8.1217917, 7.3290833, 6.5187333, 7.8591667, 7.9898583, 6.5051, 6.7251167, 6.6881333, 7.477675, 7.3571333, 6.3351833, 8.881575, 9.12315, 7.3851, 8.8008667, 5.3437833, 7.1560417, 5.5748, 7.4622583, 5.9412417, 7.3428667, 6.2594167, 7.5839167, 6.28685, 8.0270917, 6.6388333, 7.6611, 7.50065, 6.3217167, 6.7594417, 8.0401167, 7.252425, 6.444, 7.77975, 7.9104167, 6.42495, 6.6421667, 6.6103333, 7.3489417, 7.23205, 6.2059333, 8.726725, 8.994625, 7.2460917, 8.660125, 5.2502833, 7.2591, 5.6425417, 6.889925, 5.353675, 6.50635, 6.260675, 7.4236583, 5.9076417, 7.3915, 6.2134917, 7.1645333, 6.922675, 6.0295417, 6.1687917, 7.2771083, 6.6152333, 6.3299417, 7.167325, 6.647275, 5.726475, 5.93905, 6.2888583, 6.7497167, 6.4364083, 5.8906583, 7.6052917, 8.039425, 6.5672833, 7.8754667, 6.3086333, 5.352025, 7.2849417, 5.7184833, 6.9675917, 5.5615333, 6.6157917, 6.3505417, 7.4881, 6.0007417, 7.5110583, 6.35525, 7.254075, 7.0289083, 6.1994417, 6.2860833, 7.372575, 6.735975, 6.4628917, 7.3102167, 6.8619417, 5.9123667, 6.1611917, 6.4854083, 6.8942417, 6.563625, 6.0610083, 7.941625, 8.6969167, 6.66075, 8.1197167, 6.2802, 3.9638, 5.870825, 4.1852, 5.5841417, 4.3007583, 5.2352167, 4.4281417, 5.819425, 4.1990917, 5.9338917, 4.89765, 5.7204333, 5.6546833, 4.5632167, 4.9803333, 5.6962417, 5.247725, 4.7092583, 6.0145417, 5.6403917, 4.4016917, 4.7181, 4.5007833, 5.2828917, 5.1314167, 4.7492, 6.777575, 6.9040083, 4.9760583, 6.4471917, 5.0952833, 3.712725, 5.8215333, 4.025725, 5.5635, 4.2354083, 5.143525, 4.4900083, 5.6802417, 4.1214333, 5.8128, 4.7525583, 5.6412583, 5.5534917, 4.487475, 4.8237833, 5.6156917, 5.0573, 4.5755417, 5.8096083, 5.5252083, 4.3145583, 4.5437417, 4.194675, 5.0100833, 4.8972333, 4.590025, 6.6441417, 6.5789417, 4.6947667, 6.1648167, 4.8517333, 3.982925, 5.7966833, 4.1607083, 5.5564833, 4.2557417, 5.2304083, 4.8661333, 5.912875, 4.4988333, 6.03915, 4.9131583, 5.8518667, 5.6578583, 4.773225, 4.8958583, 5.8759833, 5.204725, 4.8961667, 5.9217, 5.58395, 4.5410667, 4.73445, 4.5922333, 5.2517333, 5.0220333, 4.619475, 6.4883667, 6.429175, 4.6796417, 6.3171083, 4.93615, 3.9278833, 5.7590417, 4.1155667, 5.612725, 4.2199833, 5.2126667, 4.805275, 5.8888833, 4.4363, 6.0380083, 4.892, 5.8192083, 5.64205, 4.708825, 4.8751583, 5.833775, 5.2210417, 4.853225, 5.924225, 5.5856583, 4.5386167, 4.7280917, 4.5618, 5.264425, 5.03855, 4.5539, 6.4993, 6.4900667, 4.6749083, 6.2961333, 4.918525, 4.0890583, 6.33385, 4.3470083, 5.9645, 4.6541833, 5.5438667, 4.9556583, 6.1590583, 4.6379417, 6.2876833, 5.2235167, 6.1387167, 6.0547583, 4.9545667, 5.254125, 6.05395, 5.4813417, 4.9971333, 6.2266583, 5.9172833, 4.7275917, 4.9274917, 4.443575, 5.3164917, 5.2507083, 5.1704583, 7.173075, 6.9351583, 5.0816667, 6.5568, 5.3417667, 5.1705167, 7.0777833, 5.6253333, 7.231225, 5.5799167, 6.6942917, 6.1014583, 7.538725, 5.7152667, 7.459275, 6.2406083, 7.064925, 6.9234417, 5.8328833, 6.1819583, 7.2127583, 6.8071583, 6.2599417, 7.2975417, 6.973875, 5.804125, 6.1944667, 6.38855, 7.0553583, 6.8393167, 6.1275417, 7.9986833, 8.5846, 6.4682167, 8.0134583, 6.1805917, 5.0699583, 6.9006667, 5.36365, 6.9204917, 5.4478667, 6.5391583, 6.0647417, 7.2951667, 5.6632833, 7.25595, 6.1057333, 6.9578417, 6.8235583, 5.8671833, 6.0716417, 7.060175, 6.5401, 6.1229417, 7.1305083, 6.7823417, 5.62415, 5.9202, 5.9957167, 6.7142167, 6.4706417, 5.9004667, 7.8304583, 8.2144667, 6.1530583, 7.6896417, 5.9285333, 4.2625417, 5.9677583, 4.58695, 6.0400083, 4.4215333, 5.6052833, 5.04165, 6.48845, 4.6423583, 6.1688833, 5.0256167, 5.926725, 5.7214667, 4.746375, 4.9828, 6.1583083, 5.6903, 5.217375, 6.1341583, 5.7868083, 4.5895333, 4.98235, 5.159725, 5.7866167, 5.6300833, 4.882975, 6.7210833, 7.4314833, 5.2493083, 6.8503833, 5.2225583, 3.8417833, 5.9798, 4.1168583, 5.63415, 4.3311333, 5.0777667, 4.6606833, 5.789425, 4.3565167, 5.9736167, 4.8910667, 5.9445417, 5.699275, 4.6897167, 4.9036083, 5.8767, 5.088675, 4.6224417, 5.8052833, 5.5697167, 4.3237, 4.6084333, 4.2958833, 5.1394417, 5.0137583, 4.7711, 6.771275, 6.5984417, 4.845625, 6.3338083, 5.1370333, 3.1820167, 5.2699667, 3.4827167, 5.0992583, 3.7040583, 4.6358583, 4.1604917, 5.2488333, 3.7522, 5.3774167, 4.2636167, 5.1998167, 5.0456333, 4.051475, 4.289175, 5.1718917, 4.5787083, 4.1461667, 5.2983167, 5.03025, 3.8709333, 4.0917167, 3.731925, 4.5584167, 4.4200333, 4.061375, 6.064225, 6.02975, 4.1590167, 5.6589083, 4.2614833, 3.68695, 5.587375, 3.91725, 5.3387, 4.0061667, 4.9563833, 4.1942, 5.6720583, 3.9584333, 5.6873583, 4.6251, 5.4801417, 5.3975583, 4.2382, 4.6710917, 5.4898083, 5.0469667, 4.4950083, 5.72005, 5.46085, 4.30355, 4.5525917, 4.3681667, 5.1723167, 5.0331417, 4.4793083, 6.5492917, 6.720225, 4.7550917, 6.197775, 4.8082917, 4.09925, 5.986525, 4.3104417, 5.68455, 4.4287167, 5.3555667, 4.5191083, 5.9269833, 4.2695917, 5.9984167, 4.981225, 5.8049917, 5.7680667, 4.5736667, 5.0673583, 5.7443583, 5.2811083, 4.719175, 6.0376667, 5.73875, 4.3947333, 4.8157333, 4.6093417, 5.3906417, 5.2357417, 4.684825, 6.8885583, 7.018425, 5.0878167, 6.5122333, 5.2084, 3.810525, 6.2600083, 3.6246583, 5.7396417, 4.0617917, 5.6724583, 4.2505833, 4.7518417, 4.1232, 6.208375, 4.5881167, 5.252575, 5.71795, 4.0840583, 4.700325, 6.2360333, 4.701725, 3.922525, 5.5162167, 5.6220333, 3.8836833, 4.4883667, 4.5398583)), .Names = "x", row.names = c(NA, -962L ), class = "data.frame") Assuming I want 30 values per interval (the 'n'), here is the code I used: df$z<-cut(df$x, seq(30,length(df$x),by=30)/length(df$x), include.lowest=T) Which gives me: > table(df$z) [0.0312,0.0624] (0.0624,0.0936] (0.0936,0.125] (0.125,0.156] (0.156,0.187] (0.187,0.218] (0.218,0.249] (0.249,0.281] (0.281,0.312] (0.312,0.343] (0.343,0.374] 0 0 0 0 0 0 0 0 0 0 0 (0.374,0.405] (0.405,0.437] (0.437,0.468] (0.468,0.499] (0.499,0.53] (0.53,0.561] (0.561,0.593] (0.593,0.624] (0.624,0.655] (0.655,0.686] (0.686,0.717] 0 0 0 0 0 0 0 0 0 0 0 (0.717,0.748] (0.748,0.78] (0.78,0.811] (0.811,0.842] (0.842,0.873] (0.873,0.904] (0.904,0.936] (0.936,0.967] (0.967,0.998] 0 0 0 0 0 0 0 0 0 What I want is a similar result to what I get with quantiles: df$zbis<-cut(df$x, quantile(df$x, (0:20)/20), include.lowest=T) table(df$zbis) [3.18,4.29] (4.29,4.62] (4.62,4.89] (4.89,5.14] (5.14,5.33] (5.33,5.53] (5.53,5.66] (5.66,5.8] (5.8,5.94] (5.94,6.1] (6.1,6.26] (6.26,6.45] (6.45,6.58] (6.58,6.74] (6.74,6.93] 49 48 48 48 48 48 48 48 48 48 48 48 48 48 48 (6.93,7.14] (7.14,7.34] (7.34,7.62] (7.62,8.06] (8.06,9.36] 48 48 48 48 49 Except I'd like this to be reproducible for another database, and so I can't use the quantile function, since I would not get the same intervals on a different database. SECOND EDIT: here is the second sample from another database. 'x' is the same variable, and they have similar ranges. structure(list(x = c(5.319125, 7.3036667, 5.5166167, 7.0308333, 5.6812917, 6.5496583, 5.6621833, 6.4682, 5.4897417, 7.185175, 6.44905, 7.2055833, 7.629375, 6.2282833, 6.6813917, 7.7976, 6.683975, 5.5089083, 7.307475, 7.3958667, 6.2036583, 6.2488833, 5.9372, 6.6180167, 6.4167833, 5.640275, 8.7416917, 8.3134167, 6.8996833, 5.1931083, 7.0688917, 5.4793583, 7.0091583, 5.7593, 7.1053333, 5.9382583, 7.1765417, 6.003075, 7.7699833, 6.2757333, 7.2446583, 7.179275, 6.0013083, 6.447975, 7.7845833, 6.9071083, 6.1009, 7.425425, 7.4619083, 5.9380667, 6.2116, 6.13315, 7.0852, 7.0047417, 6.0763917, 8.5926583, 8.7468417, 7.2485167, 8.5096833, 5.177275, 7.09985, 5.6444667, 7.0102417, 5.7303833, 7.0383333, 5.9870583, 7.3342083, 5.9363667, 7.7753333, 6.38355, 7.389575, 7.0396667, 5.889625, 6.29395, 7.51135, 6.940925, 6.1455417, 7.4281833, 7.4657167, 5.9707083, 6.1902083, 6.0936167, 6.9595167, 6.85065, 5.8525, 8.5148083, 8.805625, 7.00665, 8.4457, 5.3437833, 7.1560417, 5.5748, 7.4622583, 5.9412417, 7.3428667, 6.2594167, 7.5839167, 6.28685, 8.0270917, 6.6388333, 7.6611, 7.50065, 6.3217167, 6.7594417, 8.0401167, 7.252425, 6.444, 7.77975, 7.9104167, 6.42495, 6.6421667, 6.6103333, 7.3489417, 7.23205, 6.2059333, 8.726725, 8.994625, 7.2460917, 8.660125, 3.614125, 5.6345917, 3.9410417, 5.2901417, 4.0147333, 4.766825, 4.4500417, 5.5189, 4.11375, 5.6350667, 4.5756917, 5.5998833, 5.3663, 4.44405, 4.5767417, 5.552025, 4.847425, 4.4382583, 5.5769417, 5.2390667, 4.0610917, 4.4054833, 4.1917, 4.9029083, 4.6935917, 4.3499417, 6.0562333, 6.081225, 4.45855, 6.0121583, 4.740275, 4.5028, 6.4177833, 4.8716417, 6.1469917, 4.6208917, 5.7748083, 5.4530083, 6.694125, 5.0944333, 6.5123167, 5.3257083, 6.2765333, 6.0149167, 5.1815583, 5.30715, 6.4149083, 5.82245, 5.515425, 6.3654333, 5.8472833, 4.9798917, 5.1833583, 5.5210333, 6.0410667, 5.7377917, 5.2666083, 7.0378167, 7.744175, 5.718725, 7.3220583, 5.24325, 5.3256, 7.2155167, 5.696925, 7.0029667, 5.5235, 6.7261083, 6.2810667, 7.546825, 5.90915, 7.3299167, 6.2227333, 7.147075, 6.9142417, 6.0012083, 6.1725333, 7.29815, 6.7, 6.3454583, 7.2129583, 6.7559833, 5.8115, 6.0756667, 6.458225, 6.9969167, 6.778825, 6.2245833, 8.0809583, 8.875325, 6.7210917, 8.3203, 6.3513, 5.2591333, 7.1404917, 5.6266417, 6.9356, 5.4568, 6.6604, 6.206025, 7.48525, 5.8323667, 7.24635, 6.1446583, 7.066275, 6.8334, 5.9198667, 6.09505, 7.2206583, 6.63085, 6.270075, 7.1397333, 6.689125, 5.7441333, 6.042575, 6.38255, 6.9325833, 6.7175667, 6.1592, 8.00415, 8.8051167, 6.647125, 8.2465667, 6.2788167, 6.49435, 8.1847583, 6.664475, 8.0528583, 6.6822417, 7.376, 7.1517833, 8.2306833, 6.8584583, 8.3052167, 7.288375, 8.2758583, 7.7162583, 7.2807833, 7.0459, 8.2507833, 7.5855, 7.0505917, 8.2230167, 8.1669, 6.8184667, 6.9700583, 7.0936167, 7.7615667, 7.6239083, 7.0921667, 9.02585, 9.3416167, 7.6256333, 9.0869333, 8.0984667, 4.116325, 6.1680917, 4.56965, 5.797725, 4.36085, 5.42455, 5.144075, 6.1531833, 4.77825, 6.2533417, 5.0192083, 5.99395, 5.6934083, 4.9074167, 4.9823083, 5.9861667, 5.4068833, 5.1872833, 6.10095, 5.659325, 4.6632833, 4.86315, 5.221775, 5.5878, 5.3217083, 4.8202333, 6.4883083, 6.69355, 4.952075, 6.7075583, 5.00015, 5.2502833, 7.2591, 5.6425417, 6.889925, 5.353675, 6.50635, 6.260675, 7.4236583, 5.9076417, 7.3915, 6.2134917, 7.1645333, 6.922675, 6.0295417, 6.1687917, 7.2771083, 6.6152333, 6.3299417, 7.167325, 6.647275, 5.726475, 5.93905, 6.2888583, 6.7497167, 6.4364083, 5.8906583, 7.6052917, 8.039425, 6.5672833, 7.8754667, 6.3086333, 5.352025, 7.2849417, 5.7184833, 6.9675917, 5.5615333, 6.6157917, 6.3505417, 7.4881, 6.0007417, 7.5110583, 6.35525, 7.254075, 7.0289083, 6.1994417, 6.2860833, 7.372575, 6.735975, 6.4628917, 7.3102167, 6.8619417, 5.9123667, 6.1611917, 6.4854083, 6.8942417, 6.563625, 6.0610083, 7.941625, 8.6969167, 6.66075, 8.1197167, 6.2802, 3.9638, 5.870825, 4.1852, 5.5841417, 4.3007583, 5.2352167, 4.4281417, 5.819425, 4.1990917, 5.9338917, 4.89765, 5.7204333, 5.6546833, 4.5632167, 4.9803333, 5.6962417, 5.247725, 4.7092583, 6.0145417, 5.6403917, 4.4016917, 4.7181, 4.5007833, 5.2828917, 5.1314167, 4.7492, 6.777575, 6.9040083, 4.9760583, 6.4471917, 5.0952833, 3.712725, 5.8215333, 4.025725, 5.5635, 4.2354083, 5.143525, 4.4900083, 5.6802417, 4.1214333, 5.8128, 4.7525583, 5.6412583, 5.5534917, 4.487475, 4.8237833, 5.6156917, 5.0573, 4.5755417, 5.8096083, 5.5252083, 4.3145583, 4.5437417, 4.194675, 5.0100833, 4.8972333, 4.590025, 6.6441417, 6.5789417, 4.6947667, 6.1648167, 4.8517333, 4.1059833, 5.9023167, 4.2812417, 5.6593917, 4.3587583, 5.3359583, 4.983275, 6.0223417, 4.6178333, 6.1545333, 5.0244667, 5.9596, 5.7608833, 4.8875333, 4.9990583, 5.9919333, 5.3157417, 5.0169333, 6.024775, 5.6717167, 4.6372083, 4.8370583, 4.7311333, 5.3704, 5.133575, 4.7174917)), .Names = "x", row.names = c(NA, -455L), class = "data.frame")
Updated after some comments: Since you state that the minimum number of cases in each group would be fine for you, I'd go with Hmisc::cut2 v <- rnorm(10, 0, 1) Hmisc::cut2(v, m = 3) # minimum of 3 cases per group The documentation for cut2 states: m desired minimum number of observations in a group. The algorithm does not guarantee that all groups will have at least m observations. The same cuts for separate variables If the distributions of your variables are very similar you could extract the exact cutpoints by setting the argument onlycuts = T and reuse them for the other variables. In case the distributions are different though, you will end up with few cases in some intervals. Using your data: library(magrittr) library(Hmisc) cuts <- cut2(df1$x, g = 20, onlycuts = T) # determine cuts based on df1 cut2(df1$x, cuts = cuts) %>% table cut2(df2$x, cuts = cuts) %>% table*2 # multiplied by two for better comparison
This is a good example of how NOT to pose a question. At last we have an example an, it is possible to post code that applies to it. (You apparently naively pasted the exact code in my comment without thinking about how to express 'n' and 'N' in the context of the problem. I did need to add prob=c( seq(...) , 1) in order to capture the highest values. This assumes that you want groups of size 100 (although it is still very unclear why this is needed). x$xct <- cut( x$x, breaks=quantile(x$x, prob=c( seq(100, length(x$x), by=100)/length(x$x) , 1) )) table(x$xct) (4.64,5.17] (5.17,5.57] (5.57,5.85] (5.85,6.17] (6.17,6.51] (6.51,6.85] 100 100 100 100 100 100 (6.85,7.26] (7.26,7.94] (7.94,9.36] 100 100 62
How can I apply fisher test on this set of data (nominal variables)
I'm pretty new in statistics: fisher = function(idxToTest, idxATI){ idxDependent=c() dependent=c() p = c() for(i in c(1:length(idxToTest))) { tbl = table(data[[idxToTest[i]]], data[[idxATI]]) rez = fisher.test(tbl, workspace = 20000000000) if(rez$p.value<0.1){ dependent=c(dependent, TRUE) if(rez$p.value<0.1){ idxDependent = c(idxDependent, idxToTest[i]) } } else{ dependent = c(dependent, FALSE) } p = c(p, rez$p.value) } } This is the function I use. It seems to work. What I understood until now is that I have to pass as first parameter data like: Men Women Dieting 10 30 Non-dieting 5 60 My data comes from a CSV: data = read.csv('***.csv', header = TRUE, sep=','); My first problem is that I don't know how to converse from: Loan.Purpose Home.Ownership lp_value_1 ho_value_2 lp_value_1 ho_value_2 lp_value_2 ho_value_1 lp_value_3 ho_value_2 lp_value_2 ho_value_3 lp_value_4 ho_value_2 lp_value_3 ho_value_3 to: ho_value_1 ho_value_2 ho_value_3 lp_value1 0 2 0 lp_value2 1 0 1 lp_value3 0 1 1 lp_value4 0 1 0 The second issue is that I don't know what the second parameter should be POST UPDATE: This is what I get using fisher.test(myTable): Error in fisher.test(test) : FEXACT error 501. The hash table key cannot be computed because the largest key is larger than the largest representable int. The algorithm cannot proceed. Reduce the workspace size or use another algorithm. where myTable is: MORTGAGE NONE OTHER OWN RENT car 18 0 0 5 27 credit_card 190 0 2 38 214 debt_consolidation 620 0 2 87 598 educational 5 0 0 3 7 ...
Basically, fisher tests only work on smallish data sets because they require alot of memory. But all is good because chi-square tests make minimal additional assumptions and are easier on the computer. Just do: chisq.test(Loan.Purpose,Home.Ownership) to get your p-values. Make sure you read through and understand the help page for chisq.test, especially the examples at the bottom. http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html Then look at a mosaicplot to see the quantities like: mosaicplot(Loan.Purpose,Home.Ownership) this reference explains how mosaicplots work. http://alumni.media.mit.edu/~tpminka/courses/36-350.2001/lectures/day12/