Dropping small sample size from a factor plot - r

I'm new to R, so forgive my ignorance. I'm playing around with a dataset that reflects the mileage my car has achieved since I got it. Here's the data formatted .csv. (Note: I have this data in excel and when I saved it as .txt space delimited there was an issue where one line kept throwing an error on read.table saying there weren't the right number of columns...so I switched to .csv and it worked fine)
Date,Miles,Gallons,Price.per.Gallon,Total.Cost,Grade,MPG,Price.per.Mile,Cumulative.Miles,Cumulative.Gallons,Cumulative.Cost,Cumulative.MPG,Cumulative.Price.per.Mile,Gas.Source,Car.Said,Delta,Average.Price.of.Gas,Avg.Temp
6/8/2011,391.8,9.751,3.749,36.556499,Regular,40.18049431,0.093303979,570.4,9.751,36.56,40.18049431,0.064095372,Dealer,41.18,1,3.74935904,82.8
6/22/2011,441.2,9.566,3.359,32.132194,Regular,46.12168095,0.072829089,1011.6,19.317,68.692194,43.14334524,0.067904502,Speedway,47,0.878319047,3.556048765,73.2
7/7/2011,460.6,9.594,3.599,34.528806,Regular,48.0091724,0.074964842,1472.2,28.911,103.221,44.75805057,0.070113436,BP,49.4,1.390827601,3.570301961,79.5
7/18/2011,397.4,8.178,3.319,27.142782,Regular,48.59378821,0.068300911,1869.6,37.089,130.363782,45.60381784,0.069728168,Shell,45.7,2.893788212,3.514890722,83.1
7/26/2011,368.7,8.959,3.359,30.093281,Regular,41.15414667,0.081619965,2238.3,46.048,160.457063,44.73809937,0.071687023,Kroger,42.9,1.745853332,3.484560958,79.1
8/8/2011,436.3,9.845,3.559,35.038355,Regular,44.31691214,0.080307942,2674.6,55.893,195.495418,44.6639114,0.073093329,Kroger,48,3.683087862,3.49767266,76
8/9/2011,262.2,4.986,3.479,17.346294,Regular,52.58724428,0.066156728,2936.8,60.879,212.841712,45.31283365,0.072474023,Shell,46.9,5.687244284,3.496143366,74.5
8/13/2011,250.1,5.887,3.369,19.833303,Regular,42.48343808,0.079301491,3186.9,66.766,232.675015,45.0633556,0.073009826,mobil,45.5,3.016561916,3.484932675,74.1
8/14/2011,424.4,8.699,3.759,32.699541,Regular,48.78721692,0.077048871,3611.3,75.465,265.374556,45.49261247,0.073484495,Speedway,49,0.212783079,3.516524959,68
8/18/2011,437,9.594,3.399,32.610006,regular,45.54930165,0.074622439,4048.3,85.059,297.984562,45.49900657,0.073607332,Speedway,47.6,2.050698353,3.503269049,77.1
8/30/2011,407.3,9.244,3.429,31.697676,Regular,44.06101255,0.077823904,4455.6,94.303,329.682238,45.35804799,0.073992782,Shell,48.6,4.538987451,3.495988866,66.6
9/10/2011,347.3,7.992,3.549,28.363608,Regular,43.45595596,0.081668897,4802.9,102.295,358.045846,45.20944328,0.074547845,Meijer,49.6,6.144044044,3.500130466,65
9/21/2011,375,8.874,3.369,29.896506,Regular,42.25828262,0.079724016,5177.9,111.169,387.942352,44.97386861,0.07492272,Meijer,44.9,2.641717377,3.489663054,67.5
10/5/2011,404.8,9.243,3.079,28.459197,Regular,43.79530455,0.07030434,5582.7,120.412,416.401549,44.88340033,0.074587843,UDF,45.4,1.604695445,3.458139961,61.5
10/14/2011,376.5,8.715,3.249,28.315035,Regular,43.20137694,0.075205936,5959.2,129.127,444.716584,44.76987772,0.074626894,UDF,46.4,3.198623064,3.444024751,56.4
10/23/2011,382.8,8.953,3.199,28.640647,Regular,42.75661789,0.074818827,6342,138.08,473.357231,44.63933951,0.074638479,Speedway,43.8,1.043382107,3.428137536,50.3
10/31/2011,403.4,9.517,3.299,31.396583,Regular,42.38730692,0.077829903,6745.4,147.597,504.753814,44.49412928,0.074829338,Kroger,45.7,3.312693076,3.419810796,47.5
11/15/2011,402.8,9.146,3.249,29.715354,Regular,44.04111087,0.073771981,7148.2,156.743,534.469168,44.46769553,0.074769756,UDF,45.1,1.058889132,3.409843936,54.4
11/29/2011,361.1,9.209,3.149,28.999141,Regular,39.21164079,0.080307785,7509.3,165.952,563.468309,44.1760268,0.075036063,BP,41.7,2.488359214,3.395369197,42.8
12/10/2011,354.2,9.23,3.199,29.52677,Regular,38.37486457,0.083361858,7863.5,175.182,592.995079,43.87037481,0.075411087,Shell,40.3,1.925135428,3.385022885,22.8
12/19/2011,357.4,8.957,2.999,26.862043,Regular,39.90175282,0.075159605,8220.9,184.139,619.857122,43.67733071,0.075400154,UDF,41.3,1.398247181,3.366245727,41.8
1/5/2012,322.6,8.549,3.459,29.570991,Regular,37.73540765,0.091664572,8543.5,192.688,649.428113,43.41370506,0.076014293,Speedway,41,3.26459235,3.370360962,32.3
1/14/2012,370,9.148,3.319,30.362212,Regular,40.44599913,0.082060032,8913.5,201.836,679.790325,43.27919697,0.076265252,Shell,42,1.554000875,3.368033081,17.9
1/28/2012,327.3,9.108,3.329,30.320532,Regular,35.93544137,0.09263835,9240.8,210.944,710.110857,42.96211317,0.076845171,BP,37.5,1.56455863,3.366347737,32.1
2/9/2012,307,7.971,3.399,27.093429,Regular,38.51461548,0.088252212,9547.8,218.915,737.204286,42.80017358,0.077211953,Shell,41.1,2.585384519,3.367536651,28.8
2/16/2012,370.5,10.057,3.229,32.474053,Regular,36.84001193,0.087649266,9918.3,228.972,769.678339,42.53838897,0.077601841,Speedway,42.2,5.359988068,3.361451789,44
2/29/2012,406.3,9.518,3.759,35.778162,Regular,42.6875394,0.088058484,10324.6,238.49,805.456501,42.54434148,0.078013337,Shell,42.9,0.212460601,3.377317711,54.1
3/14/2012,370.6,9.812,3.699,36.294588,Regular,37.77007746,0.097934668,10695.2,248.302,841.751089,42.35567978,0.078703632,UDF,40.5,2.729922544,3.390029436,63.6
3/23/2012,357.6,7.999,3.929,31.428071,Regular,44.7055882,0.087886105,11052.8,256.301,873.17916,42.429019,0.07900072,Shell,43.1,1.605588199,3.406850383,66
4/3/2012,252.5,4.57,3.849,17.58993,Regular,55.25164114,0.069663089,11305.3,260.871,890.76909,42.65364874,0.078792167,Meijer,41.9,13.35164114,3.414596065,58.6
4/13/2012,382.3,9.416,3.629,34.170664,Regular,40.6011045,0.089381805,11687.6,270.287,924.939754,42.58214417,0.079138553,Shell,44.3,3.698895497,3.422065264,51.2
4/24/2012,393.7,9.018,3.659,32.996862,Regular,43.65713018,0.083812197,12081.3,279.305,957.936616,42.61685254,0.079290856,UDF,43.3,0.357130184,3.429715243,49.2
5/7/2012,354.7,9.203,3.729,34.317987,Regular,38.54177985,0.096752148,12436,288.508,992.254603,42.48686345,0.079788887,Speedway,40.6,2.058220146,3.439262007,70.3
5/18/2012,378,9.505,3.699,35.158995,Regular,39.76854287,0.093013214,12814,298.013,1027.413598,42.40016375,0.080178992,Speedway,42.2,2.431457128,3.447546241,62.2
6/1/2012,381.5,9.781,3.699,36.179919,Regular,39.0041918,0.094835961,13195.5,307.794,1063.593517,42.29224741,0.080602745,Sunoco,41,1.9958082,3.455536875,61.4
6/12/2012,386.8,8.976,3.649,32.753424,Regular,43.09269162,0.084677932,13582.3,316.77,1096.346941,42.31492881,0.080718799,Meijer,44.1,1.007308378,3.46101885,75.5
6/23/2012,379.9,9.168,3.339,30.611952,Regular,41.43760908,0.080578973,13962.2,325.938,1126.958893,42.29025152,0.080714994,Kroger,41.8,0.362390925,3.457586697,74.4
7/8/2012,321.9,8.285,3.549,29.403465,Regular,38.85334943,0.091343476,14284.1,334.223,1156.362358,42.20505471,0.080954513,Shell,40.9,2.046650573,3.459852727,84.1
7/21/2012,369.5,8.88,3.479,30.89352,Regular,41.61036036,0.083608985,14653.6,343.103,1187.255878,42.18966316,0.081021447,Meijer,42.6,0.98963964,3.460348286,70.1
7/21/2012,385,7.808,3.499,27.320192,Regular,49.30840164,0.070961538,15038.6,350.911,1214.57607,42.34805976,0.080763906,Speedway,48.5,0.808401639,3.461208312,70.1
7/26/2012,367.1,9.644,3.479,33.551476,Regular,38.06511821,0.091396012,15405.7,360.555,1248.127546,42.23350113,0.081017256,BP,44.2,6.134881792,3.461684198,82.5
8/12/2012,376.6,9.287,3.769,35.002703,Regular,40.55130828,0.09294398,15782.3,369.842,1283.130249,42.19126005,0.081301854,BP,42.3,1.74869172,3.46940112,66.4
8/24/2012,414.9,9.22,3.859,35.57998,Regular,45,0.085755556,16197.2,379.062,1318.710229,42.25957759,0.081415938,Speedway,44.6,0.4,3.478877411,76.5
9/9/2012,373.3,8.984,3.799,34.130216,Regular,41.55164737,0.091428385,16570.5,388.046,1352.840445,42.24318766,0.081641498,Speedway,42.8,1.248352627,3.486288855,62.1
9/19/2012,408.1,9.123,3.799,34.658277,Regular,44.73309218,0.084925942,16978.6,397.169,1387.498722,42.30038095,0.081720443,BP,46.5,1.766907815,3.493471852,53.5
9/28/2012,408.3,9.281,3.659,33.959179,Regular,43.99310419,0.083172126,17386.9,406.45,1421.457901,42.33903309,0.081754534,BP,45.6,1.606895809,3.497251571,59
10/7/2012,393.1,8.942,3.699,33.076458,Regular,43.96108253,0.084142605,17780,415.392,1454.534359,42.37395039,0.081807332,Speedway,46.3,2.338917468,3.50159454,45
10/15/2012,402.9,9.075,3.549,32.207175,Regular,44.39669421,0.079938384,18182.9,424.467,1486.741534,42.41719615,0.081765919,Speedway,46.1,1.703305785,3.502608057,54.6
10/24/2012,365.7,8.264,3.299,27.262936,Regular,44.25217812,0.074550003,18548.6,432.731,1514.00447,42.45223938,0.081623652,Speedway,46.8,2.547821878,3.49871969,68.5
11/4/2012,363.3,9.561,3.259,31.159299,Regular,37.99811735,0.085767407,18911.9,442.292,1545.163769,42.35595489,0.081703254,Meijer,42,4.001882648,3.493537683,37.3
11/15/2012,391.9,10.224,3.499,35.773776,Regular,38.33137715,0.091282919,19303.8,452.516,1580.937545,42.26502488,0.081897737,Speedway,44.1,5.768622848,3.493661097,33.7
11/24/2012,430.2,9.068,3.579,32.454372,Regular,47.44155271,0.075440195,19734,461.584,1613.391917,42.36671982,0.081756963,BP,44.3,3.141552713,3.495337614,29.5
12/2/2012,394.5,9.146,3.239,29.623894,Regular,43.13361032,0.075092253,20128.5,470.73,1643.015811,42.38162004,0.081626341,Sunoco,45.8,2.666389679,3.490357128,55.1
12/12/2012,386.1,9.312,3.169,29.509728,Regular,41.46262887,0.076430272,20514.6,480.042,1672.525539,42.36379317,0.081528547,Speedway,43.4,1.937371134,3.484123345,31
12/23/2012,359.8,8.642,3.199,27.645758,Regular,41.63388105,0.076836459,20874.4,488.684,1700.171297,42.35088523,0.081447673,Speedway,42.4,0.766118954,3.479081159,30.7
1/6/2013,336.4,8.878,3.079,27.335362,Regular,37.89141699,0.081258508,21210.8,497.562,1727.506659,42.27131493,0.081444672,Meijer,41,3.108583014,3.47194251,33.2
1/21/2013,350,9.257,3.259,30.168563,Regular,37.80922545,0.086195894,21560.8,506.819,1757.675222,42.1898153,0.0815218,Meijer,40.6,2.790774549,3.468053135,20.8
2/1/2013,335.7,9.058,3.499,31.693942,Regular,37.0611614,0.094411504,21896.5,515.877,1789.369164,42.09976409,0.081719415,Meijer,38.7,1.638838596,3.468596514,12.1
2/13/2013,360.9,9.42,3.759,35.40978,Regular,38.31210191,0.098115212,22257.4,525.297,1824.778944,42.03184103,0.08198527,Speedway,41.4,3.087898089,3.473804236,31
2/26/2013,371.3,9.081,3.899,35.406819,Regular,40.88756745,0.09535906,22628.7,534.378,1860.185763,42.01239572,0.082204712,Meijer,42.2,1.312432551,3.481029838,36.9
3/9/2013,362.6,8.952,3.439,30.785928,Regular,40.5049151,0.084903276,22991.3,543.33,1890.971691,41.98755821,0.082247271,BP,42.7,2.195084897,3.480337347,36.5
3/21/2013,375.3,8.991,3.859,34.696269,Regular,41.74174174,0.092449424,23366.6,552.321,1925.66796,41.98355666,0.082411132,Kroger,44,2.258258258,3.486501437,23.8
4/8/2013,361.7,9,3.299,29.691,Regular,40.18888889,0.082087365,23728.3,561.321,1955.35896,41.95478167,0.082406197,Speedway,43.4,3.211111111,3.483495112,61.8
4/20/2013,362.3,8.036,3.699,29.725164,Regular,45.08461921,0.082045719,24090.6,569.357,1985.084124,41.99895672,0.082400776,BP,45.6,0.515380786,3.486536784,39
4/30/2013,382.3,8.246,3.539,29.182594,Regular,46.36187242,0.076334277,24472.9,577.603,2014.266718,42.06124276,0.082306009,Speedway,48.7,2.338127577,3.487285762,60.2
5/9/2013,397.3,8.722,3.339,29.122758,Regular,45.55147902,0.073301681,24870.2,586.325,2043.389476,42.1131625,0.082162165,Pilot,47.4,1.848520981,3.485079906,65.8
5/18/2013,399,9.051,3.899,35.289849,Regular,44.08352668,0.088445737,25269.2,595.376,2078.679325,42.14311628,0.082261382,Kroger,45.7,1.616473318,3.491372385,68.3
5/30/2013,380.2,9.04,3.659,33.07736,Regular,42.05752212,0.086999895,25649.4,604.416,2111.756685,42.14183609,0.082331621,Sunoco,44.4,2.342477876,3.493879522,78.2
6/14/2013,395.3,9.095,3.759,34.188105,Regular,43.46344145,0.086486479,26044.7,613.511,2145.94479,42.16142824,0.082394683,Meijer,45,1.536558549,3.497809803,67.6
6/22/2013,390.3,9.008,3.559,32.059472,Regular,43.32815275,0.082140589,26435,622.519,2178.004262,42.17831102,0.082390931,BP,44.3,0.971847247,3.49869524,78.2
7/4/2013,388.9,9.501,3.399,32.293899,Regular,40.93253342,0.083039082,26823.9,632.02,2210.298161,42.15958356,0.082400328,BP,43.7,2.767466582,3.497196546,71.6
7/18/2013,399.8,9.06,3.299,29.88894,Regular,44.12803532,0.07475973,27223.7,641.08,2240.187101,42.18740251,0.08228812,Speedway,45.2,1.07196468,3.494395553,83.9
8/25/2013,394.3,9.114,3.529,32.163306,Regular,43.2631117,0.081570647,27618,650.194,2272.350407,42.20248111,0.082277877,Kroger,45.8,2.536888304,3.494880616,74.6
9/5/2013,413.7,9.507,3.519,33.455133,Regular,43.51530451,0.0808681,28031.7,659.701,2305.80554,42.2214003,0.082257071,Speedway,46,2.484695488,3.495228202,70.2
9/14/2013,431.2,9.272,3.299,30.588328,Regular,46.50560828,0.070937681,28462.9,668.973,2336.393868,42.28077964,0.082085587,UDF,46.7,0.194391717,3.492508469,55.1
9/25/2013,417.6,9.685,3.159,30.594915,Regular,43.11822406,0.073263685,28880.5,678.658,2366.988783,42.29273065,0.081958026,Meijer,48.1,4.981775942,3.487749033,61.3
10/11/2013,421.9,9.202,3.299,30.357398,Regular,45.84872854,0.071954013,29302.4,687.86,2397.346181,42.34030181,0.081813987,Kroger,45.7,0.148728537,3.485224001,62.7
10/23/2013,389,8.975,3.259,29.249525,Regular,43.34261838,0.075191581,29691.4,696.835,2426.595706,42.35321131,0.081727224,Meijer,45.9,2.557381616,3.482310312,39.6
11/2/2013,392.8,8.852,3.299,29.202748,Regular,44.37415273,0.074345081,30084.2,705.687,2455.798454,42.3785616,0.081630838,Meijer,44.8,0.425847266,3.480010903,49.7
11/12/2013,363.5,9.114,2.959,26.968326,Regular,39.88369541,0.074190718,30447.7,714.801,2482.76678,42.34675105,0.081542014,Valero,44,4.116304586,3.473367804,31.4
11/24/2013,375.5,9.123,3.199,29.184477,Regular,41.15970624,0.077721643,30823.2,723.924,2511.951257,42.33179174,0.081495473,UDF,42.6,1.440293763,3.46991018,21.1
12/2/2013,364,9.006,2.999,27.008994,Regular,40.41749944,0.074200533,31187.2,732.93,2538.960251,42.30826955,0.08141033,Meijer,41.1,0.682500555,3.464123792,38.9
12/12/2013,325.8,8.576,2.979,25.547904,Regular,37.98973881,0.078415912,31513,741.506,2564.508155,42.25832293,0.081379372,Murphy,39.5,1.510261194,3.458513019,13.8
1/7/2014,317.1,8.915,3.199,28.519085,Regular,35.56926528,0.089937196,31830.1,750.421,2593.02724,42.17885693,0.081464628,Kroger,38.5,2.930734717,3.455430005,-3.6
1/15/2014,359.5,9.252,3.299,30.522348,Regular,38.85646347,0.08490222,32189.6,759.673,2623.549588,42.13839376,0.081503019,Meijer,41.1,2.243536533,3.453524856,28.6
1/27/2014,302.7,8.89,3.249,28.88361,Regular,34.04949381,0.095419921,32492.3,768.563,2652.433198,42.04482912,0.08163267,BP,35.8,1.750506187,3.451159109,37.1
2/4/2014,346.7,8.983,3.279,29.455257,Regular,38.59512412,0.084958918,32839,777.546,2681.888455,42.00497463,0.081667787,UDF,40,1.404875877,3.449170152,22.4
2/16/2014,310.1,8.773,3.459,30.345807,Regular,35.34708766,0.097858133,33149.1,786.319,2712.234262,41.93069225,0.081819243,Speedway,37.7,2.352912345,3.449279824,23.3
3/1/2014,361.8,9.065,3.599,32.624935,Regular,39.91174848,0.09017395,33510.9,795.384,2744.859197,41.90768233,0.081909444,Speedway,42.2,2.288251517,3.450986187,35.1
3/17/2014,354.2,9.356,3.579,33.485124,Regular,37.858059,0.094537335,33865.1,804.74,2778.344321,41.86060094,0.082041521,Speedway,41.9,4.041941,3.45247449,26.3
3/28/2014,354.1,9.165,3.579,32.801535,Regular,38.63611566,0.092633536,34219.2,813.905,2811.145856,41.82429153,0.082151127,UDF,39.8,1.163884343,3.453899234,51.9
4/8/2014,371.5,9.164,3.549,32.523036,Regular,40.53906591,0.087545184,34590.7,823.069,2843.668892,41.80998191,0.082209059,UDF,41.7,1.16093409,3.45495808,49.7
4/21/2014,373.8,9.216,3.679,33.905664,Regular,40.55989583,0.090705361,34964.5,832.285,2877.574556,41.79613954,0.082299891,Shell,42.2,1.640104167,3.457438925,64.1
5/2/2014,391.9,8.834,3.599,31.793566,Regular,44.36268961,0.081126731,35356.4,841.119,2909.368122,41.82309519,0.082286888,Speedway,44.8,0.437310392,3.458925695,50.9
5/10/2014,375.1,8.854,3.659,32.396786,Regular,42.36503275,0.086368398,35731.5,849.973,2941.764908,41.82874044,0.082329734,Speedway,45.8,3.434967246,3.46100983,65.5
5/21/2014,401.1,9.094,3.659,33.274946,Regular,44.10600396,0.082959227,36132.6,859.067,2975.039854,41.85284733,0.082336722,Speedway,45.6,1.493996041,3.463105734,72.3
6/6/2014,435.3,9.487,3.599,34.143713,Regular,45.88384105,0.0784372,36567.9,868.554,3009.183567,41.89687688,0.082290303,Speedway,50.5,4.616158954,3.464590074,67.5
6/21/2014,458.4,9.286,3.799,35.277514,Regular,49.36463493,0.076957928,37026.3,877.84,3044.461081,41.9758726,0.082224286,Kroger,49.6,0.235365066,3.468127541,73.8
7/5/2014,386.8,9.292,3.029,28.145468,Regular,41.6272062,0.072764912,37413.1,887.132,3072.606549,41.97222059,0.082126489,Speedway,44.5,2.872793801,3.463528031,69.2
7/19/2014,433.1,8.961,3.499,31.354539,Regular,48.33165941,0.072395611,37846.2,896.093,3103.961088,42.03581548,0.082015132,Kroger,48.3,0.031659413,3.463882753,66.7
8/6/2014,401.4,9.055,3.399,30.777945,Regular,44.32909994,0.076676495,38247.6,905.148,3134.739033,42.05875724,0.081959104,Speedway,47.6,3.270900055,3.463233673,73.1
8/25/2014,414.1,9.001,3.039,27.354039,Regular,46.00599933,0.066056602,38661.7,914.149,3162.093072,42.09762304,0.081788775,Speedway,46.9,0.894000667,3.459056535,78.2
9/15/2014,406.2,9.094,2.959,26.909146,Regular,44.66681328,0.066246051,39067.9,923.243,3189.002218,42.12292972,0.081627173,Kroger,47.1,2.433186717,3.454130947,59.5
9/30/2014,396.3,9.129,3.189,29.112381,Regular,43.41110746,0.073460462,39464.2,932.372,3218.114599,42.13554247,0.081545162,Kroger,46.7,3.28889254,3.451535009,62
10/22/2014,397.7,9.328,2.859,26.668752,Regular,42.63507719,0.06705746,39861.9,941.7,3244.783351,42.1404906,0.081400619,UDF,45.1,2.464922813,3.445665659,46.9
11/5/2014,413.2,9.262,2.879,26.665298,Regular,44.61239473,0.064533635,40275.1,950.962,3271.448649,42.16456599,0.081227574,UDF,46,1.387605269,3.440146556,50
11/17/2014,398.9,9.081,2.899,26.325819,Regular,43.9268803,0.065996037,40674,960.043,3297.774468,42.18123563,0.081078194,Speedway,45.2,1.2731197,3.435027877,28.6
11/25/2014,345.8,9.003,2.899,26.099697,Regular,38.40941908,0.075476278,41019.8,969.046,3323.874165,42.14619327,0.08103097,UDF,40.7,2.290580917,3.430047867,36.7
12/7/2014,345.6,8.738,2.139,18.690582,Regular,39.55138476,0.054081545,41365.4,977.784,3342.564747,42.12300467,0.080805812,Speedway,41.6,2.048615244,3.418510373,33
12/30/2014,360.8,9.013,1.869,16.845297,Regular,40.03106624,0.046688739,41726.2,986.797,3359.410044,42.10389776,0.080510807,Kroger,42.2,2.168933762,3.40435778,25.4
2/2/2015,338.8,8.725,2.059,17.964775,Regular,38.83094556,0.05302472,42065,995.522,3377.374819,42.0752128,0.080289429,Speedway,41.1,2.269054441,3.392566733,25.9
2/12/2015,321.7,8.765,2.359,20.676635,Regular,36.70279521,0.064273034,42386.7,1004.287,3398.051454,42.02832457,0.08016787,Speedway,39.2,2.497204792,3.383546191,26.3
3/3/2015,310.7,9.93,2.039,20.24727,Regular,31.28902316,0.065166624,42697.4,1014.217,3418.298724,41.92317818,0.080058709,AAFES,37.4,6.110976838,3.370382003,26.4
3/13/2015,408.5,9.404,2.199,20.679396,Regular,43.43896214,0.050622756,43105.9,1023.621,3438.97812,41.93710367,0.079779755,Kroger,42.7,0.738962144,3.359620524,46.1
3/22/2015,396.5,9.051,2.339,21.170289,Regular,43.80731411,0.05339291,43502.4,1032.672,3460.148409,41.9534954,0.079539253,Speedway,45.9,2.092685891,3.35067515,40
3/30/2015,386.7,8.931,1.999,17.853069,Regular,43.29862277,0.04616775,43889.1,1041.603,3478.001478,41.9650289,0.079245222,Meijer,44.4,1.101377225,3.339085504,46.4
4/10/2015,414,8.905,2.399,21.363095,Regular,46.49073554,0.051601679,44303.1,1050.508,3499.364573,42.00339264,0.078986901,Kroger,48.3,1.809264458,3.331116539,61
4/19/2015,368.7,7.84,2.419,18.96496,Regular,47.02806122,0.051437375,44671.8,1058.348,3518.329533,42.04061424,0.07875952,Shell,48.4,1.371938776,3.324359788,62.5
4/28/2015,407.9,9.18,2.179,20.00322,Regular,44.4335512,0.049039519,45079.7,1067.528,3538.332753,42.06119184,0.078490601,Speedway,47.5,3.066448802,3.314510489,49.3
5/10/2015,425.1,9.235,2.499,23.078265,Regular,46.03140227,0.054289026,45504.8,1076.763,3561.411018,42.09524287,0.078264513,Kroger,47.7,1.668597726,3.307516155,74.9
5/19/2015,436.6,9.161,2.629,24.084269,Regular,47.65855256,0.055163236,45941.4,1085.924,3585.495287,42.1421757,0.078044972,BP,49.1,1.44144744,3.301792102,62.9
5/28/2015,399.1,8.503,2.299,19.548397,Regular,46.9363754,0.0489812,46340.5,1094.427,3605.043684,42.17942357,0.077794665,UDF,49,2.063624603,3.294001047,72.9
6/9/2015,416.6,8.858,2.639,23.376262,Regular,47.03093249,0.056112007,46757.1,1103.285,3628.419946,42.21837513,0.077601475,Kroger,48.4,1.36906751,3.288742207,65.5
7/9/2015,419.6,8.917,2.389,21.302713,Regular,47.05618482,0.050769097,47176.7,1112.202,3649.722659,42.25716192,0.077362822,BP,49.4,2.343815184,3.281528588,73.1
7/30/2015,433.9,9.361,2.499,23.393139,Regular,46.35188548,0.053913664,47610.6,1121.563,3673.115798,42.29133807,0.077149118,UDF,48.9,2.548114518,3.274997301,76.2
8/12/2015,410.8,8.774,2.699,23.681026,Regular,46.82015044,0.05764612,48021.4,1130.337,3696.796824,42.32649201,0.076982279,UDF,47.5,0.679849556,3.270526245,68.3
8/23/2015,397,8.841,2.059,18.203619,Regular,44.90442258,0.045852945,48418.4,1139.178,3715.000443,42.34649897,0.076727039,UDF,48.8,3.895577423,3.26112376,72.1
9/1/2015,435.8,9.6,1.999,19.1904,Regular,45.39583333,0.044034878,48854.2,1148.778,3734.190843,42.37198136,0.076435411,Kroger,49.6,4.204166667,3.250576563,75.5
9/12/2015,422.5,8.493,2.269,19.270617,Regular,49.74685035,0.045610928,49276.7,1157.271,3753.46146,42.42610417,0.076171121,Kroger,45.3,4.446850347,3.243372952,58.8
9/22/2015,391.3,8.491,1.799,15.275309,Regular,46.08408904,0.039037335,49668,1165.762,3768.736769,42.45274764,0.075878569,Speedway,48.3,2.215910965,3.232852648,63.3
10/1/2015,421.3,8.961,2.459,22.035099,Regular,47.01484209,0.052302632,50089.3,1174.723,3790.771868,42.48754813,0.075680272,Kroger,50.2,3.185157906,3.22694956,55.9
10/25/2015,412.4,10.057,1.079,10.851503,Regular,41.00626429,0.026313053,50501.7,1184.78,3801.623371,42.47497426,0.075277137,UDF,45.8,4.793735706,3.208716699,55.2
11/14/2015,445.4,9.047,1.979,17.904013,Regular,49.23178954,0.040197604,50947.1,1193.827,3819.527384,42.52617842,0.074970457,Kroger,45.5,3.731789543,3.199397722,38.2
11/24/2015,395.3,9.451,1.899,17.947449,Regular,41.82626177,0.045402097,51342.4,1203.278,3837.474833,42.52068101,0.074742802,Meijer,44.4,2.573738229,3.189183907,37.7
12/9/2015,381.4,9.291,1.469,13.648479,Regular,41.05047896,0.03578521,51723.8,1212.569,3851.123312,42.50941596,0.074455537,Speedway,43.8,2.749521042,3.176003437,46.7
12/18/2015,391,8.715,1.839,16.026885,Regular,44.86517499,0.040989476,52114.8,1221.284,3867.150197,42.5262265,0.074204452,Kroger,46.1,1.234825014,3.166462671,30.8
12/31/2015,356.6,8.754,1.999,17.499246,Regular,40.7356637,0.049072479,52471.4,1230.038,3884.649443,42.51348332,0.074033653,Speedway,43.2,2.464336303,3.158154011,33.4
1/8/2016,375.7,10.531,1.099,11.573569,Regular,35.67562435,0.030805347,52847.1,1240.569,3896.223012,42.45543779,0.073726335,UDF,43.2,7.524375653,3.140674168,38.4
1/17/2016,408.8,8.996,1.199,10.786204,Regular,45.44241885,0.026385039,53255.9,1249.565,3907.009216,42.47694198,0.073362937,Kroger,41.1,4.342418853,3.126695463,24
1/26/2016,326.8,8.83,1.799,15.88517,Regular,37.01019253,0.048608231,53582.7,1258.395,3922.894386,42.43858248,0.073211958,Kroger,39.9,2.889807475,3.11737919,39.6
2/3/2016,338.2,7.974,1.599,12.750426,Regular,42.41284174,0.037700846,53920.9,1266.369,3935.644812,42.4384204,0.072989227,UDF,44.1,1.687158264,3.107818347,53.7
2/10/2016,355.1,8.88,1.349,11.97912,Regular,39.98873874,0.033734497,54276,1275.249,3947.623932,42.42136242,0.072732403,UDF,43.3,3.311261261,3.095571086,16.5
2/17/2016,334.9,8.703,1.559,13.567977,Regular,38.48098357,0.040513517,54610.9,1283.952,3961.191909,42.39465338,0.072534822,UDF,39.6,1.119016431,3.08515576,31.7
2/26/2016,375.8,8.959,1.879,16.833961,Regular,41.94664583,0.044795,54986.7,1292.911,3978.02587,42.39154899,0.072345237,UDF,44.4,2.453354169,3.076797916,29.9
3/13/2016,385.7,8.732,1.959,17.105988,Regular,44.17086578,0.0443505,55372.4,1301.643,3995.131858,42.40348544,0.072150238,UDF,45,0.829134219,3.06929923,54.5
4/5/2016,402.6,9.241,1.959,18.103119,Regular,43.56671356,0.044965522,55775,1310.884,4013.234977,42.41168555,0.071954011,Kroger,45.9,2.333286441,3.061472241,30.9
4/14/2016,370.8,8.674,2.1139,18.3359686,Regular,42.74844362,0.049449754,56145.8,1319.558,4031.570946,42.4138992,0.071805388,UDF,44.5,1.751556375,3.055243457,47.7
4/28/2016,397.6,9.2,2.399,22.0708,Regular,43.2173913,0.05551006,56543.4,1328.758,4053.641746,42.41946239,0.071690803,UDF,44.7,1.482608696,3.050699786,55.8
5/7/2016,377,8.884,1.669,14.827396,Regular,42.43583971,0.039329963,56920.4,1337.642,4068.469142,42.41957116,0.071476468,Speedway,44.1,1.664160288,3.041523174,62.3
5/18/2016,389.2,9.253,2.459,22.753127,Regular,42.06203393,0.058461272,57309.6,1346.895,4091.222269,42.41711492,0.071388079,Kroger,44.8,2.737966065,3.037521313,56.2
5/24/2016,410.5,8.846,2.579,22.813834,Regular,46.40515487,0.055575722,57720.1,1355.741,4114.036103,42.44313626,0.071275623,UDF,47.1,0.694845128,3.034529532,68
6/28/2016,376.6,8.994,2.349,21.126906,Regular,41.87235935,0.05609906,58096.7,1364.735,4135.163009,42.43937468,0.071177244,UDF,42.9,1.027640649,3.030011694,74.6
7/13/2016,357.2,9.138,1.579,14.428902,Regular,39.08951631,0.040394462,58453.9,1373.873,4149.591911,42.41709387,0.070989137,Meijer,41.6,2.510483694,3.020360623,79
8/4/2016,358.8,9.236,1.919,17.723884,Regular,38.84798614,0.04939767,58812.7,1383.109,4167.315795,42.3932604,0.070857413,Kroger,40.7,1.852013859,3.013006057,79.6
8/12/2016,386.7,8.98,2.239,20.10622,Regular,43.0623608,0.051994363,59199.4,1392.089,4187.422015,42.39757659,0.070734197,Kroger,44.7,1.637639198,3.008013148,82.9
8/22/2016,367.9,8.752,2.339,20.470928,Regular,42.03610603,0.055642642,59567.3,1400.841,4207.892943,42.39531824,0.070640988,Meijer,43.3,1.263893967,3.003833371,66.6
8/30/2016,360.1,9.337,2.139,19.971843,Regular,38.56699154,0.055461936,59927.4,1410.178,4227.864786,42.36997032,0.070549778,UDF,41.1,2.533008461,2.998107179,76
9/12/2016,410.1,9.475,2.159,20.456525,Regular,43.2823219,0.049881797,60337.5,1419.653,4248.321311,42.3760595,0.070409303,Marathon,45.1,1.8176781,2.992506838,66
9/22/2016,395.8,9.273,2.189,20.298597,Regular,42.68305834,0.051284985,60733.3,1428.926,4268.619908,42.37805177,0.070284669,UDF,44.4,1.716941659,2.987292489,73.6
10/5/2016,379.5,9.097,1.699,15.455803,Regular,41.71704958,0.040726754,61112.8,1438.023,4284.075711,42.37387024,0.07010112,Speedway,43.7,1.982950423,2.979142691,67.3
10/7/2016,129.6,2.722,2.309,6.285098,Regular,47.61204996,0.048496127,61242.4,1440.745,4290.360809,42.38376673,0.0700554,Kroger,47.2,0.412049963,2.977876591,67.4
10/10/2016,400.1,8.569,2.219,19.014611,Regular,46.69156261,0.047524646,61642.5,1449.314,4309.37542,42.40923637,0.06990916,Kroger,48.8,2.108437391,2.973389769,53.2
10/20/2016,395.7,8.947,1.949,17.437703,Regular,44.22711523,0.044067988,62038.2,1458.261,4326.813123,42.42038977,0.069744337,Wal Mart,46.8,2.572884766,2.967104738,59.5
10/31/2016,395.9,9.247,2.099,19.409453,Regular,42.81388558,0.049026151,62434.1,1467.508,4346.222576,42.42286925,0.069612961,Meijer,45.6,2.786114415,2.961634673,51.2
11/10/2016,414.6,8.899,1.999,17.789101,Regular,46.58950444,0.042906659,62848.7,1476.407,4364.011677,42.44798352,0.069436785,Meijer,48.3,1.710495561,2.955832421,45
11/22/2016,366,9.225,2.599,23.975775,Premium,39.67479675,0.065507582,63214.7,1485.632,4387.987452,42.43076347,0.069414036,Meijer,43.6,3.925203252,2.953616677,30.8
12/6/2016,393.2,9.229,1.989,18.356481,Regular,42.60483259,0.046684845,63607.9,1494.861,4406.343933,42.43183814,0.069273533,BP,44.8,2.195167407,2.947661309,N/A
12/21/2016,334,8.855,2.259,20.003445,Regular,37.71880294,0.059890554,63941.9,1503.716,4426.347378,42.40408428,0.069224521,UDF,39.2,1.481197064,2.943605959,N/A
1/9/2017,332,8.847,2.429,21.489363,Regular,37.52684526,0.064726997,64273.9,1512.563,4447.836741,42.37555725,0.069201289,Speedway,39.6,2.073154742,2.940596022,N/A
One of the things I tried was
plot(factor(Gas.Source),MPG)
It's exactly what you'd expect. Some of the factor levels have very few (or one) observations and so rather than a box and whicker you just get a black line.
I understand this is exactly what I asked it to do, as some of those sources had very few observations. So what I'd like to do is efficiently remove the measurements associated with factor levels where there aren't enough observations to really produce a box and whisker...
I'm guessing I could do this by creating a new dataframe where I've used logical subscripting to select only those rows corresponding to a factor level that has a count that's greater than X....but I'm not sophisticated enough to figure that out yet.

Found what I was looking for here
Given that the original data was in a dataframe called mileage
tbl <- table(mileage$Gas.Source)
new.Mileage <- droplevels(mileage[mileage$Gas.Source %in% names(tbl)[tbl>10],,drop=FALSE])
new.Mileage now has only those rows where there were more than 10 observations at that factor level (i.e. from that gas source)

Related

Issue with Boxplot formula or variable definition

I have a csv file having 4 columns labeled AGE, DIASTOLIC, BMI and EVER.PREGNANT and 700 rows. The last column consists of only yes or no. I wish to plot the data BMI vs EVER.PREGNANT with an intent to comparing BMI of those with yes in the fourth column and no in the same column. What code should I write to get the required boxplot?
I have tried the following code:
Sheet=read.csv(/Downloads/1739230_1284354330_PIMA.csv - 1739230_1284354330_PIMA.csv.csv, sep=",")
boxplot(BMI~EVER.PREGNANT,data=sheet, main="BMI vs PREG",xlab="BMI",ylab="PREGNANT")
The error that I get is
Error in eval(expr,envr,enclos): object 'Sheet' not found
Similarly, what modifications can be done to plot AGE vs DIASTOLIC, where both columns are numbers? Will I get the 700 odd values nicely?
I answer here because it tells me not to extend the discussion :-).
I think you haven't loaded correctly your data set. You need to add header = T when loading to tell the program that your first row corresponds with the names of the variables.
Sheet=read.csv("/Downloads/1739230_1284354330_PIMA.csv", sep=",", header = T)

Questions associated with "Error: Aesthetics must be either length 1 or the same as the data"

I understand the subject "Error: Aesthetics must be either length 1 or the same as the data" has been done a lot (plenty of reading available online), however, I still have some unresolved questions
I am working with a dataset regarding all calls made to the Seattle Police Department in 2015. After I am done cleaning the data into an acceptable format I wind up with a dataset that is 62,092 rows and 13 columns (dataset name is SPD_2015). I would add a portion of the dataset to this question but I'm not entirely sure how to do it in a clean and legible format.
I used package lubridate to extract the times associated with my data set. I then created a bar graph that showed what time the crimes occur
ggplot(SPD_2015, aes(hour(date.reported.time))) +
geom_bar(width = 0.7)
and that works perfectly.
Since Car Prowls were the most frequently reported crime, I wanted to graph what time these car prowls occurred. And this is when I come across the error ""Error: Aesthetics must be either length 1 or the same as the data".
I read that ggplot2 does not like it when you subset within the ggplot code, so I subsetted my data by creating a separate data frame.
car.prowl <- filter(SPD_2015, summarized.offense.description == "CAR PROWL")
So here is my question. Why is it that when I look at the dimensions of my newly created dataset "car.prowl" I see that it has a dimension of 11,539 rows and 13 columns. But when I examine the length of the hours in the occurred.time column (the time that the crime occurred) I get a length of 62,092 which is the length of the original dataset?
In my mind I am picturing that the following code would work:
ggplot(car.prowl, aes(hour(occured.time))) +
geom_bar()
The length of the car.prowl$occured.time is correct:
> length(car.prowl$occured.time)
[1] 11539
but when I apply the hour function I get the length of the original dataset:
> length(hour(car.prowl$occured.time))
[1] 62092
when it should be 11,539.
Thank you. Please let me know what I can do to make my question more clear.
It could be a caching issue as Jeremy said above. I'm not sure this would work, but you could try the below, chaining things together.
SPD_2015%>%
filter(summarized.offense.description == "CAR PROWL")%>%
ggplot(aes(hour(occured.time)))+
geom_bar()

Criteria for deciding which character columns should be converted to factors

I have been working through the book "Analyzing Baseball Data with R" by Marchi and Albert and am wondering about an issue which they don't address.
Many of the datasets I need to import are fairly large (though not really "Big" in the sense of "Big Data"). For example, the Retrosheet Game Logs have 1 csv file per year dating back to 1871 where each file has a row for each game played that year, and 161 columns. When I read it into a dataframe using read.csv() using the default setting on stringsAsFactors fully 75 of the 161 columns become factors. Some of these columns conceptually are factors (such as one containing "D" or "N" for day or night games) but others are probably better left as strings (many of the columns contain names of starting pitchers, closers, etc.) I know how to convert columns from factors to strings or vice versa, but I don't want to have to scan through 161 columns, making an explicit decision for 75 of them.
The reason I think it important is that I've noticed that conceptually small dataframes obtained by subsetting these game logs are surprisingly large given the need to retain the full factor information. For example, given the dataframe GL2016 obtained from downloading, unzipping and the reading in the file, object.size(GL2016) is about 2.8 MB, and when I use:
df <- with(GL2016,GL2016[V7 == "CLE" & V13 == "D",])
to extract the home day games played by the Cleveland Indians in 2016, I get a df with 26 rows. 26/2428 (where 2428 is the number of rows in the whole dataframe) is slightly more than 1%, but object.size(df) is around 1.3 MB, which is far more than 1% of the size of GL2016.
I came up with an ad-hoc solution. I first defined a function:
big.factor <- function(v,k){is.factor(v) && length(levels(v)) > k}
And then used mutate_if from dplyr like thus:
GL2016 %>% mutate_if(function(v){big.factor(v,30)},as.character) -> GL2016
30 is the number of teams in the MLB and I somewhat arbitrarily decided that any factor with more than 30 levels should probably be treated as a string.
After this code has been run, the number of factor variables has been reduced from 75 to 12. It works in the sense that even though now GL2016 is around 3.2 MB (slightly larger than before), if I now subset the dataframe to pull out the Cleveland day games, the resulting dataframe is just 0.1 MB.
Questions:
1) What criteria (hopefully less ad-hoc than what I used above) are relevant for deciding which character columns should be converted to factors when importing a large data set?
2) I am aware of the cost in terms of memory footprint of converting all character data to factors, but am I incurring any hidden costs (say in processing time) when I convert most of these factors back into strings?
Essentially, I think what you need to do is:
df <- with(GL2016,GL2016[V7 == "CLE" & V13 == "D",])
df <- droplevels(df)
droplevelsfunction will remove all the unused factor levels, and thus reduce the size of df immensely.

cummeRbund csHeatmap column user-defined order

I am using the R package cummeRbund (from Bioconductor) to visualize RNA-seq data, I created a cuffGeneSet instance called "DEG_genes" that contains 662 genes that are significantly differentially expressed between males and females. My goal is to create a heatmap using csHeatmap() in which the male and female samples (replicates) are separated but with a specific user-defined order within the sex category.
I used:
> DEG<-diffData(genes(cuff)) # take differentially expressed genes
> DEG_significant<-subset(DEG,significant=='yes') # retain only significant changes
> DEG_sign_IDs <- DEG_significant$gene_id # retrieve IDs
> DEG_genes<-getGenes(cuff,DEG_sign_IDs) # get CuffGeneSet instance
> hmap<-csHeatmap(DEG_genes,clustering='none',labRow=F,replicates=T)
This gives me ALMOST what I want: the heatmap shows Females on the left and Males on the right but they are alphabetically ordered (Female_0,Female_1,Female_10,Female_11,Female_12...Female_19,Female_2,Female_20,Female_21..,Female_29 on the left and similarly for males Male_0,Male_1,Male_10...Male_19,Male_2,Male_20...etc on the right) and I want them to be in a specific order (clusterReps). I created a test vector with replicate names on a specific order (Males on the left with 0 and 6 echanged and females on the right) as follow:
clusterReps<-c("Male_6","Male_1","Male_2","Male_3","Male_4","Male_5","Male_0","Male_7","Male_8","Male_9","Male_10","Male_11","Male_12","Male_13","Male_14","Male_15","Male_16","Male_17","Male_18","Male_19","Male_20","Male_21","Male_22","Male_23","Male_24","Male_25","Male_26","Male_27","Male_28","Male_29","Male_30","Male_31","Male_32","Male_33","Female_0","Female_1","Female_2","Female_3","Female_4","Female_5","Female_6","Female_7","Female_8","Female_9","Female_10","Female_11","Female_12","Female_13","Female_14","Female_15","Female_16","Female_17","Female_18","Female_19","Female_20","Female_21","Female_22","Female_23","Female_24","Female_25","Female_26","Female_27","Female_28")
I would like the data to be exactly the same except the order of the columns that must follow the order of the "clusterReps" vector. Knowing that the heatmap is a ggplot, I looked everywhere for a solution the last 2 days but with no success (despite a closely ressembling problem with heatmap.2() instead of csHeatmap() on stackoverflow, I tried to get a replicate fpkm matrix and use heatmap.2 but could only use heatmap_2 and some options were not accepted).
Using:
> hmap<-hmap+scale_x_discrete(limits=clusterReps)
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing scale.
only changes the x-axis labels but not the actual data (the heatmap remains identical).
Is there a similar function that rearranges the columns and not just labels?
Thanks in advance for your help, I'm not familiar with handling ggplot objects, and in particular heatmaps from cummeRbund.
EDIT:
Here is what I can give as further information:
> DEG_genes
CuffGeneSet instance for 662 genes
Slots:
annotation
fpkm
repFpkm
diff
count
isoforms CuffFeatureSet instance of size 930
TSS CuffFeatureSet instance of size 785
CDS CuffFeatureSet instance of size 230
promoters CuffFeatureSet instance of size 662
splicing CuffFeatureSet instance of size 785
relCDS CuffFeatureSet instance of size 662
> summary(DEG_genes)
Length Class Mode
662 CuffGeneSet S4
I am afraid I can't give more information for the moment, please let me know if you want me to execute a command and report the output if it can help.
I am not very fluent in R, but I was having the same problem. To solve it I made a script that renames all my sample names in all the files inside the cuffdiff folder to something that will give the right order when sorted alphabetically, and then rebuild the database.

R storing different columns in different vectors to compute conditional probabilities

I am completely new to R. I tried reading the reference and a couple of good introductions, but I am still quite confused.
I am hoping to do the following:
I have produced a .txt file that looks like the following:
area,energy
1.41155882174e-05,1.0914586287e-11
1.46893363946e-05,5.25011714434e-11
1.39244046855e-05,1.57904991488e-10
1.64155121046e-05,9.0815757601e-12
1.85202830392e-05,8.3207522281e-11
1.5256036289e-05,4.24756620609e-10
1.82107587343e-05,0.0
I have the following command to read the file in R:
tbl <- read.csv("foo.txt",header=TRUE).
producing:
> tbl
area energy
1 1.411559e-05 1.091459e-11
2 1.468934e-05 5.250117e-11
3 1.392440e-05 1.579050e-10
4 1.641551e-05 9.081576e-12
5 1.852028e-05 8.320752e-11
6 1.525604e-05 4.247566e-10
7 1.821076e-05 0.000000e+00
Now I want to store each column in two different vectors, respectively area and energy.
I tried:
area <- c(tbl$first)
energy <- c(tbl$second)
but it does not seem to work.
I need to different vectors (which must include only the numerical data of each column) in order to do so:
> prob(energy, given = area), i.e. the conditional probability P(energy|area).
And then plot it. Can you help me please?
As #Ananda Mahto alluded to, the problem is in the way you are referring to columns.
To 'get' a column of a data frame in R, you have several options:
DataFrameName$ColumnName
DataFrameName[,ColumnNumber]
DataFrameName[["ColumnName"]]
So to get area, you would do:
tbl$area #or
tbl[,1] #or
tbl[["area"]]
With the first option generally being preferred (from what I've seen).
Incidentally, for your 'end goal', you don't need to do any of this:
with(tbl, prob(energy, given = area))
does the trick.

Resources