Error converting character strings to numeric - r

Trying to convert a series of 29 columns in a dataframe 'df' to numeric variables. Each column currently contains strings that all look like this:
two three
"-2.5346346" "-4.2342342"
"-3.645735" "-2.23434542"
"-4.235234" "-1.23422"
as.character(two)
works fine.
as.numeric(as.character(two))
does not. as.numeric() returns all NAs, not even just NAs for certain observations.
In any case, there are not any extraneous commas, letters, etc. I cannot think what could be causing the problem and have run out of ideas. If it's at all relevant, I constructed the columns from vector strings (ex. c("-3.23423", "-2.34532)) where each string became a new column and now I'm wondering if there's something in the 'str_extract_all' function that I used to do that I'm not aware of. Thank you.
Edited to include sample data.
head(df)
one two three four five six
1 c("-3.19474987" "-3.9386188" "-5.3585024" "-7.3370402" "-4.65656894" "-5.37296894"
2 c("-3.86805776" "-2.57038981" "-4.88910112" "-3.82336021" "-1.51641245" "-4.19533412"
3 c("-4.64324462" "-3.51131105" "-5.81064472" "-6.63382723" "-4.47048461" "-7.08557932"
4 c("-4.88484732" "-3.48084998" "-4.97011221" "-5.36993391" "-3.14765309" "-4.60799153"
5 c("-4.99299683" "-3.26320573" "-4.5861881" "-5.3340004" "-2.14507341" "-3.30230272"
6 c("-5.15376815" "-4.08624463" "-6.50014523" "-5.49561174" "-4.14988788" "-6.57583067"

Related

Dropping small sample size from a factor plot

I'm new to R, so forgive my ignorance. I'm playing around with a dataset that reflects the mileage my car has achieved since I got it. Here's the data formatted .csv. (Note: I have this data in excel and when I saved it as .txt space delimited there was an issue where one line kept throwing an error on read.table saying there weren't the right number of columns...so I switched to .csv and it worked fine)
Date,Miles,Gallons,Price.per.Gallon,Total.Cost,Grade,MPG,Price.per.Mile,Cumulative.Miles,Cumulative.Gallons,Cumulative.Cost,Cumulative.MPG,Cumulative.Price.per.Mile,Gas.Source,Car.Said,Delta,Average.Price.of.Gas,Avg.Temp
6/8/2011,391.8,9.751,3.749,36.556499,Regular,40.18049431,0.093303979,570.4,9.751,36.56,40.18049431,0.064095372,Dealer,41.18,1,3.74935904,82.8
6/22/2011,441.2,9.566,3.359,32.132194,Regular,46.12168095,0.072829089,1011.6,19.317,68.692194,43.14334524,0.067904502,Speedway,47,0.878319047,3.556048765,73.2
7/7/2011,460.6,9.594,3.599,34.528806,Regular,48.0091724,0.074964842,1472.2,28.911,103.221,44.75805057,0.070113436,BP,49.4,1.390827601,3.570301961,79.5
7/18/2011,397.4,8.178,3.319,27.142782,Regular,48.59378821,0.068300911,1869.6,37.089,130.363782,45.60381784,0.069728168,Shell,45.7,2.893788212,3.514890722,83.1
7/26/2011,368.7,8.959,3.359,30.093281,Regular,41.15414667,0.081619965,2238.3,46.048,160.457063,44.73809937,0.071687023,Kroger,42.9,1.745853332,3.484560958,79.1
8/8/2011,436.3,9.845,3.559,35.038355,Regular,44.31691214,0.080307942,2674.6,55.893,195.495418,44.6639114,0.073093329,Kroger,48,3.683087862,3.49767266,76
8/9/2011,262.2,4.986,3.479,17.346294,Regular,52.58724428,0.066156728,2936.8,60.879,212.841712,45.31283365,0.072474023,Shell,46.9,5.687244284,3.496143366,74.5
8/13/2011,250.1,5.887,3.369,19.833303,Regular,42.48343808,0.079301491,3186.9,66.766,232.675015,45.0633556,0.073009826,mobil,45.5,3.016561916,3.484932675,74.1
8/14/2011,424.4,8.699,3.759,32.699541,Regular,48.78721692,0.077048871,3611.3,75.465,265.374556,45.49261247,0.073484495,Speedway,49,0.212783079,3.516524959,68
8/18/2011,437,9.594,3.399,32.610006,regular,45.54930165,0.074622439,4048.3,85.059,297.984562,45.49900657,0.073607332,Speedway,47.6,2.050698353,3.503269049,77.1
8/30/2011,407.3,9.244,3.429,31.697676,Regular,44.06101255,0.077823904,4455.6,94.303,329.682238,45.35804799,0.073992782,Shell,48.6,4.538987451,3.495988866,66.6
9/10/2011,347.3,7.992,3.549,28.363608,Regular,43.45595596,0.081668897,4802.9,102.295,358.045846,45.20944328,0.074547845,Meijer,49.6,6.144044044,3.500130466,65
9/21/2011,375,8.874,3.369,29.896506,Regular,42.25828262,0.079724016,5177.9,111.169,387.942352,44.97386861,0.07492272,Meijer,44.9,2.641717377,3.489663054,67.5
10/5/2011,404.8,9.243,3.079,28.459197,Regular,43.79530455,0.07030434,5582.7,120.412,416.401549,44.88340033,0.074587843,UDF,45.4,1.604695445,3.458139961,61.5
10/14/2011,376.5,8.715,3.249,28.315035,Regular,43.20137694,0.075205936,5959.2,129.127,444.716584,44.76987772,0.074626894,UDF,46.4,3.198623064,3.444024751,56.4
10/23/2011,382.8,8.953,3.199,28.640647,Regular,42.75661789,0.074818827,6342,138.08,473.357231,44.63933951,0.074638479,Speedway,43.8,1.043382107,3.428137536,50.3
10/31/2011,403.4,9.517,3.299,31.396583,Regular,42.38730692,0.077829903,6745.4,147.597,504.753814,44.49412928,0.074829338,Kroger,45.7,3.312693076,3.419810796,47.5
11/15/2011,402.8,9.146,3.249,29.715354,Regular,44.04111087,0.073771981,7148.2,156.743,534.469168,44.46769553,0.074769756,UDF,45.1,1.058889132,3.409843936,54.4
11/29/2011,361.1,9.209,3.149,28.999141,Regular,39.21164079,0.080307785,7509.3,165.952,563.468309,44.1760268,0.075036063,BP,41.7,2.488359214,3.395369197,42.8
12/10/2011,354.2,9.23,3.199,29.52677,Regular,38.37486457,0.083361858,7863.5,175.182,592.995079,43.87037481,0.075411087,Shell,40.3,1.925135428,3.385022885,22.8
12/19/2011,357.4,8.957,2.999,26.862043,Regular,39.90175282,0.075159605,8220.9,184.139,619.857122,43.67733071,0.075400154,UDF,41.3,1.398247181,3.366245727,41.8
1/5/2012,322.6,8.549,3.459,29.570991,Regular,37.73540765,0.091664572,8543.5,192.688,649.428113,43.41370506,0.076014293,Speedway,41,3.26459235,3.370360962,32.3
1/14/2012,370,9.148,3.319,30.362212,Regular,40.44599913,0.082060032,8913.5,201.836,679.790325,43.27919697,0.076265252,Shell,42,1.554000875,3.368033081,17.9
1/28/2012,327.3,9.108,3.329,30.320532,Regular,35.93544137,0.09263835,9240.8,210.944,710.110857,42.96211317,0.076845171,BP,37.5,1.56455863,3.366347737,32.1
2/9/2012,307,7.971,3.399,27.093429,Regular,38.51461548,0.088252212,9547.8,218.915,737.204286,42.80017358,0.077211953,Shell,41.1,2.585384519,3.367536651,28.8
2/16/2012,370.5,10.057,3.229,32.474053,Regular,36.84001193,0.087649266,9918.3,228.972,769.678339,42.53838897,0.077601841,Speedway,42.2,5.359988068,3.361451789,44
2/29/2012,406.3,9.518,3.759,35.778162,Regular,42.6875394,0.088058484,10324.6,238.49,805.456501,42.54434148,0.078013337,Shell,42.9,0.212460601,3.377317711,54.1
3/14/2012,370.6,9.812,3.699,36.294588,Regular,37.77007746,0.097934668,10695.2,248.302,841.751089,42.35567978,0.078703632,UDF,40.5,2.729922544,3.390029436,63.6
3/23/2012,357.6,7.999,3.929,31.428071,Regular,44.7055882,0.087886105,11052.8,256.301,873.17916,42.429019,0.07900072,Shell,43.1,1.605588199,3.406850383,66
4/3/2012,252.5,4.57,3.849,17.58993,Regular,55.25164114,0.069663089,11305.3,260.871,890.76909,42.65364874,0.078792167,Meijer,41.9,13.35164114,3.414596065,58.6
4/13/2012,382.3,9.416,3.629,34.170664,Regular,40.6011045,0.089381805,11687.6,270.287,924.939754,42.58214417,0.079138553,Shell,44.3,3.698895497,3.422065264,51.2
4/24/2012,393.7,9.018,3.659,32.996862,Regular,43.65713018,0.083812197,12081.3,279.305,957.936616,42.61685254,0.079290856,UDF,43.3,0.357130184,3.429715243,49.2
5/7/2012,354.7,9.203,3.729,34.317987,Regular,38.54177985,0.096752148,12436,288.508,992.254603,42.48686345,0.079788887,Speedway,40.6,2.058220146,3.439262007,70.3
5/18/2012,378,9.505,3.699,35.158995,Regular,39.76854287,0.093013214,12814,298.013,1027.413598,42.40016375,0.080178992,Speedway,42.2,2.431457128,3.447546241,62.2
6/1/2012,381.5,9.781,3.699,36.179919,Regular,39.0041918,0.094835961,13195.5,307.794,1063.593517,42.29224741,0.080602745,Sunoco,41,1.9958082,3.455536875,61.4
6/12/2012,386.8,8.976,3.649,32.753424,Regular,43.09269162,0.084677932,13582.3,316.77,1096.346941,42.31492881,0.080718799,Meijer,44.1,1.007308378,3.46101885,75.5
6/23/2012,379.9,9.168,3.339,30.611952,Regular,41.43760908,0.080578973,13962.2,325.938,1126.958893,42.29025152,0.080714994,Kroger,41.8,0.362390925,3.457586697,74.4
7/8/2012,321.9,8.285,3.549,29.403465,Regular,38.85334943,0.091343476,14284.1,334.223,1156.362358,42.20505471,0.080954513,Shell,40.9,2.046650573,3.459852727,84.1
7/21/2012,369.5,8.88,3.479,30.89352,Regular,41.61036036,0.083608985,14653.6,343.103,1187.255878,42.18966316,0.081021447,Meijer,42.6,0.98963964,3.460348286,70.1
7/21/2012,385,7.808,3.499,27.320192,Regular,49.30840164,0.070961538,15038.6,350.911,1214.57607,42.34805976,0.080763906,Speedway,48.5,0.808401639,3.461208312,70.1
7/26/2012,367.1,9.644,3.479,33.551476,Regular,38.06511821,0.091396012,15405.7,360.555,1248.127546,42.23350113,0.081017256,BP,44.2,6.134881792,3.461684198,82.5
8/12/2012,376.6,9.287,3.769,35.002703,Regular,40.55130828,0.09294398,15782.3,369.842,1283.130249,42.19126005,0.081301854,BP,42.3,1.74869172,3.46940112,66.4
8/24/2012,414.9,9.22,3.859,35.57998,Regular,45,0.085755556,16197.2,379.062,1318.710229,42.25957759,0.081415938,Speedway,44.6,0.4,3.478877411,76.5
9/9/2012,373.3,8.984,3.799,34.130216,Regular,41.55164737,0.091428385,16570.5,388.046,1352.840445,42.24318766,0.081641498,Speedway,42.8,1.248352627,3.486288855,62.1
9/19/2012,408.1,9.123,3.799,34.658277,Regular,44.73309218,0.084925942,16978.6,397.169,1387.498722,42.30038095,0.081720443,BP,46.5,1.766907815,3.493471852,53.5
9/28/2012,408.3,9.281,3.659,33.959179,Regular,43.99310419,0.083172126,17386.9,406.45,1421.457901,42.33903309,0.081754534,BP,45.6,1.606895809,3.497251571,59
10/7/2012,393.1,8.942,3.699,33.076458,Regular,43.96108253,0.084142605,17780,415.392,1454.534359,42.37395039,0.081807332,Speedway,46.3,2.338917468,3.50159454,45
10/15/2012,402.9,9.075,3.549,32.207175,Regular,44.39669421,0.079938384,18182.9,424.467,1486.741534,42.41719615,0.081765919,Speedway,46.1,1.703305785,3.502608057,54.6
10/24/2012,365.7,8.264,3.299,27.262936,Regular,44.25217812,0.074550003,18548.6,432.731,1514.00447,42.45223938,0.081623652,Speedway,46.8,2.547821878,3.49871969,68.5
11/4/2012,363.3,9.561,3.259,31.159299,Regular,37.99811735,0.085767407,18911.9,442.292,1545.163769,42.35595489,0.081703254,Meijer,42,4.001882648,3.493537683,37.3
11/15/2012,391.9,10.224,3.499,35.773776,Regular,38.33137715,0.091282919,19303.8,452.516,1580.937545,42.26502488,0.081897737,Speedway,44.1,5.768622848,3.493661097,33.7
11/24/2012,430.2,9.068,3.579,32.454372,Regular,47.44155271,0.075440195,19734,461.584,1613.391917,42.36671982,0.081756963,BP,44.3,3.141552713,3.495337614,29.5
12/2/2012,394.5,9.146,3.239,29.623894,Regular,43.13361032,0.075092253,20128.5,470.73,1643.015811,42.38162004,0.081626341,Sunoco,45.8,2.666389679,3.490357128,55.1
12/12/2012,386.1,9.312,3.169,29.509728,Regular,41.46262887,0.076430272,20514.6,480.042,1672.525539,42.36379317,0.081528547,Speedway,43.4,1.937371134,3.484123345,31
12/23/2012,359.8,8.642,3.199,27.645758,Regular,41.63388105,0.076836459,20874.4,488.684,1700.171297,42.35088523,0.081447673,Speedway,42.4,0.766118954,3.479081159,30.7
1/6/2013,336.4,8.878,3.079,27.335362,Regular,37.89141699,0.081258508,21210.8,497.562,1727.506659,42.27131493,0.081444672,Meijer,41,3.108583014,3.47194251,33.2
1/21/2013,350,9.257,3.259,30.168563,Regular,37.80922545,0.086195894,21560.8,506.819,1757.675222,42.1898153,0.0815218,Meijer,40.6,2.790774549,3.468053135,20.8
2/1/2013,335.7,9.058,3.499,31.693942,Regular,37.0611614,0.094411504,21896.5,515.877,1789.369164,42.09976409,0.081719415,Meijer,38.7,1.638838596,3.468596514,12.1
2/13/2013,360.9,9.42,3.759,35.40978,Regular,38.31210191,0.098115212,22257.4,525.297,1824.778944,42.03184103,0.08198527,Speedway,41.4,3.087898089,3.473804236,31
2/26/2013,371.3,9.081,3.899,35.406819,Regular,40.88756745,0.09535906,22628.7,534.378,1860.185763,42.01239572,0.082204712,Meijer,42.2,1.312432551,3.481029838,36.9
3/9/2013,362.6,8.952,3.439,30.785928,Regular,40.5049151,0.084903276,22991.3,543.33,1890.971691,41.98755821,0.082247271,BP,42.7,2.195084897,3.480337347,36.5
3/21/2013,375.3,8.991,3.859,34.696269,Regular,41.74174174,0.092449424,23366.6,552.321,1925.66796,41.98355666,0.082411132,Kroger,44,2.258258258,3.486501437,23.8
4/8/2013,361.7,9,3.299,29.691,Regular,40.18888889,0.082087365,23728.3,561.321,1955.35896,41.95478167,0.082406197,Speedway,43.4,3.211111111,3.483495112,61.8
4/20/2013,362.3,8.036,3.699,29.725164,Regular,45.08461921,0.082045719,24090.6,569.357,1985.084124,41.99895672,0.082400776,BP,45.6,0.515380786,3.486536784,39
4/30/2013,382.3,8.246,3.539,29.182594,Regular,46.36187242,0.076334277,24472.9,577.603,2014.266718,42.06124276,0.082306009,Speedway,48.7,2.338127577,3.487285762,60.2
5/9/2013,397.3,8.722,3.339,29.122758,Regular,45.55147902,0.073301681,24870.2,586.325,2043.389476,42.1131625,0.082162165,Pilot,47.4,1.848520981,3.485079906,65.8
5/18/2013,399,9.051,3.899,35.289849,Regular,44.08352668,0.088445737,25269.2,595.376,2078.679325,42.14311628,0.082261382,Kroger,45.7,1.616473318,3.491372385,68.3
5/30/2013,380.2,9.04,3.659,33.07736,Regular,42.05752212,0.086999895,25649.4,604.416,2111.756685,42.14183609,0.082331621,Sunoco,44.4,2.342477876,3.493879522,78.2
6/14/2013,395.3,9.095,3.759,34.188105,Regular,43.46344145,0.086486479,26044.7,613.511,2145.94479,42.16142824,0.082394683,Meijer,45,1.536558549,3.497809803,67.6
6/22/2013,390.3,9.008,3.559,32.059472,Regular,43.32815275,0.082140589,26435,622.519,2178.004262,42.17831102,0.082390931,BP,44.3,0.971847247,3.49869524,78.2
7/4/2013,388.9,9.501,3.399,32.293899,Regular,40.93253342,0.083039082,26823.9,632.02,2210.298161,42.15958356,0.082400328,BP,43.7,2.767466582,3.497196546,71.6
7/18/2013,399.8,9.06,3.299,29.88894,Regular,44.12803532,0.07475973,27223.7,641.08,2240.187101,42.18740251,0.08228812,Speedway,45.2,1.07196468,3.494395553,83.9
8/25/2013,394.3,9.114,3.529,32.163306,Regular,43.2631117,0.081570647,27618,650.194,2272.350407,42.20248111,0.082277877,Kroger,45.8,2.536888304,3.494880616,74.6
9/5/2013,413.7,9.507,3.519,33.455133,Regular,43.51530451,0.0808681,28031.7,659.701,2305.80554,42.2214003,0.082257071,Speedway,46,2.484695488,3.495228202,70.2
9/14/2013,431.2,9.272,3.299,30.588328,Regular,46.50560828,0.070937681,28462.9,668.973,2336.393868,42.28077964,0.082085587,UDF,46.7,0.194391717,3.492508469,55.1
9/25/2013,417.6,9.685,3.159,30.594915,Regular,43.11822406,0.073263685,28880.5,678.658,2366.988783,42.29273065,0.081958026,Meijer,48.1,4.981775942,3.487749033,61.3
10/11/2013,421.9,9.202,3.299,30.357398,Regular,45.84872854,0.071954013,29302.4,687.86,2397.346181,42.34030181,0.081813987,Kroger,45.7,0.148728537,3.485224001,62.7
10/23/2013,389,8.975,3.259,29.249525,Regular,43.34261838,0.075191581,29691.4,696.835,2426.595706,42.35321131,0.081727224,Meijer,45.9,2.557381616,3.482310312,39.6
11/2/2013,392.8,8.852,3.299,29.202748,Regular,44.37415273,0.074345081,30084.2,705.687,2455.798454,42.3785616,0.081630838,Meijer,44.8,0.425847266,3.480010903,49.7
11/12/2013,363.5,9.114,2.959,26.968326,Regular,39.88369541,0.074190718,30447.7,714.801,2482.76678,42.34675105,0.081542014,Valero,44,4.116304586,3.473367804,31.4
11/24/2013,375.5,9.123,3.199,29.184477,Regular,41.15970624,0.077721643,30823.2,723.924,2511.951257,42.33179174,0.081495473,UDF,42.6,1.440293763,3.46991018,21.1
12/2/2013,364,9.006,2.999,27.008994,Regular,40.41749944,0.074200533,31187.2,732.93,2538.960251,42.30826955,0.08141033,Meijer,41.1,0.682500555,3.464123792,38.9
12/12/2013,325.8,8.576,2.979,25.547904,Regular,37.98973881,0.078415912,31513,741.506,2564.508155,42.25832293,0.081379372,Murphy,39.5,1.510261194,3.458513019,13.8
1/7/2014,317.1,8.915,3.199,28.519085,Regular,35.56926528,0.089937196,31830.1,750.421,2593.02724,42.17885693,0.081464628,Kroger,38.5,2.930734717,3.455430005,-3.6
1/15/2014,359.5,9.252,3.299,30.522348,Regular,38.85646347,0.08490222,32189.6,759.673,2623.549588,42.13839376,0.081503019,Meijer,41.1,2.243536533,3.453524856,28.6
1/27/2014,302.7,8.89,3.249,28.88361,Regular,34.04949381,0.095419921,32492.3,768.563,2652.433198,42.04482912,0.08163267,BP,35.8,1.750506187,3.451159109,37.1
2/4/2014,346.7,8.983,3.279,29.455257,Regular,38.59512412,0.084958918,32839,777.546,2681.888455,42.00497463,0.081667787,UDF,40,1.404875877,3.449170152,22.4
2/16/2014,310.1,8.773,3.459,30.345807,Regular,35.34708766,0.097858133,33149.1,786.319,2712.234262,41.93069225,0.081819243,Speedway,37.7,2.352912345,3.449279824,23.3
3/1/2014,361.8,9.065,3.599,32.624935,Regular,39.91174848,0.09017395,33510.9,795.384,2744.859197,41.90768233,0.081909444,Speedway,42.2,2.288251517,3.450986187,35.1
3/17/2014,354.2,9.356,3.579,33.485124,Regular,37.858059,0.094537335,33865.1,804.74,2778.344321,41.86060094,0.082041521,Speedway,41.9,4.041941,3.45247449,26.3
3/28/2014,354.1,9.165,3.579,32.801535,Regular,38.63611566,0.092633536,34219.2,813.905,2811.145856,41.82429153,0.082151127,UDF,39.8,1.163884343,3.453899234,51.9
4/8/2014,371.5,9.164,3.549,32.523036,Regular,40.53906591,0.087545184,34590.7,823.069,2843.668892,41.80998191,0.082209059,UDF,41.7,1.16093409,3.45495808,49.7
4/21/2014,373.8,9.216,3.679,33.905664,Regular,40.55989583,0.090705361,34964.5,832.285,2877.574556,41.79613954,0.082299891,Shell,42.2,1.640104167,3.457438925,64.1
5/2/2014,391.9,8.834,3.599,31.793566,Regular,44.36268961,0.081126731,35356.4,841.119,2909.368122,41.82309519,0.082286888,Speedway,44.8,0.437310392,3.458925695,50.9
5/10/2014,375.1,8.854,3.659,32.396786,Regular,42.36503275,0.086368398,35731.5,849.973,2941.764908,41.82874044,0.082329734,Speedway,45.8,3.434967246,3.46100983,65.5
5/21/2014,401.1,9.094,3.659,33.274946,Regular,44.10600396,0.082959227,36132.6,859.067,2975.039854,41.85284733,0.082336722,Speedway,45.6,1.493996041,3.463105734,72.3
6/6/2014,435.3,9.487,3.599,34.143713,Regular,45.88384105,0.0784372,36567.9,868.554,3009.183567,41.89687688,0.082290303,Speedway,50.5,4.616158954,3.464590074,67.5
6/21/2014,458.4,9.286,3.799,35.277514,Regular,49.36463493,0.076957928,37026.3,877.84,3044.461081,41.9758726,0.082224286,Kroger,49.6,0.235365066,3.468127541,73.8
7/5/2014,386.8,9.292,3.029,28.145468,Regular,41.6272062,0.072764912,37413.1,887.132,3072.606549,41.97222059,0.082126489,Speedway,44.5,2.872793801,3.463528031,69.2
7/19/2014,433.1,8.961,3.499,31.354539,Regular,48.33165941,0.072395611,37846.2,896.093,3103.961088,42.03581548,0.082015132,Kroger,48.3,0.031659413,3.463882753,66.7
8/6/2014,401.4,9.055,3.399,30.777945,Regular,44.32909994,0.076676495,38247.6,905.148,3134.739033,42.05875724,0.081959104,Speedway,47.6,3.270900055,3.463233673,73.1
8/25/2014,414.1,9.001,3.039,27.354039,Regular,46.00599933,0.066056602,38661.7,914.149,3162.093072,42.09762304,0.081788775,Speedway,46.9,0.894000667,3.459056535,78.2
9/15/2014,406.2,9.094,2.959,26.909146,Regular,44.66681328,0.066246051,39067.9,923.243,3189.002218,42.12292972,0.081627173,Kroger,47.1,2.433186717,3.454130947,59.5
9/30/2014,396.3,9.129,3.189,29.112381,Regular,43.41110746,0.073460462,39464.2,932.372,3218.114599,42.13554247,0.081545162,Kroger,46.7,3.28889254,3.451535009,62
10/22/2014,397.7,9.328,2.859,26.668752,Regular,42.63507719,0.06705746,39861.9,941.7,3244.783351,42.1404906,0.081400619,UDF,45.1,2.464922813,3.445665659,46.9
11/5/2014,413.2,9.262,2.879,26.665298,Regular,44.61239473,0.064533635,40275.1,950.962,3271.448649,42.16456599,0.081227574,UDF,46,1.387605269,3.440146556,50
11/17/2014,398.9,9.081,2.899,26.325819,Regular,43.9268803,0.065996037,40674,960.043,3297.774468,42.18123563,0.081078194,Speedway,45.2,1.2731197,3.435027877,28.6
11/25/2014,345.8,9.003,2.899,26.099697,Regular,38.40941908,0.075476278,41019.8,969.046,3323.874165,42.14619327,0.08103097,UDF,40.7,2.290580917,3.430047867,36.7
12/7/2014,345.6,8.738,2.139,18.690582,Regular,39.55138476,0.054081545,41365.4,977.784,3342.564747,42.12300467,0.080805812,Speedway,41.6,2.048615244,3.418510373,33
12/30/2014,360.8,9.013,1.869,16.845297,Regular,40.03106624,0.046688739,41726.2,986.797,3359.410044,42.10389776,0.080510807,Kroger,42.2,2.168933762,3.40435778,25.4
2/2/2015,338.8,8.725,2.059,17.964775,Regular,38.83094556,0.05302472,42065,995.522,3377.374819,42.0752128,0.080289429,Speedway,41.1,2.269054441,3.392566733,25.9
2/12/2015,321.7,8.765,2.359,20.676635,Regular,36.70279521,0.064273034,42386.7,1004.287,3398.051454,42.02832457,0.08016787,Speedway,39.2,2.497204792,3.383546191,26.3
3/3/2015,310.7,9.93,2.039,20.24727,Regular,31.28902316,0.065166624,42697.4,1014.217,3418.298724,41.92317818,0.080058709,AAFES,37.4,6.110976838,3.370382003,26.4
3/13/2015,408.5,9.404,2.199,20.679396,Regular,43.43896214,0.050622756,43105.9,1023.621,3438.97812,41.93710367,0.079779755,Kroger,42.7,0.738962144,3.359620524,46.1
3/22/2015,396.5,9.051,2.339,21.170289,Regular,43.80731411,0.05339291,43502.4,1032.672,3460.148409,41.9534954,0.079539253,Speedway,45.9,2.092685891,3.35067515,40
3/30/2015,386.7,8.931,1.999,17.853069,Regular,43.29862277,0.04616775,43889.1,1041.603,3478.001478,41.9650289,0.079245222,Meijer,44.4,1.101377225,3.339085504,46.4
4/10/2015,414,8.905,2.399,21.363095,Regular,46.49073554,0.051601679,44303.1,1050.508,3499.364573,42.00339264,0.078986901,Kroger,48.3,1.809264458,3.331116539,61
4/19/2015,368.7,7.84,2.419,18.96496,Regular,47.02806122,0.051437375,44671.8,1058.348,3518.329533,42.04061424,0.07875952,Shell,48.4,1.371938776,3.324359788,62.5
4/28/2015,407.9,9.18,2.179,20.00322,Regular,44.4335512,0.049039519,45079.7,1067.528,3538.332753,42.06119184,0.078490601,Speedway,47.5,3.066448802,3.314510489,49.3
5/10/2015,425.1,9.235,2.499,23.078265,Regular,46.03140227,0.054289026,45504.8,1076.763,3561.411018,42.09524287,0.078264513,Kroger,47.7,1.668597726,3.307516155,74.9
5/19/2015,436.6,9.161,2.629,24.084269,Regular,47.65855256,0.055163236,45941.4,1085.924,3585.495287,42.1421757,0.078044972,BP,49.1,1.44144744,3.301792102,62.9
5/28/2015,399.1,8.503,2.299,19.548397,Regular,46.9363754,0.0489812,46340.5,1094.427,3605.043684,42.17942357,0.077794665,UDF,49,2.063624603,3.294001047,72.9
6/9/2015,416.6,8.858,2.639,23.376262,Regular,47.03093249,0.056112007,46757.1,1103.285,3628.419946,42.21837513,0.077601475,Kroger,48.4,1.36906751,3.288742207,65.5
7/9/2015,419.6,8.917,2.389,21.302713,Regular,47.05618482,0.050769097,47176.7,1112.202,3649.722659,42.25716192,0.077362822,BP,49.4,2.343815184,3.281528588,73.1
7/30/2015,433.9,9.361,2.499,23.393139,Regular,46.35188548,0.053913664,47610.6,1121.563,3673.115798,42.29133807,0.077149118,UDF,48.9,2.548114518,3.274997301,76.2
8/12/2015,410.8,8.774,2.699,23.681026,Regular,46.82015044,0.05764612,48021.4,1130.337,3696.796824,42.32649201,0.076982279,UDF,47.5,0.679849556,3.270526245,68.3
8/23/2015,397,8.841,2.059,18.203619,Regular,44.90442258,0.045852945,48418.4,1139.178,3715.000443,42.34649897,0.076727039,UDF,48.8,3.895577423,3.26112376,72.1
9/1/2015,435.8,9.6,1.999,19.1904,Regular,45.39583333,0.044034878,48854.2,1148.778,3734.190843,42.37198136,0.076435411,Kroger,49.6,4.204166667,3.250576563,75.5
9/12/2015,422.5,8.493,2.269,19.270617,Regular,49.74685035,0.045610928,49276.7,1157.271,3753.46146,42.42610417,0.076171121,Kroger,45.3,4.446850347,3.243372952,58.8
9/22/2015,391.3,8.491,1.799,15.275309,Regular,46.08408904,0.039037335,49668,1165.762,3768.736769,42.45274764,0.075878569,Speedway,48.3,2.215910965,3.232852648,63.3
10/1/2015,421.3,8.961,2.459,22.035099,Regular,47.01484209,0.052302632,50089.3,1174.723,3790.771868,42.48754813,0.075680272,Kroger,50.2,3.185157906,3.22694956,55.9
10/25/2015,412.4,10.057,1.079,10.851503,Regular,41.00626429,0.026313053,50501.7,1184.78,3801.623371,42.47497426,0.075277137,UDF,45.8,4.793735706,3.208716699,55.2
11/14/2015,445.4,9.047,1.979,17.904013,Regular,49.23178954,0.040197604,50947.1,1193.827,3819.527384,42.52617842,0.074970457,Kroger,45.5,3.731789543,3.199397722,38.2
11/24/2015,395.3,9.451,1.899,17.947449,Regular,41.82626177,0.045402097,51342.4,1203.278,3837.474833,42.52068101,0.074742802,Meijer,44.4,2.573738229,3.189183907,37.7
12/9/2015,381.4,9.291,1.469,13.648479,Regular,41.05047896,0.03578521,51723.8,1212.569,3851.123312,42.50941596,0.074455537,Speedway,43.8,2.749521042,3.176003437,46.7
12/18/2015,391,8.715,1.839,16.026885,Regular,44.86517499,0.040989476,52114.8,1221.284,3867.150197,42.5262265,0.074204452,Kroger,46.1,1.234825014,3.166462671,30.8
12/31/2015,356.6,8.754,1.999,17.499246,Regular,40.7356637,0.049072479,52471.4,1230.038,3884.649443,42.51348332,0.074033653,Speedway,43.2,2.464336303,3.158154011,33.4
1/8/2016,375.7,10.531,1.099,11.573569,Regular,35.67562435,0.030805347,52847.1,1240.569,3896.223012,42.45543779,0.073726335,UDF,43.2,7.524375653,3.140674168,38.4
1/17/2016,408.8,8.996,1.199,10.786204,Regular,45.44241885,0.026385039,53255.9,1249.565,3907.009216,42.47694198,0.073362937,Kroger,41.1,4.342418853,3.126695463,24
1/26/2016,326.8,8.83,1.799,15.88517,Regular,37.01019253,0.048608231,53582.7,1258.395,3922.894386,42.43858248,0.073211958,Kroger,39.9,2.889807475,3.11737919,39.6
2/3/2016,338.2,7.974,1.599,12.750426,Regular,42.41284174,0.037700846,53920.9,1266.369,3935.644812,42.4384204,0.072989227,UDF,44.1,1.687158264,3.107818347,53.7
2/10/2016,355.1,8.88,1.349,11.97912,Regular,39.98873874,0.033734497,54276,1275.249,3947.623932,42.42136242,0.072732403,UDF,43.3,3.311261261,3.095571086,16.5
2/17/2016,334.9,8.703,1.559,13.567977,Regular,38.48098357,0.040513517,54610.9,1283.952,3961.191909,42.39465338,0.072534822,UDF,39.6,1.119016431,3.08515576,31.7
2/26/2016,375.8,8.959,1.879,16.833961,Regular,41.94664583,0.044795,54986.7,1292.911,3978.02587,42.39154899,0.072345237,UDF,44.4,2.453354169,3.076797916,29.9
3/13/2016,385.7,8.732,1.959,17.105988,Regular,44.17086578,0.0443505,55372.4,1301.643,3995.131858,42.40348544,0.072150238,UDF,45,0.829134219,3.06929923,54.5
4/5/2016,402.6,9.241,1.959,18.103119,Regular,43.56671356,0.044965522,55775,1310.884,4013.234977,42.41168555,0.071954011,Kroger,45.9,2.333286441,3.061472241,30.9
4/14/2016,370.8,8.674,2.1139,18.3359686,Regular,42.74844362,0.049449754,56145.8,1319.558,4031.570946,42.4138992,0.071805388,UDF,44.5,1.751556375,3.055243457,47.7
4/28/2016,397.6,9.2,2.399,22.0708,Regular,43.2173913,0.05551006,56543.4,1328.758,4053.641746,42.41946239,0.071690803,UDF,44.7,1.482608696,3.050699786,55.8
5/7/2016,377,8.884,1.669,14.827396,Regular,42.43583971,0.039329963,56920.4,1337.642,4068.469142,42.41957116,0.071476468,Speedway,44.1,1.664160288,3.041523174,62.3
5/18/2016,389.2,9.253,2.459,22.753127,Regular,42.06203393,0.058461272,57309.6,1346.895,4091.222269,42.41711492,0.071388079,Kroger,44.8,2.737966065,3.037521313,56.2
5/24/2016,410.5,8.846,2.579,22.813834,Regular,46.40515487,0.055575722,57720.1,1355.741,4114.036103,42.44313626,0.071275623,UDF,47.1,0.694845128,3.034529532,68
6/28/2016,376.6,8.994,2.349,21.126906,Regular,41.87235935,0.05609906,58096.7,1364.735,4135.163009,42.43937468,0.071177244,UDF,42.9,1.027640649,3.030011694,74.6
7/13/2016,357.2,9.138,1.579,14.428902,Regular,39.08951631,0.040394462,58453.9,1373.873,4149.591911,42.41709387,0.070989137,Meijer,41.6,2.510483694,3.020360623,79
8/4/2016,358.8,9.236,1.919,17.723884,Regular,38.84798614,0.04939767,58812.7,1383.109,4167.315795,42.3932604,0.070857413,Kroger,40.7,1.852013859,3.013006057,79.6
8/12/2016,386.7,8.98,2.239,20.10622,Regular,43.0623608,0.051994363,59199.4,1392.089,4187.422015,42.39757659,0.070734197,Kroger,44.7,1.637639198,3.008013148,82.9
8/22/2016,367.9,8.752,2.339,20.470928,Regular,42.03610603,0.055642642,59567.3,1400.841,4207.892943,42.39531824,0.070640988,Meijer,43.3,1.263893967,3.003833371,66.6
8/30/2016,360.1,9.337,2.139,19.971843,Regular,38.56699154,0.055461936,59927.4,1410.178,4227.864786,42.36997032,0.070549778,UDF,41.1,2.533008461,2.998107179,76
9/12/2016,410.1,9.475,2.159,20.456525,Regular,43.2823219,0.049881797,60337.5,1419.653,4248.321311,42.3760595,0.070409303,Marathon,45.1,1.8176781,2.992506838,66
9/22/2016,395.8,9.273,2.189,20.298597,Regular,42.68305834,0.051284985,60733.3,1428.926,4268.619908,42.37805177,0.070284669,UDF,44.4,1.716941659,2.987292489,73.6
10/5/2016,379.5,9.097,1.699,15.455803,Regular,41.71704958,0.040726754,61112.8,1438.023,4284.075711,42.37387024,0.07010112,Speedway,43.7,1.982950423,2.979142691,67.3
10/7/2016,129.6,2.722,2.309,6.285098,Regular,47.61204996,0.048496127,61242.4,1440.745,4290.360809,42.38376673,0.0700554,Kroger,47.2,0.412049963,2.977876591,67.4
10/10/2016,400.1,8.569,2.219,19.014611,Regular,46.69156261,0.047524646,61642.5,1449.314,4309.37542,42.40923637,0.06990916,Kroger,48.8,2.108437391,2.973389769,53.2
10/20/2016,395.7,8.947,1.949,17.437703,Regular,44.22711523,0.044067988,62038.2,1458.261,4326.813123,42.42038977,0.069744337,Wal Mart,46.8,2.572884766,2.967104738,59.5
10/31/2016,395.9,9.247,2.099,19.409453,Regular,42.81388558,0.049026151,62434.1,1467.508,4346.222576,42.42286925,0.069612961,Meijer,45.6,2.786114415,2.961634673,51.2
11/10/2016,414.6,8.899,1.999,17.789101,Regular,46.58950444,0.042906659,62848.7,1476.407,4364.011677,42.44798352,0.069436785,Meijer,48.3,1.710495561,2.955832421,45
11/22/2016,366,9.225,2.599,23.975775,Premium,39.67479675,0.065507582,63214.7,1485.632,4387.987452,42.43076347,0.069414036,Meijer,43.6,3.925203252,2.953616677,30.8
12/6/2016,393.2,9.229,1.989,18.356481,Regular,42.60483259,0.046684845,63607.9,1494.861,4406.343933,42.43183814,0.069273533,BP,44.8,2.195167407,2.947661309,N/A
12/21/2016,334,8.855,2.259,20.003445,Regular,37.71880294,0.059890554,63941.9,1503.716,4426.347378,42.40408428,0.069224521,UDF,39.2,1.481197064,2.943605959,N/A
1/9/2017,332,8.847,2.429,21.489363,Regular,37.52684526,0.064726997,64273.9,1512.563,4447.836741,42.37555725,0.069201289,Speedway,39.6,2.073154742,2.940596022,N/A
One of the things I tried was
plot(factor(Gas.Source),MPG)
It's exactly what you'd expect. Some of the factor levels have very few (or one) observations and so rather than a box and whicker you just get a black line.
I understand this is exactly what I asked it to do, as some of those sources had very few observations. So what I'd like to do is efficiently remove the measurements associated with factor levels where there aren't enough observations to really produce a box and whisker...
I'm guessing I could do this by creating a new dataframe where I've used logical subscripting to select only those rows corresponding to a factor level that has a count that's greater than X....but I'm not sophisticated enough to figure that out yet.
Found what I was looking for here
Given that the original data was in a dataframe called mileage
tbl <- table(mileage$Gas.Source)
new.Mileage <- droplevels(mileage[mileage$Gas.Source %in% names(tbl)[tbl>10],,drop=FALSE])
new.Mileage now has only those rows where there were more than 10 observations at that factor level (i.e. from that gas source)

Merge columns with the same name R

I'm fairly new to R. I'm working with a data set that is incredibly redundant with a lot of columns (~400). There are several duplicate column names, however the data is not duplicate, so I need to sum the columns when collapsing them.
The columns all have a similar name that allows easy identification, so I'm hoping I can use that to my advantage.
I attempted to perform the following:
ColNames <- unique(colnames(df))
CombinedDf <- data.frame(sapply(ColNames, function(i)rowSums(Test[,ColNames==i, drop=FALSE])))
This works if I sum over the range of columns that only contain integers, but the issue is that other columns have strings and such in them, so rowSums throws a fit.
Assuming that the identifier is "XXX", how can I aggregate all the columns that are of the same name leaving the other columns as is?
Thank you for your time.
Edit: Sample data has been asked for, I cannot give the exact data as it is sensitive, but I will give an example:
Name COL1XXX COL2XXX COL1XXX COL3XXX COL2XXX Type
Henry 5 15 25 31 1 Orange
Tom 8 16 12 4 3 Green
Should return
Name COL1XXX COL2XXX COL3XXX Type
Henry 30 16 31 Orange
Tom 20 19 4 Green
I'm not really sure, but you may try transposing the data and then aggregating by unique names.
t_df=as.data.frame(t(df))
new_df=aggregate(t_df, by=list(rownames(t_df)),sum)
Again, without sample data I'm unsure if it'll work, but based on what you said, that might work.

R readr package - written and read in file doesn't match source

I apologize in advance for the somewhat lack of reproducibility here. I am doing an analysis on a very large (for me) dataset. It is from the CMS Open Payments database.
There are four files I downloaded from that website, read into R using readr, then manipulated a bit to make them smaller (column removal), and then stuck them all together using rbind. I would like to write my pared down file out to an external hard drive so I don't have to read in all the data each time I want to work on it and doing the paring then. (Obviously, its all scripted but, it takes about 45 minutes to do this so I'd like to avoid it if possible.)
So I wrote out the data and read it in, but now I am getting different results. Below is about as close as I can get to a good example. The data is named sa_all. There is a column in the table for the source. It can only take on two values: gen or res. It is a column that is actually added as part of the analysis, not one that comes in the data.
table(sa_all$src)
gen res
14837291 822559
So I save the sa_all dataframe into a CSV file.
write.csv(sa_all, 'D:\\Open_Payments\\data\\written_files\\sa_all.csv',
row.names = FALSE)
Then I open it:
sa_all2 <- read_csv('D:\\Open_Payments\\data\\written_files\\sa_all.csv')
table(sa_all2$src)
g gen res
1 14837289 822559
I did receive the following parsing warnings.
Warning: 4 parsing failures.
row col expected actual
5454739 pmt_nature embedded null
7849361 src delimiter or quote 2
7849361 src embedded null
7849361 NA 28 columns 54 columns
Since I manually add the src column and it can only take on two values, I don't see how this could cause any parsing errors.
Has anyone had any similar problems using readr? Thank you.
Just to follow up on the comment:
write_csv(sa_all, 'D:\\Open_Payments\\data\\written_files\\sa_all.csv')
sa_all2a <- read_csv('D:\\Open_Payments\\data\\written_files\\sa_all.csv')
Warning: 83 parsing failures.
row col expected actual
1535657 drug2 embedded null
1535657 NA 28 columns 25 columns
1535748 drug1 embedded null
1535748 year an integer No
1535748 NA 28 columns 27 columns
Even more parsing errors and it looks like some columns are getting shuffled entirely:
table(sa_all2a$src)
100000000278 Allergan Inc. gen GlaxoSmithKline, LLC.
1 1 14837267 1
No res
1 822559
There are columns for manufacturer names and it looks like those are leaking into the src column when I use the write_csv function.

R: Data transfer between two lists (source list smaller than target list)

I searched, but I couldn't find a similar question, so I apologize if I may have missed it.
My problem is actually pretty simple. I have two lists, a large one and a smaller one.
The smaller one consists of the averages of the data in the large list (ten lines have
been aggregated to form the small list -> it has one tenth the size of the larger one). All I want now, is to add a new column in the large list (which is no problem) and showing the averages next
to the original data. I am aware that I will see the average ten times, but that's fine.
I tried to solve this "problem" with simple list comparisons, e.g. (the relevant averages, as well as the original data have identical identifiers in the first column):
Large_List$Average_column[ Large_List$identifier == Small_List$identifier ] <- Small_List$Average[ Large_List$identifier == Small_List$identifier ];
Yet for some reason, it doesn't work. Probably because the target vector is larger than the source vector. I really tried a lot, and the only thing that seems to work is a loop structure. But that is no option because my list is way too large... I am sure there must be a smart solution to this simple issue.
UPDATE & SPECIFICATION
Thank you for your suggestions. But it seems I need to be more specific. The problem is that in most, but not in all cases, the average is formed out of ten consecutive datapoints. It may occur that less is used because of holes in the sample. Therefore, a replication will unfortunately not do the job.
Here’s an example (1_Ident is the minute identifier, 10_Ident being the ten minute identifier) :
Original_List:
1_Ident | 10_Ident|Minute_value|
July1-0| July1-0d| 1
July1-2| July1-0d| 1
(..)
July1-10| July1-0d| 1
July1-11| July1-1d| 1
July1-12| July1-1d| 2
July1-21| July1-21| 3
July1-31| July1-31| 2
Resulting Small_list:
10_Ident|Minute_average|
July1-0d| 1
July1-1d| 1.5
July1-2d| 3
July1-3d| 2
Desired outcome:
Large_List:
1_Ident |10_Ident|Minute_value|Minute_average|
July1-0| July1-0d| 1 1
July1-2| July1-0d| 1 1
(..)
July1-10| July1-0d| 1 1
July1-11| July1-1d| 1 1.5
July1-12| July1-1d| 2 1.5
July1-21| July1-21| 3 3
July1-31| July1-31| 2 2
I think the main problem is that the Small_list$Minute_average vector is not the same size as the Large_list$Minute_value vector. As said, one could compare the two lists line by line, doing a loop, but the size of the tables is >1M lines, so that won't work.
What I want to do is basically the following:
1) Look in the Large_List$10_Ident and compare it Small_List$10_Ident
2) Where the values match, transfer the corresponding Small_List$Minute_average value to Large_List$Minute_average
Thanks!
You could use match or merge to do that but why not just calculate the averages off the groupings?
Large_List$Average_column <- ave(Large_List$col_to_be_avgd,
Large_List$group_var,
FUN=mean, na.rm=TRUE)
The merge code might look like
merge( Large_List, Small_List[c('identifier', "Average"], by='identifier' , all.x=TRUE)

R storing different columns in different vectors to compute conditional probabilities

I am completely new to R. I tried reading the reference and a couple of good introductions, but I am still quite confused.
I am hoping to do the following:
I have produced a .txt file that looks like the following:
area,energy
1.41155882174e-05,1.0914586287e-11
1.46893363946e-05,5.25011714434e-11
1.39244046855e-05,1.57904991488e-10
1.64155121046e-05,9.0815757601e-12
1.85202830392e-05,8.3207522281e-11
1.5256036289e-05,4.24756620609e-10
1.82107587343e-05,0.0
I have the following command to read the file in R:
tbl <- read.csv("foo.txt",header=TRUE).
producing:
> tbl
area energy
1 1.411559e-05 1.091459e-11
2 1.468934e-05 5.250117e-11
3 1.392440e-05 1.579050e-10
4 1.641551e-05 9.081576e-12
5 1.852028e-05 8.320752e-11
6 1.525604e-05 4.247566e-10
7 1.821076e-05 0.000000e+00
Now I want to store each column in two different vectors, respectively area and energy.
I tried:
area <- c(tbl$first)
energy <- c(tbl$second)
but it does not seem to work.
I need to different vectors (which must include only the numerical data of each column) in order to do so:
> prob(energy, given = area), i.e. the conditional probability P(energy|area).
And then plot it. Can you help me please?
As #Ananda Mahto alluded to, the problem is in the way you are referring to columns.
To 'get' a column of a data frame in R, you have several options:
DataFrameName$ColumnName
DataFrameName[,ColumnNumber]
DataFrameName[["ColumnName"]]
So to get area, you would do:
tbl$area #or
tbl[,1] #or
tbl[["area"]]
With the first option generally being preferred (from what I've seen).
Incidentally, for your 'end goal', you don't need to do any of this:
with(tbl, prob(energy, given = area))
does the trick.

Resources