Issue with Boxplot formula or variable definition - r

I have a csv file having 4 columns labeled AGE, DIASTOLIC, BMI and EVER.PREGNANT and 700 rows. The last column consists of only yes or no. I wish to plot the data BMI vs EVER.PREGNANT with an intent to comparing BMI of those with yes in the fourth column and no in the same column. What code should I write to get the required boxplot?
I have tried the following code:
Sheet=read.csv(/Downloads/1739230_1284354330_PIMA.csv - 1739230_1284354330_PIMA.csv.csv, sep=",")
boxplot(BMI~EVER.PREGNANT,data=sheet, main="BMI vs PREG",xlab="BMI",ylab="PREGNANT")
The error that I get is
Error in eval(expr,envr,enclos): object 'Sheet' not found
Similarly, what modifications can be done to plot AGE vs DIASTOLIC, where both columns are numbers? Will I get the 700 odd values nicely?

I answer here because it tells me not to extend the discussion :-).
I think you haven't loaded correctly your data set. You need to add header = T when loading to tell the program that your first row corresponds with the names of the variables.
Sheet=read.csv("/Downloads/1739230_1284354330_PIMA.csv", sep=",", header = T)

Related

Create a new row to assign M/F to a column based on heading, referencing second table?

I am new to R (and coding in general) and am really stuck on how to approach this problem.
I have a very large data set; columns are sample ID# (~7000 samples) and rows are gene expression (~20,000 genes). Column headings are BIOPSY1-A, BIOPSY1-B, BIOPSY1-C, ..., BIOPSY200-Z. Each number (1-200) is a different patient, and each sample for that patient is a different letter (-A, -Z).
I would like to do some comparisons between samples that came from men and women. Gender is not included in this gene expression table. I have a separate file with patient numbers (BIOPSY1-200) and their gender M/F.
I would like to code something that will look at the column ID (ex: BIOPSY7-A), recognize that it includes "BIOPSY7" (but not == BIOPSY7 because there is BIOPSY7-A through BIOPSY7-Z), find "BIOPSY7" in the reference file, extrapolate M/F, and create a new row with M/F designation.
Honestly, I am so overwhelmed with coding this that I tried to open the file in Excel to manually input M/F, for the 7000 columns as it would probably be faster. However, the file is so large that Excel crashes when it opens.
Any input or resources that would put me on the right path would be extremely appreciated!!
I don't quite know how your data looks like, so I made mine based on your definitions. I'm sure you can modify this answer based on your needs and your dataset structure:
library(data.table)
genderfile <-data.frame("ID"=c("BIOPSY1", "BIOPSY2", "BIOPSY3", "BIOPSY4", "BIOPSY5"),"Gender"=c("F","M","M","F","M"))
#you can just read in your gender file to r with the line below
#genderfile <- read.csv("~/gender file.csv")
View(genderfile)
df<-matrix(rnorm(45, mean=10, sd=5),nrow=3)
colnames(df)<-c("BIOPSY1-A", "BIOPSY1-B", "BIOPSY1-C", "BIOPSY2-A", "BIOPSY2-B", "BIOPSY2-C","BIOPSY3-A", "BIOPSY3-B", "BIOPSY3-C","BIOPSY4-A", "BIOPSY4-B", "BIOPSY4-C","BIOPSY5-A", "BIOPSY5-B", "BIOPSY5-C")
df<-cbind(Gene=seq(1:3),df)
df<-as.data.frame(df)
#you can just read in your main df to r with the line below, fread prevents dashes to turn to period in r, you need data.table package installed and checked in
#df<-fread("~/first file.csv")
View(df)
Note that the following line of code removes the dash and letter from the column names of df (I removed the first column by df[,-c(1)] because it is the Gene id):
substr(x=names(df[,-c(1)]),start=1,stop=nchar(names(df[,-c(1)]))-2)
#[1] "BIOPSY1" "BIOPSY1" "BIOPSY1" "BIOPSY2" "BIOPSY2" "BIOPSY2" "BIOPSY3" "BIOPSY3" "BIOPSY3" "BIOPSY4" "BIOPSY4"
#[12] "BIOPSY4" "BIOPSY5" "BIOPSY5" "BIOPSY5"
Now, we are ready to match the columns of df with the ID in genderfile to get the Gender column:
Gender<-genderfile[, "Gender"][match(substr(x=names(df[,-c(1)]),start=1,stop=nchar(names(df[,-c(1)]))-2), genderfile[,"ID"])]
Gender
#[1] F F F M M M M M M F F F M M M
Last step is to add the Gender defined above as a row to the df:
df_withGender<-rbind(c("Gender", as.character(Gender)), df)
View(df_withGender)

Juxtaposing Replicate Data

I have provided a sample dataset that I have arranged in column format (called "full.table").
These data were extracted from a 96-well PCR plate, & while collecting my data, I always ran a duplicate experiment, meaning each variable (aka test) has 1 replicate. I would like to take all replicates and juxtapose them (have them be side by side), which would allow me to easily visualize replicates next to each other, and finally calculate an average value for the variable "Cq" between the two.
The complications stems from having done multiple tests over several days (complication one), and NOT having my samples always run in the same fashion on the PCR plate (complication two). Typically, as you see on my data set below, Well A1 has a duplicate in Well B1, however this is not always the case. Occasionally, Well A7 matches Well A8 (and NOT B7).
Replicates were always run on the same day, so an important variable here is “date” which I added via R before uploading to Stack Exchange. I am confused on how to re-arrange the data to get my desired result (not even sure where to start)
I have provided an example of what I would like in the end, called “sample.finished.table”
Logically, having 768 observations in this example, this should divide it in two, resulting in 384 total lines of data (385 with header)
I appreciate any feedback. Thank you
full.table<- read.table("https://pastebin.com/raw/kTQhuttv", header=T, sep="")
sample.finished.table <- read.table("https://pastebin.com/raw/Phg7C9xD", header=T, sep="")
You can use dplyr here to group by sample and extract the requested values:
library(dplyr)
full.table %>% group_by(sample,date) %>% summarise(
Well1 = first(Well), Cq1 = first(Cq),
Well2 = last(Well), sample1 = last(sample), Cq2 = last(Cq), Cq_mean = mean(Cq[Cq > 0]))

Dropping small sample size from a factor plot

I'm new to R, so forgive my ignorance. I'm playing around with a dataset that reflects the mileage my car has achieved since I got it. Here's the data formatted .csv. (Note: I have this data in excel and when I saved it as .txt space delimited there was an issue where one line kept throwing an error on read.table saying there weren't the right number of columns...so I switched to .csv and it worked fine)
Date,Miles,Gallons,Price.per.Gallon,Total.Cost,Grade,MPG,Price.per.Mile,Cumulative.Miles,Cumulative.Gallons,Cumulative.Cost,Cumulative.MPG,Cumulative.Price.per.Mile,Gas.Source,Car.Said,Delta,Average.Price.of.Gas,Avg.Temp
6/8/2011,391.8,9.751,3.749,36.556499,Regular,40.18049431,0.093303979,570.4,9.751,36.56,40.18049431,0.064095372,Dealer,41.18,1,3.74935904,82.8
6/22/2011,441.2,9.566,3.359,32.132194,Regular,46.12168095,0.072829089,1011.6,19.317,68.692194,43.14334524,0.067904502,Speedway,47,0.878319047,3.556048765,73.2
7/7/2011,460.6,9.594,3.599,34.528806,Regular,48.0091724,0.074964842,1472.2,28.911,103.221,44.75805057,0.070113436,BP,49.4,1.390827601,3.570301961,79.5
7/18/2011,397.4,8.178,3.319,27.142782,Regular,48.59378821,0.068300911,1869.6,37.089,130.363782,45.60381784,0.069728168,Shell,45.7,2.893788212,3.514890722,83.1
7/26/2011,368.7,8.959,3.359,30.093281,Regular,41.15414667,0.081619965,2238.3,46.048,160.457063,44.73809937,0.071687023,Kroger,42.9,1.745853332,3.484560958,79.1
8/8/2011,436.3,9.845,3.559,35.038355,Regular,44.31691214,0.080307942,2674.6,55.893,195.495418,44.6639114,0.073093329,Kroger,48,3.683087862,3.49767266,76
8/9/2011,262.2,4.986,3.479,17.346294,Regular,52.58724428,0.066156728,2936.8,60.879,212.841712,45.31283365,0.072474023,Shell,46.9,5.687244284,3.496143366,74.5
8/13/2011,250.1,5.887,3.369,19.833303,Regular,42.48343808,0.079301491,3186.9,66.766,232.675015,45.0633556,0.073009826,mobil,45.5,3.016561916,3.484932675,74.1
8/14/2011,424.4,8.699,3.759,32.699541,Regular,48.78721692,0.077048871,3611.3,75.465,265.374556,45.49261247,0.073484495,Speedway,49,0.212783079,3.516524959,68
8/18/2011,437,9.594,3.399,32.610006,regular,45.54930165,0.074622439,4048.3,85.059,297.984562,45.49900657,0.073607332,Speedway,47.6,2.050698353,3.503269049,77.1
8/30/2011,407.3,9.244,3.429,31.697676,Regular,44.06101255,0.077823904,4455.6,94.303,329.682238,45.35804799,0.073992782,Shell,48.6,4.538987451,3.495988866,66.6
9/10/2011,347.3,7.992,3.549,28.363608,Regular,43.45595596,0.081668897,4802.9,102.295,358.045846,45.20944328,0.074547845,Meijer,49.6,6.144044044,3.500130466,65
9/21/2011,375,8.874,3.369,29.896506,Regular,42.25828262,0.079724016,5177.9,111.169,387.942352,44.97386861,0.07492272,Meijer,44.9,2.641717377,3.489663054,67.5
10/5/2011,404.8,9.243,3.079,28.459197,Regular,43.79530455,0.07030434,5582.7,120.412,416.401549,44.88340033,0.074587843,UDF,45.4,1.604695445,3.458139961,61.5
10/14/2011,376.5,8.715,3.249,28.315035,Regular,43.20137694,0.075205936,5959.2,129.127,444.716584,44.76987772,0.074626894,UDF,46.4,3.198623064,3.444024751,56.4
10/23/2011,382.8,8.953,3.199,28.640647,Regular,42.75661789,0.074818827,6342,138.08,473.357231,44.63933951,0.074638479,Speedway,43.8,1.043382107,3.428137536,50.3
10/31/2011,403.4,9.517,3.299,31.396583,Regular,42.38730692,0.077829903,6745.4,147.597,504.753814,44.49412928,0.074829338,Kroger,45.7,3.312693076,3.419810796,47.5
11/15/2011,402.8,9.146,3.249,29.715354,Regular,44.04111087,0.073771981,7148.2,156.743,534.469168,44.46769553,0.074769756,UDF,45.1,1.058889132,3.409843936,54.4
11/29/2011,361.1,9.209,3.149,28.999141,Regular,39.21164079,0.080307785,7509.3,165.952,563.468309,44.1760268,0.075036063,BP,41.7,2.488359214,3.395369197,42.8
12/10/2011,354.2,9.23,3.199,29.52677,Regular,38.37486457,0.083361858,7863.5,175.182,592.995079,43.87037481,0.075411087,Shell,40.3,1.925135428,3.385022885,22.8
12/19/2011,357.4,8.957,2.999,26.862043,Regular,39.90175282,0.075159605,8220.9,184.139,619.857122,43.67733071,0.075400154,UDF,41.3,1.398247181,3.366245727,41.8
1/5/2012,322.6,8.549,3.459,29.570991,Regular,37.73540765,0.091664572,8543.5,192.688,649.428113,43.41370506,0.076014293,Speedway,41,3.26459235,3.370360962,32.3
1/14/2012,370,9.148,3.319,30.362212,Regular,40.44599913,0.082060032,8913.5,201.836,679.790325,43.27919697,0.076265252,Shell,42,1.554000875,3.368033081,17.9
1/28/2012,327.3,9.108,3.329,30.320532,Regular,35.93544137,0.09263835,9240.8,210.944,710.110857,42.96211317,0.076845171,BP,37.5,1.56455863,3.366347737,32.1
2/9/2012,307,7.971,3.399,27.093429,Regular,38.51461548,0.088252212,9547.8,218.915,737.204286,42.80017358,0.077211953,Shell,41.1,2.585384519,3.367536651,28.8
2/16/2012,370.5,10.057,3.229,32.474053,Regular,36.84001193,0.087649266,9918.3,228.972,769.678339,42.53838897,0.077601841,Speedway,42.2,5.359988068,3.361451789,44
2/29/2012,406.3,9.518,3.759,35.778162,Regular,42.6875394,0.088058484,10324.6,238.49,805.456501,42.54434148,0.078013337,Shell,42.9,0.212460601,3.377317711,54.1
3/14/2012,370.6,9.812,3.699,36.294588,Regular,37.77007746,0.097934668,10695.2,248.302,841.751089,42.35567978,0.078703632,UDF,40.5,2.729922544,3.390029436,63.6
3/23/2012,357.6,7.999,3.929,31.428071,Regular,44.7055882,0.087886105,11052.8,256.301,873.17916,42.429019,0.07900072,Shell,43.1,1.605588199,3.406850383,66
4/3/2012,252.5,4.57,3.849,17.58993,Regular,55.25164114,0.069663089,11305.3,260.871,890.76909,42.65364874,0.078792167,Meijer,41.9,13.35164114,3.414596065,58.6
4/13/2012,382.3,9.416,3.629,34.170664,Regular,40.6011045,0.089381805,11687.6,270.287,924.939754,42.58214417,0.079138553,Shell,44.3,3.698895497,3.422065264,51.2
4/24/2012,393.7,9.018,3.659,32.996862,Regular,43.65713018,0.083812197,12081.3,279.305,957.936616,42.61685254,0.079290856,UDF,43.3,0.357130184,3.429715243,49.2
5/7/2012,354.7,9.203,3.729,34.317987,Regular,38.54177985,0.096752148,12436,288.508,992.254603,42.48686345,0.079788887,Speedway,40.6,2.058220146,3.439262007,70.3
5/18/2012,378,9.505,3.699,35.158995,Regular,39.76854287,0.093013214,12814,298.013,1027.413598,42.40016375,0.080178992,Speedway,42.2,2.431457128,3.447546241,62.2
6/1/2012,381.5,9.781,3.699,36.179919,Regular,39.0041918,0.094835961,13195.5,307.794,1063.593517,42.29224741,0.080602745,Sunoco,41,1.9958082,3.455536875,61.4
6/12/2012,386.8,8.976,3.649,32.753424,Regular,43.09269162,0.084677932,13582.3,316.77,1096.346941,42.31492881,0.080718799,Meijer,44.1,1.007308378,3.46101885,75.5
6/23/2012,379.9,9.168,3.339,30.611952,Regular,41.43760908,0.080578973,13962.2,325.938,1126.958893,42.29025152,0.080714994,Kroger,41.8,0.362390925,3.457586697,74.4
7/8/2012,321.9,8.285,3.549,29.403465,Regular,38.85334943,0.091343476,14284.1,334.223,1156.362358,42.20505471,0.080954513,Shell,40.9,2.046650573,3.459852727,84.1
7/21/2012,369.5,8.88,3.479,30.89352,Regular,41.61036036,0.083608985,14653.6,343.103,1187.255878,42.18966316,0.081021447,Meijer,42.6,0.98963964,3.460348286,70.1
7/21/2012,385,7.808,3.499,27.320192,Regular,49.30840164,0.070961538,15038.6,350.911,1214.57607,42.34805976,0.080763906,Speedway,48.5,0.808401639,3.461208312,70.1
7/26/2012,367.1,9.644,3.479,33.551476,Regular,38.06511821,0.091396012,15405.7,360.555,1248.127546,42.23350113,0.081017256,BP,44.2,6.134881792,3.461684198,82.5
8/12/2012,376.6,9.287,3.769,35.002703,Regular,40.55130828,0.09294398,15782.3,369.842,1283.130249,42.19126005,0.081301854,BP,42.3,1.74869172,3.46940112,66.4
8/24/2012,414.9,9.22,3.859,35.57998,Regular,45,0.085755556,16197.2,379.062,1318.710229,42.25957759,0.081415938,Speedway,44.6,0.4,3.478877411,76.5
9/9/2012,373.3,8.984,3.799,34.130216,Regular,41.55164737,0.091428385,16570.5,388.046,1352.840445,42.24318766,0.081641498,Speedway,42.8,1.248352627,3.486288855,62.1
9/19/2012,408.1,9.123,3.799,34.658277,Regular,44.73309218,0.084925942,16978.6,397.169,1387.498722,42.30038095,0.081720443,BP,46.5,1.766907815,3.493471852,53.5
9/28/2012,408.3,9.281,3.659,33.959179,Regular,43.99310419,0.083172126,17386.9,406.45,1421.457901,42.33903309,0.081754534,BP,45.6,1.606895809,3.497251571,59
10/7/2012,393.1,8.942,3.699,33.076458,Regular,43.96108253,0.084142605,17780,415.392,1454.534359,42.37395039,0.081807332,Speedway,46.3,2.338917468,3.50159454,45
10/15/2012,402.9,9.075,3.549,32.207175,Regular,44.39669421,0.079938384,18182.9,424.467,1486.741534,42.41719615,0.081765919,Speedway,46.1,1.703305785,3.502608057,54.6
10/24/2012,365.7,8.264,3.299,27.262936,Regular,44.25217812,0.074550003,18548.6,432.731,1514.00447,42.45223938,0.081623652,Speedway,46.8,2.547821878,3.49871969,68.5
11/4/2012,363.3,9.561,3.259,31.159299,Regular,37.99811735,0.085767407,18911.9,442.292,1545.163769,42.35595489,0.081703254,Meijer,42,4.001882648,3.493537683,37.3
11/15/2012,391.9,10.224,3.499,35.773776,Regular,38.33137715,0.091282919,19303.8,452.516,1580.937545,42.26502488,0.081897737,Speedway,44.1,5.768622848,3.493661097,33.7
11/24/2012,430.2,9.068,3.579,32.454372,Regular,47.44155271,0.075440195,19734,461.584,1613.391917,42.36671982,0.081756963,BP,44.3,3.141552713,3.495337614,29.5
12/2/2012,394.5,9.146,3.239,29.623894,Regular,43.13361032,0.075092253,20128.5,470.73,1643.015811,42.38162004,0.081626341,Sunoco,45.8,2.666389679,3.490357128,55.1
12/12/2012,386.1,9.312,3.169,29.509728,Regular,41.46262887,0.076430272,20514.6,480.042,1672.525539,42.36379317,0.081528547,Speedway,43.4,1.937371134,3.484123345,31
12/23/2012,359.8,8.642,3.199,27.645758,Regular,41.63388105,0.076836459,20874.4,488.684,1700.171297,42.35088523,0.081447673,Speedway,42.4,0.766118954,3.479081159,30.7
1/6/2013,336.4,8.878,3.079,27.335362,Regular,37.89141699,0.081258508,21210.8,497.562,1727.506659,42.27131493,0.081444672,Meijer,41,3.108583014,3.47194251,33.2
1/21/2013,350,9.257,3.259,30.168563,Regular,37.80922545,0.086195894,21560.8,506.819,1757.675222,42.1898153,0.0815218,Meijer,40.6,2.790774549,3.468053135,20.8
2/1/2013,335.7,9.058,3.499,31.693942,Regular,37.0611614,0.094411504,21896.5,515.877,1789.369164,42.09976409,0.081719415,Meijer,38.7,1.638838596,3.468596514,12.1
2/13/2013,360.9,9.42,3.759,35.40978,Regular,38.31210191,0.098115212,22257.4,525.297,1824.778944,42.03184103,0.08198527,Speedway,41.4,3.087898089,3.473804236,31
2/26/2013,371.3,9.081,3.899,35.406819,Regular,40.88756745,0.09535906,22628.7,534.378,1860.185763,42.01239572,0.082204712,Meijer,42.2,1.312432551,3.481029838,36.9
3/9/2013,362.6,8.952,3.439,30.785928,Regular,40.5049151,0.084903276,22991.3,543.33,1890.971691,41.98755821,0.082247271,BP,42.7,2.195084897,3.480337347,36.5
3/21/2013,375.3,8.991,3.859,34.696269,Regular,41.74174174,0.092449424,23366.6,552.321,1925.66796,41.98355666,0.082411132,Kroger,44,2.258258258,3.486501437,23.8
4/8/2013,361.7,9,3.299,29.691,Regular,40.18888889,0.082087365,23728.3,561.321,1955.35896,41.95478167,0.082406197,Speedway,43.4,3.211111111,3.483495112,61.8
4/20/2013,362.3,8.036,3.699,29.725164,Regular,45.08461921,0.082045719,24090.6,569.357,1985.084124,41.99895672,0.082400776,BP,45.6,0.515380786,3.486536784,39
4/30/2013,382.3,8.246,3.539,29.182594,Regular,46.36187242,0.076334277,24472.9,577.603,2014.266718,42.06124276,0.082306009,Speedway,48.7,2.338127577,3.487285762,60.2
5/9/2013,397.3,8.722,3.339,29.122758,Regular,45.55147902,0.073301681,24870.2,586.325,2043.389476,42.1131625,0.082162165,Pilot,47.4,1.848520981,3.485079906,65.8
5/18/2013,399,9.051,3.899,35.289849,Regular,44.08352668,0.088445737,25269.2,595.376,2078.679325,42.14311628,0.082261382,Kroger,45.7,1.616473318,3.491372385,68.3
5/30/2013,380.2,9.04,3.659,33.07736,Regular,42.05752212,0.086999895,25649.4,604.416,2111.756685,42.14183609,0.082331621,Sunoco,44.4,2.342477876,3.493879522,78.2
6/14/2013,395.3,9.095,3.759,34.188105,Regular,43.46344145,0.086486479,26044.7,613.511,2145.94479,42.16142824,0.082394683,Meijer,45,1.536558549,3.497809803,67.6
6/22/2013,390.3,9.008,3.559,32.059472,Regular,43.32815275,0.082140589,26435,622.519,2178.004262,42.17831102,0.082390931,BP,44.3,0.971847247,3.49869524,78.2
7/4/2013,388.9,9.501,3.399,32.293899,Regular,40.93253342,0.083039082,26823.9,632.02,2210.298161,42.15958356,0.082400328,BP,43.7,2.767466582,3.497196546,71.6
7/18/2013,399.8,9.06,3.299,29.88894,Regular,44.12803532,0.07475973,27223.7,641.08,2240.187101,42.18740251,0.08228812,Speedway,45.2,1.07196468,3.494395553,83.9
8/25/2013,394.3,9.114,3.529,32.163306,Regular,43.2631117,0.081570647,27618,650.194,2272.350407,42.20248111,0.082277877,Kroger,45.8,2.536888304,3.494880616,74.6
9/5/2013,413.7,9.507,3.519,33.455133,Regular,43.51530451,0.0808681,28031.7,659.701,2305.80554,42.2214003,0.082257071,Speedway,46,2.484695488,3.495228202,70.2
9/14/2013,431.2,9.272,3.299,30.588328,Regular,46.50560828,0.070937681,28462.9,668.973,2336.393868,42.28077964,0.082085587,UDF,46.7,0.194391717,3.492508469,55.1
9/25/2013,417.6,9.685,3.159,30.594915,Regular,43.11822406,0.073263685,28880.5,678.658,2366.988783,42.29273065,0.081958026,Meijer,48.1,4.981775942,3.487749033,61.3
10/11/2013,421.9,9.202,3.299,30.357398,Regular,45.84872854,0.071954013,29302.4,687.86,2397.346181,42.34030181,0.081813987,Kroger,45.7,0.148728537,3.485224001,62.7
10/23/2013,389,8.975,3.259,29.249525,Regular,43.34261838,0.075191581,29691.4,696.835,2426.595706,42.35321131,0.081727224,Meijer,45.9,2.557381616,3.482310312,39.6
11/2/2013,392.8,8.852,3.299,29.202748,Regular,44.37415273,0.074345081,30084.2,705.687,2455.798454,42.3785616,0.081630838,Meijer,44.8,0.425847266,3.480010903,49.7
11/12/2013,363.5,9.114,2.959,26.968326,Regular,39.88369541,0.074190718,30447.7,714.801,2482.76678,42.34675105,0.081542014,Valero,44,4.116304586,3.473367804,31.4
11/24/2013,375.5,9.123,3.199,29.184477,Regular,41.15970624,0.077721643,30823.2,723.924,2511.951257,42.33179174,0.081495473,UDF,42.6,1.440293763,3.46991018,21.1
12/2/2013,364,9.006,2.999,27.008994,Regular,40.41749944,0.074200533,31187.2,732.93,2538.960251,42.30826955,0.08141033,Meijer,41.1,0.682500555,3.464123792,38.9
12/12/2013,325.8,8.576,2.979,25.547904,Regular,37.98973881,0.078415912,31513,741.506,2564.508155,42.25832293,0.081379372,Murphy,39.5,1.510261194,3.458513019,13.8
1/7/2014,317.1,8.915,3.199,28.519085,Regular,35.56926528,0.089937196,31830.1,750.421,2593.02724,42.17885693,0.081464628,Kroger,38.5,2.930734717,3.455430005,-3.6
1/15/2014,359.5,9.252,3.299,30.522348,Regular,38.85646347,0.08490222,32189.6,759.673,2623.549588,42.13839376,0.081503019,Meijer,41.1,2.243536533,3.453524856,28.6
1/27/2014,302.7,8.89,3.249,28.88361,Regular,34.04949381,0.095419921,32492.3,768.563,2652.433198,42.04482912,0.08163267,BP,35.8,1.750506187,3.451159109,37.1
2/4/2014,346.7,8.983,3.279,29.455257,Regular,38.59512412,0.084958918,32839,777.546,2681.888455,42.00497463,0.081667787,UDF,40,1.404875877,3.449170152,22.4
2/16/2014,310.1,8.773,3.459,30.345807,Regular,35.34708766,0.097858133,33149.1,786.319,2712.234262,41.93069225,0.081819243,Speedway,37.7,2.352912345,3.449279824,23.3
3/1/2014,361.8,9.065,3.599,32.624935,Regular,39.91174848,0.09017395,33510.9,795.384,2744.859197,41.90768233,0.081909444,Speedway,42.2,2.288251517,3.450986187,35.1
3/17/2014,354.2,9.356,3.579,33.485124,Regular,37.858059,0.094537335,33865.1,804.74,2778.344321,41.86060094,0.082041521,Speedway,41.9,4.041941,3.45247449,26.3
3/28/2014,354.1,9.165,3.579,32.801535,Regular,38.63611566,0.092633536,34219.2,813.905,2811.145856,41.82429153,0.082151127,UDF,39.8,1.163884343,3.453899234,51.9
4/8/2014,371.5,9.164,3.549,32.523036,Regular,40.53906591,0.087545184,34590.7,823.069,2843.668892,41.80998191,0.082209059,UDF,41.7,1.16093409,3.45495808,49.7
4/21/2014,373.8,9.216,3.679,33.905664,Regular,40.55989583,0.090705361,34964.5,832.285,2877.574556,41.79613954,0.082299891,Shell,42.2,1.640104167,3.457438925,64.1
5/2/2014,391.9,8.834,3.599,31.793566,Regular,44.36268961,0.081126731,35356.4,841.119,2909.368122,41.82309519,0.082286888,Speedway,44.8,0.437310392,3.458925695,50.9
5/10/2014,375.1,8.854,3.659,32.396786,Regular,42.36503275,0.086368398,35731.5,849.973,2941.764908,41.82874044,0.082329734,Speedway,45.8,3.434967246,3.46100983,65.5
5/21/2014,401.1,9.094,3.659,33.274946,Regular,44.10600396,0.082959227,36132.6,859.067,2975.039854,41.85284733,0.082336722,Speedway,45.6,1.493996041,3.463105734,72.3
6/6/2014,435.3,9.487,3.599,34.143713,Regular,45.88384105,0.0784372,36567.9,868.554,3009.183567,41.89687688,0.082290303,Speedway,50.5,4.616158954,3.464590074,67.5
6/21/2014,458.4,9.286,3.799,35.277514,Regular,49.36463493,0.076957928,37026.3,877.84,3044.461081,41.9758726,0.082224286,Kroger,49.6,0.235365066,3.468127541,73.8
7/5/2014,386.8,9.292,3.029,28.145468,Regular,41.6272062,0.072764912,37413.1,887.132,3072.606549,41.97222059,0.082126489,Speedway,44.5,2.872793801,3.463528031,69.2
7/19/2014,433.1,8.961,3.499,31.354539,Regular,48.33165941,0.072395611,37846.2,896.093,3103.961088,42.03581548,0.082015132,Kroger,48.3,0.031659413,3.463882753,66.7
8/6/2014,401.4,9.055,3.399,30.777945,Regular,44.32909994,0.076676495,38247.6,905.148,3134.739033,42.05875724,0.081959104,Speedway,47.6,3.270900055,3.463233673,73.1
8/25/2014,414.1,9.001,3.039,27.354039,Regular,46.00599933,0.066056602,38661.7,914.149,3162.093072,42.09762304,0.081788775,Speedway,46.9,0.894000667,3.459056535,78.2
9/15/2014,406.2,9.094,2.959,26.909146,Regular,44.66681328,0.066246051,39067.9,923.243,3189.002218,42.12292972,0.081627173,Kroger,47.1,2.433186717,3.454130947,59.5
9/30/2014,396.3,9.129,3.189,29.112381,Regular,43.41110746,0.073460462,39464.2,932.372,3218.114599,42.13554247,0.081545162,Kroger,46.7,3.28889254,3.451535009,62
10/22/2014,397.7,9.328,2.859,26.668752,Regular,42.63507719,0.06705746,39861.9,941.7,3244.783351,42.1404906,0.081400619,UDF,45.1,2.464922813,3.445665659,46.9
11/5/2014,413.2,9.262,2.879,26.665298,Regular,44.61239473,0.064533635,40275.1,950.962,3271.448649,42.16456599,0.081227574,UDF,46,1.387605269,3.440146556,50
11/17/2014,398.9,9.081,2.899,26.325819,Regular,43.9268803,0.065996037,40674,960.043,3297.774468,42.18123563,0.081078194,Speedway,45.2,1.2731197,3.435027877,28.6
11/25/2014,345.8,9.003,2.899,26.099697,Regular,38.40941908,0.075476278,41019.8,969.046,3323.874165,42.14619327,0.08103097,UDF,40.7,2.290580917,3.430047867,36.7
12/7/2014,345.6,8.738,2.139,18.690582,Regular,39.55138476,0.054081545,41365.4,977.784,3342.564747,42.12300467,0.080805812,Speedway,41.6,2.048615244,3.418510373,33
12/30/2014,360.8,9.013,1.869,16.845297,Regular,40.03106624,0.046688739,41726.2,986.797,3359.410044,42.10389776,0.080510807,Kroger,42.2,2.168933762,3.40435778,25.4
2/2/2015,338.8,8.725,2.059,17.964775,Regular,38.83094556,0.05302472,42065,995.522,3377.374819,42.0752128,0.080289429,Speedway,41.1,2.269054441,3.392566733,25.9
2/12/2015,321.7,8.765,2.359,20.676635,Regular,36.70279521,0.064273034,42386.7,1004.287,3398.051454,42.02832457,0.08016787,Speedway,39.2,2.497204792,3.383546191,26.3
3/3/2015,310.7,9.93,2.039,20.24727,Regular,31.28902316,0.065166624,42697.4,1014.217,3418.298724,41.92317818,0.080058709,AAFES,37.4,6.110976838,3.370382003,26.4
3/13/2015,408.5,9.404,2.199,20.679396,Regular,43.43896214,0.050622756,43105.9,1023.621,3438.97812,41.93710367,0.079779755,Kroger,42.7,0.738962144,3.359620524,46.1
3/22/2015,396.5,9.051,2.339,21.170289,Regular,43.80731411,0.05339291,43502.4,1032.672,3460.148409,41.9534954,0.079539253,Speedway,45.9,2.092685891,3.35067515,40
3/30/2015,386.7,8.931,1.999,17.853069,Regular,43.29862277,0.04616775,43889.1,1041.603,3478.001478,41.9650289,0.079245222,Meijer,44.4,1.101377225,3.339085504,46.4
4/10/2015,414,8.905,2.399,21.363095,Regular,46.49073554,0.051601679,44303.1,1050.508,3499.364573,42.00339264,0.078986901,Kroger,48.3,1.809264458,3.331116539,61
4/19/2015,368.7,7.84,2.419,18.96496,Regular,47.02806122,0.051437375,44671.8,1058.348,3518.329533,42.04061424,0.07875952,Shell,48.4,1.371938776,3.324359788,62.5
4/28/2015,407.9,9.18,2.179,20.00322,Regular,44.4335512,0.049039519,45079.7,1067.528,3538.332753,42.06119184,0.078490601,Speedway,47.5,3.066448802,3.314510489,49.3
5/10/2015,425.1,9.235,2.499,23.078265,Regular,46.03140227,0.054289026,45504.8,1076.763,3561.411018,42.09524287,0.078264513,Kroger,47.7,1.668597726,3.307516155,74.9
5/19/2015,436.6,9.161,2.629,24.084269,Regular,47.65855256,0.055163236,45941.4,1085.924,3585.495287,42.1421757,0.078044972,BP,49.1,1.44144744,3.301792102,62.9
5/28/2015,399.1,8.503,2.299,19.548397,Regular,46.9363754,0.0489812,46340.5,1094.427,3605.043684,42.17942357,0.077794665,UDF,49,2.063624603,3.294001047,72.9
6/9/2015,416.6,8.858,2.639,23.376262,Regular,47.03093249,0.056112007,46757.1,1103.285,3628.419946,42.21837513,0.077601475,Kroger,48.4,1.36906751,3.288742207,65.5
7/9/2015,419.6,8.917,2.389,21.302713,Regular,47.05618482,0.050769097,47176.7,1112.202,3649.722659,42.25716192,0.077362822,BP,49.4,2.343815184,3.281528588,73.1
7/30/2015,433.9,9.361,2.499,23.393139,Regular,46.35188548,0.053913664,47610.6,1121.563,3673.115798,42.29133807,0.077149118,UDF,48.9,2.548114518,3.274997301,76.2
8/12/2015,410.8,8.774,2.699,23.681026,Regular,46.82015044,0.05764612,48021.4,1130.337,3696.796824,42.32649201,0.076982279,UDF,47.5,0.679849556,3.270526245,68.3
8/23/2015,397,8.841,2.059,18.203619,Regular,44.90442258,0.045852945,48418.4,1139.178,3715.000443,42.34649897,0.076727039,UDF,48.8,3.895577423,3.26112376,72.1
9/1/2015,435.8,9.6,1.999,19.1904,Regular,45.39583333,0.044034878,48854.2,1148.778,3734.190843,42.37198136,0.076435411,Kroger,49.6,4.204166667,3.250576563,75.5
9/12/2015,422.5,8.493,2.269,19.270617,Regular,49.74685035,0.045610928,49276.7,1157.271,3753.46146,42.42610417,0.076171121,Kroger,45.3,4.446850347,3.243372952,58.8
9/22/2015,391.3,8.491,1.799,15.275309,Regular,46.08408904,0.039037335,49668,1165.762,3768.736769,42.45274764,0.075878569,Speedway,48.3,2.215910965,3.232852648,63.3
10/1/2015,421.3,8.961,2.459,22.035099,Regular,47.01484209,0.052302632,50089.3,1174.723,3790.771868,42.48754813,0.075680272,Kroger,50.2,3.185157906,3.22694956,55.9
10/25/2015,412.4,10.057,1.079,10.851503,Regular,41.00626429,0.026313053,50501.7,1184.78,3801.623371,42.47497426,0.075277137,UDF,45.8,4.793735706,3.208716699,55.2
11/14/2015,445.4,9.047,1.979,17.904013,Regular,49.23178954,0.040197604,50947.1,1193.827,3819.527384,42.52617842,0.074970457,Kroger,45.5,3.731789543,3.199397722,38.2
11/24/2015,395.3,9.451,1.899,17.947449,Regular,41.82626177,0.045402097,51342.4,1203.278,3837.474833,42.52068101,0.074742802,Meijer,44.4,2.573738229,3.189183907,37.7
12/9/2015,381.4,9.291,1.469,13.648479,Regular,41.05047896,0.03578521,51723.8,1212.569,3851.123312,42.50941596,0.074455537,Speedway,43.8,2.749521042,3.176003437,46.7
12/18/2015,391,8.715,1.839,16.026885,Regular,44.86517499,0.040989476,52114.8,1221.284,3867.150197,42.5262265,0.074204452,Kroger,46.1,1.234825014,3.166462671,30.8
12/31/2015,356.6,8.754,1.999,17.499246,Regular,40.7356637,0.049072479,52471.4,1230.038,3884.649443,42.51348332,0.074033653,Speedway,43.2,2.464336303,3.158154011,33.4
1/8/2016,375.7,10.531,1.099,11.573569,Regular,35.67562435,0.030805347,52847.1,1240.569,3896.223012,42.45543779,0.073726335,UDF,43.2,7.524375653,3.140674168,38.4
1/17/2016,408.8,8.996,1.199,10.786204,Regular,45.44241885,0.026385039,53255.9,1249.565,3907.009216,42.47694198,0.073362937,Kroger,41.1,4.342418853,3.126695463,24
1/26/2016,326.8,8.83,1.799,15.88517,Regular,37.01019253,0.048608231,53582.7,1258.395,3922.894386,42.43858248,0.073211958,Kroger,39.9,2.889807475,3.11737919,39.6
2/3/2016,338.2,7.974,1.599,12.750426,Regular,42.41284174,0.037700846,53920.9,1266.369,3935.644812,42.4384204,0.072989227,UDF,44.1,1.687158264,3.107818347,53.7
2/10/2016,355.1,8.88,1.349,11.97912,Regular,39.98873874,0.033734497,54276,1275.249,3947.623932,42.42136242,0.072732403,UDF,43.3,3.311261261,3.095571086,16.5
2/17/2016,334.9,8.703,1.559,13.567977,Regular,38.48098357,0.040513517,54610.9,1283.952,3961.191909,42.39465338,0.072534822,UDF,39.6,1.119016431,3.08515576,31.7
2/26/2016,375.8,8.959,1.879,16.833961,Regular,41.94664583,0.044795,54986.7,1292.911,3978.02587,42.39154899,0.072345237,UDF,44.4,2.453354169,3.076797916,29.9
3/13/2016,385.7,8.732,1.959,17.105988,Regular,44.17086578,0.0443505,55372.4,1301.643,3995.131858,42.40348544,0.072150238,UDF,45,0.829134219,3.06929923,54.5
4/5/2016,402.6,9.241,1.959,18.103119,Regular,43.56671356,0.044965522,55775,1310.884,4013.234977,42.41168555,0.071954011,Kroger,45.9,2.333286441,3.061472241,30.9
4/14/2016,370.8,8.674,2.1139,18.3359686,Regular,42.74844362,0.049449754,56145.8,1319.558,4031.570946,42.4138992,0.071805388,UDF,44.5,1.751556375,3.055243457,47.7
4/28/2016,397.6,9.2,2.399,22.0708,Regular,43.2173913,0.05551006,56543.4,1328.758,4053.641746,42.41946239,0.071690803,UDF,44.7,1.482608696,3.050699786,55.8
5/7/2016,377,8.884,1.669,14.827396,Regular,42.43583971,0.039329963,56920.4,1337.642,4068.469142,42.41957116,0.071476468,Speedway,44.1,1.664160288,3.041523174,62.3
5/18/2016,389.2,9.253,2.459,22.753127,Regular,42.06203393,0.058461272,57309.6,1346.895,4091.222269,42.41711492,0.071388079,Kroger,44.8,2.737966065,3.037521313,56.2
5/24/2016,410.5,8.846,2.579,22.813834,Regular,46.40515487,0.055575722,57720.1,1355.741,4114.036103,42.44313626,0.071275623,UDF,47.1,0.694845128,3.034529532,68
6/28/2016,376.6,8.994,2.349,21.126906,Regular,41.87235935,0.05609906,58096.7,1364.735,4135.163009,42.43937468,0.071177244,UDF,42.9,1.027640649,3.030011694,74.6
7/13/2016,357.2,9.138,1.579,14.428902,Regular,39.08951631,0.040394462,58453.9,1373.873,4149.591911,42.41709387,0.070989137,Meijer,41.6,2.510483694,3.020360623,79
8/4/2016,358.8,9.236,1.919,17.723884,Regular,38.84798614,0.04939767,58812.7,1383.109,4167.315795,42.3932604,0.070857413,Kroger,40.7,1.852013859,3.013006057,79.6
8/12/2016,386.7,8.98,2.239,20.10622,Regular,43.0623608,0.051994363,59199.4,1392.089,4187.422015,42.39757659,0.070734197,Kroger,44.7,1.637639198,3.008013148,82.9
8/22/2016,367.9,8.752,2.339,20.470928,Regular,42.03610603,0.055642642,59567.3,1400.841,4207.892943,42.39531824,0.070640988,Meijer,43.3,1.263893967,3.003833371,66.6
8/30/2016,360.1,9.337,2.139,19.971843,Regular,38.56699154,0.055461936,59927.4,1410.178,4227.864786,42.36997032,0.070549778,UDF,41.1,2.533008461,2.998107179,76
9/12/2016,410.1,9.475,2.159,20.456525,Regular,43.2823219,0.049881797,60337.5,1419.653,4248.321311,42.3760595,0.070409303,Marathon,45.1,1.8176781,2.992506838,66
9/22/2016,395.8,9.273,2.189,20.298597,Regular,42.68305834,0.051284985,60733.3,1428.926,4268.619908,42.37805177,0.070284669,UDF,44.4,1.716941659,2.987292489,73.6
10/5/2016,379.5,9.097,1.699,15.455803,Regular,41.71704958,0.040726754,61112.8,1438.023,4284.075711,42.37387024,0.07010112,Speedway,43.7,1.982950423,2.979142691,67.3
10/7/2016,129.6,2.722,2.309,6.285098,Regular,47.61204996,0.048496127,61242.4,1440.745,4290.360809,42.38376673,0.0700554,Kroger,47.2,0.412049963,2.977876591,67.4
10/10/2016,400.1,8.569,2.219,19.014611,Regular,46.69156261,0.047524646,61642.5,1449.314,4309.37542,42.40923637,0.06990916,Kroger,48.8,2.108437391,2.973389769,53.2
10/20/2016,395.7,8.947,1.949,17.437703,Regular,44.22711523,0.044067988,62038.2,1458.261,4326.813123,42.42038977,0.069744337,Wal Mart,46.8,2.572884766,2.967104738,59.5
10/31/2016,395.9,9.247,2.099,19.409453,Regular,42.81388558,0.049026151,62434.1,1467.508,4346.222576,42.42286925,0.069612961,Meijer,45.6,2.786114415,2.961634673,51.2
11/10/2016,414.6,8.899,1.999,17.789101,Regular,46.58950444,0.042906659,62848.7,1476.407,4364.011677,42.44798352,0.069436785,Meijer,48.3,1.710495561,2.955832421,45
11/22/2016,366,9.225,2.599,23.975775,Premium,39.67479675,0.065507582,63214.7,1485.632,4387.987452,42.43076347,0.069414036,Meijer,43.6,3.925203252,2.953616677,30.8
12/6/2016,393.2,9.229,1.989,18.356481,Regular,42.60483259,0.046684845,63607.9,1494.861,4406.343933,42.43183814,0.069273533,BP,44.8,2.195167407,2.947661309,N/A
12/21/2016,334,8.855,2.259,20.003445,Regular,37.71880294,0.059890554,63941.9,1503.716,4426.347378,42.40408428,0.069224521,UDF,39.2,1.481197064,2.943605959,N/A
1/9/2017,332,8.847,2.429,21.489363,Regular,37.52684526,0.064726997,64273.9,1512.563,4447.836741,42.37555725,0.069201289,Speedway,39.6,2.073154742,2.940596022,N/A
One of the things I tried was
plot(factor(Gas.Source),MPG)
It's exactly what you'd expect. Some of the factor levels have very few (or one) observations and so rather than a box and whicker you just get a black line.
I understand this is exactly what I asked it to do, as some of those sources had very few observations. So what I'd like to do is efficiently remove the measurements associated with factor levels where there aren't enough observations to really produce a box and whisker...
I'm guessing I could do this by creating a new dataframe where I've used logical subscripting to select only those rows corresponding to a factor level that has a count that's greater than X....but I'm not sophisticated enough to figure that out yet.
Found what I was looking for here
Given that the original data was in a dataframe called mileage
tbl <- table(mileage$Gas.Source)
new.Mileage <- droplevels(mileage[mileage$Gas.Source %in% names(tbl)[tbl>10],,drop=FALSE])
new.Mileage now has only those rows where there were more than 10 observations at that factor level (i.e. from that gas source)

how to use the functcomp code in R

I am having trouble using the functcomp package in R.
I have 2 datasets: one with species frequency, and the other listing the functional traits of my species. The frequency dataset has 264 species listed in the first row and 27 sites listed in the first column, all values in dataset are between 0-1. The functional trait dataset has the same 264 species (copied & pasted from the frequency dataset to make sure identical) listed in the first column, and 5 different functional traits listed in the 1st row (height, life history, life form, origin, palatability).
I am using the following code:
traits.df <- read.table("species_functional_traits_6_ August.txt", header = TRUE)
frequency.df <- read.table("Spring 2014 - combined table - 6 August.txt", header = TRUE)
x <- (as.matrix(traits.df))
a <- (as.matrix(frequency.df))
functcomp(x, a, CWM.type = c("dom", "all"), bin.num = height)
But keep getting the following error message:
Error in functcomp(x, a, CWM.type = c("dom", "all"), bin.num = height) :
Different number of species in 'x' and 'a'.
I have tried fiddling with a couple of things in the code and datasets, but cannot work out what I am doing wrong here. Any help would be greatly appreciated!
Here are links the frequency & trait data (a subset of it, but still get same error message with this data) as a tab-delimited text file
frequency: https://www.dropbox.com/s/girs3nrq1ciyg1a/frequency%20-%20small.txt?dl=0
traits: https://www.dropbox.com/s/l888sallx7mu3f6/traits%20-%20small.txt?dl=0
try stating row.names=1 when read in your table, this solved my problem -
Anna

R storing different columns in different vectors to compute conditional probabilities

I am completely new to R. I tried reading the reference and a couple of good introductions, but I am still quite confused.
I am hoping to do the following:
I have produced a .txt file that looks like the following:
area,energy
1.41155882174e-05,1.0914586287e-11
1.46893363946e-05,5.25011714434e-11
1.39244046855e-05,1.57904991488e-10
1.64155121046e-05,9.0815757601e-12
1.85202830392e-05,8.3207522281e-11
1.5256036289e-05,4.24756620609e-10
1.82107587343e-05,0.0
I have the following command to read the file in R:
tbl <- read.csv("foo.txt",header=TRUE).
producing:
> tbl
area energy
1 1.411559e-05 1.091459e-11
2 1.468934e-05 5.250117e-11
3 1.392440e-05 1.579050e-10
4 1.641551e-05 9.081576e-12
5 1.852028e-05 8.320752e-11
6 1.525604e-05 4.247566e-10
7 1.821076e-05 0.000000e+00
Now I want to store each column in two different vectors, respectively area and energy.
I tried:
area <- c(tbl$first)
energy <- c(tbl$second)
but it does not seem to work.
I need to different vectors (which must include only the numerical data of each column) in order to do so:
> prob(energy, given = area), i.e. the conditional probability P(energy|area).
And then plot it. Can you help me please?
As #Ananda Mahto alluded to, the problem is in the way you are referring to columns.
To 'get' a column of a data frame in R, you have several options:
DataFrameName$ColumnName
DataFrameName[,ColumnNumber]
DataFrameName[["ColumnName"]]
So to get area, you would do:
tbl$area #or
tbl[,1] #or
tbl[["area"]]
With the first option generally being preferred (from what I've seen).
Incidentally, for your 'end goal', you don't need to do any of this:
with(tbl, prob(energy, given = area))
does the trick.

Resources