I need help to find the best way to convert the table below using the conditions:
If...
the data of the 1st column (plot number) and
the data of the 2st column (subplot number) and
the data of the 3rd column (trees) and
the name of the tree in the 4th column (tree_species) and
the data of the 5th column (stems)
are the SAME in different rows the new column dbh_equivalent will be result of the function:
=SQRT(dbc_cm - row1^2+dbc_cm-row2^2+...+dbc_cm- row n^2).
That is, in the table above the result would be:
Thanks
Related
I have a dataframe where the rows are the names of different genes, with 2 columns called: Control_mean and Patient_mean.
I want to create a third column where I store the value of "Patient_mean - Control_mean" for each row respectively but I cant figure out how!
I tried to do so using this:
for(i in 1:nrow(newdf8)){
newdf8$log2FC[i] <- (newdf8[,2] - newdf8[,1])
}
but it didnt work, since all the values in the new column became the same number, and not the value of the actual difference.
My first question here...
I have 2 dataframes, both with a different number of rows.
The first one has 3 columns, the second one has 1 column.
I want to make all combinations of values from the 1st column of the 1st dataframe with values in the 1st (and only) column of the second dataframe, and values of 2nd column of 1st dataframe with values in 1st (and only) column of second dataframe, and so on...
I assume the result will be a one-column dataframe (?).
Something like this:
Attempts with combn did not help me yet...
Thanks!
Probably not fully what you want, but provides a starting point. Providing your first dataframe is called df and the other one (with one column) df2
#make data long using tidyr
df_long <- tidyr::pivot_longer(df, cols = c("loc1", "loc2", "loc3"))
#cartesian join with codes column
CJ(df_long$value, df2)
I would like to divide every number in all columns by 1000. I would like to omit the row header and the 1st column from this function.
I have tried this code:
TEST2=(TEST[2:503,]/(1000))
But it is not what I am looking for. My dataframe has 503 columns.
Is TEST a dataframe? In that case, the row header won't be divided by 1000. To choose all columns except the first, use an index in j to select all columns but the first? e.g.
TEST[, 2:ncol(TEST)]/1000 # selects every row and 2nd to last columns
# same thing
TEST[, -1]/1000 # selects every row and every but the 1st column
Or you can select columns by name, etc (you select columns just like how you are selecting rows at the moment).
Probably take a look at ?'[' to learn how to select particular rows and columns.
I have a data frame (df) with 8 columns and 1200 rows. Among those 8 columns I want to find the minimum value of column 7 and find the corresponding value of column 2 in that particular row where the minimum value of column 7 was found. Also column 2 holds characters so I want a character vector giving me its value.
I found the minimum of column 7 using
min_val <- min(as.numeric(df[, 7]), na.rm = TRUE)
Now how do I get the value from column 2 (variable name of column being 'column.2') corresponding to the row in which column 7 contains value of 'min_val' as calculated above?
This might be a trivial question but I am new to R so any help will be much appreciated.
Use which.min to get the minimum value index. Something like :
df[which.min(df[,7]),2]
Note that which.min only returns the first index of the minimum, so if you've got several rows with the same minimal value, you will only get the first one.
If you want to get all the minimum rows, you can use :
df[which(df[,7]==min(df[,7])), 2]
The same answer from juba, but using data.table package (his answer uses just the R base, without the need of loading any libraries).
# Load data.table
library(data.table)
# Get 2nd column's value correspondent to the first minimum value in 7th column
df[which.min(V7), V2]
# Get all respective values in 2nd column correspondent to the minimum value in 7th column
df[V2 == min(V7), V2]
For handling data.frame-like objects, data.table is quite handly and helpful, just like the dplyr package. It's worth to look at them.
Here I've assumed your colnames were named as V1..V8. Otherwise, just replace the V7/V2 with the respective column names in 7th and 2nd position of your data, respectively.
I need to extract the columns from a dataset without header names.
I have a ~10000 x 3 data set and I need to plot the first column against the second two.
I know how to do it when the columns have names ~ plot(data$V1, data$V2) but in this case they do not. How do I access each column individually when they do not have names?
Thanks
Why not give them sensible names?
names(data)=c("This","That","Other")
plot(data$This,data$That)
That's a better solution than using the column number, since names are meaningful and if your data changes to have a different number of columns your code may break in several places. Give your data the correct names and as long as you always refer to data$This then your code will work.
I usually select columns by their position in the matrix/data frame.
e.g.
dataset[,4] to select the 4th column.
The 1st number in brackets refers to rows, the second to columns. Here, I didn't use a "1st number" so all rows of column 4 are selected, i.e., the whole column.
This is easy to remember since it stems from matrix calculations. E.g., a 4x3 dimensional matrix has 4 rows and 3 columns. Thus when I want to select the 1st row of the third column, I could do something like matrix[1,3]