Join columns from different files - r

I want to know how can we combine a specific column of one file with another column of another file in R?
I want to subtract 50 from the maximum of each column. I tried this but it didn't work:
a <- 50-max(datafile1$X2018.03.06,datafile1$X2017.07.13)

Related

How do I subtract values from one column from another in a dataframe?

I have a dataframe where the rows are the names of different genes, with 2 columns called: Control_mean and Patient_mean.
I want to create a third column where I store the value of "Patient_mean - Control_mean" for each row respectively but I cant figure out how!
I tried to do so using this:
for(i in 1:nrow(newdf8)){
newdf8$log2FC[i] <- (newdf8[,2] - newdf8[,1])
}
but it didnt work, since all the values in the new column became the same number, and not the value of the actual difference.

Transform multiple entries in cell into new columns in R

I have a messy dataset with multiple entries in some cells. The numbers in paranthesis refer to the specific columns "(1)", "(2)", and "(3)". In this example
multiple entries in cell 30 refers to column (2) and 20 refers to column (1). No information for column (3).
I would like to split up/extract the values in the cells and create 3 additional columns.
Several hundred cells are affected in several columns.
Dataset
In the end I would like to have 3 new columns for each column affected. Any idea how I do that? I'm still a rookie so help is much appreciated!

How to extract specific rows depending on part of the strings in one column in R

When I use R, I try to extract specific rows which have some specific strings in one column.
The data structure as following
ERC1 20679 14959 9770 RAB6-interacting protein 2 isoform
I want to extract the rows which have RAB6 in the last column. That column still has some other words besides RAB6 so I can not use column = "RAB6" to get them. It's just like a search function in excel. Does anyone have any ideas?
Assuming that your data frame is df:
df[grep("^RAB6", df$column),]
If not all values start with RAB6 remove the^.

How to check eliminate several rows from a dataframe using R

I have two excel files A and B. Excel file A has 6 columns with 10,000 rows. Assume that columns are named A-F. Excel file B has one column (A) with 3500 rows. Here is what I want to do-
I want to eliminate rows based on the cell values (ids) in column A (in excel file A) and have a dataframe without them. To further elaborate- I want to check each id in Column A (excelfile A) against all ids in columnA in excel file B. If the id in column A in excel file A matches with any of the listed ids in columnA of excel file B, then, I want to eliminate those rows with matching ids in excel file A.
I was able to do this in excel. I want to do the same with R as a crosscheck. I am new to R and I am learning. Could someone help me with a best way to do this?
I know how to subset rows based on header title and a particular value of a cell. But, in this case, I have data with 10,000 observations, of which, I want to eliminate at least 3500 through matching the ids.

extract columns that don't have a header or name in R

I need to extract the columns from a dataset without header names.
I have a ~10000 x 3 data set and I need to plot the first column against the second two.
I know how to do it when the columns have names ~ plot(data$V1, data$V2) but in this case they do not. How do I access each column individually when they do not have names?
Thanks
Why not give them sensible names?
names(data)=c("This","That","Other")
plot(data$This,data$That)
That's a better solution than using the column number, since names are meaningful and if your data changes to have a different number of columns your code may break in several places. Give your data the correct names and as long as you always refer to data$This then your code will work.
I usually select columns by their position in the matrix/data frame.
e.g.
dataset[,4] to select the 4th column.
The 1st number in brackets refers to rows, the second to columns. Here, I didn't use a "1st number" so all rows of column 4 are selected, i.e., the whole column.
This is easy to remember since it stems from matrix calculations. E.g., a 4x3 dimensional matrix has 4 rows and 3 columns. Thus when I want to select the 1st row of the third column, I could do something like matrix[1,3]

Resources