Let's say I have a dataset (ds) with 4 rows with 3 variables as seen below:
ds
x1 x2 x3
1 0 0
0 0 1
0 1 0
0 0 1
How do I change the "1" to a unique value for each column and combine them into a single column?
So, the first step:
x1 x2 x3
1 0 0
0 0 3
0 2 0
0 0 3
Then, the second step (creating x4):
x1 x2 x3 x4
1 0 0 1
0 0 3 3
0 2 0 2
0 0 3 3
I have a lot more variables than this, I just want to know how to minimize the number of lines I write so it's not like 10+ lines.
You could do this:
df <- read.table(text="x1 x2 x3
1 0 0
0 0 1
0 1 0
0 0 1", header=TRUE, stringsAsFactors=FALSE)
df <- df*col(df)
df$x4 <- rowSums(df)
x1 x2 x3 x4
1 1 0 0 1
2 0 0 3 3
3 0 2 0 2
4 0 0 3 3
I am trying to find a simple (1 line of code or so) to multiply all rows of all columns of a dataframe by 100 for example.
df <- data.frame(replicate(10,sample(0:1,1000,rep=TRUE)))
head(df)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 0 1 1 0 0 0 0 1 0 1
2 0 0 1 0 0 1 0 1 0 0
3 0 0 1 1 0 0 1 1 0 0
4 0 1 0 1 1 1 0 0 0 1
5 1 0 1 1 0 0 0 1 0 0
6 0 0 0 0 1 0 0 1 1 1
The way I am currently doing it;
dfX1 <- as.data.frame(df$X1 * 100)
But this way I would have to do this 10 times... and then use the cbind function to bind them all back together again.
dfFULL <- cbind(dfX1, dfX2, dfX3...)
Anybody know of a cleaner way?
I have two dataframes, A and B, each with 64 rows and 431 columns. Each dataframe contains values of zeros and ones. I need to create a new dataframe C that has values of 1 when the cell of A is equal to the cell of B, and a value of 0 when a cell of A is different to the cell of B. How to apply the if statement to each cell of the two dataframes?
Example of dataframes
A <- data.frame(replicate(431,sample(0:1,64,rep=TRUE)))
B <- data.frame(replicate(431,sample(0:1,64,rep=TRUE)))
Example rows from A
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 0 1 1 0 1 0 1 0 0 1
2 1 1 0 1 1 0 0 0 0 0
3 1 0 0 0 1 0 0 1 1 0
4 0 0 0 0 1 1 1 1 1 0
5 1 0 1 1 0 0 0 1 1 1
Example rows from B
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 1 0 1 0 0 1 0 1 0 1
2 0 0 0 1 0 1 1 1 1 1
3 1 0 1 1 1 1 0 0 0 0
4 1 0 0 0 0 1 1 0 0 0
5 0 0 0 0 1 1 1 1 1 0
Output I would like to obtain, dataframe C
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 0 0 1 0 0 0 0 0 1 0
2 0 0 1 1 0 0 0 0 0 0
3 1 1 0 0 1 0 1 0 0 1
4 0 1 1 1 0 1 1 0 0 1
5 0 1 0 0 0 0 0 1 0 0
Because of R's behind the scenes magic, you don't even need to use an if statement. You can just do this:
C <- (A == B) * 1
The first part (A == B) goes through every cell of A and B and compares them directly. The result is a bunch of TRUE and FALSE values. Multiplying everything by 1 forces the TRUE values to become 1 and FALSE to become 0.
You assess whether A and B are the same (cell-wise) and then transform the TRUE / FALSE values into binary by multiplying it by 1:
df <- (A == B) * 1
The previous answers are correct. If you really want to use an if statement, then you can use this:
C <- ifelse(A == B, 1, 0)
Basic operations on R matrix-like data structures tend to be cell-wise. Logicals mixed with numbers in operations tend to coerced into the number themselves, 0 (FALSE) and 1 (TRUE) so the (A == B) + 0 would do what you want to cell-wise, however to make sure that the result is a data.frame and not a matrix you need to call as.data.frame:
C = as.data.frame((A == B) + 0)
We’d like to merge some columns from a data frame with the matching columns from various different data frames. Our main data frame predict looks as follows:
>predict
x1 x2 x3
1 1 1
0 1 0
1 1 0
1 1 0
0 0 1
(There may be more columns depending on the quantity of prediction runs)
Our goal is to merge this data frame with the y-columns from three different test data frames (df_1 df_2 and df_3) which all have the same structure. The needed columns are accessed through df_1$y[test] ([test] is a logical vector which identifies the 5 values which match our x-values) and have the same structure as the x-columns from predict.
The desired output would look like this:
>predict_test
x1 x2 x3 y1 y2 y3
1 1 1 1 1 1
0 1 0 0 0 0
1 1 0 0 1 0
1 1 0 1 1 1
0 0 1 0 0 1
In the next step we need to stack the x- and the y- columns into one column in order to do evaluations. It is important to stack them in the correct order, i.e. x2 under x1 and x3 under x2. The y-columns respectively.
>predict_test_stack
x_all y_all
1 1
0 0
1 0
1 1
0 0
1 1
1 0
1 1
1 1
0 0
1 1
0 0
0 0
0 1
1 1
This probably works with melt, but we don't know how to apply it while indicating two different id variables.
Thanks for your help.
data
df1 <- read.table(text = "x1 x2 x3
1 1 1
0 1 0
1 1 0
1 1 0
0 0 1",stringsAsFactors = FALSE,header=TRUE)
df2 <- read.table(text = "y1 y2 y3
1 1 1
0 0 0
0 1 0
1 1 1
0 0 1",stringsAsFactors = FALSE,header=TRUE)
solution
we concatenate the data.frames, then unlist the data.frame, keeping the correct number of columns. Finally we set the names by going into the data.frames to find the pattern.
list1 <- list(df1,df2)
side_by_side <- data.frame(list1)
# x1 x2 x3 y1 y2 y3
# 1 1 1 1 1 1 1
# 2 0 1 0 0 0 0
# 3 1 1 0 0 1 0
# 4 1 1 0 1 1 1
# 5 0 0 1 0 0 1
output <- data.frame(matrix(unlist(side_by_side),ncol = length(list1)))
names(output) <- sapply(list1,function(x){sub("[[:digit:]]","",names(x)[1])})
# x y
# 1 1 1
# 2 0 0
# 3 1 0
# 4 1 1
# 5 0 0
# 6 1 1
# 7 1 0
# 8 1 1
# 9 1 1
# 10 0 0
# 11 1 1
# 12 0 0
# 13 0 0
# 14 0 1
# 15 1 1
I have a data frame that looks like this
Site <- c("X1","X2","X3","X4","X5","X6","X7","X8","X9","X10")
A <- c(0,0,1,2,4,5,6,7,13,56)
B <- c(1,0,0,0,0,4,5,7,7,8)
C <- c(2,3,0,0,4,5,67,8,43,21)
D <- c(134,0,0,2,0,0,9,0,45,55)
mydata <- data.frame(Site,A,B,C,D,stringsAsFactors=FALSE)
I want to convert all values > 0 to be 1 (i.e. binary), without jeopardising the column and row names.
I have tried mydata[mydata>=1]<-1 but it also changed my first column (the row names) to 1 as well:
head(mydata)
Site A B C D
1 1 0 1 1 1
2 1 0 0 1 0
3 1 1 0 0 0
4 1 1 0 0 1
5 1 1 0 1 0
6 1 1 1 1 0
So how do I change just the values to binary, not the row names?
We can create a logical matrix and coerce to binary
mydata[-1] <- +(mydata[-1] > 0)
As an alternative to the answer given by #akrun (+1), we can also try using sapply() to logically convert any non-zero number to 1 or else 0:
mydata[-1] <- sapply(mydata[-1], function(x) { as.numeric(x > 0) })
mydata
Site A B C D
1 X1 0 1 1 1
2 X2 0 0 1 0
3 X3 1 0 0 0
4 X4 1 0 0 1
5 X5 1 0 1 0
6 X6 1 1 1 0
7 X7 1 1 1 1
8 X8 1 1 1 0
9 X9 1 1 1 1
10 X10 1 1 1 1
If we weren't sure about the relative positioning of the columns, we could also address the numeric columns using mydata[c("A", "B", "C", "D")] or something similar.
You could also try this which disregards if the number is negative or positive:
mydata[-1] <- (!is.na(mydata[-1]/mydata[-1]))*1
ifelse function allows you to assign a new data if the value agrees or not your condition. Works for vectors but data frames also. I bind the Site column with the transformed ones.
myBinData <- data.frame(Site = mydata$Site, ifelse(mydata[, -1] == 0, 0, 1))
Site A B C D
1 X1 0 1 1 1
2 X2 0 0 1 0
3 X3 1 0 0 0
4 X4 1 0 0 1
5 X5 1 0 1 0
6 X6 1 1 1 0
7 X7 1 1 1 1
8 X8 1 1 1 0
9 X9 1 1 1 1
10 X10 1 1 1 1