Change values of a vector to 0 and 1 - r

From a vector I would like to make some values 0 and some values 1. It doesnt work, why?
a <- c(1,34,5,3,6,67,3,2)
a[c(1,3,5)] <- 0 # works
a[!c(1,3,5)] <- 1 # doesnt work
Should look like
a
[1] 0 1 0 1 0 1 1 1

! is for logical values. Try -
a[-c(1,3,5)] <- 1
a
#[1] 0 1 0 1 0 1 1 1

You can try
> +!!replace(a,c(1,3,5),0)
[1] 0 1 0 1 0 1 1 1

We can create the logical index with %in%
a[!seq_along(a) %in% c(1, 3, 5)] <- 1
a
#[1] 0 1 0 1 0 1 1 1

Related

I need help putting values from one vector into another in R

I have two vectors in R
Vector 1
0 0 0 0 0 0 0 0 0 0
Vector 2
1 1 3 1 1 1 1 1
I need to put the values from vector 2 into vector 1 but into specific positions so that vector 1 becomes
1 1 3 0 0 1 1 1 1 1
I need to do this in one line of code. I tried doing:
vector1[1:3,6:10] = vector2[1:3,4:8]
but I am getting the error "incorrect number of dimensions".
Is it possible to do this?
vector1[c(1:3,6:10)] = vector2[c(1:3,4:8)]
> vector1
[1] 1 1 3 0 0 1 1 1 1 1
We may use negative indexing
vector1[-(4:5)] <- vector2
vector1
[1] 1 1 3 0 0 1 1 1 1 1

values changes (avoid 0 1 to 1 2)

I want to transform factor to numeric to be able to take the mean of it as.numeric changes the value, numeric doesn't work.
mtcars$vec <- factor(c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1))
num.cols <- c("vec" )
mtcars[num.cols] <- lapply(mtcars[num.cols], as.numeric)
str(mtcars)
mtcars$vec
expected results should be numeric and consist of only 0 and 1
mtcars$vec
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
many thanks in advance
We need to convert to character and then to numeric because if we directly apply as.numeric, it gets coerced to the integer storage values instead of the actual values which starts from 1. In this case, there is a confusion because the values are binary
mtcars[num.cols] <- lapply(mtcars[num.cols],
function(x) as.numeric(as.character(x)))
mtcars$vec
#[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Or a faster option is also
mtcars[num.cols] <- lapply(mtcars[num.cols], function(x) as.numeric(levels(x)[x]))
If it is a single column, we can do this more easily
mtcars[[num.cols]] <- as.numeric(levels(mtcars[[num.cols]])[mtcars[[num.cols]]])
As an example
v1 <- factor(c(15, 15, 3, 3))
as.numeric(v1)
#[1] 2 2 1 1
as.numeric(as.character(v1))
#[1] 15 15 3 3

How to remove duplicate values from different rows per unique identifier?

I'm just starting to use R. I have a dataset with in the first column unique identifiers (1958 patients) and in columns 2-35 0's en 1's.
For example:
Patient A: 0 1 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 NA NA
I want to change this to:
Patient A: 0 1 0 1 0 1
Thanks in advance.
We can use tapply and grouping our variable based on whether it changes value or not, i.e.
tapply(x[!is.na(x)], cumsum(c(TRUE, diff(x[!is.na(x)]) != 0)), FUN = unique)
#1 2 3 4 5 6
#0 1 0 1 0 1
Based on your example, it is not clear whether NA's can also occur in the middle, and how you would want to deal with that situation (e.g. make 1 NA 1 to 1 1 (option 1) and hence combine the two 1's, or whether NA would mark a boundary and you would keep both 1's (option 2).
That determines at which point to remove NA's in the code.
You could use S4Vectors run length encoding, which would allow you to have more than just 0 and 1.
library(S4Vectors)
## create example data
set.seed(1)
x <- sample(c(0,1), (1958*34), replace=TRUE, prob=c(.4, .6))
x[sample(length(x), 200)] <- NA
x <- matrix(x, nrow=1958, ncol=34)
df <- data.frame(patient.id = paste0("P", seq_len(1958)), x, stringsAsFactors = FALSE)
## define function to remove NA values
# option 1
fun.NA.boundary <- function(x) {
a <- runValue(Rle(x))
a[!is.na(a)]
}
# option 2
fun.NA.remove <- function(x) runValue(Rle(x[!is.na(x)]))
## calculate results
# option 1
reslist <- apply(x[,-1], 1, function(y) fun.NA.boundary(y))
# option 2
reslist <- apply(x[,-1], 1, function(y) fun.NA.remove(y))
names(reslist) <- df$patient.id
head(reslist)
#> $P1
#> [1] 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
#>
#> $P2
#> [1] 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
#>
#> $P3
#> [1] 0 1 0 1 0 1 0 1 0 1 0 1 0 1
#>
#> $P4
#> [1] 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
#>
#> $P5
#> [1] 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0
#>
#> $P6
#> [1] 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

Create vector = (0 1 1 0 0 0 1 1 1 1)?

How can I create the following vector?
vec = (0 1 1 0 0 0 1 1 1 1)
I already tried rep(0:1,times=1:4) which works with numbers other than 0 but does not here...=
For rep, 'times' and 'x' need to have the same length (unless the length of 'times' equals 1). Therefore, you need to make a vector 'x' with length 4 in this case.
> rep(rep(0:1,2),times=1:4)
[1] 0 1 1 0 0 0 1 1 1 1
Here's a generic solution:
> increp=function(n){rep(0:(n-1), times=1:n) %% 2}
> increp(4)
[1] 0 1 1 0 0 0 1 1 1 1
> increp(3)
[1] 0 1 1 0 0 0
> increp(2)
[1] 0 1 1
> increp(6)
[1] 0 1 1 0 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 1 1
It generates 0,1,1,2,2,2,3,3,3 up to the required length and then just converts to 0/1 based on even or odd.

How to count the frequency in binary tranactions per column, and adding the result after last row in R?

I have a txt file (data5.txt):
1 0 1 0 0
1 1 1 0 0
0 0 1 0 0
1 1 1 0 1
0 0 0 0 1
0 0 1 1 1
1 0 0 0 0
1 1 1 1 1
0 1 0 0 1
1 1 0 0 0
I need to count the frequency of one's and zero's in each column
if the frequency of ones >= frequency of zero's then I will print 1 after the last row for that Colum
I'm new in R, but I tried this, and I got error:
Error in if (z >= d) data[n, i] = 1 else data[n, i] = 0 :
missing value where TRUE/FALSE needed
my code:
data<-read.table("data5.txt", sep="")
m =length(data)
d=length(data[,1])/2
n=length(data[,1])+1
for(i in 1:m)
{
z=sum(data[,i])
if (z>=d) data[n,i]=1 else data[n,i]=0
}
You may try this:
rbind(df, ifelse(colSums(df == 1) >= colSums(df == 0), 1, NA))
# V1 V2 V3 V4 V5
# 1 1 0 1 0 0
# 2 1 1 1 0 0
# 3 0 0 1 0 0
# 4 1 1 1 0 1
# 5 0 0 0 0 1
# 6 0 0 1 1 1
# 7 1 0 0 0 0
# 8 1 1 1 1 1
# 9 0 1 0 0 1
# 10 1 1 0 0 0
# 11 1 1 1 NA 1
Update, thanks to a nice suggestion from #Arun:
rbind(df, ifelse(colSums(df == 1) >= ceiling(nrow(df)/2), 1, NA)
or even:
rbind(df, ifelse(colSums(df == 1) >= nrow(df)/2, 1, NA)
Thanks to #SvenHohenstein.
Possibly I misinterpreted your intended results. If you want 0 when frequency of ones is not equal or larger than frequency of zero, then this suffice:
rbind(df, colSums(df) >= nrow(df) / 2)
Again, thanks to #SvenHohenstein for his useful comments!

Resources