Ranks and identification of elements in r - r

I have two vectors with different elements, say x=c(1,3,4) , y= c(2,9)
I want a vector of ranges that identifies me the elements of vector x with 1 and those of y with 0, ie
(1,2,3,4,9) -----> (1,0,1,1,0)
How could you get the vector of zeros and ones (1,0,1,1,0) in r?
Thanks

The following option surely isn't numerically optimal, but it's the most simple and direct one:
a<-c(1,2,3,4)
b<-c(5,6,7,8)
f<-function(vec0,vec1,inp)
{
out<-rep(NA,length(inp)) #NA if input elements in neither vector
for(i in 1:length(inp))
{ #Logical values coerced to 0 and 1 at first, then
if(sum(inp[i]==vec0))(out[i]<-0); #summed up and if sum != 0 coerced to logical "TRUE"
}
for(i in 1:length(inp))
{
if(sum(inp[i]==vec1))(out[i]<-1);
}
return (out)
}
Works just fine:
> f(vec0=a,vec1=b,inp=c(1,6,4,8,2,4,8,7,10))
[1] 0 1 0 1 0 0 1 1 NA

first you define a function that do that
blah <- function( vector,
x=c(1,3,4),
y= c(2,9)){
outVector <- rep(x = NA, times = length(vector))
outVector[vector %in% x] <- 1
outVector[vector %in% y] <- 0
return(outVector)
}
then you can use the function:
blah(vector = 1:9)
blah(vector = c(1,2,3,4,9))
you can also change the value of x & y
blah(vector = 1:10,x = c(1:5*2), y = c((1:5*2)-1 ))

Related

Adding indicator values by using a for loop in R

I'm relatively inexperienced with R. I have a vector of true and false values. I want to make these numeric (i.e., 0 or 1). I have tried to write this for loop, but there are several syntax errors that I don't know how to fix.
indY <- rep(NA, nrow(dat)) # making an empty vector
# use for loop to fill each entry with either 0 or 1
for (i in 1:length(y)) {
if newy[i] == TRUE:
indY[i] == 1
else:
indY[i] == 0
}
Any suggestions? Thanks.
This should be fine :
as.numeric(y)
Assuming y is a vector of TRUEs and FALSEs.
For bonus points, use +:
y <- c(TRUE, TRUE, FALSE, TRUE)
+y
#[1] 1 1 0 1
This coerces y to numeric and returns the vector.
Try:
df$y <- ifelse(df$y, 1, 0)

Elementary problems with a Loop in R

I am working with for-loops in R;
I have a data frame which contains n columns.
I have to build a vector of length n where each element is 1 if the column is a double, 0 otherwise.
this is what I have tried:
y<-rep(0,dim.data.frame(datafr)[2])
attach(datafr)
x<-names(dat)
for (j in 1:length(x)){
for(i in x){
if(is.double(i)){
y[j]<-1
}else{
y[j]<-0
}
}
}
However, it does not work since the y vector returned has no 1, but just n 0.
A column in a data.frame should all be the same class, so you only need to check past the first value in a column.
A simple approach is to create the vector with sapply which loops (in this case) over the columns of a frame.
datafr <- data.frame(a=1:5, b=1:5 + 0, d=letters[1:5])
sapply(datafr, is.double)
# a b d
# FALSE TRUE FALSE
If you must use a for loop, this can be unrolled with
y <- integer(ncol(datafr)) # defaults to 0
y
# [1] 0 0 0
for (j in seq_along(datafr)) {
if (is.double(datafr[[j]])) {
y[j] <- 1L
}
}
y
# [1] 0 1 0

About missing value where TRUE/FALSE needed in R

I want to return the number of times in string vector v that the element at the next successive index has more characters than the current index.
Here's my code
BiggerPairs <- function (v) {
numberOfTimes <- 0
for (i in 1:length(v)) {
if((nchar(v[i+1])) > (nchar(v[i]))) {
numberOfTimes <- numberOfTimes + 1
}
}
return(numberOfTimes)
}
}
missing value where TRUE/FALSE needed.
I do not know why this happens.
The error you are getting is saying that your code is trying to evaluate a missing value (NA) where it expects a number. There are likely one of two reasons for this.
You have NA's in your vector v (I suspect this is not the actual issue)
The loop you wrote is from 1:length(v), however, on the last iteration, this will try the loop to try to compare v[n+1] > v[n]. There is no v[n+1], thus this is a missing value and you get an error.
To remove NAs, try the following code:
v <- na.omit(v)
To improve your loop, try the following code:
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
Here is some example dummy code.
# create random 15 numbers
set.seed(1)
v <- rnorm(15)
# accessing the 16th element produces an NA
v[16]
#[1] NA
# if we add an NA and try to do a comparison, we get an error
v[10] <- NA
v[10] > v[9]
#[1] NA
# if we remove NAs and limit our loop to N-1, we should get a fair comparison
v <- na.omit(v)
numberOfTimes <- 0
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
numberOfTimes
#[1] 5
Is this what you're after? I don't think there is any need for a for loop.
I'm generating some sample data, since you don't provide any.
# Generate some sample data
set.seed(2017);
v <- sapply(sample(30, 10), function(x)
paste(sample(letters, x, replace = T), collapse = ""))
v;
#[1] "raalmkksyvqjytfxqibgwaifxqdc" "enopfcznbrutnwjq"
#[3] "thfzoxgjptsmec" "qrzrdwzj"
#[5] "knkydwnxgfdejcwqnovdv" "fxexzbfpampbadbyeypk"
#[7] "c" "jiukokceniv"
#[9] "qpfifsftlflxwgfhfbzzszl" "foltth"
The following vector marks the positions with 1 in v where entries have more characters than the previous entry.
# The following vector has the same length as v and
# returns 1 at the index position i where
# nchar(v[i]) > nchar(v[i-1])
idx <- c(0, diff(nchar(v)) > 0);
idx;
# [1] 0 0 0 0 1 0 0 1 1 0
If you're just interested in whether there is any entry with more characters than the previous entry, you can do this:
# If you just want to test if there is any position where
# nchar(v[i+1]) > nchar(v[i]) you can do
any(idx == 1);
#[1] TRUE
Or count the number of occurrences:
sum(idx);
#[1] 3

R: Remove the number of occurrences of values in one vector from another vector, but not all

Apologies for the confusing title, but I don't know how to express my problem otherwise. In R, I have the following problem which I want to solve:
x <- seq(1,1, length.out=10)
y <- seq(0,0, length.out=10)
z <- c(x, y)
p <- c(1,0,1,1,0,0)
How can I remove vector p from vector z so that vector a new vector i now has three occurrences of 1 and three occurrences 0 less, so what do I have to do to arrive at the following result? In the solution, the order of 1's and 0's in z should not matter, they just might have been in a random order, plus there can be other numbers involved as well.
i
> 1 1 1 1 1 1 1 0 0 0 0 0 0 0
Thanks in advance!
Similar to #VincentGuillemot's answer, but in functional programming style. Uses purrr package:
i <- z
map(p, function(x) { i <<- i[-min(which(i == x))]})
i
> i
[1] 1 1 1 1 1 1 1 0 0 0 0 0 0 0
There might be numerous better ways to do it:
i <- z
for (val in p) {
if (val %in% i) {
i <- i[ - which(i==val)[1] ]
}
}
Another solution that I like better because it does not require a test (and thanks fo #Franck's suggestion):
for (val in p)
i <- i[ - match(val, i, nomatch = integer(0) ) ]

R: Remove repeated values and keep the first one in a binary vector

I would like to remove the repeated ones but keep the first in a binary vector:
x = c(0,0,1,1,0,1,0,1,1,1,0,1) # the input
y = c(0,0,1,0,1,0,1,0,1) # the desired output
i.e., one 1 and two 1's of the first and third set of 1's are removed, respectively, and the first in the set is kept.
I am trying to use rle with cumsum but have not yet figured it out. Any suggestion would be appreciated.
Using rle/inverse.rle
res <- rle(x)
res$lengths[res$values == 1] <- 1
inverse.rle(res)
## [1] 0 0 1 0 1 0 1 0 1
We can use diff:
x[c(1, diff(x)) == 1 | x == 0]
x = c(0,0,1,1,0,1,0,1,1,1,0,1)
x[!(x == 1 & #remove each value that is a 1
c(x[-1] == 1, FALSE) #followed by a 1 (never the case for the last value)
)]
#[1] 0 0 1 0 1 0 1 0 1
x = c(0,0,1,1,0,1,0,1,1,1,0,1)
x1 <- rle(x)
x1$lengths[x1$values==1] <- 1
inverse.rle(x1)
Depending on the vector size you could loop through it and use conditions for appending the value to the result. Here is a simple solution using your given input.
x <- c(0,0,1,1,0,1,0,1,1,1,0,1)
prev <- 0
y <- c()
for(i in x){
if (i == 1){
if (prev != 1){
y <- append(y,i)
}
}else{
y <- append(y,i)
}
prev <- i
}

Resources