My input is a vector like this
v = c(1,2,2,3,4,5,4,1,1)
unique(v) == c(1,2,3,4,5)
instead, I need to check and operate uniqueness only on pairs of subsequent element:
.f(v) == c(1,2,3,4,5,4,1)
Use rle from base R and extract the 'values'
rle(v)$values
[1] 1 2 3 4 5 4 1
unique gets the unique values from the whole dataset, whereas rle returns a list of 'values' and its lengths for each adjacent unique value
Or another option is to do a comparison with the current and adjacent value and apply duplicated to subset the vector
v[!duplicated(cumsum(c(TRUE, v[-1] != v[-length(v)])))]
[1] 1 2 3 4 5 4 1
Another possible solution:
v[v != dplyr::lag(v, default = Inf)]
#> [1] 1 2 3 4 5 4 1
Related
I have a vector of numbers that contain some gaps. For example,
vec <- c(3,1,7,3,5,7)
So, there are 4 different values and I would like to transform it into a vector of values (without gaps) indicating the order of the entry while respecting the same position. So, in this case, I would like to obtain
2 1 4 2 3 4
Indicating a sequence of between 1 and 4 and showing the orders in the original vector vec.
You can use match to help you look up the values in a sorted unique order. For example
vec <- c(3,1,7,3,5,7)
match(vec, sort(unique(vec)))
# [1] 2 1 4 2 3 4
This works because match returns the indexes which will start at 1.
We may use factor
as.integer(factor(vec))
[1] 2 1 4 2 3 4
I have task to multiply numbers in vector, but only those that can be divided by 3 modulo 0. I figured out how to replace certain elements in vector by different numbers, but it works only if i replace with certain number. I wasn't able to find any answer here http://www.r-tutor.com/r-introduction/vector or even on this site. Everyone only extracting values to another vector.
x <- c(1,1,2,2,2,3,3)
x[x%%2==0] = 5
# [1] 1 1 5 5 5 3 3
why this doesn't work ?
x[x%%3==0] = x*3
I expect to get this:
c(1,1,5,5,5,9,9)
The assignment vectors are not the same on the lhs and rhs of the assignment operator.
length(x*3)
#[1] 7
length(x[x%%3 ==0])
#[1] 2
We need to do
x[x%%3==0] <- x[x%%3==0]*3
x
#[1] 1 1 5 5 5 9 9
Instead of repeating the logical vector, an object can be created and then do the substitution
i1 <- x%%3 == 0
x[i1] <- x[i1]*3
In the first assignment, there was only a single element and it was assigned to replace the values returned by the logical condition is met
Another option is
pmax(x, x*(!x%%3)*3)
#[1] 1 1 5 5 5 9 9
I have a data.frame built up as follows:
a b c d column_name
1 2 3 4 a
2 3 4 1 b
3 4 1 2 c
4 1 2 3 d
Now I want to get the value for each row, of the column that matches the name in column_name. I build this with an ifelse like so:
df$value <- ifelse(df$column_name=="a", df$a,
ifelse(df$column_name=="b", df$b,
ifelse(df$column_name=="c", df$c,
ifelse(df$column_name=="d", df$d, "NA"))))
However this is not very pretty and efficient. With more then 4 possible columns it becomes impossible to use.
Does anyone know a more efficient and beautiful way? I tried apply(), but couldn't get it to work.
We can create a column index by matching the 'column_name' with the column names of the dataset (match(df$column_name, colnames(df))), cbind it with the row index (1:nrow(df)), extract the elements of 'df' based on this and assign (<-) it to create the 'value' column.
df$value <- df[-ncol(df)][cbind(1:nrow(df), match(df$column_name, colnames(df)))]
df$value
#[1] 1 3 1 3
I need some hints to make effective loop in vector but for “FOR…” loop because of optimization issues.
At first glance, it is recommended to use such functions as apply(), sapply().
I have a vector converted into matrix:
x1<-c(1,2,4,1,4,3,5,3,1,0)
Looping through the vector I need to replace all x1[i+1]=x1[i] if x[i]>x[i+1].
Example:
Input vector:
x1<-as.matrix(c(1,2,4,1,4,3,5,3,1,0))
Output vector:
c(1,2,4,4,4,4,5,5,5,5)
My approach is to use user function in apply() but I have some difficulties how to code correctly the relation of x[i] and x[i+1] in user function.
I would be very grateful for your ideas or hints.
In general you can use Reduce with accumulate=TRUE for cumulative operations
Reduce(max,x1,accumulate=TRUE)
# [1] 1 2 4 4 4 4 5 5 5 5
But as #Khashaa points out, the common cases cumsum,cumprod,cummin, and yours, cummax are provided as efficient base functions.
cummax(x1)
# [1] 1 2 4 4 4 4 5 5 5 5
We could do this using ave. (Using the vector x1)
ave(x1,cumsum(c(TRUE,x1[-1]>x1[-length(x1)])), FUN=function(x) head(x,1))
#[1] 1 2 4 4 4 4 5 5 5 5
We create a grouping variable based on the condition described in the OP's post. Check whether the succeeding element (x1[-1] - removed first element) is greater than the current element (x1[-length(x1)] -removed last element).
x1[-1]>x1[-length(x1)]
#[1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
The length is one less than the length of the vector x1. So, we append TRUE to make the length equal and then do the cumsum
cumsum(c(TRUE,x1[-1]>x1[-length(x1)]))
#[1] 1 2 3 3 4 4 5 5 5 5
This we use as grouping variable in ave and select the first observation of 'x1'
within each group
Another option would to get the logical index (c(TRUE, x1[-1] > x1[-length(x1)])) as before, negate it (!) so that TRUE becomes FALSE, and FALSE as TRUE, convert the TRUE values to 'NA' (NA^(!...)), and then use na.locf from library(zoo) to replace the NA values with the preceding non-NA value.
library(zoo)
na.locf(x1*NA^(!c(TRUE,x1[-1]>x1[-length(x1)])))
#[1] 1 2 4 4 4 4 5 5 5 5
I have a vector:
x<-rnorm(100),
I would like to create a vector that stores the position of the first, second, third...100th highest value in X.
For example if x=4,9,2,0,10,11 then the desired vector would be 6,5,2,1,3,4 is there a function for doing this?
Try using order
> order(x, decreasing =TRUE)
[1] 6 5 2 1 3 4
Try this:
> order(-x)
[1] 6 5 2 1 3 4