How can i multiply specific number in vector using R - r

I have task to multiply numbers in vector, but only those that can be divided by 3 modulo 0. I figured out how to replace certain elements in vector by different numbers, but it works only if i replace with certain number. I wasn't able to find any answer here http://www.r-tutor.com/r-introduction/vector or even on this site. Everyone only extracting values to another vector.
x <- c(1,1,2,2,2,3,3)
x[x%%2==0] = 5
# [1] 1 1 5 5 5 3 3
why this doesn't work ?
x[x%%3==0] = x*3
I expect to get this:
c(1,1,5,5,5,9,9)

The assignment vectors are not the same on the lhs and rhs of the assignment operator.
length(x*3)
#[1] 7
length(x[x%%3 ==0])
#[1] 2
We need to do
x[x%%3==0] <- x[x%%3==0]*3
x
#[1] 1 1 5 5 5 9 9
Instead of repeating the logical vector, an object can be created and then do the substitution
i1 <- x%%3 == 0
x[i1] <- x[i1]*3
In the first assignment, there was only a single element and it was assigned to replace the values returned by the logical condition is met
Another option is
pmax(x, x*(!x%%3)*3)
#[1] 1 1 5 5 5 9 9

Related

How to create a vector of positions of a numeric vector in R?

I have a vector of numbers that contain some gaps. For example,
vec <- c(3,1,7,3,5,7)
So, there are 4 different values and I would like to transform it into a vector of values (without gaps) indicating the order of the entry while respecting the same position. So, in this case, I would like to obtain
2 1 4 2 3 4
Indicating a sequence of between 1 and 4 and showing the orders in the original vector vec.
You can use match to help you look up the values in a sorted unique order. For example
vec <- c(3,1,7,3,5,7)
match(vec, sort(unique(vec)))
# [1] 2 1 4 2 3 4
This works because match returns the indexes which will start at 1.
We may use factor
as.integer(factor(vec))
[1] 2 1 4 2 3 4

Unique with subsequent element only

My input is a vector like this
v = c(1,2,2,3,4,5,4,1,1)
unique(v) == c(1,2,3,4,5)
instead, I need to check and operate uniqueness only on pairs of subsequent element:
.f(v) == c(1,2,3,4,5,4,1)
Use rle from base R and extract the 'values'
rle(v)$values
[1] 1 2 3 4 5 4 1
unique gets the unique values from the whole dataset, whereas rle returns a list of 'values' and its lengths for each adjacent unique value
Or another option is to do a comparison with the current and adjacent value and apply duplicated to subset the vector
v[!duplicated(cumsum(c(TRUE, v[-1] != v[-length(v)])))]
[1] 1 2 3 4 5 4 1
Another possible solution:
v[v != dplyr::lag(v, default = Inf)]
#> [1] 1 2 3 4 5 4 1

Get first n indexes fulfilling a condition in r

I have a large vector where I have different values. I would like to find first N values which are less than a particular value.
For example in the following vector I want only 3 indexes which are less than 3
x2 <- c(1.6,0.35,1,3,6,8,1.5,2)
x3 <- which(x2 < 3)
x3
[1] 1 2 3 7 8
From X3 I can extract the first three values but they are not the smallest values in the vector. If I order the X2 vector before applying the condition, I am loosing the indexes of the values. What I want at the end is as follows
[1] 2 3 7
The rank function is what you are looking for:
which(rank(x2)<=3 & x2<3)
#[1] 2 3 7
Try:
match(sort(x2[x2 < 3])[1:3], x2)
#[1] 2 3 7
We can match the smallest 3 values less than the threshold to the original vector.
edit
This will work with unique and non-unique vectors
which(!is.na(match(x2, sort(x2[x2 < 3])[1:3])))
[1] 2 3 7

how to loop a vector comparing rows without FOR

I need some hints to make effective loop in vector but for “FOR…” loop because of optimization issues.
At first glance, it is recommended to use such functions as apply(), sapply().
I have a vector converted into matrix:
x1<-c(1,2,4,1,4,3,5,3,1,0)
Looping through the vector I need to replace all x1[i+1]=x1[i] if x[i]>x[i+1].
Example:
Input vector:
x1<-as.matrix(c(1,2,4,1,4,3,5,3,1,0))
Output vector:
c(1,2,4,4,4,4,5,5,5,5)
My approach is to use user function in apply() but I have some difficulties how to code correctly the relation of x[i] and x[i+1] in user function.
I would be very grateful for your ideas or hints.
In general you can use Reduce with accumulate=TRUE for cumulative operations
Reduce(max,x1,accumulate=TRUE)
# [1] 1 2 4 4 4 4 5 5 5 5
But as #Khashaa points out, the common cases cumsum,cumprod,cummin, and yours, cummax are provided as efficient base functions.
cummax(x1)
# [1] 1 2 4 4 4 4 5 5 5 5
We could do this using ave. (Using the vector x1)
ave(x1,cumsum(c(TRUE,x1[-1]>x1[-length(x1)])), FUN=function(x) head(x,1))
#[1] 1 2 4 4 4 4 5 5 5 5
We create a grouping variable based on the condition described in the OP's post. Check whether the succeeding element (x1[-1] - removed first element) is greater than the current element (x1[-length(x1)] -removed last element).
x1[-1]>x1[-length(x1)]
#[1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE
The length is one less than the length of the vector x1. So, we append TRUE to make the length equal and then do the cumsum
cumsum(c(TRUE,x1[-1]>x1[-length(x1)]))
#[1] 1 2 3 3 4 4 5 5 5 5
This we use as grouping variable in ave and select the first observation of 'x1'
within each group
Another option would to get the logical index (c(TRUE, x1[-1] > x1[-length(x1)])) as before, negate it (!) so that TRUE becomes FALSE, and FALSE as TRUE, convert the TRUE values to 'NA' (NA^(!...)), and then use na.locf from library(zoo) to replace the NA values with the preceding non-NA value.
library(zoo)
na.locf(x1*NA^(!c(TRUE,x1[-1]>x1[-length(x1)])))
#[1] 1 2 4 4 4 4 5 5 5 5

Arguments for Subset within a function in R colon v. greater or equal to

Suppose I have the following data.
x<- c(1,2, 3,4,5,1,3,8,2)
y<- c(4,2, 5,6,7,6,7,8,9)
data<-cbind(x,y)
x y
1 1 4
2 2 2
3 3 5
4 4 6
5 5 7
6 1 6
7 3 7
8 8 8
9 2 9
Now, if I subset this data to select only the observations with "x" between 1 and 3 I can do:
s1<- subset(data, x>=1 & x<=3)
and obtain my desired output:
x y
1 1 4
2 2 2
3 3 5
4 1 6
5 3 7
6 2 9
However, if I subset using the colon operator I obtained a different result:
s2<- subset(data, x==1:3)
x y
1 1 4
2 2 2
3 3 5
This time it only includes the first observation in which "x" was 1,2, or 3. Why?
I would like to use the ":" operator because I am writing a function so the user would input a range of values from which she wants to see an average calculated over the "y" variable. I would prefer if they can use ":" operator to pass this argument to the subset function inside my function but I don't know why subsetting with ":" gives me different results.
I'd appreciate any suggestions on this regard.
You can use %in% instead of ==
subset(data, x %in% 1:3)
In general, if we are comparing two vectors of unequal sizes, %in% would be used. There are cases where we can take advantage of the recycling (it can fail too) if the length of one of the vector is double that of the second. Some examples with some description is here.

Resources