I am studying R end Data Science. In a question, I need to validate if a number in an array is even.
My code:
vetor <- list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
But the result is a warning message:
Warning message:
In if (i%%2 == 0) { :
a condição tem comprimento > 1 e somente o primeiro elemento será usado
Translating:
The condition has a length > 1 and only the first element will be used.
What I need, that each element in a list be verified if is even, and if true, then, print it.
In R, how can I do it?
The wrapper for list is not needed
vetor <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
running the OP's code
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
#[1] 2
#[1] 4
#[1] 6
#[1] 8
#[1] 10
These are vectorized operations. We don't need a loop
vetor[vetor %% 2 == 0]
#[1] 2 4 6 8 10
When we wrap the vector with list, it returns a list of length 1 and the unit will be the whole vector. The for loop in R is a for each loop and not the traditional counter controlled 3 part expression loop. So, the i will be the whole vetor vector.
Because if/else expects a single element and not a vector of length greater than 1, it results in the warning message
Or if we want to store it in a list with each element of length 1, use as.list
vetor <- as.list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
Let's break down your code and dig into each step to see what happened ...
You should notice that vetor is a list, i.e.,
> vetor
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
In this case, the iterator i in vetor denotes the array in vetor, which can be seen from
> for (i in vetor) {
+ str(i)
+ }
num [1:10] 1 2 3 4 5 6 7 8 9 10
Therefore, when you have condition i%%2==0, you are indeed running
> for (i in vetor) {
+ print(i %% 2 == 0)
+ }
[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
which is not a single logic value as a condition for if ... else ... state. That is the reason you got the warnings.
Regarding the workaround, you can refer to #akrun's answer, which could help you a lot
Related
I just want to fill this matrix, I know this is a very easy problem, but I'm not good at all.
I won't use these numbers in the matrix.
I had tried with for loop, but the problem is this loop only shows the last iteration.
I repeat, I don't want the numbers from 1 to 9.
I have this:
(mat<-matrix(0, nrow = 3, ncol = 3))
for (i in 1:3) {
for (j in 1:3) {
if (j==1 & i==1) {
mat[i,j]=6
} else if (j==1 & i==2) {
mat[i,j]=7
} else if (j==1 & i==3) {
mat[i,j]=8
}
}
}
I want a code whithout put the conditions & i==1, & i==2, & i==3.
I want to make another variable k from 1 to 3. I tried this, but the loop only show me the value in 3.
Thank you so much.
Edit: I'm going to show you an example abou I want to solve. You will see that's about the same problem. I have the next data frame:
base2<-c( 20, 15, 17, 23, 19, 21, 16, 22, 18)
base2.1<-c( 6, 5, 3, 4, 1, 7, 2, 9, 8)
base3<-data.frame(base2,base2.1)
names(base3)=c("age","mean")
base3
I want to fill a vector vec where vec[1]=5 (because as you can see,age=15), vec[2]=2 (because ,age=16) and so on, so on.
I have tried this just for the first element:
(vec<-c(rep(0,length(base3$mean))))
for (i in 1:length(base3$mean)) {
if (base3$age[i]==15) {
vec[1]=base3$mean[which(base3$age==15)]
}
}
vec
Of course, I don't want to put number 1 on this part of the loop: vec[1]=base3$mean[which(base3$age==15)
If I want to fill the entire vector, I have this:
for (i in 1:length(base3$mean)) {
for (j in sort(base3$age)) {
vec[i]=base3$mean[which(base3$age==j)]
}
}
vec
But the foor loop only show me the last iteration:
# [1] 4 4 4 4 4 4 4 4 4
I want the next result:
[1] 5 2 3 8 1 6 7 9 4
This is simply to order vector age and use the index returned to index vector mean.
i <- order(base3$age)
vec <- base3$mean[i]
vec
#[1] 5 2 3 8 1 6 7 9 4
See help("order"), and the difference to sort.
A one-liner is
vec <- base3$mean[order(base3$age)]
I want to retain the maximum value in a vector. R code is written below.
How to fix this code so it runs without errors?
dat is in a data frame
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
desired output is
MaxRuns=c(3,5,5,5,8,8,8,9,10,10)
maxValue=function(dat){
maxv=0
for (i in 1:10) MaxRuns(i)=0
for (i in 1:10){
if dat(i) > maxv {
maxv=dat(i) }
MaxRuns(i)=maxv
}
return(maxv)
}
maxValue<-maxValue(dat)
maxValue
Errors:
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
> maxValue=function(dat){
+ maxv=0
+ for (i in 1:10) MaxRuns(i)=0
+ for (i in 1:10){
+ if dat(i) > maxv {
Error: unexpected symbol in:
" for (i in 1:10){
if dat"
> maxv=dat(i) }
Error: unexpected '}' in " maxv=dat(i) }"
> MaxRuns(i)=maxv
Error: object 'maxv' not found
> }
Error: unexpected '}' in " }"
> return(maxv)
Error: object 'maxv' not found
> }
Error: unexpected '}' in " }"
> maxValue<-maxValue(dat)
Error in maxValue(dat) : could not find function "maxValue"
> maxValue
Error: object 'maxValue' not found
Thank you. MM
This looks like cummax but you need to handle NAs. As dat is completely positive replacing NAs with 0 here.
cummax(replace(dat, is.na(dat), 0))
#[1] 3 5 5 5 8 8 8 9 10 10
As mentioned by #Dason, replacing the NA values with min would make it general
cummax(replace(dat, is.na(dat), min(dat, na.rm = TRUE)))
You can access each element of vector using square brackets ([]) and not round brackets (()). I would write a loop something like this.
maxv = integer(length = length(dat))
current_max = 0
for (i in seq_along(dat)) {
if (dat[i] > current_max & !is.na(dat[i])){
current_max <- dat[i]
}
maxv[i] <- current_max
}
maxv
#[1] 3 5 5 5 8 8 8 9 10 10
There are a few things to fix.
You can check if an entry is NA by using is.na. Also, when you access the entries of dat, use dat[i] rather than dat(i). Also, try not to name a varible the variable the same name as a function's name.
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
maxValue=function(dat){
maxv=0
MaxRuns = rep(0, 10)
for (i in 1:10){
if (!is.na(dat[i]) && dat[i] > maxv){
maxv=dat[i] }
MaxRuns[i]=maxv
}
return(MaxRuns)
}
maxRuns<-maxValue(dat)
print(maxRuns)
prints out
[1] 3 5 5 5 8 8 8 9 10 10
I have a question I have the following data
c(1, 2, 4, 5, 1, 8, 9)
I set a l = 2 and an u = 6
I want to find all the values in the range (3,7)
How can I do this?
In base R we can use comparison operators to create a logical vector and use that for subsetting the original vector
x[x > 2 & x <= 6]
#[1] 3 5 6
Or using a for loop, initialize an empty vector, loop through the elements of 'x', if the value is between 2 and 6, then concatenate that value to the empty vector
v1 <- c()
for(i in x) {
if(i > 2 & i <= 6) v1 <- c(v1, i)
}
v1
#[1] 3 5 6
data
x <- c(3, 5, 6, 8, 1, 2, 1)
I have a vector:
as <- c(1,2,3,4,5,9)
I need to extract the first continunous sequence in the vector, starting at index 1, such that the output is the following:
1 2 3 4 5
Is there a smart function for doing this, or do I have to do something not so elegant like this:
a <- c(1,2,3,4,5,9)
is_continunous <- c()
for (i in 1:length(a)) {
if(a[i+1] - a[i] == 1) {
is_continunous <- c(is_continunous, i)
} else {
break
}
}
continunous_numbers <- c()
if(is_continunous[1] == 1) {
is_continunous <- c(is_continunous, length(is_continunous)+1)
continunous_numbers <- a[is_continunous]
}
It does the trick, but I would expect that there is a function that can already do this.
It isn't clear what you need if the index of the continuous sequence only if it starts at index one or the first sequence, whatever the beginning index is.
In both case, you need to start by checking the difference between adjacent elements:
d_as <- diff(as)
If you need the first sequence only if it starts at index 1:
if(d_as[1]==1) 1:(rle(d_as)$lengths[1]+1) else NULL
# [1] 1 2 3 4 5
rle permits to know lengths and values for each consecutive sequence of same value.
If you need the first continuous sequence, whatever the starting index is:
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
Examples (for the second option):
as <- c(1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
#[1] 1 2 3 4 5
as <- c(4,3,1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 3 4 5 6 7
as <- c(1, 2, 3, 6, 7, 8)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 1 2 3
A simple way to catch the sequence would be to find the diff of your vector and grab all elements with diff == 1 plus the very next element, i.e.
d1<- which(diff(as) == 1)
as[c(d1, d1[length(d1)]+1)]
NOTE
This will only work If you only have one sequence in your vector. However If we want to make it more general, then I 'd suggest creating a function as so,
get_seq <- function(vec){
d1 <- which(diff(as) == 1)
if(all(diff(d1) == 1)){
return(c(d1, d1[length(d1)]+1))
}else{
d2 <- split(d1, cumsum(c(1, diff(d1) != 1)))[[1]]
return(c(d2, d2[length(d2)]+1))
}
}
#testing it
as <- c(3, 5, 1, 2, 3, 4, 9, 7, 5, 4, 5, 6, 7, 8)
get_seq(as)
#[1] 3 4 5 6
as <- c(8, 9, 10, 11, 1, 2, 3, 4, 7, 8, 9, 10)
get_seq(as)
#[1] 1 2 3 4
as <- c(1, 2, 3, 4, 5, 6, 11)
get_seq(as)
#[1] 1 2 3 4 5 6
I have a question about finding index values in a vector.
Let's say I have a vector as follows:
vector <- c(1,2,4,6,8,10)
And, let's say I have the value '5'. I would like to find the maximum index in "vector" such that it is less than or equal to the value 5. In the case of the example above, this index would be 3 (since 4 is less than or equal to 5). Similarly, if instead I had a vector such as:
vector <- c(1,2,4,5,6,8,10)
Then if I were to find a value less than or equal to 5, this index would now be 4 instead of 3.
However, I also want to find the first and last time this index occurs. For example, if I had a vector such as:
vector <- c(1,1,2,2,4,5,5,5,5,6,8,10)
Then the first time this index occurs would be 6 and the last time this index occurs would be 9.
Is there a short, one-line method which would allow me to perform this task? Up until now I have been using the function max(which(....)), however I find that this method is extremely inefficient for large datasets since it will literally list hundreds/thousands of values, so I would like to find a more efficient method if possible which can fit in one line.
Thanks in advance.
You can use the following code:
min(max(which(vector <= 5)), min(which(vector == 5)))
First, it searches all indices where vector is less or equal to 5 with which function, then it takes the maximum one.
Second, it searches all indices where vector is equal to 5 and takes the minimum.
Third, it takes the first of these two indices
Thanks for all those who replied, I actually found an extremely short, one-line method to do this by download a package BBmisc. It has functions called which.last and which.first, and they perform the actions I need. Thanks again for taking the time to reply, I appreciate it.
You can use:
my_ind <- function(vec, num){
ind <- which.max(vec == num) # Check for equality first
if(ind == 1L && vec[1L] != num){
ind <- which.min(vec < num) - 1L
}
ind
}
my_ind(c(1,2,4,6,8,10), 5L) # 3
my_ind(c(1,2,4,5,6,8,10), 5L) # 4
my_ind(c(1,1,2,2,4,5,5,5,5,6,8,10), 5L) # 6
my_ind(c(5,8,10), 5L) # 1
my_ind(c(6,8,10), 5L) # 0 - returns 0 if all(vec > 5L)
I don't see a need for packages here. It seems like the construct which(x == max(x[x <= 5])) would work for you.
x <- c(1, 2, 4, 6, 8, 10)
which(x == max(x[x <= 5]))
# [1] 3
x <- c(1, 2, 4, 5, 6, 8, 10)
which(x == max(x[x <= 5]))
# [1] 4
x <- c(1, 1, 2, 2, 4, 5, 5, 5, 5, 6, 8, 10)
which(x == max(x[x <= 5]))
# [1] 6 7 8 9
And to find the min/max index for multiples indices, use head/tail.
head(which(x == max(x[x <= 5])), 1)
# [1] 6
tail(which(x == max(x[x <= 5])), 1)
# [1] 9