I want to retain the maximum value in a vector. R code is written below.
How to fix this code so it runs without errors?
dat is in a data frame
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
desired output is
MaxRuns=c(3,5,5,5,8,8,8,9,10,10)
maxValue=function(dat){
maxv=0
for (i in 1:10) MaxRuns(i)=0
for (i in 1:10){
if dat(i) > maxv {
maxv=dat(i) }
MaxRuns(i)=maxv
}
return(maxv)
}
maxValue<-maxValue(dat)
maxValue
Errors:
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
> maxValue=function(dat){
+ maxv=0
+ for (i in 1:10) MaxRuns(i)=0
+ for (i in 1:10){
+ if dat(i) > maxv {
Error: unexpected symbol in:
" for (i in 1:10){
if dat"
> maxv=dat(i) }
Error: unexpected '}' in " maxv=dat(i) }"
> MaxRuns(i)=maxv
Error: object 'maxv' not found
> }
Error: unexpected '}' in " }"
> return(maxv)
Error: object 'maxv' not found
> }
Error: unexpected '}' in " }"
> maxValue<-maxValue(dat)
Error in maxValue(dat) : could not find function "maxValue"
> maxValue
Error: object 'maxValue' not found
Thank you. MM
This looks like cummax but you need to handle NAs. As dat is completely positive replacing NAs with 0 here.
cummax(replace(dat, is.na(dat), 0))
#[1] 3 5 5 5 8 8 8 9 10 10
As mentioned by #Dason, replacing the NA values with min would make it general
cummax(replace(dat, is.na(dat), min(dat, na.rm = TRUE)))
You can access each element of vector using square brackets ([]) and not round brackets (()). I would write a loop something like this.
maxv = integer(length = length(dat))
current_max = 0
for (i in seq_along(dat)) {
if (dat[i] > current_max & !is.na(dat[i])){
current_max <- dat[i]
}
maxv[i] <- current_max
}
maxv
#[1] 3 5 5 5 8 8 8 9 10 10
There are a few things to fix.
You can check if an entry is NA by using is.na. Also, when you access the entries of dat, use dat[i] rather than dat(i). Also, try not to name a varible the variable the same name as a function's name.
dat=c(3, 5, 4, 2, 8, NA, NA, 9, 10, 3)
maxValue=function(dat){
maxv=0
MaxRuns = rep(0, 10)
for (i in 1:10){
if (!is.na(dat[i]) && dat[i] > maxv){
maxv=dat[i] }
MaxRuns[i]=maxv
}
return(MaxRuns)
}
maxRuns<-maxValue(dat)
print(maxRuns)
prints out
[1] 3 5 5 5 8 8 8 9 10 10
Related
I just started with R and I've come across an issue I can fix but not quite understand.
Consider this simple code:
foo <- function(v) {
for(i in 1:length(v)-1)
if(v[i] > v[i+1])
#do something here
return()
}
v <- c(10, 40, 40, 10, 20, 70, 30, 20)
foo(v)
Running it will give this error:
Error in if (v[i] > v[i + 1]) return() : argument is of length zero
But replacing the if with the following code gets rid of the error:
if(isTRUE(v[i] > v[i+1]))
I come from a C/Java background so my question is, why? Why does this simple integer comparison need to be wrapped in isTRUE to work?
On similar questions I've found that isTRUE helps protect against cases where one of the two arguments is NA or NULL, but why is this the case here with two numbers?
1:length(v)-1 is intepreted as (1:length(v))-1. In R arrays start at 1. You should instead have 1:(length(v)-1):
> length(v)
[1] 8
> 1:length(v)-1
[1] 0 1 2 3 4 5 6 7
> 1:(length(v)-1)
[1] 1 2 3 4 5 6 7
> v[9]
[1] NA
Complete function:
foo <- function(v) {
for(i in 1:(length(v)-1))
{
if(v[i] > v[i+1])
{
#do something here
}
}
return()
}
v <- c(10, 40, 40, 10, 20, 70, 30, 20)
> foo(v)
# NULL
isTRUE(x) returns TRUE if, and only if, x is TRUE. This means that:
isTRUE(NA)
[1] FALSE
However:
> NA == TRUE
[1] NA
(not FALSE)
I am studying R end Data Science. In a question, I need to validate if a number in an array is even.
My code:
vetor <- list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
But the result is a warning message:
Warning message:
In if (i%%2 == 0) { :
a condição tem comprimento > 1 e somente o primeiro elemento será usado
Translating:
The condition has a length > 1 and only the first element will be used.
What I need, that each element in a list be verified if is even, and if true, then, print it.
In R, how can I do it?
The wrapper for list is not needed
vetor <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
running the OP's code
for (i in vetor) {
if (i %% 2 == 0) {
print(i)
}
}
#[1] 2
#[1] 4
#[1] 6
#[1] 8
#[1] 10
These are vectorized operations. We don't need a loop
vetor[vetor %% 2 == 0]
#[1] 2 4 6 8 10
When we wrap the vector with list, it returns a list of length 1 and the unit will be the whole vector. The for loop in R is a for each loop and not the traditional counter controlled 3 part expression loop. So, the i will be the whole vetor vector.
Because if/else expects a single element and not a vector of length greater than 1, it results in the warning message
Or if we want to store it in a list with each element of length 1, use as.list
vetor <- as.list(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10))
Let's break down your code and dig into each step to see what happened ...
You should notice that vetor is a list, i.e.,
> vetor
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
In this case, the iterator i in vetor denotes the array in vetor, which can be seen from
> for (i in vetor) {
+ str(i)
+ }
num [1:10] 1 2 3 4 5 6 7 8 9 10
Therefore, when you have condition i%%2==0, you are indeed running
> for (i in vetor) {
+ print(i %% 2 == 0)
+ }
[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE FALSE TRUE
which is not a single logic value as a condition for if ... else ... state. That is the reason you got the warnings.
Regarding the workaround, you can refer to #akrun's answer, which could help you a lot
I'm fairly new to R and am having trouble implementing something that should be very basic. Can someone point me in the right direction?
I need to apply a logical calculation based on the values of two vectors and return the value of that function in a third vector.
I want to do this in a user defined function so I can easily apply this in several other areas of the algorithm and make modifications to the implementation with ease.
Here's what I have tried, but I cannot get this implementation to work. I believe it is because I cannot send vectors as parameters to this function.
<!-- language: python -->
calcSignal <- function(fVector, sVector) {
if(!is.numeric(fVector) || !is.numeric(sVector)) {
0
}
else if (fVector > sVector) {
1
}
else if (fVector < sVector) {
-1
}
else {
0 # is equal case
}
}
# set up data frame
df <- data.frame(x=c("NA", 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, "NA"))
# call function
df$z <- calcSignal(df$x, df$y)
I want the output to be a vector with the following values, but I am not implementing the function correctly.
[0,-1,1,-1,0,0]
Can someone help explain how to implement this function to properly perform the logic outlined?
I appreciate your assistance!
There are some misunderstandings in your code:
in R, "NA" is considered as character (string is called character in R). the correct
form is NA without quotes.
it is worth noting that data.frame automatically will convert character to factor type which can be disabled by using data.frame(...,stringsAsFactors = F).
each column of a data.frame has a type, not each element. so when you have a column containing numbers and NA, class of that column will be numeric and is.numeric gives you True even for NA elements. is.na will do the job
|| only compares first element of each vector. | does elementwise comparison.
Now let's implement what you wanted:
Implementation 1:
#set up data frame
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
calcSignal <- function(f,s){
if(is.na(f) | is.na(s))
return(0)
else if(f>s)
return(1)
else if(f<s)
return(-1)
else
return(0)
}
df$z = mapply(calcSignal, df$x, df$y, SIMPLIFY = T)
to run a function on two or more vectors element-wise, we can use mapply.
Implementaion 2
not much different from previous. here the function is easier to use.
#set up data frame
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
calcSignal <- function(fVector, sVector) {
res = mapply(function(f,s){
if(is.na(f) | is.na(s))
return(0)
else if(f>s)
return(1)
else if(f<s)
return(-1)
else
return(0)
},fVector,sVector,SIMPLIFY = T)
return(res)
}
df$z = calcSignal(df$x,df$y)
Implementaion 3 (Vectorized)
This one is much better. because it is vectorized and is much faster:
calcSignal <- function(fVector, sVector) {
res = rep(0,length(fVector))
res[fVector>sVector] = 1
res[fVector<sVector] = -1
#This line isn't necessary.It's just for clarification
res[(is.na(fVector) | is.na(sVector))] = 0
return(res)
}
df$z = calcSignal(df$x,df$y)
Output:
> df
x y z
1 NA 4 0
2 2 1 1
3 9 5 1
4 7 9 -1
5 0 0 0
6 5 NA 0
No need for loopage as ?sign has your back:
# fixing the "NA" issue:
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
s <- sign(df$x - df$y)
s[is.na(s)] <- 0
s
#[1] 0 1 1 -1 0 0
ifelse is another handy function. Less elegant here than sign though
df <- data.frame(x=c(NA, 2, 9, 7, 0, 5), y=c(4, 1, 5, 9, 0, NA))
cs <- function(x, y){
a <- x > y
b <- x < y
out <- ifelse(a, 1, ifelse(b, -1, 0))
ifelse(is.na(out), 0, out)
}
cs(df$x, df$y)
I have a vector:
as <- c(1,2,3,4,5,9)
I need to extract the first continunous sequence in the vector, starting at index 1, such that the output is the following:
1 2 3 4 5
Is there a smart function for doing this, or do I have to do something not so elegant like this:
a <- c(1,2,3,4,5,9)
is_continunous <- c()
for (i in 1:length(a)) {
if(a[i+1] - a[i] == 1) {
is_continunous <- c(is_continunous, i)
} else {
break
}
}
continunous_numbers <- c()
if(is_continunous[1] == 1) {
is_continunous <- c(is_continunous, length(is_continunous)+1)
continunous_numbers <- a[is_continunous]
}
It does the trick, but I would expect that there is a function that can already do this.
It isn't clear what you need if the index of the continuous sequence only if it starts at index one or the first sequence, whatever the beginning index is.
In both case, you need to start by checking the difference between adjacent elements:
d_as <- diff(as)
If you need the first sequence only if it starts at index 1:
if(d_as[1]==1) 1:(rle(d_as)$lengths[1]+1) else NULL
# [1] 1 2 3 4 5
rle permits to know lengths and values for each consecutive sequence of same value.
If you need the first continuous sequence, whatever the starting index is:
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
Examples (for the second option):
as <- c(1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
#[1] 1 2 3 4 5
as <- c(4,3,1,2,3,4,5,9)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 3 4 5 6 7
as <- c(1, 2, 3, 6, 7, 8)
d_as <- diff(as)
rle_d_as <- rle(d_as)
which(d_as==1)[1]+(0:(rle_d_as$lengths[rle_d_as$values==1][1]))
# [1] 1 2 3
A simple way to catch the sequence would be to find the diff of your vector and grab all elements with diff == 1 plus the very next element, i.e.
d1<- which(diff(as) == 1)
as[c(d1, d1[length(d1)]+1)]
NOTE
This will only work If you only have one sequence in your vector. However If we want to make it more general, then I 'd suggest creating a function as so,
get_seq <- function(vec){
d1 <- which(diff(as) == 1)
if(all(diff(d1) == 1)){
return(c(d1, d1[length(d1)]+1))
}else{
d2 <- split(d1, cumsum(c(1, diff(d1) != 1)))[[1]]
return(c(d2, d2[length(d2)]+1))
}
}
#testing it
as <- c(3, 5, 1, 2, 3, 4, 9, 7, 5, 4, 5, 6, 7, 8)
get_seq(as)
#[1] 3 4 5 6
as <- c(8, 9, 10, 11, 1, 2, 3, 4, 7, 8, 9, 10)
get_seq(as)
#[1] 1 2 3 4
as <- c(1, 2, 3, 4, 5, 6, 11)
get_seq(as)
#[1] 1 2 3 4 5 6
Let I have an array like
a <- seq(1, 100, 1)
and I want to select just the elements that occur each 3 steps with a for() loop starting from the second one, e.g. 2, 5, 8, 11 and so on.
How should I use for() in this case?
b <- NULL
# for(i in 1:length(a)) { # Is there any additional argument?
# b[i] <- a[...] # Or I can just multiply 'i' by some integer?
# }
Thanks,
Use 3 as the value for by in seq
for (i in seq(2, length(a), by=3)) {}
> seq(2, 11, 3)
[1] 2 5 8 11
Why use for ?
b <- a[seq(2,length(a),3)]