I want to return the number of times in string vector v that the element at the next successive index has more characters than the current index.
Here's my code
BiggerPairs <- function (v) {
numberOfTimes <- 0
for (i in 1:length(v)) {
if((nchar(v[i+1])) > (nchar(v[i]))) {
numberOfTimes <- numberOfTimes + 1
}
}
return(numberOfTimes)
}
}
missing value where TRUE/FALSE needed.
I do not know why this happens.
The error you are getting is saying that your code is trying to evaluate a missing value (NA) where it expects a number. There are likely one of two reasons for this.
You have NA's in your vector v (I suspect this is not the actual issue)
The loop you wrote is from 1:length(v), however, on the last iteration, this will try the loop to try to compare v[n+1] > v[n]. There is no v[n+1], thus this is a missing value and you get an error.
To remove NAs, try the following code:
v <- na.omit(v)
To improve your loop, try the following code:
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
Here is some example dummy code.
# create random 15 numbers
set.seed(1)
v <- rnorm(15)
# accessing the 16th element produces an NA
v[16]
#[1] NA
# if we add an NA and try to do a comparison, we get an error
v[10] <- NA
v[10] > v[9]
#[1] NA
# if we remove NAs and limit our loop to N-1, we should get a fair comparison
v <- na.omit(v)
numberOfTimes <- 0
for(i in 1:(length(v) -1)) {
if(nchar(v[i + 1]) > nchar(v[i])) {
numberOfTimes <- numberOfTimes + 1
}
}
numberOfTimes
#[1] 5
Is this what you're after? I don't think there is any need for a for loop.
I'm generating some sample data, since you don't provide any.
# Generate some sample data
set.seed(2017);
v <- sapply(sample(30, 10), function(x)
paste(sample(letters, x, replace = T), collapse = ""))
v;
#[1] "raalmkksyvqjytfxqibgwaifxqdc" "enopfcznbrutnwjq"
#[3] "thfzoxgjptsmec" "qrzrdwzj"
#[5] "knkydwnxgfdejcwqnovdv" "fxexzbfpampbadbyeypk"
#[7] "c" "jiukokceniv"
#[9] "qpfifsftlflxwgfhfbzzszl" "foltth"
The following vector marks the positions with 1 in v where entries have more characters than the previous entry.
# The following vector has the same length as v and
# returns 1 at the index position i where
# nchar(v[i]) > nchar(v[i-1])
idx <- c(0, diff(nchar(v)) > 0);
idx;
# [1] 0 0 0 0 1 0 0 1 1 0
If you're just interested in whether there is any entry with more characters than the previous entry, you can do this:
# If you just want to test if there is any position where
# nchar(v[i+1]) > nchar(v[i]) you can do
any(idx == 1);
#[1] TRUE
Or count the number of occurrences:
sum(idx);
#[1] 3
Related
I am working with for-loops in R;
I have a data frame which contains n columns.
I have to build a vector of length n where each element is 1 if the column is a double, 0 otherwise.
this is what I have tried:
y<-rep(0,dim.data.frame(datafr)[2])
attach(datafr)
x<-names(dat)
for (j in 1:length(x)){
for(i in x){
if(is.double(i)){
y[j]<-1
}else{
y[j]<-0
}
}
}
However, it does not work since the y vector returned has no 1, but just n 0.
A column in a data.frame should all be the same class, so you only need to check past the first value in a column.
A simple approach is to create the vector with sapply which loops (in this case) over the columns of a frame.
datafr <- data.frame(a=1:5, b=1:5 + 0, d=letters[1:5])
sapply(datafr, is.double)
# a b d
# FALSE TRUE FALSE
If you must use a for loop, this can be unrolled with
y <- integer(ncol(datafr)) # defaults to 0
y
# [1] 0 0 0
for (j in seq_along(datafr)) {
if (is.double(datafr[[j]])) {
y[j] <- 1L
}
}
y
# [1] 0 1 0
I have two vectors with different elements, say x=c(1,3,4) , y= c(2,9)
I want a vector of ranges that identifies me the elements of vector x with 1 and those of y with 0, ie
(1,2,3,4,9) -----> (1,0,1,1,0)
How could you get the vector of zeros and ones (1,0,1,1,0) in r?
Thanks
The following option surely isn't numerically optimal, but it's the most simple and direct one:
a<-c(1,2,3,4)
b<-c(5,6,7,8)
f<-function(vec0,vec1,inp)
{
out<-rep(NA,length(inp)) #NA if input elements in neither vector
for(i in 1:length(inp))
{ #Logical values coerced to 0 and 1 at first, then
if(sum(inp[i]==vec0))(out[i]<-0); #summed up and if sum != 0 coerced to logical "TRUE"
}
for(i in 1:length(inp))
{
if(sum(inp[i]==vec1))(out[i]<-1);
}
return (out)
}
Works just fine:
> f(vec0=a,vec1=b,inp=c(1,6,4,8,2,4,8,7,10))
[1] 0 1 0 1 0 0 1 1 NA
first you define a function that do that
blah <- function( vector,
x=c(1,3,4),
y= c(2,9)){
outVector <- rep(x = NA, times = length(vector))
outVector[vector %in% x] <- 1
outVector[vector %in% y] <- 0
return(outVector)
}
then you can use the function:
blah(vector = 1:9)
blah(vector = c(1,2,3,4,9))
you can also change the value of x & y
blah(vector = 1:10,x = c(1:5*2), y = c((1:5*2)-1 ))
I know the question is silly but I really can't solve it. I just want to do different operations to the elements in a dataframe deppending on its sign. The following code generating a mock dataframe:
mock<-data.frame(matrix(NA,ncol=5,nrow=2))
colnames(mock)<-as.vector(c("m","n","1985-02-04","1985-02-05","1985-02-06"))
rownames(mock)<-as.vector(c("fund1","fund2"))
mock
mock[1,]<-c(0.001,0.0045,-0.03,0.25,NA)
mock[2,]<-c(0.004,0.0004,NA,0.12,-0.087)
mock
so it looks like
m n 1985-02-04 1985-02-05 1985-02-06
fund1 0.001 0.0045 -0.03 0.25 NA
fund2 0.004 0.0004 NA 0.12 -0.087
for each fund, m and n represent two different ratios, the last three figures are returns on the given days. I wish to do the following oerations:
if the return x on one day is positive, I need (x+m)/(1+n) to replace the corresponding figure in the dataframe.
If the return x is negative, I need x+m to replace the corresponding figure in the dataframe.
If it is NA on the day, I will leave it NA.
I tried the following code:
Grossreturn<-function(x){
a<-x[3:5]
m<-x[1]
p<-x[2]
a[a>0]<-(a[a>0]+m)/(1-p)
a[a<0]<-a[a<0]+m
return(a)
}
apply(mock,1,Grossreturn)
and of course it failed and the error message is:
Error in a[a > 0] <- (a[a > 0] + m)/(1 - p) :
NAs are not allowed in subscripted assignments
I really get stucked here and couldn't sort it out. Can someone help?
Thanks!
You should just exclude NAs from all your assignments. A sample syntax for doing this is below:
> foo = data.frame(x=runif(3)-0.5, y=runif(3)) #random data frame
> foo[2,1] <- NA #adding an NA
> foo
x y
1 -0.4616014 0.4892859
2 NA 0.4730237
3 0.4060813 0.1517448
If you now try to reassign without filtering out NAs, you get your error.
> foo[sign(foo$x)==-1, 1] <- -10
Error in `[<-.data.frame`(`*tmp*`, sign(foo$x) == -1, 1, value = -10) :
missing values are not allowed in subscripted assignments of data frames
But not if you explicitly leave out NAs:
> foo[sign(foo$x)==-1 & !is.na(foo$x), 1] <- -10
> foo
x y
1 -10.0000000 0.4892859
2 NA 0.4730237
3 0.4060813 0.1517448
Here is a code which solves your problem:
grossreturn <- function(x) {
m <- x[1]
n <- x[2]
# iterate over all date columns and compute new value
for (i in 3:length(x)) {
if (is.na(x[i]) {
# NA remains NA
} else if (x[i] < 0) {
x[i] <- x[i] + m # x + m
} else {
# x[i] >= 0
# includes case where x[i] == 0
x[i] <- (x[i] + m) / (1 + n) # (x + m) / (1 + n)
}
}
return x
}
result <- apply(mock, 1, FUN=function(x) grossreturn(x))
I wanted to use an apply function to iterate over the numerical columns after extracting out m and n, but there does not seem to be any vectorized apply functions which can also pass multiple parameters as input (so mapply would not be a vectorized solution).
I assumed that the case where a return is 0 that you wanted (x + m) / (1 + n). Also, you test whether R drops either the row or column names when you run this code.
this is my program in R:
mletheta<-function(x)
{
n<-length(x)
temp<-x<=0
if(n==0||sum(temp)>0)
{
stop("ERROR:x must be a vector of positive real values.\n")
}
thetahat<--1*n/sum(log(1-exp(-1*x**2)))
return(thetahat)
}
mletheta(-3)
my problem is i can't understand if x<=0, then how sum(temp)>0. as x=-5:-4 then sum(temp) should be -9<0. I don't understand the logic??
Let's decompose the first part of the function:
x <- -5:-4
n <- length(x)
n
# [1] 2
temp <- x<=0
temp
# [1] TRUE TRUE
sum(temp)
# [1] 2
This meets the if statement if(n==0||sum(temp)>0), where the error message is displayed if the length of the vector in NULL (n==0), which is not true in this case, or if the sum of temp is greater than 0 (sum(temp)>0). Here, sum(temp) gives 2.
Just a general question:
When I run:
ok<-NULL
for (i in 1:3) {
ok[i]=i^2
i=i+1
}
The loop works (as expected).
> ok
[1] 1 4 9
Now when I try to do something like:
ok<-NULL
for (i in 1:3) {
ok[i]=i^2
x[i]<-ok[i]+1
y[i]<-cbind(ok[i],x)
i=i+1
}
And I want:
y = 1
2
4
5
9
10
Instead I get:
Warning messages:
1: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
2: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
3: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
4: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
5: In y[i] <- rbind(ok[i], x) :
number of items to replace is not a multiple of replacement length
Thanks in advance.
You should read up on R basics before starting to program.
You don't have to increment i in the loop (actually its quite confusing).
You don't cbind or rbind vectors this is for data.frame columns and rows.
y <- NULL
for(i in 1:3){ ok <- i^2; x <- ok + 1; y <- c(y, ok, x) }
or:
as.vector(sapply(1:3, function(i){ ok <- i^2; x <- ok + 1; c(ok, x) }))
With this command y[i]<-cbind(ok[i],x) you attempt to replace one element in the vector with several. This causes an error.
If you want to to get 1:3 squared, you would use:
ok <- (1:3)^2
ok
# [1] 1 4 9
If you want to get 1:3 squared, along with the numbers right after them, you might try:
as.vector(rbind(ok, ok+1))
[1] 1 2 4 5 9 10
for loops in R are often the wrong solution to your problem.