date_to_numeric<- function(x)#function for construction of date
{
strptime(x,format = "%Y-%m-%d")->t
if(is.na(t)==TRUE)
strptime(x,format = "%Y%m%d")->t
as.numeric(format(t, "%Y"))->t1
as.numeric(format(t, "%m"))->t2
as.numeric(format(t, "%d"))->t3
d<-c(0,0.08493150685,0.1616438356,0.2465753425,0.3287671233,0.4136986301,0.495890411,0.5808219178,0.6657534247,0.7479452055,0.8328767123,0.9150684932)
d[t2]->t2
t3<-t3/365
result<-t1+t2+t3
return(result)
}
time(d)->t
t<-date_to_numeric(t)
Warning message:
In if (is.na(t) == TRUE) t <- strptime(x, format = "%yyyy%mm%dd") :
the condition has length > 1 and only the first element will be used
Can please someone explain to me why I get this error message ? I usde the same Code in jannuary last year and it woked fine ! Any hepel is hilgly preciated !
As #Sotos mentioned, the reason you are receiving this warning message is because in your function, you are using and if statement but object t is likely a vector of dates. Since if is not vectorized, your function will only check if the first element of t is missing (in if (is.na(t))), and it is giving you this precise warning. Note that your code will still run, however it probably won't return what you are expecting.
The simplest way to fix this without editing your function is using sapply(). You can do something like this:
t <- time(d)
t2 <- sapply(t, FUN = date_to_numeric)
You can also edit your date_to_numeric function to allow for proper vectorized calculations, which I would recommend for the long run.
Related
Iam trying to extract data from a website using a custom function:
library(tidyverse)
library(rvest)
url = "https://www.boerse.de/fundamental-analyse/garbage/" # last part does not change outcome, therefore 'garbage'
read_html_tables = function(ISIN){
content <- read_html(paste0(url,ISIN,"#guv")) %>%
html_table(dec = ",") %>%
.[c(5:10)]
return(content)
}
If I run this function with a given ISIN, e.g. US88579Y1010, I get the desired result. A list containing 6 tibbles with the data I want. But if I wrap this function into lapply() with a vector containing a few hundred ISIN, I get the following error:
list_of_all <- lapply(X = df[,2], FUN = read_html_tables)
Error: x must be a string of length 1
Called from: read_xml.character(x, encoding = encoding, ..., as_html = TRUE,
options = options)
If I call which(length(df[,2]) != 1) (the column where the ISINs are), I get integer(0), so there seems to be no issue with the ISIN column in this dataframe. And since it works with a single ISIN as input, the read_html(paste0(url,ISIN)) part seems to work as well.
I have used a very similar function before and wrapped it into lapply(). The earlier function did basically exactly what this function does, but had to do some searching and combining for the correct URL to pass into the read_html(paste0(url,ISIN)) part (on another website).
Iam a bit puzzled, since this error did not occure beforehand. But if it occured and I try to run the earlier function now, I get the same error (which I didn't receive any time before).
Maybe there is a more talented R-programmer out there which can spot the issue?
Edit: Since a reply suggested the ISIN-list is the issue:
The first two are US88579Y1010 and US8318652091. Passed individually into the function as well as passing it in a vector (c(ISIN1, ISIN2)) and passing the vector to lapply works. But if I point at both ISINs inside the tibble (df[1:2,2]) I get the error from above. What am I missing here?
Solution:
read_xml.character from read_html() seems to not accept a column from a tibble as valid input. Transfering the tibble to a data.frame and recalculating gives the desired output.
I am a beginner in R. I am actually trying to code my first function.
I am looking for csv files in a directory on my computer, then I put them into a data frame and then I am asking for the mean of some variable.
I have 2 variable : sulfate and nitrate.
My function works fine for nitrate but not for sulfate. I really don't know what is wrong. R studio gave me a clue : In mean.default(directory$suftate, na.rm = TRUE) :
argument is not numeric or logical: returning NA
But I don't know what to do with this information.
My function is :
pollutantmean <- function (directory, polluant = "nitrate", id = 1:332)
directory <- data.frame()
for (i in id)
{directory <- rbind(directory, read.csv(full_files[i]))}
if (polluant == "nitrate"){
mean(directory$nitrate,na.rm = TRUE)}
else if (polluant == "sulfate"){
mean(directory$suftate,na.rm = TRUE)}
else {print("KO")}
}
Can you help me ?
Caroline
An opening curly brace is missing in the very first line
This function will only work if there exists a global variable called full_files; consider passing it to
the function explicitly
Likely a typo in the else if clause: directory$sulfate, not directory$suftate
This function does not return anything; executing it won't actually do anything. Well, technically,
running mean() prints some output, but it may not always be the case, especially when run from the
command line, or when called from another function or script. Consider wrapping mean() in print() (or, even better, returning it, which will allow you to assign the mean to other variables)
I have an example function below that reads in a date as a string and returns it as a date object. If it reads a string that it cannot convert to a date, it returns an error.
testFunction <- function (date_in) {
return(as.Date(date_in))
}
testFunction("2010-04-06") # this works fine
testFunction("foo") # this returns an error
Now, I want to use lapply and apply this function over a list of dates:
dates1 = c("2010-04-06", "2010-04-07", "2010-04-08")
lapply(dates1, testFunction) # this works fine
But if I want to apply the function over a list when one string in the middle of two good dates returns an error, what is the best way to deal with this?
dates2 = c("2010-04-06", "foo", "2010-04-08")
lapply(dates2, testFunction)
I presume that I want a try catch in there, but is there a way to catch the error for the "foo" string whilst asking lapply to continue and read the third date?
Use a tryCatch expression around the function that can throw the error message:
testFunction <- function (date_in) {
return(tryCatch(as.Date(date_in), error=function(e) NULL))
}
The nice thing about the tryCatch function is that you can decide what to do in the case of an error (in this case, return NULL).
> lapply(dates2, testFunction)
[[1]]
[1] "2010-04-06"
[[2]]
NULL
[[3]]
[1] "2010-04-08"
One could try to keep it simple rather than to make it complicated:
Use the vectorised date parsing
R> as.Date( c("2010-04-06", "foo", "2010-04-08") )
[1] "2010-04-06" NA "2010-04-08"
You can trivially wrap na.omit() or whatever around it. Or find the index of NAs and extract accordingly from the initial vector, or use the complement of the NAs to find the parsed dates, or, or, or. It is all here already.
You can make your testFunction() do something. Use the test there -- if the returned (parsed) date is NA, do something.
Add a tryCatch() block or a try() to your date parsing.
The whole things is a little odd as you go from a one-type data structure (vector of chars) to something else, but you can't easily mix types unless you keep them in a list type. So maybe you need to rethink this.
You can also accomplish this kind of task with the purrr helper functions map and possibly. For example
library(purrr)
map(dates2, possibly(testFunction, NA))
Here possibly will return NA (or whatever value you specified if an error occurs.
Assuming the testFunction() is not trivial and/or that one cannot alter it, it can be wrapped in a function of your own, with a tryCatch() block. For example:
> FaultTolerantTestFunction <- function(date_in) {
+ tryCatch({ret <- testFunction(date_in);}, error = function(e) {ret <<- NA});
+ ret
+ }
> FaultTolerantTestFunction('bozo')
[1] NA
> FaultTolerantTestFunction('2010-03-21')
[1] "2010-03-21"
testing<-function(formula=NULL,data=NULL){
if(with(data,formula)==T){
print('YESSSS')
}
}
A<-matrix(1:16,4,4)
colnames(A)<-c('x','y','z','gg')
A<-as.data.frame(A)
testing(data=A,formula=(2*x+y==Z))
Error in eval(expr, envir, enclos) : object 'x' not found
##or I can put formula=(x=1)
##reason that I use formula is because my dataset had different location and I would want
##to 'subset' my data into different set
This is the main flow of my code. I had done some search and seems to be no one ask this kind of stupid question or it is not possible to pass a formula in a if statement. Thank you in advance
if you just want subset of your data.frame create a character object representing the formula like this:
formula="2*x+y==z"
testing<-function(data,formula){with(data = data,expr = eval(parse(text = formula)))}
subset(A,testing(A,formula=formula))
#x y z gg
#2 2 6 10 14
You can change the formula as per your need.
If we need to evaluate it, one option is eval(parse
testing<-function(formula=NULL,data=NULL){
data <- deparse(substitute(data))
if(any(eval(parse(text=paste("with(", data, ",",
deparse(substitute(formula)), ")")))))
print("YESSS")
}
testing(data=A,formula=(2*x+y==z))
#[1] "YESSS"
When you call a function in R it evaluates its arguments first before executing the function.
For example, prod(2+2, 3) is first turned into prod(4, 3) before the function prod() is even called.
Thus, in your code, R starts by trying to solve (2*x+y==Z). It fails because there is no x object outside of the function code. So, it not even begin running testing().
To use your function correctly you should make it clear to R that it is not supposed to calculate (2*x+y==Z). Instead it should pass this information as is. You could do that using the functions expression() and eval().
testing<-function(formula=NULL,data=NULL){
if(with(data,eval(formula==T)){
print('YESSSS')
}
}
A<-matrix(1:16,4,4)
colnames(A)<-c('x','y','z','gg')
A<-as.data.frame(A)
testing(data=A,formula=expression(2*x+y==Z))
However, you will notice that there other problems with your code.
For Z is different than z. Notice that the in colnames you use z and in the formula Z.
The if() only works for when there is a single value of true or false. In your case, you will have one value for each row in A. When this happens, if() will only check if the first row fits the criteria.
If your purpose is subsetting, it is much more easier to do:
A.subset <- subset(A, 2*A$x+A$y == A$z)
After a discussion with my colleague,
here is a kind of solution
testing<-function(cx,cy,px,py,z,data=NULL){
list<-NULL
for(m in 1:nrow(data)){
if(cx*data$x[m]^px+cy*data$y[m]^py+data$z==0){
print(m)}
}
}
but this can deal with polynomial only and with a lot of arguments in the function. I am think of a way to reduce it as a general equation.or maybe this is the most easiest equation.
Stuck on an error in R.
Error in names(x) <- value :
'names' attribute must be the same length as the vector
What does this error mean?
In the spirit of #Chris W, just try to replicate the exact error you are getting. An example would have helped but maybe you're doing:
x <- c(1,2)
y <- c("a","b","c")
names(x) <- y
Error in names(x) <- y :
'names' attribute [3] must be the same length as the vector [2]
I suspect you're trying to give names to a vector (x) that is shorter than your vector of names (y).
Depending on what you're doing in the loop, the fact that the %in% operator returns a vector might be an issue; consider a simple example:
c1 <- c("one","two","three","more","more")
c2 <- c("seven","five","three")
if(c1%in%c2) {
print("hello")
}
then the following warning is issued:
Warning message:
In if (c1 %in% c2) { :
the condition has length > 1 and only the first element will be used
if something in your if statement is dependent on a specific number of elements, and they don't match, then it is possible to obtain the error you see
I have seen such error and i solved it. You may have missing values in your data set. Number of observations in every column must also be the same.
I want to explain the error with an example below:
> names(lenses)
[1] "X1..1..1..1..1..3"
names(lenses)=c("ID","Age","Sight","Astigmatism","Tear","Class")
Error in names(lenses) = c("ID", "Age", "Sight", "Astigmatism", "Tear", :
'names' attribute [6] must be the same length as the vector [1]
The error happened because of mismatch in a number of attributes. I only have one but trying to add 6 names. In this case, the error happens. See below the correct one:::::>>>>
> names(lenses)=c("ID")
> names(lenses)
[1] "ID"
Now there was no error.
I hope this will help!
I had this, caused by a scaled numeric variable not being returned as numeric, but as a matrix. Restore any transformed variables to as.numeric() and it should work.
The mistake I made that coerced this error was attempting to rename a column in a loop that I was no longer selecting in my SQL. This could also be caused by trying to do the same thing in a column that you were planning to select. Make sure the column that you are trying to change actually exists.
For me, this error was because I had some of my data titles were two names, I merged them in one name and all went well.
I encountered the same error for a silly reason, which I think was this:
Working in R Studio, if you try to assign a new object to an existing name, and you currently have an object with the existing name open with View(), it throws this error.
Close the object 'View' panel, and then it works.