I have to perform a geocode script in R. I have a script given and I have tried to get it to work, but keep getting error. I am pasting my code below and the errors I am getting. If you could guide me in the correct direction, I would appreciate it.
# initialise a dataframe to hold the results
geocoded <- data.frame()
# find out where to start in the address list (if the script was interrupted before):
startindex <- 1
# if a temp file exists - load it up and count the rows!
tempfilename <- paste0(hw1, '_temp_geocoded.rds')
if (file.exists(tempfilename)){
print("Found temp file - resuming from index:")
geocoded <- readRDS(tempfilename)
startindex <- nrow(geocoded)
print(startindex)
}
## Warning message:
## In if (file.exists(tempfilename)) { :
## the condition has length > 1 and only the first element will be used
The function file.exists(...) returns a logical vector of length equal to the length of its argument. So if you provide a vector of file names, file.exists(...) will return a vector of the same length, where each element is either TRUE or FALSE depending on whether the corresponding file exists.
The problem is that if(...) expects a scalar (vector of length 1). Otherwise, it just uses the first element of the vector and gives the warning you are seeing.
So, I suspect that hw1 is a vector of length > 1, which will make tempfilename a vector of length > 1.
Related
I am trying hard to finish my assignment on r language. I am a newbie in computer language, so pardon my ignorance.
I have a dataset with 500 cryptocurrencies and relative data about them. The name of the cryptocurrency is under the column ID, and the value in EUR of one cryptocurrency is under the column current_price. I need to create a new function to convert between cryptocurrencies, with three inputs: the currency I am converting from, the amount of coins I have in that currency, and the currency I am converting to.
This is what I came up with, it does not work, and every time I change it I get a new error.
convert <- function (x,amount,z) {
currency_price <- mdata$current_price
currency_id <- mdata$id
index_currency_id_x <- which( x == currency_id) [[1]]
index_currency_id_z <- which ( z == currency_id) [[1]]
conversion <- currency_price[[index_currency_id_z]] * amount
return (conversion/currency_price[[index_currency_id_z]])
}
If I run every line of code I receive the errors:
Error in which(x == currency_id) : object 'x' not found
Error in which(z == currency_id) : object 'z' not found
Error in currency_price[[index_currency_id_z]] :
attempt to select less than one element in get1index
Error in currency_price[[index_currency_id_z]] :
attempt to select less than one element in get1index
This error just means you haven't supplied x and y to the function. It should be convert(x=,amount=,y=)
I have functions that operate on a single vector (for example, a column in a data frame). I want users to be able to use $ to specify the columns that they pass to these functions; for example, I want them to be able to write myFun(df$x), where df is a data frame. But in such cases, I want my functions to detect when x isn't in df. How may I do this?
Here is a minimal illustration of the problem:
myFun <- function (x) sum(x)
data(iris)
myFun(iris$Petal.Width) # returns 180
myFun(iris$XXX) # returns 0
I don't want the last line to return 0. I want it to throw an error message, as XXX isn't a column in iris. How may I do this?
One way is to run as.character(match.call()) inside the function. I could then use the parts of the resulting string to determine the name of df, and in turn, I could check for the existence of x. But this seems like a not–so–robust solution.
It won't suffice to throw an error whenever x has length 0: I want to detect whether the vector exists, not whether it has length 0.
I searched for related posts on Stack Overflow, but I didn't find any.
The iris$XXX returns NULL and NULL is passed to sum
sum(NULL)
#[1] 0
Note that either iris$XXX or iris[['XXX']] returns NULL as value. If we need to get an error either subset or dplyr::select gives that
iris %>%
select(XXX)
Error: Can't subset columns that don't exist.
✖ Column XXX doesn't exist.
Run rlang::last_error() to see where the error occurred.
Or with pull
iris %>%
pull(XXX)
Error: object 'XXX' not found Run rlang::last_error() to see where
the error occurred.
subset(iris, select = XXX)
Error in eval(substitute(select), nl, parent.frame()) :
object 'XXX' not found
>
We could make the function to return an error if NULL is passed. Based on the way the function takes arguments, it is taking the value and not any info about the object.
myFun <- function (x) {
stopifnot(!is.null(x))
sum(x)
}
However, this would be non-specific error because NULL values can be passed to the function from other cases as well i.e. consider if the column exists and the value is NULL.
If we need to check if the column is valid, then the data and the column name should be passed into
myFun2 <- function(data, colnm) {
stopifnot(exists(colnm, data))
sum(data[[colnm]])
}
myFun2(iris, 'XXX')
#Error in myFun2(iris, "XXX") : exists(colnm, data) is not TRUE
I have a function where in some iterations I come into situation where I have to return a value of a function at 0 position which returns
funX[0] = numeric(0) `
I understand that, R indexing starts from 1. However, if I could convert these outputs to just simply zero, life would have been easier. I do not find a way around.
Is there any way where such returns would simply be converted as 0?
Addition: I tried to set funX[0] <- 0L in the beginning of the function but it doesn't work.
I have a function where im trying to compare a dataframe column to a ref table of type character. I have downloaded some data from the Norwegian central statistics office with popular first names. I want to add a column to my data frame which is basically a 1 or a 0 if the name appears in the list (1 being a boy 0 being a girl). Im getting the following error with the code
*Error in match(x, table, nomatch = 0L) : object 'x' not found*
Data frame is train.
Reference data is male_names
male_names <- read.csv("~/R/Functions_Practice/NO/BoysNames_Data.csv", sep=";",as.is = TRUE)[ ,1]
get.sex <- function(x, ref)
for (i in ref)
{
if(x %in% ref)
{return (1)}
}
# set default for column
train$sex <- 2
# Update column if it appears in the names list
train$sex <- sapply(train$sex, FUN=get.sex(x,male_names))
I would then use the function to run the second Girls Name file against the table and set the flag for each record to zero where that occurs
Can anyone help
When using sapply, you don't write arguments directly in the FUN parameter.
train$sex <- sapply(train$sex, FUN=get.sex,ref = male_names)
It is implied that train$sex is the x argument, and all other parameters are passed after that (in this case, it's just ref) and are explicitly defined.
Edit:
As joran noted, in this case sapply isn't particularly useful, and you can do the results in one line:
train$sex = (train$sex %in% male_names)*1
%in% can be used when the argument on the left is a vector, so you don't have to loop over it. Multiplying the result by one converts logical (boolean) values into integers. 1*TRUE yields 1, and 1*FALSE yields 0.
The default behavior in R when the index exceeds the dimensions of a vector / matrix is to return NA. E.g.
> a = as.matrix(1:10)
> a[11]
[1] NA
This is very inconvenient in many circumstances, since the code keeps running giving wrong results and without even giving a warning.
Does anyone know if it is possible to alter this default behavior in a code, so that in these cases an error or a warning is thrown instead of returning NA when the index exceeds the dimensions of a vector/matrix ?
One solution is for you to use two arguments (row and col) when indexing your matrix with [, which is the more "normal" thing to do with a matrix. That usage will trigger an error:
a[11, 1] <- NA
# Error in `[<-`(`*tmp*`, 11, 1, value = NA) : subscript out of bounds
Another way, assuming that your a[11] is part of a script or function, is to put in your own error check. For example,
for (j in 1:20 ) {
ifelse(j <= length(a), a[j], cat('index out of bounds') )
}