NA Error min Function in R - r

I am running into an error using R's min() function.
wip <- read.csv("WIP-01-11-11.csv") # Get WIP CSV
wip <- transform(wip, End.Date=as.Date(wip$End.Date,format='%d-%b-%y', na.rm=T))
wip <- transform(wip, Start.Date=as.Date(wip$Start.Date,format='%d-%b-%y', na.rm=T))
wip2 <- transform(wip, duration=ifelse(
round((wip3$End.Date - wip3$Start.Date)/30, digits = 0)==0,
1,
round((wip3$End.Date - wip3$Start.Date)/30, digits = 0)))
# At this point, I get NAs
wip3 <- transform(wip2, monthsRec=min( (
(2011*12+11) - as.numeric(format(wip3$Start.Date, '%Y'))*12 +
as.numeric(format(wip3$Start.Date, '%m'))),
wip3$duration)
)
Why am I getting NAs in the "duration" calculation for wip2 when End.Date and Start.Date have no NAs.
Thanks,

wip3 = list()
wip3$Start.Date = as.Date('2011-01-01')
wip3$duration = 10
> min(((2011*12+11) - as.numeric(format(wip3$Start.Date, '%Y'))*12+as.numeric(format(wip3$Start.Date, '%m'))),wip3$duration)
[1] 10
Works fine for me. Do you have any NAs in your data? If so, you probably want to use the na.rm=T flag to min().

I reproduced your problem with #John Colbys example if I use the wrong casing for wip3$start.Date:
wip3 = list()
wip3$start.Date = as.Date('2011-01-01')
wip3$duration = 10
min(((2011*12+11) - as.numeric(format(wip3$Start.Date, '%Y'))*12+as.numeric(format(wip3$Start.Date, '%m'))),wip3$duration)
Which produces
[1] NA
Warning messages:
1: NAs introduced by coercion
2: NAs introduced by coercion
I suspected since you have wip3$duration, you probably have wip3$start.Date too - but you accessed it as wip3$Start.Date in your code. That returns NULL, which doesn't work well with the rest...

Related

Having trouble with making K Nearest Neighbors work in R Studio

I'm trying to use the knn function in r but I keep getting this error message when I try to compute it.
> knn(Taxi_train,Taxi_test,cl,k=100)
Error in knn(Taxi_train, Taxi_test, cl, k = 100) :
NA/NaN/Inf in foreign function call (arg 6)
In addition: Warning messages:
1: In knn(Taxi_train, Taxi_test, cl, k = 100) : NAs introduced by coercion
2: In knn(Taxi_train, Taxi_test, cl, k = 100) : NAs introduced by coercion
I don't know what exactly is wrong with my code so I need some help to get it working.
I tried making sure that all the variables are numeric but that didn't change anything. It may also be an issue with my cl factor in the knn equation.
Here is what my code is currently:
date<-chicago_taxi$date
class(date)
Date <- as.Date(date)
class(Date)
Julian <- yday(Date)
class(Julian)
head(Julian)
chicago_taxi <- cbind(chicago_taxi,Julian)
chicago_taxi$seconds <- as.numeric(chicago_taxi$seconds)
set.seed(7777)
train_set <- sample(1:13081,10400,replace = FALSE)
Taxi_train <- chicago_taxi[train_set,]
Taxi_test <- chicago_taxi[-train_set,]
cl <- Taxi_train$payment_type
scale(chicago_taxi$miles)
scale(chicago_taxi$seconds)
scale(chicago_taxi$Julian)
knn(Taxi_train,Taxi_test,cl,k=100)

PCA with result non-interactively in R

I send you a message because I would like realise an PCA in R with the package ade4.
I have the data "PAYSAGE" :
All the variables are numeric, PAYSAGE is a data frame, there are no NAS or blank.
But when I do :
require(ade4)
ACP<-dudi.pca(PAYSAGE)
2
I have the message error :
**You can reproduce this result non-interactively with:
dudi.pca(df = PAYSAGE, scannf = FALSE, nf = NA)
Error in if (nf <= 0) nf <- 2 : missing value where TRUE/FALSE needed
In addition: Warning message:
In as.dudi(df, col.w, row.w, scannf = scannf, nf = nf, call = match.call(), :
NAs introduced by coercion**
I don't understand what does that mean. Have you any idea??
Thank you so much
I'd suggest sharing a data set/example others could access, if possible. This seems data-specific and with NAs introduced by coercion you may want to check the type of your input - typeof(PAYSAGE) - the manual for dudi.pca states it takes a data frame of numeric values as input.
Yes, for example :
ag_div <- c(75362,68795,78384,79087,79120,73155,58558,58444,68795,76223,50696,0,17161,0,0)
canne <- c(rep(0,10),5214,6030,0,0,0)
prairie_el<- c(60, rep(0,13),76985)
sol_nu <- c(18820,25948,13150,9903,12097,21032,35032,35504,25948,20438,12153,33096,15748,33260,44786)
urb_peu_d <- c(448,459,5575,5902,5562,458,6271,6136,459,1850,40,13871,40,13920,28669)
urb_den <- c(rep(0,12),14579,0,0)
veg_arbo <- c(2366,3327,3110,3006,3049,2632,7546,7620,3327,37100,3710,0,181,0,181)
veg_arbu <- c(18704,18526,15768,15527,15675,18886,12971,12790,18526,15975,22216,24257,30962,24001,14523)
eau <- c(rep(0,10),34747,31621,36966,32165,28054)
PAYSAGE<-data.frame(ag_div,canne,prairie_el,sol_nu,urb_peu_d,urb_den,veg_arbo,veg_arbu,eau)
require(ade4)
ACP<-dudi.pca(PAYSAGE)

addTA - Error in naCheck(x, n) : Series contains non-leading NAs

I recently tried to create my own technical indicator, a simple golden cross indicator. 50 - 200 day EMA to be added to my chartSeries chart. This worked fine with the code below at first, but after the updated package of quantmod was released it gives me this error message:
Code (stock data is downloaded through the getSymbols function in quantmod)
#20dayEMA - 50dayEMA Technical indicator, Price and Volume
newEMA <- function(x){(removeNA(EMA(p[,6],n=50)-(EMA(p[,6],n=200))))
}
emaTA <- newTA(newEMA)
emaTA(col='lightgoldenrod3', 'Price')
Then it gives me this error message:
Error in naCheck(x, n) : Series contains non-leading NAs
Does anyone know how to remove these non-leading NAs?
You can use na.omit and there is no need to convert to an xts-object as this is the default.
library(quantmod)
getSymbols("VELO.CO")
p <- na.omit(VELO.CO)
newEMA <- function(x) {
EMA(p[,6], n = 20) - (EMA(p[,6], n = 50))
}
emaTA <- newTA(newEMA)
barChart(VELO.CO)
emaTA(col = "lightgoldenrod3", "Price")
I'm not familiar with the quantmod package, but I played around with your code and I think I found a working solution:
library("quantmod")
getSymbols("VELO.CO")
p <- as.xts(c(VELO.CO))
# remove incomplete cases
vec <- which(!complete.cases(p)) # rows 2305 2398
p2 <- p[-vec, ]
newEMA <- function(x) {
EMA(p2[, 6], n = 20) - (EMA(p2[, 6], n = 50))
}
emaTA <- newTA(newEMA)
barChart(VELO.CO)
emaTA(col = "lightgoldenrod3", "Price")

quantmod <- Having trouble writing a formula to extract single day returns without headers

I am attempting to write a formula that will return a stocks single day return, but I believe im having trouble with the data type of the periodReturn subset field
periodReturn(ticker,period='daily',subset='20161010::20161010')
works but
dayReturn <- function(ticker,date) {
ticker <- c(MSFT)
date <- c(20161010)
dayreturn <- periodReturn(ticker,period='daily',paste("subset='",date,"::",date,"'"))
dayreturn
}
gives error
dayReturn(msft,20161010)
daily.returns
Warning messages:
1: In as_numeric(YYYY) : NAs introduced by coercion
2: In as_numeric(MM) : NAs introduced by coercion
3: In as_numeric(DD) : NAs introduced by coercion
>
Thanks in advance for any advice!
You have a couple of syntax errors going on here inside your dayReturn function.
Here is reproducible code extracted from inside your function that will work:
library(quantmod)
getSymbols("MSFT")
ticker <- c(MSFT)
date <- c("20161010")
dayreturn <- periodReturn(ticker,period='daily',subset = paste0(date,"::",date,"'"))
Your errors:
date wants to be a string, not a numeric number.
Your string for the dates you want to subset over is incorrect. you want to use subset = "YYYYMMDD::YYYYMMDD" or (subset = "YYYY-MM-DD::YYYY-MM-DD") in side periodReturn.
Your function would work more correctly like this:
dayReturn <- function(ticker, date1 , date2) {
dayreturn <- periodReturn(ticker,period='daily',subset = paste0(date1,"::",date2,"'"))
dayreturn
}
dayReturn(MSFT, "20161010", "20161012")
# daily.returns
# 2016-10-10 0.004152284
# 2016-10-11 -0.014645107
# 2016-10-12 -0.001398811

Error in table(x, y) : attempt to make a table with >= 2^31 elements

I have a problem with plotting my results. Previously (about two weeks ago) I can use same code at below to plot my data but now I'am getting error
data<- read.table("my_step.odt", header = FALSE, sep = "", quote="\"'", dec=".", as.is = FALSE, strip.white=FALSE, col.names=c(.......);
mgn_my <- data[1:49999,18]
sim <- data[1:49999, 21]
plot(sim , mgn_my , type="l",xlab="Time (ns)",ylab="mx")
error
Error in table(x, y) : attempt to make a table with >= 2^31 elements
any suggestion?
I have had a similar problem as you before. Based on my response from another post, here's what I would suggest before you run plot:
Option 1: Use droplevels
mgn_my <- droplevels(data[1:49999,18])
Option 2: Use apply. This approach seems "friendlier" if you are familiar with apply-family functions in R. For example:
mgn_my <- data[1:49999,18]
apply(mgn_my,1,plot)

Resources