R Shiny: Read CSV from webpage, incomplete final line found by readTableHeader - r

I have an issue that I can't seem to find a solution to. I am reading a CSV from a URL on the web which goes somewhat like this (the real one has 4 more columns):
Sample Data from
the CSV
Date -------------------------- Close
2017-10-23 ----------------- 156.17
2017-10-20 ----------------- 156.16
2017-10-19 ----------------- 155.98
2017-10-18 ----------------- 159.76
2017-10-17 ----------------- 160.47
2017-10-16 ----------------- 159.88
2017-10-13 ----------------- 156.99
2017-10-12 ----------------- 156
2017-10-11 ----------------- 156.55
The CSV file goes on like this (each day having a record) back till somewhere near 1980. In R Shiny, I use the plot command like this...
stockData <- read.csv(url("https://www.quandl.com/api/v3/datasets/WIKI/FB/data.json?api_key=xTLatSPBnz751sCMECza"), header=T, sep=",")
...and then continue with this code:
plot(stockData$Date, stockData$Close, main="", type="l", las="1",
xlab="Date", ylab="Share Price", panel.first = grid())
points(x=stockData$Date, y=stockData$Close, col='#f44242', type='l', lwd=2)
grid (10,10, lty = 6, col = "lightgray")
...I get these errors:
Warning in read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on 'https://www.quandl.com/api/v3/datasets/WIKI/FB/data.json?api_key=xTLatSPBnz751sCMECza'
Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Warning in min(x) : no non-missing arguments to min; returning Inf
Warning in max(x) : no non-missing arguments to max; returning -Inf
Warning: Error in plot.window: need finite 'xlim' values
I don't know which errors are related so if someone could explain what I'm doing wrong, that would be awesome. Is it something about the file being to big, and how can I test that? Or is it completely unrelated to this? (Note: I downloaded the CSV and linked to the same (big) file and it worked)

The file that you are reading is a json which needs to be parsed and then converted to a dataframe. Please use the below code for the same:
library(jsonlite)
df <- fromJSON("https://www.quandl.com/api/v3/datasets/WIKI/FB/data.json?api_key=xTLatSPBnz751sCMECza", flatten = T)
stockData <- data.frame(df$dataset_data$data)
names(stockData) <- df$dataset_data$column_names
stockData$Close <- as.numeric(stockData$Close)
plot(stockData$Date, stockData$Close, main="", type="l", las="1",
xlab="Date", ylab="Share Price")
points(x=stockData$Date, y=stockData$Close, col='#f44242', type='l', lwd=2)
grid (10,10, lty = 6, col = "lightgray")

Related

ChartSeries AddTA(OBV()) Error [TTR-Quantmod]

I have a ChartSeries error in my production code. Code Below
chartSeries(Stock, theme = chartTheme("white"), TA=c(addTA(ATR(Stock[,c("High","Low","Close")], n=14)), addTA(ADX(Stock[,c("High","Low","Close")])), addTA(OBV(Stock[,"Close"], Stock[,"Total.Trade.Quantity"])), addTA(chaikinAD(Stock[,c("High","Low","Close")], Stock[,"Total.Trade.Quantity"])), addTA(CMF(Stock[,c("High","Low","Close")], Stock[,"Total.Trade.Quantity"])), addRSI(), addSMI(), addMACD(type = "DEMA"), addBBands(), addDEMA(n = 20, on = 1, with.col = Cl, overlay = TRUE, col = "blue")), subset='last 4 months')
Error Code:
Error in seq.default(min(tav * 0.975, na.rm = TRUE), max(tav * 1.05, na.rm = TRUE), :
'from' must be a finite number
In addition: Warning messages:
1: In min(tav * 0.975, na.rm = TRUE) :
no non-missing arguments to min; returning Inf
2: In max(tav * 1.05, na.rm = TRUE) :
no non-missing arguments to max; returning -Inf
Data file info:
So my data file, an xts styled OHLCV (csv), has 1 row of a total of 4718 rows with 3 NA values (on the first row of the file). The rest of the rows are completely filled with no other NA values.
Edit:
Just omitted the row containing NA values, and still get the same error. So the error has to do something with something else.
Edit 2:
So I found that the error is localized to the addTA(OBV(Stock[,"Close"], Stock[,"Total.Trade.Quantity"])) function/arguments. Any suggestions or tips?
This code resolves your problem:
Stock <- AAPL["2018-08"]
chartSeries(Stock, theme="white")
addTA(OBV(Cl(Stock), Vo(Stock)))
Stock prices chart with OBV added

Creating a scatterplot for each value of the first column in an r dataframe

Using the file below, I am trying to creating 2 scatterplots. One scatterplot compares the 2nd and 3rd column when the first column is equal to "coat" and the second scatterplot compares the 2nd and third column when the first column is equal to "hat"
file.txt
clothing,freq,temp
coat,0.3,10
coat,0.9,0
coat,0.1,20
hat,0.5,20
hat,0.3,15
hat,0.1,5
This is the script I have written
script.R
rates = read.csv("file.txt")
for(i in unique(rates[1])){
plot(unlist(rates[2])[rates[1] == toString(i)],unlist(rates[3])[rates[1] == toString(i)])
}
I receive this error when running it
Error in plot.window(...) : need finite 'xlim' values
Calls: plot -> plot.default -> localWindow -> plot.window
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf
Execution halted
The script works if I replace if I replace "toString(i)" with "hat" but can obviously only make one of the scatterplots.
.
EDIT
I edited my script slightly. It creates a graph for the first iteration through the loop but not for any iteration after the first.
This is my script
rates = read.csv("file.txt")
for(i in unique(rates[,1])){
plot(unlist(rates[2])[rates[1] == toString(i)],unlist(rates[3])[rates[1] == toString(i)])
file.rename("Rplots.pdf", paste(i,".pdf",sep=""))
}
This is what happens when I execute the script
name#server:/directory> ./script.R
Warning message:
In file.rename("Rplots.pdf", paste(i, ".pdf", sep = "")) :
cannot rename file 'Rplots.pdf' to 'hat.pdf', reason 'No such file or directory'
name#server:/directory> ls
coat.pdf file.txt script.R*
try this:
rates = read.table("file.txt",sep=',',header=TRUE)
cloth_type<-unique(rates[,1])
for (i in 1:length(cloth_type)){
dev.new()
index_included=which(rates[,1]==cloth_type[i])
plot(rates[index_included,2],rates[index_included,3],main=cloth_type[i],
xlab="freq ", ylab="temp ", pch=19)
}
Maybe the dplyr package would be helpful.
To install the package:
install.packages('dplyr')
Then you can use the filter function to generate separate your separate dataframes:
library('dplyr')
rates <- read.csv("file.txt")
cloathTypes <- unique(rates$clothing)
for(cloath in cloathTypes){
d <- filter(rates, clothing == cloath)
plot(d$freq, d$temp, xlab = 'Freq', ylab='Temp', main=cloath)
}
I think your issue is arising from calling unique() on a data.frame, which produces another data.frame rather than a vector to iterate over. Provided your global options import strings as factors, you should be able to output the plots side-by-side as follows:
## input data
rates = data.frame(clothing = c(rep("coat", 3), rep("hat", 3)),
freq = c(0.3, 0.9, 0.1, 0.5, 0.3, 0.1),
temp = c(10, 0, 20, 20, 15, 5))
## store original plotting parameters
op = par(no.readonly = TRUE)
## modify plotting parameters to produce side-by-side plots
par(mfrow = c(1, 2))
## output plots
for(i in levels(rates[,1])){
plot(rates[,2][rates[,1] == i], rates[,3][rates[,1] == i])
}
## reset plotting pars
par(op)
If you want to produce separate plots just remove the par lines.
You can do this pretty easily with ggplot
Your data as a data.frame
df <- data.frame(clothing=c(rep("coat",3),rep("hat",3)),
freq=c(0.3,0.9,0.1,0.5,0.3,0.1),
temp=c(10,0,20,20,15,5),
stringsAsFactors=F)
Plotting freq on x, temp on y, and coloring points by clothing
ggplot(df, aes(freq, temp, colour=clothing)) +
geom_point()
Change for(i in unique(rates[1])) to for(i in unique(rates[,1])) and add dev.new() into the for loop
rates = read.csv("file.txt")
for(i in unique(rates[,1])){
dev.new()
plot(unlist(rates[2])[rates[1] == toString(i)],unlist(rates[3])[rates[1] == toString(i)])
file.rename("Rplots.pdf", paste(i,".pdf",sep=""))
}

R package "scholar" / getting the citation history of an article

I have a problem with the R package scholar
What works:
get_citation_history(SSalzberg)
What doesn't:
get_article_cite_history(SSalzberg, "any article")
Code:
article <- "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome"
SSalzberg <- "sUVeH-4AAAAJ" (Google Scholar ID)
get_article_cite_history(SSalzberg, article)
Error Message:
Error in min(years):max(years) : result would be too long a vector
In addition: Warning messages:
1: In min(years) : no non-missing arguments to min; returning Inf
2: In max(years) : no non-missing arguments to max; returning -Inf
I do not understand the error message in the context of that function and I tried another paper with an another author without success. I don't know what I am missing here.... Thanks
You have to use an article ID, not the title of the article. Probably the easiest way to get this is to retrieve the full list of pubs, which has a pubid column ...
library(scholar)
SSalzberg <- "sUVeH-4AAAAJ"
all_pubs <- get_publications(SSalzberg)
## next step is cosmetic -- the equivalent of stringsAsFactors=FALSE
all_pubs <- as.data.frame(lapply(all_pubs,
function(x) if (is.factor(x)) as.character(x) else x))
w <-grep("Ultrafast",all_pubs$title) ## publication number 3
all_pubs$title[w]
## [1] Ultrafast and memory-efficient alignment of ...
all_pubs$pubid[w] ## "Tyk-4Ss8FVUC"
ch <- get_article_cite_history(SSalzberg,all_pubs$pubid[w])
plot(cites~year,ch,type="b")

Error while creating a Timeseries plot in R: Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values

Here's a sample of my single column data set:
Lines
141,523
146,785
143,667
65,560
88,524
148,422
I read this file as a .csv file, convert it into a ts object and then plot it:
##Read the actual number of lines CSV file
Aclines <- read.csv(file.choose(), header=T, stringsAsFactors = F)
Aclinests <- ts(Aclines[,1], start = c(2013), end = c(2015), frequency = 52)
plot(Aclinests, ylab = "Actual_Lines", xlab = "Time", col = "red")
I get the following error message:
Error in plot.window(xlim, ylim, log, ...) : need finite 'ylim' values
In addition: Warning messages:
1: In xy.coords(x, NULL, log = log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf
I thought this might be because of the "," in the columns and tried to use sapply to take care of that as advised here:
need finite 'ylim' values-error
plot(sapply(Aclinests, function(x)gsub(",",".",x)))
But I got the following error:
Error in plot(sapply(Aclinests, function(x) gsub(",", ".", x))) :
error in evaluating the argument 'x' in selecting a method for function 'plot': Error in sapply(Aclinests, function(x) gsub(",", ".", x)) :
'names' attribute [105] must be the same length as the vector [1]
Here is the head of my original and ts data set if it might help:
> head(Aclines)
Lines
1 141,523
2 146,785
3 143,667
4 65,560
5 88,524
6 148,422
> head(Aclinests)
[1] "141,523" "146,785" "143,667" "65,560" "88,524" "148,422"
Also, if I read the .csv file as:
Aclines <- read.csv(file.choose(), header=T, **stringsAsFactors = T**)
Then, I am able to plot the ts object, but head(Aclinests)gives the below output which is not consistent with my original data:
> head(Aclinests)
[1] 14 27 17 84 88 36
Please advice on how I can plot this ts object.
The simplest way to avoid this, in my case, is to remove the commas in the excel file containing the data. This can be done using simple excel commands and it worked for me.

Error when plotting HoltWinters graph

The following R code is giving me an error when trying to plot the HoltWinters graph as done here:
# init X
X11()
# get data
mydata = read.csv("lookup.csv", header=TRUE, stringsAsFactors=FALSE)
# data post-proc
mydata = as.data.frame(mydata)
mydata$Time = as.POSIXlt(mydata$Time, format='%d.%m.%Y %H:%M:%S')
# create time series - hourly data -> 8765 hours/year
dataTimeSeries <- ts(mydata$Close, frequency=8765)
dataForecasts = HoltWinters(dataTimeSeries, beta=FALSE, gamma=FALSE)
# output
plot.ts(dataForecasts)
message("Press Return To Continue")
invisible(readLines("stdin", n=1))
The error I'm getting is:
$ Rscript simple_forecast.R
Error in xy.coords(x, NULL, log = log) :
(list) object cannot be coerced to type 'double'
Calls: plot.ts -> plotts -> xy.coords
Execution halted
I'm quite perplexed, since print(dataForecasts) prints the correct data. I can also plot dataTimeSeries without a problem.
lookup.csv (pastebin)
Generally one should rely upon R to do the dispatch of class-dependent functions, and do notice that the example you cited at Avril Coghlan's page only used plot, not plot.ts.
(m <- HoltWinters(co2))
plot.ts(m)
Error in xy.coords(x, NULL, log = log) :
(list) object cannot be coerced to type 'double'
plot(m) # success

Resources