r XBRL package "404 Not Found" error with Ubuntu - r

The following R code works fine from my Windows 8 laptop:
> inst<- "https://www.sec.gov/Archives/edgar/data/51143/000104746916010329/ibm-20151231.xml"
> options(stringsAsFactors = FALSE)
> xbrl.vars <- xbrlDoAll(inst, cache.dir = "XBRLcache", prefix.out = NULL, verbose=TRUE)
However, when I attempt to run it from my Ubuntu 16.04 machine, I receive the following output:
Error in fileFromCache(file) :
Error in download.file(file, cached.file, method = "auto", quiet = !verbose) :
cannot download all files
In addition: Warning message:
In download.file(file, cached.file, method = "auto", quiet = !verbose) :
URL 'https://www.sec.gov/Archives/edgar/data/51143/000104746916010329/ibm-20151231.xsd': status was '404 Not Found'
It's finding the initial xml file but then cannot find the referenced schemas. Any help would be appreciated. Thanks in advance.

Related

Trying to install the Performance Analytics in R

I'm trying to install the package 'Performance Analysis' in the R script below:
tryCatch(
expr = {
#library('PerformanceAnalytics', verbose = FALSE, quietly = TRUE)
library("PerformanceAnalytics",lib.loc=.libPaths(),verbose = FALSE, quietly = TRUE)
},
error = function(e){
print(e)
install.packages('PerformanceAnalytics', repos='https://cran.rstudio.com', verbose = FALSE)
#library('PerformanceAnalytics', verbose = FALSE, quietly = TRUE)
library("PerformanceAnalytics",lib.loc=.libPaths(),verbose = FALSE, quietly = TRUE)
}
)
But I'm having the following error message (Version 1.2.1335 of R Studio). Does anyone know if I've made a mistake somewhere?
Thanks
unable to access index for repository https://cran.rstudio.com/bin/windows/contrib/3.1
Installing package into ‘C:/Users/xxxxx/Documents/R/win-library/3.1’
(as ‘lib’ is unspecified)
Warning in install.packages :
unable to access index for repository https://cran.rstudio.com/bin/windows/contrib/3.1
Warning in install.packages :
package ‘PerformanceAnalytics’ is not available (as a binary package for R version 3.1.3)
Error in library("PerformanceAnalytics", lib.loc = .libPaths(), verbose = FALSE, :
there is no package called ‘PerformanceAnalytics’

Error: In download.file(url = blast_ftp, destfile = dwl_file, mode = "wb")

i am trying to run this command to install an R package dependencies but i get these errors, how can i fix it ? thank you
buildDependencies(path_to_reference_data = "C:\\Users\\cc\\Desktop\\Data\\Tax4Fun\\Tax4Fun2_ReferenceData_v2", install_suggested_packages = T, use_force=T)
Install Tax4Fun2 dependencies trying URL
'ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.9.0/ncbi-blast-2.9.0+-x64-win64.tar.gz'
downloaded 91.8 MB
Error in system(command = blast_bin, intern = T) :
'C:\Users\cc\Desktop\Data' not found In addition: Warning message: In
download.file(url = blast_ftp, destfile = dwl_file, mode = "wb") :
the 'wininet' method is deprecated for ftp:// URLs

How to avoid "Error in cachaca(catalog[i, "full_url"], tf, mode = "wb", filesize_fun = "unzip_verify") " when downloading brazilian census microdata?

When using Anthony Joseph Damico`s R infrastructure for Survey Data (http://asdfree.com/), I have encountered some erros when dealing with Brazilian census data. The website provides lines of code of interest here (http://asdfree.com/prerequisites.html) and here (http://asdfree.com/brazilian-censo-demografico-censo.html).
I use the following lines of code to begin downloading census data:
install.packages( "devtools" , repos = "http://cran.rstudio.com/" )
library(devtools)
install_github( "ajdamico/lodown" , dependencies = TRUE )
install.packages( "convey" , repos = "http://cran.rstudio.com/" )
install.packages( "srvyr" , repos = "http://cran.rstudio.com/" )
library(lodown)
lodown( "censo" , output_dir = file.path( path.expand( getwd( ) ) , "CENSO" ) )
This yields the following error:
Error in cachaca(catalog[i, "full_url"], tf, mode = "wb", filesize_fun = "unzip_verify") :
download failed after 3 attempts
Warning in for (i in seq_along(cenv$extra)) { :
closing unused connection 7 (C:/Program Files/R/R-4.1.0/library/lodown/extdata/censo/LE_FAMILIAS.sas)
Warning in for (i in seq_along(cenv$extra)) { :
closing unused connection 6 (C:/Program Files/R/R-4.1.0/library/lodown/extdata/censo/LE_DOMIC.sas)
Warning in for (i in seq_along(cenv$extra)) { :
closing unused connection 5 (C:/Program Files/R/R-4.1.0/library/lodown/extdata/censo/LE_PESSOAS.sas)
Warning in for (i in seq_along(cenv$extra)) { :
closing unused connection 4 (C:/Program Files/R/R-4.1.0/library/lodown/extdata/censo/SASinputDom.txt)
Warning in for (i in seq_along(cenv$extra)) { :
closing unused connection 3 (C:/Program Files/R/R-4.1.0/library/lodown/extdata/censo/SASinputPes.txt)
How can I avoid this? When running the lines of code, a "CENSO" folder is created in my directory. It is, however, empty.

gganimate package in R cannot link to ImageMagick

Running the awesome gganimate package, but can't get the base example to work.
library(gapminder)
library(ggplot2)
library(gganimate)
theme_set(theme_bw())
p <- ggplot(gapminder, aes(gdpPercap, lifeExp, size = pop, color = continent, frame = year)) + geom_point() + scale_x_log10()
gg_animate(p)
Returns:
sh: C:\Program Files\ImageMagick-6.9.0-Q16\convert.exe: command not found
Error in cmd.fun(sprintf("%s --version", shQuote(ani.options("convert"))), : error in running command
sh: convert: command not found
Error in cmd.fun(sprintf("%s --version", convert), intern = TRUE) :
error in running command
I cannot find ImageMagick with convert = 'convert'
Error in file(file, "rb") : cannot open the connection
In addition: Warning messages:
1: In im.convert(img.files, output = movie.name, convert = convert, :
Please install ImageMagick first or put its bin path into the system PATH > variable
2: In normalizePath(movie.name) :
path[1]="file4421bbcfb7d.gif": No such file or directory
3: In file(file, "rb") :
cannot open file > '/var/folders/4y/5nw21h2j1dz9p960gfvl16ywlrpqh8/T//RtmpRywpkt/gganimate/file4421bb> cfb7d.gif': No such file or directory
I'm almost sure ImageMagick is installed. What is the missing link?

Error trying to read a PDF using readPDF from the tm package

(Windows 7 / R version 3.0.1)
Below the commands and the resulting error:
> library(tm)
> pdf <- readPDF(PdftotextOptions = "-layout")
> dat <- pdf(elem = list(uri = "17214.pdf"), language="de", id="id1")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp
\RtmpS8Uql1\pdfinfo167c2bc159f8': No such file or directory
How do I solve this issue?
EDIT I
(As suggested by Ben and described here)
I downloaded Xpdf copied the 32bit version to
C:\Program Files (x86)\xpdf32
and the 64bit version to
C:\Program Files\xpdf64
The environment variables pdfinfo and pdftotext are referring to the respective executables either 32bit (tested with R 32bit) or to 64bit (tested with R 64bit)
EDIT II
One very confusing observation is that starting from a fresh session (tm not loaded) the last command alone will produce the error:
> dat <- pdf(elem = list(uri = "17214.pdf"), language="de", id="id1")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp\RtmpKi5GnL
\pdfinfode8283c422f': No such file or directory
I don't understand this at all because the function variable is not defined by tm.readPDF yet. Below you'll find the function pdf refers to "naturally" and to what is returned by tm.readPDF:
> pdf
function (elem, language, id)
{
meta <- tm:::pdfinfo(elem$uri)
content <- system2("pdftotext", c(PdftotextOptions, shQuote(elem$uri),
"-"), stdout = TRUE)
PlainTextDocument(content, meta$Author, meta$CreationDate,
meta$Subject, meta$Title, id, meta$Creator, language)
}
<environment: 0x0674bd8c>
> library(tm)
> pdf <- readPDF(PdftotextOptions = "-layout")
> pdf
function (elem, language, id)
{
meta <- tm:::pdfinfo(elem$uri)
content <- system2("pdftotext", c(PdftotextOptions, shQuote(elem$uri),
"-"), stdout = TRUE)
PlainTextDocument(content, meta$Author, meta$CreationDate,
meta$Subject, meta$Title, id, meta$Creator, language)
}
<environment: 0x0c3d7364>
Apparently there is no difference - then why use readPDF at all?
EDIT III
The pdf file is located here: C:\Users\Raffael\Documents
> getwd()
[1] "C:/Users/Raffael/Documents"
EDIT IV
First instruction in pdf() is a call to tm:::pdfinfo() - and there the error is caused within the first few lines:
> outfile <- tempfile("pdfinfo")
> on.exit(unlink(outfile))
> status <- system2("pdfinfo", shQuote(normalizePath("C:/Users/Raffael/Documents/17214.pdf")),
+ stdout = outfile)
> tags <- c("Title", "Subject", "Keywords", "Author", "Creator",
+ "Producer", "CreationDate", "ModDate", "Tagged", "Form",
+ "Pages", "Encrypted", "Page size", "File size", "Optimized",
+ "PDF version")
> re <- sprintf("^(%s)", paste(sprintf("%-16s", sprintf("%s:",
+ tags)), collapse = "|"))
> lines <- readLines(outfile, warn = FALSE)
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\Raffael\AppData\Local\Temp\RtmpquRYX6\pdfinfo8d419174450': No such file or direc
Apparently tempfile() simply doesn't create a file.
> outfile <- tempfile("pdfinfo")
> outfile
[1] "C:\\Users\\Raffael\\AppData\\Local\\Temp\\RtmpquRYX6\\pdfinfo8d437bd65d9"
The folder C:\Users\Raffael\AppData\Local\Temp\RtmpquRYX6 exists and holds some files but none is named pdfinfo8d437bd65d9.
Intersting, on my machine after a fresh start pdf is a function to convert an image to a PDF:
getAnywhere(pdf)
A single object matching ‘pdf’ was found
It was found in the following places
package:grDevices
namespace:grDevices [etc.]
But back to the problem of reading in PDF files as text, fiddling with the PATH is a bit hit-and-miss (and annoying if you work across several different computers), so I think the simplest and safest method is to call pdf2text using system as Tony Breyal describes here.
In your case it would be (note the two sets of quotes):
system(paste('"C:/Program Files/xpdf64/pdftotext.exe"',
'"C:/Users/Raffael/Documents/17214.pdf"'), wait=FALSE)
This could easily be extended with an *apply function or loop if you have many PDF files.

Resources