Using StreamSets Jython (Python 2.7) processor when I make an API call using Python requests module
r = requests.get("https://someurl.com", headers={"Authorization":"Bearer sometokenstring"})
I get an error:
INFO java.util.zip.DataFormatException: invalid code lengths set.
Same code runs without this error in a linux terminal with Python 2.7. Any ideas to resolve this error?
This got resolved by adding a requests header attribute: "Accept-Encoding": "deflate" as the zip format data was causing some issue. So now the request looks like:
r = requests.get("https://someurl.com", headers={"Authorization":"Bearer sometokenstring","Accept-Encoding":"deflate"})
Related
I am trying to decode a bunch of .ZST files which I do not know what the original file was so I can access them, however all of the .ZSTs return the exact same error and do not get decompressed. The error is Decoding error (36) : Dictionary mismatch. The command used is zstd -d * on Windows 10 x64 using ZSTD v1.4.4 for Win x64.
I have already tried CMD, PowerShell and Bash as different environments to run the command but all return the exact same error. I have tried decompressing a single individual file to see if it was a bulk-operation issue but it didn't work either. My last attempt was to Google for the error but I could not find anything.
Edit: After investigating a little further, I decided to try checking for the MIME types of my ZST files, some of them get returned as application/x-zstd while others get returned as application/octet-stream. I wonder if this could be the issue? Although neither MIME types work, both return the same error.
Does anyone know how I could fix this error and get to decompress my files?
Here is one of the ZST files for reference: https://mega.nz/#!eV0VTKBQ!WBW_pVIq8Tsn2Rrv3XKmt4DSAH7IHbHtaAuNB9uRTMQ
My attempt to exclude the check for the EOL char on my Windows machine always results in this error message:
>vendor\bin\phpcs.bat --standard=PSR2 --exclude=Generic.Files.LineEndings.InvalidEOLChar src\version.php
ERROR: The specified sniff code "Generic.Files.LineEndings.InvalidEOLChar" is invalid
Run "phpcs --help" for usage information
Can't figure out what I'm doing wrong. I have installed PHP CodeSniffer via composer and am running version 3.4.0.
The --exclude CLI argument accepts 3-part sniffs codes, but you've passed in a 4-part error code.
In your case, the sniff code is Generic.Files.LineEndings and that sniff only generates a single error code, so you'll be fine ignoring the entire sniff:
vendor\bin\phpcs.bat --standard=PSR2 --exclude=Generic.Files.LineEndings src\version.php
If you want to exclude individual error codes, or if you just want to lock down a standard for your project, you'll need to use a ruleset.xml file: https://github.com/squizlabs/PHP_CodeSniffer/wiki/Annotated-Ruleset
I have the following script, which works perfectly fine, when I run it on my local PC:
library(RAdwords)
autX <- doAuth()
data <- getData(clientCustomerId='xxx-xxx-xxxx',
google_auth=autx
)
However, when I try to run the very same script on my Unix-Server, then I get this error message:
Error in rjson::fromJSON(RCurl::postForm("https://accounts.google.com/o/oauth2/token", :
STRING_ELT() can only be applied to a 'character vector', not a 'raw'
Question: What could be the reason and how can I fix it?
By the way:
I did copy the files .gitgnore and .google.auth.RData from the folder on my local PC, where I already did this authentification, to the directory on my server.
If I just type doAuth() alone I do not get an error message.
Issue:
getData()calls the function refreshToken() that updates the authentication token of the Google AdWords API. Within the function refreshToken the RCurl command returns a raw data file instead of a character file format. rjson::fromJSON returns an error that is solved with the addition of rawToChar().
Solution:
I created a patch of the function and updated the Github development version of RAdwords.
You can install the new package version with:
require(devtools)
install_github('jburkhardt/RAdwords')
I am trying to download a set of NetCDF files from: ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/
When I manually download the files I have no issues connecting, but when I use download.file and attempt to connect I get the following error:
Assertion failed!
Program: C:\Program Files\Rstudio\bin\rsession.exe
File: nc4file.c, Line 2771
Expression: 0
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
I have attempted to run the code in R without R studio and got the same result.
My abbreviated code is as followed:
library("ncdf4")
library("ncdf4.helpers")
download.file("ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/nwm.t00z.medium_range.channel_rt.f006.conus.nc","c:/users/nt/desktop/nwm.t00z.medium_range.channel_rt.f006.conus.nc")
temp = nc_open("c:/users/nt/desktop/nwm.t00z.medium_range.channel_rt.f006.conus.nc")
Adding mode = 'wb' to the download.file arguments solves the issue for me. I've had the same problem when downloading PDFs
download.file("ftp://ftpprd.ncep.noaa.gov/pub/data/nccf/com/nwm/prod/nwm.20180425/medium_range/nwm.t00z.medium_range.channel_rt.f006.conus.nc","C:/teste/teste.nc", mode = 'wb')
Okay - so here is what I'm trying to do.
I've got this password protected CSV file I'm trying to import into R.
I can import it fine using:
read.csv()
and when I run my code in RStudio everything works perfect.
However, when I try and run my .R file using a batch file (windows .bat) it doesn't work. I want to use the .BAT file so that I can set up a scheduled task to run my code every morning.
Here is my .BAT file:
"E:\R-3.0.2\bin\x64\R.exe" CMD BATCH "E:\Control Files\download_data.R" "E:\Control Files\DailyEmail.txt"
And here is my .R file:
url <- "http://username:password#www.url.csv"
data <- read.csv(url, skip=1)
** note, I've put my username/password and the exact location of the CSV in my code. I've used generic stuff here, as this is work related and posting usernames and passwords is probably frowned upon.
As I've said, this code works fine when I use it in RStudio. But fails when I use the .BAT file.
I get the following error message:
Error in download.file(url, "E:/data/data.csv") :
cannot open URL 'websiteurl'
In addition: Warning message:
In download.file(url, "E:/data/data.csv") :
unable to resolve 'username'
Execution halted
** above websiteurl is the http above (I can't post links)
So obviously, the .BAT is having trouble with the username/password? Any thoughts?
* EDIT *
I've gone so far as trying this on Linux. Thinking maybe windows was playing silly bugger.
Just from the terminal, I run Rscript -e "download_data.r" and get the EXACT same error message as I did in Windows. So I suspect this may be a problem with where I'm getting the data? Could the provider be blocking data from the command line, but not from with Rstudio?
I have had similar problems which had to do with file permissions. The .bat file somehow does not have the same privileges as you running the code directly from Rstudio. Try using rscript (http://stat.ethz.ch/R-manual/R-devel/library/utils/html/Rscript.html) within your .bat file like
Rscript "E:\Control Files\download_data.R"
What is the purpose of the argument "E:\Control Files\DailyEmail.txt"? Is the program suppose to use it in any way?
So, I've found a solution, which is likely not the most practical for most people, but works for me.
What I did was migrated my project over to a Linux system. Running daily scripts, is easier on Linux anyways.
The solution makes use of the "wget" function in linux.
You can either run the wget right in your shell script, or make use of the system() function in R to run the wget.
code looks like:
wget -O /home/user/.../file.csv --user=userid --password='password' http://www.url.com/file.csv
And you can do something like:
syscomand >- "wget -O /home/.../file.csv --user=userid --password='password' http://www.url.com/file.csv"
system (syscommand)
in R to download the CSV to a location on your hard drive, then grab the CSV using read.csv()
Doing it this way gave me some more insight into the potential root cause of the problem. While the system(syscommand) is running, I get the following output:
Connecting to www.website.com (www.website.com)|ip.ad.re.ss|:80... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Reusing existing connection to www.weburl.com:80.
HTTP request sent, awaiting response... 200 OK
Not sure why it has to send the request twice? And why I'm getting a 401 Unauthorized the first try?