how do you connect and retrieve data from graphite (whisper) - r

Is there an R package to connect to graphite (whisper)?

Seems I am looking the same thing. For now I see only that ways:
Using jsnonlite within R to access graphite render URL API and get json or csv formatted data.
Get whisper data from via whisper-fetch (example usage described in russian IT blog (its automatically translated to English by google)

Related

How to connect to a socket.io from R?

I am trying to connect to a socket.io data source using R.
Specifically I am trying to connect to CoinCap https://github.com/CoinCapDev/CoinCap.io.
I started by trying the websockets package from here but I could not get a connection. Maybe it is not socket.io compliant.
The best example appears to be in this post which asks the same question.
It seems the answer was to create a socket.io server as a middleman and then connect to R.
The problem is that I am not nearly as advanced as jeromefroe and have no experience with sockets or javascript and I have do not understand how the server that he created works or how to build or start it.
jeromefroe provides his javascript server code in the post, and I don't know what to do with it.
I am trying to collect data in R and use for analysis.
Can somebody help me get the connection running and/or help me set up the sever like jeromefroe did for the connection?
If I understand your question correctly, you are trying to "collect data in R and use for analysis". The website provides the REST URLs and so it is a matter of doing a http GET to retrieve data. An example usage of the httr package as follows. The result retrieved is in json format. Hence, you need jsonlite package to convert into a R data structure.
library(httr)
library(jsonlite)
resp <- httr::GET("http://coincap.io/coins")
jsonlite::fromJSON(rawToChar(resp$content))

AWS PDF upload through http post

I am new to AWS and I am trying to upload a pdf document to S3 trough an AWS API. I am using an HTML form with a post method. The action of the form is the URL of the deployed API. The API is integrated with a lambda function. My question is how can I extract the uploaded file to proceed within the lambda function, to perform some processing before uploading to S3. Is it even possible?
I have tried the instructions found in this post:
Passing HTTP Post from AWS API GW to Lambda
However, I return the event from the lambda function and this is what I get:
{file: file.pdf , acl:private,
success_action_redirect: http://localhost/, AWSAccessKeyId:my_aws_key}
The file I uploaded is called file.pdf.
Any guidance will be appreciated.
A pdf file is a binary format. API Gateway does not currently support binary data. We know that binary data does not work and there are no workarounds to make it work reliably. A number of customers have requested that we add binary support to API Gateway and it is prioritized on our backlog.

What is the best way to get (stream) data from BigQuery to R (Rstudio server in Docker)

I have a number of large tables in Google BigQuery, containing data to be processed in R. I am running RStudio via Docker on Google Cloud Platform using the Container Engine.
I have tested a few routes with a table of 38 million rows (three columns) with a table size of 862 MB in BigQuery.
The first route I tested was using the R package bigrquery. This option was preferred as data can be directly queried from BigQuery. And data-acquisition can be incorporated in R loops. This option is unfortunately very slow, it takes close to an hour to complete.
The second option I tried was exporting the BigQuery table to a csv file on Google Cloud Storage (approx 1 minute), and using the public link to import in Rstudio (another 5 minutes). This route entails quite some manual handling, which is at least not desirable.
In Google Cloud Console I noticed VM instances can be granted access to BigQuery. Also, RStudio can be configured to have root access in its Docker container.
So finally my question: Is there a way to use this backdoor to enable fast data-transfer from BigQuery into an R dataframe in an automated way? Or are there other ways to achieve this goal?
Any help is highly appreciated!
Edit:
I have loaded the same table into a MySQL database hosted in Google Cloud SQL, this time it took only about 20 seconds to load the same amount of data. So some kind of translation from BigQuery to SQL is an option too.

Is it possible to access elasticsearch's internal stats via Kibana

I can see from querying our elasticsearch nodes that they contains internal statistics that for example show disk, memory and CPU usage (for example via GET _nodes/stats API).
Is there anyway to access these in Kibana-4?
Not directly, as ElasticSearch doesn't natively push it's internal statistics to an index. However you could easily set something like this up on a *nix box:
Poll your ElasticSearch box via REST periodically (say, once a minute). The /_status or /_cluster/health end points probably contain what you're after.
Pipe these to a log file in a simple CSV format along with a time stamp.
Point logstash to these log files and forward the output to your ElasticSearch box.
Graph your data.

Connect R to POP Email Server (Gmail)

Is is possible to have R connect to gmail's POP server and read/download the messages in a specific folder of mine? I have been storing emails and would like to go back and start to analyze subject lines, etc.
Basically, I need a way to export a folder in my gmail account and I would like to do this pro grammatically if it all possible.
Thanks in advance!
I am not sure that this can be done via a single command. Maybe there is a package out there, which I am not aware of that can accomplish that, but as long as you do not run into that maybe the following process would be a solution ...
Consider got-your-back (http://code.google.com/p/got-your-back/wiki/GettingStarted#Step_4%3a_Performing_A_Backup) which "is a command line tool that backs up and restores your Gmail account".
You can invoke it like this (given that python is available on your machine):
python gyb.py --email foo#bar.com --search "from:pip#pop.com" --folder "mail_from_pip"
After completion you'll find all the emails matching the --search in the specified --folder, along with a sqlite database. (posted by dukedave, Dec 4 '11)
So depending on your OS you should be able to invoke the above command from within R and then access the downloaded mails in the respective folder.
GotYourBack is a good backup utility, but for downloading metadata for analysis, you might want something that doesn't first require you to fetch the entire content of all your email.
I've recently used the gmailr package to do a similar analysis.

Resources