Execute R Script on AWS via API - r

I have an R package that I would like to host through Amazon Web Services that will be accessible via an API. The script should take a couple of input values and return the R output in json format. Also, the API should be able to handle multiple requests simultaneously.
So for example, call http://sampleapi.com/?location=USA?state=Florida. That would then run the R package and return the output data to the calling application.
Has anyone done this before or know of resources you can point me to that would explain how to do so? Thanks!

Thanks for all the suggestions. I decided to use Ruby for the API with the rinruby and rails-api gems and will host that through AWS Elastic Beanstalk. See this question for how I am setting it up - Ruby API - Accept parameters and execute script

Related

How to connect to a socket.io from R?

I am trying to connect to a socket.io data source using R.
Specifically I am trying to connect to CoinCap https://github.com/CoinCapDev/CoinCap.io.
I started by trying the websockets package from here but I could not get a connection. Maybe it is not socket.io compliant.
The best example appears to be in this post which asks the same question.
It seems the answer was to create a socket.io server as a middleman and then connect to R.
The problem is that I am not nearly as advanced as jeromefroe and have no experience with sockets or javascript and I have do not understand how the server that he created works or how to build or start it.
jeromefroe provides his javascript server code in the post, and I don't know what to do with it.
I am trying to collect data in R and use for analysis.
Can somebody help me get the connection running and/or help me set up the sever like jeromefroe did for the connection?
If I understand your question correctly, you are trying to "collect data in R and use for analysis". The website provides the REST URLs and so it is a matter of doing a http GET to retrieve data. An example usage of the httr package as follows. The result retrieved is in json format. Hence, you need jsonlite package to convert into a R data structure.
library(httr)
library(jsonlite)
resp <- httr::GET("http://coincap.io/coins")
jsonlite::fromJSON(rawToChar(resp$content))

Hosting or deploy R rest Api's

We have created RestApi's in R. We are able to run the code by using Plumber. But the thing is we need to host or deploy the R code on web (like web api or web services)
# myfile.R
#' #get /Sample
Sample <- function(samples=10){
print(samples)
}
Note : Please suggest other than Plumber and Shiny
This is for those who wants to have a comparion of API development with R.
Basically concurrent requests are queued by httpuv in plumber so that it is not performant by itself. The author recommends multiple docker containers but it can be complicated as well as response-demanding.
There are other tech eg Rserve and rApache. Rserve forks prosesses and it is possible to configure rApache to pre-fork so as to handle concurrent requests.
See the following posts for comparison
https://www.linkedin.com/pulse/api-development-r-part-i-jaehyeon-kim/
https://www.linkedin.com/pulse/api-development-r-part-ii-jaehyeon-kim/

AWS PDF upload through http post

I am new to AWS and I am trying to upload a pdf document to S3 trough an AWS API. I am using an HTML form with a post method. The action of the form is the URL of the deployed API. The API is integrated with a lambda function. My question is how can I extract the uploaded file to proceed within the lambda function, to perform some processing before uploading to S3. Is it even possible?
I have tried the instructions found in this post:
Passing HTTP Post from AWS API GW to Lambda
However, I return the event from the lambda function and this is what I get:
{file: file.pdf , acl:private,
success_action_redirect: http://localhost/, AWSAccessKeyId:my_aws_key}
The file I uploaded is called file.pdf.
Any guidance will be appreciated.
A pdf file is a binary format. API Gateway does not currently support binary data. We know that binary data does not work and there are no workarounds to make it work reliably. A number of customers have requested that we add binary support to API Gateway and it is prioritized on our backlog.

Error:1411809D:SSL routines - When trying to make https call from inside R module in AzureML

I have an experiment in AzureML which has a R module at its core. Additionally, I have some .RData files stored in Azure blob storage. The blob container is set as private (no anonymous access).
Now, I am trying to make a https call from inside the R script to the azure blob storage container in order to download some files. I am using the httr package's GET() function and properly set up the url, authentication etc...The code works in R on my local machine but the same code gives me the following error when called from inside the R module in the experiment
error:1411809D:SSL routines:SSL_CHECK_SERVERHELLO_TLSEXT:tls invalid ecpointformat list
Apparently this is an error from the underlying OpenSSL library (which got fixed a while ago). Some suggested workarounds I found here were to set sslversion = 3 and ssl_verifypeer = 1, or turn off verification ssl_verifypeer = 0. Both of these approaches returned the same error.
I am guessing that this has something to do with the internal Azure certificate / validation...? Or maybe I am missing or overseeing something?
Any help or ideas would be greatly appreciated. Thanks in advance.
Regards
After a while, an answer came back from the support team, so I am going to post the relevant part as an answer here for anyone who lands here with the same problem.
"This is a known issue. The container (a sandbox technology known as "drawbridge" running on top of Azure PaaS VM) executing the Execute R module doesn't support outbound HTTPS traffic. Please try to switch to HTTP and that should work."
As well as that a solution is on the way :
"We are actively looking at how to fix this bug. "
Here is the original link as a reference.
hth

Running AWS commands from commandline on a ShellCommandActivity

My original problem was that I want to increase my DynamoDB write throughput before I run the pipeline, and then decrease it when I'm done uploading (doing it max once a day, so I'm fine with the decreasing limitations).
They only way I found to do it is through a shell script that will issue the API commands to alter the throughput. How does it work with my AMI access_key and secret_key when it's a resource that pipeline creates for me? (I can't log in to set the ~/.aws/config file and don't really want to create an AMI just for this).
Should I write the script in bash? can I use ruby/python AWS SDK packages for example? (I prefer the latter..)
How do I pass my credentials to the script? do I have runtime variables (like #startedDate) that I can pass as arguments to the activity with my key and secret? Do I have any other way to authenticate with either the commandline tools or the SDK package?
If there is another way to solve my original problem - please let me know. I've only got to the ShellActivity solution because I couldn't find anything else in documentations/forums.
Thanks!
OK. found it - http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-concepts-roles.html
The resourceRole in the default object in your pipeline will be the one assigned to resources (Ec2Resource) that are created as a part of the pipeline activation.
The default one in configured to have all your permissions and AWS commandline and SDK packages are automatically looking for those credentials so no need to update ~/.aws/config of pass credentials manually.

Resources