"Session-Security" in R-Shiny and AWS Fargate - r

I am currently thinking about the best way to deploy my RShiny app. After trying to host my app on a dedicated server via Shinyproxy, Docker and Nginx - but this solution was (surprise!) not really scalable. The RAM requirement per user was too high for that.
I'm currently considering hosting the app via a Docker image in AWS Fargate, where RAM resources scale up and down as needed.
I'm now wondering about security, though.
Brief background:
My goal is to add my app as a tool to an online store. Here it can and will (hopefully) happen that several users will use the tool at the same time. It's important that users can't mess with each other's data - that's why I thought of ShinyProxy, so that each user gets their "own R session".
Now I am wondering what this looks like with AWS Fargate. Could it be that if multiple users are active in the tool at the same time, there can be mutual interference?
If so, does anyone have any ideas on how to prevent this? Unfortunately, publishing ShinyProxy via Fargate is not possible as far as I know.
I hope I could formulate my question understandably and someone of you can help me.
Thank you and have a nice day!

Brief background: My goal is to add my app as a tool to an online
store. Here it can and will (hopefully) happen that several users will
use the tool at the same time. It's important that users can't mess
with each other's data - that's why I thought of ShinyProxy, so that
each user gets their "own R session".
Probably depends on what you need for your use case.
Shiny actually has no user management per default - in the sense of limiting access to your application for certain groups and requiring authentication (can be done by hosting with Shinyapps.io and others).
But you probably do not really need this anyway - your problem sounds more like a scoping issue.
(you should read this information about it)
Sure, there might only be one R process, but it actually supports multiple client connections (sessions). You can define, what objects these sessions share. This is totally independent from where you host your app.
Everything you put into the shinyServer() function in the server.R file will only be visible within the user session. (every user has it's own session)
If you need to share variables between sessions, you have to put them in the server.R file, but outside of the shinyServer() function.

Related

How do you make an R coded program running on AWS accessible from your website?

In the creation of a simulation for our company, we coded the entire thing in R. It runs on AWS, and the consumers have been given links that route to the AWS page. Our website, however, is currently running off of Wordpress. In order for our customers to be able to access the product, we need to find a way to connect the product to the website. We would hence like to replace the current site with a new one that allows users to access our simulation from the website.
The only option we’ve come up with is to create a separate domain that has the interface built into the R program, and have a link to that domain from the current website. However, we would prefer to have a more direct solution.
Do any of you have any suggestions as to how we might achieve this?
Thanks for your time!
This answer very much depends on your code, but I think you have several options.
Run on external website
Pros:
Full control over code, easy to update without risking changes to main site
Easily accessible either by directly linking to it, or using <iframe> (HTML) on your main website, no Wordpress support required!
Cons:
Separate domain, some extra costs (?)
Shinyapps.io
Pros:
Easily publishable, often free
Cons:
Available for mostly everyone, which might not be ideal in a business situation
Less control over the platform
EDIT: I wanted to add that you can host your own shiny applications, and build the front-end using HTML. This gives you some more control.
AWS
Pros:
You should be able to set up an instance where the simulation is run on a subdomain that is not directly tied to Wordpress, e.g. outside of the main Wordpress folder.
As I said, the ideal solution depends on your code. Does it take user input, does it need to save files often? What kind of access control do you need?

Is there a way to prevent directory traversal attacks in a Shiny App running in Windows?

I'm trying to develop an internal Shiny app for my organization as a test run. The IT department is requiring the app to be safe from Directory Traversal Attacks. Unfortunately, I have to deploy the Shiny app in a Windows machine. (currently using runApp).
I have searched but not found a way to implement the different recommendations of avoiding Directory Traversal Attacks. Can anyone help me out?
Protecting from a traversal attack is two fold. Once in the application and once in the system.
For the application, you will need to make sure that you are cleaning any inputs that point to a hosted file. For example, if your application allows a user to call images/supercool.png youll need to verify that the path is not being changed to something like ../../../../etc/psswd.
For the system, it is a matter of separating privileges. The accounts given access to the runapp files should not also have access to system files(beyond what is absolutely needed.
I would recommend using shiny server or connect to host the files for you, especially if you do not feel prepared to implement the needed security.
Rstudio has done a lot of work and a great job to make a good product and is continuing to add new features including enhancements around security/access.

DTAP storing script variables in a simple and fast way

I am setting up a DTAP environment for Google App Maker. Google App Maker enables working in a singe file very well, however there is one use case that I would like to simplify.
For each deployment I need to "know" certain things in the back end script. Things like the ip address of the SQL server, or usernames and passwords. This information needs to be retrieved fast and often, given the stateless nature of google.script.run.
The best solution so far is a settings form, combined with google drive tables and caching. This works, but it is not simple, and things could fail easily. The other approach is hard coded and linked to the deployment url. This is fast and simple, but also means that all the credentials are in the source.
I am looking for a better solution. Apps Script used to have the script properties. Is there a similar option in App Maker, with a UI to maintain the settings.
There is no built-in UI to manage script properties, but App Maker's runtime (Apps Script) provides API to perform CRUD operations on it:
PropertiesService.getScriptProperties().setProperty('testKey', 'testValue');
...and you can 'easily' build the UI on top of this API. In answer for this question are highlighted major steps to achieve this: Google App Maker how to create Data Source from Google Contacts
Here is a feature request for the first party support. You can up-vote it by giving it a star:
https://issuetracker.google.com/issues/73584947

Understanding the scalability of RShiny apps hosted on ShinyServer

I am building a series of interactive shiny web apps for a project that I am considering turning into a Company. My background is in data science and I don't have a lot of experience on the web app / server side of things, but these are important aspects for me to consider with my project. I currently have an Amazon Linux AMI EC2 instance with ShinyServer (free, open-source) installed, and I am currently hosting early versions of my web apps there. So far everything works fine, but I haven't made the links public yet.
My first question is whether anyone knows if there are certain limitations (scalability limitations, integration with database limitations, security / authentication limitations, etc.) that I will inevitably run into using RShiny apps and ShinyServer? I haven't heard of many successful, super-popular web apps being shiny apps hosted on ShinyServer, but rather my feeling is that ShinyServer is mainly used for hosting RShiny apps that are shared amongst only a small number of people (i.e. shared amongst team members at a company.). Per this thread - Does R-Server or Shiny Server create a new R process/instance for each user? - I am particularly concerned that my app won't be able to handle thousands of users simultaneously since only 1 R process is created for the app regardless of the # of concurrent users of the app. Having 10-20 processes through ShinyServer pro probably doesn't solve the issue either if I ever intend to scale greater than the hundreds or thousands of users. I also noticed that ShinyServer Pro would run me a not-so-negligible $10K per year.
My second question is whether RShiny apps can be deployed using other server technologies, such as Heroku. I came across this github page (https://github.com/virtualstaticvoid/heroku-buildpack-r/tree/heroku-16) but haven't dug too deep into it yet. I've been told that heroku makes it easy to update releases to apps whose code is on github (git push heroku:master), amongst other things.
My third question involves certain specific considerations of mine. In particular, I am currently working on a script that queries data from an API and writes that data to a (not-yet-setup) database of mine. This is the data my apps use, and I'd be interested in having the apps update in real time as the database updates, without requiring the user to refresh the webpage. A buddy of mine suggested AJAX for this type of asynchronous behavior, and it looks like this may be possible in R with something like this (https://github.com/daattali/advanced-shiny/tree/master/api-ajax).
Sorry that this is such a loaded question, but I hope it doesn't get closed down as I think it is fairly educational. Any suggestions / sources / pointing me in the right direction would be greatly appreciated on this.
Canovice,
I'd recommend you take a look at the following RStudio / AWS support articles. To scale a shiny server you'll need to look at using a load balancer:
RStudio
https://shiny.rstudio.com/articles/scaling-and-tuning.html
https://support.rstudio.com/hc/en-us/articles/220546267-Scaling-and-Performance-Tuning-Applications-in-Shiny-Server-Pro
https://support.rstudio.com/hc/en-us/articles/217801438-Can-I-load-balance-across-multiple-nodes-running-Shiny-Server-Pro-
AWS
https://aws.amazon.com/blogs/big-data/running-r-on-aws/
Blog Article:
http://mgritts.github.io/2016/07/08/shiny-aws/
Shiny is a great platform, their support is fabulous. I'd recommend you ring them up - they'll be sure to help answer your questions.
That said if your plan is to create a scalable website that will support thousands or hundreds of thousands of people then my sense would be to recommend you also review and consider using D3.js in conjunction with react.js or Angular.js, not forgetting to mention node.js.
My sense is that you are looking at a backend database connected to a logic engine and visualisation front end. If you are looking for a good overview of usage take a look at the following web page and git repo [A little dated but useful]:
https://anmolkoul.wordpress.com/2015/06/05/interactive-data-visualization-using-d3-js-dc-js-nodejs-and-mongodb/
https://github.com/anmolkoul/node-dc-mongo
I hope the above points you in the right direction.
I'd like to provide some notes related to your second question: Yes, you can use the mentioned buildback to deploy shiny applications on heroku.
I was in a similar situation with you (asking myself about possible ways of serving Shiny applications in a scalable manner) and decided to go the "heroku way".
You may find these hints helpful when deploying your app to heroku using the buildpack mentioned above:
Heroku tries to "guess" how to execute your application. But you can also add a special file, named Procfile, to your application to control the process commands you want to execute for your application. In my case I used web: R -f ~/run.R --gui-none --no-save, where this means that a file named run.R is being passed to the R executable for the web server process
The stack on heroku is based on Ubuntu. If you need additional deb-packages, you can create another special file named Aptfile and add the package names therein, heroku will then automatically install these for you (I needed it for RPostgreSQL)
You can add another special file named init.R and install all R packages as necessary just as you are used to, i.e. with install.packages etc. You can also add initial configuration material within this file.
As a running example, here is an example toy application that I wrote for myself to remember how a "full-stack" shiny app may look like, including compability with heroku.
For a large number of concurrent users, use a load balancer like nginx and enable the autoscaling of your app, e.g. through Kubernetes.
You can deploy your app on Heroku. On the paid tiers it includes NoOps autoscaling of your app. See this tutorial on how to deploy a Shiny app in a Docker container on Heroku: https://medium.com/analytics-vidhya/deploying-an-r-shiny-app-on-heroku-free-tier-b31003858b68
You can query the table last update timestamp in the Shiny server logic with reactivePoll() and rerun your db query if it changed. It is not "real-time" but depending on your application close enough if you set the time interval small.

Installing MeteorJS in Amazon S3 Bucket

I currently manage Web App on a LAMP stack hosted with GreenGeeks. As it has scaled up, I have started learning MeteorJS on my local machine and am thinking about redeveloping the app in Meteor to support more concurrent connections. My questions are:
Can Meteor simply be hosted in a Simple Amazon S3 Bucket with no need for a stack of any kind? Is this smart? When something seems this simple, I get nervous.
Is Meteor as portable as it feels? Migrating a LAMP app from one server to another can be a real pain. This "feels" like it's as simple as zipping up the whole thing and simply dragging it anywhere. Again, feels too simple = nervous.
Is meteor the right choice if I am looking to maximize concurrent connections and reduce the number of times I need to go to the server for information? My app loads about 2 MB of data per user and I'd love a situation where this can be loaded once and the user has it available to interact with without going to the server (unless it changes).
Ok answers to your questions:
Well actually You can deploy your meteor app into an Amazon EC2 instance, the process is pretty easy, take a look to This video
Meteor is incredible portable, actually it was made with nodejs, therefore it inherits its features
You are in the right way, you know meteor is reactive, it acts in real time, also uses mongoDB, which is incredible faster than a regular SQL database, so in general meteor's performance is amazing, in fact, there are lots of packages that improve even more the performance of your app like this one and many others

Resources