We have separate process which provides data to our R-Shiny application. I know I can provide data to R-Shiny via file or database and observe the data source via reactivePoll. This works fine and I understand it's sort of recommended way.
What I don't like on this approach is:
It's hard to send to shiny various type of inputs (like data and metadata)
I miss the shiny application feedback to the data providing process. I just write a file and hope that shiny app will pick it up and process and will be successful. Data sourcing process cannot be notified about failure for example (invalid data)
I would love to have some 2-ways protocol. For example send the data through a websocket (this would have to be different websocket than the one Shiny has with the UI obviously) or raw socket and be able to send response back.
Surely I can implement some file-based API where I store files under different names observing them with shiny and shiny then writes other files back and I would observe them with my application which provided the data. But this basically sucks :)
Any tips much appreciated!
Edit: it mas or may not be obvious from saying the Java and R applications are writing files for each other ... But the apps are running on the same host and I can live with this limitation
Related
I am trying to implement the following design. Read data from a file (xml) at server startup and have it available as in memory variables to be used in the backend api for certain calculations. This data never changes thus it only need to be read once.
I am getting alot of module not found errors as I believe from what I read is that FS functions should only be done on the server side using getStaticProps
But this will trigger the read request every time a client loads the page.
Can someone guide me with a simple example on how to do this so that the data is read once and usable in the back end server side modules for calculations
Thanks
I want to host a shiny app on amazon EC2 which takes a excelsheet using fileinput(). Then I need to make some API calls for each row in the excelsheet which is expected to take 1-2 hours on average for my purposes. So I figured out that this is what I should do:
Host a shiny app where one can upload an excelsheet.
On receiving an excelsheet from a user, store it on the amazon servers, notify the user that an email will be sent once the processing is complete, and trigger run another R script (I'm not sure how to do that) which will keep running in background even if the user closes the browser window and collect all the information by making the slow API calls.
Once I have all the data, store it in another excelsheet and email back to the user.
If it is possible and reasonable to do it this way or you have some other ideas to do my task, please help me with how to do it.
Edit: I've found this is what I can do otherwise:
Get the excelsheet data and store it in a file.
Call a bash script from the R shiny like this: ./<my-script> &; disown
The bash script will call a python file which makes all API calls, decodes the relevant data from JSON output and stores it in another file on the server.
It finally sends an email to the user with he processed data attached.
I wanted to know if this is an appropriate way to do the job. Thanks a lot.
Try implementing simple web framework like Django since you are using python. Flask may come in handy for creating simple routes. Please comment if you find any issues.
Say we have a Shiny app which is deployed on a Shiny Server. We expect that the app will be used several users via their web browser, as usual.
The Shiny app's server.R includes some sparklyr package code which connects to a Spark cluster for classic filter, select, mutate, and arrange operations on data located on HDFS.
Is it mandatory to disconnect from Spark: to include a spark_disconnect at the end of the server.R code to free resources ? I think we should never disconnect at let Spark handle the load for each arriving and leaving user. Can somebody please help me to confirm this ?
TL;DR SparkSession and SparkContext are not lightweight resources which can be started on demand.
Putting aside all security considerations related to starting Spark session directly from a user-facing application, maintaining SparkSession inside server (starting session on entry, stopping on exit) is simply not a viable option.
server function will be executed every time there is an upcoming event effectively restarting a whole Spark application, and rendering project unusable. And this only the tip of the iceberg. Since Spark reuses existing sessions (only one context is allowed for a single JVM), multiuser access could lead to random failures if reused session has been stopped from another server call.
One possible solution is to register onSessionEnded with spark_disconnect, but I am pretty sure it will be useful only in a single user environment.
Another possible approach is to use global connection, and wrap runApp with function calling spark_disconnect_all on exit:
runApp <- function() {
shiny::runApp()
on.exit({
spark_disconnect_all()
})
}
although in practice resource manager should free resources when driver disassociates, without stopping session explicitly.
I am planning to create sqlite table on my android app. The data comes from the the server via webservice.
I would like to know what is the best way to do this.
Should I transfer the data from the webservice in a sqlite db file and merge it or should i get all the data as a soap request and parse it in to table or should I use rest call.
The general size of the data is 2MB with 100 columns.
Please advise the best case where I can quickly get this data, with less load on the device.
My Workflow is:
Download a set of 20000 Addresses and save them to device sqlite database. This operation is only once, when you run the app for the first time or when you want to refresh the whole app data.
Update this record when ever there is a change in the server.
Now I can get this data either in JSON, XML or as pure SqLite File from the server . I want to know what is the fastest way to store this data in to Android Database.
I tried all the above methods and I found getting the database file from server and copying that data to the database is faster than getting the data in XML or JSON and parsing it. Please advise if I am right or wrong.
If you are planning to use sync adapters then you will need to implement a content provider (or atleast a stub) and an authenticator. Here is a good example that you can follow.
Also, you have not explained more about what is the use-case of such a web-service to decide what web-service architecture to suggest. But REST is a good style to write your services and using JSON over XML is advisable due to data format efficiency (or better yet give protocol-buffer a shot)
And yes, sync adapters are better to use as they already provide a great set of features that you will want to implement otherwise when written as a background service (e.g., periodic sync, auto sync, exponential backoff etc.)
To have less load on the device you can implement a sync-adapter backed by a content provider. You serialize/deserialize data when you upload/download data from server. When you need to persist data from the server you can use the bulkInsert() method in content-provider and persist all your data in a transaction
I have built a quiz system using Shiny Server on Amazon Web Services. The system runs reliably when I tested it on one or two devices at home. However when I used it in the classroom with more than 10 students the system broke down. The questions and widgets loaded correctly, but when the students tried to submit their answers (after 30 - 40 minutes looking at them) the data was not handled correctly (results are saved in a csv file so I could see that).
I understand that there can be many causes for this, but I would like to know whether one might be that Shiny server is just not designed to handle many simultaneous requests. This would mean I can just forget about using Shiny for my purposes and look elsewhere. For those who are interested in the system, here is the code:
https://github.com/witusj/CFA-2/tree/master/WK4
Many thanks!
It depends on the complexity of your app and the server you host it on. There is an explanation by one of their developers here, although there are no clear guidelines.
Since you have students you can test on, you may be able to get an estimate of how many users the application will be able to handle correctly, and use this number to set a limit to the number of people who can join. If you look at the manual you will find the "Simple Scheduler" to do this. To use the example out of the manual, if you want to limit the number of connected students to 5, you would add simple_scheduler to you configuration:
location / {
# Define the scheduler to use for this location
simple_scheduler 5;
...
}
Since you have more than 5 students, set multiple copies of the application under a number of different locations. You can extend this using the load balancing idea of Huidong Tang, or an implementation of that idea by sjewo.
What #FvD said. But additionally, bear in mind that there's shinyapps.io if you want someone else to host your application in a scalable way, or Shiny Server Pro if you want to back a Shiny application with multiple R processes.
Shiny Server itself can certainly handle plenty of requests (we've seen a single Shiny Server instance gracefully handle up to a thousand concurrent users) -- and it had plenty of room for more -- but as #FvD described, it all comes down to how well your R application scales.
One caveat here: there is a bit of complexity to think through in an application like yours. If you write all your data out to a single .csv file, then you can't safely run multiple instances of the application simultaneously (the processes would be overwriting each other's file). Instead, you could consider writing out the results into a bunch of distinct CSV files which can be aggregated together later, or you could look at using something like a relational database to really do this right. This problem is described in more detail here.