shinyapps.io app 'disconnected from Server' from querying too big a data.frame? - r

The app works fine locally and is basically just for plotting data - very little to no math/processing is going on. The Shiny input fields just filter a data.frame by date and a handful of other attributes like location, observation type, etc. -- for daily numerical values originating in .csv that go back to Oct '16 that amount to a seemingly manageable ~0.25 gb. The default begin and end dates for the filtering are both in April '19, and the online app filters (by filter()) and plots that about month of data correctly on start up, and works fine looking back several days or weeks too.
However, if I extend the start date back by multiple months or years (say for example 10/15/16 to now instead of just the past month), the io version pretty abruptly yields "disconnected from the server". It seems I need to relax a setting or devote more resources to something else, but just a guess.. any ideas?
If feeling extremely helpful or curious - here's the working app in one ~40 mb zipped folder along with the app.r, and all the data, and the libraries used specified in libs.r
timeseries app
in summary the app works fine locally, but only for short time series ranges in io. The full data set goes back to 2004 for ~2gb. Ideally I'd like to find a way to have all those years queryable from the io app with decent app performance.

Related

Azure Time Series Insights Gen2 slower than preview?

We have a couple of environments still running Time Series Insights Preview version. This is really fast and we are really satisfied by it. However, new environments really seem a lot slower with the official release. Warm path extraction is a lot slower, but still doable, while cold path extraction becomes unbearable.
EDIT: We need to add &storeType=WarmStore if we would like to query warm data. Cool! This works really fast again! Question about cold store still persists:
It is hard to compare the different environments, because the datasets are not exactly the same, but for our new environment we have about 4.5 TB sensor data imported in TSI.
The following screenshot shows a query that tries to retrieve one minute of data for one device (each device only sends data each 10 seconds) in the far past of 2018. However, the server returns the call after 30 seconds with a continuationtoken, saying it couldn't retrieve all the 6 values in time. Sometimes it manages to return all 6 of the values, but it still takes 30 seconds.
My internet download speed, while performing the query, was over 80 Mb per second, so that shouldn't be an issue either.
Is this something we should be worried about in the new release?
please submit a support ticket through the Azure portal with all of these details and the product team will investigate.

kibana's timelion not showing data when data range is less than 150 minutes

I have an SLK stack on windows 10, running kibana 4.5.1, Elasticsearch 2.3.3 and logstash 2.3.2 and timelion (i cant find the version).
I have been experiencing some odd behavior.
I am collecting "stats" on a date field from solr, when viewing this data in timelion and the date range is less than or equal to 150 minutes then no data is displayed. When using kibana vanilla this data is present for all sizes of date range and the data was displayed fine. If I use data from a different logstash instance then it works fine. The 2 logstash instances are identical other than the urls being polled and they have both been restarted to see if that effected anything.
Can anyone give me a reason why looking at data from this one instance's data to not display if using a data range of less than 150 minutes?
I have found out why its doing this, the buckets were smaller than 1/2 the interval of collecting the data therefore there were 2 adjacent buckets without any data. I am unsure as to why this breaks things but oh well.

Exact time for activities monitored by Microsoft band

I want to know the exact times during the night when i was awake or was experiencing deep sleep.
The data that Microsoft exports in csv format is the summed up data while the one it the graph on the dash board is no time stamped.
Is there a way to get the exact times when i was awake from the Microsoft band?
You may be able to get more raw data about sleep activities by pulling it from the Cloud API, though that's certainly not as convenient as the using the dashboard.
You can find samples and documentation about the (preview) API here.

How does Twitter's Website calculate how long a tweet was posted?

Here's a screenshot of my twitter feed (as of right now while me writing this Question).
Notice how the time is relative to me, right now? (those times differences are correct, btw)
How do they calculate that?
The reason I ask is that right now, i'm in Melbourne Australia. When I Remote Desktop to a server in the states, log in to twitter (using the SAME account) .. i get the same results!
At first, I thought they were calculating this based upon my account settings for Time Zone (which btw is set at +10 GMT)
But if that was the case, when I remote desktop to the server (which is in San Fran, CA) it should be showing different results in that RD terminal, right?
So how could they have coded this, please?
Twitter more than likely stores the date it was posted in UTC, it knows the time now in UTC (both on your machine and on the server).
Given that those date times are translated into the same timezone (UTC), it's a simple matter of taking the difference between the two times.
It's the same thing the Stack Exchange sites do to stamp the times for all the activities that you see.
As long as you're able to convert any representation of date time to UTC (which pretty much every API in existence has), this value is able to be computed as Twitter will push the UTC time down to the clients which then do the math (or do it on the server and pass the differences down); the settings that you see for UTC offset are when absolute times are displayed to you and you want them relative to your timezone.

How do I regenerate statistics in Openx?

Due to faulty hardware, statistics generated over a 2 week period were significantly higher than normal (10000 times higher than normal).
After moving the application to a new server, the problem rectified itself. The issue I have is that there are 2 weeks of stats that are clearly wrong.
I have checked the raw impressions table for the affected fortnight and it seems to be correct (ie. stats per banner per day match the average for the previous month). Looking at the intermediate & summary impressions tables, the values are inflated.
I understand from the openx forum (link text) it's possible to regenerate stats from the raw data but it will only regenerate stats per hour, meaning regenerating stats for 2 weeks would be very time consuming.
Is there another, more efficient way to regenerate the stats from the raw data for the affected fortnight?
Have a look at this link as it appears to have a solution you may find helpful. The solution is similar to the one you posted in your question, but it appears that this one has been modified to make it easier to use. Other than using regenerateAdServerStatistics.php, I do not know of another option for regenerating the statistics you need.
I understand from the openx forum (link text) it's possible to regenerate stats from the raw data but it will only regenerate stats per hour, meaning regenerating stats for 2 weeks would be very time consuming
We have solved this problem on our installation by creating a wrapper shell script for reganerateAdServerStatistics.php with dateStart & dateEnd arguments for situations like the one you mention. It's used to:
regenerate statistics for a specific day (all hours, takes ~2h)
run normal maintenance to keep today's stats updated
goto step (1) as long as day processed < dateEnd
To be honest the script is somewhat more complex, as we also need to import raw data from our data warehouse for each day to be processed, because the "live" data are kept in a in-memory database, but that's kinda out of this post's context.

Resources