I'm wondering about the difference between the R environment and R options. Specifically, what is the difference between the settings held in this place:
Sys.getenv()
vs. this place
getOption()
When calling each of them, I receive quite different settings but I can't figure out what the difference is on a conceptual level. What is each of these intended for? When deciding in which location to let the user store their API keys, which one is the better (safer, faster, more logical) place to store them in? Does either of these settings persist after closing R?
Related
I am looking to get the values from a reactive function, based on a specific set of inputs, but without the dashboard open or visible. Is this possible?
I have two possible solutions, both of them I am not sure are possible, and which of the two would be better. Also, I do think there could be a better solution.
Start the dashboard at a specific time and write away a csv with the required information. I don't know how to put the input values to the right settings though (they would be different from the normal initial settings). Hopefully start this without a visible endpoint (headless chromedriver for example).
Within Shiny, do a check if it's a specific time. If that's the case, change all inputs to the right settings, and export/do something with the required values. I don't prefer this, as there will be users using the dashboard at that time, and I don't want to disturb their work.
To give a bit more context, I would like to obtain values/dataframes from several dashboards at a specific time. These values are calculated via a chain of multiple reactives and inputs. I cannot trust that these dashboards are already running, so I need to start them.
There's any way to list the kinds that are not being used in google's datastore by our app engine app without having to look into our code and/or logic? : )
I'm not talking about indexes, which I can list by issuing an
gcloud datastore indexes list
and then compare with the datastore-indexes.xml or index.yaml.
I tried to check datastore kinds statistics and other metadata but I could not find anything useful to help me on this matter.
Should I give up to find ways of datastore providing me useful stats and code something to keep collecting datastore statistics(like data size), during a huge period to have at least a clue of which kinds are not being used and then, only after this research, take a look into our app code to see if the kind Model was removed?
Example:
select bytes from __Stat_Kind__
Store it somewhere and keep updating for a period. If the Kind bytes size does not change than probably the kind is not being used anymore.
The idea is to do some cleaning in datastore.
I would like to find which kinds are not being used anymore, maybe for a long time or were created manually to be used once... You know, like a table in oracle that no one knows what is used for and then if we look into the statistics of that table we would see that this table was only used once 5 years ago. I'm trying to achieve the same in datastore, I want to know which kinds are not being used anymore or were used a while ago, then ask around and backup/delete it if no owner was found.
It's an interesting question.
I think you would be best-placed to audit your code and instill organizational practice that requires this documentation to be performed in future as a business|technical pre-prod requirement.
IIRC, Datastore doesn't automatically timestamp Entities and keys (rightly) aren't incremental. So there appears no intrinsic mechanism to track changes short of taking a snapshot (expensive) and comparing your in-flight and backup copies for changes (also expensive and inconclusive).
One challenge with identifying a Kind that appears to be non-changing is that it could be referenced (rarely) by another Kind and so, while it does not change, it is required.
Auditing your code and documenting it for posterity should not only provide you with a definitive answer (and identify owners) but it pays off a significant technical debt that has been incurred and avoids this and probably future problems (e.g. GDPR-like) requirements that will arise in the future.
Assuming you are referring to records being created/updated, then I can think of the following options
Via the Cloud Console (Datastore > Dashboard) - This lists all your 'Kinds' and the number of records in each Kind. Theoretically, you can take a screen shot and compare the counts so that you know which one has experienced an increase or not.
Use of Created/LastModified Date columns - I usually add these 2 columns to most of my datastore tables. If you have them, then you can have a stored function that queries them. For example, you run a query to sort all of your Kinds in descending order of creation (or last modified date) and you only pull the first record from each one. This tells you the last time a record was created or modified.
I would write a function as part of my App, put it behind a page which requires admin privilege (only app creator can run it) and then just clicking a link on my App would give me the information.
I'm beginning the process of instrumenting a web application, and using StatsD to gather as many relevant metrics as possible. For instance, here are a few examples of the high-level metric names I'm currently using:
http.responseTime
http.status.4xx
http.status.5xx
view.renderTime
oauth.begin.facebook
oauth.complete.facebook
oauth.time.facebook
users.active
...and there are many, many more. What I'm grappling with right now is establishing a consistent hierarchy and set of naming conventions for the various metrics, so that the current ones make sense and that there are logical buckets within which to add future metrics.
My question is two fold:
What relevant metrics are you gathering that you have found indespensible?
What naming structure are you using to categorize metrics?
This is a question that has no definitive answer but here's how we do it at Datadog (we are a hosted monitoring service so we tend to obsess over these things).
1. Which metrics are indispensable? It depends on the beholder. But at a high-level, for each team, any metric that is as close to their goals as possible (which may not be the easiest to gather).
System metrics (e.g. system load, memory etc.) are trivial to gather but seldom actionable because they are too hard to reliably connect them to a probable cause.
On the other hand number of completed product tours matter to anyone tasked with making sure new users are happy from the first minute they use the product. StatsD makes this kind of stuff trivially easy to collect.
We have also found that the core set of key metrics for any teamchanges as the product evolves so there is a continuous editorial process.
Which in turn means that anyone in the company needs to be able to pick and choose which metrics matter to them. No permissions asked, no friction to get to the data.
2. Naming structure The highest level of hierarchy is the product line or the process. Our web frontend is internally called dogweb so all the metrics from that component are prefixed with dogweb.. The next level of hierarchy is the sub-component, e.g. dogweb.db., dogweb.http., etc.
The last level of hierarchy is the thing being measured (e.g. renderTime or responseTime).
The unresolved issue in graphite is the encoding of metric metadata in the metric name (and selection using *, e.g. dogweb.http.browser.*.renderTime) It's clever but can get in the way.
We ended up implementing explicit metadata in our data model, but this is not in statsd/graphite so I will leave the details out. If you want to know more, contact me directly.
I'm setting up Graphite, and hit a problem with how data is represented on the screen when there's not enough pixels.
I found this post whose first answer is very close to what I'm looking for:
No what is probably happening is that you're looking at a graph with more datapoints than pixels, which forces Graphite to aggregate the datapoints. The default aggregation method is averaging, but you can change it to summing by applying the cumulative() function to your metrics.
Is there any way to get this cumulative() behavior by default?
I've modified my storage-aggregation.conf to use 'aggregationMethod = sum', but I believe this is for historical data and not for data that's displayed in the UI.
When I apply cumulative() everything is perfect, I'm just wondering if there's a way to get this behavior by default.
I'm guessing that even though you've modified your storage-aggregation.conf to use 'aggregationMethod = sum', your metrics you've already created have not changed their aggregationMethod. The rules in storage-aggregation.conf only affect new metrics.
To change your existing metrics to be summed instead of averaged, you'll need to use whisper-resize.py. Or you can delete your existing metrics and they'll be recreated with sum.
Here's an example of what you might need to run:
whisper-resize.py --xFilesFactor=0.0 --aggregationMethod=sum /opt/graphite/storage/whisper/stats_counts/path/to/your/metric.wsp 10s:28d 1m:84d 10m:1y 1h:3y
Make sure to run that as the same user who owns the file, or at least make sure the files have the same ownership when you're done, otherwise they won't be writeable for new data.
Another possibility if you're using statsd is that you're just using metrics under stats instead of stats_counts. From the statsd README:
In the legacy setting rates were recorded under stats.counter_name
directly, whereas the absolute count could be found under
stats_count.counter_name. With disabling the legacy namespacing those
values can be found (with default prefixing) under
stats.counters.counter_name.rate and stats.counters.counter_name.count
now.
Basically, metrics are aggregated differently under the different namespaces when using statsd, and you want stuff under stats_count or stats.counters for things that should be summed.
Have a stored procedure that produces a number--let's say 50, that is rendered as an anchor with the number as the text. When the user clicks the number, a popup opens and calls a different stored procedure and shows 50 rows in a html table. The 50 rows are the disaggregation of the number the user clicked. In summary, two different aspx pages and two different stored procedures that need to show the same amount, one amount is the aggregate and the other the disaggregation of the aggregate.
Question, how do I test this code so I know that if the numbers do not match, there is an error somewhere.
Note: This is a simplified example, in reality there are 100s of anchor tags on the page.
This kind of testing falls outside of the standard / code level testing paradigm. Here you are explicitly validating the data and it sounds like you need a utility to achieve this.
There are plenty of environments to do this and approaches you can take, but here's two possible candidates
SQL Management Studio : here you can generate a simply script that can run through the various combinations from the two stored procedures ensuring that the number and rows match up. This will involve some inventive T-SQL but nothing particular taxing. The main advantage of this approach is you'll have bare metal access to the data.
Unit Testing : as mentioned your problem is somewhat outside of the typical testing scenario where you would oridnarily Mock the data and test into your Business Logic. However, that doesn't mean you cannot write the tests (especially if you are doing any Dataset manipulation prior to this processing. Check out this link and this one for various approaches (note: if you're using VS2008 or above, you get the Testing Projects built in from the Professional Version up).
In order to test what happens when the numbers do not match, I would simply change (temporary) one of the stored procedure to return the correct amount +1, or always return zero, etc.