how to run aggregated update report? - sbt

If one types update in the sbt console, it runs an aggregated report that typically takes a minute or so for a project.
However, if one programatically runs update for each ProjectRef it is cronically slow (10 minutes to an hour is not unheard of).
How can one programmatically run the same (faster) aggregated update report that the console runs?

If one types update in the sbt console, it runs an aggregated report that typically takes a minute or so for a project.
The implementation of the update task is available here:
https://github.com/sbt/sbt-zero-thirteen/blob/v0.13.9/main/src/main/scala/sbt/Defaults.scala#L1325-L1443
The main thing it adds there is caching based on the input parameters.
Not sure what you mean by aggregated. Do you mean aggregated across the configurations (e.g. Compile and Test?)

Basically this PR is how I ended up doing it
https://github.com/ensime/ensime-sbt/pull/122
which meant setting up an aggregated report in a single task and calling that once, which is referenced later on.

Related

Firebase, how to implement scheduler?

When some information is stored in the firestore, each document is storing some specific time in the future, and according to that time, the event should occur in the user's app.
The first way I could find was the Cloud Function pub sub scheduler. However, I could not use this because the time is fixed.
The second method was to use Cloud Function + Cloud Task. I have referenced this. 
https://medium.com/firebase-developers/how-to-schedule-a-cloud-function-to-run-in-the-future-in-order-to-build-a-firestore-document-ttl-754f9bf3214a
This perfectly performed the function I really wanted, but there was a fatal drawback in the Cloud Task, because I could only save the event within 30 days. In other words, future time exceeding 30 days did not apply to this.
I want this event to be saved over the long term. And I want it to be somewhat smooth for large traffic.
I`m using Flutter/Firebase, how to implement this requirements above?
thank you for reading happy new year
You could check in the function that gets activated on document creation if the task is due in more than 30 days, and if so, store it somewhere else (maybe another document). Then have another process that checks if the task is now within the 30 days range and then have it do the same as the newly created ones. This second process could be run every week or two weeks.

Schedule function in firebase

The problem
I have a firebase application in combination with Ionic. I want the user to create a group and define a time, when the group is about to be deleted automatically. My first idea was to create a setTimeout(), save it and override it whenever the user changes the time. But as I have read, setTimeout() is a bad solution when used for long durations (because of the firebase billing service). Later I have heard about Cron, but as far as I have seen, Cron only allows to call functions at a specific time, not relative to a given time (e.g. 1 hour from now). Ideally, the user can define any given time with a datetime picker.
My idea
So my idea is as following:
User defines the date via native datepicker and the hour via some spinner
The client writes the time into a seperate firebase-database with a reference of following form: /scheduledJobs/{date}/{hour}/{groupId}
Every hour, the Cron task will check all the groups at the given location and delete them
If a user plans to change the time, he will just delete the old value in scheduledJobs and create a new one
My question
What is the best way to schedule the automatic deletion of the group? I am not sure if my approach suits well, since querying for the date may create a very flat and long list in my database. Also, my approach is limited in a way, that only full hours can be taken as the time of deletion and not any given time. Additionally I will need two inputs (date + hour) from the user instead of just using a datetime (which also provides me the minutes).
I believe what you're looking for is node schedule. Basically, it allows you to run serverside cron jobs, it has the ability to take date-time objects and schedule the job at that time. Since I'm assuming you're running a server for this, this would allow you to schedule the deletion at whatever time you wish based on the user input.
An alternative to TheCog's answer (which relies on running a node server) is to use Cloud Functions for Firebase in combination with a third party server (e.g. cron-jobs.org) to schedule their execution. See this video for more or this blog post for an alternative trigger.
In either of these approaches I recommend keeping only upcoming triggers in your database. So delete the jobs after you've processed them. That way you know it won't grow forever, but rather will have some sort of fixed size. In fact, you can query it quite efficiently because you know that you only need to read jobs that are scheduled before the next trigger time.
If you're having problems implementing your approach, I recommend sharing the minimum code that reproduces where you're stuck as it will be easier to give concrete help that way.

Error during a wikidata update

I've created a local version of the wikidata api using the instructions here, and after running munge.sh with the default options, I've run
./runUpdate.sh -n wdq which resulted with the following error message.
ERROR org.wikidata.query.rdf.tool.Update -
RDF store reports the last update time is before the minimum safe poll time.
You will have to reload from scratch or you might have missing data.
What does it mean? Should I munge again before updating?
The default updater can only currently update based on what is in RecentChanges for the wiki.
The default for this is 30 days, so if the dump that you imported is from longer than 30 days ago the updater will fail.
There are options that can now be passed to the updater script to look into the history of RecentChanges for longer periods.
You can also set the last updater triple that the check is performed on.
These options can be seen discussed in https://phabricator.wikimedia.org/T182394 (but im not sure better docs currently exist):
"wikibaseMaxDaysBack" can be used to set the maximum days to look back in RecentChanges
"init" can be used to set the last updated triple

Is there a way to define a trigger that runs reliably at a datetime specified as a field in the updated/created object?

The question
Is it possible (and if so, how) to make it so when an object's field x (that contains a timestamp) is created/updated a specific trigger will be called at the time specified in x (probably calling a serverless function)?
My Specific context
In my specific instance the object can be seen as a task. I want to make it so when the task is created a serverless function tries to complete the task and if it doesn't succeed it updates the record with the partial results and specifies in a field x when the next attempt should happen.
The attempts should not span at a fixed interval. For example, a task may require 10 successive attempts at approximately every 30 seconds, but then it may need to wait 8 hours.
There currently is no way to (re)trigger a Cloud Function on a node after a certain timespan.
The closest you can get is by regularly scheduling a cron job to run on the list of tasks. For more on that, see this sample in the function-samples repo, this blog post by Abe, and this video where Jen explains them.
I admit I never like using this cron-job approach, since you have to query the list to find the items to process. A while ago, I wrote a more efficient solution that runs a priority queue in a node process. My code was a bit messy, so I'm not quite ready to share it, but it wasn't a lot (<100 lines). So if the cron-trigger approach doesn't work for you, I recommend investigating that direction.

Teradata Change data capture

My team is thinking about developing a real time application (a bunch of charts, gauges etc) reading from the database. At the backend we have a high volume Teradata database. We expect some other applications to be constantly feeding in data into this database.
Now we are wondering about how to feed in the changes from the database to the application. Polling from the application would not be a viable option in our case.
Are there any tools that are available within Teradata that would help us achieve this?
Any directions on this would be greatly appreciated
We faced similar requirement. But in our case client asked us to provide daily changes to a purchase orders table. That means we had to run a batch of scripts every day to capture the changes occuring to the table.
So we started to collect data every day and store the data in a sparse history format in another table. So the process is simple here. We collect a purchase order details record in the against first day's date in the history table. And then the next day we compare the next day's feed record against the history record and identify any change in that record. If there is a change in the purchase order record columns we collect that record and keep it in a final reporting table which will be shown to the client.
If you run the batch scripts every day once and there will be more than one change in a day to a record then this method cannot give you the full changes. For that you may need to run the batch scripts more than once every day based on your requirement.
Please let us know if you find any other solution. Hope this helps.
There is a change data capture tool from wisdomforce.
http://www.wisdomforce.com/resources/docs/databasesync/DatabaseSyncBestPracticesforTeradata.pdf
It would it probably work in this case
Are triggers with stored procedures an option?
CREATE TRIGGER dbname.triggername
AFTER INSERT ON db_name.tbl_name
REFERENCING stored_procedure
Theoretically speaking, you can write external stored procedures which may call UDFs written in Java or C/C++ etc which can push the row data to your application in near real time.

Resources