What design concept am I dealing with?

What design concept am I dealing with? - firebase

I have two data objects- customers and jobs. Job records are created based on certain fields of the customer record. A job record is created for each service visit to the customer, which happens on a recurring weekly basis.
So I'm considering the best way to create job records, I can either:
Use a server function to create job records on the backend. And create them in batches- say, quarterly- so I'd have job records for 12 weeks ahead. This way I can just query the jobs table for any operations in the presentation layer.
Use fields on the customers table to create jobs in the presentation layer, creating the job record only after some interaction with the presentation layer.
This way, jobs are always created from updated data.
I think I should go with the second approach but it seems like I might be committing a design transgression when it comes to handling data and presentation layers.
Is there some concept that encapsulates this type of problem?
--
Drawback to first approach: The server function would have to run after any changes to the customer record so that jobs are updated. I suppose I could schedule the function to run every night (cron job) so I'm getting updated records every day. But I think there should be a simpler way.

This is kind of an opinion question, and I suspect it might get removed.
but, I would go with #2, always. with #1 you're creating a lot of empty data records that hold no value and may or may not get used. It also gives you an opportunity to present the data to the user for verification before saving the job.

Related

At what point do you need more than one table in dynamodb?

I am working on an asset tracking system that also manages the concept of "projects". The users of this application perform maintenance activities on their customer's assets, so they need an action log where actions on an asset start life as a task in a project. For example, "Fix broken frame" might be a task where an action would have something like "Used parts a, b, and c to fix the frame" with a completed time and the employee who performed the action.
The conceptual data model for the application starts with a Customer that has multiple locations and each location has multiple assets. Each asset should have an associated action log so it is easy to view previous actions applied to that asset.
To me, that should all go in one table based upon the logical ownership of that data. Customer owns Locations which own Assets which own Actions.
I believe I should have a second table for projects as this data is tangential to the Customer/Location/Asset data. However, because I read so much about how it should all be one table, I'm not sure if this delineation only exists because I've modeled the data incorrectly because I can't get over the 3NF modeling that I've used for my entire career.

Single table design doesn't forbid you to create multiple tables. Instead in encourages to use only a single table per micro-services (meaning, store correlated data, which you want to access together, in the same table).
Let's look at some anecdotes from experts:
Rick Houlihan tweeted over a year ago
Using a table per entity in DynamoDB is like deploying a new server for each table in RDBMS. Nobody does that. As soon as you segregate items across tables you can no longer group them on a GSI. Instead you must query each table to get related items. This is slow and expensive.
Alex DeBrie responded to a tweet last August
Think of it as one table per service, not across your whole architecture. Each service should own its own table, just like with other databases. The key around single table is more about not requiring a table per entity like in an RDBMS.
Based on this, you should answer to yourself ...
How related is the data?
If you'd build using a relational database, would you store it in separate databases?
Are those actually 2 separate micro services, or is it part of the same micro service?
...
Based on the answers to those (and similar) questions you can argue to either keep it in one table, or to split it across 2 tables.

SQL Server Data Archiving

I have a SQL Azure database on which I need to perform some data archiving operation.
Plan is to move all the irrelevant data from the actual tables into Archive_* tables.
I have tables which have up to 8-9 million records.
One option is to write a stored procedure and insert data in to the new Archive_* tables and also delete from the actual tables.
But this operation is really time consuming and running for more than 3 hrs.
I am in a situation where I can't have more than an hour's downtime.
How can I make this archiving faster?

You can use Azure Automation to schedule execution of a stored procedure every day at the same time, during maintenance window, where this stored procedure will archive the oldest one week or one month of data only, each time it runs. The store procedure should archive data older than X number of weeks/months/years only. Please read this article to create the runbook. In a few days you will have all the old data archived and the Runbook will continue to do the job from now and on.

You can't make it faster, but you can make it seamless. The first option is to have a separate task that moves data in portions from the source to the archive tables. In order to prevent table lock escalations and overall performance degradation I would suggest you to limit the size of a single transaction. E.g. start transaction, insert N records into the archive table, delete these records from the source table, commit transaction. Continue for a few days until all the necessary data is transferred. The advantage of that way is that if there is some kind of a failure, you may restart the archival process and it will continue from the point of the failure.
The second option that does not exclude the first one really depends on how critical the performance of the source tables for you and how many updates are happening with them. It if is not a problem you can write triggers that actually pour every inserted/updated record into an archive table. Then, when you want a cleanup all you need to do is to delete the obsolete records from the source tables, their copies will already be in the archive tables.
In the both cases you will not need to have any downtime.

The problem
I have a firebase application in combination with Ionic. I want the user to create a group and define a time, when the group is about to be deleted automatically. My first idea was to create a setTimeout(), save it and override it whenever the user changes the time. But as I have read, setTimeout() is a bad solution when used for long durations (because of the firebase billing service). Later I have heard about Cron, but as far as I have seen, Cron only allows to call functions at a specific time, not relative to a given time (e.g. 1 hour from now). Ideally, the user can define any given time with a datetime picker.
My idea
So my idea is as following:
User defines the date via native datepicker and the hour via some spinner
The client writes the time into a seperate firebase-database with a reference of following form: /scheduledJobs/{date}/{hour}/{groupId}
Every hour, the Cron task will check all the groups at the given location and delete them
If a user plans to change the time, he will just delete the old value in scheduledJobs and create a new one
My question
What is the best way to schedule the automatic deletion of the group? I am not sure if my approach suits well, since querying for the date may create a very flat and long list in my database. Also, my approach is limited in a way, that only full hours can be taken as the time of deletion and not any given time. Additionally I will need two inputs (date + hour) from the user instead of just using a datetime (which also provides me the minutes).

I believe what you're looking for is node schedule. Basically, it allows you to run serverside cron jobs, it has the ability to take date-time objects and schedule the job at that time. Since I'm assuming you're running a server for this, this would allow you to schedule the deletion at whatever time you wish based on the user input.

An alternative to TheCog's answer (which relies on running a node server) is to use Cloud Functions for Firebase in combination with a third party server (e.g. cron-jobs.org) to schedule their execution. See this video for more or this blog post for an alternative trigger.
In either of these approaches I recommend keeping only upcoming triggers in your database. So delete the jobs after you've processed them. That way you know it won't grow forever, but rather will have some sort of fixed size. In fact, you can query it quite efficiently because you know that you only need to read jobs that are scheduled before the next trigger time.
If you're having problems implementing your approach, I recommend sharing the minimum code that reproduces where you're stuck as it will be easier to give concrete help that way.

Dealing with huge amount of select statements

i'm designing an ASP.NET Application wich builds an Overview of all the sales per partner in a period of time.
How it works so far:
Select all partnerNo(SQL-Server) and add to List(ASP.NET)
Select sales of partnerNo1 over period of time(SQL-Server), summarize them(ASP.NET) and add them to a DataTable(ASP.NET)
Select sales of partnerNo2 over period of time, summarize them and add them to a datatable
Select sales of partnerNo3 over period of time, summarize them and add them to a datatable
and so on
Now here is the Problem: if i select only the TOP 100 partnerNo, if takes a while, but i get a result. If i change the TOP to 1000, the SQL-Server processes the SQL-Statements
(can see him working in activitymonitor), and the iis-server is feeding him the new SQL-Selects... but after a while, the iis is terminating the page-request from the browser, so no result is shown
i really hope, i could explain it enough for someone to help me.
With regards
Dirk Th.

That's the RBAR anti-pattern. It should be possible to create one SQL query that returns summarized information from all partners.
That's typically much faster: less data has to go over the line, and less often. A roundtrip to a database can cost 50ms. If you do 600 of those, you're at the 30 second timeout for web pages.

If you have Framework 4.5, AND getting the summary data for each partnerN is mutually exclusive, you can try parallel tasks.
http://msdn.microsoft.com/en-us/library/dd460720.aspx
Now, that's not a simple subject. But it would allow you to take advantage of multiple processors.
Number one rule. You CANNOT RELY ON SEQUENCE.
........
Option 2, a more "traditional" approach is to hit the database for everything you need.
I would abandon DataTables, and start using DTO or POCO objects.
Then you can author mini "read only properties" that replace your calculated/derived data-table columns.
Go to the database, do not use cursors or looping, and hit the database for all the info you need. After you get it back, stuff it into DTOs/POCO's rely on read-only properties where you can (for derived values)..........and then if you have to run some business logic to figure-out some derived values, then do that.
If you're "stuck" with a DataSet/DataTable for the presentation layer, you can do loop over your DTOs/POCOs and stuff them into a DataSet/DataTable.

Teradata Change data capture

My team is thinking about developing a real time application (a bunch of charts, gauges etc) reading from the database. At the backend we have a high volume Teradata database. We expect some other applications to be constantly feeding in data into this database.
Now we are wondering about how to feed in the changes from the database to the application. Polling from the application would not be a viable option in our case.
Are there any tools that are available within Teradata that would help us achieve this?
Any directions on this would be greatly appreciated

We faced similar requirement. But in our case client asked us to provide daily changes to a purchase orders table. That means we had to run a batch of scripts every day to capture the changes occuring to the table.
So we started to collect data every day and store the data in a sparse history format in another table. So the process is simple here. We collect a purchase order details record in the against first day's date in the history table. And then the next day we compare the next day's feed record against the history record and identify any change in that record. If there is a change in the purchase order record columns we collect that record and keep it in a final reporting table which will be shown to the client.
If you run the batch scripts every day once and there will be more than one change in a day to a record then this method cannot give you the full changes. For that you may need to run the batch scripts more than once every day based on your requirement.
Please let us know if you find any other solution. Hope this helps.

There is a change data capture tool from wisdomforce.
http://www.wisdomforce.com/resources/docs/databasesync/DatabaseSyncBestPracticesforTeradata.pdf
It would it probably work in this case

Are triggers with stored procedures an option?
CREATE TRIGGER dbname.triggername
AFTER INSERT ON db_name.tbl_name
REFERENCING stored_procedure
Theoretically speaking, you can write external stored procedures which may call UDFs written in Java or C/C++ etc which can push the row data to your application in near real time.

Categories

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex