Bulk insert in PingDirectotry - pingfederate

I am trying to create and update bulk records in PingDirectory using REST is there any way we can achieve bulk update and insert. Single insertion is working fine using POST request with Postman
https://directory.*.com/directory/v1/
TIA

The PingDirectory server doesn't have API endpoints that have support bulk commits. Doing bulk modifications to a running directory server is computationally expensive, due to indexing requirements and the like. While being able to "move faster" in theory if you do a bulk modification in a single batch, we've found that you can achieve the same speeds with parallelism.
Switch to a few parallel modifications running at the same time, or, switch to an ldapmodify command with an LDIF.

Related

How to prevent write transactions while copying the Sqlite3 database file?

Sqlite3 recommends using the Backup API to make a backup but also indicates that copying can be safe as long as there are no transactions in progress:
The best approach to make reliable backup copies of an SQLite database is to make use of the backup API that is part of the SQLite library. Failing that, it is safe to make a copy of an SQLite database file as long as there are no transactions in progress by any process.
I'd like to use the approach of copying the database file, provided I can do this safely.
NB. I understand that the Backup API is the preferred option, but since Sqlite3 advertises atomic commits I believe this approach is viable, and am interested to know how to make this approach as safe as possible.
What SQL command or API can be used to completely lock the database such that would satisfy the "as long as there are no transactions in process" requirement?
Can an empty exclusive transaction (i.e. BEGIN EXCLUSIVE; with no further statements), be used to achieve this, despite technically being a transaction?

Asynchronous logging of server-side activities

I have a web server that on every request produces a log that should be persistently saved in a DB.
Since the request rate is too high, I can't make DB operations on every request.
I thought to do the following thing:
Any request to a web server produces a log.
This log is placed somewhere in a place where it can be stored in a fast way (redis?)
Another service (a cron job?) periodically flushes the data from that place, removes duplicates (yes, there can be duplicates that don't need to be stored in the DB) and makes a single MySQL query to save the data permanently.
What would be the most efficient way to achieve this thing?
Normally you would use a common logging library (e.g. log4j) which will manage safely writing your log statements to the file. However you should note that log verbosity can still impact application performance.
After the file is on disk, you can do whatever you like with it - it would be completely normal to ingest that file into Splunk for further processing and ad-hoc searches/alerting.
If you want to have better performance on this operation, you should send you'r logs to some kind of queue and with a service reading that queue, send all the logs to the database.
I found some information about queue's in mysql.
MySQL Insert Statement Queue
regards.

How to implement synchronized Memcached with database

AFAIK, Memcached does not support synchronization with database (at least SQL Server and Oracle). We are planning to use Memcached (it is free) with our OLTP database.
In some business processes we do some heavy validations which requires lot of data from database, we can not keep static copy of these data as we don't know whether the data has been modified so we fetch the data every time which slows the process down.
One possible solution could be
Write triggers on database to create/update prefixed-postfixed (table-PK1-PK2-PK3-column) files on change of records
Monitor this change of file using FileSystemWatcher and expire the key (table-PK1-PK2-PK3-column) to get updated data
Problem: There would be around 100,000 users using any combination of data for 10 hours. So we will end up having a lot of files e.g. categ1-subcateg5-subcateg-78-data100, categ1-subcateg5-subcateg-78-data250, categ2-subcateg5-subcateg-78-data100, categ1-subcateg5-subcateg-33-data100, etc.
I am expecting 5 million files at least. Now it looks a pathetic solution :(
Other possibilities are
call a web service asynchronously from the trigger passing the key
to be expired
call an exe from trigger without waiting it to finish and then this
exe would expire the key. (I have got some success with this approach on SQL Server using xp_cmdsell to call an exe, calling an exe from oracle's trigger looks a bit difficult)
Still sounds pathetic, isn't it?
Any intelligent suggestions please
It's not clear (to me) if the use of Memcached is mandatory or not. I would personally avoid it and use instead SqlDependency and OracleDependency. The two both allow to pass a db command and get notified when the data that the command would return changes.
If Memcached is mandatory you can still use this two classes to trigger the invalidation.
MS SQL Server has "Change Tracking" features that maybe be of use to you. You enable the database for change tracking and configure which tables you wish to track. SQL Server then creates change records on every update, insert, delete on a table and then lets you query for changes to records that have been made since the last time you checked. This is very useful for syncing changes and is more efficient than using triggers. It's also easier to manage than making your own tracking tables. This has been a feature since SQL Server 2005.
How to: Use SQL Server Change Tracking
Change tracking only captures the primary keys of the tables and let's you query which fields might have been modified. Then you can query the tables join on those keys to get the current data. If you want it to capture the data also you can use Change Capture, but it requires more overhead and at least SQL Server 2008 enterprise edition.
Change Data Capture
I have no experience with Oracle, but i believe it may also have some tracking functionality as well. This article might get you started:
20 Using Oracle Streams to Record Table Changes

trigger # Execution plan.. Cartesian product

The idea is when a user runs a query and has a bad Cartesian who cost is above a certain threshold . Then oracle emails it to me and user. i have tried few things but they don't work on run time. If toad and sql developer can see the execution plan. then I believe there is information there I just find it. Or i may have to adopt another logic.
In general, this is probably not possible.
In theory, if you were really determined, you could generate fine-grained auditing (FGA) triggers for every table in your system that fire for every SELECT, INSERT, UPDATE, and DELETE, get the SQL_ID from V$SESSION, join to V$SQL_PLAN, and implements whatever logic you want. That's technically possible but it would involve quite a bit of code and you'd be adding a potentially decent amount of overhead to every query in your system. This is probably not practical.
Rather than trying to use a trigger, you could write a procedure that was scheduled to run every few minutes via the DBMS_JOB or DBMS_SCHEDULER packages that would query V$SESSION for all active sessions, joined to V$SQL_PLAN, and implemented whatever logic you wanted. That eliminates the overhead of trying to run a trigger every time any user executes any statement. But it still involves a decent amount of code.
Rather than writing any code, depending on the business problem you're trying to solve, it may be easier to create resource limits on the user's profile to let Oracle enforce limits on the amount of resources any single SQL statement can consume. For example, you could set the user's CPU_PER_CALL, LOGICAL_READS_PER_CALL, or COMPOSITE_LIMIT to limit the amount of CPU, the amount of logical I/O, or the composite limit of CPU and logical I/O that a single statement can do before Oracle kills it.
If you want even more control, you could use Oracle Resource Manager. That could allow you to do anything from prevent Oracle from running queries from certain users if they were estimated to run too long or to throttle the resources that a group of users can consume if there is contention over those resources. Oracle can automatically move long-running queries from specific users to lower-priority groups, it can kill long-running queries automatically, it can prevent them from running in the first place, or any combination of those things.

Programmatic Remote Access to the Datastore

I have a requirement to implement a batch processing system that will run outside of Google App Engine (GAE) to batch process data from an RDBMS and insert it into GAE.
The appcfg.py does this from various input files but I would like to do it "by hand" using some API so I can fully control the lifecycle of the process. Is there a public API that is used internally by appcfg.py?
I would write a daemon in Python that runs on my internal server and monitors certain MySQL tables. Under the correct conditions, it would grab data from MySQL, process it, and post it using the GAE RemoteAPI to the GAE application.
sounds like you already know what to do. in your own words: "grab data from MySQL, process it, and post it using the GAE RemoteAPI." the remote api docs even have examples that write to the datastore.
What you could probably do (If I understand right what your problem is) is using the Task Queue. With that you could define a Task that does what you expect it to do;
Lets say you want to insert something into GAE-datastore. prepare the insert file on some server. Than go to your application and prepare an "Start Insert Task". By clicking on that a background task will start, read that file and insert it into the datastore.
Furthermore, if that task is daily performed you could invoke the task creation with a cron job.
However, if you could say more about the work you have to perform it would be easier :-P

Resources