How to: Background Processing of WooCommerce Product Batch API calls - wordpress

I'm running a webshop with 7000 products. Every night I'm executing a cronjob on my ERP-system, pushing all product-information into my WooCommerce store through their Restful API.
This works perfectly... However, I'm looking for ways to improve the performance of this task.
What I've done, so far, is to change my individual update API requests to a batch call. Thereby I limit my requests from 7000 to around 1400 - I can batch around 50 products without running into gateway timeouts and other serverside limitations.
However, while improving this, I'm wondering if there are some smart way of scheduling these update-tasks into a background queue/process. I'm quite familiar with this on Laravel, but I'm wondering if anything exist in Wordpress/WooCommerce that actually supports this out-of-the-box.
What I actually mean; instead of executing the update-batch task upon API-call, the API should just schedule the task and send a response back to client, telling that the task has been added succesfully. This way the ERP-system dosen't have to wait for the WooCommerce to finalize the whole batch. If I make the ERP-system make async calls, then it would properly lead to overloading the WooCommerce, and wouldn't be beneficial.
If not, what would actually be the best approach to accomplish this? I'm thinking of making an jobs/queue-databasetable, that contains my payload, and after pushing all update-data to this, then creating another endpoint, which tells my WooCommerce store to start working through the list.
Please let me know if there are some excellent way of achieving this.

Related

Cloud Tasks - waiting for a result

My application needs front-end searching. It searches an external API, for which I'm limited to a few calls per second.
So, I wanted to keep ALL queries, related to this external API, on the same Cloud Task queue, so I could guarantee the amount of calls per second.
That means the user would have to wait for second or two, most likely, when searching.
However, using Google's const { CloudTasksClient } = require('#google-cloud/tasks') library, I can create a task but when I go to check it's status using .getTask() it says:
The task no longer exists, though a task with this name existed recently.
Is there any way to poll a task until it's complete and retrieve response data? Or any other recommended methods for this? Thanks in advance.
No. GCP Cloud Tasks provides no way to gather information on the body of requests that successfully completed.
(Which is a shame, because it seems quite natural. I just wrote an email to my own GCP account rep asking about this possible feature. If I get an update I'll put it here.)

Firebase Functions - Do something every 10 minutes

I'm building an app using Firebase. There's an admin setting I want to build, that essentially populates one of my nodes with data. It would do this every 10 minutes, and there are 50-80 data points that I'd want to add.
So this would take roughly 13 hours total, I'd probably want to expand this though in the future. I would only call this function, maybe once a week.
I would simply do this using a setTimeout but I've heard this can be expensive? Does someone know roughly how expensive this would be?
I'm not that experienced with CRON jobs, is there a better way of doing this with Firebase? It's not something I want to have running constantly, and it's not at a specific time, but just whenever I need it. I'd also potentially want to have the job running multiple times, at the same time. Which seems to be super easy using Firebase Functions.
Any ideas? Thank you!
You would have to trigger the functions from an outside source.
One way you could do that is by creating and subscribing to a pub-sub system:
exports.someSubscription = functions.pubsub.topic('some-subscription').onPublish((event) => {
// Your code here
});
Another way you could trigger the functions is by exposing them as http requests and hitting those endpoints - See this amazing article on the firebase documentation for some resources on creating http endpoints.
Both of these ways require you to have some sort of scheduling service to either publish a new event to your subscription or to hit the http endpoint. You will need to either implement these services yourself, however I am sure there are services out there that can do that for you.

WP-Engine 502 timeout- what options do I have to get around this limitation?

We have a plugin for Wordpress that we've been using successfully on many customers- the plugin syncs stock numbers with our warehouse and exports orders to our warehouse.
We have recently had a client move to WP-Engine who seem to impose a hard 30 second limit on the length of a running request. Because sometimes we have many orders to export, the script simply hits a 502 bad gateway error.
According to WP-Engine documentation, this cannot be turned off on a client by client basis.
https://wpengine.com/support/troubleshooting-502-error/
My question is, what options do I have to get around a host's 30 second timeout limit? Setting set_time_limit has no effect (as expected as it is the web server killing the request, not PHP). The only thing I can think of is make heavy modifications to the plugin whereby it acts as an API and we simply pull the data from the clients system, however this is a last resort.
The long-process timeout is 60 seconds.
This cannot be turned off on shared plans, only plans with dedicated servers. You will not be able to get around this by attempting to modify it as it runs directly on Apache outside of your particular install
Your optons are:
1. 'Chunk' the upload to be smaller
2. Upload the sql file to your sFTP _wpeprivate folder and have their support import it for you.
3. Optimize the import so the content is imported more efficiently.
I can see three options here.
Change the web host (easy option).
Modify a plugin to process the sync in batches. However, this also won't give you a 100% guarantee with a hard script execution time limit - something may get lost in one or more batches and you won't even know.
Contact WP Engine and ask to raise the limit for this particular client.

Slow Transactions - WebTransaction taking the hit. What does this mean?

Trying to work out why some of my application servers have creeped up over 1s response times using newrelic. We're using WebApi 2.0 and MVC5.
As you can see below the bulk of the time is spent under 'WebTransaction'. The throughput figures aren't particularly high - what could be causing this, and what are the steps I can take to reduce it down?
Thanks
EDIT I added transactional tracing to this function to get some further analysis - see below:
Over 1 second waiting in System.Web.HttpApplication.BeginRequest().
Any insight into this would be appreciated.
Ok - I have now solved the issue.
Cause
One of my logging handlers which syncs it's data to cloud storage was initializing every time it was instantiated, which also involved a call to Azure table storage. As it was passed into the controller in question, every call to the API resulted in this instantiate.
It was a blocking call, so it added ~1s to every call. Once i configured this initialization to be server life-cycle wide,
Observations
As the blocking call was made at the time of the Controller being build (due to Unity resolving the dependancies at this point) New Relic reports this as
System.Web.HttpApplication.BeginRequest()
Although I would love to see this a little granular, as we can see from the transactional trace above it was in fact the 7 calls to table storage (still not quite sure why it was 7) that led me down this path.
Nice tool - my new relic subscription is starting to pay for itself.
It appears that the bulk of time is being spent in Account.NewSession. But it is difficult to say without drilling down into your data. If you need some more insight into a block of code, you may want to consider adding Custom Instrumentation
If you would like us to investigate this in more depth, please reach out to us at support.newrelic.com where we will have you account information on hand.

Updating Coldfusion solr collections with a scheduled task

So I'm pretty new to using the Coldfusion solr search (just moved from a CF8 Mac OS X server to a Linux CF9 server), and I'm wondering what the best way to handle automatically updating the collections is. I know scheduled tasks are meant for this but I haven't been able to find any examples online.
I currently have a scheduled task set up to update all of the collections weekly by getting the list of collections and using the cfindex tag in a loop to run the refresh command. This is pretty processing intensive though and takes about ten minutes to update the four collections I have set up so far. This works when I run it in the browser, but I get this error "The request has exceeded the allowable time limit Tag: CFLOOP" when I run the task from scheduled task administration page.
Is there a better way to handle updating the collections? Would it be better if I made a task to update each collection individually?
Here's my update code.
<cfsetting requesttimeout="1800">
<cfcollection action="list" name="collections" engine="solr">
<cfloop query="collections">
<cfindex collection="#name#" action="refresh" extensions=".pdf, .html, .htm, .cfml, .cfm" type="path" key="/home/#name#/public_html/" recurse="yes">
</cfloop>
In earlier versions of ColdFusion there was a URL parameter that could be passed on any HTTP request to change the server's timeout for the requested page. You might have guessed from the scheduled task configuration that there's an HTTP request running your task, so it functions just like any other page. In those earlier versions you would have just added &requesttimeout=900 to the URL and that gave the server 15 minutes to process that task.
In later versions they realized that this URL parameter was a security risk but they needed a way to allow developers to declare that an individual HTTP request should still be allowed to take longer than the default page timeout set in the ColdFusion Administrator. So they moved it from the URL parameter to the <cfsetting> tag.
<cfsetting requesttimeout="900" />
You need to put the cfsetting tag at the top of the page, rather than putting it inside your loop, because it's resetting the total allowable time from the beginning of the request, not just since the last cfsetting tag. Ben Nadel wrote a blog article about that here: http://www.bennadel.com/blog/626-CFSetting-RequestTimeout-Updates-Timeouts-It-Does-Not-Set-Them.htm
I'm not sure if there's an upper limit to the request timeout. I do know that in the past when I've had a really long-running task like that, the server has gradually slowed down, in some cases until it crashed. I'm not sure if I would expect reindexing Solr collections to degrade performance so badly, I think my tasks were doing some other things that were probably hogging memory at the time. Anyway if you run into that issue, you may need to divide it up into separate tasks for each collection and just make sure there's enough time between the tasks to allow each one to complete before the next one starts.
EDIT: Oops! I don't know how I missed the cfsetting tag in the original question. D'oh! In any event, when you execute a scheduled task via the CF Administrator, it performs a cfhttp request to execute the task. This is the way scheduled tasks are normally executed, and I suspect it's so the task can execute inside your own application scope, but the effect is that there are two separate requests executing. I don't think there's a cfsetting tag in the CFIDE page, but I suspect a person could add one if they wanted to allow that page longer to wait for the task to complete.
EDIT: Okay, if you wanted to add the cfsetting in the CFIDE, you would first have to decrypt the template and then add your one line of code... which might void your warranty on the server, but is probably not dangerous. ;) For decrypting the template see: Can I get the source of a hacked Coldfusion template? - and the template to edit is /CFIDE/administrator/scheduler/scheduletasks.cfm.

Resources