Plone: 504 Gateway Time-out renaming a folder - plone

I must rename a folder (title and short name / id) containing a lot of contents. It seems to take a lot of time and I receive 504 Gateway Time-out The server didn't respond in time.
Is there a solution for a quick item renaming? (Maybe with a script?) Can you give me a hint to solve this?

This is a common problem with Plone. As Kim said it uncatalogs and catalogs every item in this folder. This includes heavy indexes like SearchableText.
I wrote a part of ftw.copymovepatches, which significantly improves renaming and moving of large structures.
The idea is to NOT uncatalog and catalog all items, but reindex only the necessary indexes, like id, path, allowedRolesAndUsers, etc.
On average, if you have the patch installed, you gain 70% - 90%.
Of course this depends on your setup, but you should feel it ;-)

Renaming is going to index the folder's contents recursively. Try doing it from a direct ZEO client connection, e.g. on port 8080, rather than through a web server proxy (nginx / Apache).

Related

Nginx rewrite and (later) load balancer together: is that possible?

I've an old site based on IIS that, for historical reasons, was using lots of RewriteRule via Helicon APE. Now, when we hit the server with multiple clients, Helicon APE frequently crashes (quite frequently, actually). The entire set of IIS servers (4) are expected to grow and the entire system to scale, and a lot of effort was done recently in the webapp to support new features and user growth.
Someone suggested to use NGINX as a load balancer before the IIS servers, as it will handle way better the increasing amount of traffic, and apply those rewrites before hitting IIS, so the URLs would be converted to the new formats before load balancing them.
Following the advice, we have set one POC nginx 1.13 on linux with rewrite rules (from the ones used in APE) and using proxy_pass with two of the servers. But we have noticed several issues this way:
rewrite rules seems to NOT work the way they should; we can check that the regex's are certainly valid (putting them in locations), but the URL seems to be not rewritten.
ProxyPass returns usually a 400 bad request or does not hit the servers.
However, if we set several locations with some of the simpler regexs, and then we put inside ProxyPass to the backend server and the new URL patterns, the servers are hit with the right requests. This solution, however, brings some problems: some of our rewrites are additions to anothers, so the transformations could be done in 3 steps (one changes first part of the rule, the second changes another, and the third will join all together to put the valid url with a break flag). This is impossible to be done mixing locations.
A lot of research through StackOverflow, blogs, support sites and mailings lists has been put in place to find a solution to our problem, but sometimes the suggested solution does not work at all (or partially), and to be honest, after a week with this, we are concerned the arquitecture we had in mind is not possible to be done.
We have tried this with haproxy as well, with really odd behavior from haproxy (ie: sending error messages attached to the request being LB'd).
As the title summarizes, after the long description above, the question is: can someone confirms what we are trying to achieve can really be done with nginx? If not, what could be used?

Serving static content programmatically from Servlet - does the spec have anything available or i should roll a custom one?

I have a db with original file names, location to files on disk, meta data like user that owns file... Those files on disk are with scrambled names. When user requests a file, the servlet will check whether he's authorized, then send the file in it's original name.
While researching on the subject i've found several cases that cover that issue, but nothing specific to mine.
Essentially there are 2 solutions:
A custom servlet that handles headers and other stuff the Default Servlet containers don't: http://balusc.omnifaces.org/2009/02/fileservlet-supporting-resume-and.html
Then there is the quick and easy one of just using the Default Servlet and do some path remapping. For ex., in Undertow you configure the Undertow subsystem and add file handlers in the standalone.xml that map http://example.com/content/ to /some/path/on/disk/with/files .
So i am leaning towards solution 1, since solution 2 is a straight path remap and i need to change file names on the fly.
I don't want to reinvent the hot water. And both solutions are non standard. So if i decide to migrate app server to other than Wildfly, it will be problematic. Is there a better way? How would you approach this problem?
While your problem is a fairly common one there isn't necessarily a standards based solution for every possible design challenge.
I don't think the #2 solution will be sufficient - what if two threads try to manipulate the file at the same time? If someone got the link to the file could they share it?
I've implemented something very similar to your #1 solution - the key there is that even if the link to the file got out no one could reuse the link as it requires security. You would just "return" a 401 or 403 for the resource.
Another possibility depends on how you're hosted. Amazon S3 allows you to generate a signed URL that has a limited time to live. In this way your server isn't sending the file directly. It is either sending a redirect or a URL to the front end to use. Keep the lifetime at like 15 seconds (depending on your needs) and then the URL is no longer valid.
I believe that the other cloud providers have a similar capability too.

WP-Engine 502 timeout- what options do I have to get around this limitation?

We have a plugin for Wordpress that we've been using successfully on many customers- the plugin syncs stock numbers with our warehouse and exports orders to our warehouse.
We have recently had a client move to WP-Engine who seem to impose a hard 30 second limit on the length of a running request. Because sometimes we have many orders to export, the script simply hits a 502 bad gateway error.
According to WP-Engine documentation, this cannot be turned off on a client by client basis.
https://wpengine.com/support/troubleshooting-502-error/
My question is, what options do I have to get around a host's 30 second timeout limit? Setting set_time_limit has no effect (as expected as it is the web server killing the request, not PHP). The only thing I can think of is make heavy modifications to the plugin whereby it acts as an API and we simply pull the data from the clients system, however this is a last resort.
The long-process timeout is 60 seconds.
This cannot be turned off on shared plans, only plans with dedicated servers. You will not be able to get around this by attempting to modify it as it runs directly on Apache outside of your particular install
Your optons are:
1. 'Chunk' the upload to be smaller
2. Upload the sql file to your sFTP _wpeprivate folder and have their support import it for you.
3. Optimize the import so the content is imported more efficiently.
I can see three options here.
Change the web host (easy option).
Modify a plugin to process the sync in batches. However, this also won't give you a 100% guarantee with a hard script execution time limit - something may get lost in one or more batches and you won't even know.
Contact WP Engine and ask to raise the limit for this particular client.

how to prevent vulnerability scanning

I have a web site that reports about each non-expected server side error on my email.
Quite often (once each 1-2 weeks) somebody launches automated tools that bombard the web site with a ton of different URLs:
sometimes they (hackers?) think my site has inside phpmyadmin hosted and they try to access vulnerable (i believe) php-pages...
sometimes they are trying to access pages that are really absent but belongs to popular CMSs
last time they tried to inject wrong ViewState...
It is clearly not search engine spiders as 100% of requests that generated errors are requests to invalid pages.
Right now they didn't do too much harm, the only one is that I need to delete a ton of server error emails (200-300)... But at some point they could probably find something.
I'm really tired of that and looking for the solution that will block such 'spiders'.
Is there anything ready to use? Any tool, dlls, etc... Or I should implement something myself?
In the 2nd case: could you please recommend the approach to implement? Should I limit amount of requests from IP per second (let's say not more than 5 requests per second and not more then 20 per minute)?
P.S. Right now my web site is written using ASP.NET 4.0.
Such bots are not likely to find any vulnerabilities in your system, if you just keep the server and software updated. They are generally just looking for low hanging fruit, i.e. systems that are not updated to fix known vulnerabilities.
You could make a bot trap to minimise such traffic. As soon as someone tries to access one of those non-existant pages that you know of, you could stop all requests from that IP address with the same browser string, for a while.
There are a couple of things what you can consider...
You can use one of the available Web Application Firewalls. It usually has set of rules and analytic engine that determine suspicious activities and react accordingly. For example in you case it can automatically block attempts to scan you site as it recognize it as a attack pattern.
More simple (but not 100% solution) approach is check referer url (referer url description in wiki) and if request was originating not from one of you page you rejected it (you probably should create httpmodule for that purpose).
And of cause you want to be sure that you site address all known security issues from OWASP TOP 10 list (OWASP TOP 10). You can find very comprehensive description how to do it for asp.net here (owasp top 10 for .net book in pdf), i also recommend to read the blog of the author of the aforementioned book: http://www.troyhunt.com/
Theres nothing you can do (reliabily) to prevent vulernability scanning, the only thing to do really is to make sure you are on top of any vulnerabilities and prevent vulernability exploitation.
If youre site is only used by a select few and in constant locations you could maybe use an IP restriction

Slow queries when putting files into "LIVE" environment

Here's the situation...
We have a local development server at Location A where we build all our aspx pages. Our databases are located at Location A also.
When testing our files on the development server our queries run quickly most under 1 second.
We have just move our files up to our live server which is located at Location B (databases are still at Location A) and the queries now take anywhere between 5-10 times longer than the development server. Location A is in East Anglia, Location B is in London roughly 100 miles apart.
Also on both dev and live server the first query that is run takes a lot longer than the rest of the queries thereafter.
Any ideas what may be causing the slowness?
EDIT
I've turned tracing on for a few of the pages and it seems that End Load is taking the longest of all the methods however I'm unsure why
I also do not have access to our external server to be able to install SSMS or Oracle developer on there to test any queries unfortunately.
"the first query that is run takes a lot longer than the rest of the
queries thereafter."
That's the effect of caching. The first query pays the toll of physical IO. Subsequent queries benefit from finding relevant records already in cache, either the DB Buffer Cache or some other OS or architectural buffer.
As for the difference in performance between the two environments, that's probably down to this:
"roughly 100 miles apart"
It is likely the network connection between the two locations is throttling data transfer. You need to talk to your network admin, assuming it's a private connection. If you're using public infrastructure your options are limited.
"seems that End Load is taking the longest of all the methods"
Okay, so I'm not a ASPX expert (I'm here for the [oracle] tag) but some light searching shows up several threads which suggest that it might be "user controls", as these fire just before the End Load event. For instance this other SO question.

Resources