Pupeteer is downloading unrequested files that cannot be intercepted or stopped

Pupeteer is downloading unrequested files that cannot be intercepted or stopped - webdriver

I am using Puppeteer along a proxy service, and after getting unexplainable high bandwidth usage I used a local proxy server to monitor the requests that were generating this bandwidth. I discovered that almost 90% of the traffic was used to request some crx files/updates.
My project requires me to open a a few thousand browsers every hour, in order to keep each task with it's own cookies and proxy. Every Chromium browser I open will eventually download ~10-15MB of files, using the proxy that is passed as arg to puppeteer.launch.
puppeteer.launch({
headless: false,
args: [
`--proxy-server=http://${this.proxy.host}:${this.proxy.port}`
]
)}
This requests do not appear in the network section of devtools and cannot be intercepted using:
await page.setRequestInterception(true);
this.page.on("request", cb);
I started a local proxy server and gave it to puppeteer via launch args to use, in order to monitor the requests made through it by Chrome. This is how I found out about this downloads. I blocked the first domain that Chromium was using to download these crx files, but Chromium started to download them from another domain, and so on. Some of this domains and URLs are:
http://redirector.gvt1.com/edgedl/chromewebstore/L2Nocm9tZV9leHRlbnNpb24vYmxvYnMvYjFkQUFWdmlaXy12MHFUTGhWQUViMUVlUQ/0.57.44.2492_hnimpnehoodheedghdeeijklkeaacbdc.crx
http://dl.google.com/chromewebstore/L2Nocm9tZV9leHRlbnNpb24vYmxvYnMvYjFkQUFWdmlaXy12MHFUTGhWQUVi
https://google.com/dl/something/something.crx
There were even more. When I block one domain, puppeteer finds another. This files are getting downloaded for every new browser launched, using expensive proxy bandwidth.
Is there a way to stop these downloads, or at least make Chromium only download them once? Not for every new browser launched. Can I at least instruct chrome to download these files without using the proxy?
This happens for both v5.5.0 and v8.0.0.

After a lot of time trying to find what is this extension that chrome always has to download, I found out about Chromium Components, that can be inspected using chrome://components. Looks like these are also shipped as crx files.
In my particular case Chrome was downloading "pnacl". The only way I was able to find this is by recognising the version number from the first link that I posted in my question (0.57.44.2492). Using chrome://components in a browser instance launched by puppeteer with the headless option to false, I found that pnacl had the exact same version.
I was able to prevent Chrome from downloading this component using the flag --disable-component-update. This flag is used by default by some webdrivers but not by the one that puppeteer (v5.5.0 or v8.0.0) downloads.
If anybody else encounters this problem, yours may be related to an extension instead of a component, so you may need to also use a flag to disable extension updates, but there is none, so I use --disable-extensions and --disable-default-apps just to make sure.

Related

Downloads timing out after 30 seconds on users with slow connections

We have some files on our portal that aren't that big to me: 50MB-80MB. On my home connection, it takes <10seconds to download these files. I've had other users test and they experience the same thing.
However, in the office, the connection is terrible. These files don't even download because once the download time gets to about 30-35 seconds, even though it is downloading (just incredibly slow), it triggers a non-descriptive error in the Developer Tools > Network and stops the download. Not seeing anything in any logs that indicates why the download is terminated.
The bigger problem is we now have a few end users with crappy internet who are also experiencing the same issue.
So I'm trying to figure out what we can do on our end. Obviously, we can't tell them, "Well, just get better internet service." It seems like there can be something done on our end to persist the download until it is completed. What that is, I'm not quite sure and that is what I'm looking for help on. Maybe it is a default setting in a dependency somewhere in our stack.
ReactJS FE that uses FileSaver.js for downloads
Django BE using native Django downloading
nginx-ingress for traffic ingress controller to the Kubernetes cluster
The FE uses nginx to serve the FE
The BE uses gunicorn to serve the BE
Any suggestions on what I should do to prevent this timeout on downloads?
I'm thinking the issue is somewhere with nginx-ingress, nginx, and/or FileSaver.js, so investigating those.

Per Saurabh, adjusting the timeout did the trick. I now just start the web server with the -t 300 flag and the users that were having issues no longer do.

Firebase serve does not update Service Worker

When I do a firebase init at command line, and create a standard web page, then a firebase serve, then open http://localhost:5000. I usually get a web page that I was working on at previous time. I am almost certain this is from a previous version of serviceworker.js.
Also depends on which browser I am using on my Mac (Safari, Chrome, Firefox and Opera), they will get different results. My feeling is that there should be a ServiceWorker.js clear or reset command so that a new serviceworker.js will be created. So Question, is there a SW reset command someplace. Ideally at the command line?
Or am I just nuts?

Since the Service Worker registration lives entirely within the browser, there's no way for the firebase serve command to know that you've made changes.
To clear all local stored data including Service Worker registrations (for Chrome at least), you can open the web inspector, go to the "Application" tab, and click the "Clear site data" button. Safari does not yet support Service Worker, so you shouldn't see the same behavior there.
If you're working on multiple web apps at the same time, you might want to consider using different ports for each, e.g. firebase serve -p 5001 for a second app.

Windows Explorer not refreshing after CreateFolder (new folder)

We have built a WebDav Service with your Engine and have a one problem when we create a new Folder or File:
The new folder / file is created successfully, but not showing in the Windows Explorer. Only if you press F5, the new folder / file is showing (and the name is already selected to be edited).
This behavior is reproducible even with a blank WebDav Solution.
We can reproduce this on Windows 7 and Windows 8 (8.1) using WebDav .NET Server 3.8 and the latest 3.9.
Is there a way to get around this “refresh-problem”?

I solved this issue but clicking in the folder explorer at view > options > then i restored to default and everything is back to normal.

I assume this issue is in Windows Explorer on a single computer. Most likely the WebDAV server-side code is failing with with some exception. Here are some ideas how to detect what is wrong:
Unmount network connections executing 'net use * /DELETE' in a command prompt, this will unmount WebDAV connections too and simulate 'clean' environment.
Retry reproducing the issue and examine your WebDAV logfile. By default it is located in /App_Data/WebDAV/Logs/ folder. Are there any exceptions in it?
Use Fiddler tool or any other debugging proxy to capture and examine HTTP requests. Are there any failed requests?
In case you are creating a folder/file on one computer using Windows Explorer (Microsoft Mini-redirector driver) or IT Hit Ajax File Browser and expect the files list to refresh automatically on another computer this would not work. Mini-redirector does not support any notifications from server and WebDAV does not submit any notifications, you need to refresh the files list manually to see the new items created.

I found this video on Youtube that explains in very much detail how to fix this problem: https://www.youtube.com/watch?v=UUiCPsQquqc
It is a bit lengthy, so I'll just quickly sum it up here:
The reason for these problems are one or more (broken) Shell Extensions that prevent the refresh of the Windows Explorer
To fix it, open up regedit.exe (requires admin privileges), do a search for the Registry Key "DontRefresh". If it is "1", set it to "0". There might be multiple matches for that Key, so repeat until all Keys have the value "0".
This might not work immediately, you may have to kill and restart your explorer.exe process (easiest to do with the Task Manager). Or you can simply reboot your computer. In my case, it worked immediately.
According to the video, the Keys should only be located under HKEY_CLASSES_ROOT/CLSID, but in my case I could only find such Keys in HKEY_LOCAL_MACHINE/Classes/Wow6432Node/CLSID.
I figured it makes most sense to simply search the complete Registry, it does not take very long.

I tred a lot of hacks, from scanning the system, to recreating the profile to hacking Registry keys and hives.
Finally what worked for me -
Right click on desktop
Select Personalize
Click Themes
Click Change desktop icons
Click Restore default & OK
And instantly it began to auto refresh with a new folder, rename, delete, copy, etc.

Flash SWF On Solaris Won't Load When Also Loading Apache APR Library in JBoss

UPDATE + SOLUTION ===============================
Sorry to be posting the solution here instead of in a comment, but something about my work's filtering doesn't allow the comment functionality to work for me.
I ended up using the -b 0.0.0.0 property in jboss to bind to all addresses, so I could try accessing machine A's server with machine B as the client, and vice-versa. I found that it always failed to load when running on machine B, whether or not I was connecting from A or B.
I started wireshark on a windows machine on the same network, and observed the TCP connection that was loading the webpage. I saw that the request for the .swf in the cases where it failed had a content length of 2 million or so, and when I right clicked the wireshark logs and selected "view conversation" or something like that, the size of the total conversation to get the .swf file was only 130,000. Looking at about:cache, that was about equal to what it ended up caching before saying "Done" on the page.
I ended up finding that there is a bug with the useSendFile property. (http://community.jboss.org/thread/148651?tstart=0). This causes it to only send part of the file if you are running low on kernel memory. Using useSendFile="false" in our server.xml has seemed to resolve the problem.
==================================================
Original Problem
I have a JBoss (5.1.0.GA) application server. I am using GraniteDS to connect between the application server and the client. The client side is flash-based.
Granite DS requires the use of the APR library (apache native library), so I am loading it. I see in the JBoss logs that it says it loaded the apache native library just fine (version 1.18, though I've also tried 1.20).
The issue is that when I have it so the APR library loads successfully, then the Flash side of the application does not usually load. I'll have to hit refresh a bunch of times and eventually it will usually load, but normally I'll see either a black webpage that says "Done" or the loading progress bar never move. Only by repeatedly hitting refresh will the page load. It will load eventually by hitting refresh enough, but it is not consistent and this obviously will not work for our clients who have to clear their browser's cache every time.
This problem only exists on Solaris, our application works fine on Windows. We've tried multiple patch-levels of solaris, and have verified with the "ldd" command that the library that needs to be loaded has all its dependencies there.
We've verified it isn't our swf file's size by testing:
1) Our regular SWF (1660 kb).
2) A random large-ish SWF (950 kb).
3) A small SWF with one label component that says "Test" (277 kb).
All 3 were unable to load when JBoss was also loading the native library, and loaded just fine without it. We need the native library to load successfully for Granite to connect between the client and server though, so not loading it isn't an option (unless there's some way to use the NIO connector with JBoss, but it appears unsupported).... if there is a way to use the NIO connector then we shouldn't need the APR library.
Has anybody run into this before? Anybody have any ideas or recommendations?

Have you tried the jboss native libraries for Solaris ?
http://www.jboss.org/jbossweb/downloads/jboss-native-2-0-9.html

After working for years, videos in my Flex Application won't play, just "buffering"

An application I wrote for a client almost 2 years ago using Flex 2 has stopped playing the .flv videos. It's been nearly 9 months since I've had to perform any updates to the app, so I don't have the source code on the computer I'm using at the moment. I'm not sure how often the client uses the application, so I can't say exactly when this started.
The videos just displays a black screen, does not load the first frame. I believe I used standard VideoDisplay object. The videos are contained in a folder on the same shared account as the application.
I've checked the application in latest versions of IE, Firefox and Chrome (running Flash 10) and I've also fired up a virtual machine to test it out in IE 7 with various releases of Flash 9 instead of Flash 10.
I checked, and the videos are still present, and I scattered some extra no-security cross-domain files around... but to no avail.
Does anyone have an ideas as to where I should start looking when I get back to my development computer? Could a change on the hosted server cause this?
UPDATE: I remembered another application with video that I had on the site that was made more recently using Flex 2. This application is a simple shell VideoDisplay object that serves up a .flv file in the same directory... and it works just fine.
So, the server is serving .flv files. The application I'm having problems with pulls .flv files from a different folder that is at the same level of the applications parent folder (the only difference I can see right now).

The someone cryptic error message received when using the debugging version of the Flash player was:
Error: 1000: No bitrate match
at mx.controls.videoClasses::VideoPlayer/play()
After getting back to my development machine I was able to determine that the XML file containing the URLs of the videos showed an old variant of the domain name that was in use a couple of years ago. This domain name was just allowed to expire, and so, the video player was pointing to .flv filenames no longer existed. Correcting the domain name resolved the problem.

You said the videos are still present, but are the being served?
A small hosting configuration change might cause files to no longer be served.
I would start there, you rule out that both your swf, and flv are accessible by the client browsers..

If it's on a new server, make sure it's serving the right mime type for .flv files, video/x-flv. I've had flash refuse to play videos without that set. Also, IIS now gives bogus 404 errors on requests to files of unknown mime type, so files can be on the servers, but invisible to clients. http://it.toolbox.com/blogs/rymoore/adding-flv-mime-type-in-iis-4198

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex