Playwright/Chromium: Request stuck pending for localhost:8082 - networking

I just installed Playwright to create some automated tests for my web-app.
I got my test running fine on the staging version of my site just fine, but one of my requests is hanging when I try running it against localhost:
I have Nginx running on :8080 and webpack running on :8082 to serve my JS. As you can see, the document ("create") is served from :8080 no problem, but all.js which is http://localhost:8082/assets/all.js never finishes.
What's really confusing me though is that I can load that URL in a new tab in Chrome just fine, I can wget under WSL, and I can curl it under cmd.exe. So there's something funky going on with the networking when the browser instance is created by Playwright, but I don't know how to debug further. The same thing happens if I set defaultBrowserType: 'firefox'.
What else can I try?
I just found chrome://net-export/ and enabled it during the request. I've got all the CLI flags now:
"clientInfo": {
"cl": "b9c217c128c16f53d12f9a02933fcfdec1bf49af-refs/branch-heads/5195#{#176}",
"command_line": "\"C:\\Users\\Mark\\AppData\\Local\\ms-playwright\\chromium-1019\\chrome-win\\chrome.exe\" --disable-field-trial-config --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,DialMediaRouteProvider,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater,AvoidUnnecessaryBeforeUnloadCheckSync,Translate --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --no-sandbox --auto-open-devtools-for-tabs --deny-permission-prompts --allow-loopback-in-peer-connection --user-data-dir=\"C:\\Users\\Mark\\AppData\\Local\\Temp\\playwright_chromiumdev_profile-5Cth57\" --remote-debugging-pipe --no-startup-window --flag-switches-begin --flag-switches-end --file-url-path-alias=\"/gen=C:\\Users\\Mark\\AppData\\Local\\ms-playwright\\chromium-1019\\chrome-win\\gen\"",
"name": "Chromium",
"official": "unofficial",
"os_type": "Windows NT: 10.0.19044 (x86_64)",
"version": "105.0.5195.19",
"version_mod": ""
},
And a few request details:
{
"params": {
"headers": [
"Host: localhost:8082",
"Connection: keep-alive",
"sec-ch-ua: \"Chromium\";v=\"105\", \"Not)A;Brand\";v=\"8\"",
"Origin: http://localhost:8080",
"Accept-Language: en-CA",
"sec-ch-ua-mobile: ?0",
"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36",
"sec-ch-ua-platform: \"Windows\"",
"Accept: */*",
"Sec-Fetch-Site: same-site",
"Sec-Fetch-Mode: cors",
"Sec-Fetch-Dest: script",
"Referer: http://localhost:8080/",
"Accept-Encoding: gzip, deflate, br"
],
"line": "GET /assets/all.js HTTP/1.1\r\n"
},
"phase": 0,
"source": {
"id": 170,
"start_time": "131696227",
"type": 1
},
"time": "131696228",
"type": 169
},
Nothing really jumps out at me as suspicious though.

Related

How do I scrape a website that ignores my headers?

test_url = 'https://crimegrade.org/safest-places-in-60629/'
test_headers = {
'accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'accept-encoding' : 'gzip, deflate, br',
'accept-language' : 'en-US,en;q=0.9',
'cache-control': 'no-cache',
'cookie': '_ga=GA1.2.1384046872.1654177894; _gid=GA1.2.924008640.1654177894',
'pragma': 'no-cache',
'referer' : 'https://crimegrade.org/crime-by-zip-code/',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'same-origin',
'sec-fetch-user': '?1',
'upgrade-insecure-requests' : '1',
'user-agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.63 Safari/537.36'
}
crime_response = requests.get(test_url, headers=test_headers)
print(crime_response.content)
I've managed to scrape other websites with a similar approach before, but I haven't been able to get parameters or a clean 200 status code for crimegrade.org. I think that's why I'm getting this response:
\<div class="cf-alert cf-alert-error cf-cookie-error" id="cookie-alert" data-translate="enable_cookies">Please enable cookies.\</div>
Do you have any advice on how to solve this?
Through a bit more reading, watching, & hunting on my end, I managed to get around this with a very conventional method of automating my browsing with Selenium. My code is below.
Note: .page_source gives the html data which can be parsed with BeautifulSoup. It is akin to the .content yield in my original post, except it's the information I need.
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
crime_url = 'https://crimegrade.org/safest-places-in-73505/'
chrome_driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
chrome_driver.get(crime_url)
crime_html = chrome_driver.page_source
chrome_driver.quit()

Jupyter error at saving: "sqlite3.OperationalError: disk I/O error"

I am experiencing an issue when trying to save a file within jupyter (it worked before but not anymore). The file was on a NFS based mount, that can create apparently issues: the problem might have appeared while the file was opened from 2 different laptops.
This is the error I get from the terminal:
[E 2022-03-04 16:26:14.418 ServerApp] Error while saving file: Untitled1.ipynb disk I/O error
Traceback (most recent call last):
File "/data/users/pklein/Conda/miniconda3/envs/R-env/lib/python3.10/site-packages/jupyter_server/services/contents/filemanager.py", line 467, in save
self.check_and_sign(nb, path)
File "/data/users/pklein/Conda/miniconda3/envs/R-env/lib/python3.10/site-packages/jupyter_server/services/contents/manager.py", line 515, in check_and_sign
self.notary.sign(nb)
File "/data/users/pklein/Conda/miniconda3/envs/R-env/lib/python3.10/site-packages/nbformat/sign.py", line 452, in sign
self.store.store_signature(signature, self.algorithm)
File "/data/users/pklein/Conda/miniconda3/envs/R-env/lib/python3.10/site-packages/nbformat/sign.py", line 207, in store_signature
if not self.check_signature(digest, algorithm):
File "/data/users/pklein/Conda/miniconda3/envs/R-env/lib/python3.10/site-packages/nbformat/sign.py", line 229, in check_signature
r = self.db.execute("""SELECT id FROM nbsignatures WHERE
sqlite3.OperationalError: disk I/O error
[W 2022-03-04 16:26:14.428 ServerApp] 500 PUT /api/contents/Untitled1.ipynb?1646407574412 (127.0.0.1): Unexpected error while saving file: Untitled1.ipynb disk I/O error
[W 2022-03-04 16:26:14.429 ServerApp] Unexpected error while saving file: Untitled1.ipynb disk I/O error
[E 2022-03-04 16:26:14.429 ServerApp] {
"Host": "localhost:8889",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0",
"Accept": "*/*",
"Accept-Language": "en-US,en;q=0.5",
"Accept-Encoding": "gzip, deflate",
"Referer": "http://localhost:8889/lab",
"Content-Type": "text/plain;charset=UTF-8",
"Authorization": "token 926a6a5e8c6fead68d1561623988c4416c3d7b92beb20661",
"X-Xsrftoken": "2|0ac8d7f8|44ed8e7e6b965c9a29c545cc381b0ba5|1646214609",
"Origin": "http://localhost:8889",
"Content-Length": "810",
"Connection": "keep-alive",
"Cookie": "username-localhost-8888=\"2|1:0|10:1646320313|23:username-localhost-8888|44:ZTk2NTA4OGQxOWY4NGU3NWFlNDFiMmViYjkwYWIwNWE=|a63dfcc7ce7e59bf90fb0fc2cd39c406bad6fc687673b656f13244bc31053ffb\"; _xsrf=2|0ac8d7f8|44ed8e7e6b965c9a29c545cc381b0ba5|1646214609; username-localhost-8889=\"2|1:0|10:1646407574|23:username-localhost-8889|44:Y2M3ZDU3MDQ5ZGYwNDBmYjhjNjVhZDNkMjZhZmMyNTE=|a66279f706cf27e342645760f638e0b8449cea2720733d0cd8c1a0c9bf000a63\"; username-localhost-8890=\"2|1:0|10:1646314248|23:username-localhost-8890|44:NDNkOWI0NjBkYzhlNDM4MmEwNDA4MGIyNDUwNGM4ZDk=|6ce552a2bb0d687a82ba4e5b4a20953272a3daf2a2d78f411eb990fb608fe3c7\"",
"Sec-Fetch-Dest": "empty",
"Sec-Fetch-Mode": "cors",
"Sec-Fetch-Site": "same-origin",
"Pragma": "no-cache",
"Cache-Control": "no-cache"
}
From different posts suggestions, I have already tried:
link change the configuration of ~/.ipython/profile_default/ipython_config.py by setting up
c = get_config() #gets the configuration
c.HistoryManager.hist_file='/tmp/ipython_hist.sqlite' #changes history file writing to tmp folder
link delete ~/.ipython/profile_default/history.sqlite
I still have space in my HOME directory
link I removed ~/.local/share/jupyter/nbsignatures.db
I tried to pip uninstall & install again jupyter
Any suggestions are welcome, thank you very much !
Best,
Paul

curl: How to send HEAD request with request body

I'd like to send a HEAD request with a request body.
So I tried the below commands. But I got some errors.
$ curl -X HEAD http://localhost:8080 -d "test"
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
curl: (18) transfer closed with 11 bytes remaining to read
or I tried this one:
$ curl -I http://localhost:8080 -d "test"
Warning: You can only select one HTTP request method! You asked for both POST
Warning: (-d, --data) and HEAD (-I, --head).
I think that RFC doesn't prohibit sending HEAD request with a request body.
How can I send ?
By default, with -d/--data, method "POST" is used.
With -I/--head you sugest to use "HEAD" method.
How your service accept which method (POST or HEAD) ?
I use "https://httpbin.org" site for testing.
With cURL, yout could use, POST like this:
$ curl --silent --include https://httpbin.org/post -d "data=spam_and_eggs"
HTTP/2 200
date: Thu, 30 Sep 2021 18:57:02 GMT
content-type: application/json
content-length: 438
server: gunicorn/19.9.0
access-control-allow-origin: *
access-control-allow-credentials: true
{
"args": {},
"data": "",
"files": {},
"form": {
"data": "spam_and_eggs"
},
"headers": {
"Accept": "*/*",
"Content-Length": "18",
"Content-Type": "application/x-www-form-urlencoded",
"Host": "httpbin.org",
"User-Agent": "curl/7.71.1",
"X-Amzn-Trace-Id": "Root=1-6156087e-6b04f4645dce993909a95b24"
},
"json": null,
"origin": "86.245.210.158",
"url": "https://httpbin.org/post"
}
or "HEAD" method:
$ curl --silent -X HEAD --include https://httpbin.org/headers -d "data=spam_and_eggs"
HTTP/2 200
date: Thu, 30 Sep 2021 18:58:30 GMT
content-type: application/json
content-length: 260
server: gunicorn/19.9.0
access-control-allow-origin: *
access-control-allow-credentials: true
I inspected with strace (with the HTTP protocol) the HEAD request with data are passed to the server:
sendto(5, "HEAD /headers HTTP/1.1\r\nHost: httpbin.org\r\nUser-Agent: curl/7.71.1\r\nAccept: */*\r\nContent-Length: 18\r\nContent-Type: application/x-www-form-urlencoded\r\n\r\ndata=spam_and_eggs", 170, MSG_NOSIGNAL, NULL, 0) = 170
Of course, without "--silent" option, the warning message appears:
Warning: Setting custom HTTP method to HEAD with -X/--request may not work the
Warning: way you want. Consider using -I/--head instead.
My research are based on this very old post: https://serverfault.com/questions/140149/difference-between-curl-i-and-curl-x-head

Parsing Custom Nginx access log using telegraf logparser

I have defined a custom nginx log format using below template :
log_format main escape=none '"$time_local" client=$remote_addr '
'request="$request" '
'status=$status'
'Req_Header=$req_headers '
'request_body=$req_body '
'response_body=$resp_body '
'referer=$http_referer '
'user_agent="$http_user_agent" '
'upstream_addr=$upstream_addr '
'upstream_status=$upstream_status '
'request_time=$request_time '
'upstream_response_time=$upstream_response_time '
'upstream_connect_time=$upstream_connect_time ';
In return i get the request logged like below
"09/Sep/2019:13:28:39 +0530" client=59.152.52.190 request="POST /api/onboard/checkExistence HTTP/1.1"status=200Req_Header=Headers: accept: application/json
host: uat-pwa.abc.com
from: https://uat-pwa.abc.com/onboard/mf/onboard-info_v1.2.15.3
sec-fetch-site: same-origin
accept-language: en-GB,en-US;q=0.9,en;q=0.8
content-type: application/json
connection: keep-alive
content-length: 46
cookie: _ga=GA1.2.51303468.1558948708; _gid=GA1.2.1607663960.1568015582; _gat_UA-144276655-2=1
referer: https://uat-pwa.abc.com/onboard/mf/onboard-info
accept-encoding: gzip, deflate, br
ticket: aW52ZXN0aWNh
businessunit: MF
sec-fetch-mode: cors
userid: Onboarding
origin: https://uat-pwa.abc.com
investorid:
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36
request_body={"PAN":"ABCDEGH","mobile":null,"both":"no"} response_body={"timestamp":"2019-09-09T13:28:39.132+0530","message":"Client Already Exist. ","details":"Details are in Logger database","payLoad":null,"errorCode":"0050","userId":"Onboarding","investorId":"","sessionUUID":"a2161b89-d2d7-11e9-aa73-3dba15bc0e1c"} referer=https://uat-pwa.abc.com/onboard/mf/onboard-info user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36" upstream_addr=[::1]:8080 upstream_status=200 request_time=0.069 upstream_response_time=0.068 upstream_connect_time=0.000
I am facing issues in writing a parser rule in telegraf logparser section. Once this data is parsed properly, Telegraf can write it into influx DB.
I have tried various solutions online to find a parsing rule but not able to do so as I am new to it. Any assistance will be appreciated.
Thanks and do let me know if any further information is required.

Unable to start server after Jupyterhub upgrade to 0.8.1

I have recently upgraded jupyterhub from 0.7 to 0.8.1. After upgrade, i have upgraded the sqllite database as well as mentioned in the upgrade documents. I'm able to start the jupyterhub service but after login, i'm unable to start the server with below error. My server is AD integrated for login. This was working perfectly before upgrade. Any idea how this can be resolved?
[I 2019-03-14 15:21:57.698 JupyterHub base:346] User logged in: test
[E 2019-03-14 15:21:57.746 JupyterHub user:427] Unhandled error starting test's server: 'getpwnam(): name not found: test'
[E 2019-03-14 15:21:57.755 JupyterHub web:1590] Uncaught exception POST /hub/login?next= (192.168.0.24)
HTTPServerRequest(protocol='https', host='jupyter2.testing.com', method='POST', uri='/hub/login?next=', version='HTTP/1.1', remote_ip='192.168.0.24', headers={'Content-Type': 'application/x-www-form-urlencoded', 'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8', 'Cookie': '_xsrf=2|e7c2dfb6|2e7d6377e8446061ff8be0e64f86210f|1551259887', 'Upgrade-Insecure-Requests': '1', 'Host': 'jupyter2.testing.com', 'X-Forwarded-Proto': 'https', 'Origin': 'https://jupyter2.testing.com', 'X-Real-Ip': '192.168.0.24', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'Accept-Encoding': 'gzip, deflate, br', 'Content-Length': '42', 'Cache-Control': 'max-age=0', 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36', 'X-Forwarded-Port': '443', 'Referer': 'https://jupyter2.testing.com/hub/login', 'X-Forwarded-Host': 'jupyter2.testing.com', 'X-Forwarded-For': '192.168.0.24,127.0.0.1', 'Connection': 'close', 'X-Nginx-Proxy': 'true'})
Traceback (most recent call last):
File "/usr/local/python3/lib/python3.4/site-packages/tornado/web.py", line 1511, in _execute
result = yield result
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/handlers/login.py", line 94, in post
yield self.spawn_single_user(user)
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/handlers/base.py", line 475, in spawn_single_user
yield gen.with_timeout(timedelta(seconds=self.slow_spawn_timeout), finish_spawn_future)
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/handlers/base.py", line 445, in finish_user_spawn
yield spawn_future
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/user.py", line 439, in spawn
raise e
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/user.py", line 378, in spawn
ip_port = yield gen.with_timeout(timedelta(seconds=spawner.start_timeout), f)
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/spawner.py", line 968, in start
env = self.get_env()
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/spawner.py", line 960, in get_env
env = self.user_env(env)
File "/usr/local/python3/lib/python3.4/site-packages/jupyterhub/spawner.py", line 947, in user_env
home = pwd.getpwnam(self.user.name).pw_dir
KeyError: 'getpwnam(): name not found: test'
[E 2019-03-14 15:21:57.756 JupyterHub log:114] {
"Content-Type": "application/x-www-form-urlencoded",
"Accept-Language": "en-GB,en-US;q=0.9,en;q=0.8",
"Cookie": "_xsrf=2|e7c2dfb6|2e7d6377e8446061ff8be0e64f86210f|1551259887",
"Upgrade-Insecure-Requests": "1",
"Host": "jupyter2.testing.com",
"X-Forwarded-Proto": "https",
"Origin": "https://jupyter2.testing.com",
"X-Real-Ip": "192.168.0.24",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Content-Length": "42",
"Cache-Control": "max-age=0",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36",
"X-Forwarded-Port": "443",
"Referer": "https://jupyter2.testing.com/hub/login",
"X-Forwarded-Host": "jupyter2.testing.com",
"X-Forwarded-For": "192.168.0.24,127.0.0.1",
"Connection": "close",
"X-Nginx-Proxy": "true"
}
[E 2019-03-14 15:21:57.757 JupyterHub log:122] 500 POST /hub/login?next= (#192.168.0.24) 199.65ms

Resources