cljs-http.client get request incompatibility with target - http

I'm trying to make an clj-http.client request to a third-party webpage, but am getting the following error response:
clj-http: status 403
{:cached nil,
:request-time 141,
:repeatable? false,
:protocol-version {:name "HTTP", :major 1, :minor 1},
:streaming? true,
:http-client
#object[org.apache.http.impl.client.InternalHttpClient 0x7d5d53f8 "org.apache.http.impl.client.InternalHttpClient#7d5d53f8"],
:chunked? true,
:type :clj-http.client/unexceptional-status,
:reason-phrase "Forbidden",
:headers
{"Server" "cloudflare",
"Content-Type" "text/html; charset=UTF-8",
"X-Frame-Options" "SAMEORIGIN",
"Connection" "close",
"cf-request-id" "02afa1c33a00000cb192b11200000001",
"Transfer-Encoding" "chunked",
"Set-Cookie"
"__cfduid=d5186984dc436aac8989538666ddb21761589373348; expires=Fri, 12-Jun-20 12:35:48 GMT; path=/; domain=.angel.co; HttpOnly; SameSite=Lax",
"Expect-CT"
"max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"",
"CF-RAY" "592c6be52bb00cb1-EWR",
"Date" "Wed, 13 May 2020 12:35:48 GMT",
"Vary" "Accept-Encoding",
"Cache-Control" "no-cache",
"CF-Chl-Bypass" "1"},
:orig-content-encoding nil,
:status 403,
:length -1,
:body
"<!DOCTYPE html>\n<!--[if lt IE 7]> <html class=\"no-js ie6 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 7]> <html class=\"no-js ie7 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 8]> <html class=\"no-js ie8 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class=\"no-js\" lang=\"en-US\"> <!--<![endif]-->\n<head>\n<title>Attention Required! | Cloudflare</title>\n<meta name=\"captcha-bypass\" id=\"captcha-bypass\" />\n<meta charset=\"UTF-8\" />\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n<meta http-equiv=\"X-UA-Compatible\" content=\"IE=Edge,chrome=1\" />\n<meta name=\"robots\" content=\"noindex, nofollow\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1\" />\n<link rel=\"stylesheet\" id=\"cf_styles-css\" href=\"/cdn-cgi/styles/cf.errors.css\" type=\"text/css\" media=\"screen,projection\" />\n<!--[if lt IE 9]><link rel=\"stylesheet\" id='cf_styles-ie-css' href=\"/cdn-cgi/styles/cf.errors.ie.css\" type=\"text/css\" media=\"screen,projection\" /><![endif]-->\n<style type=\"text/css\">body{margin:0;padding:0}</style>\n\n\n<!--[if gte IE 10]><!--><script type=\"text/javascript\" src=\"/cdn-cgi/scripts/zepto.min.js\"></script><!--<![endif]-->\n<!--[if gte IE 10]><!--><script type=\"text/javascript\" src=\"/cdn-cgi/scripts/cf.common.js\"></script><!--<![endif]-->\n\n\n\n\n</head>\n<body>\n <div id=\"cf-wrapper\">\n <div class=\"cf-alert cf-alert-error cf-cookie-error\" id=\"cookie-alert\" data-translate=\"enable_cookies\">Please enable cookies.</div>\n <div id=\"cf-error-details\" class=\"cf-error-details-wrapper\">\n <div class=\"cf-wrapper cf-header cf-error-overview\">\n <h1 data-translate=\"challenge_headline\">One more step</h1>\n <h2 class=\"cf-subheadline\"><span data-translate=\"complete_sec_check\">Please complete the security check to access</span> angel.co</h2>\n </div><!-- /.header -->\n \n <div class=\"cf-section cf-highlight cf-captcha-container\">\n <div class=\"cf-wrapper\">\n <div class=\"cf-columns two\">\n <div class=\"cf-column\">\n \n <div class=\"cf-highlight-inverse cf-form-stacked\">\n <form class=\"challenge-form\" id=\"challenge-form\" action=\"/?__cf_chl_captcha_tk__=0d94e2e21d2ef06ef34bd2b5b4667f279b690108-1589373348-0-ATT4PiY_pQI1dGVw0_sZDV32_7x4mqtO4RepyD-L4i6zBJiIuml25fVlyJaK8uXNJw5ZWnzGlb6y0jGJJ8HIdEz14sOXRUoHqs_naHtwFEQywa8qZf_rwHsBxIUD5y_FNPph6TDrcfLVnQaN9eyy5VjiznzH4y0yeK8cidNnd-qNGw4OIZbFLfv8299DGhvNnBgsbn3BiQ9bkoGOtE4wANUh5U2LTJVAWhlquAvfhjCu6jHlYRXtN5GdnNvfBbCYwWGwCX0j88J-qCjJFOrSvx1_xraYtpB_Y8PpLHZTob_t8POfE0kJpn9ZYxwjhLQhqAAcIoE8fRe7Lv_50pzummklgMLgTRT2_NJGiE-_jNEogQmoTCvGOOmhNCe28SVYkXop9Ajm-z-6xwgoKQnY7EwekXJZCs-4nwpWJ9Gh3HBgVxZRiuv_wKgcmU0sPlLXSL5G8yOVdbBKBtHhQyqadtmTSg_IC2HV7SiYqPoJMmpJGfxxUm1au7ZS9ZiLpokjI5pQDZLpT2ZG-6jVfnTKvt9w_qmMtUSBhDleXd8mG59r\" method=\"POST\" enctype=\"application/x-www-form-urlencoded\">\n <input type=\"hidden\" name=\"r\" value=\"bfe8db4864c274e3ed80528a0e0ad233279c00b9-1589373348-0-AVWaRjujNq/XSmYrRyYxyBLhp5bbxA92rBX2qiiOx9PVWzas1b/usxApmblw248v1q5iUvP/V/GYHXhQF1UBviAqExhVjGW4upmNgdEf/zdFWHbgQb/s0RdZyMS+rurne8Y7aKD8ppx/WHjY8eSxVTGcHePc+qs/NdCt33voCLk2sGd0inuxibNjFXkBT62qs/JshlzaDsM58mC/jdSBRHZiOoJHmteJ0J1vwDVTVumWM97Qrc9fDyAqvDo72LCfqq0uG6hppWsi/z5jnGhTwzmJ7biqcY3BThvQAABSgD80MH4unfjys3iYhsefX0tfuAm23Rx1BCoKDRrrnWy0//Z9D0vI3petRmLSerLnJUAqCRh6ZoRqahYwNTPr39G+/WBJBsh3UDfB0+PwSmGsczRmL6DDbDu023etpAhehWcdR55ftEcijKiEnnfZE4vyKYm4C835QoKlQ+odT+u7syO/u/PgoyguQxqnNoKdlSSCs4+96s86urmY/yM9T4dvZdB4K4aOVH5cNfRHc8fsqeKpcuxmBbHOmIYIAegjTd5iKB4OQtxPHti1ZQCLeP74OiAxF6UgH+bCBp+h2mfU19CtEXvfcQdxGXPDT/iAPbPZG8c7fubDCKUympyb5nbHzVUcL9IGTlCq1zN7B1pRFj/O6JKOGBRo+q0OEs0nI7l/RFvmDfEtA0FYSC4IGegEs//fUsB165Zdm2SdKk7/cy89Xd4Hy5cedzqmjrtKNw5zjvfjqaNU7FlUL38irfopK/Pyk5Fp/HdV7iMvflIJO1M7GedTWdcNKB/OqPGV9NuJaKYgJbgBrxS4iYtHw9ZZsKWogYCig+eYiU8ty/MSDus9zCE2yRIbLVQ59AFwqTwODgBaV2nJepBDxcXVauCpdHiGbi7Q9M4t1eyGafFUKasv3unzdriRTrFPZ+44ZQb3gYberTMv2f3MwfcryaFgxcgtu43w8Hy5nviA9sOeoLmPYMZtL85QbB+AzKCXJV5DfIGcMvx1aeD/D9QNyOSTakVv2tAwxnP5UeQj8mJKGHTYrIsOMFDfxSnQ2lVzMRPQYmeEes8KjFvYrGyQ82Io+hGnKYOHX1T1ioi+wh+MGacVaSC1VMfG6rdIauPSxbB9WNxqnJxKz7SxHNiV3Gwm4rgUOs+vN2tSPyfINt12OHU=\">\n <input type=\"hidden\" name=\"cf_captcha_kind\" value=\"h\">\n <script type=\"text/javascript\" src=\"/cdn-cgi/scripts/hcaptcha.challenge.js\" data-type=\"normal\" data-ray=\"592c6be52bb00cb1\" async data-sitekey=\"33f96e6a-38cd-421b-bb68-7806e1764460\"></script>\n <noscript id=\"cf-captcha-bookmark\" class=\"cf-captcha-info\">\n <h1 data-translate=\"turn_on_js\" style=\"color:#bd2426;\">Please turn JavaScript on and reload the page.</h1>\n </noscript>\n <div id=\"trk_captcha_js\" style=\"background-image:url('/cdn-cgi/images/trace/captcha/nojs/h/transparent.gif?ray=592c6be52bb00cb1')\"></div>\n</form>\n\n </div>\n </div>\n\n <div class=\"cf-column\">\n <div class=\"cf-screenshot-container\">\n \n <span class=\"cf-no-screenshot\"></span>\n \n </div>\n </div>\n </div><!-- /.columns -->\n </div>\n </div><!-- /.captcha-container -->\n\n <div class=\"cf-section cf-wrapper\">\n <div class=\"cf-columns two\">\n <div class=\"cf-column\">\n <h2 data-translate=\"why_captcha_headline\">Why do I have to complete a CAPTCHA?</h2>\n \n <p data-translate=\"why_captcha_detail\">Completing the CAPTCHA proves you are a human and gives you temporary access to the web property.</p>\n </div>\n\n <div class=\"cf-column\">\n <h2 data-translate=\"resolve_captcha_headline\">What can I do to prevent this in the future?</h2>\n \n\n <p data-translate=\"resolve_captcha_antivirus\">If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p>\n\n <p data-translate=\"resolve_captcha_network\">If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.</p>\n \n \n \n </div>\n </div>\n </div><!-- /.section -->\n \n\n <div class=\"cf-error-footer cf-wrapper\">\n <p>\n <span class=\"cf-footer-item\">Cloudflare Ray ID: <strong>592c6be52bb00cb1</strong></span>\n <span class=\"cf-footer-separator\">•</span>\n <span class=\"cf-footer-item\"><span>Your IP</span>: 128.151.150.1</span>\n <span class=\"cf-footer-separator\">•</span>\n <span class=\"cf-footer-item\"><span>Performance & security by</span> Cloudflare</span>\n \n </p>\n</div><!-- /.error-footer -->\n\n\n </div><!-- /#cf-error-details -->\n </div><!-- /#cf-wrapper -->\n\n <script type=\"text/javascript\">\n window._cf_translation = {};\n \n \n</script>\n\n\n</body>\n</html>\n",
:trace-redirects []}
Where the response says "no-js ie 6 oldie". How do I fix this?
-- EDIT --
Tried this, but it didn't work either
curl 'https://angel.co' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:76.0) Gecko/20100101 Firefox/76.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Connection: keep-alive' -H 'Cookie: __cfduid=d3d3c3a784a6878d27e67563734baeb7e1589375442; _ga=GA1.2.1171200054.1589375443; _gid=GA1.2.1332292087.1589375443; ajs_user_id=null; ajs_group_id=null; ajs_anonymous_id=%226898badd-9a9e-4876-9a75-e92df55c5d4b%22; _hjid=58408190-990c-4a5b-86c8-ac6003d5fe93' -H 'Upgrade-Insecure-Requests: 1'

Related

Need help in reaching content to scrape website behind cloudflare

I am trying to scrape articles from the following website: https://dzkuensel.com using Python.
However, if I use, e.g. requests.get() (or even selenium), I don't reach the contents I need because of this:
Can anyone suggest a workaround?
Setting some cookies and headers seem to solve the problem for me...
import requests
cookies = {
'cf_clearance': 'GpVnDnCl9tkRWCpGeyXaDZRCkyj9ah0oq8bVcJPUd9E-1664729507-0-150',
}
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'DNT': '1',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'cross-site',
'Sec-Fetch-User': '?1',
'Cache-Control': 'max-age=0',
}
response = requests.get('https://dzkuensel.com/', cookies=cookies, headers=headers)
response.text
'<!DOCTYPE html>\n<html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml" itemscope itemtype="http://schema.org/Article" prefix="og: http://ogp.me/ns#">\n<head>\n<meta charset="UTF-8" />\n<link rel="profile" href="https://gmpg.org/xfn/11" />\n<link rel="pingback" href="https://dzkuensel.com/xmlrpc.php" />\n<title>ཀུན་གསལ། – རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།</title>\n<meta property="og:title" content="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།"/>\r\n<meta property="og:type" content="website"/>\r\n<meta property="og:description" content=""/>\r\n<meta property="og:url" content="https://dzkuensel.com/"/>\r\n<meta property="og:site_name" content="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།"/>\r\n<meta name=\'robots\' content=\'max-image-preview:large\' />\n<link rel=\'dns-prefetch\' href=\'//fonts.googleapis.com\' />\n<link rel=\'dns-prefetch\' href=\'//s.w.org\' />\n<link rel="alternate" type="application/rss+xml" title="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག། » Feed" href="https://dzkuensel.com/feed/" />\n<link rel="alternate" type="application/rss+xml" title="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག། » Comments Feed" href="https://dzkuensel.com/comments/feed/" />\n\t\t<script type="text/javascript">\n\t\t\twindow._wpemojiSettings = {"baseUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/13.1.0\\/72x72\\/","ext":".png","svgUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/13.1.0\\/svg\\/","svgExt":".svg","source":{"concatemoji":"https:\\/\\/dzkuensel.com\\/wp-includes\\/js\\/wp-emoji-release.min.js"}};\n\t\t\t!function(e,a,t){var n,r,o,i=a.createElement("canvas"),p=i.getContext&&i.getContext("2d");function s(e,t){var a=String.fromCharCode;p.clearRect(0,0,i.width,i.height),p.fillText(a.apply(this,e),0,0);e=i.toDataURL();return p.clearRect(0,0,i.width,i.height),p.fillText(a.apply(this,t),0,0),e===i.toDataURL()}function c(e){var t=a.createElement("script");t.src=e,t.defer=t.type="text/javascript",a.getElementsByTagName("head")[0].appendChild(t)}for(o=Array("flag","emoji"),t.supports={everything:!0,everythingExceptFlag:!0},r=0;r<o.length;r++)t.supports[o[r]]=function(e){if(!p||!p.fillText)return!1;switch(p.textBaseline="top",p.font="600 32px Arial",e){case"flag":return s([127987,65039,8205,9895,65039],[127987,65039,8203,9895,65039])?!1:!s([55356,56826,55356,56819],[55356,56826,8203,55356,56819])&&!s([55356,57332,56128,56423,56128,56418,56128,56421,56128,56430,56128,56423,56128,56447],[55356,57332,8203,56128,56423,8203,56128,56418,8203,56128,56421,8203,56128,56430,8203,56128,56423,8203,56128,56447]);case"emoji":return!s([10084,65039,8205,55357,56613],[10084,65039,8203,55357,56613])}return!1}(o[r]),t.supports.everything=t.supports.everything&&t.supports[o[r]],"flag"!==o[r]&&(t.supports.everythingExceptFlag=t.supports.everythingExceptFlag&&t.supports[o[r]]);t.supports.everythingExceptFlag=t.supports.everythingExceptFlag&&!t.supports.flag,t.DOMReady=!1,t.readyCallback=function(){t.DOMReady=!0},t.supports.everything||(n=function(){t.readyCallback()},a.addEventListener?(a.addEventListener("DOMContentLoaded",n,!1),e.addEventListener("load",n,!1)):(e.attachEvent("onload",n),a.attachEvent("onreadystatechange",function(){"complete"===a.readyState&&t.readyCallback()})),(n=t.source||{}).concatemoji?c(n.concatemoji):n.wpemoji&&n.twemoji&&(c(n.twemoji),c(n.wpemoji)))}(window,document,window._wpemojiSettings);\n\t\t</script>\n\t\t<style type="text/css">\nimg.wp-smiley,\nimg.emoji {\n\tdisplay: inline !important;\n\tborder: none !important;\n\tbox-shadow: none !important;\n\theight: 1em !important;\n\twidth: 1em !important;\n\tmargin: 0 .07em !important;\n\tvertical-align: -0.1em !important;\n\tbackground: none !important;\n\tpadding: 0 !important;\n}\n</style>\n\t<link rel=\'stylesheet\' id=\'lafs_style-css\' href=\'https://dzkuensel.com/wp-content/plugins/follow-subscribe/css/lafs_style.css?lafs_version=1.1\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'tie-insta-style-css\' href=\'https://dzkuensel.com/wp-content/plugins/instanow/assets/style.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'categoryposttab-css\' href=\'https://dzkuensel.com/wp-content/plugins/category-and-post-tab/assets/css/categoryposttab.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'wp-block-library-css\' href=\'https://dzkuensel.com/wp-includes/css/dist/block-library/style.min.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'bwg_frontend-css\' href=\'https://dzkuensel.com/wp-content/plugins/photo-gallery/css/bwg_frontend.css\' type=\'text/css\'

HTML mail sent by Drupal - html tags are displayed as plain text

Drupal 9.3.3
Module MimeMail
MailSystem module settings; use MimeMAil as formatter and sender.
I send a test MimeMail successfully
but the content is not as expected for an HTML mail.
This is the content of the mail as displayed (not the code):
This is a multi-part message in MIME format.
--20dff9a16b5198f0553cfc2d198f7a201cbebed81
Content-Type: multipart/alternative;
boundary="7078940ec78091f01efed9a4d17f1784137cb90a0"
Content-Transfer-Encoding: 8bit
--7078940ec78091f01efed9a4d17f1784137cb90a0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
CONTENT OF PLAIN TEXT FIELD IN TEST MIMEMAIL FORM
--7078940ec78091f01efed9a4d17f1784137cb90a0
Content-Type: text/html; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8Bit
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Mime Mail Example message template</title>
</head>
<body id="mimemail-example-body" class="mimemail-example-test">
<div id="center">
<div id="main">
CONTENT OF HTML FIELD IN TEST MIMEMAIL FORM
</div>
</div>
</body>
</html>
--7078940ec78091f01efed9a4d17f1784137cb90a0--
--20dff9a16b5198f0553cfc2d198f7a201cbebed81--
I am not very familiar with mailing systems.
What could be the reason why html tags are displayed as plain text?
This is the code the mail:
Return-Path: <faoa3352#domain.net>
Delivered-To: myname#free.fr
Received: from capitale.jabatus.fr (mx15-g26.priv.proxad.net [172.20.243.85])
by toaster2-g26.priv.proxad.net (Postfix) with ESMTP id 3415FD8058D
for <myname#free.fr>; Wed, 2 Feb 2022 00:35:01 +0100 (CET)
Received: from capitale.jabatus.fr ([109.234.163.51])
by mx1-g20.free.fr (MXproxy) with ESMTPS for myname#free.fr
(version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256);
Wed, 2 Feb 2022 00:35:01 +0100 (CET)
X-ProXaD-SC: state=HAM score=0
X-ProXaD-Cause: (null)
X-Spam-Status: No
X-MailPropre-MailScanner-From: faoa3352#cantonais.o2switch.net
X-MailPropre-MailScanner-SpamCheck: not spam, SpamAssassin (not cached,
score=0.401, required 5, autolearn=disabled, FREEFR 0.01,
KAM_LOTSOFHASH 0.25, KAM_SHORT 1.00, RCVD_IN_DNSWL_HI 0.01,
SERVINT 0.01, SPF_HELO_NONE 0.00, SPF_PASS -1.00, ST02 0.10,
SUKC_2 0.01, VM03 0.01)
X-MailPropre-MailScanner: Not scanned: please contact your Internet E-Mail Service Provider for details
X-MailPropre-MailScanner-ID: 23D4B100499.AC917
X-MailPropre-MailScanner-Information: Please contact the ISP for more information
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
d=faoa3352.odns.fr; s=default; h=Date:From:Message-Id:MIME-Version:Subject:To
:Sender:Reply-To:Cc:Content-Type:Content-Transfer-Encoding:Content-ID:
Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
:Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:
List-Subscribe:List-Post:List-Owner:List-Archive;
bh=YvftJrKmBeNP4umR5N/8QzQNVuXwfXnJ1C9WgM5Bv4Y=; b=AfgO5GvhTm2fuO5iAc0HYy2bs7
u5kXFZ0ZIeFRGIxnwUMePgOERHOo8X3h+8cqJSkYYyRlmV5j8Nzgmj1oFLIJGA23Z2UogCLIXfn/t
szVb+ZAn3RvXLKtby2nkiVya/8/8+z3ohCzupUVMMxFGvjcRBL59uYfLAvOgltnDPsJXinnEsuzMp
ZYe3FdvalT7dQh5anEPhe6LjuMfuCyFDxP6Zz/EAxvfkaSKu1LBnMXsjrKha4PrAYwf7J7bS6ZQhO
ha9iGMsryjl7QDBoI8N4yEoRY/PhPSQi3Vw++4Yi476aIevKymIqECVWO+myfc5xAkyQz8OKrzgeX
e3dgTXPg==;
To: myname#free.fr
Subject: Test Mail
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="6151d4616ca0bcf7ab13e4df7dfeb33596e00b4c1"
Content-Transfer-Encoding: 8Bit
X-Mailer: Drupal
Sender: G-FAMILY <gilbert.admin#iaou.fr>
From: G-FAMILY <gilbert.admin#iaou.fr>
Message-Id: <E1nF2fg-0002U1-Ou#cantonais.o2switch.net>
From: faoa3352#cantonais.o2switch.net
Date: Wed, 02 Feb 2022 00:34:48 +0100
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - cantonais.o2switch.net
X-AntiAbuse: Original Domain - free.fr
X-AntiAbuse: Originator/Caller UID/GID - [1078 994] / [47 12]
X-AntiAbuse: Sender Address Domain - cantonais.o2switch.net
X-Get-Message-Sender-Via: cantonais.o2switch.net: authenticated_id: faoa3352/primary_hostname/system user
X-Authenticated-Sender: cantonais.o2switch.net: faoa3352
X-Source:
X-Source-Args:
X-Source-Dir: faoa3352.odns.fr:/SITES/gfamily.iaou.fr
This is a multi-part message in MIME format.
--6151d4616ca0bcf7ab13e4df7dfeb33596e00b4c1
Content-Type: multipart/alternative;
boundary="6d0a30d7981a4211f4855c18a1ed7354b284dbac0"
Content-Transfer-Encoding: 8bit
--6d0a30d7981a4211f4855c18a1ed7354b284dbac0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
TEST PLAIN TEXT
--6d0a30d7981a4211f4855c18a1ed7354b284dbac0
Content-Type: text/html; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8Bit
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Mime Mail Example message template</title>
</head>
<body id="mimemail-example-body" class="mimemail-example-test mail">
<div id="center">
<div id="main">
<strong>TEST HTML TEXT </strong>
</div>
</div>
</body>
</html>
--6d0a30d7981a4211f4855c18a1ed7354b284dbac0--
--6151d4616ca0bcf7ab13e4df7dfeb33596e00b4c1--
You must also add the mime formatter into your mail settings.
Go to admin/config/system/mailsystem and configure it, like this:
This is what I did, without success.
In the meantime I found that this issue was comming from a bad mime format.
Some spaces were added in the begining of header lines.
(see above in the mail raw content , just after line " MIME-Version: 1.0")
So I moved from "MailSystem +MimeMail" to "SymfonyMailer" and it works.

CasperJS gets blocked

I'm trying to build a scraper using Casperjs but it keeps getting blocked. I read several articles saying that it can be avoided by setting user-agent but even with user-agent I get blocked.
Here is my current setup:
var casper = require('casper').create({
verbose: true,
logLevel: 'debug',
colorizerType: 'Dummy',
waitTimeout: 30000, // timeout for waits (loading etc.)
exitOnError: true,
pageSettings: {
userAgent: 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5',
javascriptEnabled: true,
loadImages: true,
loadPlugins: true,
},
onError: function(msg, backtrace) {
this.exit();
}
});
casper.start().then(function() {
this.open('https://WEBSITE-URL', {
headers: {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'
}
});
casper.viewport(1280, 1024);
});
// Login
casper.then(function() {
this.echo("Waiting for login form to load.");
this.echo(this.getHTML());
});
I receive this HTML after running casper:
<!DOCTYPE html><html><head>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
<meta http-equiv="cache-control" content="max-age=0">
<meta http-equiv="cache-control" content="no-cache">
<meta http-equiv="expires" content="0">
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT">
<meta http-equiv="pragma" content="no-cache">
<meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?requestId=972f0bd8-1861-4c7b-8459-ce880b8cf2b6&httpReferrer=%2F">
<script type="text/javascript">
(function(window){
try {
if (typeof sessionStorage !== 'undefined'){
sessionStorage.setItem('distil_referrer', document.referrer);
}
} catch (e){}
})(window);
</script>
<script type="text/javascript" src="/dstltrntmls.js" defer="">
</script>
<style type="text/css">#d__fFH{position:absolute;top:-5000px;left:-5000px}#d__fF{font-family:serif;font-size:200px;visibility:hidden}#ruxctfdwzvsxvuucdvdtdtsufa{display:none!important}</style></head>
<body>
<div id="distilIdentificationBlock"> </div>
<div id="d__fFH" style="position: absolute; top: -5000px; left: -5000px;">
<object id="d_dlg" classid="clsid:3050f819-98b5-11cf-bb82-00aa00bdce0b" width="0px" height="0px"></object>
<span id="d__fF" style="font-family: Courier, serif; font-size: 72px; visibility: hidden;">The quick brown fox jumps over the lazy dog.</span></div></body>
</html>
Is there a way to workaround this issue. When I try a simple GET request in POSTMAN it turns the actual HTML but it doesn't in casperjs.

Looping through array in PHP to post several multipart form-data

I'm trying in an asp web application to code a function that would loop through a list of files in a multiple upload form and send them one by one.
Is this something that can be done in ASP? Because I've read some posts about how to attach several files together, but saw nothing about looping through the files. I can easily imagine it in C# via HttpWebRequest or with socket, but in php, I guess there are already function designed to handle it?
// This is false/pseudo-code :)
for (int index = 0; index < number_of_files; index++)
{
postfile(file[index]);
}
And in each iteration, it should send a multipart form-data POST.
postfile(TheFileInfos) should make a POST like it:
POST /afs.aspx?fn=upload HTTP/1.1
[Header stuff]
Content-Type: multipart/form-data; boundary=----------Ef1Ef1cH2Ij5GI3ae0gL6KM7GI3GI3
[Header stuff]
------------Ef1Ef1cH2Ij5GI3ae0gL6KM7GI3GI3
Content-Disposition: form-data; name="Filename" myimage1.png
------------Ef1Ef1cH2Ij5GI3ae0gL6KM7GI3GI3
Content-Disposition: form-data; name="fileid"
58e21ede4ead43a5201206101806420000007667212251
------------Ef1Ef1cH2Ij5GI3ae0gL6KM7GI3GI3
Content-Disposition: form-data; name="Filedata"; filename="myimage1.png"
Content-Type: application/octet-stream
[Octet Stream]
[Edit]
I'll try it:
<html>
<head>
<title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<form name="form1" enctype="multipart/form-data" method="post" action="processFiles.php">
<p>
<?
// start of dynamic form
$uploadNeed = $_POST['uploadNeed'];
for($x=0;$x<$uploadNeed;$x++){
?>
<input name="uploadFile<? echo $x;?>" type="file" id="uploadFile<? echo $x;?>">
</p>
<?
// end of for loop
}
?>
<p><input name="uploadNeed" type="hidden" value="<? echo $uploadNeed;?>">
<input type="submit" name="Submit" value="Submit">
</p>
</form>
</body>
</html>

PayPal API Listener Website Payments Standard URI

The PayPal IPN Guide documentation says clearly
Post the request to www.paypal.com or www.sandbox.paypal.com, depending on whether you are going live or testing your listener in the Sandbox.
Wait for a response from PayPal, which is either VERIFIED or INVALID.
Well, I tried that (the Sandbox version), and the response was a full HTML page.
So I glanced at the code sample at https://www.paypal.com/us/cgi-bin/webscr?cmd=p/pdn/ipn-codesamples-pop-outside#php and saw that the suggested URI there was /cgi-bin/webscr. I tried that, and still got a full HTML page. DOCTYPE and everything.
What am I doing wrong? And is it just me, or is PayPal documentation unnecessarily confusing?
Edit to add:
I've tried resetting the URL to a page I control, which simply dumps out $_GET, $_POST and $_SERVER data, and I can see there that I'm sending the correct info. (I'm now putting the information in the $_GET string, as Alex K suggested, instead of in the POST body, but I'm still sending it as a POST request.)
And I'm still getting a HTML reply from the sandbox:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<!--
Script info: script: webscr, cmd: notify-validate, template: p/wel/sandbox-outside, date: Jul 28, 2010 17:09:26 PDT; country: US, language: en_US, xslt server:
web version: 64.0-1430643 branch: UPR_641_int
content version: -
pexml version: 64.0-1434686
page XSL: Merchant/default/en_US/homepage/SandBox-outside.xsl
hostname : DOxxcnld8je7pj0zYHT0DtWhtm4QxXx1WVQNKYCmQt0
rlogid : DOxxcnld8je7pj0zYHT0Do0AouceG%2b49A2fz8FNwI82Hi9r1Lzz7MA%3d%3d_12a42bb271e
-->
<title>Welcome - PayPal</title>
<!--googleoff: all-->
<meta name="description" content="PayPal is the safer, easier way to pay online without revealing your credit card number.">
<!--googleon: all-->
<meta http-equiv="X-UA-Compatible" content="IE=8">
<link media="screen" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/core/xptdev.css">
<link media="screen" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/core/global.css">
<!--[if IE 8]><link media="screen" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/browsers/ie8.css"><![endif]-->
<!--[if IE 7]><link media="screen" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/browsers/ie7.css"><![endif]-->
<!--[if lte IE 6]><link media="screen" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/browsers/ie6.css"><![endif]-->
<link rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/sandbox.css">
<link media="print" rel="stylesheet" type="text/css" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/css/core/print.css">
<script type="text/javascript">
if (parent.frames.length > 0){
top.location.replace(document.location);
}</script><script type="text/javascript" src="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/js/lib/min/global.js"></script><script type="text/javascript">PAYPAL.util.lazyLoadRoot = 'https://www.sandbox.paypal.com/WEBSCR-640-20100726-1';</script><link rel="shortcut icon" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/en_US/i/icon/pp_favicon_x.ico">
<link rel="apple-touch-icon" href="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/en_US/i/pui/apple-touch-icon.png">
</head>
<body class="xptSandbox">
<noscript><p class="nonjsAlert">NOTE: Many features on the PayPal Web site require Javascript and cookies. You can enable both via your browser's preference settings.</p></noscript>
<div class="" id="page">
<div id="content">
<div id="headline">
<h2 class="accessAid">Welcome</h2>
</div>
<div id="messageBox"></div>
<div id="main"><div class="layout1">
<p><img src="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/en_US/i/logo/logo_sandbox_clr_289x39.gif" border="0" alt=""></p>
<p align="center"><strong>Please login to use the PayPal Sandbox features.</strong></p>
</div></div>
</div>
<div id="navFull"><ul>
<li class="active">
Home<ul>
<li class="active">
How PayPal Works<ul>
<li>What is PayPal</li>
<li>Getting Started</li>
<li>Managing Your Account</li>
<li>Great Ways to Use PayPal</li>
<li>Top Ten Things to Know About PayPal</li>
<li>How Much It Costs</li>
<li>Account Types</li>
</ul>
</li>
<li>
Pay Online<ul>
<li>Great Deals</li>
<li>PayPal Store Directory</li>
<li>PayPal Plus MasterCard</li>
<li>Shop Via Mobile</li>
</ul>
</li>
<li>
Send Money<ul>
<li>Send Money Online</li>
<li>Internationally</li>
<li>To Your Teen</li>
<li>Via Your Mobile</li>
</ul>
</li>
<li>
Get Paid<ul>
<li>Sell Online</li>
<li>Accept Credit Cards</li>
<li>Request Money</li>
<li>Accept Donations</li>
</ul>
</li>
<li>Products & Services</li>
</ul>
</li>
<li>Personal</li>
<li>Business</li>
<li>Developers</li>
</ul></div>
<script type="text/javascript">if(typeof PAYPAL != 'undefined'){ PAYPAL.core.Navigation.init(); }</script>
</div>
<script type="text/javascript" src="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/js/lib/min/widgets.js"></script><script type="text/javascript" src="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/js/pp_naturalsearch.js"></script><script type="text/javascript">mp_landing();</script>
<!-- SiteCatalyst Code
Copyright 1997-2005 Omniture, Inc.
More info available at http://www.omniture.com -->
<script type="text/javascript" src="https://www.sandbox.paypal.com/WEBSCR-640-20100726-1/js/site_catalyst/pp_jscode_paypalsandboxdev.js"></script>
<script type="text/javascript">
s.prop1="p/wel/sandbox-outside";
s.prop7="Unknown";
s.prop8="Unknown";
s.prop9="Unknown";
s.prop10="US";
s.prop14="";
s.prop34="PayPalCredit:Servicing:CO:NoTransactions";
s.pageName="p/wel/sandbox-outside::notify-validate";
s.prop50="en_US";
s.prop18="";
</script>
<script type="text/javascript"><!--
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
var s_code=s.t();if(s_code)document.write(s_code);
if(navigator.appVersion.indexOf('MSIE')>=0)document.write(unescape('%3C')+'\!-'+'-')
//-->
</script><noscript><img
src="//paypal.112.2O7.net/b/ss/paypalsandboxdev/1/H.6--NS/0?pageName=NonJavaScript"
height="1" width="1" border="0" alt="" /></noscript>
<!--/DO NOT REMOVE/-->
<!-- End SiteCatalyst Code -->
<script type="text/javascript">
YUE.addListener(window, "load", function() {
PAYPAL.util.lazyLoad("/js/Customer/min/baynote.js", function() {
var searchFormsIDs = ["searchForm", "searchformnew", "searchform"];
YUE.addListener(searchFormsIDs, 'submit', function() {baynote_handleSubmit(this);});
var bn_timeout = setTimeout(function() {
if (typeof baynote_validateSearchBox == 'function') {
baynote_validateSearchBox();
clearTimeout(bn_timeout);
}
}, 200);
});
});
</script>
</body>
</html>
If you were using CURL, there's a catch - you have to form the POST data block by hand, instead of passing an array. If you pass an array, CURL will send it as multipart data, instead of URL-encoded form, like PayPal expects. I had a very similar issue and got it to work with CURL, eventually. Sample in PHP below:
function deq($s) //Removes the dreaded "magic quotes"
{
if($s == null)
return null;
return
get_magic_quotes_gpc() ?
stripslashes($s) : $s;
}
function MakeQS($po) //Makes an URL-encoded query string from an associative array
{
$ps = "";
foreach($po as $k => $v)
$ps .= ($ps == "" ? "" : "&").$k."=".urlencode(deq($v));
return $ps;
}
$cu = curl_init("https://www.paypal.com/cgi-bin/webscr");
$po = $_POST;
$po["cmd"] = "_notify-validate";
curl_setopt_array($cu, array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => false,
CURLOPT_POST => 1,
CURLOPT_POSTFIELDS => MakeQS($po) //This is the non-obvious bit!
));
$resp = curl_exec($cu);
if(curl_errno($cu) !== 0)
//fail...
curl_close($cu);
if($resp != "VERIFIED")
//fail...
This is working production code.
When the IPN hits your script you should post back to:
https://www.paypal.com/cgi-bin/webscr?cmd=_notify-validate&<all the junk they posted you>
(or https://www.sandbox.paypal.com/...)
Then check the response body of that for VERIFIED.
If all of that's giving you some HTML, what is it?
I've got this sorted now, by using their provided sample code and tweaking it slightly to put in validation. I'm still not sure exactly what I was doing wrong with the code I was using: I haven't bothered debugging it, now that I've got other code working.
It is a POST request, not GET.
For now, I'm using their sample code, which uses fsockopen instead of cURL, but it works, which is the main thing.
The request goes to a URL on PayPal which is not reserved for IPN calls: it also exists as an HTML page.
If you do anything wrong in your request, PayPal returns the HTML page instead of an IPN response.
https://www.x.com/message/180364

Resources