Need help in reaching content to scrape website behind cloudflare - web-scraping

I am trying to scrape articles from the following website: https://dzkuensel.com using Python.
However, if I use, e.g. requests.get() (or even selenium), I don't reach the contents I need because of this:
Can anyone suggest a workaround?

Setting some cookies and headers seem to solve the problem for me...
import requests
cookies = {
'cf_clearance': 'GpVnDnCl9tkRWCpGeyXaDZRCkyj9ah0oq8bVcJPUd9E-1664729507-0-150',
}
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'DNT': '1',
'Connection': 'keep-alive',
'Upgrade-Insecure-Requests': '1',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'cross-site',
'Sec-Fetch-User': '?1',
'Cache-Control': 'max-age=0',
}
response = requests.get('https://dzkuensel.com/', cookies=cookies, headers=headers)
response.text
'<!DOCTYPE html>\n<html lang="en-US" xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml" itemscope itemtype="http://schema.org/Article" prefix="og: http://ogp.me/ns#">\n<head>\n<meta charset="UTF-8" />\n<link rel="profile" href="https://gmpg.org/xfn/11" />\n<link rel="pingback" href="https://dzkuensel.com/xmlrpc.php" />\n<title>ཀུན་གསལ། – རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།</title>\n<meta property="og:title" content="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།"/>\r\n<meta property="og:type" content="website"/>\r\n<meta property="og:description" content=""/>\r\n<meta property="og:url" content="https://dzkuensel.com/"/>\r\n<meta property="og:site_name" content="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག།"/>\r\n<meta name=\'robots\' content=\'max-image-preview:large\' />\n<link rel=\'dns-prefetch\' href=\'//fonts.googleapis.com\' />\n<link rel=\'dns-prefetch\' href=\'//s.w.org\' />\n<link rel="alternate" type="application/rss+xml" title="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག། » Feed" href="https://dzkuensel.com/feed/" />\n<link rel="alternate" type="application/rss+xml" title="ཀུན་གསལ། - རྒྱལ་ཡོངས་ཉིན་བསྟར་གསར་ཤོག། » Comments Feed" href="https://dzkuensel.com/comments/feed/" />\n\t\t<script type="text/javascript">\n\t\t\twindow._wpemojiSettings = {"baseUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/13.1.0\\/72x72\\/","ext":".png","svgUrl":"https:\\/\\/s.w.org\\/images\\/core\\/emoji\\/13.1.0\\/svg\\/","svgExt":".svg","source":{"concatemoji":"https:\\/\\/dzkuensel.com\\/wp-includes\\/js\\/wp-emoji-release.min.js"}};\n\t\t\t!function(e,a,t){var n,r,o,i=a.createElement("canvas"),p=i.getContext&&i.getContext("2d");function s(e,t){var a=String.fromCharCode;p.clearRect(0,0,i.width,i.height),p.fillText(a.apply(this,e),0,0);e=i.toDataURL();return p.clearRect(0,0,i.width,i.height),p.fillText(a.apply(this,t),0,0),e===i.toDataURL()}function c(e){var t=a.createElement("script");t.src=e,t.defer=t.type="text/javascript",a.getElementsByTagName("head")[0].appendChild(t)}for(o=Array("flag","emoji"),t.supports={everything:!0,everythingExceptFlag:!0},r=0;r<o.length;r++)t.supports[o[r]]=function(e){if(!p||!p.fillText)return!1;switch(p.textBaseline="top",p.font="600 32px Arial",e){case"flag":return s([127987,65039,8205,9895,65039],[127987,65039,8203,9895,65039])?!1:!s([55356,56826,55356,56819],[55356,56826,8203,55356,56819])&&!s([55356,57332,56128,56423,56128,56418,56128,56421,56128,56430,56128,56423,56128,56447],[55356,57332,8203,56128,56423,8203,56128,56418,8203,56128,56421,8203,56128,56430,8203,56128,56423,8203,56128,56447]);case"emoji":return!s([10084,65039,8205,55357,56613],[10084,65039,8203,55357,56613])}return!1}(o[r]),t.supports.everything=t.supports.everything&&t.supports[o[r]],"flag"!==o[r]&&(t.supports.everythingExceptFlag=t.supports.everythingExceptFlag&&t.supports[o[r]]);t.supports.everythingExceptFlag=t.supports.everythingExceptFlag&&!t.supports.flag,t.DOMReady=!1,t.readyCallback=function(){t.DOMReady=!0},t.supports.everything||(n=function(){t.readyCallback()},a.addEventListener?(a.addEventListener("DOMContentLoaded",n,!1),e.addEventListener("load",n,!1)):(e.attachEvent("onload",n),a.attachEvent("onreadystatechange",function(){"complete"===a.readyState&&t.readyCallback()})),(n=t.source||{}).concatemoji?c(n.concatemoji):n.wpemoji&&n.twemoji&&(c(n.twemoji),c(n.wpemoji)))}(window,document,window._wpemojiSettings);\n\t\t</script>\n\t\t<style type="text/css">\nimg.wp-smiley,\nimg.emoji {\n\tdisplay: inline !important;\n\tborder: none !important;\n\tbox-shadow: none !important;\n\theight: 1em !important;\n\twidth: 1em !important;\n\tmargin: 0 .07em !important;\n\tvertical-align: -0.1em !important;\n\tbackground: none !important;\n\tpadding: 0 !important;\n}\n</style>\n\t<link rel=\'stylesheet\' id=\'lafs_style-css\' href=\'https://dzkuensel.com/wp-content/plugins/follow-subscribe/css/lafs_style.css?lafs_version=1.1\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'tie-insta-style-css\' href=\'https://dzkuensel.com/wp-content/plugins/instanow/assets/style.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'categoryposttab-css\' href=\'https://dzkuensel.com/wp-content/plugins/category-and-post-tab/assets/css/categoryposttab.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'wp-block-library-css\' href=\'https://dzkuensel.com/wp-includes/css/dist/block-library/style.min.css\' type=\'text/css\' media=\'all\' />\n<link rel=\'stylesheet\' id=\'bwg_frontend-css\' href=\'https://dzkuensel.com/wp-content/plugins/photo-gallery/css/bwg_frontend.css\' type=\'text/css\'

Related

Adding a post to google blogger with R httr

I want to post a message to the Google blogger API with R, but I am puzzled how to format the request with R, httr and jsonlite. GET works, but the formatting of the POST message and the JSON isn't clear to me.
The example of a POST is shown at the Google developer page:
POST https://www.googleapis.com/blogger/v3/blogs/8070105920543249955/posts/
Authorization: /* OAuth 2.0 token here */
Content-Type: application/json
{
"kind": "blogger#post",
"blog": {
"id": "8070105920543249955"
},
"title": "A new post",
"content": "With <b>exciting</b> content..."
}
Creating the message:
amesg <- list(
kind = "blogger#post",
blog =list(id = "..."),
Title = "A new post",
content = "With exciting content...")
jsnmesge <- jsonlite::toJSON(amesg,pretty = TRUE, auto_unbox = TRUE)
httr::POST("https://www.googleapis.com/blogger/v3/blogs/.../posts/key=...",
body = jsnmesge)
Gives as response:
<html lang=en>
<meta charset=utf-8>
<meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">
<title>Error 404 (Not Found)!!1</title>
<style>
*{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;p...
</style>
<a href=//www.google.com/><span id=logo aria-label=Google></span></a>
<p><b>404.</b> <ins>That’s an error.</ins>

cljs-http.client get request incompatibility with target

I'm trying to make an clj-http.client request to a third-party webpage, but am getting the following error response:
clj-http: status 403
{:cached nil,
:request-time 141,
:repeatable? false,
:protocol-version {:name "HTTP", :major 1, :minor 1},
:streaming? true,
:http-client
#object[org.apache.http.impl.client.InternalHttpClient 0x7d5d53f8 "org.apache.http.impl.client.InternalHttpClient#7d5d53f8"],
:chunked? true,
:type :clj-http.client/unexceptional-status,
:reason-phrase "Forbidden",
:headers
{"Server" "cloudflare",
"Content-Type" "text/html; charset=UTF-8",
"X-Frame-Options" "SAMEORIGIN",
"Connection" "close",
"cf-request-id" "02afa1c33a00000cb192b11200000001",
"Transfer-Encoding" "chunked",
"Set-Cookie"
"__cfduid=d5186984dc436aac8989538666ddb21761589373348; expires=Fri, 12-Jun-20 12:35:48 GMT; path=/; domain=.angel.co; HttpOnly; SameSite=Lax",
"Expect-CT"
"max-age=604800, report-uri=\"https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct\"",
"CF-RAY" "592c6be52bb00cb1-EWR",
"Date" "Wed, 13 May 2020 12:35:48 GMT",
"Vary" "Accept-Encoding",
"Cache-Control" "no-cache",
"CF-Chl-Bypass" "1"},
:orig-content-encoding nil,
:status 403,
:length -1,
:body
"<!DOCTYPE html>\n<!--[if lt IE 7]> <html class=\"no-js ie6 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 7]> <html class=\"no-js ie7 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if IE 8]> <html class=\"no-js ie8 oldie\" lang=\"en-US\"> <![endif]-->\n<!--[if gt IE 8]><!--> <html class=\"no-js\" lang=\"en-US\"> <!--<![endif]-->\n<head>\n<title>Attention Required! | Cloudflare</title>\n<meta name=\"captcha-bypass\" id=\"captcha-bypass\" />\n<meta charset=\"UTF-8\" />\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\n<meta http-equiv=\"X-UA-Compatible\" content=\"IE=Edge,chrome=1\" />\n<meta name=\"robots\" content=\"noindex, nofollow\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1\" />\n<link rel=\"stylesheet\" id=\"cf_styles-css\" href=\"/cdn-cgi/styles/cf.errors.css\" type=\"text/css\" media=\"screen,projection\" />\n<!--[if lt IE 9]><link rel=\"stylesheet\" id='cf_styles-ie-css' href=\"/cdn-cgi/styles/cf.errors.ie.css\" type=\"text/css\" media=\"screen,projection\" /><![endif]-->\n<style type=\"text/css\">body{margin:0;padding:0}</style>\n\n\n<!--[if gte IE 10]><!--><script type=\"text/javascript\" src=\"/cdn-cgi/scripts/zepto.min.js\"></script><!--<![endif]-->\n<!--[if gte IE 10]><!--><script type=\"text/javascript\" src=\"/cdn-cgi/scripts/cf.common.js\"></script><!--<![endif]-->\n\n\n\n\n</head>\n<body>\n <div id=\"cf-wrapper\">\n <div class=\"cf-alert cf-alert-error cf-cookie-error\" id=\"cookie-alert\" data-translate=\"enable_cookies\">Please enable cookies.</div>\n <div id=\"cf-error-details\" class=\"cf-error-details-wrapper\">\n <div class=\"cf-wrapper cf-header cf-error-overview\">\n <h1 data-translate=\"challenge_headline\">One more step</h1>\n <h2 class=\"cf-subheadline\"><span data-translate=\"complete_sec_check\">Please complete the security check to access</span> angel.co</h2>\n </div><!-- /.header -->\n \n <div class=\"cf-section cf-highlight cf-captcha-container\">\n <div class=\"cf-wrapper\">\n <div class=\"cf-columns two\">\n <div class=\"cf-column\">\n \n <div class=\"cf-highlight-inverse cf-form-stacked\">\n <form class=\"challenge-form\" id=\"challenge-form\" action=\"/?__cf_chl_captcha_tk__=0d94e2e21d2ef06ef34bd2b5b4667f279b690108-1589373348-0-ATT4PiY_pQI1dGVw0_sZDV32_7x4mqtO4RepyD-L4i6zBJiIuml25fVlyJaK8uXNJw5ZWnzGlb6y0jGJJ8HIdEz14sOXRUoHqs_naHtwFEQywa8qZf_rwHsBxIUD5y_FNPph6TDrcfLVnQaN9eyy5VjiznzH4y0yeK8cidNnd-qNGw4OIZbFLfv8299DGhvNnBgsbn3BiQ9bkoGOtE4wANUh5U2LTJVAWhlquAvfhjCu6jHlYRXtN5GdnNvfBbCYwWGwCX0j88J-qCjJFOrSvx1_xraYtpB_Y8PpLHZTob_t8POfE0kJpn9ZYxwjhLQhqAAcIoE8fRe7Lv_50pzummklgMLgTRT2_NJGiE-_jNEogQmoTCvGOOmhNCe28SVYkXop9Ajm-z-6xwgoKQnY7EwekXJZCs-4nwpWJ9Gh3HBgVxZRiuv_wKgcmU0sPlLXSL5G8yOVdbBKBtHhQyqadtmTSg_IC2HV7SiYqPoJMmpJGfxxUm1au7ZS9ZiLpokjI5pQDZLpT2ZG-6jVfnTKvt9w_qmMtUSBhDleXd8mG59r\" method=\"POST\" enctype=\"application/x-www-form-urlencoded\">\n <input type=\"hidden\" name=\"r\" value=\"bfe8db4864c274e3ed80528a0e0ad233279c00b9-1589373348-0-AVWaRjujNq/XSmYrRyYxyBLhp5bbxA92rBX2qiiOx9PVWzas1b/usxApmblw248v1q5iUvP/V/GYHXhQF1UBviAqExhVjGW4upmNgdEf/zdFWHbgQb/s0RdZyMS+rurne8Y7aKD8ppx/WHjY8eSxVTGcHePc+qs/NdCt33voCLk2sGd0inuxibNjFXkBT62qs/JshlzaDsM58mC/jdSBRHZiOoJHmteJ0J1vwDVTVumWM97Qrc9fDyAqvDo72LCfqq0uG6hppWsi/z5jnGhTwzmJ7biqcY3BThvQAABSgD80MH4unfjys3iYhsefX0tfuAm23Rx1BCoKDRrrnWy0//Z9D0vI3petRmLSerLnJUAqCRh6ZoRqahYwNTPr39G+/WBJBsh3UDfB0+PwSmGsczRmL6DDbDu023etpAhehWcdR55ftEcijKiEnnfZE4vyKYm4C835QoKlQ+odT+u7syO/u/PgoyguQxqnNoKdlSSCs4+96s86urmY/yM9T4dvZdB4K4aOVH5cNfRHc8fsqeKpcuxmBbHOmIYIAegjTd5iKB4OQtxPHti1ZQCLeP74OiAxF6UgH+bCBp+h2mfU19CtEXvfcQdxGXPDT/iAPbPZG8c7fubDCKUympyb5nbHzVUcL9IGTlCq1zN7B1pRFj/O6JKOGBRo+q0OEs0nI7l/RFvmDfEtA0FYSC4IGegEs//fUsB165Zdm2SdKk7/cy89Xd4Hy5cedzqmjrtKNw5zjvfjqaNU7FlUL38irfopK/Pyk5Fp/HdV7iMvflIJO1M7GedTWdcNKB/OqPGV9NuJaKYgJbgBrxS4iYtHw9ZZsKWogYCig+eYiU8ty/MSDus9zCE2yRIbLVQ59AFwqTwODgBaV2nJepBDxcXVauCpdHiGbi7Q9M4t1eyGafFUKasv3unzdriRTrFPZ+44ZQb3gYberTMv2f3MwfcryaFgxcgtu43w8Hy5nviA9sOeoLmPYMZtL85QbB+AzKCXJV5DfIGcMvx1aeD/D9QNyOSTakVv2tAwxnP5UeQj8mJKGHTYrIsOMFDfxSnQ2lVzMRPQYmeEes8KjFvYrGyQ82Io+hGnKYOHX1T1ioi+wh+MGacVaSC1VMfG6rdIauPSxbB9WNxqnJxKz7SxHNiV3Gwm4rgUOs+vN2tSPyfINt12OHU=\">\n <input type=\"hidden\" name=\"cf_captcha_kind\" value=\"h\">\n <script type=\"text/javascript\" src=\"/cdn-cgi/scripts/hcaptcha.challenge.js\" data-type=\"normal\" data-ray=\"592c6be52bb00cb1\" async data-sitekey=\"33f96e6a-38cd-421b-bb68-7806e1764460\"></script>\n <noscript id=\"cf-captcha-bookmark\" class=\"cf-captcha-info\">\n <h1 data-translate=\"turn_on_js\" style=\"color:#bd2426;\">Please turn JavaScript on and reload the page.</h1>\n </noscript>\n <div id=\"trk_captcha_js\" style=\"background-image:url('/cdn-cgi/images/trace/captcha/nojs/h/transparent.gif?ray=592c6be52bb00cb1')\"></div>\n</form>\n\n </div>\n </div>\n\n <div class=\"cf-column\">\n <div class=\"cf-screenshot-container\">\n \n <span class=\"cf-no-screenshot\"></span>\n \n </div>\n </div>\n </div><!-- /.columns -->\n </div>\n </div><!-- /.captcha-container -->\n\n <div class=\"cf-section cf-wrapper\">\n <div class=\"cf-columns two\">\n <div class=\"cf-column\">\n <h2 data-translate=\"why_captcha_headline\">Why do I have to complete a CAPTCHA?</h2>\n \n <p data-translate=\"why_captcha_detail\">Completing the CAPTCHA proves you are a human and gives you temporary access to the web property.</p>\n </div>\n\n <div class=\"cf-column\">\n <h2 data-translate=\"resolve_captcha_headline\">What can I do to prevent this in the future?</h2>\n \n\n <p data-translate=\"resolve_captcha_antivirus\">If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p>\n\n <p data-translate=\"resolve_captcha_network\">If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.</p>\n \n \n \n </div>\n </div>\n </div><!-- /.section -->\n \n\n <div class=\"cf-error-footer cf-wrapper\">\n <p>\n <span class=\"cf-footer-item\">Cloudflare Ray ID: <strong>592c6be52bb00cb1</strong></span>\n <span class=\"cf-footer-separator\">•</span>\n <span class=\"cf-footer-item\"><span>Your IP</span>: 128.151.150.1</span>\n <span class=\"cf-footer-separator\">•</span>\n <span class=\"cf-footer-item\"><span>Performance & security by</span> Cloudflare</span>\n \n </p>\n</div><!-- /.error-footer -->\n\n\n </div><!-- /#cf-error-details -->\n </div><!-- /#cf-wrapper -->\n\n <script type=\"text/javascript\">\n window._cf_translation = {};\n \n \n</script>\n\n\n</body>\n</html>\n",
:trace-redirects []}
Where the response says "no-js ie 6 oldie". How do I fix this?
-- EDIT --
Tried this, but it didn't work either
curl 'https://angel.co' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:76.0) Gecko/20100101 Firefox/76.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Connection: keep-alive' -H 'Cookie: __cfduid=d3d3c3a784a6878d27e67563734baeb7e1589375442; _ga=GA1.2.1171200054.1589375443; _gid=GA1.2.1332292087.1589375443; ajs_user_id=null; ajs_group_id=null; ajs_anonymous_id=%226898badd-9a9e-4876-9a75-e92df55c5d4b%22; _hjid=58408190-990c-4a5b-86c8-ac6003d5fe93' -H 'Upgrade-Insecure-Requests: 1'

CasperJS gets blocked

I'm trying to build a scraper using Casperjs but it keeps getting blocked. I read several articles saying that it can be avoided by setting user-agent but even with user-agent I get blocked.
Here is my current setup:
var casper = require('casper').create({
verbose: true,
logLevel: 'debug',
colorizerType: 'Dummy',
waitTimeout: 30000, // timeout for waits (loading etc.)
exitOnError: true,
pageSettings: {
userAgent: 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.56 Safari/536.5',
javascriptEnabled: true,
loadImages: true,
loadPlugins: true,
},
onError: function(msg, backtrace) {
this.exit();
}
});
casper.start().then(function() {
this.open('https://WEBSITE-URL', {
headers: {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'
}
});
casper.viewport(1280, 1024);
});
// Login
casper.then(function() {
this.echo("Waiting for login form to load.");
this.echo(this.getHTML());
});
I receive this HTML after running casper:
<!DOCTYPE html><html><head>
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
<meta http-equiv="cache-control" content="max-age=0">
<meta http-equiv="cache-control" content="no-cache">
<meta http-equiv="expires" content="0">
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT">
<meta http-equiv="pragma" content="no-cache">
<meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?requestId=972f0bd8-1861-4c7b-8459-ce880b8cf2b6&httpReferrer=%2F">
<script type="text/javascript">
(function(window){
try {
if (typeof sessionStorage !== 'undefined'){
sessionStorage.setItem('distil_referrer', document.referrer);
}
} catch (e){}
})(window);
</script>
<script type="text/javascript" src="/dstltrntmls.js" defer="">
</script>
<style type="text/css">#d__fFH{position:absolute;top:-5000px;left:-5000px}#d__fF{font-family:serif;font-size:200px;visibility:hidden}#ruxctfdwzvsxvuucdvdtdtsufa{display:none!important}</style></head>
<body>
<div id="distilIdentificationBlock"> </div>
<div id="d__fFH" style="position: absolute; top: -5000px; left: -5000px;">
<object id="d_dlg" classid="clsid:3050f819-98b5-11cf-bb82-00aa00bdce0b" width="0px" height="0px"></object>
<span id="d__fF" style="font-family: Courier, serif; font-size: 72px; visibility: hidden;">The quick brown fox jumps over the lazy dog.</span></div></body>
</html>
Is there a way to workaround this issue. When I try a simple GET request in POSTMAN it turns the actual HTML but it doesn't in casperjs.

IE8 - Content Not Acceptable Symfony 2 App

I have a website up and running (and i need to support IE8).
Server: Nginx, framework Symfony2/PHP/MySQL
The issue is simple: IE8 (8.0.6) shows an HTTP 406Content not acceptable on all HTML pages.
Headers (Nginx)
Cache-Control:no-cache
Connection:keep-alive
Content-Encoding:gzip
Content-Type:text/html; charset=UTF-8
Date:Mon, 25 Apr 2016 15:23:46 GMT
Server:nginx/1.6.2
Transfer-Encoding:chunked
X-Debug-Token:d7e68f
HTML (2 versions, not working)
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=9; IE=8; IE=7; IE=EDGE" />
<meta name="robots" content="noindex, nofollow">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Htm 2</title>
<link rel="icon" type="image/x-icon" href="/favicon.ico" />
</head>
<body>
... hi
</body>
</html>
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta name="robots" content="noindex, nofollow">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Htm</title>
<link rel="icon" type="image/x-icon" href="/favicon.ico" />
</head>
<body>
... hi
</body>
</html>
I have read tons of stuff on that matter but could not find any clue. The previous website version worked on IE8 and ran on Apache 2.
This bug was not related to Nginx but to Symfony 2 and FOSRESTBundle which relies on Client headers to negociate the response content via format listener.
Solution:
I changed my configuration to disable FOSRest to disable format listener for HTML pages.
fos_rest:
format_listener:
rules:
- { path: '^/rest', priorities: [ 'json' ], fallback_format: json, prefer_extension: false }
#- { path: ^/, priorities: [ 'text/html', '*/*' ], fallback_format: html, prefer_extension: true }
- { path: '^/', stop: true }

Why am I getting 'application/json' response instead of 'text/html' using PartialView in ASP.NET MVC 5

The action code:
public ActionResult Visit(VisitModel model)
{
if (Request.HttpMethod == "GET")
return PartialView("VisitPostRedirect", visit);
// some logic...
return PartialView(visit);
}
The 'VisitPostRedirect' view:
#model VisitModel
<!doctype html>
<html lang="en">
<body onload="javascript: document.getElementById('visitPostRedirectForm').submit()">
#using (Html.BeginRouteForm("Visit", new
{
// some data...
RedirectUrl = string.Empty
}, FormMethod.Post,
new { id="visitPostRedirectForm" }))
{
#Html.HiddenFor(m => m.ReturnUrl)
// some data...
}
</body>
</html>
The 'Visit' view:
#model VisitModel
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0, , maximum-scale=1.0">
<!-- ... -->
</head>
<body>
<!-- ... -->
</html>
The 'visitPostRedirectForm' form submits correctly, with:
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8,pl;q=0.6
Cache-Control:max-age=0
Connection:keep-alive
Content-Length:69
Content-Type:application/x-www-form-urlencoded
Cookie:ASP.NET_SessionId=vh3kl4zbxkonborzazuafkiw
Host: /*removed*/
Origin:/*removed*/
Referer:/*removed*/
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36
but the response is:
Cache-Control:private
Content-Encoding:gzip
Content-Length:2135
Content-Type:application/json; charset=utf-8
Date:Wed, 04 Jun 2014 09:23:32 GMT
Server:Microsoft-IIS/7.5
Vary:Accept-Encoding
X-AspNet-Version:4.0.30319
X-AspNetMvc-Version:5.0
X-Powered-By:ASP.NET
Has anybody an idea why the content-type of the response is 'application/json'? This causes a browser to render raw html. The returned html is correct.

Resources