Good day all!
I am trying to scrape https://guardian.com.my/health.html using API with scrapy. On postman the request url yielded status 200ok, but I get Crawled(400) bad request when scraping.
import scrapy
from scrapy.exceptions import CloseSpider
import json
class GapiSpider(scrapy.Spider):
name = 'gapi'
headers={
':authority': 'guardian.com.my',
':method': 'GET',
':path': f"/graphql?query=query+GetCategories%28%24id%3AInt%21%24pageSize%3AInt%21%24currentPage%3AInt%21%24filters%3AProductAttributeFilterInput%21%24sort%3AProductAttributeSortInput%29%7Bcategory%28id%3A%24id%29%7Bid+description+name+product_count+meta_title+meta_keywords+meta_description+__typename%7Dproducts%28pageSize%3A%24pageSize+currentPage%3A%24currentPage+filter%3A%24filters+sort%3A%24sort%29%7Bitems%7Bid+name+sku+price%7BregularPrice%7Bamount%7Bcurrency+value+__typename%7D__typename%7D__typename%7Dprice_range%7Bminimum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7Dmaximum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7D__typename%7Dpromotion_label+promotion_label_name+sales_icon+small_image%7Burl+__typename%7Dstock_status+url_key+url_suffix+__typename%7Dpage_info%7Btotal_pages+__typename%7Dtotal_count+__typename%7D%7D&operationName=GetCategories&variables=%7B%22currentPage%22%3A2%2C%22id%22%3A3047%2C%22filters%22%3A%7B%22category_id%22%3A%7B%22eq%22%3A%223047%22%7D%7D%2C%22pageSize%22%3A60%2C%22sort%22%3A%7B%22position%22%3A%22ASC%22%7D%7D",
':scheme': 'https',
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'cache-control': 'no-cache',
'content-type': 'application/json',
'cookie': 'lzd_cid=13b492b3-2f4c-46fc-fb4f-98212b95f68d; t_uid=13b492b3-2f4c-46fc-fb4f-98212b95f68d; lzd_sid=1728fd4a8af615419dc17044911406d4; t_fv=1623735402455; hng=MY|en-MY|MYR|458; userLanguageML=en; _bl_uid=7qkOhpLtxynm254s1v6s7X3gjp7n; cna=aipPGYuo6GwCAXOk2DtI3+FS; _gcl_au=1.1.335118801.1623735403; _tb_token_=f311f13b31eb5; age_restriction=over%3B18%3B1; _fbp=fb.2.1623760903320.1705265880; _ga=GA1.3.1886895770.1623760957; _gid=GA1.3.71310232.1623760957; t_sid=VVEk6ELqT4zeevu7Snsvx63eqX0u9De7; utm_origin=https://www.google.com/; utm_channel=SEO; xlly_s=1; EGG_SESS=S_Gs1wHo9OvRHCMp98md7LqI1pVlU7ApMhhrX1Oe_NHHkRPwi6zdBuxVpdbHrc8tccMpabJfEEwLAe7yCNtlESqPvCMPcAfjqwmNQ19bjOPRdxRnSKGABpGFDMYvsTiDFT7FfaLnWnbHJr555QFqdYHSAtp73LRUYDZgaeGlJT4=; isg=BICAfUhD9ImywYiPlm818du3UQhSCWTTedKTF_oRTxsudSGfoh0tYqpEiNW1QByr; l=eBgFfq9IjfrHbUYTBOfaourza779IIRbSuPzaNbMiOCP_-1p5HU1W6OwTtT9CnhNnsgHR3lqRWpDBu8SQyz6Qxv9-egPe9oEndBG.; tfstk=ctjGBQ2yNBGSUZThPlt1IUVci2adZKke6s51Yite4tkyX3IFiWuEzDR4-BfGDq1..; _m_h5_tk=9a0a86b9d6a2af069082c39a003a0087_1623862308738; _m_h5_tk_enc=e8b5adef6a74673cb92ca952c850f019; _uetsid=ab562c30cd9b11eb835b67411ecd1bd6; _uetvid=ab587f00cd9b11eb8eea6906a86bf0ba',
'pragma': 'no-cache',
'referer': 'https://guardian.com.my/health.html?page=1',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'sec-ch-ua-mobile': '?0',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'store': 'default',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
def start_requests(self):
yield scrapy.Request(
url= f"https://guardian.com.my/graphql?query=query+GetCategories%28%24id%3AInt%21%24pageSize%3AInt%21%24currentPage%3AInt%21%24filters%3AProductAttributeFilterInput%21%24sort%3AProductAttributeSortInput%29%7Bcategory%28id%3A%24id%29%7Bid+description+name+product_count+meta_title+meta_keywords+meta_description+__typename%7Dproducts%28pageSize%3A%24pageSize+currentPage%3A%24currentPage+filter%3A%24filters+sort%3A%24sort%29%7Bitems%7Bid+name+sku+price%7BregularPrice%7Bamount%7Bcurrency+value+__typename%7D__typename%7D__typename%7Dprice_range%7Bminimum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7Dmaximum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7D__typename%7Dpromotion_label+promotion_label_name+sales_icon+small_image%7Burl+__typename%7Dstock_status+url_key+url_suffix+__typename%7Dpage_info%7Btotal_pages+__typename%7Dtotal_count+__typename%7D%7D&operationName=GetCategories&variables=%7B%22currentPage%22%3A2%2C%22id%22%3A3047%2C%22filters%22%3A%7B%22category_id%22%3A%7B%22eq%22%3A%223047%22%7D%7D%2C%22pageSize%22%3A60%2C%22sort%22%3A%7B%22position%22%3A%22ASC%22%7D%7D",
headers=self.headers
)
def parse(self, response):
print(response.body)
Since I am using an IP proxy rotator, what am I missing? ROBOTSTXT_OBEY has been set to False in settings.py
Thank you very much!
Scrapy throws 400 bad request when the request are not made in the expected format. It can be wrong with headers, payload or parameters.
Considering your case you have added colon : in various headers fields which is invalid format while making requests. For an example, ':authority' should be 'authority' . Similarly, ':method' should be 'method' and so on..
Code
import scrapy
from scrapy.exceptions import CloseSpider
import json
class GapiSpider(scrapy.Spider):
name = 'gapi'
headers={
'authority': 'guardian.com.my',
'method': 'GET',
'path': f"/graphql?query=query+GetCategories%28%24id%3AInt%21%24pageSize%3AInt%21%24currentPage%3AInt%21%24filters%3AProductAttributeFilterInput%21%24sort%3AProductAttributeSortInput%29%7Bcategory%28id%3A%24id%29%7Bid+description+name+product_count+meta_title+meta_keywords+meta_description+__typename%7Dproducts%28pageSize%3A%24pageSize+currentPage%3A%24currentPage+filter%3A%24filters+sort%3A%24sort%29%7Bitems%7Bid+name+sku+price%7BregularPrice%7Bamount%7Bcurrency+value+__typename%7D__typename%7D__typename%7Dprice_range%7Bminimum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7Dmaximum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7D__typename%7Dpromotion_label+promotion_label_name+sales_icon+small_image%7Burl+__typename%7Dstock_status+url_key+url_suffix+__typename%7Dpage_info%7Btotal_pages+__typename%7Dtotal_count+__typename%7D%7D&operationName=GetCategories&variables=%7B%22currentPage%22%3A2%2C%22id%22%3A3047%2C%22filters%22%3A%7B%22category_id%22%3A%7B%22eq%22%3A%223047%22%7D%7D%2C%22pageSize%22%3A60%2C%22sort%22%3A%7B%22position%22%3A%22ASC%22%7D%7D",
'scheme': 'https',
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9',
'cache-control': 'no-cache',
'content-type': 'application/json',
'cookie': 'lzd_cid=13b492b3-2f4c-46fc-fb4f-98212b95f68d; t_uid=13b492b3-2f4c-46fc-fb4f-98212b95f68d; lzd_sid=1728fd4a8af615419dc17044911406d4; t_fv=1623735402455; hng=MY|en-MY|MYR|458; userLanguageML=en; _bl_uid=7qkOhpLtxynm254s1v6s7X3gjp7n; cna=aipPGYuo6GwCAXOk2DtI3+FS; _gcl_au=1.1.335118801.1623735403; _tb_token_=f311f13b31eb5; age_restriction=over%3B18%3B1; _fbp=fb.2.1623760903320.1705265880; _ga=GA1.3.1886895770.1623760957; _gid=GA1.3.71310232.1623760957; t_sid=VVEk6ELqT4zeevu7Snsvx63eqX0u9De7; utm_origin=https://www.google.com/; utm_channel=SEO; xlly_s=1; EGG_SESS=S_Gs1wHo9OvRHCMp98md7LqI1pVlU7ApMhhrX1Oe_NHHkRPwi6zdBuxVpdbHrc8tccMpabJfEEwLAe7yCNtlESqPvCMPcAfjqwmNQ19bjOPRdxRnSKGABpGFDMYvsTiDFT7FfaLnWnbHJr555QFqdYHSAtp73LRUYDZgaeGlJT4=; isg=BICAfUhD9ImywYiPlm818du3UQhSCWTTedKTF_oRTxsudSGfoh0tYqpEiNW1QByr; l=eBgFfq9IjfrHbUYTBOfaourza779IIRbSuPzaNbMiOCP_-1p5HU1W6OwTtT9CnhNnsgHR3lqRWpDBu8SQyz6Qxv9-egPe9oEndBG.; tfstk=ctjGBQ2yNBGSUZThPlt1IUVci2adZKke6s51Yite4tkyX3IFiWuEzDR4-BfGDq1..; _m_h5_tk=9a0a86b9d6a2af069082c39a003a0087_1623862308738; _m_h5_tk_enc=e8b5adef6a74673cb92ca952c850f019; _uetsid=ab562c30cd9b11eb835b67411ecd1bd6; _uetvid=ab587f00cd9b11eb8eea6906a86bf0ba',
'pragma': 'no-cache',
'referer': 'https://guardian.com.my/health.html?page=1',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'sec-ch-ua-mobile': '?0',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-origin',
'store': 'default',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.114 Safari/537.36'
}
def start_requests(self):
yield scrapy.Request(
url= f"https://guardian.com.my/graphql?query=query+GetCategories%28%24id%3AInt%21%24pageSize%3AInt%21%24currentPage%3AInt%21%24filters%3AProductAttributeFilterInput%21%24sort%3AProductAttributeSortInput%29%7Bcategory%28id%3A%24id%29%7Bid+description+name+product_count+meta_title+meta_keywords+meta_description+__typename%7Dproducts%28pageSize%3A%24pageSize+currentPage%3A%24currentPage+filter%3A%24filters+sort%3A%24sort%29%7Bitems%7Bid+name+sku+price%7BregularPrice%7Bamount%7Bcurrency+value+__typename%7D__typename%7D__typename%7Dprice_range%7Bminimum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7Dmaximum_price%7Bfinal_price%7Bcurrency+value+__typename%7Ddiscount%7Bamount_off+percent_off+__typename%7D__typename%7D__typename%7Dpromotion_label+promotion_label_name+sales_icon+small_image%7Burl+__typename%7Dstock_status+url_key+url_suffix+__typename%7Dpage_info%7Btotal_pages+__typename%7Dtotal_count+__typename%7D%7D&operationName=GetCategories&variables=%7B%22currentPage%22%3A1%2C%22id%22%3A3047%2C%22filters%22%3A%7B%22category_id%22%3A%7B%22eq%22%3A%223047%22%7D%7D%2C%22pageSize%22%3A60%2C%22sort%22%3A%7B%22position%22%3A%22ASC%22%7D%7D",
headers=self.headers
)
def parse(self, response):
print(response.body)
Related
Been through dozens of links which all say the same solution.
Create an object for the parameter in the api call. And pass that exact same object using json.
So API call is thus (which is hit):
[System.Web.Http.HttpPost]
public Microsoft.AspNetCore.Mvc.JsonResult SearchItems([FromBody]SearchParams searchParams)
Input object is defined thus:
public class SearchParams
{
public string searchWord { get; set; }
public int anid { get; set; }
}
Call is made thus:
let item = {
searchWord: searchKeyword.value.trim(),
anid: 1
};
const uri = "../../WebApi/SearchItems";
fetch(uri, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json'
},
body: JSON.stringify(item),
//credentials: 'include',
mode: 'cors',
})
Look at the network traffic, headers passed are thus:
:authority: localhost:44386
:method: POST
:path: /WebApi/SearchItems
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-GB,en-US;q=0.9,en;q=0.8
content-length: 32
content-type: application/json
cookie: .AspNetCore.Session=CfDJ8LGmdU3gpbZIvSkveH0x8cMT8VibUQlc07yQ8SJOo6DJkNqykRsAz2V6NVuQ5zQhzBNiHjZ2iRJc%2Fno44sQdQJhsVPnktzx8EWu%2Bptg9ONjmErDP3TZ1csme%2FAJ3H5hSgvooxH0snE00och2ov4ZldFCosHYGH6X70ESjL8PbcJg; .AspNetCore.Antiforgery.Q2hy0CiNRlg=CfDJ8LGmdU3gpbZIvSkveH0x8cOOBFRZOyah2508xPXIUjTbV_weFLdM06pME-M-kc2l48FOmSym_5JS9GUHJeciQEKJI9SHBu1D-5wLcVF4de3rYsjKRsI67qGrCado7eBFDBAbeYFOLWEMbXXCQr_0vlA
origin: https://localhost:44386
referer: https://localhost:44386/lb_users/Details/11
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36
Payload is thus:
{"searchWord":"qwerty","anid":1}
Both my VS debugger AND Postman brings back null/0 values.
Relatively new to .net core.
So what fundamental thing am I missing here as the dozens of SO questions I've looked at all seem identical to what I've added to my solution?
Thanks in advance.
EDIT:
this works.
const uri = "../../WebApi/SearchItems";
fetch(uri, {
method: 'POST',
headers: {
"Content-type": "application/x-www-form-urlencoded; charset=UTF-8"
},
body: "searchWord=" + searchKeyword.value.trim() + "&anid=1",
//credentials: 'include',
mode: 'cors',
})
So it would appear the JSON is the issue.
Have JSON objects been dropped from being passed to api calls?
I test asp .net webapi2 and asp .net core,both can work.
Here is a demo of asp .net webapi2:
ApiController:
public class ValuesController : ApiController
{
[System.Web.Http.HttpPost]
public string GetVal(SearchParams searchParams)
{
return "success";
}
}
public class SearchParams
{
public string searchWord { get; set; }
public int anid { get; set; }
}
ajax:
<input id="searchWord"/>
<button onclick="sendData()">ssss</button>
<script>
function sendData() {
let item = {
searchWord: $("#searchWord").val().trim(),
anid: 1
};
const uri = "../api/Values";
fetch(uri, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Accept': 'application/json'
},
body: JSON.stringify(item),
//credentials: 'include',
mode: 'cors',
})
}
</script>
headers:
:authority: localhost:44383
:method: POST
:path: /api/Values
:scheme: https
accept: application/json
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7
content-length: 33
content-type: application/json
cookie: .AspNetCore.Culture=c%3Den-US%7Cuic%3Den-US; .AspNetCore.Antiforgery.I5tvYLHG_pU=CfDJ8AkZmG9N6OhEnYVb3Xy31rQgFnXqHPkWTaV4nUodZGM9SfyhvD5jztl-kzo768oHkGPUYV-bOoSNBP5OuTHO_yd08w-IxrsO39HceHJgOIqC5ePYLIxXd0w9cBzYeWEu5amihQhOqiLySw376bp9YdM
origin: https://localhost:44383
referer: https://localhost:44383/Home/Index1
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: same-origin
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36
paylod:
{searchWord: "hghgfgf", anid: 1}
result:
You can also try to create a new project,and test it.
I am using Python3 and I what to login in https://competitions.codalab.org/accounts/login/ using Python requests.
This is my example code.
# -*- coding: utf-8 -*-
import requests
url_open = 'https://competitions.codalab.org/accounts/login'
sess = requests.Session()
sess.verify = False
sess.headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate, sdch, br',
'Accept-Language': 'zh-CN,zh;q=0.8',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Host': 'competitions.codalab.org',
'Referer': 'https',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 '
'(KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'}
page = sess.get(url_open)
csrfToken = page.cookies['csrftoken']
print(csrfToken)
usrname = ***
passwd = ***
form_data = {
'csrfmiddlewaretoken': csrfToken,
'login': usrname,
'password': passwd,
}
req = sess.post(url_open, data=form_data, cookies={
'csrftoken': csrfToken
})
print(req.text)
I get the csrf first, and post with it.
But this code failed to login in codalab.
Can anyone tell me why?
following code works with django 1.9
import requests
client = requests.session()
client.get("http://127.0.0.1:8000/admin/login/")
csrftoken = client.cookies['csrftoken']
login_data = {'username':"admin",
'password':"pass!",
'csrfmiddlewaretoken':csrftoken,
'next': '/admin/'}
r1=client.post("http://127.0.0.1:8000/admin/login/",data=login_data)
print r1
<Response [200]>
I'm trying to add headers to my http post request as shown below
import { Injectable } from '#angular/core';
import { Http, Headers, RequestOptions } from '#angular/http';
#Injectable()
export class UserService extends ServiceBase {
apiUrl: string;
private contentHeaders = new Headers();
constructor(private http: Http) {
super();
this.apiUrl = appConfig.apiBaseUrl + '/users';
}
login(user: User) {
this.contentHeaders.append('Accept', 'application/json');
this.contentHeaders.append('Content-Type', 'application/json');
return this.http.post(
this.apiUrl+'/sign_in',
JSON.stringify({user: user}),
{headers: this.contentHeaders}
);
}
}
Headers shown in Chrome DevTools:
OPTIONS /api/v1/users/sign_in HTTP/1.1
Host: offers2win.com
Connection: keep-alive
Pragma: no-cache
Cache-Control: no-cache
Access-Control-Request-Method: POST
Origin: http://evil.com/
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36
Access-Control-Request-Headers: access-control-allow-credentials, content-type
Accept: */*
Referer: http://localhost:3000/?
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
But these headers are not seen in the Chrome dev tool, network panel. What am I missing here.
Try this
constructor(private http: Http) { }
login(username: any, password: any) {
let body = JSON.stringify({ "email" : username, "password" : password });
let headers = new Headers({"Content-Type": "application/json"});
let options = new RequestOptions({headers: headers});
return this.http.post(APIUrl+"/login", body, options).map(res => res.json());
}
I try to make authenticated request to a WebApi server, using angular.
This is how I make the request:
var config = {headers: {Authorization: "Bearer " + JSON.parse($cookies.token)}};
var url = 'http://localhost:53889/api/Values/5';
$http.get(url,config).success(function(response){
console.log(response);
})
And this is how I set the authentication:
login: function(credentials){
return $http({
method: 'POST',
url: url + '/Token',
headers: {'Content-Type' : 'application/x-www.form-urlencoded'},
transformRequest : function(obj){
var str = [];
for(var p in obj){
str.push(encodeURIComponent(p) + "=" + encodeURIComponent(obj[p]));
}
return str.join("&");
},
data: credentials
});
var success = function(data){
AuthorizationFactory.setCurrentUser(credentials.username);
ApiFactory.init(data["access_token"]);
$cookieStore.put('token', JSON.stringify(data["access_token"]));
};
The success method is called like this: AuthorizationFactory.login(credentials).success(success).
But everytime I try to make a authenticated request, I get a 401 error message.
The headers look like this:
Accept:application/json, text/plain, */*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:ro-RO,ro;q=0.8,en-US;q=0.6,en;q=0.4
Authorization:Bearer "4N_9ScOCtFxTgNs....t"
Cache-Control:no-cache
Connection:keep-alive
Cookie:.AspNet.Cookies=Rx...y0OTHrvE
Host:localhost:53889
Pragma:no-cache
Referer:http://localhost:53889/app/
User-Agent:Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36
X-Access-Token:"\"4N_9ScO...XZ6t\""
What am I doing wrong? What I'm missing?
Do not use quotation marks in the Authorization header. It should look like this:
Authorization: Bearer your_token_here
When initiating an HTTP DELETE request with content-type header and body, these two elements are:
Well received when the request is initiated from within Chrome,
Not received when the request is initiated from within PhantomJS.
Which of the two is behaving as per the standard?
It is to be noted that in both cases, the client and the server are the same.
Below are the logs of the request at the server level.
Log of the request sent from within PhantomJS
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /test/del
SERVER: req.method OPTIONS
SERVER: req.headers { 'access-control-request-method': 'DELETE',
origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34',
referer: 'http://localhost:9876/context.html',
'access-control-request-headers': 'Content-Type, Accept',
accept: '*/*',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'accept-language': 'fr-FR,en,*',
host: 'localhost:9009' }
SERVER: req.query {}
SERVER: req.body {}
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E S P O N S E -
SERVER: -------------------------------------------------------
SERVER: res.headers { 'Access-Control-Allow-Headers': 'Origin, X-Requested-With, Content-Type, Accept',
'Access-Control-Allow-Methods': 'PUT, DELETE',
'Access-Control-Allow-Origin': '*' }
SERVER: -------------------------------------------------------
127.0.0.1 - - [Thu, 13 Nov 2014 16:07:08 GMT] "OPTIONS /test/del HTTP/1.1" 200 - "http://localhost:9876/context.html" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34"
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /test/del
SERVER: req.method DELETE
SERVER: req.headers { origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34',
accept: 'application/json, application/json;q=0.8, text/plain;q=0.5, */*;q=0.2',
referer: 'http://localhost:9876/context.html',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'accept-language': 'fr-FR,en,*',
host: 'localhost:9009' }
SERVER: req.query {}
SERVER: req.body {}
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /del
SERVER: req.method DELETE
SERVER: req.headers { origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34',
accept: 'application/json, application/json;q=0.8, text/plain;q=0.5, */*;q=0.2',
referer: 'http://localhost:9876/context.html',
connection: 'Keep-Alive',
'accept-encoding': 'gzip',
'accept-language': 'fr-FR,en,*',
host: 'localhost:9009' }
SERVER: req.query {}
SERVER: req.body {}
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E S P O N S E -
SERVER: -------------------------------------------------------
SERVER: res.headers { 'Content-type': 'application/json',
'Access-Control-Allow-Origin': '*' }
SERVER: res.body { code: 'Declined',
reason: 'UNEXPECTED CONTENT',
message: 'The content-type "undefined" is unexpected. Please use "application/json".' }
SERVER: -------------------------------------------------------
Log of the request sent from within Chrome
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /test/del
SERVER: req.method OPTIONS
SERVER: req.headers { host: 'localhost:9009',
connection: 'keep-alive',
'access-control-request-method': 'DELETE',
origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36',
'access-control-request-headers': 'accept, content-type',
accept: '*/*',
referer: 'http://localhost:9876/context.html',
'accept-encoding': 'gzip,deflate,sdch',
'accept-language': 'en-US,en;q=0.8' }
SERVER: req.query {}
SERVER: req.body {}
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E S P O N S E -
SERVER: -------------------------------------------------------
SERVER: res.headers { 'Access-Control-Allow-Headers': 'Origin, X-Requested-With, Content-Type, Accept',
'Access-Control-Allow-Methods': 'PUT, DELETE',
'Access-Control-Allow-Origin': '*' }
SERVER: -------------------------------------------------------
127.0.0.1 - - [Thu, 13 Nov 2014 16:15:04 GMT] "OPTIONS /test/del HTTP/1.1" 200 - "http://localhost:9876/context.html" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36"
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /test/del
SERVER: req.method DELETE
SERVER: req.headers { host: 'localhost:9009',
connection: 'keep-alive',
'content-length': '23',
accept: 'application/json, application/json;q=0.8, text/plain;q=0.5, */*;q=0.2',
origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36',
'content-type': 'application/json',
referer: 'http://localhost:9876/context.html',
'accept-encoding': 'gzip,deflate,sdch',
'accept-language': 'en-US,en;q=0.8' }
SERVER: req.query {}
SERVER: req.body { field1: 1, field2: 2 }
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E Q U E S T -
SERVER: -------------------------------------------------------
SERVER: req.url /del
SERVER: req.method DELETE
SERVER: req.headers { host: 'localhost:9009',
connection: 'keep-alive',
'content-length': '23',
accept: 'application/json, application/json;q=0.8, text/plain;q=0.5, */*;q=0.2',
origin: 'http://localhost:9876',
'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36',
'content-type': 'application/json',
referer: 'http://localhost:9876/context.html',
'accept-encoding': 'gzip,deflate,sdch',
'accept-language': 'en-US,en;q=0.8' }
SERVER: req.query {}
SERVER: req.body { field1: 1, field2: 2 }
SERVER: -------------------------------------------------------
SERVER: -------------------------------------------------------
SERVER: - R E S P O N S E -
SERVER: -------------------------------------------------------
SERVER: res.headers { 'Content-type': 'application/json',
'Access-Control-Allow-Origin': '*' }
SERVER: res.body { code: 'Accepted' }
SERVER: -------------------------------------------------------
The RFC 2616 for HTTP/1.1 specifies the DELETE method in chapter 9.7. It does not specify anything about the enclosed entity (message body) in the description text which it does for POST and PUT.
From your logs it is apparent that PhantomJS doesn't even send the message body and therefore doesn't need to include the content type, because it assumes that it will never be used anyway as it is not defined. It seems that Chrome sends the message body regardless.
There is a draft for an update which includes the following text in chapter 6.7:
Bodies on DELETE requests have no defined semantics. Note that
sending a body on a DELETE request might cause some existing
implementations to reject the request.
Since PhantomJS 1.x is based on a more than three year old version of WebKit (pre-draft), it behaves this way. Chrome on the other hand may have implemented the proposed draft and actively sends the body on DELETE requests. See this question for more information.
If your operation depends on the message body for the DELETE method, you should change the implementation so that the resource which you delete is completely defined by the URI.