I'm currently new to python programming. My problem is that my python program doesn't seem to pass/encode the parameter properly to the ASP file that I've created. This is my sample code:
import urllib.request
url = 'http://www.sample.com/myASP.asp'
full_url = url + "?data='" + str(sentData).replace("'", '"').replace(" ", "%20").replace('"', "%22") + "'"
print (full_url)
response = urllib.request.urlopen(full_url)
print(response)
the output would give me something like:
http://www.sample.com/myASP.asp?data='{%22mykey%22:%20[{%22idno%22:%20%22id123%22,%20%22name%22:%20%22ej%22}]}'
The asp file is suppose to insert the acquired querystring to a database.. But whenever I check my database, no record is saved. Though if I do copy and paste the printed output on my browser url, the record is saved. Any input on this? TIA
Update:
Is it possible the python calls my ASP File A but it doesn't call my ASP File B? ASP File A is called by python while ASP File B is called by ASP File A. Because whenever I run the url on a browser, the saving goes well. But in python, no saving of database occurs even though the data passed from python is read by ASP File A..
Use firebug with Firefox and watch the network traffic when the page is loaded. If it is actually an HTTP POST, which I suspect it is, check the post parameters on that post and do something like this:
from BeautifulSoup import BeautifulSoup
import urllib
post_params = {
'param1' : 'val1',
'param2' : 'val2',
'param3' : 'val3'
}
post_args = urllib.urlencode(post_params)
url = 'http://www.sample.com/myASP.asp'
fp = urllib.urlopen(url, post_args)
soup = BeautifulSoup(fp)
If its actually HTTP POST, this will work.
In case anybody stumbles upon this, this is what I've come up with:
py file:
url = "my.url.com"
data = {'sample': 'data'}
encodeddata = urllib.parse.urlencode(data).encode('UTF-8')
req = urllib.request.Request(url, encodeddata)
response = urllib.request.urlopen(req)
and in my asp file, I used json2.js:
jsondata = request.form("data")
jsondata = replace(jsondata,"'","""")
SET jsondata = JSON.parse(jsontimecard)
Note: use requests instead. ;)
First off, I don't know Python.
But from this : doc on urllib.request
the HTTP request will be a POST instead of a GET when the data
parameter is provided
Let me make a really wild guess, you are accessing the form values as Request.Querystring(..) in the asp page, so your post wont pass any values. But when you paste the url in the address bar, it is a GET and it works.
just guessing, you could show the .asp page for further check.
Related
At first, I am not the best programmer, so please excuse me if I ask something stupid.
I have a question about the following code (in language R) which I have written in order to get an authentication code for the Withings API:
library(httr)
my_client_id = "..." #deleted because it is secret
my_redirect_uri = "..." #deleted because it is secret
my_scope="user.activity,user.metrics,user.info"
access_url = "https://wbsapi.withings.net/v2/oauth2"
authorize_url = "https://account.withings.com/oauth2_user/authorize2"
my_response_type = "code"
my_state = "..." #deleted because it is secret
httr::BROWSE(authorize_url, query = list(response_type = my_response_type,
client_id = my_client_id,
redirect_uri = my_redirect_uri,
scope = my_scope,
state = my_state))
This code successfully opens the URL
http://%22https://account.withings.com/oauth2_user/account_login?response_type=code&client_id=...&redirect_uri=...&scope=user.activity%2Cuser.metrics%2Cuser.info&state=...&b=authorize2%22
where I can enter my e-mail-adress and password. After that, it redirects me to the URL
http://.../?code=...&state=...
where the first dots are my redirect URL. This gives me the code I need for getting the access token. I have tested the code, i.e. I tried to get an access token with using this code and I was successfull.
The problem is, I have to copy/paste the code from the URL (in my browser) to my POST statement (which I use to get the access token) manually and I would like to automatize that. So I would like to get returned the URL with the code so that I can parse it in order to extract the code. I know how to extract the code if I have the URL, but I have no idea how to avoid the copying/pasting and I am not even sure if it is possible. If it is possible, does anyone have an idea how I could add something to my existing code or how I could change my existing code in order to get the URL with the code (apart from doing it manually)?
I am very happy about any help and I want to say thank you in advance!
So, as I figured out, when I have a form with enctype="multipart/form-data" and I upload a file, I can no longer access the object request. The following error is shown:
Cannot use the generic Request collection after calling BinaryRead.
After checking some resources, I stumpled upon a statement, which says: "This is by design". Well, okay, not here to judge about design-decisions.
To give you a quick overview, let me walk you through the code:
if request("todo") = "add" then
Set Form = New ASPForm
category = request("category")
title = request("title")
if len(Form("upload_file").FileName) > 0 then
filename = Form("upload_file").FileName
DestinationPath = Server.mapPath("personal/allrounder/dokumente/")
Form.Files.Save DestinationPath
end if
end if
Nothing too special here so far. Later however, when I try to access my request object, the error mentioned above occures:
<% if request("todo") = "new" then %>
...
My question now, how to get rid of it or fix this. I don't want to open the upload in a popup if there is another way around. This is the only solution I could think off.
Perfectly would be an object, which checks Form and request. Alternatively maybe a check at the top of the file, which object I have to use?
Thanks for any suggestions.
There used to be a very popular ASP class/component that solved ASP file uploads. The site for that component has been taken down, but the code is mirrored here:
https://github.com/romuloalves/free-asp-upload
You can include this ASP page on your own page, and on your page instantiate the class to get access to the files in your form, but also to the form variables. Here is a piece of example code (Upload.Form accesses the form fields):
Dim uploadsDir : uploadsDir = server.mapPath(".") ' whatever you want
Dim Upload, ks, fileKey, mailto
Set Upload = New FreeASPUpload
call Upload.Save(uploadsDir)
ks = Upload.UploadedFiles.keys
for each fileKey in ks
Response.write(fileKey & " : " & Upload.UploadedFiles(fileKey).FileName & "<br/>")
next
mailto = Upload.form("mailTo")
Set Upload = Nothing
If you want to stick to your own implementation, you can probably figure out how to get to the form variables in a multipart/form-data encoded data stream by having a look at the code they use to do so.
I have the following part of code:
let client = new WebClient()
let url = "https://..."
client.DownloadFile(Url, filename)
client.Dispose()
In which code i am performing a HttpGet method in which method i get a file excel with some data.
The method is executed correctly because i get my excel file.
The problem is that the content of my file excel is like this:
I think its because i don't pass ContentType:"application/vnd.ms-excel"
So anyone can help how can I pass that ContentType in my Client in F# ?
If you want to add HTTP headers to a request made using WebClient, use the Headers property:
let client = new WebClient()
let url = "https://..."
client.Headers.Add(HttpRequestHeader.Accept, "application/vnd.ms-excel")
client.DownloadFile(Url, filename)
In your case, I think you need the Accept header (Content-Type is what the response should contain to tell you what you got).
That said, I'm not sure if this is the problem you are actually having - as noted in the comments, your screenshot shows a different file, so it is hard to tell what's wrong with the file you get from the download (maybe it's just somewhere else? or maybe the encoding is wrong?)
I'm doing some heavy web scraping using Python. In some cases, post data is sent not through a form submit but through some Javascript, which I cannot interact with via this approach. In order to circumvent this, I've been appending names and values for the post requests to the url and then visiting that url.
This method was working fine until I came across a site that used this kind of structure: [sitename].com/?[pagename].do/. I admit total ignorance about this .do extension, though some light searching tells me that it has to do with Struts and a Java-based backend. In this case it seems to be a way of dynamically generating a table; I'm trying to filter the results of that table. What I want to enter is something like [sitename].com/?[pagename].do?[name]=[value]&[name]=[value], but this doesn't work, nor does it even seem like it should work. I attempted it using several variations in syntax. It seems like something I don't quite understand is going on here.
I wish I could direct you to the actual site, but unfortunately I cannot due to the sensitive nature of the project. Let me know, though, if there's any additional information that would be helpful in providing an answer. Thanks in advance.
Edit: This is not really a "my code isn't working" question, as it's the underlying functionality that I would like to emulate in my code which is troubling me, but I'll do my best to get grittier. I'm contractually bound not to share the names of the sites that we're studying, but I will try to model the problem. I am hoping that someone with some familiarity of the back-end activity sending this .do page to the browser will be able to shed light.
import urllib
import urllib2
#
## case 1: a site that i have success in scraping
url = 'http://[sitename]/[pagename]'
values = {'s' : '40', 'pg' : '1'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page #i get the filtered data that i am looking for
#
## case 2: the site that poses a problem for the encoding of post parameters
url = 'http://[sitename]/?[pagename].do/' # this site uses a .do file to generate
# the content i want to filter. note that the page name is preceded by ?.
values = {'s' : '40', 'pg' : '1'}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
print the_page # i am taken back to the root of the site,
# the same result i would get if i entered nonsense
# post parameters that did not correspond with actual control names.
here, also, is an example of some javascript on the page that accomplishes what i'd like to do with my scraper:
function page_next (id) {
$("#loading").fadeIn("normal");
$.post("/?dumps.do/", {s: id, pg: 2},
function( data ) {
var content = $( data ).find( '#dumps' );
}
)
}
I don't know what site you are parsing, but this: [sitename].com/?[pagename].do/ is not something I would call default Struts behaviour, assuming it's indeed a Struts application.
Having a .do extension was indeed something Struts used to use as request mapping, but the URL in that case should be [sitename].com/[pagename].do not [sitename].com/?[pagename].do/
In the second form, the action is in fact a parameter in a query string. This is why this syntax is broken: [sitename].com/?[pagename].do?[name]=[value]&[name]=[value]. You want to send a query string to the action but the action itself is a parameter in the query string.
But that's not the issue. The issue is that the site is doing something with that parameter and expects to receive it's data in a certain way, a way you were not able to reverse engineer.
Assuming again that this is a Struts application, Struts uses a front controller to intercept all action.do URLs and then use the action to invoke a particular class in the application, a class that is mapped to that particular action. The format for this should be [sitename].com/[pagename].do. That would be similar to having, say, [sitename].com/[pagename].php.
But having the action as a parameter makes me think that the site has a different front controller (not that of Struts) that is taking the parameter from the query string and passing it downstream to the Struts framework.
There could be a lot of reasons for having this funky way of handling requests, including making it harder for others to scrape the site, although this seems kinda straight forward:
$.post("/?dumps.do/", {s: id, pg: 2}, ...
Have you tried doing a POST to the root of the application with the action in the query string?
I have a URL like "http://www.ti.com/lit/gpn/TPS767D318-Q1"
which is a path eventually being routed to "http://www.ti.com/lit/ds/symlink/tps767d318-q1.pdf" on the browser(rendering a pdf file). I am processing this URL in a Console application in order to fetch the "pdf" filename that you see in the second URL i provided.
I checked the UriResponse.Absoluteuri property in the httpresponse object and it says "http://focus.ti.com/general/docs/lit/getliterature.tsp?genericPartNumber=TPS767D318-Q1&fileType=pdf"
looks like this is a nested virtual path. Can anybody help on where i can get to the end URL to extract the pdf file name? i did not find it anywhere in the response object. I also checked in the Response headers and nothing there either.
Any help will be appreciated...Thanks
Not sure about ASP, but at the protocol level the initial request may cause a Redirect to be issued by the application/server on the other end, so you can look at the initial HTTP response and check if it's a redirect code, 301, 302 etc. If so, you can follow the 302s until you hit a 200, and that's the final URL you can use to check the filename.
Look at the Content-Disposition header, it might look something like: Content-Disposition: attachment; filename=tps767d318-q1.pdf. This is a common technique for webservices that fetch and "download" files from database, network shares, etc.
It turns out that the URL in my question is actually returning HTML content and doing a "meta tag" redirect. So I had to do the following:
var redirect = Regex.Match(new string(buffer, 0, count), #"\<meta(?=[^>]*http-equiv\W*refresh)[^>]*?content\s*\=[^=>]*url\s*\=\s*(?<Url>[^'"">]+)", RegexOptions.IgnoreCase | RegexOptions.Singleline);
if (redirect.Success)
{
Uri uri = new Uri(new Uri(externalUrl, UriKind.Absolute), new Uri(redirect.Groups["Url"].Value, UriKind.RelativeOrAbsolute));
return SaveUrlToTemporaryFile(uri.AbsoluteUri, needsFullDownload);
}
I'm getting the final URL out of the meta tags from the returned HTML content, and calling my download routine again.