download file from remote location

download file from remote location - asp.net

Hey i am in atrouble please help me out.i want to download file from other website to on my location and i used code below
Dim wc As New System.Net.WebClient
wc.DownloadFile(pathUrl, fileName)
PathUrl,fileName both are correct m 100% sure.
after execution of these 2 line my browser progress-bar goes in to wait state like something is retrieving.but file not download any where.what should i do next?

Not enough rep to leave a comment so:
#AZHAR, the file save location is the second parameter. In your example it is fileName, in NiL's example it is "uploads/myPath.doc"
If you use wc.DownloadFileAsync, make sure to include an AsyncCompletedEventHandler so you know when it's done.

I'm not sure about the correctness of what you did, relatively to your goal (I don't mean the code is incorrect, as it is syntactically correct otherwise it won't compile).
If you want to retrieve a file from a remote location and save it to your local machine, this is surely the worst way!!!!
If, instead, you want to download the file onto your server, then your problem is patience :)
I mean, the DownloadFile method is blocking and can take even hours if you are trying to download something a bluray ripped film or a Linux ISO, no matter how fast is your server.
You could think about using an asynchronous job in this case...

The code you wrote did download the file, I tested it and it surely download it
the usage of the DownloadFunction is as follows:
wc.DownloadFile("http://www.domaine.com/uploads/file.doc", "uploads/myPath.doc");
If you are trying to download a big file you can use :
wc.DownloadFileAsync
and it is the same

Related

Goutte / Web Scraping - How to intercept and download a file

Firstly, thanks in advance for your help here, it's really appreciated!
I've successfully managed to get Goutte to authenticate, hit a URL, change a select field and click a submit button.
The page then reloads and as it finishes loading, it downloads a file to the client.
How do I intercept this file within Goutte? I've read as much doco as I can but can't seem to find an answer. I then want to basically hit this file, traverse it and save it locally.
Depending upon the file type, I want to traverse it, or save it locally.
Thanks :-)

It is not easy to achieve this. In my situation, I open the URL where the file is (after authentication) then the server gives the file (as an object of Page), afterwards you can get the content of the page.
// $url contains the path to the file.
$session->visit($url);
$page = $session->getPage();
$saved = file_put_contents($targetFilePath, $page->getContent());
In my case, I am downloading zip file. In your case, probably save it in a temporary location, detect the type then move it to any desired directory.
Hope this helps.

Download ASPX page with R

There are a number of fairly detailed answers on SO which cover authenticated login to an aspx site and a download from it. As a complete n00b I haven't been able to find a simple explanation of how to get data from a web form
The following MWE is intended as an example only. And this question is more intended to teach me how to do it for a wider collection of webpages.
website :
http://data.un.org/Data.aspx?d=SNA&f=group_code%3a101
what I tried and (obviously) failed.
test=read.csv('http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc')
giving me goobledegook with a View(test)
Anything that steps me through this or points me in the right direction would be very gratefully received.

The URL you are accessing using read.csv is returning a zipped file. You could download it
using httr say and write the contents to a temp file:
library(httr)
urlUN <- "http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc"
response <- GET(urlUN)
writeBin(content(response, as = "raw"), "temp/temp.zip")
fName <- unzip("temp/temp.zip", list = TRUE)$Name
unzip("temp/temp.zip", exdir = "temp")
read.csv(paste0("temp/", fName))
Alternatively Hmisc has a useful getZip function:
library(Hmisc)
urlUN <- "http://data.un.org/Handlers/DownloadHandler.ashx?DataFilter=group_code:101;country_code:826&DataMartId=SNA&Format=csv&c=2,3,4,6,7,8,9,10,11,12,13&s=_cr_engNameOrderBy:asc,fiscal_year:desc,_grIt_code:asc"
unData <- read.csv(getZip(urlUN))

The links are being generated dynamically. The other problem is the content isn't actually at that link. You're making a request to a (very odd and poorly documented) API which will eventually return with the zip file. If you look in the Chrome dev tools as you click on that link you'll see the message and response headers.
There's a few ways you can solve this. If you know some javascript you can script a headless webkit instance like Phantom to load up these pages, simulate lick events and wait for a content response, then pipe that to something.
Alternately you may be able to finagle httr into treating this like a proper restful API. I have no idea if that's even remotely possible. :)

Detail procedure to generate a har file from a given url via command line tool

Could anybody advise how to generate a har file from given a url via command line in linux? Detail tools used and guidelines are much appreciated.
Thanks

You can use phantomjs for this work.
phantomjs examples/netsniff.js "some_url" > out.har
or take a look at the BrowserMob Proxy

I have worked with PhantomJS to produce HAR files but they are not really reliable as opposed to the HAR files generated by actual browsers like Chrome, Firefox. Using selenium and BrowsermobProxy, you can generate HAR files directly from browsers with a python script such as this:
from browsermobproxy import Server
from selenium import webdriver
import json
server = Server("path/to/browsermob-proxy")
server.start()
proxy = server.create_proxy()
profile = webdriver.FirefoxProfile()
profile.set_proxy(self.proxy.selenium_proxy())
driver = webdriver.Firefox(firefox_profile=profile)
proxy.new_har("http://stackoverflow.com", options={'captureHeaders': True})
driver.get("http://stackoverflow.com")
result = json.dumps(proxy.har, ensure_ascii=False)
print result
proxy.stop()
driver.quit()
If you are looking for a command line tool that headlessly generates HAR and performance data with Chrome and Firefox, have a look at Speedprofile.

Phantomjs' har files are an abbreviated list of assets. In other words, when you visit a web page with Chrome or another browser, files load over a period of a few seconds.
But phantomjs takes an instantaneous snapshot of that website, before all the assets have had time to load.
It also excludes data and image files (because they're not part of the har spec)
You can work around this by modifying the netsniff.js example file.
I've forked that project and made those modifications at the link below. Please note that I've set the timer to wait 20 seconds before generating the har. I've also added a bit of error handling to ignore js errors. The error handling bit was added to deal with phantomjs creating invalid har files if it encountered an error. (I also commented out the function that excludes data/image files)
So this may not be exactly what you want. But it's a starting point for you or anyone else looking to use phantomjs.
After these changes, I went from consistently getting four asset files to about 25.
https://github.com/associatedpress/phantomjs/blob/netsniff-timer/examples/netsniff.js

Bought Magento Extension - weird encryption worried it's malicious

I just bought an extension for Magento, once I checked the files I saw that some are encrpyted in a really weird way, never saw that before and from some of the function names and includes it looks like at some places it gets content from external files...
Anyways, i would like to be able to decode this to see if the extension does anything malicious or not. I paid $300 for it and I'm a little bit worried to put that in my shop if I don't know that the extension is clean.
The code in the encrypted files looks like this:
if (isset ($††††††††††††††††††††††->Items->Item)){if (is_array($††††††††††††††††††††††->Items->Item)){$†††††††††††††††††††††††=$††††††††††††††††††††††->Items->Item;}else {$†††††††††††††††††††††††=array ($††††††††††††††††††††††->Items->Item);}}else {return array (0,0);}self::_getExistingsProducts(chr(97).chr(109).chr(97).chr(122).chr(111).chr(110).chr(105).chr(109).chr(112).chr(111).chr(114).chr(116).chr(98).chr(111).chr(111).chr(107).chr(115));$††††††††††††††††††††††††=array ();$†††††††††††††††††††††††††=array ();foreach ($††††††††††††††††††††††† as $††††††††††††††††††††††††††=>$††††††††††††††){$†††††††††††††††††††††††††††=array (chr(97).chr(115).chr(105).chr(110)=>$††††††††††††††->ASIN,chr(115).chr(107).chr(117)=>self::_getProductSku($††††[chr(115).chr(107).chr(117)],$††††††††††††††),);if (in_array($†††††††††††††††††††††††††††[chr(97).chr(115).chr(105).chr(110)],$††††††††††††††††††††††††)|| in_array($†††††††††††††††††††††††††††[chr(115).chr(107).chr(117)],$†††††††††††††††††††††††††)|| self::_existsProduct($†††††††††††††††††††††††††††,true)){$††††††††††††††††††++ ;continue ;}try {$††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††[chr(97).chr(115).chr(105).chr(110)];$†††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††[chr(115).chr(107).chr(117)];$††††††††††††††††††††††††††††[]=$††††[chr(108).chr(111).chr(99).chr(97).chr(108)];$††††††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††[chr(97).chr(115).chr(105).chr(110)];$††††††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††[chr(115).chr(107).chr(117)];$††††††††††††††††††††††††††††[]=self::getProductCategories($††††††††††††††,true);$††††††††††††††††††††††††††††[]=isset ($††††††††††††††->ItemAttributes->Title)?$††††††††††††††->ItemAttributes->Title:$††††††††††††††->ASIN;$††††††††††††††††††††††††††††[]=self::_getImagesCount($††††††††††††††);$††††††††††††††††††††††††††††[]=$††††††††††††††->DetailPageURL;list ($†††††††††††††††††††††††††††††,$††††††††††††††††††††††††††††††,$†††††††††††††††††††††††††††††††,$††††††††††††††††††††††††††††††††,$†††††††††††††††††††††††††††††††††)=WP_Amazonimportproducts_Model_Amazonoffer::getOfferInfo($††††††††††††††,$††††[chr(100).chr(101).chr(102).chr(97).chr(117).chr(108).chr(116).chr(80).chr(114).chr(105).chr(99).chr(101)]);$††††††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††††;$††††††††††††††††††††††††††††[]=floatval($††††[chr(100).chr(101).chr(102).chr(97).chr(117).chr(108).chr(116).chr(80).chr(114).chr(105).chr(99).chr(101).chr(80).chr(108).chr(117).chr(115).chr(80).chr(101).chr(114).chr(99).chr(101).chr(110).chr(116)]);$††††††††††††††††††††††††††††[]=floatval($††††[chr(100).chr(101).chr(102).chr(97).chr(117).chr(108).chr(116).chr(80).chr(114).chr(105).chr(99).chr(101).chr(80).chr(108).chr(117).chr(115).chr(85).chr(110).chr(105).chr(116)]);$††††††††††††††††††††††††††††[]=$††††††††††††††††††††††††††††††;$††††††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††††††;$††††††††††††††††††††††††††††[]=$††††††††††††††††††††††††††††††††;$††††††††††††††††††††††††††††[]=$†††††††††††††††††††††††††††††††††;$††††††††††††††††††††††††††††[]=self::_getProductDetail($††††††††††††††);$††††††††††††††††††††††††††††[]=serialize($††††††††††††††);$††††††††††††††††††††††††††††[]=$††††††††††††††††††††;$††††††††††††††††††††††††††††[]=0;$††††††††††††††††††††††††††††[]=date(chr(89).chr(45).chr(109).chr(45).chr(100).chr(32).chr(72).chr(58).chr(105).chr(58).chr(115));$††††††††††††††††††††††††††††[]=$††††[chr(100).chr(101).chr(102).chr(97).chr(117).chr(108).chr(116).chr(80).chr(114).chr(105).chr(99).chr(101)];$††††††††††††††††††††††††††††[]=$††††[chr(100).chr(101).chr(102).chr(97).chr(117).chr(108).chr(116).chr(67).chr(111).chr(110).chr(100).chr(105).chr(116).chr(105).chr(111).chr(110)];$†††††††††††++ ;}catch (Exception $†††††††††††††††††){Mage::helper(chr(97).chr(109).chr(97).chr(122).chr(111).chr(110).chr(105).chr(109).chr(112).chr(111).chr(114).chr(116).chr(112).chr(114).chr(111).chr(100).chr(117).chr(99).chr(116).chr(115))->{"\x6c\x6f\x67"}($†††††††††††††††††->{"\x67\x65\x74\x4d\x65\x73\x73\x61\x67\x65"}(),chr(71).chr(101).chr(116).chr(32).chr(66).chr(111).chr(111).chr(107).chr(32).chr(73).chr(110).chr(102).chr(111).chr(32).chr(102).chr(114).chr(111).chr(109).chr(32).chr(65).chr(109).chr(97).chr(122).chr(111).chr(110).chr(32).chr(65).chr(80).chr(73).chr(44).chr(32).chr(82).chr(101).chr(115).chr(112).chr(111).chr(110).chr(115).chr(101).chr(32).chr(69).chr(114).chr(114).chr(111).chr(114),chr(105).chr(109).chr(112).chr(111).chr(114).chr(116));}}
This is only a small excerpt from one of the encrypted files, and I'm looking for a way to display the code unencoded so I can check it. Sorry for the format, I am still trying to figure out how to format huge blocks of code correctly, maybe someone is nice enough to edit it for me?

Well, you will have to work with find-and-replace tools. Firstly you get the result of a php function call (php -r "echo chr(107)") and then replace all occurrences of chr(107) to the result ("k"). Than replace "\x67\x65\x74\x4d\x65\x73\x73\x61\x67\x65" (php -r 'echo "\x67\x65\x74\x4d\x65\x73\x73\x61\x67\x65";'). Than replace all †-variables in methods to human-readable names.

How to work with hook_nodeapi after image thumbnail creation with ImageCache

A bit of a followup from a previous question.
As I mentioned in that question, my overall goal is to call a Ruby script after ImageCache does its magic with generating thumbnails and whatnot.
Sebi's suggestion from this question involved using hook_nodeapi.
Sadly, my Drupal knowledge of creating modules and/or hacking into existing modules is pretty limited.
So, for this question:
Should I create my own module or attempt to modify the ImageCache module?
How do I go about getting the generated thumbnail path (from ImageCache) to pass into my Ruby script?
edit
I found this question searching through SO...
Is it possible to do something similar in the _imagecache_cache function that would do what I want?
ie
function _imagecache_cache($presetname, $path) {
...
...
// check if deriv exists... (file was created between apaches request handler and reaching this code)
// otherwise try to create the derivative.
if (file_exists($dst) || imagecache_build_derivative($preset['actions'], $src, $dst)) {
imagecache_transfer($dst);
// call ruby script here
call('MY RUBY SCRIPT');
}

Don't hack into imagecache, remember every time you hack core/contrib modules god kills a kitten ;)
You should create a module that invokes hook_nodeapi, look at the api documentation to find the correct entry point for your script, nodeapi works on various different levels of the node process so you have to pick the correct one for you (it should become clear when you check the link out) http://api.drupal.org/api/function/hook_nodeapi
You won't be able to call the function you've shown because it is private so you'll have to find another route.
You could try and build the path up manually, you should be able to pull out the filename of the uploaded file and then append it to the directory structure, ugly but it should work. e.g.
If the uploaded file is called test123.jpg then it should be in /files/imagecache/thumbnails/test123/jpg (or something similar).
Hope it helps.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

download file from remote location - asp.net

Not enough rep to leave a comment so: #AZHAR, the file save location is the second parameter. In your example it is fileName, in NiL's example it is "uploads/myPath.doc" If you use wc.DownloadFileAsync, make sure to include an AsyncCompletedEventHandler so you know when it's done.

Related

Goutte / Web Scraping - How to intercept and download a file

Download ASPX page with R

Detail procedure to generate a har file from a given url via command line tool

Bought Magento Extension - weird encryption worried it's malicious

How to work with hook_nodeapi after image thumbnail creation with ImageCache

Categories

Resources