I have a URL in my custom module which runs a long script. If i call url via wget it downloads the page content. It doesn't run the script. How to do it?
I would have thought that even though it downloaded the page it would still run the script.
To run without downloading the file use:
wget -O - -q -t 1 http://example.com/path/to/file.php
From memory:
-O and the hyphen are redirecting the output so it's not saved to a file.
-q is for quiet
-t is the number of attempts.
You can use man wget to look any more options up.
Related
I had been using a proxy for a long time. Now I need to remove it. I have forgotten how I have added the proxy to wget. Can someone please help me get back to the normal wget where it doesn't use any proxy. As of now, I'm using
wget <link> --proxy=none
But I'm facing a problem when I'm installing using a pre-written script. It's painstaking to search all through the scripts and change each command.
Any simpler solution will be very much appreciated.
Thanks
Check your
~/.wgetrc
/etc/wgetrc
and remove proxy settings.
Or use wget --no-proxy command line option to override them.
In case your OS is alpine/busybox then the wget might vary from the one used by #Logu.
There the correct command is
wget --proxy off http://server:port/
Running wget --help outputs:
/ # wget --help
BusyBox v1.31.1 () multi-call binary.
Usage: wget [-c|--continue] [--spider] [-q|--quiet] [-O|--output-document FILE]
[-o|--output-file FILE] [--header 'header: value'] [-Y|--proxy on/off]
[-P DIR] [-S|--server-response] [-U|--user-agent AGENT] [-T SEC] URL...
Retrieve files via HTTP or FTP
--spider Only check URL existence: $? is 0 if exists
-c Continue retrieval of aborted transfer
-q Quiet
-P DIR Save to DIR (default .)
-S Show server response
-T SEC Network read timeout is SEC seconds
-O FILE Save to FILE ('-' for stdout)
-o FILE Log messages to FILE
-U STR Use STR for User-Agent header
-Y on/off Use proxy
I'm using wget to download all jpegs from a website.
I searched a lot and this should be the way:
wget -r -nd -A jpg "http://www.hotelninfea.com"
This should recursively -r download files jpegs -A jpg and store all files in a single directory, without recreating website directory tree -nd
Running this command downloads only the jpegs from the homepage of the website, not the whole jpegs of all the website.
I know that a jpeg file could have different extensions (jpg, jpeg) and so on, but this is not the case, also there aren't any robots.txt restrictions acting.
If I remove the filter from the previous command, it works as expected
wget -r -nd "http://www.hotelninfea.com"
This is happening on Lubuntu 16.04 64bit, wget 1.17.1
Is this a bug or I am misunderstanding something?
I suspect that this is happening because the main page you mention contains links to the other pages in the form http://.../something.php, i.e., there is an explicit extension. Then the option -A jpeg has the "side-effect" of removing those pages from the traversal process.
Perhaps a bit dirty workaround in this particular case would be something like this:
wget -r -nd -A jpg,jpeg,php "http://www.hotelninfea.com" && rm -f *.php
i.e., to download only the necessary extra pages and then delete them if wget successfully terminates.
ewcz anwer pointed me to the right way, the --accept acclist parameter has a dual role, it define the rules of file saving and the rules of following links.
Reading deeply the manual I found this
If ‘--adjust-extension’ was specified, the local filename might have ‘.html’ appended to it. If Wget is invoked with ‘-E -A.php’, a filename such as ‘index.php’ will match be accepted, but upon download will be named ‘index.php.html’, which no longer matches, and so the file will be deleted.
So you can do this
wget -r -nd -E -A jpg,php,asp "http://www.hotelninfea.com"
But of course a webmaster could have been using custom extensions
So I think that the most robust solution would be a bash script, something
like
WEBSITE="http://www.hotelninfea.com"
DEST_DIR="."
image_urls=`wget -nd --spider -r "$WEBSITE" 2>&1 | grep '^--' | awk '{ print $3 }' | grep -i '\.\(jpeg\|jpg\)'`
for image_url in $image_urls; do
DESTFILE="$DEST_DIR/$RANDOM.jpg"
wget "$image_url" -O "$DESTFILE"
done
--spider wget will not download the pages, just check that they are there
$RANDOM asks a random number to the operating system
I use Wget to download all files in a directory of a site. But it was interrupted. How could I make it to continue?
You want wget -c url, where -c stands for continue and url is the URL of the interrupted file.
I'm trying to figure out how to install Pear on my Mac (10.6.6).
Not understanding what they're telling me at pear.php.net, I got some code from http://clickontyler.com/blog/2008/01/how-to-install-pear-in-mac-os-x-leopard/
First, I entered curl http://pear.php.net/go-pear > go-pear.php in my terminal.
It resulted in this output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 88004 100 88004 0 0 47537 0 0:00:01 0:00:01 --:--:-- 59744
What does that all mean? Am I on the right track?
Next, I entered sudo php -q go-pear.php
and it gave me the long output below. In short I have no idea where I am in the installation process. However, I'm pretty sure that I'm not where I'm supposed to be at following the tutorial at http://clickontyler.com/blog/2008/01/how-to-install-pear-in-mac-os-x-leopard/
because the tutorial tells me to select all the default choices, and I don't see any options to select.
The next line of code is asking me to modify the php.ini files and it requires a password so I'm worried about doing it...Can anyone tell me if I'm on the right track?
sudo cp /etc/php.ini.default /etc/php.ini
Usage: php [options] [-f] <file> [--] [args...]
php [options] -r <code> [--] [args...]
php [options] [-B <begin_code>] -R <code> [-E <end_code>] [--] [args...]
php [options] [-B <begin_code>] -F <file> [-E <end_code>] [--] [args...]
php [options] -- [args...]
php [options] -a
-a Run interactively
-c <path>|<file> Look for php.ini file in this directory
-n No php.ini file will be used
-d foo[=bar] Define INI entry foo with value 'bar'
-e Generate extended information for debugger/profiler
-f <file> Parse and execute <file>.
-h This help
-i PHP information
-l Syntax check only (lint)
-m Show compiled in modules
-r <code> Run PHP <code> without using script tags <?..?>
-B <begin_code> Run PHP <begin_code> before processing input lines
-R <code> Run PHP <code> for every input line
-F <file> Parse and execute <file> for every input line
-E <end_code> Run PHP <end_code> after processing all input lines
-H Hide any passed arguments from external tools.
-s Output HTML syntax highlighted source.
-v Version number
-w Output source with stripped comments and whitespace.
-z <file> Load Zend extension <file>.
args... Arguments passed to script. Use -- args when first argument
starts with - or script is read from stdin
--ini Show configuration file names
--rf <name> Show information about function <name>.
--rc <name> Show information about class <name>.
--re <name> Show information about extension <name>.
--ri <name> Show configuration for extension <name>.
php does not have an argument -q. Its also mentioned in go-pear.php (http://pear.php.net/go-pear) itself, but I dont know, what it wants to tell me. However, try
sudo php go-pear.php
and then follow the instructions.
Update:
-q was used, to start the interpreter in "quiet" mode. It seems, that this option does not exists anymore, because php always starts "quiet", but it should not cause an error, anyway. Now make sure you are in the same directory as the file go-pear.php before you call php go-pear.php.
The first part shows that you successfully downloaded the file to go-pear.php.
The second part is showing that -q isn't a valid option. The third part is asking for the root password, since you're doing 'sudo'.
I used this, though I wasn't installing on Mac:
Getting and installing the PEAR package manager
Greetings to everyone,
I'm on OSX. I use the terminal a lot as a habit from my Linux old days that I never surpassed. I wanted to download the files listed in this http server: http://files.ubuntu-gr.org/ubuntistas/pdfs/
I select them all with the mouse, put them in a txt files and then gave the following command on the terminal:
for i in `cat ../newfile`; do wget http://files.ubuntu-gr.org/ubuntistas/pdfs/$i;done
I guess it's pretty self explanatory.
I was wondering if there's any easier, better, cooler way to download this "linked" pdf files using wget or curl.
Regards
You can do this with one line of wget as follows:
wget -r -nd -A pdf -I /ubuntistas/pdfs/ http://files.ubuntu-gr.org/ubuntistas/pdfs/
Here's what each parameter means:
-r makes wget recursively follow links
-nd avoids creating directories so all files are stored in the current directory
-A restricts the files saved by type
-I restricts by directory (this one is important if you don't want to download the whole internet ;)