Easy way to find failed files using wget -i - unix

I have a list of files (images, pdfs) and I want download them all via wget (using option -i). The problem is that I got different number of files downloaded - some are missing. All files in list are unique - I already double-check.
I know, that I can compare my list with 'ls' in folder, but I'm curious if there is some "wget" way to find failed operations.
UPDATE:
My problem in steps:
I have list 'list.txt' with URLs
I run command:
wget -i list.txt
for each file I get a long output with status e.g:
--2015-07-07 13:46:34-- http://www.example.com/example.png
Reusing existing connection to www.example.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 261503 (255K) [image/png]
Saving to: 'example.png'
example.png 100%[===================================================>] 255.37K 1.51MB/s in 0.2s
2015-07-07 13:46:34 (1.51 MB/s) - 'example.png' saved [261503/261503]
at the end i got 1 more message:
FINISHED --2015-07-07 13:46:34--
Total wall clock time: 1m 15s
Downloaded: 109 files, 130M in 1m 11s (1.84 MB/s)
My question is: I have 'list.txt' with 112 lines, the result told me that it downloads 109 files, so how can I know which files are not downloaded?

Related

Self-hosted gitlab server with with RPi and pitunnel showing http error 413 when trying to push

1. Problem
The git push command returns the following error if one file is larger than ~1MB:
Pushing to http://mygitlabserver.pitunnel.com/root/my_project.git
POST git-receive-pack (1163897 bytes)
error: RPC failed; HTTP 413 curl 22 The requested URL returned error: 413
fatal: the remote end hung up unexpectedly
fatal: the remote end hung up unexpectedly
Everything up-to-date
The server is an RPi 4 with an SSD attached. Accessed via pitunnel (standard subscribtion).
The push fails if one file is larger than 1MB
The push returns no error even if the commit is 150MB (a lot of small files)
The push returns no error if an mp3 file of multiple MBs gets pushed.
2. Problem
Not really a problem but it can be related to the other one
If a large project is imported that was exported from gitlab.com it returns the same error:
413 Request Entity Too Large
nginx/1.10.3 (Ubuntu)
But only if connected via pitunnel (link), it works if the project is uploaded in the local network.
The nginx seems to be the problem.
In the gitlab.rb file the following parameters are set and the gitlab service was restarted according to the gitlab docs:
nginx['enable'] = true
nginx['client_max_body_size'] = '900m'
PS: The repo will use git LFS after this problem is solved.
for all with similar a similair problem:
Pitunnel was the problem.

Is there a way to disable SSL/TLS for GitHub Pages?

I have been looking for a place to store some of my XML schemas publicly without actually having to host them. I decided GitHub Pages would be the ideal platform. I was correct except that I cannot figure out how to turn off SSL/TLS. When I try to fetch my pages with plain old HTTP I get a 301 Moved Permanently response.
So obviously this isn't a big deal. Worst case scenario it takes a little longer to download my schemas, and people generally only use schemas that they've already cached anyway. But is there really no way to turn this off?
From github help :
HTTPS enforcement is required for GitHub Pages sites created after June 15, 2016 and using a github.io domain.
So, you have two solutions :
find a github.io repository older than June 15, 2016
set a custom domain name on your github.io
But is there really no way to turn this off?
No, and a simple curl -L would follow the redirection and get you the page content anyway.
For instance (get an xml file in a tree structure):
vonc#vonvb C:\test
> curl --create-dirs -L -o .repo/local_manifests/local_manifest.xml -O -L https://raw.githubusercontent.com/legaCyMod/android_local_manifest/cm-11.0/local_manifest.xml
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 530 100 530 0 0 1615 0 --:--:-- --:--:-- --:--:-- 1743
vonc#voncvb C:\test
> tree /F .
C:\TEST
└───.repo
└───local_manifests
local_manifest.xml

IIS: Download large file with wget - connection always fails with Bitrate Throottling

I have a problem to make IIS to allow download file with Bitrate Throottling. I set limit file to 100kb/s. There is no problem without bitrate limitation. But with limit I have a problem.
I'm using a code similar to described in this article:
Securing Large Downloads Using C# and IIS 7
I also tried to switch off IIS Bitrate Throottling and control bitrate "by hand" calculating with TimeSpan the bitrate and using Thread.Sleep(10) in a while...
But all my tries was useless, I don't get any exceptions.
to test download I use wget, this way:
wget -t 1 http://db.realestate.ru/yrl/RealEstateExportToYandex.xml
(you can try it with wget for windows)
this is a 240Mb text file, wget always stops, at random position of downloading, 5% - 60% and throws this error message:
Read error at byte ... (Connection reset by peer).
May be the problem is not with IIS, because in may localhost is working well, but not online on highly loaded server.
Solved with this parameters specified in wget command:
wget -t 1 --header="Keep-Alive: 30000" -nv http://db.realestate.ru/yrl/RealEstateExportToYandex.xml

exit code for rsync if there is no modification done to destination folder

I'm using rsync in solaris and couldn't find an exit code if there is no file or folder modification/addition or deletion done on the destination folder. How can I get the status if rsync doesn't have one ?
0 Success
1 Syntax or usage error
2 Protocol incompatibility
3 Errors selecting input/output files, dirs
4 Requested action not supported: an attempt was made to manipulate 64-bit
files on a platform that cannot support them; or an option was specified
that is supported by the client and not by the server.
5 Error starting client-server protocol
6 Daemon unable to append to log-file
10 Error in socket I/O
11 Error in file I/O
12 Error in rsync protocol data stream
13 Errors with program diagnostics
14 Error in IPC code
20 Received SIGUSR1 or SIGINT
21 Some error returned by waitpid()
22 Error allocating core memory buffers
23 Partial transfer due to error
24 Partial transfer due to vanished source files
25 The --max-delete limit stopped deletions
30 Timeout in data send/receive
35 Timeout waiting for daemon connection
Thank you
There is a work around
rsync --log-format=%f ...
Note that rsync outputs files anytime any attribute changes, not only if the content of the file is updated.
There is also a -i option (or --log-format=%i) that itemizes all of the changes. See the rsync man page for details of the output format.

Where is Wordpress direct download link?

Wordpress not using direct linking for the download links (looks like enterprise software developer who generate links dynamically to track installation).
Use wget http://wordpress.org/latest.tar.gz is not getting the right file name.
I dont want save in desktop and upload to server because I'm running slow internet connection.
I fail to see what the problem is:
marc#panic:~$ wget http://wordpress.org/latest.tar.gz
--2011-04-01 11:19:27-- http://wordpress.org/latest.tar.gz
Resolving wordpress.org... 72.233.56.139, 72.233.56.138
Connecting to wordpress.org|72.233.56.139|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: `latest.tar.gz'
[ <=> ] 2,365,766 1.09M/s

Resources