rsync - why is it transferring whole file - rsync

I believe my rsync is transferring the entire file each time instead of using its algorithm and only transferring changes. Why is this?
I have a text file called rsynctest. Even if I delete only a single character from the text file on the server end, it appears to be transferring the entire file. The rsync stats show: Total transferred file size is 2.55G and the file size is 2.4G so I believe it transferred the entire file.
Here is the output
/usr/bin/rsync -avrh --progress --compress --stats rsynctest x.x.x.x:/rsynctest
sending incremental file list
test
2.55G 100% 27.56MB/s 0:01:28 (xfer#1, to-check=0/1)
Number of files: 1
Number of files transferred: 1
Total file size: 2.55G bytes
Total transferred file size: 2.55G bytes
Literal data: 50.53K bytes
Matched data: 2.55G bytes
File list size: 39
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 693
Total bytes received: 404.38K
sent 693 bytes received 404.38K bytes 2.54K bytes/sec
total size is 2.55G speedup is 6305.64

It's not actually transferring the whole file. Look at the "Total bytes sent" and "Total bytes received" lines. It's only transferring the checksums of each block of N bytes (where N can vary), and when a given block's checksums turn out identical, that block isn't transferred. But since the file size has changed, it has to check the whole file for differences. And that's what you're seeing in the progress bar: rsync checking the whole file for differences. (Note also the speed of 27.56MB/s -- I doubt your Internet connection could actually maintain a quarter of a hundred megabytes per second transfer speed. Though if it can, more power to you.)

Related

How can I find the first clusters/blocks of a file?

I have a FAT16 drive that contains the following info:
Bytes per sector: 512 bytes (0x200)
Sectors per cluster: 64 (0x40)
Reserved sectors: 6 (0x06)
Number of FATs: 2 (0x02)
Number of root entries: 512 (0x0200)
Total number of sectors: 3805043 (0x3a0f73)
Sectors per file allocation table: 233 (0xE9)
Root directory is located at sector 472 (0x1d8)
I'm looking for a file with the following details:
File name: LOREMI~1
File extension: TXT
File size: 3284 bytes (0x0cd4)
First cluster: 660 (0x294)
However, I would admit that the start of the file cluster is located at sector 42616. My problem is that what equation should I use that would produce 42616?
I have trouble figuring this out since there is barely any information about this other than a tutorial made by Tavi Systems but the part involving this is very hard to follow.
Actually, the FAT filesystem is fairly well documented. The official FAT documentation by Microsoft can be found by the filename fatgen103.
The directory entry LOREMI~1.TXT can be found in the root directory and is precedented by the long file name entry (xt, lorem ipsum.t → lorem ipsum.txt), the directory entry is documented in the «FAT Directory Structure» chapter; in case of FAT16 you are interested in the 26th to 28th byte to get the cluster address (DIR_FstClusLo), which is (little endian!) 0x0294 (or 660₁₀).
Based on the BPB header information you provided we can calculate the the data sector like this:
data_sector = (cluster-2) * sectors_per_cluster +
(reserved_sectors + (number_of_fats * fat_size) +
first_data_sector)
Why cluster-2? Because the first two clusters in a FAT filesystem are always reserved for the BPB header block as well as the FAT itself, see chapter «FAT Data Structure» in fatgen103.doc.
In order for us to solve this, we still need to determine the sector span of the root directory entry. For FAT12/16 this can be determined like this:
first_data_sector = ((root_entries * directory_entry_size) +
(bytes_per_sector - 1)) // bytes_per_sector
The directory entry size is always 32 bytes as per specification (see chapter «FAT Directory Structure» in fatgen103.doc), every other value is known by now:
first_data_sector = ((512*32)+(512-1)) // 512 → 32
data_sector = (660-2)*64+(6+(2*233)+32) → 42616

Appending 0's to make 1MB file with dd

I have a binary file abc.bin that is 512 bytes long, and I need to generate 1M (1024 x 1024 = 1048576) byte file by appending 0's (0x00) to the abc.bin. How can I do that with dd utility?
For example, abc.bin has 512 bytes of 0x01 ("11 ... 11"), and I need to have a helloos.bin that is 1048576 bytes ("11 ... 11000 ... 000"); the 0 is not '0', but 0x00, and the number of 0x00 is 1048576 - 512.
You can tell dd to seek to the 1M position in the file, which has the effect of making its size at least 1M:
dd if=/dev/null of=abc.bin obs=1M seek=1
If you want to ensure that dd only extends, never truncates the file, add conv=notrunc:
dd if=/dev/null of=abc.bin obs=1M seek=1 conv=notrunc
If you're on a system with GNU coreutils (like, just about any Linux system), you can use the truncate command instead of dd:
truncate --size=1M abc.bin
If you want to be sure the file is only extended, never truncated:
truncate --size=\>1M abc.bin
I'm assuming you actually mean to allocate 1M of zeroes on the disk, not just have a file whose reported length is 1MiB and reads as zeroes.
dd if=/dev/zero count=2047 bs=512 >> abc.bin
This method also works:
Create a 1M file with 0(0x00)s - dd if=/dev/zero of=helloos.bin bs=512 count=2048
Write the abc on the created file - dd of=helloos.bin conv=notrunc if=abc.bin
Without the conv=notrunc option, I have only 512 byte file. I also can use seek=N to control the start position by skipping N blocks.

Is 1MB = 1000000 bytes OR 1048576 considering file size

I have to place a 1MB size check on my files to be uploaded . Now in code (using c#) i have to mention the size in bytes . Should i check the size of upload file with the value
MaxSizeInBytes = 1048576 OR MaxSizeInBytes = 1000000 .
In most operating systems, files of 1048576 bytes will show as "1 MB" (and more importantly, files of 1000000 bytes will show as "less than 1MB"). So the user will have the expectation to be able to upload such a file if the form says it accepts "1 MB files".

Why a hex file is used in burning program in micro controller?

When ever we program a micro controller we convert the C file into a hex file and then we burn that into controller.
My question is that why a hex file only, is that hex file a hexadecimal version of binary executable?
If yes then why do not we use a binary file instead?
if you are talking about an "intel hex" file the reason being is that it is ascii which makes it easy to examine and parse. true, it is innefficient in one way but compared to a raw binary it might be smaller. With a raw binary you only have one if any address associated, the starting address (not embedded in the file) in a hex file or motorola srecord which is a similar and often used format as well. both the ihex and srec formats are basically lines of ascii/hex numbers that represent a type a starting address, length data, and a checksum. there are non data lines in there but much of it will be data. so if your program has a few bytes at address 0x1000 and a few bytes at 0x80000000 then a .bin file would be at its smallest 0x8000000-0x1000 plus a few bytes but would typically be 0x80000000+ a few bytes (right, 2 gigabytes). Where an ihex or srec would be in the dozens of bytes total. the ihex and srec have built in checksums to help protect against corrupt files, not perfect of course but better than nothing at all...
Since then elf and coff and other formats have become popular. these are also based on blocks of data and not a complete memory image. these are binary, not ascii formats, but they are not just a memory image. chunks of data with address, type, etc are provided.
Because the ihex and srec are so simple to create and parse they will continue to be used for a long time, it does not take a lot of resources in a bootloader for example to handle receiving an ihex or srec file. (same with a binary of course, but the binary has a lot of fill data in it costing a lot of unnecessary transmission time).

What do the numbers in rsync's output mean?

When I run rsync with the --progress flag, I get information about the transfers as follows.
path/to/file
16 100% 0.01kB/s 0:00:01 (xfer#10857, to-check=427700/441502)
What do the numbers in the second row mean? I know what some of them are, but what do the others mean (marked with ??? below)?
16 ???
100% amount of transfer completed in this file
0.0.1kB/s speed of current file transfer
0:00:01: time elapsed in current file transfer
10857 count of files transferred
427700 ???
441502 ???
When the file transfer finishes, rsync
replaces the progress line with a
summary line that looks like this:
1238099 100% 146.38kB/s 0:00:08 (xfer#5, to-check=169/396)
In this example, the file was 1238099
bytes long in total, the average rate
of transfer for the whole file was
146.38 kilobytes per second over the 8 seconds that it took to complete, it
was the 5th transfer of a regular file
during the current rsync session, and
there are 169 more files for the
receiver to check (to see if they are
up-to-date or not) remaining out of
the 396 total files in the file-list.
from http://samba.anu.edu.au/ftp/rsync/rsync.html under --progress switch
path/to/file
16 100% 0.01kB/s 0:00:01 (xfer#10857, to-check=427700/441502)
The 16 is the bytes-in-this-file transferred sofar. The 100% lists the percentage of the file transferred: 100% in this case. For very short files the kb/sec number often comes out a bit weird: Small measuring errors cause big differences in the calculated overall speed. Then there is the total time. Then, the transfer number. In the example given, of the 427700 files checked so far, only 10857 needed to be transferred. Based on the modification times rsync decided that no transfer was needed for some of the others. Next there is the number of files left-to-check and the total. Modern rsync implementations will create the list that counts towards the "total" on the fly: only adding to the list if the unchecked number drops below 1000.

Resources