Using the diff command - unix

So I am trying to compare a binary file I make when I compile with gcc to an sample executable that is provided. So I used the command diff and went like this
diff asgn2 sample-asgn2
Binary files asgn2 and sample-asgn2 differ
Is there any way to see how they differ? Instead of it just displaying that differ.

Do a hex dump of the two binaries using hexdump. Then you can compare the hex dump using your favorite diffing tool, like kdiff3, tkdiff, xxdiff, etc.

Why don't you try Vbindiff? It probably does what you want:
Visual Binary Diff (VBinDiff) displays files in hexadecimal and ASCII (or EBCDIC). It can also display two files at once, and highlight the differences between them. Unlike diff, it works well with large files (up to 4 GB).
Where to get Vbindiff depends on which operating system you are using. If Ubuntu or another Debian derivative, apt-get install vbindiff.

I'm using Linux,in my case,I need a -q option to just show what you got.
diff -q file1 file2
without -q option it will show which line is differ and display that line.
you may check with man diff to see the right option to use in your UNIX.

vbindiff only do byte-to-byte comparison. If there is just one byte addition/deletion, it will mark all subsequent bytes changed...
Another approach is to transform the binary files in text files so they can be compared with the text diff algorithm.
colorbindiff.pl is a simple and open-source perl script which uses this method and show a colored side-by-side comparison, like in a text diff. It highlights byte changes/additions/deletions. It's available on GitHub.

Related

ZSH - print java version in right prompt

I have a daily use case where I need to work with projects on different version of Java (8, 11, ...).
I would like to have it displayed in the right side prompt in my shell (ZSH with Oh-My-Zsh). I know of a dummy way (computationally expensive) to do it (just java --version to var and display it). I would like it to have it cached until I don't source a file (which is a specific project file that sets the new env vars for different java versions).
Do you have any ideas how to do this efficiently?
Br,
Stjepan
The PROMPT and RPROMPT variables can have both static and dynamic parts, so you can set the version there when you source the project file, and it will only be calculated one time. The trick is to get the quoting right.
This line goes in the project file that sets the env variables, somewhere after setting PATH to include the preferred java executable:
RPROMPT="${${=$(java --version)}[1,3]}"
The pieces:
RPROMPT= - variable for the right-side prompt.
"..." - the critical part. Variables in double quotes will be expanded then and there, so the commands within this will only be executed when the project file is sourced.
${...[1,3]} - selects the first three words of the enclosed expression. On my system, java --version returns three lines of data, which is way too big for a prompt; this reduces it to something manageable.
${=...} - splits the enclosed value into words.
$(java --version) - jre version info.

GNAT Metric and RTL files

For running GNAT metric (for Windows, GPL 2017 or CE 2018) I'd like to include the RTL sources as well. There is a "-a" switch but it seems to be ineffective. When I'm forcing visibility of RTL sources, only ada.ads and system.ads are processed. Guessing it is a "crunched name" issue (RTL file names forced to 8 character names) I've tried other tricks without success.
My question is: is there a way to get the RTL source metrics (of the source files actually used) with GNAT Metric?
I'm using the command
gnatmetric -a -xs -nt -j0 -Pmyproj.gpr -U somemain.adb
TIA
In the meantime I've found a workaround by using the gnathtml.pl script.
I've customized the script a bit by removing the H1 headers.
The result is a few hundreds of HTML files with the sources of units actually used: the script does find all dependencies, recursively, through the .ali files - including the RTL.
Then I group the HTML files together, convert them back to text files, pass them through Adalog's Normalize tool for removing comments and empty lines, count lines with the wc command, and the job is done.

Output file numbering in graphicsmagick

I'm trying to convert pdf to images using this command:
gm convert ./file.pdf -scene 1 thumbs/thumb%02d.jpg
Although I specify -scene argument, it does nothing, as I get output files starting from thumb00.jpg. And I need them to start from thumb01.jpg.
I'm using GraphicsMagick 1.3.12.
What am I doing wrong here?
In order to ensure numbered output files, add the +adjoin option like:
gm convert ./file.pdf -scene 1 +adjoin thumbs/thumb%02d.jpg
This additional requirement was added by GraphicsMagick 1.3.15. It is ok to use the same option for all earlier releases.
There is still an inability to specify the starting scene number. This is a known bug.

Compress EACH LINE of a file individually and independently of one another? (or preserve newlines)

I have a very large file (~10 GB) that can be compressed to < 1 GB using gzip. I'm interested in using sort FILE | uniq -c | sort to see how often a single line is repeated, however the 10 GB file is too large to sort and my computer runs out of memory.
Is there a way to compress the file while preserving newlines (or an entirely different method all together) that would reduce the file to a small enough size to sort, yet still leave the file in a condition that's sortable?
Or any other method of finding out / countin how many times each line is repetead inside a large file (a ~10 GB CSV-like file) ?
Thanks for any help!
Are you sure you're running out of the Memory (RAM?) with your sort?
My experience debugging sort problems leads me to believe that you have probably run out of diskspace for sort to create it temporary files. Also recall that diskspace used to sort is usually in /tmp or /var/tmp.
So check out your available disk space with :
df -g
(some systems don't support -g, try -m (megs) -k (kiloB) )
If you have an undersized /tmp partition, do you have another partition with 10-20GB free? If yes, then tell your sort to use that dir with
sort -T /alt/dir
Note that for sort version
sort (GNU coreutils) 5.97
The help says
-T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or /tmp;
multiple options specify multiple directories
I'm not sure if this means can combine a bunch of -T=/dr1/ -T=/dr2 ... to get to your 10GB*sortFactor space or not. My experience was that it only used the last dir in the list, so try to use 1 dir that is big enough.
Also, note that you can go to the whatever dir you are using for sort, and you'll see the acctivity of the temporary files used for sorting.
I hope this helps.
As you appear to be a new user here on S.O., allow me to welcome you and remind you of four things we do:
. 1) Read the FAQs
. 2) Please accept the answer that best solves your problem, if any, by pressing the checkmark sign. This gives the respondent with the best answer 15 points of reputation. It is not subtracted (as some people seem to think) from your reputation points ;-)
. 3) When you see good Q&A, vote them up by using the gray triangles, as the credibility of the system is based on the reputation that users gain by sharing their knowledge.
. 4) As you receive help, try to give it too, answering questions in your area of expertise
There are some possible solutions:
1 - use any text processing language (perl, awk) to extract each line and save the line number and a hash for that line, and then compare the hashes
2 - Can / Want to remove the duplicate lines, leaving just one occurence per file? Could use a script (command) like:
awk '!x[$0]++' oldfile > newfile
3 - Why not split the files but with some criteria? Supposing all your lines begin with letters:
- break your original_file in 20 smaller files: grep "^a*$" original_file > a_file
- sort each small file: a_file, b_file, and so on
- verify the duplicates, count them, do whatever you want.

console print w/o scrolling

I see console apps print colors and seen apps such as ffmpeg print text over itself instead of a new line. How do I print over an existing line? I want to display fps in my console app either at the very top or very bottom and have regular printfs go there and scroll normally.
I need this for windows, but this is meant to be cross platform, so I will eventually have a linux and mac implementation.
There is two simple possibilities which work on linux as well as windows, but only for one line:
printf("\b"); will return for one character, so you might count how many character you want to backspace and fire this in a loop, or you know that you only write n numbers and do it likeprintf("\b\b\b\b\b\b\b\b\b\b");
printf("text to be overwritten by next printf\r"); this will return the cursor to the beginning of the line, so any next printf will overwrite it. Make sure to write a string of same length or longer so you overwrite it entirely.
If you want to rewrite several lines, there is nothing so portable as ncurses, there is libs for it on practically every operating system, and you don't have to take care of the ANSI-differences.
edit: added link to ncurses wikipedia page, gives great overview and introduction, as well as link list and maybe a translation to your preferred language
Check out ncurses. It has bindings for most scripting languages.
You can use '\r' instead of '\n'.
The ASCII character number 8 (A.K.A. Ctrl-H, BS or Backspace) lets you back up one character. ASCII Character number 13 (A.K.A Ctrl-M, CR or Carriage Return) returns the cursor at the beggining of the line.
If you are working in C try putchar(8); and putchar(13);
The magic of the colors, cursor locating and bliking and so on are inside ANSI escape codes. Any text console capable of handling ANSI codes can use them just printing them out to console (i.e. by means of echo in a bash script or printf() function in C).
Unix terminals support ANSI escape sequences and Windows world used to support them back in old MS-DOS days, but the multibyte console support put an end to this. There is more information here. However there are other ways out of just ANSI sequences printing available on Windows. Moreover if you have Cygwin installed on your Windows maching ANSI codes work just as great as on any Unix terminal.
Many people mention Ncurses library that is the de-facto standard for any gui-like text based applications. What this library does is to hide all the terminal differences (Windows/Unix flavours) to represent the same information as identical as possible across all the platforms, though from my own experience I tell you this is not always true (i.e. typical text window frames change because the especial chars are not available under all character encodings). The counterpart of using ncurses is that it is a complete API and it is much harder to start out with it than simply writing out some ANSI escape sequences for simple things such as change the font color, cleaning screen or moving back the cursor to a random position.
For the sake of completeness I paste an example of use of ANSI sequence under Linux that changes the prompt to blue and shows the date:
PS1="\[\033[34m\][\$(date +%H%M)][\u#\h:\w]$ "
You can use Ncurses -
ncurses package is a subroutine library for terminal-independent screen-painting and input-event handling which presents a high level screen model to the programmer, hiding differences between terminal types and doing automatic optimization of output to change one screenfull of text into another
Depending on the platform which you are developing on there's probably a more powerful API which you could use, rather than old ASCII control codes.
e.g. If you are working on Win32 you can actually manipulate the console screen buffer directly.
A good place to start might be here
http://msdn.microsoft.com/en-us/library/ms683171(VS.85).aspx
I have been looking for similar functions/API which would allow me to access the console as something other than a stream of text for other platforms. Haven't found anything yet, but then again, I haven't been looking that hard.
Hope it helps.

Resources