Linux file compare binary files

Linux file compare binary files

Yeah, you could probably write a small program to read byte-by-byte and point out all of the bytes which are different.

You haven’t written one of those yet? Of all people, I would expect you to be to provide it. lol This thread might turn into the biggest collection of deviants yet.

I recommend visiting OpenRCE (http://openrce.org). You’re likely to find what you’re looking for there, or find someone who can help you help yourself. 😀

ha ha ha, I have now. It only took a minute to type out and I figure it will come in handy for me too.

Just compile this with c++ and execute it with «./a.out filea fileb» as parameters. It will show the hex dump of file A on the left, the hex dump of file B on the right, and the difference between them in the center. Any bytes that aren’t the same will be replaced by «__» then all you have to do is look at that byte in fileA and fileB for the difference.

#include
#include
#include

using namespace std;

// Read unsigned char (byte)
unsigned char readByte(ifstream &thisFile) <
char buffer;
thisFile.get(buffer);
return (unsigned char)buffer;
>

int main(int argc, char** argv)
<
ifstream binFileA(argv[1]);
ifstream binFileB(argv[2]);

int offset = 0;
unsigned char byteA;
unsigned char byteB;

And it will send it all to «compareFile»

Hmm. maybe this will turn into a programming challenge thread 😉

lol, well I do a lot of work with binary files too and I’ve never really thought about using a technique like this. Plus it was pretty easy to write, it’s just comparing bytes.

What I need is a hex editor type application that can take two binary files and show the difference between the two files. Does such an application exist for Linux?

you can use cmp , coupled with the -l switch to see the results.

For a byte level diff i tend to abuse the following applications:

hexdump -C file1 > file1.txt
hexdump -C file2 > file2.txt

diff file1.txt file2.txt

Ok, so I have a question. I haven’t really used any of these binary differencing tools/libraries. Consider a case like this:

File 1:
01 45 6a 89 12 46 78 99

File 2:
45 6a 89 12 45 ff 78 99

Do the differencing tools have a method for reporting a «missing» or «added» byte?

Showing how file 2 is different from file 1 would be:
1 -01
5 -46 +45
6 +ff

I don’t think so. However you do bring up a very interesting feature that could be added to a binary differing tool. Especially with assembly. That would probably be very helpful!

This inspires me to work on a small toolkit for dealing with binary files.

jblebrun, do you mind if I steal your idea?

I don’t think so. However you do bring up a very interesting feature that could be added to a binary differing tool. Especially with assembly. That would probably be very helpful!

This inspires me to work on a small toolkit for dealing with binary files.

jblebrun, do you mind if I steal your idea?

Go for it! I stole it from the way text diffing works, anyway. If you do a diff on a text file, it can detect inserted or removed lines. I just transfered that concept to individual bytes. I never come up with my own ideas. I just shuffle other people’s ideas around 😉

Ok, so I have a question. I haven’t really used any of these binary differencing tools/libraries. Consider a case like this:

File 1:
01 45 6a 89 12 46 78 99

File 2:
45 6a 89 12 45 ff 78 99

Do the differencing tools have a method for reporting a «missing» or «added» byte?

Showing how file 2 is different from file 1 would be:
1 -01
5 -46 +45
6 +ffThat problem has a name: «edit distance.» The edit distance between two strings is the minimum number of character insertions, deletions, and mutations required to transform one into the other. IIRC, computational biologists make good use of this to compare DNA sequences.

Читайте также:  Ethernet свойства настройка windows 10

edit: As a workaround, you can use a combination of diff and hexdump to detect inserted and deleted bytes:

hexdump -e «1 1 \»%02x\n\»» file1 > temp1
hexdump -e «1 1 \»%02x\n\»» file2 > temp2
diff temp1 temp2

That problem has a name: «edit distance.» The edit distance between two strings is the minimum number of character insertions, deletions, and mutations required to transform one into the other. IIRC, computational biologists make good use of this to compare DNA sequences.

Thanks for giving me the official name for that. Defining a metric like edit distance makes the problem much more clear!

I’m gonna guess that you recalled correctly: if you ask me, it sounds like a problem that’s 100% applicable to DNA.

The answer sort of depends what you want to know about the files, and how large they are.

If all you care about is whether they differ at all, then use the md5sum program to compute a hash of each file, and compare the hashes.

diff will compare binary files, but like the md5sum approach only tells you if they differ.

cmp will- if called with -l- tell you where and how the files differ.

I need to compare very large files. Which need several hour. diff don’t print any progress. I wrote a very simple java program to do the thing.

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;

public class Test <

public static void main(String[] args) throws Exception <
String name = «backup.part1.rar»;
File file1 = new File(«/home/bak/1/» + name);
File file2 = new File(«/mnt/» + name);
java.io.BufferedInputStream fi1 = new BufferedInputStream(
new FileInputStream(file1));
java.io.BufferedInputStream fi2 = new BufferedInputStream(
new FileInputStream(file2));
byte[] bs1 = new byte[1024 * 1024];
byte[] bs2 = new byte[1024 * 1024];
long pos = 0;
while (true) <
System.out.format(«%,d\n», pos);
int len1 = fi1.read(bs1);
int len2 = fi2.read(bs2);
if (len1 != len2)
throw new Error();
if (len1

Edit and compare giant binary files with lfhex

(Edit: After installing 80 MB of dependencies, it still won’t install. I give up.)

My application is two almost-identical .wav files that I created a long time ago and stored in different places. They are not binary-equal, though, so some errors must have occurred in transmission or storage. I want to see if I can figure out which is the original. 🙂

cmp -l Untitled.wav Untitled2.wav
4973185 125 135
93124225 155 145
93212289 361 371
100802177 274 264

So they only differ in four places, by single bits?

VBINDIFF(1) Christopher J. Madsen VBINDIFF(1)

NAME
vbindiff — hexadecimal file display and comparison

SYNOPSIS
vbindiff file1 [ file2 ]

DESCRIPTION
Visual Binary Diff (VBinDiff) displays files in hexadecimal and ASCII (or EBCDIC). It can also display two files at once, and highlight the differences between them. Unlike diff, it works well with large files (up to 4 GB).

What I need is a hex editor type application that can take two binary files and show the difference between the two files. Does such an application exist for Linux?

Hey. look at this thread. It *might* be what you need.

Have you tried using vbindiff? It’s in the Ubuntu repositories and works very well.

Edit: Looks like it’s already been mentioned, I should really read the whole thread more throughly before replying.

Was just looking at a problem that needed this, and found this little command-line that gives the number of differing bytes between two files (from commandlinefu):
cmp -l file1 file2 |wc -l

Then just use «ls -l» to get the filesize in bytes, and you get your percentage.

Читайте также:  Как записать dmg с mac os под windows

cat f1 | od -x > hex1 && cat f2 | od -x > hex2
diff hex1 hex2

Upon reading this thread, I delved into my old backup discs and found this little utility I wrote while being stuck in the world of Windows. It does what cmp -l, except it will also display the difference in value.

Especially useful if the value change is known, but the location isn’t.

Invoked as follows:
bcmp file1 file2 where file1 and file2 are binary files of equal size.

Its’ output is something like this for every difference it finds:

Difference at 0x6C14.
file 1: 0x0D 013
file 2: 0xFE 254
Difference in numbers: +241And here is the code:

/*
Author: Samuel Rydйn, trahojen (at, no abuse) gmail.com
Copyright: 2002, 2009
Purpose: To compare two binary files of equal size, and display
differences in BYTES in a helpful matter for further
processing.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see .
*/

void fp_cleanup(FILE** f)
<
int c;
for (c = 0; c t[1]) ? ‘-‘ : ‘+’, abs(t[0] — t[1]));
fprintf(stdout, «\n»);
>

return 0;
>
Also attached for convenience.

Please note that this is a byte by byte comparition, so large integers and endianness is not taken into account. But if you are big enough to use these tools, you are big enough to do the math. 🙂

I built this tool to cheat in the old Sid Meyer’s Civilization, where the amount of gold were searched for in two savefiles. It has also been used in old games as Excalibur and for various emulators’ savegames, such as VisualBoyAdvance. The output was used, at the time, as an aid for binary editing in applications such as PSPad in hex edit mode.
I find myself sometimes using this as a tool still, when reverse engineering stupid undocumented protocols, for instance. 😡

I am fully aware this makes me a bad person. 🙄

I think this may have been somewhat what the original poster was after, athough it was posted in 2007. I hope someone will find it useful. 🙂

edit: Actually, I just used it myself and realised changing the output to a one-liner makes it a whole lot more powerful, for instance in conjunction with grep:

fprintf(stdout, «0x%08X:\t0x%02X -> 0x%02X\t%03d -> %03d\t%c%d\n», c, t[0], t[1], t[0], t[1], (t[0] > t[1]) ? ‘-‘ : ‘+’, abs(t[0] — t[1]));

Will produce lines like this:

0x00006C14 0x0D -> 0xFE 013 -> 254 +241

Edit and compare giant binary files with lfhex

(Edit: After installing 80 MB of dependencies, it still won’t install. I give up.)

Compiled on Hardy Heron 64 bit:

Install qt3-. libqt3-. packages

With this old version you are also able to compare two files:

$ lfhex -c file1 file2

It works on Jaunty 64bit too.
:popcorn:

You find it compiled in the attached files.

sorry to be reviving such an old thread, but i think I’ve found the best answer to the question. i found all of the other suggestions to be much more annoying than the nice hex editors available in windows. so i found a way to get one of those to actually work in linux. (Hex Workshop 6.0.1)

Читайте также:  Тема урока стандартные приложения windows

it turns out that the program itself runs fine in wine, but its the installer which crashes. so i just installed this thing in windows XP and then switched to my ubuntu installation and copied the program files folder over to wine. and now just link to the HWorks32.exe and its all golden. I hope this helps anybody who stumbles on it.

Just wanted to post my version of this:

edit: As a workaround, you can use a combination of diff and hexdump to detect inserted and deleted bytes:

hexdump -e «1 1 \»%02x\n\»» file1 > temp1
hexdump -e «1 1 \»%02x\n\»» file2 > temp2
diff temp1 temp2

It should be noted, that the hexdump manual mentions:

-v Cause hexdump to display all input data. Without the -v option,
any number of groups of output lines, which would be identical to
the immediately preceding group of output lines (except for the
input offsets), are replaced with a line comprised of a single
asterisk.

So, what I end up using is: hexdump to generate a textual representation of each byte one a single line, as in the original example by Lux Perpetua — and then I use meld to visualise the differences; this is because meld seems to be able to handle single row deletions/insertions well; which translates to handling single byte insertions/deletions well — unlike, say, vbindiff or dhex (which cannot display single byte changes well). In a bash one-liner, it looks like this:

You can’t possibly be up to any good using a tool like that. Why would you want to look for changes in the hex of two binary files? 😀

You’re joking right? If not, you should be because I can think of a thousand reasons to compare binary files without any malintent.

Источник

6 Best Linux Diff Tools

File comparison compares the contents of computer files, finding their common contents and their differences. The result of the comparison is often known as a diff.

diff is also the name of a famous console based file comparison utility that outputs the differences between two files. The diff utility was developed in the early 1970s on the Unix operating system. Typically, diff is used to show the changes between two versions of the same file. Modern implementations also support binary files.

Linux has many good GUI tools that enable you to clearly see the difference between two files or two versions of the same file. This roundup selects 6 of our favourite GUI diff tools. All of them are open source goodness.

These utilities are an essential software development tool, as they visualize the differences between files or directories, merge files with differences, resolve conflicts and save output to a new file or patch, and assist file changes reviewing and comment production (e.g. approving source code changes before they get merged into a source tree). They help developers work on a file, passing it back and forth between each other. The diff tools are not only useful for showing differences in source code files; they can be used on many text-based file types as well. The visualizations make it easier to compare files.

Here’s our software recommendations. DiffPDF is different from the other tools, as it compares two PDF files.

Diff Tools
Meld Graphical diff viewer and merge application for the Gnome desktop
Kompare KDE diff tool supporting a variety of diff formats
Diffuse Tool for merging and comparing text files
KDiff3 Text difference analyzer for up to 3 input files
DiffPDF Compare two PDF files
xxdiff File and directories comparator and merge tool

We’ve covered the best console based diff tools in a separate article available here.

Источник

Оцените статью