파일비교 유틸(diff tool)

IT 2024. 10. 13. 19:28


- 파일 다른점 비교하는 프로그램(file diff tool util, different compare program)

//-------------------------------------
텍스트 파일(text file diff)

* WinMerge
https://winmerge.org



//-------------------------------------
이진 파일 비교(binary file diff)

* HxD
https://mh-nexus.de/en/

- 사용법
파일 2개 열기 
-> (메뉴) Analysis -> Data Comparison -> Compare (Ctrl+K)

 

//-----------------------------------------------------------------------------

이진 파일 비교 결과, 다른 byte 수 구하기, python 소스 코드

import sys, os, time



def compare_binary_files(file1_path, file2_path):
    try:
        start_time = time.time()

        with open(file1_path, "rb") as file1, open(file2_path, "rb") as file2:
            # Read entire contents of both files
            content1 = file1.read()
            content2 = file2.read()

        diff_size = abs(len(content1) - len(content2))

        large_file = None
        if len(content1) > len(content2):
            content1, content2 = content2, content1  # swap
            large_file = file1_path
        elif len(content1) < len(content2):            
            large_file = file2_path
        else:
            large_file = None
            
        print(f"Large file: {large_file}")

        # Compare bytes and count differences
        compare_start_time = time.time()
        diff_count = 0

		# 블록 단위 비교
        BLOCK_SIZE = 1024 * 1024
        total_bytes_compared = min(len(content1), len(content2))

        for i in range(0, total_bytes_compared, BLOCK_SIZE):
            block1 = content1[i : i + BLOCK_SIZE]
            block2 = content2[i : i + BLOCK_SIZE]

            if block1 != block2:
                for b1, b2 in zip(block1, block2):
                    if b1 != b2:
                        diff_count += 1

		#
        compare_time = time.time() - compare_start_time
        total_time = time.time() - start_time

        print(f"Total execution time: {total_time:.4f} seconds")

        print(f"Total bytes compared: {len(content1):,.0f}")
        if len(content1) > 0:
            print(f"Percentage different: {(diff_count / len(content1)) * 100:.2f}%")
        else:
            print("Files are empty.")

        print(f"Number of different bytes: {diff_count:,.0f}, diff_size= {diff_size:,.0f}")

    except IOError as e:
        print(f"Error reading files: {e}")


if __name__ == "__main__":
    if len(sys.argv) < 3:
        print("Usage: python script.py <file1> <file2>")
        sys.exit(1)

    file1_path = sys.argv[1]
    file2_path = sys.argv[2]

    print(f'Comparing files: "{file1_path}" and "{file2_path}"')
    if os.path.exists(file1_path) == False:
        print(f'File "{file1_path}" does not exist')
        sys.exit(1)

    if os.path.exists(file2_path) == False:
        print(f'File "{file2_path}" does not exist')
        sys.exit(1)

    print(os.path.exists(file1_path), os.path.exists(file2_path))
    diff_count = compare_binary_files(file1_path, file2_path)
    # print(f"최종 : Number of different bytes: {diff_count}")

//

반응형
Posted by codens