From f63e2f9b335ac3a23e7ee6398a939a7d7f3c3a5c Mon Sep 17 00:00:00 2001 From: repi Date: Thu, 4 Jun 2026 14:44:47 +0100 Subject: [PATCH] Add project documentation Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode) Co-authored-by: Sisyphus --- .gitignore | 1 + README.md | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+) create mode 100644 .gitignore create mode 100644 README.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ea8c4bf --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +/target diff --git a/README.md b/README.md new file mode 100644 index 0000000..a613982 --- /dev/null +++ b/README.md @@ -0,0 +1,76 @@ +# disk-checker + +Fast Ubuntu-friendly CLI for scanning folders, checking file sizes, hashing the first chunk of same-size files, and reporting possible duplicates plus symlinks, hard links, special files, and scan errors. + +## Install Rust on Ubuntu + +```bash +sudo apt update +sudo apt install -y build-essential curl +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +source "$HOME/.cargo/env" +``` + +## Build + +```bash +cargo build --release +``` + +The binary will be at: + +```bash +target/release/disk-checker +``` + +## Usage + +Scan the current directory: + +```bash +disk-checker +``` + +Scan one or more paths: + +```bash +disk-checker ~/Downloads /mnt/shared +``` + +Use JSON for scripts: + +```bash +disk-checker ~/Downloads --json +``` + +Hash a larger first chunk before grouping possible duplicates: + +```bash +disk-checker ~/Downloads --hash-bytes 8MiB +``` + +Follow symlinks while still reporting them separately: + +```bash +disk-checker ~/Downloads --follow-links +``` + +Verify possible duplicates with a full-file hash pass: + +```bash +disk-checker ~/Downloads --verify-full +``` + +Limit hashing workers: + +```bash +disk-checker ~/Downloads --threads 4 +``` + +## Notes + +- By default, duplicate results are **possible duplicates**: same file size plus same first `1MiB` BLAKE3 hash. +- This is intentionally fast because it avoids reading whole files unless you pass `--verify-full`. +- Symlinks are not followed by default to avoid surprises and cycles. +- Hard link groups are reported separately because they are multiple paths to the same inode, not extra disk copies. +- Hidden files and gitignored files are included; this is a disk scanner, not a source-code search tool.