X-Git-Url: https://git.xandkar.net/?p=dups.git;a=blobdiff_plain;f=README.md;fp=README.md;h=28ed81abc6156a24746ccb74024811f587149612;hp=04b84ecbbef4b1b95bef57749c4a0ae7a70172d4;hb=f41b9cdf0268213b9d1c911aa7836d9dc9948194;hpb=dbb52e5c345aeafd3b7a2f142ca6bf2039616574 diff --git a/README.md b/README.md index 04b84ec..28ed81a 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ dups ==== -Find duplicate files in given directory trees. Where "duplicate" is defined as -having the same (and non-0) file size and MD5 hash digest. +Find duplicate files in N given directory trees. Where "duplicate" is defined +as having the same (and non-0) file size and MD5 hash digest. -It is roughly equivalent to the following one-liner: +It is roughly equivalent to the following one-liner (included as `dups.sh`): ```sh find . -type f -print0 | xargs -0 -P 6 -I % md5sum % | awk '{digest = $1; sub("^" $1 " +", ""); path = $0; paths[digest, ++cnt[digest]] = path} END {for (digest in cnt) {n = cnt[digest]; if (n > 1) {print(digest, n); for (i=1; i<=n; i++) {printf " %s\n", paths[digest, i]} } } }' ```