X-Git-Url: https://git.xandkar.net/?p=dups.git;a=blobdiff_plain;f=README.md;h=6677b476c5a529dc3ece5dbf5de8c965b339290b;hp=28ed81abc6156a24746ccb74024811f587149612;hb=f289b74bfb797118e32341290319012e3f06f8c1;hpb=f41b9cdf0268213b9d1c911aa7836d9dc9948194 diff --git a/README.md b/README.md index 28ed81a..6677b47 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ as having the same (and non-0) file size and MD5 hash digest. It is roughly equivalent to the following one-liner (included as `dups.sh`): ```sh -find . -type f -print0 | xargs -0 -P 6 -I % md5sum % | awk '{digest = $1; sub("^" $1 " +", ""); path = $0; paths[digest, ++cnt[digest]] = path} END {for (digest in cnt) {n = cnt[digest]; if (n > 1) {print(digest, n); for (i=1; i<=n; i++) {printf " %s\n", paths[digest, i]} } } }' +find . -type f -print0 | xargs -0 -P $(nproc) -I % md5sum % | awk '{digest = $1; sub("^" $1 " +", ""); path = $0; paths[digest, ++cnt[digest]] = path} END {for (digest in cnt) {n = cnt[digest]; if (n > 1) {print(digest, n); for (i=1; i<=n; i++) {printf " %s\n", paths[digest, i]} } } }' ``` which, when indented, looks like: