dups.git
5 years agoMove modules into dedicated files multi-samples
Siraaj Khandkar [Tue, 14 May 2019 16:54:07 +0000 (12:54 -0400)] 
Move modules into dedicated files

5 years agoSkip 0-sized files in the shell version master
Siraaj Khandkar [Fri, 8 Feb 2019 01:34:14 +0000 (20:34 -0500)] 
Skip 0-sized files in the shell version

5 years agoFix accidental reporting of singletons
Siraaj Khandkar [Fri, 30 Nov 2018 19:23:44 +0000 (14:23 -0500)] 
Fix accidental reporting of singletons

5 years agoHandle null-delimited input paths
Siraaj Khandkar [Wed, 28 Nov 2018 23:30:35 +0000 (18:30 -0500)] 
Handle null-delimited input paths

5 years agoFix statically-defined number of processes
Siraaj Khandkar [Wed, 28 Nov 2018 22:28:52 +0000 (17:28 -0500)] 
Fix statically-defined number of processes

5 years agoAdd shell-equivalent as an executable script
Siraaj Khandkar [Wed, 28 Nov 2018 22:15:50 +0000 (17:15 -0500)] 
Add shell-equivalent as an executable script

5 years agoUpdate the shell-equivalent implementation and motivation
Siraaj Khandkar [Wed, 28 Nov 2018 22:12:13 +0000 (17:12 -0500)] 
Update the shell-equivalent implementation and motivation

5 years agoSlightly refactor lord's and vassal's loops
Siraaj Khandkar [Wed, 28 Nov 2018 01:26:47 +0000 (20:26 -0500)] 
Slightly refactor lord's and vassal's loops

5 years agoRemove redundant annotations
Siraaj Khandkar [Wed, 28 Nov 2018 01:15:00 +0000 (20:15 -0500)] 
Remove redundant annotations

5 years agoRemove debug statements
Siraaj Khandkar [Tue, 27 Nov 2018 01:39:50 +0000 (20:39 -0500)] 
Remove debug statements

5 years agoParallelize file head sampling
Siraaj Khandkar [Tue, 27 Nov 2018 01:36:23 +0000 (20:36 -0500)] 
Parallelize file head sampling

5 years agoDistinguish between processor and wall times
Siraaj Khandkar [Tue, 27 Nov 2018 00:48:54 +0000 (19:48 -0500)] 
Distinguish between processor and wall times

and report both

5 years agoParallelize file hashing
Siraaj Khandkar [Mon, 26 Nov 2018 20:57:27 +0000 (15:57 -0500)] 
Parallelize file hashing

5 years agoMake ignore-pattern a closure
Siraaj Khandkar [Mon, 26 Nov 2018 05:47:32 +0000 (00:47 -0500)] 
Make ignore-pattern a closure

5 years agoCount redundant data size
Siraaj Khandkar [Mon, 26 Nov 2018 04:55:09 +0000 (23:55 -0500)] 
Count redundant data size

6 years agoImprove pipeline and metrics abstractions
Siraaj Khandkar [Fri, 23 Nov 2018 23:57:37 +0000 (18:57 -0500)] 
Improve pipeline and metrics abstractions

6 years agoExpand metrics
Siraaj Khandkar [Thu, 22 Nov 2018 01:13:13 +0000 (20:13 -0500)] 
Expand metrics

6 years agoAdd TODO note for better abstractions
Siraaj Khandkar [Wed, 21 Nov 2018 22:17:25 +0000 (17:17 -0500)] 
Add TODO note for better abstractions

6 years agoUse "File", rather than "path", as main abstraction
Siraaj Khandkar [Tue, 20 Nov 2018 23:51:08 +0000 (18:51 -0500)] 
Use "File", rather than "path", as main abstraction

which will help us accumulate and pass around the collected metadata.

6 years agoSample file heads and skip unique ones
Siraaj Khandkar [Tue, 20 Nov 2018 02:33:30 +0000 (21:33 -0500)] 
Sample file heads and skip unique ones

6 years agoRefactor CLI options gathering
Siraaj Khandkar [Tue, 20 Nov 2018 01:04:25 +0000 (20:04 -0500)] 
Refactor CLI options gathering

6 years agoUpdate README
Siraaj Khandkar [Mon, 19 Nov 2018 04:13:29 +0000 (23:13 -0500)] 
Update README

6 years agoSkip files with unique and 0 sizes
Siraaj Khandkar [Mon, 19 Nov 2018 04:02:36 +0000 (23:02 -0500)] 
Skip files with unique and 0 sizes

Thanks @AeroNotix for the idea!

6 years agoAdd option to ignore filepaths matching a pattern
Siraaj Khandkar [Sun, 18 Nov 2018 17:44:04 +0000 (12:44 -0500)] 
Add option to ignore filepaths matching a pattern

6 years agoImprove names in illustrative script
Siraaj Khandkar [Thu, 15 Nov 2018 01:23:20 +0000 (20:23 -0500)] 
Improve names in illustrative script

6 years agoAdd README and LICENSE
Siraaj Khandkar [Wed, 14 Nov 2018 20:23:25 +0000 (15:23 -0500)] 
Add README and LICENSE

6 years agoTidy-up variant label names
Siraaj Khandkar [Wed, 14 Nov 2018 20:19:16 +0000 (15:19 -0500)] 
Tidy-up variant label names

6 years agoSupport outputting to files in a directory
Siraaj Khandkar [Wed, 14 Nov 2018 20:17:55 +0000 (15:17 -0500)] 
Support outputting to files in a directory

6 years agoQuote outputted paths
Siraaj Khandkar [Wed, 14 Nov 2018 19:04:08 +0000 (14:04 -0500)] 
Quote outputted paths

6 years agoHandle multiple root directories
Siraaj Khandkar [Wed, 14 Nov 2018 18:02:24 +0000 (13:02 -0500)] 
Handle multiple root directories

6 years agoRename Directory to Directory_tree
Siraaj Khandkar [Wed, 14 Nov 2018 17:42:56 +0000 (12:42 -0500)] 
Rename Directory to Directory_tree

6 years agoFix indentation
Siraaj Khandkar [Wed, 14 Nov 2018 17:39:31 +0000 (12:39 -0500)] 
Fix indentation

6 years agoHandle directories with no regular-file children
Siraaj Khandkar [Wed, 14 Nov 2018 17:34:13 +0000 (12:34 -0500)] 
Handle directories with no regular-file children

i.e. keep exploring child directories when no files remain to process

6 years agoHandle root path missing or not a directory
Siraaj Khandkar [Wed, 14 Nov 2018 16:48:33 +0000 (11:48 -0500)] 
Handle root path missing or not a directory

6 years agoInvert abstractions
Siraaj Khandkar [Wed, 14 Nov 2018 16:38:59 +0000 (11:38 -0500)] 
Invert abstractions

6 years agoRename dupfiles to dups
Siraaj Khandkar [Wed, 14 Nov 2018 16:19:36 +0000 (11:19 -0500)] 
Rename dupfiles to dups

6 years agoMake a clean executable name
Siraaj Khandkar [Wed, 14 Nov 2018 16:16:32 +0000 (11:16 -0500)] 
Make a clean executable name

6 years agoReport total number of files and execution time
Siraaj Khandkar [Wed, 14 Nov 2018 16:06:40 +0000 (11:06 -0500)] 
Report total number of files and execution time

6 years agoImplement recursive directory walk stream
Siraaj Khandkar [Wed, 14 Nov 2018 15:57:56 +0000 (10:57 -0500)] 
Implement recursive directory walk stream

to avoid the issue with newlines when accepting file paths on stdin.

6 years agoInitial prototype
Siraaj Khandkar [Wed, 14 Nov 2018 14:08:34 +0000 (09:08 -0500)] 
Initial prototype

works mostly smoothly, except one problem: because we treat each input
line as a filename - the filenames which contain newline characters are
seen as multiple filenames which do not exist.

6 years agoRoot commit
Siraaj Khandkar [Wed, 14 Nov 2018 13:59:39 +0000 (08:59 -0500)] 
Root commit

This page took 0.043588 seconds and 4 git commands to generate.