# vim:sw=2:sts=2:
-- [ ] Output formats:
- - [x] text long
- - [x] text short
- - [ ] HTML
- - [ ] JSON
-- [ ] Convert to Typed Racket
- - requires: build executable (otherwise too slow)
-- [x] Build executable
- Implies fix of "collection not found" when executing the built executable
- outside the source directory:
+TODO
+====
- collection-path: collection not found
- collection: "tt"
- in collection directories:
- context...:
- /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail
- /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info
- /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3
- /usr/share/racket/collects/racket/cmdline.rkt:191:51
- '|#%mzc:p
+Legend:
+- [ ] not started
+- [-] in-progress
+- [x] done
+- [~] cancelled
-- [ ] Support redirects
- - should permanent redirects update the feed somehow?
-- [ ] Support time ranges (i.e. reading the timeline between given time points)
-- [x] Implement rfc3339->epoch
-- [x] Remove dependency on rfc3339-old
-- [x] remove dependency on http-client
-- [ ] optional text wrap
-- [ ] write
-- [x] caching (use cache by default, unless explicitly asked for update)
- - [x] value --> cache
- - [x] value <-- cache
- requires: d command
-- [ ] timeline limits
-- [ ] feed set operations (perhaps better done externally?)
-- [ ] timeline as a result of a query (feed set op + filter expressions)
-- [ ] named timelines
-- [ ] config files
-- [ ] parse "following" from feed
- - following = <nick> <uri>
-- [ ] parse mentions:
- - @<source.nick source.url> | @<source.url>
-- [ ] highlight mentions
-- [ ] filter on mentions
-- [ ] highlight hashtags
-- [ ] filter on hashtags
-- [ ] hashtags as channels? initial hashtag special?
-- [ ] query language
-- [ ] console logger colors by level ('error)
-- [ ] file logger ('debug)
+In-progress
+-----------
+- [-] timeline limits
+ - [x] by time range
+ - [ ] by msg count
+ - [ ] per peer
+ - [ ] total
+ Not necessary for short format, because we have Unix head/tail,
+ but may be convinient for long format (because msg spans multiple lines).
+- [-] Convert to Typed Racket
+ - [x] build executable (otherwise too-slow)
+ - [-] add signatures
+ - [x] top-level
+ - [ ] inner
+ - [ ] imports
- [-] commands:
+ - [x] c | crawl
+ Discover new peers mentioned by known peers.
- [x] r | read
- see timeline ops above
- [ ] w | write
- arg or stdin
- nick expand to URI
+ - Watch FIFO for lines, then read, timestamp and append [+ upload].
+ Can be part of a "live" mode, along with background polling and
+ incremental printing. Sort of an ii-like IRC experience.
- [ ] q | query
- see timeline ops above
- see hashtag and channels above
- [x] d | download
+ - [ ] options:
+ - [ ] all - use all known peers
+ - [ ] fast - all except peers known to be slow or unavailable
+ REQUIRES: stats
- [x] u | upload
- - calls user-configured command to upload user's own feed file to their server
+ - calls user-configured command to upload user's own timeline file to their server
Looks like a better CLI parser than "racket/cmdline": https://docs.racket-lang.org/natural-cli/
But it is no longer necessary now that I've figured out how to chain (command-line ..) calls.
-- [ ] Suport immutable timelines
- - store individual messages
- - where?
- - something like DBM or SQLite - faster
- - filesystem - transparent, easily published - probably best
- - [ ] block(chain/tree) of twtxts
- - distributed twtxt.db
- - each twtxt.txt is a ledger
- - peers can verify states of ledgers
- - peers can publish known nick->url mappings
- - peers can vote on nick->url mappings
- - we could break time periods into blocks
- - how to handle the facts that many(most?) twtxt are unseen by peers
- - longest X wins?
-- [ ] Peer discovery
- requires:
- - parse mentions
- - parse following
- rough sketch from late 2019:
-
+- [-] Output formats:
+ - [x] text long
+ - [x] text short
+ - [ ] HTML
+ - [ ] JSON
+- [-] Peer discovery
+ - [-] parse peer refs from peer timelines
+ - [x] mentions from timeline messages
+ - [x] @<source.nick source.url>
+ - [x] @<source.url>
+ - [ ] "following" from timeline comments: # following = <nick> <uri>
+ 1. split file lines in 2 groups: comments and messages
+ 2. dispatch messages parsing as usual
+ 3. dispatch comments parsing for:
+ - # following = <nick> <uri>
+ - what else?
+ - [ ] Parse User-Agent web access logs.
+ - [-] Update peer ref file(s)
+ - [x] peers-all
+ - [x] peers-mentioned
+ - [ ] peers-followed (by others, parsed from comments)
+ - [ ] peers-down (net errors)
+ - [ ] redirects?
+ Rough sketch from late 2019:
let read file =
...
let write file peers =
loop interval peers_all
let () =
loop (Sys.argv.(1)) (read "peers-all.txt")
+
+Backlog
+-------
+- [ ] Support date without time in timestamps
+- [ ] Associate cached object with nick.
+- [ ] Crawl downloaded web access logs
+- [ ] download-command hook to grab the access logs
+
+ (define (parse log-line)
+ (match (regexp-match #px"([^/]+)/([^ ]+) +\\(\\+([a-z]+://[^;]+); *@([^\\)]+)\\)" log-line)
+ [(list _ client version uri nick) (cons nick uri)]
+ [_ #f]))
+
+ (list->set (filter-map parse (file->lines "logs/combined-access.log")))
+
+ (filter (λ (p) (equal? 'file (file-or-directory-type p))) (directory-list logs-dir))
+
+- [ ] user-agent file as CLI option - need to run at least the crawler as another user
+- [ ] Support fetching rsync URIs
+- [ ] Check for peer duplicates:
+ - [ ] same nick for N>1 URIs
+ - [ ] same URI for N>1 nicks
+- [ ] Background polling and incremental timeline updates.
+ We can mark which messages have already been printed and print new ones as
+ they come in.
+ REQUIRES: polling
+- [ ] Polling mode/command, where tt periodically polls peer timelines
+- [ ] nick tiebreaker(s)
+ - [ ] some sort of a hash of URI?
+ - [ ] angry-purple-tiger kind if thingie?
+ - [ ] P2P nick registration?
+ - [ ] Peers vote by claiming to have seen a nick->uri mapping?
+ The inherent race condition would be a feature, since all user name
+ registrations are races.
+ REQUIRES: blockchain
+- [ ] stats
+ - [ ] download times per peer
+- [ ] Support redirects
+ - should permanent redirects update the peer ref somehow?
+- [ ] optional text wrap
+- [ ] write
+- [ ] peer refs set operations (perhaps better done externally?)
+- [ ] timeline as a result of a query (peer ref set op + filter expressions)
+- [ ] config files
+- [ ] highlight mentions
+- [ ] filter on mentions
+- [ ] highlight hashtags
+- [ ] filter on hashtags
+- [ ] hashtags as channels? initial hashtag special?
+- [ ] query language
+- [ ] console logger colors by level ('error)
+- [ ] file logger ('debug)
+- [ ] Suport immutable timelines
+ - store individual messages
+ - where?
+ - something like DBM or SQLite - faster
+ - filesystem - transparent, easily published - probably best
+ - [ ] block(chain/tree) of twtxts
+ - distributed twtxt.db
+ - each twtxt.txt is a ledger
+ - peers can verify states of ledgers
+ - peers can publish known nick->url mappings
+ - peers can vote on nick->url mappings
+ - we could break time periods into blocks
+ - how to handle the facts that many(most?) twtxt are unseen by peers
+ - longest X wins?
+
+Done
+----
+- [x] Crawl all cache/objects/*, not given peers.
+- [x] Support time ranges (i.e. reading the timeline between given time points)
+- [x] Dedup read-in peers before using them.
+- [x] Prevent redundant downloads
+ - [x] Check ETag
+ - [x] Check Last-Modified if no ETag was provided
+ - [x] Parse rfc2822 timestamps
+- [x] caching (use cache by default, unless explicitly asked for update)
+ - [x] value --> cache
+ - [x] value <-- cache
+ REQUIRES: d command
+- [x] Logger sync before exit.
+- [x] Implement rfc3339->epoch
+- [x] Remove dependency on rfc3339-old
+- [x] remove dependency on http-client
+- [x] Build executable
+ Implies fix of "collection not found" when executing the built executable
+ outside the source directory:
+
+ collection-path: collection not found
+ collection: "tt"
+ in collection directories:
+ context...:
+ /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail
+ /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info
+ /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3
+ /usr/share/racket/collects/racket/cmdline.rkt:191:51
+ '|#%mzc:p
+
+
+Cancelled
+---------
+- [~] named timelines/peer-sets
+ REASON: That is basically files of peers, which we already support.