# vim:sw=2:sts=2: TODO ==== Legend: - [ ] not started - [-] in-progress - [x] done - [~] cancelled In-progress ----------- - [-] timeline limits - [x] by time range - [ ] by msg count - [ ] per peer - [ ] total Not necessary for short format, because we have Unix head/tail, but may be convinient for long format (because msg spans multiple lines). - [-] Convert to Typed Racket - [x] build executable (otherwise too-slow) - [-] add signatures - [x] top-level - [ ] inner - [ ] imports - [-] commands: - [x] c | crawl Discover new peers mentioned by known peers. - [x] r | read - see timeline ops above - [ ] w | write - arg or stdin - nick expand to URI - Watch FIFO for lines, then read, timestamp and append [+ upload]. Can be part of a "live" mode, along with background polling and incremental printing. Sort of an ii-like IRC experience. - [ ] q | query - see timeline ops above - see hashtag and channels above - [x] d | download - [ ] options: - [ ] all - use all known peers - [ ] fast - all except peers known to be slow or unavailable REQUIRES: stats - [x] u | upload - calls user-configured command to upload user's own timeline file to their server Looks like a better CLI parser than "racket/cmdline": https://docs.racket-lang.org/natural-cli/ But it is no longer necessary now that I've figured out how to chain (command-line ..) calls. - [-] Output formats: - [x] text long - [x] text short - [ ] HTML - [ ] JSON - [-] Peer discovery - [-] parse peer refs from peer timelines - [x] mentions from timeline messages - [x] @ - [x] @ - [ ] "following" from timeline comments: # following = 1. split file lines in 2 groups: comments and messages 2. dispatch messages parsing as usual 3. dispatch comments parsing for: - # following = - what else? - [ ] Parse User-Agent web access logs. - [-] Update peer ref file(s) - [x] peers-all - [x] peers-mentioned - [ ] peers-followed (by others, parsed from comments) - [ ] peers-down (net errors) - [ ] redirects? Rough sketch from late 2019: let read file = ... let write file peers = ... let fetch peer = (* Fetch could mean either or both of: * - fetch peer's we-are-twtxt.txt * - fetch peer's twtxt.txt and extract mentioned peer URIs * *) ... let test peers = ... let rec discover peers_old = let peers_all = Set.fold peers_old ~init:peers_old ~f:(fun peers p -> match fetch p with | Error _ -> (* TODO: Should p be moved to down set here? *) log_warning ...; peers | Ok peers_fetched -> Set.union peers peers_fetched ) in if Set.empty (Set.diff peers_old peers_all) then peers_all else discover peers_all let rec loop interval peers_old = let peers_all = discover peers_old in let (peers_up, peers_down) = test peers_all in write "peers-all.txt" peers_all; write "peers-up.txt" peers_up; write "peers-down.txt" peers_down; sleep interval; loop interval peers_all let () = loop (Sys.argv.(1)) (read "peers-all.txt") Backlog ------- - [ ] Support date without time in timestamps - [ ] Associate cached object with nick. - [ ] Crawl downloaded web access logs - [ ] download-command hook to grab the access logs (define (parse log-line) (match (regexp-match #px"([^/]+)/([^ ]+) +\\(\\+([a-z]+://[^;]+); *@([^\\)]+)\\)" log-line) [(list _ client version uri nick) (cons nick uri)] [_ #f])) (list->set (filter-map parse (file->lines "logs/combined-access.log"))) (filter (λ (p) (equal? 'file (file-or-directory-type p))) (directory-list logs-dir)) - [ ] user-agent file as CLI option - need to run at least the crawler as another user - [ ] Support fetching rsync URIs - [ ] Check for peer duplicates: - [ ] same nick for N>1 URIs - [ ] same URI for N>1 nicks - [ ] Background polling and incremental timeline updates. We can mark which messages have already been printed and print new ones as they come in. REQUIRES: polling - [ ] Polling mode/command, where tt periodically polls peer timelines - [ ] nick tiebreaker(s) - [ ] some sort of a hash of URI? - [ ] angry-purple-tiger kind if thingie? - [ ] P2P nick registration? - [ ] Peers vote by claiming to have seen a nick->uri mapping? The inherent race condition would be a feature, since all user name registrations are races. REQUIRES: blockchain - [ ] stats - [ ] download times per peer - [ ] Support redirects - should permanent redirects update the peer ref somehow? - [ ] optional text wrap - [ ] write - [ ] peer refs set operations (perhaps better done externally?) - [ ] timeline as a result of a query (peer ref set op + filter expressions) - [ ] config files - [ ] highlight mentions - [ ] filter on mentions - [ ] highlight hashtags - [ ] filter on hashtags - [ ] hashtags as channels? initial hashtag special? - [ ] query language - [ ] console logger colors by level ('error) - [ ] file logger ('debug) - [ ] Suport immutable timelines - store individual messages - where? - something like DBM or SQLite - faster - filesystem - transparent, easily published - probably best - [ ] block(chain/tree) of twtxts - distributed twtxt.db - each twtxt.txt is a ledger - peers can verify states of ledgers - peers can publish known nick->url mappings - peers can vote on nick->url mappings - we could break time periods into blocks - how to handle the facts that many(most?) twtxt are unseen by peers - longest X wins? Done ---- - [x] Crawl all cache/objects/*, not given peers. - [x] Support time ranges (i.e. reading the timeline between given time points) - [x] Dedup read-in peers before using them. - [x] Prevent redundant downloads - [x] Check ETag - [x] Check Last-Modified if no ETag was provided - [x] Parse rfc2822 timestamps - [x] caching (use cache by default, unless explicitly asked for update) - [x] value --> cache - [x] value <-- cache REQUIRES: d command - [x] Logger sync before exit. - [x] Implement rfc3339->epoch - [x] Remove dependency on rfc3339-old - [x] remove dependency on http-client - [x] Build executable Implies fix of "collection not found" when executing the built executable outside the source directory: collection-path: collection not found collection: "tt" in collection directories: context...: /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3 /usr/share/racket/collects/racket/cmdline.rkt:191:51 '|#%mzc:p Cancelled --------- - [~] named timelines/peer-sets REASON: That is basically files of peers, which we already support.