Add TODO to parse web logs
[tt.git] / TODO
CommitLineData
a78e83b8 1# vim:sw=2:sts=2:
e678174b
SK
2TODO
3====
33cf2848 4
e678174b
SK
5Legend:
6- [ ] not started
7- [-] in-progress
8- [x] done
9- [~] cancelled
33cf2848 10
e678174b
SK
11In-progress
12-----------
13
14- [-] Convert to Typed Racket
15 - [x] build executable (otherwise too-slow)
16 - [-] add signatures
17 - [x] top-level
18 - [ ] inner
19 - [ ] imports
24f1f64b
SK
20- [-] commands:
21 - [x] r | read
d96fa613 22 - see timeline ops above
24f1f64b 23 - [ ] w | write
d96fa613
SK
24 - arg or stdin
25 - nick expand to URI
24f1f64b 26 - [ ] q | query
d96fa613
SK
27 - see timeline ops above
28 - see hashtag and channels above
4214c0f3 29 - [x] d | download
e678174b
SK
30 - [ ] options:
31 - [ ] all - use all known peers
32 - [ ] fast - all except peers known to be slow or unavailable
33 REQUIRES: stats
3a4b2233 34 - [x] u | upload
54c0807b 35 - calls user-configured command to upload user's own timeline file to their server
24f1f64b
SK
36 Looks like a better CLI parser than "racket/cmdline": https://docs.racket-lang.org/natural-cli/
37 But it is no longer necessary now that I've figured out how to chain (command-line ..) calls.
e678174b
SK
38- [-] Output formats:
39 - [x] text long
40 - [x] text short
41 - [ ] HTML
42 - [ ] JSON
43- [-] Peer discovery
44 - [-] parse peer refs from peer timelines
45 - [x] mentions from timeline messages
46 - [x] @<source.nick source.url>
47 - [x] @<source.url>
48 - [x] "following" from timeline comments: # following = <nick> <uri>
8cd862ed
SK
49 - [ ] Parse User-Agent web access logs.
50
e678174b 51 Rough sketch from late 2019:
c91a1ca9
SK
52
53 let read file =
54 ...
55 let write file peers =
56 ...
57 let fetch peer =
58 (* Fetch could mean either or both of:
59 * - fetch peer's we-are-twtxt.txt
60 * - fetch peer's twtxt.txt and extract mentioned peer URIs
61 * *)
62 ...
63 let test peers =
64 ...
65 let rec discover peers_old =
66 let peers_all =
67 Set.fold peers_old ~init:peers_old ~f:(fun peers p ->
68 match fetch p with
69 | Error _ ->
70 (* TODO: Should p be moved to down set here? *)
71 log_warning ...;
72 peers
73 | Ok peers_fetched ->
74 Set.union peers peers_fetched
75 )
76 in
77 if Set.empty (Set.diff peers_old peers_all) then
78 peers_all
79 else
80 discover peers_all
81 let rec loop interval peers_old =
82 let peers_all = discover peers_old in
83 let (peers_up, peers_down) = test peers_all in
84 write "peers-all.txt" peers_all;
85 write "peers-up.txt" peers_up;
86 write "peers-down.txt" peers_down;
87 sleep interval;
88 loop interval peers_all
89 let () =
90 loop (Sys.argv.(1)) (read "peers-all.txt")
e678174b
SK
91
92Backlog
93-------
94- [ ] nick tiebreaker(s)
95 - [ ] some sort of a hash of URI?
96 - [ ] angry-purple-tiger kind if thingie?
97 - [ ] P2P nick registration?
98 - [ ] Peers vote by claiming to have seen a nick->uri mapping?
99 The inherent race condition would be a feature, since all user name
100 registrations are races.
101 REQUIRES: blockchain
102- [ ] stats
103 - [ ] download times per peer
104- [ ] Support redirects
54c0807b 105 - should permanent redirects update the peer ref somehow?
e678174b
SK
106- [ ] Support time ranges (i.e. reading the timeline between given time points)
107- [ ] optional text wrap
108- [ ] write
109- [ ] timeline limits
54c0807b
SK
110- [ ] peer refs set operations (perhaps better done externally?)
111- [ ] timeline as a result of a query (peer ref set op + filter expressions)
e678174b
SK
112- [ ] config files
113- [ ] highlight mentions
114- [ ] filter on mentions
115- [ ] highlight hashtags
116- [ ] filter on hashtags
117- [ ] hashtags as channels? initial hashtag special?
118- [ ] query language
119- [ ] console logger colors by level ('error)
120- [ ] file logger ('debug)
121- [ ] Suport immutable timelines
122 - store individual messages
123 - where?
124 - something like DBM or SQLite - faster
125 - filesystem - transparent, easily published - probably best
126 - [ ] block(chain/tree) of twtxts
127 - distributed twtxt.db
128 - each twtxt.txt is a ledger
129 - peers can verify states of ledgers
130 - peers can publish known nick->url mappings
131 - peers can vote on nick->url mappings
132 - we could break time periods into blocks
133 - how to handle the facts that many(most?) twtxt are unseen by peers
134 - longest X wins?
135
136Done
137----
138- [x] caching (use cache by default, unless explicitly asked for update)
139 - [x] value --> cache
140 - [x] value <-- cache
141 REQUIRES: d command
142- [x] Logger sync before exit.
143- [x] Implement rfc3339->epoch
144- [x] Remove dependency on rfc3339-old
145- [x] remove dependency on http-client
146- [x] Build executable
147 Implies fix of "collection not found" when executing the built executable
148 outside the source directory:
149
150 collection-path: collection not found
151 collection: "tt"
152 in collection directories:
153 context...:
154 /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail
155 /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info
156 /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3
157 /usr/share/racket/collects/racket/cmdline.rkt:191:51
158 '|#%mzc:p
159
160
161Cancelled
162---------
163- [~] named timelines/peer-sets
164 REASON: That is basically files of peers, which we already support.
This page took 0.052028 seconds and 4 git commands to generate.