WIP
[tt.git] / TODO
CommitLineData
a78e83b8 1# vim:sw=2:sts=2:
e678174b
SK
2TODO
3====
33cf2848 4
e678174b
SK
5Legend:
6- [ ] not started
7- [-] in-progress
8- [x] done
9- [~] cancelled
33cf2848 10
e678174b
SK
11In-progress
12-----------
a993cb85
SK
13- [-] timeline limits
14 - [x] by time range
15 - [ ] by msg count
16 - [ ] per peer
17 - [ ] total
18 Not necessary for short format, because we have Unix head/tail,
19 but may be convinient for long format (because msg spans multiple lines).
e678174b
SK
20- [-] Convert to Typed Racket
21 - [x] build executable (otherwise too-slow)
22 - [-] add signatures
23 - [x] top-level
24 - [ ] inner
25 - [ ] imports
24f1f64b 26- [-] commands:
a60c484e
SK
27 - [x] c | crawl
28 Discover new peers mentioned by known peers.
24f1f64b 29 - [x] r | read
d96fa613 30 - see timeline ops above
24f1f64b 31 - [ ] w | write
d96fa613
SK
32 - arg or stdin
33 - nick expand to URI
9d1b7217
SK
34 - Watch FIFO for lines, then read, timestamp and append [+ upload].
35 Can be part of a "live" mode, along with background polling and
36 incremental printing. Sort of an ii-like IRC experience.
24f1f64b 37 - [ ] q | query
d96fa613
SK
38 - see timeline ops above
39 - see hashtag and channels above
4214c0f3 40 - [x] d | download
e678174b
SK
41 - [ ] options:
42 - [ ] all - use all known peers
43 - [ ] fast - all except peers known to be slow or unavailable
44 REQUIRES: stats
3a4b2233 45 - [x] u | upload
54c0807b 46 - calls user-configured command to upload user's own timeline file to their server
24f1f64b
SK
47 Looks like a better CLI parser than "racket/cmdline": https://docs.racket-lang.org/natural-cli/
48 But it is no longer necessary now that I've figured out how to chain (command-line ..) calls.
e678174b
SK
49- [-] Output formats:
50 - [x] text long
51 - [x] text short
52 - [ ] HTML
53 - [ ] JSON
54- [-] Peer discovery
55 - [-] parse peer refs from peer timelines
56 - [x] mentions from timeline messages
57 - [x] @<source.nick source.url>
58 - [x] @<source.url>
a60c484e 59 - [ ] "following" from timeline comments: # following = <nick> <uri>
2cac257d
SK
60 1. split file lines in 2 groups: comments and messages
61 2. dispatch messages parsing as usual
62 3. dispatch comments parsing for:
63 - # following = <nick> <uri>
64 - what else?
8cd862ed 65 - [ ] Parse User-Agent web access logs.
a60c484e
SK
66 - [-] Update peer ref file(s)
67 - [x] peers-all
68 - [x] peers-mentioned
69 - [ ] peers-followed (by others, parsed from comments)
eade8175 70 - [ ] peers-up (no net errors)
a60c484e 71 - [ ] peers-down (net errors)
eade8175 72 - [ ] peers-valid (up and parsed at least 1 message)
a60c484e 73 - [ ] redirects?
b06cbfc2 74 Rough sketch from late 2019:
c91a1ca9
SK
75 let read file =
76 ...
77 let write file peers =
78 ...
79 let fetch peer =
80 (* Fetch could mean either or both of:
81 * - fetch peer's we-are-twtxt.txt
82 * - fetch peer's twtxt.txt and extract mentioned peer URIs
83 * *)
84 ...
85 let test peers =
86 ...
87 let rec discover peers_old =
88 let peers_all =
89 Set.fold peers_old ~init:peers_old ~f:(fun peers p ->
90 match fetch p with
91 | Error _ ->
92 (* TODO: Should p be moved to down set here? *)
93 log_warning ...;
94 peers
95 | Ok peers_fetched ->
96 Set.union peers peers_fetched
97 )
98 in
99 if Set.empty (Set.diff peers_old peers_all) then
100 peers_all
101 else
102 discover peers_all
103 let rec loop interval peers_old =
104 let peers_all = discover peers_old in
105 let (peers_up, peers_down) = test peers_all in
106 write "peers-all.txt" peers_all;
107 write "peers-up.txt" peers_up;
108 write "peers-down.txt" peers_down;
109 sleep interval;
110 loop interval peers_all
111 let () =
112 loop (Sys.argv.(1)) (read "peers-all.txt")
e678174b
SK
113
114Backlog
115-------
eade8175
SK
116- [ ] Batch download jobs by domain:
117 - at most 1 worker per domain
118 - more than 1 domain per worker is OK
119- [ ] Remove mention link noise in read view.
120 in short view: just abbreviate @<nick uri> to @nick
121 in long view: abbreviate like above AND list the full versions after the text
122- [ ] Crawl only valid objects
123 REQUIRES: peers-valid ref file update
124- [ ] Reduce log noise
125- [ ] Parallelize crawling by file
126- [ ] Parallelize reading by file
a993cb85 127- [ ] Support date without time in timestamps
d3ac9e11 128- [ ] Associate cached object with nick.
7d9f2ab5
SK
129- [ ] Crawl downloaded web access logs
130- [ ] download-command hook to grab the access logs
131
132 (define (parse log-line)
133 (match (regexp-match #px"([^/]+)/([^ ]+) +\\(\\+([a-z]+://[^;]+); *@([^\\)]+)\\)" log-line)
134 [(list _ client version uri nick) (cons nick uri)]
135 [_ #f]))
136
137 (list->set (filter-map parse (file->lines "logs/combined-access.log")))
138
139 (filter (λ (p) (equal? 'file (file-or-directory-type p))) (directory-list logs-dir))
140
a60c484e 141- [ ] user-agent file as CLI option - need to run at least the crawler as another user
9c34c974 142- [ ] Support fetching rsync URIs
3231d4b5
SK
143- [ ] Check for peer duplicates:
144 - [ ] same nick for N>1 URIs
145 - [ ] same URI for N>1 nicks
55da29c0
SK
146- [ ] Background polling and incremental timeline updates.
147 We can mark which messages have already been printed and print new ones as
148 they come in.
149 REQUIRES: polling
4ffb857c 150- [ ] Polling mode/command, where tt periodically polls peer timelines
e678174b
SK
151- [ ] nick tiebreaker(s)
152 - [ ] some sort of a hash of URI?
153 - [ ] angry-purple-tiger kind if thingie?
154 - [ ] P2P nick registration?
155 - [ ] Peers vote by claiming to have seen a nick->uri mapping?
156 The inherent race condition would be a feature, since all user name
157 registrations are races.
158 REQUIRES: blockchain
159- [ ] stats
160 - [ ] download times per peer
161- [ ] Support redirects
54c0807b 162 - should permanent redirects update the peer ref somehow?
e678174b
SK
163- [ ] optional text wrap
164- [ ] write
54c0807b
SK
165- [ ] peer refs set operations (perhaps better done externally?)
166- [ ] timeline as a result of a query (peer ref set op + filter expressions)
e678174b
SK
167- [ ] config files
168- [ ] highlight mentions
169- [ ] filter on mentions
170- [ ] highlight hashtags
171- [ ] filter on hashtags
172- [ ] hashtags as channels? initial hashtag special?
173- [ ] query language
174- [ ] console logger colors by level ('error)
175- [ ] file logger ('debug)
176- [ ] Suport immutable timelines
177 - store individual messages
178 - where?
179 - something like DBM or SQLite - faster
180 - filesystem - transparent, easily published - probably best
181 - [ ] block(chain/tree) of twtxts
182 - distributed twtxt.db
183 - each twtxt.txt is a ledger
184 - peers can verify states of ledgers
185 - peers can publish known nick->url mappings
186 - peers can vote on nick->url mappings
187 - we could break time periods into blocks
188 - how to handle the facts that many(most?) twtxt are unseen by peers
189 - longest X wins?
190
191Done
192----
d3ac9e11 193- [x] Crawl all cache/objects/*, not given peers.
a993cb85 194- [x] Support time ranges (i.e. reading the timeline between given time points)
38c9ecd5 195- [x] Dedup read-in peers before using them.
9c5e4499
SK
196- [x] Prevent redundant downloads
197 - [x] Check ETag
198 - [x] Check Last-Modified if no ETag was provided
199 - [x] Parse rfc2822 timestamps
e678174b
SK
200- [x] caching (use cache by default, unless explicitly asked for update)
201 - [x] value --> cache
202 - [x] value <-- cache
203 REQUIRES: d command
204- [x] Logger sync before exit.
205- [x] Implement rfc3339->epoch
206- [x] Remove dependency on rfc3339-old
207- [x] remove dependency on http-client
208- [x] Build executable
209 Implies fix of "collection not found" when executing the built executable
210 outside the source directory:
211
212 collection-path: collection not found
213 collection: "tt"
214 in collection directories:
215 context...:
216 /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail
217 /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info
218 /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3
219 /usr/share/racket/collects/racket/cmdline.rkt:191:51
220 '|#%mzc:p
221
222
223Cancelled
224---------
225- [~] named timelines/peer-sets
226 REASON: That is basically files of peers, which we already support.
This page took 0.056796 seconds and 4 git commands to generate.