Commit | Line | Data |
---|---|---|
a78e83b8 | 1 | # vim:sw=2:sts=2: |
e678174b SK |
2 | TODO |
3 | ==== | |
33cf2848 | 4 | |
e678174b SK |
5 | Legend: |
6 | - [ ] not started | |
7 | - [-] in-progress | |
8 | - [x] done | |
9 | - [~] cancelled | |
33cf2848 | 10 | |
e678174b SK |
11 | In-progress |
12 | ----------- | |
a993cb85 SK |
13 | - [-] timeline limits |
14 | - [x] by time range | |
15 | - [ ] by msg count | |
16 | - [ ] per peer | |
17 | - [ ] total | |
18 | Not necessary for short format, because we have Unix head/tail, | |
19 | but may be convinient for long format (because msg spans multiple lines). | |
e678174b SK |
20 | - [-] Convert to Typed Racket |
21 | - [x] build executable (otherwise too-slow) | |
22 | - [-] add signatures | |
23 | - [x] top-level | |
24 | - [ ] inner | |
25 | - [ ] imports | |
24f1f64b | 26 | - [-] commands: |
a60c484e SK |
27 | - [x] c | crawl |
28 | Discover new peers mentioned by known peers. | |
24f1f64b | 29 | - [x] r | read |
d96fa613 | 30 | - see timeline ops above |
24f1f64b | 31 | - [ ] w | write |
d96fa613 SK |
32 | - arg or stdin |
33 | - nick expand to URI | |
9d1b7217 SK |
34 | - Watch FIFO for lines, then read, timestamp and append [+ upload]. |
35 | Can be part of a "live" mode, along with background polling and | |
36 | incremental printing. Sort of an ii-like IRC experience. | |
24f1f64b | 37 | - [ ] q | query |
d96fa613 SK |
38 | - see timeline ops above |
39 | - see hashtag and channels above | |
4214c0f3 | 40 | - [x] d | download |
e678174b SK |
41 | - [ ] options: |
42 | - [ ] all - use all known peers | |
43 | - [ ] fast - all except peers known to be slow or unavailable | |
44 | REQUIRES: stats | |
3a4b2233 | 45 | - [x] u | upload |
54c0807b | 46 | - calls user-configured command to upload user's own timeline file to their server |
24f1f64b SK |
47 | Looks like a better CLI parser than "racket/cmdline": https://docs.racket-lang.org/natural-cli/ |
48 | But it is no longer necessary now that I've figured out how to chain (command-line ..) calls. | |
e678174b SK |
49 | - [-] Output formats: |
50 | - [x] text long | |
51 | - [x] text short | |
52 | - [ ] HTML | |
53 | - [ ] JSON | |
54 | - [-] Peer discovery | |
55 | - [-] parse peer refs from peer timelines | |
56 | - [x] mentions from timeline messages | |
57 | - [x] @<source.nick source.url> | |
58 | - [x] @<source.url> | |
a60c484e | 59 | - [ ] "following" from timeline comments: # following = <nick> <uri> |
8cd862ed | 60 | - [ ] Parse User-Agent web access logs. |
a60c484e SK |
61 | - [-] Update peer ref file(s) |
62 | - [x] peers-all | |
63 | - [x] peers-mentioned | |
64 | - [ ] peers-followed (by others, parsed from comments) | |
65 | - [ ] peers-down (net errors) | |
66 | - [ ] redirects? | |
b06cbfc2 | 67 | Rough sketch from late 2019: |
c91a1ca9 SK |
68 | let read file = |
69 | ... | |
70 | let write file peers = | |
71 | ... | |
72 | let fetch peer = | |
73 | (* Fetch could mean either or both of: | |
74 | * - fetch peer's we-are-twtxt.txt | |
75 | * - fetch peer's twtxt.txt and extract mentioned peer URIs | |
76 | * *) | |
77 | ... | |
78 | let test peers = | |
79 | ... | |
80 | let rec discover peers_old = | |
81 | let peers_all = | |
82 | Set.fold peers_old ~init:peers_old ~f:(fun peers p -> | |
83 | match fetch p with | |
84 | | Error _ -> | |
85 | (* TODO: Should p be moved to down set here? *) | |
86 | log_warning ...; | |
87 | peers | |
88 | | Ok peers_fetched -> | |
89 | Set.union peers peers_fetched | |
90 | ) | |
91 | in | |
92 | if Set.empty (Set.diff peers_old peers_all) then | |
93 | peers_all | |
94 | else | |
95 | discover peers_all | |
96 | let rec loop interval peers_old = | |
97 | let peers_all = discover peers_old in | |
98 | let (peers_up, peers_down) = test peers_all in | |
99 | write "peers-all.txt" peers_all; | |
100 | write "peers-up.txt" peers_up; | |
101 | write "peers-down.txt" peers_down; | |
102 | sleep interval; | |
103 | loop interval peers_all | |
104 | let () = | |
105 | loop (Sys.argv.(1)) (read "peers-all.txt") | |
e678174b SK |
106 | |
107 | Backlog | |
108 | ------- | |
a993cb85 | 109 | - [ ] Support date without time in timestamps |
d3ac9e11 | 110 | - [ ] Associate cached object with nick. |
7d9f2ab5 SK |
111 | - [ ] Crawl downloaded web access logs |
112 | - [ ] download-command hook to grab the access logs | |
113 | ||
114 | (define (parse log-line) | |
115 | (match (regexp-match #px"([^/]+)/([^ ]+) +\\(\\+([a-z]+://[^;]+); *@([^\\)]+)\\)" log-line) | |
116 | [(list _ client version uri nick) (cons nick uri)] | |
117 | [_ #f])) | |
118 | ||
119 | (list->set (filter-map parse (file->lines "logs/combined-access.log"))) | |
120 | ||
121 | (filter (λ (p) (equal? 'file (file-or-directory-type p))) (directory-list logs-dir)) | |
122 | ||
a60c484e | 123 | - [ ] user-agent file as CLI option - need to run at least the crawler as another user |
9c34c974 | 124 | - [ ] Support fetching rsync URIs |
3231d4b5 SK |
125 | - [ ] Check for peer duplicates: |
126 | - [ ] same nick for N>1 URIs | |
127 | - [ ] same URI for N>1 nicks | |
55da29c0 SK |
128 | - [ ] Background polling and incremental timeline updates. |
129 | We can mark which messages have already been printed and print new ones as | |
130 | they come in. | |
131 | REQUIRES: polling | |
4ffb857c | 132 | - [ ] Polling mode/command, where tt periodically polls peer timelines |
e678174b SK |
133 | - [ ] nick tiebreaker(s) |
134 | - [ ] some sort of a hash of URI? | |
135 | - [ ] angry-purple-tiger kind if thingie? | |
136 | - [ ] P2P nick registration? | |
137 | - [ ] Peers vote by claiming to have seen a nick->uri mapping? | |
138 | The inherent race condition would be a feature, since all user name | |
139 | registrations are races. | |
140 | REQUIRES: blockchain | |
141 | - [ ] stats | |
142 | - [ ] download times per peer | |
143 | - [ ] Support redirects | |
54c0807b | 144 | - should permanent redirects update the peer ref somehow? |
e678174b SK |
145 | - [ ] optional text wrap |
146 | - [ ] write | |
54c0807b SK |
147 | - [ ] peer refs set operations (perhaps better done externally?) |
148 | - [ ] timeline as a result of a query (peer ref set op + filter expressions) | |
e678174b SK |
149 | - [ ] config files |
150 | - [ ] highlight mentions | |
151 | - [ ] filter on mentions | |
152 | - [ ] highlight hashtags | |
153 | - [ ] filter on hashtags | |
154 | - [ ] hashtags as channels? initial hashtag special? | |
155 | - [ ] query language | |
156 | - [ ] console logger colors by level ('error) | |
157 | - [ ] file logger ('debug) | |
158 | - [ ] Suport immutable timelines | |
159 | - store individual messages | |
160 | - where? | |
161 | - something like DBM or SQLite - faster | |
162 | - filesystem - transparent, easily published - probably best | |
163 | - [ ] block(chain/tree) of twtxts | |
164 | - distributed twtxt.db | |
165 | - each twtxt.txt is a ledger | |
166 | - peers can verify states of ledgers | |
167 | - peers can publish known nick->url mappings | |
168 | - peers can vote on nick->url mappings | |
169 | - we could break time periods into blocks | |
170 | - how to handle the facts that many(most?) twtxt are unseen by peers | |
171 | - longest X wins? | |
172 | ||
173 | Done | |
174 | ---- | |
d3ac9e11 | 175 | - [x] Crawl all cache/objects/*, not given peers. |
a993cb85 | 176 | - [x] Support time ranges (i.e. reading the timeline between given time points) |
38c9ecd5 | 177 | - [x] Dedup read-in peers before using them. |
9c5e4499 SK |
178 | - [x] Prevent redundant downloads |
179 | - [x] Check ETag | |
180 | - [x] Check Last-Modified if no ETag was provided | |
181 | - [x] Parse rfc2822 timestamps | |
e678174b SK |
182 | - [x] caching (use cache by default, unless explicitly asked for update) |
183 | - [x] value --> cache | |
184 | - [x] value <-- cache | |
185 | REQUIRES: d command | |
186 | - [x] Logger sync before exit. | |
187 | - [x] Implement rfc3339->epoch | |
188 | - [x] Remove dependency on rfc3339-old | |
189 | - [x] remove dependency on http-client | |
190 | - [x] Build executable | |
191 | Implies fix of "collection not found" when executing the built executable | |
192 | outside the source directory: | |
193 | ||
194 | collection-path: collection not found | |
195 | collection: "tt" | |
196 | in collection directories: | |
197 | context...: | |
198 | /usr/share/racket/collects/racket/private/collect.rkt:11:53: fail | |
199 | /usr/share/racket/collects/setup/getinfo.rkt:17:0: get-info | |
200 | /usr/share/racket/collects/racket/contract/private/arrow-val-first.rkt:555:3 | |
201 | /usr/share/racket/collects/racket/cmdline.rkt:191:51 | |
202 | '|#%mzc:p | |
203 | ||
204 | ||
205 | Cancelled | |
206 | --------- | |
207 | - [~] named timelines/peer-sets | |
208 | REASON: That is basically files of peers, which we already support. |