X-Git-Url: https://git.xandkar.net/?a=blobdiff_plain;f=TODO;h=f89176cb77e88fa8becfde5501e36ce782b2794f;hb=eade817510cd03ba31e90099238f06d6c30872aa;hp=2d4efd99a53d512a1cc0833108802bbeb902b817;hpb=d3ac9e11ef7903d22b29241574b106a434f8ddd0;p=tt.git diff --git a/TODO b/TODO index 2d4efd9..f89176c 100644 --- a/TODO +++ b/TODO @@ -57,12 +57,19 @@ In-progress - [x] @ - [x] @ - [ ] "following" from timeline comments: # following = + 1. split file lines in 2 groups: comments and messages + 2. dispatch messages parsing as usual + 3. dispatch comments parsing for: + - # following = + - what else? - [ ] Parse User-Agent web access logs. - [-] Update peer ref file(s) - [x] peers-all - [x] peers-mentioned - [ ] peers-followed (by others, parsed from comments) + - [ ] peers-up (no net errors) - [ ] peers-down (net errors) + - [ ] peers-valid (up and parsed at least 1 message) - [ ] redirects? Rough sketch from late 2019: let read file = @@ -106,6 +113,17 @@ In-progress Backlog ------- +- [ ] Batch download jobs by domain: + - at most 1 worker per domain + - more than 1 domain per worker is OK +- [ ] Remove mention link noise in read view. + in short view: just abbreviate @ to @nick + in long view: abbreviate like above AND list the full versions after the text +- [ ] Crawl only valid objects + REQUIRES: peers-valid ref file update +- [ ] Reduce log noise +- [ ] Parallelize crawling by file +- [ ] Parallelize reading by file - [ ] Support date without time in timestamps - [ ] Associate cached object with nick. - [ ] Crawl downloaded web access logs