Programmatically organising news with elisp

elfeed is a feed reader that’s programmable. This means that you can use code to automatically organise and prioritise news based on your requirements.

I thought it would be fun and interesting to share how I organise my news with the Elisp programming language.

I’m going to discuss how I organise content from news aggregator sites with user commentary, sites like Reddit and HackerNews.

The code below is used to organise Reddit content.

I start by tagging content newer than 1 day, as requiring user commentary. For the time being, this content will not show up in the feed reader.

(add-hook 'elfeed-new-entry-hook
 (elfeed-make-tagger :feed-url "www\\.reddit\\.com"
 :after "1 day ago"
 :add 'require-commentary))

Content that is older than 1 day, should have enough user commentary, so I remove the tag indicating that it requires user commentary.

(add-hook 'elfeed-new-entry-hook
 (elfeed-make-tagger :feed-url "www\\.reddit\\.com"
 :before "1 day ago"
 :remove 'require-commentary))

The absence of the “require-commentary” tag, means that the content will be processed. Processing the content involves gathering the article link, the link to the comments page on the news aggregator site, and the comment count (which I use to prioritise content) — and storing all this as metadata.

(defun edin-elfeed/process-reddit (entry)
  (when (and (elfeed-tagged-p 'reddit entry)
	     (not (elfeed-tagged-p 'require-commentary entry))
	     (not (elfeed-tagged-p 'processed entry))) ;; optimisation, to stop re-scraping of entries (e.g. comment count)
    (edin-elfeed/add-reddit-op-link-as-metadata entry)
    (edin-elfeed/add-reddit-comment-link-as-metadata entry)
    (edin-elfeed/add-reddit-comment-count-as-metadata entry)
    (elfeed-tag entry 'processed)))

The article link is extracted from the RSS entry’s content retrieved from Reddit RSS feed:

(defun edin-elfeed/get-reddit-op-link-from-source (entry)
  (let ((content (elfeed-deref (elfeed-entry-content entry))))
    (cond ((null content) nil)
	  ((string-match "<a href=\"\\([^\"]+\\)\">\\[link\\]</a>" content) (match-string 1 content))
	  (t nil))))

The comments page link, in the case of Reddit, is retrieved from the RSS entry:

(defun edin-elfeed/get-reddit-comment-link-from-source (entry)
  (elfeed-entry-link entry))

While the comment count is acquired by scraping the HTML of the Reddit comments page:

(defun edin-elfeed/get-reddit-comment-count-from-source (entry)
  (with-current-buffer (url-retrieve-synchronously (elfeed-meta entry :comment-link))
    (let ((s (buffer-string)))
      (if (string-match ">\\([0-9]+\\) comments?</a>" s)
	  (string-to-number (match-string 1 s))
	0))))

Finally, the content is prioritised based on the number of comments it has received:

(defun edin-elfeed/add-priority-tags (entry)
  (when (and (or (elfeed-tagged-p 'reddit entry)
		 (elfeed-tagged-p 'hn entry))
	     (elfeed-tagged-p 'processed entry))
    (let ((comment-count (elfeed-meta entry :comment-count)))
      (cond ((< comment-count 50) (elfeed-tag entry 'low-priority))
	    ((>= comment-count 100) (elfeed-tag entry 'high-priority))))))

Feed readers are a great way to reduce the distractions of the modern web, while elfeed’s programmability gives you the power to define how you want to read the news.