A Recommendation Engine built for the speed of news

The Chartbeat Recommendation Engine is a new suite of APIs that publishers can use to populate a “related news” or “read more” section on their websites. It uses deep learning language models and Chartbeat’s large dataset on pageviews to suggest relevant and engaging stories that readers are likely to click on next.

We knew before we began building the Recommendation Engine that it had to be fast. Because it needs to serve article recommendations as soon as the page loads, if the Recommendation Engine is slow then the overall load time of the page will be slow which is bad for the reader experience and bad for SEO. 

To make the Recommendation Engine fast, we had to think carefully about how to architect it.

Designing for speed

Generating a good recommendation takes time. In order to recommend articles similar to the one a user is currently reading, we have to scan a pool of articles in the tens of thousands. Then, before serving the recommended articles to users, we have to retrieve metrics and metadata for each article, requiring at least one round trip to a database.

To speed things up, we considered skipping or simplifying some of these steps, but that would mean reducing the quality of our recommendations, and we wanted to find a solution that could free us from having to make a tradeoff between the speed of our engine on the one hand and the quality of our recommendations on the other.

We realized that caching was the solution we needed. Rather than generating a new suggestion for every request to our Recommendation Engine, we would generate a recommendation for the first request and then save that for later requests. We would save it in a fast, in-memory database that would allow us to retrieve it in milliseconds the next time we receive a request for recommendations to display below the same article.

Caching allowed us to take our time generating quality recommendations while still responding quickly to user requests, but it introduced its own complexities that we needed to design around. The biggest challenge was determining what exactly we would store in our cache.

A hybrid approach to cache management

Caching is a strategy used everywhere across software development and computer science. The basic idea is to save the result of an expensive computation so that the next time you need to make the same computation you can re-use the result you saved before.

One subtlety to consider is that you need some rules for determining when a given computation is the same as one that has come before. The simplest approach would be to consider two computations identical if they share the exact same inputs, but the APIs that comprise the Recommendation Engine support numerous parameters such as excluding articles in certain sections and sorting the articles by page views or other metrics. If we used this approach to caching and treated every parameter as an input, it would mean that we would have to cache an enormous number of different recommendations in our cache, and the problem would compound with every new parameter we want to support in our API.

We chose not to do this. Instead, we took a hybrid approach where we account for all inputs when determining whether two recommendation requests are identical but only save a single recommendation to our cache for every story. If one of the user-supplied parameters changes, we consider that to be a cache miss, and we have to regenerate and replace the recommendation saved in our cache.

This approach works because, while we expect our users to make extensive use of a parameter like the parameter for ‘excluding sections’, we do not expect the parameter to vary much on a per-article basis. We expect that editorial staff might want to pick a few sections to exclude in recommendations for a certain article but that this list of sections will be set just once when the article is published.

What we built

By architecting our system with caching at its heart, we were able to build a suite of APIs that can return recommendations to users in tens of milliseconds. This makes it possible to call the Recommendation Engine during a page load without having to worry about creating a bad reader experience or tempting Google to punish your site with a lower ranking.

Caching solves one problem but creates others. In our case, caching added some complexity to our system. However, we reduced it to a level that made the tradeoff worth it and useful for every Chartbeat Recommendation Engine user.


Interested in chatting about how you can use our Recommendation Engine? Let us know here.


Sinclair Target is a Senior Software Engineer working on backend systems at Chartbeat. In his free time, he likes to write about computer history and why the software we use every day works the way it does.