Skip to content
This repository has been archived by the owner on Jun 1, 2024. It is now read-only.

[Do not merge] Fetch comments concurrently in a thread pool #2502

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AdamNiederer
Copy link

Hi there,

I've been trying to improve the speed of offline subbreddit caching, and I figured that serially awaiting each HTTP request was eating the bulk of the time.

While fixing that problem, I found that RedditClient.execute() has some kind of lock which doesn't let me query Reddit concurrently. Although this patch doesn't hurt the wall time of the caching process, I'd recommend not merging this until I can figure out how to get around that limitation. Does reddit have a particularly oppressive rate limiter or something?

While it isn't the immense performance increase I anticipated, it does help
speed up offline caching somewhat.
@ccrama
Copy link
Collaborator

ccrama commented Aug 5, 2017

Thank you for the PR!

Actually, this is done on purpose because Reddit has API rate limitations, and the caching mechanism was designed to not break these limits. Unfortunately don't know if your changes break these limits, but I know that JRAW has a master rate limit built into it (I figured serially downloading would ensure we don't break any limits on our side either). One way we could do this is to execute each as a separate pooled thread, but keeping track of all of those threads is pretty messy and that's why I opted to just have one master async thread. If you have any other suggestions on how to go about this, I'm open to any suggestions!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants