Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ratecatcher middleware #3581

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Ratecatcher middleware #3581

wants to merge 2 commits into from

Conversation

heckj
Copy link
Collaborator

@heckj heckj commented Jan 4, 2025

Sketch to show WIP

  • adds a middleware that uses cfray and stores/accesses redis to compute a "# of requests over a sliding window"

HOWEVER - this isn't even compiling for me, as the request.redis that I get back is returning me an EventLoopFuture, and not something that I can interact with using async/await - so I'm missing something there. Even still, I thought getting the loose sketch of what I was thinking up and visible might be useful.

(do we need to be slightly more up-level in version of Vapor or some dependencies to get the RedisClient that does async/await?)

@heckj heckj self-assigned this Jan 4, 2025
@cla-bot cla-bot bot added the cla-signed label Jan 4, 2025
@finestructure
Copy link
Member

finestructure commented Jan 6, 2025

HOWEVER - this isn't even compiling for me, as the request.redis that I get back is returning me an EventLoopFuture, and not something that I can interact with using async/await - so I'm missing something there.

You just need to tack on a .get() to the ELF to convert it over into a/a land:

-        let countOverWindow: Int = try await request.redis.increment(combinedKey)
+        let countOverWindow: Int = try await request.redis.increment(combinedKey).get()

But also, we won't have request.redis due to the different way of setting up Redis via #3582. It will be

@Dependency(\.redis) var redis
let countOverWindow: Int = try await redis.increment(combinedKey)

(but we'll need to extend the existing RedisClient with the increment - I'll deal with that.)

@finestructure
Copy link
Member

Ok, I see how this would work, looking great! I haven't found a whole lot of detail on CF's RayID but I suppose it's similar to a source IP and should be a fairly stable value for traffic coming from the same source?

Once we have #3582 merged it'll be easy to rebase this here. I can take care of that.

@Sherlouk
Copy link
Collaborator

Sherlouk commented Jan 6, 2025

I appreciate the only context I have here is an issue from when bots seemingly took down the site, and a draft PR here exploring an interesting idea, but I was curious as to why you're tackling this problem at the service level?

Especially (but not entirely reliant on) when Cloudflare is one of your key sponsors, it doesn't put them in great light when their service can't protect you. I mean, they can, if configured for it, I have the cheapo plan with most of the protections enabled (and a few allow lists for edge cases) which protects against crawlers I don't want. Was this explored as an option?

Even without them as a sponsor (not sure what your contract allows/doesn't allow), this is what CDNs and stuff at the AWS/Azure top levels provides.

Solving bots has always felt like an anti-pattern to me personally! but maybe that's because I don't know what I'm doing and would rather leave it to the experts 😂

@finestructure
Copy link
Member

Yes, we've toggled the easy options and they didn't mitigate the issue unfortunately :(

@Sherlouk
Copy link
Collaborator

Sherlouk commented Jan 6, 2025

That's mega frustrating then! What settings have you got on, out of interest?

For me within "Super Bot Fight Mode" I've got Block AI Bots enabled, and managed challenge for "definitely automated" (which Cloudflare describes as bad bots). I allow through the Cloudflare verified bots which is usually more of the SEO crawlers that you want to have.

I do have some other allow lists and such (especially whitelisting things like Google domain verification which for some reason wasn't being allowed).

Looking at my bots dashboard it's detecting and rejecting a lot of clients, but the logs demonstrate none of that traffic is traffic I want to hit my service (i.e. it's working well). Looking at my Google logs shows traffic origins, and again it's all genuine traffic so I'm happy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants