I think a good chunk of it is bot-induced performance problems, yea. Whether that's compute or transfer. And advertisement costs.
Optimization is very very much not a solved problem though, just look at basically all software ever written - it's written for an optimization priority and to a price point (whether commercial $$ or via personal time), and that target's value to its users has shifted rather dramatically.
This is really interesting. I indeed looked at this problem from the wrong perspective.
I'm working on an open-source tool that could be useful for bot detection, but I'm still not confident that anyone would deploy it on-prem and make the setup/maintenance instead of just routing traffic through the cloud.
I think you'd definitely find some interest, e.g. anyone that intentionally avoids "the cloud" will want something local. Honestly I assume there are some of these already, monitoring apache/nginx/etc logs. Anubis is arguably similar and has been exploding lately, for example, though I'm not sure if it auto-updates its rules at all: https://github.com/TecharoHQ/anubis
As to if it'd get enough interest: yea no idea at all. I wish you luck tho! Clearly there's a need for this kind of thing.
Our team develops a risk-based analytics system that we also use for bot detection. From our perspective, bots shouldn't be blindly blocked, but rather properly monitored and blocked only when necessary. Here is a live demo (1) to give you a general idea.
Optimization is very very much not a solved problem though, just look at basically all software ever written - it's written for an optimization priority and to a price point (whether commercial $$ or via personal time), and that target's value to its users has shifted rather dramatically.