AI Bots Are a Problem No One Will Fix

My relationship with artificial intelligence (AI) tools is complicated. On one hand, tools like ChatGPT and GitHub Copilot increase my productivity. That’s a boon to my day job as a web developer.

The other side of the coin isn’t so positive. For one, AI models consume energy at alarming rates. MIT Technology Review published a report looking at just how much electricity these tools gulp:

From 2005 to 2017, the amount of electricity going to data centers remained quite flat thanks to increases in efficiency, despite the construction of armies of new data centers to serve the rise of cloud-based online services, from Facebook to Netflix. In 2017, AI began to change everything. Data centers started getting built with energy-intensive hardware designed for AI, which led them to double their electricity consumption by 2023. The latest reports show that 4.4% of all the energy in the US now goes toward data centers.

We did the math on AI’s energy footprint. Here’s the story you haven’t heard.
MIT Technology Review

Power usage is only one concern. The invasion of AI-powered bots is another issue drawing ire. These automated minions visit websites, scrape their content, and add it to their parent service’s cache of knowledge.

There’s much debate about what AI companies can and should have access to. For example, should they continue to scoop up copyrighted materials to train their models? Authors, artists, musicians, and even researchers are feeding these models, whether they want to or not.

Then there’s the flood of traffic AI bots send to websites. You might imagine they target big websites like Wikipedia, and they do. However, they seem just as likely to send an automated army to visit a local government or organizational site.

I’ve seen instances of relatively small sites becoming inundated with AI bot traffic. The deluge slowed these sites to a crawl and drew the ire of their respective web hosting providers.

Mind you, this isn’t your typical search engine bot that only visits when new content is published. Google’s crawler will show up on your analytics reports, but it isn’t hogging your site’s resources. AI models tend to hit websites to the point of exhaustion (yours and your hosts’). We’re talking thousands of requests within 24 hours. That’s a lot of traffic for a small site on a modest hosting plan.

I now think of AI bots as virtual mafia thugs. They show up uninvited, take what they want, and leave a mess for you to clean up. There’s no accountability or manners.

Can Anything Be Done About Excessive AI Bot Traffic?

The incessant nature of the problem has inspired some site owners to put up virtual toll booths. The idea is to make AI companies pay a set fee for accessing your site. In theory, it could negate the added hosting costs associated with bot traffic. That may work for large institutions and publications. Whether companies like OpenAI, Amazon, or Microsoft will pay everyday site owners is a whole other topic.

Blocking bots on your own is possible, but imperfect. I wrote a primer on the subject for Speckyboy earlier this year.

The takeaway is that new bots are popping up almost daily. So, you might block ChatGPT only to find that a new AI scraper is hammering your site. Much like web security, blocking AI bots is a game of virtual whack-a-mole.

Users of Cloudflare’s content delivery network (CDN) can opt into a bot-blocking feature. Their AI Labyrinth claims to send bots into (wait for it) an AI-powered black hole. The bots are forced to index bogus content. My experience with the feature is that it greatly reduces bot traffic. However, the shifting nature of the problem means it must keep adapting to stay effective.

Plus, private industry can only do so much. Governing bodies are the only ones who can stop this runaway train.

It would be nice to see legislatures worldwide take on AI companies. Perhaps they could be forced to make bot scraping an opt-in practice, as opposed to its current free-for-all. I won’t hold my breath.

Governments seem all too happy to provide incentives for AI companies to build data centers. In the USA, states are bending over backwards for the privilege. I haven’t heard much discussion on stemming the environmental and societal impacts of these companies.

With that, the problem of AI-fueled traffic surges will only continue, if not get worse. The trend has already become a source of frustration for web developers and a budget-buster for website owners. Now, will anyone fix it?

Share: