this post was submitted on 11 Jan 2025

2 points (100.0% liked)

Cybersecurity

10 readers

5 users here now

An umbrella community for all things cybersecurity / infosec. News, research, questions, are all welcome!

Rules

Community Rules

Be kind
Limit promotional activities
Non-cybersecurity posts should be redirected to other communities within infosec.pub.

founded 2 years ago

MODERATORS

shellsharks@fedia.io

tweedge@fedia.io

"On Saturday, Triplegangers CEO Oleksandr Tomchuk was alerted that his company’s e-commerce site was down. It looked to be some kind of distributed denial-of-service attack. (tldr.nettime.org)

submitted 1 week ago by remixtures@tldr.nettime.org to c/cybersecurity@fedia.io

13 comments fedilink hide all child comments

"On Saturday, Triplegangers CEO Oleksandr Tomchuk was alerted that his company’s e-commerce site was down. It looked to be some kind of distributed denial-of-service attack.

He soon discovered the culprit was a bot from OpenAI that was relentlessly attempting to scrape his entire, enormous site.

“We have over 65,000 products, each product has a page,” Tomchuk told TechCrunch. “Each page has at least three photos.”

OpenAI was sending “tens of thousands” of server requests trying to download all of it, hundreds of thousands of photos, along with their detailed descriptions.

“OpenAI used 600 IPs to scrape data, and we are still analyzing logs from last week, perhaps it’s way more,” he said of the IP addresses the bot used to attempt to consume his site.

“Their crawlers were crushing our site,” he said “It was basically a DDoS attack.”

Triplegangers’ website is its business. The seven-employee company has spent over a decade assembling what it calls the largest database of “human digital doubles” on the web, meaning 3D image files scanned from actual human models.

It sells the 3D object files, as well as photos — everything from hands to hair, skin, and full bodies — to 3D artists, video game makers, anyone who needs to digitally recreate authentic human characteristics."

https://techcrunch.com/2025/01/10/how-openais-bot-crushed-this-seven-person-companys-web-site-like-a-ddos-attack/

#CyberSecurity #AI #GenerativeAI #OpenAI #WebScraping #DDoS #AITraining

top 13 comments

sorted by: hot top controversial new old

[–] bryce@mastodon.brycedixon.dev 1 points 1 week ago

@remixtures@tldr.nettime.org This isn't "like" a DDoS attack it IS a DDoS attack.

Virtually every early example of a modern computer attack was originally someone just messing around or making a mistake (the first virus, worm, and DoS all come to mind) and to my knowledge all of those were tried on (and many found guilty to) serious hacking charges, so why shouldn't OpenAI? They shouldn't get to claim "well, your service should have been able to handle a DDoS" or "we're doing it for gain, though."

[–] wraptile@fosstodon.org 1 points 1 week ago

@remixtures@tldr.nettime.org not cool running so many connections but 65,000 pages isn't really that much for a contemporary website. If you have a CDN then even more so.

[–] iam_sysop@cyberplace.social 1 points 1 week ago

@remixtures@tldr.nettime.org

This has happened to us on several of the over 300 domains we host.

The COSTS to support OpenAI harvesting, bandwidth, and the rest of the AI bot farms stealing copyrighted content is crushing us.

[–] Jennifer@m.ai6yr.org 1 points 1 week ago

@remixtures@tldr.nettime.org I'm starting to consider AI companies evil.

[–] DrinkyBird@mastodon.org.uk 1 points 1 week ago

@remixtures@tldr.nettime.org I had my own run in with GPTbot spamming requests, falling into a recursive hole with desktop/mobile view links and sending malformed URLs:

[–] Serpent7776@mastodon.social 1 points 1 week ago

@remixtures@tldr.nettime.org Soon: their business go down, because most of their data is available directly on ChatGPT.

[–] gimulnautti@mastodon.green 1 points 1 week ago

@remixtures@tldr.nettime.org Block them. Make them pay!

[–] GhostOnTheHalfShell@masto.ai 1 points 1 week ago

@remixtures@tldr.nettime.org

And no one is calling this what it should be: robbery?

[–] TessRants@mastodon.social 1 points 1 week ago

@remixtures@tldr.nettime.org
So, an AI bot was trying to steal all of the product from a legitimate business and, in the process, crashed the business's whole source of income?
We don't need regulations for the scare-the-uninformed version of AI that Sam Altman likes to bloviate about...
We need laws and penalties for the actual theft Altman and his cronies are perpetrating on a daily basis.
They are too big for any small business to fight in court. This is a thing only a government can remedy w/ out violence.

[–] TommyTorty10@infosec.exchange 1 points 1 week ago

@remixtures@tldr.nettime.org im neither a lawyer nor cybersecurity expert, just a fresh computer engineer. Im curious what would happen if they pursued legal action against openai for the downtime? Openai attacked their service and took them offline causing financial loss. Seriously why not treat it like a hack? What would a judge say when comparing openai's actions to those of some kids running a ddos campaign?

[–] dev_ric@fosstodon.org 1 points 1 week ago

@remixtures@tldr.nettime.org GPTBot is the most aggressive content scraper I've come across in decades of server management. Totally ignores any crawl limits that you set in your robots.txt, and they operate on enough IPs to make even nginx configured rate limiting a bit futile.

You can, though, block them (and others) by their useragent string. Add this to your .htaccess to block both GPTBot and Claude, for example:

SetEnvIfNoCase ^User-Agent$ .*(ClaudeBot|GPTBot) BADBOTHAMMER
Deny from env=BADBOTHAMMER

[–] tisha@htt.social 1 points 1 week ago

@remixtures@tldr.nettime.org I’ve to deal with the AI scraping problem too at work.

They are the worst scraping bot ever made, not only OpenAI but a dozen of AI startup.

[–] holdenweb@freeradical.zone 1 points 1 week ago

@remixtures@tldr.nettime.org the AI world will be lawless. Goodbye, sanity.