Sabot in the Age of AI
January 28, 2025

babble
tarpit in operation, generating an endless stream of deterministic bollocks, with plenty of links.⚠️ Warning
Please note that the following list comprises intentionally malicious approaches designed to cause harm. Do not deploy any of these suggestions unless you are fully cognizant of the potential consequences of your actions. LLM scrapers are persistent and aggressive, imposing additional strain on your server, even when serving only static content.
📝 Note
This blog post constitutes a continuation of an earlier publication disseminated through our Mastodon account, which can be found under the handle
@asrg@tldr.nettime.org
. The original post is directly accessible at: https://tldr.nettime.org/@asrg/113867412641585520.
Context #
This formulated list diligently records strategically offensive methodologies and purposefully orchestrated tactics intended to facilitate (algorithmic) sabotage, including the deliberate disruption of systems and processes, alongside the targeted poisoning or corruption of data within the operational workflows of artificial intelligence (AI) systems. These approaches seek to destabilize critical mechanisms, undermine foundational structures, and challenge the overall reliability, functionality, and integrity of AI-driven frameworks.
Selected Tools and Frameworks #
Table 1: Offensive Methods and Strategic Approaches for Facilitating (Algorithmic) Sabotage, Framework Disruption, and Intentional Data Poisoning
🔢 | Tool | License | Repository | Description |
---|---|---|---|---|
0 | Nepenthes | MIT | URL | nepenthes generates endless sequences of pages with dozens of links leading back into a tarpit. Intentional delay is added to prevent crawlers from bogging down your server, in addition to wasting their time. Optional Markov-babble can also be added, giving crawlers something to scrape. |
1 | Iocaine | MIT | URL | iocaine operates on the tarpitting principle, with a focus on data poisoning. Its goal is to create a stable, infinite maze of garbage to trap unwanted crawlers, relying on a reverse proxy. |
2 | Quixotic | MIT | URL | quixotic is a set of tools for static website operators to confuse bots and LLM scrapers using a simple Markov Chain text generator. By default, it modifies around 20% of the content with nonsense and transposes about 40% of images, leaving alt and caption content unchanged, thus incorrectly describing the images. |
3 | Marko | N/A | URL | Implements the Dissociated Press algorithm as a library and CLI tool, accepting a single input string to produce an indefinite amount of output using a character-based or word-based Markov model. |
4 | Poison the WeLLMs | AGPL | URL | A reverse-proxy that serves Dissociated Press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content. |
5 | django-llm-poison | MIT | URL | A pluggable Django app that replaces a subset of text content with nonsense when served to AI crawlers. The app uses the markovify library to generate Markov chains from your content and then uses these chains to replace a subset of the sentences within the {% poison %} tag if the user agent matches a known AI bot. |
6 | konterfAI | AGPL | URL | konterfai is a proof-of-concept model-poisoner for large language models (LLMs), designed to generate nonsense (“bullshit”) content to degrade their performance. The backend queries a small LLM running in Ollama, utilizing a high AI temperature setting to produce hallucinatory content. |
7 | Caddy Defender | MIT | URL | The Caddy Defender plugin is a middleware for Caddy, blocking or manipulating requests based on client IP, useful for preventing unwanted traffic or polluting AI training data with garbage responses. |
8 | Markov Tarpit | AGPL | URL | This software can run as a back-end for a webserver (e.g., nginx ), in order to trickle out a Markov chain generated output. The intended use is tarpitting AI bots while feeding them, slowly, useless data. |
9 | Spigot | MIT | URL | A simple proof of concept using a Markov Chain to generate an infinitely large website. spigot creates a fake hierarchy of web pages, generating gibberish content that superficially resembles English. |
10 | Pyison | MIT | URL | pyison feeds web crawlers an endless list of links to other pages on its site, trapping them on a single site to endlessly navigate an ever-growing sea of links. |
11 | Infinite* Slop | AGPL-3.0-only | URL | infinite-slop is an enterprise-ready, high-performance slop generation solution designed to waste resources of shitty web crawlers and potentially ruin training sets of unethically-sourced AI projects, using the classic template-based random string generation approach from the ’90s. |
12 | fakejpeg | MIT | URL | Generates files that look like JPEGs but contain random data—useful for feeding aggressive web crawlers with junk. Designed to run very quickly, it trains on existing JPEGs and generates an arbitrary number of fake JPEGs on the fly. |
13 | Antlion | MIT | URL | Antlion is Express.js middleware that allows you to set up dedicated routes on your site as infinitely recursive tar pits designed to trap web scrapers that ignore your robots.txt . Bots that enter Antlion’s pit are locked in an infinitely deep site full of nonsensical, garbled text, loading at the speed of a ’90s dial-up connection. |
14 | Babble | N/A | URL | Junk food for your local LLM. babble is a standalone LLM crawler tarpit binary. Generates an endless stream of deterministic bollocks to be ingested by bots, with plenty of links. |
* Please note that this list was last updated on April 29, 2025
. It functions as a dynamic, continuously evolving resource, with periodic updates and revisions undertaken to preserve its accuracy, relevance, and alignment with various facets of the expanding spectrum of collective techno-disobedience and algorithmic Luddism, manifested through radically assertive modes of refusal, resistance, and reimagining, which are indispensable principles for mounting opposition to and challenging algorithmic politics and practices detrimental to commonality.
Contact #
For any suggestions, revisions, proposals, or further contributions pertaining to this list, please contact us via email at x7kekmg7@proton.me.
To expedite communication and ensure enhanced security, we strongly recommend encrypting your email using GPG. Our public key can be obtained through the following link here. Alternatively, you may retrieve our key from a public key server by executing the following command:
gpg --recv-keys DD4FF0D691C7C8F501C1CD0441CC385A75C16CD7
We kindly ask that you include your public GPG key in your email correspondence to facilitate efficient processing and communication.