Sabot in the Age of AI

Sabot in the Age of AI

January 28, 2025
A captured screenshot showcasing the babble tarpit in operation, generating an endless stream of deterministic bollocks, with plenty of links.

⚠️ Warning

Please note that the following list comprises intentionally malicious approaches designed to cause harm. Do not deploy any of these suggestions unless you are fully cognizant of the potential consequences of your actions. LLM scrapers are persistent and aggressive, imposing additional strain on your server, even when serving only static content.

📝 Note

This blog post constitutes a continuation of an earlier publication disseminated through our Mastodon account, which can be found under the handle @asrg@tldr.nettime.org. The original post is directly accessible at: https://tldr.nettime.org/@asrg/113867412641585520.

Context #

This formulated list diligently records strategically offensive methodologies and purposefully orchestrated tactics intended to facilitate (algorithmic) sabotage, including the deliberate disruption of systems and processes, alongside the targeted poisoning or corruption of data within the operational workflows of artificial intelligence (AI) systems. These approaches seek to destabilize critical mechanisms, undermine foundational structures, and challenge the overall reliability, functionality, and integrity of AI-driven frameworks.

Selected Tools and Frameworks #

Table 1: Offensive Methods and Strategic Approaches for Facilitating (Algorithmic) Sabotage, Framework Disruption, and Intentional Data Poisoning

🔢ToolLicenseRepositoryDescription
0NepenthesMITURLnepenthes generates endless sequences of pages with dozens of links leading back into a tarpit. Intentional delay is added to prevent crawlers from bogging down your server, in addition to wasting their time. Optional Markov-babble can also be added, giving crawlers something to scrape.
1IocaineMITURLiocaine operates on the tarpitting principle, with a focus on data poisoning. Its goal is to create a stable, infinite maze of garbage to trap unwanted crawlers, relying on a reverse proxy.
2QuixoticMITURLquixotic is a set of tools for static website operators to confuse bots and LLM scrapers using a simple Markov Chain text generator. By default, it modifies around 20% of the content with nonsense and transposes about 40% of images, leaving alt and caption content unchanged, thus incorrectly describing the images.
3MarkoN/AURLImplements the Dissociated Press algorithm as a library and CLI tool, accepting a single input string to produce an indefinite amount of output using a character-based or word-based Markov model.
4Poison the WeLLMsAGPLURLA reverse-proxy that serves Dissociated Press style reimaginings of your upstream pages, poisoning any LLMs that scrape your content.
5django-llm-poisonMITURLA pluggable Django app that replaces a subset of text content with nonsense when served to AI crawlers. The app uses the markovify library to generate Markov chains from your content and then uses these chains to replace a subset of the sentences within the {% poison %} tag if the user agent matches a known AI bot.
6konterfAIAGPLURLkonterfai is a proof-of-concept model-poisoner for large language models (LLMs), designed to generate nonsense (“bullshit”) content to degrade their performance. The backend queries a small LLM running in Ollama, utilizing a high AI temperature setting to produce hallucinatory content.
7Caddy DefenderMITURLThe Caddy Defender plugin is a middleware for Caddy, blocking or manipulating requests based on client IP, useful for preventing unwanted traffic or polluting AI training data with garbage responses.
8Markov TarpitAGPLURLThis software can run as a back-end for a webserver (e.g., nginx), in order to trickle out a Markov chain generated output. The intended use is tarpitting AI bots while feeding them, slowly, useless data.
9SpigotMITURLA simple proof of concept using a Markov Chain to generate an infinitely large website. spigot creates a fake hierarchy of web pages, generating gibberish content that superficially resembles English.
10PyisonMITURLpyison feeds web crawlers an endless list of links to other pages on its site, trapping them on a single site to endlessly navigate an ever-growing sea of links.
11Infinite* SlopAGPL-3.0-onlyURLinfinite-slop is an enterprise-ready, high-performance slop generation solution designed to waste resources of shitty web crawlers and potentially ruin training sets of unethically-sourced AI projects, using the classic template-based random string generation approach from the ’90s.
12fakejpegMITURLGenerates files that look like JPEGs but contain random data—useful for feeding aggressive web crawlers with junk. Designed to run very quickly, it trains on existing JPEGs and generates an arbitrary number of fake JPEGs on the fly.
13AntlionMITURLAntlion is Express.js middleware that allows you to set up dedicated routes on your site as infinitely recursive tar pits designed to trap web scrapers that ignore your robots.txt. Bots that enter Antlion’s pit are locked in an infinitely deep site full of nonsensical, garbled text, loading at the speed of a ’90s dial-up connection.
14BabbleN/AURLJunk food for your local LLM. babble is a standalone LLM crawler tarpit binary. Generates an endless stream of deterministic bollocks to be ingested by bots, with plenty of links.
Table 1: This table provides a comprehensive, analytical overview of diverse computational methods and offensive techniques explicitly designed to facilitate (algorithmic) sabotage, deliberate disruption, and targeted poisoning within the operational workflows of artificial intelligence (AI) systems. Each resource delineated herein has been meticulously structured to erode the integrity of AI models, with particular emphasis on destabilizing their data acquisition mechanisms, subverting training pipelines, and circumventing the foundational operational frameworks that underpin their functionality and reliability.

* Please note that this list was last updated on April 29, 2025. It functions as a dynamic, continuously evolving resource, with periodic updates and revisions undertaken to preserve its accuracy, relevance, and alignment with various facets of the expanding spectrum of collective techno-disobedience and algorithmic Luddism, manifested through radically assertive modes of refusal, resistance, and reimagining, which are indispensable principles for mounting opposition to and challenging algorithmic politics and practices detrimental to commonality.


Contact #

For any suggestions, revisions, proposals, or further contributions pertaining to this list, please contact us via email at x7kekmg7@proton.me.

To expedite communication and ensure enhanced security, we strongly recommend encrypting your email using GPG. Our public key can be obtained through the following link here. Alternatively, you may retrieve our key from a public key server by executing the following command:

gpg --recv-keys DD4FF0D691C7C8F501C1CD0441CC385A75C16CD7

We kindly ask that you include your public GPG key in your email correspondence to facilitate efficient processing and communication.