Aptitude Test banner: the word CAPTCHA in tarnished gold blackletter inside an ornate dark frame, surrounded by faded mock verification widgets, distorted alphanumeric strings, a grid puzzle, and a teal onion silhouette.

Case file

Aptitude Test

The acronym is Completely Automated Public Turing test to tell Computers and Humans Apart. On the cryptographic underground, the gate has come to test something else.

Before you read this: what this article describes and at what level

This case concerns defensive infrastructure deployed on hidden services that operate outside the law. The article describes those defences at the level of the mechanism and the trust assumptions they encode. It does not describe how to deploy them, where to acquire them, or how to bypass them. Specific configuration values, file paths, and circuit-killing procedures available in the underlying technical literature are deliberately omitted. What remains is the part a defender on the lawful side of the gate needs to recognise: the shape of the verification, the resource asymmetry it tries to fix, the new failure modes it introduces.

Where the case touches on academic findings, the citations are given so the original work can be read in full. The body of the article is a reconstruction in plain language; the verifying detail lives in the cited papers.

Story opening

The page loads slowly, then not at all, then loads. There is no banner image, no header, no navigation. There is a clock. The clock face is analog, slightly skewed, drawn over a noise pattern of misleading geometric shapes, the hour hand and the minute hand printed in a colour just close enough to the background to make the eye work. A text field below the clock asks the visitor to type the time. The visitor types the time. The page returns an error. The clock changes. The visitor types again. After three or four attempts, sometimes more, a new page appears. It says: You are in the queue. Please do not refresh.

The queue is the product. The clock is the queue. Behind both, on a server the visitor will never reach directly, sits a stack of free and open-source software, configured with unusual care, defending a hidden service against floods of traffic intended to take it down. The stack has a name; it is published; it has changelogs and version numbers. The people who maintain it have written technical documentation. They have argued with each other about default values on a forum where the same people sell, depending on the marketplace, narcotics, stolen credentials, child sexual abuse material, malware, or all of the above.

The clock is the smallest thing in the picture. Above it, in roughly the order the request travelled, were a chain of decisions about who is allowed to ask a question of the server, how often, at what cost, and through which circuit. None of those decisions could be made the way a clearnet provider would make them, because the clearnet's tools, address reputation, geolocation, the protection of a global content delivery network, do not exist in this environment. The visitor's connection arrived through the Tor network as just another anonymous circuit. The verification gate had to be built from what could be measured at that single end of the conversation: the circuit, the time, and the cost of asking.

Case file

The Onion Router (Tor) network was published in 2002 to do exactly one thing: route a request through three relays in such a way that no single relay knows both the source and the destination. Hidden services, the addresses ending in .onion, extend the same property to the server. The first mass-market application of this architecture was The Farmer's Market in 2010, then Silk Road in 2011. Both treated anonymity as the only defensive layer that mattered. Neither needed a CAPTCHA. The traffic was small. The threat was law enforcement, slow and far away.

By 2020 the underground trading ecosystem was generating about 1.7 billion United States dollars in annual revenue, with the Russian marketplace Hydra accounting for 1.3 billion of that figure before its seizure in 2022. Concentration of capital invites contestation. Competitor marketplaces commission distributed denial-of-service (DDoS) attacks against rivals; law enforcement supplements its long-term investigations with disruption operations; researchers, scrapers, and the merely curious add background noise. The hidden service that wanted to remain reachable had to invent its own front door.

The arms race that followed sorted itself into four roughly visible generations. The first, from 2010 to about 2014, used static distorted alphanumeric strings; these fell to commodity optical character recognition (OCR) within a year of being deployed. The second, from 2015 to 2018, replaced them with image-grid challenges of the kind familiar from Google's reCAPTCHA; these fell to better convolutional networks. The third, from 2019 to 2022, introduced interactive puzzles, including the now-iconic analog clock used by Dread and later by Archetyp; the clock fell, as the technical breakdown below records, to a single afternoon's training of a deep residual network. The fourth generation, beginning in 2023, abandoned visual challenges as the primary line of defence and moved the verification down the stack into the protocol itself.

Two artefacts define the current generation. The first is the EndGame framework, an open-source DDoS protection toolset originally developed by the administrators of the Dread forum and White House Market and now deployed, with local modifications, by most large onion-service operators. The second is the Tor Project's own proof-of-work (PoW) defence, introduced in Tor version 0.4.8 and standardised in a public proposal. Both shift the question the gate is asking. The earlier generations asked: are you a human? The current generation asks: are you willing to pay?

The payment is computational. The transaction is between the client's processor and the server's queue. The published research, particularly the 2025 USENIX paper that introduced the attack now called OnionFlation, has already identified the failure mode of the new gate. The arms race has not ended; it has changed denominations.

Technical breakdown

The starting condition of any defence on a hidden service is the absence of a source address. On the clearnet, when an attacker floods a web server, the server sees a list of Internet Protocol (IP) addresses behaving badly and asks an upstream provider to drop them. On Tor, the server sees one address: the address of its own guard relay, repeated for every request. Source filtering, geographical blacklisting, and reputation-based throttling all fail at the first step. The defender has to invent a substitute for the IP address out of whatever information the protocol still exposes.

The substitute the current generation uses is the Tor circuit identifier. In Tor version 3 (V3) onion services, the cryptographic handshake that sets up a client's connection to a hidden service produces, at the service side, a unique identifier for that circuit. The identifier does not reveal who the client is. It does reveal that two requests from the same circuit are from the same client, and that two requests from different circuits, even if they look identical at the application layer, are not. This is enough to build a rate limiter. The EndGame framework, deployed through the NGINX web server with the OpenResty Lua module, uses the circuit identifier as the key for a sliding-window counter; once a circuit has issued more than a small number of requests in quick succession, additional requests are deferred or dropped, and the underlying circuit can be torn down through a Python controller that speaks to the local Tor daemon.

An obvious counter is to build a new circuit for every request. Tor permits this, although the handshake itself is expensive in network terms. The defender's counter to the counter is a secondary verification: a small cryptographically signed cookie issued to the browser on first contact, refreshed at the user's pace. A client that rebuilds circuits to evade circuit-based rate limiting still presents the same cookie and is throttled on that key instead. A client that discards cookies as it rebuilds circuits is forced to solve the visual CAPTCHA every time, which is the cost the defender wanted to impose. The whole architecture is a series of small, cheap traps that compound; the attacker can defeat any one of them at modest cost, and cannot defeat all of them at any reasonable cost without paying for the gate the same way the user does.

Why the visual layer keeps moving

The visual CAPTCHA still appears because it remains the cheapest way to delay an attacker by a few seconds while the protocol-level checks complete in the background. What it cannot do, since 2022, is keep a determined attacker out. A team of researchers writing in the Proceedings of the 19th International Conference on Security and Cryptography demonstrated that the analog-clock CAPTCHA used by Dread and Archetyp could be solved with 96.83 per cent accuracy by a fine-tuned ResNet50 image classifier trained on roughly 50,000 captured examples. The clock was already, at the moment of its publication, no longer a security feature; it was a queueing primitive that incidentally inconvenienced humans.

The arrival of multimodal large language models (MLLMs) in 2023 and 2024 closed the remaining gap. A 2025 study by researchers at the University of Notre Dame, presented at the 17th International Conference on Agents and Artificial Intelligence, described an automated agent that navigated a marketplace using OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet to solve a three-by-three scrambled-image puzzle. The agent isolated the rows, generated the candidate reconstructions, identified the visually consistent one, and read off the alphanumeric key embedded in the unscrambled image. The full cycle took under a minute and consumed, on average, nineteen application programming interface (API) calls. The puzzle's design had assumed an attacker without access to general-purpose reasoning about scrambled images. That assumption is no longer safe.

The new gate is a price

Faced with a visual layer that machine vision can clear in real time, the Tor Project and the EndGame maintainers have both moved the verification down the stack. The Tor Project's proof-of-work defence, introduced in 2023 with Tor version 0.4.8, works as a network-layer CAPTCHA. When a hidden service detects elevated load, it updates the service descriptor in the Tor directory authorities to advertise a puzzle. The puzzle uses the Equi-X and HashX libraries, both designed to be memory-hard rather than compute-hard, which prevents attackers from gaining an outsized advantage by deploying graphics processing units (GPUs) or application-specific integrated circuits (ASICs). The client's machine solves the puzzle locally; the solution is attached to the rendezvous request; the service places the request in a priority queue keyed on the effort the client expended. Higher-effort solutions are served first.

The defence works because it changes the cost ratio. In the original introduction-flooding attack, a client could send an INTRODUCE1 cell, which the introduction point relayed as an INTRODUCE2 cell, which the hidden service had to decrypt asymmetrically and respond to by building a new three-hop circuit to the client's chosen rendezvous point. The decryption and circuit construction were expensive at the service; the request was nearly free at the attacker. Proof-of-work flips the asymmetry: a request without a solved puzzle is dropped almost instantly; a request with one represents a known quantity of computational work the attacker had to perform, and the queue can be sorted in favour of those who paid the most. A flood becomes self-defeating in a way it was not before.

OnionFlation, or how to attack the price

The flaw in the new gate is that the difficulty has to be adjusted dynamically. A hidden service that fixed its puzzle difficulty at a high level when no attack was occurring would simply lock out its own users. A service that fixed it at a low level would be defenceless during a flood. The Tor proof-of-work proposal therefore allows the service to suggest an effort value in its descriptor, and to revise that value upward when the queue lengthens. The revision propagates through the directory authorities, which is a global, gossip-based system that takes time to converge.

In 2025, Jinseo Lee, Hobin Kim and Min Suk Kang of the Korea Advanced Institute of Science and Technology (KAIST) presented at the USENIX Security Symposium a paper titled Onions Got Puzzled, which describes the resulting attack. The attacker submits a small number of requests with extremely high effort values. The hidden service interprets this as evidence that the equilibrium price of a connection has risen, and updates its descriptor accordingly. Legitimate clients now arrive expecting to pay a premium rate that their consumer-grade processors cannot meet within the Tor circuit timeout, which is generally sixty seconds. The service has not been flooded; it has been priced out of reach of its own users. The KAIST team demonstrated that the attack could render a major hidden service effectively unreachable for as little as a couple of United States dollars an hour using a cloud computing instance. The attack works because the system cannot simultaneously be congestion-resistant (raise the price under flood) and inflation-resistant (refuse to raise the price under suspicious spikes). It must choose, and the proposal as written chooses congestion-resistance. The KAIST authors have circulated a proposed mitigation that allows operators to tune the trade-off; at the time of writing it is under review.

Wedging the clock

A separate, narrower vulnerability is worth noting because it shows how thin the floor of the system is. CVE-2022-33903, disclosed against Tor versions prior to 0.4.7.8, affected the round-trip time (RTT) estimator that Tor uses to optimise the latency of established circuits. By transmitting a particular sequence of timing-relevant packets, an unauthenticated remote attacker could drive the estimator into a non-responsive state, causing the affected Tor process to spike its central processing unit (CPU) usage and lose the ability to build new circuits. The hidden service did not need to be flooded at the application layer; the underlying Tor daemon could be made to wedge on its own arithmetic. The bug has been patched, but the class of vulnerability, where the defender's own timing logic is the attack surface, is not unique to that bug. It reappears, with different mechanics, in OnionFlation.

The fake gate

One more variant is worth naming because it travels in the opposite direction along the same channel. On the clearnet over the last two years, a class of malware delivery now widely documented under the name ClickFix has weaponised the visual grammar of the CAPTCHA itself. The user lands on a page that presents a verification screen indistinguishable from a familiar challenge. The instructions are slightly unusual; they ask the user to open the run dialog or a terminal window, paste a string, and press enter. The string is a command. The command fetches and runs a payload. The verification has flipped its polarity. It is no longer the server testing whether the client is human; it is the attacker testing whether the human will follow an instruction because it arrives inside a shape they recognise. The Turing test has been replaced by an aptitude test of a different kind, and what it measures is the visitor's willingness to obey a prompt.

For defenders on the lawful side of the gate, the structural point of all of this is that the CAPTCHA is no longer a humanity check. It is a market mechanism for access. The relevant questions are no longer about distortion algorithms or image grids; they are about the cost curve a given verification imposes on attackers, about the time it takes for the defensive logic to update its prices, and about the second-order failures introduced when the price itself becomes the attack surface.

Core lesson

The convenient story to tell about CAPTCHAs is that they were a security feature, that machine learning broke them, and that we are now in a brief transitional moment before something better arrives. The convenient story is wrong in the way that most convenient stories are wrong: it mistakes the visible thing for the actual thing.

The actual thing is that the CAPTCHA has, for at least a decade, been doing two jobs at once. The visible job, the job the user sees, is a humanity check. The invisible job, the job the protocol cares about, is the imposition of a small cost on the requester to balance the much larger cost a request imposes on the server. The first job was always doomed; humans are not reliably better at distorted text or grid puzzles than the machines built to solve those puzzles, and once the machines reach parity, no amount of further distortion will pull the gap open again without locking out the humans first. The second job is more durable, because it does not depend on the requester being unable to do the task. It depends only on the task being expensive enough to deter the requester from doing it at scale.

What the EndGame stack and the Tor proof-of-work defence have done, in different ways, is make the second job explicit. They have stopped pretending the gate is testing for humanity and started openly charging for entry. The price is denominated in computational work; the queue is sorted by who paid the most; the analog clock on the front page is a vestigial organ kept around because it incidentally helps the queue function in a way users tolerate. The cryptographic underground reached this honesty first, because it had to: the attackers it was defending against had budgets and engineering teams, and pretending otherwise stopped being viable around 2020. The enterprise will reach it next, and is already reaching it now, although the public-facing copy still uses the word verify.

The OnionFlation attack matters because it shows what happens when the price itself becomes contested. Once entry is a market, every attack against the market is also an attack against the service. The defender no longer worries only about being flooded; the defender worries about being priced out of reach of the users on whose behalf the defence was raised in the first place. This is the structure of every market with a manipulable benchmark, and it is now the structure of the Tor onion service access layer. The gatekeeper sets a price; the attacker manipulates the price; the user, who simply wanted to ask a question, finds the question has become unaffordable.

At the start of the case, the gate is a clock the user types the time on. By the end, the gate is a quote the user's machine has to meet. The technology did not fail; it performed exactly as a market is designed to perform, in the only place on the internet where the resource asymmetry between defender and attacker has been allowed, openly and without apology, to be the entire problem.

Glossary

The terms below cover the protocols, attacks, and components named in the technical breakdown. Each is explained in plain English; the precise behaviour is in the breakdown above.

Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA)
A challenge designed to be easy for humans and difficult for automated agents, intended to gate access to a service. The acronym dates from a 2003 paper by Luis von Ahn and colleagues at Carnegie Mellon University; the architecture predates that paper by several years.
The Onion Router (Tor)
An anonymity network published in 2002 in which client traffic is routed through three relays, each of which knows only the previous and next hop. Hidden services, with addresses ending in .onion, extend the same property to the server. Tor version 3 (V3) onion services, introduced in 2018, replaced the older sixteen-character addresses with fifty-six-character ones and added a number of cryptographic improvements.
Distributed Denial-of-Service (DDoS)
An attack in which a large number of compromised or attacker-controlled hosts simultaneously direct traffic at a target with the intent of exhausting some resource the target needs to serve legitimate users.
Introduction-flooding attack
A specific class of denial-of-service attack against Tor hidden services, in which the attacker exploits the asymmetry between the cost of sending an INTRODUCE1 cell (cheap, at the attacker) and the cost of responding to it with a new three-hop circuit (expensive, at the service).
EndGame framework
An open-source DDoS protection toolset for Tor hidden services, originally developed by the administrators of the Dread forum and White House Market and now deployed in modified form by most large onion-service operators. EndGame uses the NGINX web server with the OpenResty Lua module to perform circuit-based rate limiting, cookie-based secondary verification, and a static waiting-queue page during heavy load.
Proof-of-work (PoW)
A class of cryptographic puzzle in which a client must perform a verifiable amount of computational work before its request is honoured. Proof-of-work is the consensus mechanism of Bitcoin; the Tor Project introduced a memory-hard variant for hidden service denial-of-service defence in Tor version 0.4.8 (2023), built on the Equi-X and HashX libraries.
OnionFlation
An attack on the Tor proof-of-work defence, described by Jinseo Lee, Hobin Kim and Min Suk Kang at the 2025 USENIX Security Symposium. The attacker submits a small number of requests with very high effort values, causing the hidden service to raise its advertised difficulty above the level that consumer hardware can solve within the circuit timeout. Legitimate users are priced out of reach.
Multimodal large language model (MLLM)
A large language model that accepts and reasons over input of more than one modality, typically text and images. OpenAI's GPT-4o and Anthropic's Claude 3.5 Sonnet are examples; both have been shown, in published research, to solve image-based CAPTCHAs that were previously considered resistant to automation.
Equi-X and HashX
The cryptographic puzzle libraries used by Tor's proof-of-work defence. Both are designed to be memory-hard rather than compute-hard, which prevents attackers from gaining an outsized advantage by switching from general-purpose central processing units (CPUs) to graphics processing units (GPUs) or application-specific integrated circuits (ASICs).
Onionbalance
A tool for distributing the load on a single onion address across multiple backend servers, by publishing a single hidden service descriptor that points to several instances. Onionbalance is the closest equivalent in the Tor ecosystem to a clearnet load balancer.
ClickFix
A class of clearnet social-engineering attack documented since 2024 in which the user lands on a page that presents a verification screen indistinguishable from a familiar CAPTCHA. The instructions ask the user to paste a string into a terminal or run dialog; the string is a command that fetches and runs a malicious payload. The attack reuses the visual grammar of a humanity check to deliver a compliance check.

Further reading

The following sources informed this article and were consulted during drafting. Each has been verified.

  • Jinseo Lee, Hobin Kim and Min Suk Kang, Onions Got Puzzled: On the Challenges of Mitigating Denial-of-Service Problems in Tor Onion Services, USENIX Security Symposium, 2025. PDF, prepublication version.
  • David Audran, Marcus Andersen, Mark Hansen and others, Tick Tock Break the Clock: Breaking CAPTCHAs on the Darkweb, Proceedings of the 19th International Conference on Security and Cryptography, 2022, 357–365. PDF.
  • Yichao Wang, Budi Arief and Julio C. Hernandez-Castro, Analysis of Security Mechanisms of Dark Web Markets, EICC '24: Proceedings of the 2024 European Interdisciplinary Cybersecurity Conference, 120–127. PDF.
  • Mrunal Vibhute, Neol Gutierrez, Kristina Radivojevic and Paul Brenner, Multimodal Web Agents for Automated (Dark) Web Navigation, Proceedings of the 17th International Conference on Agents and Artificial Intelligence, 2025, 437–444. DOI link.
  • Yaxin Luo, Zhaoyi Li, Jiacheng Liu, Jiacheng Cui, Xiaohan Zhao and Zhiqiang Shen, Open CaptchaWorld: A Comprehensive Web-Based Platform for Testing and Benchmarking Multimodal LLM Agents, arXiv preprint 2505.24878, May 2025. arXiv link.
  • Isra Mohamed Ali, Maurantonio Caprolu and Roberto Di Pietro, Foundations, Properties, and Security Applications of Puzzles: A Survey, ACM Computing Surveys 53, no. 4 (2021), 1–38. DOI link.
  • Enze Liu, Elisa Luo, Shawn Shan, Geoffrey M. Voelker, Ben Y. Zhao and Stefan Savage, Somesite I Used To Crawl: Awareness, Agency and Efficacy in Protecting Content Creators From AI Crawlers, preprint, May 2025. DOI link.
  • Ahmet Sinan Kazmali and Ahmet Sayar, Web Scraping: Legal and Ethical Considerations in General and Local Context, A Review, Procedia Computer Science 259 (2025), 1563–1572. DOI link.
  • The Tor Project, Onion service DoS guidelines, community documentation. community.torproject.org.
  • CACI DarkBlue research team, Tor Dark Web Browser Introduces Proof-of-Work. caci.com.
  • DarkOwl content team, Cracking the Code: Exploring the Sophistication of CAPTCHAs, April 2024. darkowl.com.
  • MIT News, Shoring up Tor, July 2015. Background on early hidden service denial-of-service research. news.mit.edu.
  • Onion Pass: Token-Based Denial-of-Service Protection for Tor Onion Services, IFIP Networking 2021. PDF.
  • Sergey Gribkov and colleagues, A qualitative mapping of Darkweb marketplaces, arXiv preprint 1904.10164. arXiv link.
  • Darkweb research: Past, present, and future trends and mapping to sustainable development goals, Heliyon, 2023. PubMed Central.

Return to the case

The visitor types the time on the clock. The clock changes. The visitor types the time again. Somewhere upstream of the visitor's awareness, a Lua script has counted the circuit, a cookie has been signed, a queue has been entered, a proof of work has been requested and accepted, an effort value has been compared to the value of every other request now in the priority queue. The visitor sees none of this. The visitor sees only the clock, and after the clock, the marketplace.

The clock is not the gate. The clock is the receipt for the gate. The gate is the price the visitor's machine has just paid in computational work, denominated in the units of a market that until very recently the people running it were happy to call a humanity check. They have stopped calling it that on the cryptographic underground, because the people building the attacks against them are no longer making the polite distinction between a human and a paying customer. The polite distinction is, however, still being made on the clearnet, in the verification widgets that gate the rest of the internet. The honesty arrived first in the place where dishonesty cost the most.

Return to top