Anna’s Archive, a popular website for pirating books and articles, seems to be square in Google’s sights, according to copyright and digital rights publication TorrentFreak. The search giant is said to have blocked some 749 million Anna’s Archive URLs from showing up in search results, TorrentFreak found, after combing through a recent transparency report.
The removal wasn’t necessarily targeted, as Google regularly delists content at the request of copyright holders. At time of this writing, has taken down links to 15,125,359,564 pages since 2011. But this is the latest in an ongoing, AI-prompted saga that is seeing copyright holders crack down on so-called “shadow libraries,” and it already represents around 5% of Google’s overall takedowns.
Anna’s Archive is a platform for pirated e-books
Personally, I hadn’t heard of Anna’s Archive, which makes sense—it’s a newer player in the field. The platform popped up in 2022, shortly after its predecessor, Z-Library, had its domains seized by the U.S. Department of Justice. Since then, it’s been quietly operating on its own little corner of the internet, serving as an open-source search engine for literary works that links to free publicly available sources when they exist, and pirated uploads when they don’t. Like Z-Library, it’s been blocked by German ISPs and sued in the U.S., but remains operational.
You can think of it kind of like the Pirate Bay, but for literary works—but on a larger scale (impressive given how new it is). TorrentFreak notes that only 4.2 million Pirate Bay URLs have been taken off Google, which is paltry compared to Anna’s Archive’s numbers.
AI scraping could be a factor
That discrepancy could be due to more aggressive takedown filing from publishers and authors, as more than 1,000 separate users have issued takedown requests to date, according to the Google data. These include both individuals and larger names like Penguin Random House, and their diligence could be related to Anna’s Archives’ stance on AI, as the site has admitted that it has freely provided access to 30 LLM developers to train on its “illegal archive of books,” and still openly hosts freely accessible pages for others to access.
Where copyright holders and readers will go from here is still up in the air. It’s important to note that, despite all appearances to the contrary, Google does not own the internet. Removing a site from its search engine does not prevent users from visiting it directly, and all three Anna’s Archive domains—annas-archive.org, annas-archive.se, and annas-archive.li—remain live.
Additionally, Anna’s Archive does not host any pirated content itself, but simply provides users to links where they can find it. All of this puts it in a legal gray area, which, when backed by the site’s open-source nature and strong commitment to the ideal that “preserving and hosting these files is morally right,” means it’s likely to continue in some form or another for years.
Still, as companies like Meta are found to have used pirated content to train its AI models, it’s likely actions that Google’s will become more common, and other sites, or even legal entities, might follow suit. Plan accordingly. (And if, like me, you’ve been asking yourself “Who the heck is Anna?” the archive’s FAQ has an answer: “You are Anna.” It’s a nod at the anonymous uploaders who provide it with much of its material.)