Whether it’s scrawled on scrolls or inscribed into stone, the written accounts of ancient cultures capture a single moment in history uncovering truths about forgotten civilizations and bringing their universe into sharper focus.
But many of these crucial records have been so damaged by time that they’re close to illegible. Researchers aren’t just letting these consequential documents get lost to history forever. With the incredible computing power of artificial intelligence and the remarkable problem solving ability of machine learning, decaying documents are being rescued from the grave.
Traditionally the task of deciphering ancient texts has fallen to specialists called epigraphists, who draw on past experience and comparable examples to fill in the blanks. A new collaboration between Google DeepMind and classical scholars promises to make this task faster and more accurate, opening wider windows into the previously mysterious past. Their work culminated in the largest expansion of ancient text sources in 100 years, according to Nature Magazine.
The team created a deep neural network called Ithaca, named after the Greek island that the hero Odysseus struggled to return to in Homer’s epic Odyssey. The tool was trained on a digitized dataset of 78,608 inscriptions in ancient Greek, dating between the seventh century BC and the fifth century AD.
Ancient Greek is a highly inflected language — meaning word forms can change depending on how they’re used in sentences — and have many varieties of dialects.
“It was this linguistic complexity that made us interested as it poses an excellent case study for Natural Language Processing and Machine Learning methods,” said DeepMind’s Yannis Assael, who along with Thea Sommerschield authored a paper published in Nature.
The first model of its kind, Ithaca was trained to restore fragmented texts and tease out when and where they were created, all at the same time. It uses pattern recognition to predict missing words, processing the text as characters and words simultaneously. Every small prediction limits the options for subsequent ones, like a puzzle-solver eliminating letters one by one in Wordle — just with many possible answers.
The branches of the decision tree result in multiple solutions that the model rates by confidence level. It also creates a ranked list of 84 possible regions and distribution of 10-year date intervals between 800 BC and AD 800. All of this happens in seconds, compared to the hours that human experts need.
Source: credit Wikimedia CommonsDeepMind
“It is observing patterns and learning those patterns at a greater scale and a greater speed than a human could do, and therefore achieving more than a human could,” says Jonathan Prag of Merton College, who collaborated on the project.
In tests, Ithaca restored the fragmented Greek texts with 62% accuracy. When historians incorporated the results into their predictions, it increased their accuracy from 25% to 72%. Ithaca scored 71% on location predictions and dated texts to within 30 years accuracy, compared to human experts’ average of 144 years.
Ithaca has already been put to practical use helping settle a dispute over a group of ancient Athenian decrees. Originally the decrees were thought to have been created before 446 BC, based on specific letterforms that changed around that date. But the dates of many of the decrees seemed to conflict with the accounts of the Athenian historian Thucydides, leading some researchers to propose the decrees had been created around 420 BC.
Source: credit Wikimedia CommonsDeepMind
Sure enough, Ithaca predicted a date around 421 BC.
“Although it might seem like a small difference, this date shift has significant implications for our understanding of the political history of Classical Athens,” said DeepMind’s Sommerschield.
Unlocking new findings about civilizations that have been studied and scrutinized for thousands of years is monumental for both researchers and historians alike.
“We’re really excited to see the new directions Ithaca will take,” said Assael. “Ancient Greece plays an instrumental role in our understanding of the Mediterranean world, but it’s still only one part of a vast global picture of civilizations.”
The team is working on training versions of Ithaca on other ancient writing systems, including Hebrew and Mayan. They have made the code open source and created a free interactive version online.
With the meteoric rise of AI, the powerful technology is in the hands of more people and the potential for meaningful breakthroughs is only just beginning. In fact, three students illustrated just that, building their own model to decipher a 2,000-year-old scroll belonging to Cesar’s family and burnt irreparably during the Mount Vesuvius eruption.
First, x-rays were used to scan the documents as they were too scorched and fragile to even be unrolled. Then the student’s AI model stepped in, employing pattern recognition to recover lettering from within the charred text. Their methodology is poised to be used on hundreds of papyrus scrolls from the same library, believed to be the only collection of ancient Roman texts to survive the infamous volcanic blast.
"This is the start of a revolution,” Dr Federica, papyrology researcher at the University of Naples, told the BBC. “In that moment, you really think ‘now I’m living something that will be a historical moment for my field.’”
Artificial intelligence has also gone to work deciphering the Epic of Gilgamesh, believed to be the world’s oldest surviving literary work. The piece, etched onto clay tablets which have fragmented over time and using a now extinct writing system, was previously decoded entirely by hand. Now, with an AI algorithm on the task, lines from thousands of different broken fragments are being pieced together and all manner of Babylonian writing can be more quickly translated.
“With the help of artificial intelligence, all known variants of a Babylonian text can be quickly analyzed and used appropriately,’” Institute of Assyriology professor Enrique Jiménez said in a statement.
Technology is retrieving text that has been previously inaccessible for millenia. Because of AI, ancient texts are seeing new light.
The potential for this technology to uncover new, overlooked or previously lost chapters of history is astronomical. Whether its data analysis and pattern recognition, whereby the technology can quickly process a massive trove of historical records and sources to uncover hidden patterns and potential connections, detecting biases in various historical narratives or making sense of long lost languages and documents, artificial intelligence is just getting started.
“We believe this is just the start for the development of tools for exploring the potential for collaboration between machine learning and the humanities,” Assael said.
This is an updated version of the article originally published on May 26, 2022.
Editor’s note: Learn more about DeepMind’s PYTHIA, the first ancient text restoration model that recovers missing characters from a damaged text input using deep neural networks. Also learn how Nutanix Enterprise AI software allow IT teams to deploy LLMs and endpoints with secure APIs to run from the edge to public clouds for GenAI apps and agents with standardized and centralized inferencing, including Day 2 operations.
Julian Smith is a contributing writer. He is the executive editor Atellan Media and author of Aloha Rodeo and Smokejumper published by HarperCollins. He writes about green tech, sustainability, adventure, culture and history.
Chase Guttman contributed to this updated version.
© 2025 Nutanix, Inc. All rights reserved. For additional information and important legal disclaimers, please go here.