As big tech companies pour billions of dollars into building larger AI models and sprawling data centres, a startup has claimed to have developed a fundamentally different frontier-level AI model at a fraction of the cost and tokens of traditional large language models (LLMs).
HRM-Text is a one billion-parameter, open-weight AI model developed from scratch by researchers at Sapient Intelligence. While the ongoing AI boom is underpinned by models running on Transformer-based architectures, Sapient said that its new text generation model is based on a highly sample-efficient Hierarchical Recurrent Model (HRM) architecture that was first introduced by the Singapore-based startup last year.
The HRM architecture decouples computation into slow-evolving strategic and fast-evolving execution layers. While most LLMs are trained on raw text using brute-force autoregressive prediction, HRM-Text has been trained only on instruction-response pairs. This means that the model is specially designed to generate a targeted answer to a specific task. It is available on GitHub for download.
The launch of HRM-Text is potentially significant considering that training a foundational LLM from scratch costs millions of dollars and involves scraping internet-scale data. As a result, most enterprises and startups do not have the resources to develop their own in-house LLMs.
“The industry’s scaling addiction says: ‘When the model fails, make it bigger. Add more data. Add more GPUs.’ That has worked, but it is reaching a point of diminishing returns. More scale often means more memorisation, more latency, more infrastructure, and more vendor dependency. It does not necessarily give an enterprise a better reasoning engine,” Guan Wang, CEO of Sapient Intelligence, was quoted as saying by Venture Beat.
“Imagine a hedge fund, insurer, or bank that has highly proprietary data: internal research notes, transaction logic, compliance rules, analyst memos, risk models, portfolio constraints,” Wang said. “They may not want to send that data to an external frontier model, and they may not need a giant general-purpose model that memorised the internet. What they need is a compact reasoning core that can learn their task structure, reason across rules and numbers, and run in a controlled environment,” he added.
Sapient introduced the HRM architecture for AI models in 2025. It marks a fundamental departure from traditional Transformer-based models. Initially, HRM-based models were found to be highly effective at solving controlled, symbolic reasoning problems. However, researchers were faced with a challenge when HRM was applied to massive, open-ended complexities of generalised language modeling.
The feedback loops in HRM made it think more efficiently, but those same loops made the model mathematically volatile to be trained on human language, which is diverse in nature. To address this problem in the neural network, Sapient researchers said they came up with two key architectural innovations that led to the HRM-Text model.
They were able to develop a specialised normalisation technique called MagicNorm that is designed specifically to keep the internal signals stable, no matter how many times the model loops its thought process. In addition, the researchers designed a warm-up method where the model is only evaluated on short, shallow reasoning loops during early training. As training progresses, the system warms up, gradually giving the model deeper and longer reasoning sequences.
The HRM-Text model was not trained on trillions of words of raw internet text. Instead, it was trained on a tightly curated dataset of just 40 billion tokens.
Researchers also said that they also switched the training objective of the model from next-token prediction to task completion, so that the model is rewarded only on its full response as opposed to individual tokens it generates. The training dataset used to develop HRM-Text were also changed from raw text into instruction-response pairs only. Thinking tokens were also filtered out of the dataset to make the model rely more on its internal hierarchical architecture as opposed to following step-by-step logic.
Sapient said that the model was trained on a small cluster of 16 GPUs in just 2 days.
The HRM-Text model was evaluated on a wide range of benchmarks to test knowledge, reasoning, logic, math, and comprehension skills. Sapient claimed that its model achieved performance on par with larger open-weight models on these benchmarks.
The model achieved 60.7 per cent on MMLU, 84.5 per cent on GSM8K, and 56.2 per cent on MATH benchmark tests. With these scores, the one billion-parameter model is on par and, in some cases, surpasses the performance of two billion to seven billion-parameter foundational AI models.
These scores were achieved using 100 to 900 times fewer training tokens and 96 to 432 times less estimated compute than models like Qwen, Gemma, and Llama.
In terms of real-world AI applications, the HRM approach could enable organisations to pretrain their own highly capable reasoning AI models from scratch and pair them with external knowledge stores at affordable prices. This is significant given that pretraining a foundational AI model from scratch typically requires millions of dollars, leaving only tech giants with the resources to do so.
Sapient said that the total estimated cost of . While the HRM architecture has unique advantages, Sapient acknowledged that the initial release of HRM-Text should be seen as a proof-of-concept similar to early GPT releases.
“Honestly, HRM-Text is not yet a plug-and-play ChatGPT replacement. It is a compact foundation language reasoning model. For an enterprise engineering team, the operational work is mainly around templates, mode selection, attention masking, and alignment,” Wang was quoted as saying.



