Foxconn Creates the First Traditional Chinese LLM: FoxBrain

The tension between the US and the East over technology and AI supremacy has intensified, as the Hon Hai Research Institute has launched Taiwan's first traditional Chinese Large Language Model (LLM).
The model, named FoxBrain, represents a development in Taiwan's AI technology sector through an approach that completed model training in four weeks – designed to improve manufacturing and supply chain management.
The institute, which operates under Hon Hai Technology Group (commonly known as Foxconn), announced that its new model demonstrates capabilities in reasoning while being optimised for local language patterns.
Although the model was initially developed for Foxconn's internal systems to handle functions including data analysis, decision support, document collaboration, mathematics, reasoning and problem solving and code generation – the company has also stated that the model would be made available as open-source technology in the future, allowing wider access to its capabilities.
“In recent months, the deepening of reasoning capabilities and the efficient use of GPUs have gradually become the mainstream development in the field of AI,” says Dr Yung-Hui Li, Director of the AI Research Center at Hon Hai Research Institute.
“Our FoxBrain model adopted a very efficient training strategy, focusing on optimising the training process rather than blindly accumulating computing power.”
Nvidia and Meta's influence on FoxBrain
The training process utilised 120 Nvidia H100 GPUs – connected through Nvidia's Quantum-2 InfiniBand networking technology, which enables high-speed data transfer between computing components.
During the training phase, Nvidia provided support through its Taipei-1 Supercomputer facility and technical consultation, enabling the company to complete model pre-training using Nvidia's NeMo framework for building and customising AI models.
FoxBrain also uses the architecture developed by Meta known as Llama 3.1, incorporating 70 billion parameters – the values that an AI system adjusts as it learns from data.
According toFoxconn, the model outperforms Llama-3-Taiwan-70B, another Traditional Chinese language model of comparable size, in numerous categories.
In comparison testing, FoxBrain demonstrated improvements in mathematics performance relative to the base Meta Llama 3.1 model it was built upon.
Foxconn reports progress in mathematical tests compared to Taiwan Llama, which it describes as the top Traditional Chinese language model currently available.
FoxBrain model demonstrates performance in mathematics and reasoning tests
FoxBrain shows particular strength in mathematics and logical reasoning capabilities, based on testing using the TMMLU+ benchmark, which measures performance across various knowledge domains.
The training process involved establishing data augmentation methods – techniques to expand and enhance training data – across 24 topic categories, generating 98 billion tokens of pre-training data for Traditional Chinese.
Tokens are units of text that the AI system processes, typically consisting of words or parts of words – and the model features a context window of 128,000 tokens, determining how much information it can consider at once – enabling the system to maintain awareness of more extensive conversation history or document content compared to models with smaller context windows.
- Developed proprietary data augmentation and quality assessment techniques for 24 topic categories
- Trained the model using 120 Nvidia H100 GPUs over a total of 2,688 GPU days
- Implemented a multi-node parallel training framework to ensure optimal performance and system stability
- Introduced an Adaptive Reasoning Reflection method to enhance the model's autonomous reasoning capabilities
“Through carefully designed training methods and resource optimisation, we have successfully built a local AI model with powerful reasoning capabilities,” Dr Yung-Hui Li says.
Foxconn moves towards open-source AI contributions
Foxconn notes that while there remains a performance gap compared to DeepSeek's distillation model – another AI system focused on efficient knowledge transfer – FoxBrain's performance approaches what it terms “world-leading standards.”
The development process encompassed data collection, cleaning and augmentation, followed by continual pre-training, supervised fine-tuning, reinforcement learning from AI feedback (RLAIF) – and a technique that Foxconn calls “Adaptive Reasoning Reflection.”
Though originally conceived for applications within Foxconn, the company plans to collaborate with technology partners to expand FoxBrain's applications and promote AI in manufacturing, supply chain management and decision-making processes.
Foxconn also plans to present FoxBrain's results at the Nvidia GTC 2025 conference in a session titled ‘From Open Source to Frontier AI: Build, Customise and Extend Foundation Models’ scheduled for 20 March.
Explore the latest edition of Technology Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.
Discover all our upcoming events and secure your tickets today.
Technology Magazine is a BizClik brand


