Article

AI & Machine Learning

Nvidia: Behind DeepSeek's 'Excellent AI Advancement'

By Kitty Wheeler

January 31, 2025

undefined mins

Share this article

Prioritise Us on Google

Share this article

Prioritise Us on Google

DeepSeek rises to the top of Apple Store's downloads

Chinese AI startup DeepSeek challenges the US with a cost-effective model using test-time scaling, raising questions about the future of AI chips

The race for AI dominance between China and the US is taking an unexpected turn. While US companies have focused on pushing raw computational power, Chinese startup DeepSeek has taken a different approach, with the company’s R1 model claims to match leading AI systems while using fewer high-end chips – challenging the assumption that more computing power equals better AI.

The implications hit the market hard. Nvidia, whose chips power most advanced AI systems, saw US$600bn wiped from its value on Monday. It's a particular blow given the US's recent restrictions on chip exports to China, which were meant to slow Chinese AI development.

DeepSeek’s efficient approach raises questions about the future of AI development. If the company’s claims hold up, companies might need to rethink their reliance on increasingly expensive semiconductor hardware. For the global chip market, this shift in thinking could reshape how high-performance computing evolves.

As the AI race hots up, the key question isn’t just about US-China competition – but about whether DeepSeek’s rise signals a phase where efficiency matters more than raw power.

What sets DeepSeek apart from the rest?

DeepSeek operates as an AI-powered chatbot much like ChatGPT and is reportedly as powerful as OpenAI’s o1 model in tasks including mathematics and coding.

OpenAI CEO, Sam Altman

Much like o1, R1 is a 'reasoning' model and thus utilises a reasoning model structure which processes responses incrementally, mirroring human cognitive patterns.

This enables reduced memory usage compared to market competitors, translating to lower operational costs.

DeepSeek says it has been able to train its model at a much lower cost – claiming R1 cost US$6m to train, a fraction of the “over $100m” referred to by OpenAI CEO Sam Altman when discussing GPT-4.

On Monday, Altman posted a congratulations to DeepSeek on X, saying it had developed “an impressive model, particularly around what they’re able to deliver for the price.”

Yet despite OpenAI itself being accused of accessing content to which it did not have the rights – making it subject of multiple lawsuits – OpenAI has accused DeepSeek of "inappropriately” taking data from its model to create its own AI chatbot.

According to Bloomberg News, Microsoft and OpenAI are probing if data output from the ChatGPT maker's technology was obtained in an unauthorised manner by a group linked to DeepSeek.

How Nvidia’s semiconductor chips impacted DeepSeek’s rise

The market response to DeepSeek's announcement led to Nvidia's stock price declining by 17% on Monday 27th January, though it recovered 4% by midday Tuesday.

The semiconductor manufacturer, which produces chips essential for AI model training, moved from first to third position in global market capitalisation, falling behind Apple and Microsoft.

At the core of DeepSeek’s strategy is a clever workaround to U.S. export restrictions. According to the BBC, the company’s founder, Liang Wenfeng, had accumulated an estimated 50,000 Nvidia A100 chips before they were banned from export to China in September 2022. Rather than relying solely on these high-end processors, experts suggested that DeepSeek has paired them with less expensive chips to create its AI model.

US President, Donald Trump

DeepSeek’s release of R1 on 20th January attracted attention from AI researchers before gaining widespread industry recognition. Even President Trump, who has set out plans to advance AI development in the US, characterised the development as a “wake-up call” for US companies.

How test-time scaling drives DeepSeek efficiency gains

DeepSeek’s breakthrough centers on test-time scaling, a technique that dynamically manages computing resources during operation, rather than relying on fixed training parameters. At CES 2025, Nvidia CEO Jensen Huang highlighted this approach as one of three key scaling methods shaping AI development, alongside pre-training and post-training scaling.

Nvidia CEO, Jensen Huang

Pre-training scaling follows the traditional path: more data, larger models and increased computing power. Post-training scaling refines AI capabilities through reinforcement learning and human feedback. Test-time scaling, DeepSeek's focus, takes a different route by optimising resource allocation during the inference process itself.

A Nvidia spokesperson told Technology Magazine: “We now have three scaling laws: pre-training and post-training, which continue and new test-time scaling.

“DeepSeek is an excellent AI advancement and a perfect example of Test Time Scaling.

“DeepSeek’s work illustrates how new models can be created using that technique, leveraging widely-available models and compute that is fully export control compliant.
Inference requires significant numbers of Nvidia GPUs and high-performance networking.”

DeepSeek’s rise meets security challenges

Despite DeepSeek sending ripples throughout the technology sector, technical challenges emerged following the app's rise to the top position on Apple’s App Store in the US and DeepSeek reported “large-scale malicious attacks,” necessitating temporary registration restrictions and causing website outages.

The incident highlighted the security challenges facing emerging AI platforms. “It can act as a huge honeypot for cybercriminals,” Jake Moore, Global Cybersecurity Advisor at ESET told Cyber Magazine. “This is typical for any new platform that dominates the media and can attract multiple groups of threat actors looking for any potential vulnerability to exploit.”

DeepSeek: Global implications for the AI chip market

Liang, who holds degrees in electronic information engineering and computer science from Zhejiang University, maintains perspective on the sector's development trajectory.

DeepSeek's Founder, Liang Wenfeng

In 2019, as CEO of High-Flyer, which became China's first quantitative hedge fund to surpass 100 billion yuan in assets, he asked: "If the US can develop its quantitative trading sector, why not China?"

Reflecting on the current state of AI development in an interview last year, Liang says: "Often, we say there's a one or two-year gap between Chinese and American AI, but the real gap is between originality and imitation. If this doesn't change, China will always be a follower” and that China's AI sector “cannot remain a follower forever.”

Explore the latest edition of Technology Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.

Discover all our upcoming events and secure your tickets today.

Technology Magazine is a BizClik brand

Nvidia: Behind DeepSeek's 'Excellent AI Advancement'

What sets DeepSeek apart from the rest?

How Nvidia’s semiconductor chips impacted DeepSeek’s rise

How test-time scaling drives DeepSeek efficiency gains

DeepSeek’s rise meets security challenges

DeepSeek: Global implications for the AI chip market

Company portals

NVIDIA

OpenAI

Tags