LLMs vs SLMs: How AI Models Impact Sustainability

Share this article
Share this article
Prioritise Us on Google
AI's appetite for resources – particularly energy and water – is immense
Are SLMs the future of sustainable AI? We dive into new models by Microsoft and IBM and ask – can AI be sustainable?

The relentless integration of AI into corporate strategies has driven advancements in computing power, leading to new technological developments in digital infrastructures.

Large Language Models (LLMs), such as those from OpenAI and Google, have garnered significant attention for their robust capabilities in natural language understanding and generation, powering various applications from business chatbots to intricate data analysis tools.

Despite their success, LLMs are notoriously resource-intensive, consuming vast amounts of energy and water.

A query to ChatGPT, for example, can use up to 10 times the electricity of a standard Google search.

Training these models also involves data centres that require enormous water quantities for cooling purposes.

Training of GPT-3, for instance, was estimated to use 1,287MWh of electricity, the same annual energy consumption of 120 American homes.

Microsoft’s operations reflect this trend, with a 34% rise in water consumption during 2022, attributed mainly to AI activities.

In response to these resource-heavy models, Small Language Models (SLMs) have emerged as a viable alternative, offering similar functionality but with reduced environmental impact.

Read the full story in the August 2025 edition of Sustainability Magazine.

These models, with parameters ranging from millions to 10 billion, offer substantial efficiency in energy, memory and storage use.

Unlike massive LLMs, SLMs use transformer architectures optimised through techniques like knowledge distillation and quantisation, enabling them to deliver task-specific performance with minimal resources, suitable for specific needs like custom email summarisation and customer service solutions.

What are SLMs?

SLMs, unlike their larger counterparts, offer significant advantages in sustainability efforts.

Youtube Placeholder

Due to their smaller size, the energy demand for training and running these models is considerably reduced, aiding organisations in achieving emissions objectives whilst benefiting from AI-driven automation and intelligence without the typical resource expenditure.

SLMs facilitate deployment on edge devices or local infrastructure, reducing reliance on large, energy-consuming data centres.Aligning with green AI principles that prioritise efficiency and environmental responsibility, SLMs present a cost-effective solution for organisations expanding their AI capabilities.

Youtube Placeholder

Lowered infrastructure costs, expedited optimisation and diminished GPU needs make SLMs appealing for widespread adoption.

Their straightforward designations also allow for enhanced auditability, speeding up the processes of explanation, debugging and risk mitigation, crucial in highly regulated sectors like finance and healthcare where model transparency is legally required.

Phi-4: Microsoft’s latest SLM

Microsoft has been at the forefront of promoting energy-efficient AI models through their Phi-4 series.
“The energy intensity of advanced cloud and AI services has driven us to accelerate our efforts to drive efficiencies and energy reductions,” says Melanie Nakagawa, Microsoft’s Chief Sustainability Officer.

Melanie Nakagawa, Chief Sustainability Officer at Microsoft

“As AI scenarios increase in complexity, we’re empowering developers to build and optimize AI models that can achieve similar outcomes while requiring fewer resources.”

Phi-4 statistics
  • 5.6B - Parameters in the Phi-4-multimodal model, fewer than most competing multimodal systems
  • 6.14% - Word error rate on the Huggingface OpenASR leaderboard, representing a new benchmark record
  • 128,000 - Maximum token sequence length supported by the Phi-4-mini model, enabling processing of extensive text

The Phi-4-multimodal model encompasses speech, vision and text functionalities, with impressive performances in speech recognition and translation tasks.

Its development is noteworthy as Microsoft’s initiation into multimodal language modelling, incorporating sophisticated cross-modality learning methodologies for improved device interactions.

Phi-4-mini, on the other hand, caters to swift, precise text functions like mathematical reasoning and code generation, supporting extensive token sequences for thorough document processing.

“Phi-4-multimodal marks a new milestone in Microsoft’s AI development as our first multimodal language model,” says Weizhu Chen, Technical Fellow, CVP, Gen AI at Microsoft.

Weizhu Chen, Technical Fellow, CVP, Gen AI at Microsoft

“By leveraging advanced cross-modal learning techniques, this model enables more natural and context-aware interactions, allowing devices to understand and reason across multiple input modalities simultaneously.

“Whether interpreting spoken language, analysing images, or processing textual information, it delivers highly efficient, low-latency inference – all while optimising for on-device execution and reduced computational overhead.”

IBM’s Granite 3.2 models

Similarly, IBM has unveiled its Granite 3.2 model series, which emphasises AI accessibility and efficiency for corporate applications.

Features like “chain of thought” reasoning permit resource conservation during less demanding tasks while deploying advanced logic strategically.

With models like the Granite Vision 3.2 2B developed for enterprise-level document processing, IBM upholds its commitment to delivering competitive model performances with a smaller resource footprint. Its deployment in document classification and reasoning tasks exemplifies its competence against larger models, delivering meticulous processing power with minimal impact.“The next era of AI is about efficiency, integration and real-world impact – where enterprises can achieve powerful outcomes without excessive spend on compute,” says Sriram Raghavan, Vice President of IBM AI Research.

Sriram Raghavan, IBM Research

“IBM's latest Granite developments focus on open solutions demonstrate another step forward in making AI more accessible, cost-effective and valuable for modern enterprises.”

With additional innovations like the Granite Guardian 3.2 safety model (now 30% smaller but still highly effective) and the long-range TinyTimeMixers forecaster, IBM demonstrates that sustainable, high-performing AI is not only feasible but also accessible to modern enterprises.

The future of language models: infrastructure, scale and sustainability

As both LLMs and SLMs mature, the technological future points towards strategic combinations of broad LLM capabilities with the precision and sustainability of SLMs.

This hybrid approach optimises the balance between efficacy, environmental commitment and necessary resource allocation, supporting diverse AI workloads tailored to organisational needs from the cloud to the edge.

Businesses will increasingly assess AI achievements not solely on functional prowess but significantly on sustainable deployment and responsible innovation.

As Melanie says: “Sustainability is good business. Sustainable business practices drive innovation.”

This transition towards environmentally considerate deployment strategies signifies a critical evolution in AI technology, marking a pivotal shift in both operational execution and strategic technological foresight.

Read the full story in the August 2025 edition of Sustainability Magazine.

Company portals