IBM Unveils Smaller, Smarter AI Models for Business Use

Share this article
Share this article
Prioritise Us on Google
IBM has unveiled the next generation of its Granite large language model family
Technology giant IBM expands its Granite family with multimodal capabilities aimed at enterprise applications under permissive licencing terms

The race to build AI has largely centered on scale, with many companies competing to create ever-larger models. Yet while some firms grab headlines with models requiring vast computational resources, many businesses find themselves seeking more practical solutions.

IBM has positioned itself to address this market need with its latest announcement. The global technology and consulting firm has unveiled the next generation of its Granite large language model family, focusing on compact and efficient systems designed for real-world business applications.

The company’s Granite 3.2 models continue IBM’s strategy of creating smaller models that can deliver specific capabilities without demanding excessive computing resources.

Youtube Placeholder

The models are available under Apache 2.0 licence on Hugging Face, a platform for sharing machine learning models. Selected versions can be accessed through IBM’s watsonx.ai platform, Ollama, Replicate and LM Studio, with plans to integrate them into Red Hat Enterprise Linux AI 1.5 in the coming months.

IBM Granite vision model aims to transform document processing tasks

A key component of the release is a new vision language model that handles document understanding tasks. According to IBM’s benchmark testing, this model performs at a level matching or exceeding much larger competitors, including Meta’s Llama 3.2 11B model and Pixtral 12B, on enterprise-relevant tests.

To build this capability, IBM utilised its open-source Docling toolkit to process 85 million PDF documents and generated 26 million synthetic question-answer pairs. This extensive preparation helps the model tackle the document-heavy workflows that characterise many enterprise environments, from finance to healthcare.

Key facts
  • 85 million - Number of PDF documents processed using IBM's Docling toolkit to train the new vision model
  • 30% - Size reduction achieved in Granite Guardian safety models while maintaining performance
  • 2 years - Maximum forecast range of IBM's TinyTimeMixers models with fewer than 10 million parameters

The company has also integrated what it terms “chain of thought” reasoning into the 2B and 8B parameter versions of Granite 3.2. This feature enables the models to approach problems methodically, breaking them down into steps similar to human reasoning. Importantly, users can activate or deactivate this capability depending on the complexity of the task at hand.

With these enhancements, the 8B model demonstrates significant improvements in instruction-following benchmarks compared to previous versions. Through innovative “inference scaling” methods, IBM reports that even this relatively small model can compete with much larger systems like Anthropic’s Claude 3.5 Sonnet model or OpenAI’s GPT-4o on mathematics reasoning benchmarks.

The next era of AI is about efficiency, integration and real-world impact – where enterprises can achieve powerful outcomes without excessive spend on compute.

Sriram Raghavan, Vice President of IBM AI Research

The Granite Guardian safety models have also received updates, reducing their size by 30% while maintaining performance levels. These models now include a feature called “verbalised confidence” that provides more nuanced risk assessment by acknowledging degrees of uncertainty in safety monitoring.

IBM TinyTimeMixers target long-range forecasting for business planning

Alongside the Granite updates, IBM has released the next generation of its TinyTimeMixers models. Despite containing fewer than 10 million parameters – tiny by industry standards – these specialised models can forecast time series data up to two years into the future.

Such capabilities prove particularly valuable for financial trend analysis, supply chain planning, and retail inventory management – all areas where businesses need to make decisions based on long-term projections.

Youtube Placeholder

The ability to toggle reasoning capabilities addresses a practical challenge in AI implementation. Step-by-step reasoning approaches require substantial computing power that isn't necessary for every task. By making this feature optional, IBM allows organisations to reduce computing costs for simpler tasks while preserving advanced reasoning for more complex problems.

This approach reflects IBM’s understanding of real-world business constraints, where efficiency often matters as much as raw performance.

Sriram Raghavan, Vice President of IBM AI Research

The company’s strategy appears to be gaining traction. The previous Granite 3.1 8B model recently performed strongly on the Salesforce LLM Benchmark for Customer Relationship Management, suggesting that smaller, specialised models can indeed meet specific business needs effectively.

Sriram Raghavan, Vice President of IBM AI Research, explains the company’s philosophy: “The next era of AI is about efficiency, integration, and real-world impact – where enterprises can achieve powerful outcomes without excessive spend on compute. IBM's latest Granite developments focus on open solutions demonstrate another step forward in making AI more accessible, cost-effective and valuable for modern enterprises.”


Explore the latest edition of Technology Magazine and be part of the conversation at our global conference series, Tech & AI LIVE.

Discover all our upcoming events and secure your tickets today.


Technology Magazine is a BizClik brand

Company portals