LG AI Research taps NVIDIA challenger FuriosaAI

The biggest barrier to scaling AI is the unsustainable power consumption and costs associated with traditional GPU hardware.
LG-AI-Research-taps-NVIDIA-challenger-FuriosaAI LG-AI-Research-taps-NVIDIA-challenger-FuriosaAI

FuriosaAI’s RNGD (pronounced ‘Renegade’) accelerator has successfully passed rigorous performance tests with LG AI Research’s EXAONE models.

Following this successful deployment – which delivered high performance, met low-latency service requirements, and achieved significant improvements in energy efficiency compared to previous GPU solutions – RNGD Server solution is now available to enterprise customers leveraging LLMs. These include the diverse spectrum of LG businesses across electronics, chemicals, and telecommunications. This showcases real-world enterprise GenAI deployments and provides a powerful reference use case for other global enterprises.

“After extensively testing a wide range of options, we found RNGD to be a highly effective solution for deploying EXAONE models. RNGD provides a compelling combination of benefits: excellent real-world performance, a dramatic reduction in our total cost of ownership, and a surprisingly straightforward integration,” said Kijeong Jeon, Product Unit Leader at LG AI Research. “For a project of this scale and ambition, the entire process was quite impressive.”

Testing power consumption as well as performance

LG AI Research first announced plans two years ago to evaluate RNGD and assess the accelerator’s efficiency and, if successful, integrate RNGD into various EXAONE-based services across LG. We unveiled RNGD, which leverages its unique Tensor Contraction Processor chip architecture to deliver up to 512 TFLOPS of FP8 performance with a Thermal Design Power (TDP) of just 180W, last summer at Hot Chips and began sampling with customers last fall.

RNGD Server aggregates the power of eight RNGD accelerators into a single, air-cooled 4U chassis, enabling high compute density. Up to five RNGD Server Systems can be deployed within a single, standard 15kW air-cooled rack.

LG AI Research has decided to leverage RNGD to ensure power efficiency, cost-effectiveness, and scalability when delivering its LLM services. They evaluated RNGD for its ability to meet demanding, real-world benchmarks using 7.8-billion-parameter and 32-billion-parameter versions of EXAONE 3.5, both available with 4K and 32K context windows.

Performance and efficiency results

LG AI Research’s direct, real-world comparison demonstrates a fundamental leap in the economics of high-performance AI inference.

  • RNGD achieved 2.25x better performance per watt for LLMs compared to a GPU-based solution
  • A RNGD-powered rack can generate 3.75x more tokens for EXAONE models compared to a GPU rack operating within the same power constraints
  • Using a single server with four RNGD cards and a batch size of one, LG AI Research ran the EXAONE 3.5 32B model and achieved 60 tokens/second with a 4K context window and 50 tokens/second with a 32K context window

Deployment and integration

After installing RNGD hardware at its Koreit Tower data centre, LG AI Research collaborated with our team to launch an enterprise-ready solution. We successfully optimised and scaled EXAONE 3.0, 3.5, and now 4.0 models, progressing from a single card to two-card, four-card, and then eight-card server configurations. To achieve this, we applied tensor parallelism not only across multiple processing elements but also across multiple RNGD cards.

To maximise the performance of tensor parallelism, they optimised PCIe paths for peer-to-peer (P2P) communication, communication scheduling, and compiler tactics to overlap inter-chip DMA operations with computation. In addition, we utilised the global optimisation capabilities of Furiosa’s compiler to maximise SRAM reuse between transformer blocks.

This successful integration highlights the maturity and ease-of-use of the software stack, including the vLLM-compatible Furiosa-LLM serving framework. The migration demonstrates the platform’s programmability and simplified optimisation process.

It also showcases key advantages required in real-world service environments, such as support for an OpenAI-compatible API server, monitoring with Prometheus metrics, Kubernetes integration for large-scale deployment in cloud-native environments, and easy deployment through a publicly available SDK.

LG AI Research’s ChatEXAONE, an EXAONE-powered Enterprise AI Agent, provides robust capabilities including document analysis, deep research, data analysis, and Retrieval-Augmented Generation (RAG).

Moving forward, LG AI Research plans to expand ChatEXAONE’s availability to external clients, utilising RNGD to facilitate this expansion.

Next steps for RNGD and LG AI Research

Furiosa and LG AI Research are committed to enabling businesses to deploy advanced models and agentic AI sustainably, scalably, and economically.

Furiosa has added support for EXAONE 4.0 and is working with LG AI Research to develop new software features, expand to additional customers and markets, and provide a powerful and sustainable AI infrastructure stack for advanced AI applications.

LG AI Research aims to continue optimising RNGD software and hardware for EXAONE models around specific business use cases.

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Previous Post
Project-aims-to-develop-300mm-in-line-wafer-metrology-system

Project aims to develop 300mm in-line wafer metrology system

Next Post
Hailo-10H-Edge-AI-accelerator-with-Generative-AI-capabilities

Hailo-10H Edge AI accelerator with Generative AI capabilities