LLM Inference Optimization

New LLM optimization technique slashes memory costs up to 75%

Researchers at the Tokyo-based startup Sakana AI have developed a new technique that enables language models to use memory more efficiently, helping enterprises cut the costs of building applications ...

Business Wire

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

BELLEVUE, Wash.--(BUSINESS WIRE)--MangoBoost, a provider of cutting-edge system solutions designed to maximize AI data center efficiency, is announcing the launch of Mango LLMBoost™, system ...

Semiconductor Engineering

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

NextBigFuture

Defeating Nondeterminism in LLM Inference by Thinking Machines

A research article by Horace He and the Thinking Machines Lab (X-OpenAI CTO Mira Murati founded) addresses a long-standing issue in large language models (LLMs). Even with greedy decoding bu setting ...

SDxCentral

AI inference crisis: Google engineers on why network latency and memory trump compute

Google researchers have warned that large language model (LLM) inference is hitting a wall amid fundamental problems with memory and networking problems, not compute. In a paper authored by ...

Business Wire

ASC24 Finals Set for April in Shanghai: Focus on Cutting-Edge Large Language Model Inference and Seepage Simulation!

BEIJING--(BUSINESS WIRE)--On January 4th, the inaugural ceremony for the 2024 ASC Student Supercomputer Challenge (ASC24) unfolded in Beijing. With a global interest, ASC24 has garnered the ...

TechRepublic

NVIDIA Boosts LLM Inference Performance With New TensorRT-LLM Software Library

TensorRT-LLM provides 8x higher performance for AI inferencing on NVIDIA hardware. As companies like d-Matrix squeeze into the lucrative artificial intelligence market with coveted inferencing ...

Forbes

The New Frontier Of LLM Inference: Where The Next Tenfold Gains Will Come From

Shakti P. Singh, Principal Engineer at Intuit and former OCI model inference lead, specializing in scalable AI systems and LLM inference. Generative models are rapidly making inroads into enterprise ...

Semiconductor Engineering

Vision-Language-Action Models Arrive

A vision-language-action model is an end-to-end neural network that takes sensor inputs—camera images, joint positions, ...

VentureBeat

Meta's LLM Compiler is the latest AI breakthrough to change the way we code

Meta has unveiled the Meta Large Language Model (LLM) Compiler, a suite of robust, open-source models designed to optimize code and revolutionize compiler design. This innovation has the potential to ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results