LLM Inference Infrastructure

20d

Lumai Launches the World’s First Optical Computing System for Real-Time, Billion-Parameter LLM Inference

Lumai, the optical compute company addressing scalable AI, today announced its Lumai Iris inference server – the world’s first optical computing system to successfully run billion-parameter large ...

Forbes

AI Infrastructure Evolution: How Better Hardware Powers The LLM Era

The launch of ChatGPT in November 2022 marked the beginning of a new chapter in AI. Most of the industry’s attention had focused on the training of increasingly larger models to improve accuracy. The ...

Velda Launches Serverless GPU Job Platform That Eliminates Infrastructure Overhead for Machine Learning Teams

Execute GPU jobs instantly from your terminal with zero setup. No manifests, no environment drift, and per-second ...

2UrbanGirls on MSN

The AI infrastructure imperative: Building the backbone of tomorrow's intelligence

As artificial intelligence moves from experimental to essential, the physical and logical infrastructure that carries it ...

SiliconANGLE

Akamai distributes AI inference across the globe, promising lower latency and higher throughput

Akamai Technologies Inc. is expanding its developer-focused cloud infrastructure platform with the launch of Akamai Cloud Inference, a highly distributed foundation for running large language models ...

Chosunbiz

Joo-Young Kim wins Korea ICT honor for LLM chip breakthroughs at HyperAccel

Joo-Young Kim, CEO of AI Semiconductor startup HyperAccel, received a decoration in the commendations for "Information and ...

Computer Weekly

Red Hat launches llm-d community & project

The latest trends and issues around the use of open source software in the enterprise. Red Hat has announced the launch of llm-d, a new open source project designed to address generative AI’s future ...

10d

5% GPU utilization: The $401 billion AI infrastructure problem enterprises can't keep ignoring

Enterprises locked in GPU capacity during the AI scramble. Now utilization sits at 5% and the bill is due. Here's what the ...

12d

DIGITIMES Report: Enterprise AI Enters Deployment Phase, Shifting Compute Architectures Toward Inference

As enterprise adoption of generative AI accelerates, a new phase of infrastructure demand is beginning to take shape.

10d

OrcaRouter Launches the Open LLM API Router -- Zero Markup, MIT-Licensed, 100+ Models

Today, Continuum AI released OrcaRouter and OrcaRouter Lite — a unified inference layer that routes across 200+ frontier and open-source language models, with zero markup on BYOK traffic.

XDA Developers on MSN

My local LLM can call Claude when it's stuck, and it changed everything about my local-first setup

Local LLMs aren't very good on their own ...

Network World

Crooks are hijacking and reselling AI infrastructure: Report

Researchers at Pillar Security say threat actors are accessing unprotected LLMs and MCP endpoints for profit. Here’s how CSOs can lower the risk. For years, CSOs have worried about their IT ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results