Solidigm: Storage Must Be Reengineered for Liquid-Cooled AI Racks
Context and chronology
Racks built for modern AI workloads are migrating away from fan-driven airflow toward integrated liquid loops, and that architectural change is exposing storage as a critical thermal variable rather than a passive backend. Solidigm frames the problem operationally: partially retrofitting racks with liquid cold plates creates parallel cooling stacks that raise cost and complexity without delivering system-level efficiency. Hardeep Singh, thermal-mechanical hardware team manager at Solidigm, warns operators that hoses and cold plates block airflow pathways and concentrate thermal stress on fan‑dependent parts such as NVMe drives, DRAM and NIC modules, producing throttling that reduces model-serving throughput.
Operational impact and constraints
Thermal consequences are concrete: evaporative cooling chains can consume multi‑million‑gallon volumes across large fleets over typical deployment horizons, while aiming rack loop setpoints near roughly 45°C can eliminate chillers and materially improve PUE — but only if every component is thermally compatible with the shared loop. That compatibility requires SSDs and other storage modules to conduct heat to a single‑side cold plate, support hot‑swap serviceability without fluid exposure, and pass validated leak tests. The engineering constraints are real: lowering thermal interface resistance and creating reliable fluid seals forces changes in PCB layout, component placement, and connector design, and those changes dictate new SKUs and procurement cycles.
Standards, collaborations, and the roadmap
To avoid a fragmented liquid ecosystem, vendors are shifting from ad hoc fixtures to standards-aligned designs; bodies such as SNIA and the Open Compute Project are working on form-factor and interface evolutions (including SFF and E1.S iterations) to enable interoperable, production-ready liquid SSDs. Scott Shadley, director of leadership narrative and evangelist at Solidigm, says co-design with hyperscalers and GPU platform partners is core to product roadmaps — Solidigm has engaged with NVIDIA on hot-swap and single-side cooling tests to preserve GPU coolant budgets — and that full-rack thermal integration, not piecemeal upgrades, will determine future GPU utilization, reliability, and datacenter sustainability.
Why storage redesign matters beyond thermals
Complementing the thermal argument, an emerging architecture pattern treats storage as an active compute surface — persisting optimized representations, projection caches, and queryable artifacts so accelerators receive ready-to-run inputs. That compute-in-storage trend reduces repeated data transfers and duplicate transformations that otherwise waste GPU hours. When storage performs more work near the data, it changes where thermal and performance constraints appear: denser, hotter storage devices become first‑order infrastructure elements that must be incorporated into the rack cooling budget and mechanical design.
Trade-offs and the path forward
The market will see heterogeneous responses. Some operators will adopt full rack-level liquid integration with standards-aligned, leak‑proof, hot-swap SSDs; others will pursue hybrid colocation — projection-first caches or on‑storage primitives near inference nodes—avoiding an immediate, wholesale conversion of existing fleets. This creates a tension: hybrid approaches can defer capital expenditure and preserve flexibility in the short term but risk creating persistent fragmentation and duplicated cooling infrastructures that degrade system-level TCO over time. Winners will be vendors that combine validated liquid thermal hardware with software primitives for on‑storage compute and clear unit‑economics for inference workloads.
Read Our Expert Analysis
Create an account or login for free to unlock our expert analysis and key takeaways for this development.
By continuing, you agree to receive marketing communications and our weekly newsletter. You can opt-out at any time.
Recommended for you
Smart storage reshapes the storage–compute divide
Smart storage will force architects to colocate compute and data for repeated‑read AI workloads, improving GPU utilization and lowering TCO — but the near-term outcome is a heterogeneous mix: on‑storage compute, projection‑first caches and edge/private inference will all coexist as teams optimize for latency, cost, and governance.

Akash Systems Debuts Diamond-Cooled AI Servers with AMD Instinct MI350X
Akash Systems launched production Diamond Cooled AI servers built with AMD Instinct MI350X GPUs and manufactured by MiTAC , backed by a reported $300M initial order. The systems claim multi‑percent efficiency and throughput gains that could shift data center density economics, but delivery timing and realized ROI will hinge on component supply, packaging capacity and site‑level integration.
Phaidra's Nvidia-Backed Cooling Strategy Targets Data centers
Phaidra announced collaborations with Nvidia, CoreWeave and Applied Digital to pilot a telemetry-driven, power-as-early-warning cooling workflow that aims to cut wasted utility use while preserving usable GPU compute hours. Placed alongside parallel industry moves — server-level diamond cooling from Akash/AMD/MiTAC and NVIDIA's design work with AtkinsRéalis on firm power for AI campuses — the effort highlights a short-to-long-term spectrum of fixes (software controls, hardware thermal modules, and power-supply engineering) operators will combine to raise usable capacity.

Nvidia Vera Rubin: Rack-Scale Leap Rewrites Data-Center Economics
Nvidia’s Vera Rubin rack platform targets roughly tenfold gains in performance per watt while shifting installations to fully liquid-cooled, modular racks. A concurrent multiyear supply pact with Meta — a demand signal analysts peg near $50 billion — amplifies near-term pressure on HBM, packaging and foundry capacity, raising execution and geopolitical risks even as per-rack economics improve.
NVIDIA Unveils Rack That Supports Rival AI Accelerators
NVIDIA announced a rack‑scale platform designed to accept third‑party accelerator cards while retaining NVIDIA’s networking, telemetry and management stack. The move increases buyer leverage and accelerates heterogeneous deployments, but real‑world impact will be shaped by supplier deals, HBM and packaging constraints, and whether openness coexists with NVIDIA’s operational control.
Cisco launches Silicon One G300 and liquid-cooled N9000/8000 systems to accelerate AI data centers
Cisco introduced the Silicon One G300 switching silicon and high‑density N9000/8000 platforms — with liquid‑cooled options, denser optics and unified fabric management — and paired the hardware roadmap with expanded AI governance, observability and automation capabilities to make large AI deployments more efficient and secure. The combined hardware and software push targets higher GPU utilization, shorter job times, energy savings and operational controls for AI agent and model risk in production.
AI-driven memory squeeze reshapes GPU and storage markets as prices surge
A surge in demand for memory driven by AI workloads has pushed standalone RAM prices up several hundred percent, and signs now show those costs bleeding into GPUs and high-capacity storage. Manufacturers are reallocating scarce memory to higher-margin products, forcing lineup changes, higher street prices for certain GPUs, and a wider cascade of pricing pressure across components.

Google and NVIDIA Back New Memory Fabric That Reconfigures Servers
Google and NVIDIA have moved a coherent, pooled memory fabric from prototype toward productization, prompting hyperscalers to redesign node roles and procurement specs. Upstream supply shocks—large DRAM price moves, HBM prioritization and tooling partnerships—both accelerate the rationale for fabrics and complicate near‑term deployment and component availability.