Building multi-physics simulations with LLMs
LLMs don't solve Navier-Stokes. But they're remarkably good at figuring out which solver should, and how to set it up. That distinction matters more than you'd think.
Last year we tried something that shouldn't have worked as well as it did.
We gave an LLM a one-paragraph description of a heat sink design problem—250W thermal load, forced convection cooling, aluminum body, can't exceed 85°C junction temperature—and asked it to plan a coupled thermal-CFD simulation. Not run it. Just plan it: which solvers, what mesh strategy, which boundary conditions, what convergence criteria.
The plan it produced was... good. Not perfect—we'll get to that—but better than what a junior analyst would write, and about 80% of what a senior analyst would specify. The remaining 20% was exactly the kind of stuff you'd expect: a slightly too-coarse mesh recommendation near the fin tips, and a turbulence model choice that was defensible but not optimal for this specific geometry.
That result convinced us that LLMs have a real role in multi-physics simulation—not as physics engines, but as orchestration layers that bridge the gap between what engineers say and what solvers need.
Why multi-physics is different
Single-physics simulation is hard enough. But multi-physics is a different beast entirely, because the complexity isn't just additive—it's multiplicative.
Take a turbine blade. You need:
- CFD for the gas flow around the blade (external) and cooling air through internal channels
- Conjugate heat transfer to compute temperature distribution through the solid blade
- Structural FEA for centrifugal and thermal stress
- Creep analysis for long-term deformation at high temperature
Each of these domains has its own solver, its own meshing requirements, its own time-stepping approach. And they're coupled: the temperature field from the thermal solve feeds into the structural solve as thermal loading. The structural deformation changes the flow passage geometry. The flow changes the heat transfer coefficients. Everything affects everything.
Setting this up manually is a multi-week endeavor even for experienced teams. You need someone who understands CFD mesh requirements (boundary layer resolution, y+ targets), structural mesh requirements (stress concentration refinement), and how to map data between dissimilar meshes at the coupling interfaces. These people exist, but there aren't many of them.
Multi-physics coupling: what an LLM orchestrates
What the LLM actually does
Let me be specific about the LLM's role, because the hype around “AI for simulation” tends to be frustratingly vague.
The LLM does not solve any physics equations. It cannot and should not. Physics solvers—ANSYS, OpenFOAM, Abaqus, COMSOL—are the result of decades of numerical methods research. An LLM generating a Navier-Stokes solution from a prompt would be like asking a project manager to write the code. That's not their job.
What the LLM does is project management for physics. It:
- Parses the problem. “Heat sink, 250W, forced air, aluminum, max 85°C” → This requires conjugate heat transfer (CHT) coupling CFD with conduction. The flow regime (compute Reynolds number from expected velocities and characteristic length) determines laminar vs. turbulent treatment.
- Selects the approach. Steady-state is probably fine unless there are transient thermal events. k-ε or k-ω SST for turbulence (the LLM picks k-ω SST because internal fin channels have separated flow regions). One-way coupling is sufficient if fin deformation is negligible at these temperatures.
- Specifies the mesh. 15-20 boundary layer elements on wetted surfaces with first cell height targeting y+ ≈ 1 for the SST model. Refinement in the fin channel inlets where flow develops. Thermal mesh can be coarser in the solid but needs to match at the fluid-solid interface.
- Configures the solve. Initialize with a rough thermal estimate. Run CFD to semi-convergence. Map heat transfer coefficients to the thermal model. Iterate until the interface temperature change between iterations is below 0.5°C.
- Plans validation. Check energy balance (heat in = heat out ± convergence tolerance). Compare bulk thermal resistance against analytical correlations for finned heat sinks. Flag any local temperatures above 85°C.
Each of those steps involves dozens of specific numerical choices that an analyst would normally make one at a time, over hours or days. The LLM makes them in seconds—drawing on patterns from solver documentation, textbooks, and thousands of similar problems in its training data.
Where it breaks down
I want to be honest about the failure modes, because this technology is impressive enough without overselling it.
Novel physics combinations. If you're coupling magnetohydrodynamics with radiation heat transfer in a molten salt reactor—the LLM has probably seen very few examples of that specific combination. Its recommendations will be more generic and less reliable. You still need an expert.
Convergence debugging. When a coupled simulation diverges, the cause can be subtle: maybe the relaxation factor is too aggressive, maybe the mesh is deforming past validity, maybe there's an energy imbalance at a coupling interface. LLMs can suggest common fixes, but they can't yet reliably diagnose which of the 15 possible causes is actually the problem. This requires the kind of interactive hypothesis-testing that current AI struggles with.
Confident errors. This is the dangerous one. An LLM might recommend a mesh density that's too coarse for a specific geometry feature and not flag any uncertainty about it. Unlike a human analyst who might think “hmm, this thin wall might need more elements,” the LLM doesn't have the same mechanism for doubt. This is why human review of the setup is non-negotiable.
The system around the model
The LLM is the reasoning core, but it's not the whole system. What makes this work in practice is the infrastructure around it:
A retrieval layer that gives the LLM access to material databases (MatWeb, MMPDS), solver documentation, and your company's past analyses. The LLM's training data has a lot of general engineering knowledge, but your specific AL 7075-T73 allowables at 150°C aren't in GPT's weights. They need to be retrieved.
Validation hooks that run physics-based sanity checks on every LLM-generated configuration before it goes to the solver. Is the total heat flux consistent with the power input? Does the mesh have adequate resolution for the expected gradients? Are the material properties within valid temperature ranges? These are fast, deterministic checks that catch most LLM errors.
A feedback loop where solver results (convergence behavior, energy balance, result quality metrics) feed back into the LLM to refine subsequent runs. If the first mesh was too coarse (detected by high Richardson extrapolation error), the LLM learns to recommend finer meshes for similar geometries.
Where we think this goes
Short-term (now): LLMs handle the orchestration and setup for well-understood multi-physics problems—CHT, thermal-structural, basic FSI. The engineer defines the problem and reviews the setup. Time savings: 60-80% on setup.
Medium-term (1-2 years): The feedback loop gets tight enough that the system can iterate on its own—running a simulation, evaluating results, adjusting parameters, re-running. The engineer reviews converged results rather than intermediate setups. This opens the door to automated design optimization across multiple physics.
Longer-term: Physics-aware foundation models that understand conservation laws, constitutive relationships, and numerical stability at a deeper level than today's LLMs. Not replacing solvers, but becoming dramatically better at driving them.
We're not waiting for the longer-term to be useful. The short-term capability—LLMs that can plan and configure coupled simulations from natural language descriptions—is already saving teams real time on real products.
This is the core of what we're building at Zeta Nexus: LLM-orchestrated simulation workflows that handle the multi-physics complexity so you don't have to configure it by hand. See a demo.