GPU Boost for Ansys Fluent

Using the computing power of graphics processing units (GPUs) opens up completely new possibilities for CFD simulations in terms of computing times and performance. The use of GPU technology is natively implemented in Ansys Fluent. Dr. Thomas Zeiser from the Erlangen National High Performance Computing Center (NHR@FAU) has taken a closer look at their performance.

Header_Blog_FAU-GPU-Benchmaks
© Ansys 

Flow simulations are computationally intensive, especially because in computational fluid dynamics (CFD), the level of detail of the models plays a key role in the results. Transient models with many small time steps lead to complex calculations that are difficult to process conventionally within a reasonable time frame. Using High Performance Computing (HPC), common CFD tasks can be solved overnight or over the weekend - on a CPU basis.

This process is even faster when the computing power of the graphics processor is used. Ansys recognized the potential of GPUs early on. Because speed is important, GPU technology is one of the top development topics at Ansys. Beginning in 2022, Ansys Fluent has been reprogrammed so that the GPU calculates faster. This performance is evident in benchmarks that have been confirmed by independent institutions such as the Erlangen National High Performance Computing Center (NHR@FAU) at the University of Erlangen-Nuremberg (FAU). Reduced energy consumption and lower investment costs are welcome side effects.

Why are GPUs so fast?

The design and architecture of GPUs are inherently optimized for parallel data processing. Their thousands of computing cores are particularly suitable for operations that can be easily divided into small packages. In addition to graphics applications, for which they were developed, this also applies to numerical simulations. Additional acceleration results from their memory bandwidth and the immensely fast cache.

Benchmarks at the Erlangen National High Performance Computing Center

Teaser_Blog_FAU-GPU-Benchmaks_Fritz-Alex-Cluster

“Fritz” and “Alex”, the two supercomputers at the University of Erlangen-Nuremberg | © MEGWARE Computer und vertrieb Computer GmbH

Dr. Thomas Zeiser is keen to point out that he is not an Ansys or simulation expert. However, as head of the System & Services department at NHR@FAU, he is an expert on computing power. He therefore has an objective view of what computationally intensive software such as Ansys Fluent can do under HPC conditions and of comparisons of performance under CPU and GPU usage.

The HPC computing power of his clusters, which are named “Fritz” and “Alex” to match the university’s name, is available to FAU researchers and scientists from all over Germany. FAU is also equipped with a large Ansys Academic Multiphysics and HPC license pool.

The performance explosion of Ansys Fluent using GPUs - not as a support for the CPU, but rather natively and standalone - made Thomas Zeiser curious. He subjected it to comprehensive benchmarking and compared it with the “classic” CPU-based approach. His conclusion: “Fluent with GPUs actually performs as well as the manufacturer, Ansys, claims! And it doesn't have to be the most expensive and apparently most powerful GPU; in a single precision application, a standard product is sufficient for the performance promised.”

Ansys CFD

Make fluid flows visible with Computational Fluid Dynamics (CFD). Analyze particle and material flows with Discrete-Element-Method (DEM) and SPH (Smoothed-Particle Hydrodynamics).

Request trial version

CFD-1

GPU vs. CPU: Performance Study

Thomas Zeiser looks at GPU performance as an IT specialist and hardware expert. For comparability, he uses the basic physical principle of “work per time” and uses “MIUPS” (Million Cell Iteration Updates Per Second) as a criterion for evaluation. MIUPS describes the performance of the hardware for the pure solution time by using the iterations performed in one second per one million cells, so that the performance specification is independent of the size of the computational mesh.

Fluent with GPUs actually performs as well as the manufacturer, Ansys, claims!
Dr. Thomas Zeiser
Erlangen National High Performance Computing Center

In his performance comparison, Thomas Zeiser looked at a cross-section of examples from Ansys, including vehicle external flows, combustion processes and mixing processes. It is important to note that the studies are based on benchmarks and were therefore carried out under “laboratory conditions” and in a single precision environment - pure computing power is what counts. Additional evaluation steps, animations or reports were switched off. The results and their comparability therefore address pure hardware technology.

Blog_FAU-GPU-Benchmaks_Performance-Vergleich

Performance comparison of GPU and CPU benchmarks  | © NHR / CADFEM 

Calculations were initially performed on a CPU basis on the “Fritz” cluster, using 1 - 8 nodes, each with two Intel Xeon Platinum 8470 (“Sapphire Rapids”), which corresponds to 104-832 CPU cores. For the comparison, the curves that no longer increase at the end, i.e. reach (performance) saturation, are also important. These models are simply too small for such large computing resources and cannot be efficiently parallelized.

The GPU performance on the Nvidia A40, a card optimized for single precision, which is rather the exception in industrial practice for CFD simulations (double precision is usually used here for accuracy reasons), shows similar curves and is almost always better than the CPU cluster by a factor of around 2. Here, the number of GPU cards is compared with the number of network nodes, each with 104 CPUs. With the Nvidia A100, performance is improved by a further 80%.

About NHR and FAU

In Germany, there are a total of nine university-related national high-performance computing centers for researchers and scientists funded by the Federal Ministry of Education and Research and the respective state ministries. The NHR@FAU is part of the University of Erlangen-Nuremberg, and the two supercomputers are called “Fritz” and „Alex“. 

Benchmark External Flow

The Ansys benchmark of the external flow is examined in more detail. In this CFD simulation with around 250 million cells, not only is computing time compared, but also energy consumption and investment costs.
The classic CPU-based calculation is used as a reference. It is carried out on the “Fritz” cluster with 8 compute nodes, each with two Intel Xeon Platinum 8470 (“Sapphire Rapids”) and thus 104 CPU cores per node, or 832 cores.

Result: 8 Nvidia A100 graphics cards perform the same calculation, taking into account the start, partitioning and initialization times of Fluent, in only 12% of the CPU time, and 20% for the Nvidia A40. Instead of 16 kWh, the energy consumption for the A100 is only 2.5 kWh, and 3.8 kWh for the A40. The configuration with the A40 has turned out to be the most cost-effective option, although it should be noted that it is only partially suitable as a single-precision card for CFD use (where double precision is often recommended). The advantage of the considerably more expensive A100 (or currently H100), however, is that it is also fast when calculating with double precision.

Blog_FAU-GPU-Benchmaks_Vergleich-Energiebedarf-Kosten

Comparison of energy requirements and costs for GPU and CPU use | © NHR / CADFEM 

Using CADFEM IT services for the optimal computing solution

The results of the NHR show that “the one” hardware solution for Ansys Fluent simulation tasks (or other Ansys applications) usually does not exist. Basically, CFD calculations with Ansys Fluent are massively accelerated by using GPUs. This implies that hardly any compromises need to be made in terms of the level of detail or the use of AI methods. Which constellation is ultimately ideal in practice depends on several factors including the task itself, the mesh quality, the available infrastructure and other resources, as well as time or money budgets.

This is precisely where the CADFEM IT service comes into play. Our IT experts, certified by both the manufacturers and Ansys, support CADFEM customers in selecting and implementing the optimal computing solution, taking into account all influencing factors.

110-ansys-cfd-efficient-usage-of-ansys-fluent-18028

 

110% Ansys CFD – Efficient Usage of Ansys Fluent

In this training course, you will learn tips, tricks and solution strategies for efficient flow simulation with Ansys Fluent.

Info & Registration

Author

Alexander Kunz

CADFEM Germany GmbH

+49 (0)8092 7005-889
akunz@cadfem.de

Editor

Klaus Kuboth

CADFEM Germany GmbH

+49 (0)8092 7005-279
kkuboth@cadfem.de