SEMCAD X sets new standards in computational electromagnetics (CEM) by offering some of the fastest finite-difference time-domain (FDTD) solvers and enhancements on the market. With awesome speedup, excellent RAM efficiency, and useful features like auto-termination, solving huge problems has never been so easy.
SEMCAD X was the first FDTD toolkit on the market to offer hardware acceleration. Hardware acceleration in SEMCAD is achieved with the Acceleware library or, as an alternative, with the new CUDA library available from SPEAG. With simulation speeds of 700 – 3000+ MCells/s, SEMCAD X is in a league of its own for solver performance.
Please contact us to request performance information related to your specific applications. The following examples highlight the advanced performance of SEMCAD X simulations performed with the Acceleware library 11.0.3 (Floriana) and the new SPEAG CUDA solver in very large, very complex geometries:
|SEMCAD X Simulation||Benchmark 1||Benchmark 2||Benchmark 3|
|Model||car, driver, phone
||head with implant
||64||500 – 2000|
|No. time step||5896||406435||404530|
|computational domain (million cells)
|ABC||UPML 8 lyrs||UPML 11 lyrs||UPML 9 lyrs
|Solver Performance||Benchmark 1||Benchmark 2||Benchmark 3|
|Acceleware GPU Solver Speed (Mcell/s)*||3075||2815||2373|
|Acceleware GPU Solver Time (hh:mm)||00:59||28:40||120:11|
|SPEAG CUDA GPU Solver Speed (Mcell/s)*||2990||2782||2723|
|SPEAG CUDA GPU Solver Time (hh:mm)||01:03||29:03||104:44|
|CPU Solver Speed (Mcell/s)**||27||19.8||15.6|
|CPU Solver Time (hh:mm)||52:21||2043:00||280:00|
The performance power of SEMCAD X is also great for smaller simulation domains that fit in a single GPU. The examples following table were executed with a single NVIDIA Tesla C2070/C2075 card. For comparison with dual and quad-core architectures, see Fig. 1.
|SEMCAD X Simulation||Benchmark 4
|Model||cell phone||birdcage with human male model
||car, driver, Bluetooth antenna
|frequency (MHz)||1130 – 2630||64||2100|
|No. time step||66622||31079||9600|
|computational domain (million cells)
|ABC||UPML 8 lyrs||UPML 8 lyrs||UPML 6 lyrs
|Solver Performance||Benchmark 4
|Acceleware GPU Solver Speed (Mcell/s)||469||519||503|
|Acceleware GPU Solver Time (hh:mm)||00:27||00:21||00:07|
|SPEAG CUDA GPU Solver Speed (Mcell/s)||593||651||565|
|SPEAG CUDA GPU Solver Time (hh:mm)||00:20||00:18||00:06|
|CPU Solver Speed (Mcell/s)||15.5||20||17.2|
|CPU Solver Time (hh:mm)||13.34||09:29||03:39|
NVIDIA's Fermi architecture adds ECC support and improved double precision throughput and overall performance. In addition, multiple cards can be connected in parallel to allow higher performance and larger domain sizes. The performance of the Tesla 20-Series GPU is markedly improved compared to that of the previous 10-Series, and positively eclipses the solver speed of CPU-only systems. Solving large-scale, high-resolution problems in CPU-based software becomes too burdensome to be practical, as demonstrated in Fig. 1.
SEMCAD X's numerical solver allows multiple computers to be networked in Beowulf-style clusters to run single FDTD simulations with the Acceleware Cluster library. The single simulation space is partitioned across multiple computer nodes in message passing interface (MPI) processes with time-stepped synchronization. This CPU cluster solution is now available for Linux 64-bit architectures.
The Ben Arabi cluster at FPC Murcia, Spain consists of 102 nodes, for a total of 816 cores of Intel Xeon Quad-Core E5450@3GHz and 1072 GB of distributed memory. The generic phone model gridded at 195 Mcells was simulated as a benchmark in the Arabi cluster: solver speed as a function of the number of cores executing the simulation is shown in Fig. 2.