Three characteristics become obvious as you experience the Kautz graph visualizations:
1. Performance remains very high even as the number of nodes approaches a thousand.
2. The “response curve” of the network is essentially flat, meaning that the time it takes to get to the most distant node is almost the same as the time to the closest. (Each hop only costs 30ns.)
3. The Kautz graph’s enormous bisection bandwidth allows it to power through collective operations such as broadcast without degradation.
All three of these characteristics are essential to the high processor count applications that will dominate technical computing in the years to come.
The first key to increasing the performance of any Linux/MPI application is to run it on more processors, provided the network scales cost-effectively. The SiCortex Kautz graph achieves this goal by integrating the switching within processor nodes and putting the wires in a backplane.
A second key to increasing performance is often to use more sophisticated, and perhaps adaptive, data structures. That is why so many mainstream applications are now moving from fixed grids to dynamic ones. The flat response curve of the Kautz network assures that performance remains high even as the data moves around unpredictably in memory.
And it is well known among computational scientists that the performance of collectives is often the limiting factor in scalability. The Kautz graph addresses this issue by offering enormous bisection bandwidth and supporting microcode-accelerated collectives.
Ultimately, the payoff for applications is that you can henceforth forget about your SiCortex system’s Kautz graph and implement your application in the way that best fits the algorithm. The Kautz graph will perform well regardless.