Algorithm Comparison
|
  T42L1   |
  T21L2   |
  T42L2   |
  T85L2   |
  T85L1   |
  T85L4   |
|
  P=32   |
  P=16   |
  P=8   |
  P=32   |
  P=16   |
  P=8   |
|   optimal algorithm   |
halfsum  |
halfsum  |
halfsum  |
ringsum  |
halfsum  |
ringsum  |
|   (allreduce-min)/min   |
  0.885  |
  0.648  |
  0.289  |
  1.797  |
  0.604  |
  0.429  |
|   (generic-min)/min   |
  0.968  |
  0.409  |
  0.064  |
  0.057  |
  0.067  |
  0.003  |
|