Progress in increasing the number of particles in -body simulations
of star clusters has been slow. In the sixties,
was measured in
the tens; in the seventies in the hundreds; and in the eighties in the
thousands.
Peanuts indeed, compared to cosmological simulations with
and more particles! Why are these numbers so different?
The main reason is the fact that star clusters evolve on the time
scale of the two-body relaxation time at the half-mass radius,
. In terms of the half-mass crossing time
,
. In addition, the use of tree codes or other codes
based on potential solvers is not practical, given the high accuracy
required. This follows from the fact that an evolving star cluster is
nearly always close to thermal equilibrium. Therefore, small errors
in the orbits of individual particles can give rise to relatively much
larger errors in the heat flux through the system.
As a result, the cost of a simulation of star cluster evolution scales
roughly , since
crossing times are needed for each
relaxation time, and all
gravitational interactions between the
stars have to be evaluated many times per crossing time. In contrast,
most simulations in galactic dynamics span a fixed number of crossing
times, and can employ an algorithm with a computational cost scaling
. It is this difference, of roughly a factor
,
which has kept cluster modeling stuck in such a modest range of
-values.
What it Takes
The above rough estimates, although suggestive, are not very reliable.
In an evolving star cluster, large density gradients will be formed,
and sophisticated integration schemes can employ individual time steps
and various regularization mechanisms to make the codes vastly more
efficient (see the next section). One might have hoped that the above
scaling estimates might have turned out to be too pessimistic.
Indeed, empirical evidence based on runs in the range
seemed to indicated a scaling of the computational cost
, rather than
, per crossing time (Aarseth 1985).
However, a detailed analysis of the scaling behavior of various
integration schemes (Makino &Hut 1988) showed that the relatively
more mild empirical scaling must have been an artifact of the still
rather small values it was based on. Instead, the best asymptotic
scaling reported by Makino &Hut, based on a lengthy analysis of
two-timescale methods, resulted in a computational cost
per crossing time in a homogeneous system, as well as in an
isothermal density distribution. For steeper density distributions,
with the velocity dispersion increasing towards the center, they found
even slightly higher cost estimates.
With this staggering scaling of the computational cost, slightly worse
than , it would seem worthwhile to look for some type of
approximate method, in which part of the star cluster is modeled in a
statistical fashion. Indeed, such a hybrid approach, using a
combination of Fokker-Planck methods and direct
-body integration,
was carried out by McMillan and Lightman (1984ab) and McMillan (1985,
1986). This approach proved to be successful in modeling a star
cluster around the moment of core-collapse. However, these
simulations could not be extended significantly beyond core collapse,
for lack of computer time.
When Hut, Makino &McMillan (1988) applied the analysis by Makino &Hut to McMillan's hybrid code, as well as to a variety of other
algorithms, they reached a pessimistic conclusion. In order to model
even a modest globular cluster with stars, even with the
theoretically most efficient integration scheme, would carry a cost of
several Teraflops-days (a Teraflops equals
floating point
operations per second, and a Teraflops-day therefore corresponds to
floating point operations). More realistically, they
concluded, for a relatively simple and vectorizable or parallelizable
algorithm, the computational cost would lie in the range of
Teraflops-months.
How to Get There
Available computer speed increases by a factor of nearly two each
year, or a factor of per decade. Ten years ago, the
speed of personal computers (or a per-person-averaged speed of a VAX),
was measured in kflops, while current workstations are measured in
terms of Mflops. Extrapolating, by the year 2013 we can expect each
of us to have a multi-Teraflop machine on our desk, enabling us to
core-collapse a globular cluster in a week or so.
If we look at the speed of the fastest supercomputers available, a similar scaling holds. Ten years ago, astrophysicists could have occasional access to a supercomputer, but when averaged to a sustained speed, it would boil down to only a fraction of a Mflops speed. And indeed, by now the best one can hope to obtain from a NSF supercomputer center is an average speed of a fraction of a Gflops. It would seem that the evolution of even a modest globular cluster would not be possible until some time around the year 2003.
Fortunately, we do not have to wait that long. Following the rather
pessimistic conclusions of Hut et al. (1988), a project was started to
construct special-purpose hardware for -body calculations, by a
small group of astrophysicists at Tokyo University (Sugimoto et al.
1990). This GRAPE project (from `GRAvity PipE') centers on the
development of parallel Newtonian-force-accelerators, in analogy to
the idea of using floating-point accelerators to speed up
workstations. During the last three years, a variety of GRAPE
versions has been completed (and some of them distributed to other
locations outside Japan), and used successfully for several
astrophysical calculations (see, e.g., Funato, Makino &Ebisuzaki
1992).
The next major step in this project will be the development of a Teraflops speed high-accuracy special-purpose computer (the HARP, for `Hermite AcceleratoR Pipeline'; Makino, Kokubo &Taiji 1993), in the form of a set of a few thousand specially designed chips, each with a speed of several hundred Mflops. This machine is expected to be available some time next year.