Setup
In what follows, “write and implement” means that we suggest that you write the function in pseudocode first, and then implement your function.
Download the Barnes-Hut template from the course code repository and unzip it into your python/src directory.
If you are logged in to our course on Cogniterra, you can find the practice problems for this chapter linked in the window below, where you can enter your code into our autograders and get real-time feedback. Beneath this window, we provide a full description of all this chapter’s exercises.
[LINK COMING SOON]
An introduction to the Barnes-Hut algorithm
In this assignment, we will scale up our gravity simulator to handle the movements of thousands of stars over a much longer period of time. For example, in a few billion years, the Andromeda and Milky Way Galaxies will collide. We would like to build a simulation like the one shown below to illustrate a galaxy collision.
If we were to use the brute-force gravity simulator we built earlier, the most time-consuming step would be computing the force between every pair of heavenly bodies — requiring on the order of n2 comparisons, where n is the number of bodies. As n grows into the thousands or millions, this becomes impractical.
Reflecting on the realities of gravity gives us hope. As you read this, you are under the gravitational influence of every molecule in every star in the farthest stretches of the universe — but their effect on you is negligible. In other words, gravity is effectively a local force, and we can develop a heuristic that only computes the force between nearby objects. In this assignment, you will implement the Barnes-Hut algorithm, a landmark method from 1986 that exploits this insight using a clever data structure called a quadtree.
The Barnes-Hut algorithm
Clustering nearby points
A critical insight is that if a collection of bodies are very close to each other relative to a distant body X, their individual forces on X can be approximated by a single “dummy” body whose mass equals the combined mass of the group and whose position is the center of gravity of the group.
The center of gravity of a collection of bodies with positions (xi, yi) and masses mi is the weighted average of their coordinates:
\text{Center of Gravity} = \left( \frac{\sum_{i=1}^{n} m_i x_i}{\sum_{i=1}^{n} m_i},\ \frac{\sum_{i=1}^{n} m_i y_i}{\sum_{i=1}^{n} m_i} \right)For example, if three stars have equal mass and positions (−4, 7), (0, 5), and (10, 3), their center of gravity is:
\left(\frac{-4 + 0 + 10}{3},\ \frac{7 + 5 + 3}{3}\right) = (2,\ 5)If the same stars have masses in the ratio 4:2:1, then their center of gravity is:
\left(\frac{-4 \cdot 4 + 0 \cdot 2 + 10 \cdot 1}{4 + 2 + 1},\ \frac{7 \cdot 4 + 5 \cdot 2 + 3 \cdot 1}{4 + 2 + 1}\right) = \left(-\frac{6}{7},\ \frac{41}{7}\right)The idea of aggregating nearby bodies is only useful if we can avoid computing the distance between every pair of bodies — the very problem we started with. We will solve this using a data structure that cleverly organizes the n bodies before making any force computations.
Forming a quadtree
In the past, we have worked with binary trees, in which every node has 0, 1, or 2 children. In a quadtree, nodes can have up to four children. The quadtree of a collection of bodies is built as follows, illustrated using the five bodies shown below.
First, form a root node with four children corresponding to the four quadrants of the two-dimensional space: NW, NE, SW, and SE (indexed 0, 1, 2, and 3).
We divide the space into four equally sized quadrants and range through the bodies, assigning each to the leaf node corresponding to its quadrant — with the constraint that no two bodies occupy the same leaf. Body A lies in the NW quadrant and body B lies in the NE quadrant, so we assign them to the corresponding leaves. As we do so, we create a “dummy” star at the root whose position is the center of gravity of A and B and whose mass is their combined mass.
Body C also belongs to the NE quadrant. Since we require each leaf to contain only one body, rather than assigning C to the node already containing B, we subdivide that quadrant into four equally sized subquadrants. B falls in the NW subquadrant and C in the SW subquadrant. As we add C, we update the mass and center of gravity of the dummy star at each internal node on the path from the root to C.
Note: If two bodies land in the same subquadrant, we continue to subdivide that subquadrant until the two bodies can be assigned to distinct children.
Body D belongs to the unoccupied SE subquadrant within the already-subdivided NE quadrant, so we assign it directly to the corresponding leaf and update all ancestor nodes.
Finally, body E is the sole occupant of the SE quadrant of the root, so we assign the SE child of the root to point to E and update the root’s mass and center of gravity to include all five bodies.
Note: The Barnes-Hut figures above are borrowed from http://arborjs.org/docs/barnes-hut.
Using the quadtree to compute the force of gravity
Now that we have built the quadtree, we can use it to approximate the net force of gravity acting on any given body. Recall the key insight: if a group of bodies is close together relative to a distant body X, their combined effect on X can be approximated by a single dummy body. In terms of the quadtree, “close together” means the sector containing them has a small width s, and “far from X” means the distance d from their center of gravity to X is large.
We combine these into a single statistic s/d: if s/d is small, the bodies are far enough away that we can treat them as one; if s/d is large, we should compute forces individually. Given a parameter θ, we traverse the quadtree and, whenever we reach an internal node with s/d < θ, we stop exploring that subtree and instead treat the dummy body at that node as a single body.
The following figures illustrate this heuristic using a quadtree of six bodies to approximate the net force acting on body A with θ = 0.5.
In summary, when using a quadtree to estimate the net force acting on a body b, we traverse the nodes of the tree and at each node take the following actions:
- If the current node is a leaf:
- If the leaf has no body assigned to it, it does not contribute to the net force on b.
- If the leaf has a body assigned to it: if the body is b, take no action; otherwise, compute the gravitational force of this body on b and add it to the net force.
- If the current node is an internal node, compute s/d.
- If s/d > θ, visit the node’s children.
- If s/d ≤ θ, treat this internal node as a single body (using its precomputed total mass and center of gravity) and compute the force it exerts on b.
Every time we update the universe, we generate a new quadtree and use this heuristic to approximate the net force acting on each body. The resulting algorithm — alternating between quadtree construction and heuristic force computations — is the Barnes-Hut algorithm. If you’d like to read the original 1986 paper, you can find it here.
Analyzing the runtime of Barnes-Hut
It can be shown that if n is the number of bodies, then constructing the quadtree takes on the order of n log n steps, and using it to approximate the net forces on all bodies likewise takes on the order of n log n steps. This is a significant improvement over the brute-force approach, which requires on the order of n2 comparisons.
The speedup factor is on the order of n/log n. For a simulation with n = 10 billion stars (about ten percent of the Milky Way), log n ≈ 10, giving a speedup of roughly 1 billion. Notably, the larger n grows, the larger the speedup — a perfect illustration of why efficient algorithms matter more than ever as scientific datasets grow.
The runtime is also heavily dependent on the choice of θ. A smaller value of θ makes the heuristic more stringent — fewer groups of bodies are consolidated — and the algorithm takes longer but yields more accurate results. A larger θ consolidates more aggressively and runs faster at the cost of accuracy.
Learning objectives
By completing this assignment, you will be able to:
- Apply object-oriented programming concepts, including methods and object references.
- Appreciate the importance of well-organized, readable, modular code as the size of a program grows.
- Implement a medium-scale programming project by extending an existing codebase.
- Design your own tests to verify that individual functions are correct before integrating them.
- Recognize that ideas like trees and recursion are fundamental to computer science far beyond counting and phylogenetics.
- Appreciate that tuning a parameter (here, θ) often represents a trade-off between efficiency and accuracy — a ubiquitous concept in computational science.
An object-oriented representation of the Barnes-Hut system
The following object-oriented design is provided in the starter code; you should not edit it. All core data structures are contained in datatypes.py.
OrderedPair
A lightweight class that stores two numbers, x and y. Used throughout the simulation to represent positions, velocities, and accelerations.
@dataclass
class OrderedPair:
x: float = 0.0
y: float = 0.0
Star
Represents a single celestial body. Each Star has a position, velocity, and acceleration (all OrderedPair objects), physical attributes such as mass and radius, and optional RGB color values for visualization.
@dataclass
class Star:
position: OrderedPair | None = None
velocity: OrderedPair | None = None
acceleration: OrderedPair | None = None
mass: float = 0.0
radius: float = 0.0
red: int = 255
green: int = 255
blue: int = 255
Universe
Represents the simulated world at a single moment in time: a square region of a given width (in meters) containing a list of stars. The method in_field() tests whether a point lies inside the universe’s boundaries.
@dataclass
class Universe:
width: float = 0.0
stars: list[Star] = None
def in_field(self, p: OrderedPair) -> bool:
return 0 <= p.x <= self.width and 0 <= p.y <= self.width
Quadrant
Represents a square subregion of the universe defined by its lower-left corner (x, y) and its width. Quadrants are used to subdivide space when building the quadtree.
@dataclass
class Quadrant:
x: float = 0.0
y: float = 0.0
width: float = 0.0
Node
Represents one node in a quadtree. Each node corresponds to a Quadrant and can contain a single real Star (for leaves) or a dummy Star representing the combined center of mass of its children (for internal nodes). Child quadrants are always ordered [NW, NE, SW, SE]. The method is_leaf() returns True if the node has no children.
@dataclass
class Node:
sector: Quadrant | None = None
children: list["Node"] | None = None
star: Star | None = None
def is_leaf(self) -> bool:
return self.children is None or len(self.children) == 0
QuadTree
A wrapper around the root Node of the quadtree. From this root, the entire tree can be traversed. The tree is rebuilt at every time step to calculate gravitational forces efficiently.
@dataclass
class QuadTree:
root: Node | None = None
Galaxy
An alias defined in initialization.py representing a collection of Star objects.
Galaxy = list[Star]
Function hierarchy
Your Barnes-Hut galaxy simulation is organized across several Python files. Below is a high-level overview of how all major functions connect to one another.
main()
├── initialize_galaxy() × 2
├── push()
├── initialize_universe()
├── barnes_hut()
│ └── update_universe()
│ ├── generate_quadtree()
│ │ └── Node.insert()
│ │ ├── center_of_gravity()
│ │ ├── Node.create_children()
│ │ └── Node.find_child()
│ ├── Node.calculate_net_force()
│ ├── update_velocity()
│ └── update_position()
└── animate_system()
└── draw_to_canvas()
The highest-level function you will implement is barnes_hut():
def barnes_hut(
initial_universe: Universe,
num_gens: int,
time: float,
theta: float
) -> list[Universe]:
We also reproduce our earlier pseudocode for simulate_gravity() and update_universe(), since barnes_hut() follows the same top-level pattern:
SimulateGravity(initialUniverse, numGens, time)
timePoints ← array of numGens + 1 universes
timePoints[0] ← initialUniverse
for every integer i between 1 and numGens
timePoints[i] ← UpdateUniverse(timePoints[i-1], time)
return timePoints
UpdateUniverse(currentUniverse, time, theta)
newUniverse ← CopyUniverse(currentUniverse)
tree ← GenerateQuadtree(newUniverse)
for every star s in newUniverse
oldA ← s.acceleration
oldV ← s.velocity
s.acceleration ← UpdateAcceleration(s, tree, theta)
s.velocity ← UpdateVelocity(s, oldA, time)
s.position ← UpdatePosition(s, oldA, oldV, time)
return newUniverse
Running your program
Once you have implemented all functions, you can run your simulation from the command line (use python instead of python3 on Windows):
python3 main.py <num_stars> <num_gens> <time_interval> <theta> <canvas_width> <frequency>
Here is an example that should run in around 30 seconds:
python3 main.py 100 1000 4e16 1.0 1000 10
After running the simulation, your program will produce a video named galaxy.mp4. Once you have a working simulation, try adding more stars, increasing the number of steps, and experimenting with the time interval and θ parameter.
Note: You may encounter a"maximum recursion depth exceeded"error if stars get very close together, causing the quadtree to subdivide to an extreme depth. To mitigate this, try reducing the number of stars. You can also increase Python’s recursion limit by addingsys.setrecursionlimit(10000)at the top of your code.
Important notes
You should view the origin of the coordinate system as the bottom-left corner of the Universe object, as we did when building the gravity simulator (i.e., use Cartesian coordinates).
Unlike in the gravity simulator, when a star leaves the boundary of the Universe, you should remove it from the next Universe, because the quadtree is only defined for stars that lie within the universe’s boundaries.
Round-off errors will accumulate over time, arising both from the discrete time steps and from the approximations made by the Barnes-Hut heuristic. You may notice that it is difficult to form a galaxy with stable orbits over a long simulation; this is partly because real galactic orbits have had billions of years to stabilize, and partly because computing realistic initial velocity distributions is a problem in astrophysics in its own right. Do not worry if your galaxies occasionally lose stars.
Utility functions
An important step in Barnes-Hut is computing the center of gravity of a collection of stars, which may have different masses. Recall the formula:
\text{Center of Gravity} = \left( \frac{\sum_{i=1}^{n} m_i x_i}{\sum_{i=1}^{n} m_i},\ \frac{\sum_{i=1}^{n} m_i y_i}{\sum_{i=1}^{n} m_i} \right)This is a perfect task for a variadic function, since we would like to take an arbitrary number of Star objects as input.
Code Challenge: Write and implement a variadic functioncenter_of_gravity(*stars: Star) -> OrderedPairthat takes an arbitrary number ofStarobjects as input and returns anOrderedPairrepresenting the mass-weighted center of gravity of the input stars.
Building the quadtree
Each Node in the Barnes-Hut tree represents a square sector of space. When a node must be subdivided, it splits into four smaller square regions resulting in four child nodes, ordered [NW, NE, SW, SE] and indexed 0 through 3.
Code Challenge: Write and implement aNodemethodcreate_children(self)that modifies the node in place by assigning four childNodeobjects toself.children, each corresponding to one quadrant (NW, NE, SW, SE) of the current node’s sector. The new child nodes should have theirstarandchildrenattributes left asNone.
Code Challenge: Write and implement aNodemethodfind_child(self, s: Star) -> Nodethat returns the childNodeofselfwhose sector would contain starsbased on the star’s position.
The figure below shows an example tree to use when testing find_child().
Now we can implement the core recursive operation: inserting a star into the quadtree. As you work through this function, think carefully about the different cases that can arise depending on whether the current node is an empty leaf, an occupied leaf, or an internal node.
Code Challenge: Write and implement aNodemethodinsert(self, s: Star)that inserts starsinto the appropriate location in the quadtree rooted atself. If the current node is an occupied leaf, it must be subdivided, and the existing star must be re-inserted into the correct child before insertings. All internal nodes on the path from the root to the inserted star must have their mass and center of gravity updated accordingly.
Hints: (1) You will find yourself on the best path to solving this problem if you enumerate the different cases for inserting a star — empty leaf, occupied leaf, and internal node — and work through an example by hand before writing any code. (2) Prefer using the built-inis_leaf()method over checkingself.children == Nonedirectly, sinceself.childrencould also be an empty list.
Now that we can insert individual stars, we can build a quadtree for an entire Universe.
Code Challenge: Write and implement a functiongenerate_quadtree(current_universe: Universe) -> QuadTreethat builds and returns aQuadTreecontaining all stars incurrent_universe. The root node should correspond to the full universe sector, and stars should be inserted in the order they appear incurrent_universe.stars.
The figures below show example quadtrees produced by generate_quadtree() for several input universes, displayed as preorder traversals of the tree. Each node is shown as a triplet of (x-position, y-position, mass) for the star at that node, or nil if the node is empty.
generate_quadtree().generate_quadtree().generate_quadtree().generate_quadtree().Computing the net force
With the quadtree in hand, we can implement the core Barnes-Hut heuristic for approximating the net gravitational force acting on a given star.
Code Challenge: Write and implement aNodemethodcalculate_net_force(self, star: Star, theta: float) -> OrderedPairthat returns the net gravitational force acting onstarfrom all bodies in the subtree rooted atself, using the Barnes-Hut heuristic with parametertheta.
Note: Empty quadrant nodes still exist in the tree as placeholders but have no star assigned to them. Make sure to checknode.star is not Nonebefore accessingnode.starto avoid errors.
Note: When computing distances, remember to check fordistance == 0to avoid dividing by zero, which would occur when computing the force of a star on itself.
Note: The following subroutines are provided for you in the starter code:compute_force(s1: Star, s2: Star) -> OrderedPair,distance(p1: OrderedPair, p2: OrderedPair) -> float, andcenter_of_gravity(*stars: Star) -> OrderedPair.
The figures below show example input quadtrees for testing calculate_net_force().
calculate_net_force().calculate_net_force().calculate_net_force().calculate_net_force().Simulating the universe
We now have all the pieces needed to assemble the full simulation. Note that when updating the universe, all stars must be updated simultaneously — you should compute the new acceleration, velocity, and position of every star using the current universe before writing any updated values back.
Code Challenge: Write and implement a functionupdate_universe(current_universe: Universe, time: float, theta: float) -> Universethat takes aUniverseobject and returns a newUniverserepresenting the system after one time step of the Barnes-Hut simulation, using a time interval oftimeand heuristic parametertheta. Stars that leave the universe’s boundaries should be removed from the returned universe.
Note: The following subroutines are provided for you in the starter code:update_acceleration(star: Star, q: QuadTree, theta: float) -> OrderedPair,update_velocity(star: Star, time: float, old_accel: OrderedPair) -> OrderedPair,update_position(star: Star, time: float, old_accel: OrderedPair, old_vel: OrderedPair) -> OrderedPair,copy_universe(current_universe: Universe) -> Universe, andgenerate_quadtree(universe: Universe) -> QuadTree.
If you made it this far, congratulations! Our highest-level function follows directly from update_universe().
Note: In addition to the subroutines available forupdate_universe(), a functionupdate_universe(current_universe: Universe, time: float, theta: float) -> Universeis also provided for you in the starter code.