Life Universe: A Technical Explanation

This is a translation of the original article. It was mostly processed by ChatGPT and DeepL and then I fixed the details.

It’s been a while since it was released, but I’ll explain the technical details of "Life Universe".

About Life Universe

First, there are some things you should know before I explain.

About OTCA Metapixel

I omit the explanation of Conway’s Game of Life. Since Life is Turing complete, various patterns exist, including a special pattern called OTCA Metapixel. OTCA Metapixel is a pattern that visually shows the on and off states of (meta) cells, and can reproduce all 2D cellular automata[1] that work under outer totalistic rules[2], not just Life.

In other words, you can see "Life within Life".

This is a famous video, but such calculations can actually be performed.

About Hashlife

The effective size of OTCA Metapixel[3] is 2048×2048 cells. In other words, moving one metacell requires the calculation of over 4 million cells, and if you want to create patterns with metacells, you need to move billions of cells. It’s impossible to move them in real time with normal calculations.

Therefore (although the order is reversed[4]), we use a technique called Hashlife, which allows for ultra-fast computation of Life. Hashlife represents the entire field as a quadtree, collapses the children with the same state into a DAG, and uses caching and doubling to compute the next generation at an explosive speed.

With Hashlife, OTCA Metapixel can be computed very efficiently, and on modern machines, scenes like using OTCA Metapixel to move another OTCA Metapixel (meta-metapixel) can be calculated as well.

However, Hashlife also has its limits, and even with its efficiency, memory usage and computation time increase exponentially as the hierarchy increases, so constructing patterns using meta-meta-metapixels is not practical.

Ideas on Life Universe

One day, I had a thought: "Don’t you wanna make this infinite?"

While it may seem impossible, at least mathematically, the existence of OTCA Metapixels makes it possible to create such a thing. To do this, we simply need to consider the minimum fixed point of the mapping that constructs the next level of OTCA Metapixels using a certain level of cells.

Furthermore, we also realize that we can cheat various details and make it computable in real-time.

For example, we can calculate up to the level of meta-metapixels and then swap them with normal cells and metapixels once the smallest metapixel inside them is crushed. This allows us to "display" a Life that can be zoom in and out infinitely while keeping the amount of computation virtually constant.

However, there are two major issues with this approach.

Consistency with time

The first issue is that "computation fails when time progresses". When we compute at the level of meta-metapixels, we are not considering the meta-meta-metapixels, but the blinking pattern of a meta-metapixel depends on its position in the higher-level meta-meta-metapixel.

Moreover, to determine the blinking pattern of a meta-meta-metapixel, we need to know where it is located in the meta-meta-meta-metapixel, and so on, ad infinitum. Therefore, if we really want to perform correct calculations for an infinite number of levels, we cannot stop computation at some intermediate level.

Consistency with position

The second issue is that "it is possible not to return to the original location when zoomed in and out". When we zoom in, we discard the information of the largest meta-metapixel, so it is not always possible to reconstruct the same scene when we zoom out again.

Since zooming in and out can be continued indefinitely, we also realize that we need infinite information to accurately represent "our current position".

In the end, we cannot escape from infinity in terms of both computational complexity and memory consumption if we pursue "accuracy".

Life Universe

The work that achieved real-time calculation with complete accuracy and solved the two aforementioned problems is Life Universe.

Initially, I thought to myself "I made something amazing," but I was extremely surprised to receive more feedback than I had anticipated from all over the world.

Thank you very much.

Consideration of partial structures

It is important to break down and consider the structure of the target to be realized. First, let’s consider the structure of OTCA Metapixel (hereafter referred to as OTCAMP).

Structure and number of states of OTCAMP

The effective size of OTCAMP is 2048×2048, and since it is assumed to be infinitely tiled, we do not consider the cells at the edges. In addition, the pattern has a period of 35328, so advancing 35328 generations will cause the generation at the meta level to advance by 1. Also, since OTCAMP itself can be turned on or off, the state of OTCAMP at a certain time seems to be determined by 2 (on/off) x 35328 (time). However, this is not enough.

The tiled OTCAMP communicates with the surrounding OTCAMPs via gliders. Therefore, the state of the central OTCAMP changes slightly depending on the state of the surrounding OTCAMPs. Since there are 8 adjacent cells, additional information of 2^8 is required.

In fact, this is still not enough, and the previous state of itself is required. The internal state of OTCAMP and the displayed state are slightly out of sync in time, and the display is updated shortly after updating the internal state. Therefore, during one cycle, updates such as ON→OFF, OFF→ON, ON→ON, and OFF→OFF of the display can be seen.

In the end, to identify the state of OTCAMP at a certain moment, you need the following information:

  • The state of the 9 cells including itself 2^9
  • The previous state of itself 2
  • The time within the cycle 35328

If all of these patterns are prepared in advance, then the correct display can be obtained by performing an appropriate tiling in theory.

Now, how much capacity will this require? Assuming no compression, using 4194304 bits for one state, and totaling 2^9 \times 2 \times 35328 \times 4194304 = 151732604633088 bits, or 138 TiB. This is an incredible amount, but assuming that all of these patterns can be pre-calculated, we will consider how to tile the patterns.

When we only zoom in

Now that we have figured out how to identify the state of a single OTCAMP, let’s consider creating a hierarchical structure. First, let’s consider the case where at some point, all OTCAMPs on the same level are off as we continue to zoom out. In this case, we can only zoom in because there is nothing left to see. We need to think about how to make the correct calculations in this case.

We assign levels to the hierarchy, where level 0 is the layer where the off OTCAMPs are tiled, level 1 is the OTCAMPs that make up the cells inside those off OTCAMPs, level 2 is the OTCAMPs that make up the cells inside level 1 OTCAMPs, and so on. As we zoom in, we see higher level numbers.

Next, we introduce the concept of coordinates to identify our current position. Let’s say we are focusing on a single fixed OTCAMP state. In order to shift our focus to one cell that makes up that OTCAMP, we need to determine:

  • The location of the cell we want to focus on
  • The time within the OTCAMP’s period when we view that cell as an OTCAMP

We represent this location and time as a triple of integers (x,y,t), where 0\leq x,y\lt2048 and 0\leq t\lt35328. This determines the state of the next OTCAMP we will focus on. We simply call this triple the focus.

For example, let’s say we are focusing on one of the OTCAMPs at level 0. Since all level 0 OTCAMPs are in the repeated off state, we can determine the state by just the period time t_0, and we can represent this state as (*,*,t_0) (the location of the cell does not matter, and the state is determined only by the time within the period).

Similarly, by focusing on a cell at the x_1-th position from the left and the y_1-th position from the top, and viewing it as an OTCAMP at period time t_1, we can focus on a specific level 1 OTCAMP. We represent this by (*,*,t_0)\to(x_1,y_1,t_1).

By repeating this process, we can focus on a specific level n OTCAMP using the following chain of focus:

(*,*,t_0)\to(x_1,y_1,t_1)\to\dots\to(x_n,y_n,t_n)

We define this as the coordinate of the corresponding OTCAMP.

Coordinates and cell states

What we want to discuss now is how to calculate the state of the corresponding OTCAMP when a certain coordinate is given, as this will enable us to "observe" the OTCAMP of interest.

This can be done top-down. For example, suppose we want to determine the state of the OTCAMP corresponding to the coordinate

(*,*,t_0)\to(x_1,y_1,t_1)

At level 0, we know that there are off OTCAMPs arranged. Therefore, the state of the OTCAMP can be determined solely from the time within the period t_0. To determine the state of the OTCAMP at level 1, we need:

  • The states of the surrounding 9 cells
  • The state of the previous cell
  • The time within the period

Out of these, the states of the surrounding 9 cells can be determined by looking at the cell at (x_1+i,y_1+j) in the OTCAMP represented by the coordinate (*,*,t_0), where -1\leq i,j\leq1. Also, the state of the previous cell can be determined by looking at the cell at (x_1,y_1) in the OTCAMP represented by the coordinate (*,*,t_0-1). The time within the period is determined by t_1.

This enables us to determine the state of the level 1 OTCAMP represented by the coordinate (*,*,t_0)\to(x_1,y_1,t_1).

Similarly, we can determine the state of cells at level 2 and beyond using the states of cells up to level 1, and we can determine the state of the level n OTCAMP corresponding to any coordinate

(*,*,t_0)\to(x_1,y_1,t_1)\to\dots\to(x_n,y_n,t_n)

in a finite number of calculations.

Time evolution and scrolling

If we can determine the state from the coordinates, time evolution and scrolling are not that difficult.

Since the period of OTCAMP is 35328, when the time advances by 35328 at a certain level, the time at the parent level advances by 1. In other words, the overall time can be treated as a 35328-base number when viewed a level as a digit. For example, if we advance the time by 1 at level 1 at the coordinates (*,*,0)\to(0,0,35327), the coordinates change to (*,*,1)\to(0,0,0).

Similarly, the coordinates can be regarded as a 2048-base number for each x and y. For example, if we move one cell to the right (+x direction) from the coordinates (*,*,0)\to(0,0,0)\to(2047,0,0) to level 2, the coordinates change to (*,*,0)\to(1,0,0)\to(0,0,0). Then, if we move to the cell above (-y direction), the coordinates change to (*,*,0)\to(1,2047,0)\to(0,2047,0).

In this way, we can freely move around inside the hierarchical structure. If the focus chain is finite, the coordinates only hold limited information, so memory usage is also limited.

Analysis of the overall structure

We learned that if we only zoom in on the grid, the coordinates are represented by a finite length, and the state can also be determined with finite calculations in a top-down manner.

Now, let’s consider the case where the hierarchical structure does not end at level 0, but continues infinitely to level -1, level -2, and so on.

Suppose we want to know the state of the cell at coordinates (0,0,0) at level 0. Since there is no assumption that all OTCAMPs at level 0 are off, we need information about which state of OTCAMP and where it belongs to in level -1 in order to determine the state of a specific OTCAMP at level 0. Furthermore, to determine the state of an OTCAMP at level -1, information from level -2 is necessary, and similarly, information from infinitely previous levels is necessary. This means that the calculations will never end.

In addition, coordinate expansion will also be necessary. So far, since the state at level 0 was clear, the state could be identified with an attention column based on level 0, but now we need to use a coordinate system that extends from infinity. That is, the coordinates must be expressed as a focus chain extending infinitely to the left:

\cdots\to(x_{-1},y_{-1},t_{-1})\to(x_0,y_0,t_0)\to\cdots\to(x_n,y_n,t_n)

This means that an infinite amount of memory capacity is required.

Furthermore, upon closer examination, it turns out that such coordinates may not necessarily uniquely determine the state, but this problem is too complicated to address here.

Making the impossible possible

Now that we’ve realized that trying to calculate it perfectly is impossible, we want to find a way to make it possible.

The reason the calculation doesn’t stop is that to get the state of a certain OTCAMP, the state of the upper-level OTCAMP is needed. This creates a structure similar to a non-stop recursive function. Therefore, we consider forcibly stopping this recursion in the middle. In other words, when it reaches a certain level, it decides its own state based on the information it has without asking for the state of the higher level. This is similar to assuming that "OTCAMPs that are off are arranged in level 0". This way, the calculation for determining the state will stop after a finite number of times, and the coordinates can be represented by a finite sequence of attention with a fixed level as the starting point.

The problem here is what to do when it reaches the level where it has been fixed. We are not considering the levels above that, so if we do nothing, the universe will end there. What should we do?

 

The answer is to "dynamically construct a parent that fits the consistency". This is the biggest key to realizing this work.

 

Dynamically constructing parents

For example, let’s consider an OTCAMP where all the surrounding cells, including itself, are off, its one previous state was on, and the time within the cycle is t. The parent of this OTCAMP can be a state of OTCAMP with a 3×3 pattern transition shown below at a certain time t' and location (x,y):

  ? ? ?             □ □ □
  ? ■ ?     -->     □ □ □
  ? ? ?             □ □ □

time = t          time = t+1

■: ON
□: OFF
?: ON or OFF

We represent the coordinates of this OTCAMP as (?,?,t') since we do not know the exact location of this OTCAMP and its parent.

Then, the aforementioned OTCAMP can be represented by the coordinates (?,?,t')\to(x,y,t). If we can find a parent OTCAMP for this OTCAMP, we can further extend the coordinates to the left as (?,?,t'')\to(x',y',t')\to(x,y,t).

However, if an OTCAMP in a strange state is selected, there may no longer be an OTCAMP that would be its parent, so the state to be fixed and the parent that satisfies the condition must be selected carefully.

In the actual work, to add diversity to the scenery seen when zooming out, we pre-calculate and enumerate the states of the OTCAMP and its parent to be used at the time of calculation termination. When an actual parent is needed, a parent state is randomly selected from the possible options [5].

How parents are expanded

Using an approach like lazy evaluation, recursive calculations can be terminated and it becomes possible to compute a completely consistent structure in a finite amount of time.

Let’s take an example. First, choose the root OTCAMP state. The state can be chosen in any way, but choose a state where its parent always exists. If we are looking at the state of time 10000 within this OTCAMP cycle, its coordinate is represented as

(?,?,10000)

Now, let’s move forward in time, and assume that the coordinate has become (?,?,35327). When we advance the time further, there is a carryover in time. At this point, it becomes necessary to actually determine the parent of this OTCAMP. That is, for example, by extending the coordinate to the left like

(?,?,500)\to(50,100,35327)

and then calculating the carryover like

(?,?,501)\to(50,100,0)

to determine the next state. The same process is applied to scrolls as well. For example, if we shift our focus to the cell one to the left of this state, the coordinate becomes

(?,?,501)\to(49,100,0)

but as we continue to do so,

(?,?,501)\to(0,100,0)

when we shift our focus to the next cell to the left, the parent expansion is required[6].

(?,?,2000)\to(20,30,501)\to(0,100,0)

Expand the parent and determine the number for ?, and then update the coordinate by calculating the carryover.

(?,?,2000)\to(19,30,501)\to(2047,100,0)

In this way, parents are expanded as needed. An important property is that every time one parent is expanded, it becomes possible to calculate up to 2048 times the movement and 35328 times the time elapsed, so even if you move at an unrealistically high speed or advance time at an ultra-high speed, the number of expanded parents will be only a few. For example, even if we advance the time by 5000 trillion steps, the number of expanded parents will be at most 4. Exponential functions are terrifying.

When actually computing the OTCAMP state, if it is implemented naively using recursive calculations, it will take exponentially long computation time with respect to the length of the coordinate. Therefore, I used a memoization technique to reduce the order to linear or less.

Preparing the animation

Now, how much capacity will this require? Assuming no compression, …… or 138 TiB. This is an incredible amount, but assuming that all of these patterns can be pre-calculated, we will consider how to tile the patterns.

We’ve solved the calculation part of laying out the patterns, but we still have the assumption that this needs to be compressed to a size that can be published on the web.

One way to reduce the space without compression is to record only part of it and calculate the rest during playback. However, because we can’t access the random time, we don’t think this is suitable for displaying a multi-level OTCAMP, so we decided to record all frames completely.

First of all, Hashlife

In order to start talking about the entire OTCAMP, I implemented Hashlife and performed calculations for 35,328 generations for all 1024 arrangements of OTCAMP. I then performed an output of all animation data.

The advantage of using Hashlife is that it saves the same state quadtree nodes in memory by compacting them into a DAG, so it can save data in a much smaller size than saving images, etc. There are two types of data that need to be saved, the structure of the DAG containing all necessary nodes, and the animation data that lists the node numbers referenced by each state. Especially the latter requires a large capacity, and I remember that it became more than 4 GiB in total. Compared to 138 TiB, it has become much smaller, but it is still not a size that can be distributed.

Making data easier to compress

The output file contained data for each of the 1024 possible OTCAMP cell configurations, written over 35328 frames. I rearranged this data so that it becomes data for each of the 1024 cell configurations for each of the 35328 frames. This makes the data much easier to compress. Why is that?

The reason is the similarity of data at the same time for each cell configuration. For example, let’s compare the OTCAMP at time t and time t+1 for a certain cell configuration[7] A. Since the generations are adjacent, there are no major differences at a glance. However, since Life contains many small oscillators, if you look closely, you will see that many small changes are occurring in various places. As a result, many different reference node numbers are included.

On the other hand, when we compare the OTCAMP at time t for cell configuration A and the OTCAMP at time t for another cell configuration B, these are surprisingly well-matched in detail. This is because, since the times are aligned, most of the patterns of each oscillator match.

In many compression algorithms, the more similar patterns appear near the data, the higher the compression rate, so simply rearranging this data can result in a significantly higher compression rate.

LZSS

As a result of the above rearrangement, we needed to compress 1024 sets of 35328 number sequences. We apply the LZSS algorithm to each of these number sequences to compress the data. Since the number sequence is only 1024 numbers, we can minimize the compressed capacity through exhaustive search.

The LZSS algorithm is simple yet very powerful as it can eliminate most of the commonly occurring repetitive patterns. In addition, one of the features of LZSS is that the expansion is fast, so the compressed data can be saved in memory and extracted and expanded when needed during real-time calculations.

This compression reduced the animation data to about 20 MiB. Although it might be possible to publish it on the web, I would like to make it even smaller if possible.

Patternization of LZSS results

Looking at the data compressed by LZSS, you can see that similar patterns appear in compressed data from different times. For example,

A B C 1 2 A 1 3 A 4 5 B ...

and

X Y Z 1 2 X 1 3 X 4 5 Y ...

show that the data is scattered at different times. Therefore, I decomposed these data into a fill pattern + a number sequence to use, such as

$1 $2 $3 1 2 $1 1 3 $1 4 5 $2 ...
A B C ...
$1 $2 $3 1 2 $1 1 3 $1 4 5 $2 ...
X Y Z ...

and reused the same fill pattern. With this, the size became about 10 MiB.

Conversion to PNG

This is a final touch. The image format is easy to load in JavaScript, and since PNG itself has Deflate, we can expect further reduction in size in addition to binary conversion.

Since all the numbers contained in the data to be encoded fit within 19 bits, I performed run-length encoding using the remaining 5 bits of the available 24 bits in an alpha-less PNG. There was some effect.

In addition, I used a tool called OptiPNG to attempt to reduce the size of the PNG file itself, which also had some effect.

Finally, I were able to compress all DAG data and animation data to about 4 MiB. This is a size that is not particularly problematic for publication on the web.

This is an excerpt of the data. Looking at it, there seems to be a lot of color bias, but I want to believe that this is being processed by Deflate in PNG itself.

Drawing

Drawing is performed by writing specialized shaders, but… actually, it was quite difficult.

Data transfer

In order to draw the OTCAMP data obtained from state calculation on the GPU, it is necessary to transfer the data, but since the size of one OTCAMP is equivalent to a huge texture of 2048×2048, it would be too slow to transfer the data every frame. Therefore, just like saving the entire DAG data as an image, the entire graph data is pre-transferred to the GPU as a texture. This can be done in one time at startup.

Then, in order to draw the OTCAMP of a specific state, it is enough to send the corresponding node number of the graph, and the shader will expand it internally while looking at the entire DAG data and determine the state of the cell.

Furthermore, the maximum number of layers to be drawn is at most 2 layers[8], and the time within the cycle matches in the same layer. Therefore, it is sufficient to transfer the node numbers of 2048 OTCAMPs in total. Even if it is transferred every frame, it is not a significant burden.

Two-Stage drawing

In order to determine the state of the OTCAMP, it is necessary to know the state of the surrounding cells, so if you try to draw two layers at once, the drawing load of the second layer will be more than 9 times the first layer.

Therefore, an intermediate texture is prepared, and the result of the first layer drawing is stored in the texture to reduce the drawing load of the second layer.

Cell density

Since we are using OTCAMP, we want to assign an on pixel if it is on and an off pixel if it is off when the OTCAMP is zoomed out. To do this, I used the number of on cells recorded at each node in the quadtree.

I calculated a real number that becomes 1 when the cell in the OTCAMP is in the on state and 0 when it is in the off state, and used it in color calculation.

Reducing Moiré patterns

When colors of pixels are determined using only ON/OFF binary values, strong Moiré patterns occur because scaling can be done in non-integer multiples.

Ideally, the number of ON cells present in the area occupied by each pixel should be counted to determine the color. However, because the data structure is in a quadtree, there is a significant cost to integration at intermediate positions. Therefore, the colors are gradually changed in response to density changes to reduce Moiré patterns, similar to combining Nearest Neighbor filtering with Mipmapping. The result is reasonably good.

Adding colors

I wanted to make it rainbow-colored.

As you may notice upon closer inspection, the color structure is recursive as well. A large OTCAMP is assigned a distribution of rainbow colors, and as it is magnified, smaller OTCAMPs are each painted with rainbow colors that change gradually.


Overall pattern of colors


Magnified bottom right corner


Individual OTCAMPs become visible


Rainbow colors are assigned to each one again

Conclusion

There are many other small tricks and adjustments that I couldn’t write about here.

  • Be careful that the timing of OTCAMP’s period is not aligned with the switching time of OTCAMP’s appearance.
  • Make the frame less visible when zooming out on an off OTCAMP.
  • Slightly shift the center when continuing to zoom in.
  • Non-recursively modify functions to prevent Stack Overflow even when scaling up to tens of thousands of layers.

When making something, I usually start with something that is known to be achievable from the outset. However, this time, it was a rare pattern that I realized was completely computable in the middle of the process. I think that the effort paid off and I was able to create a content that could be experienced for the first time in the world.

I consulted with @phi16_ during the development and would like to thank him for his help.

The source code is available in this repository.


  1. The shape cannot be changed, so it is limited to grid-like shapes. ↩︎
  2. A type of rule in which the next state of oneself is determined only by its own state and the number of living cells in its neighborhood. ↩︎
  3. The actual bounding box is slightly larger because there is a communication part with neighboring cells. ↩︎
  4. OTCA Metapixel is designed to be able to be calculated quickly using the Hashlife algorithm. ↩︎
  5. At this time, it is necessary to ensure ergodicity when transitioning from parent to parent. Otherwise, even if diverse candidates are selected, the diversity of the scenery may be lost by getting caught in a short cycle. ↩︎
  6. Since the states of adjacent cells are necessary to identify the state of OTCAMP, the parent expansion is already necessary at this point. ↩︎
  7. It refers to the state of the 9 cells in the neighborhood, including itself, and its previous state. ↩︎
  8. For the first layer, the data of up to four maximum OTCAMPs are transferred, and for the second layer, OTCAMPs of 2×2 or more resolutions are drawn, so when the display resolution exceeds 16000×16000, drawing may not work well. ↩︎

このエントリーをはてなブックマークに追加