Skip to main content
Arbitrary data are represented in TON Blockchain by trees of cells. Such a tree of cells is transformed into a DAG of cells by identifying cells in the tree that have the same hash. After that, each of the references of each cell might be replaced by the 32-byte representation hash of the cell referred to. Thus a bag of cells (BoC) is obtained.
🔵 Roots: [0x1111]

0x1111
  Refs: [0x2222, 0x3333]
0x2222
  Refs: [0x4444]
0x3333
  Refs: [0x4444]
0x4444
  Refs: []
In general, a BoC can be obtained from several trees of cells, thus forming a forest. By convention, the roots of the original trees of cells are marked elements of the resulting bag of cells, so that anybody receiving this bag of cells and knowing the marked elements can reconstruct the original forest. However, this BoC needs to be serialized into a file, suitable for disk storage or network transfer.
🔵 Roots: [0x1111, 0x2222]

0x1111
  Refs: [0x3333]
0x2222
  Refs: [0x3333]
0x3333
  Refs: []
There may be many different ways to serialize such a data structure, each of which has its own goals and is convenient for specific cases. This page provides a general serialization algorithm and specification of the corresponding TL-B schemes, followed by the example and specific implementation used in the TON Blockchain.
Even though the syntax looks very much like TL-B, it cannot be used in most of the TL-B tooling. Unlike in real TL-B, these schemas serialize to a bitstring with no 1023 bit length limit, and without any refs.

General scheme

Internal references, absent cells, and complete BoCs

For an arbitrary cell c in a given BoC, references to it can be either:
  • internal if the cell corresponding to the reference is also represented in BoC,
  • external if it’s not in BoC. Such cell c is called absent from this BoC.
A BoC is called complete if it does not contain any external references. Most real-world BoCs are complete.

Assigning indices to the cells from a bag of cells

The assignment of indices of its cells plays an important role. Let c1, ..., cn be the n distinct cells belonging to a bag of cells B. The most used options are:
  • Order cells by their representation hash. Thus Hash(ci) < Hash(cj) whenever i < j.
  • Topological order.

Outline of serialization process

This and the following paragraphs provide a textual description of the BoC serialization process and the specifications of the TL-B schemas associated with it. The specific implementation of the serialization and TL-B schemes is left to the choice of developers.For a specific example of TL-B schema and pseudocode of related cell serialization, see TL-B scheme.
The serialization process of a BoC B consisting of n cells can be outlined as follows.
  • List the cells from B in a chosen order: c1, ..., cn(with c1, ..., c_k as root cells, if B is a forest).
  • Choose an integer number s, such that n ≤ 2^s. Represent each cell ci by an integral number of bytes as in standard representation cell algorithm, but:
    • d1 = r + 8s + 16h + 32l where h = 1 if the cell’s hashes are explicitly included into the serialization; otherwise, h = 0 (when r = 7, h must be 1);
    • if h = 1, after bytes b1 and b2 the serialization is continued by l + 1 32-byte higher hashes of c;
    • unsigned big-endian s-bit integer j used instead of hash Hash(cj) to represent internal references to cell cj.
  • Concatenate the representations of cells ci thus obtained in the increasing order of i.
  • Optionally, an index can be constructed that consists of n + 1 t-bit integer entries L1, ..., Ln, where Li is the total length (in bytes) of the representations of cells cj with j ≤ i, and integer t ≥ 0 is chosen so that Ln ≤ 2^t. If the index is included, any cell ci the serialized bag of cells may be easily accessed by its index i without deserializing all other cells, or even without loading the entire serialized bag of cells in memory.
  • The serialization of the bag of cells now consists of a magic number indicating the precise format of the serialization, followed by integers s ≥ 0, t ≥ 0, n ≤ 2^s, an optional index consisting of (n+1)t8\lceil\frac{(n+1)*t}{8}\rceil bytes, and Ln bytes with the cell representations.
  • An optional CRC32C may be appended to the serialization for integrity verification purposes.

A classification of serialization schemes for bags of cells

Each TL-B scheme for a bag of cells must specify the following parameters.
  • The 4-byte magic number (name of TL-B constructor) prepended to the serialization.
  • The number of bits s used to represent cell indices. Usually s is a multiple of eight.
  • The number of bits t used to represent offsets of cell serializations. Usually t is also a multiple of eight.
  • A flag indicating whether an index with offsets L1, ..., Ln of cell serializations is present. This flag may be combined with t by setting t = 0 when the index is absent.
  • A flag indicating whether the CRC32C of the whole serialization is appended to it for integrity verification purposes.
  • The total number of cells n present in the serialization.
  • The number of root cells k ≤ n present in the serialization. The root cells themselves are c_1,,c_kc\_1, \ldots, c\_{k}. All other cells present in the bag of cells are expected to be reachable by chains of references starting from the root cells.
  • The number of absent cells l ≤ n − k, which represent cells that are absent from this bag of cells, but are referred to from it. The absent cells themselves are represented by c_nl+1,,c_nc\_{n−l+1}, \ldots, c\_{n}, and only these cells may (and also must) have r = 7. Complete bags of cells have l = 0.
  • The total length in bytes Ln of the serialization of all cells. If the index is present, Ln might not be stored explicitly since it can be recovered as the last entry of the index.

A TL-B scheme

Only one serialization scheme of BoCs is used in TON Blockchain (there are also two outdated BoC serialization schemes in the file):
serialized_boc#b5ee9c72 has_idx:(## 1) has_crc32c:(## 1) 
  has_cache_bits:(## 1) flags:(## 2) { flags = 0 }
  size:(## 3) { size <= 4 }
  off_bytes:(## 8) { off_bytes <= 8 } 
  cells:(##(size * 8)) 
  roots:(##(size * 8)) { roots >= 1 }
  absent:(##(size * 8)) { roots + absent <= cells }
  tot_cells_size:(##(off_bytes * 8))
  root_list:(roots * ##(size * 8))
  index:has_idx?(cells * ##(off_bytes * 8))
  cell_data:(tot_cells_size * [ uint8 ])
  crc32c:has_crc32c?uint32
  = BagOfCells;
Field cells is n, roots is k, absent is l, and tot_cells_size is Ln (the total size of the serialization of all cells in bytes). If an index is present, parameters s/8 and t/8 are serialized separately as size and off_bytes, respectively, and the flag has_idx is set. The index itself is contained in index, present only if has_idx is set. The field root_list contains the (zero-based) indices of the root nodes of the bag of cells. Finally, cell_data is a sequence of bits that obtained as a concatenation of the cells representations:
// d1
uint3 level_mask;
uint1 has_hashes;
uint1 is_exotic;
uint3 ref_count;

// d2
if ref_count < 7 {
    uint7 byte_size;
    uint1 has_padding;
}
        
if has_hashes {
    uint256 hashes[GetLevel(level_mask) + 1];
    uint16 depths[GetLevel(level_mask) + 1];
}
        
if ref_count < 7 {
    if byte_size + has_padding != 0 {
        uint8 data_bits[byte_size + has_padding];
    }
    ref refs[ref_count];
}

An example of a manual serialization

Let’s consider the following example of a tree of cells:
01
  0aaaaa
  fe
    0aaaaa
So, there is a 2-bit root cell that references two other cells:
  • The first is a 24-bit cell.
  • The second is a 8-bit cell that itself references a 24-bit cell.
After identifying of unique cells, we have the following:
01
0aaaaa
fe
Next, the unique cells are arranged in a topological order:
01     -> index 0 (root cell)
fe     -> index 1
0aaaaa -> index 2
Now, let’s calculate the descriptor bytes b1 and b2 for each of the three unique cells. So, we obtain:
01     -> 0201
fe     -> 0102
0aaaaa -> 0006
Then the data bits are serialized as b8\lceil\frac{b}{8}\rceil bytes. Remember, if b is not a multiple of eight, a binary 1 and up to six binary 0s are appended to the data bits. After that, the data is split into b8\lceil\frac{b}{8}\rceil 8-bit groups.
01     -> 01100000 = 0x60
fe     -> do not change (full groups)
0aaaaa -> do not change (full groups)
Next come the depths for the refs in two bytes:
01     -> 0002
fe     -> 0001
0aaaaa -> 0000
Now specify which cells each one references:
0: 01     -> 0201: refers to 2 cells with such indexes
1: fe     -> 02: refers to cells with index 2
2: 0aaaaa -> no refs
For each cell we have its hexadecimal representation:
01     -> 02016000020201
fe     -> 0102fe000102
0aaaaa -> 00060aaaaa0000
Finally, we concatenate all parts into a single hexadecimal array: 0x020160000202010102fe00010200060aaaaa0000. Now that we’ve serialized our cells into a flat 20-byte array, it’s time to pack them into a complete BoC format.
0xb5ee9c72                                 -> TL-B id of the BoC structure
0b1                                        -> has indexes
0b0                                        -> does not have CRC32C
0b0                                        -> does not have cache bits
0b00                                       -> flags are 0
0b001                                      -> the number of bytes needed to store the number of cells is 1
0b00000001                                 -> the number of bytes used to represent offset of a serialization is 1
0b00000011                                 -> the number of cells is 3
0b00000001                                 -> the number of roots is 1
0b00000000                                 -> the number of absent cells is 0
0b00010100                                 -> tot_cells_size is 20 bytes
0b00000000                                 -> the root list; we have one root with number 0 after the topological sort
0b000001110000111000010100                 -> the three 8-bits group of indexes for cells accorging to the topological sort
0x020160000202010102fe00010200060aaaaa0000 -> the cells data
_                                          -> CRC32C is not serialized
By combining everything into a single bit string, we get the result of serialization.

SDK from @ton/core

According to the TL-B scheme above there is the SDK for serialization and parsing BoC. Only serialization of BoCs with one root and no absent cells is supported. There are two main functions:
  • serializeBoc for serialization. It has two parameters: root and options object with two boolean flags: idx and crc32. They indicate whether indexes and CRC32C will be included in serialization. The output is a Buffer with serialization.
  • deserializeBoc for parsing. It has one parameter: src, a Buffer that contains a serialized BoC. The output is a roots list of a given BoC.
import { beginCell, serializeBoc, deserializeBoc } from "@ton/core";

const innerCell = beginCell().storeUint(456, 16).endCell();

const rootCell = beginCell().storeUint(0, 64).storeRef(innerCell).endCell();

const serialized_boc = serializeBoc(rootCell, { idx: false, crc32: false });

const serialized_boc_with_indexes_and_crc32 = serializeBoc(rootCell, {
    idx: true,
    crc32: true,
});

const roots_of_first_boc = deserializeBoc(serialized_boc);

const roots_of_second_boc = deserializeBoc(
    serialized_boc_with_indexes_and_crc32,
);