Skip to main content
Arbitrary data are represented in the TON Blockchain by trees of cells. Such a tree of cells is transformed into a DAG of cells by identifying identical cells in the tree. After that, each of the references of each cell might be replaced by the 32-byte representation hash of the cell referred to. Thus a bag of cells (BoC) is obtained. In general, a BoC can be obtained from several trees of cells, thus forming a forest. By convention, the root of the original tree of cells is a marked element of the resulting bag of cells, so that anybody receiving this bag of cells and knowing the marked element can reconstruct the original DAG of cells, hence also the original tree of cells. However, this BoC needs to be serialized into a file, suitable for disk storage or network transfer. There may be many different ways to serialize such a data structure, each of which has its own goals and is convenient for specific cases. This page provides a general serialization algorithm and specification of corresponding TL-B schemes, followed by the specific implementation used in the TON Blockchain.

General scheme

Internal references, absent cells, and complete BoCs

Consider an arbitrary cell c in a given BoC. A reference of c is called internal if the cell corresponding to the reference is also represented in BoC. Otherwise, the reference is called external and the corresponding cell is called absent from that BoC. In turn, a BoC is called complete if it does not contain any external references. Although most real-world cases only deal with complete BoCs, in general, the serialization of absent cells in BoC differs from the serialization of included cells. Therefore, it is very important to be able to identify the type of references.

Assigning indices to the cells from a bag of cells

In the process of BoC serialization, the assignment of indices of its cells plays an important role. Let c1, ..., cn be the n distinct cells belonging to a bag of cells B. The most used options are:
  • Order cells by their representation hash. Thus Hash(ci) < Hash(cj) whenever i < j.
  • Topological order.

Outline of serialization process

The serialization process of a BoC B consisting of n cells can be outlined as follows.
  • List the cells from B in a chosen order: c1, ..., cn(with c1, ..., c_k as root cells).
  • Choose an integer number s, such that n ≤ 2^s. Represent each cell ci by an integral number of bytes as in standard representation cell algorithm, but using unsigned big-endian s-bit integer j instead of hash Hash(cj) to represent internal references to cell cj. More precisely, each individual cell c is serialized as follows, provided s is a multiple of eight.
    • Two descriptor bytes d1 and d2 are computed by setting d1 = r + 8x + 16h + 32l and d2=b8+b8d2 = \lfloor \frac{b}{8} \rfloor + \lceil \frac{b}{8}\rceil (for absent cells, only d1 is present, always equals to 7 + 16 + 32l), where:
      • 0 ≤ r ≤ 4 is the number of cell references present in cell c, if c is absent from the bag of cells being serialized and is represented by its hashes only, then r is set to 7;
      • 0 ≤ b ≤ 1023 is the number of data bits in cell c;
      • 0 ≤ l ≤ 3 is the level of cell c;
      • x = 1 for exotic cells and x = 0 for ordinary cells;
      • h = 1 if the cell’s hashes are explicitly included into the serialization; otherwise, h = 0 (when r = 7, h must be 1).
    • Two bytes d1 and d2 (if r < 7) or one byte d1 (if r = 7) begin the serialization of c.
    • If h = 1, the serialization is continued by l + 1 32-byte higher hashes of c: Hash1(c),,Hashl+1(c)=RepHash(c)Hash_{1}(c), \ldots, Hash_{l+1}(c) = RepHash(c).
    • After that, b8\lceil\frac{b}{8}\rceil data bytes are serialized, by splitting b data bits into 8-bit groups and interpreting each group as a big-endian integer in the range 0 ... 255. If b is not divisible by 8, then the data bits are first augmented by one binary 1 and up to six binary 0, so as to make the number of data bits divisible by eight.
    • Finally, r cell references to cells cj1,,cjrc_{j_1}, \ldots, c_{j_r} are encoded by means of r s-bit big-endian integers j1,,jrj_1, \ldots, j_r.
  • Concatenate the representations of cells ci thus obtained in the increasing order of i.
  • Optionally, an index can be constructed that consists of n + 1 t-bit integer entries L1, ..., Ln, where Li is the total length (in bytes) of the representations of cells cj with j ≤ i, and integer t ≥ 0 is chosen so that Ln ≤ 2^t. If the index is included, any cell ci the serialized bag of cells may be easily accessed by its index i without deserializing all other cells, or even without loading the entire serialized bag of cells in memory.
  • The serialization of the bag of cells now consists of a magic number indicating the precise format of the serialization, followed by integers s ≥ 0, t ≥ 0, n ≤ 2^s, an optional index consisting of (n+1)t8\lceil\frac{(n+1)*t}{8}\rceil bytes, and Ln bytes with the cell representations.
  • An optional CRC32C may be appended to the serialization for integrity verification purposes.

A classification of serialization schemes for bags of cells

Each TL-B scheme for a bag of cells must specify the following parameters.
  • The 4-byte magic number (name of TL-B constructor) prepended to the serialization.
  • The number of bits s used to represent cell indices. Usually s is a multiple of eight.
  • The number of bits t used to represent offsets of cell serializations. Usually t is also a multiple of eight.
  • A flag indicating whether an index with offsets L1, ..., Ln of cell serializations is present. This flag may be combined with t by setting t = 0 when the index is absent.
  • A flag indicating whether the CRC32 of the whole serialization is appended to it for integrity verification purposes.
  • The total number of cells n present in the serialization.
  • The number of root cells k ≤ n present in the serialization. The root cells themselves are c1,,ckc_1, \ldots, c_{k}. All other cells present in the bag of cells are expected to be reachable by chains of references starting from the root cells.
  • The number of absent cells l ≤ n − k, which represent cells that are actually absent from this bag of cells, but are referred to from it. The absent cells themselves are represented by cnl+1,,cnc_{n−l+1}, \ldots, c_{n}, and only these cells may (and also must) have r = 7. Complete bags of cells have l = 0.
  • The total length in bytes Ln of the serialization of all cells. If the index is present, Ln might not be stored explicitly since it can be recovered as the last entry of the index.

A TL-B scheme

Only one serialization scheme of BoCs is used in TON Blockchain:
serialized_boc#b5ee9c72 has_idx:(## 1) has_crc32c:(## 1) 
  has_cache_bits:(## 1) flags:(## 2) { flags = 0 }
  size:(## 3) { size <= 4 }
  off_bytes:(## 8) { off_bytes <= 8 } 
  cells:(##(size * 8)) 
  roots:(##(size * 8)) { roots >= 1 }
  absent:(##(size * 8)) { roots + absent <= cells }
  tot_cells_size:(##(off_bytes * 8))
  root_list:(roots * ##(size * 8))
  index:has_idx?(cells * ##(off_bytes * 8))
  cell_data:(tot_cells_size * [ uint8 ])
  crc32c:has_crc32c?uint32
  = BagOfCells;
Field cells is n, roots is k, absent is l, and tot_cells_size is Ln (the total size of the serialization of all cells in bytes). If an index is present, parameters s/8 and t/8 are serialized separately as size and off_bytes, respectively, and the flag has_idx is set. The index itself is contained in index, present only if has_idx is set. The field root_list contains the (zero-based) indices of the root nodes of the bag of cells. There are also two outdated BoC serialization schemes in the same file.

SDK from @ton/core

According to the TL-B scheme above there is the SDK for serialization and parsing BoC. Only serialization of BoCs with one root and no absent cells is supported.
import { beginCell, serializeBoc } from "@ton/core";
// serializeBoc has two arguments:
// root: Cell. A root cell of a given tree of cells
// opt: { idx: boolean, crc32: boolean }. Two flags indicating whether indexes and CRC32C will be included in serialization

const innerCell = beginCell().storeUint(456, 16).endCell();

const rootCell = beginCell().storeUint(0, 64).storeRef(innerCell).endCell();

const serialized_boc = serializeBoc(rootCell, { idx: false, crc32: false });

const serialized_boc_with_indexes_and_crc32 = serializeBoc(rootCell, { idx: true, crc32: true });
I