Engineering
A three-stage pipeline that produces standardized metrics for 3,000+ tokenized assets, sitting on top of the full onchain data stack we operate in-house.
Token Terminal
•
Our asset metrics pipeline tracks 3,000+ tokenized assets across 50 blockchains. They fall into four categories:
- Stablecoins are tokens pegged to a reference asset like the U.S. dollar, including USDC, USDT, and DAI.
- Tokenized funds are onchain representations of shares in regulated investment vehicles, including BlackRock's BUIDL, Franklin Templeton's BENJI, and Ondo's OUSG.
- Tokenized commodities are tokens backed by physical commodities like gold, including PAXG and XAUT.
- Tokenized stocks are tokens tracking shares of publicly traded companies, issued by firms like Robinhood, Dinari, and Backed.
Each is a smart contract that mints, transfers, and redeems a token tied to dollars, fund shares, commodities, or stocks. All of them run through one pipeline that produces the same standardized metrics for every asset on every chain.
The pipeline has three stages:
- Asset curation decides what counts as an asset.
- Chain-specific transformations compute per-chain activity for every asset in the registry.
- Cross-chain metric aggregation combines the chain tables and applies pricing to produce the final metrics.
Curating a registry of assets
Any address can deploy a token named "USDC" with the symbol "USDC" and 6 decimals. The same is true for PAXG, BUIDL, and every tokenized equity ticker. A pipeline that discovered assets automatically would fill the dataset with scam tokens within hours. Our registry is the source of truth for which address is Circle's USDC on which chain.
The registry also solves cross-chain aggregation. USDC on Ethereum and USDC on Solana are not the same contract, but they are the same asset. A user asking for USDC's circulating market cap wants the sum across every chain, not a per-chain number. Each registry entry maps one human-readable name to the token's addresses on every chain it exists on.
A single entry looks like this:
asset_id: usdc
asset_type: stablecoin
reference_asset: usd
issuer: circle
addresses:
ethereum:
address: "0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48"
bridged: false
solana:
address: "EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v"
bridged: false
arbitrum:
address: "0xaf88d065e77c8cc2239327c5edb3a432268e5831"
bridged: false
# 17 more chains
Assets live as their own entries, not as properties of an issuer. That matters because the same asset often has more than one party that could plausibly be called its issuer:
- USDC has one clear issuer: Circle.
- BUIDL has two. BlackRock manages the fund and the brand; Securitize runs the tokenization platform that actually issues the token onchain.
Keeping the registry accurate as tokens proliferate takes both automation and review. An agent watches factory contracts across chains, drafts registry entries for every new token it finds, and flags name collisions against the existing registry. A reviewer approves every entry before it reaches production, and CI blocks anything that fails linting or schema validation.
When Dinari deployed close to 300 stock tokens across four chains, the agent matched most against existing assets and drafted the rest as new entries in a single pull request. One reviewer moved the PR through, and the metrics went live on the next run.
Robinhood's launch on Arbitrum was the larger version. Their factory deployed roughly 2,000 ERC-20 proxy tokens. The agent drafted a registry entry for each, the reviewer approved them in a single pull request, and the metrics went live on the next run.
Reading the chain
Chain-specific transformations compute the per-chain daily tables for every asset in the registry: who held the asset at the end of each day, how much moved between addresses, how many tokens were minted, and how many were redeemed. They read directly from our normalized blockchain data, our decoded contract data, and our in-house price feed.
About 80% of the registry conforms to a chain standard:
- ERC20 is the token standard on Ethereum and its compatible chains.
- SPL is the standard on Solana.
- FA is the standard on Aptos.
A token that conforms to its chain's standard emits the same event for every mint, burn, and transfer. On EVM chains the event is:
event Transfer(address indexed from, address indexed to, uint256 value);
A transfer moves tokens from one address to another. Most of the time both addresses are regular wallets. The standard also reserves one address that nobody controls: 0x0000000000000000000000000000000000000000, the null address. When it shows up in a transfer, tokens are being created or destroyed instead of moved.
Three operations derive from a single Transfer event:
- A transfer moves tokens between two non-null addresses.
- A mint is a transfer from the null address.
- A burn is a transfer to the null address.
A simplified per-chain model reads raw transfers and aggregates them into daily activity per asset:
-- Simplified per-chain asset model (ethereum)
select
date(block_timestamp) as date,
token_address,
sum(case when from_address = '0x000...' then value end) as mint_volume,
sum(case when to_address = '0x000...' then value end) as burn_volume,
sum(value) as transfer_volume
from ethereum.token_transfers
where token_address in (select address from registry where chain = 'ethereum')
group by date, token_address
For this 80%, a code generator writes one of these per chain. All of them produce the same output schema. Non-EVM chains use dedicated model patterns that account for different transfer semantics, but the output columns match.
The remaining 20% do not conform. Major stablecoin issuers use their own events for supply changes:
// Circle (USDC)
event Mint(address minter, address to, uint256 amount);
// Tether (USDT)
event Issue(uint256 amount);
// Paxos (USDP)
event SupplyIncreased(address to, uint256 value);
Each issuer gets a dedicated override model that reads the issuer's event and produces the same output columns as the standard path. Some assets need more than event decoding:
- Rebase tokens like Aave's aTokens grow the holder's balance continuously as interest accrues, with no mint event ever fired for the change.
- Cross-chain burn-and-mint like Circle's CCTP pairs a burn on one chain with a mint on another, so counting either side alone would misstate supply.
Each of these cases gets purpose-built logic.
The output of this stage is one unified schema per asset, per chain, per day. USDC and USDT, two of the largest assets by supply, require overrides. Many smaller assets follow the standard. The pipeline supports both with the same fidelity.
Aggregating across chains
Assets do not live on one chain. USDC has native issuance on 20+ chains. BUIDL has supply on Ethereum and several other chains. A user asking for holder count or market cap wants a single number that aggregates every deployment. Per-chain tables alone cannot answer that question.
A single model combines every per-chain table for an asset and aggregates per metric:
-- Union per-chain tables, then aggregate
with all_chains as (
select * from asset_metrics_ethereum
union all select * from asset_metrics_solana
union all select * from asset_metrics_arbitrum
-- 47 more chains
)
select asset_id, date,
count(distinct holder_address) as holders,
sum(transfer_volume) as transfer_volume,
sum(mint_volume) as mints,
sum(burn_volume) as redemptions
from all_chains
group by asset_id, date
The output is one row per asset per day across twelve metrics in three groups:
- Flows cover transfer volume, transfer count, mints, and redemptions.
- State covers circulating market cap, holders, price, APY, and off-peg.
- Activity covers senders at daily, monthly, and weekly windows.
Pricing comes from our in-house onchain price feed, which covers 300,000+ tokens and is built entirely from onchain state. More on that layer in Why we built our in-house onchain price feed.
What the full stack makes possible
The pipeline looks simple from the outside. The input is a YAML file. The output is a metric.
That simplicity rests on years of infrastructure beneath it.
Chain-specific transformations depend on the ingestion layer pulling raw blockchain data from 100+ chains and the normalization layer turning it into consistent tables.
Override models depend on the decoding layer resolving every contract event our pipeline touches.
Cross-chain aggregation depends on the pricing layer covering 300,000+ tokens.
Each of those layers took years to build. Together, they let one pipeline cover the entire tokenized asset market.
If your team needs access to this dataset, reach out at sales@tokenterminal.xyz.
The authors of this content, or members, affiliates, or stakeholders of Token Terminal may be participating or are invested in protocols or tokens mentioned herein. The foregoing statement acts as a disclosure of potential conflicts of interest and is not a recommendation to purchase or invest in any token or participate in any protocol. Token Terminal does not recommend any particular course of action in relation to any token or protocol. The content herein is meant purely for educational and informational purposes only, and should not be relied upon as financial, investment, legal, tax or any other professional or other advice. None of the content and information herein is presented to induce or to attempt to induce any reader or other person to buy, sell or hold any token or participate in any protocol or enter into, or offer to enter into, any agreement for or with a view to buying or selling any token or participating in any protocol. Statements made herein (including statements of opinion, if any) are wholly generic and not tailored to take into account the personal needs and unique circumstances of any reader or any other person. Readers are strongly urged to exercise caution and have regard to their own personal needs and circumstances before making any decision to buy or sell any token or participate in any protocol. Observations and views expressed herein may be changed by Token Terminal at any time without notice. Token Terminal accepts no liability whatsoever for any losses or liabilities arising from the use of or reliance on any of this content.
Stay in the loop
Join our mailing list to get the latest insights!