Engineering

Why onchain data needs an entity taxonomy for proper metric standardization

How we built an entity taxonomy and metric standardization to make every onchain metric Token Terminal publishes comparable.

Token Terminal

Why onchain data needs an entity taxonomy for proper metric standardization

When you analyze a public company, the taxonomy for the analysis exists outside the company itself. SEC filings (10-K) standardize the disclosure. GAAP defines the line items. GICS classifies the sector. CUSIP identifies the security. The categories you use to compare two companies are provided by regulators, accounting standards bodies, exchanges, and index providers, not by the companies themselves.

Onchain has none of that built in. The chain emits blocks, transactions, logs, and traces. None of those tell you what business activity a contract represents, what market sector a project belongs to, or how its numbers compare to a peer on a different chain.

Most crypto data providers built their entity taxonomy on top of the technical primitives: blockchains, accounts, smart contracts, tokens. If the goal is producing comparable financial and usage metrics, this is the wrong starting point.

Modeling crypto protocols as businesses

Our thesis from the start was simple: analyze blockchains, applications, and asset issuers the way investors analyze companies. Does the protocol have users? What is its business model? What is its gross merchandise value (GMV)? What is its take rate? These are the questions an analyst asks of any business, and they were the questions we wanted to answer for blockchains, applications, and asset issuers.

The technical taxonomy could not answer them. We needed an entity model that matched how a financial analyst already thinks.

Six entity types form the skeleton:

  • Project: a business (crypto-native or TradFi) operating onchain, e.g. Ethereum, Aave, Uniswap, Circle, BlackRock.
  • Chain: the blockchain a project operates on, e.g. Ethereum, Solana, Base.
  • Market sector: the category of business activity a project falls into, e.g. Lending, Exchanges, Stablecoin issuers, RWA issuers.
  • Product: a business line within a project, mapping to a market sector, e.g. Aave's V3 lending protocol, Uniswap's V4 DEX protocol and Unichain L2, Circle's USDC and EURC stablecoins.
  • Asset: a token, native currency, or tokenized real-world instrument, e.g. aUSDC, sUSDS, stETH, BUIDL.
  • Metric: a quantified property of any entity above, e.g. TVL, GMV, Fees, Revenue for projects; Market cap, Transfer volume, APY, Senders for assets.
  Market sector                              Chain
(Lending, Exchanges,                  (Ethereum, Solana, Base)
 Stablecoin / RWA issuers)                     ▲
           │                                   │
           │ categorizes                       │
           ▼                                   │
        Project ────── deployed on ────────────┤
  (Ethereum, Aave, Uniswap,                    │
   Circle, BlackRock)                          │
           │                                   │ deployed on
           │ has products                      │
           ▼                                   │
        Product ────── linked to ──────► Asset ┘
       (Aave V3,                    (aUSDC, sUSDS,
        Uniswap V4,                  stETH, BUIDL)
        Circle USDC)
           │                                │
           │ has metrics                    │ has metrics
           ▼                                ▼
 ┌──────────────────┐              ┌──────────────────┐
 │ Project metrics  │              │  Asset metrics   │
 │                  │              │                  │
 │ TVL              │              │ Market cap       │
 │ GMV              │              │ Transfer volume  │
 │ Fees             │              │ APY              │
 │ Revenue          │              │ Senders          │
 └──────────────────┘              └──────────────────┘

The six entity types and how they connect. Each metric is anchored to either a project or an asset.

How we standardize a metric

Each project has its own contracts and terminology for what it does. Our job is to map all of that to one canonical definition per metric, so that each metric means the same thing whether the project is Aave, Uniswap, Circle, BlackRock or any other protocol we cover.

Every metric we standardize follows the same three-step process per project.

1. The canonical definition. A definition modeled and inspired by traditional finance, in vocabulary an analyst already uses. Applied identically across every project we cover.

2. The project-specific definition. What the canonical metric means for a particular project.

3. The mapping from business model to onchain components. Map the project's business model and how value flows through it: who supplies capital, where it accumulates, how fees are paid, where revenue accrues. With the value flow mapped, the smart contracts and events at each step become identifiable, and each step corresponds to one of the canonical metrics.

Stress-testing the framework

To stress-test the framework, we walk through how it applies to three architecturally different businesses. The metrics we've chosen describe the financial picture of any onchain project: capital held, activity flowing, fees paid, revenue retained.

Defined canonically:

  • TVL: measures the total value of capital a project holds, onchain or tokenized from offchain.
  • GMV: measures the gross dollar value of activity through a project over the period. The market sector-specific variant carries its own name:
    • Active loans (Lending): measures the value of capital actively lent to borrowers.
    • Trading volume (Exchanges): measures the total value of trades executed through a project.
    • Outstanding supply (Stablecoin issuers): measures the circulating supply of stablecoins issued by a project.
  • Fees: measures the total value users pay to use a project.
  • Revenue: measures the portion of fees that a project retains, as determined by its take rate.

We apply these to Aave (Lending), Uniswap (Exchanges), and Circle (Stablecoin issuers). Most projects operate more than one business line; for the worked examples below we walk through a single product per project to keep the comparison clean. For each, we map the business model and trace how value flows through it.

Aave (Lending)

We walk through Aave V3. Flash loans, liquidations, and ecosystem fees (GHO, Chainlink SVR, Velora) are part of Aave's broader value flow but out of scope for this example.

A depositor supplies an asset (USDC, ETH, WBTC, and so on) to Aave's lending pool and receives an aToken receipt in return. A borrower posts collateral and draws an asset from the same pool, receiving a debt token that tracks what they owe. Interest accrues continuously through Aave's per-asset liquidity index. The reserve factor, set per asset by Aave governance, sends a share of borrower interest to the Aave treasury; the rest accrues to depositors via growing aToken balances.

Project-specific definitions:

  • TVL: total value of assets deposited into Aave V3 (idle liquidity plus active loans), at fair value.
  • Active loans: total value of assets currently borrowed (outstanding principal plus accrued interest).
  • Fees: borrower interest paid on outstanding loans.
  • Revenue: reserve factor share of borrower interest, retained by the Aave DAO.
   Depositor                          Borrower
      │                                  │
      │ supplies USDC                    │ borrows USDC
      │ (aUSDC minted)                   │ (vUSDC minted)
      ▼                                  ▼
┌──────────────────────────────────────────────┐
│              Aave Reserve                    │
│   ┌────────────────┐  ┌──────────────────┐   │
│   │ Idle liquidity │  │  Active loans    │   │
│   └────────────────┘  └──────────────────┘   │
│        └──────── TVL ─────────┘              │
└──────────────────────────────────────────────┘
                       │
                       │ Borrowers pay interest
                       │ (total interest = Fees)
                       ▼
                ┌─── Splits at ───┐
                │  reserve factor │
           (1 − rf)              (rf)
                │                 │
                ▼                 ▼
         ┌──────────────┐  ┌──────────────┐
         │  Depositors  │  │   Treasury   │
         │              │  │              │
         │ = Supply-    │  │  = Revenue   │
         │   side fees  │  │              │
         └──────────────┘  └──────────────┘

Aave V3: borrower interest splits between depositors and the Aave DAO at the reserve factor.

Mapping to onchain components:

  • TVL: ERC-20 Transfer events on aToken contracts (idle liquidity), plus VToken Mint/Burn events scaled by the variable borrow index (active loans).
  • Active loans: Mint and Burn events from VToken contracts, scaled by the variable borrow index from ReserveDataUpdated.
  • Fees: borrower interest from accruedToTreasury state and MintedToTreasury events on LendingPool.
  • Revenue: reserve factor applied to borrower interest. Treasury accruals land at the Aave Collector contract.

Full methodology: TVL, Active loans, Fees, Revenue.

Uniswap (Exchanges)

We walk through Uniswap V4. V2, V3, and Unichain L2 are part of Uniswap's broader value flow but out of scope for this example.

A liquidity provider supplies a pair of tokens (USDC + ETH, for example) to a Uniswap pool and receives an LP position representing their share. Traders swap one token for the other against the pool, paying a swap fee per trade. In V4, a configurable protocol fee can divert a share of swap fees to the Uniswap treasury, though the switch is off by default in most pools.

Project-specific definitions:

  • TVL: aggregate USD value of assets in Uniswap V4 liquidity pools, priced end-of-day.
  • Trading volume: total USD value of swaps in Uniswap V4 pools.
  • Fees: swap fees collected by Uniswap V4 pools.
  • Revenue: the share of swap fees retained when the V4 protocol switch is on.
  Liquidity provider                  Trader
      │                                  │
      │ supplies USDC + ETH              │ swaps USDC for ETH
      │ (LP position minted)             │ (pays swap fee)
      ▼                                  ▼
┌──────────────────────────────────────────────┐
│      Uniswap V4 PoolManager (singleton)      │
│   ┌────────────────┐  ┌──────────────────┐   │
│   │ Token reserves │  │  Trading volume  │   │
│   │ (LP positions) │  │  (per period)    │   │
│   └────────────────┘  └──────────────────┘   │
│        └──── TVL ────┘                       │
└──────────────────────────────────────────────┘
                       │
                       │ Traders pay swap fees
                       │ (total swap fees = Fees)
                       ▼
                ┌── Splits at ───┐
                │  protocol fee  │
           (1 − pf)             (pf)
                │                 │
                ▼                 ▼
         ┌──────────────┐  ┌──────────────┐
         │  LPs         │  │   Treasury   │
         │              │  │              │
         │ = Supply-    │  │  = Revenue   │
         │   side fees  │  │  (V3/V4 only)│
         └──────────────┘  └──────────────┘

Uniswap V4: swap fees go to LPs by default; with the V4 protocol fee switch on, a share accrues to the Uniswap treasury.

Mapping to onchain components:

  • TVL: ERC-20 and native asset transfers in and out of PoolManager (singleton) and vault contracts, summed as running balances per asset and priced end-of-day.
  • Trading volume: Swap event amounts from V4's singleton PoolManager.
  • Fees: Swap event amounts multiplied by the pool's fee tier (from PoolKey.fee).
  • Revenue: protocol fee share from setProtocolFee, when the switch is on. The share accrues to the configured recipient address.

Full methodology: TVL, Trading volume, Fees, Revenue.

Circle (Stablecoin issuers)

We walk through Circle's USDC stablecoin. CCTP cross-chain transfer fees and Circle's other stablecoins (EURC, USYC) are part of Circle's broader value flow but out of scope for this example.

A user deposits fiat with Circle and receives newly-minted USDC at a 1:1 rate. Circle holds the fiat in regulated custody and invests it in short-duration US Treasuries. The Treasuries earn yield. USDC itself is non-yielding, so all of the yield accrues to Circle. The supply outstanding is Circle's "managed capital" and the reserves backing it are the protocol's TVL.

Project-specific definitions:

  • TVL: total USD value of outstanding USDC across all supported chains.
  • Outstanding supply: circulating USDC across all supported chains, excluding tokens held by Circle.
  • Fees: yield earned on USDC reserve assets.
  • Revenue: essentially all of Fees. USDC is non-yielding to holders, so Circle retains the reserve yield.
   USDC user                       US Treasury
                                      market
      │                                  │
      │ deposits fiat                    │ pays
      │ (USDC minted 1:1)                │ Treasury yield
      ▼                                  ▼
┌──────────────────────────────────────────────┐
│              Circle Reserve                  │
│   ┌────────────────┐  ┌──────────────────┐   │
│   │ Fiat custody   │  │  US Treasuries   │   │
│   └────────────────┘  └──────────────────┘   │
│        └──────── TVL ─────────┘              │
│     ( 1:1 backing → Outstanding supply )     │
└──────────────────────────────────────────────┘
                       │
                       │ Reserves earn yield
                       │ (total yield = Fees)
                       ▼
                ┌── Splits at ───┐
                │  USDC holders  │
                │  get 0% yield  │
            (0%)               (100%)
                │                 │
                ▼                 ▼
         ┌──────────────┐  ┌──────────────┐
         │ USDC holders │  │   Circle     │
         │              │  │              │
         │ = Supply-    │  │  = Revenue   │
         │   side fees  │  │              │
         │   (= 0)      │  │              │
         └──────────────┘  └──────────────┘

Circle USDC: reserves earn Treasury yield, holders earn nothing, so all yield accrues to Circle.

Mapping to onchain components:

  • TVL: USDC outstanding supply across all supported chains, derived from Mint and Burn events. Acts as a 1:1 proxy for the offchain reserves Circle attests to monthly.
  • Outstanding supply: Mint and Burn events from USDC contracts across all supported chains, excluding tokens held in Circle-owned accounts.
  • Fees: reserve yield from Circle's transparency-reported reserve holdings (offchain).
  • Revenue: same as Fees, since the supply-side share is zero.

Full methodology: TVL, Outstanding supply, Fees, Revenue.

Extending the framework to new metrics and datasets

The standardization process is enabled by owning the full data pipeline. We scrape raw blockchain data from our own RPC infrastructure, decode smart contract events from contracts we label in-house, and run our own price feed. The entity taxonomy and the canonical metric definitions sit on top of all of it.

That control is what lets us extend the same process to new metric families without rebuilding the stack. This year alone we have shipped three new datasets on the same infrastructure:

  • Tokenized assets. Over 3,000 tokenized real-world instruments (e.g. Stablecoins, RWAs, T-bill funds, credit funds, stocks, ETFs).
  • APY across yield-bearing assets. Yield normalized across rebasing tokens, ERC-4626 vaults, scaled-balance designs, and tokenized funds backed by yield-bearing instruments (e.g. T-bills), all annualized the same way.
  • Agentic payments. Onchain payments by AI agents, with transaction volumes and sender/recipient breakdowns.

Each dataset adds entities to the same taxonomy rather than forking it:

  1. From a tokenized asset like BUIDL you can navigate to its tokenization platform (Securitize) or fund issuer (BlackRock) and see project metrics, the rest of their tokenized asset catalogs, and comparisons to other RWA issuers.
  2. From a yield-bearing asset like aEthWETH you can navigate to the protocol producing the yield (Aave) and see Aave's project-level revenue, market sector context, and the rest of its assets.
  3. From the agentic payments dataset you can navigate to the chain where payments settle (Solana) and see payment activity in the context of Solana's broader economic and usage metrics.

What this lets a user do in practice is ask any question that crosses entity types, get one answer, and trust it. "Which RWA issuer has the highest revenue per AUM?" is one query across the joined tokenized asset and project metric tables. "Compare yields across protocols that issue receipt tokens" is one query across the asset and project entities.

What this enables

Any data provider can build one-off dashboards for a handful of projects. The moment you want to do that at scale, you need an entity taxonomy to tie similar-sounding concepts together across projects, chains, and market sectors. Networks and blockchains, protocols and dapps, tokens and currencies: different words mean the same thing in different dashboards.

The cost of not having that mapping falls on whoever is interpreting the data. A human analyst has to audit every cross-protocol comparison, reconciling vocabulary by hand. An AI agent has to reconstruct the semantic layer from raw tables on every query, which is where hallucinations come from.

Recent benchmarks from dbt put numbers on this. Text-to-SQL accuracy across three settings:

  • ~65% accuracy on raw, unmodeled enterprise data. The agent sees raw tables, guesses at joins, disambiguates business terms on its own. Most answers are wrong, and the wrong ones look as confident as the right ones.
  • 85-90% accuracy once the data is properly modeled. Entities are typed, metrics are defined once, and the agent reads from the model instead of the raw tables.
  • 98-100% accuracy with a semantic layer on top. The schema, the entities, and the metric definitions all line up with how a human analyst would ask the question.

Onchain data is moving from analysts to agents. Every such workflow is only as right as the data layer underneath. Without an entity taxonomy and standardized metrics, the agent guesses at joins, reconciles vocabulary on the fly, and produces confident-sounding wrong answers.

Token Terminal has spent five years building that layer. Try our MCP server to query accurate, standardized, and comparable onchain data through any LLM-powered tool. For a trial or to discuss integrations, reach out at sales@tokenterminal.xyz.

The authors of this content, or members, affiliates, or stakeholders of Token Terminal may be participating or are invested in protocols or tokens mentioned herein. The foregoing statement acts as a disclosure of potential conflicts of interest and is not a recommendation to purchase or invest in any token or participate in any protocol. Token Terminal does not recommend any particular course of action in relation to any token or protocol. The content herein is meant purely for educational and informational purposes only, and should not be relied upon as financial, investment, legal, tax or any other professional or other advice. None of the content and information herein is presented to induce or to attempt to induce any reader or other person to buy, sell or hold any token or participate in any protocol or enter into, or offer to enter into, any agreement for or with a view to buying or selling any token or participating in any protocol. Statements made herein (including statements of opinion, if any) are wholly generic and not tailored to take into account the personal needs and unique circumstances of any reader or any other person. Readers are strongly urged to exercise caution and have regard to their own personal needs and circumstances before making any decision to buy or sell any token or participate in any protocol. Observations and views expressed herein may be changed by Token Terminal at any time without notice. Token Terminal accepts no liability whatsoever for any losses or liabilities arising from the use of or reliance on any of this content.

Stay in the loop

Join our mailing list to get the latest insights!

Subscribe to our weekly newsletter
Actionable insights you can’t get elsewhere.
© 2026 Token Terminal