Engineering

How we apply dimensional modeling to onchain data

A run-through of Kimball's dimensional modeling applied to Sky's raw smart contract events.

Token Terminal

How we apply dimensional modeling to onchain data

The previous post introduced our entity taxonomy and how it underpins metric standardization. In this post we show how the standardization actually works from raw smart contract events to a queryable metric.

Onchain is an open design space. Any developer can build any smart contract architecture, choose any design pattern, and name their events and functions however they want. That freedom makes standardizing comparable financial and usage metrics across 100+ blockchains, 1,200+ protocols, and 3,000+ tokenized assets hard.

Traditional analytics has faced the same problem for decades. Dimensional modeling has been the standard answer for thirty years. Ralph Kimball codified it in 1996 in The Data Warehouse Toolkit.

In short, the process follows these steps:

  1. Choose the business process being modeled: the activity to measure.
  2. Declare the grain of the data: what a single row represents (one event, one daily snapshot).
  3. Identify the dimensions: the descriptive context queries filter and group by.
  4. Identify the facts: the numeric measurements being aggregated.

While the process itself is simple, applying it well to onchain data is the hard part. We've chosen Sky as the worked example: it has a complex smart contract architecture and a deliberately strange vocabulary, both of which make it a good stress test for the dimensional model.

1. Choose the business process

Sky issues USDS, its decentralized stablecoin backed by a mix of onchain collateral and tokenized real-world assets.

The protocol's key business components are:

  • Stablecoin issuance (Vaults). Users lock collateral (ETH, USDC, RWA tokens) and mint USDS. Borrowers pay a stability fee, an interest rate set per collateral type by governance.
  • Sky Savings Rate (SSR). USDS holders deposit into a savings contract and earn yield, paid from collected stability fees.
  • Peg Stability Module (PSM). The PSM swaps approved stablecoins (e.g. USDC) for USDS one-to-one. A small fee per swap accrues to the protocol.
  • Direct Deposit Module (D3M). Sky deploys USDS into external lending venues (Aave, Morpho) and into semi-autonomous sub-DAOs within its own ecosystem (Spark, Grove) to put idle reserves to work. The portion actively borrowed earns yield that flows to the surplus. Whatever sits unlent counts as idle.

They all feed the protocol surplus through their own paths:

   Vault user                                     USDS holder   
      │                                                  │      
      │ locks ETH/USDC/RWA                               │ deposits USDS
      │ mints USDS                                       │ into SSR
      ▼                                                  ▼      
┌──────────────────────────────────────────────────────────────┐
│                          Sky Vaults                          │
│      ┌────────────────┐    ┌────────────────────────────┐    │
│      │  Collateral    │    │  USDS minted               │    │
│      │  locked        │    │  → in circulation          │    │
│      │  (= TVL)       │    │  (= Outstanding supply)    │    │
│      └────────────────┘    └────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘
                              │
                              │ Stability fees (= Fees)
                              ▼
                       ┌── Splits at ──┐
                       │   SSR rate     │
                  (paid out)        (retained)
                       │                │
                       ▼                ▼
                   USDS           ┌────────────────────────────┐
                  holders         │      Sky Surplus Buffer    │
                  (savings        │      (= Revenue)           │
                  yield)          └────────────────────────────┘
                                       ▲       ▲
                                       │       │
                                  PSM fee   D3M income
                                  (per-     (lending
                                   swap)    yield)

Stability fees from vault borrowers split between savings depositors and the surplus buffer. PSM swap fees and D3M lending yield flow directly to the surplus.

Sky inherits a deliberately strange vocabulary from MakerDAO's original contract design. The terms used in the rest of this post:

  1. The vault accounting contract is the Vat. All collateral and debt state lives here.
  2. Collateral types are ilks (e.g. ETH-A, USDC-A).
  3. A single user's vault state is an urn.
  4. ink is the collateral balance in an urn (raw token units before USD conversion).
  5. art is the debt amount in an urn, before rate scaling.
  6. Each ilk has a rate, a per-block debt multiplier. The actual debt equals art × rate.
  7. Each ilk has a gem, the underlying ERC-20 token address.
  8. Vault state changes happen through Vat function calls: frob (open or modify a vault), grab (liquidate), fold (rate update).

The protocol invented these terms on purpose: each is an idiosyncratic word that means exactly one thing inside the protocol, so contributors reasoning about contract behavior cannot confuse them with anything else.

Sky's architecture and vocabulary are dense. To go deep on the dimensional model without overloading the post, we focus on one metric from here: TVL.

2. Declare the grain

Facts capture quantitative state at a defined level of detail. Kimball calls this the grain.

For Sky's TVL, the grain is per-vault, per-day: each row in fct_sky_vault_collaterals is one vault's collateral balance on one day.

timestamp     vault_id   ilk        ink         token_address
─────────────────────────────────────────────────────────────────
2024-12-01    1234       ETH-A      5.0         0xC02a...  (WETH)
2024-12-01    1235       USDC-A     50,000      0xA0b8...  (USDC)
2024-12-01    1236       WBTC-A     0.15        0x2260...  (WBTC)

Each row is one vault's collateral balance on one day. ink is the onchain balance from Sky's Vat state; ilk identifies the collateral type; token_address is the underlying token used for USD conversion.

3. Identify the dimensions

A dimension is the descriptive context attached to a fact. In Kimball's framework, dimensions answer the who, what, where, when, why, and how of an event. Queries filter and group on them. Each dimension table sits as a point of a star around the central fact table; every query joins facts to one or more dimensions.

Inside a project's transformation, the dimensions that matter are the protocol-specific ones. For Sky, that's the collateral type backing each vault.

Sky's TVL uses one dimension:

  • dim_sky_collateral_types holds one row per ilk, with the bytes32 identifier decoded to a human-readable name. The table auto-populates from Vat.frob, Vat.grab, and Vat.fold calls.

Here's how it's built:

with collateral_types as (
  select distinct
    'ethereum' as chain,
    in_i as ilk_id,
    safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
  from {{ source('sky_v1_ethereum', 'Vat_frob_function') }}
  union all
  select distinct 'ethereum' as chain, in_i as ilk_id,
    safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
  from {{ source('sky_v1_ethereum', 'Vat_grab_function') }}
  union all
  select distinct 'ethereum' as chain, in_i as ilk_id,
    safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
  from {{ source('sky_v1_ethereum', 'Vat_fold_function') }}
)
select distinct chain, ilk_id, ilk_name from collateral_types

Each row is one Sky collateral type, with the ilk identifier and decoded ilk name. When Sky onboards a new collateral, the dimension picks it up automatically.

4. Identify the facts

A fact is a measurement of something that happened. In Kimball's framework, facts are numeric, timestamped, and aggregatable. They sit at the center of the dimensional model.

Sky's TVL is built from the following fact tables, each capturing a different kind of collateral position:

  • fct_sky_vault_collaterals holds per-vault ink (collateral) balances per ilk, read from Vat state.
  • fct_sky_legacyrwa_collaterals holds legacy real-world asset positions, with value tracked onchain as art × rate.
  • fct_sky_nonvault_collaterals covers PSM and D3M-style ilks where vault accounting doesn't apply and collateral value equals debt.
  • fct_sky_lockstake_sky_supplies tracks Lockstake Sky token supply, derived from mint and burn Transfer events.

Each table is its own fact in the dimensional model, derived from upstream source fact tables that capture state at the event level. To make one concrete, here's fct_sky_vault_collaterals:

-- fct_sky_vault_collaterals: per-vault ink balances per ilk, daily.

-- Read raw ink from Vat state, joined with the ilk registry to attach
-- the underlying token address for each collateral type. Filter out
-- collateral types with no join contract (inactive).
with unfilled as (
  select
    inks.timestamp,
    inks.vault_id,
    inks.ilk,
    inks.urn,
    inks.chain,
    lower(json_value(infos.result, '$.gem')) as token_address,
    cast(json_value(inks.result, '$.ink') as numeric) as ink
  from {{ source('states_sky', 'fct_sky_vat_changed_inks') }} as inks
  inner join {{ source('states_sky', 'fct_sky_ilk_registry_infos') }} as infos
    on inks.ilk = infos.ilk
  where json_value(infos.result, '$.join') != '0x0000000000000000000000000000000000000000'
),

-- Forward-fill balances across days with no change events, so every
-- vault has a balance row for every day.
filled as (
  {{ fill_forward(
    table_name = 'unfilled',
    timestamp = 'timestamp',
    dimensions = ['vault_id', 'ilk', 'urn', 'chain', 'token_address'],
    expression = 'ink'
  ) }}
),

-- Read the set of currently active ilks from the ilk registry.
active_ilks as (
  select distinct
    timestamp,
    json_value(ilk, '$') as ilk
  from {{ source('states_sky', 'fct_sky_ilk_registry_ilks') }} as ilks,
  unnest(json_extract_array(ilks.raw_state_data)) as ilk
  where
    json_query(raw_state_data, '$.error') is null
    and raw_state_data is not null
),

-- Keep only balances for ilks that are still active on that day.
final as (
  select
    filled.timestamp,
    filled.vault_id,
    filled.ilk,
    filled.urn,
    filled.chain,
    filled.token_address,
    filled.ink
  from filled
  inner join active_ilks
    on filled.ilk = active_ilks.ilk
    and filled.timestamp = active_ilks.timestamp
)
select * from final

Read raw ink balances from Sky's Vat state via the upstream source fact tables, attach the underlying token address from the ilk registry, forward-fill days without change events, and keep only currently active ilks. The output is per-vault ink for every active ilk on every day, ready for TVL to aggregate.

5. Build the standardized metric

Sky's TVL measures the total USD value of all collateral assets deposited in the protocol. It represents the aggregate value of assets backing USDS across vault positions, non-vault collaterals, legacy real-world assets, and Lockstake Sky tokens.

TVL aggregates the four collateral facts into a single per-day USD figure, attributed via dim_sky_collateral_types:

  • Vault collaterals. Read ink balances per vault and ilk from Sky's Vat state. Convert raw ink from 18-decimal precision to token amounts, then to USD using end-of-day prices.
  • Non-vault collaterals. For special ilks where vault accounting doesn't apply (some PSM and direct-deposit structures), collateral value is computed as art × rate: debt amount multiplied by the per-ilk debt multiplier. This treats collateral value as equal to debt value for these positions.
  • Legacy RWA collaterals. Real-world asset ilks (identified by names beginning with the RWA prefix) use the same art × rate convention as non-vault collaterals. These represent older offchain assets with values tracked onchain.
  • Lockstake Sky collaterals. Track Lockstake Sky token supply by replaying Transfer events (mints from the zero address, burns to the zero address) and maintaining a running cumulative supply. Value the supply at the Sky token's end-of-day price.

To make the per-component logic concrete, here are the aggregations for each collateral type:

-- Vault deposits: sum ink across vaults per ilk, per day.
vault_collaterals_token as (
  select
    timestamp, vault_id, ilk, chain as chain_id, token_address,
    sum(ink / 1e18) as tvl
  from {{ ref('fct_sky_vault_collaterals') }}
  group by 1, 2, 3, 4, 5
),

-- Non-vault: collateral value equals debt, already in USD.
nonvault_collaterals_usd as (
  select
    timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
    sum(debt) as tvl
  from {{ ref('fct_sky_nonvault_collaterals') }}
  group by 1, 2, 3, 4, 5
),

-- Legacy RWA: same collateral = debt convention.
legacyrwa_collaterals_usd as (
  select
    timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
    sum(debt) as tvl
  from {{ ref('fct_sky_legacyrwa_collaterals') }}
  group by 1, 2, 3, 4, 5
),

-- Lockstake Sky: sum token supply per ilk, per day.
lockstake_sky_collaterals as (
  select
    timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
    sum(supply) as tvl
  from {{ ref('fct_sky_lockstake_sky_supplies') }}
  group by 1, 2, 3, 4, 5
)

Vault collaterals read ink in token units (converted to USD downstream); non-vault and legacy RWA take collateral value as equal to debt, already in USD; Lockstake Sky sums token supply, valued at the Sky token's end-of-day price. Finally (not shown), we combine and sum all four by day to produce the TVL number.

6. What this enables

What we walked through for Sky's TVL is the same process every metric runs through, across 100+ blockchains, 1,200+ protocols, and 3,000+ tokenized assets. It serves three audiences:

  • Our team. Every data and analytics engineer transforms data through the same fact and dimensional model. When the engineer who originally listed a project starts working on something else, the next one picks up the work with minimal context and handover.
  • Our users. We run the full pipeline from raw smart contract events to final metrics, and the dimensional model is the discipline applied at every step. Together they give every financial and usage metric a full audit trail back to the underlying contracts. The Sky TVL methodology is one example of what that looks like in production.
  • Our agents. An AI agent queries, edits, and extends the system the way a human does, with the schema as scaffolding. Combined with our knowledge base, this is already producing an order-of-magnitude improvement in the efficiency, accuracy, and speed of integrating new blockchains, protocols, and tokenized assets.

While crypto's design space is open, the data we publish from it stays standardized and comparable, on the dimensional model institutional finance has used since the 1990s.

The authors of this content, or members, affiliates, or stakeholders of Token Terminal may be participating or are invested in protocols or tokens mentioned herein. The foregoing statement acts as a disclosure of potential conflicts of interest and is not a recommendation to purchase or invest in any token or participate in any protocol. Token Terminal does not recommend any particular course of action in relation to any token or protocol. The content herein is meant purely for educational and informational purposes only, and should not be relied upon as financial, investment, legal, tax or any other professional or other advice. None of the content and information herein is presented to induce or to attempt to induce any reader or other person to buy, sell or hold any token or participate in any protocol or enter into, or offer to enter into, any agreement for or with a view to buying or selling any token or participating in any protocol. Statements made herein (including statements of opinion, if any) are wholly generic and not tailored to take into account the personal needs and unique circumstances of any reader or any other person. Readers are strongly urged to exercise caution and have regard to their own personal needs and circumstances before making any decision to buy or sell any token or participate in any protocol. Observations and views expressed herein may be changed by Token Terminal at any time without notice. Token Terminal accepts no liability whatsoever for any losses or liabilities arising from the use of or reliance on any of this content.

Stay in the loop

Join our mailing list to get the latest insights!

Subscribe to our weekly newsletter
Actionable insights you can’t get elsewhere.
© 2026 Token Terminal