Engineering
A run-through of Kimball's dimensional modeling applied to Sky's raw smart contract events.
Token Terminal
•
The previous post introduced our entity taxonomy and how it underpins metric standardization. In this post we show how the standardization actually works from raw smart contract events to a queryable metric.
Onchain is an open design space. Any developer can build any smart contract architecture, choose any design pattern, and name their events and functions however they want. That freedom makes standardizing comparable financial and usage metrics across 100+ blockchains, 1,200+ protocols, and 3,000+ tokenized assets hard.
Traditional analytics has faced the same problem for decades. Dimensional modeling has been the standard answer for thirty years. Ralph Kimball codified it in 1996 in The Data Warehouse Toolkit.
In short, the process follows these steps:
- Choose the business process being modeled: the activity to measure.
- Declare the grain of the data: what a single row represents (one event, one daily snapshot).
- Identify the dimensions: the descriptive context queries filter and group by.
- Identify the facts: the numeric measurements being aggregated.
While the process itself is simple, applying it well to onchain data is the hard part. We've chosen Sky as the worked example: it has a complex smart contract architecture and a deliberately strange vocabulary, both of which make it a good stress test for the dimensional model.
1. Choose the business process
Sky issues USDS, its decentralized stablecoin backed by a mix of onchain collateral and tokenized real-world assets.
The protocol's key business components are:
- Stablecoin issuance (Vaults). Users lock collateral (ETH, USDC, RWA tokens) and mint USDS. Borrowers pay a stability fee, an interest rate set per collateral type by governance.
- Sky Savings Rate (SSR). USDS holders deposit into a savings contract and earn yield, paid from collected stability fees.
- Peg Stability Module (PSM). The PSM swaps approved stablecoins (e.g. USDC) for USDS one-to-one. A small fee per swap accrues to the protocol.
- Direct Deposit Module (D3M). Sky deploys USDS into external lending venues (Aave, Morpho) and into semi-autonomous sub-DAOs within its own ecosystem (Spark, Grove) to put idle reserves to work. The portion actively borrowed earns yield that flows to the surplus. Whatever sits unlent counts as idle.
They all feed the protocol surplus through their own paths:
Vault user USDS holder
│ │
│ locks ETH/USDC/RWA │ deposits USDS
│ mints USDS │ into SSR
▼ ▼
┌──────────────────────────────────────────────────────────────┐
│ Sky Vaults │
│ ┌────────────────┐ ┌────────────────────────────┐ │
│ │ Collateral │ │ USDS minted │ │
│ │ locked │ │ → in circulation │ │
│ │ (= TVL) │ │ (= Outstanding supply) │ │
│ └────────────────┘ └────────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
│
│ Stability fees (= Fees)
▼
┌── Splits at ──┐
│ SSR rate │
(paid out) (retained)
│ │
▼ ▼
USDS ┌────────────────────────────┐
holders │ Sky Surplus Buffer │
(savings │ (= Revenue) │
yield) └────────────────────────────┘
▲ ▲
│ │
PSM fee D3M income
(per- (lending
swap) yield)
Stability fees from vault borrowers split between savings depositors and the surplus buffer. PSM swap fees and D3M lending yield flow directly to the surplus.
Sky inherits a deliberately strange vocabulary from MakerDAO's original contract design. The terms used in the rest of this post:
- The vault accounting contract is the
Vat. All collateral and debt state lives here. - Collateral types are
ilks(e.g.ETH-A,USDC-A). - A single user's vault state is an
urn. inkis the collateral balance in an urn (raw token units before USD conversion).artis the debt amount in an urn, before rate scaling.- Each ilk has a
rate, a per-block debt multiplier. The actual debt equalsart × rate. - Each ilk has a
gem, the underlying ERC-20 token address. - Vault state changes happen through
Vatfunction calls:frob(open or modify a vault),grab(liquidate),fold(rate update).
The protocol invented these terms on purpose: each is an idiosyncratic word that means exactly one thing inside the protocol, so contributors reasoning about contract behavior cannot confuse them with anything else.
Sky's architecture and vocabulary are dense. To go deep on the dimensional model without overloading the post, we focus on one metric from here: TVL.
2. Declare the grain
Facts capture quantitative state at a defined level of detail. Kimball calls this the grain.
For Sky's TVL, the grain is per-vault, per-day: each row in fct_sky_vault_collaterals is one vault's collateral balance on one day.
timestamp vault_id ilk ink token_address
─────────────────────────────────────────────────────────────────
2024-12-01 1234 ETH-A 5.0 0xC02a... (WETH)
2024-12-01 1235 USDC-A 50,000 0xA0b8... (USDC)
2024-12-01 1236 WBTC-A 0.15 0x2260... (WBTC)
Each row is one vault's collateral balance on one day. ink is the onchain balance from Sky's Vat state; ilk identifies the collateral type; token_address is the underlying token used for USD conversion.
3. Identify the dimensions
A dimension is the descriptive context attached to a fact. In Kimball's framework, dimensions answer the who, what, where, when, why, and how of an event. Queries filter and group on them. Each dimension table sits as a point of a star around the central fact table; every query joins facts to one or more dimensions.
Inside a project's transformation, the dimensions that matter are the protocol-specific ones. For Sky, that's the collateral type backing each vault.
Sky's TVL uses one dimension:
dim_sky_collateral_typesholds one row per ilk, with the bytes32 identifier decoded to a human-readable name. The table auto-populates fromVat.frob,Vat.grab, andVat.foldcalls.
Here's how it's built:
with collateral_types as (
select distinct
'ethereum' as chain,
in_i as ilk_id,
safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
from {{ source('sky_v1_ethereum', 'Vat_frob_function') }}
union all
select distinct 'ethereum' as chain, in_i as ilk_id,
safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
from {{ source('sky_v1_ethereum', 'Vat_grab_function') }}
union all
select distinct 'ethereum' as chain, in_i as ilk_id,
safe_cast(from_hex(replace(replace(in_i, '0x', ''), '00', '')) as string format 'UTF8') as ilk_name
from {{ source('sky_v1_ethereum', 'Vat_fold_function') }}
)
select distinct chain, ilk_id, ilk_name from collateral_types
Each row is one Sky collateral type, with the ilk identifier and decoded ilk name. When Sky onboards a new collateral, the dimension picks it up automatically.
4. Identify the facts
A fact is a measurement of something that happened. In Kimball's framework, facts are numeric, timestamped, and aggregatable. They sit at the center of the dimensional model.
Sky's TVL is built from the following fact tables, each capturing a different kind of collateral position:
fct_sky_vault_collateralsholds per-vaultink(collateral) balances per ilk, read fromVatstate.fct_sky_legacyrwa_collateralsholds legacy real-world asset positions, with value tracked onchain asart × rate.fct_sky_nonvault_collateralscovers PSM and D3M-style ilks where vault accounting doesn't apply and collateral value equals debt.fct_sky_lockstake_sky_suppliestracks Lockstake Sky token supply, derived from mint and burnTransferevents.
Each table is its own fact in the dimensional model, derived from upstream source fact tables that capture state at the event level. To make one concrete, here's fct_sky_vault_collaterals:
-- fct_sky_vault_collaterals: per-vault ink balances per ilk, daily.
-- Read raw ink from Vat state, joined with the ilk registry to attach
-- the underlying token address for each collateral type. Filter out
-- collateral types with no join contract (inactive).
with unfilled as (
select
inks.timestamp,
inks.vault_id,
inks.ilk,
inks.urn,
inks.chain,
lower(json_value(infos.result, '$.gem')) as token_address,
cast(json_value(inks.result, '$.ink') as numeric) as ink
from {{ source('states_sky', 'fct_sky_vat_changed_inks') }} as inks
inner join {{ source('states_sky', 'fct_sky_ilk_registry_infos') }} as infos
on inks.ilk = infos.ilk
where json_value(infos.result, '$.join') != '0x0000000000000000000000000000000000000000'
),
-- Forward-fill balances across days with no change events, so every
-- vault has a balance row for every day.
filled as (
{{ fill_forward(
table_name = 'unfilled',
timestamp = 'timestamp',
dimensions = ['vault_id', 'ilk', 'urn', 'chain', 'token_address'],
expression = 'ink'
) }}
),
-- Read the set of currently active ilks from the ilk registry.
active_ilks as (
select distinct
timestamp,
json_value(ilk, '$') as ilk
from {{ source('states_sky', 'fct_sky_ilk_registry_ilks') }} as ilks,
unnest(json_extract_array(ilks.raw_state_data)) as ilk
where
json_query(raw_state_data, '$.error') is null
and raw_state_data is not null
),
-- Keep only balances for ilks that are still active on that day.
final as (
select
filled.timestamp,
filled.vault_id,
filled.ilk,
filled.urn,
filled.chain,
filled.token_address,
filled.ink
from filled
inner join active_ilks
on filled.ilk = active_ilks.ilk
and filled.timestamp = active_ilks.timestamp
)
select * from final
Read raw ink balances from Sky's Vat state via the upstream source fact tables, attach the underlying token address from the ilk registry, forward-fill days without change events, and keep only currently active ilks. The output is per-vault ink for every active ilk on every day, ready for TVL to aggregate.
5. Build the standardized metric
Sky's TVL measures the total USD value of all collateral assets deposited in the protocol. It represents the aggregate value of assets backing USDS across vault positions, non-vault collaterals, legacy real-world assets, and Lockstake Sky tokens.
TVL aggregates the four collateral facts into a single per-day USD figure, attributed via dim_sky_collateral_types:
- Vault collaterals. Read
inkbalances per vault and ilk from Sky'sVatstate. Convert rawinkfrom 18-decimal precision to token amounts, then to USD using end-of-day prices. - Non-vault collaterals. For special ilks where vault accounting doesn't apply (some PSM and direct-deposit structures), collateral value is computed as
art × rate: debt amount multiplied by the per-ilk debt multiplier. This treats collateral value as equal to debt value for these positions. - Legacy RWA collaterals. Real-world asset ilks (identified by names beginning with the
RWAprefix) use the sameart × rateconvention as non-vault collaterals. These represent older offchain assets with values tracked onchain. - Lockstake Sky collaterals. Track Lockstake Sky token supply by replaying
Transferevents (mints from the zero address, burns to the zero address) and maintaining a running cumulative supply. Value the supply at the Sky token's end-of-day price.
To make the per-component logic concrete, here are the aggregations for each collateral type:
-- Vault deposits: sum ink across vaults per ilk, per day.
vault_collaterals_token as (
select
timestamp, vault_id, ilk, chain as chain_id, token_address,
sum(ink / 1e18) as tvl
from {{ ref('fct_sky_vault_collaterals') }}
group by 1, 2, 3, 4, 5
),
-- Non-vault: collateral value equals debt, already in USD.
nonvault_collaterals_usd as (
select
timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
sum(debt) as tvl
from {{ ref('fct_sky_nonvault_collaterals') }}
group by 1, 2, 3, 4, 5
),
-- Legacy RWA: same collateral = debt convention.
legacyrwa_collaterals_usd as (
select
timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
sum(debt) as tvl
from {{ ref('fct_sky_legacyrwa_collaterals') }}
group by 1, 2, 3, 4, 5
),
-- Lockstake Sky: sum token supply per ilk, per day.
lockstake_sky_collaterals as (
select
timestamp, -1 as vault_id, ilk, chain as chain_id, token_address,
sum(supply) as tvl
from {{ ref('fct_sky_lockstake_sky_supplies') }}
group by 1, 2, 3, 4, 5
)
Vault collaterals read ink in token units (converted to USD downstream); non-vault and legacy RWA take collateral value as equal to debt, already in USD; Lockstake Sky sums token supply, valued at the Sky token's end-of-day price. Finally (not shown), we combine and sum all four by day to produce the TVL number.
6. What this enables
What we walked through for Sky's TVL is the same process every metric runs through, across 100+ blockchains, 1,200+ protocols, and 3,000+ tokenized assets. It serves three audiences:
- Our team. Every data and analytics engineer transforms data through the same fact and dimensional model. When the engineer who originally listed a project starts working on something else, the next one picks up the work with minimal context and handover.
- Our users. We run the full pipeline from raw smart contract events to final metrics, and the dimensional model is the discipline applied at every step. Together they give every financial and usage metric a full audit trail back to the underlying contracts. The Sky TVL methodology is one example of what that looks like in production.
- Our agents. An AI agent queries, edits, and extends the system the way a human does, with the schema as scaffolding. Combined with our knowledge base, this is already producing an order-of-magnitude improvement in the efficiency, accuracy, and speed of integrating new blockchains, protocols, and tokenized assets.
While crypto's design space is open, the data we publish from it stays standardized and comparable, on the dimensional model institutional finance has used since the 1990s.
The authors of this content, or members, affiliates, or stakeholders of Token Terminal may be participating or are invested in protocols or tokens mentioned herein. The foregoing statement acts as a disclosure of potential conflicts of interest and is not a recommendation to purchase or invest in any token or participate in any protocol. Token Terminal does not recommend any particular course of action in relation to any token or protocol. The content herein is meant purely for educational and informational purposes only, and should not be relied upon as financial, investment, legal, tax or any other professional or other advice. None of the content and information herein is presented to induce or to attempt to induce any reader or other person to buy, sell or hold any token or participate in any protocol or enter into, or offer to enter into, any agreement for or with a view to buying or selling any token or participating in any protocol. Statements made herein (including statements of opinion, if any) are wholly generic and not tailored to take into account the personal needs and unique circumstances of any reader or any other person. Readers are strongly urged to exercise caution and have regard to their own personal needs and circumstances before making any decision to buy or sell any token or participate in any protocol. Observations and views expressed herein may be changed by Token Terminal at any time without notice. Token Terminal accepts no liability whatsoever for any losses or liabilities arising from the use of or reliance on any of this content.
Stay in the loop
Join our mailing list to get the latest insights!