Token identity & enrichment
Summary
Section titled “Summary”Ingesters frequently see partial token identity — an Etherscan
contract address, a CoinGecko slug, a Kraken asset code — and need a
canonical token row to attach a holding /
transaction / observation to.
TokenIdentityService.findOrCreateByIdentity() is the federated flow
that resolves any partial identity into a fully-materialised token,
enriching it across every registered provider in parallel and
persisting the merged result. A weekly backfill-token-identity job
re-enriches stale rows so provider metadata stays current as
upstream definitions evolve.
The resolution flow
Section titled “The resolution flow”For an incoming identity (any subset of {symbol, type, market segment, chainId, contractAddress, coingeckoId, krakenAsset, ...}):
- EVM contract lookup by
(chainId, contractAddress)against theproviderMetadata.etherscanjsonb path. If found, return. - (symbol, typeId, marketSegment) lookup. If found, return.
- Parallel enrichment via every registered
TokenIdentityProvider:- CoinGecko — adds
providerMetadata.coingecko. - DeFiLlama — adds
providerMetadata.defillama(the"chain:address"coin spec used by DeFiLlama’s pricing API). - Etherscan — adds
providerMetadata.etherscan(chain + contract). - Kraken — adds
providerMetadata.kraken(raw asset code). - Finnhub — adds
providerMetadata.finnhub(stock symbol + exchange). - Solana — adds
providerMetadata.solana(SPL mint address). - …new providers plug in here.
- CoinGecko — adds
- Persist the new token row with fully-enriched
providerMetadata.
The flow is deterministic per identity input — if two ingest paths discover the same contract concurrently, both produce the same row (the unique constraint guarantees one wins; the other reads the winner).
First-writer-wins per namespace
Section titled “First-writer-wins per namespace”providerMetadata is a single jsonb with one key per provider. When
two providers disagree on the same namespace (rare — usually means
the upstreams genuinely disagree), the first to populate that
namespace wins; the conflict is logged.
This is why adding a new provider is a small change: the new provider tags its own namespace, and existing providers’ data is untouched.
The weekly re-enrichment cron
Section titled “The weekly re-enrichment cron”backfill-token-identity runs weekly on Sunday at 02:00 UTC. It
picks tokens that haven’t been touched by an ingester recently and
re-runs the parallel enrichment pass. The job is the safety net for:
- Providers whose definitions evolve (a stock listing moves, a token migrates contract addresses).
- Tokens initially materialised with only one provider’s data — the weekly pass picks up new providers that didn’t exist or weren’t configured at original creation.
Scam / unpriceable flags
Section titled “Scam / unpriceable flags”Two flags on tokens are managed alongside identity:
isScamProbability— float 0–1. Tokens above the threshold (SCAM_PROBABILITY_THRESHOLD) are excluded from totals by the inclusion rule. The score is populated by enrichment passes that consult provider hints (CoinGecko’sis_scam, DeFiLlama’s blacklist, …).unpriceableUntil— timestamp. Set when the historical-price backfill has tried and failed to find prices, so the next pass skips this token instead of re-asking the same providers. Cleared on the next successful price write.
Both flags are an established-then-trusted model: a single missing price doesn’t flag the token, but a sustained inability does.
Where to look
Section titled “Where to look”| File | Role |
|---|---|
packages/business/domain/src/services/tokens/TokenIdentityService.ts | The federated resolution flow. |
packages/clients/providers/src/ — per-provider directories | Each provider’s TokenIdentityProvider adapter. |
packages/business/jobs/src/scheduled-jobs/backfill-token-identity.ts | Weekly cron descriptor. |