5. Verification Engine Execution Model
5.1 Seven-Phase Evaluation Pipeline
The Verification Engine processes claims through seven sequential phases:
Phase 1: CLAIM_VALIDATION
- Parse and validate claim schema
- Verify predicate exists in registry
- Verify subject type compatibility
- Verify context refers to finalized blocks
- Output: Validated claim or rejection
Phase 2: SUBJECT_RESOLUTION
- Resolve subject to on-chain entities
- For inferred subjects: verify the inference method produced this subject
- Retrieve subject metadata (token name, decimals, program owner)
- Output: Resolved subject with canonical identifiers
Phase 3: EVIDENCE_COLLECTION
- Determine admissible evidence sources for this predicate
- Retrieve evidence units from chain RPC providers
- Apply BCE canonicalization to each evidence unit
- Verify evidence provenance and integrity
- Output: Set of admissible, canonicalized evidence units
Phase 4: GRAPH_CONSTRUCTION
- Build the Evidence Graph from collected evidence (Section 4.4)
- Enforce scope constraints
- Annotate graph with entity labels and derived properties
- Output: Scoped Evidence Graph
Phase 5: EVALUATION
- Execute evaluation logic for the claim's predicate
- For DETERMINISTIC predicates: compute the result directly from graph state
- For INFERENTIAL predicates: apply versioned inference methods (Section 5.3)
- Output: Raw evaluation result with confidence and bounds
Phase 6: QUALIFICATION
- Determine qualification state based on evaluation result (Section 6.2)
- Apply qualification rules mechanically
- Output: Qualification (VERIFIED | INFERRED | OBSERVED | INCONCLUSIVE | UNQUALIFIED)
Phase 7: ASSEMBLY
- Construct the Verification Object (Section 6.1)
- Compute content-addressed
voId - Attach evidence references, method versions, timestamps
- Output: Complete Verification Object
5.2 Deterministic Evaluation
Deterministic evaluations yield the same output for any verifier executing the same engine version against the same historical anchors.
Specification per Predicate:
supply_concentration
INPUT: Token mint, context block range, N (top holder count)
METHOD:
1. Retrieve all token accounts at context.blockTo
2. Sort by balance descending
3. Compute total_supply from token supply at context.blockTo
4. Compute top_n_pct = sum(top_n_balances) / total_supply * 100
OUTPUT: { topN, concentration: top_n_pct, threshold: claim_threshold, result: PASS | FAIL }
liquidity_depth
INPUT: Pool address, sell amount (denominated in tokenB, e.g., SOL/ETH), max price impact %
DEFINITIONS:
reserveA = pool reserve of tokenA (the token being evaluated)
reserveB = pool reserve of tokenB (the quote asset: SOL or ETH)
sellAmount = amount of tokenB being sold into the pool (buying tokenA)
k = reserveA * reserveB (constant product invariant)
METHOD:
1. Retrieve pool reserves (reserveA, reserveB) at context.blockTo
2. Compute tokens received using constant product formula:
amountOut = reserveA * sellAmount / (reserveB + sellAmount)
(This is the actual tokenA received. Derived from: k = reserveA * reserveB,
new_reserveA = k / (reserveB + sellAmount), amountOut = reserveA - new_reserveA)
3. Compute expected amount at current spot price:
expectedOut = reserveA * sellAmount / reserveB
4. Compute execution price impact (slippage vs. spot):
price_impact_pct = (expectedOut - amountOut) * 100 * SCALE / expectedOut
Simplified: price_impact_pct = sellAmount * 100 * SCALE / (reserveB + sellAmount)
(fixedpoint64, scale 10^8; truncating integer division; see Appendix B, Section B.2)
5. Compare computed impact to claim threshold
OUTPUT: {
sellAmount,
amountOut,
expectedOut,
priceImpact: price_impact_pct, // fixedpoint64 (e.g., 9_09090909 = 9.09%)
threshold: max_impact,
result: PASS | FAIL
}
Note: This computes execution price impact (slippage between the effective trade price and the current spot price), NOT spot price change (which measures how the marginal price moves after the trade). Execution price impact is the standard metric used in DeFi interfaces and is what a trader actually experiences. For reference: spot price change for the same trade would be sellAmount * (2 * reserveB + sellAmount) * SCALE / (reserveB * (reserveB + sellAmount)), which is approximately 2x the execution price impact for small trades.
holder_distribution
INPUT: Token mint, context block range
METHOD:
1. Retrieve all token accounts at context.blockTo
2. Compute Gini coefficient: G = 1 - 2 * integral(Lorenz_curve)
3. Compute HHI: H = sum(share_i^2) for all holders
4. Compute Nakamoto coefficient: min holders for 51% control
OUTPUT: { gini, hhi, nakamoto, holderCount, top1Pct, top10Pct }
contract_security
INPUT: Contract/program address
METHOD:
1. Retrieve program authority status (Solana: upgrade authority)
2. Check mint authority (disabled = safer)
3. Check freeze authority (disabled = safer)
4. For EVM: check ownership renounced, proxy patterns, hidden functions
OUTPUT: {
mintAuthority: ACTIVE | DISABLED,
freezeAuthority: ACTIVE | DISABLED,
upgradeAuthority: ACTIVE | DISABLED | RENOUNCED,
flags: string[]
}
buy_sell_pressure
INPUT: Token mint, context block range
METHOD:
1. Retrieve all swap/trade events within context window
2. Classify as BUY or SELL based on token flow direction
3. Compute: buy_count, sell_count, buy_volume_sol, sell_volume_sol
4. Compute ratios: buy_sell_ratio = buy_count / (buy_count + sell_count)
OUTPUT: { buyCount, sellCount, buyVolume, sellVolume, ratio, uniqueTraders }
5.3 Inferential Evaluation Methods
Inference methods are versioned and disclosed. Each method version MUST produce byte-identical output for the same input evidence graph, regardless of implementation language or hardware.
Determinism requirements for all inference methods:
- Algorithm choice is pinned per method version — no "or" alternatives
- All node/edge iteration MUST follow canonical ordering: lexicographic by
nodeId(UTF-8 byte order), matching BFS ordering from Section 4.5 - All arithmetic MUST use
fixedpoint64integer operations (Appendix B, Section B.2) — no IEEE 754 floating-point - Confidence values are computed by exact formulas, not ranges
- Random or stochastic algorithms are prohibited; all methods must be fully deterministic given the input graph
Method: wallet_clustering
Version: 1.0.0 Type: Graph-based community detection (Louvain, deterministic variant)
Algorithm:
- Construct funding graph from Evidence Graph (
FUNDINGedges only) - Apply heuristic layer:
- Common funder: wallets funded by same source within 24h → merge
- Timing correlation: transactions within same block → flag
- Amount similarity: transfer amounts within 1% variance (fixedpoint64 comparison) → flag
- Gas/priority pattern: identical gas settings → flag
- Apply Louvain community detection (deterministic variant):
- Phase 1 (Local moves): Process nodes in lexicographic
nodeIdorder. For each node, evaluate community reassignment by computing modularity delta for each neighboring community. Select the community with the highest positive modularity delta. Ties are broken by selecting the community whose canonical ID (smallestnodeIdamong members) is lexicographically first. If no positive delta exists, the node stays in its current community. - Phase 2 (Aggregation): Collapse each community into a super-node. The super-node ID is the lexicographically smallest
nodeIdof its members. Edge weights between super-nodes are summed. - Repeat Phase 1 and Phase 2 until no node changes community in a full pass of Phase 1.
- Resolution parameter: γ = 1.0 (standard modularity). This is fixed per method version and MUST NOT be configurable at evaluation time.
- Edge weights: Each
FUNDINGedge has weight 1. Multiple funding edges between the same pair are summed.
- Phase 1 (Local moves): Process nodes in lexicographic
- Merge heuristic flags with Louvain community results: each Louvain community is one cluster; heuristic merges that span Louvain communities combine those communities
- For each cluster: compute confidence from heuristic count using the formula below
- Output: list of clusters with member wallets (sorted by
nodeId), cluster size, confidence
Confidence Formula:
heuristic_count = count of distinct heuristic types matched for this cluster (0-4)
confidence = min(100, 30 + (heuristic_count - 1) * 20 + heuristic_count * 5)
| Heuristics Matched | Confidence (exact) |
|---|---|
| 4 heuristics | 100 |
| 3 heuristics | 85 |
| 2 heuristics | 60 |
| 1 heuristic | 35 |
| 0 heuristics (Louvain-only cluster) | 20 |
All confidence values are stored as fixedpoint64 with scale factor 10^8.
Method: sniper_detection
Version: 1.0.0 Type: Temporal pattern matching + known address lookup
Algorithm:
- Identify all BUY events in first N blocks after token creation (N defined per predicate scope)
- For each buyer (processed in lexicographic address order): check against known sniper/MEV bot address database (version-pinned, referenced by content hash in method version metadata)
- Compute
gas_premium = (paid_priority_fee - base_fee) * SCALE / base_fee(fixedpoint64, scale 10^8, truncating division; ifbase_fee = 0,gas_premium = MAX_I64) - Flag buyers with:
gas_premium > 10_00000000(10x at scale 10^8) ORknown_bot = trueORblock_0_purchase = true - Compute
sniper_share = sum(flagged_buyer_balances) * SCALE / total_supply(fixedpoint64, scale 10^8)
Confidence Formula:
base_confidence = max signal confidence among: known_bot(95), gas_block_0(85), gas_block_1_5(70), timing_only(50)
confidence = base_confidence // highest applicable signal, no interpolation
| Signal | Confidence (exact) |
|---|---|
| Known bot address match | 95 |
| Gas premium > 10x in block 0 | 85 |
| Gas premium > 5x in blocks 1-5 | 70 |
| Timing-only flag (block 0 purchase, no gas/bot signal) | 50 |
When multiple signals match, the highest confidence applies. All confidence values stored as fixedpoint64 with scale 10^8.
Method: wash_trading_detection
Version: 1.0.0 Type: Statistical testing + graph analysis
Algorithm:
- Benford's Law conformity test on trade amounts:
- Extract first significant digit from all trade amounts (in lamports). For amounts < 10, skip (insufficient digits).
- Expected distribution:
E(d) = total_trades * BENFORD[d]whereBENFORD[1..9]is a fixed lookup table of pre-computed fixedpoint64 values at scale 10^8:[30103000, 17609126, 12493874, 9691001, 7918125, 6694679, 5799195, 5115252, 4575749](i.e.,log10(1 + 1/d) * 10^8, truncated) - Chi-squared statistic (fixedpoint64, scale 10^8):
X2 = sum_d( (O(d) - E(d))^2 * SCALE / E(d) )whereO(d)is observed count for digit d. Division uses truncating integer division. IfE(d) = 0, that digit is excluded. - Flag as non-conforming if
X2 > 1692_00000000(p < 0.05 threshold for 8 degrees of freedom, at scale 10^8)
- Strongly Connected Components (SCC) detection:
- Build directed graph: nodes = wallet addresses, edges = trade events
- Apply Tarjan's SCC algorithm with deterministic node visitation order: nodes are visited in lexicographic
nodeIdorder (UTF-8 byte order). The DFS neighbor iteration for each node also follows lexicographic order of targetnodeId. - Flag SCCs where
abs(net_position_sum) < 1% of gross_volume(fixedpoint64 comparison at scale 10^8):abs(net) * 100 * SCALE < gross * SCALE * 1
- Compute
wash_volume_ratio = wash_volume * SCALE / total_volume(fixedpoint64, scale 10^8; iftotal_volume = 0, result = 0) - Compute
organic_volume_pct = 100 * SCALE - wash_volume_ratio * 100(fixedpoint64, scale 10^8)
Confidence Formula:
confidence = 90 if benford_flag AND scc_flag
= 75 if scc_flag only
= 60 if benford_flag only
= 0 if neither
| Detection Method | Confidence (exact) |
|---|---|
| Both Benford + SCC flag | 90 |
| SCC only | 75 |
| Benford only | 60 |
All confidence values stored as fixedpoint64 with scale 10^8.
Method: kol_impact_scoring
Version: 1.0.0 Type: Causal impact analysis
Algorithm:
- Track KOL mention timestamps from social content evidence (processed in chronological order; ties broken by
evidenceIdlexicographic order) - Measure price/volume change at +1h, +4h, +24h, +7d post-mention (all as fixedpoint64 percentage change at scale 10^8)
- Compute KOL quality score (all fixedpoint64, scale 10^8):
Note:win_rate = mentions_with_positive_7d_return * SCALE / total_mentions follower_quality = SCALE - bot_ratio_among_followers log_followers = ILOG2(followers) // integer log base 2 (floor), result as fixedpoint64 at scale 10^8 kol_score = win_rate * follower_quality / SCALE * log_followers / SCALEILOG2(integer log base 2, floor) replaceslogto avoid floating-point non-determinism.ILOG2(0) = 0. The multiplication chain uses i128 intermediates to avoid overflow. - Weight by:
final_score = kol_score * accuracy_weight / SCALE * authenticity_weight / SCALE * timing_weight / SCALEwhere weights are pre-defined per KOL tier (stored in method version metadata as fixedpoint64 values)
Confidence Formula:
sample_size = total_mentions for this KOL
confidence = min(95, 20 + sample_size * 5) // capped at 95, minimum 25 (at sample_size=1)
If sample_size = 0, confidence = 0 and the KOL is excluded from scoring.
All values stored as fixedpoint64 with scale 10^8.
5.4 Feature Engineering
The Verification Engine computes features from the Evidence Graph to feed both deterministic evaluations and inference methods.
Standard Feature Set (26 features, aligned with real-time monitoring):
| Category | Feature | Computation |
|---|---|---|
| Time | age_seconds |
blockTo.time - creation.time |
| Time | trade_frequency |
total_trades / age_seconds |
| Time | time_to_first_trade |
first_trade.time - creation.time |
| Volume | total_sol_volume |
sum(all trade SOL amounts) |
| Volume | buy_sol_volume |
sum(buy trade SOL amounts) |
| Volume | sell_sol_volume |
sum(sell trade SOL amounts) |
| Volume | buy_sell_sol_ratio |
buy_sol_volume / sell_sol_volume |
| Volume | avg_trade_size_sol |
total_sol_volume / total_trades |
| Volume | max_trade_size_sol |
max(individual trade SOL amounts) |
| Count | total_trades |
count(all trade events) |
| Count | buy_count |
count(buy events) |
| Count | sell_count |
count(sell events) |
| Count | buy_sell_count_ratio |
buy_count / sell_count |
| Trader | unique_traders |
count(distinct trader addresses) |
| Trader | unique_buyers |
count(distinct buyer addresses) |
| Trader | unique_sellers |
count(distinct seller addresses) |
| Trader | repeat_buyer_ratio |
repeat_buyers / unique_buyers |
| Trader | trader_concentration |
top_5_trader_volume / total_volume |
| Price | price_at_snapshot |
latest trade price |
| Price | price_change_pct |
(latest_price - first_price) / first_price * 100 |
| Price | price_velocity |
price_change_pct / age_seconds |
| Price | price_volatility |
std_dev(trade_prices) |
| Price | max_drawdown_pct |
max peak-to-trough decline |
| Pattern | buy_cluster_score |
temporal clustering of buy events |
| Pattern | whale_buy_ratio |
large_buy_volume / total_buy_volume |
| Pattern | dex_count |
count(distinct DEX sources) |
Fixed-Point Representation (Cross-Language Determinism)
All 26 features MUST be computed and stored as fixedpoint64 integers (see Appendix B, Section B.2) — IEEE 754 floating-point is not permitted in the canonical feature vector. This ensures that TypeScript, Rust, Python, and any other implementation produce byte-identical feature vectors for the same input evidence.
| Feature Category | Scale Factor | Notes |
|---|---|---|
Time (age_seconds, time_to_first_trade) |
10^0 | Exact integer seconds |
Counts (total_trades, buy_count, unique_traders, dex_count, etc.) |
10^0 | Exact integers |
SOL volumes (total_sol_volume, buy_sol_volume, max_trade_size_sol, etc.) |
10^9 | Native lamports |
Ratios (buy_sell_sol_ratio, buy_sell_count_ratio, repeat_buyer_ratio, etc.) |
10^8 | Truncating integer division |
Rates (trade_frequency, price_velocity) |
10^12 | High precision for small per-second values |
Percentages (price_change_pct, max_drawdown_pct, trader_concentration) |
10^8 | e.g., 42.5% = 4_250_000_000 |
Prices (price_at_snapshot) |
10^9 | Lamport-scale price |
Scores (price_volatility, buy_cluster_score) |
10^8 | std_dev via integer square root |
Division by zero (e.g., sell_count = 0 for buy_sell_count_ratio) MUST yield 0, not an error or special value.
These features are computed identically in all implementation languages using integer-only arithmetic to ensure byte-level consistency between real-time scoring and batch verification.
5.5 Engine Versioning
Engine versions follow semantic versioning: MAJOR.MINOR.PATCH
- MAJOR: Breaking changes to evaluation logic, qualification rules, claim schema, or introduction of a predicate MAJOR version (Section 2.5)
- MINOR: New predicate MINOR/PATCH versions, new inference methods, new evidence source types, new predicates added to the manifest
- PATCH: Bug fixes, performance improvements, documentation updates
Each engine version includes a Predicate Manifest — a frozen mapping of predicateId to predicate version. This manifest is immutable for a given engine version and is the authoritative record of what each predicate means within that engine. See Section 2.5 for the full predicate versioning rules and binding semantics.
Version Lifecycle:
| State | Description |
|---|---|
| CURRENT | Latest stable version, used for new evaluations |
| SUPPORTED | Previous versions that can still be used for replay |
| DEPRECATED | Versions that produce warnings but still function |
| RETIRED | Versions that can no longer be executed |
Deprecation Policy:
- Engine versions MUST remain replayable for at least 12 months after deprecation
- Deprecation MUST be announced at least 3 months in advance
- Retired versions MUST have all Verification Objects archived before retirement
- A predicate version reaches RETIRED only when ALL engine versions referencing it are RETIRED (Section 2.5)