11. Reference Implementation Notes

11.1 Architecture Overview

The reference implementation comprises four primary codebases:

Data Ingestion (pumpfun-monitor)

  • Node.js/TypeScript real-time transaction monitor
  • WebSocket subscriptions to 9+ Solana DEX programs
  • Borsh/Anchor event parsing from transaction logs
  • 26-feature extraction pipeline (TypeScript)
  • AES-256-GCM encrypted NDJSON logging
  • PostgreSQL persistence (events, pairs, tokens, ML scores)
  • ML prediction integration via HTTP to Python FastAPI service
  • Web dashboard (HTML5 + WebSocket) for live visualization

Verification & Product API (baseline-api)

  • Monorepo: @baseline/core, @baseline/api, @baseline/worker, @baseline/admin
  • Fastify-based REST API with Swagger documentation
  • PostgreSQL with read/write replica separation
  • Background worker with browser-based scraping (Puppeteer)
  • Twitter scraper for social content collection
  • DexScreener scraper for market data
  • JWT authentication (Google OAuth, Apple OAuth, wallet signing)
  • PM2 process management (cluster mode for API, fork for worker)

@baseline/core Public Interface:

The @baseline/core package is the typed SDK for integrators. It exports all protocol types (claim, evidence, VO, attestation, error) and a client factory. The complete type definitions are specified in Appendix F.

Module Exports Description
@baseline/core createBaselineClient Factory function returning a typed BaselineClient instance
@baseline/core/types Claim, ClaimSubmission, Subject, Context, Scope, etc. All claim-related types (Section 2)
@baseline/core/types EvidenceUnit, EvidenceReference, EvidenceSourceType, etc. All evidence types (Section 3)
@baseline/core/types VerificationObject, QualificationType, LineStatus, etc. All VO types (Section 6)
@baseline/core/types Attestation, ValidatorInfo Attestation types (Section 7)
@baseline/core/types BaselineError, ErrorCode, ClaimValidationError Error types
@baseline/core/types PaginatedResponse, PaginationParams Pagination generics

Product Frontend (baseline-web)

  • React 18 + Vite + TailwindCSS
  • TypeScript types aligned with Verification Object schema
  • Social feed rendering, coin scoring dashboard, bot arena
  • Baseline Feed (AI-verified aggregated insights)
  • Watchlist management with auth context
  • Radix UI primitives, Recharts for data visualization, Framer Motion

AI Agent Layer (Ghost / clawd)

  • Claude-powered AI analyst agent ("Ghost")
  • Ghost Scoring System v1 (manual) and v2 (automated, 12 categories)
  • ML/AI Framework: XGBoost regression, GNN entity resolution, LSTM autoencoders, LLM contract interpretation
  • Trading bots (paper trading + mainnet via Jupiter)
  • KOL scraping and quality scoring
  • PostgreSQL database with coins, social content, trading history
  • Cron-scheduled automated analysis and reporting
  • Memory persistence via markdown files (daily notes, long-term memory)

11.2 ML Model Stack

Model Type Framework Use Case Inference Latency
CatBoost (default) catboost Pump/dump prediction < 10ms
CatBoost Deep catboost Enhanced regularization < 10ms
XGBoost xgboost Alternative GBDT < 10ms
Ensemble catboost+xgb Soft voting (best precision) < 20ms
TCN PyTorch Temporal event sequences < 50ms
LightGBM lightgbm Sybil detection < 10ms
GNN (GAT) PyTorch Geom. Wallet clustering < 500ms
LSTM Autoencoder PyTorch Distribution anomaly < 50ms
FinBERT transformers Social sentiment < 200ms
LLM (Llama/Mistral) transformers Contract interpretation 2-5s

Training Pipeline:

  1. Collect data via WebSocket monitor (24-48 hours)
  2. Build dataset: decrypt logs → compute features → label tokens
  3. Train: model selection, hyperparameter optimization (Optuna), SMOTE for imbalance
  4. Evaluate: temporal backtest (80/20 chronological split)
  5. Deploy: FastAPI inference server with hot model reload

Threshold Optimization:

Strategy Description
F0.5 (default) Precision-favoring, balanced recall
F1 Balanced precision/recall
F2 Recall-favoring (catch more events)
Precision Maximum precision (strict)
Recall Maximum recall (comprehensive, precision >= 30%)

11.3 Database Architecture

Primary Database: PostgreSQL

Operational Tables

Table Description
coins Token registry (contract address, ticker, metadata)
coin_social Social media links per coin
coin_social_content Scraped social posts (JSONB info field)
coin_onchain On-chain snapshots (rich list, snipers, etc.)
coin_history Audit trail of coin status changes (trigger-based)
bot_report AI-generated analysis reports
bot_report_history Archived report versions (trigger-based)

Trading Tables

Table Description
trading_account Virtual/real trading accounts with balance
trading_position Current token holdings per account
trading_history Buy/sell trade log
trading_account_pnl_history Portfolio snapshots (per-trade and periodic)

Monitoring Tables

Table Description
DEXTRACKER_EVENTS Raw DEX trade events
DEXTRACKER_PAIRS Aggregated pair statistics
DEXTRACKER_TOKENS Token creation events
DEXTRACKER_GLOBAL_STATS System-wide statistics
DEXTRACKER_ML_SCORES ML prediction results

Cron Tables

Table Description
openclaw_cron_runs Cron job execution history

results matching ""

    No results matching ""