Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,5 @@ node_modules/
# Test
test/
coverage/
.claude/
logs/
61 changes: 59 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,50 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

AR.IO Node — Arweave gateway for accessing and indexing blockchain data, with
caching, ANS-104 bundle unbundling, and multi-source data retrieval.

## Tech stack

- Node.js v20 (see `.nvmrc`), TypeScript strict mode, ESM (`"type": "module"`)
- Test framework: **Node.js native `node:test`** (not Jest/Mocha/Vitest)
- Transpiler: SWC (via ts-node)
- Databases: SQLite (primary) + ClickHouse (analytics/GQL)
- Caching: Redis, LMDB, LRU in-memory
- HTTP: Express
- Observability: OpenTelemetry + Prometheus + Winston

## Commands

```bash
# Development
yarn start # Start service (requires .env file)
yarn watch # Start with nodemon (auto-restart on changes)
yarn build # Clean + compile TypeScript (prod)

# Testing
yarn test # Run all unit tests
yarn test:file src/path/to/file.test.ts # Run a single test file
yarn test:e2e # Run end-to-end tests (in test/ directory)
yarn test:coverage # Run tests with coverage report

# Linting & quality
yarn lint:check # ESLint check
yarn lint:fix # ESLint auto-fix
yarn duplicate:check # Detect code duplication (jscpd)
yarn deps:check # Detect circular dependencies (madge)

# Database
yarn db:migrate # Run SQLite migrations
yarn db:dump-test-schemas # Regenerate test SQL schemas after migrations

# Service management (systemd-based)
yarn service:start / stop / restart / status / logs
```

## Discovery points

- Commands — `package.json` scripts (dev, build, service, test, lint,
migrations, duplicate/deps checks)
- Documentation index — `docs/INDEX.md`
- Env vars — `docs/envs.md` (keep this and `docker-compose.yaml` in sync when
adding or removing env vars)
Expand All @@ -20,6 +58,8 @@ caching, ANS-104 bundle unbundling, and multi-source data retrieval.

- `src/system.ts` is the central DI wiring — all services, workers, data
sources, resolvers, and lifecycle cleanup handlers are constructed here.
- `src/config.ts` parses all environment variables and exports typed
constants — this is where new env vars are added.
- `src/data/` uses composite sources with fallback chains
(cache → S3 → AR.IO peers → trusted gateways → Arweave nodes). Retrieval
order is configurable via `ON_DEMAND_RETRIEVAL_ORDER` and
Expand All @@ -30,6 +70,11 @@ caching, ANS-104 bundle unbundling, and multi-source data retrieval.
- Filters (`ANS104_UNBUNDLE_FILTER`, `ANS104_INDEX_FILTER`,
`WEBHOOK_INDEX_FILTER`) share a composable JSON filter system — see
`docs/filters.md`.
- Background workers (`src/workers/`) handle block importing, data importing,
bundle unbundling, verification, and webhooks. Controlled by `START_WRITERS`.
- IPFS serving (`src/ipfs/`) is opt-in via `IPFS_ENABLED`. Uses a Kubo sidecar
for content retrieval with its own cache, rate limiter, and blocklist. Routes
mount before ArNS in `app.ts`. See `docs/ipfs-integration.md`.
- Responses include trust headers indicating verification status.

## Gotchas
Expand All @@ -52,6 +97,18 @@ Always use `createTestLogger()` from `test/test-logger.ts` in test files —
never `winston.createLogger({ silent: true })`. Test output is written to
`logs/test.log` (overwritten each run), not the console.

### Test imports

Tests use `node:test` and `node:assert`:

```typescript
import { describe, it, before, after, mock } from 'node:test';
import { strict as assert } from 'node:assert';
```

Common test stubs are in `test/stubs.ts`, SQLite helpers in
`test/sqlite-helpers.ts`.

### Adding a database method

Five coordinated edits are required:
Expand Down
29 changes: 29 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ services:
- ${HEADERS_DATA_PATH:-./data/headers}:/app/data/headers
- ${SQLITE_DATA_PATH:-./data/sqlite}:/app/data/sqlite
- ${DUCKDB_DATA_PATH:-./data/duckdb}:/app/data/duckdb
- ${IPFS_CACHE_DATA_PATH:-./data/ipfs-cache}:/app/data/ipfs-cache
- ${TEMP_DATA_PATH:-./data/tmp}:/app/data/tmp
- ${LMDB_DATA_PATH:-./data/lmdb}:/app/data/lmdb
- ${PARQUET_DATA_PATH:-./data/parquet}:/app/data/parquet
Expand Down Expand Up @@ -124,6 +125,18 @@ services:
- RATE_LIMITER_IP_REFILL_PER_SEC=${RATE_LIMITER_IP_REFILL_PER_SEC:-}
- RATE_LIMITER_IPS_AND_CIDRS_ALLOWLIST=${RATE_LIMITER_IPS_AND_CIDRS_ALLOWLIST:-}
- RATE_LIMITER_ARNS_ALLOWLIST=${RATE_LIMITER_ARNS_ALLOWLIST:-}
- IPFS_ENABLED=${IPFS_ENABLED:-false}
- IPFS_KUBO_URL=${IPFS_KUBO_URL:-http://kubo:8080}
- IPFS_KUBO_REQUEST_TIMEOUT_MS=${IPFS_KUBO_REQUEST_TIMEOUT_MS:-}
- IPFS_STREAM_STALL_TIMEOUT_MS=${IPFS_STREAM_STALL_TIMEOUT_MS:-}
- IPFS_CACHE_PATH=${IPFS_CACHE_PATH:-}
- IPFS_CACHE_MAX_SIZE_BYTES=${IPFS_CACHE_MAX_SIZE_BYTES:-}
- IPFS_CACHE_CLEANUP_THRESHOLD_SECONDS=${IPFS_CACHE_CLEANUP_THRESHOLD_SECONDS:-}
- IPFS_RATE_LIMITER_IP_TOKENS_PER_BUCKET=${IPFS_RATE_LIMITER_IP_TOKENS_PER_BUCKET:-}
- IPFS_RATE_LIMITER_IP_REFILL_PER_SEC=${IPFS_RATE_LIMITER_IP_REFILL_PER_SEC:-}
- IPFS_RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET=${IPFS_RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET:-}
- IPFS_RATE_LIMITER_RESOURCE_REFILL_PER_SEC=${IPFS_RATE_LIMITER_RESOURCE_REFILL_PER_SEC:-}
- IPFS_MAX_RESPONSE_SIZE_BYTES=${IPFS_MAX_RESPONSE_SIZE_BYTES:-}
- NODE_MAX_OLD_SPACE_SIZE=${NODE_MAX_OLD_SPACE_SIZE:-}
- ENABLE_FS_HEADER_CACHE_CLEANUP=${ENABLE_FS_HEADER_CACHE_CLEANUP:-}
- ON_DEMAND_RETRIEVAL_ORDER=${ON_DEMAND_RETRIEVAL_ORDER:-}
Expand Down Expand Up @@ -578,6 +591,22 @@ services:
networks:
- ar-io-network

kubo:
image: ipfs/kubo:${KUBO_IMAGE_TAG:-v0.32.1}
profiles:
- ipfs
restart: unless-stopped
ports:
- '${IPFS_SWARM_PORT:-4001}:4001/tcp'
- '${IPFS_SWARM_PORT:-4001}:4001/udp'
environment:
- IPFS_PROFILE=${IPFS_PROFILE:-server}
volumes:
- ${IPFS_DATA_PATH:-./data/ipfs}:/data/ipfs
networks:
- ar-io-network
command: ['daemon', '--enable-gc']

autoheal:
image: willfarrell/autoheal@sha256:fd2c5500ab9210be9fa0d365162301eb0d16923f1d9a36de887f5d1751c6eb8c
network_mode: none
Expand Down
6 changes: 6 additions & 0 deletions docs/INDEX.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,12 @@ Fast, offline lookups for data item to root transaction mappings.
| [CDB64 Tools Reference](cdb64-tools.md) | CLI tools for creating indexes |
| [CDB64 Format Specification](cdb64-format.md) | Technical file format details |

### IPFS Integration

| Document | Description |
|----------|-------------|
| [IPFS Integration](ipfs-integration.md) | Architecture, deployment, and configuration for IPFS CID serving |

### Rate Limiting & Payments

| Document | Description |
Expand Down
22 changes: 22 additions & 0 deletions docs/envs.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,3 +357,25 @@ ingestion may be partial) for SQLite.
| CLICKHOUSE_SQLITE_MIN_HEIGHT_ENABLED | Boolean | false | When true, restrict the SQLite fallback to heights above (ClickHouse max height - buffer) |
| CLICKHOUSE_SQLITE_MIN_HEIGHT_BUFFER | Number | 10 | Heights reserved for SQLite near the ClickHouse tip, to guard against partially ingested recent blocks |
| CLICKHOUSE_MAX_HEIGHT_CACHE_TTL_SECONDS | Number | 60 | TTL for the cached ClickHouse max-height lookup used by the boundary optimization |

## IPFS

When enabled, the gateway can serve IPFS content via `/ipfs/{CID}` path routes
and `{CID}.{root_host}` subdomain routes (same level as ArNS, works with
standard `*.{host}` wildcard TLS certs). Requires a Kubo IPFS node (available
as a Docker Compose sidecar via the `ipfs` profile).

| ENV_NAME | TYPE | DEFAULT_VALUE | DESCRIPTION |
| ----------------------------------------- | ------- | ------------------- | ------------------------------------------------------------------- |
| IPFS_ENABLED | Boolean | false | Enable IPFS content serving |
| IPFS_KUBO_URL | String | http://kubo:8080 | Kubo HTTP gateway URL |
| IPFS_KUBO_REQUEST_TIMEOUT_MS | Number | 30000 | Connection timeout for Kubo requests (ms) |
| IPFS_STREAM_STALL_TIMEOUT_MS | Number | 30000 | Stall timeout — max time with no data before aborting stream (ms) |
| IPFS_CACHE_PATH | String | data/ipfs-cache | Directory for cached IPFS content |
| IPFS_CACHE_MAX_SIZE_BYTES | Number | 10737418240 (10 GB) | Maximum cache size before LRU eviction |
| IPFS_CACHE_CLEANUP_THRESHOLD_SECONDS | Number | 3600 | Age in seconds before cached files become eviction candidates |
| IPFS_RATE_LIMITER_IP_TOKENS_PER_BUCKET | Number | 100000 | IPFS rate limiter: max tokens per IP bucket |
| IPFS_RATE_LIMITER_IP_REFILL_PER_SEC | Number | 20 | IPFS rate limiter: token refill rate per second (IP bucket) |
| IPFS_RATE_LIMITER_RESOURCE_TOKENS_PER_BUCKET | Number | 1000000 | IPFS rate limiter: max tokens per resource bucket |
| IPFS_RATE_LIMITER_RESOURCE_REFILL_PER_SEC | Number | 100 | IPFS rate limiter: token refill rate per second (resource bucket) |
| IPFS_MAX_RESPONSE_SIZE_BYTES | Number | 1073741824 (1 GB) | Maximum IPFS content size the gateway will serve |
Loading
Loading