HonestRoles is a deterministic, config-driven pipeline runtime for job data with Polars and explicit plugin manifests.
Use the HonestRoles app first: honestroles.com.
- Launch app: https://honestroles.com
- App guide: App Quickstart
- App users: start in the browser at honestroles.com
- Developers and integrators: use the CLI/SDK sections below
$ python -m venv .venv
$ . .venv/bin/activate
$ python -m pip install --upgrade pip
$ pip install honestrolesFrom the repository root:
$ python examples/create_sample_dataset.py
$ honestroles run --pipeline-config examples/sample_pipeline.toml --plugins examples/sample_plugins.toml
$ ls -lh examples/jobs_scored.parquetExpected CLI diagnostics include stage_rows, plugin_counts, and final_rows.
$ honestroles ingest sync --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --merge-policy updated_hash --retain-snapshots 30 --prune-inactive-days 90 --format table
$ honestroles ingest validate --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --format table
$ honestroles ingest sync-all --manifest ingest.toml --format table
$ honestroles recommend build-index --input-parquet dist/ingest/greenhouse/stripe/jobs.parquet --policy recommendation.toml --format table
$ honestroles recommend match --index-dir dist/recommend/index/<index_id> --candidate-json examples/candidate.json --top-k 25 --include-excluded --format table
$ honestroles recommend evaluate --index-dir dist/recommend/index/<index_id> --golden-set examples/recommend_golden_set.json --thresholds recommend_eval.toml --format table
$ honestroles recommend feedback add --profile-id jane_doe --job-id 12345 --event interviewed --format table
$ honestroles publish neondb migrate --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles publish neondb sync --database-url-env NEON_DATABASE_URL --schema honestroles_api --jobs-parquet dist/ingest/greenhouse/stripe/jobs.parquet --index-dir dist/recommend/index/<index_id> --sync-report dist/ingest/greenhouse/stripe/sync_report.json --require-quality-pass --format table
$ honestroles publish neondb verify --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles init --input-parquet data/jobs.parquet --pipeline-config pipeline.toml --plugins-manifest plugins.toml
$ honestroles doctor --pipeline-config pipeline.toml --plugins plugins.toml --format table
$ honestroles reliability check --pipeline-config pipeline.toml --plugins plugins.toml --strict --format table
$ honestroles run --pipeline-config pipeline.toml --plugins plugins.toml
$ honestroles plugins validate --manifest plugins.toml
$ honestroles config validate --pipeline pipeline.toml
$ honestroles report-quality --pipeline-config pipeline.toml
$ honestroles runs list --limit 10 --command ingest.sync --format table
$ honestroles scaffold-plugin --name my-plugin --output-dir .from honestroles import (
HonestRolesRuntime,
build_retrieval_index,
evaluate_relevance,
migrate_neondb,
match_jobs,
publish_neondb_sync,
record_feedback_event,
sync_source,
sync_sources_from_manifest,
summarize_feedback,
validate_ingestion_source,
verify_neondb_contract,
)
ingest = sync_source(
source="greenhouse",
source_ref="stripe",
quality_policy_file="ingest_quality.toml",
strict_quality=False,
merge_policy="updated_hash",
retain_snapshots=30,
prune_inactive_days=90,
)
print(ingest.rows_written, ingest.output_parquet)
validation = validate_ingestion_source(
source="greenhouse",
source_ref="stripe",
quality_policy_file="ingest_quality.toml",
strict_quality=True,
)
print(validation.report.status, validation.rows_evaluated)
batch = sync_sources_from_manifest(manifest_path="ingest.toml")
print(batch.status, batch.total_sources, batch.fail_count)
index = build_retrieval_index(
input_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
policy_file="recommendation.toml",
)
matches = match_jobs(
index_dir=index.index_dir,
candidate_json="examples/candidate.json",
top_k=25,
include_excluded=True,
)
print(matches.status, len(matches.results))
evaluation = evaluate_relevance(
index_dir=index.index_dir,
golden_set="examples/recommend_golden_set.json",
thresholds_file="recommend_eval.toml",
)
print(evaluation.status, evaluation.metrics)
record_feedback_event(profile_id="jane_doe", job_id="12345", event="interviewed")
print(summarize_feedback(profile_id="jane_doe").weights)
print(migrate_neondb(database_url_env="NEON_DATABASE_URL").status)
publish_result = publish_neondb_sync(
database_url_env="NEON_DATABASE_URL",
jobs_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
index_dir=index.index_dir,
sync_report="dist/ingest/greenhouse/stripe/sync_report.json",
)
print(publish_result.batch_id, verify_neondb_contract(database_url_env="NEON_DATABASE_URL").status)
runtime = HonestRolesRuntime.from_configs(
pipeline_config_path="pipeline.toml",
plugin_manifest_path="plugins.toml",
)
result = runtime.run()
print(result.diagnostics)
print(result.dataset.to_polars().head())
print(result.application_plan[:3])- App home: https://honestroles.com
- Docs home: https://honestroles.com/docs/
- Local docs source:
docs/ - Start here in docs:
docs/index.md
$ pip install -e ".[dev,docs]"
$ pytest -q
$ pytest tests/docs -q
$ bash scripts/check_docs_refs.sh
# Optional live connector smoke (requires refs):
# HONESTROLES_SMOKE_GREENHOUSE_REF, HONESTROLES_SMOKE_LEVER_REF,
# HONESTROLES_SMOKE_ASHBY_REF, HONESTROLES_SMOKE_WORKABLE_REF
$ bash scripts/run_ingest_smoke.sh
# Optional Neon DB smoke (requires NEON_DATABASE_URL):
$ PYTHON_BIN=.venv/bin/python DATABASE_URL_ENV=NEON_DATABASE_URL SCHEMA=honestroles_api bash scripts/run_neondb_smoke.shFor local profiling data, keep large parquet inputs under data/ and write generated artifacts under dist/ (both are ignored by git).
- PyPI publishing is manual and token-based via
bash scripts/publish_pypi.sh. - The script reads
PYPI_API_KEY(orPYPI_API_TOKEN) from env/.env. - The GitHub
Releaseworkflow is manual (workflow_dispatch) only. - Before publish, run deterministic gate:
$ PYTHON_BIN=.venv/bin/python bash scripts/run_coverage.sh- Full maintainer runbook:
docs/for-maintainers/release-and-pypi.md.
MIT