HonestRoles

HonestRoles is a deterministic, config-driven pipeline runtime for job data with Polars and explicit plugin manifests.

Start With the App

Use the HonestRoles app first: honestroles.com.

Launch app: https://honestroles.com
App guide: App Quickstart

Choose Your Path

App users: start in the browser at honestroles.com
Developers and integrators: use the CLI/SDK sections below

Install (Developer)

$ python -m venv .venv
$ . .venv/bin/activate
$ python -m pip install --upgrade pip
$ pip install honestroles

5-Minute First Run (Developer)

From the repository root:

$ python examples/create_sample_dataset.py
$ honestroles run --pipeline-config examples/sample_pipeline.toml --plugins examples/sample_plugins.toml
$ ls -lh examples/jobs_scored.parquet

Expected CLI diagnostics include stage_rows, plugin_counts, and final_rows.

CLI

$ honestroles ingest sync --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --merge-policy updated_hash --retain-snapshots 30 --prune-inactive-days 90 --format table
$ honestroles ingest validate --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --format table
$ honestroles ingest sync-all --manifest ingest.toml --format table
$ honestroles recommend build-index --input-parquet dist/ingest/greenhouse/stripe/jobs.parquet --policy recommendation.toml --format table
$ honestroles recommend match --index-dir dist/recommend/index/<index_id> --candidate-json examples/candidate.json --top-k 25 --include-excluded --format table
$ honestroles recommend evaluate --index-dir dist/recommend/index/<index_id> --golden-set examples/recommend_golden_set.json --thresholds recommend_eval.toml --format table
$ honestroles recommend feedback add --profile-id jane_doe --job-id 12345 --event interviewed --format table
$ honestroles publish neondb migrate --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles publish neondb sync --database-url-env NEON_DATABASE_URL --schema honestroles_api --jobs-parquet dist/ingest/greenhouse/stripe/jobs.parquet --index-dir dist/recommend/index/<index_id> --sync-report dist/ingest/greenhouse/stripe/sync_report.json --require-quality-pass --format table
$ honestroles publish neondb verify --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles init --input-parquet data/jobs.parquet --pipeline-config pipeline.toml --plugins-manifest plugins.toml
$ honestroles doctor --pipeline-config pipeline.toml --plugins plugins.toml --format table
$ honestroles reliability check --pipeline-config pipeline.toml --plugins plugins.toml --strict --format table
$ honestroles run --pipeline-config pipeline.toml --plugins plugins.toml
$ honestroles plugins validate --manifest plugins.toml
$ honestroles config validate --pipeline pipeline.toml
$ honestroles report-quality --pipeline-config pipeline.toml
$ honestroles runs list --limit 10 --command ingest.sync --format table
$ honestroles scaffold-plugin --name my-plugin --output-dir .

Python API

from honestroles import (
    HonestRolesRuntime,
    build_retrieval_index,
    evaluate_relevance,
    migrate_neondb,
    match_jobs,
    publish_neondb_sync,
    record_feedback_event,
    sync_source,
    sync_sources_from_manifest,
    summarize_feedback,
    validate_ingestion_source,
    verify_neondb_contract,
)

ingest = sync_source(
    source="greenhouse",
    source_ref="stripe",
    quality_policy_file="ingest_quality.toml",
    strict_quality=False,
    merge_policy="updated_hash",
    retain_snapshots=30,
    prune_inactive_days=90,
)
print(ingest.rows_written, ingest.output_parquet)

validation = validate_ingestion_source(
    source="greenhouse",
    source_ref="stripe",
    quality_policy_file="ingest_quality.toml",
    strict_quality=True,
)
print(validation.report.status, validation.rows_evaluated)

batch = sync_sources_from_manifest(manifest_path="ingest.toml")
print(batch.status, batch.total_sources, batch.fail_count)

index = build_retrieval_index(
    input_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
    policy_file="recommendation.toml",
)
matches = match_jobs(
    index_dir=index.index_dir,
    candidate_json="examples/candidate.json",
    top_k=25,
    include_excluded=True,
)
print(matches.status, len(matches.results))

evaluation = evaluate_relevance(
    index_dir=index.index_dir,
    golden_set="examples/recommend_golden_set.json",
    thresholds_file="recommend_eval.toml",
)
print(evaluation.status, evaluation.metrics)

record_feedback_event(profile_id="jane_doe", job_id="12345", event="interviewed")
print(summarize_feedback(profile_id="jane_doe").weights)

print(migrate_neondb(database_url_env="NEON_DATABASE_URL").status)
publish_result = publish_neondb_sync(
    database_url_env="NEON_DATABASE_URL",
    jobs_parquet="dist/ingest/greenhouse/stripe/jobs.parquet",
    index_dir=index.index_dir,
    sync_report="dist/ingest/greenhouse/stripe/sync_report.json",
)
print(publish_result.batch_id, verify_neondb_contract(database_url_env="NEON_DATABASE_URL").status)

runtime = HonestRolesRuntime.from_configs(
    pipeline_config_path="pipeline.toml",
    plugin_manifest_path="plugins.toml",
)
result = runtime.run()

print(result.diagnostics)
print(result.dataset.to_polars().head())
print(result.application_plan[:3])

Documentation

App home: https://honestroles.com
Docs home: https://honestroles.com/docs/
Local docs source: docs/
Start here in docs: docs/index.md

Development

$ pip install -e ".[dev,docs]"
$ pytest -q
$ pytest tests/docs -q
$ bash scripts/check_docs_refs.sh
# Optional live connector smoke (requires refs):
# HONESTROLES_SMOKE_GREENHOUSE_REF, HONESTROLES_SMOKE_LEVER_REF,
# HONESTROLES_SMOKE_ASHBY_REF, HONESTROLES_SMOKE_WORKABLE_REF
$ bash scripts/run_ingest_smoke.sh
# Optional Neon DB smoke (requires NEON_DATABASE_URL):
$ PYTHON_BIN=.venv/bin/python DATABASE_URL_ENV=NEON_DATABASE_URL SCHEMA=honestroles_api bash scripts/run_neondb_smoke.sh

For local profiling data, keep large parquet inputs under data/ and write generated artifacts under dist/ (both are ignored by git).

Maintainer Notes

PyPI publishing is manual and token-based via bash scripts/publish_pypi.sh.
The script reads PYPI_API_KEY (or PYPI_API_TOKEN) from env/.env.
The GitHub Release workflow is manual (workflow_dispatch) only.
Before publish, run deterministic gate:

$ PYTHON_BIN=.venv/bin/python bash scripts/run_coverage.sh

Full maintainer runbook: docs/for-maintainers/release-and-pypi.md.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
contracts		contracts
data		data
docs		docs
examples		examples
plugin_template		plugin_template
plugins-index		plugins-index
scripts		scripts
src/honestroles		src/honestroles
tests		tests
.cursorrules		.cursorrules
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING_PLUGIN.md		CONTRIBUTING_PLUGIN.md
LICENSE		LICENSE
README.md		README.md
ingest_quality.toml		ingest_quality.toml
ingest_quality_smoke.toml		ingest_quality_smoke.toml
mkdocs.yml		mkdocs.yml
objects.md		objects.md
pyproject.toml		pyproject.toml
recommend_eval.toml		recommend_eval.toml
recommendation.toml		recommendation.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HonestRoles

Start With the App

Choose Your Path

Install (Developer)

5-Minute First Run (Developer)

CLI

Python API

Documentation

Development

Maintainer Notes

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HonestRoles

Start With the App

Choose Your Path

Install (Developer)

5-Minute First Run (Developer)

CLI

Python API

Documentation

Development

Maintainer Notes

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages