Skip to content

mokarimi19/pymetabase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyMetabase

The most complete Python API wrapper and CLI for Metabase

PyPI Python License Tests Downloads


Export millions of rows, manage cards, dashboards, collections, execute parameterized queries, and access every Metabase API endpoint — from Python or the command line.

🔍 Why PyMetabase?

Feature PyMetabase Other Libraries
Full Metabase API coverage Partial
CLI tool
Automatic chunking for large exports
Parameterized queries ({{param}})
Session token persistence & reuse
Resume interrupted exports
rclone-style remote management
Cards, Dashboards, Collections API Rare
Generic API access (any endpoint) Rare
Search, Users, Tables/Fields API

📦 Installation

pip install pymetabase

Optional extras:

pip install pymetabase[excel]    # Excel export (pandas + openpyxl)
pip install pymetabase[parquet]  # Parquet export (pyarrow)

⚡ Quick Start

Python — Export your first dataset in 4 lines

from pymetabase import Metabase

with Metabase(url="https://metabase.example.com", username="analyst@company.com", password="pass") as mb:
    result = mb.export(database="Production", query="SELECT * FROM customers", output="customers.jsonl")
    print(f"Exported {result.total_rows:,} rows in {result.duration_seconds:.1f}s")

CLI — Zero code, full power

# Step 1: Configure your Metabase server (one-time setup)
pymetabase config

# Step 2: Start working
pymetabase export -d Production -q "SELECT * FROM customers" -o customers.jsonl
pymetabase cards                    # see all saved questions
pymetabase search "revenue"         # find anything across your instance
pymetabase query -d Production -q "SELECT COUNT(*) FROM orders WHERE status = {{status}}" -p status=completed

📖 Features

🔌 Generic API Access

Hit any Metabase endpoint — no wrapper method needed:

with Metabase(url=url, username=user, password=pwd) as mb:
    # Fetch all cards
    all_cards = mb.get("/api/card")

    # Create a new question
    mb.post("/api/card", json={
        "name": "Daily Active Users",
        "dataset_query": {"database": 1, "type": "native", "native": {"query": "SELECT ..."}},
        "display": "line",
        "visualization_settings": {},
    })

    # Update and delete
    mb.put("/api/card/1", json={"name": "Weekly Active Users"})
    mb.delete_resource("/api/card/1")
# Same from the command line
pymetabase api get /api/card
pymetabase api post /api/card --json-body '{"name": "New Question", ...}'

📋 Cards (Saved Questions)

with Metabase(...) as mb:
    # Browse saved questions
    cards = mb.list_cards()
    card = mb.get_card(42)

    # Run a saved question with parameters
    result = mb.execute_card(42, parameters={"date_filter": "2024-01-01"})

    # Create a new question programmatically
    mb.create_card(
        name="Monthly Revenue by Region",
        dataset_query={
            "database": 1,
            "type": "native",
            "native": {"query": "SELECT region, SUM(amount) FROM sales GROUP BY region"},
        },
        collection_id=5,
    )

    # Housekeeping
    mb.update_card(42, name="Monthly Revenue by Region (v2)")
    mb.delete_card(99)

📊 Dashboards

with Metabase(...) as mb:
    dashboards = mb.list_dashboards()
    dash = mb.get_dashboard(1)   # includes all cards on the dashboard

    mb.create_dashboard("Q4 Executive Report", collection_id=3, description="KPIs and trends")
    mb.update_dashboard(1, name="Q4 Executive Report — Final")
    mb.delete_dashboard(1)

📁 Collections

with Metabase(...) as mb:
    collections = mb.list_collections()
    items = mb.get_collection_items("root")   # list everything at the top level
    items = mb.get_collection_items(7)         # or inside a specific collection

    mb.create_collection("Data Engineering", parent_id=1, color="#509EE3")

🔎 Search

Find cards, dashboards, collections, and tables in one call:

results = mb.search("revenue", models=["card", "dashboard"])
pymetabase search "revenue"

👤 Users

me = mb.get_current_user()               # who am I?
user = mb.get_user(1)                     # specific user
users = mb.list_users()                   # all users (admin only)
pymetabase whoami
pymetabase users

🗄️ Tables & Fields

Explore your database schema without SQL:

tables = mb.list_tables("Production")     # all tables in a database
table = mb.get_table(1)                   # table details
metadata = mb.get_table_metadata(1)       # columns, types, FKs
field = mb.get_field(42)                  # single field info
values = mb.get_field_values(42)          # distinct values (great for filters)

🧩 Parameterized Queries

Use {{param}} placeholders in SQL — PyMetabase handles the Metabase template-tag wiring for you:

with Metabase(...) as mb:
    # Filter orders by status and minimum amount
    rows = mb.query(
        "SELECT * FROM orders WHERE status = {{status}} AND total > {{min_total}}",
        database="Production",
        parameters={"status": "shipped", "min_total": 500},
    )

    # Export with parameters — same syntax
    mb.export(
        database="Production",
        query="SELECT * FROM signups WHERE created_at > {{since}}",
        output="recent_signups.jsonl",
        parameters={"since": "2024-01-01"},
    )
# CLI — pass parameters with -p key=value
pymetabase query -d Production \
    -q "SELECT * FROM orders WHERE status = {{status}}" \
    -p status=shipped

pymetabase export -d Production \
    -q "SELECT * FROM users WHERE role = {{role}}" \
    -o admins.jsonl -p role=admin

📤 Export

Large Dataset Export

Automatic chunking for datasets with millions of rows. No memory issues, no timeouts:

with Metabase(...) as mb:
    result = mb.export(
        database="Production",
        query="SELECT * FROM events",       # 10M+ rows? No problem
        output="events.jsonl",
        chunk_size=500_000,                  # fetch 500K rows at a time
        checkpoint_file="progress.json",     # resume if interrupted
        progress_callback=lambda cur, total, rate: print(f"{cur:,}/{total:,} rows ({rate:.0f}/s)"),
    )
    print(f"Done — {result.total_rows:,} rows, {result.chunks} chunks, {result.duration_seconds:.0f}s")

How it works under the hood:

  1. Counts total rows with SELECT COUNT(*)
  2. Wraps your query with ROW_NUMBER() OVER () and fetches in offset chunks
  3. Retries failed chunks with exponential backoff
  4. Saves progress to a checkpoint file so you can resume after crashes
  5. Works with any SQL — CTEs, JOINs, subqueries, window functions

Export Formats

Output format is detected from the file extension:

pymetabase export -d DB -q "SELECT * FROM users" -o users.jsonl   # JSON Lines (default)
pymetabase export -d DB -q "SELECT * FROM users" -o users.json    # JSON array
pymetabase export -d DB -q "SELECT * FROM users" -o users.csv     # CSV

Table Export with Filtering

Export a table directly — no SQL needed:

with Metabase(...) as mb:
    mb.export_table(
        database="Production",
        table="users",
        output="active_users.csv",
        columns=["id", "name", "email", "plan"],
        where="active = true AND plan = 'pro'",
        order_by="created_at DESC",
        limit=10_000,
    )

🌐 Remote Management

Manage multiple Metabase servers like rclone manages cloud storage:

# Configure a remote (interactive)
pymetabase config

# List all configured remotes
pymetabase listremotes
# Configured remotes (2):
#   - production (default): https://metabase.company.com
#   - staging: https://staging-metabase.company.com

# Use a specific remote for any command
pymetabase -r staging export -d DB -q "SELECT 1" -o test.jsonl

# Manage remotes
pymetabase testremote production     # verify connection
pymetabase showremote production     # show config details
pymetabase selectremote staging      # change default
pymetabase deleteremote old-server   # remove a remote

# Quick identity check
pymetabase whoami

🔑 Session tokens are persisted — PyMetabase caches your auth token and reuses it across runs. No re-login on every command.

Config location:

  • macOS / Linux: ~/.config/pymetabase/config.json
  • Windows: %APPDATA%\pymetabase\config.json

⚙️ Configuration

PyMetabase supports multiple configuration methods. Highest priority wins:

1. Direct Parameters

mb = Metabase(url="https://metabase.company.com", username="analyst@company.com", password="...")

2. Remotes (recommended for CLI)

pymetabase config           # interactive setup
pymetabase -r myserver export ...

3. Credentials File

[{"SERVER_NAME": "https://metabase.company.com", "USERNAME": "analyst@company.com", "PASSWORD": "..."}]
mb = Metabase(credentials_file="credentials.json")

4. Config File (YAML)

metabase:
  url: https://metabase.company.com
  username: analyst@company.com
  password: secret

defaults:
  chunk_size: 500000
  format: jsonl

retry:
  max_retries: 3
  delay: 1.0
  timeout: 600

5. Environment Variables

export METABASE_URL=https://metabase.company.com
export METABASE_USERNAME=analyst@company.com
export METABASE_PASSWORD=secret

📘 CLI Reference

Command Description
export Export SQL query results to file
export-table Export a table with optional column/filter selection
query Execute SQL and print/save results (-f table|json|jsonl|csv -o file)
cards List all saved questions
card <id> Show card details
execute-card <id> Run a saved question (with optional -p params)
export-card <id> Export a saved question to file (-o file)
dashboards List all dashboards
dashboard <id> Show dashboard details and its cards
collections List all collections
collection <id> Show items inside a collection
search <query> Search cards, dashboards, collections, tables
users List all users (admin only)
whoami Show current authenticated user
list-databases List available databases
list-tables List tables in a database
api <method> <endpoint> Raw API request (GET/POST/PUT/DELETE)
config Add, edit, or delete remotes
listremotes List configured remotes
selectremote Set the default remote
showremote <name> Show remote configuration
deleteremote <name> Delete a remote
testremote [name] Test connection to a remote

📘 Python API Reference

Metabase Client

Metabase(
    url: str = None,
    username: str = None,
    password: str = None,
    config_file: str = None,
    credentials_file: str = None,
    remote_name: str = None,     # for token persistence
    persist_token: bool = True,  # reuse sessions across runs
)

Methods

Category Methods
Generic API get(endpoint) post(endpoint, **kwargs) put(endpoint, **kwargs) delete_resource(endpoint)
Cards list_cards() get_card(id) create_card(...) update_card(id, ...) delete_card(id) execute_card(id, parameters=)
Dashboards list_dashboards() get_dashboard(id) create_dashboard(...) update_dashboard(id, ...) delete_dashboard(id)
Collections list_collections() get_collection(id) get_collection_items(id) create_collection(...)
Search search(query, models=, limit=)
Users get_current_user() list_users() get_user(id)
Tables/Fields list_tables(db) get_table(id) get_table_metadata(id) get_field(id) get_field_values(id)
Queries query(sql, database, parameters=)
Export export(query, output, database, parameters=, chunk_size=, ...) export_table(table, output, ...)
Connection connect() disconnect()

ExportResult

result.total_rows         # int — total rows exported
result.chunks             # int — number of chunks used
result.duration_seconds   # float — wall-clock time
result.output_file        # str — path to output file
result.rate_per_second    # float — rows/sec throughput
result.format             # str — "jsonl", "json", or "csv"

🛠️ Requirements

  • Python 3.9+
  • requests sqlglot pyyaml

🤝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.

git clone https://github.com/mokarimi19/pymetabase.git
cd pymetabase
pip install -e ".[dev]"
pytest

📄 License

MIT

About

The most complete Python API wrapper and CLI for Metabase. Export millions of rows, manage cards, dashboards, collections, and more.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages