Add Exa Search API support as internet search tool#1846
Add Exa Search API support as internet search tool#1846maxwbuckley wants to merge 9 commits intoNVIDIA:developfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an Exa-backed LangChain internet search tool: new config, async tool implementation with retries and result formatting, automatic registration, dependency addition, docs for Exa usage, and unit tests for the config. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant ExaTool as ExaInternetSearchTool
participant Builder
participant ExaClient
participant ExaAPI
Agent->>ExaTool: request internet_search(query)
ExaTool->>Builder: resolve tool config & secrets
ExaTool->>ExaClient: instantiate AsyncExa (use API key)
ExaTool->>ExaClient: search_and_contents(query, params)
ExaClient->>ExaAPI: HTTP request
ExaAPI-->>ExaClient: search results
ExaClient-->>ExaTool: results
ExaTool-->>Agent: formatted <Document/> blocks or error message
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
packages/nvidia_nat_langchain/pyproject.toml (1)
65-66: Keep dependency entries sorted to match local file contract.The new
langchain-exaentry breaks the declared “Keep sorted!!!” ordering in this dependency block. Please move it afterlangchain-coreto preserve deterministic diffs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/pyproject.toml` around lines 65 - 66, The dependency list is out of sorted order: move the "langchain-exa>=1.1.0,<2.0.0" entry so it appears after "langchain-core>=1.2.6,<2.0.0" to restore the declared "Keep sorted!!!" ordering; ensure the two entries remain otherwise unchanged and the block stays alphabetically sorted.packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
46-47: Add an explicit return type to the public registration function.The async registration function should declare its yielded type for API clarity and static checks.
As per coding guidelines: "All public APIs require Python 3.11+ type hints on parameters and return values".Proposed fix
+from collections.abc import AsyncGenerator ... -async def exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder): +async def exa_internet_search( + tool_config: ExaInternetSearchToolConfig, + builder: Builder, +) -> AsyncGenerator[FunctionInfo, None]:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 46 - 47, The public async registration function exa_internet_search is missing an explicit return type; update its signature to include a typed async generator return annotation (e.g., -> AsyncGenerator[Tool, None]) and add the necessary import from typing (AsyncGenerator) and the Tool type used by the registration system so the signature reads like: async def exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder) -> AsyncGenerator[Tool, None]: ensuring the yielded type matches the actual yielded objects in the function body.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 37: The model field max_retries can be <= 0 which causes the retry loop
in _exa_internet_search to be skipped and the function to implicitly return
None; add a guard and ensure a non-None return: validate/normalize max_retries
on model initialization (e.g., enforce min 1 or coerce negatives to 0) and
modify _exa_internet_search so that when the retry loop is skipped or all
attempts fail it explicitly returns an empty list (or other documented default)
instead of None; update references to max_retries and the retry loop inside
_exa_internet_search to use the validated value and always return a concrete
value.
- Around line 53-58: The code mutates process-wide environment EXA_API_KEY
during tool setup; remove the conditional that sets os.environ["EXA_API_KEY"]
and instead rely solely on the explicit api_key argument (falling back to
os.environ.get("EXA_API_KEY") only when constructing ExaSearchResults). Update
the ExaSearchResults instantiation (ExaSearchResults(exa_api_key=...)) to use
api_key or os.environ.get(...) but do not write to os.environ anywhere in this
module (remove the block that assigns os.environ["EXA_API_KEY"]).
- Around line 38-43: Replace the loose string types for the config fields with
enum-like types so invalid values fail at parse time: change the annotations for
search_type and livecrawl to constrained types (e.g., from typing import Literal
and use search_type: Literal["neural","keyword","auto"] and livecrawl:
Literal["always","fallback","never"] or define enums via class SearchType(Enum)
and class Livecrawl(Enum) and use those types), keep the Field(...) calls for
defaults/description but update the defaults to one of the allowed values and
add the necessary imports (Literal or Enum) so pydantic validates inputs when
parsing the model.
---
Nitpick comments:
In `@packages/nvidia_nat_langchain/pyproject.toml`:
- Around line 65-66: The dependency list is out of sorted order: move the
"langchain-exa>=1.1.0,<2.0.0" entry so it appears after
"langchain-core>=1.2.6,<2.0.0" to restore the declared "Keep sorted!!!"
ordering; ensure the two entries remain otherwise unchanged and the block stays
alphabetically sorted.
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 46-47: The public async registration function exa_internet_search
is missing an explicit return type; update its signature to include a typed
async generator return annotation (e.g., -> AsyncGenerator[Tool, None]) and add
the necessary import from typing (AsyncGenerator) and the Tool type used by the
registration system so the signature reads like: async def
exa_internet_search(tool_config: ExaInternetSearchToolConfig, builder: Builder)
-> AsyncGenerator[Tool, None]: ensuring the yielded type matches the actual
yielded objects in the function body.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 71f86120-bb29-4abf-929b-468819a50794
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
docs/source/get-started/tutorials/add-tools-to-a-workflow.mdpackages/nvidia_nat_langchain/pyproject.tomlpackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.pypackages/nvidia_nat_langchain/tests/test_exa_internet_search.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (3)
37-37:⚠️ Potential issue | 🟠 MajorGuard retry bounds to avoid implicit
Nonereturns.If
max_retries <= 0, the loop at Line 73 is skipped and_exa_internet_searchcan returnNoneimplicitly.Suggested fix
- max_retries: int = Field(default=3, description="Maximum number of retries for the search request") + max_retries: int = Field(default=3, ge=1, description="Maximum number of retries for the search request") ... for attempt in range(tool_config.max_retries): try: ... except Exception: ... await asyncio.sleep(2**attempt) + return f"Web search failed after {tool_config.max_retries} attempts for: {question}"Also applies to: 73-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` at line 37, The max_retries Field can be zero or negative causing the retry loop in _exa_internet_search to be skipped and the function to implicitly return None; add a guard in either the Field validation or at start of _exa_internet_search to coerce/validate max_retries to a positive integer (e.g., if max_retries is None or <=0 set to 1 or raise ValueError), and ensure _exa_internet_search always returns an explicit value (like an empty list or a standardized error result) rather than None so callers don’t get implicit None returns.
38-43:⚠️ Potential issue | 🟠 MajorConstrain
search_typeandlivecrawlat config-parse time.Right now, invalid strings pass validation and fail only at runtime. Use enum-like typing (
Literal) so bad values are rejected early.Suggested fix
+from typing import Literal ... - search_type: str = Field( + search_type: Literal["auto", "neural", "keyword"] = Field( default="auto", description="Type of search to perform - 'neural', 'keyword', or 'auto'") - livecrawl: str = Field( + livecrawl: Literal["always", "fallback", "never"] = Field( default="fallback", description="Livecrawl behavior - 'always', 'fallback', or 'never'")As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 38 - 43, Replace the loose str types for search_type and livecrawl so invalid values are rejected during config parsing: change the type annotations on the model fields search_type and livecrawl from str to Literal types (e.g., Literal["neural","keyword","auto"] for search_type and Literal["always","fallback","never"] for livecrawl) or use an Enum, and keep the existing Field(...) defaults and descriptions; this ensures pydantic/schema validation fails at parse time instead of letting bad strings slip through to runtime in functions that rely on these fields.
53-58:⚠️ Potential issue | 🔴 CriticalDo not mutate process-wide
EXA_API_KEYduring tool setup.Writing to
os.environhere creates shared global state and can leak/cross wires credentials under concurrency. Resolve the key locally and pass it directly toExa(...).Suggested fix
- if not os.environ.get("EXA_API_KEY"): - if api_key: - os.environ["EXA_API_KEY"] = api_key - # This Exa tool requires an API Key and it must be set as an environment variable (EXA_API_KEY) - - exa_client = Exa(api_key=api_key or os.environ.get("EXA_API_KEY", "")) + resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "") + exa_client = Exa(api_key=resolved_api_key)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 53 - 58, Do not write to process-wide os.environ; instead resolve the key locally and pass it into the Exa constructor: compute a local variable (e.g., resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")) and instantiate exa_client = Exa(api_key=resolved_api_key) without assigning to os.environ; remove the branch that mutates EXA_API_KEY and optionally validate resolved_api_key and raise/handle missing key near where exa_client is created.
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
46-47: Add public API docstring and explicit return type annotation.
exa_internet_search(...)is a public registered function and should include a Google-style docstring plus an explicit return type (AsyncGenerator[FunctionInfo, None]).As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command" and "All public APIs require Python 3.11+ type hints on parameters and return values".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 46 - 47, Add a Google-style docstring to the public registered function exa_internet_search describing its purpose, parameters (tool_config: ExaInternetSearchToolConfig, builder: Builder), and yield behavior, and add an explicit return type annotation AsyncGenerator[FunctionInfo, None] to the function signature; ensure imports/types needed for AsyncGenerator and FunctionInfo are available and reference the registration via register_function so tooling recognizes the API.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 75-81: The call to exa_client.search_and_contents is synchronous
inside async code; replace the blocking Exa usage by importing AsyncExa (change
`from exa_py import Exa` to `from exa_py import AsyncExa`), instantiate the
async client (replace where `exa_client = Exa(...)` is created) and call its
async method with await (use `await exa_client.search_and_contents(...)`),
ensuring any surrounding function is async and errors are awaited/handled; keep
the same arguments (question, num_results, type, livecrawl, text) and update any
teardown/close calls to the async client equivalents.
---
Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 37: The max_retries Field can be zero or negative causing the retry loop
in _exa_internet_search to be skipped and the function to implicitly return
None; add a guard in either the Field validation or at start of
_exa_internet_search to coerce/validate max_retries to a positive integer (e.g.,
if max_retries is None or <=0 set to 1 or raise ValueError), and ensure
_exa_internet_search always returns an explicit value (like an empty list or a
standardized error result) rather than None so callers don’t get implicit None
returns.
- Around line 38-43: Replace the loose str types for search_type and livecrawl
so invalid values are rejected during config parsing: change the type
annotations on the model fields search_type and livecrawl from str to Literal
types (e.g., Literal["neural","keyword","auto"] for search_type and
Literal["always","fallback","never"] for livecrawl) or use an Enum, and keep the
existing Field(...) defaults and descriptions; this ensures pydantic/schema
validation fails at parse time instead of letting bad strings slip through to
runtime in functions that rely on these fields.
- Around line 53-58: Do not write to process-wide os.environ; instead resolve
the key locally and pass it into the Exa constructor: compute a local variable
(e.g., resolved_api_key = api_key or os.environ.get("EXA_API_KEY", "")) and
instantiate exa_client = Exa(api_key=resolved_api_key) without assigning to
os.environ; remove the branch that mutates EXA_API_KEY and optionally validate
resolved_api_key and raise/handle missing key near where exa_client is created.
---
Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 46-47: Add a Google-style docstring to the public registered
function exa_internet_search describing its purpose, parameters (tool_config:
ExaInternetSearchToolConfig, builder: Builder), and yield behavior, and add an
explicit return type annotation AsyncGenerator[FunctionInfo, None] to the
function signature; ensure imports/types needed for AsyncGenerator and
FunctionInfo are available and reference the registration via register_function
so tooling recognizes the API.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e896f727-5c3f-4946-8c76-57f8bc3fe183
📒 Files selected for processing (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
b8da77f to
a69e612
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Line 35: The max_results field currently allows zero/negative values; update
the config model's declaration of max_results to enforce a minimum of 1 (e.g.,
replace "max_results: int = 3" with a pydantic constrained field like
"max_results: int = Field(3, ge=1)" or use conint(ge=1)), and add the necessary
import from pydantic (Field or conint) so model-parse time validation prevents
invalid values; ensure this change is applied to the config class that defines
max_results in exa_internet_search.py.
- Around line 57-70: In _exa_internet_search, add a fast-fail check at the top
to immediately return an error (or raise) when no Exa API key is available:
check both the configured key (tool_config.exa_api_key) and the environment
(os.environ.get('EXA_API_KEY')) and if both are empty, return/raise immediately
instead of proceeding into the retry/backoff loop that uses
tool_config.max_retries; place this check before the question truncation and the
for attempt in range(tool_config.max_retries) loop so unnecessary
retries/backoff are avoided.
- Around line 87-92: The except block in the web-search retry logic in
exa_internet_search.py currently catches all exceptions silently; change this to
import logging and create a module-level logger, narrow the except to retryable
exceptions (e.g., httpx.RequestError, httpx.ReadTimeout, asyncio.TimeoutError)
and call logger.exception(...) before each retry, and separately handle
non-retryable errors (e.g., ValueError, httpx.HTTPStatusError with 401/403) to
fail fast (log with logger.exception and return the fallback message
immediately). Ensure the final fallback return still logs the last exception
with logger.exception so the full stack trace is captured when giving up after
tool_config.max_retries.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e72dc882-36a7-402a-b00f-37853e2058f5
📒 Files selected for processing (2)
packages/nvidia_nat_langchain/pyproject.tomlpackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
✅ Files skipped from review due to trivial changes (1)
- packages/nvidia_nat_langchain/pyproject.toml
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (2)
87-92:⚠️ Potential issue | 🟠 MajorReplace blind catch with logged, selective retry handling.
Current handling catches everything silently; this loses stack traces and may retry non-retryable failures.
🔧 Suggested direction
+import logging @@ +logger = logging.getLogger(__name__) @@ - except Exception: + except Exception: + logger.exception("Exa search attempt %s/%s failed", attempt + 1, tool_config.max_retries) # Return a graceful message instead of raising, so the agent can # continue reasoning without web search rather than failing entirely. if attempt == tool_config.max_retries - 1: return f"Web search failed after {tool_config.max_retries} attempts for: {question}" await asyncio.sleep(2**attempt)In exa-py versions compatible with langchain-exa>=1.1.0,<2.0.0, which exception classes can AsyncExa.search_and_contents raise for transient network/server failures versus auth/configuration failures?As per coding guidelines: "When catching and logging exceptions without re-raising: always use
logger.exception()to capture the full stack trace information."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 87 - 92, Replace the blind except in the retry loop around AsyncExa.search_and_contents with selective handling: catch transient/network/server exceptions thrown by AsyncExa.search_and_contents (e.g., connection/timeouts/retryable HTTP errors) and on those call logger.exception(...) to record the stack trace, perform the exponential backoff (await asyncio.sleep(2**attempt)), and only return the graceful failure message after exhausting tool_config.max_retries; for non-retryable errors (authentication/configuration errors) re-raise or return immediately so they are not retried. Locate the retry block in exa_internet_search.py around the AsyncExa.search_and_contents call and replace the broad except Exception with specific exception classes and logger.exception usage while preserving the existing max_retries/attempt logic.
53-71:⚠️ Potential issue | 🟠 MajorFail fast when no Exa API key is configured.
If both config and env are empty, the tool still enters retries and backoff, adding avoidable latency.
🔧 Suggested fix
async def _exa_internet_search(question: str) -> str: """This tool retrieves relevant contexts from web search (using Exa) for the given question. @@ Returns: str: The web search results. """ + if not resolved_api_key: + return "Web search is unavailable: `EXA_API_KEY` is not configured." + # Exa API supports longer queries than Tavily but truncate at a reasonable limit if len(question) > 2000: question = question[:1997] + "..."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 53 - 71, The function _exa_internet_search currently creates exa_client with resolved_api_key and then enters retry/backoff loop even when no key is configured; change it to fail fast by checking resolved_api_key (or api_key) before creating/using AsyncExa and raise/log a clear error or return immediately if it's empty so you don't enter the for attempt in range(tool_config.max_retries) loop; update the early check near where resolved_api_key/api_key and exa_client are set (and before the loop that uses tool_config.max_retries) to short-circuit execution when no API key is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/get-started/tutorials/add-tools-to-a-workflow.md`:
- Around line 170-187: The Exa subsection shows configuring
functions.internet_search with _type: exa_internet_search but omits wiring the
tool into the workflow; update the docs to add an explicit workflow block that
sets workflow.tool_names to include internet_search and current_datetime (and
use the correct workflow._type, e.g., react_agent) so the example demonstrates
both function registration (functions.internet_search / current_datetime) and
adding those names to workflow.tool_names.
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 82-84: The generated web_search_results string incorrectly uses a
self-closing opening tag plus a separate closing tag; update the formatting
where web_search_results is built (iterating over search_response.results and
using doc.url and doc.text) to use a proper opening tag with href (e.g.,
<Document href="...">) followed by the document text and then the closing
</Document> tag so the XML/HTML is well-formed.
---
Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 87-92: Replace the blind except in the retry loop around
AsyncExa.search_and_contents with selective handling: catch
transient/network/server exceptions thrown by AsyncExa.search_and_contents
(e.g., connection/timeouts/retryable HTTP errors) and on those call
logger.exception(...) to record the stack trace, perform the exponential backoff
(await asyncio.sleep(2**attempt)), and only return the graceful failure message
after exhausting tool_config.max_retries; for non-retryable errors
(authentication/configuration errors) re-raise or return immediately so they are
not retried. Locate the retry block in exa_internet_search.py around the
AsyncExa.search_and_contents call and replace the broad except Exception with
specific exception classes and logger.exception usage while preserving the
existing max_retries/attempt logic.
- Around line 53-71: The function _exa_internet_search currently creates
exa_client with resolved_api_key and then enters retry/backoff loop even when no
key is configured; change it to fail fast by checking resolved_api_key (or
api_key) before creating/using AsyncExa and raise/log a clear error or return
immediately if it's empty so you don't enter the for attempt in
range(tool_config.max_retries) loop; update the early check near where
resolved_api_key/api_key and exa_client are set (and before the loop that uses
tool_config.max_retries) to short-circuit execution when no API key is present.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: db1cc117-034d-4644-b824-f2cfbfcbf8e9
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
docs/source/get-started/tutorials/add-tools-to-a-workflow.mdpackages/nvidia_nat_langchain/pyproject.tomlpackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.pypackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.pypackages/nvidia_nat_langchain/tests/test_exa_internet_search.py
✅ Files skipped from review due to trivial changes (3)
- packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/register.py
- packages/nvidia_nat_langchain/pyproject.toml
- packages/nvidia_nat_langchain/tests/test_exa_internet_search.py
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Show resolved
Hide resolved
There was a problem hiding this comment.
♻️ Duplicate comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
85-88:⚠️ Potential issue | 🟡 MinorFix malformed XML tag syntax.
The
<Document>wrapper uses a self-closing opening tag (/>) combined with a separate closing tag, which is inconsistent XML.Proposed fix
web_search_results = "\n\n---\n\n".join([ - f'<Document href="{doc.url}"/>\n{doc.text}\n</Document>' + f'<Document href="{doc.url}">\n{doc.text}\n</Document>' for doc in search_response.results if doc.text ])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 85 - 88, The XML wrapper for search results is malformed: the opening tag in the web_search_results join uses a self-closing form ('<Document href="..."/>') but then adds a separate closing tag; update the string construction inside web_search_results (the list comprehension iterating over search_response.results and using doc.url/doc.text) so the opening tag is a proper start tag (e.g., '<Document href="...">') paired with the existing '</Document>' closing tag to produce well-formed XML.
🧹 Nitpick comments (1)
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py (1)
90-96: Add logging for exception handling.The code catches all exceptions silently without logging, making production debugging difficult. Per coding guidelines, use
logger.exception()when catching exceptions without re-raising.Additionally, not all exceptions warrant retry (e.g., auth failures from 401/403 should fail fast).
Proposed improvement
+import logging + +logger = logging.getLogger(__name__) + # ... in the function ... except Exception: + logger.exception("Exa search attempt %d failed", attempt + 1) # Return a graceful message instead of raising, so the agent can # continue reasoning without web search rather than failing entirely. if attempt == tool_config.max_retries - 1: return f"Web search failed after {tool_config.max_retries} attempts for: {question}" await asyncio.sleep(2**attempt)As per coding guidelines: "When catching and logging exceptions without re-raising, always use
logger.exception()to capture the full stack trace."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py` around lines 90 - 96, The except block in the web search retry loop swallows all exceptions—update it to catch Exception as e and call logger.exception(...) to log the full stack trace and context (include question and attempt), and add a fast-fail for authorization errors by checking the exception for HTTP status 401/403 (e.g., inspect e.response.status or isinstance checks for HTTPError) and immediately return a clear failure string in that case; for other exceptions continue the existing exponential backoff (await asyncio.sleep(2**attempt)) and only return the final failure after tool_config.max_retries attempts. Reference the existing variables/methods: attempt, tool_config.max_retries, question, and logger/asyncio.sleep in exa_internet_search.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 85-88: The XML wrapper for search results is malformed: the
opening tag in the web_search_results join uses a self-closing form ('<Document
href="..."/>') but then adds a separate closing tag; update the string
construction inside web_search_results (the list comprehension iterating over
search_response.results and using doc.url/doc.text) so the opening tag is a
proper start tag (e.g., '<Document href="...">') paired with the existing
'</Document>' closing tag to produce well-formed XML.
---
Nitpick comments:
In
`@packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py`:
- Around line 90-96: The except block in the web search retry loop swallows all
exceptions—update it to catch Exception as e and call logger.exception(...) to
log the full stack trace and context (include question and attempt), and add a
fast-fail for authorization errors by checking the exception for HTTP status
401/403 (e.g., inspect e.response.status or isinstance checks for HTTPError) and
immediately return a clear failure string in that case; for other exceptions
continue the existing exponential backoff (await asyncio.sleep(2**attempt)) and
only return the final failure after tool_config.max_retries attempts. Reference
the existing variables/methods: attempt, tool_config.max_retries, question, and
logger/asyncio.sleep in exa_internet_search.py.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 79ae3df7-b586-4594-bb27-53cf77807065
📒 Files selected for processing (2)
docs/source/get-started/tutorials/add-tools-to-a-workflow.mdpackages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
|
Hi @maxwbuckley, appreciate your interest in contributing to NAT! Can you raise an issue describing the use case and argument to include Exa integration with NeMo Agent Toolkit? Why does this need to be a built-in tool directly in nat-langchain? When you raise the issue we can get product involved to consider as an RFR. Adding DO NOT MERGE to this PR, until issue is raised and approved |
|
Thanks @bbednarski9! Filed #1848 with the use case and rationale. Happy to iterate on it. |
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
packages/nvidia_nat_langchain/src/nat/plugins/langchain/tools/exa_internet_search.py
Outdated
Show resolved
Hide resolved
|
@maxwbuckley thanks for filing the issue. Did a first pass of the code. Would you mind addressing the comments above and I'll take another look? -Bryan |
|
@bbednarski9 thanks for the detailed review! All six comments have been addressed across the follow-up commits (a69e612, 2e43508, 27786c9). Summary of changes:
Ready for another pass whenever you have a moment. Thanks! |
|
Ran through the CodeRabbit comments — summary of resolution: Already addressed in earlier commits:
Addressed in 78dbf08:
Intentionally not changed:
All 12 unit tests still pass and |
|
Hey @maxwbuckley, our repository requires DCO signoffs for each commit (Signed-off-by:). Can you ammend your previous commits to meet this criteria so we can move forward with a merge? use: If you havent done this before, Claude is pretty good at it. In the meantime ill review your last commits and close out old comments |
78dbf08 to
e2c3e8d
Compare
|
@bbednarski9 good catch — only the most recent commit ( |
|
/ok to test 2f57377 |
|
@maxwbuckley almost there. Can you do the following:
Looks good otherwise |
Add `exa_internet_search` tool using the langchain-exa integration, mirroring the existing tavily_internet_search tool. Includes config class, tool registration, unit tests, dependency, and documentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
The langchain_exa ExaSearchResults wrapper doesn't pass num_results and other params through its .run() method. Use the exa_py.Exa client directly for correct behavior (max_results, search_type, livecrawl). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
- Use AsyncExa instead of sync Exa to avoid blocking the event loop - Remove os.environ mutation; resolve API key locally - Use Literal types for search_type and livecrawl config validation - Add ge=1 constraint on max_retries to prevent implicit None returns - Add explicit return after retry loop as safety fallback - Fix dependency sort order: langchain-core before langchain-exa Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
- Add ge=1 constraint on max_results field - Fail fast when no EXA_API_KEY is configured - Add workflow.tool_names example to Exa docs section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
- Fix copyright year to 2026 - Use langchain_exa.ExaSearchResults instead of exa_py directly to match the declared dependency - Lazily instantiate client inside invocation path, only if key exists - Add configurable max_query_length field (default 2000) with truncation warning log - Expand test coverage: retries, truncation, empty results, empty key, config validation for invalid search_type/livecrawl/max_retries/max_results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
Address CodeRabbit feedback to surface failures instead of swallowing them silently in the retry loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
Address reviewer feedback to surface all configurable fields in the tutorial's Exa configuration example. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
2f57377 to
9ba9b3f
Compare
|
@bbednarski9 both done:
Both changes are in a single signed commit |
|
/okt to test 9ba9b3f |
|
@maxwbuckley I think there was an issue with the uv lock generation. Perhaps it didnt go through cleanly? Can you try again or pass me contributor to your fork and ill check it? Expected: Got: |
Regenerate lockfile after rebase onto develop and add 'Exa' to the accepted Vale vocabulary so the docs linter accepts the product name. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
9ba9b3f to
209182f
Compare
|
@bbednarski9 you were right — the previous regen started from our feature branch's stale lockfile, which caused Fixed in Force-pushed. |
exa-py defines SearchType = Literal["auto", "fast", "deep", "neural", "instant"]. The previous config exposed "keyword" which the Exa API rejects, making any workflow that set search_type=keyword fail at runtime. Drop "keyword", add the valid "fast", "deep", and "instant" options, and update the docs example comment accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Max Buckley <maxwbuckley@gmail.com>
bbednarski9
left a comment
There was a problem hiding this comment.
Looks good, neeed on license review of the diffs below before merge:
Added packages:
- exa-py 1.16.1 MIT
- langchain-exa 1.1.0 MIT
Changed packages:
- langchain-core 1.2.16 -> 1.2.28
Salonijain27
left a comment
There was a problem hiding this comment.
approved from a dependency point of view
|
/ok to test 5f98c2c |
|
Hey @maxwbuckley, its approved on our end, final step is just passing CI. You can run the following to make sure it will pass pre-commit hooks before. Also lets merge/rebase
After it passes CI ill merge. |
Summary
exa_internet_searchtool usinglangchain_exa.ExaSearchResults, mirroring the existingtavily_internet_searchtoolExaInternetSearchToolConfigwith configurablemax_results,search_type(Literal["auto", "neural", "keyword"]),livecrawl(Literal["always", "fallback", "never"]),max_query_length, andapi_key(via config orEXA_API_KEYenv var)langchain-exa>=1.1.0,<2.0.0dependency tonvidia-nat-langchainCloses #1848
Test plan
test_exa_internet_search.py— config validation, retries, truncation, empty results, empty key)GlobalTypeRegistryand appears innat info components -t functionruff checkpasses on all new/modified filesEXA_API_KEYagainst live Exa API🤖 Generated with Claude Code