DESCRITPION:
The current implementation of PangaeaDataset.search_studies() relies on PanQuery for retrieving datasets from PANGAEA. However, PANGAEA’s search API does not support filtering datasets based on temporal coverage (Age) — unlike NOAA.
According to the PANGAEA search documentation:
- The available "Year" field refers to publication year, not the temporal extent of the dataset.
- There is no direct support for filtering datasets by Age (e.g., BP, CE) at the query level.
PROBLEM:
This creates a mismatch with the unified PyleoTUPS interface, where users expect:
ds.search_studies(earliest_year=..., latest_year=...)
to behave consistently across datasets (NOAA + PANGAEA).
Currently:
- Temporal filters can not be applied
- Even if some looped search is run, results may include datasets outside the requested temporal range
- This breaks consistency and user expectations
Potential Approach (Not Implemented)
One possible direction to approximate temporal filtering:
- Perform initial search using available filters
- Extract temporal coverage from dataset metadata/tables
- Apply post-search filtering based on: earliest_year, latest_year, time_format (CE/BP)
- Iteratively fetch more results (via offset) until desired limit is reached
Challenges / Considerations
- Temporal metadata is often incomplete or inconsistent in PANGAEA datasets
- Extraction relies on parsing dataset tables → not always reliable
- Additional API calls may impact performance
- Results would be approximate, not guaranteed accurate
Additional Notes
This is a known limitation of PANGAEA search, not a bug in pyleotups
The goal is to implement NOAA-like behavior while maintaining transparency about limitations
DESCRITPION:
The current implementation of PangaeaDataset.search_studies() relies on PanQuery for retrieving datasets from PANGAEA. However, PANGAEA’s search API does not support filtering datasets based on temporal coverage (Age) — unlike NOAA.
According to the PANGAEA search documentation:
PROBLEM:
This creates a mismatch with the unified PyleoTUPS interface, where users expect:
to behave consistently across datasets (NOAA + PANGAEA).
Currently:
Potential Approach (Not Implemented)
One possible direction to approximate temporal filtering:
Challenges / Considerations
Additional Notes
This is a known limitation of PANGAEA search, not a bug in pyleotups
The goal is to implement NOAA-like behavior while maintaining transparency about limitations