Findit-AI · al8n · Apr 8, 2026 · Apr 8, 2026 · Apr 8, 2026 · Apr 8, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,7 +1,45 @@
-# UNRELEASED
+# Changelog
 
-# 0.1.2 (January 6th, 2022)
+All notable changes to this workspace will be documented in this file.
 
-FEATURES
+## Unreleased
 
+## 0.2.0 - 2026-04-08
 
+### `soundevents`
+
+- Added `predict_raw_scores_batch_flat` and `predict_raw_scores_batch_into` for lower-allocation batched raw-score access.
+- Expanded batched inference coverage with regression tests that verify flat and buffer-reuse paths against sequential inference.
+- Removed redundant input validation in `classify_batch` while preserving the existing error behavior for invalid batches.
+- Tightened crate metadata and docs.rs configuration so feature-gated APIs, including `Classifier::tiny`, render correctly on published docs.
+- Added packaged third-party notices for bundled CED model artifacts.
+
+### `soundevents-dataset`
+
+- Packaged the dual-license texts with the published crate and aligned crate metadata for docs.rs and crates.io discovery.
+- Kept the crate on its Rust 1.59 / edition 2021 compatibility track while removing the in-source `deny(warnings)` footgun.
+- Added packaged third-party notices for bundled AudioSet ontology and label metadata.
+
+### Workspace
+
+- Included license files in published package contents for both crates.
+- Upgraded README examples from ignored snippets to compile-checked doctests across the workspace.
+
+## 0.1.0 - 2026-04-08
+
+### `soundevents`
+
+- Initial public release of the ONNX Runtime wrapper for CED AudioSet classifiers.
+- Added file, memory, and bundled-model loading paths plus configurable graph optimization.
+- Added ranked top-k helpers, raw-score accessors, and chunked inference with mean/max aggregation.
+- Added equal-length batch APIs for clip inference and chunked window batching for higher-throughput services.
+
+### `soundevents-dataset`
+
+- Initial public release of the typed AudioSet dataset companion crate.
+- Included both the 527-class rated label set and the full 632-entry ontology as `&'static` generated data.
+- Kept the crate `no_std`-friendly, allocation-free at runtime, and compatible with Rust 1.59.
+
+### `xtask`
+
+- Added code generation for the rated label set and ontology modules from upstream AudioSet source data.
diff --git a/Cargo.toml b/Cargo.toml
@@ -14,7 +14,12 @@ resolver = "3"
 thiserror = { version = "2", default-features = false }
 serde = "1"
 
-soundevents-dataset = { version = "0.1", path = "soundevents-dataset", default-features = false }
+soundevents-dataset = { version = "0.2", path = "soundevents-dataset", default-features = false }
+
+[workspace.package]
+license = "MIT OR Apache-2.0"
+repository = "https://github.com/findit-ai/soundevents"
+homepage = "https://github.com/findit-ai/soundevents"
 
 [profile.bench]
 opt-level = 3

diff --git a/LICENSE-APACHE b/LICENSE-APACHE
@@ -186,7 +186,7 @@ APPENDIX: How to apply the Apache License to your work.
    same "printed page" as the copyright notice for easier
    identification within third-party archives.
 
-Copyright [yyyy] [name of copyright owner]
+Copyright (c) 2026 The FinDIT studio developers
 
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.

diff --git a/LICENSE-MIT b/LICENSE-MIT
@@ -1,4 +1,4 @@
-Copyright (c) 2015 The Rust Project Developers
+Copyright (c) 2026 The FinDIT studio developers
 
 Permission is hereby granted, free of charge, to any
 person obtaining a copy of this software and associated

diff --git a/README.md b/README.md
@@ -22,20 +22,25 @@ Production-oriented Rust inference for [CED](https://arxiv.org/abs/2308.11957) A
 - **Drop-in CED inference** — load any [CED](https://arxiv.org/abs/2308.11957) AudioSet ONNX model (or use the bundled `tiny` variant) and run it directly on `&[f32]` PCM samples. No Python, no preprocessing pipeline.
 - **Typed labels, not bare integers** — every prediction comes back as an [`EventPrediction`] carrying a `&'static RatedSoundEvent` from [`soundevents-dataset`](./soundevents-dataset), so you get the canonical AudioSet name, the `/m/...` id, the model class index, and the confidence in one struct.
 - **Compile-time class-count guarantee** — the `NUM_CLASSES = 527` constant comes from the rated dataset at codegen time. If a model returns the wrong number of classes you get a typed [`ClassifierError::UnexpectedClassCount`] instead of a silent mismatch.
-- **Long-clip chunking built in** — `classify_chunked` / `classify_all_chunked` window the input at a configurable hop, run inference on each chunk, and aggregate the per-chunk confidences with either `Mean` or `Max`. Defaults match CED's 10 s training window (160 000 samples at 16 kHz).
+- **Long-clip chunking built in** — `classify_chunked` / `classify_all_chunked` window the input at a configurable hop, run inference on each chunk, and aggregate the per-chunk confidences with either `Mean` or `Max`. Defaults match CED's 10 s training window (160 000 samples at 16 kHz), and fixed-size chunk batches can now be packed into one model call.
 - **Top-k via a tiny min-heap** — `classify(samples, k)` does not allocate a full 527-element scores vector to find the top results.
+- **Batch-ready low-level API** — `predict_raw_scores_batch`, `predict_raw_scores_batch_flat`, `predict_raw_scores_batch_into`, `classify_all_batch`, and `classify_batch` accept equal-length clip batches for service-layer batching.
 - **Bring-your-own model or bundle one** — load from a path, from in-memory bytes, or enable the `bundled-tiny` feature to embed `models/tiny.onnx` directly into your binary.
 
 ## Quick start
 
 ```toml
 [dependencies]
-soundevents = "0.1"
+soundevents = "0.2"
 ```
 
-```rust,ignore
+```rust,no_run
 use soundevents::{Classifier, Options};
 
+fn load_mono_16k_audio(_: &str) -> Result<Vec<f32>, Box<dyn std::error::Error>> {
+    Ok(vec![0.0; 16_000])
+}
+
 fn main() -> Result<(), Box<dyn std::error::Error>> {
     let mut classifier = Classifier::from_file("soundevents/models/tiny.onnx")?;
-    let mut classifier = Classifier::from_file("soundevents/models/tiny.onnx")?;
+    let mut classifier = Classifier::from_file("path/to/model.onnx")?;
-    let mut classifier = Classifier::from_file("soundevents/models/tiny.onnx")?;
+    let mut classifier = Classifier::from_file("path/to/model.onnx")?;
 
@@ -61,16 +66,22 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
 
 `Classifier::classify_chunked` slides a window over the input and aggregates each chunk's per-class confidences. The defaults (10 s window, 10 s hop, mean aggregation) match CED's training setup; tune them for overlap or peak-pooling.
 
-```rust,ignore
+```rust,no_run
 use soundevents::{ChunkAggregation, ChunkingOptions, Classifier};
 
+fn load_long_clip() -> Result<Vec<f32>, Box<dyn std::error::Error>> {
+    Ok(vec![0.0; 320_000])
+}
+
 fn main() -> Result<(), Box<dyn std::error::Error>> {
     let mut classifier = Classifier::from_file("soundevents/models/tiny.onnx")?;
     let samples: Vec<f32> = load_long_clip()?;
 
     let opts = ChunkingOptions::default()
         // 5 s overlap (50%) between adjacent windows
         .with_hop_samples(80_000)
+        // Batch up to 4 equal-length windows per session.run()
+        .with_batch_size(4)
         // Keep the loudest detection in any window instead of averaging
         .with_aggregation(ChunkAggregation::Max);
 
@@ -111,18 +122,29 @@ If upstream releases new weights, or you cloned without the model files, refetch
 
 The script downloads the `*.onnx` artifact from each `mispeech/ced-*` Hugging Face repo and writes it as `soundevents/models/<variant>.onnx`.
 
+See [THIRD_PARTY_NOTICES.md](THIRD_PARTY_NOTICES.md) for upstream model
+sources and attribution details.
+
 ### Bundled tiny model
 
 Enable the `bundled-tiny` feature to embed `models/tiny.onnx` into your binary — useful for CLI tools and self-contained services where you don't want to ship a separate model file.
 
 ```toml
-soundevents = { version = "0.1", features = ["bundled-tiny"] }
+soundevents = { version = "0.2", features = ["bundled-tiny"] }
 ```
 
-```rust,ignore
+```rust
+# #[cfg(feature = "bundled-tiny")]
 use soundevents::{Classifier, Options};
 
+# fn main() -> Result<(), Box<dyn std::error::Error>> {
+# #[cfg(feature = "bundled-tiny")]
+# {
 let mut classifier = Classifier::tiny(Options::default())?;
+# let _ = &mut classifier;
+# }
+# Ok(())
+# }
 ```
 
 ## Features
@@ -139,6 +161,8 @@ The full input/output contract:
 | `DEFAULT_CHUNK_SAMPLES` | `160_000` | Default 10 s window/hop for chunked inference. |
 | `NUM_CLASSES` | `527` | Number of CED output classes — derived at compile time from `RatedSoundEvent::events().len()`. |
 
+For low-level batching, every clip in `predict_raw_scores_batch*` / `classify_*_batch` must be non-empty and have the same sample count. `predict_raw_scores_batch_flat` returns one row-major `Vec<f32>`, and `predict_raw_scores_batch_into` lets callers reuse their own output buffer to avoid per-call result allocations. `classify_chunked` uses the same equal-length restriction internally when `ChunkingOptions::batch_size() > 1`, which is naturally satisfied for fixed-size windows and automatically falls back to smaller batches for the final short tail chunk.
+
 ## Development
 
 Regenerate the dataset from upstream sources:
@@ -162,6 +186,8 @@ cargo test
 Apache License (Version 2.0).
 
 See [LICENSE-APACHE](LICENSE-APACHE), [LICENSE-MIT](LICENSE-MIT) for details.
+Bundled third-party model attributions and source licenses are documented in
+[THIRD_PARTY_NOTICES.md](THIRD_PARTY_NOTICES.md).
 
 Copyright (c) 2026 FinDIT studio authors.
 

diff --git a/soundevents-dataset/Cargo.toml b/soundevents-dataset/Cargo.toml
@@ -1,12 +1,17 @@
 [package]
 name = "soundevents-dataset"
-version = "0.1.0"
+version = "0.2.0"
+# Intentionally kept on edition 2021 / MSRV 1.59 so this no_std static-data
+# crate remains usable from older toolchains.
 edition = "2021"
-repository = "https://github.com/findit-ai/soundevents"
-homepage = "https://github.com/findit-ai/soundevents"
-documentation = "https://docs.rs/soundevents"
+documentation = "https://docs.rs/soundevents-dataset"
 description = "Audio Set Ontology aims to provide a comprehensive set of categories to describe sound events."
-license = "MIT OR Apache-2.0"
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+readme = "README.md"
+keywords = ["audioset", "sound-events", "ontology", "dataset", "no-std"]
+categories = ["data-structures", "multimedia::audio", "no-std", "no-std::no-alloc"]
 rust-version = "1.59.0"
 
 [features]