Skip to content

View filtered by Vector content #319

@h4-h

Description

@h4-h

Hi! BonsaiDB seems like a really cool project. However, I’m having trouble finding examples of how to filter View by Vec<T> content. The book provides an example for filtering with a single string here, but that doesn’t seem to fit my use case. I’m trying to figure out how to filter my collection to get posts that include a specific tag (or multiple tags), without reducing the results to a count of posts or something else. I just want to retrieve the posts that match the tags.

SQL example:

SELECT DISTINCT p.post_id, p.text_content, GROUP_CONCAT(t.tag_name) AS tags
FROM posts p
JOIN post_tags pt ON p.post_id = pt.post_id
JOIN tags t ON pt.tag_id = t.tag_id
WHERE t.tag_name IN ('Rust')
GROUP BY p.post_id, p.text_content;

Rust filtering example:

// <-- ... code ... -->

struct Post {
  // <-- ... code ... -->
  tags: Vec<String>,
  // <-- ... code ... -->
}

// <-- ... code ... -->

let rust_posts = Post::all(&db)
            .query()?
            .into_iter()
            .filter(|post| post.contents.tags.contains(&"Rust".to_string()))
            .collect::<Vec<_>>();

// <-- ... code ... -->
Working code example without view
use bonsaidb::{core::schema::{Collection, Schema, SerializedCollection}, local::{config::{Builder, StorageConfiguration}, Database}};
use serde::{Deserialize, Serialize};

const DB_PATH: &str = "./data.bdb";

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let configuration = StorageConfiguration::new(DB_PATH);
    let database = Database::open::<DataStore>(configuration)?;

    let _first = Post::new("Async in Rust", vec!["Rust", "Learning"]).push_into(&database)?;
    let _second = Post::new("What does zero cost mean in Rust", vec!["Rust", "Learning"]).push_into(&database)?;
    let _third = Post::new("Go vs Rust", vec!["Go", "Rust"]).push_into(&database)?;
    let _fourth = Post::new("Goroutines in a nutsell", vec!["Go"]).push_into(&database)?;

    { // Filter all by tags.
        let all_rust_posts = Post::all(&database)
            .query()?
            .into_iter()
            .filter(|post| post.contents.tags.contains(&"Rust".to_string()))
            .collect::<Vec<_>>();

        println!("rusty posts:");
        all_rust_posts.iter().for_each(|post| println!("  {:?}", post.contents));

        // rusty posts:
        //   Post { content: "Async in Rust", tags: ["Rust", "Learning"] }
        //   Post { content: "What does zero cost mean in Rust", tags: ["Rust", "Learning"] }
        //   Post { content: "Go vs Rust", tags: ["Go", "Rust"] }
    }

    std::fs::remove_dir_all(DB_PATH).map_err(Into::into)
}

#[derive(Debug, Schema)]
#[schema(name = "data", collections = [Post])]
struct DataStore;

#[derive(Debug, Serialize, Deserialize, Collection)]
#[collection(name = "posts")]
struct Post {
    pub content: String,
    pub tags: Vec<String>,
}

impl Post {
    pub fn new(
        content: impl Into<String>,
        tags: Vec<impl Into<String>>,
    ) -> Self {
        Self {
            content: content.into(),
            tags: tags.into_iter().map(Into::into).collect(),
        }
    }
}

P.S.

I'm trying to figure out how to go through the items in a Collection one at a time without loading everything at once. Is there a way to get an Iter directly, instead of using all() to load everything and then converting it to Iter?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions