Is this a new bug?
Current Behavior
when I try to get sparse vectors using encode_documents and encode_queries for the same piece of text is gives different values.
piece to text : "the lazy dog"
encode_documents values : 0.58, 0.58
encode_queries: 0.5
Expected Behavior
Getting different values for encode_documents and encode encode_queries for the same piece of text. expecting values should be 0.5 for both right but there is ~0.08 difference, am I missing something?
Steps To Reproduce
from pinecone_text.sparse import BM25Encoder
corpus = ["The quick brown fox jumps over the lazy dog", "The lazy dog is brown"]
bm25 = BM25Encoder()
bm25.fit(corpus)
print(bm25.encode_documents("the lazy dog"))
### Output: {'indices': [226376294, 2982218203], 'values': [0.5882352941176472, 0.5882352941176472]}
print(bm25.encode_queries("the lazy dog"))
### Output: {'indices': [226376294, 2982218203], 'values': [0.5, 0.5]}
Relevant log output
No response
Environment
OS: Ubuntu 20.04
Python 3.9.12
pinecone-text==0.9.0
Additional Context
No response
Is this a new bug?
Current Behavior
when I try to get sparse vectors using encode_documents and encode_queries for the same piece of text is gives different values.
piece to text :
"the lazy dog"encode_documents values : 0.58, 0.58
encode_queries: 0.5
Expected Behavior
Getting different values for encode_documents and encode encode_queries for the same piece of text. expecting values should be 0.5 for both right but there is ~0.08 difference, am I missing something?
Steps To Reproduce
Relevant log output
No response
Environment
Additional Context
No response