Embeddings npu3, embeddings_calculator_ov#3933
Merged
Conversation
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
| outputTensorName = inferRequest.get_compiled_model().outputs().begin()->get_any_name(); | ||
| SPDLOG_LOGGER_DEBUG(embeddings_calculator_logger, "Single embedding model output found with name {}", outputTensorName); | ||
| } | ||
| embeddingsTensors.push_back(inferRequest.get_tensor(outputTensorName.c_str())); |
Collaborator
There was a problem hiding this comment.
this tensor you push to embeddingsTensors is still tied to inferRequest and data underneath will be overriden if you perform another inference on it. and you do, in next for loop iteration. please correct me if I'm wrong here
Did you run any stress tests, high load tests + accuracy tests with different concurrency/queue size? This would showcase the issue which I point out
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
dkalinowski
reviewed
Feb 2, 2026
|
|
||
| namespace mediapipe { | ||
|
|
||
| void printTensor(const ov::Tensor& tensor) { |
Collaborator
There was a problem hiding this comment.
this should be probbably removed?
| SPDLOG_LOGGER_DEBUG(embeddings_calculator_logger, "Input size {} exceeds max_context_length {}", input_ids_size, max_context_length); | ||
| return absl::InvalidArgumentError("Input length " + std::to_string(input_ids_size) + " longer than allowed " + std::to_string(max_context_length)); | ||
| size_t inputIdsSize = tokens.input_ids.get_shape()[1]; | ||
| if (inputIdsSize > maxContextLength) { |
Collaborator
There was a problem hiding this comment.
maybe you could create method for this check and use it also in line 256?
michalkulakowski
approved these changes
Feb 5, 2026
dkalinowski
reviewed
Feb 5, 2026
dkalinowski
reviewed
Feb 5, 2026
dkalinowski
reviewed
Feb 5, 2026
dkalinowski
reviewed
Feb 5, 2026
dkalinowski
reviewed
Feb 5, 2026
dkalinowski
reviewed
Feb 6, 2026
src/test/test_utils.cpp
Outdated
| if (elementType == ov::element::f32) { | ||
| const float* data = static_cast<const float*>(dataPtr); | ||
| std::cout << "Tensor data (f32): "; | ||
| for (size_t i = 0; i < tensor.get_size(); ++i) { |
Collaborator
There was a problem hiding this comment.
I dont think its good idea to flood the console, keep it 20 but dont introduce segfaults
dkalinowski
approved these changes
Feb 6, 2026
dtrawins
approved these changes
Feb 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🛠 Summary
JIRA CVS-179110
🧪 Checklist
``