Skip to content

Distance computation: Fix SSE instructions#672

Open
twuebker wants to merge 1 commit intomicrosoft:cpp_mainfrom
twuebker:fix-distance
Open

Distance computation: Fix SSE instructions#672
twuebker wants to merge 1 commit intomicrosoft:cpp_mainfrom
twuebker:fix-distance

Conversation

@twuebker
Copy link

  • Does this PR have a descriptive title that could go in our release notes?
  • Does this PR add any new dependencies?
  • Does this PR modify any existing APIs?
    • Is the change to the API backwards compatible?
  • Should this result in any changes to our documentation, either updating existing docs or adding new ones?

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Currently, invalid SSE instructions like _mm128_loadu_ps are used. This causes DiskANN compilation to fail on machines that only support SSE and not AVX. Instead, it should be _mm_loadu_ps.

Change the instructions to the correct ones.

Currently, invalid SSE instructions like _mm128_loadu_ps are used. This causes
DiskANN compilation to fail on machines that only support SSE and not AVX.
Instead, it should be _mm_loadu_ps.

Change the instructions to the correct ones.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes compilation on SSE-only (non-AVX) builds by replacing invalid/nonexistent SSE intrinsic names in the distance computation kernels, keeping the SSE2 fallback path functional.

Changes:

  • Replace _mm128_loadu_ps/_mm128_mul_ps/_mm128_add_ps with the correct _mm_loadu_ps/_mm_mul_ps/_mm_add_ps in the SSE dot-product macro.
  • Replace _mm128_* intrinsics with correct _mm_* intrinsics in the SSE L2-norm macro.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tmp2 = _mm128_loadu_ps(addr2); \
tmp1 = _mm128_mul_ps(tmp1, tmp2); \
dest = _mm128_add_ps(dest, tmp1);
tmp1 = _mm_loadu_ps(addr1); \
Copy link
Contributor

@harsha-simhadri harsha-simhadri Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot check that the new instructions do run on SSE and are correct equivalents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants