Skip to content

Enable CI lint gh action on ROCm#547

Open
VeeraRajasekhar wants to merge 7 commits intodevfrom
veergopu/ci-rocm-lint-enablement
Open

Enable CI lint gh action on ROCm#547
VeeraRajasekhar wants to merge 7 commits intodevfrom
veergopu/ci-rocm-lint-enablement

Conversation

@VeeraRajasekhar
Copy link
Copy Markdown
Contributor

Description

Please include a brief summary of the changes, relevant motivation and context.

Fixes # (issue)

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

transformer_engine/common/amd_detail/hip_float8.h
  -Host constructor: multi-statement if/else now uses braces (readability/braces).

transformer_engine/common/cast/mxfp8/rocm_quantize_mxfp8.cuh
  -Include <cstdint>; typedef for gfx950 vector type uses int16_t instead of
  short (runtime/int).

transformer_engine/common/ck_fused_attn/src/ck_fused_attn_utils.cpp
  -dladdr: avoid ill-formed function-pointer-to-void* cast via a small union
  (readability/casting / portable POSIX).
  -get_ck_log_stream: else branch restructured with nested if so else/brace
  pairing satisfies cpplint (readability/braces).

transformer_engine/common/fused_attn_rocm/fused_attn.cpp
  -check_set_window_size: replace std::make_pair<int64_t,int64_t>(...) with
  std::pair<int64_t,int64_t>(...) (build/explicit_make_pair).
  -Replace alternative tokens `or` with || (readability/alt_tokens).
  -log_fused_attn_config: same for sliding-window condition.

transformer_engine/common/gemm/rocm_gemm.cu
  -ObjCache / NameMapper: mark single-argument constructors explicit
  (runtime/explicit).
  -HIPBLASLT scaling_mode check: split #if/#else branches so each if has its
  own braced body; use static_cast<int> instead of C-style cast
  (readability/braces, readability/casting).
  -Debug logging: (int) casts -> static_cast<int> for hipDataType fields
  (readability/casting).
  -ServiceStreamKey: use std::uint64_t alias instead of unsigned long long
  (runtime/int).

transformer_engine/common/normalization/common.cpp
  -getNormalizationPlan: after optional CUDNN plan, use if (!plan) { ... } for
  TE plans instead of } else #endif if (readability/braces across preprocessor).

transformer_engine/common/normalization/layernorm/ln_api.cpp
  -Forward/backward: default norm_backend to Te; optional CUDNN path only under
  #ifndef __HIP_PLATFORM_AMD__; set is_aligned only when backend is Te, so
  preprocessor does not split if/else from its braces (readability/braces).

transformer_engine/common/normalization/rmsnorm/rmsnorm_api.cpp
  -Same pattern as ln_api for forward (including HIP constexpr
  gamma_in_weight_dtype) and backward cudnn vs Te (readability/braces).

transformer_engine/common/permutation/permutation.cu
  -MoE unpermute kernel: functional-style float(...) casts replaced with
  static_cast<float>(...) (readability/casting).

transformer_engine/common/util/logging.h
  -NVTE_CHECK_HIPBLASLT macro: std::to_string((int)status) ->
  std::to_string(static_cast<int>(status)) (readability/casting).

transformer_engine/pytorch/csrc/extensions/gemm.cpp
  -Comm overlap RS path: HIP p2p vs split_overlap_rs restructured with proper
  #else for non-HIP so } else #endif { does not confuse brace rules
  (readability/braces).
@VeeraRajasekhar VeeraRajasekhar self-assigned this Apr 17, 2026
Copy link
Copy Markdown
Contributor

@Micky774 Micky774 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this cause divergence from upstream's source code in inherited files? Also, why do we need linting as a CI action instead of e.g. a pre-commit?

Comment on lines +79 to +85
// dladdr expects void*; avoid reinterpret_cast<void*>(fn) (not ISO C++).
union {
void (*fn)();
void *addr;
} sym{};
sym.fn = set_aiter_asm_dir;
dladdr(sym.addr, &info);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is unnecessary. Yes, it quiets the warning, but the warning is irrelevant for us granted our support is POSIX focused to begin with. From the dlopen man page:

           /* According to the ISO C standard, casting between function
              pointers and 'void *', as done above, produces undefined results.
              POSIX.1-2001 and POSIX.1-2008 accepted this state of affairs and
              proposed the following workaround:

                  *(void **) &cosine = dlsym(handle, "cos");

              This (clumsy) cast conforms with the ISO C standard and will
              avoid any compiler warnings.

              The 2013 Technical Corrigendum 1 to POSIX.1-2008 improved matters
              by requiring that conforming implementations support casting
              'void *' to a function pointer.  Nevertheless, some compilers
              (e.g., gcc with the '-pedantic' option) may complain about the
              cast used in this program.  */

the union trick here provides no additional safety -- it's still undefined behavior technically speaking -- and will break in the same circumstances (non-POSIX risk).

All things considered, I'd rather we keep things as-is, and if we really want to deal with the warning, we can make a small utility to use pragmas to suppress the warnings locally around the cast.

}

ObjCache(void (*a_offload)(const Data&)): offload(a_offload) {}
explicit ObjCache(void (*a_offload)(const Data&)): offload(a_offload) {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want these constructors to be explicit? Are they even used implicitly anywhere in our codebase?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they aren’t used implicitly anywhere; the only constructions are direct ObjCache<T,K>(nullptr) from ObjPool and direct init of service_stream_cache with a lambda. explicit was added only to satisfy cpplint runtime/explicit. If we prefer not to mark these callbacks as explicit, we can drop explicit and suppress that line with NOLINT for cpplint instead.

There are no APIs that take an ObjCache by value. So explicit is not required for correctness, only for style / tooling.

Comment thread commit.txt Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this won't be part of the final PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, This is for me to keep track of the issues fixed, will remove this.

- common: trailing space, encoding, docstrings, wheel file handle naming
- recipe: Format helper docstrings, whitespace
- jax/util: PEP8, module docstring, subprocess check, def over lambdas
- jax/setup: group build_tools.hipify imports before pybind11
- jax/quantize/helper: tejax + CUDA/ROCm helpers, no-else-return
- cpp_extensions: attention/normalization lazy SdyShardingRule; base
  is_hip_extension(); gemm conditional cGEMM imports + stubs
- pylintrc: align disables (e.g. wrong-import-position) with CI.
@VeeraRajasekhar VeeraRajasekhar marked this pull request as ready for review April 21, 2026 22:47
@VeeraRajasekhar VeeraRajasekhar added the ci-level 3 CI test level 3 label Apr 21, 2026
if (log_dir_str == "1") {
log_stream = &std::cout;
}
else if (open_ck_fused_attn_log_file(log_file, "ck_fused_attn", log_dir_str)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a warning for if-else if? I think it is used a lot in our code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The category was readability/braces. The message was along the lines of: “If an else has a brace on one side, it should have it on both.”

So the warning showed up because cpplint’s readability/braces heuristic fired on this if / else if layout, not because else if is forbidden. Nesting as else { if (...) { ... } } makes the structure obvious to the linter and cleared the warning.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move them to single line then but do not create nested ifs

}

#if HIPBLASLT_VERSION_MAJOR > 0 || HIPBLASLT_VERSION_MINOR >= 15
if (cfg.scaling_mode < 0 || cfg.scaling_mode >= (int)HIPBLASLT_MATMUL_MATRIX_SCALE_END)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line length and { at the end of the line are understood but lint does not require duplicate body of If.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the duplicate was to fix the readability/braces warning, I have restructured it to compute a bool in preprocessor-only branches, then use one if body.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is discrepancy here. .clang-format sets line width 100 and this is what IDE uses. So cpplint should be configured accorfingly

zero_centered_gamma, mode, training);
} else
}
#endif
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get rid of nested ifs. If splitting else-if does not work, better add dummy 'if (false) {' for ROCm instead of 'if (NOrmBAckend...'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the nested if (!plan) / inner if (Forward) structure. and added the following more clear structure:

CUDA: if (Cudnn) … else if (Forward TE) … else (backward TE) in one #ifndef HIP_PLATFORM_AMD block.
ROCm: if (Forward TE) … else (backward TE) in #else, with mode/training on TE constructors.

return False

with te_wheel_file.open("r") as f:
for line in f:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is upstream code. Why change it?

from packaging import version
from typing import Optional, Tuple

from packaging import version
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used to check the following

if version.parse(jax.__version__) >= version.parse("0.5.0"):
    from jax.experimental.custom_partitioning import SdyShardingRule

I will update the codebase to remove all these checks as we no longer support Jax<0.5.0 as mentioned in #547 (comment)

from jax.experimental.custom_partitioning import BATCHING
from jax.interpreters.mlir import ir
from jax.sharding import PartitionSpec
from .misc import is_hip_extension
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the file have ROCm code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, jax version checks were added.

import jax.numpy as jnp
from jax import dtypes, ffi
from jax.experimental.custom_partitioning import SdyShardingRule, BATCHING
if version.parse(jax.__version__) >= version.parse("0.5.0"):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IT is not needed, we don;t actually support JAX < 0.5

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

from flax.core.frozen_dict import FrozenDict

from ..util import is_hip_extension, get_jnp_float8_e4m3_type, get_jnp_float8_e5m2_type
import transformer_engine_jax as tejax
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All those changes will result in IFU conflicts

from build_tools.te_version import te_version
from build_tools.jax import setup_jax_extension, install_requirements, test_requirements

from pybind11.setup_helpers import build_ext as BuildExtension
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted and added # pylint: disable=ungrouped-imports

Copy link
Copy Markdown
Contributor

@Micky774 Micky774 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VeeraRajasekhar I still have questions from my initial review unanswered.

The main one which @ipanfilo has also asked is: why such divergence? It will be a huge maintenance burden to reconcile against every subsequent IFU, unless NV upstream implements these exact same decisions (afaik they don't?).

Enforcing linting on ROCm-only files is fine, but we have constraints regarding minimizing diff with upstream even if it means letting some things go "un-linted".


# Loop through tiles of current MM problem.
while tile >= last_mm_tile and tile < last_mm_tile + num_tiles:
while last_mm_tile <= tile < last_mm_tile + num_tiles:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triton doesn't support this multi-comparison -- it looks like the linter is trying to enforce python semantics on it

if (log_dir_str == "1") {
log_stream = &std::cout;
}
else if (open_ck_fused_attn_log_file(log_file, "ck_fused_attn", log_dir_str)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move them to single line then but do not create nested ifs

}

#if HIPBLASLT_VERSION_MAJOR > 0 || HIPBLASLT_VERSION_MINOR >= 15
if (cfg.scaling_mode < 0 || cfg.scaling_mode >= (int)HIPBLASLT_MATMUL_MATRIX_SCALE_END)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is discrepancy here. .clang-format sets line width 100 and this is what IDE uses. So cpplint should be configured accorfingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-level 3 CI test level 3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants