Skip to content

Material, sampling, and denoiser improvements for the rtx device#288

Merged
jeffamstutz merged 16 commits intoNVIDIA:next_releasefrom
tarcila:misc-enhancements
Apr 28, 2026
Merged

Material, sampling, and denoiser improvements for the rtx device#288
jeffamstutz merged 16 commits intoNVIDIA:next_releasefrom
tarcila:misc-enhancements

Conversation

@tarcila
Copy link
Copy Markdown
Collaborator

@tarcila tarcila commented Apr 28, 2026

PhysicallyBased material (glTF 2.0 parity)

  • Adds the full KHR_materials_* extension set: occlusion, specular/specularColor, clearcoat, sheen, iridescence, and volume attenuation.
  • Base color default corrected to (1, 1, 1) per spec.
  • Reflections use GGX visible-normal importance sampling.
  • Matte material now honours opacity in the Quality renderer.

Sampling fixes

  • HDRI NEE pdf normalization corrected.
  • Light-pick probability correctly factored into the MIS pdf.
  • Transmission-to-alpha improved for transparent scenes over image backgrounds.

Denoiser

  • Dense per-pixel albedo/normal/color estimates fed to the denoiser at every frame, eliminating checkerboard flicker.

Internals

  • Background compositing, format conversion, and tonemap/inverse-tonemap extracted into dedicated helpers; premultiplyBackground renamed to premultipliedAlpha. These are
    prerequisites for the denoiser work above.

tarcila added 15 commits April 27, 2026 19:28
…mantics

Renames both the GPU struct field (RendererGPUData::premultiplyBackground)
and the public ANARI parameter, and updates its description from
"pre-multiply alpha channel with background color" to "pre-multiply RGB
by alpha in the composited output pixel". Also exposes the parameter on
the third renderer that previously lacked it.
Drops the unconditional getBackground/bg.a blend on path miss and
replaces it with HDRI-only accumulation via getBackgroundLight, gated
on !volumeSample.didScatter so a scattered volume sample doesn't
re-add the sky.
Introduces a CUDA kernel (compositeBackground) and its launcher that
composite the rendered accumulator over the background and emit the
final pixel format (FLOAT, UINT, or SRGB). Hooks it into both the
denoise and non-denoise paths in Frame::renderFrame().
Sparse accumulator writes under checkerboarding caused flicker; the
2x2 dilation hack papered over gaps instead of fixing them. Two new
post-launch kernels (prepareDenoiseInput, prepareDenoiseGuides) cover
every pixel using a resolveSample helper that redirects non-rendered
checker tiles to their source accumulator.

Side effects: accumPixelSample simplifies to pure accumulation (no
output-buffer writes, no frameIDOffset); writeOutputColor is deleted;
outColorVec4, outColorUint, and FrameFormat are removed from
FramebufferGPUData.
Each iteration of the depth loop runs at most one of the volume-scatter,
surface-hit, or no-hit branches, so on the first bounce the
!firstHitAssigned check is always true. The flag and its assignments
can go.
Helps rendering a transparent scene over an image background.

Adds a NEXT_RAY_CONTINUES_THROUGH_SURFACE flag and a
continuesThroughSurface(NextRay) helper so callers (Quality_ptx,
PhysicallyBasedShader, MatteShader) can distinguish a real material
sample from an alpha-driven pass-through. NextRay default values
collapse from vec4 to vec3 to drop a stale alpha lane. The per-hit
accumulateValue(sample.opacity, materialOpacity, ...) on the surface
branch in Quality_ptx is removed; opacity accumulation is now driven
by whether the chosen next ray continues through the surface.
Fix 2 importance sampling bugs:
- Per pixel area was wrong when computing the pdf weight
- Pole bias was accounted twice, in the precomputed luminance
  and in sample light.
Move the light probability weight into pdf instead of returned radiance
so MIS consumers see the correct joint density.
Wire up the glTF 2.0 KHR material extensions on the PhysicallyBased
material — specular, clearcoat, sheen, iridescence, occlusion, and
volume (thickness, attenuation distance and color) — with the usual
constant / attribute / sampler routing on host and GPU.

Rework the shader so direct lighting and next-ray sampling use a
proper microfacet model and compose the new lobes (base + clearcoat
+ sheen, iridescent Fresnel). Drop the cone-around-mirror reflection
sample and the hand-tuned 0.85 transmission tint.

Add a NextRay.flags field to distinguish opacity pass-through from a
real bounce; Quality and Interactive use it.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the RTX device renderer to improve glTF2-style physically based materials, fix several light sampling/PDF/MIS issues, and stabilize denoising by feeding dense per-pixel guide buffers every frame. It also refactors accumulation/output so background compositing and output formatting are handled in dedicated helpers/kernels (including a rename of the premultiply parameter).

Changes:

  • Expand PBR material support toward glTF 2.0 parity (KHR_materials_* extensions), update defaults, and improve reflection sampling (GGX VNDF).
  • Correct HDRI/light sampling PDFs and MIS integration details (including uniform light-pick probability handling).
  • Rework denoiser inputs and output compositing: accumulate-only in raygen, then composite background/tonemap/format-convert in a dedicated kernel; introduce premultipliedAlpha.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
devices/rtx/device/visrtx_device.json Rename/introduce premultipliedAlpha renderer parameter across renderer types.
devices/rtx/device/renderer/Renderer.h Rename renderer member to m_premultipliedAlpha.
devices/rtx/device/renderer/Renderer.cpp Read premultipliedAlpha param and pass through to GPU frame data.
devices/rtx/device/renderer/Quality_ptx.cu Fix MIS light-pick PDF handling; adjust first-hit bookkeeping; background via HDRI helper; opacity handling tweaks.
devices/rtx/device/renderer/Interactive_ptx.cu Use continuesThroughSurface() for ray offsets; background via HDRI helper; minor cleanup.
devices/rtx/device/renderer/Debug_ptx.cu Background via HDRI helper (instead of background image/color).
devices/rtx/device/material/shaders/PhysicallyBasedShader_ptx.cu Major PBR shading/sampling upgrade: KHR extensions, GGX VNDF, clearcoat/sheen/iridescence/volume attenuation, revised next-ray selection.
devices/rtx/device/material/shaders/MatteShader_ptx.cu Make matte honor opacity by allowing pass-through rays.
devices/rtx/device/material/shaders/MDLShader_ptx.cu Add NextRay flags for “continues through surface”; remove legacy 0.85 transmission scaling.
devices/rtx/device/material/PBR.h Update default base color to (1,1,1,1); add parameters/samplers for KHR extensions.
devices/rtx/device/material/PBR.cpp Commit/gpuData wiring for new KHR extension parameters/samplers.
devices/rtx/device/light/sampling/CDF.cu Adjust HDRI pdf normalization weight for equirect sampling; formatting tweaks.
devices/rtx/device/gpu/shadingState.h Add NextRayFlags + NextRay::flags; extend PBR shading state with KHR fields.
devices/rtx/device/gpu/sampleLight.h Remove extra sinθ factor from HDRI pdf (since sinθ folded into CDF); formatting tweaks.
devices/rtx/device/gpu/renderer/raygen_helpers.h Update opacity/transmission relation; treat HDRI sky as opaque to avoid bleed; remove per-sample output writes.
devices/rtx/device/gpu/gpu_util.h Add continuesThroughSurface() helper; remove background/output-writing helpers; make accumulation write-only; move tonemap helpers out.
devices/rtx/device/gpu/gpu_tonemap.h New shared tonemap/inverse-tonemap utilities usable outside PTX-only headers.
devices/rtx/device/gpu/gpu_objects.h Rename GPU renderer flag to premultipliedAlpha; remove output buffers + fb.format from GPU framebuffer data.
devices/rtx/device/gpu/evalShading.h Signature/format cleanup; update default NextRay init for new struct shape.
devices/rtx/device/frame/Frame.h Add dedicated denoiser input/guide buffers to avoid checkerboard feedback artifacts.
devices/rtx/device/frame/Frame.cu Add kernels to prepare denoise inputs/guides and to composite background + format-convert output.
devices/rtx/device/frame/Denoiser.h Update setup signature (explicit input/albedo/normal buffers) and add convertOutput().
devices/rtx/device/frame/Denoiser.cu Wire denoiser to separate input/output; split output conversion into convertOutput().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread devices/rtx/device/light/sampling/CDF.cu Outdated
Comment thread devices/rtx/device/light/sampling/CDF.cu Outdated
Comment thread devices/rtx/device/renderer/Renderer.cpp Outdated
Comment thread devices/rtx/device/gpu/renderer/raygen_helpers.h
Comment thread devices/rtx/device/gpu/evalShading.h Outdated
Comment thread devices/rtx/device/light/sampling/CDF.cu Outdated
Copy link
Copy Markdown
Collaborator

@jeffamstutz jeffamstutz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@jeffamstutz jeffamstutz merged commit d256bae into NVIDIA:next_release Apr 28, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants