I cannot use TextureSample with custom HLSL gaussian blur function

Beefyhead · June 19, 2021, 12:13pm

I am relatively new in HLSL, I am trying to implement gaussian blur for my new dynamic shadow material project. I can input Texture Object, but not TextureSample, and getting an error, Is there is any other function for TextureSample? If not how do I get alpha from Texture Object? I am totally confused.

Error with TextureSample: error X3004: undeclared identifier 'TexSampler’

With Texture Object

With TextureSample

Source:

float3 blur = Texture2DSample(Tex, TexSampler, UV);

for (int i = 0; i < r; i++) {

blur += Texture2DSample(Tex, TexSampler, UV + float2(i dist, 0));
blur += Texture2DSample(Tex, TexSampler, UV - float2(i dist, 0));

}

for (int j = 0; j < r; j++) {

blur += Texture2DSample(Tex, TexSampler, UV + float2(0, j dist));
blur += Texture2DSample(Tex, TexSampler, UV - float2(0, j dist));

}

blur /= 2*(2*r)+1;
return blur;

Source reference: Custom Expressions | Unreal Engine Documentation

Jeonguk_Choi · June 22, 2021, 12:12am

To use the alpha channel, change to type float4.

Beefyhead · January 26, 2022, 1:16pm

Thanks for replying buddy, you made my day. But I wonder which version of Unreal Engine do you use? Unfortunately, I cannot set any output variable name from the Custome node.

Jeonguk_Choi · June 22, 2021, 9:48am

I am using version 4.26.
In previous versions, use the Component Mask node.

Beefyhead · June 22, 2021, 3:13pm

But the Alpha variable making the problem, Displaying an undeclared variable. Without naming the variable how does it possible? I am sorry if there is any.

Jeonguk_Choi · June 22, 2021, 6:07pm

you don’t need an alpha variable.

ozzmeister00 · June 25, 2021, 4:00pm

You need to pass a TextureObject into the Tex input of your custom node (you can see that Jeonguk is doing this in their example). In your material, you’re using a Texture Sampler node and outputting the RGB value into the Tex input of your custom node, so the custom node sees the Tex value as a float3.

JonMicheelsen · June 28, 2021, 9:56am

The syntax you need to use is Tex.Sample(TexSampler, UV); instead of Texture2DSample(Tex, TexSampler, UV);

Also, in Unreal, you can use the globally defined clamped and wrapped samplers, found in the View struct: View.MaterialTextureBilinearClampedSampler or View.MaterialTextureBilinearWrapedSampler
Like this: Tex.Sample(View.MaterialTextureBilinearClampedSampler, UV); It’s what setting the shared Sampler Source to Shared: Wrapped or Clamped does under the hood.

Also be aware, a brute force gaussian blur like you are doing there, becomes very expensive with increase gaussian kernel size!

Beefyhead · June 29, 2021, 4:28am

Indeed, about 120 instructions. Do you have any alternative options to share?

JonMicheelsen · July 2, 2021, 11:03am

Well. The instruction count printed by unreal is not correct[0], for what that snipped of code is actually doing. Unreal’s pipeline is built around the nodegraph, not code nodes. The nodegraph doesn’t have proper flow control, so it’s not designed to evaluate instruction costs for loops(flow) - indeed it can’t if the iteration is dynamic. So, if you put a loop in there, it will ignore it and just count what’s in it once.
You are sampling a texture (r * 2) ^ 2 + 1 times so an r of 5 is 100 + 1. An r of 10 is 400 + 1 etc. Texture samples are much more expensive than single instructions, so to get an actual idea of what this would cost you would have to plop out 401 texture sampler nodes - and already there it’s obvious that what you are doing ain’t cheap. You also do (r * 6) ^ 2 additions, (r * 2) ^ 2 subtractions and (r * 4) ^ 2 multiplications, ignoring here the final bit outside the loop. Each of those additions, subtractions and multiplications counts as an instruction[1].
If you hardcode the r as a const int in the code and put an [unroll] before each of those loops, Unreal may actually show you the real instruction count, emphasis on the may.

Either way, what you have is a brute force blur. It’s a box blur actually, not a gaussian blur, since you’re not calculating a gaussian weight per sample. Normally if you do a thing like this you would do it in two passes. First blur x axis r steps save that to a texture, and then do y axis r steps of that texture. This is how bloom is often doing it. That would change the cost from ^ 2 to * 2. I don’t know if that can be done in the context of what you’re doing though.

All that said. If you have mipmaps generated for the texture, you can use those. Mips are naturally blurred and cheaper to sample[2]. You might be able to just sample maybe 4-5 mip biases down from the highest and sum those to get more or less the same thing for a fraction of the cost.

Here’s a rough untested example of a lot more more performant solution.
float r = 5;//(replace this with an input)
float blur = 2;//(replace this with an input)

float4 sum = 0;
for(int i = 0; i < r; i++)
{
sum += tex.SampleBias(View.MaterialTextureBilinearClampedSampler, UV, i * blur);
}
return sum / r;

You wrote you’re new to HLSL and shaders. Shaders are fun, have fun! But, I would always recommend trying to understand the code snippets you’re trying out and the implications. Best way to learn is to try things out, if you notice clear performance drops when you test is taking up the whole screen, you are doing too much work in the shader[3]. The more you understand it the better you can learn to evaluate if there is better alternative or smarter tricks to get to what you want. Compilers are usually real smart and can optimize a lot of things away - but they can’t replace whole methods for more performant ones.

The online directX HLSL intrinsic function manual is really informative too.
Here it’s explaining SampleBias SampleBias (DirectX HLSL Texture Object) - Win32 apps | Microsoft Docs

[0]: An empty default material is 116 instructions - you’re not blurring for just 4, I can guarantee you that :o)
[1]: A compiler will remove some of the duplicates, like (r * 4) ^ 2 probably becomes r ^ 2 since the result is the same for all 4 cases - it’s still a large number if r is high.
[2]: Mips are smaller, so it’s smaller blocks of memory for the sampler to handle - another source of performance issues: “oversampling”. Tile a texture on a huge plane forcing it to always sample mip 0 and even a high end graphics card may stall significantly, when several full res tiles cover just a few pixels on the horizon.
[3]: “stat unit” console command will show you the ms timers, if the gpu ms timer goes up by a lot compared to looking at a grey box, you are doing something expensive. This is by far the best indicator of a specific shader being expensive, without actually digging deeper in a real gpu frame capture.

Beefyhead · July 4, 2021, 12:30pm

Great, thats new information for me loops wouldn’t count in UE 4 material editor. I haven’t generated mips. MIPS can be even pixelated I guess. I am looking for a dynamic shadow projection using a diffuse map for cutting down the cost. Brute force kinda linear approach, multiplying on a single direction, rest of the areas appears much sharper. Thanks for your comment. let me try your method.