Commit graph

52 commits

Author SHA1 Message Date
ReinUsesLisp
d4ba9c4fe5 shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
d8598a8717 shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
bunnei
33a218bea2 Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
ReinUsesLisp
7e8f51273c shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
ReinUsesLisp
c9b4c56d69 shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
ReinUsesLisp
d222e63967 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
Nguyen Dac Nam
cf457eafff shader: node - update correct comment 2020-03-30 17:44:44 +07:00
Nguyen Dac Nam
407064c658 shader_decode: add Atomic op for common usage 2020-03-30 17:44:44 +07:00
ReinUsesLisp
043d94e858 shader: Simplify indexed sampler usages 2020-02-24 01:26:07 -03:00
ReinUsesLisp
4f5791e529 shader: Remove curly braces initializers on shared pointers 2020-02-01 22:52:10 -03:00
bunnei
1de7f0beeb Merge pull request #3282 from FernandoS27/indexed-samplers
Partially implement Indexed samplers in general and specific code in GLSL
2020-02-01 20:41:40 -05:00
ReinUsesLisp
0d8f0ad3b3 shader/memory: Implement ATOM.ADD
ATOM operates atomically on global memory. For now only add ATOM.ADD
since that's what was found in commercial games.

This asserts for ATOM.ADD.S32 (handling the others as unimplemented),
although ATOM.ADD.U32 shouldn't be any different.

This change forces us to change the default type on SPIR-V storage
buffers from float to uint. We could also alias the buffers, but it's
simpler for now to just use uint. While we are at it, abstract the code
to avoid repetition.
2020-01-26 01:54:24 -03:00
Fernando Sahmkow
2e6a1b965d Shader_IR: Address feedback. 2020-01-25 09:04:59 -04:00
Fernando Sahmkow
b6d3153e7e Shader_IR: Propagate bindless index into the GL compiler. 2020-01-24 16:44:47 -04:00
Fernando Sahmkow
c066e472b9 Shader_IR: Implement Injectable Custom Variables to the IR. 2020-01-24 16:43:31 -04:00
Fernando Sahmkow
123c7cf307 Shader_IR: deduce size of indexed samplers 2020-01-24 16:43:31 -04:00
Fernando Sahmkow
f93bff419e Shader_IR: Implement initial code for tracking indexed samplers. 2020-01-24 16:43:30 -04:00
ReinUsesLisp
c4fd02b47f shader/memory: Implement ATOMS.ADD.U32 2020-01-16 17:30:55 -03:00
Fernando Sahmkow
591d53e1c3 Shader_IR: Address Feedback 2020-01-04 14:40:57 -04:00
Fernando Sahmkow
a4d446291d Shader_IR: add the ability to amend code in the shader ir.
This commit introduces a mechanism by which shader IR code can be
amended and extended. This useful for track algorithms where certain
information can derived from before the track such as indexes to array
samplers.
2019-12-30 15:31:48 -04:00
ReinUsesLisp
ac847a8cca shader/texture: Implement TLD4.PTP 2019-12-16 04:09:24 -03:00
ReinUsesLisp
6e95568616 shader: Implement MEMBAR.GL
Implement using memoryBarrier in GLSL and OpMemoryBarrier on SPIR-V.
2019-12-10 16:45:03 -03:00
ReinUsesLisp
72b999d789 shader_ir/other: Implement S2R InvocationId 2019-12-09 23:52:28 -03:00
ReinUsesLisp
243a33aba9 shader_ir/memory: Implement patch stores 2019-12-09 23:25:21 -03:00
bunnei
2b4786f709 Merge pull request #3109 from FernandoS27/new-instr
Implement FLO & TXD Instructions on GPU Shaders
2019-12-06 18:18:16 -05:00
ReinUsesLisp
77f86f48ac shader/texture: Deduce texture buffers from locker
Instead of specializing shaders to separate texture buffers from 1D
textures, use the locker to deduce them while they are being decoded.
2019-11-22 21:28:47 -03:00
Fernando Sahmkow
206d13c987 Shader_IR: Implement TXD instruction. 2019-11-14 11:15:27 -04:00
Fernando Sahmkow
6267529837 Shader_IR: Implement FLO instruction. 2019-11-14 11:15:27 -04:00
ReinUsesLisp
bb94bcc991 shader_ir/warp: Implement FSWZADD 2019-11-07 20:08:41 -03:00
ReinUsesLisp
5fc04875a1 gl_shader_decompiler: Reimplement shuffles with platform agnostic intrinsics 2019-11-07 20:08:41 -03:00
ReinUsesLisp
1589a146ed shader/node: Unpack bindless texture encoding
Bindless textures were using u64 to pack the buffer and offset from
where they come from. Drop this in favor of separated entries in the
struct.

Remove the usage of std::set in favor of std::list (it's not std::vector
to avoid reference invalidations) for samplers and images.
2019-10-29 20:53:48 -03:00
Lioncash
94855ef1a8 shader/node: std::move Meta instance within OperationNode constructor
Allows usages of the constructor to avoid an unnecessary copy.
2019-10-15 18:21:59 -04:00
ReinUsesLisp
79a7463f4c gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp
331d140bb4 shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
bunnei
beabee3696 Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
bunnei
47a8e03f14 Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp
42815d1d24 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp
2e6bebb3d2 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
ReinUsesLisp
e2aad88d51 gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp
9fb31b1b23 kepler_compute: Implement texture queries 2019-09-05 20:35:51 -03:00
ReinUsesLisp
df0203dd87 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
ReinUsesLisp
67f47b2f6a shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
Fernando Sahmkow
9a0fa90be2 Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
ReinUsesLisp
2f76aafca9 shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
Fernando Sahmkow
82efa35683 shader_ir: Unify blocks in decompiled shaders. 2019-07-09 08:14:39 -04:00
Fernando Sahmkow
d5d4cc30ec shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
Fernando Sahmkow
b62d22a77d texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
ReinUsesLisp
7e4a7929f8 shader: Implement bindless images 2019-06-20 21:38:33 -03:00
ReinUsesLisp
224e4e174d shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
e8bd976b4d shader: Split SSY and PBK stack
Hardware testing revealed that SSY and PBK push to a different stack,
allowing code like this:

        SSY label1;
        PBK label2;
        SYNC;
label1: PBK;
label2: EXIT;
2019-06-07 02:18:27 -03:00