ReinUsesLisp
9b001821d9
shader/shift: Implement SHR wrapped and clamped variants
...
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
bunnei
3df0f440fd
Merge pull request #2812 from ReinUsesLisp/f2i-selector
...
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
4ae7f81090
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
...
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
ReinUsesLisp
6f134adf2a
shader_ir/conversion: Split int and float selector and implement F2F H1
2019-08-28 16:09:33 -03:00
ReinUsesLisp
d9ad389777
shader_ir/conversion: Implement F2I F16 Ra.H1
2019-08-27 23:40:40 -03:00
ReinUsesLisp
d490cc5285
float_set_predicate: Add missing negation bit for the second operand
2019-08-27 21:57:43 -03:00
ReinUsesLisp
67f47b2f6a
shader_ir: Implement VOTE
...
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics
Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.
To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:
* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true
ballotARB, also known as "uint64_t(activeThreadsNV())", emits
VOTE.ANY Rd, PT, PT;
on nouveau's compiler. This doesn't match exactly to Nvidia's code
VOTE.ALL Rd, PT, PT;
Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
0d754d7a75
Merge pull request #2753 from FernandoS27/float-convert
...
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
b6272eb8e2
shader_ir: Implement NOP
2019-08-04 03:02:55 -03:00
Fernando Sahmkow
9a0fa90be2
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
...
This commit takes care of implementing the F16 Variants of the
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
ReinUsesLisp
edc43b2509
shader/half_set_predicate: Implement missing HSETP2 variants
2019-07-19 22:20:47 -03:00
Fernando Sahmkow
e221290cb7
Merge pull request #2695 from ReinUsesLisp/layer-viewport
...
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
Fernando Sahmkow
43662e376e
Merge pull request #2692 from ReinUsesLisp/tlds-f16
...
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
d5d4cc30ec
shader_ir: Implement BRX & BRA.CC
2019-07-09 08:14:37 -04:00
ReinUsesLisp
a650406899
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
...
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.
In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
ReinUsesLisp
48d485d6df
shader/texture: Add F16 support for TLDS
2019-07-07 16:05:56 -03:00
ReinUsesLisp
7eed876cfb
shader_bytecode: Include missing <array>
2019-06-24 01:51:02 -03:00
ReinUsesLisp
224e4e174d
shader: Decode SUST and implement backing image functionality
2019-06-20 21:38:33 -03:00
ReinUsesLisp
27cd63a05a
shader: Implement texture buffers
2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a8250f511b
shader_bytecode: Mark EXIT as flow instruction
2019-06-04 12:18:35 -04:00
ReinUsesLisp
68af52d525
shader/memory: Implement ST (generic memory)
2019-05-20 22:41:53 -03:00
ReinUsesLisp
71ded7da4e
shader/memory: Implement LD (generic memory)
2019-05-20 22:38:59 -03:00
ReinUsesLisp
5bf7324068
shader_ir/other: Implement IPA.IDX
2019-05-02 21:46:37 -03:00
ReinUsesLisp
f96020b2ae
shader_ir/memory: Implement physical input attributes
2019-05-02 21:46:25 -03:00
ReinUsesLisp
9a9902214e
shader_bytecode: Add AL2P decoding
2019-05-02 21:46:25 -03:00
bunnei
7fc67a06bb
Merge pull request #2407 from FernandoS27/f2f
...
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
c1c43bde80
Merge pull request #2348 from FernandoS27/guest-bindless
...
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
d4b42f6bc6
Merge pull request #2315 from ReinUsesLisp/severity-decompiler
...
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow
73f925a949
Do some corrections in conversion shader instructions.
...
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
ReinUsesLisp
79e7fb6d6f
shader_ir: Implement STG, keep track of global memory usage and flush
2019-04-14 00:25:32 -03:00
bunnei
dd5989d907
Merge pull request #2366 from FernandoS27/xmad-fix
...
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow
25e6fb72eb
Correct LOP_IMN encoding
2019-04-08 13:39:12 -04:00
Fernando Sahmkow
34b15b69df
Correct XMAD mode, psl and high_b on different encodings.
2019-04-08 13:01:17 -04:00
Fernando Sahmkow
f5792ffeab
Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format.
2019-04-08 11:36:11 -04:00
Fernando Sahmkow
2f456841b0
Implement TXQ_B
2019-04-08 11:29:52 -04:00
Fernando Sahmkow
8bb9877b70
Corrections to TEX_B
2019-04-08 11:28:44 -04:00
Fernando Sahmkow
ee9b2e3cdc
Implement Bindless Samplers and TEX_B in the IR.
2019-04-08 11:23:42 -04:00
ReinUsesLisp
f725007975
shader_ir/memory: Reduce severity of LD_L cache management and log it
2019-04-03 17:12:44 -03:00
ReinUsesLisp
c2ea1d5263
shader_ir/memory: Reduce severity of ST_L cache management and log it
2019-04-03 17:12:44 -03:00
bunnei
11ac277646
Merge pull request #2147 from ReinUsesLisp/texture-clean
...
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
Lioncash
f596ce7887
video_core/engines: Remove unnecessary includes
...
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.
This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
ReinUsesLisp
3b01587ca4
shader/decode: Remove extras from MetaTexture
2019-02-26 00:11:30 -03:00
ReinUsesLisp
8a7efd22ec
shader/decode: Split memory and texture instructions decoding
2019-02-26 00:11:30 -03:00
Fernando Sahmkow
e29f546bb7
shader_decompiler: Improve Accuracy of Attribute Interpolation.
2019-02-14 03:25:07 -04:00
Fernando Sahmkow
0f8f14a732
Corrected F2I None mode to RoundEven.
2019-02-11 18:46:45 -04:00
bunnei
38df722dc7
Merge pull request #2081 from ReinUsesLisp/lmem-64
...
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
66514e4190
Merge pull request #2082 from FernandoS27/txq-stl
...
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
6506dbc577
Update src/video_core/engines/shader_bytecode.h
...
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
4133c86d71
Fix TXQ not using the component mask.
2019-02-03 18:17:18 -04:00
ReinUsesLisp
5ae8a056fe
shader_bytecode: Rename BytesN enums to BitsN
2019-02-03 00:25:40 -03:00