Commit graph

244 commits

Author SHA1 Message Date
ReinUsesLisp
9b001821d9 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
bunnei
3df0f440fd Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei
4ae7f81090 Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
ReinUsesLisp
6f134adf2a shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp
d9ad389777 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp
d490cc5285 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp
67f47b2f6a shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei
0d754d7a75 Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
ReinUsesLisp
b6272eb8e2 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
Fernando Sahmkow
9a0fa90be2 Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
ReinUsesLisp
edc43b2509 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
Fernando Sahmkow
e221290cb7 Merge pull request #2695 from ReinUsesLisp/layer-viewport
gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
2019-07-15 16:28:07 -04:00
Fernando Sahmkow
43662e376e Merge pull request #2692 from ReinUsesLisp/tlds-f16
shader/texture: Add F16 support for TLDS
2019-07-14 08:44:38 -04:00
Fernando Sahmkow
d5d4cc30ec shader_ir: Implement BRX & BRA.CC 2019-07-09 08:14:37 -04:00
ReinUsesLisp
a650406899 gl_shader_decompiler: Implement gl_ViewportIndex and gl_Layer in vertex shaders
This commit implements gl_ViewportIndex and gl_Layer in vertex and
geometry shaders. In the case it's used in a vertex shader, it requires
ARB_shader_viewport_layer_array. This extension is available on AMD and
Nvidia devices (mesa and proprietary drivers), but not available on
Intel on any platform. At the moment of writing this description I don't
know if this is a hardware limitation or a driver limitation.

In the case that ARB_shader_viewport_layer_array is not available,
writes to these registers on a vertex shader are ignored, with the
appropriate logging.
2019-07-07 20:42:55 -03:00
ReinUsesLisp
48d485d6df shader/texture: Add F16 support for TLDS 2019-07-07 16:05:56 -03:00
ReinUsesLisp
7eed876cfb shader_bytecode: Include missing <array> 2019-06-24 01:51:02 -03:00
ReinUsesLisp
224e4e174d shader: Decode SUST and implement backing image functionality 2019-06-20 21:38:33 -03:00
ReinUsesLisp
27cd63a05a shader: Implement texture buffers 2019-06-20 21:36:12 -03:00
Fernando Sahmkow
a8250f511b shader_bytecode: Mark EXIT as flow instruction 2019-06-04 12:18:35 -04:00
ReinUsesLisp
68af52d525 shader/memory: Implement ST (generic memory) 2019-05-20 22:41:53 -03:00
ReinUsesLisp
71ded7da4e shader/memory: Implement LD (generic memory) 2019-05-20 22:38:59 -03:00
ReinUsesLisp
5bf7324068 shader_ir/other: Implement IPA.IDX 2019-05-02 21:46:37 -03:00
ReinUsesLisp
f96020b2ae shader_ir/memory: Implement physical input attributes 2019-05-02 21:46:25 -03:00
ReinUsesLisp
9a9902214e shader_bytecode: Add AL2P decoding 2019-05-02 21:46:25 -03:00
bunnei
7fc67a06bb Merge pull request #2407 from FernandoS27/f2f
Do some corrections in conversion shader instructions.
2019-04-20 00:42:34 -04:00
bunnei
c1c43bde80 Merge pull request #2348 from FernandoS27/guest-bindless
Implement Bindless Textures on Shader Decompiler and GL backend
2019-04-17 20:59:49 -04:00
bunnei
d4b42f6bc6 Merge pull request #2315 from ReinUsesLisp/severity-decompiler
shader_ir/decode: Reduce the severity of common assertions
2019-04-16 22:21:19 -04:00
Fernando Sahmkow
73f925a949 Do some corrections in conversion shader instructions.
Corrects encodings for I2F, F2F, I2I and F2I
Implements Immediate variants of all four conversion types.
Add assertions to unimplemented stuffs.
2019-04-15 19:16:27 -04:00
ReinUsesLisp
79e7fb6d6f shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
dd5989d907 Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
Fernando Sahmkow
25e6fb72eb Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
34b15b69df Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
Fernando Sahmkow
f5792ffeab Move ConstBufferAccessor to Maxwell3d, correct mistakes and clang format. 2019-04-08 11:36:11 -04:00
Fernando Sahmkow
2f456841b0 Implement TXQ_B 2019-04-08 11:29:52 -04:00
Fernando Sahmkow
8bb9877b70 Corrections to TEX_B 2019-04-08 11:28:44 -04:00
Fernando Sahmkow
ee9b2e3cdc Implement Bindless Samplers and TEX_B in the IR. 2019-04-08 11:23:42 -04:00
ReinUsesLisp
f725007975 shader_ir/memory: Reduce severity of LD_L cache management and log it 2019-04-03 17:12:44 -03:00
ReinUsesLisp
c2ea1d5263 shader_ir/memory: Reduce severity of ST_L cache management and log it 2019-04-03 17:12:44 -03:00
bunnei
11ac277646 Merge pull request #2147 from ReinUsesLisp/texture-clean
shader_ir: Remove "extras" from the MetaTexture
2019-03-10 17:28:36 -04:00
Lioncash
f596ce7887 video_core/engines: Remove unnecessary includes
Removes a few unnecessary dependencies on core-related machinery, such
as the core.h and memory.h, which reduces the amount of rebuilding
necessary if those files change.

This also uncovered some indirect dependencies within other source
files. This also fixes those.
2019-03-05 20:35:32 -05:00
ReinUsesLisp
3b01587ca4 shader/decode: Remove extras from MetaTexture 2019-02-26 00:11:30 -03:00
ReinUsesLisp
8a7efd22ec shader/decode: Split memory and texture instructions decoding 2019-02-26 00:11:30 -03:00
Fernando Sahmkow
e29f546bb7 shader_decompiler: Improve Accuracy of Attribute Interpolation. 2019-02-14 03:25:07 -04:00
Fernando Sahmkow
0f8f14a732 Corrected F2I None mode to RoundEven. 2019-02-11 18:46:45 -04:00
bunnei
38df722dc7 Merge pull request #2081 from ReinUsesLisp/lmem-64
shader_ir/memory: Add LD_L 64 bits loads
2019-02-05 09:17:48 -05:00
bunnei
66514e4190 Merge pull request #2082 from FernandoS27/txq-stl
Fix TXQ not using the component mask.
2019-02-04 20:22:32 -05:00
Mat M
6506dbc577 Update src/video_core/engines/shader_bytecode.h
Co-Authored-By: FernandoS27 <fsahmkow27@gmail.com>
2019-02-03 21:27:26 -04:00
Fernando Sahmkow
4133c86d71 Fix TXQ not using the component mask. 2019-02-03 18:17:18 -04:00
ReinUsesLisp
5ae8a056fe shader_bytecode: Rename BytesN enums to BitsN 2019-02-03 00:25:40 -03:00