Commit graph

49 commits

Author SHA1 Message Date
MrPurple666
2d2e9208d2 Unified torzu and sudachi friend.cpp + fix android build on dma_pusher 2025-04-04 03:40:49 +02:00
Zephyron
0071e980b8 video_core: Enforce safe memory reads for compute dispatch
- Modify DmaPusher to use safe memory reads when handling compute
  operations at High GPU accuracy
- Prevent potential memory corruption issues that could lead to
  invalid dispatch parameters
- Previously, unsafe reads could result in corrupted launch_description
  data in KeplerCompute::ProcessLaunch, causing invalid vkCmdDispatch
  calls
- By enforcing safe reads specifically for compute operations, we
  maintain performance for other GPU tasks while ensuring compute
  dispatch stability

This change requires >= High GPU accuracy level to take effect.
2025-04-04 03:40:49 +02:00
Fernando Sahmkow
b206089ea7 Core: Clang format and other small issues. 2024-01-18 21:12:30 -05:00
Fernando Sahmkow
9db159da71 SMMU: Initial adaptation to video_core. 2024-01-18 21:12:30 -05:00
Fernando Sahmkow
94dd857cda VideoCore: Implement DispatchIndirect 2023-08-27 04:26:22 +02:00
Fernando Sahmkow
8208becc49 DMA Pusher: Fix regression caused by guest memory optimizations 2023-08-26 22:00:43 +02:00
Kelebek1
42638691b5 Use spans over guest memory where possible instead of copying data. 2023-07-02 23:09:48 +01:00
Fernando Sahmkow
4bf1ee5bdc DMAPusher: Improve collection of non executing methods 2023-01-01 16:43:57 -05:00
Fernando Sahmkow
d2643a61c3 Revert Buffer cache changes and setup additional macros. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow
12a76465b9 MacroHLE: Reduce massive calculations on sizing estimation. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow
b4fcb0b2b2 MacroHLE: Refactor MacroHLE system. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow
b5b0ec9429 MacroHLE: Implement DrawIndexedIndirect & DrawArraysIndirect. 2023-01-01 16:43:57 -05:00
Fernando Sahmkow
f2f2784817 MacroHLE: Add MultidrawIndirect HLE Macro. 2023-01-01 16:43:57 -05:00
ameerj
4d5adfb3c9 scratch_buffer: Explicitly defing resize and resize_destructive functions
resize keeps previous data intact when the buffer grows
resize_destructive destroys the previous data when the buffer grows
2022-12-19 22:40:50 -05:00
ameerj
284582a0b2 dma_pusher: Rework command_headers usage
Uses ScratchBuffer and avoids overwriting the command_headers buffer with the prefetch_command_list
2022-12-19 18:08:04 -05:00
Fernando Sahmkow
42ef10060a VideoCore: Refactor fencing system. 2022-10-06 21:00:52 +02:00
Fernando Sahmkow
8847b6645c VideoCore: implement channels on gpu caches. 2022-10-06 21:00:51 +02:00
Morph
2b87305d31 general: Convert source file copyright comments over to SPDX
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-23 05:55:32 -04:00
ameerj
b837219423 video_core: Reduce unused includes 2022-03-19 15:01:31 -04:00
Fernando Sahmkow
d9fc759460 BufferCache: Additional download fixes. 2021-07-09 22:20:36 +02:00
ReinUsesLisp
2dfce2fca6 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
Lioncash
2f181b6a90 video_core: Resolve more variable shadowing scenarios
Resolves variable shadowing scenarios up to the end of the OpenGL code
to make it nicer to review. The rest will be resolved in a following
commit.
2020-12-04 16:19:09 -05:00
bunnei
0b6324b3a6 video_core: dma_pusher: Remove integrity check on command lists.
- This seems to cause softlocks in Breath of the Wild.
2020-11-07 00:08:19 -08:00
bunnei
af7ab45b45 video_core: dma_pusher: Add support for integrity checks.
- Log corrupted command lists, rather than crash.
2020-11-01 01:52:38 -07:00
bunnei
69f4a66d23 video_core: dma_pusher: Add support for prefetched command lists. 2020-11-01 01:52:38 -07:00
David Marcec
67d7c0f45e DmaPusher: Remove dead code in step 2020-05-16 12:42:27 +10:00
Fernando Sahmkow
4c11487d1e VideoCore/GPU: Delegate subchannel engines to the dma pusher. 2020-04-27 22:07:21 -04:00
Fernando Sahmkow
ef3a0ae64a DMAPusher: Propagate multimethod writes into the engines. 2020-04-23 08:52:55 -04:00
Fernando Sahmkow
fda21f5a93 GPU: Delay Fences. 2020-04-22 11:36:08 -04:00
Fernando Sahmkow
de53bc96c0 BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow
c689dc6804 GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
Lioncash
8a37c63b9e dma_pusher: Remove reliance on the global system instance
With this, the video core is now has no calls to the global system
instance at all.
2020-04-19 16:12:08 -04:00
ReinUsesLisp
005f5ca883 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp
c2d3732176 gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
Fernando Sahmkow
e82d641357 GPU: Flush commands on every dma pusher step.
This commit ensures that the host gpu is constantly fed with commands to
work with, while the guest gpu keeps producing the rest of the commands.
This reduces syncing time between host and guest gpu.
2019-07-26 16:54:22 -04:00
Fernando Sahmkow
7c50842226 Maxwell3D: Rework the dirty system to be more consistant and scaleable 2019-07-17 17:29:49 -04:00
Fernando Sahmkow
fc9a1b81cb Dma_pusher: ASSERT on empty command_list
This is a measure to avoid crashes on command list reading as an empty
command_list is considered a NOP.
2019-05-19 10:48:31 -04:00
bunnei
673cfd89c1 Merge pull request #2322 from ReinUsesLisp/wswitch
video_core: Silent -Wswitch warnings
2019-04-28 22:24:58 -04:00
ReinUsesLisp
7a56d07632 video_core: Silent -Wswitch warnings 2019-04-18 15:54:39 -03:00
Fernando Sahmkow
994393bd02 Use ReadBlockUnsafe for fetyching DMA CommandLists 2019-04-16 11:22:34 -04:00
Lioncash
44d91d561a video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
d3f26c1546 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
ReinUsesLisp
81ff2a51ad dma_pusher: Store command_list_header by copy
Instead of holding a reference that will get invalidated by
dma_pushbuffer.pop(), hold it as a copy. This doesn't have any
performance cost since CommandListHeader is 8 bytes long.
2019-03-08 04:06:54 -03:00
Markus Wick
00fa708e04 video_core/dma_pusher: Simplyfy Step() logic.
As fetching command list headers and and the list of command headers is a fixed 1:1 relation now, they can be implemented within a single call.
This cleans up the Step() logic quite a bit.
2019-02-19 10:28:42 +01:00
Markus Wick
0faab8fe2c video_core/dma_pusher: The full list of headers at once.
Fetching every u32 from memory leads to a big overhead. So let's fetch all of them as a block if possible.
This reduces the Memory::* calls by the dma_pusher by a factor of 10.
2019-02-19 09:58:38 +01:00
ReinUsesLisp
af1543712d video_core: Assert on invalid GPU to CPU address queries 2019-02-03 04:58:40 -03:00
bunnei
a86364480f dma_pushbuffer: Optimize to avoid loop and copy on Push. 2018-11-27 19:17:33 -05:00
bunnei
9266f76fb2 gpu: Move command list profiling to DmaPusher::DispatchCalls. 2018-11-27 18:42:21 -05:00
bunnei
f8b215e361 gpu: Rewrite GPU command list processing with DmaPusher class.
- More accurate impl., fixes Undertale (among other games).
2018-11-26 23:14:01 -05:00