Commit graph

87 commits

Author SHA1 Message Date
ameerj
899cf73819 vk_blit_screen: Fix non-accelerated texture size calculation
Addresses the potential OOB access in UnswizzleTexture.
2021-08-16 14:28:10 -04:00
yzct12345
46e4e6707f decoders: Optimize swizzle copy performance (#6790)
This makes UnswizzleTexture up to two times faster. It is the main bottleneck in NVDEC video decoding.
2021-08-02 11:18:58 -04:00
lat9nq
8aed753c16 decoders: Break instead of continue
continue causes a memory leak in A Hat in Time.
2021-06-04 05:12:14 -04:00
lat9nq
54f2a92203 decoders: Avoid out-of-bounds access
This is not a real fix, so assert here and continue before crashing.
2021-06-04 05:03:54 -04:00
ameerj
cac341dbc7 renderer_vulkan: Accelerate ASTC decoding
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2021-03-13 12:16:03 -05:00
ReinUsesLisp
4e4056f581 common/alignment: Rename AlignBits to AlignUpLog2
AlignUpLog2 describes what the function does better than AlignBits.
2021-01-15 04:13:33 -03:00
ReinUsesLisp
d25b097e84 video_core: Rewrite the texture cache
The current texture cache has several points that hurt maintainability
and performance. It's easy to break unrelated parts of the cache
when doing minor changes. The cache can easily forget valuable
information about the cached textures by CPU writes or simply by its
normal usage.The current texture cache has several points that hurt
maintainability and performance. It's easy to break unrelated parts
of the cache when doing minor changes. The cache can easily forget
valuable information about the cached textures by CPU writes or simply
by its normal usage.

This commit aims to address those issues.
2020-12-30 03:38:50 -03:00
ReinUsesLisp
9a83f8794b textures/decoders: Fix block linear to pitch copies
There were two issues with block linear copies. First the swizzling was
wrong and this commit reimplements them.

The other issue was that these copies are generally used to download
render targets from the GPU and yuzu was not downloading them from
host GPU memory unless the extreme GPU accuracy setting was selected.
This commit enables cached memory reads for all accuracy levels.

- Fixes level thumbnails in Super Mario Maker 2.
2020-08-10 20:45:03 -03:00
bunnei
2e4a5d2110 Merge pull request #4324 from ReinUsesLisp/formats
video_core: Fix, add and rename pixel formats
2020-07-21 00:13:04 -04:00
ReinUsesLisp
a068ce4c32 video_core: Rearrange pixel format names
Normalizes pixel format names to match Vulkan names. Previous to this
commit pixel formats had no convention, leading to confusion and
potential bugs.
2020-07-13 01:44:23 -03:00
ReinUsesLisp
2691f5e6b9 video_core/textures: Add and use SwizzleSliceToVoxel, and minor style changes
Change GOB sizes from free-functions to constexpr constants.

Add SwizzleSliceToVoxel, a function that swizzles a 2D array of pixels
into a 3D texture and use it for 3D copies.
2020-07-10 04:09:32 -03:00
Fernando Sahmkow
4d2b23d3c5 MaxwellDMA: Optimize micro copies. 2020-04-28 13:44:14 -04:00
Lioncash
eaeb4520f7 General: Resolve warnings related to missing declarations 2020-04-16 23:43:34 -04:00
Fernando Sahmkow
f1adfe6591 MaxwellDMA: Fixes, corrections and relaxations.
This commit fixes offsets on Linear -> Tiled copies, corrects z pos
fortiled->linear copies, corrects bytes_per_pixel calculation in tiled
-> linear copies and relaxes some limitations set by latest dma fixes
refactors.
2019-07-25 20:41:42 -04:00
Fernando Sahmkow
b62d22a77d texture_cache: Style and Corrections 2019-06-20 21:24:47 -04:00
Fernando Sahmkow
18322c1369 decoders: correct block calculation 2019-06-20 21:38:34 -03:00
ReinUsesLisp
1d10810d2b video_core: Use un-shifted block sizes to avoid integer divisions
Instead of storing all block width, height and depths in their shifted
form:

block_width = 1U << block_shift;

Store them like they are provided by the emulated hardware (their
block_shift form). This way we can avoid doing the costly
Common::AlignUp operation to align texture sizes and drop CPU integer
divisions with bitwise logic (defined in Common::AlignBits).
2019-06-20 21:36:12 -03:00
ReinUsesLisp
f46a979f19 gl_texture_cache: Add fast copy path 2019-06-20 21:36:11 -03:00
ReinUsesLisp
3b430b5605 gl_texture_cache: Initial implementation 2019-06-20 21:36:11 -03:00
Fernando Sahmkow
56c2b0ea86 Apply Const correctness to SwizzleKepler and replace u32 for size_t on iterators. 2019-04-16 12:00:46 -04:00
Fernando Sahmkow
15368c6070 Implement Block Linear copies in Kepler Memory. 2019-04-15 21:22:16 -04:00
bunnei
d3f26c1546 video_core: Refactor to use MemoryManager interface for all memory access.
# Conflicts:
#	src/video_core/engines/kepler_memory.cpp
#	src/video_core/engines/maxwell_3d.cpp
#	src/video_core/morton.cpp
#	src/video_core/morton.h
#	src/video_core/renderer_opengl/gl_global_cache.cpp
#	src/video_core/renderer_opengl/gl_global_cache.h
#	src/video_core/renderer_opengl/gl_rasterizer_cache.cpp
2019-03-16 00:38:48 -04:00
ReinUsesLisp
3989075e5f gl_rasterizer_cache: Move format conversion to its own file 2019-02-26 20:08:27 -03:00
ReinUsesLisp
64612bf940 decoders: Minor style changes 2019-02-26 20:08:27 -03:00
David Marcec
1dfb0a513a Fixed uninitialized memory due to missing returns in canary
Functions which are suppose to crash on non canary builds usually don't return anything which lead to uninitialized memory being used.
2018-12-19 12:52:32 +11:00
FernandoS27
b509890e4c Implemented Tile Width Spacing 2018-11-26 09:05:12 -04:00
bunnei
e84cceb645 Merge pull request #1717 from FreddyFunk/swizzle-gob
textures/decoders: Replace magic numbers
2018-11-18 20:13:00 -08:00
Frederic L
d2dd9cfc1d Eliminated unnessessary memory allocation and copy (#1702) 2018-11-18 19:53:03 -08:00
Frederic Laing
9d588877ef textures/decoders: Replace magic numbers 2018-11-17 01:55:28 +01:00
Frederic Laing
8dcfc75e4e textures/decoders: Minor cleanup 2018-11-15 21:04:17 +01:00
greggameplayer
ec188ec832 Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB (#1666)
* Implement ASTC_2D_10X8 & ASTC_2D_10X8_SRGB
( needed by Mario+Rabbids Kingdom Battle )

* Small placement correction
2018-11-12 18:34:54 -08:00
FernandoS27
fe596b4c6e Fix ASTC formats 2018-11-01 13:08:19 -04:00
bunnei
6bf7e0d83d Merge pull request #1524 from FernandoS27/layers-fix
rasterizer: Fix Layered Textures Loading and Cubemaps
2018-10-25 00:29:18 -04:00
Lioncash
a3c2defab1 decoders: Remove unused variable within SwizzledData() 2018-10-23 23:51:13 -04:00
FernandoS27
88ff802e1a Fixed Layered Textures Loading and Cubemaps 2018-10-23 14:27:36 -04:00
bunnei
fa24c17b95 decoders: Introduce functions for un/swizzling subrects. 2018-10-18 22:41:43 -04:00
bunnei
7ce1c004c4 Merge pull request #1488 from Hexagon12/astc-types
video_core: Added ASTC 5x4; 8x5 types
2018-10-14 14:44:24 -04:00
FernandoS27
39ba6950b8 Shorten the implementation of 3D swizzle to only 3 functions 2018-10-13 20:58:00 -04:00
FernandoS27
92e9faba25 Fix a Crash on Zelda BotW and Splatoon 2, and simplified LoadGLBuffer 2018-10-13 16:11:11 -04:00
FernandoS27
1a70753709 Propagate depth and depth_block on modules using decoders 2018-10-13 15:25:18 -04:00
FernandoS27
8b1e913058 Remove old Swizzle algorithms and use 3d Swizzle 2018-10-13 15:25:17 -04:00
FernandoS27
2650e33c48 Implement Precise 3D Swizzle 2018-10-13 15:25:16 -04:00
FernandoS27
8b32bd526b Implement Fast 3D Swizzle 2018-10-13 15:25:15 -04:00
Hexagon12
f50514b25b Added ASTC 5x4; 8x5 2018-10-13 17:10:26 +03:00
FernandoS27
eec2311ec1 Implemented helper function to correctly calculate a texture's size 2018-10-12 14:21:53 -04:00
FernandoS27
9acd217fed Reverse stride align restriction on FastSwizzle due to lost performance 2018-09-21 12:09:59 -04:00
FernandoS27
ec51077131 Join both Swizzle methods within one interface function 2018-09-21 11:42:34 -04:00
FernandoS27
dc98e08d51 Standarized Legacy Swizzle to look alike FastSwizzle and use a Swizzling Table instead 2018-09-21 11:34:54 -04:00
FernandoS27
55543db875 Remove same output bpp restriction on FastSwizzle 2018-09-21 11:10:44 -04:00
FernandoS27
0bd735d574 Improved Legacy Swizzler to be better documented and work better 2018-09-21 10:57:12 -04:00