Commit graph

11249 commits

Author SHA1 Message Date
Lioncash
46a7c8826b common/swap: Improve codegen of the default swap fallbacks
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.

e.g. for the original swap32 code on x86-64, clang 8.0 would generate:

    mov     ecx, edi
    rol     cx, 8
    shl     ecx, 16
    shr     edi, 16
    rol     di, 8
    movzx   eax, di
    or      eax, ecx
    ret

while GCC 8.3 would generate the ideal:

    mov     eax, edi
    bswap   eax
    ret

now both generate the same optimal output.

MSVC used to generate the following with the old code:

    mov     eax, ecx
    rol     cx, 8
    shr     eax, 16
    rol     ax, 8
    movzx   ecx, cx
    movzx   eax, ax
    shl     ecx, 16
    or      eax, ecx
    ret     0

Now MSVC also generates a similar, but equally optimal result as clang/GCC:

    bswap   ecx
    mov     eax, ecx
    ret     0

====

In the swap64 case, for the original code, clang 8.0 would generate:

    mov     eax, edi
    bswap   eax
    shl     rax, 32
    shr     rdi, 32
    bswap   edi
    or      rax, rdi
    ret

(almost there, but still missing the mark)

while, again, GCC 8.3 would generate the more ideal:

    mov     rax, rdi
    bswap   rax
    ret

now clang also generates the optimal sequence for this fallback as well.

This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.

    mov     r8, rcx
    mov     r9, rcx
    mov     rax, 71776119061217280
    mov     rdx, r8
    and     r9, rax
    and     edx, 65280
    mov     rax, rcx
    shr     rax, 16
    or      r9, rax
    mov     rax, rcx
    shr     r9, 16
    mov     rcx, 280375465082880
    and     rax, rcx
    mov     rcx, 1095216660480
    or      r9, rax
    mov     rax, r8
    and     rax, rcx
    shr     r9, 16
    or      r9, rax
    mov     rcx, r8
    mov     rax, r8
    shr     r9, 8
    shl     rax, 16
    and     ecx, 16711680
    or      rdx, rax
    mov     eax, -16777216
    and     rax, r8
    shl     rdx, 16
    or      rdx, rcx
    shl     rdx, 16
    or      rax, rdx
    shl     rax, 8
    or      rax, r9
    ret     0

which is pretty unfortunate.
2019-04-12 00:07:39 -04:00
Lioncash
0d2300473a common/swap: Mark byte swapping free functions with [[nodiscard]] and noexcept
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
2019-04-11 20:42:44 -04:00
Lioncash
d9fa38ef42 common/swap: Simplify swap function ifdefs
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.

Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.

This shrinks the overall code down to just

if (msvc)
  use msvc's functions
else if (clang or gcc)
  use clang/gcc's builtins
else
  use the slow path
2019-04-11 20:36:19 -04:00
Lioncash
424250354c common/swap: Remove 32-bit ARM path
We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code.
2019-04-11 20:15:47 -04:00
bunnei
1302d026a1 Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 -04:00
bunnei
53c9e7aab2 Merge pull request #1957 from DarkLordZach/title-provider
file_sys: Provide generic interface for accessing game data
2019-04-09 19:16:37 -04:00
bunnei
dd5989d907 Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 -04:00
bunnei
4eeae8de2e Merge pull request #2132 from FearlessTobi/port-4437
Port citra-emu/citra#4437: "citra-qt: Make hotkeys configurable via the GUI (Attempt 2)"
2019-04-09 18:08:30 -04:00
bunnei
dd10b8d841 Merge pull request #2370 from lioncash/qt-warn
yuzu/loading_screen: Resolve runtime Qt string formatting warnings
2019-04-09 17:21:18 -04:00
bunnei
c26108eca5 Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 -04:00
bunnei
0e344dddc0 Merge pull request #2368 from FernandoS27/fix-lop
Correct LOP_IMM encoding
2019-04-09 17:19:56 -04:00
Hexagon12
cd4e6af512 Merge pull request #2371 from lioncash/pagetable
kernel/process: Set page table when page table resizes occur.
2019-04-09 20:13:37 +03:00
Lioncash
9e3d4595b7 kernel/process: Set page table when page table resizes occur.
We need to ensure dynarmic gets a valid pointer if the page table is
resized (the relevant pointers would be invalidated in this scenario).

In this scenario, the page table can be resized depending on what kind
of address space is specified within the NPDM metadata (if it's
present).
2019-04-09 13:00:56 -04:00
Lioncash
af836c2968 yuzu/loading_screen: Resolve runtime Qt string formatting warnings
In our error console, when loading a game, the strings:

QString::arg: Argument missing: "Loading...", 0
QString::arg: Argument missing: "Launching...", 0

would occasionally pop up when the loading screen was running. This was
due to the strings being assumed to have formatting indicators in them,
however only two out of the four strings actually have them.

This only applies the arguments to the strings that have formatting
specifiers provided, which avoids these warnings from occurring.
2019-04-09 10:49:38 -04:00
Fernando Sahmkow
7f9e792814 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 -04:00
Fernando Sahmkow
25e6fb72eb Correct LOP_IMN encoding 2019-04-08 13:39:12 -04:00
Fernando Sahmkow
34b15b69df Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 -04:00
bunnei
74386a009b Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 -04:00
bunnei
233ce811cf Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 -04:00
bunnei
b0512f8cf5 Merge pull request #2359 from FearlessTobi/port-2-prs
Port citra-emu/citra#4718: "fix clang-format target when using a path with spaces on windows"
2019-04-07 17:54:57 -04:00
bunnei
63dfb003f3 Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 -04:00
bunnei
d0146b9856 Merge pull request #2361 from lioncash/pagetable
core/memory: Minor simplifications to page table management
2019-04-07 17:50:31 -04:00
bunnei
822d91bc35 Merge pull request #2321 from ReinUsesLisp/gl-state-rework
gl_state: Rework to enable individual applies
2019-04-07 17:50:07 -04:00
bunnei
490390548a Merge pull request #2098 from FreddyFunk/disk-cache-zstd
gl_shader_disk_cache: Use Zstandard for compression
2019-04-07 17:48:33 -04:00
bunnei
be75b51381 Merge pull request #2356 from lioncash/pair
kernel/{server_port, server_session}: Return pairs instead of tuples from pair creation functions
2019-04-07 17:48:00 -04:00
bunnei
16f575a536 Merge pull request #2362 from lioncash/enum
core/memory: Remove unused enum constants
2019-04-07 17:46:09 -04:00
bunnei
01e7a22754 Merge pull request #2352 from bunnei/mem-manager-fixes
memory_manager: Improved implementation of read/write/copy block.
2019-04-07 17:44:59 -04:00
Fernando Sahmkow
a576cd4a8c Permit a Null Shader in case of a bad host_ptr. 2019-04-07 07:52:01 -04:00
Lioncash
024bdfdc08 core/memory: Remove unused enum constants
These are holdovers from Citra and can be removed.
2019-04-07 03:04:55 -04:00
Lioncash
08424ab57f core/memory: Remove GetCurrentPageTable()
Now that nothing actually touches the internal page table aside from the
memory subsystem itself, we can remove the accessor to it.
2019-04-07 02:47:37 -04:00
Lioncash
c1a788780d arm/arm_dynarmic: Remove unnecessary current_page_table member
Given the page table will always be guaranteed to be that of whatever
the current process is, we no longer need to keep this around.
2019-04-07 02:43:51 -04:00
Lioncash
6a929c3a2c kernel: Handle page table switching within MakeCurrentProcess()
Centralizes the page table switching to one spot, rather than making
calling code deal with it everywhere.
2019-04-07 01:12:54 -04:00
khang06
00dd963ee6 fix clang-format target when using a path with spaces on windows 2019-04-07 02:10:01 +02:00
Lioncash
bfbadb38be kernel/server_session: Return a std::pair from CreateSessionPair()
Keeps the return type consistent with the function name. While we're at
it, we can also reduce the amount of boilerplate involved with handling
these by using structured bindings.
2019-04-06 01:42:03 -04:00
Lioncash
05243b3041 kernel/server_port: Return a std::pair from CreatePortPair()
Returns the same type that the function name describes.
2019-04-06 01:36:53 -04:00
ReinUsesLisp
8092d3fad0 maxwell_3d: Reduce severity of ProcessSyncPoint 2019-04-06 02:18:20 -03:00
Lioncash
053aae66c1 video_core/textures/convert: Replace include with a forward declaration
Avoids dragging in a direct dependency in a header.
2019-04-06 00:14:36 -04:00
Lioncash
44d91d561a video_core/texures/texture: Remove unnecessary includes
Nothing in this header relies on common_funcs or the memory manager.

This gets rid of reliance on indirect inclusions in the OpenGL caches.
2019-04-06 00:03:35 -04:00
bunnei
9d8fa5f6e3 Merge pull request #2317 from FernandoS27/sync
Implement SyncPoint Register in the GPU.
2019-04-05 23:50:54 -04:00
bunnei
eccdc91fe0 Merge pull request #2325 from lioncash/name
kernel/server_session: Provide a GetName() override
2019-04-05 23:48:13 -04:00
bunnei
30218eab18 Merge pull request #2342 from lioncash/warning
common/multi_level_queue: Silence truncation warnings
2019-04-05 23:47:27 -04:00
bunnei
851f7f9e85 Merge pull request #2240 from FearlessTobi/port-4651
Port citra-emu/citra#4651: "gdbstub: Fix some bugs in IsMemoryBreak() and ServeBreak. Add workaround to let watchpoints break into GDB."
2019-04-05 23:46:37 -04:00
bunnei
031f36e6ac Merge pull request #2346 from lioncash/header
video_core/engines: Remove unnecessary inclusions where applicable
2019-04-05 23:44:27 -04:00
bunnei
2a4a454793 memory_manager: Improved implementation of read/write/copy block.
- Fixes graphical issues with Chocobo's Mystery Dungeon EVERY BUDDY!
- Fixes a crash with Mario Tennis Aces
2019-04-05 23:43:34 -04:00
bunnei
22fe3a5545 Merge pull request #2350 from lioncash/vmem
video_core/memory_manager: Mark a few member functions with the const qualifier
2019-04-05 23:40:54 -04:00
bunnei
2ad085e283 Merge pull request #2340 from lioncash/view
file_sys/fsmitm_romfsbuild: Utilize a string_view in romfs_calc_path_hash
2019-04-05 23:40:16 -04:00
bunnei
21b4a904f4 Merge pull request #2334 from lioncash/override
core: Add missing override specifiers where applicable
2019-04-05 23:39:52 -04:00
bunnei
5df7110b7d Merge pull request #2347 from lioncash/trunc
video_core/gpu_thread: Silence truncation warning in ThreadManager's constructor
2019-04-05 23:39:31 -04:00
bunnei
980c16b58f Merge pull request #2341 from lioncash/compare
file_sys/nca_metadata: Remove unnecessary comparison operators for TitleType
2019-04-05 23:38:37 -04:00
bunnei
41cc5be7b8 Merge pull request #2339 from lioncash/rank
service/fsp_srv: Update SaveDataInfo and SaveDataDescriptor structs
2019-04-05 23:36:46 -04:00