POWER - Performance Optimizations With Extensive Ramifications (#2286)

* Refactoring of KMemoryManager class

* Replace some trivial uses of DRAM address with VA

* Get rid of GetDramAddressFromVa

* Abstracting more operations on derived page table class

* Run auto-format on KPageTableBase

* Managed to make TryConvertVaToPa private, few uses remains now

* Implement guest physical pages ref counting, remove manual freeing

* Make DoMmuOperation private and call new abstract methods only from the base class

* Pass pages count rather than size on Map/UnmapMemory

* Change memory managers to take host pointers

* Fix a guest memory leak and simplify KPageTable

* Expose new methods for host range query and mapping

* Some refactoring of MapPagesFromClientProcess to allow proper page ref counting and mapping without KPageLists

* Remove more uses of AddVaRangeToPageList, now only one remains (shared memory page checking)

* Add a SharedMemoryStorage class, will be useful for host mapping

* Sayonara AddVaRangeToPageList, you served us well

* Start to implement host memory mapping (WIP)

* Support memory tracking through host exception handling

* Fix some access violations from HLE service guest memory access and CPU

* Fix memory tracking

* Fix mapping list bugs, including a race and a error adding mapping ranges

* Simple page table for memory tracking

* Simple "volatile" region handle mode

* Update UBOs directly (experimental, rough)

* Fix the overlap check

* Only set non-modified buffers as volatile

* Fix some memory tracking issues

* Fix possible race in MapBufferFromClientProcess (block list updates were not locked)

* Write uniform update to memory immediately, only defer the buffer set.

* Fix some memory tracking issues

* Pass correct pages count on shared memory unmap

* Armeilleure Signal Handler v1 + Unix changes

Unix currently behaves like windows, rather than remapping physical

* Actually check if the host platform is unix

* Fix decommit on linux.

* Implement windows 10 placeholder shared memory, fix a buffer issue.

* Make PTC version something that will never match with master

* Remove testing variable for block count

* Add reference count for memory manager, fix dispose

Can still deadlock with OpenAL

* Add address validation, use page table for mapped check, add docs

Might clean up the page table traversing routines.

* Implement batched mapping/tracking.

* Move documentation, fix tests.

* Cleanup uniform buffer update stuff.

* Remove unnecessary assignment.

* Add unsafe host mapped memory switch

On by default. Would be good to turn this off for untrusted code (homebrew, exefs mods) and give the user the option to turn it on manually, though that requires some UI work.

* Remove C# exception handlers

They have issues due to current .NET limitations, so the meilleure one fully replaces them for now.

* Fix MapPhysicalMemory on the software MemoryManager.

* Null check for GetHostAddress, docs

* Add configuration for setting memory manager mode (not in UI yet)

* Add config to UI

* Fix type mismatch on Unix signal handler code emit

* Fix 6GB DRAM mode.

The size can be greater than `uint.MaxValue` when the DRAM is >4GB.

* Address some feedback.

* More detailed error if backing memory cannot be mapped.

* SetLastError on all OS functions for consistency

* Force pages dirty with UBO update instead of setting them directly.

Seems to be much faster across a few games. Need retesting.

* Rebase, configuration rework, fix mem tracking regression

* Fix race in FreePages

* Set memory managers null after decrementing ref count

* Remove readonly keyword, as this is now modified.

* Use a local variable for the signal handler rather than a register.

* Fix bug with buffer resize, and index/uniform buffer binding.

Should fix flickering in games.

* Add InvalidAccessHandler to MemoryTracking

Doesn't do anything yet

* Call invalid access handler on unmapped read/write.

Same rules as the regular memory manager.

* Make unsafe mapped memory its own MemoryManagerType

* Move FlushUboDirty into UpdateState.

* Buffer dirty cache, rather than ubo cache

Much cleaner, may be reusable for Inline2Memory updates.

* This doesn't return anything anymore.

* Add sigaction remove methods, correct a few function signatures.

* Return empty list of physical regions for size 0.

* Also on AddressSpaceManager

Co-authored-by: gdkchan <gab.dark.100@gmail.com>
This commit is contained in:
riperiperi 2021-05-24 21:52:44 +01:00 committed by GitHub
parent fb65f392d1
commit 54ea2285f0
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
107 changed files with 8309 additions and 4183 deletions

View file

@ -1,4 +1,7 @@
using System;
using Ryujinx.Memory.Range;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
@ -10,15 +13,9 @@ namespace Ryujinx.Memory
/// </summary>
public sealed class AddressSpaceManager : IVirtualMemoryManager, IWritableBlock
{
public const int PageBits = 12;
public const int PageSize = 1 << PageBits;
public const int PageMask = PageSize - 1;
private const int PtLevelBits = 9; // 9 * 4 + 12 = 48 (max address space size)
private const int PtLevelSize = 1 << PtLevelBits;
private const int PtLevelMask = PtLevelSize - 1;
private const ulong Unmapped = ulong.MaxValue;
public const int PageBits = PageTable<nuint>.PageBits;
public const int PageSize = PageTable<nuint>.PageSize;
public const int PageMask = PageTable<nuint>.PageMask;
/// <summary>
/// Address space width in bits.
@ -27,16 +24,14 @@ namespace Ryujinx.Memory
private readonly ulong _addressSpaceSize;
private readonly MemoryBlock _backingMemory;
private readonly ulong[][][][] _pageTable;
private readonly PageTable<nuint> _pageTable;
/// <summary>
/// Creates a new instance of the memory manager.
/// </summary>
/// <param name="backingMemory">Physical backing memory where virtual memory will be mapped to</param>
/// <param name="addressSpaceSize">Size of the address space</param>
public AddressSpaceManager(MemoryBlock backingMemory, ulong addressSpaceSize)
public AddressSpaceManager(ulong addressSpaceSize)
{
ulong asSize = PageSize;
int asBits = PageBits;
@ -49,8 +44,7 @@ namespace Ryujinx.Memory
AddressSpaceBits = asBits;
_addressSpaceSize = asSize;
_backingMemory = backingMemory;
_pageTable = new ulong[PtLevelSize][][][];
_pageTable = new PageTable<nuint>();
}
/// <summary>
@ -60,18 +54,18 @@ namespace Ryujinx.Memory
/// Addresses and size must be page aligned.
/// </remarks>
/// <param name="va">Virtual memory address</param>
/// <param name="pa">Physical memory address</param>
/// <param name="hostAddress">Physical memory address</param>
/// <param name="size">Size to be mapped</param>
public void Map(ulong va, ulong pa, ulong size)
public void Map(ulong va, nuint hostAddress, ulong size)
{
AssertValidAddressAndSize(va, size);
while (size != 0)
{
PtMap(va, pa);
_pageTable.Map(va, hostAddress);
va += PageSize;
pa += PageSize;
hostAddress += PageSize;
size -= PageSize;
}
}
@ -87,7 +81,7 @@ namespace Ryujinx.Memory
while (size != 0)
{
PtUnmap(va);
_pageTable.Unmap(va);
va += PageSize;
size -= PageSize;
@ -146,7 +140,7 @@ namespace Ryujinx.Memory
if (IsContiguousAndMapped(va, data.Length))
{
data.CopyTo(_backingMemory.GetSpan(GetPhysicalAddressInternal(va), data.Length));
data.CopyTo(GetHostSpanContiguous(va, data.Length));
}
else
{
@ -154,22 +148,18 @@ namespace Ryujinx.Memory
if ((va & PageMask) != 0)
{
ulong pa = GetPhysicalAddressInternal(va);
size = Math.Min(data.Length, PageSize - (int)(va & PageMask));
data.Slice(0, size).CopyTo(_backingMemory.GetSpan(pa, size));
data.Slice(0, size).CopyTo(GetHostSpanContiguous(va, size));
offset += size;
}
for (; offset < data.Length; offset += size)
{
ulong pa = GetPhysicalAddressInternal(va + (ulong)offset);
size = Math.Min(data.Length - offset, PageSize);
data.Slice(offset, size).CopyTo(_backingMemory.GetSpan(pa, size));
data.Slice(offset, size).CopyTo(GetHostSpanContiguous(va + (ulong)offset, size));
}
}
}
@ -195,7 +185,7 @@ namespace Ryujinx.Memory
if (IsContiguousAndMapped(va, size))
{
return _backingMemory.GetSpan(GetPhysicalAddressInternal(va), size);
return GetHostSpanContiguous(va, size);
}
else
{
@ -219,7 +209,7 @@ namespace Ryujinx.Memory
/// <param name="size">Size of the data</param>
/// <returns>A writable region of memory containing the data</returns>
/// <exception cref="InvalidMemoryRegionException">Throw for unhandled invalid or unmapped memory accesses</exception>
public WritableRegion GetWritableRegion(ulong va, int size)
public unsafe WritableRegion GetWritableRegion(ulong va, int size)
{
if (size == 0)
{
@ -228,7 +218,7 @@ namespace Ryujinx.Memory
if (IsContiguousAndMapped(va, size))
{
return new WritableRegion(null, va, _backingMemory.GetMemory(GetPhysicalAddressInternal(va), size));
return new WritableRegion(null, va, new NativeMemoryManager<byte>((byte*)GetHostAddress(va), size).Memory);
}
else
{
@ -250,14 +240,14 @@ namespace Ryujinx.Memory
/// <param name="va">Virtual address of the data</param>
/// <returns>A reference to the data in memory</returns>
/// <exception cref="MemoryNotContiguousException">Throw if the specified memory region is not contiguous in physical memory</exception>
public ref T GetRef<T>(ulong va) where T : unmanaged
public unsafe ref T GetRef<T>(ulong va) where T : unmanaged
{
if (!IsContiguous(va, Unsafe.SizeOf<T>()))
{
ThrowMemoryNotContiguous();
}
return ref _backingMemory.GetRef<T>(GetPhysicalAddressInternal(va));
return ref *(T*)GetHostAddress(va);
}
/// <summary>
@ -299,7 +289,7 @@ namespace Ryujinx.Memory
return false;
}
if (GetPhysicalAddressInternal(va) + PageSize != GetPhysicalAddressInternal(va + PageSize))
if (GetHostAddress(va) + PageSize != GetHostAddress(va + PageSize))
{
return false;
}
@ -317,9 +307,48 @@ namespace Ryujinx.Memory
/// <param name="va">Virtual address of the range</param>
/// <param name="size">Size of the range</param>
/// <returns>Array of physical regions</returns>
public (ulong address, ulong size)[] GetPhysicalRegions(ulong va, ulong size)
public IEnumerable<HostMemoryRange> GetPhysicalRegions(ulong va, ulong size)
{
throw new NotImplementedException();
if (size == 0)
{
return Enumerable.Empty<HostMemoryRange>();
}
if (!ValidateAddress(va) || !ValidateAddressAndSize(va, size))
{
return null;
}
int pages = GetPagesCount(va, (uint)size, out va);
var regions = new List<HostMemoryRange>();
nuint regionStart = GetHostAddress(va);
ulong regionSize = PageSize;
for (int page = 0; page < pages - 1; page++)
{
if (!ValidateAddress(va + PageSize))
{
return null;
}
nuint newHostAddress = GetHostAddress(va + PageSize);
if (GetHostAddress(va) + PageSize != newHostAddress)
{
regions.Add(new HostMemoryRange(regionStart, regionSize));
regionStart = newHostAddress;
regionSize = 0;
}
va += PageSize;
regionSize += PageSize;
}
regions.Add(new HostMemoryRange(regionStart, regionSize));
return regions;
}
private void ReadImpl(ulong va, Span<byte> data)
@ -335,22 +364,18 @@ namespace Ryujinx.Memory
if ((va & PageMask) != 0)
{
ulong pa = GetPhysicalAddressInternal(va);
size = Math.Min(data.Length, PageSize - (int)(va & PageMask));
_backingMemory.GetSpan(pa, size).CopyTo(data.Slice(0, size));
GetHostSpanContiguous(va, size).CopyTo(data.Slice(0, size));
offset += size;
}
for (; offset < data.Length; offset += size)
{
ulong pa = GetPhysicalAddressInternal(va + (ulong)offset);
size = Math.Min(data.Length - offset, PageSize);
_backingMemory.GetSpan(pa, size).CopyTo(data.Slice(offset, size));
GetHostSpanContiguous(va + (ulong)offset, size).CopyTo(data.Slice(offset, size));
}
}
@ -367,7 +392,7 @@ namespace Ryujinx.Memory
return false;
}
return PtRead(va) != Unmapped;
return _pageTable.Read(va) != 0;
}
/// <summary>
@ -434,28 +459,14 @@ namespace Ryujinx.Memory
}
}
/// <summary>
/// Performs address translation of the address inside a mapped memory range.
/// </summary>
/// <remarks>
/// If the address is invalid or unmapped, -1 will be returned.
/// </remarks>
/// <param name="va">Virtual address to be translated</param>
/// <returns>The physical address</returns>
public ulong GetPhysicalAddress(ulong va)
private unsafe Span<byte> GetHostSpanContiguous(ulong va, int size)
{
// We return -1L if the virtual address is invalid or unmapped.
if (!ValidateAddress(va) || !IsMapped(va))
{
return ulong.MaxValue;
}
return GetPhysicalAddressInternal(va);
return new Span<byte>((void*)GetHostAddress(va), size);
}
private ulong GetPhysicalAddressInternal(ulong va)
private nuint GetHostAddress(ulong va)
{
return PtRead(va) + (va & PageMask);
return _pageTable.Read(va) + (nuint)(va & PageMask);
}
/// <summary>
@ -469,132 +480,6 @@ namespace Ryujinx.Memory
throw new NotImplementedException();
}
private ulong PtRead(ulong va)
{
int l3 = (int)(va >> PageBits) & PtLevelMask;
int l2 = (int)(va >> (PageBits + PtLevelBits)) & PtLevelMask;
int l1 = (int)(va >> (PageBits + PtLevelBits * 2)) & PtLevelMask;
int l0 = (int)(va >> (PageBits + PtLevelBits * 3)) & PtLevelMask;
if (_pageTable[l0] == null)
{
return Unmapped;
}
if (_pageTable[l0][l1] == null)
{
return Unmapped;
}
if (_pageTable[l0][l1][l2] == null)
{
return Unmapped;
}
return _pageTable[l0][l1][l2][l3];
}
private void PtMap(ulong va, ulong value)
{
int l3 = (int)(va >> PageBits) & PtLevelMask;
int l2 = (int)(va >> (PageBits + PtLevelBits)) & PtLevelMask;
int l1 = (int)(va >> (PageBits + PtLevelBits * 2)) & PtLevelMask;
int l0 = (int)(va >> (PageBits + PtLevelBits * 3)) & PtLevelMask;
if (_pageTable[l0] == null)
{
_pageTable[l0] = new ulong[PtLevelSize][][];
}
if (_pageTable[l0][l1] == null)
{
_pageTable[l0][l1] = new ulong[PtLevelSize][];
}
if (_pageTable[l0][l1][l2] == null)
{
_pageTable[l0][l1][l2] = new ulong[PtLevelSize];
for (int i = 0; i < _pageTable[l0][l1][l2].Length; i++)
{
_pageTable[l0][l1][l2][i] = Unmapped;
}
}
_pageTable[l0][l1][l2][l3] = value;
}
private void PtUnmap(ulong va)
{
int l3 = (int)(va >> PageBits) & PtLevelMask;
int l2 = (int)(va >> (PageBits + PtLevelBits)) & PtLevelMask;
int l1 = (int)(va >> (PageBits + PtLevelBits * 2)) & PtLevelMask;
int l0 = (int)(va >> (PageBits + PtLevelBits * 3)) & PtLevelMask;
if (_pageTable[l0] == null)
{
return;
}
if (_pageTable[l0][l1] == null)
{
return;
}
if (_pageTable[l0][l1][l2] == null)
{
return;
}
_pageTable[l0][l1][l2][l3] = Unmapped;
bool empty = true;
for (int i = 0; i < _pageTable[l0][l1][l2].Length; i++)
{
empty &= (_pageTable[l0][l1][l2][i] == Unmapped);
}
if (empty)
{
_pageTable[l0][l1][l2] = null;
RemoveIfAllNull(l0, l1);
}
}
private void RemoveIfAllNull(int l0, int l1)
{
bool empty = true;
for (int i = 0; i < _pageTable[l0][l1].Length; i++)
{
empty &= (_pageTable[l0][l1][i] == null);
}
if (empty)
{
_pageTable[l0][l1] = null;
RemoveIfAllNull(l0);
}
}
private void RemoveIfAllNull(int l0)
{
bool empty = true;
for (int i = 0; i < _pageTable[l0].Length; i++)
{
empty &= (_pageTable[l0][i] == null);
}
if (empty)
{
_pageTable[l0] = null;
}
}
public void SignalMemoryTracking(ulong va, ulong size, bool write)
{
// Only the ARM Memory Manager has tracking for now.