CPU (A64): Add Pmull_V Inst. with Clmul fast path for the "1/2D -> 1Q" variant & Sse fast path and slow path for both the "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test. (#1817)

* Add Pmull_V Sse fast path only, both "8/16B -> 8H" and "1/2D -> 1Q" variants; with Test.

* Add Clmul fast path for the 128 bits variant.

* Small optimisation (save 60 instructions) for the Sse fast path about the 128 bits variant.

* Add slow path, both variants. Fix V128 Shl/Shr when shift = 0.

* A32: Add Vmull_I P64 variant (slow path); not tested.

* A32: Add Vmull_I_P8_P64 Test and fix P64 variant.
This commit is contained in:
LDj3SNuD 2021-01-04 23:45:54 +01:00 committed by GitHub
parent a03ab0c4a0
commit 430ba6da65
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
11 changed files with 264 additions and 25 deletions

View file

@ -22,7 +22,7 @@ namespace ARMeilleure.Translation.PTC
{
private const string HeaderMagic = "PTChd";
private const int InternalVersion = 1814; //! To be incremented manually for each change to the ARMeilleure project.
private const int InternalVersion = 1817; //! To be incremented manually for each change to the ARMeilleure project.
private const string ActualDir = "0";
private const string BackupDir = "1";