[WIP][AMDGPU] Improve the handling of `inreg` arguments
When SGPRs available for `inreg` argument passing run out, the compiler silently
falls back to using whole VGPRs to pass those arguments. Ideally, instead of
using whole VGPRs, we should pack `inreg` arguments into individual lanes of
VGPRs.
This PR introduces `InregVGPRSpiller`, which handles this packing. It uses
`v_writelane` at the call site to place `inreg` arguments into specific VGPR
lanes, and then extracts them in the callee using `v_readlane`.
Fixes #130443 and #129071.
Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200)
Reverts llvm/llvm-project#132433
I suspect this change caused a failure in the bolt build bot.
https://lab.llvm.org/buildbot/#/builders/113/builds/6621
```
!9185 = !{!"branch_weights", i32 3912, i32 802}
Wrong number of operands
!9185 = !{!"branch_weights", i32 3912, i32 802}
fatal error: error in backend: Broken module found, compilation aborted!
```
[RISCV] Don't allow '-' after 'ra' in Zcmp/Xqccmp register list. (#134182)
Move the parsing of '-' under the check that we parsed a comma.
Unfortunately, this leads to a poor error, but I still have more known
issues in this code and may end up with an overall restructuring and
want to think about wording.
[RISCV] Check S0 register list check for qc.cm.pushfp to after we parsed the whole register list. (#134180)
This is more of a semantic check. The diagnostic location to has been
changed to point at the register list start instead of the
closing brace or whatever character might be there instead of a brace
if its malformed.
Reland [RISCV] Add Xqci Insn Formats (#134134)
This adds the following instruction formats from the Xqci Spec:
- QC.EAI
- QC.EI
- QC.EB
- QC.EJ
- QC.ES
The update to the THead test is because the largest number of operands
for a valid instruction has been bumped by this change.
This reverts commit 68fb7a5a1d203dde7badf67031bdd9eb650eef5d. This
relands commit 0cfabd37df9940346f3bf8a4d74c19e1f48a00e9.
openssh: Request the OpenSSL 1.1 API
Upstream OpenSSH commit f51423bda ("request 1.1x API compatibility for
OpenSSL >=3.x") requests OPENSSL_API_COMPAT version 0x10100000L (OpenSSL
1.1.0), in order to avoid warnings about deprecated functions.
Do the same here, to avoid getting those warnings.
Reviewed by: emaste
Approved by: emaste (mentor)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D49517
(cherry picked from commit d4f438357e90ee1cb12819d092913fdbce813626)
Reapply "[cmake] Refactor clang unittest cmake" (#134195)
This reapplies 5ffd9bdb50b57 (#133545) with fixes.
The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM
dependencies to the InterpTests binary in
unittests/AST/ByteCode/CMakeLists.txt .
llvm-reduce: Remove unsupported from bitcode uselistorder test (#134185)
This was disabled due to flakiness but I'm currently unable to
reproduce.
I'm nervous the original issue still exists. However, I downgraded the
tripped
assert in 8c18c25b1b22ea710edb40a4f167a6a8bfe6ff9d to a warning since
the same
assert can trigger for illegitimate reasons.
Fixes #64157
Fix initial user setup
This commit ensures that setting up initial account will properly
set the last_password_change value so as to not have the admin
account immediately locked.
[lldb][debugserver] Save and restore the SVE/SME register state (#134184)
debugserver isn't saving and restoring the SVE/SME register state around
inferior function calls.
Making arbitrary function calls while in Streaming SVE mode is generally
a poor idea because a NEON instruction can be hit and crash the
expression execution, which is how I missed this, but they should be
handled correctly if the user knows it is safe to do.
rdar://146886210
Ensure KnownBits passed when calculating from range md has right size (#132985)
KnownBits passed to computeKnownBitsFromRangeMetadata must have the same
bit width as the range metadata bit width. Otherwise the calculated
results will be incorrect.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
[Clang] Fix a lambda pattern comparison mismatch after ecc7e6ce4 (#133863)
In ecc7e6ce4, we tried to inspect the `LambdaScopeInfo` on stack to
recover the instantiating lambda captures. However, there was a mismatch
in how we compared the pattern declarations of lambdas: the constraint
instantiation used a tailored `getPatternFunctionDecl()` which is
localized in SemaLambda that finds the very primal template declaration
of a lambda, while `FunctionDecl::getTemplateInstantiationPattern` finds
the latest template pattern of a lambda. This difference causes issues
when lambdas are nested, as we always want the primary template
declaration.
This corrects that by moving `Sema::addInstantiatedCapturesToScope` from
SemaConcept to SemaLambda, allowing it to use the localized version of
`getPatternFunctionDecl`.
It is also worth exploring to coalesce the implementation of
`getPatternFunctionDecl` with
`FunctionDecl::getTemplateInstantiationPattern`. But I’m leaving that
[4 lines not shown]
[TableGen] Emit `llvm::is_contained` for `CheckOpcode` predicate (#134057)
When the list is large, using `llvm::is_contained` is of higher
performance than a sequence of comparisons. When the list is small,
the `llvm::is_contained` can be inlined and unrolled, which has the
same effect as using a sequence of comparisons.
And the generated code is more readable.
Revert "mccomphy: add support for YT8531"
The new code makes use of FDT/OFW types and interfaces, and obviously
fails to build on amd64. Revert to fix.
Pointy-hat-to: mhorne
This reverts commit e69623451ea62d2c3c76e0d0e775aa3f7317f2eb.