[WIP][AMDGPU] Improve the handling of `inreg` arguments
When SGPRs available for `inreg` argument passing run out, the compiler silently
falls back to using whole VGPRs to pass those arguments. Ideally, instead of
using whole VGPRs, we should pack `inreg` arguments into individual lanes of
VGPRs.
This PR introduces `InregVGPRSpiller`, which handles this packing. It uses
`v_writelane` at the call site to place `inreg` arguments into specific VGPR
lanes, and then extracts them in the callee using `v_readlane`.
Fixes #130443 and #129071.
Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200)
Reverts llvm/llvm-project#132433
I suspect this change caused a failure in the bolt build bot.
https://lab.llvm.org/buildbot/#/builders/113/builds/6621
```
!9185 = !{!"branch_weights", i32 3912, i32 802}
Wrong number of operands
!9185 = !{!"branch_weights", i32 3912, i32 802}
fatal error: error in backend: Broken module found, compilation aborted!
```
[RISCV] Don't allow '-' after 'ra' in Zcmp/Xqccmp register list. (#134182)
Move the parsing of '-' under the check that we parsed a comma.
Unfortunately, this leads to a poor error, but I still have more known
issues in this code and may end up with an overall restructuring and
want to think about wording.
[RISCV] Check S0 register list check for qc.cm.pushfp to after we parsed the whole register list. (#134180)
This is more of a semantic check. The diagnostic location to has been
changed to point at the register list start instead of the
closing brace or whatever character might be there instead of a brace
if its malformed.
Reland [RISCV] Add Xqci Insn Formats (#134134)
This adds the following instruction formats from the Xqci Spec:
- QC.EAI
- QC.EI
- QC.EB
- QC.EJ
- QC.ES
The update to the THead test is because the largest number of operands
for a valid instruction has been bumped by this change.
This reverts commit 68fb7a5a1d203dde7badf67031bdd9eb650eef5d. This
relands commit 0cfabd37df9940346f3bf8a4d74c19e1f48a00e9.
Reapply "[cmake] Refactor clang unittest cmake" (#134195)
This reapplies 5ffd9bdb50b57 (#133545) with fixes.
The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM
dependencies to the InterpTests binary in
unittests/AST/ByteCode/CMakeLists.txt .
llvm-reduce: Remove unsupported from bitcode uselistorder test (#134185)
This was disabled due to flakiness but I'm currently unable to
reproduce.
I'm nervous the original issue still exists. However, I downgraded the
tripped
assert in 8c18c25b1b22ea710edb40a4f167a6a8bfe6ff9d to a warning since
the same
assert can trigger for illegitimate reasons.
Fixes #64157
[lldb][debugserver] Save and restore the SVE/SME register state (#134184)
debugserver isn't saving and restoring the SVE/SME register state around
inferior function calls.
Making arbitrary function calls while in Streaming SVE mode is generally
a poor idea because a NEON instruction can be hit and crash the
expression execution, which is how I missed this, but they should be
handled correctly if the user knows it is safe to do.
rdar://146886210
Ensure KnownBits passed when calculating from range md has right size (#132985)
KnownBits passed to computeKnownBitsFromRangeMetadata must have the same
bit width as the range metadata bit width. Otherwise the calculated
results will be incorrect.
---------
Signed-off-by: John Lu <John.Lu at amd.com>
[Clang] Fix a lambda pattern comparison mismatch after ecc7e6ce4 (#133863)
In ecc7e6ce4, we tried to inspect the `LambdaScopeInfo` on stack to
recover the instantiating lambda captures. However, there was a mismatch
in how we compared the pattern declarations of lambdas: the constraint
instantiation used a tailored `getPatternFunctionDecl()` which is
localized in SemaLambda that finds the very primal template declaration
of a lambda, while `FunctionDecl::getTemplateInstantiationPattern` finds
the latest template pattern of a lambda. This difference causes issues
when lambdas are nested, as we always want the primary template
declaration.
This corrects that by moving `Sema::addInstantiatedCapturesToScope` from
SemaConcept to SemaLambda, allowing it to use the localized version of
`getPatternFunctionDecl`.
It is also worth exploring to coalesce the implementation of
`getPatternFunctionDecl` with
`FunctionDecl::getTemplateInstantiationPattern`. But I’m leaving that
[4 lines not shown]
[TableGen] Emit `llvm::is_contained` for `CheckOpcode` predicate (#134057)
When the list is large, using `llvm::is_contained` is of higher
performance than a sequence of comparisons. When the list is small,
the `llvm::is_contained` can be inlined and unrolled, which has the
same effect as using a sequence of comparisons.
And the generated code is more readable.
[lldb-dap] Add a -v/--version command line argument (#134114)
Add a -v/--version command line argument to print the version of both
the lldb-dap binary and the liblldb it's linked against.
This is motivated by me trying to figure out which lldb-dap I had in my
PATH.
[ctxprof] Option to move a whole tree to its own module (#133992)
Modules may contain a mix of functions that participate or don't participate in callgraphs covered by a contextual profile. We currently have been importing all the functions under a context root in the module defining that root, but if the other functions there are covered by flat profiles, the result is difficult to reason about.
This patch allows moving everything under a context root (and that root) in its own module. For now, we expect a module with a filename matching the GUID of the function be present in the set of modules known by the linker. This mechanism can be improved in a later patch.
Subsequent patches will handle implementing "move" instead of "import" semantics for the root function (because we want to make sure only one version of the root exists - so the optimizations we perform are actually the ones being observed at runtime).
llvm-reduce: Remove unsupported from bitcode uselistorder test
This was disabled due to flakiness but I'm currently unable to reproduce.
I'm nervous the original issue still exists. However, I downgraded the tripped
assert in 8c18c25b1b22ea710edb40a4f167a6a8bfe6ff9d to a warning since the same
assert can trigger for illegitimate reasons.
Fixes #64157
[LLVM] Only build the GPU loader utility if it has LLVM-libc (#134141)
Summary:
There were some discussions about this being included by default. I need
to fix this up and codify the use of LLVM libc inside of LLVM. For now,
just turn it off unless the user requested the `libc` GPU stuff. This
matches the old behavior.