LLVM/project c659e3allvm/docs/TableGen ProgRef.rst

[TableGen][Docs] Fix `!range` markup (#95540)

DeltaFile
+1-1llvm/docs/TableGen/ProgRef.rst
+1-11 files

LLVM/project 4cf1a19llvm/test/CodeGen/AMDGPU global-atomicrmw-fadd.ll, llvm/test/Transforms/AtomicExpand/AMDGPU expand-atomic-v2bf16-agent.ll expand-atomic-v2bf16-system.ll

Reapply "AMDGPU: Handle legal v2f16/v2bf16 atomicrmw fadd for global/flat (#95394)"

This reverts commit 95b77d90aae10725ea692e120aac083ef1c1297d.
DeltaFile
+1,338-202llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2bf16-agent.ll
+1,338-202llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2bf16-system.ll
+1,280-240llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-rmw-fadd.ll
+1,196-202llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2f16-agent.ll
+1,196-202llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-v2f16-system.ll
+48-924llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
+6,396-1,9726 files not shown
+6,549-2,90112 files

LLVM/project a1bdb01llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/AArch64 shuffletoidentity.ll

[VectorCombine] Change shuffleToIdentity to use Use. NFC

When looking up through shuffles, a Value can be multiple different leaf types
(for example an identity from one position, a splat from another). We currently
detect this by recalculating which type of leaf it is when generating, but as
more types of leafs are added (#94954) this doesn't scale very well.

This patch switches it to use Use, not Value, to more accurately detect which
type of leaf each Use should have.
DeltaFile
+51-55llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+13-0llvm/test/Transforms/VectorCombine/AArch64/shuffletoidentity.ll
+64-552 files

LLVM/project 7a75779clang/lib/AST ExprConstant.cpp, clang/lib/AST/Interp Interp.h ByteCodeExprGen.cpp

Merge branch 'main' into users/Enna1/BPI-use-isEHPad
DeltaFile
+116-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fadd.v2bf16.ll
+57-49clang/lib/AST/ExprConstant.cpp
+43-14llvm/test/Transforms/InstCombine/trunc.ll
+57-0clang/lib/AST/Interp/Interp.h
+56-0flang/lib/Frontend/FrontendActions.cpp
+45-10clang/lib/AST/Interp/ByteCodeExprGen.cpp
+374-7314 files not shown
+494-8120 files

LLVM/project 57b8be4mlir/include/mlir/Target/LLVM/ROCDL Utils.h, mlir/lib/Dialect/GPU CMakeLists.txt

Revert [mlir][Target] Improve ROCDL gpu serialization API (#95790)

Reverts llvm/llvm-project#95456
DeltaFile
+119-160mlir/lib/Target/LLVM/ROCDL/Target.cpp
+10-31mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
+6-1mlir/lib/Target/LLVM/CMakeLists.txt
+1-1mlir/lib/Dialect/GPU/CMakeLists.txt
+136-1934 files

LLVM/project 4bf160eclang/lib/AST ExprConstant.cpp ExprConstShared.h, clang/lib/AST/Interp Interp.h ByteCodeExprGen.cpp

[clang][Interp] Implement Complex-complex multiplication (#94891)

Share the implementation for floating-point complex-complex
multiplication with the current interpreter. This means we need a new
opcode for this, but there's no good way around that.
DeltaFile
+57-49clang/lib/AST/ExprConstant.cpp
+57-0clang/lib/AST/Interp/Interp.h
+45-10clang/lib/AST/Interp/ByteCodeExprGen.cpp
+31-0clang/test/AST/Interp/complex.cpp
+7-0clang/lib/AST/ExprConstShared.h
+4-0clang/lib/AST/Interp/Opcodes.td
+201-596 files

LLVM/project 954cb5fmlir/include/mlir/Target/LLVM/ROCDL Utils.h, mlir/lib/Dialect/GPU CMakeLists.txt

[mlir][Target] Improve ROCDL gpu serialization API (#95456)

This patch improves the ROCDL gpu serialization API by:
- Introducing the enum `AMDGCNLibraries` for specifying the AMD GCN
device code libraries to use during linking.
- Removing `getCommonBitcodeLibs` in favor of `AMDGCNLibraries`.
Previously `getCommonBitcodeLibs` would try to load all AMD GCN bitcode
librariesm now it will only load the requested libraries.
- Exposing the `compileToBinary` method and making it virtual, allowing
downstream users to re-use this method.
- Exposing `moduleToObjectImpl`, this method provides a prototype flow
for compiling to binary, allowing downstream users to re-use this
method.
- It also avoids constructing the control variables if no device
libraries are being used.

This patch also changes the behavior of the CMake flag
`DEFAULT_ROCM_PATH`. Before it would fall back to a default value of
`/opt/rocm` if not specified. However, that default value causes fragile

    [2 lines not shown]
DeltaFile
+160-119mlir/lib/Target/LLVM/ROCDL/Target.cpp
+31-10mlir/include/mlir/Target/LLVM/ROCDL/Utils.h
+1-6mlir/lib/Target/LLVM/CMakeLists.txt
+1-1mlir/lib/Dialect/GPU/CMakeLists.txt
+193-1364 files

LLVM/project 534f856llvm/lib/Transforms/InstCombine InstCombineCasts.cpp, llvm/test/Transforms/InstCombine trunc.ll

[InstCombine] Don't preserve context across div

We can't preserve the context across a non-speculatable instruction,
as this might introduce a trap. Alternatively, we could also
insert all the replacement instruction at the use-site, but that
would be a more intrusive change for the sake of this edge case.

Fixes https://github.com/llvm/llvm-project/issues/95547.
DeltaFile
+6-4llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
+4-4llvm/test/Transforms/InstCombine/trunc.ll
+10-82 files

LLVM/project 7767f0dllvm/test/Transforms/InstCombine trunc.ll

[InstCombine] Add test for #95547 (NFC)
DeltaFile
+43-14llvm/test/Transforms/InstCombine/trunc.ll
+43-141 files

LLVM/project b75e7c6clang/include/clang/Driver Options.td, flang/include/flang/Frontend FrontendActions.h CodeGenOptions.h

[flang] Add -mlink-builtin-bitcode option to fc1 (#94763)

This patch enables the -mlink-builtin-bitcode flag in fc1 so that
bitcode libraries can be linked in. This is needed for OpenMP offloading
libraries.
DeltaFile
+56-0flang/lib/Frontend/FrontendActions.cpp
+15-0flang/test/Driver/mlink-builtin-bc.f90
+6-3clang/include/clang/Driver/Options.td
+4-1flang/include/flang/Frontend/FrontendActions.h
+5-0flang/lib/Frontend/CompilerInvocation.cpp
+4-0flang/include/flang/Frontend/CodeGenOptions.h
+90-41 files not shown
+94-47 files

LLVM/project 0cfdce8libcxx/include vector

[libc++] Guard transitive include of `<locale>` with availability macro (#95686)

This is a follow-up to https://github.com/llvm/llvm-project/pull/80282.
The transitive includes of `<locale>` in `<vector>` were all guarded by
the availability macro -- the new include should also be guarded,
otherwise any users who compile with localization disabled will start
getting errors trying to include `<vector>`.
DeltaFile
+2-0libcxx/include/vector
+2-01 files

LLVM/project fb5e46dllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU SIISelLowering.cpp AMDGPULegalizerInfo.cpp

AMDGPU: Remove .v2bf16 buffer atomic fadd intrinsics

These are redundant with the unsuffixed versions, and have a name
collision with surprising behavior when the base intrinsic is used with
v2bf16.

The global and flat variants should be removed too, but those are complicated
due to using v2i16 in place of the natural v2bf16. Those cases can soon be
completely deleted in favor of atomicrmw.

The GlobalISel codegen change is broken and substitutes handling as bf16
for handling as f16, but it's a bug that this passed the IRTranslator in the first
place.
DeltaFile
+2-42llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+6-6llvm/test/CodeGen/AMDGPU/fp-atomics-gfx1200.ll
+0-9llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-9llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+0-4llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
+1-1llvm/lib/Target/AMDGPU/BUFInstructions.td
+9-716 files not shown
+9-7812 files

LLVM/project 405882dllvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.struct.ptr.buffer.atomic.fadd.v2bf16.ll llvm.amdgcn.ptr.buffer.atomic.fadd_rtn_errors.ll

AMDGPU: Fix legalization for llvm.amdgcn.struct.buffer.atomic.fadd.v2bf16
DeltaFile
+116-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.buffer.atomic.fadd.v2bf16.ll
+29-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.ptr.buffer.atomic.fadd_rtn_errors.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+1-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+148-04 files

LLVM/project 264c664llvm/lib/Analysis BranchProbabilityInfo.cpp

[BPI] Use BasicBlock::isEHPad() to check exception handling block.

There is no need to iterate all predecessors of current block, check if
current block is the invoke unwind destination of any predcessor.
We can directly call `BasicBlock::isEHPad()` to check if current block
is a exception handling block.
DeltaFile
+3-6llvm/lib/Analysis/BranchProbabilityInfo.cpp
+3-61 files

LLVM/project 96e8d0fllvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[AArch64] Refactor creation of a shuffle mask for TBL (NFC) (#92529)

... in preparation for https://github.com/llvm/llvm-project/pull/92528
DeltaFile
+47-37llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+47-371 files

LLVM/project 0432221flang/lib/Optimizer/Transforms DebugTypeGenerator.cpp DebugTypeGenerator.h, flang/test/Integration debug-allocatable-1.f90

[flang][debug] Support allocatables. (#95557)

This PR adds debug support for allocatable. The allocatable arrays use
the existing functionality to read the array information from
descriptor. The allocatable for the scalar shows up as pointer to the
scalar.

While testing this, I notices that values of allocated and associated
flags were swapped. This is also fixed in this PR.

Here is how the debugging of the allocatable looks like with this patch
in place.

integer, allocatable :: ar1(:, :)
real, allocatable :: sc

allocate(sc)
allocate(ar1(3, 4))


    [6 lines not shown]
DeltaFile
+30-2flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+26-0flang/test/Transforms/debug-allocatable-1.fir
+24-0flang/test/Integration/debug-allocatable-1.f90
+7-0flang/lib/Optimizer/Transforms/DebugTypeGenerator.h
+87-24 files

LLVM/project 457e895llvm/lib/CodeGen MachineLICM.cpp RegUsageInfoCollector.cpp

[CodeGen] Do not include $noreg in any regmask operands. NFCI. (#95775)

Saying that a call preserves $noreg seems weird and required a
workaround in MachineLICM.
DeltaFile
+1-2llvm/lib/CodeGen/MachineLICM.cpp
+3-0llvm/lib/CodeGen/RegUsageInfoCollector.cpp
+4-22 files

LLVM/project fb59d9bllvm/test/DebugInfo/Generic sroa-extract-bits.ll

[DebugInfo] Update sroa-extract-bits.ll test (#95774)

Update test due to #91724
DeltaFile
+17-17llvm/test/DebugInfo/Generic/sroa-extract-bits.ll
+17-171 files

LLVM/project 70de505llvm/lib/Analysis BranchProbabilityInfo.cpp

[BPI] Use BasicBlock::isEHPad() to check exception handling block.

There is no need to iterate all predecessors of current block, check if
current block is the invoke unwind destination of any predcessor.
We can directly call `BasicBlock::isEHPad()` to check if current block
is a exception handling block.
DeltaFile
+3-6llvm/lib/Analysis/BranchProbabilityInfo.cpp
+3-61 files

LLVM/project 3cead57mlir/include/mlir/Dialect/EmitC/IR EmitCTypes.td, mlir/include/mlir/Dialect/EmitC/Transforms TypeConversions.h

[mlir][emitc] Add EmitC index types (#93155)

This commit adds `emitc.size_t`, `emitc.ssize_t` and `emitc.ptrdiff_t`
types to the EmitC dialect. These are used to map `index` types to C/C++
types with an explicit signedness, and are emitted in C/C++ as `size_t`,
`ssize_t` and `ptrdiff_t`.
DeltaFile
+64-0mlir/lib/Dialect/EmitC/Transforms/TypeConversions.cpp
+30-2mlir/include/mlir/Dialect/EmitC/IR/EmitCTypes.td
+26-0mlir/include/mlir/Dialect/EmitC/Transforms/TypeConversions.h
+15-6mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+19-1mlir/test/Dialect/EmitC/types.mlir
+12-0mlir/test/Target/Cpp/types.mlir
+166-98 files not shown
+196-1714 files

LLVM/project 770393bllvm/lib/CodeGen MachineLICM.cpp, llvm/test/CodeGen/AMDGPU indirect-call.ll

[MachineLICM] Correctly Apply Register Masks (#95746)

Fix regression introduced in d4b8b72
DeltaFile
+13-22llvm/lib/CodeGen/MachineLICM.cpp
+2-2llvm/test/CodeGen/AMDGPU/indirect-call.ll
+15-242 files

LLVM/project c2d9f25clang/lib/CodeGen CGDeclCXX.cpp, clang/test/CodeGenCXX init-invariant.cpp

[clang][CodeGen] Fix EmitInvariantStart for non-zero addrspace (#94346)

The `llvm.invariant.start` intrinsic is already overloaded to work with
memory objects in any address space. We simply instantiate the intrinsic
with the appropriate pointer type.

Fixes #94345.

Co-authored-by: Vito Kortbeek <kortbeek at synopsys.com>
DeltaFile
+6-0clang/test/CodeGenCXX/init-invariant.cpp
+2-1clang/lib/CodeGen/CGDeclCXX.cpp
+8-12 files

LLVM/project 6d973b4clang/lib/CodeGen CGExprAgg.cpp CGExprScalar.cpp, clang/lib/CodeGen/Targets AArch64.cpp X86.cpp

[clang][CodeGen] Return RValue from `EmitVAArg` (#94635)

This should simplify handling of resulting value by the callers.
DeltaFile
+37-34clang/lib/CodeGen/Targets/AArch64.cpp
+27-25clang/lib/CodeGen/Targets/X86.cpp
+21-24clang/lib/CodeGen/Targets/PPC.cpp
+16-14clang/lib/CodeGen/CGExprAgg.cpp
+13-16clang/lib/CodeGen/Targets/Mips.cpp
+2-21clang/lib/CodeGen/CGExprScalar.cpp
+116-13430 files not shown
+282-28036 files

LLVM/project 52d87dellvm/lib/Target/Xtensa/AsmParser XtensaAsmParser.cpp, llvm/test/MC/Xtensa/Core registers.s

[Xtensa] Fix register asm parsing. (#95551)

Fix passing temporary string object as argument to the StringRef
constructor in "parseRegister" function, because it causes errors in the
test "llvm/test/MC/Xtensa/Core/processor-control.s".
DeltaFile
+14-0llvm/test/MC/Xtensa/Core/registers.s
+1-1llvm/lib/Target/Xtensa/AsmParser/XtensaAsmParser.cpp
+15-12 files

LLVM/project f06d969llvm/test/CodeGen/AArch64 sve-streaming-mode-fixed-length-int-rem.ll sve-streaming-mode-fixed-length-int-div.ll

[LLVM][DAGCombiner] Extend coverage for insert_subv(undef, extract_subv(A, 0), 0) (#95242)

There is an existing combine to remove the need for extract_subv that
requires matching vector types (all fixed or all scalable).

The combine doesn't need this restriction and so I've changed it to use
ValueType's "knownBits??" interface that supports mixed vector types. In
doing so we also need extra guards to prevent invalid operations (e.g.
extracting a scalable vector from a fixed length vector).
DeltaFile
+172-174llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-rem.ll
+146-168llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-div.ll
+310-0llvm/test/CodeGen/AArch64/vector-insert-dag-combines.ll
+132-130llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-fp-to-int.ll
+68-70llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-to-fp.ll
+14-18llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-int-extends.ll
+842-5602 files not shown
+856-5748 files

LLVM/project 1ba8ed0libcxx/include __split_buffer deque, libcxx/include/__memory shared_ptr.h

[libc++] Mark more types as trivially relocatable (#89724)

Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
DeltaFile
+119-0libcxx/test/libcxx/type_traits/is_trivially_relocatable.compile.pass.cpp
+11-0libcxx/include/__split_buffer
+10-0libcxx/include/deque
+9-0libcxx/include/vector
+8-0libcxx/include/__memory/shared_ptr.h
+6-0libcxx/include/__utility/pair.h
+163-015 files not shown
+202-021 files

LLVM/project f84056cllvm/include/llvm/IR DebugInfoMetadata.h, llvm/lib/IR DebugInfoMetadata.cpp

 [DebugInfo] Handle DW_OP_LLVM_extract_bits in SROA (#94638)

This doesn't need any work to be done in SROA itself, but rather in
functions that it uses. Specifically:
* DIExpression::createFragmentExpression is made to understand
DW_OP_LLVM_extract_bits
* valueCoversEntireFragment is made to check the active bits instead of
the fragment size, so that it handles extract_bits correctly
DeltaFile
+205-0llvm/test/DebugInfo/Generic/sroa-extract-bits.ll
+68-3llvm/lib/IR/DebugInfoMetadata.cpp
+6-0llvm/include/llvm/IR/DebugInfoMetadata.h
+4-2llvm/lib/Transforms/Utils/Local.cpp
+283-54 files

LLVM/project 7e4f7fcflang/lib/Optimizer/Transforms DebugTypeGenerator.cpp DebugTypeGenerator.h, flang/test/Integration debug-char-type-1.f90

[flang][debug] Support fixed size character type. (#95462)

This PR adds debug support for fixed size character type. The character
type gets translated to DIStringType.

As I have noticed in comments, currently DIStringType does not have a
way to represent the underlying character type of the string. This
restricts our ability to represent wide string. As an example, this is
how the debugger shows 2 different type of string. Note that non-ascii
characters work ok with default kind string.

  character(kind=4, len=5) :: str1
  character(len=16) :: str2
  str1 = 'hello'
  str2 = 'π = 3.14'

(gdb) p str1
$1 = 'h\000\000\000e\000\000\000l\000\000\000l\000\000\000o\000\000\000'

(gdb) p str2
$2 = 'π = 3.14       '
DeltaFile
+29-0flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+21-0flang/test/Integration/debug-char-type-1.f90
+19-0flang/test/Transforms/debug-char-type-1.fir
+4-0flang/lib/Optimizer/Transforms/DebugTypeGenerator.h
+73-04 files

LLVM/project f838f08lldb/include/lldb/Target RegisterFlags.h, lldb/source/Target RegisterFlags.cpp

[lldb] Add register field enum class (#90063)

This represents the enum type that can be assigned to a field using the
`<enum>` element in the target XML.

https://sourceware.org/gdb/current/onlinedocs/gdb.html/Enum-Target-Types.html

Each enumerator has:
* A non-empty name
* A value that is within the range of the field it's applied to

The XML includes a "size" but we don't need that for anything and it's a
pain to verify so I've left it out of our internal structures. When
emitting XML we'll set size to the size of the register using the enum.

An Enumerator class is added to RegisterFlags and hooked up to the
existing ToXML so lldb-server can use it to emit enums as well.

As enums are elements on the same level as flags, when emitting XML

    [5 lines not shown]
DeltaFile
+196-2lldb/source/Target/RegisterFlags.cpp
+174-1lldb/unittests/Target/RegisterFlagsTest.cpp
+65-7lldb/include/lldb/Target/RegisterFlags.h
+435-103 files

LLVM/project e843f02mlir/include/mlir/Dialect/Mesh/IR MeshOps.h, mlir/lib/Conversion/LLVMCommon MemRefBuilder.cpp

mlir: fix incorrect usages of divideCeilSigned (#95680)

Follow up on #95087 to fix incorrect usage instances of
divideCeilSigned.
DeltaFile
+2-3mlir/lib/Conversion/LLVMCommon/MemRefBuilder.cpp
+1-1mlir/include/mlir/Dialect/Mesh/IR/MeshOps.h
+1-1mlir/lib/Conversion/MemRefToLLVM/MemRefToLLVM.cpp
+4-53 files