LLVM/project 455b3d6lld/COFF DriverUtils.cpp SymbolTable.cpp, lld/test/COFF arm64x-export.test

[LLD][COFF] Separate EC and native exports for ARM64X (#123652)

Store exports in SymbolTable instead of Configuration.
DeltaFile
+0-136lld/COFF/DriverUtils.cpp
+135-0lld/COFF/SymbolTable.cpp
+121-0lld/test/COFF/arm64x-export.test
+32-27lld/COFF/Driver.cpp
+18-16lld/COFF/DLL.cpp
+8-8lld/COFF/Writer.cpp
+314-1875 files not shown
+324-19411 files

LLVM/project 2cd8e6e.ci monolithic-windows.sh, .github/workflows premerge.yaml

test
DeltaFile
+11-4.github/workflows/premerge.yaml
+4-0.ci/monolithic-windows.sh
+1-1llvm/CMakeLists.txt
+16-53 files

LLVM/project ebc5020llvm/lib/Target/AMDGPU/Utils AMDGPUPALMetadata.cpp, llvm/test/CodeGen/AMDGPU pal-metadata-3.0.ll amdpal.ll

[AMDGPU] Update entry point name for PAL metadata (#123581)

Old entry-point metadata being updated. Nothing is required
to account for deprecation as nothing uses the old style
DeltaFile
+4-4llvm/test/CodeGen/AMDGPU/pal-metadata-3.0.ll
+2-1llvm/lib/Target/AMDGPU/Utils/AMDGPUPALMetadata.cpp
+1-1llvm/test/CodeGen/AMDGPU/amdpal.ll
+1-1llvm/test/CodeGen/AMDGPU/elf-notes.ll
+1-1llvm/test/CodeGen/AMDGPU/wave_dispatch_regs.ll
+1-1llvm/test/CodeGen/AMDGPU/amdpal-vs.ll
+10-96 files not shown
+16-1512 files

LLVM/project 67b9d3fmlir/lib/Dialect/Linalg/Utils Utils.cpp, mlir/test/Dialect/Linalg tile-offset.mlir

[mlir] computeSliceParameters: Fix offset when m(0) != 0 (#122492)

For affine maps where `m(0) != 0`,
like `affine_map<(d0) -> (d0 + 3)` in
```
  %generic = linalg.generic
    {indexing_maps = [affine_map<(d0) -> (d0 + 3)>,
                      affine_map<(d0) -> (d0)>],
     iterator_types = ["parallel"]} ins(%arg0: tensor<9xf32>) outs(%empty : tensor<6xf32>) {
    ^bb0(%in : f32, %out: f32):
      linalg.yield %in : f32
    } -> tensor<6xf32>
```
tiling currently computes the wrong slice offsets. When tiling above
example with a size of 3, it would compute
```
scf.for %i = ...
  %slice = tensor.extract_slice %arg0[%i + 3] [6] [1]
  linalg.generic

    [8 lines not shown]
DeltaFile
+25-0mlir/test/Dialect/Linalg/tile-offset.mlir
+10-1mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+35-12 files

LLVM/project 2a8c12bclang/lib/Analysis UnsafeBufferUsage.cpp, clang/test/SemaCXX warn-unsafe-buffer-usage-array.cpp

"Reland "[Wunsafe-buffer-usage] Fix false positive when const sized array is indexed by const evaluatable expressions (#119340)"" (#123713)

This reverts commit 7dd34baf5505d689161c3a8678322a394d7a2929.

Fixed the assertion violation reported by
7dd34baf5505d689161c3a8678322a394d7a2929

Co-authored-by: MalavikaSamak <malavika2 at apple.com>
DeltaFile
+32-0clang/test/SemaCXX/warn-unsafe-buffer-usage-array.cpp
+7-2clang/lib/Analysis/UnsafeBufferUsage.cpp
+39-22 files

LLVM/project 9f6ed07.ci monolithic-windows.sh, .github/workflows premerge.yaml

test
DeltaFile
+13-3.github/workflows/premerge.yaml
+4-0.ci/monolithic-windows.sh
+1-1llvm/CMakeLists.txt
+18-43 files

LLVM/project 5658bc4lldb/source/Plugins/Process/Utility LinuxSignals.cpp, lldb/test/API/linux/aarch64/gcs TestAArch64LinuxGCS.py main.c

[lldb][Linux] Add Control Protection Fault signal (#122917)

This will be sent by Arm's Guarded Control Stack extension when an
invalid return is executed.

The signal does have an address we could show, but it's the PC at which
the fault occured. The debugger has plenty of ways to show you that
already, so I've left it out.

```
(lldb) c
Process 460 resuming
Process 460 stopped
* thread #1, name = 'test', stop reason = signal SIGSEGV: control protection fault
    frame #0: 0x0000000000400784 test`main at main.c:57:1
   54     afunc();
   55     printf("return from main\n");
   56     return 0;
-> 57   }

    [9 lines not shown]
DeltaFile
+22-0lldb/test/API/linux/aarch64/gcs/TestAArch64LinuxGCS.py
+16-1lldb/test/API/linux/aarch64/gcs/main.c
+4-0lldb/source/Plugins/Process/Utility/LinuxSignals.cpp
+42-13 files

LLVM/project 547bfdaclang/lib/CodeGen CGBuiltin.cpp, clang/test/CodeGen arm-bf16-convert-intrinsics.c

[AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (#120363)

This started out as trying to combine bf16 fpround to BFCVT2
instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in
favour of generating fpround instructions directly. This simplifies the
patterns and can lead to other optimizations. The BFCVT2 instruction is
adjusted to makes sure the types are valid, and a bfcvt2 is now
generated in more place. The old intrinsics are auto-upgraded to fptrunc
instructions too.
DeltaFile
+58-74llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+56-20llvm/lib/IR/AutoUpgrade.cpp
+38-3clang/lib/CodeGen/CGBuiltin.cpp
+13-11llvm/lib/Target/AArch64/AArch64InstrInfo.td
+11-12clang/test/CodeGen/arm-bf16-convert-intrinsics.c
+0-14llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+176-1344 files not shown
+185-16010 files

LLVM/project c22364allvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 csel-cmp-cse.ll

[AArch64] Eliminate Common SUBS by Reassociating Non-Constants (#123344)

Commit 1eed46960c217f9480865702f06fb730c7521e61 added logic to
reassociate a (add (add x y) -c) operand to a CSEL instruction with a
comparison involving x and c (or a similar constant) in order to obtain
a common (SUBS x c) instruction.
    
This commit extends this logic to non-constants. In this way, we also
reassociate a (sub (add x y) z) operand of a CSEL instruction to
(add (sub x z) y) if the CSEL compares x and z, for example.
    
Alive proof: https://alive2.llvm.org/ce/z/SEVpR
DeltaFile
+376-16llvm/test/CodeGen/AArch64/csel-cmp-cse.ll
+42-16llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+418-322 files

LLVM/project c97393d.ci monolithic-windows.sh, .github/workflows premerge.yaml

test
DeltaFile
+12-3.github/workflows/premerge.yaml
+4-0.ci/monolithic-windows.sh
+1-1llvm/CMakeLists.txt
+17-43 files

LLVM/project ad372aflld/test/ELF loongarch-relax-tlsdesc.s

Modify loongarch-relax-tlsdesc.s.
DeltaFile
+16-29lld/test/ELF/loongarch-relax-tlsdesc.s
+16-291 files

LLVM/project ddb64eelld/ELF/Arch LoongArch.cpp

Support relaxation during TLSDESC GD/LD to IE/LE conversion.

Complement https://. When relaxation enable, remove redundant NOPs.
DeltaFile
+29-3lld/ELF/Arch/LoongArch.cpp
+29-31 files

LLVM/project 30ff9a9.github/workflows premerge.yaml, llvm CMakeLists.txt

test
DeltaFile
+12-3.github/workflows/premerge.yaml
+1-1llvm/CMakeLists.txt
+13-42 files

LLVM/project 653281clld/ELF Relocations.cpp

Add comments.
DeltaFile
+3-0lld/ELF/Relocations.cpp
+3-01 files

LLVM/project 4740e09clang-tools-extra/clangd XRefs.cpp FindTarget.cpp, clang/include/clang/Sema HeuristicResolver.h

[clang][Sema] Respect qualification of methods in heuristic results (#123551)

Fixes https://github.com/llvm/llvm-project/issues/123549
DeltaFile
+49-38clang/lib/Sema/HeuristicResolver.cpp
+20-0clang/unittests/Sema/HeuristicResolverTest.cpp
+4-3clang-tools-extra/clangd/XRefs.cpp
+2-2clang/include/clang/Sema/HeuristicResolver.h
+1-2clang-tools-extra/clangd/FindTarget.cpp
+76-455 files

LLVM/project 97d691bllvm/unittests/CodeGen LowLevelTypeTest.cpp, llvm/unittests/FuzzMutate RandomIRBuilderTest.cpp OperationsTest.cpp

[IR][unittests] Replace of PointerType::get(Type) with opaque version (NFC) (#123621)

In accordance with https://github.com/llvm/llvm-project/issues/123569
DeltaFile
+66-89llvm/unittests/IR/InstructionsTest.cpp
+4-6llvm/unittests/FuzzMutate/RandomIRBuilderTest.cpp
+3-3llvm/unittests/IR/ConstantsTest.cpp
+3-3llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+2-3llvm/unittests/CodeGen/LowLevelTypeTest.cpp
+1-2llvm/unittests/FuzzMutate/OperationsTest.cpp
+79-1062 files not shown
+81-1088 files

LLVM/project 2ce5c0f.github/workflows premerge.yaml, llvm CMakeLists.txt

test
DeltaFile
+10-3.github/workflows/premerge.yaml
+1-1llvm/CMakeLists.txt
+11-42 files

LLVM/project b85cc52llvm/lib/Target/AMDGPU SILowerWWMCopies.cpp SILowerWWMCopies.h, llvm/test/CodeGen/AMDGPU si-lower-wwm-copies.mir

[AMDGPU][NewPM] Port SILowerWWMCopies to NPM
DeltaFile
+59-30llvm/lib/Target/AMDGPU/SILowerWWMCopies.cpp
+43-0llvm/test/CodeGen/AMDGPU/si-lower-wwm-copies.mir
+22-0llvm/lib/Target/AMDGPU/SILowerWWMCopies.h
+4-3llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+2-2llvm/lib/Target/AMDGPU/AMDGPU.h
+1-0llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
+131-356 files

LLVM/project 7b4cefbllvm/lib/Target/AMDGPU SILowerWWMCopies.cpp

[AMDGPU][CodeGen] Add used analyses to SILowerWWMCopies
DeltaFile
+3-0llvm/lib/Target/AMDGPU/SILowerWWMCopies.cpp
+3-01 files

LLVM/project 3529cc4llvm/test/CodeGen/AMDGPU shufflevector.v4i32.v4i32.ll shufflevector.v4p3.v4p3.ll

DAG: Avoid breaking legal vector_shuffle with multiple uses

Previously this combine would undo AMDGPU's new custom legalization of
wide vector shuffles into 2 element pieces. The comment also
states that this combine is only done before legalization,
but the case with a build_vector source was unconditional.

We probably don't want to do this if the multiple uses are full
scalarization of the vector, but this seems to work well enough.
Scalarizing extracts should have folded out pre-legalize.
DeltaFile
+345-468llvm/test/CodeGen/AMDGPU/shufflevector.v4i32.v4i32.ll
+345-468llvm/test/CodeGen/AMDGPU/shufflevector.v4p3.v4p3.ll
+345-468llvm/test/CodeGen/AMDGPU/shufflevector.v4f32.v4f32.ll
+122-174llvm/test/CodeGen/AMDGPU/shufflevector.v4i32.v2i32.ll
+122-174llvm/test/CodeGen/AMDGPU/shufflevector.v4f32.v2f32.ll
+122-174llvm/test/CodeGen/AMDGPU/shufflevector.v4p3.v2p3.ll
+1,401-1,9264 files not shown
+1,805-2,36710 files

LLVM/project 1c0f60allvm/lib/Target/AMDGPU SIISelLowering.cpp

AMDGPU: Custom lower 32-bit element shuffles

This is so we can try to make use of v_pk_mov_b32 when available.
Note this currently has little observable effect. The combiner
will undo the common extract of shuffle pattern. The lack
of test changes should demonstrate this change is minimally
correct.

We should probably try to make better use of wider extracts in
even aligned cases, but I'm trying to avoid some really ugly
regalloc regressions in some MFMA tests. The DAG scheduler ends
up doing a worse job if we use vector extracts, resulting
in failure to do 3 address conversion of MFMAs.
DeltaFile
+80-5llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+80-51 files

LLVM/project e696bc4llvm/include/llvm/Support YAMLTraits.h, llvm/lib/Support YAMLTraits.cpp

[NFC][YAML] Add `IO::error()`



Pull Request: https://github.com/llvm/llvm-project/pull/123475
DeltaFile
+3-1llvm/include/llvm/Support/YAMLTraits.h
+2-0llvm/lib/Support/YAMLTraits.cpp
+5-12 files

LLVM/project 0caa47fllvm/lib/ObjectYAML MachOYAML.cpp ELFYAML.cpp, llvm/test/ObjectYAML/MachO section_data.yaml

[YAML] Don't validate `Fill::Size` after error

Size is required, so we don't know if it's in
uninitialized state after the previous error.

Triggers msan on llvm/test/tools/yaml2obj/ELF/custom-fill.yaml

Pull Request: https://github.com/llvm/llvm-project/pull/123280
DeltaFile
+42-0llvm/test/ObjectYAML/MachO/section_data.yaml
+4-1llvm/lib/ObjectYAML/MachOYAML.cpp
+3-1llvm/lib/ObjectYAML/ELFYAML.cpp
+3-1llvm/test/tools/yaml2obj/ELF/custom-fill.yaml
+52-34 files

LLVM/project fa1b0c6llvm/lib/CodeGen MachineLICM.cpp

[MachineLICM] Disabling hoisting unlikely MI by default

Stop hoisting computations out of unlikely blocks in loops.
DeltaFile
+10-11llvm/lib/CodeGen/MachineLICM.cpp
+10-111 files

LLVM/project 0f9e913mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/test/Target/LLVMIR/nvvm tma_bulk_copy.mlir

[MLIR][NVVM] Add TMA Bulk Copy Ops (#123186)

PR #122344 adds intrinsics for Bulk Async Copy
(non-tensor variants) using TMA. This patch
adds the corresponding NVVM Dialect Ops.

lit tests are added to verify the lowering to all
variants of the intrinsics.

Signed-off-by: Durgadoss R <durgadossr at nvidia.com>
DeltaFile
+144-0mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+35-0mlir/test/Target/LLVMIR/nvvm/tma_bulk_copy.mlir
+179-02 files

LLVM/project a588e20llvm/lib/CodeGen/SelectionDAG StatepointLowering.cpp

[SelectionDAG] Avoid repeated hash lookups (NFC) (#123697)

DeltaFile
+3-2llvm/lib/CodeGen/SelectionDAG/StatepointLowering.cpp
+3-21 files

LLVM/project 671088bclang/lib/Frontend/Rewrite RewriteModernObjC.cpp

[Rewrite] Avoid repeated hash lookups (NFC) (#123696)

DeltaFile
+2-2clang/lib/Frontend/Rewrite/RewriteModernObjC.cpp
+2-21 files

LLVM/project 1714facllvm/utils/TableGen/Basic VTEmitter.cpp

[TableGen] Avoid repeated map lookups (NFC) (#123699)

DeltaFile
+7-6llvm/utils/TableGen/Basic/VTEmitter.cpp
+7-61 files

LLVM/project 73beb15llvm/lib/MC WasmObjectWriter.cpp

[MC] Avoid repeated hash lookups (NFC) (#123698)

DeltaFile
+3-2llvm/lib/MC/WasmObjectWriter.cpp
+3-21 files

LLVM/project 26b87aallvm/lib/Target/Mips MipsISelLowering.h MipsISelLowering.cpp, llvm/lib/Target/Mips/MCTargetDesc MipsBaseInfo.h

[Mips] Handle declspec(dllimport) on mipsel-windows-* triples (#120912)

On Windows, imported symbols must be searched with '__imp_' prefix.
Support imported global variables and imported functions.
DeltaFile
+55-0llvm/test/CodeGen/Mips/dllimport.ll
+27-0llvm/lib/Target/Mips/MipsISelLowering.h
+16-1llvm/lib/Target/Mips/MipsISelLowering.cpp
+12-2llvm/lib/Target/Mips/MipsMCInstLower.cpp
+11-0llvm/test/MC/Mips/coff-relocs-dllimport.ll
+6-1llvm/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h
+127-41 files not shown
+129-47 files