LLVM/project 6e535a9llvm/lib/Target/AArch64 AArch64InstrFormats.td AArch64InstrInfo.td, llvm/test/MC/AArch64 armv9.6a-lsui.s armv9.6a-srmask.s

[LLVM][MC][AArch64] Assembler support for Armv9.6-A memory systems extensions (#112341)

Add support for the following Armv9.6-A memory systems extensions:
  FEAT_LSUI      - Unprivileged Load Store
  FEAT_OCCMO     - Outer Cacheable Cache Maintenance Operation
  FEAT_PCDPHINT  - Producer-Consumer Data Placement Hints
  FEAT_SRMASK    - Bitwise System Register Write Masks

as documented here:

https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension

Co-authored-by: Jonathan Thackray <jonathan.thackray at arm.com>

---------

Co-authored-by: Jonathan Thackray <jonathan.thackray at arm.com>
DeltaFile
+486-0llvm/test/MC/AArch64/armv9.6a-lsui.s
+324-0llvm/test/MC/Disassembler/AArch64/armv9.6a-lsui.txt
+304-0llvm/lib/Target/AArch64/AArch64InstrFormats.td
+153-29llvm/lib/Target/AArch64/AArch64InstrInfo.td
+102-0llvm/test/MC/AArch64/armv9.6a-srmask.s
+102-0llvm/test/MC/Disassembler/AArch64/armv9.6a-srmask.txt
+1,471-2917 files not shown
+1,797-3823 files

LLVM/project a18826dllvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUMCExpr.cpp, llvm/test/CodeGen/AMDGPU mcexpr-knownbits-assign-crash-gh-issue-110930.ll

[AMDGPU] Create local KnownBits in case DenseMap gets invalidated (#111568)

KnownBits retrieved from DenseMap may invalidate if insertion requires a
(re)growth.

Fixes https://github.com/llvm/llvm-project/issues/110930
DeltaFile
+333-0llvm/test/CodeGen/AMDGPU/mcexpr-knownbits-assign-crash-gh-issue-110930.ll
+6-2llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
+339-22 files

LLVM/project d8d144allvm/lib/Target/AArch64 SVEInstrFormats.td, llvm/test/MC/AArch64/SVE2 bfscale.s bfscale-diagnostics.s

[LLVM][AArch64] Add assembly/disassembly of SVE BFSCALE instruction (#113168)

This patch add assembly/disassembly and tests for sve bfscale
instruction according to https://developer.arm.com/documentation/ddi0602
.
DeltaFile
+50-0llvm/test/MC/AArch64/SVE2/bfscale.s
+43-0llvm/test/MC/AArch64/SVE2/bfscale-diagnostics.s
+6-0llvm/test/MC/AArch64/SVE2/directive-cpu-negative.s
+6-0llvm/test/MC/AArch64/SVE2/directive-arch-negative.s
+6-0llvm/test/MC/AArch64/SVE2/directive-arch_extension-negative.s
+5-0llvm/lib/Target/AArch64/SVEInstrFormats.td
+116-04 files not shown
+131-110 files

LLVM/project c5ea7b8mlir/lib/Pass IRPrinting.cpp

[mlir] Avoid repeated hash lookups (NFC) (#113249)

DeltaFile
+1-4mlir/lib/Pass/IRPrinting.cpp
+1-41 files

LLVM/project 5dbfb49lldb/source/Plugins/DynamicLoader/FreeBSD-Kernel DynamicLoaderFreeBSDKernel.cpp

[lldb] Avoid repeated hash lookups (NFC) (#113248)

DeltaFile
+3-3lldb/source/Plugins/DynamicLoader/FreeBSD-Kernel/DynamicLoaderFreeBSDKernel.cpp
+3-31 files

LLVM/project 0690a42llvm/lib/Target/BPF BTFDebug.cpp

[BPF] Avoid repeated map lookups (NFC) (#113247)

DeltaFile
+8-12llvm/lib/Target/BPF/BTFDebug.cpp
+8-121 files

LLVM/project da66f6allvm/tools/llvm-jitlink llvm-jitlink.cpp llvm-jitlink.h

[llvm-jitlink] Use heterogenous lookups with std::map (NFC) (#113245)

DeltaFile
+1-1llvm/tools/llvm-jitlink/llvm-jitlink.cpp
+1-1llvm/tools/llvm-jitlink/llvm-jitlink.h
+2-22 files

LLVM/project ac1a01fllvm/lib/CodeGen CFIFixup.cpp

Reland [CFIFixup] Factor CFI remember/restore insertion into a helper (NFC) (#113328)

The previous submission looked like it triggered build failure
https://lab.llvm.org/buildbot/#/builders/17/builds/3116, but this
appears to be a spurious failure due to a flaky test.
DeltaFile
+30-16llvm/lib/CodeGen/CFIFixup.cpp
+30-161 files

LLVM/project 1004865mlir/lib/Dialect/Vector/Transforms LowerVectorTransfer.cpp, mlir/test/Conversion/VectorToSCF vector-to-scf.mlir

[mlir][Vector] Support 0-d vectors natively in TransferOpReduceRank (#112907)

Since
https://github.com/llvm/llvm-project/commit/ddf2d62c7dddf1e4a9012d96819ff1ed005fbb05
, 0-d vectors are supported in VectorType. This patch removes 0-d vector
handling with scalars for the TransferOpReduceRank pattern. This pattern
specifically introduces tensor.extract_slice during vectorization,
causing vectorization to not fold transfer_read/transfer_write slices
properly. The changes in vectorization test files reflect this.

There are other places where lowering patterns are still side-stepping
from handling 0-d vectors properly, by turning them into scalars, but
this patch only focuses on the vector.transfer_x patterns.
DeltaFile
+0-21mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+10-10mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir
+2-2mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir
+2-2mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir
+14-354 files

LLVM/project 4c697f7llvm/lib/Transforms/Utils LowerMemIntrinsics.cpp, llvm/test/CodeGen/AMDGPU memmove-var-size.ll lower-mem-intrinsics.ll

[LowerMemIntrinsics] Use i8 GEPs in memcpy/memmove lowering (#112707)

The IR lowering of memcpy/memmove intrinsics uses a target-specific type
for its load/store operations. So far, the loaded and stored addresses
are computed with GEPs based on this type. That is wrong if the
allocation size of the type differs from its store size: The width of
the accesses is determined by the store size, while the GEP stride is
determined by the allocation size. If the allocation size is greater
than the store size, some bytes are not copied/moved.

This patch changes the GEPs to use i8 addressing, with offsets based on
the type's store size. The correctness of the lowering therefore no
longer depends on the type's allocation size.

This is in support of PR #112332, which allows adjusting the memcpy loop
lowering type through a command line argument in the AMDGPU backend.
DeltaFile
+727-736llvm/test/CodeGen/AMDGPU/memmove-var-size.ll
+471-498llvm/test/CodeGen/AMDGPU/lower-mem-intrinsics.ll
+108-107llvm/lib/Transforms/Utils/LowerMemIntrinsics.cpp
+93-111llvm/test/CodeGen/AMDGPU/memcpy-crash-issue63986.ll
+77-82llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.memcpy.ll
+1,476-1,5345 files

LLVM/project 4275a73libc/src/__support/macros/properties types.h

[libc] Fix long double is double double const (#113258)

Turns out for double double LDBL_MANT_DIG == 106. This patch fixes the
constant. Should fix the ppc buildbot.

Previously:
https://github.com/llvm/llvm-project/pull/113235
https://github.com/llvm/llvm-project/issues/113237
https://github.com/llvm/llvm-project/pull/91651
DeltaFile
+1-1libc/src/__support/macros/properties/types.h
+1-11 files

LLVM/project 6512a8dllvm/lib/Target/SystemZ/MCTargetDesc SystemZInstPrinter.cpp SystemZInstPrinterCommon.cpp

[SystemZ] Split SystemZInstPrinter to two classes based on Asm dialect (#112975)

In preparation for future work on separating the output of the GNU/HLASM
ASM dialects, we first separate the SystemZInstPrinter classes to two
versions, one for each ASM dialect.

The common code remains in a SystemZInstPrinterCommon class instead.

---------

Co-authored-by: Tony Tao <tonytao at ca.ibm.com>
DeltaFile
+0-266llvm/lib/Target/SystemZ/MCTargetDesc/SystemZInstPrinter.cpp
+246-0llvm/lib/Target/SystemZ/MCTargetDesc/SystemZInstPrinterCommon.cpp
+0-95llvm/lib/Target/SystemZ/MCTargetDesc/SystemZInstPrinter.h
+88-0llvm/lib/Target/SystemZ/MCTargetDesc/SystemZInstPrinterCommon.h
+46-0llvm/lib/Target/SystemZ/MCTargetDesc/SystemZGNUInstPrinter.h
+45-0llvm/lib/Target/SystemZ/MCTargetDesc/SystemZHLASMInstPrinter.h
+425-3618 files not shown
+531-37314 files

LLVM/project 6761b24clang-tools-extra/docs/clang-tidy/checks/bugprone unchecked-optional-access.rst, clang/include/clang/Analysis/FlowSensitive/Models UncheckedOptionalAccessModel.h

[clang][dataflow] Cache accessors for bugprone-unchecked-optional-access (#112605)

Treat calls to zero-param const methods as having stable return values
(with a cache) to address issue #58510. The cache is invalidated when
non-const methods are called. This uses the infrastructure from PR
#111006.

For now we cache methods returning:
- ref to optional
- optional by value
- booleans

We can extend that to pointers to optional in a next change.
DeltaFile
+196-1clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp
+131-5clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
+12-5clang/include/clang/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.h
+10-0clang-tools-extra/docs/clang-tidy/checks/bugprone/unchecked-optional-access.rst
+349-114 files

LLVM/project b81d8e9clang/lib/Parse ParseDeclCXX.cpp

[NFC][Clang] Fix enumerated mismatch warning (#112816)

This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on.
Built by GCC 11.

```
Fix warning:
llvm-project/clang/lib/Parse/ParseDeclCXX.cpp:3153:14: error: enumerated mismatch in conditional expression: ‘clang::diag::<unnamed enum>’ vs ‘clang::diag::<unnamed enum>’ [-Werror=enum-compare]
 3152 |          DS.isFriendSpecified() || NextToken().is(tok::kw_friend)
      |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 3153 |              ? diag::err_friend_concept
      |              ^~~~~~~~~~~~~~~~~~~~~~~~~~
 3154 |              : diag::
      |              ~~~~~~~~
 3155 |                    err_concept_decls_may_only_appear_in_global_namespace_scope);

```

---------

    [2 lines not shown]
DeltaFile
+7-5clang/lib/Parse/ParseDeclCXX.cpp
+7-51 files

LLVM/project d6e714blibcxx/test/libcxx transitive_includes_to_csv.py headers_in_modulemap.sh.py, libcxx/test/libcxx/transitive_includes to_csv.py

[libc++] Rewrite the transitive header checking machinery (#110554)

Since we don't generate a full dependency graph of headers, we can
greatly simplify the script that parses the result of --trace-includes.

At the same time, we also unify the mechanism for detecting whether a
header is a public/C compat/internal/etc header with the existing
mechanism in header_information.py.

As a drive-by this fixes the headers_in_modulemap.sh.py test which had
been disabled by mistake because it used its own way of determining
the list of libc++ headers. By consistently using header_information.py
to get that information, problems like this shouldn't happen anymore.

This should also unblock #110303, which was blocked because of
a brittle implementation of the transitive includes check which broke
when the repository was cloned at a path like /path/__something/more.
DeltaFile
+160-116libcxx/utils/libcxx/header_information.py
+0-147libcxx/test/libcxx/transitive_includes_to_csv.py
+120-0libcxx/test/libcxx/transitive_includes/to_csv.py
+6-16libcxx/test/libcxx/headers_in_modulemap.sh.py
+3-11libcxx/utils/generate_libcxx_cppm_in.py
+6-6libcxx/test/libcxx/transitive_includes.gen.py
+295-2962 files not shown
+297-2988 files

LLVM/project c623df3libcxx/test/std/atomics/atomics.lockfree is_always_lock_free.pass.cpp

[libc++] Fix typo in is_always_lock_free test (#113169)

DeltaFile
+1-1libcxx/test/std/atomics/atomics.lockfree/is_always_lock_free.pass.cpp
+1-11 files

LLVM/project 5bb3480clang/test/CodeGen arm-bf16-convert-intrinsics.c aarch64-sve-vls-bitwise-ops.c, llvm/test/Analysis/CostModel/SystemZ divrem-pow2.ll

[NFC] Migrate tests to use autoupdate for CHECK lines.
DeltaFile
+641-308llvm/test/CodeGen/AArch64/arm64-codegen-prepare-extload.ll
+184-184clang/test/CodeGen/arm-bf16-convert-intrinsics.c
+241-60llvm/test/Analysis/CostModel/SystemZ/divrem-pow2.ll
+81-81clang/test/CodeGen/aarch64-sve-vls-bitwise-ops.c
+62-38llvm/test/Instrumentation/MemorySanitizer/reduce.ll
+60-26llvm/test/Instrumentation/MemorySanitizer/vector_arith.ll
+1,269-6977 files not shown
+1,461-78213 files

LLVM/project 56f75b5clang/test/SemaCXX attr-lifetimebound.cpp

Fix the broken test
DeltaFile
+2-2clang/test/SemaCXX/attr-lifetimebound.cpp
+2-21 files

LLVM/project 91c1157mlir/lib/Dialect/Bufferization/Transforms OneShotModuleBufferize.cpp, mlir/test/Dialect/Bufferization/Transforms transform-ops.mlir

Revert "[MLIR] Make `OneShotModuleBufferize` use `OpInterface` (#110322)" (#113124)

This reverts commit 2026501cf107fcb3cbd51026ba25fda3af823941.

Failing bot:
  * https://lab.llvm.org/staging/#/builders/125/builds/389
DeltaFile
+62-80mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir
+60-64mlir/test/Dialect/Linalg/pad-to-specific-memory-space.mlir
+56-56mlir/lib/Dialect/Bufferization/Transforms/OneShotModuleBufferize.cpp
+39-45mlir/test/Dialect/Vector/transform-vector.mlir
+24-28mlir/test/Dialect/Linalg/matmul-shared-memory-padding.mlir
+10-12mlir/test/Dialect/LLVM/transform-e2e.mlir
+251-2855 files not shown
+269-30411 files

LLVM/project a6d6c00clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema CheckExprLifetime.cpp

[clang] Lifetimebound in assignment operator should work for non-gsl annotated types. (#113180)

This issue is identified during the discussion of [this
comment](https://github.com/llvm/llvm-project/issues/112234#issuecomment-2426102198).

There will be no release note for this fix as it is a follow-up to
[#106997](https://github.com/llvm/llvm-project/pull/106997).
DeltaFile
+15-7clang/lib/Sema/CheckExprLifetime.cpp
+3-3clang/test/SemaCXX/attr-lifetimebound.cpp
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+19-113 files

LLVM/project deecfa9utils/bazel/llvm-project-overlay/clang BUILD.bazel

bazelbuild: adapt for commit b735c66da9 (#113302)

DeltaFile
+4-0utils/bazel/llvm-project-overlay/clang/BUILD.bazel
+4-01 files

LLVM/project a85e452llvm/lib/CodeGen TailDuplication.cpp

Rename to TailDuplicateBaseLegacy and honor optnone
DeltaFile
+12-9llvm/lib/CodeGen/TailDuplication.cpp
+12-91 files

LLVM/project f1ade1fllvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 sve-intrinsics-while.ll

[LLVM][CodeGen][AArch64] while_le(#,max_int) -> all_active (#111183)

When the second operand of an incrementing while instruction is the
maximum value, comparisons that include equality can never fail.
DeltaFile
+8-10llvm/test/CodeGen/AArch64/sve-intrinsics-while.ll
+7-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+15-102 files

LLVM/project cd290a6llvm/docs CompileCudaWithLLVM.rst

[Doc] Update a broken link in CompileCudaWithLLVM (#113282)

DeltaFile
+1-1llvm/docs/CompileCudaWithLLVM.rst
+1-11 files

LLVM/project b3acb25llvm/lib/Target/AMDGPU VOP2Instructions.td

[AMDGPU] Don't rely on !eq comparing int with bits<5>. NFC. (#113279)

Tweak VOP2eInst_Base so that it does not rely on !eq comparing an int
value (-1) with a bits<5> value. This is to avoid a change in behaviour
when #112904 lands, which is a bug fix which has the side effect of
implicitly casting template arguments to the declared template parameter
type.
DeltaFile
+2-2llvm/lib/Target/AMDGPU/VOP2Instructions.td
+2-21 files

LLVM/project 6e0b003clang/test/CodeGenOpenCL amdgpu-enqueue-kernel.cl builtins-alloca.cl

[clang][OpenCL][CodeGen][AMDGPU] Do not use `private` as the default AS for when `generic` is available (#112442)

Currently, for AMDGPU, when compiling for OpenCL, we unconditionally use
`private` as the default address space. This is wrong for cases where
the `generic` address space is available, and is corrected via this
patch. In general, this AS map abuse is a bad hack and we should re-work
it altogether, but at least after this patch we will stop being
incorrect for e.g. OpenCL 2.0.
DeltaFile
+275-220clang/test/CodeGenOpenCL/amdgpu-enqueue-kernel.cl
+428-4clang/test/CodeGenOpenCL/builtins-alloca.cl
+164-118clang/test/CodeGenOpenCL/amdgpu-abi-struct-arg-byref.cl
+99-70clang/test/CodeGenOpenCL/addr-space-struct-arg.cl
+93-62clang/test/CodeGenOpenCL/builtins-amdgcn-gfx12.cl
+20-16clang/test/CodeGenOpenCL/amdgcn-automatic-variable.cl
+1,079-49014 files not shown
+1,154-54920 files

LLVM/project aea60abclang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format TokenAnnotatorTest.cpp

[clang-format] Make bitwise and imply requires clause (#110942)

This patch adjusts the requires clause/expression parser to imply a
requires clause if it is preceded by a bitwise and operator `&`, and
assume it is a reference qualifier. The justification is that bitwise
operations should not be used for requires expressions.

This is a band-aid fix. The real problems lie in the lookahead heuristic
in the same method. It may be worth it to rewrite that whole heuristic
to track more state in the future, instead of just blindly marching
forward across multiple unrelated definitions, since right now, the
definition following the one with the requires clause can influence
whether the heuristic chooses clause or expression.

Fixes https://github.com/llvm/llvm-project/issues/110485
DeltaFile
+9-0clang/unittests/Format/TokenAnnotatorTest.cpp
+1-1clang/lib/Format/UnwrappedLineParser.cpp
+10-12 files

LLVM/project c829f91flang/include/flang/Common fast-int-set.h

Re-apply: use Fortran::common::optional for CUDA build
DeltaFile
+3-3flang/include/flang/Common/fast-int-set.h
+3-31 files

LLVM/project 1dc08d5llvm/include/llvm InitializePasses.h, llvm/include/llvm/CodeGen TailDuplication.h Passes.h

[AMDGPU][NewPM] Port TailDuplicate pass to NPM
DeltaFile
+46-12llvm/lib/CodeGen/TailDuplication.cpp
+47-0llvm/include/llvm/CodeGen/TailDuplication.h
+4-4llvm/lib/CodeGen/TargetPassConfig.cpp
+2-2llvm/lib/CodeGen/CodeGen.cpp
+2-2llvm/include/llvm/CodeGen/Passes.h
+2-2llvm/include/llvm/InitializePasses.h
+103-228 files not shown
+113-2614 files

LLVM/project 2c5be30flang-rt/cmake/modules AddFlangRTOffload.cmake, flang-rt/lib/flang_rt CMakeLists.txt

Fix OpenMP linking
DeltaFile
+2-2flang-rt/lib/flang_rt/CMakeLists.txt
+2-1flang-rt/cmake/modules/AddFlangRTOffload.cmake
+4-32 files