LLVM/project 0068078llvm/lib/Target/NVPTX NVPTXInstrInfo.td NVPTXISelLowering.cpp, llvm/test/CodeGen/NVPTX i128.ll combine-mad.ll

[NVPTX] Remove `NVPTX::IMAD` opcode, and rely on intruction selection only (#121724)

I noticed that NVPTX will sometimes emit `mad.lo` to multiply by 1, e.g.
in https://gcc.godbolt.org/z/4j47Y9W4c.

This happens when DAGCombiner operates on the add before the mul, so the
imad contraction happens regardless of whether the mul could have been
simplified.

To fix this, I remove `NVPTXISD::IMAD` and only combine to mad during
selection. This allows the default DAGCombiner patterns to simplify
the graph without any NVPTX-specific intervention.
DeltaFile
+94-98llvm/test/CodeGen/NVPTX/i128.ll
+31-64llvm/lib/Target/NVPTX/NVPTXInstrInfo.td
+55-0llvm/test/CodeGen/NVPTX/combine-mad.ll
+10-13llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+1-1llvm/test/CodeGen/NVPTX/dynamic_stackalloc.ll
+0-1llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+191-1776 files

LLVM/project 5a90168llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/FunctionAttrs phi_cycle.ll

[ValueTracking] Provide getUnderlyingObjectAggressive fallback (#123019)

This callsite assumes `getUnderlyingObjectAggressive` returns a non-null
pointer:

https://github.com/llvm/llvm-project/blob/273a94b3d5a78cd9122c7b3bbb5d5a87147735d2/llvm/lib/Transforms/IPO/FunctionAttrs.cpp#L124

But it can return null when there are cycles in the value chain so there
is no more `Worklist` item anymore to explore, in which case it just
returns `Object` at the end of the function without ever setting it:
https://github.com/llvm/llvm-project/blob/9b5857a68381652dbea2a0c9efa734b6c4cf38c9/llvm/lib/Analysis/ValueTracking.cpp#L6866-L6867
https://github.com/llvm/llvm-project/blob/9b5857a68381652dbea2a0c9efa734b6c4cf38c9/llvm/lib/Analysis/ValueTracking.cpp#L6889

`getUnderlyingObject` does not seem to return null either judging by
looking at its code and its callsites, so I think it is not likely to be
the author's intention that `getUnderlyingObjectAggressive` returns
null.

So this checks whether `Object` is null at the end, and if so, falls

    [29 lines not shown]
DeltaFile
+52-0llvm/test/Transforms/FunctionAttrs/phi_cycle.ll
+1-1llvm/lib/Analysis/ValueTracking.cpp
+53-12 files

LLVM/project 6655c53llvm/cmake/modules CrossCompile.cmake LLVMExternalProjectUtils.cmake

[cmake] Serialize native builds for Make generator (#121021)

The build system is fragile by allowing multiple invocation of
subprocess builds in the native folder for Make generator.

For example, during sub-invocation of the native llvm-config,
llvm-min-tblgen is also built. If there is another sub-invocation of the
native llvm-min-tblgen build running in parallel, they may overwrite
each other's build results, and may lead to errors like "Text file
busy".

This patch adds a cmake script that uses file lock to serialize all
native builds for Make generator.
DeltaFile
+13-2llvm/cmake/modules/CrossCompile.cmake
+8-4llvm/cmake/modules/LLVMExternalProjectUtils.cmake
+9-0llvm/cmake/modules/FileLock.cmake
+30-63 files

LLVM/project 5f35883mlir/lib/Dialect/Vector/Transforms VectorLinearize.cpp

code format
DeltaFile
+7-5mlir/lib/Dialect/Vector/Transforms/VectorLinearize.cpp
+7-51 files

LLVM/project 6ffc445compiler-rt/test/profile/ContinuousSyncMode online-merging.c

[PGO][AIX] Disable multi-process continuous mode test in 32-bit

In PGO continuous mode, we mmap the profile file into shared memory, which
allows multiple processes to be updating the same memory.

The -fprofile-update=atomic option forces the counter increments to be atomic,
but the counter size is always 64-bit (in -m32 and -m64), so in 32-bit mode the
atomic operations are function calls to libatomic.a and these function calls use
locks.

The lock based libatomic.a functions are per-process, so two processes will race
on the same shared memory because each will acquire their own lock.
DeltaFile
+1-0compiler-rt/test/profile/ContinuousSyncMode/online-merging.c
+1-01 files

LLVM/project 8fa826allvm/lib/Analysis CtxProfAnalysis.cpp, llvm/lib/ProfileData PGOCtxProfReader.cpp

[ctxprof] use yaml for serialization (for testing)
DeltaFile
+88-0llvm/lib/ProfileData/PGOCtxProfReader.cpp
+30-40llvm/unittests/Transforms/Utils/CallPromotionUtilsTest.cpp
+14-48llvm/test/Analysis/CtxProfAnalysis/full-cycle.ll
+6-45llvm/lib/Analysis/CtxProfAnalysis.cpp
+8-25llvm/test/Analysis/CtxProfAnalysis/load.ll
+16-17llvm/test/Analysis/CtxProfAnalysis/inline.ll
+162-1755 files not shown
+174-20011 files

LLVM/project d067a5amlir/lib/Dialect/Vector/Transforms VectorLinearize.cpp, mlir/test/Dialect/Vector linearize.mlir

add linearize pattern for bitcast
DeltaFile
+34-3mlir/lib/Dialect/Vector/Transforms/VectorLinearize.cpp
+20-1mlir/test/Dialect/Vector/linearize.mlir
+54-42 files

LLVM/project f325e4bclang/cmake/caches hexagon-unknown-linux-musl-clang-cross.cmake

[Hexagon] Add default clang symlinks to CLANG_LINKS_TO_CREATE (#123011)

Since this cache value overrides the defaults, we end up with `clang`
linked to `clang-20`, and some `${triple}-clang*` links, but we're
missing `clang++`. This makes for a toolchain with inconsistent behavior
when used in someone's `$PATH`.

We'll add the default symlinks to our list so that C and C++ programs
are both built as expected when `clang` and `clang++` are invoked.
DeltaFile
+3-0clang/cmake/caches/hexagon-unknown-linux-musl-clang-cross.cmake
+3-01 files

LLVM/project feb7872clang/lib/Sema SemaTemplate.cpp, clang/test/APINotes templates.cpp

[APINotes] Avoid duplicated attributes for class template instantiations

If a C++ class template is annotated via API Notes, the instantiations
had the attributes repeated twice. This is because Clang was adding the
attribute twice while processing the same class template. This change
makes sure we don't try to add attributes from API Notes twice.

There is currently no way to annotate specific instantiations using API
Notes.

rdar://142539959
DeltaFile
+3-0clang/test/APINotes/templates.cpp
+0-1clang/lib/Sema/SemaTemplate.cpp
+1-0clang/test/APINotes/Inputs/Headers/Templates.h
+4-13 files

LLVM/project 1a461cdcompiler-rt/lib/sanitizer_common/symbolizer/scripts build_symbolizer.sh

[compiler-rt] Install libc++ and libc++abi in build_symbolizer.sh

This ensures that the directory layout of the libc++/libc++abi matches exactly what we would get on a real installation.
Currently the build directory happens to match the install directory layout, but this will no longer be true in the future.
DeltaFile
+7-5compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh
+7-51 files

LLVM/project 9966972clang/lib/Driver OffloadBundler.cpp, clang/test/Driver clang-offload-bundler.c clang-offload-bundler-standardize.c

[OffloadBundler] Rework the ctor of `OffloadTargetInfo` to support generic target

The current parsing of target string assumes to be in a form of
`kind-triple-targetid:feature`, such as
`hipv4-amdgcn-amd-amdhsa-gfx1030:+xnack`. Specifically, the target id does not
contain any `-`, which is not the case for generic target. Also, a generic
target may contain one or more `-`, such as `gfx10-3-generic` and
`gfx12-generic`. As a result, we can no longer depend on `rstrip` to get things
work right. This patch reworks the logic to parse the target string to make it
more robust, as well as supporting generic target.
DeltaFile
+24-24clang/test/Driver/clang-offload-bundler.c
+23-24clang/lib/Driver/OffloadBundler.cpp
+5-13clang/test/Driver/clang-offload-bundler-standardize.c
+7-7clang/test/Driver/clang-offload-bundler-asserts-on.c
+6-6clang/test/Driver/clang-offload-bundler-zstd.c
+6-6clang/test/Driver/clang-offload-bundler-zlib.c
+71-8010 files not shown
+113-11016 files

LLVM/project c4443a1compiler-rt/lib/rtsan rtsan_interceptors_posix.cpp, compiler-rt/lib/rtsan/tests rtsan_test_interceptors_posix.cpp

[compiler-rt][rtsan] fseek api interception. (#122163)

DeltaFile
+100-0compiler-rt/lib/rtsan/rtsan_interceptors_posix.cpp
+74-0compiler-rt/lib/rtsan/tests/rtsan_test_interceptors_posix.cpp
+174-02 files

LLVM/project 2bc422dutils/bazel/llvm-project-overlay/clang BUILD.bazel

[bazel] Remove internal headers from `hdrs` in //clang:format (#122987)

They are already included in `srcs`, as they should be.
DeltaFile
+1-7utils/bazel/llvm-project-overlay/clang/BUILD.bazel
+1-71 files

LLVM/project 1a56360llvm/include/llvm/IR InstrTypes.h, llvm/lib/Transforms/IPO FunctionAttrs.cpp

[IR] Treat calls with byval ptrs as read-only (#122961)

DeltaFile
+16-0llvm/test/Transforms/SROA/readonlynocapture.ll
+1-4llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+0-5llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+5-0llvm/include/llvm/IR/InstrTypes.h
+4-0llvm/test/Analysis/BasicAA/call-attrs.ll
+1-1llvm/test/Analysis/BasicAA/tail-byval.ll
+27-106 files

LLVM/project e19bc76llvm/test/CodeGen/RISCV/rvv splat-vectors.ll

[RISCV] Precommit test coverage for pr118873
DeltaFile
+144-0llvm/test/CodeGen/RISCV/rvv/splat-vectors.ll
+144-01 files

LLVM/project 1865048clang/lib/Serialization ASTWriter.cpp

[clang][Serialization] Add the missing block info (#122976)

HEADER_SEARCH_ENTRY_USAGE and VFS_USAGE were missing from the block info
block. Add the missing info so `llvm-bcanalyzer` can read them
correctly.
DeltaFile
+2-0clang/lib/Serialization/ASTWriter.cpp
+2-01 files

LLVM/project ab6e63amlir/include/mlir/IR TypeRange.h ValueRange.h, mlir/lib/IR TypeRange.cpp OperationSupport.cpp

[mlir] Make single value `ValueRange`s memory safer (#121996)

A very common mistake users (and yours truly) make when using
`ValueRange`s is assigning a temporary `Value` to it. Example:
```cpp
ValueRange values = op.getOperand();
apiThatUsesValueRange(values);
```

The issue is caused by the implicit `const Value&` constructor: As per
C++ rules a const reference can be constructed from a temporary and the
address of it taken. After the statement, the temporary goes out of
scope and `stack-use-after-free` error occurs.

This PR fixes that issue by making `ValueRange` capable of owning a
single `Value` instance for that case specifically. While technically a
departure from the other owner types that are non-owning, I'd argue that
this behavior is more intuitive for the majority of users that usually
don't need to care about the lifetime of `Value` instances.

    [2 lines not shown]
DeltaFile
+13-8mlir/include/mlir/IR/TypeRange.h
+17-0mlir/unittests/IR/OperationSupportTest.cpp
+8-8mlir/include/mlir/IR/ValueRange.h
+15-0mlir/lib/IR/TypeRange.cpp
+13-0mlir/lib/IR/OperationSupport.cpp
+66-165 files

LLVM/project 1e53f95llvm/cmake config-ix.cmake, llvm/include/llvm/Config config.h.cmake

[CMake] Remove some always-true HAVE_XXX_H

These are unneeded even on AIX, PURE_WINDOWS, and ZOS (per #104706)

* HAVE_ERRNO_H: introduced by 1a93330ffa2ae2aa0b49461f05e6f0d51e8443f8 (2009) but unneeded.
  The guarded ABI is unconditionally used by lldb.
* HAVE_FCNTL_H
* HAVE_FENV_H
* HAVE_SYS_STAT_H

Pull Request: https://github.com/llvm/llvm-project/pull/123087
DeltaFile
+0-12utils/bazel/llvm_configs/config.h.cmake
+0-12llvm/include/llvm/Config/config.h.cmake
+0-12utils/bazel/llvm-project-overlay/llvm/include/llvm/Config/config.h
+3-5llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
+0-5llvm/lib/Support/Errno.cpp
+0-4llvm/cmake/config-ix.cmake
+3-509 files not shown
+5-7615 files

LLVM/project ac2165fllvm/lib/MC WinCOFFObjectWriter.cpp

[coff] Don't try to write the obj if the assembler has errors (#123007)

The ASAN and MSAN tests have been failing after #122777 because some
fields are now set in `executePostLayoutBinding` which is skipped by the
assembler if it had errors but read in `writeObject`

Since the compilation has failed anyway, skip `writeObject` if the
assembler had errors.
DeltaFile
+5-0llvm/lib/MC/WinCOFFObjectWriter.cpp
+5-01 files

LLVM/project 943b212llvm/utils/TableGen DecoderEmitter.cpp

[TableGen] Use `std::move` to avoid copy (#123088)

DeltaFile
+1-1llvm/utils/TableGen/DecoderEmitter.cpp
+1-11 files

LLVM/project b25902allvm/include/llvm/Config config.h.cmake, llvm/lib/Support/Unix Process.inc Program.inc

remove HAVE_FCNTL_H

Created using spr 1.3.5-bogner
DeltaFile
+0-9utils/bazel/llvm_configs/config.h.cmake
+0-9utils/bazel/llvm-project-overlay/llvm/include/llvm/Config/config.h
+0-6llvm/include/llvm/Config/config.h.cmake
+0-2llvm/lib/Support/Unix/Process.inc
+0-2llvm/lib/Support/Unix/Program.inc
+0-2llvm/lib/Support/Unix/Unix.h
+0-304 files not shown
+0-3710 files

LLVM/project 06499f3llvm/include/llvm/Analysis CmpInstAnalysis.h, llvm/lib/Analysis CmpInstAnalysis.cpp

[InstCombine] Prepare foldLogOpOfMaskedICmps to handle trunc to i1. (NFC) (#122179)

DeltaFile
+66-55llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+14-0llvm/lib/Analysis/CmpInstAnalysis.cpp
+6-0llvm/include/llvm/Analysis/CmpInstAnalysis.h
+86-553 files

LLVM/project ffbc2f6llvm/include/llvm/Config config.h.cmake, llvm/lib/Analysis ConstantFolding.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.5-bogner
DeltaFile
+3-5llvm/lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp
+0-6llvm/include/llvm/Config/config.h.cmake
+0-5llvm/lib/Support/Errno.cpp
+2-2llvm/lib/Analysis/ConstantFolding.cpp
+0-3llvm/utils/gn/secondary/llvm/include/llvm/Config/BUILD.gn
+0-3utils/bazel/llvm-project-overlay/llvm/include/llvm/Config/config.h
+5-247 files not shown
+5-3913 files

LLVM/project 1c5f874clang/include/clang/Driver Driver.h

[Driver] Fix a warning

This patch fixes:

  clang/include/clang/Driver/Driver.h:82:3: error: definition of
  implicit copy assignment operator for 'CUIDOptions' is deprecated
  because it has a user-declared copy constructor
  [-Werror,-Wdeprecated-copy]
DeltaFile
+0-1clang/include/clang/Driver/Driver.h
+0-11 files

LLVM/project 0360f81lld/COFF Driver.cpp, lld/test/COFF subsystem-arm64x.test

[LLD][COFF] Infer subsystem from EC symbol table for ARM64X (#122838)

DeltaFile
+41-0lld/test/COFF/subsystem-arm64x.test
+1-1lld/COFF/Driver.cpp
+42-12 files

LLVM/project 80084e9lld/COFF Driver.cpp, lld/test/COFF arm64x-loadconfig.s

[LLD][COFF] Pull _load_config_used symbol from both symbol tables on ARM64X (#122837)

DeltaFile
+6-3lld/COFF/Driver.cpp
+5-0lld/test/COFF/arm64x-loadconfig.s
+11-32 files

LLVM/project 3bb969fflang/include/flang/Optimizer/Builder FIRBuilder.h, flang/include/flang/Optimizer/HLFIR Passes.td

[flang] Inline hlfir.matmul[_transpose]. (#122821)

Inlining `hlfir.matmul` as `hlfir.eval_in_mem` does not allow
to get rid of a temporary array in many cases, but it may still be
much better allowing to:
  * Get rid of any overhead related to calling runtime MATMUL
    (such as descriptors creation).
  * Use CPU-specific vectorization cost model for matmul loops,
    which Fortran runtime cannot currently do.
  * Optimize matmul of known-size arrays by complete unrolling.

One of the drawbacks of `hlfir.eval_in_mem` inlining is that
the ops inside it with store memory effects block the current
MLIR CSE, so I decided to run this inlining late in the pipeline.
There is a source commen explaining the CSE issue in more detail.

Straightforward inlining of `hlfir.matmul` as an `hlfir.elemental`
is not good for performance, and I got performance regressions
with it comparing to Fortran runtime implementation. I put it

    [6 lines not shown]
DeltaFile
+660-0flang/test/HLFIR/simplify-hlfir-intrinsics-matmul.fir
+456-0flang/lib/Optimizer/HLFIR/Transforms/SimplifyHLFIRIntrinsics.cpp
+14-3flang/lib/Optimizer/Builder/HLFIRTools.cpp
+14-0flang/lib/Optimizer/Builder/FIRBuilder.cpp
+11-0flang/include/flang/Optimizer/HLFIR/Passes.td
+9-0flang/include/flang/Optimizer/Builder/FIRBuilder.h
+1,164-34 files not shown
+1,183-310 files

LLVM/project 2bfa7bcflang/lib/Optimizer/HLFIR/IR HLFIROps.cpp, flang/test/HLFIR mul_transpose.f90

[flang] Propagate fastmath flags to matmul_transpose. (#122842)

DeltaFile
+2-1flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp
+1-1flang/test/HLFIR/mul_transpose.f90
+3-22 files

LLVM/project 07a1847clang/lib/Basic TargetInfo.cpp, clang/test/CodeGenHLSL Bool.hlsl

[HLSL] Make bool in hlsl i32 (#122977)

make a bool's memory representation i32 in hlsl
add new test
fix broken test
Closes #122932
DeltaFile
+12-0clang/test/CodeGenHLSL/Bool.hlsl
+0-9clang/test/SemaHLSL/BuiltIns/asfloat-errors.hlsl
+1-0clang/lib/Basic/TargetInfo.cpp
+13-93 files

LLVM/project 2570e35llvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine assume.ll

[InstCombine] Handle trunc to i1 in align assume. (#122949)

proof: https://alive2.llvm.org/ce/z/EyAUA4
DeltaFile
+12-7llvm/test/Transforms/InstCombine/assume.ll
+6-5llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+18-122 files