[SLP]Vectorize gathered loads
Final gather/buildvector nodes may have scalar loads, which are not
vectorized (since they are part of the gather nodes) but may form full
vector loads, being combined. This patch walks over all gather nodes,
"gathering" and sorting gathered scalar loads and then tries to build
vector loads, which later are reshuffled between the gather nodes.
It allows later to add support for segmented loads (kind of AOS to SOA
load kind for RISC-V RVV) and may help with the removal of the alternat
e opcodes support.
Currently, alternate nodes may depend on each other because of the
consecutive loads between their operands. Because of that we cannot
simply remove alternate vectorization. But this approach may help to
remove most of the stuff for it, since we'll be able to vectorize loads
in between lanes.
Metric: size..text, AVX512
Program size..text
[241 lines not shown]
[AArch64] Fold UBFMXri to UBFMWri when it's an LSR or LSL alias (#106968)
Using the LSR or LSL aliases of UBFM can be faster on some CPUs, so it
is worth changing 64 bit UBFM instructions, that are equivalent to 32
bit LSR/LSL operations, to 32 bit variants.
This change folds the following patterns:
* If `Imms == 31` and `Immr <= Imms`:
`UBFMXri %0, Immr, Imms` -> `UBFMWri %0.sub_32, Immr, Imms`
* If `Immr == Imms + 33`:
`UBFMXri %0, Immr, Imms` -> `UBFMWri %0.sub_32, Immr - 32, Imms`
[flang][Semantics][OpenMP] Don't privatise associate names (#108856)
The associate name preserves the association with the selector
established in the associate statement. Therefore it is incorrect to
change the data-sharing attribute of the name.
Closes #58041
[flang][debug] Generate correct subroutine type. (#108605)
We pass a list of types when creating a subroutine type. The first one
is supposed to be return type and the rest are the argument types. A
subroutine does not have a return type so an argument type could be
confused as a return type. To fix this, if there is no return type, we
generate a null type as a place holder.
Fixes #108564.
Merge tag 'soc-arm-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC ARM platform updates from Arnd Bergmann:
"Most of these updates are for removing dead code on the Samsung S3C,
NXP i.MX, TI OMAP and TI DaVinci platforms, though this appears to be
a coincidence.
There are also cleanups for the Marvell Orion family and the Arm
integrator series and a Kconfig change for Broadcom"
* tag 'soc-arm-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: dove: Drop a write-only variable
ARM: orion5x: Switch to new sys-off handler API
ARM: mvebu: Warn about memory chunks too small for DDR training
ARM: imx: Annotate imx7d_enet_init() as __init
ARM: OMAP1: Remove unused declarations in arch/arm/mach-omap1/pm.h
ARM: s3c: remove unused s3c2410_cpu_suspend() declaration
ARM: s3c: remove unused declarations for s3c6400
ARM: s3c: Remove unused s3c_init_uart_irqs() declaration
[9 lines not shown]
[MLIR][OpenMP][Docs] Document operand structures (NFC) (#108824)
This patch updates the OpenMP dialect top-level documentation to
describe the operand structures, when they can be used and how they are
automatically generated.