LLVM / project - FreshBSD

LLVM/project 7b3bbd8 — llvm/test/CodeGen/X86 vector-interleaved-store-i64-stride-7.ll vector-interleaved-load-i16-stride-7.ll

Oct 9, 2023 by Jay Foad on ⎇

main

Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"

This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.

Delta		File
+12,000	-11,992	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-7.ll
+11,074	-11,060	llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll
+9,482	-9,377	llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+9,170	-9,196	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-8.ll
+8,655	-8,552	llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-7.ll
+9,509	-5,423	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
+59,890	-55,600	730 files not shown
+267,439	-259,303	736 files

LLVM/project 2501ae5 — llvm/test/CodeGen/X86 vector-interleaved-store-i64-stride-7.ll vector-interleaved-load-i16-stride-7.ll

Oct 9, 2023 by Jay Foad via GitHub on ⎇

main

[CodeGen] Really renumber slot indexes before register allocation (#67038)

PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.

Delta		File
+11,968	-11,976	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-7.ll
+11,080	-11,094	llvm/test/CodeGen/X86/vector-interleaved-load-i16-stride-7.ll
+9,435	-9,540	llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+9,201	-9,175	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-8.ll
+8,567	-8,670	llvm/test/CodeGen/X86/vector-interleaved-load-i8-stride-7.ll
+5,428	-9,514	llvm/test/CodeGen/X86/vector-interleaved-store-i64-stride-6.ll
+55,679	-59,969	730 files not shown
+259,910	-268,046	736 files

LLVM/project df017ba — llvm/test/CodeGen/ARM fptosi-sat-scalar.ll fptoui-sat-scalar.ll, llvm/test/CodeGen/RISCV half-convert.ll half-round-conv-sat.ll

Apr 29, 2023 by Craig Topper on ⎇

main

[TargetLowering] Don't use ISD::SELECT_CC in expandFP_TO_INT_SAT.

This function gets called for vectors and ISD::SELECT_CC was never
intended to support vectors. Some updates were made to support
it when this function started getting used for vectors.

Overall, using separate ISD::SETCC and ISD::SELECT looks like an
improvement even for scalar.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D149481

Delta		File
+733	-918	llvm/test/CodeGen/Thumb2/mve-fptosi-sat-vector.ll
+584	-849	llvm/test/CodeGen/ARM/fptosi-sat-scalar.ll
+546	-702	llvm/test/CodeGen/Thumb2/mve-fptoui-sat-vector.ll
+464	-651	llvm/test/CodeGen/ARM/fptoui-sat-scalar.ll
+287	-332	llvm/test/CodeGen/RISCV/half-convert.ll
+160	-160	llvm/test/CodeGen/RISCV/half-round-conv-sat.ll
+2,774	-3,612	7 files not shown
+3,257	-4,191	13 files

LLVM/project f0dd12e — llvm/test/CodeGen/X86 tls.ll unfold-masked-merge-vector-variablemask.ll

Jul 20, 2022 by Sanjay Patel on ⎇

main

[x86] use zero-extending load of a byte outside of loops too (2nd try)

The first attempt missed changing test files for tools
(update_llc_test_checks.py).

Original commit message:

This implements the main suggested change from issue #56498.
Using the shorter (non-extending) instruction with only
-Oz ("minsize") rather than -Os ("optsize") is left as a
possible follow-up.

As noted in the bug report, the zero-extending load may have
shorter latency/better throughput across a wide range of x86
micro-arches, and it avoids a potential false dependency.
The cost is an extra instruction byte.

This could cause perf ups and downs from secondary effects,
but I don't think it is possible to account for those in

    [6 lines not shown]

Delta		File
+754	-305	llvm/test/CodeGen/X86/tls.ll
+522	-522	llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll
+298	-298	llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
+243	-243	llvm/test/CodeGen/X86/extract-bits.ll
+188	-188	llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
+180	-180	llvm/test/CodeGen/X86/avx512-calling-conv.ll
+2,185	-1,736	205 files not shown
+3,834	-3,292	211 files

LLVM/project 95401b0 — llvm/test/CodeGen/X86 tls.ll unfold-masked-merge-vector-variablemask.ll

Jul 19, 2022 by Sanjay Patel on ⎇

main

Revert "[x86] use zero-extending load of a byte outside of loops too"

This reverts commit 9d1ea1774c51c44ddf0b5065bf600919988d7015.
There are tests of update_llc_tests_checks.py that missed being updated.

Delta		File
+305	-754	llvm/test/CodeGen/X86/tls.ll
+522	-522	llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll
+298	-298	llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
+243	-243	llvm/test/CodeGen/X86/extract-bits.ll
+188	-188	llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
+180	-180	llvm/test/CodeGen/X86/avx512-calling-conv.ll
+1,736	-2,185	203 files not shown
+3,284	-3,826	209 files

LLVM/project 9d1ea17 — llvm/test/CodeGen/X86 tls.ll unfold-masked-merge-vector-variablemask.ll

Jul 19, 2022 by Sanjay Patel on ⎇

main

[x86] use zero-extending load of a byte outside of loops too

This implements the main suggested change from issue #56498.
Using the shorter (non-extending) instruction with only
-Oz ("minsize") rather than -Os ("optsize") is left as a
possible follow-up.

As noted in the bug report, the zero-extending load may have
shorter latency/better throughput across a wide range of x86
micro-arches, and it avoids a potential false dependency.
The cost is an extra instruction byte.

This could cause perf ups and downs from secondary effects,
but I don't think it is possible to account for those in
advance, and that will likely also depend on exact micro-arch.
This does bring LLVM x86 codegen more in line with existing
gcc codegen, so if problems are exposed they are more likely
to occur for both compilers.

Differential Revision: https://reviews.llvm.org/D129775

Delta		File
+754	-305	llvm/test/CodeGen/X86/tls.ll
+522	-522	llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll
+298	-298	llvm/test/CodeGen/X86/avx512vl-intrinsics-fast-isel.ll
+243	-243	llvm/test/CodeGen/X86/extract-bits.ll
+188	-188	llvm/test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
+180	-180	llvm/test/CodeGen/X86/avx512-calling-conv.ll
+2,185	-1,736	203 files not shown
+3,826	-3,284	209 files

LLVM/project 655ba9c — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 17, 2022 by Phoebe Wang on ⎇

main

Reland "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This resolves problems reported in commit 1a20252978c76cf2518aa45b175a9e5d6d36c4f0.
1. Promote to float lowering for nodes XINT_TO_FP
2. Bail out f16 from shuffle combine due to vector type is not legal in the version

Delta		File
+845	-737	llvm/test/CodeGen/X86/frem.ll
+489	-715	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+669	-434	llvm/test/CodeGen/X86/half.ll
+410	-629	llvm/test/CodeGen/X86/vector-half-conversions.ll
+475	-375	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+380	-372	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,268	-3,262	44 files not shown
+5,160	-4,633	50 files

LLVM/project 1a20252 — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 17, 2022 by Benjamin Kramer on ⎇

main

Revert "Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""""

This reverts commit 04a3d5f3a1193fb87576425a385aa0a6115b1e7c.

I see two more issues:

- uitofp/sitofp from i32/i64 to half now generates
  __floatsihf/__floatdihf, which exists in neither compiler-rt nor
  libgcc

- This crashes when legalizing the bitcast:
```
; RUN: llc < %s -mcpu=skx
define void @main.45(ptr nocapture readnone %retval, ptr noalias nocapture readnone %run_options, ptr noalias nocapture readnone %params, ptr noalias nocapture readonly %buffer_table, ptr noalias nocapture readnone %status, ptr noalias nocapture readnone %prof_counters) local_unnamed_addr {
entry:
  %fusion = load ptr, ptr %buffer_table, align 8
  %0 = getelementptr inbounds ptr, ptr %buffer_table, i64 1
  %Arg_1.2 = load ptr, ptr %0, align 8
  %1 = getelementptr inbounds ptr, ptr %buffer_table, i64 2

    [38 lines not shown]

Delta		File
+829	-937	llvm/test/CodeGen/X86/frem.ll
+661	-435	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+485	-571	llvm/test/CodeGen/X86/half.ll
+637	-418	llvm/test/CodeGen/X86/vector-half-conversions.ll
+373	-473	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+369	-377	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,354	-3,211	44 files not shown
+4,799	-5,136	50 files

LLVM/project 04a3d5f — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 17, 2022 by Phoebe Wang on ⎇

main

Reland "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

Fix the crash on lowering X86ISD::FCMP.

Delta		File
+845	-737	llvm/test/CodeGen/X86/frem.ll
+489	-715	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+571	-485	llvm/test/CodeGen/X86/half.ll
+410	-629	llvm/test/CodeGen/X86/vector-half-conversions.ll
+475	-375	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+380	-372	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,170	-3,313	44 files not shown
+5,036	-4,699	50 files

LLVM/project 3cd5696 — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 15, 2022 by Frederik Gossen on ⎇

main

Revert "Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"""

This reverts commit e1c5afa47d37012499467b5061fc42e50884d129.

This introduces crashes in the JAX backend on CPU. A reproducer in LLVM is
below. Let me know if you have trouble reproducing this.

; ModuleID = '__compute_module'
source_filename = "__compute_module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

@0 = private unnamed_addr constant [4 x i8] c"\00\00\00?"
@1 = private unnamed_addr constant [4 x i8] c"\1C}\908"
@2 = private unnamed_addr constant [4 x i8] c"?\00\\4"
@3 = private unnamed_addr constant [4 x i8] c"%ci1"
@4 = private unnamed_addr constant [4 x i8] zeroinitializer
@5 = private unnamed_addr constant [4 x i8] c"\00\00\00\C0"
@6 = private unnamed_addr constant [4 x i8] c"\00\00\00B"

    [205 lines not shown]

Delta		File
+829	-937	llvm/test/CodeGen/X86/frem.ll
+661	-435	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+637	-418	llvm/test/CodeGen/X86/vector-half-conversions.ll
+485	-493	llvm/test/CodeGen/X86/half.ll
+373	-473	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+369	-377	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,354	-3,133	44 files not shown
+4,798	-5,056	50 files

LLVM/project e1c5afa — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 15, 2022 by Phoebe Wang on ⎇

main

Reland "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""

Fixed the missing SQRT promotion. Adding several missing operations too.

Delta		File
+845	-737	llvm/test/CodeGen/X86/frem.ll
+489	-715	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+410	-629	llvm/test/CodeGen/X86/vector-half-conversions.ll
+493	-485	llvm/test/CodeGen/X86/half.ll
+475	-375	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+380	-372	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,092	-3,313	44 files not shown
+4,956	-4,698	50 files

LLVM/project 37455b1 — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 15, 2022 by Thomas Joerg via Benjamin Kramer on ⎇

main

Revert "Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI""

This reverts commit 6e02e27536b9de25a651cfc9c2966ce471169355.

This introduces a crash in the backend. Reproducer in MLIR's LLVM
dialect follows. Let me know if you have trouble reproducing this.

module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @_mlir_ciface_tf_report_error(!llvm.ptr<i8>, i32, !llvm.ptr<i8>)
  llvm.mlir.global internal constant @error_message_2208944672953921889("failed to allocate memory at loc(\22-\22:3:8)\00")
  llvm.func @_mlir_ciface_tf_alloc(!llvm.ptr<i8>, i64, i64, i32, i32, !llvm.ptr<i32>) -> !llvm.ptr<i8>
  llvm.func @Rsqrt_CPU_DT_HALF_DT_HALF(%arg0: !llvm.ptr<i8>, %arg1: i64, %arg2: !llvm.ptr<i8>) -> !llvm.struct<(i64, ptr<i8>)> attributes {llvm.emit_c_interface, tf_entry} {
    %0 = llvm.mlir.constant(8 : i32) : i32
    %1 = llvm.mlir.constant(8 : index) : i64
    %2 = llvm.mlir.constant(2 : index) : i64
    %3 = llvm.mlir.constant(dense<0.000000e+00> : vector<4xf16>) : vector<4xf16>
    %4 = llvm.mlir.constant(dense<[0, 1, 2, 3]> : vector<4xi32>) : vector<4xi32>
    %5 = llvm.mlir.constant(dense<1.000000e+00> : vector<4xf16>) : vector<4xf16>

    [180 lines not shown]

Delta		File
+829	-937	llvm/test/CodeGen/X86/frem.ll
+661	-435	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+637	-418	llvm/test/CodeGen/X86/vector-half-conversions.ll
+485	-446	llvm/test/CodeGen/X86/half.ll
+373	-473	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+369	-377	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,354	-3,086	44 files not shown
+4,798	-4,997	50 files

LLVM/project 6e02e27 — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 15, 2022 by Phoebe Wang on ⎇

main

Reland "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

Disabled 2 mlir tests due to the runtime doesn't support `_Float16`, see
the issue here https://github.com/llvm/llvm-project/issues/55992

Delta		File
+845	-737	llvm/test/CodeGen/X86/frem.ll
+489	-715	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+410	-629	llvm/test/CodeGen/X86/vector-half-conversions.ll
+446	-485	llvm/test/CodeGen/X86/half.ll
+475	-375	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+380	-372	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,045	-3,313	44 files not shown
+4,897	-4,698	50 files

LLVM/project 5d8298a — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 12, 2022 by Mehdi Amini on ⎇

main

Revert "[X86][RFC] Enable `_Float16` type support on X86 following the psABI"

This reverts commit 2d2da259c8726fd5c974c01122a9689981a12196.

This breaks MLIR integration test (JIT crashing), reverting in the
meantime.

Delta		File
+829	-937	llvm/test/CodeGen/X86/frem.ll
+661	-435	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+637	-418	llvm/test/CodeGen/X86/vector-half-conversions.ll
+485	-446	llvm/test/CodeGen/X86/half.ll
+373	-473	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+369	-377	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,354	-3,086	42 files not shown
+4,811	-4,991	48 files

LLVM/project 2d2da25 — llvm/test/CodeGen/X86 frem.ll fpclamptosat_vec.ll

Jun 12, 2022 by Phoebe Wang on ⎇

main

[X86][RFC] Enable `_Float16` type support on X86 following the psABI

GCC and Clang/LLVM will support `_Float16` on X86 in C/C++, following
the latest X86 psABI. (https://gitlab.com/x86-psABIs)

_Float16 arithmetic will be performed using native half-precision. If
native arithmetic instructions are not available, it will be performed
at a higher precision (currently always float) and then truncated down
to _Float16 immediately after each single arithmetic operation.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D107082

Delta		File
+845	-737	llvm/test/CodeGen/X86/frem.ll
+489	-715	llvm/test/CodeGen/X86/fpclamptosat_vec.ll
+410	-629	llvm/test/CodeGen/X86/vector-half-conversions.ll
+446	-485	llvm/test/CodeGen/X86/half.ll
+475	-375	llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+380	-372	llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,045	-3,313	42 files not shown
+4,891	-4,711	48 files

LLVM/project 4a36e96 — llvm/test/CodeGen/AMDGPU load-constant-i16.ll amdgpu-codegenprepare-idiv.ll, llvm/test/CodeGen/AMDGPU/GlobalISel srem.i64.ll sdiv.i64.ll

Sep 15, 2021 by Matt Arsenault on ⎇

release/14.x

RegAllocGreedy: Account for reserved registers in num regs heuristic

This simple heuristic uses the estimated live range length combined
with the number of registers in the class to switch which heuristic to
use. This was taking the raw number of registers in the class, even
though not all of them may be available. AMDGPU heavily relies on
dynamically reserved numbers of registers based on user attributes to
satisfy occupancy constraints, so the raw number is highly misleading.

There are still a few problems here. In the original testcase that
made me notice this, the live range size is incorrect after the
scheduler rearranges instructions, since the instructions don't have
the original InstrDist offsets. Additionally, I think it would be more
appropriate to use the number of disjointly allocatable registers in
the class. For the AMDGPU register tuples, there are a large number of
registers in each tuple class, but only a small fraction can actually
be allocated at the same time since they all overlap with each
other. It seems we do not have a query that corresponds to the number
of independently allocatable registers. Relatedly, I'm still debugging

    [10 lines not shown]

Delta		File
+2,764	-2,764	llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll
+1,141	-1,138	llvm/test/CodeGen/AMDGPU/load-constant-i16.ll
+1,023	-1,023	llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
+752	-740	llvm/test/CodeGen/AMDGPU/load-global-i16.ll
+644	-644	llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll
+341	-341	llvm/test/CodeGen/RISCV/rvv/fixed-vectors-cttz.ll
+6,665	-6,650	145 files not shown
+13,666	-13,423	151 files

LLVM/project 0aef747 — llvm/test/CodeGen/X86 vector-popcnt-128-ult-ugt.ll vector-popcnt-512-ult-ugt.ll

Jun 11, 2021 by Roman Lebedev on ⎇

release/13.x

[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated

The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from
what pretty much all the checklines were generated with,
and most of the tests are still not updated, so each time one of the
non-up-to-date tests is updated to see the effect of the code change,
there is a lot of noise. Instead of having to deal with that each
time, let's just deal with everything at once.

This has been done via:
```
cd llvm-project/llvm/test/CodeGen/X86
grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" | xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc
```

Not all tests were regenerated, however.

Delta		File
+2,310	-2,310	llvm/test/CodeGen/X86/vector-popcnt-128-ult-ugt.ll
+836	-836	llvm/test/CodeGen/X86/vector-popcnt-512-ult-ugt.ll
+493	-493	llvm/test/CodeGen/X86/vector-popcnt-256-ult-ugt.ll
+474	-474	llvm/test/CodeGen/X86/srem-seteq-vec-nonsplat.ll
+371	-371	llvm/test/CodeGen/X86/urem-seteq-vec-nonsplat.ll
+549	-0	llvm/test/CodeGen/X86/umul-with-overflow.ll
+5,033	-4,484	666 files not shown
+14,147	-13,397	672 files

LLVM/project 0248e24 — llvm/test/CodeGen/X86 vector_splat-const-shift-of-constmasked.ll avx2-intrinsics-x86.ll

Mar 28, 2021 by Craig Topper on ⎇

release/13.x

[X86][update_llc_test_checks] Use a less greedy regular expression for replacing constant pool labels in tests.

While working on D97208 I noticed that these greedy regular
expressions prevent tests from failing when (%rip) appears after
a constant pool label when it didn't before.

Reviewed By: RKSimon, pengfei

Differential Revision: https://reviews.llvm.org/D99460

Delta		File
+196	-196	llvm/test/CodeGen/X86/vector_splat-const-shift-of-constmasked.ll
+168	-168	llvm/test/CodeGen/X86/avx2-intrinsics-x86.ll
+124	-124	llvm/test/CodeGen/X86/fptosi-sat-scalar.ll
+94	-94	llvm/test/CodeGen/X86/limited-prec.ll
+73	-73	llvm/test/CodeGen/X86/fptoui-sat-scalar.ll
+72	-72	llvm/test/CodeGen/X86/cmov-fp.ll
+727	-727	138 files not shown
+1,680	-1,680	144 files

LLVM/project 07605ea — llvm/lib/Target/X86 X86ISelLowering.cpp X86ISelLowering.h, llvm/test/CodeGen/X86 fptosi-sat-scalar.ll fptoui-sat-scalar.ll

Jan 12, 2021 by Bevin Hansson via Bjorn Pettersson on ⎇

release/12.x

[X86] Improved lowering for saturating float to int.

Adapted from D54696 by @nikic.

This patch improves lowering of saturating float to
int conversions, FP_TO_[SU]INT_SAT, for X86.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D86079

Delta		File
+153	-342	llvm/test/CodeGen/X86/fptosi-sat-scalar.ll
+139	-302	llvm/test/CodeGen/X86/fptoui-sat-scalar.ll
+164	-0	llvm/lib/Target/X86/X86ISelLowering.cpp
+1	-0	llvm/lib/Target/X86/X86ISelLowering.h
+457	-644	4 files

LLVM/project 75c0432 — llvm/test/CodeGen/X86 lit.local.cfg vector-pack-128.ll

Jan 9, 2021 by Mircea Trofin on ⎇

release/12.x

[NFC] Disallow unused prefixes in CodeGen/X86 tests.

Also fixed remaining tests that featured unused prefixes.

Differential Revision: https://reviews.llvm.org/D94330

Delta		File
+8	-0	llvm/test/CodeGen/X86/lit.local.cfg
+2	-2	llvm/test/CodeGen/X86/vector-pack-128.ll
+2	-2	llvm/test/CodeGen/X86/fptosi-sat-scalar.ll
+2	-2	llvm/test/CodeGen/X86/fptoui-sat-scalar.ll
+14	-6	4 files

LLVM/project a89d751 — llvm/test/CodeGen/AArch64 fptosi-sat-vector.ll fptoui-sat-vector.ll, llvm/test/CodeGen/ARM fptosi-sat-scalar.ll

Dec 18, 2020 by Bjorn Pettersson on ⎇

release/12.x

Add intrinsics for saturating float to int casts

This patch adds support for the fptoui.sat and fptosi.sat intrinsics,
which provide basically the same functionality as the existing fptoui
and fptosi instructions, but will saturate (or return 0 for NaN) on
values unrepresentable in the target type, instead of returning
poison. Related mailing list discussion can be found at:
https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ

The intrinsics have overloaded source and result type and support
vector operands:

    i32 @llvm.fptoui.sat.i32.f32(float %f)
    i100 @llvm.fptoui.sat.i100.f64(double %f)
    <4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f)
    // etc

On the SelectionDAG layer two new ISD opcodes are added,
FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands

    [43 lines not shown]

Delta		File
+4,711	-0	llvm/test/CodeGen/X86/fptosi-sat-scalar.ll
+4,300	-0	llvm/test/CodeGen/X86/fptoui-sat-scalar.ll
+2,812	-0	llvm/test/CodeGen/ARM/fptosi-sat-scalar.ll
+2,807	-0	llvm/test/CodeGen/AArch64/fptosi-sat-vector.ll
+2,196	-0	llvm/test/CodeGen/AArch64/fptoui-sat-vector.ll
+676	-0	llvm/test/CodeGen/AArch64/fptosi-sat-scalar.ll
+17,502	-0	16 files not shown
+18,535	-2	22 files