Merge ~fheimes/ubuntu/+source/valgrind:valgrind-lp1825343-hirsute into ubuntu/+source/valgrind:ubuntu/hirsute-devel

Proposed by Frank Heimes
Status: Merged
Approved by: Christian Ehrhardt 
Approved revision: 9ce66c4fc97a353855c5cd5bf496d03e42867fda
Merged at revision: 9ce66c4fc97a353855c5cd5bf496d03e42867fda
Proposed branch: ~fheimes/ubuntu/+source/valgrind:valgrind-lp1825343-hirsute
Merge into: ubuntu/+source/valgrind:ubuntu/hirsute-devel
Diff against target: 3232 lines (+3198/-0)
5 files modified
debian/changelog (+9/-0)
debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch (+2986/-0)
debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch (+45/-0)
debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch (+155/-0)
debian/patches/series (+3/-0)
Reviewer Review Type Date Requested Status
Christian Ehrhardt  (community) Approve
Review via email: mp+397860@code.launchpad.net

Description of the change

valgrind-lp1825343-hirsute
  add support for IBM z14 instructions to Valgrind
  debian/patches/lp-1825343-Bug-404076-s390*.patches
  debian/changelog
  backported three commits from valgrind > v3.16.1
  Thanks to Andreas Arnez (LP: #1825343)

One patch needed to be modified to skip the following two files:
  - docs/internals/s390-opcodes.csv
  - auxprogs/s390-check-opcodes.pl
since these files are not included in the upstream release tar ball 3.16.1 thereby also not included in the Ubuntu package '3.16.1-1ubuntu1'.

Test build is available here:
https://launchpad.net/~fheimes/+archive/ubuntu/lp1825343

To post a comment you must log in.
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Changelog:
- [✓] changelog entry correct version and targeted codename
- [✓] changelog entries correct
- [✓] update-maintainer has been run

New Delta:
- [✓] new patches are good or match what was is merged upstream
- [✓] new patches correctly included in debian/patches/series
- [✓] new patches have correct DEP3 metadata

Build/Test:
- [✓] build is ok (the one warning is not related to the changed code)
- [✓] verified PPA package installs/uninstalls
- [✓] sanity checks test fine

Some code changes are too complex to fully sign off on, e,g. lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch. There I have verified that it closely matches what went upstream and have to trust in them to be the subject matter experts.

Overall that looks like a good pre-FF Feature add pulled forward from the coming next version.
+1

review: Approve
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

To ssh://git.launchpad.net/ubuntu/+source/valgrind
 * [new tag] upload/1%3.16.1-1ubuntu2 -> upload/1%3.16.1-1ubuntu2

Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading valgrind_3.16.1-1ubuntu2.dsc: done.
  Uploading valgrind_3.16.1-1ubuntu2.debian.tar.xz: done.
  Uploading valgrind_3.16.1-1ubuntu2_source.buildinfo: done.
  Uploading valgrind_3.16.1-1ubuntu2_source.changes: done.
Successfully uploaded packages.

Revision history for this message
Frank Heimes (fheimes) wrote :

Many thx for reviewing, commenting, sponsoring, uploading and your overall support on this!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/debian/changelog b/debian/changelog
index c669e48..9b0d8a8 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,12 @@
1valgrind (1:3.16.1-1ubuntu2) hirsute; urgency=medium
2
3 * debian/patches/lp-1825343-Bug-404076-s390*.patches
4 adding support for IBM z14 instructions to Valgrind
5 backported three commits from valgrind > v3.16.1
6 Thanks to Andreas Arnez (LP: #1825343)
7
8 -- Frank Heimes <frank.heimes@canonical.com> Wed, 10 Feb 2021 20:10:24 +0100
9
1valgrind (1:3.16.1-1ubuntu1) groovy; urgency=low10valgrind (1:3.16.1-1ubuntu1) groovy; urgency=low
211
3 * Merge from Debian unstable. Remaining changes:12 * Merge from Debian unstable. Remaining changes:
diff --git a/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch b/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch
4new file mode 10064413new file mode 100644
index 0000000..fa985b9
--- /dev/null
+++ b/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch
@@ -0,0 +1,2986 @@
1From 159f132289160ab1a5a5cf4da14fb57ecdb248ca Mon Sep 17 00:00:00 2001
2From: Andreas Arnez <arnez@linux.ibm.com>
3Date: Mon, 7 Dec 2020 20:01:26 +0100
4Subject: [PATCH] Bug 404076 - s390x: Implement z14 vector instructions
5
6Implement the new instructions/features that were added to z/Architecture
7with the vector-enhancements facility 1. Also cover the instructions from
8the vector-packed-decimal facility that are defined outside the chapter
9"Vector Decimal Instructions", but not the ones from that chapter itself.
10
11For a detailed list of newly supported instructions see the updates to
12`docs/internals/s390-opcodes.csv'.
13
14Since the miscellaneous instruction extensions facility 2 was already
15addressed by Bug 404406, this completes the support necessary to run
16general programs built with `--march=z14' under Valgrind. The
17vector-packed-decimal facility is currently not exploited by the standard
18toolchain and libraries.
19
20Author: Andreas Arnez <arnez@linux.ibm.com>
21Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=159f13228
22Bug-IBM: IBM Bugzilla 163660
23Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343
24Applied-Upstream: > v3.16.1
25Reviewed-by: Frank Heimes <frank.heimes@canonical.com>
26Last-Update: 2021-02-10
27
28---
29--- a/coregrind/m_initimg/initimg-linux.c
30+++ b/coregrind/m_initimg/initimg-linux.c
31@@ -697,9 +697,13 @@
32 }
33 # elif defined(VGP_s390x_linux)
34 {
35- /* Advertise hardware features "below" TE and VXRS. TE itself
36- and anything above VXRS is not supported by Valgrind. */
37- auxv->u.a_val &= (VKI_HWCAP_S390_TE - 1) | VKI_HWCAP_S390_VXRS;
38+ /* Out of the hardware features available on the platform,
39+ advertise those "below" TE, as well as the ones explicitly
40+ ORed in the expression below. Anything else, such as TE
41+ itself, is not supported by Valgrind. */
42+ auxv->u.a_val &= ((VKI_HWCAP_S390_TE - 1)
43+ | VKI_HWCAP_S390_VXRS
44+ | VKI_HWCAP_S390_VXRS_EXT);
45 }
46 # elif defined(VGP_arm64_linux)
47 {
48--- a/coregrind/m_machine.c
49+++ b/coregrind/m_machine.c
50@@ -1544,6 +1544,7 @@
51 { False, S390_FAC_MSA5, VEX_HWCAPS_S390X_MSA5, "MSA5" },
52 { False, S390_FAC_MI2, VEX_HWCAPS_S390X_MI2, "MI2" },
53 { False, S390_FAC_LSC2, VEX_HWCAPS_S390X_LSC2, "LSC2" },
54+ { False, S390_FAC_VXE, VEX_HWCAPS_S390X_VXE, "VXE" },
55 };
56
57 /* Set hwcaps according to the detected facilities */
58--- a/include/vki/vki-s390x-linux.h
59+++ b/include/vki/vki-s390x-linux.h
60@@ -806,6 +806,7 @@
61
62 #define VKI_HWCAP_S390_TE 1024
63 #define VKI_HWCAP_S390_VXRS 2048
64+#define VKI_HWCAP_S390_VXRS_EXT 8192
65
66
67 //----------------------------------------------------------------------
68--- a/NEWS
69+++ b/NEWS
70@@ -2,6 +2,7 @@
71 428648 s390_emit_load_mem panics due to 20-bit offset for vector load
72 429864 s390x: C++ atomic test_and_set yields false-positive memcheck
73 diagnostics
74+404076 s390x: z14 vector instructions not implemented
75
76 Release 3.16.1 (22 June 2020)
77 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
78--- a/none/tests/s390x/vector_float.c
79+++ b/none/tests/s390x/vector_float.c
80@@ -114,50 +114,59 @@
81 test_with_selective_printing(vldeb, (V128_V_RES_AS_FLOAT64 |
82 V128_V_ARG1_AS_FLOAT64));
83 test_with_selective_printing(wldeb, (V128_V_RES_AS_FLOAT64 |
84- V128_V_ARG1_AS_FLOAT64));
85+ V128_V_ARG1_AS_FLOAT64 |
86+ V128_V_RES_ZERO_ONLY));
87
88 test_with_selective_printing(vflcdb, (V128_V_RES_AS_FLOAT64 |
89 V128_V_ARG1_AS_FLOAT64));
90 test_with_selective_printing(wflcdb, (V128_V_RES_AS_FLOAT64 |
91- V128_V_ARG1_AS_FLOAT64));
92+ V128_V_ARG1_AS_FLOAT64 |
93+ V128_V_RES_ZERO_ONLY));
94 test_with_selective_printing(vflndb, (V128_V_RES_AS_FLOAT64 |
95 V128_V_ARG1_AS_FLOAT64));
96 test_with_selective_printing(wflndb, (V128_V_RES_AS_FLOAT64 |
97- V128_V_ARG1_AS_FLOAT64));
98+ V128_V_ARG1_AS_FLOAT64 |
99+ V128_V_RES_ZERO_ONLY));
100 test_with_selective_printing(vflpdb, (V128_V_RES_AS_FLOAT64 |
101 V128_V_ARG1_AS_FLOAT64));
102 test_with_selective_printing(wflpdb, (V128_V_RES_AS_FLOAT64 |
103- V128_V_ARG1_AS_FLOAT64));
104+ V128_V_ARG1_AS_FLOAT64 |
105+ V128_V_RES_ZERO_ONLY));
106
107 test_with_selective_printing(vfadb, (V128_V_RES_AS_FLOAT64 |
108 V128_V_ARG1_AS_FLOAT64 |
109 V128_V_ARG2_AS_FLOAT64));
110 test_with_selective_printing(wfadb, (V128_V_RES_AS_FLOAT64 |
111 V128_V_ARG1_AS_FLOAT64 |
112- V128_V_ARG2_AS_FLOAT64));
113+ V128_V_ARG2_AS_FLOAT64 |
114+ V128_V_RES_ZERO_ONLY));
115 test_with_selective_printing(vfsdb, (V128_V_RES_AS_FLOAT64 |
116 V128_V_ARG1_AS_FLOAT64 |
117 V128_V_ARG2_AS_FLOAT64));
118 test_with_selective_printing(wfsdb, (V128_V_RES_AS_FLOAT64 |
119 V128_V_ARG1_AS_FLOAT64 |
120- V128_V_ARG2_AS_FLOAT64));
121+ V128_V_ARG2_AS_FLOAT64 |
122+ V128_V_RES_ZERO_ONLY));
123 test_with_selective_printing(vfmdb, (V128_V_RES_AS_FLOAT64 |
124 V128_V_ARG1_AS_FLOAT64 |
125 V128_V_ARG2_AS_FLOAT64));
126 test_with_selective_printing(wfmdb, (V128_V_RES_AS_FLOAT64 |
127 V128_V_ARG1_AS_FLOAT64 |
128- V128_V_ARG2_AS_FLOAT64));
129+ V128_V_ARG2_AS_FLOAT64 |
130+ V128_V_RES_ZERO_ONLY));
131 test_with_selective_printing(vfddb, (V128_V_RES_AS_FLOAT64 |
132 V128_V_ARG1_AS_FLOAT64 |
133 V128_V_ARG2_AS_FLOAT64));
134 test_with_selective_printing(wfddb, (V128_V_RES_AS_FLOAT64 |
135 V128_V_ARG1_AS_FLOAT64 |
136- V128_V_ARG2_AS_FLOAT64));
137+ V128_V_ARG2_AS_FLOAT64 |
138+ V128_V_RES_ZERO_ONLY));
139
140 test_with_selective_printing(vfsqdb, (V128_V_RES_AS_FLOAT64 |
141 V128_V_ARG1_AS_FLOAT64));
142 test_with_selective_printing(wfsqdb, (V128_V_RES_AS_FLOAT64 |
143- V128_V_ARG1_AS_FLOAT64));
144+ V128_V_ARG1_AS_FLOAT64 |
145+ V128_V_RES_ZERO_ONLY));
146
147 test_with_selective_printing(vfmadb, (V128_V_RES_AS_FLOAT64 |
148 V128_V_ARG1_AS_FLOAT64 |
149@@ -166,7 +175,8 @@
150 test_with_selective_printing(wfmadb, (V128_V_RES_AS_FLOAT64 |
151 V128_V_ARG1_AS_FLOAT64 |
152 V128_V_ARG2_AS_FLOAT64 |
153- V128_V_ARG3_AS_FLOAT64));
154+ V128_V_ARG3_AS_FLOAT64 |
155+ V128_V_RES_ZERO_ONLY));
156 test_with_selective_printing(vfmsdb, (V128_V_RES_AS_FLOAT64 |
157 V128_V_ARG1_AS_FLOAT64 |
158 V128_V_ARG2_AS_FLOAT64 |
159@@ -174,21 +184,25 @@
160 test_with_selective_printing(wfmsdb, (V128_V_RES_AS_FLOAT64 |
161 V128_V_ARG1_AS_FLOAT64 |
162 V128_V_ARG2_AS_FLOAT64 |
163- V128_V_ARG3_AS_FLOAT64));
164+ V128_V_ARG3_AS_FLOAT64 |
165+ V128_V_RES_ZERO_ONLY));
166
167 test_with_selective_printing(wfcdb, (V128_V_ARG1_AS_FLOAT64 |
168 V128_V_ARG2_AS_FLOAT64 |
169- V128_R_RES));
170+ V128_R_RES |
171+ V128_V_RES_ZERO_ONLY));
172 test_with_selective_printing(wfkdb, (V128_V_ARG1_AS_FLOAT64 |
173 V128_V_ARG2_AS_FLOAT64 |
174- V128_R_RES));
175+ V128_R_RES |
176+ V128_V_RES_ZERO_ONLY));
177
178 test_with_selective_printing(vfcedb, (V128_V_RES_AS_INT |
179 V128_V_ARG1_AS_FLOAT64 |
180 V128_V_ARG2_AS_FLOAT64));
181 test_with_selective_printing(wfcedb, (V128_V_RES_AS_INT |
182 V128_V_ARG1_AS_FLOAT64 |
183- V128_V_ARG2_AS_FLOAT64));
184+ V128_V_ARG2_AS_FLOAT64 |
185+ V128_V_RES_ZERO_ONLY));
186 test_with_selective_printing(vfcedbs, (V128_V_RES_AS_INT |
187 V128_V_ARG1_AS_FLOAT64 |
188 V128_V_ARG2_AS_FLOAT64 |
189@@ -196,14 +210,16 @@
190 test_with_selective_printing(wfcedbs, (V128_V_RES_AS_INT |
191 V128_V_ARG1_AS_FLOAT64 |
192 V128_V_ARG2_AS_FLOAT64 |
193- V128_R_RES));
194+ V128_R_RES |
195+ V128_V_RES_ZERO_ONLY));
196
197 test_with_selective_printing(vfchdb, (V128_V_RES_AS_INT |
198 V128_V_ARG1_AS_FLOAT64 |
199 V128_V_ARG2_AS_FLOAT64));
200 test_with_selective_printing(wfchdb, (V128_V_RES_AS_INT |
201 V128_V_ARG1_AS_FLOAT64 |
202- V128_V_ARG2_AS_FLOAT64));
203+ V128_V_ARG2_AS_FLOAT64 |
204+ V128_V_RES_ZERO_ONLY));
205 test_with_selective_printing(vfchdbs, (V128_V_RES_AS_INT |
206 V128_V_ARG1_AS_FLOAT64 |
207 V128_V_ARG2_AS_FLOAT64 |
208@@ -211,14 +227,16 @@
209 test_with_selective_printing(wfchdbs, (V128_V_RES_AS_INT |
210 V128_V_ARG1_AS_FLOAT64 |
211 V128_V_ARG2_AS_FLOAT64 |
212- V128_R_RES));
213+ V128_R_RES |
214+ V128_V_RES_ZERO_ONLY));
215
216 test_with_selective_printing(vfchedb, (V128_V_RES_AS_INT |
217 V128_V_ARG1_AS_FLOAT64 |
218 V128_V_ARG2_AS_FLOAT64));
219 test_with_selective_printing(wfchedb, (V128_V_RES_AS_INT |
220 V128_V_ARG1_AS_FLOAT64 |
221- V128_V_ARG2_AS_FLOAT64));
222+ V128_V_ARG2_AS_FLOAT64 |
223+ V128_V_RES_ZERO_ONLY));
224 test_with_selective_printing(vfchedbs, (V128_V_RES_AS_INT |
225 V128_V_ARG1_AS_FLOAT64 |
226 V128_V_ARG2_AS_FLOAT64 |
227@@ -226,7 +244,8 @@
228 test_with_selective_printing(wfchedbs, (V128_V_RES_AS_INT |
229 V128_V_ARG1_AS_FLOAT64 |
230 V128_V_ARG2_AS_FLOAT64 |
231- V128_R_RES));
232+ V128_R_RES |
233+ V128_V_RES_ZERO_ONLY));
234
235 test_with_selective_printing(vftcidb0, (V128_V_RES_AS_INT |
236 V128_V_ARG1_AS_FLOAT64 |
237--- a/none/tests/s390x/vector_float.stdout.exp
238+++ b/none/tests/s390x/vector_float.stdout.exp
239@@ -419,88 +419,88 @@
240 v_result = 7fffffffffffffff | 7fffffffffffffff
241 v_arg1 = 0x1.fed2f087c21p+341 | 0x1.180e4c1d87fc4p+682
242 insn wcgdb00:
243- v_result = 7fffffffffffffff | 0000000000000000
244+ v_result = 7fffffffffffffff | --
245 v_arg1 = 0x1.d7fd9222e8b86p+670 | 0x1.c272612672a3p+798
246 insn wcgdb00:
247- v_result = 0000000000000000 | 0000000000000000
248+ v_result = 0000000000000000 | --
249 v_arg1 = 0x1.745cd360987e5p-496 | -0x1.f3b404919f358p-321
250 insn wcgdb00:
251- v_result = 8000000000000000 | 0000000000000000
252+ v_result = 8000000000000000 | --
253 v_arg1 = -0x1.9523565cd92d5p+643 | 0x1.253677d6d3be2p-556
254 insn wcgdb00:
255- v_result = 7fffffffffffffff | 0000000000000000
256+ v_result = 7fffffffffffffff | --
257 v_arg1 = 0x1.b6eb576ec3e6ap+845 | -0x1.c7e102c503d91p+266
258 insn wcgdb01:
259- v_result = 0000000000000000 | 0000000000000000
260+ v_result = 0000000000000000 | --
261 v_arg1 = -0x1.3d4319841f4d6p-1011 | -0x1.2feabf7dfc506p-680
262 insn wcgdb01:
263- v_result = 0000000000000000 | 0000000000000000
264+ v_result = 0000000000000000 | --
265 v_arg1 = -0x1.6fb8d1cd8b32cp-843 | -0x1.50f6a6922f97ep+33
266 insn wcgdb01:
267- v_result = 0000000000000000 | 0000000000000000
268+ v_result = 0000000000000000 | --
269 v_arg1 = -0x1.64a673daccf1ap-566 | -0x1.69ef9b1d01499p+824
270 insn wcgdb01:
271- v_result = 8000000000000000 | 0000000000000000
272+ v_result = 8000000000000000 | --
273 v_arg1 = -0x1.3e2ddd862b4adp+1005 | -0x1.312466410271p+184
274 insn wcgdb03:
275- v_result = 0000000000000001 | 0000000000000000
276+ v_result = 0000000000000001 | --
277 v_arg1 = 0x1.d594c3412a11p-953 | -0x1.a07393d34d77cp-224
278 insn wcgdb03:
279- v_result = 8000000000000000 | 0000000000000000
280+ v_result = 8000000000000000 | --
281 v_arg1 = -0x1.f7a0dbcfd6e4cp+104 | -0x1.40f7cde7f2214p-702
282 insn wcgdb03:
283- v_result = 8000000000000000 | 0000000000000000
284+ v_result = 8000000000000000 | --
285 v_arg1 = -0x1.40739c1574808p+560 | -0x1.970328ddf1b6ep-374
286 insn wcgdb03:
287- v_result = 0000000000000001 | 0000000000000000
288+ v_result = 0000000000000001 | --
289 v_arg1 = 0x1.477653afd7048p-38 | 0x1.1eac2f8b2a93cp-384
290 insn wcgdb04:
291- v_result = ffffffffe9479a7d | 0000000000000000
292+ v_result = ffffffffe9479a7d | --
293 v_arg1 = -0x1.6b865833eff3p+28 | 0x1.06e8cf1834d0ep-722
294 insn wcgdb04:
295- v_result = 0000000000000000 | 0000000000000000
296+ v_result = 0000000000000000 | --
297 v_arg1 = 0x1.eef0b2294a5cp-544 | -0x1.8e8b133ccda15p+752
298 insn wcgdb04:
299- v_result = 0000000000000000 | 0000000000000000
300+ v_result = 0000000000000000 | --
301 v_arg1 = -0x1.f34e77e6b6698p-894 | -0x1.9f7ce1cb53bddp-896
302 insn wcgdb04:
303- v_result = 7fffffffffffffff | 0000000000000000
304+ v_result = 7fffffffffffffff | --
305 v_arg1 = 0x1.95707a6d75db5p+1018 | -0x1.3b0c072d23011p-224
306 insn wcgdb05:
307- v_result = 0000000000000000 | 0000000000000000
308+ v_result = 0000000000000000 | --
309 v_arg1 = -0x1.a9fb71160793p-968 | 0x1.05f601fe8123ap-986
310 insn wcgdb05:
311- v_result = 8000000000000000 | 0000000000000000
312+ v_result = 8000000000000000 | --
313 v_arg1 = -0x1.0864159b94305p+451 | -0x1.d4647f5a78b7ep-599
314 insn wcgdb05:
315- v_result = 7fffffffffffffff | 0000000000000000
316+ v_result = 7fffffffffffffff | --
317 v_arg1 = 0x1.37eadff8397c8p+432 | -0x1.15d896b6f6063p+464
318 insn wcgdb05:
319- v_result = 0000000000000000 | 0000000000000000
320+ v_result = 0000000000000000 | --
321 v_arg1 = 0x1.eb0812b0d677p-781 | 0x1.3117c5e0e288cp-202
322 insn wcgdb06:
323- v_result = 0000000000000001 | 0000000000000000
324+ v_result = 0000000000000001 | --
325 v_arg1 = 0x1.6b88069167c0fp-662 | -0x1.70571d27e1279p+254
326 insn wcgdb06:
327- v_result = 7fffffffffffffff | 0000000000000000
328+ v_result = 7fffffffffffffff | --
329 v_arg1 = 0x1.f6a6d6e883596p+260 | 0x1.0d578afaaa34ap+604
330 insn wcgdb06:
331- v_result = 0000000000000001 | 0000000000000000
332+ v_result = 0000000000000001 | --
333 v_arg1 = 0x1.d91c7d13c4694p-475 | -0x1.ecf1f8529767bp+830
334 insn wcgdb06:
335- v_result = 0000000000000001 | 0000000000000000
336+ v_result = 0000000000000001 | --
337 v_arg1 = 0x1.fac8dd3bb7af6p-101 | 0x1.fb8324a00fba8p+959
338 insn wcgdb07:
339- v_result = 7fffffffffffffff | 0000000000000000
340+ v_result = 7fffffffffffffff | --
341 v_arg1 = 0x1.4b0fa18fa73c7p+111 | -0x1.08e7b17633a49p+61
342 insn wcgdb07:
343- v_result = e636b693e39a1100 | 0000000000000000
344+ v_result = e636b693e39a1100 | --
345 v_arg1 = -0x1.9c9496c1c65efp+60 | 0x1.c4182ee728d76p-572
346 insn wcgdb07:
347- v_result = ffffffffffffffff | 0000000000000000
348+ v_result = ffffffffffffffff | --
349 v_arg1 = -0x1.819718032dff7p-303 | 0x1.a784c77ff6aa2p-622
350 insn wcgdb07:
351- v_result = 7fffffffffffffff | 0000000000000000
352+ v_result = 7fffffffffffffff | --
353 v_arg1 = 0x1.978e8abfd83c2p+152 | 0x1.2531ebf451762p+315
354 insn vclgdb00:
355 v_result = 0000000000000000 | 0000000000000000
356@@ -587,88 +587,88 @@
357 v_result = 0000000000000000 | 0000000000000000
358 v_arg1 = -0x1.137bbb51f08bdp+306 | 0x1.18d2a1063356p-795
359 insn wclgdb00:
360- v_result = 0000000000000000 | 0000000000000000
361+ v_result = 0000000000000000 | --
362 v_arg1 = -0x1.e66f55dcc2639p-1013 | -0x1.733ee56929f3bp-304
363 insn wclgdb00:
364- v_result = 0000000000000000 | 0000000000000000
365+ v_result = 0000000000000000 | --
366 v_arg1 = 0x1.8802fd9ab740cp-986 | -0x1.64d4d2c7c145fp-1015
367 insn wclgdb00:
368- v_result = 0000000000000000 | 0000000000000000
369+ v_result = 0000000000000000 | --
370 v_arg1 = 0x1.a67209b8c407bp-645 | -0x1.6410ff9b1c801p+487
371 insn wclgdb00:
372- v_result = 0000000000000000 | 0000000000000000
373+ v_result = 0000000000000000 | --
374 v_arg1 = -0x1.cb2febaefeb2dp+49 | 0x1.dee368b2ec375p-502
375 insn wclgdb01:
376- v_result = 0000000000000000 | 0000000000000000
377+ v_result = 0000000000000000 | --
378 v_arg1 = 0x1.5703db3c1b0e2p-728 | 0x1.068c4d51ea4ebp+617
379 insn wclgdb01:
380- v_result = 0000000000000000 | 0000000000000000
381+ v_result = 0000000000000000 | --
382 v_arg1 = -0x1.ae350291e5b3ep+291 | 0x1.1b87bb09b6032p+376
383 insn wclgdb01:
384- v_result = ffffffffffffffff | 0000000000000000
385+ v_result = ffffffffffffffff | --
386 v_arg1 = 0x1.c4666a710127ep+424 | -0x1.19e969b6c0076p+491
387 insn wclgdb01:
388- v_result = ffffffffffffffff | 0000000000000000
389+ v_result = ffffffffffffffff | --
390 v_arg1 = 0x1.c892c5a4d103fp+105 | -0x1.d4f937cc76704p+749
391 insn wclgdb03:
392- v_result = 0000000000000001 | 0000000000000000
393+ v_result = 0000000000000001 | --
394 v_arg1 = 0x1.81090d8fc663dp-111 | 0x1.337ec5e0f0904p+1
395 insn wclgdb03:
396- v_result = 0000000000000000 | 0000000000000000
397+ v_result = 0000000000000000 | --
398 v_arg1 = -0x1.e787adc70b91p-593 | 0x1.db8d83196b53cp-762
399 insn wclgdb03:
400- v_result = ffffffffffffffff | 0000000000000000
401+ v_result = ffffffffffffffff | --
402 v_arg1 = 0x1.6529307e907efp+389 | -0x1.3ea0d8d5b4dd2p+589
403 insn wclgdb03:
404- v_result = 0000000000000000 | 0000000000000000
405+ v_result = 0000000000000000 | --
406 v_arg1 = -0x1.be701a158637p-385 | 0x1.c5a7f70cb8a09p+107
407 insn wclgdb04:
408- v_result = 0000000000000000 | 0000000000000000
409+ v_result = 0000000000000000 | --
410 v_arg1 = -0x1.2f328571ab445p+21 | -0x1.dcc21fc82ba01p-930
411 insn wclgdb04:
412- v_result = 0000000000000000 | 0000000000000000
413+ v_result = 0000000000000000 | --
414 v_arg1 = -0x1.06b69fcbb7bffp-415 | 0x1.6f9a13a0a827ap+915
415 insn wclgdb04:
416- v_result = 0000000000000000 | 0000000000000000
417+ v_result = 0000000000000000 | --
418 v_arg1 = -0x1.738e549b38bcdp+479 | 0x1.a522edb999c9p-45
419 insn wclgdb04:
420- v_result = 0000000000000000 | 0000000000000000
421+ v_result = 0000000000000000 | --
422 v_arg1 = 0x1.7f9399d2bcf3bp-215 | -0x1.7bc35f2d69a7fp+818
423 insn wclgdb05:
424- v_result = ffffffffffffffff | 0000000000000000
425+ v_result = ffffffffffffffff | --
426 v_arg1 = 0x1.fc542bdb707f6p+880 | -0x1.8521ebc93a25fp-969
427 insn wclgdb05:
428- v_result = 1ce8d9951b8c8600 | 0000000000000000
429+ v_result = 1ce8d9951b8c8600 | --
430 v_arg1 = 0x1.ce8d9951b8c86p+60 | 0x1.92712589230e7p+475
431 insn wclgdb05:
432- v_result = 0000000000000000 | 0000000000000000
433+ v_result = 0000000000000000 | --
434 v_arg1 = -0x1.8a297f60a0811p-156 | 0x1.102b79043d82cp-204
435 insn wclgdb05:
436- v_result = 0000000000000000 | 0000000000000000
437+ v_result = 0000000000000000 | --
438 v_arg1 = 0x1.beb9057e1401dp-196 | -0x1.820f18f830262p+15
439 insn wclgdb06:
440- v_result = 0000000000000001 | 0000000000000000
441+ v_result = 0000000000000001 | --
442 v_arg1 = 0x1.c321a966ecb4dp-430 | -0x1.2f6a1a95ead99p-943
443 insn wclgdb06:
444- v_result = 0000000000000000 | 0000000000000000
445+ v_result = 0000000000000000 | --
446 v_arg1 = -0x1.f1a86b4aed821p-56 | -0x1.1ee6717cc2d7fp-899
447 insn wclgdb06:
448- v_result = 0000000000000000 | 0000000000000000
449+ v_result = 0000000000000000 | --
450 v_arg1 = -0x1.73ce49d89ecb9p-302 | 0x1.52663b975ed23p-716
451 insn wclgdb06:
452- v_result = 0000000000000000 | 0000000000000000
453+ v_result = 0000000000000000 | --
454 v_arg1 = -0x1.3e9c2de97a292p+879 | 0x1.d34eed36f2eafp+960
455 insn wclgdb07:
456- v_result = 0000000000000000 | 0000000000000000
457+ v_result = 0000000000000000 | --
458 v_arg1 = -0x1.4e6ec6ddc6a45p-632 | -0x1.6e564d0fec72bp+369
459 insn wclgdb07:
460- v_result = ffffffffffffffff | 0000000000000000
461+ v_result = ffffffffffffffff | --
462 v_arg1 = 0x1.42e2c658e4c4dp+459 | -0x1.9f9dc0252e44p+85
463 insn wclgdb07:
464- v_result = 0000000000000000 | 0000000000000000
465+ v_result = 0000000000000000 | --
466 v_arg1 = -0x1.fb40ac8cda3c1p-762 | 0x1.0e9ed614bc8f1p-342
467 insn wclgdb07:
468- v_result = 0000000000000000 | 0000000000000000
469+ v_result = 0000000000000000 | --
470 v_arg1 = -0x1.c1f8b3c68e214p+118 | -0x1.1a26a49368b61p+756
471 insn vfidb00:
472 v_arg1 = -0x1.38df4cf9d52dbp-545 | -0x1.049253d90dd92p+94
473@@ -1020,16 +1020,16 @@
474 v_result = -0x1.6f5fb2p+70 | -0x1.0d2df6p-107
475 insn wldeb:
476 v_arg1 = -0x1.d26169729db2ap-435 | 0x1.d6fd080793e8cp+767
477- v_result = -0x1.9a4c2cp-54 | 0x0p+0
478+ v_result = -0x1.9a4c2cp-54 | --
479 insn wldeb:
480 v_arg1 = -0x1.f4b59107fce61p-930 | 0x1.cdf2816e253f4p-168
481- v_result = -0x1.be96b2p-116 | 0x0p+0
482+ v_result = -0x1.be96b2p-116 | --
483 insn wldeb:
484 v_arg1 = -0x1.9603a2997928cp-441 | -0x1.aada85e355a11p-767
485- v_result = -0x1.d2c074p-55 | 0x0p+0
486+ v_result = -0x1.d2c074p-55 | --
487 insn wldeb:
488 v_arg1 = 0x1.25ccf5bd0e83p+620 | 0x1.e1635864ebb17p-88
489- v_result = 0x1.64b99ep+78 | 0x0p+0
490+ v_result = 0x1.64b99ep+78 | --
491 insn vflcdb:
492 v_arg1 = 0x1.0ae6d82f76afp-166 | -0x1.e8fb1e03a7415p-191
493 v_result = -0x1.0ae6d82f76afp-166 | 0x1.e8fb1e03a7415p-191
494@@ -1044,16 +1044,16 @@
495 v_result = -0x1.19520153d35b4p-301 | -0x1.ac5325cd23253p+396
496 insn wflcdb:
497 v_arg1 = 0x1.ffd3eecfd54d7p-831 | -0x1.97854fa523a77p+146
498- v_result = -0x1.ffd3eecfd54d7p-831 | 0x0p+0
499+ v_result = -0x1.ffd3eecfd54d7p-831 | --
500 insn wflcdb:
501 v_arg1 = -0x1.508ea45606447p-442 | 0x1.ae7f0e6cf9d2bp+583
502- v_result = 0x1.508ea45606447p-442 | 0x0p+0
503+ v_result = 0x1.508ea45606447p-442 | --
504 insn wflcdb:
505 v_arg1 = 0x1.da8ab2188c21ap+94 | 0x1.78a9c152aa074p-808
506- v_result = -0x1.da8ab2188c21ap+94 | 0x0p+0
507+ v_result = -0x1.da8ab2188c21ap+94 | --
508 insn wflcdb:
509 v_arg1 = -0x1.086882645e0c5p-1001 | -0x1.54e2de5af5a74p-262
510- v_result = 0x1.086882645e0c5p-1001 | 0x0p+0
511+ v_result = 0x1.086882645e0c5p-1001 | --
512 insn vflndb:
513 v_arg1 = -0x1.5bec561d407dcp+819 | -0x1.a5773dadb7a2dp+935
514 v_result = -0x1.5bec561d407dcp+819 | -0x1.a5773dadb7a2dp+935
515@@ -1068,16 +1068,16 @@
516 v_result = -0x1.c5bc39a06d4e2p-259 | -0x1.c5e61ad849e77p-833
517 insn wflndb:
518 v_arg1 = -0x1.e9f3e6d1beffap-117 | -0x1.d58cc8bf123b3p-714
519- v_result = -0x1.e9f3e6d1beffap-117 | 0x0p+0
520+ v_result = -0x1.e9f3e6d1beffap-117 | --
521 insn wflndb:
522 v_arg1 = -0x1.3fc4ef2e7485ep-691 | 0x1.eb328986081efp-775
523- v_result = -0x1.3fc4ef2e7485ep-691 | 0x0p+0
524+ v_result = -0x1.3fc4ef2e7485ep-691 | --
525 insn wflndb:
526 v_arg1 = -0x1.7146c5afdec16p+23 | -0x1.597fcfa1fab2p-708
527- v_result = -0x1.7146c5afdec16p+23 | 0x0p+0
528+ v_result = -0x1.7146c5afdec16p+23 | --
529 insn wflndb:
530 v_arg1 = 0x1.03f8d7e9afe84p-947 | 0x1.9a10c3feb6b57p-118
531- v_result = -0x1.03f8d7e9afe84p-947 | 0x0p+0
532+ v_result = -0x1.03f8d7e9afe84p-947 | --
533 insn vflpdb:
534 v_arg1 = 0x1.64ae59b6c762ep-407 | -0x1.fa7191ab21e86p+533
535 v_result = 0x1.64ae59b6c762ep-407 | 0x1.fa7191ab21e86p+533
536@@ -1092,16 +1092,16 @@
537 v_result = 0x1.85fa2de1d492ap+170 | 0x1.ac36828822c11p-968
538 insn wflpdb:
539 v_arg1 = 0x1.a6cf677640a73p-871 | 0x1.b6f1792385922p-278
540- v_result = 0x1.a6cf677640a73p-871 | 0x0p+0
541+ v_result = 0x1.a6cf677640a73p-871 | --
542 insn wflpdb:
543 v_arg1 = -0x1.b886774f6d888p-191 | -0x1.6a2b08d735d22p-643
544- v_result = 0x1.b886774f6d888p-191 | 0x0p+0
545+ v_result = 0x1.b886774f6d888p-191 | --
546 insn wflpdb:
547 v_arg1 = 0x1.5045d37d46f5fp+943 | -0x1.333a86ef2dcf6p-1013
548- v_result = 0x1.5045d37d46f5fp+943 | 0x0p+0
549+ v_result = 0x1.5045d37d46f5fp+943 | --
550 insn wflpdb:
551 v_arg1 = 0x1.1e7bec6ada14dp+252 | 0x1.a70b3f3e24dap-153
552- v_result = 0x1.1e7bec6ada14dp+252 | 0x0p+0
553+ v_result = 0x1.1e7bec6ada14dp+252 | --
554 insn vfadb:
555 v_arg1 = 0x1.5b1ad8e9f17c6p-294 | -0x1.ddd8300a0bf02p+122
556 v_arg2 = -0x1.9b49c31ca8ac6p+926 | 0x1.fdbc992926268p+677
557@@ -1121,19 +1121,19 @@
558 insn wfadb:
559 v_arg1 = 0x1.3c5466cb80722p+489 | -0x1.11e1770053ca2p+924
560 v_arg2 = 0x1.d876cd721a726p-946 | 0x1.5c04ceb79c9bcp+1001
561- v_result = 0x1.3c5466cb80722p+489 | 0x0p+0
562+ v_result = 0x1.3c5466cb80722p+489 | --
563 insn wfadb:
564 v_arg1 = 0x1.b0b142d6b76a3p+577 | 0x1.3146824e993a2p+432
565 v_arg2 = -0x1.f7f3b7582925fp-684 | -0x1.9700143c2b935p-837
566- v_result = 0x1.b0b142d6b76a2p+577 | 0x0p+0
567+ v_result = 0x1.b0b142d6b76a2p+577 | --
568 insn wfadb:
569 v_arg1 = -0x1.8d65e15edabd6p+244 | 0x1.3be7fd08492d6p-141
570 v_arg2 = -0x1.5eef86490fb0ap+481 | 0x1.7b26c897cb6dfp+810
571- v_result = -0x1.5eef86490fb0ap+481 | 0x0p+0
572+ v_result = -0x1.5eef86490fb0ap+481 | --
573 insn wfadb:
574 v_arg1 = -0x1.2dffa5b5f29p+34 | 0x1.71a026274602fp-881
575 v_arg2 = 0x1.4dad707287289p+756 | -0x1.1500d55807247p-616
576- v_result = 0x1.4dad707287288p+756 | 0x0p+0
577+ v_result = 0x1.4dad707287288p+756 | --
578 insn vfsdb:
579 v_arg1 = 0x1.054fd9c4d4883p+644 | 0x1.45c90ed85bd7fp-780
580 v_arg2 = 0x1.f3bc7a611dadap+494 | -0x1.7c9e1e858ba5bp-301
581@@ -1153,19 +1153,19 @@
582 insn wfsdb:
583 v_arg1 = 0x1.9090dabf846e7p-648 | 0x1.1c4ab843a2d15p+329
584 v_arg2 = -0x1.a7ceb293690dep+316 | 0x1.22245954a20cp+42
585- v_result = 0x1.a7ceb293690dep+316 | 0x0p+0
586+ v_result = 0x1.a7ceb293690dep+316 | --
587 insn wfsdb:
588 v_arg1 = 0x1.4e5347c27819p-933 | -0x1.56a30bda28351p-64
589 v_arg2 = -0x1.dedb9f3935b56p-155 | 0x1.8c5b6ed76816cp-522
590- v_result = 0x1.dedb9f3935b56p-155 | 0x0p+0
591+ v_result = 0x1.dedb9f3935b56p-155 | --
592 insn wfsdb:
593 v_arg1 = 0x1.0ec4e562a015bp-491 | 0x1.3996381b52d9fp-686
594 v_arg2 = 0x1.1dcce4e81819p+960 | -0x1.32fa425e8fc08p-263
595- v_result = -0x1.1dcce4e81818fp+960 | 0x0p+0
596+ v_result = -0x1.1dcce4e81818fp+960 | --
597 insn wfsdb:
598 v_arg1 = -0x1.587229f90f77dp-19 | 0x1.100d8eb8105e4p-784
599 v_arg2 = -0x1.afb4cce4c43ddp+530 | -0x1.6da7f05e7f512p-869
600- v_result = 0x1.afb4cce4c43dcp+530 | 0x0p+0
601+ v_result = 0x1.afb4cce4c43dcp+530 | --
602 insn vfmdb:
603 v_arg1 = 0x1.892b425556c47p-124 | 0x1.38222404079dfp-656
604 v_arg2 = 0x1.af612ed2c342dp-267 | -0x1.1f735fd6ce768p-877
605@@ -1185,19 +1185,19 @@
606 insn wfmdb:
607 v_arg1 = -0x1.b992d950126a1p-683 | -0x1.9c1b22eb58c59p-497
608 v_arg2 = 0x1.b557a7d8e32c3p-25 | -0x1.f746b2ddafccep+227
609- v_result = -0x1.792f6fb13894ap-707 | 0x0p+0
610+ v_result = -0x1.792f6fb13894ap-707 | --
611 insn wfmdb:
612 v_arg1 = -0x1.677a8c20a5a2fp+876 | 0x1.c03e7b97e8c0dp-645
613 v_arg2 = 0x1.dab44be430937p-1011 | -0x1.3f51352c67be9p-916
614- v_result = -0x1.4d4b0a1827064p-134 | 0x0p+0
615+ v_result = -0x1.4d4b0a1827064p-134 | --
616 insn wfmdb:
617 v_arg1 = -0x1.da60f596ad0cep+254 | 0x1.52332e0650e33p+966
618 v_arg2 = 0x1.a042c52ed993cp+215 | 0x1.8f380c84aa133p+204
619- v_result = -0x1.81aca4bbcbd24p+470 | 0x0p+0
620+ v_result = -0x1.81aca4bbcbd24p+470 | --
621 insn wfmdb:
622 v_arg1 = -0x1.83d17f11f6aa3p-469 | -0x1.98117efe89b9ep-361
623 v_arg2 = 0x1.8c445fd46d214p-701 | -0x1.f98118821821cp+596
624- v_result = -0x0p+0 | 0x0p+0
625+ v_result = -0x0p+0 | --
626 insn vfddb:
627 v_arg1 = -0x1.ecbb48899e0f1p+969 | 0x1.caf175ab352p-20
628 v_arg2 = -0x1.9455d67f9f79dp+208 | 0x1.bc4a431b04a6fp+482
629@@ -1217,19 +1217,19 @@
630 insn wfddb:
631 v_arg1 = 0x1.bd48489b60731p-114 | 0x1.a760dcf57b74fp-51
632 v_arg2 = -0x1.171f83409eeb6p-402 | -0x1.e159d1409bdc6p-972
633- v_result = -0x1.9864f1511f8cp+288 | 0x0p+0
634+ v_result = -0x1.9864f1511f8cp+288 | --
635 insn wfddb:
636 v_arg1 = -0x1.120505ef4606p-637 | -0x1.83f6f775c0eb7p+272
637 v_arg2 = -0x1.d18ba3872fde1p+298 | 0x1.c60f8d191068cp-454
638- v_result = 0x1.2d5cdb15a686cp-936 | 0x0p+0
639+ v_result = 0x1.2d5cdb15a686cp-936 | --
640 insn wfddb:
641 v_arg1 = 0x1.f637f7f8c790fp-97 | -0x1.7bdce4d74947p+189
642 v_arg2 = -0x1.1c8f2d1b3a2edp-218 | -0x1.55fdfd1840241p-350
643- v_result = -0x1.c3d0799c1420fp+121 | 0x0p+0
644+ v_result = -0x1.c3d0799c1420fp+121 | --
645 insn wfddb:
646 v_arg1 = -0x1.c63b7b2eee253p+250 | 0x1.dfd9dcd8b823fp-125
647 v_arg2 = 0x1.094a1f1f87e0cp+629 | 0x1.eeaa23c0d7843p-814
648- v_result = -0x1.b653a10ebdeccp-379 | 0x0p+0
649+ v_result = -0x1.b653a10ebdeccp-379 | --
650 insn vfsqdb:
651 v_arg1 = 0x1.f60db25f7066p-703 | -0x1.d43509abca8c3p+631
652 v_result = 0x1.fb009ab25ec11p-352 | nan
653@@ -1244,16 +1244,16 @@
654 v_result = 0x1.833dba0954bccp+249 | nan
655 insn wfsqdb:
656 v_arg1 = 0x1.71af4e7f64978p+481 | -0x1.3429dc60011d7p-879
657- v_result = 0x1.b30fc65551133p+240 | 0x0p+0
658+ v_result = 0x1.b30fc65551133p+240 | --
659 insn wfsqdb:
660 v_arg1 = 0x1.5410db1c5f403p+173 | 0x1.97fa6581e692fp+108
661- v_result = 0x1.a144f43a592c1p+86 | 0x0p+0
662+ v_result = 0x1.a144f43a592c1p+86 | --
663 insn wfsqdb:
664 v_arg1 = -0x1.5838027725afep+6 | 0x1.ac61529c11f38p+565
665- v_result = nan | 0x0p+0
666+ v_result = nan | --
667 insn wfsqdb:
668 v_arg1 = -0x1.159e341dcc06ep-439 | 0x1.ed54ce5481ba5p-574
669- v_result = nan | 0x0p+0
670+ v_result = nan | --
671 insn vfmadb:
672 v_arg1 = -0x1.eb00a5c503d75p+538 | 0x1.89fae603ddc07p+767
673 v_arg2 = -0x1.71c72712c3957p+715 | 0x1.1bd5773442feap+762
674@@ -1278,22 +1278,22 @@
675 v_arg1 = 0x1.1cc5b10a14d54p+668 | -0x1.686407390f7d1p+616
676 v_arg2 = -0x1.bf34549e73246p+676 | -0x1.dc5a34cc470f3p+595
677 v_arg3 = -0x1.95e0fdcf13974p-811 | -0x1.79c7cc1a8ec83p-558
678- v_result = -0x1.fffffffffffffp+1023 | 0x0p+0
679+ v_result = -0x1.fffffffffffffp+1023 | --
680 insn wfmadb:
681 v_arg1 = 0x1.138bc1a5d75f8p+713 | -0x1.e226ebba2fe54p+381
682 v_arg2 = -0x1.081ebb7cc3414p-772 | 0x1.369d99e174fc3p+922
683 v_arg3 = -0x1.0671c682a5d0cp-1016 | 0x1.03c9530dd0377p+378
684- v_result = -0x1.1c4933e117d95p-59 | 0x0p+0
685+ v_result = -0x1.1c4933e117d95p-59 | --
686 insn wfmadb:
687 v_arg1 = -0x1.166f0b1fad67bp+64 | -0x1.e9ee8d32e1069p-452
688 v_arg2 = -0x1.4a235bdd109e2p-65 | 0x1.bacaa96fc7e81p-403
689 v_arg3 = -0x1.d2e19acf7c4bdp+99 | 0x1.f901130f685adp-963
690- v_result = -0x1.d2e19acf7c4bcp+99 | 0x0p+0
691+ v_result = -0x1.d2e19acf7c4bcp+99 | --
692 insn wfmadb:
693 v_arg1 = -0x1.77d7bfec863d2p-988 | -0x1.b68029700c6b1p-206
694 v_arg2 = -0x1.aca05ad00aec1p+737 | 0x1.ac746bd7e216bp+51
695 v_arg3 = 0x1.17342292078b4p+188 | -0x1.49efaf9392301p+555
696- v_result = 0x1.17342292078b4p+188 | 0x0p+0
697+ v_result = 0x1.17342292078b4p+188 | --
698 insn vfmsdb:
699 v_arg1 = -0x1.a1b218e84e61p+34 | 0x1.b220f0d144daep-111
700 v_arg2 = 0x1.564fcc2527961p-265 | 0x1.ea85a4154721ep+733
701@@ -1318,22 +1318,22 @@
702 v_arg1 = -0x1.7499a639673a6p-100 | -0x1.2a0d737e6cb1cp-207
703 v_arg2 = -0x1.01ad4670a7aa3p-911 | 0x1.f94385e1021e8p+317
704 v_arg3 = 0x1.aa42b2bb17af9p+982 | 0x1.c550e471711p+786
705- v_result = -0x1.aa42b2bb17af8p+982 | 0x0p+0
706+ v_result = -0x1.aa42b2bb17af8p+982 | --
707 insn wfmsdb:
708 v_arg1 = 0x1.76840f99b431ep+500 | -0x1.989a500c92c08p+594
709 v_arg2 = 0x1.33c657cb8385cp-84 | -0x1.2c795ad92ce17p+807
710 v_arg3 = -0x1.ee58a39f02d54p-351 | -0x1.18695ed9a280ap+48
711- v_result = 0x1.c242894a0068p+416 | 0x0p+0
712+ v_result = 0x1.c242894a0068p+416 | --
713 insn wfmsdb:
714 v_arg1 = -0x1.16db07e054a65p-469 | -0x1.3a627ab99c6e4p+689
715 v_arg2 = 0x1.17872eae826e5p-538 | 0x1.44ed513fb5873p-929
716 v_arg3 = 0x1.5ca912008e077p-217 | -0x1.982a6f7359876p-23
717- v_result = -0x1.5ca912008e077p-217 | 0x0p+0
718+ v_result = -0x1.5ca912008e077p-217 | --
719 insn wfmsdb:
720 v_arg1 = -0x1.d315f4a932c6p+122 | 0x1.616a04493e143p+513
721 v_arg2 = -0x1.cf1cd3516f23fp+552 | 0x1.7121749c3932cp-750
722 v_arg3 = 0x1.dc26d92304d7fp-192 | -0x1.1fc3cca9ec20ep+371
723- v_result = 0x1.a67ca6ba395bcp+675 | 0x0p+0
724+ v_result = 0x1.a67ca6ba395bcp+675 | --
725 insn wfcdb:
726 v_arg1 = 0x1.302001b736011p-633 | -0x1.72d5300225c97p-468
727 v_arg2 = -0x1.8c007c5aba108p-17 | -0x1.bb3f9ae136acdp+569
728@@ -1383,19 +1383,19 @@
729 v_arg1 = 0x1.d8e5c9930c19dp+623 | -0x1.cf1facff4e194p-605
730 v_arg2 = -0x1.ed6ba02646d0dp+441 | -0x1.2d677e710620bp+810
731 insn wfcedb:
732- v_result = 0000000000000000 | 0000000000000000
733+ v_result = 0000000000000000 | --
734 v_arg1 = -0x1.a252009e1a12cp-442 | 0x1.4dc608268bb29p-513
735 v_arg2 = -0x1.81020aa1a36e6p-687 | -0x1.300e64ce414f1p-899
736 insn wfcedb:
737- v_result = 0000000000000000 | 0000000000000000
738+ v_result = 0000000000000000 | --
739 v_arg1 = 0x1.cec439a8d4781p-175 | -0x1.d20e3b281d599p+893
740 v_arg2 = 0x1.ca17cf16cf0aap-879 | 0x1.61506f8596092p+545
741 insn wfcedb:
742- v_result = 0000000000000000 | 0000000000000000
743+ v_result = 0000000000000000 | --
744 v_arg1 = 0x1.0659f5f24a004p+877 | 0x1.fc46867ed0338p-680
745 v_arg2 = -0x1.1d6849587155ep-1010 | -0x1.f68171edc235fp+575
746 insn wfcedb:
747- v_result = 0000000000000000 | 0000000000000000
748+ v_result = 0000000000000000 | --
749 v_arg1 = 0x1.dc88a0d46ad79p-816 | 0x1.245140dcaed79p+851
750 v_arg2 = 0x1.b33e977c7b3ep-818 | -0x1.04319d7c69367p+787
751 insn vfcedbs:
752@@ -1419,22 +1419,22 @@
753 v_arg2 = 0x1.ae2c06ea88ff4p+332 | -0x1.f668ce4f8ef9ap+821
754 r_result = 0000000000000003
755 insn wfcedbs:
756- v_result = 0000000000000000 | 0000000000000000
757+ v_result = 0000000000000000 | --
758 v_arg1 = 0x1.645261bf86b1fp-996 | 0x1.abd13c95397aap+992
759 v_arg2 = -0x1.ba09e8fc66a8cp+113 | 0x1.75dbfe92c16c4p-786
760 r_result = 0000000000000003
761 insn wfcedbs:
762- v_result = 0000000000000000 | 0000000000000000
763+ v_result = 0000000000000000 | --
764 v_arg1 = -0x1.d02831d003e7dp+415 | -0x1.611a9dfd10f36p-80
765 v_arg2 = -0x1.10bda62f4647p+723 | 0x1.cc47af6653378p-614
766 r_result = 0000000000000003
767 insn wfcedbs:
768- v_result = 0000000000000000 | 0000000000000000
769+ v_result = 0000000000000000 | --
770 v_arg1 = 0x1.f168f32f84178p-321 | -0x1.79a2a0b9549d1p-136
771 v_arg2 = 0x1.41e19d1cfa692p+11 | -0x1.2a0ed6e7fd517p-453
772 r_result = 0000000000000003
773 insn wfcedbs:
774- v_result = 0000000000000000 | 0000000000000000
775+ v_result = 0000000000000000 | --
776 v_arg1 = -0x1.76a9144ee26c5p+188 | -0x1.386aaea2d9cddp-542
777 v_arg2 = 0x1.810fcf222efc4p-999 | -0x1.ce90a9a43e2a1p+80
778 r_result = 0000000000000003
779@@ -1455,19 +1455,19 @@
780 v_arg1 = 0x1.82be31fb88a2dp+946 | -0x1.7ca9e9ff31953p-931
781 v_arg2 = 0x1.fe75a1052beccp+490 | 0x1.179d18543d678p-255
782 insn wfchdb:
783- v_result = ffffffffffffffff | 0000000000000000
784+ v_result = ffffffffffffffff | --
785 v_arg1 = 0x1.0af85d8d8d609p-464 | -0x1.9f639a686e0fep+203
786 v_arg2 = -0x1.3142b77b55761p-673 | 0x1.ca9c474339da1p+472
787 insn wfchdb:
788- v_result = ffffffffffffffff | 0000000000000000
789+ v_result = ffffffffffffffff | --
790 v_arg1 = -0x1.6cf16959a022bp+213 | 0x1.445606e4363e1p+942
791 v_arg2 = -0x1.8c343201bbd2p+939 | -0x1.e5095ad0c37a4p-434
792 insn wfchdb:
793- v_result = ffffffffffffffff | 0000000000000000
794+ v_result = ffffffffffffffff | --
795 v_arg1 = 0x1.36b4fc9cf5bdap-52 | -0x1.f1fd95cbcd533p+540
796 v_arg2 = 0x1.5a2362891c9edp-175 | -0x1.e1f68c319e5d2p+58
797 insn wfchdb:
798- v_result = ffffffffffffffff | 0000000000000000
799+ v_result = ffffffffffffffff | --
800 v_arg1 = 0x1.11c6489f544bbp+811 | 0x1.262a740ec3d47p+456
801 v_arg2 = -0x1.d9394d354e989p-154 | 0x1.cc21b3094391ap-972
802 insn vfchdbs:
803@@ -1491,22 +1491,22 @@
804 v_arg2 = 0x1.e426748435a76p+370 | 0x1.8702527d17783p-871
805 r_result = 0000000000000003
806 insn wfchdbs:
807- v_result = ffffffffffffffff | 0000000000000000
808+ v_result = ffffffffffffffff | --
809 v_arg1 = 0x1.6c51b9f6442c8p+639 | 0x1.1e6b37adff703p+702
810 v_arg2 = 0x1.0cba9c1c75e43p+520 | -0x1.145d44ed90967p+346
811 r_result = 0000000000000000
812 insn wfchdbs:
813- v_result = ffffffffffffffff | 0000000000000000
814+ v_result = ffffffffffffffff | --
815 v_arg1 = 0x1.7b3dd643bf36bp+816 | -0x1.61ce7bfb9307ap-683
816 v_arg2 = -0x1.f2c998dc15c9ap-776 | 0x1.e16397f2dcdf5p+571
817 r_result = 0000000000000000
818 insn wfchdbs:
819- v_result = ffffffffffffffff | 0000000000000000
820+ v_result = ffffffffffffffff | --
821 v_arg1 = 0x1.cc3be81884e0ap-865 | -0x1.8b353bd41064p+820
822 v_arg2 = -0x1.2c1bafaafdd4ep-34 | -0x1.24666808ab16ep-435
823 r_result = 0000000000000000
824 insn wfchdbs:
825- v_result = ffffffffffffffff | 0000000000000000
826+ v_result = ffffffffffffffff | --
827 v_arg1 = 0x1.c3de33d3b673ap+554 | 0x1.d39ed71e53096p-798
828 v_arg2 = -0x1.c1e8f7b3c001p-828 | 0x1.22e2cf797fabp-787
829 r_result = 0000000000000000
830@@ -1527,19 +1527,19 @@
831 v_arg1 = -0x1.6c5599e7ba923p+829 | -0x1.5d1a1191ed6eap-994
832 v_arg2 = -0x1.555c8775bc4d2p-478 | -0x1.4aa6a2c82319cp+493
833 insn wfchedb:
834- v_result = ffffffffffffffff | 0000000000000000
835+ v_result = ffffffffffffffff | --
836 v_arg1 = 0x1.ae6cad07b0f3ep-232 | -0x1.2ed61a43f3b99p-74
837 v_arg2 = -0x1.226f7cddbde13p-902 | -0x1.790d1d6febbf8p+336
838 insn wfchedb:
839- v_result = ffffffffffffffff | 0000000000000000
840+ v_result = ffffffffffffffff | --
841 v_arg1 = 0x1.20eb8eac3711dp-385 | 0x1.ef71d3312d7e1p+739
842 v_arg2 = 0x1.7a3ba08c5a0bdp-823 | -0x1.a7845ccaa544dp-129
843 insn wfchedb:
844- v_result = 0000000000000000 | 0000000000000000
845+ v_result = 0000000000000000 | --
846 v_arg1 = -0x1.97ebdbc057be8p+824 | 0x1.2b7798b063cd6p+237
847 v_arg2 = 0x1.cdb87a6074294p-81 | -0x1.074c902b19bccp-416
848 insn wfchedb:
849- v_result = 0000000000000000 | 0000000000000000
850+ v_result = 0000000000000000 | --
851 v_arg1 = -0x1.82deebf9ff023p+937 | 0x1.56c5adcf9d4abp-672
852 v_arg2 = -0x1.311ce49bc9439p+561 | 0x1.c8e1c512d8544p+103
853 insn vfchedbs:
854@@ -1563,22 +1563,22 @@
855 v_arg2 = -0x1.47f5dfc7a5bcp-569 | 0x1.5877ef33664a3p-758
856 r_result = 0000000000000003
857 insn wfchedbs:
858- v_result = 0000000000000000 | 0000000000000000
859+ v_result = 0000000000000000 | --
860 v_arg1 = -0x1.a7370ccfd9e49p+505 | 0x1.c6b2385850ca2p-591
861 v_arg2 = 0x1.984f4fcd338b1p+675 | -0x1.feb996c821232p-39
862 r_result = 0000000000000003
863 insn wfchedbs:
864- v_result = ffffffffffffffff | 0000000000000000
865+ v_result = ffffffffffffffff | --
866 v_arg1 = 0x1.641878612dd2p+207 | 0x1.b35e3292db7f6p+567
867 v_arg2 = -0x1.18a87f209e96bp+299 | -0x1.3d598f3612d8ap+1016
868 r_result = 0000000000000000
869 insn wfchedbs:
870- v_result = ffffffffffffffff | 0000000000000000
871+ v_result = ffffffffffffffff | --
872 v_arg1 = 0x1.cfc2cda244153p+404 | 0x1.d8b2b28e9d8d7p+276
873 v_arg2 = 0x1.3517b8c7a59a1p-828 | 0x1.6096fab7003ccp-415
874 r_result = 0000000000000000
875 insn wfchedbs:
876- v_result = 0000000000000000 | 0000000000000000
877+ v_result = 0000000000000000 | --
878 v_arg1 = -0x1.54d656f033e56p-603 | -0x1.95ad0e2088967p+254
879 v_arg2 = 0x1.4cb319db206e4p-614 | 0x1.b41cd9e3739b6p-862
880 r_result = 0000000000000003
881--- a/none/tests/s390x/vector.h
882+++ b/none/tests/s390x/vector.h
883@@ -86,6 +86,13 @@
884 printf("%016lx | %016lx\n", value.u64[0], value.u64[1]);
885 }
886
887+void print_hex64(const V128 value, int zero_only) {
888+ if (zero_only)
889+ printf("%016lx | --\n", value.u64[0]);
890+ else
891+ printf("%016lx | %016lx\n", value.u64[0], value.u64[1]);
892+}
893+
894 void print_f32(const V128 value, int even_only, int zero_only) {
895 if (zero_only)
896 printf("%a | -- | -- | --\n", value.f32[0]);
897@@ -222,8 +229,10 @@
898 {printf(" v_arg2 = "); print_hex(v_arg2);} \
899 if (info & V128_V_ARG3_AS_INT) \
900 {printf(" v_arg3 = "); print_hex(v_arg3);} \
901- if (info & V128_V_RES_AS_INT) \
902- {printf(" v_result = "); print_hex(v_result);} \
903+ if (info & V128_V_RES_AS_INT) { \
904+ printf(" v_result = "); \
905+ print_hex64(v_result, info & V128_V_RES_ZERO_ONLY); \
906+ } \
907 \
908 if (info & V128_V_ARG1_AS_FLOAT64) \
909 {printf(" v_arg1 = "); print_f64(v_arg1, 0);} \
910--- a/VEX/priv/guest_s390_defs.h
911+++ b/VEX/priv/guest_s390_defs.h
912@@ -8,7 +8,7 @@
913 This file is part of Valgrind, a dynamic binary instrumentation
914 framework.
915
916- Copyright IBM Corp. 2010-2017
917+ Copyright IBM Corp. 2010-2020
918
919 This program is free software; you can redistribute it and/or
920 modify it under the terms of the GNU General Public License as
921@@ -263,26 +263,27 @@
922 before S390_VEC_OP_LAST. */
923 typedef enum {
924 S390_VEC_OP_INVALID = 0,
925- S390_VEC_OP_VPKS = 1,
926- S390_VEC_OP_VPKLS = 2,
927- S390_VEC_OP_VFAE = 3,
928- S390_VEC_OP_VFEE = 4,
929- S390_VEC_OP_VFENE = 5,
930- S390_VEC_OP_VISTR = 6,
931- S390_VEC_OP_VSTRC = 7,
932- S390_VEC_OP_VCEQ = 8,
933- S390_VEC_OP_VTM = 9,
934- S390_VEC_OP_VGFM = 10,
935- S390_VEC_OP_VGFMA = 11,
936- S390_VEC_OP_VMAH = 12,
937- S390_VEC_OP_VMALH = 13,
938- S390_VEC_OP_VCH = 14,
939- S390_VEC_OP_VCHL = 15,
940- S390_VEC_OP_VFCE = 16,
941- S390_VEC_OP_VFCH = 17,
942- S390_VEC_OP_VFCHE = 18,
943- S390_VEC_OP_VFTCI = 19,
944- S390_VEC_OP_LAST = 20 // supposed to be the last element in enum
945+ S390_VEC_OP_VPKS,
946+ S390_VEC_OP_VPKLS,
947+ S390_VEC_OP_VFAE,
948+ S390_VEC_OP_VFEE,
949+ S390_VEC_OP_VFENE,
950+ S390_VEC_OP_VISTR,
951+ S390_VEC_OP_VSTRC,
952+ S390_VEC_OP_VCEQ,
953+ S390_VEC_OP_VTM,
954+ S390_VEC_OP_VGFM,
955+ S390_VEC_OP_VGFMA,
956+ S390_VEC_OP_VMAH,
957+ S390_VEC_OP_VMALH,
958+ S390_VEC_OP_VCH,
959+ S390_VEC_OP_VCHL,
960+ S390_VEC_OP_VFTCI,
961+ S390_VEC_OP_VFMIN,
962+ S390_VEC_OP_VFMAX,
963+ S390_VEC_OP_VBPERM,
964+ S390_VEC_OP_VMSL,
965+ S390_VEC_OP_LAST // supposed to be the last element in enum
966 } s390x_vec_op_t;
967
968 /* Arguments of s390x_dirtyhelper_vec_op(...) which are packed into one
969--- a/VEX/priv/guest_s390_helpers.c
970+++ b/VEX/priv/guest_s390_helpers.c
971@@ -8,7 +8,7 @@
972 This file is part of Valgrind, a dynamic binary instrumentation
973 framework.
974
975- Copyright IBM Corp. 2010-2017
976+ Copyright IBM Corp. 2010-2020
977
978 This program is free software; you can redistribute it and/or
979 modify it under the terms of the GNU General Public License as
980@@ -314,20 +314,11 @@
981 /*--- Dirty helper for Store Facility instruction ---*/
982 /*------------------------------------------------------------*/
983 #if defined(VGA_s390x)
984-static void
985-s390_set_facility_bit(ULong *addr, UInt bitno, UInt value)
986-{
987- addr += bitno / 64;
988- bitno = bitno % 64;
989-
990- ULong mask = 1;
991- mask <<= (63 - bitno);
992
993- if (value == 1) {
994- *addr |= mask; // set
995- } else {
996- *addr &= ~mask; // clear
997- }
998+static ULong
999+s390_stfle_range(UInt lo, UInt hi)
1000+{
1001+ return ((1UL << (hi + 1 - lo)) - 1) << (63 - (hi % 64));
1002 }
1003
1004 ULong
1005@@ -336,6 +327,77 @@
1006 ULong hoststfle[S390_NUM_FACILITY_DW], cc, num_dw, i;
1007 register ULong reg0 asm("0") = guest_state->guest_r0 & 0xF; /* r0[56:63] */
1008
1009+ /* Restrict to facilities that we know about and that we assume to be
1010+ compatible with Valgrind. Of course, in this way we may reject features
1011+ that Valgrind is not really involved in (and thus would be compatible
1012+ with), but quering for such features doesn't seem like a typical use
1013+ case. */
1014+ ULong accepted_facility[S390_NUM_FACILITY_DW] = {
1015+ /* === 0 .. 63 === */
1016+ (s390_stfle_range(0, 16)
1017+ /* 17: message-security-assist, not supported */
1018+ | s390_stfle_range(18, 19)
1019+ /* 20: HFP-multiply-and-add/subtract, not supported */
1020+ | s390_stfle_range(21, 22)
1021+ /* 23: HFP-unnormalized-extension, not supported */
1022+ | s390_stfle_range(24, 25)
1023+ /* 26: parsing-enhancement, not supported */
1024+ | s390_stfle_range(27, 28)
1025+ /* 29: unassigned */
1026+ | s390_stfle_range(30, 30)
1027+ /* 31: extract-CPU-time, not supported */
1028+ | s390_stfle_range(32, 41)
1029+ /* 42-43: DFP, not fully supported */
1030+ /* 44: PFPO, not fully supported */
1031+ | s390_stfle_range(45, 47)
1032+ /* 48: DFP zoned-conversion, not supported */
1033+ /* 49: includes PPA, not supported */
1034+ /* 50: constrained transactional-execution, not supported */
1035+ | s390_stfle_range(51, 55)
1036+ /* 56: unassigned */
1037+ /* 57: MSA5, not supported */
1038+ | s390_stfle_range(58, 60)
1039+ /* 61: miscellaneous-instruction 3, not supported */
1040+ | s390_stfle_range(62, 63)),
1041+
1042+ /* === 64 .. 127 === */
1043+ (s390_stfle_range(64, 72)
1044+ /* 73: transactional-execution, not supported */
1045+ | s390_stfle_range(74, 75)
1046+ /* 76: MSA3, not supported */
1047+ /* 77: MSA4, not supported */
1048+ | s390_stfle_range(78, 78)
1049+ /* 80: DFP packed-conversion, not supported */
1050+ /* 81: PPA-in-order, not supported */
1051+ | s390_stfle_range(82, 82)
1052+ /* 83-127: unassigned */ ),
1053+
1054+ /* === 128 .. 191 === */
1055+ (s390_stfle_range(128, 131)
1056+ /* 132: unassigned */
1057+ /* 133: guarded-storage, not supported */
1058+ /* 134: vector packed decimal, not supported */
1059+ | s390_stfle_range(135, 135)
1060+ /* 136: unassigned */
1061+ /* 137: unassigned */
1062+ | s390_stfle_range(138, 142)
1063+ /* 143: unassigned */
1064+ | s390_stfle_range(144, 145)
1065+ /* 146: MSA8, not supported */
1066+ | s390_stfle_range(147, 147)
1067+ /* 148: vector-enhancements 2, not supported */
1068+ | s390_stfle_range(149, 149)
1069+ /* 150: unassigned */
1070+ /* 151: DEFLATE-conversion, not supported */
1071+ /* 153: unassigned */
1072+ /* 154: unassigned */
1073+ /* 155: MSA9, not supported */
1074+ | s390_stfle_range(156, 156)
1075+ /* 157-167: unassigned */
1076+ | s390_stfle_range(168, 168)
1077+ /* 168-191: unassigned */ ),
1078+ };
1079+
1080 /* We cannot store more than S390_NUM_FACILITY_DW
1081 (and it makes not much sense to do so anyhow) */
1082 if (reg0 > S390_NUM_FACILITY_DW - 1)
1083@@ -351,35 +413,9 @@
1084 /* Update guest register 0 with what STFLE set r0 to */
1085 guest_state->guest_r0 = reg0;
1086
1087- /* Set default: VM facilities = host facilities */
1088+ /* VM facilities = host facilities, filtered by acceptance */
1089 for (i = 0; i < num_dw; ++i)
1090- addr[i] = hoststfle[i];
1091-
1092- /* Now adjust the VM facilities according to what the VM supports */
1093- s390_set_facility_bit(addr, S390_FAC_LDISP, 1);
1094- s390_set_facility_bit(addr, S390_FAC_EIMM, 1);
1095- s390_set_facility_bit(addr, S390_FAC_ETF2, 1);
1096- s390_set_facility_bit(addr, S390_FAC_ETF3, 1);
1097- s390_set_facility_bit(addr, S390_FAC_GIE, 1);
1098- s390_set_facility_bit(addr, S390_FAC_EXEXT, 1);
1099- s390_set_facility_bit(addr, S390_FAC_HIGHW, 1);
1100- s390_set_facility_bit(addr, S390_FAC_LSC2, 1);
1101-
1102- s390_set_facility_bit(addr, S390_FAC_HFPMAS, 0);
1103- s390_set_facility_bit(addr, S390_FAC_HFPUNX, 0);
1104- s390_set_facility_bit(addr, S390_FAC_XCPUT, 0);
1105- s390_set_facility_bit(addr, S390_FAC_MSA, 0);
1106- s390_set_facility_bit(addr, S390_FAC_PENH, 0);
1107- s390_set_facility_bit(addr, S390_FAC_DFP, 0);
1108- s390_set_facility_bit(addr, S390_FAC_PFPO, 0);
1109- s390_set_facility_bit(addr, S390_FAC_DFPZC, 0);
1110- s390_set_facility_bit(addr, S390_FAC_MISC, 0);
1111- s390_set_facility_bit(addr, S390_FAC_CTREXE, 0);
1112- s390_set_facility_bit(addr, S390_FAC_TREXE, 0);
1113- s390_set_facility_bit(addr, S390_FAC_MSA4, 0);
1114- s390_set_facility_bit(addr, S390_FAC_VXE, 0);
1115- s390_set_facility_bit(addr, S390_FAC_VXE2, 0);
1116- s390_set_facility_bit(addr, S390_FAC_DFLT, 0);
1117+ addr[i] = hoststfle[i] & accepted_facility[i];
1118
1119 return cc;
1120 }
1121@@ -2500,25 +2536,26 @@
1122 vassert(d->op > S390_VEC_OP_INVALID && d->op < S390_VEC_OP_LAST);
1123 static const UChar opcodes[][2] = {
1124 {0x00, 0x00}, /* invalid */
1125- {0xe7, 0x97}, /* VPKS */
1126- {0xe7, 0x95}, /* VPKLS */
1127- {0xe7, 0x82}, /* VFAE */
1128- {0xe7, 0x80}, /* VFEE */
1129- {0xe7, 0x81}, /* VFENE */
1130- {0xe7, 0x5c}, /* VISTR */
1131- {0xe7, 0x8a}, /* VSTRC */
1132- {0xe7, 0xf8}, /* VCEQ */
1133- {0xe7, 0xd8}, /* VTM */
1134- {0xe7, 0xb4}, /* VGFM */
1135- {0xe7, 0xbc}, /* VGFMA */
1136- {0xe7, 0xab}, /* VMAH */
1137- {0xe7, 0xa9}, /* VMALH */
1138- {0xe7, 0xfb}, /* VCH */
1139- {0xe7, 0xf9}, /* VCHL */
1140- {0xe7, 0xe8}, /* VFCE */
1141- {0xe7, 0xeb}, /* VFCH */
1142- {0xe7, 0xea}, /* VFCHE */
1143- {0xe7, 0x4a} /* VFTCI */
1144+ [S390_VEC_OP_VPKS] = {0xe7, 0x97},
1145+ [S390_VEC_OP_VPKLS] = {0xe7, 0x95},
1146+ [S390_VEC_OP_VFAE] = {0xe7, 0x82},
1147+ [S390_VEC_OP_VFEE] = {0xe7, 0x80},
1148+ [S390_VEC_OP_VFENE] = {0xe7, 0x81},
1149+ [S390_VEC_OP_VISTR] = {0xe7, 0x5c},
1150+ [S390_VEC_OP_VSTRC] = {0xe7, 0x8a},
1151+ [S390_VEC_OP_VCEQ] = {0xe7, 0xf8},
1152+ [S390_VEC_OP_VTM] = {0xe7, 0xd8},
1153+ [S390_VEC_OP_VGFM] = {0xe7, 0xb4},
1154+ [S390_VEC_OP_VGFMA] = {0xe7, 0xbc},
1155+ [S390_VEC_OP_VMAH] = {0xe7, 0xab},
1156+ [S390_VEC_OP_VMALH] = {0xe7, 0xa9},
1157+ [S390_VEC_OP_VCH] = {0xe7, 0xfb},
1158+ [S390_VEC_OP_VCHL] = {0xe7, 0xf9},
1159+ [S390_VEC_OP_VFTCI] = {0xe7, 0x4a},
1160+ [S390_VEC_OP_VFMIN] = {0xe7, 0xee},
1161+ [S390_VEC_OP_VFMAX] = {0xe7, 0xef},
1162+ [S390_VEC_OP_VBPERM]= {0xe7, 0x85},
1163+ [S390_VEC_OP_VMSL] = {0xe7, 0xb8},
1164 };
1165
1166 union {
1167@@ -2612,6 +2649,7 @@
1168 case S390_VEC_OP_VGFMA:
1169 case S390_VEC_OP_VMAH:
1170 case S390_VEC_OP_VMALH:
1171+ case S390_VEC_OP_VMSL:
1172 the_insn.VRRd.v1 = 1;
1173 the_insn.VRRd.v2 = 2;
1174 the_insn.VRRd.v3 = 3;
1175@@ -2621,9 +2659,9 @@
1176 the_insn.VRRd.m6 = d->m5;
1177 break;
1178
1179- case S390_VEC_OP_VFCE:
1180- case S390_VEC_OP_VFCH:
1181- case S390_VEC_OP_VFCHE:
1182+ case S390_VEC_OP_VFMIN:
1183+ case S390_VEC_OP_VFMAX:
1184+ case S390_VEC_OP_VBPERM:
1185 the_insn.VRRc.v1 = 1;
1186 the_insn.VRRc.v2 = 2;
1187 the_insn.VRRc.v3 = 3;
1188--- a/VEX/priv/guest_s390_toIR.c
1189+++ b/VEX/priv/guest_s390_toIR.c
1190@@ -8,7 +8,7 @@
1191 This file is part of Valgrind, a dynamic binary instrumentation
1192 framework.
1193
1194- Copyright IBM Corp. 2010-2017
1195+ Copyright IBM Corp. 2010-2020
1196
1197 This program is free software; you can redistribute it and/or
1198 modify it under the terms of the GNU General Public License as
1199@@ -248,6 +248,13 @@
1200 #define VRS_d2(insn) (((insn) >> 32) & 0xfff)
1201 #define VRS_m4(insn) (((insn) >> 28) & 0xf)
1202 #define VRS_rxb(insn) (((insn) >> 24) & 0xf)
1203+#define VRSd_v1(insn) (((insn) >> 28) & 0xf)
1204+#define VRSd_r3(insn) (((insn) >> 48) & 0xf)
1205+#define VSI_i3(insn) (((insn) >> 48) & 0xff)
1206+#define VSI_b2(insn) (((insn) >> 44) & 0xf)
1207+#define VSI_d2(insn) (((insn) >> 32) & 0xfff)
1208+#define VSI_v1(insn) (((insn) >> 28) & 0xf)
1209+#define VSI_rxb(insn) (((insn) >> 24) & 0xf)
1210
1211
1212 /*------------------------------------------------------------*/
1213@@ -1937,6 +1944,26 @@
1214 return results[m];
1215 }
1216
1217+/* Determine IRType from instruction's floating-point format field */
1218+static IRType
1219+s390_vr_get_ftype(const UChar m)
1220+{
1221+ static const IRType results[] = {Ity_F32, Ity_F64, Ity_F128};
1222+ if (m >= 2 && m <= 4)
1223+ return results[m - 2];
1224+ return Ity_INVALID;
1225+}
1226+
1227+/* Determine number of elements from instruction's floating-point format
1228+ field */
1229+static UChar
1230+s390_vr_get_n_elem(const UChar m)
1231+{
1232+ if (m >= 2 && m <= 4)
1233+ return 1 << (4 - m);
1234+ return 0;
1235+}
1236+
1237 /* Determine if Condition Code Set (CS) flag is set in m field */
1238 #define s390_vr_is_cs_set(m) (((m) & 0x1) != 0)
1239
1240@@ -2191,12 +2218,15 @@
1241 goto invalidIndex;
1242 }
1243 return vr_offset(archreg) + sizeof(ULong) * index;
1244+
1245 case Ity_V128:
1246+ case Ity_F128:
1247 if(index == 0) {
1248 return vr_qw_offset(archreg);
1249 } else {
1250 goto invalidIndex;
1251 }
1252+
1253 default:
1254 vpanic("s390_vr_offset_by_index: unknown type");
1255 }
1256@@ -2214,7 +2244,14 @@
1257 UInt offset = s390_vr_offset_by_index(archreg, type, index);
1258 vassert(typeOfIRExpr(irsb->tyenv, expr) == type);
1259
1260- stmt(IRStmt_Put(offset, expr));
1261+ if (type == Ity_F128) {
1262+ IRTemp val = newTemp(Ity_F128);
1263+ assign(val, expr);
1264+ stmt(IRStmt_Put(offset, unop(Iop_F128HItoF64, mkexpr(val))));
1265+ stmt(IRStmt_Put(offset + 8, unop(Iop_F128LOtoF64, mkexpr(val))));
1266+ } else {
1267+ stmt(IRStmt_Put(offset, expr));
1268+ }
1269 }
1270
1271 /* Read type sized part specified by index of a vr register. */
1272@@ -2222,6 +2259,11 @@
1273 get_vr(UInt archreg, IRType type, UChar index)
1274 {
1275 UInt offset = s390_vr_offset_by_index(archreg, type, index);
1276+ if (type == Ity_F128) {
1277+ return binop(Iop_F64HLtoF128,
1278+ IRExpr_Get(offset, Ity_F64),
1279+ IRExpr_Get(offset + 8, Ity_F64));
1280+ }
1281 return IRExpr_Get(offset, type);
1282 }
1283
1284@@ -2297,11 +2339,11 @@
1285 return mkexpr(output);
1286 }
1287
1288-/* Load bytes into v1.
1289- maxIndex specifies max index to load and must be Ity_I32.
1290- If maxIndex >= 15, all 16 bytes are loaded.
1291- All bytes after maxIndex are zeroed. */
1292-static void s390_vr_loadWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex)
1293+/* Starting from addr, load at most maxIndex + 1 bytes into v1. Fill the
1294+ leftmost or rightmost bytes of v1, depending on whether `rightmost' is set.
1295+ If maxIndex >= 15, load all 16 bytes; otherwise clear the remaining bytes. */
1296+static void
1297+s390_vr_loadWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex, Bool rightmost)
1298 {
1299 IRTemp maxIdx = newTemp(Ity_I32);
1300 IRTemp cappedMax = newTemp(Ity_I64);
1301@@ -2314,8 +2356,8 @@
1302 crossed if and only if the real insn would have crossed it as well.
1303 Thus, if the bytes to load are fully contained in an aligned 16-byte
1304 chunk, load the whole 16-byte aligned chunk, and otherwise load 16 bytes
1305- from the unaligned address. Then shift the loaded data left-aligned
1306- into the target vector register. */
1307+ from the unaligned address. Then shift the loaded data left- or
1308+ right-aligned into the target vector register. */
1309
1310 assign(maxIdx, maxIndex);
1311 assign(cappedMax, mkite(binop(Iop_CmpLT32U, mkexpr(maxIdx), mkU32(15)),
1312@@ -2328,20 +2370,60 @@
1313 assign(back, mkite(binop(Iop_CmpLE64U, mkexpr(offset), mkexpr(zeroed)),
1314 mkexpr(offset), mkU64(0)));
1315
1316- /* How much to shift the loaded 16-byte vector to the right, and then to
1317- the left. Since both 'zeroed' and 'back' range from 0 to 15, the shift
1318- amounts range from 0 to 120. */
1319- IRExpr *shrAmount = binop(Iop_Shl64,
1320- binop(Iop_Sub64, mkexpr(zeroed), mkexpr(back)),
1321- mkU8(3));
1322- IRExpr *shlAmount = binop(Iop_Shl64, mkexpr(zeroed), mkU8(3));
1323-
1324- put_vr_qw(v1, binop(Iop_ShlV128,
1325- binop(Iop_ShrV128,
1326- load(Ity_V128,
1327- binop(Iop_Sub64, mkexpr(addr), mkexpr(back))),
1328- unop(Iop_64to8, shrAmount)),
1329- unop(Iop_64to8, shlAmount)));
1330+ IRExpr* chunk = load(Ity_V128, binop(Iop_Sub64, mkexpr(addr), mkexpr(back)));
1331+
1332+ /* Shift the loaded 16-byte vector to the right, then to the left, or vice
1333+ versa, where each shift amount ranges from 0 to 120. */
1334+ IRExpr* shift1;
1335+ IRExpr* shift2 = unop(Iop_64to8, binop(Iop_Shl64, mkexpr(zeroed), mkU8(3)));
1336+
1337+ if (rightmost) {
1338+ shift1 = unop(Iop_64to8, binop(Iop_Shl64, mkexpr(back), mkU8(3)));
1339+ put_vr_qw(v1, binop(Iop_ShrV128,
1340+ binop(Iop_ShlV128, chunk, shift1),
1341+ shift2));
1342+ } else {
1343+ shift1 = unop(Iop_64to8,
1344+ binop(Iop_Shl64,
1345+ binop(Iop_Sub64, mkexpr(zeroed), mkexpr(back)),
1346+ mkU8(3)));
1347+ put_vr_qw(v1, binop(Iop_ShlV128,
1348+ binop(Iop_ShrV128, chunk, shift1),
1349+ shift2));
1350+ }
1351+}
1352+
1353+/* Store at most maxIndex + 1 bytes from v1 to addr. Store the leftmost or
1354+ rightmost bytes of v1, depending on whether `rightmost' is set. If maxIndex
1355+ >= 15, store all 16 bytes. */
1356+static void
1357+s390_vr_storeWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex, Bool rightmost)
1358+{
1359+ IRTemp maxIdx = newTemp(Ity_I32);
1360+ IRTemp cappedMax = newTemp(Ity_I64);
1361+ IRTemp counter = newTemp(Ity_I64);
1362+ IRExpr* offset;
1363+
1364+ assign(maxIdx, maxIndex);
1365+ assign(cappedMax, mkite(binop(Iop_CmpLT32U, mkexpr(maxIdx), mkU32(15)),
1366+ unop(Iop_32Uto64, mkexpr(maxIdx)), mkU64(15)));
1367+
1368+ assign(counter, get_counter_dw0());
1369+
1370+ if (rightmost)
1371+ offset = binop(Iop_Add64,
1372+ binop(Iop_Sub64, mkU64(15), mkexpr(cappedMax)),
1373+ mkexpr(counter));
1374+ else
1375+ offset = mkexpr(counter);
1376+
1377+ store(binop(Iop_Add64, mkexpr(addr), mkexpr(counter)),
1378+ binop(Iop_GetElem8x16, get_vr_qw(v1), unop(Iop_64to8, offset)));
1379+
1380+ /* Check for end of field */
1381+ put_counter_dw0(binop(Iop_Add64, mkexpr(counter), mkU64(1)));
1382+ iterate_if(binop(Iop_CmpNE64, mkexpr(counter), mkexpr(cappedMax)));
1383+ put_counter_dw0(mkU64(0));
1384 }
1385
1386 /* Bitwise vCond ? v1 : v2
1387@@ -3752,6 +3834,28 @@
1388 s390_disasm(ENC5(MNM, GPR, UDXB, VR, UINT), mnm, r1, d2, 0, b2, v3, m4);
1389 }
1390
1391+static void
1392+s390_format_VRS_RRDV(const HChar *(*irgen)(UChar v1, UChar r3, IRTemp op2addr),
1393+ UChar v1, UChar r3, UChar b2, UShort d2, UChar rxb)
1394+{
1395+ const HChar *mnm;
1396+ IRTemp op2addr = newTemp(Ity_I64);
1397+
1398+ if (! s390_host_has_vx) {
1399+ emulation_failure(EmFail_S390X_vx);
1400+ return;
1401+ }
1402+
1403+ assign(op2addr, binop(Iop_Add64, mkU64(d2), b2 != 0 ? get_gpr_dw0(b2) :
1404+ mkU64(0)));
1405+
1406+ v1 = s390_vr_getVRindex(v1, 4, rxb);
1407+ mnm = irgen(v1, r3, op2addr);
1408+
1409+ if (UNLIKELY(vex_traceflags & VEX_TRACE_FE))
1410+ s390_disasm(ENC4(MNM, VR, GPR, UDXB), mnm, v1, r3, d2, 0, b2);
1411+}
1412+
1413
1414 static void
1415 s390_format_VRS_VRDVM(const HChar *(*irgen)(UChar v1, IRTemp op2addr, UChar v3,
1416@@ -4084,6 +4188,29 @@
1417 mnm, v1, v2, v3, m4, m5, m6);
1418 }
1419
1420+static void
1421+s390_format_VSI_URDV(const HChar *(*irgen)(UChar v1, IRTemp op2addr, UChar i3),
1422+ UChar v1, UChar b2, UChar d2, UChar i3, UChar rxb)
1423+{
1424+ const HChar *mnm;
1425+ IRTemp op2addr = newTemp(Ity_I64);
1426+
1427+ if (!s390_host_has_vx) {
1428+ emulation_failure(EmFail_S390X_vx);
1429+ return;
1430+ }
1431+
1432+ v1 = s390_vr_getVRindex(v1, 4, rxb);
1433+
1434+ assign(op2addr, binop(Iop_Add64, mkU64(d2), b2 != 0 ? get_gpr_dw0(b2) :
1435+ mkU64(0)));
1436+
1437+ mnm = irgen(v1, op2addr, i3);
1438+
1439+ if (vex_traceflags & VEX_TRACE_FE)
1440+ s390_disasm(ENC4(MNM, VR, UDXB, UINT), mnm, v1, d2, 0, b2, i3);
1441+}
1442+
1443 /*------------------------------------------------------------*/
1444 /*--- Build IR for opcodes ---*/
1445 /*------------------------------------------------------------*/
1446@@ -16183,7 +16310,9 @@
1447 static const HChar *
1448 s390_irgen_VLLEZ(UChar v1, IRTemp op2addr, UChar m3)
1449 {
1450- IRType type = s390_vr_get_type(m3);
1451+ s390_insn_assert("vllez", m3 <= 3 || m3 == 6);
1452+
1453+ IRType type = s390_vr_get_type(m3 & 3);
1454 IRExpr* op2 = load(type, mkexpr(op2addr));
1455 IRExpr* op2as64bit;
1456 switch (type) {
1457@@ -16203,7 +16332,13 @@
1458 vpanic("s390_irgen_VLLEZ: unknown type");
1459 }
1460
1461- put_vr_dw0(v1, op2as64bit);
1462+ if (m3 == 6) {
1463+ /* left-aligned */
1464+ put_vr_dw0(v1, binop(Iop_Shl64, op2as64bit, mkU8(32)));
1465+ } else {
1466+ /* right-aligned */
1467+ put_vr_dw0(v1, op2as64bit);
1468+ }
1469 put_vr_dw1(v1, mkU64(0));
1470 return "vllez";
1471 }
1472@@ -16612,7 +16747,7 @@
1473 s390_getCountToBlockBoundary(addr, m3),
1474 mkU32(1));
1475
1476- s390_vr_loadWithLength(v1, addr, maxIndex);
1477+ s390_vr_loadWithLength(v1, addr, maxIndex, False);
1478
1479 return "vlbb";
1480 }
1481@@ -16620,42 +16755,51 @@
1482 static const HChar *
1483 s390_irgen_VLL(UChar v1, IRTemp addr, UChar r3)
1484 {
1485- s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3));
1486+ s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3), False);
1487
1488 return "vll";
1489 }
1490
1491 static const HChar *
1492-s390_irgen_VSTL(UChar v1, IRTemp addr, UChar r3)
1493+s390_irgen_VLRL(UChar v1, IRTemp addr, UChar i3)
1494 {
1495- IRTemp counter = newTemp(Ity_I64);
1496- IRTemp maxIndexToStore = newTemp(Ity_I64);
1497- IRTemp gpr3 = newTemp(Ity_I64);
1498+ s390_insn_assert("vlrl", (i3 & 0xf0) == 0);
1499+ s390_vr_loadWithLength(v1, addr, mkU32((UInt) i3), True);
1500
1501- assign(gpr3, unop(Iop_32Uto64, get_gpr_w1(r3)));
1502- assign(maxIndexToStore, mkite(binop(Iop_CmpLE64U,
1503- mkexpr(gpr3),
1504- mkU64(16)
1505- ),
1506- mkexpr(gpr3),
1507- mkU64(16)
1508- )
1509- );
1510-
1511- assign(counter, get_counter_dw0());
1512+ return "vlrl";
1513+}
1514
1515- store(binop(Iop_Add64, mkexpr(addr), mkexpr(counter)),
1516- binop(Iop_GetElem8x16, get_vr_qw(v1), unop(Iop_64to8, mkexpr(counter))));
1517+static const HChar *
1518+s390_irgen_VLRLR(UChar v1, UChar r3, IRTemp addr)
1519+{
1520+ s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3), True);
1521
1522- /* Check for end of field */
1523- put_counter_dw0(binop(Iop_Add64, mkexpr(counter), mkU64(1)));
1524- iterate_if(binop(Iop_CmpNE64, mkexpr(counter), mkexpr(maxIndexToStore)));
1525- put_counter_dw0(mkU64(0));
1526+ return "vlrlr";
1527+}
1528
1529+static const HChar *
1530+s390_irgen_VSTL(UChar v1, IRTemp addr, UChar r3)
1531+{
1532+ s390_vr_storeWithLength(v1, addr, get_gpr_w1(r3), False);
1533 return "vstl";
1534 }
1535
1536 static const HChar *
1537+s390_irgen_VSTRL(UChar v1, IRTemp addr, UChar i3)
1538+{
1539+ s390_insn_assert("vstrl", (i3 & 0xf0) == 0);
1540+ s390_vr_storeWithLength(v1, addr, mkU32((UInt) i3), True);
1541+ return "vstrl";
1542+}
1543+
1544+static const HChar *
1545+s390_irgen_VSTRLR(UChar v1, UChar r3, IRTemp addr)
1546+{
1547+ s390_vr_storeWithLength(v1, addr, get_gpr_w1(r3), True);
1548+ return "vstrlr";
1549+}
1550+
1551+static const HChar *
1552 s390_irgen_VX(UChar v1, UChar v2, UChar v3)
1553 {
1554 put_vr_qw(v1, binop(Iop_XorV128, get_vr_qw(v2), get_vr_qw(v3)));
1555@@ -16680,6 +16824,24 @@
1556 }
1557
1558 static const HChar *
1559+s390_irgen_VOC(UChar v1, UChar v2, UChar v3)
1560+{
1561+ put_vr_qw(v1, binop(Iop_OrV128, get_vr_qw(v2),
1562+ unop(Iop_NotV128, get_vr_qw(v3))));
1563+
1564+ return "voc";
1565+}
1566+
1567+static const HChar *
1568+s390_irgen_VNN(UChar v1, UChar v2, UChar v3)
1569+{
1570+ put_vr_qw(v1, unop(Iop_NotV128,
1571+ binop(Iop_AndV128, get_vr_qw(v2), get_vr_qw(v3))));
1572+
1573+ return "vnn";
1574+}
1575+
1576+static const HChar *
1577 s390_irgen_VNO(UChar v1, UChar v2, UChar v3)
1578 {
1579 put_vr_qw(v1, unop(Iop_NotV128,
1580@@ -16689,6 +16851,15 @@
1581 }
1582
1583 static const HChar *
1584+s390_irgen_VNX(UChar v1, UChar v2, UChar v3)
1585+{
1586+ put_vr_qw(v1, unop(Iop_NotV128,
1587+ binop(Iop_XorV128, get_vr_qw(v2), get_vr_qw(v3))));
1588+
1589+ return "vnx";
1590+}
1591+
1592+static const HChar *
1593 s390_irgen_LZRF(UChar r1, IRTemp op2addr)
1594 {
1595 IRTemp op2 = newTemp(Ity_I32);
1596@@ -17496,9 +17667,19 @@
1597 static const HChar *
1598 s390_irgen_VPOPCT(UChar v1, UChar v2, UChar m3)
1599 {
1600- vassert(m3 == 0);
1601+ s390_insn_assert("vpopct", m3 <= 3);
1602
1603- put_vr_qw(v1, unop(Iop_Cnt8x16, get_vr_qw(v2)));
1604+ IRExpr* cnt = unop(Iop_Cnt8x16, get_vr_qw(v2));
1605+
1606+ if (m3 >= 1) {
1607+ cnt = unop(Iop_PwAddL8Ux16, cnt);
1608+ if (m3 >= 2) {
1609+ cnt = unop(Iop_PwAddL16Ux8, cnt);
1610+ if (m3 == 3)
1611+ cnt = unop(Iop_PwAddL32Ux4, cnt);
1612+ }
1613+ }
1614+ put_vr_qw(v1, cnt);
1615
1616 return "vpopct";
1617 }
1618@@ -18332,12 +18513,53 @@
1619 return "vmalh";
1620 }
1621
1622+static const HChar *
1623+s390_irgen_VMSL(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6)
1624+{
1625+ s390_insn_assert("vmsl", m5 == 3 && (m6 & 3) == 0);
1626+
1627+ IRDirty* d;
1628+ IRTemp cc = newTemp(Ity_I64);
1629+
1630+ s390x_vec_op_details_t details = { .serialized = 0ULL };
1631+ details.op = S390_VEC_OP_VMSL;
1632+ details.v1 = v1;
1633+ details.v2 = v2;
1634+ details.v3 = v3;
1635+ details.v4 = v4;
1636+ details.m4 = m5;
1637+ details.m5 = m6;
1638+
1639+ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
1640+ &s390x_dirtyhelper_vec_op,
1641+ mkIRExprVec_2(IRExpr_GSPTR(),
1642+ mkU64(details.serialized)));
1643+
1644+ d->nFxState = 4;
1645+ vex_bzero(&d->fxState, sizeof(d->fxState));
1646+ d->fxState[0].fx = Ifx_Read;
1647+ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
1648+ d->fxState[0].size = sizeof(V128);
1649+ d->fxState[1].fx = Ifx_Read;
1650+ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
1651+ d->fxState[1].size = sizeof(V128);
1652+ d->fxState[2].fx = Ifx_Read;
1653+ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v4 * sizeof(V128);
1654+ d->fxState[2].size = sizeof(V128);
1655+ d->fxState[3].fx = Ifx_Write;
1656+ d->fxState[3].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
1657+ d->fxState[3].size = sizeof(V128);
1658+
1659+ stmt(IRStmt_Dirty(d));
1660+
1661+ return "vmsl";
1662+}
1663+
1664 static void
1665-s390_vector_fp_convert(IROp op, IRType fromType, IRType toType,
1666+s390_vector_fp_convert(IROp op, IRType fromType, IRType toType, Bool rounding,
1667 UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1668 {
1669 Bool isSingleElementOp = s390_vr_is_single_element_control_set(m4);
1670- UChar maxIndex = isSingleElementOp ? 0 : 1;
1671
1672 /* For Iop_F32toF64 we do this:
1673 f32[0] -> f64[0]
1674@@ -18350,14 +18572,21 @@
1675 The magic below with scaling factors is used to achieve the logic
1676 described above.
1677 */
1678- const UChar sourceIndexScaleFactor = (op == Iop_F32toF64) ? 2 : 1;
1679- const UChar destinationIndexScaleFactor = (op == Iop_F64toF32) ? 2 : 1;
1680+ Int size_diff = sizeofIRType(toType) - sizeofIRType(fromType);
1681+ const UChar sourceIndexScaleFactor = size_diff > 0 ? 2 : 1;
1682+ const UChar destinationIndexScaleFactor = size_diff < 0 ? 2 : 1;
1683+ UChar n_elem = (isSingleElementOp ? 1 :
1684+ 16 / (size_diff > 0 ?
1685+ sizeofIRType(toType) : sizeofIRType(fromType)));
1686
1687- const Bool isUnary = (op == Iop_F32toF64);
1688- for (UChar i = 0; i <= maxIndex; i++) {
1689+ for (UChar i = 0; i < n_elem; i++) {
1690 IRExpr* argument = get_vr(v2, fromType, i * sourceIndexScaleFactor);
1691 IRExpr* result;
1692- if (!isUnary) {
1693+ if (rounding) {
1694+ if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1695+ emulation_warning(EmWarn_S390X_fpext_rounding);
1696+ m5 = S390_BFP_ROUND_PER_FPC;
1697+ }
1698 result = binop(op,
1699 mkexpr(encode_bfp_rounding_mode(m5)),
1700 argument);
1701@@ -18366,10 +18595,6 @@
1702 }
1703 put_vr(v1, toType, i * destinationIndexScaleFactor, result);
1704 }
1705-
1706- if (isSingleElementOp) {
1707- put_vr_dw1(v1, mkU64(0));
1708- }
1709 }
1710
1711 static const HChar *
1712@@ -18377,12 +18602,8 @@
1713 {
1714 s390_insn_assert("vcdg", m3 == 3);
1715
1716- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1717- emulation_warning(EmWarn_S390X_fpext_rounding);
1718- m5 = S390_BFP_ROUND_PER_FPC;
1719- }
1720-
1721- s390_vector_fp_convert(Iop_I64StoF64, Ity_I64, Ity_F64, v1, v2, m3, m4, m5);
1722+ s390_vector_fp_convert(Iop_I64StoF64, Ity_I64, Ity_F64, True,
1723+ v1, v2, m3, m4, m5);
1724
1725 return "vcdg";
1726 }
1727@@ -18392,12 +18613,8 @@
1728 {
1729 s390_insn_assert("vcdlg", m3 == 3);
1730
1731- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1732- emulation_warning(EmWarn_S390X_fpext_rounding);
1733- m5 = S390_BFP_ROUND_PER_FPC;
1734- }
1735-
1736- s390_vector_fp_convert(Iop_I64UtoF64, Ity_I64, Ity_F64, v1, v2, m3, m4, m5);
1737+ s390_vector_fp_convert(Iop_I64UtoF64, Ity_I64, Ity_F64, True,
1738+ v1, v2, m3, m4, m5);
1739
1740 return "vcdlg";
1741 }
1742@@ -18407,12 +18624,8 @@
1743 {
1744 s390_insn_assert("vcgd", m3 == 3);
1745
1746- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1747- emulation_warning(EmWarn_S390X_fpext_rounding);
1748- m5 = S390_BFP_ROUND_PER_FPC;
1749- }
1750-
1751- s390_vector_fp_convert(Iop_F64toI64S, Ity_F64, Ity_I64, v1, v2, m3, m4, m5);
1752+ s390_vector_fp_convert(Iop_F64toI64S, Ity_F64, Ity_I64, True,
1753+ v1, v2, m3, m4, m5);
1754
1755 return "vcgd";
1756 }
1757@@ -18422,12 +18635,8 @@
1758 {
1759 s390_insn_assert("vclgd", m3 == 3);
1760
1761- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1762- emulation_warning(EmWarn_S390X_fpext_rounding);
1763- m5 = S390_BFP_ROUND_PER_FPC;
1764- }
1765-
1766- s390_vector_fp_convert(Iop_F64toI64U, Ity_F64, Ity_I64, v1, v2, m3, m4, m5);
1767+ s390_vector_fp_convert(Iop_F64toI64U, Ity_F64, Ity_I64, True,
1768+ v1, v2, m3, m4, m5);
1769
1770 return "vclgd";
1771 }
1772@@ -18435,246 +18644,262 @@
1773 static const HChar *
1774 s390_irgen_VFI(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1775 {
1776- s390_insn_assert("vfi", m3 == 3);
1777+ s390_insn_assert("vfi",
1778+ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4)));
1779
1780- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1781- emulation_warning(EmWarn_S390X_fpext_rounding);
1782- m5 = S390_BFP_ROUND_PER_FPC;
1783+ switch (m3) {
1784+ case 2: s390_vector_fp_convert(Iop_RoundF32toInt, Ity_F32, Ity_F32, True,
1785+ v1, v2, m3, m4, m5); break;
1786+ case 3: s390_vector_fp_convert(Iop_RoundF64toInt, Ity_F64, Ity_F64, True,
1787+ v1, v2, m3, m4, m5); break;
1788+ case 4: s390_vector_fp_convert(Iop_RoundF128toInt, Ity_F128, Ity_F128, True,
1789+ v1, v2, m3, m4, m5); break;
1790 }
1791
1792- s390_vector_fp_convert(Iop_RoundF64toInt, Ity_F64, Ity_F64,
1793- v1, v2, m3, m4, m5);
1794-
1795- return "vcgld";
1796+ return "vfi";
1797 }
1798
1799 static const HChar *
1800-s390_irgen_VLDE(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1801+s390_irgen_VFLL(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1802 {
1803- s390_insn_assert("vlde", m3 == 2);
1804+ s390_insn_assert("vfll", m3 == 2 || (s390_host_has_vxe && m3 == 3));
1805
1806- s390_vector_fp_convert(Iop_F32toF64, Ity_F32, Ity_F64, v1, v2, m3, m4, m5);
1807+ if (m3 == 2)
1808+ s390_vector_fp_convert(Iop_F32toF64, Ity_F32, Ity_F64, False,
1809+ v1, v2, m3, m4, m5);
1810+ else
1811+ s390_vector_fp_convert(Iop_F64toF128, Ity_F64, Ity_F128, False,
1812+ v1, v2, m3, m4, m5);
1813
1814- return "vlde";
1815+ return "vfll";
1816 }
1817
1818 static const HChar *
1819-s390_irgen_VLED(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1820+s390_irgen_VFLR(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1821 {
1822- s390_insn_assert("vled", m3 == 3);
1823+ s390_insn_assert("vflr", m3 == 3 || (s390_host_has_vxe && m3 == 2));
1824
1825- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) {
1826- m5 = S390_BFP_ROUND_PER_FPC;
1827- }
1828-
1829- s390_vector_fp_convert(Iop_F64toF32, Ity_F64, Ity_F32, v1, v2, m3, m4, m5);
1830+ if (m3 == 3)
1831+ s390_vector_fp_convert(Iop_F64toF32, Ity_F64, Ity_F32, True,
1832+ v1, v2, m3, m4, m5);
1833+ else
1834+ s390_vector_fp_convert(Iop_F128toF64, Ity_F128, Ity_F64, True,
1835+ v1, v2, m3, m4, m5);
1836
1837- return "vled";
1838+ return "vflr";
1839 }
1840
1841 static const HChar *
1842 s390_irgen_VFPSO(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5)
1843 {
1844- s390_insn_assert("vfpso", m3 == 3);
1845+ s390_insn_assert("vfpso", m5 <= 2 &&
1846+ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4)));
1847
1848- IRExpr* result;
1849- switch (m5) {
1850- case 0: {
1851- /* Invert sign */
1852- if (!s390_vr_is_single_element_control_set(m4)) {
1853- result = unop(Iop_Neg64Fx2, get_vr_qw(v2));
1854- }
1855- else {
1856- result = binop(Iop_64HLtoV128,
1857- unop(Iop_ReinterpF64asI64,
1858- unop(Iop_NegF64, get_vr(v2, Ity_F64, 0))),
1859- mkU64(0));
1860- }
1861- break;
1862- }
1863+ Bool single = s390_vr_is_single_element_control_set(m4) || m3 == 4;
1864+ IRType type = single ? s390_vr_get_ftype(m3) : Ity_V128;
1865+ int idx = 2 * (m3 - 2) + (single ? 0 : 1);
1866+
1867+ static const IROp negate_ops[] = {
1868+ Iop_NegF32, Iop_Neg32Fx4,
1869+ Iop_NegF64, Iop_Neg64Fx2,
1870+ Iop_NegF128
1871+ };
1872+ static const IROp abs_ops[] = {
1873+ Iop_AbsF32, Iop_Abs32Fx4,
1874+ Iop_AbsF64, Iop_Abs64Fx2,
1875+ Iop_AbsF128
1876+ };
1877
1878- case 1: {
1879+ if (m5 == 1) {
1880 /* Set sign to negative */
1881- IRExpr* highHalf = mkU64(0x8000000000000000ULL);
1882- if (!s390_vr_is_single_element_control_set(m4)) {
1883- IRExpr* lowHalf = highHalf;
1884- IRExpr* mask = binop(Iop_64HLtoV128, highHalf, lowHalf);
1885- result = binop(Iop_OrV128, get_vr_qw(v2), mask);
1886- }
1887- else {
1888- result = binop(Iop_64HLtoV128,
1889- binop(Iop_Or64, get_vr_dw0(v2), highHalf),
1890- mkU64(0ULL));
1891- }
1892-
1893- break;
1894+ put_vr(v1, type, 0,
1895+ unop(negate_ops[idx],
1896+ unop(abs_ops[idx], get_vr(v2, type, 0))));
1897+ } else {
1898+ /* m5 == 0: invert sign; m5 == 2: set sign to positive */
1899+ const IROp *ops = m5 == 2 ? abs_ops : negate_ops;
1900+ put_vr(v1, type, 0, unop(ops[idx], get_vr(v2, type, 0)));
1901 }
1902
1903- case 2: {
1904- /* Set sign to positive */
1905- if (!s390_vr_is_single_element_control_set(m4)) {
1906- result = unop(Iop_Abs64Fx2, get_vr_qw(v2));
1907- }
1908- else {
1909- result = binop(Iop_64HLtoV128,
1910- unop(Iop_ReinterpF64asI64,
1911- unop(Iop_AbsF64, get_vr(v2, Ity_F64, 0))),
1912- mkU64(0));
1913- }
1914+ return "vfpso";
1915+}
1916
1917- break;
1918- }
1919+static const HChar *
1920+s390x_vec_fp_binary_op(const HChar* mnm, const IROp ops[],
1921+ UChar v1, UChar v2, UChar v3,
1922+ UChar m4, UChar m5)
1923+{
1924+ s390_insn_assert(mnm, (m5 & 7) == 0 &&
1925+ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4)));
1926
1927- default:
1928- vpanic("s390_irgen_VFPSO: Invalid m5 value");
1929- }
1930+ int idx = 2 * (m4 - 2);
1931
1932- put_vr_qw(v1, result);
1933- if (s390_vr_is_single_element_control_set(m4)) {
1934- put_vr_dw1(v1, mkU64(0ULL));
1935+ if (m4 == 4 || s390_vr_is_single_element_control_set(m5)) {
1936+ IRType type = s390_vr_get_ftype(m4);
1937+ put_vr(v1, type, 0,
1938+ triop(ops[idx], get_bfp_rounding_mode_from_fpc(),
1939+ get_vr(v2, type, 0), get_vr(v3, type, 0)));
1940+ } else {
1941+ put_vr_qw(v1, triop(ops[idx + 1], get_bfp_rounding_mode_from_fpc(),
1942+ get_vr_qw(v2), get_vr_qw(v3)));
1943 }
1944
1945- return "vfpso";
1946+ return mnm;
1947 }
1948
1949-static void s390x_vec_fp_binary_op(IROp generalOp, IROp singleElementOp,
1950- UChar v1, UChar v2, UChar v3, UChar m4,
1951- UChar m5)
1952+static const HChar *
1953+s390x_vec_fp_unary_op(const HChar* mnm, const IROp ops[],
1954+ UChar v1, UChar v2, UChar m3, UChar m4)
1955 {
1956- IRExpr* result;
1957- if (!s390_vr_is_single_element_control_set(m5)) {
1958- result = triop(generalOp, get_bfp_rounding_mode_from_fpc(),
1959- get_vr_qw(v2), get_vr_qw(v3));
1960- } else {
1961- IRExpr* highHalf = triop(singleElementOp,
1962- get_bfp_rounding_mode_from_fpc(),
1963- get_vr(v2, Ity_F64, 0),
1964- get_vr(v3, Ity_F64, 0));
1965- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf),
1966- mkU64(0ULL));
1967- }
1968+ s390_insn_assert(mnm, (m4 & 7) == 0 &&
1969+ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4)));
1970
1971- put_vr_qw(v1, result);
1972-}
1973+ int idx = 2 * (m3 - 2);
1974
1975-static void s390x_vec_fp_unary_op(IROp generalOp, IROp singleElementOp,
1976- UChar v1, UChar v2, UChar m3, UChar m4)
1977-{
1978- IRExpr* result;
1979- if (!s390_vr_is_single_element_control_set(m4)) {
1980- result = binop(generalOp, get_bfp_rounding_mode_from_fpc(),
1981- get_vr_qw(v2));
1982+ if (m3 == 4 || s390_vr_is_single_element_control_set(m4)) {
1983+ IRType type = s390_vr_get_ftype(m3);
1984+ put_vr(v1, type, 0,
1985+ binop(ops[idx], get_bfp_rounding_mode_from_fpc(),
1986+ get_vr(v2, type, 0)));
1987 }
1988 else {
1989- IRExpr* highHalf = binop(singleElementOp,
1990- get_bfp_rounding_mode_from_fpc(),
1991- get_vr(v2, Ity_F64, 0));
1992- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf),
1993- mkU64(0ULL));
1994+ put_vr_qw(v1, binop(ops[idx + 1], get_bfp_rounding_mode_from_fpc(),
1995+ get_vr_qw(v2)));
1996 }
1997
1998- put_vr_qw(v1, result);
1999+ return mnm;
2000 }
2001
2002
2003-static void
2004-s390_vector_fp_mulAddOrSub(IROp singleElementOp,
2005- UChar v1, UChar v2, UChar v3, UChar v4,
2006- UChar m5, UChar m6)
2007+static const HChar *
2008+s390_vector_fp_mulAddOrSub(UChar v1, UChar v2, UChar v3, UChar v4,
2009+ UChar m5, UChar m6,
2010+ const HChar* mnm, const IROp single_ops[],
2011+ Bool negate)
2012 {
2013- Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5);
2014+ s390_insn_assert(mnm, m6 == 3 || (s390_host_has_vxe && m6 >= 2 && m6 <= 4));
2015+
2016+ static const IROp negate_ops[] = { Iop_NegF32, Iop_NegF64, Iop_NegF128 };
2017+ IRType type = s390_vr_get_ftype(m6);
2018+ Bool single = s390_vr_is_single_element_control_set(m5) || m6 == 4;
2019+ UChar n_elem = single ? 1 : s390_vr_get_n_elem(m6);
2020 IRTemp irrm_temp = newTemp(Ity_I32);
2021 assign(irrm_temp, get_bfp_rounding_mode_from_fpc());
2022 IRExpr* irrm = mkexpr(irrm_temp);
2023- IRExpr* result;
2024- IRExpr* highHalf = qop(singleElementOp,
2025- irrm,
2026- get_vr(v2, Ity_F64, 0),
2027- get_vr(v3, Ity_F64, 0),
2028- get_vr(v4, Ity_F64, 0));
2029-
2030- if (isSingleElementOp) {
2031- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf),
2032- mkU64(0ULL));
2033- } else {
2034- IRExpr* lowHalf = qop(singleElementOp,
2035- irrm,
2036- get_vr(v2, Ity_F64, 1),
2037- get_vr(v3, Ity_F64, 1),
2038- get_vr(v4, Ity_F64, 1));
2039- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf),
2040- unop(Iop_ReinterpF64asI64, lowHalf));
2041- }
2042
2043- put_vr_qw(v1, result);
2044+ for (UChar idx = 0; idx < n_elem; idx++) {
2045+ IRExpr* result = qop(single_ops[m6 - 2],
2046+ irrm,
2047+ get_vr(v2, type, idx),
2048+ get_vr(v3, type, idx),
2049+ get_vr(v4, type, idx));
2050+ put_vr(v1, type, idx, negate ? unop(negate_ops[m6 - 2], result) : result);
2051+ }
2052+ return mnm;
2053 }
2054
2055 static const HChar *
2056 s390_irgen_VFA(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5)
2057 {
2058- s390_insn_assert("vfa", m4 == 3);
2059- s390x_vec_fp_binary_op(Iop_Add64Fx2, Iop_AddF64, v1, v2, v3, m4, m5);
2060- return "vfa";
2061+ static const IROp vfa_ops[] = {
2062+ Iop_AddF32, Iop_Add32Fx4,
2063+ Iop_AddF64, Iop_Add64Fx2,
2064+ Iop_AddF128,
2065+ };
2066+ return s390x_vec_fp_binary_op("vfa", vfa_ops, v1, v2, v3, m4, m5);
2067 }
2068
2069 static const HChar *
2070 s390_irgen_VFS(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5)
2071 {
2072- s390_insn_assert("vfs", m4 == 3);
2073- s390x_vec_fp_binary_op(Iop_Sub64Fx2, Iop_SubF64, v1, v2, v3, m4, m5);
2074- return "vfs";
2075+ static const IROp vfs_ops[] = {
2076+ Iop_SubF32, Iop_Sub32Fx4,
2077+ Iop_SubF64, Iop_Sub64Fx2,
2078+ Iop_SubF128,
2079+ };
2080+ return s390x_vec_fp_binary_op("vfs", vfs_ops, v1, v2, v3, m4, m5);
2081 }
2082
2083 static const HChar *
2084 s390_irgen_VFM(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5)
2085 {
2086- s390_insn_assert("vfm", m4 == 3);
2087- s390x_vec_fp_binary_op(Iop_Mul64Fx2, Iop_MulF64, v1, v2, v3, m4, m5);
2088- return "vfm";
2089+ static const IROp vfm_ops[] = {
2090+ Iop_MulF32, Iop_Mul32Fx4,
2091+ Iop_MulF64, Iop_Mul64Fx2,
2092+ Iop_MulF128,
2093+ };
2094+ return s390x_vec_fp_binary_op("vfm", vfm_ops, v1, v2, v3, m4, m5);
2095 }
2096
2097 static const HChar *
2098 s390_irgen_VFD(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5)
2099 {
2100- s390_insn_assert("vfd", m4 == 3);
2101- s390x_vec_fp_binary_op(Iop_Div64Fx2, Iop_DivF64, v1, v2, v3, m4, m5);
2102- return "vfd";
2103+ static const IROp vfd_ops[] = {
2104+ Iop_DivF32, Iop_Div32Fx4,
2105+ Iop_DivF64, Iop_Div64Fx2,
2106+ Iop_DivF128,
2107+ };
2108+ return s390x_vec_fp_binary_op("vfd", vfd_ops, v1, v2, v3, m4, m5);
2109 }
2110
2111 static const HChar *
2112 s390_irgen_VFSQ(UChar v1, UChar v2, UChar m3, UChar m4)
2113 {
2114- s390_insn_assert("vfsq", m3 == 3);
2115- s390x_vec_fp_unary_op(Iop_Sqrt64Fx2, Iop_SqrtF64, v1, v2, m3, m4);
2116-
2117- return "vfsq";
2118+ static const IROp vfsq_ops[] = {
2119+ Iop_SqrtF32, Iop_Sqrt32Fx4,
2120+ Iop_SqrtF64, Iop_Sqrt64Fx2,
2121+ Iop_SqrtF128
2122+ };
2123+ return s390x_vec_fp_unary_op("vfsq", vfsq_ops, v1, v2, m3, m4);
2124 }
2125
2126+static const IROp FMA_single_ops[] = {
2127+ Iop_MAddF32, Iop_MAddF64, Iop_MAddF128
2128+};
2129+
2130 static const HChar *
2131 s390_irgen_VFMA(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6)
2132 {
2133- s390_insn_assert("vfma", m6 == 3);
2134- s390_vector_fp_mulAddOrSub(Iop_MAddF64, v1, v2, v3, v4, m5, m6);
2135- return "vfma";
2136+ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6,
2137+ "vfma", FMA_single_ops, False);
2138 }
2139
2140 static const HChar *
2141+s390_irgen_VFNMA(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6)
2142+{
2143+ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6,
2144+ "vfnma", FMA_single_ops, True);
2145+}
2146+
2147+static const IROp FMS_single_ops[] = {
2148+ Iop_MSubF32, Iop_MSubF64, Iop_MSubF128
2149+};
2150+
2151+static const HChar *
2152 s390_irgen_VFMS(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6)
2153 {
2154- s390_insn_assert("vfms", m6 == 3);
2155- s390_vector_fp_mulAddOrSub(Iop_MSubF64, v1, v2, v3, v4, m5, m6);
2156- return "vfms";
2157+ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6,
2158+ "vfms", FMS_single_ops, False);
2159+}
2160+
2161+static const HChar *
2162+s390_irgen_VFNMS(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6)
2163+{
2164+ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6,
2165+ "vfnms", FMS_single_ops, True);
2166 }
2167
2168 static const HChar *
2169 s390_irgen_WFC(UChar v1, UChar v2, UChar m3, UChar m4)
2170 {
2171- s390_insn_assert("wfc", m3 == 3);
2172- s390_insn_assert("wfc", m4 == 0);
2173+ s390_insn_assert("wfc", m4 == 0 &&
2174+ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4)));
2175+
2176+ static const IROp ops[] = { Iop_CmpF32, Iop_CmpF64, Iop_CmpF128 };
2177+ IRType type = s390_vr_get_ftype(m3);
2178
2179 IRTemp cc_vex = newTemp(Ity_I32);
2180- assign(cc_vex, binop(Iop_CmpF64,
2181- get_vr(v1, Ity_F64, 0), get_vr(v2, Ity_F64, 0)));
2182+ assign(cc_vex, binop(ops[m3 - 2], get_vr(v1, type, 0), get_vr(v2, type, 0)));
2183
2184 IRTemp cc_s390 = newTemp(Ity_I32);
2185 assign(cc_s390, convert_vex_bfpcc_to_s390(cc_vex));
2186@@ -18692,213 +18917,253 @@
2187 }
2188
2189 static const HChar *
2190-s390_irgen_VFCE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2191-{
2192- s390_insn_assert("vfce", m4 == 3);
2193+s390_irgen_VFCx(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6,
2194+ const HChar *mnem, IRCmpFResult cmp, Bool equal_ok,
2195+ IROp cmp32, IROp cmp64)
2196+{
2197+ s390_insn_assert(mnem, (m5 & 3) == 0 && (m6 & 14) == 0 &&
2198+ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4)));
2199+
2200+ Bool single = s390_vr_is_single_element_control_set(m5) || m4 == 4;
2201+
2202+ if (single) {
2203+ static const IROp ops[] = { Iop_CmpF32, Iop_CmpF64, Iop_CmpF128 };
2204+ IRType type = s390_vr_get_ftype(m4);
2205+ IRTemp result = newTemp(Ity_I32);
2206+ IRTemp cond = newTemp(Ity_I1);
2207
2208- Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5);
2209- if (!s390_vr_is_cs_set(m6)) {
2210- if (!isSingleElementOp) {
2211- put_vr_qw(v1, binop(Iop_CmpEQ64Fx2, get_vr_qw(v2), get_vr_qw(v3)));
2212+ assign(result, binop(ops[m4 - 2],
2213+ get_vr(v2, type, 0), get_vr(v3, type, 0)));
2214+ if (equal_ok) {
2215+ assign(cond,
2216+ binop(Iop_Or1,
2217+ binop(Iop_CmpEQ32, mkexpr(result), mkU32(cmp)),
2218+ binop(Iop_CmpEQ32, mkexpr(result), mkU32(Ircr_EQ))));
2219 } else {
2220- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v2, Ity_F64, 0),
2221- get_vr(v3, Ity_F64, 0));
2222- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult,
2223- mkU32(Ircr_EQ)),
2224- mkU64(0xffffffffffffffffULL),
2225- mkU64(0ULL));
2226- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL)));
2227+ assign(cond, binop(Iop_CmpEQ32, mkexpr(result), mkU32(cmp)));
2228+ }
2229+ put_vr_qw(v1, mkite(mkexpr(cond),
2230+ IRExpr_Const(IRConst_V128(0xffff)),
2231+ IRExpr_Const(IRConst_V128(0))));
2232+ if (s390_vr_is_cs_set(m6)) {
2233+ IRTemp cc = newTemp(Ity_I64);
2234+ assign(cc, mkite(mkexpr(cond), mkU64(0), mkU64(3)));
2235+ s390_cc_set(cc);
2236 }
2237 } else {
2238- IRDirty* d;
2239- IRTemp cc = newTemp(Ity_I64);
2240+ IRTemp result = newTemp(Ity_V128);
2241
2242- s390x_vec_op_details_t details = { .serialized = 0ULL };
2243- details.op = S390_VEC_OP_VFCE;
2244- details.v1 = v1;
2245- details.v2 = v2;
2246- details.v3 = v3;
2247- details.m4 = m4;
2248- details.m5 = m5;
2249- details.m6 = m6;
2250+ assign(result, binop(m4 == 2 ? cmp32 : cmp64,
2251+ get_vr_qw(v2), get_vr_qw(v3)));
2252+ put_vr_qw(v1, mkexpr(result));
2253+ if (s390_vr_is_cs_set(m6)) {
2254+ IRTemp cc = newTemp(Ity_I64);
2255+ assign(cc,
2256+ mkite(binop(Iop_CmpEQ64,
2257+ binop(Iop_And64,
2258+ unop(Iop_V128to64, mkexpr(result)),
2259+ unop(Iop_V128HIto64, mkexpr(result))),
2260+ mkU64(-1ULL)),
2261+ mkU64(0), /* all comparison results are true */
2262+ mkite(binop(Iop_CmpEQ64,
2263+ binop(Iop_Or64,
2264+ unop(Iop_V128to64, mkexpr(result)),
2265+ unop(Iop_V128HIto64, mkexpr(result))),
2266+ mkU64(0)),
2267+ mkU64(3), /* all false */
2268+ mkU64(1)))); /* mixed true/false */
2269+ s390_cc_set(cc);
2270+ }
2271+ }
2272
2273- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2274- &s390x_dirtyhelper_vec_op,
2275- mkIRExprVec_2(IRExpr_GSPTR(),
2276- mkU64(details.serialized)));
2277+ return mnem;
2278+}
2279
2280- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128);
2281- d->nFxState = 3;
2282- vex_bzero(&d->fxState, sizeof(d->fxState));
2283- d->fxState[0].fx = Ifx_Read;
2284- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2285- d->fxState[0].size = elementSize;
2286- d->fxState[1].fx = Ifx_Read;
2287- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2288- d->fxState[1].size = elementSize;
2289- d->fxState[2].fx = Ifx_Write;
2290- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2291- d->fxState[2].size = sizeof(V128);
2292+static const HChar *
2293+s390_irgen_VFCE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2294+{
2295+ return s390_irgen_VFCx(v1, v2, v3, m4, m5, m6, "vfce", Ircr_EQ,
2296+ False, Iop_CmpEQ32Fx4, Iop_CmpEQ64Fx2);
2297+}
2298
2299- stmt(IRStmt_Dirty(d));
2300- s390_cc_set(cc);
2301- }
2302+static const HChar *
2303+s390_irgen_VFCH(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2304+{
2305+ /* Swap arguments and compare "low" instead. */
2306+ return s390_irgen_VFCx(v1, v3, v2, m4, m5, m6, "vfch", Ircr_LT,
2307+ False, Iop_CmpLT32Fx4, Iop_CmpLT64Fx2);
2308+}
2309
2310- return "vfce";
2311+static const HChar *
2312+s390_irgen_VFCHE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2313+{
2314+ /* Swap arguments and compare "low or equal" instead. */
2315+ return s390_irgen_VFCx(v1, v3, v2, m4, m5, m6, "vfche", Ircr_LT,
2316+ True, Iop_CmpLE32Fx4, Iop_CmpLE64Fx2);
2317 }
2318
2319 static const HChar *
2320-s390_irgen_VFCH(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2321+s390_irgen_VFTCI(UChar v1, UChar v2, UShort i3, UChar m4, UChar m5)
2322 {
2323- vassert(m4 == 3);
2324+ s390_insn_assert("vftci",
2325+ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4)));
2326
2327 Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5);
2328- if (!s390_vr_is_cs_set(m6)) {
2329- if (!isSingleElementOp) {
2330- put_vr_qw(v1, binop(Iop_CmpLE64Fx2, get_vr_qw(v3), get_vr_qw(v2)));
2331- } else {
2332- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v2, Ity_F64, 0),
2333- get_vr(v3, Ity_F64, 0));
2334- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult,
2335- mkU32(Ircr_GT)),
2336- mkU64(0xffffffffffffffffULL),
2337- mkU64(0ULL));
2338- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL)));
2339- }
2340- }
2341- else {
2342- IRDirty* d;
2343- IRTemp cc = newTemp(Ity_I64);
2344
2345- s390x_vec_op_details_t details = { .serialized = 0ULL };
2346- details.op = S390_VEC_OP_VFCH;
2347- details.v1 = v1;
2348- details.v2 = v2;
2349- details.v3 = v3;
2350- details.m4 = m4;
2351- details.m5 = m5;
2352- details.m6 = m6;
2353+ IRDirty* d;
2354+ IRTemp cc = newTemp(Ity_I64);
2355
2356- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2357- &s390x_dirtyhelper_vec_op,
2358- mkIRExprVec_2(IRExpr_GSPTR(),
2359- mkU64(details.serialized)));
2360+ s390x_vec_op_details_t details = { .serialized = 0ULL };
2361+ details.op = S390_VEC_OP_VFTCI;
2362+ details.v1 = v1;
2363+ details.v2 = v2;
2364+ details.i3 = i3;
2365+ details.m4 = m4;
2366+ details.m5 = m5;
2367
2368- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128);
2369- d->nFxState = 3;
2370- vex_bzero(&d->fxState, sizeof(d->fxState));
2371- d->fxState[0].fx = Ifx_Read;
2372- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2373- d->fxState[0].size = elementSize;
2374- d->fxState[1].fx = Ifx_Read;
2375- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2376- d->fxState[1].size = elementSize;
2377- d->fxState[2].fx = Ifx_Write;
2378- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2379- d->fxState[2].size = sizeof(V128);
2380+ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2381+ &s390x_dirtyhelper_vec_op,
2382+ mkIRExprVec_2(IRExpr_GSPTR(),
2383+ mkU64(details.serialized)));
2384
2385- stmt(IRStmt_Dirty(d));
2386- s390_cc_set(cc);
2387- }
2388+ const UChar elementSize = isSingleElementOp ?
2389+ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128);
2390+ d->nFxState = 2;
2391+ vex_bzero(&d->fxState, sizeof(d->fxState));
2392+ d->fxState[0].fx = Ifx_Read;
2393+ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2394+ d->fxState[0].size = elementSize;
2395+ d->fxState[1].fx = Ifx_Write;
2396+ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2397+ d->fxState[1].size = sizeof(V128);
2398+
2399+ stmt(IRStmt_Dirty(d));
2400+ s390_cc_set(cc);
2401
2402- return "vfch";
2403+ return "vftci";
2404 }
2405
2406 static const HChar *
2407-s390_irgen_VFCHE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2408+s390_irgen_VFMIN(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2409 {
2410- s390_insn_assert("vfche", m4 == 3);
2411+ s390_insn_assert("vfmin",
2412+ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4)));
2413
2414 Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5);
2415- if (!s390_vr_is_cs_set(m6)) {
2416- if (!isSingleElementOp) {
2417- put_vr_qw(v1, binop(Iop_CmpLT64Fx2, get_vr_qw(v3), get_vr_qw(v2)));
2418- }
2419- else {
2420- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v3, Ity_F64, 0),
2421- get_vr(v2, Ity_F64, 0));
2422- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult,
2423- mkU32(Ircr_LT)),
2424- mkU64(0xffffffffffffffffULL),
2425- mkU64(0ULL));
2426- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL)));
2427- }
2428- }
2429- else {
2430- IRDirty* d;
2431- IRTemp cc = newTemp(Ity_I64);
2432-
2433- s390x_vec_op_details_t details = { .serialized = 0ULL };
2434- details.op = S390_VEC_OP_VFCHE;
2435- details.v1 = v1;
2436- details.v2 = v2;
2437- details.v3 = v3;
2438- details.m4 = m4;
2439- details.m5 = m5;
2440- details.m6 = m6;
2441+ IRDirty* d;
2442+ IRTemp cc = newTemp(Ity_I64);
2443
2444- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2445- &s390x_dirtyhelper_vec_op,
2446- mkIRExprVec_2(IRExpr_GSPTR(),
2447- mkU64(details.serialized)));
2448+ s390x_vec_op_details_t details = { .serialized = 0ULL };
2449+ details.op = S390_VEC_OP_VFMIN;
2450+ details.v1 = v1;
2451+ details.v2 = v2;
2452+ details.v3 = v3;
2453+ details.m4 = m4;
2454+ details.m5 = m5;
2455+ details.m6 = m6;
2456
2457- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128);
2458- d->nFxState = 3;
2459- vex_bzero(&d->fxState, sizeof(d->fxState));
2460- d->fxState[0].fx = Ifx_Read;
2461- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2462- d->fxState[0].size = elementSize;
2463- d->fxState[1].fx = Ifx_Read;
2464- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2465- d->fxState[1].size = elementSize;
2466- d->fxState[2].fx = Ifx_Write;
2467- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2468- d->fxState[2].size = sizeof(V128);
2469+ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2470+ &s390x_dirtyhelper_vec_op,
2471+ mkIRExprVec_2(IRExpr_GSPTR(),
2472+ mkU64(details.serialized)));
2473
2474- stmt(IRStmt_Dirty(d));
2475- s390_cc_set(cc);
2476- }
2477+ const UChar elementSize = isSingleElementOp ?
2478+ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128);
2479+ d->nFxState = 3;
2480+ vex_bzero(&d->fxState, sizeof(d->fxState));
2481+ d->fxState[0].fx = Ifx_Read;
2482+ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2483+ d->fxState[0].size = elementSize;
2484+ d->fxState[1].fx = Ifx_Read;
2485+ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2486+ d->fxState[1].size = elementSize;
2487+ d->fxState[2].fx = Ifx_Write;
2488+ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2489+ d->fxState[2].size = sizeof(V128);
2490
2491- return "vfche";
2492+ stmt(IRStmt_Dirty(d));
2493+ s390_cc_set(cc);
2494+ return "vfmin";
2495 }
2496
2497 static const HChar *
2498-s390_irgen_VFTCI(UChar v1, UChar v2, UShort i3, UChar m4, UChar m5)
2499+s390_irgen_VFMAX(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6)
2500 {
2501- s390_insn_assert("vftci", m4 == 3);
2502+ s390_insn_assert("vfmax",
2503+ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4)));
2504
2505 Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5);
2506-
2507 IRDirty* d;
2508 IRTemp cc = newTemp(Ity_I64);
2509
2510 s390x_vec_op_details_t details = { .serialized = 0ULL };
2511- details.op = S390_VEC_OP_VFTCI;
2512+ details.op = S390_VEC_OP_VFMAX;
2513 details.v1 = v1;
2514 details.v2 = v2;
2515- details.i3 = i3;
2516+ details.v3 = v3;
2517 details.m4 = m4;
2518 details.m5 = m5;
2519+ details.m6 = m6;
2520
2521 d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2522 &s390x_dirtyhelper_vec_op,
2523 mkIRExprVec_2(IRExpr_GSPTR(),
2524 mkU64(details.serialized)));
2525
2526- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128);
2527- d->nFxState = 2;
2528+ const UChar elementSize = isSingleElementOp ?
2529+ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128);
2530+ d->nFxState = 3;
2531 vex_bzero(&d->fxState, sizeof(d->fxState));
2532 d->fxState[0].fx = Ifx_Read;
2533 d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2534 d->fxState[0].size = elementSize;
2535- d->fxState[1].fx = Ifx_Write;
2536- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2537- d->fxState[1].size = sizeof(V128);
2538+ d->fxState[1].fx = Ifx_Read;
2539+ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2540+ d->fxState[1].size = elementSize;
2541+ d->fxState[2].fx = Ifx_Write;
2542+ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2543+ d->fxState[2].size = sizeof(V128);
2544
2545 stmt(IRStmt_Dirty(d));
2546 s390_cc_set(cc);
2547+ return "vfmax";
2548+}
2549
2550- return "vftci";
2551+static const HChar *
2552+s390_irgen_VBPERM(UChar v1, UChar v2, UChar v3)
2553+{
2554+ IRDirty* d;
2555+ IRTemp cc = newTemp(Ity_I64);
2556+
2557+ s390x_vec_op_details_t details = { .serialized = 0ULL };
2558+ details.op = S390_VEC_OP_VBPERM;
2559+ details.v1 = v1;
2560+ details.v2 = v2;
2561+ details.v3 = v3;
2562+ details.m4 = 0;
2563+ details.m5 = 0;
2564+ details.m6 = 0;
2565+
2566+ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op",
2567+ &s390x_dirtyhelper_vec_op,
2568+ mkIRExprVec_2(IRExpr_GSPTR(),
2569+ mkU64(details.serialized)));
2570+
2571+ d->nFxState = 3;
2572+ vex_bzero(&d->fxState, sizeof(d->fxState));
2573+ d->fxState[0].fx = Ifx_Read;
2574+ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128);
2575+ d->fxState[0].size = sizeof(V128);
2576+ d->fxState[1].fx = Ifx_Read;
2577+ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128);
2578+ d->fxState[1].size = sizeof(V128);
2579+ d->fxState[2].fx = Ifx_Write;
2580+ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128);
2581+ d->fxState[2].size = sizeof(V128);
2582+
2583+ stmt(IRStmt_Dirty(d));
2584+ s390_cc_set(cc);
2585+ return "vbperm";
2586 }
2587
2588 /* New insns are added here.
2589@@ -20486,11 +20751,23 @@
2590 RXY_dl2(ovl),
2591 RXY_dh2(ovl)); goto ok;
2592 case 0xe60000000034ULL: /* VPKZ */ goto unimplemented;
2593- case 0xe60000000035ULL: /* VLRL */ goto unimplemented;
2594- case 0xe60000000037ULL: /* VLRLR */ goto unimplemented;
2595+ case 0xe60000000035ULL: s390_format_VSI_URDV(s390_irgen_VLRL, VSI_v1(ovl),
2596+ VSI_b2(ovl), VSI_d2(ovl),
2597+ VSI_i3(ovl),
2598+ VSI_rxb(ovl)); goto ok;
2599+ case 0xe60000000037ULL: s390_format_VRS_RRDV(s390_irgen_VLRLR, VRSd_v1(ovl),
2600+ VRSd_r3(ovl), VRS_b2(ovl),
2601+ VRS_d2(ovl),
2602+ VRS_rxb(ovl)); goto ok;
2603 case 0xe6000000003cULL: /* VUPKZ */ goto unimplemented;
2604- case 0xe6000000003dULL: /* VSTRL */ goto unimplemented;
2605- case 0xe6000000003fULL: /* VSTRLR */ goto unimplemented;
2606+ case 0xe6000000003dULL: s390_format_VSI_URDV(s390_irgen_VSTRL, VSI_v1(ovl),
2607+ VSI_b2(ovl), VSI_d2(ovl),
2608+ VSI_i3(ovl),
2609+ VSI_rxb(ovl)); goto ok;
2610+ case 0xe6000000003fULL: s390_format_VRS_RRDV(s390_irgen_VSTRLR, VRSd_v1(ovl),
2611+ VRSd_r3(ovl), VRS_b2(ovl),
2612+ VRS_d2(ovl),
2613+ VRS_rxb(ovl)); goto ok;
2614 case 0xe60000000049ULL: /* VLIP */ goto unimplemented;
2615 case 0xe60000000050ULL: /* VCVB */ goto unimplemented;
2616 case 0xe60000000052ULL: /* VCVBG */ goto unimplemented;
2617@@ -20688,12 +20965,18 @@
2618 case 0xe7000000006bULL: s390_format_VRR_VVV(s390_irgen_VNO, VRR_v1(ovl),
2619 VRR_v2(ovl), VRR_r3(ovl),
2620 VRR_rxb(ovl)); goto ok;
2621- case 0xe7000000006cULL: /* VNX */ goto unimplemented;
2622+ case 0xe7000000006cULL: s390_format_VRR_VVV(s390_irgen_VNX, VRR_v1(ovl),
2623+ VRR_v2(ovl), VRR_r3(ovl),
2624+ VRR_rxb(ovl)); goto ok;
2625 case 0xe7000000006dULL: s390_format_VRR_VVV(s390_irgen_VX, VRR_v1(ovl),
2626 VRR_v2(ovl), VRR_r3(ovl),
2627 VRR_rxb(ovl)); goto ok;
2628- case 0xe7000000006eULL: /* VNN */ goto unimplemented;
2629- case 0xe7000000006fULL: /* VOC */ goto unimplemented;
2630+ case 0xe7000000006eULL: s390_format_VRR_VVV(s390_irgen_VNN, VRR_v1(ovl),
2631+ VRR_v2(ovl), VRR_r3(ovl),
2632+ VRR_rxb(ovl)); goto ok;
2633+ case 0xe7000000006fULL: s390_format_VRR_VVV(s390_irgen_VOC, VRR_v1(ovl),
2634+ VRR_v2(ovl), VRR_r3(ovl),
2635+ VRR_rxb(ovl)); goto ok;
2636 case 0xe70000000070ULL: s390_format_VRR_VVVM(s390_irgen_VESLV, VRR_v1(ovl),
2637 VRR_v2(ovl), VRR_r3(ovl),
2638 VRR_m4(ovl), VRR_rxb(ovl)); goto ok;
2639@@ -20746,7 +21029,9 @@
2640 case 0xe70000000084ULL: s390_format_VRR_VVVM(s390_irgen_VPDI, VRR_v1(ovl),
2641 VRR_v2(ovl), VRR_r3(ovl),
2642 VRR_m4(ovl), VRR_rxb(ovl)); goto ok;
2643- case 0xe70000000085ULL: /* VBPERM */ goto unimplemented;
2644+ case 0xe70000000085ULL: s390_format_VRR_VVV(s390_irgen_VBPERM, VRR_v1(ovl),
2645+ VRR_v2(ovl), VRR_r3(ovl),
2646+ VRR_rxb(ovl)); goto ok;
2647 case 0xe7000000008aULL: s390_format_VRR_VVVVMM(s390_irgen_VSTRC, VRRd_v1(ovl),
2648 VRRd_v2(ovl), VRRd_v3(ovl),
2649 VRRd_v4(ovl), VRRd_m5(ovl),
2650@@ -20777,8 +21062,16 @@
2651 case 0xe70000000097ULL: s390_format_VRR_VVVMM(s390_irgen_VPKS, VRR_v1(ovl),
2652 VRR_v2(ovl), VRR_r3(ovl),
2653 VRR_m4(ovl), VRR_m5(ovl), VRR_rxb(ovl)); goto ok;
2654- case 0xe7000000009eULL: /* VFNMS */ goto unimplemented;
2655- case 0xe7000000009fULL: /* VFNMA */ goto unimplemented;
2656+ case 0xe7000000009eULL: s390_format_VRR_VVVVMM(s390_irgen_VFNMS, VRRe_v1(ovl),
2657+ VRRe_v2(ovl), VRRe_v3(ovl),
2658+ VRRe_v4(ovl), VRRe_m5(ovl),
2659+ VRRe_m6(ovl),
2660+ VRRe_rxb(ovl)); goto ok;
2661+ case 0xe7000000009fULL: s390_format_VRR_VVVVMM(s390_irgen_VFNMA, VRRe_v1(ovl),
2662+ VRRe_v2(ovl), VRRe_v3(ovl),
2663+ VRRe_v4(ovl), VRRe_m5(ovl),
2664+ VRRe_m6(ovl),
2665+ VRRe_rxb(ovl)); goto ok;
2666 case 0xe700000000a1ULL: s390_format_VRR_VVVM(s390_irgen_VMLH, VRR_v1(ovl),
2667 VRR_v2(ovl), VRR_r3(ovl),
2668 VRR_m4(ovl), VRR_rxb(ovl)); goto ok;
2669@@ -20831,7 +21124,11 @@
2670 case 0xe700000000b4ULL: s390_format_VRR_VVVM(s390_irgen_VGFM, VRR_v1(ovl),
2671 VRR_v2(ovl), VRR_r3(ovl),
2672 VRR_m4(ovl), VRR_rxb(ovl)); goto ok;
2673- case 0xe700000000b8ULL: /* VMSL */ goto unimplemented;
2674+ case 0xe700000000b8ULL: s390_format_VRR_VVVVMM(s390_irgen_VMSL, VRRd_v1(ovl),
2675+ VRRd_v2(ovl), VRRd_v3(ovl),
2676+ VRRd_v4(ovl), VRRd_m5(ovl),
2677+ VRRd_m6(ovl),
2678+ VRRd_rxb(ovl)); goto ok;
2679 case 0xe700000000b9ULL: s390_format_VRRd_VVVVM(s390_irgen_VACCC, VRRd_v1(ovl),
2680 VRRd_v2(ovl), VRRd_v3(ovl),
2681 VRRd_v4(ovl), VRRd_m5(ovl),
2682@@ -20868,11 +21165,11 @@
2683 VRRa_v2(ovl), VRRa_m3(ovl),
2684 VRRa_m4(ovl), VRRa_m5(ovl),
2685 VRRa_rxb(ovl)); goto ok;
2686- case 0xe700000000c4ULL: s390_format_VRRa_VVMMM(s390_irgen_VLDE, VRRa_v1(ovl),
2687+ case 0xe700000000c4ULL: s390_format_VRRa_VVMMM(s390_irgen_VFLL, VRRa_v1(ovl),
2688 VRRa_v2(ovl), VRRa_m3(ovl),
2689 VRRa_m4(ovl), VRRa_m5(ovl),
2690 VRRa_rxb(ovl)); goto ok;
2691- case 0xe700000000c5ULL: s390_format_VRRa_VVMMM(s390_irgen_VLED, VRRa_v1(ovl),
2692+ case 0xe700000000c5ULL: s390_format_VRRa_VVMMM(s390_irgen_VFLR, VRRa_v1(ovl),
2693 VRRa_v2(ovl), VRRa_m3(ovl),
2694 VRRa_m4(ovl), VRRa_m5(ovl),
2695 VRRa_rxb(ovl)); goto ok;
2696@@ -20953,8 +21250,16 @@
2697 VRRa_m3(ovl), VRRa_m4(ovl),
2698 VRRa_m5(ovl),
2699 VRRa_rxb(ovl)); goto ok;
2700- case 0xe700000000eeULL: /* VFMIN */ goto unimplemented;
2701- case 0xe700000000efULL: /* VFMAX */ goto unimplemented;
2702+ case 0xe700000000eeULL: s390_format_VRRa_VVVMMM(s390_irgen_VFMIN, VRRa_v1(ovl),
2703+ VRRa_v2(ovl), VRRa_v3(ovl),
2704+ VRRa_m3(ovl), VRRa_m4(ovl),
2705+ VRRa_m5(ovl),
2706+ VRRa_rxb(ovl)); goto ok;
2707+ case 0xe700000000efULL: s390_format_VRRa_VVVMMM(s390_irgen_VFMAX, VRRa_v1(ovl),
2708+ VRRa_v2(ovl), VRRa_v3(ovl),
2709+ VRRa_m3(ovl), VRRa_m4(ovl),
2710+ VRRa_m5(ovl),
2711+ VRRa_rxb(ovl)); goto ok;
2712 case 0xe700000000f0ULL: s390_format_VRR_VVVM(s390_irgen_VAVGL, VRR_v1(ovl),
2713 VRR_v2(ovl), VRR_r3(ovl),
2714 VRR_m4(ovl), VRR_rxb(ovl)); goto ok;
2715--- a/VEX/priv/host_s390_defs.c
2716+++ b/VEX/priv/host_s390_defs.c
2717@@ -8,7 +8,7 @@
2718 This file is part of Valgrind, a dynamic binary instrumentation
2719 framework.
2720
2721- Copyright IBM Corp. 2010-2017
2722+ Copyright IBM Corp. 2010-2020
2723 Copyright (C) 2012-2017 Florian Krohm (britzel@acm.org)
2724
2725 This program is free software; you can redistribute it and/or
2726@@ -684,6 +684,8 @@
2727 switch (hregClass(from)) {
2728 case HRcInt64:
2729 return s390_insn_move(sizeofIRType(Ity_I64), to, from);
2730+ case HRcFlt64:
2731+ return s390_insn_move(sizeofIRType(Ity_F64), to, from);
2732 case HRcVec128:
2733 return s390_insn_move(sizeofIRType(Ity_V128), to, from);
2734 default:
2735@@ -7870,6 +7872,10 @@
2736 op = "v-vfloatabs";
2737 break;
2738
2739+ case S390_VEC_FLOAT_NABS:
2740+ op = "v-vfloatnabs";
2741+ break;
2742+
2743 default:
2744 goto fail;
2745 }
2746@@ -9439,21 +9445,28 @@
2747
2748 case S390_VEC_FLOAT_NEG: {
2749 vassert(insn->variant.unop.src.tag == S390_OPND_REG);
2750- vassert(insn->size == 8);
2751+ vassert(insn->size >= 4);
2752 UChar v1 = hregNumber(insn->variant.unop.dst);
2753 UChar v2 = hregNumber(insn->variant.unop.src.variant.reg);
2754 return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 0);
2755 }
2756 case S390_VEC_FLOAT_ABS: {
2757 vassert(insn->variant.unop.src.tag == S390_OPND_REG);
2758- vassert(insn->size == 8);
2759+ vassert(insn->size >= 4);
2760 UChar v1 = hregNumber(insn->variant.unop.dst);
2761 UChar v2 = hregNumber(insn->variant.unop.src.variant.reg);
2762 return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 2);
2763 }
2764+ case S390_VEC_FLOAT_NABS: {
2765+ vassert(insn->variant.unop.src.tag == S390_OPND_REG);
2766+ vassert(insn->size >= 4);
2767+ UChar v1 = hregNumber(insn->variant.unop.dst);
2768+ UChar v2 = hregNumber(insn->variant.unop.src.variant.reg);
2769+ return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 1);
2770+ }
2771 case S390_VEC_FLOAT_SQRT: {
2772 vassert(insn->variant.unop.src.tag == S390_OPND_REG);
2773- vassert(insn->size == 8);
2774+ vassert(insn->size >= 4);
2775 UChar v1 = hregNumber(insn->variant.unop.dst);
2776 UChar v2 = hregNumber(insn->variant.unop.src.variant.reg);
2777 return s390_emit_VFSQ(buf, v1, v2, s390_getM_from_size(insn->size), 0);
2778--- a/VEX/priv/host_s390_defs.h
2779+++ b/VEX/priv/host_s390_defs.h
2780@@ -8,7 +8,7 @@
2781 This file is part of Valgrind, a dynamic binary instrumentation
2782 framework.
2783
2784- Copyright IBM Corp. 2010-2017
2785+ Copyright IBM Corp. 2010-2020
2786
2787 This program is free software; you can redistribute it and/or
2788 modify it under the terms of the GNU General Public License as
2789@@ -205,6 +205,7 @@
2790 S390_VEC_COUNT_ONES,
2791 S390_VEC_FLOAT_NEG,
2792 S390_VEC_FLOAT_ABS,
2793+ S390_VEC_FLOAT_NABS,
2794 S390_VEC_FLOAT_SQRT,
2795 S390_UNOP_T_INVALID
2796 } s390_unop_t;
2797@@ -931,6 +932,8 @@
2798 (s390_host_hwcaps & (VEX_HWCAPS_S390X_MSA5))
2799 #define s390_host_has_lsc2 \
2800 (s390_host_hwcaps & (VEX_HWCAPS_S390X_LSC2))
2801+#define s390_host_has_vxe \
2802+ (s390_host_hwcaps & (VEX_HWCAPS_S390X_VXE))
2803 #endif /* ndef __VEX_HOST_S390_DEFS_H */
2804
2805 /*---------------------------------------------------------------*/
2806--- a/VEX/priv/host_s390_isel.c
2807+++ b/VEX/priv/host_s390_isel.c
2808@@ -8,7 +8,7 @@
2809 This file is part of Valgrind, a dynamic binary instrumentation
2810 framework.
2811
2812- Copyright IBM Corp. 2010-2017
2813+ Copyright IBM Corp. 2010-2020
2814 Copyright (C) 2012-2017 Florian Krohm (britzel@acm.org)
2815
2816 This program is free software; you can redistribute it and/or
2817@@ -2362,9 +2362,10 @@
2818 case Iop_NegF128:
2819 if (left->tag == Iex_Unop &&
2820 (left->Iex.Unop.op == Iop_AbsF32 ||
2821- left->Iex.Unop.op == Iop_AbsF64))
2822+ left->Iex.Unop.op == Iop_AbsF64)) {
2823 bfpop = S390_BFP_NABS;
2824- else
2825+ left = left->Iex.Unop.arg;
2826+ } else
2827 bfpop = S390_BFP_NEG;
2828 goto float128_opnd;
2829 case Iop_AbsF128: bfpop = S390_BFP_ABS; goto float128_opnd;
2830@@ -2726,9 +2727,10 @@
2831 case Iop_NegF64:
2832 if (left->tag == Iex_Unop &&
2833 (left->Iex.Unop.op == Iop_AbsF32 ||
2834- left->Iex.Unop.op == Iop_AbsF64))
2835+ left->Iex.Unop.op == Iop_AbsF64)) {
2836 bfpop = S390_BFP_NABS;
2837- else
2838+ left = left->Iex.Unop.arg;
2839+ } else
2840 bfpop = S390_BFP_NEG;
2841 break;
2842
2843@@ -3944,11 +3946,27 @@
2844 vec_unop = S390_VEC_COUNT_ONES;
2845 goto Iop_V_wrk;
2846
2847+ case Iop_Neg32Fx4:
2848+ size = 4;
2849+ vec_unop = S390_VEC_FLOAT_NEG;
2850+ if (arg->tag == Iex_Unop && arg->Iex.Unop.op == Iop_Abs32Fx4) {
2851+ vec_unop = S390_VEC_FLOAT_NABS;
2852+ arg = arg->Iex.Unop.arg;
2853+ }
2854+ goto Iop_V_wrk;
2855 case Iop_Neg64Fx2:
2856 size = 8;
2857 vec_unop = S390_VEC_FLOAT_NEG;
2858+ if (arg->tag == Iex_Unop && arg->Iex.Unop.op == Iop_Abs64Fx2) {
2859+ vec_unop = S390_VEC_FLOAT_NABS;
2860+ arg = arg->Iex.Unop.arg;
2861+ }
2862 goto Iop_V_wrk;
2863
2864+ case Iop_Abs32Fx4:
2865+ size = 4;
2866+ vec_unop = S390_VEC_FLOAT_ABS;
2867+ goto Iop_V_wrk;
2868 case Iop_Abs64Fx2:
2869 size = 8;
2870 vec_unop = S390_VEC_FLOAT_ABS;
2871@@ -4474,17 +4492,29 @@
2872 vec_binop = S390_VEC_ELEM_ROLL_V;
2873 goto Iop_VV_wrk;
2874
2875+ case Iop_CmpEQ32Fx4:
2876+ size = 4;
2877+ vec_binop = S390_VEC_FLOAT_COMPARE_EQUAL;
2878+ goto Iop_VV_wrk;
2879 case Iop_CmpEQ64Fx2:
2880 size = 8;
2881 vec_binop = S390_VEC_FLOAT_COMPARE_EQUAL;
2882 goto Iop_VV_wrk;
2883
2884+ case Iop_CmpLE32Fx4:
2885+ size = 4;
2886+ vec_binop = S390_VEC_FLOAT_COMPARE_LESS_OR_EQUAL;
2887+ goto Iop_VV_wrk;
2888 case Iop_CmpLE64Fx2: {
2889 size = 8;
2890 vec_binop = S390_VEC_FLOAT_COMPARE_LESS_OR_EQUAL;
2891 goto Iop_VV_wrk;
2892 }
2893
2894+ case Iop_CmpLT32Fx4:
2895+ size = 4;
2896+ vec_binop = S390_VEC_FLOAT_COMPARE_LESS;
2897+ goto Iop_VV_wrk;
2898 case Iop_CmpLT64Fx2: {
2899 size = 8;
2900 vec_binop = S390_VEC_FLOAT_COMPARE_LESS;
2901@@ -4671,20 +4701,41 @@
2902 dst, reg1, reg2, reg3));
2903 return dst;
2904
2905+ case Iop_Add32Fx4:
2906+ size = 4;
2907+ vec_binop = S390_VEC_FLOAT_ADD;
2908+ goto Iop_irrm_VV_wrk;
2909+
2910 case Iop_Add64Fx2:
2911 size = 8;
2912 vec_binop = S390_VEC_FLOAT_ADD;
2913 goto Iop_irrm_VV_wrk;
2914
2915+ case Iop_Sub32Fx4:
2916+ size = 4;
2917+ vec_binop = S390_VEC_FLOAT_SUB;
2918+ goto Iop_irrm_VV_wrk;
2919+
2920 case Iop_Sub64Fx2:
2921 size = 8;
2922 vec_binop = S390_VEC_FLOAT_SUB;
2923 goto Iop_irrm_VV_wrk;
2924
2925+ case Iop_Mul32Fx4:
2926+ size = 4;
2927+ vec_binop = S390_VEC_FLOAT_MUL;
2928+ goto Iop_irrm_VV_wrk;
2929+
2930 case Iop_Mul64Fx2:
2931 size = 8;
2932 vec_binop = S390_VEC_FLOAT_MUL;
2933 goto Iop_irrm_VV_wrk;
2934+
2935+ case Iop_Div32Fx4:
2936+ size = 4;
2937+ vec_binop = S390_VEC_FLOAT_DIV;
2938+ goto Iop_irrm_VV_wrk;
2939+
2940 case Iop_Div64Fx2:
2941 size = 8;
2942 vec_binop = S390_VEC_FLOAT_DIV;
2943--- a/VEX/priv/main_main.c
2944+++ b/VEX/priv/main_main.c
2945@@ -1792,6 +1792,7 @@
2946 { VEX_HWCAPS_S390X_MSA5, "msa5" },
2947 { VEX_HWCAPS_S390X_MI2, "mi2" },
2948 { VEX_HWCAPS_S390X_LSC2, "lsc2" },
2949+ { VEX_HWCAPS_S390X_LSC2, "vxe" },
2950 };
2951 /* Allocate a large enough buffer */
2952 static HChar buf[sizeof prefix +
2953--- a/VEX/pub/libvex_emnote.h
2954+++ b/VEX/pub/libvex_emnote.h
2955@@ -124,6 +124,10 @@
2956 /* ppno insn is not supported on this host */
2957 EmFail_S390X_ppno,
2958
2959+ /* insn needs vector-enhancements facility which is not available on this
2960+ host */
2961+ EmFail_S390X_vxe,
2962+
2963 EmNote_NUMBER
2964 }
2965 VexEmNote;
2966--- a/VEX/pub/libvex.h
2967+++ b/VEX/pub/libvex.h
2968@@ -167,7 +167,7 @@
2969 #define VEX_HWCAPS_S390X_MSA5 (1<<19) /* message security assistance facility */
2970 #define VEX_HWCAPS_S390X_MI2 (1<<20) /* miscellaneous-instruction-extensions facility 2 */
2971 #define VEX_HWCAPS_S390X_LSC2 (1<<21) /* Conditional load/store facility2 */
2972-
2973+#define VEX_HWCAPS_S390X_VXE (1<<22) /* Vector-enhancements facility */
2974
2975 /* Special value representing all available s390x hwcaps */
2976 #define VEX_HWCAPS_S390X_ALL (VEX_HWCAPS_S390X_LDISP | \
2977@@ -185,7 +185,8 @@
2978 VEX_HWCAPS_S390X_VX | \
2979 VEX_HWCAPS_S390X_MSA5 | \
2980 VEX_HWCAPS_S390X_MI2 | \
2981- VEX_HWCAPS_S390X_LSC2)
2982+ VEX_HWCAPS_S390X_LSC2 | \
2983+ VEX_HWCAPS_S390X_VXE)
2984
2985 #define VEX_HWCAPS_S390X(x) ((x) & ~VEX_S390X_MODEL_MASK)
2986 #define VEX_S390X_MODEL(x) ((x) & VEX_S390X_MODEL_MASK)
diff --git a/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch b/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch
0new file mode 1006442987new file mode 100644
index 0000000..a62098a
--- /dev/null
+++ b/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch
@@ -0,0 +1,45 @@
1From ba73f8d2ebe4b5fe8163ee5ab806f0e50961ebdf Mon Sep 17 00:00:00 2001
2From: Andreas Arnez <arnez@linux.ibm.com>
3Date: Tue, 3 Nov 2020 18:17:30 +0100
4Subject: [PATCH] Bug 428648 - s390x: Force 12-bit amode for vector loads in isel
5
6Similar to Bug 417452, where the instruction selector sometimes attempted
7to generate vector stores with a 20-bit displacement, the same problem has
8now been reported with vector loads.
9
10The problem is caused in s390_isel_vec_expr_wrk(), where the addressing
11mode is generated with s390_isel_amode() instead of
12s390_isel_amode_short(). This is fixed.
13
14Author: Andreas Arnez <arnez@linux.ibm.com>
15Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=ba73f8d2e
16Bug-IBM: IBM Bugzilla 163660
17Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343
18Applied-Upstream: > v3.16.1
19Reviewed-by: Frank Heimes <frank.heimes@canonical.com>
20Last-Update: 2021-02-10
21
22---
23 NEWS | 1 +
24 VEX/priv/host_s390_isel.c | 2 +-
25 2 files changed, 3 insertions(+), 1 deletion(-)
26--- a/NEWS
27+++ b/NEWS
28@@ -1,4 +1,6 @@
29
30+428648 s390_emit_load_mem panics due to 20-bit offset for vector load
31+
32 Release 3.16.1 (22 June 2020)
33 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
34
35--- a/VEX/priv/host_s390_isel.c
36+++ b/VEX/priv/host_s390_isel.c
37@@ -3741,7 +3741,7 @@
38 /* --------- LOAD --------- */
39 case Iex_Load: {
40 HReg dst = newVRegV(env);
41- s390_amode *am = s390_isel_amode(env, expr->Iex.Load.addr);
42+ s390_amode *am = s390_isel_amode_short(env, expr->Iex.Load.addr);
43
44 if (expr->Iex.Load.end != Iend_BE)
45 goto irreducible;
diff --git a/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch b/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch
0new file mode 10064446new file mode 100644
index 0000000..94c81f8
--- /dev/null
+++ b/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch
@@ -0,0 +1,155 @@
1From 5adeafad7a60b63786d9545e6980de26c17cb0a6 Mon Sep 17 00:00:00 2001
2From: Andreas Arnez <arnez@linux.ibm.com>
3Date: Thu, 3 Dec 2020 18:32:45 +0100
4Subject: [PATCH] Bug 429864 - s390: Use Iop_CasCmp* to fix memcheck false
5 positives
6
7Compare-and-swap instructions can cause memcheck false positives when
8operating on partially uninitialized data. An example is where a 1-byte
9lock is allocated on the stack and then manipulated using CS on the
10surrounding word. This is correct, and the uninitialized data has no
11influence on the result, but memcheck still complains.
12
13This is caused by logic in the s390 backend, where the expected and actual
14memory values are compared using Iop_Sub32. Fix this by using
15Iop_CasCmpNE32 instead.
16
17Author: Andreas Arnez <arnez@linux.ibm.com>
18Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=5adeafad7
19Bug-IBM: IBM Bugzilla 163660
20Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343
21Applied-Upstream: > v3.16.1
22Reviewed-by: Frank Heimes <frank.heimes@canonical.com>
23Last-Update: 2021-02-10
24
25---
26 NEWS | 2 ++
27 VEX/priv/guest_s390_toIR.c | 31 ++++++++++++++-----------------
28 2 files changed, 16 insertions(+), 17 deletions(-)
29
30--- a/NEWS
31+++ b/NEWS
32@@ -1,5 +1,7 @@
33
34 428648 s390_emit_load_mem panics due to 20-bit offset for vector load
35+429864 s390x: C++ atomic test_and_set yields false-positive memcheck
36+ diagnostics
37
38 Release 3.16.1 (22 June 2020)
39 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
40--- a/VEX/priv/guest_s390_toIR.c
41+++ b/VEX/priv/guest_s390_toIR.c
42@@ -742,6 +742,9 @@
43 case Ity_I8:
44 expr = unop(sign_extend ? Iop_8Sto64 : Iop_8Uto64, expr);
45 break;
46+ case Ity_I1:
47+ expr = unop(sign_extend ? Iop_1Sto64 : Iop_1Uto64, expr);
48+ break;
49 default:
50 vpanic("s390_cc_widen");
51 }
52@@ -7417,7 +7420,7 @@
53
54 /* If old_mem contains the expected value, then the CAS succeeded.
55 Otherwise, it did not */
56- yield_if(binop(Iop_CmpNE32, mkexpr(old_mem), mkexpr(op2)));
57+ yield_if(binop(Iop_CasCmpNE32, mkexpr(old_mem), mkexpr(op2)));
58 put_gpr_w1(r1, mkexpr(old_mem));
59 }
60
61@@ -7451,7 +7454,7 @@
62
63 /* If old_mem contains the expected value, then the CAS succeeded.
64 Otherwise, it did not */
65- yield_if(binop(Iop_CmpNE64, mkexpr(old_mem), mkexpr(op2)));
66+ yield_if(binop(Iop_CasCmpNE64, mkexpr(old_mem), mkexpr(op2)));
67 put_gpr_dw0(r1, mkexpr(old_mem));
68 }
69
70@@ -7481,7 +7484,7 @@
71
72 /* If old_mem contains the expected value, then the CAS succeeded.
73 Otherwise, it did not */
74- yield_if(binop(Iop_CmpNE32, mkexpr(old_mem), mkexpr(op2)));
75+ yield_if(binop(Iop_CasCmpNE32, mkexpr(old_mem), mkexpr(op2)));
76 put_gpr_w1(r1, mkexpr(old_mem));
77 }
78
79@@ -13864,7 +13867,6 @@
80 IRTemp op1 = newTemp(Ity_I32);
81 IRTemp old_mem = newTemp(Ity_I32);
82 IRTemp op3 = newTemp(Ity_I32);
83- IRTemp result = newTemp(Ity_I32);
84 IRTemp nequal = newTemp(Ity_I1);
85
86 assign(op1, get_gpr_w1(r1));
87@@ -13879,12 +13881,11 @@
88 stmt(IRStmt_CAS(cas));
89
90 /* Set CC. Operands compared equal -> 0, else 1. */
91- assign(result, binop(Iop_Sub32, mkexpr(op1), mkexpr(old_mem)));
92- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False);
93+ assign(nequal, binop(Iop_CasCmpNE32, mkexpr(op1), mkexpr(old_mem)));
94+ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True);
95
96 /* If operands were equal (cc == 0) just store the old value op1 in r1.
97 Otherwise, store the old_value from memory in r1 and yield. */
98- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0)));
99 put_gpr_w1(r1, mkite(mkexpr(nequal), mkexpr(old_mem), mkexpr(op1)));
100 yield_if(mkexpr(nequal));
101 }
102@@ -13912,7 +13913,6 @@
103 IRTemp op1 = newTemp(Ity_I64);
104 IRTemp old_mem = newTemp(Ity_I64);
105 IRTemp op3 = newTemp(Ity_I64);
106- IRTemp result = newTemp(Ity_I64);
107 IRTemp nequal = newTemp(Ity_I1);
108
109 assign(op1, get_gpr_dw0(r1));
110@@ -13927,12 +13927,11 @@
111 stmt(IRStmt_CAS(cas));
112
113 /* Set CC. Operands compared equal -> 0, else 1. */
114- assign(result, binop(Iop_Sub64, mkexpr(op1), mkexpr(old_mem)));
115- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False);
116+ assign(nequal, binop(Iop_CasCmpNE64, mkexpr(op1), mkexpr(old_mem)));
117+ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True);
118
119 /* If operands were equal (cc == 0) just store the old value op1 in r1.
120 Otherwise, store the old_value from memory in r1 and yield. */
121- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0)));
122 put_gpr_dw0(r1, mkite(mkexpr(nequal), mkexpr(old_mem), mkexpr(op1)));
123 yield_if(mkexpr(nequal));
124
125@@ -13950,7 +13949,6 @@
126 IRTemp old_mem_low = newTemp(Ity_I32);
127 IRTemp op3_high = newTemp(Ity_I32);
128 IRTemp op3_low = newTemp(Ity_I32);
129- IRTemp result = newTemp(Ity_I32);
130 IRTemp nequal = newTemp(Ity_I1);
131
132 assign(op1_high, get_gpr_w1(r1));
133@@ -13967,18 +13965,17 @@
134 stmt(IRStmt_CAS(cas));
135
136 /* Set CC. Operands compared equal -> 0, else 1. */
137- assign(result, unop(Iop_1Uto32,
138- binop(Iop_CmpNE32,
139+ assign(nequal,
140+ binop(Iop_CasCmpNE32,
141 binop(Iop_Or32,
142 binop(Iop_Xor32, mkexpr(op1_high), mkexpr(old_mem_high)),
143 binop(Iop_Xor32, mkexpr(op1_low), mkexpr(old_mem_low))),
144- mkU32(0))));
145+ mkU32(0)));
146
147- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False);
148+ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True);
149
150 /* If operands were equal (cc == 0) just store the old value op1 in r1.
151 Otherwise, store the old_value from memory in r1 and yield. */
152- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0)));
153 put_gpr_w1(r1, mkite(mkexpr(nequal), mkexpr(old_mem_high), mkexpr(op1_high)));
154 put_gpr_w1(r1+1, mkite(mkexpr(nequal), mkexpr(old_mem_low), mkexpr(op1_low)));
155 yield_if(mkexpr(nequal));
diff --git a/debian/patches/series b/debian/patches/series
index bc89f83..36cbddd 100644
--- a/debian/patches/series
+++ b/debian/patches/series
@@ -9,3 +9,6 @@
913_fix-path-to-vgdb.patch913_fix-path-to-vgdb.patch
1014_fix-debuginfo-section-duplicates-a-section-in-the-main-ELF-file.patch1014_fix-debuginfo-section-duplicates-a-section-in-the-main-ELF-file.patch
11armv7-illegal-opcode.patch11armv7-illegal-opcode.patch
12lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch
13lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch
14lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch

Subscribers

People subscribed via source and target branches