Merge ~fheimes/ubuntu/+source/valgrind:valgrind-lp1825343-hirsute into ubuntu/+source/valgrind:ubuntu/hirsute-devel
- Git
- lp:~fheimes/ubuntu/+source/valgrind
- valgrind-lp1825343-hirsute
- Merge into ubuntu/hirsute-devel
Status: | Merged |
---|---|
Approved by: | Christian Ehrhardt |
Approved revision: | 9ce66c4fc97a353855c5cd5bf496d03e42867fda |
Merged at revision: | 9ce66c4fc97a353855c5cd5bf496d03e42867fda |
Proposed branch: | ~fheimes/ubuntu/+source/valgrind:valgrind-lp1825343-hirsute |
Merge into: | ubuntu/+source/valgrind:ubuntu/hirsute-devel |
Diff against target: |
3232 lines (+3198/-0) 5 files modified
debian/changelog (+9/-0) debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch (+2986/-0) debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch (+45/-0) debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch (+155/-0) debian/patches/series (+3/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Christian Ehrhardt (community) | Approve | ||
Review via email: mp+397860@code.launchpad.net |
Commit message
Description of the change
valgrind-
add support for IBM z14 instructions to Valgrind
debian/
debian/changelog
backported three commits from valgrind > v3.16.1
Thanks to Andreas Arnez (LP: #1825343)
One patch needed to be modified to skip the following two files:
- docs/internals/
- auxprogs/
since these files are not included in the upstream release tar ball 3.16.1 thereby also not included in the Ubuntu package '3.16.1-1ubuntu1'.
Test build is available here:
https:/
Christian Ehrhardt (paelzer) wrote : | # |
To ssh://git.
* [new tag] upload/
Uploading to ubuntu (via ftp to upload.ubuntu.com):
Uploading valgrind_
Uploading valgrind_
Uploading valgrind_
Uploading valgrind_
Successfully uploaded packages.
Frank Heimes (fheimes) wrote : | # |
Many thx for reviewing, commenting, sponsoring, uploading and your overall support on this!
Preview Diff
1 | diff --git a/debian/changelog b/debian/changelog |
2 | index c669e48..9b0d8a8 100644 |
3 | --- a/debian/changelog |
4 | +++ b/debian/changelog |
5 | @@ -1,3 +1,12 @@ |
6 | +valgrind (1:3.16.1-1ubuntu2) hirsute; urgency=medium |
7 | + |
8 | + * debian/patches/lp-1825343-Bug-404076-s390*.patches |
9 | + adding support for IBM z14 instructions to Valgrind |
10 | + backported three commits from valgrind > v3.16.1 |
11 | + Thanks to Andreas Arnez (LP: #1825343) |
12 | + |
13 | + -- Frank Heimes <frank.heimes@canonical.com> Wed, 10 Feb 2021 20:10:24 +0100 |
14 | + |
15 | valgrind (1:3.16.1-1ubuntu1) groovy; urgency=low |
16 | |
17 | * Merge from Debian unstable. Remaining changes: |
18 | diff --git a/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch b/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch |
19 | new file mode 100644 |
20 | index 0000000..fa985b9 |
21 | --- /dev/null |
22 | +++ b/debian/patches/lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch |
23 | @@ -0,0 +1,2986 @@ |
24 | +From 159f132289160ab1a5a5cf4da14fb57ecdb248ca Mon Sep 17 00:00:00 2001 |
25 | +From: Andreas Arnez <arnez@linux.ibm.com> |
26 | +Date: Mon, 7 Dec 2020 20:01:26 +0100 |
27 | +Subject: [PATCH] Bug 404076 - s390x: Implement z14 vector instructions |
28 | + |
29 | +Implement the new instructions/features that were added to z/Architecture |
30 | +with the vector-enhancements facility 1. Also cover the instructions from |
31 | +the vector-packed-decimal facility that are defined outside the chapter |
32 | +"Vector Decimal Instructions", but not the ones from that chapter itself. |
33 | + |
34 | +For a detailed list of newly supported instructions see the updates to |
35 | +`docs/internals/s390-opcodes.csv'. |
36 | + |
37 | +Since the miscellaneous instruction extensions facility 2 was already |
38 | +addressed by Bug 404406, this completes the support necessary to run |
39 | +general programs built with `--march=z14' under Valgrind. The |
40 | +vector-packed-decimal facility is currently not exploited by the standard |
41 | +toolchain and libraries. |
42 | + |
43 | +Author: Andreas Arnez <arnez@linux.ibm.com> |
44 | +Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=159f13228 |
45 | +Bug-IBM: IBM Bugzilla 163660 |
46 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343 |
47 | +Applied-Upstream: > v3.16.1 |
48 | +Reviewed-by: Frank Heimes <frank.heimes@canonical.com> |
49 | +Last-Update: 2021-02-10 |
50 | + |
51 | +--- |
52 | +--- a/coregrind/m_initimg/initimg-linux.c |
53 | ++++ b/coregrind/m_initimg/initimg-linux.c |
54 | +@@ -697,9 +697,13 @@ |
55 | + } |
56 | + # elif defined(VGP_s390x_linux) |
57 | + { |
58 | +- /* Advertise hardware features "below" TE and VXRS. TE itself |
59 | +- and anything above VXRS is not supported by Valgrind. */ |
60 | +- auxv->u.a_val &= (VKI_HWCAP_S390_TE - 1) | VKI_HWCAP_S390_VXRS; |
61 | ++ /* Out of the hardware features available on the platform, |
62 | ++ advertise those "below" TE, as well as the ones explicitly |
63 | ++ ORed in the expression below. Anything else, such as TE |
64 | ++ itself, is not supported by Valgrind. */ |
65 | ++ auxv->u.a_val &= ((VKI_HWCAP_S390_TE - 1) |
66 | ++ | VKI_HWCAP_S390_VXRS |
67 | ++ | VKI_HWCAP_S390_VXRS_EXT); |
68 | + } |
69 | + # elif defined(VGP_arm64_linux) |
70 | + { |
71 | +--- a/coregrind/m_machine.c |
72 | ++++ b/coregrind/m_machine.c |
73 | +@@ -1544,6 +1544,7 @@ |
74 | + { False, S390_FAC_MSA5, VEX_HWCAPS_S390X_MSA5, "MSA5" }, |
75 | + { False, S390_FAC_MI2, VEX_HWCAPS_S390X_MI2, "MI2" }, |
76 | + { False, S390_FAC_LSC2, VEX_HWCAPS_S390X_LSC2, "LSC2" }, |
77 | ++ { False, S390_FAC_VXE, VEX_HWCAPS_S390X_VXE, "VXE" }, |
78 | + }; |
79 | + |
80 | + /* Set hwcaps according to the detected facilities */ |
81 | +--- a/include/vki/vki-s390x-linux.h |
82 | ++++ b/include/vki/vki-s390x-linux.h |
83 | +@@ -806,6 +806,7 @@ |
84 | + |
85 | + #define VKI_HWCAP_S390_TE 1024 |
86 | + #define VKI_HWCAP_S390_VXRS 2048 |
87 | ++#define VKI_HWCAP_S390_VXRS_EXT 8192 |
88 | + |
89 | + |
90 | + //---------------------------------------------------------------------- |
91 | +--- a/NEWS |
92 | ++++ b/NEWS |
93 | +@@ -2,6 +2,7 @@ |
94 | + 428648 s390_emit_load_mem panics due to 20-bit offset for vector load |
95 | + 429864 s390x: C++ atomic test_and_set yields false-positive memcheck |
96 | + diagnostics |
97 | ++404076 s390x: z14 vector instructions not implemented |
98 | + |
99 | + Release 3.16.1 (22 June 2020) |
100 | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
101 | +--- a/none/tests/s390x/vector_float.c |
102 | ++++ b/none/tests/s390x/vector_float.c |
103 | +@@ -114,50 +114,59 @@ |
104 | + test_with_selective_printing(vldeb, (V128_V_RES_AS_FLOAT64 | |
105 | + V128_V_ARG1_AS_FLOAT64)); |
106 | + test_with_selective_printing(wldeb, (V128_V_RES_AS_FLOAT64 | |
107 | +- V128_V_ARG1_AS_FLOAT64)); |
108 | ++ V128_V_ARG1_AS_FLOAT64 | |
109 | ++ V128_V_RES_ZERO_ONLY)); |
110 | + |
111 | + test_with_selective_printing(vflcdb, (V128_V_RES_AS_FLOAT64 | |
112 | + V128_V_ARG1_AS_FLOAT64)); |
113 | + test_with_selective_printing(wflcdb, (V128_V_RES_AS_FLOAT64 | |
114 | +- V128_V_ARG1_AS_FLOAT64)); |
115 | ++ V128_V_ARG1_AS_FLOAT64 | |
116 | ++ V128_V_RES_ZERO_ONLY)); |
117 | + test_with_selective_printing(vflndb, (V128_V_RES_AS_FLOAT64 | |
118 | + V128_V_ARG1_AS_FLOAT64)); |
119 | + test_with_selective_printing(wflndb, (V128_V_RES_AS_FLOAT64 | |
120 | +- V128_V_ARG1_AS_FLOAT64)); |
121 | ++ V128_V_ARG1_AS_FLOAT64 | |
122 | ++ V128_V_RES_ZERO_ONLY)); |
123 | + test_with_selective_printing(vflpdb, (V128_V_RES_AS_FLOAT64 | |
124 | + V128_V_ARG1_AS_FLOAT64)); |
125 | + test_with_selective_printing(wflpdb, (V128_V_RES_AS_FLOAT64 | |
126 | +- V128_V_ARG1_AS_FLOAT64)); |
127 | ++ V128_V_ARG1_AS_FLOAT64 | |
128 | ++ V128_V_RES_ZERO_ONLY)); |
129 | + |
130 | + test_with_selective_printing(vfadb, (V128_V_RES_AS_FLOAT64 | |
131 | + V128_V_ARG1_AS_FLOAT64 | |
132 | + V128_V_ARG2_AS_FLOAT64)); |
133 | + test_with_selective_printing(wfadb, (V128_V_RES_AS_FLOAT64 | |
134 | + V128_V_ARG1_AS_FLOAT64 | |
135 | +- V128_V_ARG2_AS_FLOAT64)); |
136 | ++ V128_V_ARG2_AS_FLOAT64 | |
137 | ++ V128_V_RES_ZERO_ONLY)); |
138 | + test_with_selective_printing(vfsdb, (V128_V_RES_AS_FLOAT64 | |
139 | + V128_V_ARG1_AS_FLOAT64 | |
140 | + V128_V_ARG2_AS_FLOAT64)); |
141 | + test_with_selective_printing(wfsdb, (V128_V_RES_AS_FLOAT64 | |
142 | + V128_V_ARG1_AS_FLOAT64 | |
143 | +- V128_V_ARG2_AS_FLOAT64)); |
144 | ++ V128_V_ARG2_AS_FLOAT64 | |
145 | ++ V128_V_RES_ZERO_ONLY)); |
146 | + test_with_selective_printing(vfmdb, (V128_V_RES_AS_FLOAT64 | |
147 | + V128_V_ARG1_AS_FLOAT64 | |
148 | + V128_V_ARG2_AS_FLOAT64)); |
149 | + test_with_selective_printing(wfmdb, (V128_V_RES_AS_FLOAT64 | |
150 | + V128_V_ARG1_AS_FLOAT64 | |
151 | +- V128_V_ARG2_AS_FLOAT64)); |
152 | ++ V128_V_ARG2_AS_FLOAT64 | |
153 | ++ V128_V_RES_ZERO_ONLY)); |
154 | + test_with_selective_printing(vfddb, (V128_V_RES_AS_FLOAT64 | |
155 | + V128_V_ARG1_AS_FLOAT64 | |
156 | + V128_V_ARG2_AS_FLOAT64)); |
157 | + test_with_selective_printing(wfddb, (V128_V_RES_AS_FLOAT64 | |
158 | + V128_V_ARG1_AS_FLOAT64 | |
159 | +- V128_V_ARG2_AS_FLOAT64)); |
160 | ++ V128_V_ARG2_AS_FLOAT64 | |
161 | ++ V128_V_RES_ZERO_ONLY)); |
162 | + |
163 | + test_with_selective_printing(vfsqdb, (V128_V_RES_AS_FLOAT64 | |
164 | + V128_V_ARG1_AS_FLOAT64)); |
165 | + test_with_selective_printing(wfsqdb, (V128_V_RES_AS_FLOAT64 | |
166 | +- V128_V_ARG1_AS_FLOAT64)); |
167 | ++ V128_V_ARG1_AS_FLOAT64 | |
168 | ++ V128_V_RES_ZERO_ONLY)); |
169 | + |
170 | + test_with_selective_printing(vfmadb, (V128_V_RES_AS_FLOAT64 | |
171 | + V128_V_ARG1_AS_FLOAT64 | |
172 | +@@ -166,7 +175,8 @@ |
173 | + test_with_selective_printing(wfmadb, (V128_V_RES_AS_FLOAT64 | |
174 | + V128_V_ARG1_AS_FLOAT64 | |
175 | + V128_V_ARG2_AS_FLOAT64 | |
176 | +- V128_V_ARG3_AS_FLOAT64)); |
177 | ++ V128_V_ARG3_AS_FLOAT64 | |
178 | ++ V128_V_RES_ZERO_ONLY)); |
179 | + test_with_selective_printing(vfmsdb, (V128_V_RES_AS_FLOAT64 | |
180 | + V128_V_ARG1_AS_FLOAT64 | |
181 | + V128_V_ARG2_AS_FLOAT64 | |
182 | +@@ -174,21 +184,25 @@ |
183 | + test_with_selective_printing(wfmsdb, (V128_V_RES_AS_FLOAT64 | |
184 | + V128_V_ARG1_AS_FLOAT64 | |
185 | + V128_V_ARG2_AS_FLOAT64 | |
186 | +- V128_V_ARG3_AS_FLOAT64)); |
187 | ++ V128_V_ARG3_AS_FLOAT64 | |
188 | ++ V128_V_RES_ZERO_ONLY)); |
189 | + |
190 | + test_with_selective_printing(wfcdb, (V128_V_ARG1_AS_FLOAT64 | |
191 | + V128_V_ARG2_AS_FLOAT64 | |
192 | +- V128_R_RES)); |
193 | ++ V128_R_RES | |
194 | ++ V128_V_RES_ZERO_ONLY)); |
195 | + test_with_selective_printing(wfkdb, (V128_V_ARG1_AS_FLOAT64 | |
196 | + V128_V_ARG2_AS_FLOAT64 | |
197 | +- V128_R_RES)); |
198 | ++ V128_R_RES | |
199 | ++ V128_V_RES_ZERO_ONLY)); |
200 | + |
201 | + test_with_selective_printing(vfcedb, (V128_V_RES_AS_INT | |
202 | + V128_V_ARG1_AS_FLOAT64 | |
203 | + V128_V_ARG2_AS_FLOAT64)); |
204 | + test_with_selective_printing(wfcedb, (V128_V_RES_AS_INT | |
205 | + V128_V_ARG1_AS_FLOAT64 | |
206 | +- V128_V_ARG2_AS_FLOAT64)); |
207 | ++ V128_V_ARG2_AS_FLOAT64 | |
208 | ++ V128_V_RES_ZERO_ONLY)); |
209 | + test_with_selective_printing(vfcedbs, (V128_V_RES_AS_INT | |
210 | + V128_V_ARG1_AS_FLOAT64 | |
211 | + V128_V_ARG2_AS_FLOAT64 | |
212 | +@@ -196,14 +210,16 @@ |
213 | + test_with_selective_printing(wfcedbs, (V128_V_RES_AS_INT | |
214 | + V128_V_ARG1_AS_FLOAT64 | |
215 | + V128_V_ARG2_AS_FLOAT64 | |
216 | +- V128_R_RES)); |
217 | ++ V128_R_RES | |
218 | ++ V128_V_RES_ZERO_ONLY)); |
219 | + |
220 | + test_with_selective_printing(vfchdb, (V128_V_RES_AS_INT | |
221 | + V128_V_ARG1_AS_FLOAT64 | |
222 | + V128_V_ARG2_AS_FLOAT64)); |
223 | + test_with_selective_printing(wfchdb, (V128_V_RES_AS_INT | |
224 | + V128_V_ARG1_AS_FLOAT64 | |
225 | +- V128_V_ARG2_AS_FLOAT64)); |
226 | ++ V128_V_ARG2_AS_FLOAT64 | |
227 | ++ V128_V_RES_ZERO_ONLY)); |
228 | + test_with_selective_printing(vfchdbs, (V128_V_RES_AS_INT | |
229 | + V128_V_ARG1_AS_FLOAT64 | |
230 | + V128_V_ARG2_AS_FLOAT64 | |
231 | +@@ -211,14 +227,16 @@ |
232 | + test_with_selective_printing(wfchdbs, (V128_V_RES_AS_INT | |
233 | + V128_V_ARG1_AS_FLOAT64 | |
234 | + V128_V_ARG2_AS_FLOAT64 | |
235 | +- V128_R_RES)); |
236 | ++ V128_R_RES | |
237 | ++ V128_V_RES_ZERO_ONLY)); |
238 | + |
239 | + test_with_selective_printing(vfchedb, (V128_V_RES_AS_INT | |
240 | + V128_V_ARG1_AS_FLOAT64 | |
241 | + V128_V_ARG2_AS_FLOAT64)); |
242 | + test_with_selective_printing(wfchedb, (V128_V_RES_AS_INT | |
243 | + V128_V_ARG1_AS_FLOAT64 | |
244 | +- V128_V_ARG2_AS_FLOAT64)); |
245 | ++ V128_V_ARG2_AS_FLOAT64 | |
246 | ++ V128_V_RES_ZERO_ONLY)); |
247 | + test_with_selective_printing(vfchedbs, (V128_V_RES_AS_INT | |
248 | + V128_V_ARG1_AS_FLOAT64 | |
249 | + V128_V_ARG2_AS_FLOAT64 | |
250 | +@@ -226,7 +244,8 @@ |
251 | + test_with_selective_printing(wfchedbs, (V128_V_RES_AS_INT | |
252 | + V128_V_ARG1_AS_FLOAT64 | |
253 | + V128_V_ARG2_AS_FLOAT64 | |
254 | +- V128_R_RES)); |
255 | ++ V128_R_RES | |
256 | ++ V128_V_RES_ZERO_ONLY)); |
257 | + |
258 | + test_with_selective_printing(vftcidb0, (V128_V_RES_AS_INT | |
259 | + V128_V_ARG1_AS_FLOAT64 | |
260 | +--- a/none/tests/s390x/vector_float.stdout.exp |
261 | ++++ b/none/tests/s390x/vector_float.stdout.exp |
262 | +@@ -419,88 +419,88 @@ |
263 | + v_result = 7fffffffffffffff | 7fffffffffffffff |
264 | + v_arg1 = 0x1.fed2f087c21p+341 | 0x1.180e4c1d87fc4p+682 |
265 | + insn wcgdb00: |
266 | +- v_result = 7fffffffffffffff | 0000000000000000 |
267 | ++ v_result = 7fffffffffffffff | -- |
268 | + v_arg1 = 0x1.d7fd9222e8b86p+670 | 0x1.c272612672a3p+798 |
269 | + insn wcgdb00: |
270 | +- v_result = 0000000000000000 | 0000000000000000 |
271 | ++ v_result = 0000000000000000 | -- |
272 | + v_arg1 = 0x1.745cd360987e5p-496 | -0x1.f3b404919f358p-321 |
273 | + insn wcgdb00: |
274 | +- v_result = 8000000000000000 | 0000000000000000 |
275 | ++ v_result = 8000000000000000 | -- |
276 | + v_arg1 = -0x1.9523565cd92d5p+643 | 0x1.253677d6d3be2p-556 |
277 | + insn wcgdb00: |
278 | +- v_result = 7fffffffffffffff | 0000000000000000 |
279 | ++ v_result = 7fffffffffffffff | -- |
280 | + v_arg1 = 0x1.b6eb576ec3e6ap+845 | -0x1.c7e102c503d91p+266 |
281 | + insn wcgdb01: |
282 | +- v_result = 0000000000000000 | 0000000000000000 |
283 | ++ v_result = 0000000000000000 | -- |
284 | + v_arg1 = -0x1.3d4319841f4d6p-1011 | -0x1.2feabf7dfc506p-680 |
285 | + insn wcgdb01: |
286 | +- v_result = 0000000000000000 | 0000000000000000 |
287 | ++ v_result = 0000000000000000 | -- |
288 | + v_arg1 = -0x1.6fb8d1cd8b32cp-843 | -0x1.50f6a6922f97ep+33 |
289 | + insn wcgdb01: |
290 | +- v_result = 0000000000000000 | 0000000000000000 |
291 | ++ v_result = 0000000000000000 | -- |
292 | + v_arg1 = -0x1.64a673daccf1ap-566 | -0x1.69ef9b1d01499p+824 |
293 | + insn wcgdb01: |
294 | +- v_result = 8000000000000000 | 0000000000000000 |
295 | ++ v_result = 8000000000000000 | -- |
296 | + v_arg1 = -0x1.3e2ddd862b4adp+1005 | -0x1.312466410271p+184 |
297 | + insn wcgdb03: |
298 | +- v_result = 0000000000000001 | 0000000000000000 |
299 | ++ v_result = 0000000000000001 | -- |
300 | + v_arg1 = 0x1.d594c3412a11p-953 | -0x1.a07393d34d77cp-224 |
301 | + insn wcgdb03: |
302 | +- v_result = 8000000000000000 | 0000000000000000 |
303 | ++ v_result = 8000000000000000 | -- |
304 | + v_arg1 = -0x1.f7a0dbcfd6e4cp+104 | -0x1.40f7cde7f2214p-702 |
305 | + insn wcgdb03: |
306 | +- v_result = 8000000000000000 | 0000000000000000 |
307 | ++ v_result = 8000000000000000 | -- |
308 | + v_arg1 = -0x1.40739c1574808p+560 | -0x1.970328ddf1b6ep-374 |
309 | + insn wcgdb03: |
310 | +- v_result = 0000000000000001 | 0000000000000000 |
311 | ++ v_result = 0000000000000001 | -- |
312 | + v_arg1 = 0x1.477653afd7048p-38 | 0x1.1eac2f8b2a93cp-384 |
313 | + insn wcgdb04: |
314 | +- v_result = ffffffffe9479a7d | 0000000000000000 |
315 | ++ v_result = ffffffffe9479a7d | -- |
316 | + v_arg1 = -0x1.6b865833eff3p+28 | 0x1.06e8cf1834d0ep-722 |
317 | + insn wcgdb04: |
318 | +- v_result = 0000000000000000 | 0000000000000000 |
319 | ++ v_result = 0000000000000000 | -- |
320 | + v_arg1 = 0x1.eef0b2294a5cp-544 | -0x1.8e8b133ccda15p+752 |
321 | + insn wcgdb04: |
322 | +- v_result = 0000000000000000 | 0000000000000000 |
323 | ++ v_result = 0000000000000000 | -- |
324 | + v_arg1 = -0x1.f34e77e6b6698p-894 | -0x1.9f7ce1cb53bddp-896 |
325 | + insn wcgdb04: |
326 | +- v_result = 7fffffffffffffff | 0000000000000000 |
327 | ++ v_result = 7fffffffffffffff | -- |
328 | + v_arg1 = 0x1.95707a6d75db5p+1018 | -0x1.3b0c072d23011p-224 |
329 | + insn wcgdb05: |
330 | +- v_result = 0000000000000000 | 0000000000000000 |
331 | ++ v_result = 0000000000000000 | -- |
332 | + v_arg1 = -0x1.a9fb71160793p-968 | 0x1.05f601fe8123ap-986 |
333 | + insn wcgdb05: |
334 | +- v_result = 8000000000000000 | 0000000000000000 |
335 | ++ v_result = 8000000000000000 | -- |
336 | + v_arg1 = -0x1.0864159b94305p+451 | -0x1.d4647f5a78b7ep-599 |
337 | + insn wcgdb05: |
338 | +- v_result = 7fffffffffffffff | 0000000000000000 |
339 | ++ v_result = 7fffffffffffffff | -- |
340 | + v_arg1 = 0x1.37eadff8397c8p+432 | -0x1.15d896b6f6063p+464 |
341 | + insn wcgdb05: |
342 | +- v_result = 0000000000000000 | 0000000000000000 |
343 | ++ v_result = 0000000000000000 | -- |
344 | + v_arg1 = 0x1.eb0812b0d677p-781 | 0x1.3117c5e0e288cp-202 |
345 | + insn wcgdb06: |
346 | +- v_result = 0000000000000001 | 0000000000000000 |
347 | ++ v_result = 0000000000000001 | -- |
348 | + v_arg1 = 0x1.6b88069167c0fp-662 | -0x1.70571d27e1279p+254 |
349 | + insn wcgdb06: |
350 | +- v_result = 7fffffffffffffff | 0000000000000000 |
351 | ++ v_result = 7fffffffffffffff | -- |
352 | + v_arg1 = 0x1.f6a6d6e883596p+260 | 0x1.0d578afaaa34ap+604 |
353 | + insn wcgdb06: |
354 | +- v_result = 0000000000000001 | 0000000000000000 |
355 | ++ v_result = 0000000000000001 | -- |
356 | + v_arg1 = 0x1.d91c7d13c4694p-475 | -0x1.ecf1f8529767bp+830 |
357 | + insn wcgdb06: |
358 | +- v_result = 0000000000000001 | 0000000000000000 |
359 | ++ v_result = 0000000000000001 | -- |
360 | + v_arg1 = 0x1.fac8dd3bb7af6p-101 | 0x1.fb8324a00fba8p+959 |
361 | + insn wcgdb07: |
362 | +- v_result = 7fffffffffffffff | 0000000000000000 |
363 | ++ v_result = 7fffffffffffffff | -- |
364 | + v_arg1 = 0x1.4b0fa18fa73c7p+111 | -0x1.08e7b17633a49p+61 |
365 | + insn wcgdb07: |
366 | +- v_result = e636b693e39a1100 | 0000000000000000 |
367 | ++ v_result = e636b693e39a1100 | -- |
368 | + v_arg1 = -0x1.9c9496c1c65efp+60 | 0x1.c4182ee728d76p-572 |
369 | + insn wcgdb07: |
370 | +- v_result = ffffffffffffffff | 0000000000000000 |
371 | ++ v_result = ffffffffffffffff | -- |
372 | + v_arg1 = -0x1.819718032dff7p-303 | 0x1.a784c77ff6aa2p-622 |
373 | + insn wcgdb07: |
374 | +- v_result = 7fffffffffffffff | 0000000000000000 |
375 | ++ v_result = 7fffffffffffffff | -- |
376 | + v_arg1 = 0x1.978e8abfd83c2p+152 | 0x1.2531ebf451762p+315 |
377 | + insn vclgdb00: |
378 | + v_result = 0000000000000000 | 0000000000000000 |
379 | +@@ -587,88 +587,88 @@ |
380 | + v_result = 0000000000000000 | 0000000000000000 |
381 | + v_arg1 = -0x1.137bbb51f08bdp+306 | 0x1.18d2a1063356p-795 |
382 | + insn wclgdb00: |
383 | +- v_result = 0000000000000000 | 0000000000000000 |
384 | ++ v_result = 0000000000000000 | -- |
385 | + v_arg1 = -0x1.e66f55dcc2639p-1013 | -0x1.733ee56929f3bp-304 |
386 | + insn wclgdb00: |
387 | +- v_result = 0000000000000000 | 0000000000000000 |
388 | ++ v_result = 0000000000000000 | -- |
389 | + v_arg1 = 0x1.8802fd9ab740cp-986 | -0x1.64d4d2c7c145fp-1015 |
390 | + insn wclgdb00: |
391 | +- v_result = 0000000000000000 | 0000000000000000 |
392 | ++ v_result = 0000000000000000 | -- |
393 | + v_arg1 = 0x1.a67209b8c407bp-645 | -0x1.6410ff9b1c801p+487 |
394 | + insn wclgdb00: |
395 | +- v_result = 0000000000000000 | 0000000000000000 |
396 | ++ v_result = 0000000000000000 | -- |
397 | + v_arg1 = -0x1.cb2febaefeb2dp+49 | 0x1.dee368b2ec375p-502 |
398 | + insn wclgdb01: |
399 | +- v_result = 0000000000000000 | 0000000000000000 |
400 | ++ v_result = 0000000000000000 | -- |
401 | + v_arg1 = 0x1.5703db3c1b0e2p-728 | 0x1.068c4d51ea4ebp+617 |
402 | + insn wclgdb01: |
403 | +- v_result = 0000000000000000 | 0000000000000000 |
404 | ++ v_result = 0000000000000000 | -- |
405 | + v_arg1 = -0x1.ae350291e5b3ep+291 | 0x1.1b87bb09b6032p+376 |
406 | + insn wclgdb01: |
407 | +- v_result = ffffffffffffffff | 0000000000000000 |
408 | ++ v_result = ffffffffffffffff | -- |
409 | + v_arg1 = 0x1.c4666a710127ep+424 | -0x1.19e969b6c0076p+491 |
410 | + insn wclgdb01: |
411 | +- v_result = ffffffffffffffff | 0000000000000000 |
412 | ++ v_result = ffffffffffffffff | -- |
413 | + v_arg1 = 0x1.c892c5a4d103fp+105 | -0x1.d4f937cc76704p+749 |
414 | + insn wclgdb03: |
415 | +- v_result = 0000000000000001 | 0000000000000000 |
416 | ++ v_result = 0000000000000001 | -- |
417 | + v_arg1 = 0x1.81090d8fc663dp-111 | 0x1.337ec5e0f0904p+1 |
418 | + insn wclgdb03: |
419 | +- v_result = 0000000000000000 | 0000000000000000 |
420 | ++ v_result = 0000000000000000 | -- |
421 | + v_arg1 = -0x1.e787adc70b91p-593 | 0x1.db8d83196b53cp-762 |
422 | + insn wclgdb03: |
423 | +- v_result = ffffffffffffffff | 0000000000000000 |
424 | ++ v_result = ffffffffffffffff | -- |
425 | + v_arg1 = 0x1.6529307e907efp+389 | -0x1.3ea0d8d5b4dd2p+589 |
426 | + insn wclgdb03: |
427 | +- v_result = 0000000000000000 | 0000000000000000 |
428 | ++ v_result = 0000000000000000 | -- |
429 | + v_arg1 = -0x1.be701a158637p-385 | 0x1.c5a7f70cb8a09p+107 |
430 | + insn wclgdb04: |
431 | +- v_result = 0000000000000000 | 0000000000000000 |
432 | ++ v_result = 0000000000000000 | -- |
433 | + v_arg1 = -0x1.2f328571ab445p+21 | -0x1.dcc21fc82ba01p-930 |
434 | + insn wclgdb04: |
435 | +- v_result = 0000000000000000 | 0000000000000000 |
436 | ++ v_result = 0000000000000000 | -- |
437 | + v_arg1 = -0x1.06b69fcbb7bffp-415 | 0x1.6f9a13a0a827ap+915 |
438 | + insn wclgdb04: |
439 | +- v_result = 0000000000000000 | 0000000000000000 |
440 | ++ v_result = 0000000000000000 | -- |
441 | + v_arg1 = -0x1.738e549b38bcdp+479 | 0x1.a522edb999c9p-45 |
442 | + insn wclgdb04: |
443 | +- v_result = 0000000000000000 | 0000000000000000 |
444 | ++ v_result = 0000000000000000 | -- |
445 | + v_arg1 = 0x1.7f9399d2bcf3bp-215 | -0x1.7bc35f2d69a7fp+818 |
446 | + insn wclgdb05: |
447 | +- v_result = ffffffffffffffff | 0000000000000000 |
448 | ++ v_result = ffffffffffffffff | -- |
449 | + v_arg1 = 0x1.fc542bdb707f6p+880 | -0x1.8521ebc93a25fp-969 |
450 | + insn wclgdb05: |
451 | +- v_result = 1ce8d9951b8c8600 | 0000000000000000 |
452 | ++ v_result = 1ce8d9951b8c8600 | -- |
453 | + v_arg1 = 0x1.ce8d9951b8c86p+60 | 0x1.92712589230e7p+475 |
454 | + insn wclgdb05: |
455 | +- v_result = 0000000000000000 | 0000000000000000 |
456 | ++ v_result = 0000000000000000 | -- |
457 | + v_arg1 = -0x1.8a297f60a0811p-156 | 0x1.102b79043d82cp-204 |
458 | + insn wclgdb05: |
459 | +- v_result = 0000000000000000 | 0000000000000000 |
460 | ++ v_result = 0000000000000000 | -- |
461 | + v_arg1 = 0x1.beb9057e1401dp-196 | -0x1.820f18f830262p+15 |
462 | + insn wclgdb06: |
463 | +- v_result = 0000000000000001 | 0000000000000000 |
464 | ++ v_result = 0000000000000001 | -- |
465 | + v_arg1 = 0x1.c321a966ecb4dp-430 | -0x1.2f6a1a95ead99p-943 |
466 | + insn wclgdb06: |
467 | +- v_result = 0000000000000000 | 0000000000000000 |
468 | ++ v_result = 0000000000000000 | -- |
469 | + v_arg1 = -0x1.f1a86b4aed821p-56 | -0x1.1ee6717cc2d7fp-899 |
470 | + insn wclgdb06: |
471 | +- v_result = 0000000000000000 | 0000000000000000 |
472 | ++ v_result = 0000000000000000 | -- |
473 | + v_arg1 = -0x1.73ce49d89ecb9p-302 | 0x1.52663b975ed23p-716 |
474 | + insn wclgdb06: |
475 | +- v_result = 0000000000000000 | 0000000000000000 |
476 | ++ v_result = 0000000000000000 | -- |
477 | + v_arg1 = -0x1.3e9c2de97a292p+879 | 0x1.d34eed36f2eafp+960 |
478 | + insn wclgdb07: |
479 | +- v_result = 0000000000000000 | 0000000000000000 |
480 | ++ v_result = 0000000000000000 | -- |
481 | + v_arg1 = -0x1.4e6ec6ddc6a45p-632 | -0x1.6e564d0fec72bp+369 |
482 | + insn wclgdb07: |
483 | +- v_result = ffffffffffffffff | 0000000000000000 |
484 | ++ v_result = ffffffffffffffff | -- |
485 | + v_arg1 = 0x1.42e2c658e4c4dp+459 | -0x1.9f9dc0252e44p+85 |
486 | + insn wclgdb07: |
487 | +- v_result = 0000000000000000 | 0000000000000000 |
488 | ++ v_result = 0000000000000000 | -- |
489 | + v_arg1 = -0x1.fb40ac8cda3c1p-762 | 0x1.0e9ed614bc8f1p-342 |
490 | + insn wclgdb07: |
491 | +- v_result = 0000000000000000 | 0000000000000000 |
492 | ++ v_result = 0000000000000000 | -- |
493 | + v_arg1 = -0x1.c1f8b3c68e214p+118 | -0x1.1a26a49368b61p+756 |
494 | + insn vfidb00: |
495 | + v_arg1 = -0x1.38df4cf9d52dbp-545 | -0x1.049253d90dd92p+94 |
496 | +@@ -1020,16 +1020,16 @@ |
497 | + v_result = -0x1.6f5fb2p+70 | -0x1.0d2df6p-107 |
498 | + insn wldeb: |
499 | + v_arg1 = -0x1.d26169729db2ap-435 | 0x1.d6fd080793e8cp+767 |
500 | +- v_result = -0x1.9a4c2cp-54 | 0x0p+0 |
501 | ++ v_result = -0x1.9a4c2cp-54 | -- |
502 | + insn wldeb: |
503 | + v_arg1 = -0x1.f4b59107fce61p-930 | 0x1.cdf2816e253f4p-168 |
504 | +- v_result = -0x1.be96b2p-116 | 0x0p+0 |
505 | ++ v_result = -0x1.be96b2p-116 | -- |
506 | + insn wldeb: |
507 | + v_arg1 = -0x1.9603a2997928cp-441 | -0x1.aada85e355a11p-767 |
508 | +- v_result = -0x1.d2c074p-55 | 0x0p+0 |
509 | ++ v_result = -0x1.d2c074p-55 | -- |
510 | + insn wldeb: |
511 | + v_arg1 = 0x1.25ccf5bd0e83p+620 | 0x1.e1635864ebb17p-88 |
512 | +- v_result = 0x1.64b99ep+78 | 0x0p+0 |
513 | ++ v_result = 0x1.64b99ep+78 | -- |
514 | + insn vflcdb: |
515 | + v_arg1 = 0x1.0ae6d82f76afp-166 | -0x1.e8fb1e03a7415p-191 |
516 | + v_result = -0x1.0ae6d82f76afp-166 | 0x1.e8fb1e03a7415p-191 |
517 | +@@ -1044,16 +1044,16 @@ |
518 | + v_result = -0x1.19520153d35b4p-301 | -0x1.ac5325cd23253p+396 |
519 | + insn wflcdb: |
520 | + v_arg1 = 0x1.ffd3eecfd54d7p-831 | -0x1.97854fa523a77p+146 |
521 | +- v_result = -0x1.ffd3eecfd54d7p-831 | 0x0p+0 |
522 | ++ v_result = -0x1.ffd3eecfd54d7p-831 | -- |
523 | + insn wflcdb: |
524 | + v_arg1 = -0x1.508ea45606447p-442 | 0x1.ae7f0e6cf9d2bp+583 |
525 | +- v_result = 0x1.508ea45606447p-442 | 0x0p+0 |
526 | ++ v_result = 0x1.508ea45606447p-442 | -- |
527 | + insn wflcdb: |
528 | + v_arg1 = 0x1.da8ab2188c21ap+94 | 0x1.78a9c152aa074p-808 |
529 | +- v_result = -0x1.da8ab2188c21ap+94 | 0x0p+0 |
530 | ++ v_result = -0x1.da8ab2188c21ap+94 | -- |
531 | + insn wflcdb: |
532 | + v_arg1 = -0x1.086882645e0c5p-1001 | -0x1.54e2de5af5a74p-262 |
533 | +- v_result = 0x1.086882645e0c5p-1001 | 0x0p+0 |
534 | ++ v_result = 0x1.086882645e0c5p-1001 | -- |
535 | + insn vflndb: |
536 | + v_arg1 = -0x1.5bec561d407dcp+819 | -0x1.a5773dadb7a2dp+935 |
537 | + v_result = -0x1.5bec561d407dcp+819 | -0x1.a5773dadb7a2dp+935 |
538 | +@@ -1068,16 +1068,16 @@ |
539 | + v_result = -0x1.c5bc39a06d4e2p-259 | -0x1.c5e61ad849e77p-833 |
540 | + insn wflndb: |
541 | + v_arg1 = -0x1.e9f3e6d1beffap-117 | -0x1.d58cc8bf123b3p-714 |
542 | +- v_result = -0x1.e9f3e6d1beffap-117 | 0x0p+0 |
543 | ++ v_result = -0x1.e9f3e6d1beffap-117 | -- |
544 | + insn wflndb: |
545 | + v_arg1 = -0x1.3fc4ef2e7485ep-691 | 0x1.eb328986081efp-775 |
546 | +- v_result = -0x1.3fc4ef2e7485ep-691 | 0x0p+0 |
547 | ++ v_result = -0x1.3fc4ef2e7485ep-691 | -- |
548 | + insn wflndb: |
549 | + v_arg1 = -0x1.7146c5afdec16p+23 | -0x1.597fcfa1fab2p-708 |
550 | +- v_result = -0x1.7146c5afdec16p+23 | 0x0p+0 |
551 | ++ v_result = -0x1.7146c5afdec16p+23 | -- |
552 | + insn wflndb: |
553 | + v_arg1 = 0x1.03f8d7e9afe84p-947 | 0x1.9a10c3feb6b57p-118 |
554 | +- v_result = -0x1.03f8d7e9afe84p-947 | 0x0p+0 |
555 | ++ v_result = -0x1.03f8d7e9afe84p-947 | -- |
556 | + insn vflpdb: |
557 | + v_arg1 = 0x1.64ae59b6c762ep-407 | -0x1.fa7191ab21e86p+533 |
558 | + v_result = 0x1.64ae59b6c762ep-407 | 0x1.fa7191ab21e86p+533 |
559 | +@@ -1092,16 +1092,16 @@ |
560 | + v_result = 0x1.85fa2de1d492ap+170 | 0x1.ac36828822c11p-968 |
561 | + insn wflpdb: |
562 | + v_arg1 = 0x1.a6cf677640a73p-871 | 0x1.b6f1792385922p-278 |
563 | +- v_result = 0x1.a6cf677640a73p-871 | 0x0p+0 |
564 | ++ v_result = 0x1.a6cf677640a73p-871 | -- |
565 | + insn wflpdb: |
566 | + v_arg1 = -0x1.b886774f6d888p-191 | -0x1.6a2b08d735d22p-643 |
567 | +- v_result = 0x1.b886774f6d888p-191 | 0x0p+0 |
568 | ++ v_result = 0x1.b886774f6d888p-191 | -- |
569 | + insn wflpdb: |
570 | + v_arg1 = 0x1.5045d37d46f5fp+943 | -0x1.333a86ef2dcf6p-1013 |
571 | +- v_result = 0x1.5045d37d46f5fp+943 | 0x0p+0 |
572 | ++ v_result = 0x1.5045d37d46f5fp+943 | -- |
573 | + insn wflpdb: |
574 | + v_arg1 = 0x1.1e7bec6ada14dp+252 | 0x1.a70b3f3e24dap-153 |
575 | +- v_result = 0x1.1e7bec6ada14dp+252 | 0x0p+0 |
576 | ++ v_result = 0x1.1e7bec6ada14dp+252 | -- |
577 | + insn vfadb: |
578 | + v_arg1 = 0x1.5b1ad8e9f17c6p-294 | -0x1.ddd8300a0bf02p+122 |
579 | + v_arg2 = -0x1.9b49c31ca8ac6p+926 | 0x1.fdbc992926268p+677 |
580 | +@@ -1121,19 +1121,19 @@ |
581 | + insn wfadb: |
582 | + v_arg1 = 0x1.3c5466cb80722p+489 | -0x1.11e1770053ca2p+924 |
583 | + v_arg2 = 0x1.d876cd721a726p-946 | 0x1.5c04ceb79c9bcp+1001 |
584 | +- v_result = 0x1.3c5466cb80722p+489 | 0x0p+0 |
585 | ++ v_result = 0x1.3c5466cb80722p+489 | -- |
586 | + insn wfadb: |
587 | + v_arg1 = 0x1.b0b142d6b76a3p+577 | 0x1.3146824e993a2p+432 |
588 | + v_arg2 = -0x1.f7f3b7582925fp-684 | -0x1.9700143c2b935p-837 |
589 | +- v_result = 0x1.b0b142d6b76a2p+577 | 0x0p+0 |
590 | ++ v_result = 0x1.b0b142d6b76a2p+577 | -- |
591 | + insn wfadb: |
592 | + v_arg1 = -0x1.8d65e15edabd6p+244 | 0x1.3be7fd08492d6p-141 |
593 | + v_arg2 = -0x1.5eef86490fb0ap+481 | 0x1.7b26c897cb6dfp+810 |
594 | +- v_result = -0x1.5eef86490fb0ap+481 | 0x0p+0 |
595 | ++ v_result = -0x1.5eef86490fb0ap+481 | -- |
596 | + insn wfadb: |
597 | + v_arg1 = -0x1.2dffa5b5f29p+34 | 0x1.71a026274602fp-881 |
598 | + v_arg2 = 0x1.4dad707287289p+756 | -0x1.1500d55807247p-616 |
599 | +- v_result = 0x1.4dad707287288p+756 | 0x0p+0 |
600 | ++ v_result = 0x1.4dad707287288p+756 | -- |
601 | + insn vfsdb: |
602 | + v_arg1 = 0x1.054fd9c4d4883p+644 | 0x1.45c90ed85bd7fp-780 |
603 | + v_arg2 = 0x1.f3bc7a611dadap+494 | -0x1.7c9e1e858ba5bp-301 |
604 | +@@ -1153,19 +1153,19 @@ |
605 | + insn wfsdb: |
606 | + v_arg1 = 0x1.9090dabf846e7p-648 | 0x1.1c4ab843a2d15p+329 |
607 | + v_arg2 = -0x1.a7ceb293690dep+316 | 0x1.22245954a20cp+42 |
608 | +- v_result = 0x1.a7ceb293690dep+316 | 0x0p+0 |
609 | ++ v_result = 0x1.a7ceb293690dep+316 | -- |
610 | + insn wfsdb: |
611 | + v_arg1 = 0x1.4e5347c27819p-933 | -0x1.56a30bda28351p-64 |
612 | + v_arg2 = -0x1.dedb9f3935b56p-155 | 0x1.8c5b6ed76816cp-522 |
613 | +- v_result = 0x1.dedb9f3935b56p-155 | 0x0p+0 |
614 | ++ v_result = 0x1.dedb9f3935b56p-155 | -- |
615 | + insn wfsdb: |
616 | + v_arg1 = 0x1.0ec4e562a015bp-491 | 0x1.3996381b52d9fp-686 |
617 | + v_arg2 = 0x1.1dcce4e81819p+960 | -0x1.32fa425e8fc08p-263 |
618 | +- v_result = -0x1.1dcce4e81818fp+960 | 0x0p+0 |
619 | ++ v_result = -0x1.1dcce4e81818fp+960 | -- |
620 | + insn wfsdb: |
621 | + v_arg1 = -0x1.587229f90f77dp-19 | 0x1.100d8eb8105e4p-784 |
622 | + v_arg2 = -0x1.afb4cce4c43ddp+530 | -0x1.6da7f05e7f512p-869 |
623 | +- v_result = 0x1.afb4cce4c43dcp+530 | 0x0p+0 |
624 | ++ v_result = 0x1.afb4cce4c43dcp+530 | -- |
625 | + insn vfmdb: |
626 | + v_arg1 = 0x1.892b425556c47p-124 | 0x1.38222404079dfp-656 |
627 | + v_arg2 = 0x1.af612ed2c342dp-267 | -0x1.1f735fd6ce768p-877 |
628 | +@@ -1185,19 +1185,19 @@ |
629 | + insn wfmdb: |
630 | + v_arg1 = -0x1.b992d950126a1p-683 | -0x1.9c1b22eb58c59p-497 |
631 | + v_arg2 = 0x1.b557a7d8e32c3p-25 | -0x1.f746b2ddafccep+227 |
632 | +- v_result = -0x1.792f6fb13894ap-707 | 0x0p+0 |
633 | ++ v_result = -0x1.792f6fb13894ap-707 | -- |
634 | + insn wfmdb: |
635 | + v_arg1 = -0x1.677a8c20a5a2fp+876 | 0x1.c03e7b97e8c0dp-645 |
636 | + v_arg2 = 0x1.dab44be430937p-1011 | -0x1.3f51352c67be9p-916 |
637 | +- v_result = -0x1.4d4b0a1827064p-134 | 0x0p+0 |
638 | ++ v_result = -0x1.4d4b0a1827064p-134 | -- |
639 | + insn wfmdb: |
640 | + v_arg1 = -0x1.da60f596ad0cep+254 | 0x1.52332e0650e33p+966 |
641 | + v_arg2 = 0x1.a042c52ed993cp+215 | 0x1.8f380c84aa133p+204 |
642 | +- v_result = -0x1.81aca4bbcbd24p+470 | 0x0p+0 |
643 | ++ v_result = -0x1.81aca4bbcbd24p+470 | -- |
644 | + insn wfmdb: |
645 | + v_arg1 = -0x1.83d17f11f6aa3p-469 | -0x1.98117efe89b9ep-361 |
646 | + v_arg2 = 0x1.8c445fd46d214p-701 | -0x1.f98118821821cp+596 |
647 | +- v_result = -0x0p+0 | 0x0p+0 |
648 | ++ v_result = -0x0p+0 | -- |
649 | + insn vfddb: |
650 | + v_arg1 = -0x1.ecbb48899e0f1p+969 | 0x1.caf175ab352p-20 |
651 | + v_arg2 = -0x1.9455d67f9f79dp+208 | 0x1.bc4a431b04a6fp+482 |
652 | +@@ -1217,19 +1217,19 @@ |
653 | + insn wfddb: |
654 | + v_arg1 = 0x1.bd48489b60731p-114 | 0x1.a760dcf57b74fp-51 |
655 | + v_arg2 = -0x1.171f83409eeb6p-402 | -0x1.e159d1409bdc6p-972 |
656 | +- v_result = -0x1.9864f1511f8cp+288 | 0x0p+0 |
657 | ++ v_result = -0x1.9864f1511f8cp+288 | -- |
658 | + insn wfddb: |
659 | + v_arg1 = -0x1.120505ef4606p-637 | -0x1.83f6f775c0eb7p+272 |
660 | + v_arg2 = -0x1.d18ba3872fde1p+298 | 0x1.c60f8d191068cp-454 |
661 | +- v_result = 0x1.2d5cdb15a686cp-936 | 0x0p+0 |
662 | ++ v_result = 0x1.2d5cdb15a686cp-936 | -- |
663 | + insn wfddb: |
664 | + v_arg1 = 0x1.f637f7f8c790fp-97 | -0x1.7bdce4d74947p+189 |
665 | + v_arg2 = -0x1.1c8f2d1b3a2edp-218 | -0x1.55fdfd1840241p-350 |
666 | +- v_result = -0x1.c3d0799c1420fp+121 | 0x0p+0 |
667 | ++ v_result = -0x1.c3d0799c1420fp+121 | -- |
668 | + insn wfddb: |
669 | + v_arg1 = -0x1.c63b7b2eee253p+250 | 0x1.dfd9dcd8b823fp-125 |
670 | + v_arg2 = 0x1.094a1f1f87e0cp+629 | 0x1.eeaa23c0d7843p-814 |
671 | +- v_result = -0x1.b653a10ebdeccp-379 | 0x0p+0 |
672 | ++ v_result = -0x1.b653a10ebdeccp-379 | -- |
673 | + insn vfsqdb: |
674 | + v_arg1 = 0x1.f60db25f7066p-703 | -0x1.d43509abca8c3p+631 |
675 | + v_result = 0x1.fb009ab25ec11p-352 | nan |
676 | +@@ -1244,16 +1244,16 @@ |
677 | + v_result = 0x1.833dba0954bccp+249 | nan |
678 | + insn wfsqdb: |
679 | + v_arg1 = 0x1.71af4e7f64978p+481 | -0x1.3429dc60011d7p-879 |
680 | +- v_result = 0x1.b30fc65551133p+240 | 0x0p+0 |
681 | ++ v_result = 0x1.b30fc65551133p+240 | -- |
682 | + insn wfsqdb: |
683 | + v_arg1 = 0x1.5410db1c5f403p+173 | 0x1.97fa6581e692fp+108 |
684 | +- v_result = 0x1.a144f43a592c1p+86 | 0x0p+0 |
685 | ++ v_result = 0x1.a144f43a592c1p+86 | -- |
686 | + insn wfsqdb: |
687 | + v_arg1 = -0x1.5838027725afep+6 | 0x1.ac61529c11f38p+565 |
688 | +- v_result = nan | 0x0p+0 |
689 | ++ v_result = nan | -- |
690 | + insn wfsqdb: |
691 | + v_arg1 = -0x1.159e341dcc06ep-439 | 0x1.ed54ce5481ba5p-574 |
692 | +- v_result = nan | 0x0p+0 |
693 | ++ v_result = nan | -- |
694 | + insn vfmadb: |
695 | + v_arg1 = -0x1.eb00a5c503d75p+538 | 0x1.89fae603ddc07p+767 |
696 | + v_arg2 = -0x1.71c72712c3957p+715 | 0x1.1bd5773442feap+762 |
697 | +@@ -1278,22 +1278,22 @@ |
698 | + v_arg1 = 0x1.1cc5b10a14d54p+668 | -0x1.686407390f7d1p+616 |
699 | + v_arg2 = -0x1.bf34549e73246p+676 | -0x1.dc5a34cc470f3p+595 |
700 | + v_arg3 = -0x1.95e0fdcf13974p-811 | -0x1.79c7cc1a8ec83p-558 |
701 | +- v_result = -0x1.fffffffffffffp+1023 | 0x0p+0 |
702 | ++ v_result = -0x1.fffffffffffffp+1023 | -- |
703 | + insn wfmadb: |
704 | + v_arg1 = 0x1.138bc1a5d75f8p+713 | -0x1.e226ebba2fe54p+381 |
705 | + v_arg2 = -0x1.081ebb7cc3414p-772 | 0x1.369d99e174fc3p+922 |
706 | + v_arg3 = -0x1.0671c682a5d0cp-1016 | 0x1.03c9530dd0377p+378 |
707 | +- v_result = -0x1.1c4933e117d95p-59 | 0x0p+0 |
708 | ++ v_result = -0x1.1c4933e117d95p-59 | -- |
709 | + insn wfmadb: |
710 | + v_arg1 = -0x1.166f0b1fad67bp+64 | -0x1.e9ee8d32e1069p-452 |
711 | + v_arg2 = -0x1.4a235bdd109e2p-65 | 0x1.bacaa96fc7e81p-403 |
712 | + v_arg3 = -0x1.d2e19acf7c4bdp+99 | 0x1.f901130f685adp-963 |
713 | +- v_result = -0x1.d2e19acf7c4bcp+99 | 0x0p+0 |
714 | ++ v_result = -0x1.d2e19acf7c4bcp+99 | -- |
715 | + insn wfmadb: |
716 | + v_arg1 = -0x1.77d7bfec863d2p-988 | -0x1.b68029700c6b1p-206 |
717 | + v_arg2 = -0x1.aca05ad00aec1p+737 | 0x1.ac746bd7e216bp+51 |
718 | + v_arg3 = 0x1.17342292078b4p+188 | -0x1.49efaf9392301p+555 |
719 | +- v_result = 0x1.17342292078b4p+188 | 0x0p+0 |
720 | ++ v_result = 0x1.17342292078b4p+188 | -- |
721 | + insn vfmsdb: |
722 | + v_arg1 = -0x1.a1b218e84e61p+34 | 0x1.b220f0d144daep-111 |
723 | + v_arg2 = 0x1.564fcc2527961p-265 | 0x1.ea85a4154721ep+733 |
724 | +@@ -1318,22 +1318,22 @@ |
725 | + v_arg1 = -0x1.7499a639673a6p-100 | -0x1.2a0d737e6cb1cp-207 |
726 | + v_arg2 = -0x1.01ad4670a7aa3p-911 | 0x1.f94385e1021e8p+317 |
727 | + v_arg3 = 0x1.aa42b2bb17af9p+982 | 0x1.c550e471711p+786 |
728 | +- v_result = -0x1.aa42b2bb17af8p+982 | 0x0p+0 |
729 | ++ v_result = -0x1.aa42b2bb17af8p+982 | -- |
730 | + insn wfmsdb: |
731 | + v_arg1 = 0x1.76840f99b431ep+500 | -0x1.989a500c92c08p+594 |
732 | + v_arg2 = 0x1.33c657cb8385cp-84 | -0x1.2c795ad92ce17p+807 |
733 | + v_arg3 = -0x1.ee58a39f02d54p-351 | -0x1.18695ed9a280ap+48 |
734 | +- v_result = 0x1.c242894a0068p+416 | 0x0p+0 |
735 | ++ v_result = 0x1.c242894a0068p+416 | -- |
736 | + insn wfmsdb: |
737 | + v_arg1 = -0x1.16db07e054a65p-469 | -0x1.3a627ab99c6e4p+689 |
738 | + v_arg2 = 0x1.17872eae826e5p-538 | 0x1.44ed513fb5873p-929 |
739 | + v_arg3 = 0x1.5ca912008e077p-217 | -0x1.982a6f7359876p-23 |
740 | +- v_result = -0x1.5ca912008e077p-217 | 0x0p+0 |
741 | ++ v_result = -0x1.5ca912008e077p-217 | -- |
742 | + insn wfmsdb: |
743 | + v_arg1 = -0x1.d315f4a932c6p+122 | 0x1.616a04493e143p+513 |
744 | + v_arg2 = -0x1.cf1cd3516f23fp+552 | 0x1.7121749c3932cp-750 |
745 | + v_arg3 = 0x1.dc26d92304d7fp-192 | -0x1.1fc3cca9ec20ep+371 |
746 | +- v_result = 0x1.a67ca6ba395bcp+675 | 0x0p+0 |
747 | ++ v_result = 0x1.a67ca6ba395bcp+675 | -- |
748 | + insn wfcdb: |
749 | + v_arg1 = 0x1.302001b736011p-633 | -0x1.72d5300225c97p-468 |
750 | + v_arg2 = -0x1.8c007c5aba108p-17 | -0x1.bb3f9ae136acdp+569 |
751 | +@@ -1383,19 +1383,19 @@ |
752 | + v_arg1 = 0x1.d8e5c9930c19dp+623 | -0x1.cf1facff4e194p-605 |
753 | + v_arg2 = -0x1.ed6ba02646d0dp+441 | -0x1.2d677e710620bp+810 |
754 | + insn wfcedb: |
755 | +- v_result = 0000000000000000 | 0000000000000000 |
756 | ++ v_result = 0000000000000000 | -- |
757 | + v_arg1 = -0x1.a252009e1a12cp-442 | 0x1.4dc608268bb29p-513 |
758 | + v_arg2 = -0x1.81020aa1a36e6p-687 | -0x1.300e64ce414f1p-899 |
759 | + insn wfcedb: |
760 | +- v_result = 0000000000000000 | 0000000000000000 |
761 | ++ v_result = 0000000000000000 | -- |
762 | + v_arg1 = 0x1.cec439a8d4781p-175 | -0x1.d20e3b281d599p+893 |
763 | + v_arg2 = 0x1.ca17cf16cf0aap-879 | 0x1.61506f8596092p+545 |
764 | + insn wfcedb: |
765 | +- v_result = 0000000000000000 | 0000000000000000 |
766 | ++ v_result = 0000000000000000 | -- |
767 | + v_arg1 = 0x1.0659f5f24a004p+877 | 0x1.fc46867ed0338p-680 |
768 | + v_arg2 = -0x1.1d6849587155ep-1010 | -0x1.f68171edc235fp+575 |
769 | + insn wfcedb: |
770 | +- v_result = 0000000000000000 | 0000000000000000 |
771 | ++ v_result = 0000000000000000 | -- |
772 | + v_arg1 = 0x1.dc88a0d46ad79p-816 | 0x1.245140dcaed79p+851 |
773 | + v_arg2 = 0x1.b33e977c7b3ep-818 | -0x1.04319d7c69367p+787 |
774 | + insn vfcedbs: |
775 | +@@ -1419,22 +1419,22 @@ |
776 | + v_arg2 = 0x1.ae2c06ea88ff4p+332 | -0x1.f668ce4f8ef9ap+821 |
777 | + r_result = 0000000000000003 |
778 | + insn wfcedbs: |
779 | +- v_result = 0000000000000000 | 0000000000000000 |
780 | ++ v_result = 0000000000000000 | -- |
781 | + v_arg1 = 0x1.645261bf86b1fp-996 | 0x1.abd13c95397aap+992 |
782 | + v_arg2 = -0x1.ba09e8fc66a8cp+113 | 0x1.75dbfe92c16c4p-786 |
783 | + r_result = 0000000000000003 |
784 | + insn wfcedbs: |
785 | +- v_result = 0000000000000000 | 0000000000000000 |
786 | ++ v_result = 0000000000000000 | -- |
787 | + v_arg1 = -0x1.d02831d003e7dp+415 | -0x1.611a9dfd10f36p-80 |
788 | + v_arg2 = -0x1.10bda62f4647p+723 | 0x1.cc47af6653378p-614 |
789 | + r_result = 0000000000000003 |
790 | + insn wfcedbs: |
791 | +- v_result = 0000000000000000 | 0000000000000000 |
792 | ++ v_result = 0000000000000000 | -- |
793 | + v_arg1 = 0x1.f168f32f84178p-321 | -0x1.79a2a0b9549d1p-136 |
794 | + v_arg2 = 0x1.41e19d1cfa692p+11 | -0x1.2a0ed6e7fd517p-453 |
795 | + r_result = 0000000000000003 |
796 | + insn wfcedbs: |
797 | +- v_result = 0000000000000000 | 0000000000000000 |
798 | ++ v_result = 0000000000000000 | -- |
799 | + v_arg1 = -0x1.76a9144ee26c5p+188 | -0x1.386aaea2d9cddp-542 |
800 | + v_arg2 = 0x1.810fcf222efc4p-999 | -0x1.ce90a9a43e2a1p+80 |
801 | + r_result = 0000000000000003 |
802 | +@@ -1455,19 +1455,19 @@ |
803 | + v_arg1 = 0x1.82be31fb88a2dp+946 | -0x1.7ca9e9ff31953p-931 |
804 | + v_arg2 = 0x1.fe75a1052beccp+490 | 0x1.179d18543d678p-255 |
805 | + insn wfchdb: |
806 | +- v_result = ffffffffffffffff | 0000000000000000 |
807 | ++ v_result = ffffffffffffffff | -- |
808 | + v_arg1 = 0x1.0af85d8d8d609p-464 | -0x1.9f639a686e0fep+203 |
809 | + v_arg2 = -0x1.3142b77b55761p-673 | 0x1.ca9c474339da1p+472 |
810 | + insn wfchdb: |
811 | +- v_result = ffffffffffffffff | 0000000000000000 |
812 | ++ v_result = ffffffffffffffff | -- |
813 | + v_arg1 = -0x1.6cf16959a022bp+213 | 0x1.445606e4363e1p+942 |
814 | + v_arg2 = -0x1.8c343201bbd2p+939 | -0x1.e5095ad0c37a4p-434 |
815 | + insn wfchdb: |
816 | +- v_result = ffffffffffffffff | 0000000000000000 |
817 | ++ v_result = ffffffffffffffff | -- |
818 | + v_arg1 = 0x1.36b4fc9cf5bdap-52 | -0x1.f1fd95cbcd533p+540 |
819 | + v_arg2 = 0x1.5a2362891c9edp-175 | -0x1.e1f68c319e5d2p+58 |
820 | + insn wfchdb: |
821 | +- v_result = ffffffffffffffff | 0000000000000000 |
822 | ++ v_result = ffffffffffffffff | -- |
823 | + v_arg1 = 0x1.11c6489f544bbp+811 | 0x1.262a740ec3d47p+456 |
824 | + v_arg2 = -0x1.d9394d354e989p-154 | 0x1.cc21b3094391ap-972 |
825 | + insn vfchdbs: |
826 | +@@ -1491,22 +1491,22 @@ |
827 | + v_arg2 = 0x1.e426748435a76p+370 | 0x1.8702527d17783p-871 |
828 | + r_result = 0000000000000003 |
829 | + insn wfchdbs: |
830 | +- v_result = ffffffffffffffff | 0000000000000000 |
831 | ++ v_result = ffffffffffffffff | -- |
832 | + v_arg1 = 0x1.6c51b9f6442c8p+639 | 0x1.1e6b37adff703p+702 |
833 | + v_arg2 = 0x1.0cba9c1c75e43p+520 | -0x1.145d44ed90967p+346 |
834 | + r_result = 0000000000000000 |
835 | + insn wfchdbs: |
836 | +- v_result = ffffffffffffffff | 0000000000000000 |
837 | ++ v_result = ffffffffffffffff | -- |
838 | + v_arg1 = 0x1.7b3dd643bf36bp+816 | -0x1.61ce7bfb9307ap-683 |
839 | + v_arg2 = -0x1.f2c998dc15c9ap-776 | 0x1.e16397f2dcdf5p+571 |
840 | + r_result = 0000000000000000 |
841 | + insn wfchdbs: |
842 | +- v_result = ffffffffffffffff | 0000000000000000 |
843 | ++ v_result = ffffffffffffffff | -- |
844 | + v_arg1 = 0x1.cc3be81884e0ap-865 | -0x1.8b353bd41064p+820 |
845 | + v_arg2 = -0x1.2c1bafaafdd4ep-34 | -0x1.24666808ab16ep-435 |
846 | + r_result = 0000000000000000 |
847 | + insn wfchdbs: |
848 | +- v_result = ffffffffffffffff | 0000000000000000 |
849 | ++ v_result = ffffffffffffffff | -- |
850 | + v_arg1 = 0x1.c3de33d3b673ap+554 | 0x1.d39ed71e53096p-798 |
851 | + v_arg2 = -0x1.c1e8f7b3c001p-828 | 0x1.22e2cf797fabp-787 |
852 | + r_result = 0000000000000000 |
853 | +@@ -1527,19 +1527,19 @@ |
854 | + v_arg1 = -0x1.6c5599e7ba923p+829 | -0x1.5d1a1191ed6eap-994 |
855 | + v_arg2 = -0x1.555c8775bc4d2p-478 | -0x1.4aa6a2c82319cp+493 |
856 | + insn wfchedb: |
857 | +- v_result = ffffffffffffffff | 0000000000000000 |
858 | ++ v_result = ffffffffffffffff | -- |
859 | + v_arg1 = 0x1.ae6cad07b0f3ep-232 | -0x1.2ed61a43f3b99p-74 |
860 | + v_arg2 = -0x1.226f7cddbde13p-902 | -0x1.790d1d6febbf8p+336 |
861 | + insn wfchedb: |
862 | +- v_result = ffffffffffffffff | 0000000000000000 |
863 | ++ v_result = ffffffffffffffff | -- |
864 | + v_arg1 = 0x1.20eb8eac3711dp-385 | 0x1.ef71d3312d7e1p+739 |
865 | + v_arg2 = 0x1.7a3ba08c5a0bdp-823 | -0x1.a7845ccaa544dp-129 |
866 | + insn wfchedb: |
867 | +- v_result = 0000000000000000 | 0000000000000000 |
868 | ++ v_result = 0000000000000000 | -- |
869 | + v_arg1 = -0x1.97ebdbc057be8p+824 | 0x1.2b7798b063cd6p+237 |
870 | + v_arg2 = 0x1.cdb87a6074294p-81 | -0x1.074c902b19bccp-416 |
871 | + insn wfchedb: |
872 | +- v_result = 0000000000000000 | 0000000000000000 |
873 | ++ v_result = 0000000000000000 | -- |
874 | + v_arg1 = -0x1.82deebf9ff023p+937 | 0x1.56c5adcf9d4abp-672 |
875 | + v_arg2 = -0x1.311ce49bc9439p+561 | 0x1.c8e1c512d8544p+103 |
876 | + insn vfchedbs: |
877 | +@@ -1563,22 +1563,22 @@ |
878 | + v_arg2 = -0x1.47f5dfc7a5bcp-569 | 0x1.5877ef33664a3p-758 |
879 | + r_result = 0000000000000003 |
880 | + insn wfchedbs: |
881 | +- v_result = 0000000000000000 | 0000000000000000 |
882 | ++ v_result = 0000000000000000 | -- |
883 | + v_arg1 = -0x1.a7370ccfd9e49p+505 | 0x1.c6b2385850ca2p-591 |
884 | + v_arg2 = 0x1.984f4fcd338b1p+675 | -0x1.feb996c821232p-39 |
885 | + r_result = 0000000000000003 |
886 | + insn wfchedbs: |
887 | +- v_result = ffffffffffffffff | 0000000000000000 |
888 | ++ v_result = ffffffffffffffff | -- |
889 | + v_arg1 = 0x1.641878612dd2p+207 | 0x1.b35e3292db7f6p+567 |
890 | + v_arg2 = -0x1.18a87f209e96bp+299 | -0x1.3d598f3612d8ap+1016 |
891 | + r_result = 0000000000000000 |
892 | + insn wfchedbs: |
893 | +- v_result = ffffffffffffffff | 0000000000000000 |
894 | ++ v_result = ffffffffffffffff | -- |
895 | + v_arg1 = 0x1.cfc2cda244153p+404 | 0x1.d8b2b28e9d8d7p+276 |
896 | + v_arg2 = 0x1.3517b8c7a59a1p-828 | 0x1.6096fab7003ccp-415 |
897 | + r_result = 0000000000000000 |
898 | + insn wfchedbs: |
899 | +- v_result = 0000000000000000 | 0000000000000000 |
900 | ++ v_result = 0000000000000000 | -- |
901 | + v_arg1 = -0x1.54d656f033e56p-603 | -0x1.95ad0e2088967p+254 |
902 | + v_arg2 = 0x1.4cb319db206e4p-614 | 0x1.b41cd9e3739b6p-862 |
903 | + r_result = 0000000000000003 |
904 | +--- a/none/tests/s390x/vector.h |
905 | ++++ b/none/tests/s390x/vector.h |
906 | +@@ -86,6 +86,13 @@ |
907 | + printf("%016lx | %016lx\n", value.u64[0], value.u64[1]); |
908 | + } |
909 | + |
910 | ++void print_hex64(const V128 value, int zero_only) { |
911 | ++ if (zero_only) |
912 | ++ printf("%016lx | --\n", value.u64[0]); |
913 | ++ else |
914 | ++ printf("%016lx | %016lx\n", value.u64[0], value.u64[1]); |
915 | ++} |
916 | ++ |
917 | + void print_f32(const V128 value, int even_only, int zero_only) { |
918 | + if (zero_only) |
919 | + printf("%a | -- | -- | --\n", value.f32[0]); |
920 | +@@ -222,8 +229,10 @@ |
921 | + {printf(" v_arg2 = "); print_hex(v_arg2);} \ |
922 | + if (info & V128_V_ARG3_AS_INT) \ |
923 | + {printf(" v_arg3 = "); print_hex(v_arg3);} \ |
924 | +- if (info & V128_V_RES_AS_INT) \ |
925 | +- {printf(" v_result = "); print_hex(v_result);} \ |
926 | ++ if (info & V128_V_RES_AS_INT) { \ |
927 | ++ printf(" v_result = "); \ |
928 | ++ print_hex64(v_result, info & V128_V_RES_ZERO_ONLY); \ |
929 | ++ } \ |
930 | + \ |
931 | + if (info & V128_V_ARG1_AS_FLOAT64) \ |
932 | + {printf(" v_arg1 = "); print_f64(v_arg1, 0);} \ |
933 | +--- a/VEX/priv/guest_s390_defs.h |
934 | ++++ b/VEX/priv/guest_s390_defs.h |
935 | +@@ -8,7 +8,7 @@ |
936 | + This file is part of Valgrind, a dynamic binary instrumentation |
937 | + framework. |
938 | + |
939 | +- Copyright IBM Corp. 2010-2017 |
940 | ++ Copyright IBM Corp. 2010-2020 |
941 | + |
942 | + This program is free software; you can redistribute it and/or |
943 | + modify it under the terms of the GNU General Public License as |
944 | +@@ -263,26 +263,27 @@ |
945 | + before S390_VEC_OP_LAST. */ |
946 | + typedef enum { |
947 | + S390_VEC_OP_INVALID = 0, |
948 | +- S390_VEC_OP_VPKS = 1, |
949 | +- S390_VEC_OP_VPKLS = 2, |
950 | +- S390_VEC_OP_VFAE = 3, |
951 | +- S390_VEC_OP_VFEE = 4, |
952 | +- S390_VEC_OP_VFENE = 5, |
953 | +- S390_VEC_OP_VISTR = 6, |
954 | +- S390_VEC_OP_VSTRC = 7, |
955 | +- S390_VEC_OP_VCEQ = 8, |
956 | +- S390_VEC_OP_VTM = 9, |
957 | +- S390_VEC_OP_VGFM = 10, |
958 | +- S390_VEC_OP_VGFMA = 11, |
959 | +- S390_VEC_OP_VMAH = 12, |
960 | +- S390_VEC_OP_VMALH = 13, |
961 | +- S390_VEC_OP_VCH = 14, |
962 | +- S390_VEC_OP_VCHL = 15, |
963 | +- S390_VEC_OP_VFCE = 16, |
964 | +- S390_VEC_OP_VFCH = 17, |
965 | +- S390_VEC_OP_VFCHE = 18, |
966 | +- S390_VEC_OP_VFTCI = 19, |
967 | +- S390_VEC_OP_LAST = 20 // supposed to be the last element in enum |
968 | ++ S390_VEC_OP_VPKS, |
969 | ++ S390_VEC_OP_VPKLS, |
970 | ++ S390_VEC_OP_VFAE, |
971 | ++ S390_VEC_OP_VFEE, |
972 | ++ S390_VEC_OP_VFENE, |
973 | ++ S390_VEC_OP_VISTR, |
974 | ++ S390_VEC_OP_VSTRC, |
975 | ++ S390_VEC_OP_VCEQ, |
976 | ++ S390_VEC_OP_VTM, |
977 | ++ S390_VEC_OP_VGFM, |
978 | ++ S390_VEC_OP_VGFMA, |
979 | ++ S390_VEC_OP_VMAH, |
980 | ++ S390_VEC_OP_VMALH, |
981 | ++ S390_VEC_OP_VCH, |
982 | ++ S390_VEC_OP_VCHL, |
983 | ++ S390_VEC_OP_VFTCI, |
984 | ++ S390_VEC_OP_VFMIN, |
985 | ++ S390_VEC_OP_VFMAX, |
986 | ++ S390_VEC_OP_VBPERM, |
987 | ++ S390_VEC_OP_VMSL, |
988 | ++ S390_VEC_OP_LAST // supposed to be the last element in enum |
989 | + } s390x_vec_op_t; |
990 | + |
991 | + /* Arguments of s390x_dirtyhelper_vec_op(...) which are packed into one |
992 | +--- a/VEX/priv/guest_s390_helpers.c |
993 | ++++ b/VEX/priv/guest_s390_helpers.c |
994 | +@@ -8,7 +8,7 @@ |
995 | + This file is part of Valgrind, a dynamic binary instrumentation |
996 | + framework. |
997 | + |
998 | +- Copyright IBM Corp. 2010-2017 |
999 | ++ Copyright IBM Corp. 2010-2020 |
1000 | + |
1001 | + This program is free software; you can redistribute it and/or |
1002 | + modify it under the terms of the GNU General Public License as |
1003 | +@@ -314,20 +314,11 @@ |
1004 | + /*--- Dirty helper for Store Facility instruction ---*/ |
1005 | + /*------------------------------------------------------------*/ |
1006 | + #if defined(VGA_s390x) |
1007 | +-static void |
1008 | +-s390_set_facility_bit(ULong *addr, UInt bitno, UInt value) |
1009 | +-{ |
1010 | +- addr += bitno / 64; |
1011 | +- bitno = bitno % 64; |
1012 | +- |
1013 | +- ULong mask = 1; |
1014 | +- mask <<= (63 - bitno); |
1015 | + |
1016 | +- if (value == 1) { |
1017 | +- *addr |= mask; // set |
1018 | +- } else { |
1019 | +- *addr &= ~mask; // clear |
1020 | +- } |
1021 | ++static ULong |
1022 | ++s390_stfle_range(UInt lo, UInt hi) |
1023 | ++{ |
1024 | ++ return ((1UL << (hi + 1 - lo)) - 1) << (63 - (hi % 64)); |
1025 | + } |
1026 | + |
1027 | + ULong |
1028 | +@@ -336,6 +327,77 @@ |
1029 | + ULong hoststfle[S390_NUM_FACILITY_DW], cc, num_dw, i; |
1030 | + register ULong reg0 asm("0") = guest_state->guest_r0 & 0xF; /* r0[56:63] */ |
1031 | + |
1032 | ++ /* Restrict to facilities that we know about and that we assume to be |
1033 | ++ compatible with Valgrind. Of course, in this way we may reject features |
1034 | ++ that Valgrind is not really involved in (and thus would be compatible |
1035 | ++ with), but quering for such features doesn't seem like a typical use |
1036 | ++ case. */ |
1037 | ++ ULong accepted_facility[S390_NUM_FACILITY_DW] = { |
1038 | ++ /* === 0 .. 63 === */ |
1039 | ++ (s390_stfle_range(0, 16) |
1040 | ++ /* 17: message-security-assist, not supported */ |
1041 | ++ | s390_stfle_range(18, 19) |
1042 | ++ /* 20: HFP-multiply-and-add/subtract, not supported */ |
1043 | ++ | s390_stfle_range(21, 22) |
1044 | ++ /* 23: HFP-unnormalized-extension, not supported */ |
1045 | ++ | s390_stfle_range(24, 25) |
1046 | ++ /* 26: parsing-enhancement, not supported */ |
1047 | ++ | s390_stfle_range(27, 28) |
1048 | ++ /* 29: unassigned */ |
1049 | ++ | s390_stfle_range(30, 30) |
1050 | ++ /* 31: extract-CPU-time, not supported */ |
1051 | ++ | s390_stfle_range(32, 41) |
1052 | ++ /* 42-43: DFP, not fully supported */ |
1053 | ++ /* 44: PFPO, not fully supported */ |
1054 | ++ | s390_stfle_range(45, 47) |
1055 | ++ /* 48: DFP zoned-conversion, not supported */ |
1056 | ++ /* 49: includes PPA, not supported */ |
1057 | ++ /* 50: constrained transactional-execution, not supported */ |
1058 | ++ | s390_stfle_range(51, 55) |
1059 | ++ /* 56: unassigned */ |
1060 | ++ /* 57: MSA5, not supported */ |
1061 | ++ | s390_stfle_range(58, 60) |
1062 | ++ /* 61: miscellaneous-instruction 3, not supported */ |
1063 | ++ | s390_stfle_range(62, 63)), |
1064 | ++ |
1065 | ++ /* === 64 .. 127 === */ |
1066 | ++ (s390_stfle_range(64, 72) |
1067 | ++ /* 73: transactional-execution, not supported */ |
1068 | ++ | s390_stfle_range(74, 75) |
1069 | ++ /* 76: MSA3, not supported */ |
1070 | ++ /* 77: MSA4, not supported */ |
1071 | ++ | s390_stfle_range(78, 78) |
1072 | ++ /* 80: DFP packed-conversion, not supported */ |
1073 | ++ /* 81: PPA-in-order, not supported */ |
1074 | ++ | s390_stfle_range(82, 82) |
1075 | ++ /* 83-127: unassigned */ ), |
1076 | ++ |
1077 | ++ /* === 128 .. 191 === */ |
1078 | ++ (s390_stfle_range(128, 131) |
1079 | ++ /* 132: unassigned */ |
1080 | ++ /* 133: guarded-storage, not supported */ |
1081 | ++ /* 134: vector packed decimal, not supported */ |
1082 | ++ | s390_stfle_range(135, 135) |
1083 | ++ /* 136: unassigned */ |
1084 | ++ /* 137: unassigned */ |
1085 | ++ | s390_stfle_range(138, 142) |
1086 | ++ /* 143: unassigned */ |
1087 | ++ | s390_stfle_range(144, 145) |
1088 | ++ /* 146: MSA8, not supported */ |
1089 | ++ | s390_stfle_range(147, 147) |
1090 | ++ /* 148: vector-enhancements 2, not supported */ |
1091 | ++ | s390_stfle_range(149, 149) |
1092 | ++ /* 150: unassigned */ |
1093 | ++ /* 151: DEFLATE-conversion, not supported */ |
1094 | ++ /* 153: unassigned */ |
1095 | ++ /* 154: unassigned */ |
1096 | ++ /* 155: MSA9, not supported */ |
1097 | ++ | s390_stfle_range(156, 156) |
1098 | ++ /* 157-167: unassigned */ |
1099 | ++ | s390_stfle_range(168, 168) |
1100 | ++ /* 168-191: unassigned */ ), |
1101 | ++ }; |
1102 | ++ |
1103 | + /* We cannot store more than S390_NUM_FACILITY_DW |
1104 | + (and it makes not much sense to do so anyhow) */ |
1105 | + if (reg0 > S390_NUM_FACILITY_DW - 1) |
1106 | +@@ -351,35 +413,9 @@ |
1107 | + /* Update guest register 0 with what STFLE set r0 to */ |
1108 | + guest_state->guest_r0 = reg0; |
1109 | + |
1110 | +- /* Set default: VM facilities = host facilities */ |
1111 | ++ /* VM facilities = host facilities, filtered by acceptance */ |
1112 | + for (i = 0; i < num_dw; ++i) |
1113 | +- addr[i] = hoststfle[i]; |
1114 | +- |
1115 | +- /* Now adjust the VM facilities according to what the VM supports */ |
1116 | +- s390_set_facility_bit(addr, S390_FAC_LDISP, 1); |
1117 | +- s390_set_facility_bit(addr, S390_FAC_EIMM, 1); |
1118 | +- s390_set_facility_bit(addr, S390_FAC_ETF2, 1); |
1119 | +- s390_set_facility_bit(addr, S390_FAC_ETF3, 1); |
1120 | +- s390_set_facility_bit(addr, S390_FAC_GIE, 1); |
1121 | +- s390_set_facility_bit(addr, S390_FAC_EXEXT, 1); |
1122 | +- s390_set_facility_bit(addr, S390_FAC_HIGHW, 1); |
1123 | +- s390_set_facility_bit(addr, S390_FAC_LSC2, 1); |
1124 | +- |
1125 | +- s390_set_facility_bit(addr, S390_FAC_HFPMAS, 0); |
1126 | +- s390_set_facility_bit(addr, S390_FAC_HFPUNX, 0); |
1127 | +- s390_set_facility_bit(addr, S390_FAC_XCPUT, 0); |
1128 | +- s390_set_facility_bit(addr, S390_FAC_MSA, 0); |
1129 | +- s390_set_facility_bit(addr, S390_FAC_PENH, 0); |
1130 | +- s390_set_facility_bit(addr, S390_FAC_DFP, 0); |
1131 | +- s390_set_facility_bit(addr, S390_FAC_PFPO, 0); |
1132 | +- s390_set_facility_bit(addr, S390_FAC_DFPZC, 0); |
1133 | +- s390_set_facility_bit(addr, S390_FAC_MISC, 0); |
1134 | +- s390_set_facility_bit(addr, S390_FAC_CTREXE, 0); |
1135 | +- s390_set_facility_bit(addr, S390_FAC_TREXE, 0); |
1136 | +- s390_set_facility_bit(addr, S390_FAC_MSA4, 0); |
1137 | +- s390_set_facility_bit(addr, S390_FAC_VXE, 0); |
1138 | +- s390_set_facility_bit(addr, S390_FAC_VXE2, 0); |
1139 | +- s390_set_facility_bit(addr, S390_FAC_DFLT, 0); |
1140 | ++ addr[i] = hoststfle[i] & accepted_facility[i]; |
1141 | + |
1142 | + return cc; |
1143 | + } |
1144 | +@@ -2500,25 +2536,26 @@ |
1145 | + vassert(d->op > S390_VEC_OP_INVALID && d->op < S390_VEC_OP_LAST); |
1146 | + static const UChar opcodes[][2] = { |
1147 | + {0x00, 0x00}, /* invalid */ |
1148 | +- {0xe7, 0x97}, /* VPKS */ |
1149 | +- {0xe7, 0x95}, /* VPKLS */ |
1150 | +- {0xe7, 0x82}, /* VFAE */ |
1151 | +- {0xe7, 0x80}, /* VFEE */ |
1152 | +- {0xe7, 0x81}, /* VFENE */ |
1153 | +- {0xe7, 0x5c}, /* VISTR */ |
1154 | +- {0xe7, 0x8a}, /* VSTRC */ |
1155 | +- {0xe7, 0xf8}, /* VCEQ */ |
1156 | +- {0xe7, 0xd8}, /* VTM */ |
1157 | +- {0xe7, 0xb4}, /* VGFM */ |
1158 | +- {0xe7, 0xbc}, /* VGFMA */ |
1159 | +- {0xe7, 0xab}, /* VMAH */ |
1160 | +- {0xe7, 0xa9}, /* VMALH */ |
1161 | +- {0xe7, 0xfb}, /* VCH */ |
1162 | +- {0xe7, 0xf9}, /* VCHL */ |
1163 | +- {0xe7, 0xe8}, /* VFCE */ |
1164 | +- {0xe7, 0xeb}, /* VFCH */ |
1165 | +- {0xe7, 0xea}, /* VFCHE */ |
1166 | +- {0xe7, 0x4a} /* VFTCI */ |
1167 | ++ [S390_VEC_OP_VPKS] = {0xe7, 0x97}, |
1168 | ++ [S390_VEC_OP_VPKLS] = {0xe7, 0x95}, |
1169 | ++ [S390_VEC_OP_VFAE] = {0xe7, 0x82}, |
1170 | ++ [S390_VEC_OP_VFEE] = {0xe7, 0x80}, |
1171 | ++ [S390_VEC_OP_VFENE] = {0xe7, 0x81}, |
1172 | ++ [S390_VEC_OP_VISTR] = {0xe7, 0x5c}, |
1173 | ++ [S390_VEC_OP_VSTRC] = {0xe7, 0x8a}, |
1174 | ++ [S390_VEC_OP_VCEQ] = {0xe7, 0xf8}, |
1175 | ++ [S390_VEC_OP_VTM] = {0xe7, 0xd8}, |
1176 | ++ [S390_VEC_OP_VGFM] = {0xe7, 0xb4}, |
1177 | ++ [S390_VEC_OP_VGFMA] = {0xe7, 0xbc}, |
1178 | ++ [S390_VEC_OP_VMAH] = {0xe7, 0xab}, |
1179 | ++ [S390_VEC_OP_VMALH] = {0xe7, 0xa9}, |
1180 | ++ [S390_VEC_OP_VCH] = {0xe7, 0xfb}, |
1181 | ++ [S390_VEC_OP_VCHL] = {0xe7, 0xf9}, |
1182 | ++ [S390_VEC_OP_VFTCI] = {0xe7, 0x4a}, |
1183 | ++ [S390_VEC_OP_VFMIN] = {0xe7, 0xee}, |
1184 | ++ [S390_VEC_OP_VFMAX] = {0xe7, 0xef}, |
1185 | ++ [S390_VEC_OP_VBPERM]= {0xe7, 0x85}, |
1186 | ++ [S390_VEC_OP_VMSL] = {0xe7, 0xb8}, |
1187 | + }; |
1188 | + |
1189 | + union { |
1190 | +@@ -2612,6 +2649,7 @@ |
1191 | + case S390_VEC_OP_VGFMA: |
1192 | + case S390_VEC_OP_VMAH: |
1193 | + case S390_VEC_OP_VMALH: |
1194 | ++ case S390_VEC_OP_VMSL: |
1195 | + the_insn.VRRd.v1 = 1; |
1196 | + the_insn.VRRd.v2 = 2; |
1197 | + the_insn.VRRd.v3 = 3; |
1198 | +@@ -2621,9 +2659,9 @@ |
1199 | + the_insn.VRRd.m6 = d->m5; |
1200 | + break; |
1201 | + |
1202 | +- case S390_VEC_OP_VFCE: |
1203 | +- case S390_VEC_OP_VFCH: |
1204 | +- case S390_VEC_OP_VFCHE: |
1205 | ++ case S390_VEC_OP_VFMIN: |
1206 | ++ case S390_VEC_OP_VFMAX: |
1207 | ++ case S390_VEC_OP_VBPERM: |
1208 | + the_insn.VRRc.v1 = 1; |
1209 | + the_insn.VRRc.v2 = 2; |
1210 | + the_insn.VRRc.v3 = 3; |
1211 | +--- a/VEX/priv/guest_s390_toIR.c |
1212 | ++++ b/VEX/priv/guest_s390_toIR.c |
1213 | +@@ -8,7 +8,7 @@ |
1214 | + This file is part of Valgrind, a dynamic binary instrumentation |
1215 | + framework. |
1216 | + |
1217 | +- Copyright IBM Corp. 2010-2017 |
1218 | ++ Copyright IBM Corp. 2010-2020 |
1219 | + |
1220 | + This program is free software; you can redistribute it and/or |
1221 | + modify it under the terms of the GNU General Public License as |
1222 | +@@ -248,6 +248,13 @@ |
1223 | + #define VRS_d2(insn) (((insn) >> 32) & 0xfff) |
1224 | + #define VRS_m4(insn) (((insn) >> 28) & 0xf) |
1225 | + #define VRS_rxb(insn) (((insn) >> 24) & 0xf) |
1226 | ++#define VRSd_v1(insn) (((insn) >> 28) & 0xf) |
1227 | ++#define VRSd_r3(insn) (((insn) >> 48) & 0xf) |
1228 | ++#define VSI_i3(insn) (((insn) >> 48) & 0xff) |
1229 | ++#define VSI_b2(insn) (((insn) >> 44) & 0xf) |
1230 | ++#define VSI_d2(insn) (((insn) >> 32) & 0xfff) |
1231 | ++#define VSI_v1(insn) (((insn) >> 28) & 0xf) |
1232 | ++#define VSI_rxb(insn) (((insn) >> 24) & 0xf) |
1233 | + |
1234 | + |
1235 | + /*------------------------------------------------------------*/ |
1236 | +@@ -1937,6 +1944,26 @@ |
1237 | + return results[m]; |
1238 | + } |
1239 | + |
1240 | ++/* Determine IRType from instruction's floating-point format field */ |
1241 | ++static IRType |
1242 | ++s390_vr_get_ftype(const UChar m) |
1243 | ++{ |
1244 | ++ static const IRType results[] = {Ity_F32, Ity_F64, Ity_F128}; |
1245 | ++ if (m >= 2 && m <= 4) |
1246 | ++ return results[m - 2]; |
1247 | ++ return Ity_INVALID; |
1248 | ++} |
1249 | ++ |
1250 | ++/* Determine number of elements from instruction's floating-point format |
1251 | ++ field */ |
1252 | ++static UChar |
1253 | ++s390_vr_get_n_elem(const UChar m) |
1254 | ++{ |
1255 | ++ if (m >= 2 && m <= 4) |
1256 | ++ return 1 << (4 - m); |
1257 | ++ return 0; |
1258 | ++} |
1259 | ++ |
1260 | + /* Determine if Condition Code Set (CS) flag is set in m field */ |
1261 | + #define s390_vr_is_cs_set(m) (((m) & 0x1) != 0) |
1262 | + |
1263 | +@@ -2191,12 +2218,15 @@ |
1264 | + goto invalidIndex; |
1265 | + } |
1266 | + return vr_offset(archreg) + sizeof(ULong) * index; |
1267 | ++ |
1268 | + case Ity_V128: |
1269 | ++ case Ity_F128: |
1270 | + if(index == 0) { |
1271 | + return vr_qw_offset(archreg); |
1272 | + } else { |
1273 | + goto invalidIndex; |
1274 | + } |
1275 | ++ |
1276 | + default: |
1277 | + vpanic("s390_vr_offset_by_index: unknown type"); |
1278 | + } |
1279 | +@@ -2214,7 +2244,14 @@ |
1280 | + UInt offset = s390_vr_offset_by_index(archreg, type, index); |
1281 | + vassert(typeOfIRExpr(irsb->tyenv, expr) == type); |
1282 | + |
1283 | +- stmt(IRStmt_Put(offset, expr)); |
1284 | ++ if (type == Ity_F128) { |
1285 | ++ IRTemp val = newTemp(Ity_F128); |
1286 | ++ assign(val, expr); |
1287 | ++ stmt(IRStmt_Put(offset, unop(Iop_F128HItoF64, mkexpr(val)))); |
1288 | ++ stmt(IRStmt_Put(offset + 8, unop(Iop_F128LOtoF64, mkexpr(val)))); |
1289 | ++ } else { |
1290 | ++ stmt(IRStmt_Put(offset, expr)); |
1291 | ++ } |
1292 | + } |
1293 | + |
1294 | + /* Read type sized part specified by index of a vr register. */ |
1295 | +@@ -2222,6 +2259,11 @@ |
1296 | + get_vr(UInt archreg, IRType type, UChar index) |
1297 | + { |
1298 | + UInt offset = s390_vr_offset_by_index(archreg, type, index); |
1299 | ++ if (type == Ity_F128) { |
1300 | ++ return binop(Iop_F64HLtoF128, |
1301 | ++ IRExpr_Get(offset, Ity_F64), |
1302 | ++ IRExpr_Get(offset + 8, Ity_F64)); |
1303 | ++ } |
1304 | + return IRExpr_Get(offset, type); |
1305 | + } |
1306 | + |
1307 | +@@ -2297,11 +2339,11 @@ |
1308 | + return mkexpr(output); |
1309 | + } |
1310 | + |
1311 | +-/* Load bytes into v1. |
1312 | +- maxIndex specifies max index to load and must be Ity_I32. |
1313 | +- If maxIndex >= 15, all 16 bytes are loaded. |
1314 | +- All bytes after maxIndex are zeroed. */ |
1315 | +-static void s390_vr_loadWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex) |
1316 | ++/* Starting from addr, load at most maxIndex + 1 bytes into v1. Fill the |
1317 | ++ leftmost or rightmost bytes of v1, depending on whether `rightmost' is set. |
1318 | ++ If maxIndex >= 15, load all 16 bytes; otherwise clear the remaining bytes. */ |
1319 | ++static void |
1320 | ++s390_vr_loadWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex, Bool rightmost) |
1321 | + { |
1322 | + IRTemp maxIdx = newTemp(Ity_I32); |
1323 | + IRTemp cappedMax = newTemp(Ity_I64); |
1324 | +@@ -2314,8 +2356,8 @@ |
1325 | + crossed if and only if the real insn would have crossed it as well. |
1326 | + Thus, if the bytes to load are fully contained in an aligned 16-byte |
1327 | + chunk, load the whole 16-byte aligned chunk, and otherwise load 16 bytes |
1328 | +- from the unaligned address. Then shift the loaded data left-aligned |
1329 | +- into the target vector register. */ |
1330 | ++ from the unaligned address. Then shift the loaded data left- or |
1331 | ++ right-aligned into the target vector register. */ |
1332 | + |
1333 | + assign(maxIdx, maxIndex); |
1334 | + assign(cappedMax, mkite(binop(Iop_CmpLT32U, mkexpr(maxIdx), mkU32(15)), |
1335 | +@@ -2328,20 +2370,60 @@ |
1336 | + assign(back, mkite(binop(Iop_CmpLE64U, mkexpr(offset), mkexpr(zeroed)), |
1337 | + mkexpr(offset), mkU64(0))); |
1338 | + |
1339 | +- /* How much to shift the loaded 16-byte vector to the right, and then to |
1340 | +- the left. Since both 'zeroed' and 'back' range from 0 to 15, the shift |
1341 | +- amounts range from 0 to 120. */ |
1342 | +- IRExpr *shrAmount = binop(Iop_Shl64, |
1343 | +- binop(Iop_Sub64, mkexpr(zeroed), mkexpr(back)), |
1344 | +- mkU8(3)); |
1345 | +- IRExpr *shlAmount = binop(Iop_Shl64, mkexpr(zeroed), mkU8(3)); |
1346 | +- |
1347 | +- put_vr_qw(v1, binop(Iop_ShlV128, |
1348 | +- binop(Iop_ShrV128, |
1349 | +- load(Ity_V128, |
1350 | +- binop(Iop_Sub64, mkexpr(addr), mkexpr(back))), |
1351 | +- unop(Iop_64to8, shrAmount)), |
1352 | +- unop(Iop_64to8, shlAmount))); |
1353 | ++ IRExpr* chunk = load(Ity_V128, binop(Iop_Sub64, mkexpr(addr), mkexpr(back))); |
1354 | ++ |
1355 | ++ /* Shift the loaded 16-byte vector to the right, then to the left, or vice |
1356 | ++ versa, where each shift amount ranges from 0 to 120. */ |
1357 | ++ IRExpr* shift1; |
1358 | ++ IRExpr* shift2 = unop(Iop_64to8, binop(Iop_Shl64, mkexpr(zeroed), mkU8(3))); |
1359 | ++ |
1360 | ++ if (rightmost) { |
1361 | ++ shift1 = unop(Iop_64to8, binop(Iop_Shl64, mkexpr(back), mkU8(3))); |
1362 | ++ put_vr_qw(v1, binop(Iop_ShrV128, |
1363 | ++ binop(Iop_ShlV128, chunk, shift1), |
1364 | ++ shift2)); |
1365 | ++ } else { |
1366 | ++ shift1 = unop(Iop_64to8, |
1367 | ++ binop(Iop_Shl64, |
1368 | ++ binop(Iop_Sub64, mkexpr(zeroed), mkexpr(back)), |
1369 | ++ mkU8(3))); |
1370 | ++ put_vr_qw(v1, binop(Iop_ShlV128, |
1371 | ++ binop(Iop_ShrV128, chunk, shift1), |
1372 | ++ shift2)); |
1373 | ++ } |
1374 | ++} |
1375 | ++ |
1376 | ++/* Store at most maxIndex + 1 bytes from v1 to addr. Store the leftmost or |
1377 | ++ rightmost bytes of v1, depending on whether `rightmost' is set. If maxIndex |
1378 | ++ >= 15, store all 16 bytes. */ |
1379 | ++static void |
1380 | ++s390_vr_storeWithLength(UChar v1, IRTemp addr, IRExpr *maxIndex, Bool rightmost) |
1381 | ++{ |
1382 | ++ IRTemp maxIdx = newTemp(Ity_I32); |
1383 | ++ IRTemp cappedMax = newTemp(Ity_I64); |
1384 | ++ IRTemp counter = newTemp(Ity_I64); |
1385 | ++ IRExpr* offset; |
1386 | ++ |
1387 | ++ assign(maxIdx, maxIndex); |
1388 | ++ assign(cappedMax, mkite(binop(Iop_CmpLT32U, mkexpr(maxIdx), mkU32(15)), |
1389 | ++ unop(Iop_32Uto64, mkexpr(maxIdx)), mkU64(15))); |
1390 | ++ |
1391 | ++ assign(counter, get_counter_dw0()); |
1392 | ++ |
1393 | ++ if (rightmost) |
1394 | ++ offset = binop(Iop_Add64, |
1395 | ++ binop(Iop_Sub64, mkU64(15), mkexpr(cappedMax)), |
1396 | ++ mkexpr(counter)); |
1397 | ++ else |
1398 | ++ offset = mkexpr(counter); |
1399 | ++ |
1400 | ++ store(binop(Iop_Add64, mkexpr(addr), mkexpr(counter)), |
1401 | ++ binop(Iop_GetElem8x16, get_vr_qw(v1), unop(Iop_64to8, offset))); |
1402 | ++ |
1403 | ++ /* Check for end of field */ |
1404 | ++ put_counter_dw0(binop(Iop_Add64, mkexpr(counter), mkU64(1))); |
1405 | ++ iterate_if(binop(Iop_CmpNE64, mkexpr(counter), mkexpr(cappedMax))); |
1406 | ++ put_counter_dw0(mkU64(0)); |
1407 | + } |
1408 | + |
1409 | + /* Bitwise vCond ? v1 : v2 |
1410 | +@@ -3752,6 +3834,28 @@ |
1411 | + s390_disasm(ENC5(MNM, GPR, UDXB, VR, UINT), mnm, r1, d2, 0, b2, v3, m4); |
1412 | + } |
1413 | + |
1414 | ++static void |
1415 | ++s390_format_VRS_RRDV(const HChar *(*irgen)(UChar v1, UChar r3, IRTemp op2addr), |
1416 | ++ UChar v1, UChar r3, UChar b2, UShort d2, UChar rxb) |
1417 | ++{ |
1418 | ++ const HChar *mnm; |
1419 | ++ IRTemp op2addr = newTemp(Ity_I64); |
1420 | ++ |
1421 | ++ if (! s390_host_has_vx) { |
1422 | ++ emulation_failure(EmFail_S390X_vx); |
1423 | ++ return; |
1424 | ++ } |
1425 | ++ |
1426 | ++ assign(op2addr, binop(Iop_Add64, mkU64(d2), b2 != 0 ? get_gpr_dw0(b2) : |
1427 | ++ mkU64(0))); |
1428 | ++ |
1429 | ++ v1 = s390_vr_getVRindex(v1, 4, rxb); |
1430 | ++ mnm = irgen(v1, r3, op2addr); |
1431 | ++ |
1432 | ++ if (UNLIKELY(vex_traceflags & VEX_TRACE_FE)) |
1433 | ++ s390_disasm(ENC4(MNM, VR, GPR, UDXB), mnm, v1, r3, d2, 0, b2); |
1434 | ++} |
1435 | ++ |
1436 | + |
1437 | + static void |
1438 | + s390_format_VRS_VRDVM(const HChar *(*irgen)(UChar v1, IRTemp op2addr, UChar v3, |
1439 | +@@ -4084,6 +4188,29 @@ |
1440 | + mnm, v1, v2, v3, m4, m5, m6); |
1441 | + } |
1442 | + |
1443 | ++static void |
1444 | ++s390_format_VSI_URDV(const HChar *(*irgen)(UChar v1, IRTemp op2addr, UChar i3), |
1445 | ++ UChar v1, UChar b2, UChar d2, UChar i3, UChar rxb) |
1446 | ++{ |
1447 | ++ const HChar *mnm; |
1448 | ++ IRTemp op2addr = newTemp(Ity_I64); |
1449 | ++ |
1450 | ++ if (!s390_host_has_vx) { |
1451 | ++ emulation_failure(EmFail_S390X_vx); |
1452 | ++ return; |
1453 | ++ } |
1454 | ++ |
1455 | ++ v1 = s390_vr_getVRindex(v1, 4, rxb); |
1456 | ++ |
1457 | ++ assign(op2addr, binop(Iop_Add64, mkU64(d2), b2 != 0 ? get_gpr_dw0(b2) : |
1458 | ++ mkU64(0))); |
1459 | ++ |
1460 | ++ mnm = irgen(v1, op2addr, i3); |
1461 | ++ |
1462 | ++ if (vex_traceflags & VEX_TRACE_FE) |
1463 | ++ s390_disasm(ENC4(MNM, VR, UDXB, UINT), mnm, v1, d2, 0, b2, i3); |
1464 | ++} |
1465 | ++ |
1466 | + /*------------------------------------------------------------*/ |
1467 | + /*--- Build IR for opcodes ---*/ |
1468 | + /*------------------------------------------------------------*/ |
1469 | +@@ -16183,7 +16310,9 @@ |
1470 | + static const HChar * |
1471 | + s390_irgen_VLLEZ(UChar v1, IRTemp op2addr, UChar m3) |
1472 | + { |
1473 | +- IRType type = s390_vr_get_type(m3); |
1474 | ++ s390_insn_assert("vllez", m3 <= 3 || m3 == 6); |
1475 | ++ |
1476 | ++ IRType type = s390_vr_get_type(m3 & 3); |
1477 | + IRExpr* op2 = load(type, mkexpr(op2addr)); |
1478 | + IRExpr* op2as64bit; |
1479 | + switch (type) { |
1480 | +@@ -16203,7 +16332,13 @@ |
1481 | + vpanic("s390_irgen_VLLEZ: unknown type"); |
1482 | + } |
1483 | + |
1484 | +- put_vr_dw0(v1, op2as64bit); |
1485 | ++ if (m3 == 6) { |
1486 | ++ /* left-aligned */ |
1487 | ++ put_vr_dw0(v1, binop(Iop_Shl64, op2as64bit, mkU8(32))); |
1488 | ++ } else { |
1489 | ++ /* right-aligned */ |
1490 | ++ put_vr_dw0(v1, op2as64bit); |
1491 | ++ } |
1492 | + put_vr_dw1(v1, mkU64(0)); |
1493 | + return "vllez"; |
1494 | + } |
1495 | +@@ -16612,7 +16747,7 @@ |
1496 | + s390_getCountToBlockBoundary(addr, m3), |
1497 | + mkU32(1)); |
1498 | + |
1499 | +- s390_vr_loadWithLength(v1, addr, maxIndex); |
1500 | ++ s390_vr_loadWithLength(v1, addr, maxIndex, False); |
1501 | + |
1502 | + return "vlbb"; |
1503 | + } |
1504 | +@@ -16620,42 +16755,51 @@ |
1505 | + static const HChar * |
1506 | + s390_irgen_VLL(UChar v1, IRTemp addr, UChar r3) |
1507 | + { |
1508 | +- s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3)); |
1509 | ++ s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3), False); |
1510 | + |
1511 | + return "vll"; |
1512 | + } |
1513 | + |
1514 | + static const HChar * |
1515 | +-s390_irgen_VSTL(UChar v1, IRTemp addr, UChar r3) |
1516 | ++s390_irgen_VLRL(UChar v1, IRTemp addr, UChar i3) |
1517 | + { |
1518 | +- IRTemp counter = newTemp(Ity_I64); |
1519 | +- IRTemp maxIndexToStore = newTemp(Ity_I64); |
1520 | +- IRTemp gpr3 = newTemp(Ity_I64); |
1521 | ++ s390_insn_assert("vlrl", (i3 & 0xf0) == 0); |
1522 | ++ s390_vr_loadWithLength(v1, addr, mkU32((UInt) i3), True); |
1523 | + |
1524 | +- assign(gpr3, unop(Iop_32Uto64, get_gpr_w1(r3))); |
1525 | +- assign(maxIndexToStore, mkite(binop(Iop_CmpLE64U, |
1526 | +- mkexpr(gpr3), |
1527 | +- mkU64(16) |
1528 | +- ), |
1529 | +- mkexpr(gpr3), |
1530 | +- mkU64(16) |
1531 | +- ) |
1532 | +- ); |
1533 | +- |
1534 | +- assign(counter, get_counter_dw0()); |
1535 | ++ return "vlrl"; |
1536 | ++} |
1537 | + |
1538 | +- store(binop(Iop_Add64, mkexpr(addr), mkexpr(counter)), |
1539 | +- binop(Iop_GetElem8x16, get_vr_qw(v1), unop(Iop_64to8, mkexpr(counter)))); |
1540 | ++static const HChar * |
1541 | ++s390_irgen_VLRLR(UChar v1, UChar r3, IRTemp addr) |
1542 | ++{ |
1543 | ++ s390_vr_loadWithLength(v1, addr, get_gpr_w1(r3), True); |
1544 | + |
1545 | +- /* Check for end of field */ |
1546 | +- put_counter_dw0(binop(Iop_Add64, mkexpr(counter), mkU64(1))); |
1547 | +- iterate_if(binop(Iop_CmpNE64, mkexpr(counter), mkexpr(maxIndexToStore))); |
1548 | +- put_counter_dw0(mkU64(0)); |
1549 | ++ return "vlrlr"; |
1550 | ++} |
1551 | + |
1552 | ++static const HChar * |
1553 | ++s390_irgen_VSTL(UChar v1, IRTemp addr, UChar r3) |
1554 | ++{ |
1555 | ++ s390_vr_storeWithLength(v1, addr, get_gpr_w1(r3), False); |
1556 | + return "vstl"; |
1557 | + } |
1558 | + |
1559 | + static const HChar * |
1560 | ++s390_irgen_VSTRL(UChar v1, IRTemp addr, UChar i3) |
1561 | ++{ |
1562 | ++ s390_insn_assert("vstrl", (i3 & 0xf0) == 0); |
1563 | ++ s390_vr_storeWithLength(v1, addr, mkU32((UInt) i3), True); |
1564 | ++ return "vstrl"; |
1565 | ++} |
1566 | ++ |
1567 | ++static const HChar * |
1568 | ++s390_irgen_VSTRLR(UChar v1, UChar r3, IRTemp addr) |
1569 | ++{ |
1570 | ++ s390_vr_storeWithLength(v1, addr, get_gpr_w1(r3), True); |
1571 | ++ return "vstrlr"; |
1572 | ++} |
1573 | ++ |
1574 | ++static const HChar * |
1575 | + s390_irgen_VX(UChar v1, UChar v2, UChar v3) |
1576 | + { |
1577 | + put_vr_qw(v1, binop(Iop_XorV128, get_vr_qw(v2), get_vr_qw(v3))); |
1578 | +@@ -16680,6 +16824,24 @@ |
1579 | + } |
1580 | + |
1581 | + static const HChar * |
1582 | ++s390_irgen_VOC(UChar v1, UChar v2, UChar v3) |
1583 | ++{ |
1584 | ++ put_vr_qw(v1, binop(Iop_OrV128, get_vr_qw(v2), |
1585 | ++ unop(Iop_NotV128, get_vr_qw(v3)))); |
1586 | ++ |
1587 | ++ return "voc"; |
1588 | ++} |
1589 | ++ |
1590 | ++static const HChar * |
1591 | ++s390_irgen_VNN(UChar v1, UChar v2, UChar v3) |
1592 | ++{ |
1593 | ++ put_vr_qw(v1, unop(Iop_NotV128, |
1594 | ++ binop(Iop_AndV128, get_vr_qw(v2), get_vr_qw(v3)))); |
1595 | ++ |
1596 | ++ return "vnn"; |
1597 | ++} |
1598 | ++ |
1599 | ++static const HChar * |
1600 | + s390_irgen_VNO(UChar v1, UChar v2, UChar v3) |
1601 | + { |
1602 | + put_vr_qw(v1, unop(Iop_NotV128, |
1603 | +@@ -16689,6 +16851,15 @@ |
1604 | + } |
1605 | + |
1606 | + static const HChar * |
1607 | ++s390_irgen_VNX(UChar v1, UChar v2, UChar v3) |
1608 | ++{ |
1609 | ++ put_vr_qw(v1, unop(Iop_NotV128, |
1610 | ++ binop(Iop_XorV128, get_vr_qw(v2), get_vr_qw(v3)))); |
1611 | ++ |
1612 | ++ return "vnx"; |
1613 | ++} |
1614 | ++ |
1615 | ++static const HChar * |
1616 | + s390_irgen_LZRF(UChar r1, IRTemp op2addr) |
1617 | + { |
1618 | + IRTemp op2 = newTemp(Ity_I32); |
1619 | +@@ -17496,9 +17667,19 @@ |
1620 | + static const HChar * |
1621 | + s390_irgen_VPOPCT(UChar v1, UChar v2, UChar m3) |
1622 | + { |
1623 | +- vassert(m3 == 0); |
1624 | ++ s390_insn_assert("vpopct", m3 <= 3); |
1625 | + |
1626 | +- put_vr_qw(v1, unop(Iop_Cnt8x16, get_vr_qw(v2))); |
1627 | ++ IRExpr* cnt = unop(Iop_Cnt8x16, get_vr_qw(v2)); |
1628 | ++ |
1629 | ++ if (m3 >= 1) { |
1630 | ++ cnt = unop(Iop_PwAddL8Ux16, cnt); |
1631 | ++ if (m3 >= 2) { |
1632 | ++ cnt = unop(Iop_PwAddL16Ux8, cnt); |
1633 | ++ if (m3 == 3) |
1634 | ++ cnt = unop(Iop_PwAddL32Ux4, cnt); |
1635 | ++ } |
1636 | ++ } |
1637 | ++ put_vr_qw(v1, cnt); |
1638 | + |
1639 | + return "vpopct"; |
1640 | + } |
1641 | +@@ -18332,12 +18513,53 @@ |
1642 | + return "vmalh"; |
1643 | + } |
1644 | + |
1645 | ++static const HChar * |
1646 | ++s390_irgen_VMSL(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6) |
1647 | ++{ |
1648 | ++ s390_insn_assert("vmsl", m5 == 3 && (m6 & 3) == 0); |
1649 | ++ |
1650 | ++ IRDirty* d; |
1651 | ++ IRTemp cc = newTemp(Ity_I64); |
1652 | ++ |
1653 | ++ s390x_vec_op_details_t details = { .serialized = 0ULL }; |
1654 | ++ details.op = S390_VEC_OP_VMSL; |
1655 | ++ details.v1 = v1; |
1656 | ++ details.v2 = v2; |
1657 | ++ details.v3 = v3; |
1658 | ++ details.v4 = v4; |
1659 | ++ details.m4 = m5; |
1660 | ++ details.m5 = m6; |
1661 | ++ |
1662 | ++ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
1663 | ++ &s390x_dirtyhelper_vec_op, |
1664 | ++ mkIRExprVec_2(IRExpr_GSPTR(), |
1665 | ++ mkU64(details.serialized))); |
1666 | ++ |
1667 | ++ d->nFxState = 4; |
1668 | ++ vex_bzero(&d->fxState, sizeof(d->fxState)); |
1669 | ++ d->fxState[0].fx = Ifx_Read; |
1670 | ++ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
1671 | ++ d->fxState[0].size = sizeof(V128); |
1672 | ++ d->fxState[1].fx = Ifx_Read; |
1673 | ++ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
1674 | ++ d->fxState[1].size = sizeof(V128); |
1675 | ++ d->fxState[2].fx = Ifx_Read; |
1676 | ++ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v4 * sizeof(V128); |
1677 | ++ d->fxState[2].size = sizeof(V128); |
1678 | ++ d->fxState[3].fx = Ifx_Write; |
1679 | ++ d->fxState[3].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
1680 | ++ d->fxState[3].size = sizeof(V128); |
1681 | ++ |
1682 | ++ stmt(IRStmt_Dirty(d)); |
1683 | ++ |
1684 | ++ return "vmsl"; |
1685 | ++} |
1686 | ++ |
1687 | + static void |
1688 | +-s390_vector_fp_convert(IROp op, IRType fromType, IRType toType, |
1689 | ++s390_vector_fp_convert(IROp op, IRType fromType, IRType toType, Bool rounding, |
1690 | + UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1691 | + { |
1692 | + Bool isSingleElementOp = s390_vr_is_single_element_control_set(m4); |
1693 | +- UChar maxIndex = isSingleElementOp ? 0 : 1; |
1694 | + |
1695 | + /* For Iop_F32toF64 we do this: |
1696 | + f32[0] -> f64[0] |
1697 | +@@ -18350,14 +18572,21 @@ |
1698 | + The magic below with scaling factors is used to achieve the logic |
1699 | + described above. |
1700 | + */ |
1701 | +- const UChar sourceIndexScaleFactor = (op == Iop_F32toF64) ? 2 : 1; |
1702 | +- const UChar destinationIndexScaleFactor = (op == Iop_F64toF32) ? 2 : 1; |
1703 | ++ Int size_diff = sizeofIRType(toType) - sizeofIRType(fromType); |
1704 | ++ const UChar sourceIndexScaleFactor = size_diff > 0 ? 2 : 1; |
1705 | ++ const UChar destinationIndexScaleFactor = size_diff < 0 ? 2 : 1; |
1706 | ++ UChar n_elem = (isSingleElementOp ? 1 : |
1707 | ++ 16 / (size_diff > 0 ? |
1708 | ++ sizeofIRType(toType) : sizeofIRType(fromType))); |
1709 | + |
1710 | +- const Bool isUnary = (op == Iop_F32toF64); |
1711 | +- for (UChar i = 0; i <= maxIndex; i++) { |
1712 | ++ for (UChar i = 0; i < n_elem; i++) { |
1713 | + IRExpr* argument = get_vr(v2, fromType, i * sourceIndexScaleFactor); |
1714 | + IRExpr* result; |
1715 | +- if (!isUnary) { |
1716 | ++ if (rounding) { |
1717 | ++ if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1718 | ++ emulation_warning(EmWarn_S390X_fpext_rounding); |
1719 | ++ m5 = S390_BFP_ROUND_PER_FPC; |
1720 | ++ } |
1721 | + result = binop(op, |
1722 | + mkexpr(encode_bfp_rounding_mode(m5)), |
1723 | + argument); |
1724 | +@@ -18366,10 +18595,6 @@ |
1725 | + } |
1726 | + put_vr(v1, toType, i * destinationIndexScaleFactor, result); |
1727 | + } |
1728 | +- |
1729 | +- if (isSingleElementOp) { |
1730 | +- put_vr_dw1(v1, mkU64(0)); |
1731 | +- } |
1732 | + } |
1733 | + |
1734 | + static const HChar * |
1735 | +@@ -18377,12 +18602,8 @@ |
1736 | + { |
1737 | + s390_insn_assert("vcdg", m3 == 3); |
1738 | + |
1739 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1740 | +- emulation_warning(EmWarn_S390X_fpext_rounding); |
1741 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1742 | +- } |
1743 | +- |
1744 | +- s390_vector_fp_convert(Iop_I64StoF64, Ity_I64, Ity_F64, v1, v2, m3, m4, m5); |
1745 | ++ s390_vector_fp_convert(Iop_I64StoF64, Ity_I64, Ity_F64, True, |
1746 | ++ v1, v2, m3, m4, m5); |
1747 | + |
1748 | + return "vcdg"; |
1749 | + } |
1750 | +@@ -18392,12 +18613,8 @@ |
1751 | + { |
1752 | + s390_insn_assert("vcdlg", m3 == 3); |
1753 | + |
1754 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1755 | +- emulation_warning(EmWarn_S390X_fpext_rounding); |
1756 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1757 | +- } |
1758 | +- |
1759 | +- s390_vector_fp_convert(Iop_I64UtoF64, Ity_I64, Ity_F64, v1, v2, m3, m4, m5); |
1760 | ++ s390_vector_fp_convert(Iop_I64UtoF64, Ity_I64, Ity_F64, True, |
1761 | ++ v1, v2, m3, m4, m5); |
1762 | + |
1763 | + return "vcdlg"; |
1764 | + } |
1765 | +@@ -18407,12 +18624,8 @@ |
1766 | + { |
1767 | + s390_insn_assert("vcgd", m3 == 3); |
1768 | + |
1769 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1770 | +- emulation_warning(EmWarn_S390X_fpext_rounding); |
1771 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1772 | +- } |
1773 | +- |
1774 | +- s390_vector_fp_convert(Iop_F64toI64S, Ity_F64, Ity_I64, v1, v2, m3, m4, m5); |
1775 | ++ s390_vector_fp_convert(Iop_F64toI64S, Ity_F64, Ity_I64, True, |
1776 | ++ v1, v2, m3, m4, m5); |
1777 | + |
1778 | + return "vcgd"; |
1779 | + } |
1780 | +@@ -18422,12 +18635,8 @@ |
1781 | + { |
1782 | + s390_insn_assert("vclgd", m3 == 3); |
1783 | + |
1784 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1785 | +- emulation_warning(EmWarn_S390X_fpext_rounding); |
1786 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1787 | +- } |
1788 | +- |
1789 | +- s390_vector_fp_convert(Iop_F64toI64U, Ity_F64, Ity_I64, v1, v2, m3, m4, m5); |
1790 | ++ s390_vector_fp_convert(Iop_F64toI64U, Ity_F64, Ity_I64, True, |
1791 | ++ v1, v2, m3, m4, m5); |
1792 | + |
1793 | + return "vclgd"; |
1794 | + } |
1795 | +@@ -18435,246 +18644,262 @@ |
1796 | + static const HChar * |
1797 | + s390_irgen_VFI(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1798 | + { |
1799 | +- s390_insn_assert("vfi", m3 == 3); |
1800 | ++ s390_insn_assert("vfi", |
1801 | ++ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4))); |
1802 | + |
1803 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1804 | +- emulation_warning(EmWarn_S390X_fpext_rounding); |
1805 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1806 | ++ switch (m3) { |
1807 | ++ case 2: s390_vector_fp_convert(Iop_RoundF32toInt, Ity_F32, Ity_F32, True, |
1808 | ++ v1, v2, m3, m4, m5); break; |
1809 | ++ case 3: s390_vector_fp_convert(Iop_RoundF64toInt, Ity_F64, Ity_F64, True, |
1810 | ++ v1, v2, m3, m4, m5); break; |
1811 | ++ case 4: s390_vector_fp_convert(Iop_RoundF128toInt, Ity_F128, Ity_F128, True, |
1812 | ++ v1, v2, m3, m4, m5); break; |
1813 | + } |
1814 | + |
1815 | +- s390_vector_fp_convert(Iop_RoundF64toInt, Ity_F64, Ity_F64, |
1816 | +- v1, v2, m3, m4, m5); |
1817 | +- |
1818 | +- return "vcgld"; |
1819 | ++ return "vfi"; |
1820 | + } |
1821 | + |
1822 | + static const HChar * |
1823 | +-s390_irgen_VLDE(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1824 | ++s390_irgen_VFLL(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1825 | + { |
1826 | +- s390_insn_assert("vlde", m3 == 2); |
1827 | ++ s390_insn_assert("vfll", m3 == 2 || (s390_host_has_vxe && m3 == 3)); |
1828 | + |
1829 | +- s390_vector_fp_convert(Iop_F32toF64, Ity_F32, Ity_F64, v1, v2, m3, m4, m5); |
1830 | ++ if (m3 == 2) |
1831 | ++ s390_vector_fp_convert(Iop_F32toF64, Ity_F32, Ity_F64, False, |
1832 | ++ v1, v2, m3, m4, m5); |
1833 | ++ else |
1834 | ++ s390_vector_fp_convert(Iop_F64toF128, Ity_F64, Ity_F128, False, |
1835 | ++ v1, v2, m3, m4, m5); |
1836 | + |
1837 | +- return "vlde"; |
1838 | ++ return "vfll"; |
1839 | + } |
1840 | + |
1841 | + static const HChar * |
1842 | +-s390_irgen_VLED(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1843 | ++s390_irgen_VFLR(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1844 | + { |
1845 | +- s390_insn_assert("vled", m3 == 3); |
1846 | ++ s390_insn_assert("vflr", m3 == 3 || (s390_host_has_vxe && m3 == 2)); |
1847 | + |
1848 | +- if (!s390_host_has_fpext && m5 != S390_BFP_ROUND_PER_FPC) { |
1849 | +- m5 = S390_BFP_ROUND_PER_FPC; |
1850 | +- } |
1851 | +- |
1852 | +- s390_vector_fp_convert(Iop_F64toF32, Ity_F64, Ity_F32, v1, v2, m3, m4, m5); |
1853 | ++ if (m3 == 3) |
1854 | ++ s390_vector_fp_convert(Iop_F64toF32, Ity_F64, Ity_F32, True, |
1855 | ++ v1, v2, m3, m4, m5); |
1856 | ++ else |
1857 | ++ s390_vector_fp_convert(Iop_F128toF64, Ity_F128, Ity_F64, True, |
1858 | ++ v1, v2, m3, m4, m5); |
1859 | + |
1860 | +- return "vled"; |
1861 | ++ return "vflr"; |
1862 | + } |
1863 | + |
1864 | + static const HChar * |
1865 | + s390_irgen_VFPSO(UChar v1, UChar v2, UChar m3, UChar m4, UChar m5) |
1866 | + { |
1867 | +- s390_insn_assert("vfpso", m3 == 3); |
1868 | ++ s390_insn_assert("vfpso", m5 <= 2 && |
1869 | ++ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4))); |
1870 | + |
1871 | +- IRExpr* result; |
1872 | +- switch (m5) { |
1873 | +- case 0: { |
1874 | +- /* Invert sign */ |
1875 | +- if (!s390_vr_is_single_element_control_set(m4)) { |
1876 | +- result = unop(Iop_Neg64Fx2, get_vr_qw(v2)); |
1877 | +- } |
1878 | +- else { |
1879 | +- result = binop(Iop_64HLtoV128, |
1880 | +- unop(Iop_ReinterpF64asI64, |
1881 | +- unop(Iop_NegF64, get_vr(v2, Ity_F64, 0))), |
1882 | +- mkU64(0)); |
1883 | +- } |
1884 | +- break; |
1885 | +- } |
1886 | ++ Bool single = s390_vr_is_single_element_control_set(m4) || m3 == 4; |
1887 | ++ IRType type = single ? s390_vr_get_ftype(m3) : Ity_V128; |
1888 | ++ int idx = 2 * (m3 - 2) + (single ? 0 : 1); |
1889 | ++ |
1890 | ++ static const IROp negate_ops[] = { |
1891 | ++ Iop_NegF32, Iop_Neg32Fx4, |
1892 | ++ Iop_NegF64, Iop_Neg64Fx2, |
1893 | ++ Iop_NegF128 |
1894 | ++ }; |
1895 | ++ static const IROp abs_ops[] = { |
1896 | ++ Iop_AbsF32, Iop_Abs32Fx4, |
1897 | ++ Iop_AbsF64, Iop_Abs64Fx2, |
1898 | ++ Iop_AbsF128 |
1899 | ++ }; |
1900 | + |
1901 | +- case 1: { |
1902 | ++ if (m5 == 1) { |
1903 | + /* Set sign to negative */ |
1904 | +- IRExpr* highHalf = mkU64(0x8000000000000000ULL); |
1905 | +- if (!s390_vr_is_single_element_control_set(m4)) { |
1906 | +- IRExpr* lowHalf = highHalf; |
1907 | +- IRExpr* mask = binop(Iop_64HLtoV128, highHalf, lowHalf); |
1908 | +- result = binop(Iop_OrV128, get_vr_qw(v2), mask); |
1909 | +- } |
1910 | +- else { |
1911 | +- result = binop(Iop_64HLtoV128, |
1912 | +- binop(Iop_Or64, get_vr_dw0(v2), highHalf), |
1913 | +- mkU64(0ULL)); |
1914 | +- } |
1915 | +- |
1916 | +- break; |
1917 | ++ put_vr(v1, type, 0, |
1918 | ++ unop(negate_ops[idx], |
1919 | ++ unop(abs_ops[idx], get_vr(v2, type, 0)))); |
1920 | ++ } else { |
1921 | ++ /* m5 == 0: invert sign; m5 == 2: set sign to positive */ |
1922 | ++ const IROp *ops = m5 == 2 ? abs_ops : negate_ops; |
1923 | ++ put_vr(v1, type, 0, unop(ops[idx], get_vr(v2, type, 0))); |
1924 | + } |
1925 | + |
1926 | +- case 2: { |
1927 | +- /* Set sign to positive */ |
1928 | +- if (!s390_vr_is_single_element_control_set(m4)) { |
1929 | +- result = unop(Iop_Abs64Fx2, get_vr_qw(v2)); |
1930 | +- } |
1931 | +- else { |
1932 | +- result = binop(Iop_64HLtoV128, |
1933 | +- unop(Iop_ReinterpF64asI64, |
1934 | +- unop(Iop_AbsF64, get_vr(v2, Ity_F64, 0))), |
1935 | +- mkU64(0)); |
1936 | +- } |
1937 | ++ return "vfpso"; |
1938 | ++} |
1939 | + |
1940 | +- break; |
1941 | +- } |
1942 | ++static const HChar * |
1943 | ++s390x_vec_fp_binary_op(const HChar* mnm, const IROp ops[], |
1944 | ++ UChar v1, UChar v2, UChar v3, |
1945 | ++ UChar m4, UChar m5) |
1946 | ++{ |
1947 | ++ s390_insn_assert(mnm, (m5 & 7) == 0 && |
1948 | ++ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4))); |
1949 | + |
1950 | +- default: |
1951 | +- vpanic("s390_irgen_VFPSO: Invalid m5 value"); |
1952 | +- } |
1953 | ++ int idx = 2 * (m4 - 2); |
1954 | + |
1955 | +- put_vr_qw(v1, result); |
1956 | +- if (s390_vr_is_single_element_control_set(m4)) { |
1957 | +- put_vr_dw1(v1, mkU64(0ULL)); |
1958 | ++ if (m4 == 4 || s390_vr_is_single_element_control_set(m5)) { |
1959 | ++ IRType type = s390_vr_get_ftype(m4); |
1960 | ++ put_vr(v1, type, 0, |
1961 | ++ triop(ops[idx], get_bfp_rounding_mode_from_fpc(), |
1962 | ++ get_vr(v2, type, 0), get_vr(v3, type, 0))); |
1963 | ++ } else { |
1964 | ++ put_vr_qw(v1, triop(ops[idx + 1], get_bfp_rounding_mode_from_fpc(), |
1965 | ++ get_vr_qw(v2), get_vr_qw(v3))); |
1966 | + } |
1967 | + |
1968 | +- return "vfpso"; |
1969 | ++ return mnm; |
1970 | + } |
1971 | + |
1972 | +-static void s390x_vec_fp_binary_op(IROp generalOp, IROp singleElementOp, |
1973 | +- UChar v1, UChar v2, UChar v3, UChar m4, |
1974 | +- UChar m5) |
1975 | ++static const HChar * |
1976 | ++s390x_vec_fp_unary_op(const HChar* mnm, const IROp ops[], |
1977 | ++ UChar v1, UChar v2, UChar m3, UChar m4) |
1978 | + { |
1979 | +- IRExpr* result; |
1980 | +- if (!s390_vr_is_single_element_control_set(m5)) { |
1981 | +- result = triop(generalOp, get_bfp_rounding_mode_from_fpc(), |
1982 | +- get_vr_qw(v2), get_vr_qw(v3)); |
1983 | +- } else { |
1984 | +- IRExpr* highHalf = triop(singleElementOp, |
1985 | +- get_bfp_rounding_mode_from_fpc(), |
1986 | +- get_vr(v2, Ity_F64, 0), |
1987 | +- get_vr(v3, Ity_F64, 0)); |
1988 | +- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf), |
1989 | +- mkU64(0ULL)); |
1990 | +- } |
1991 | ++ s390_insn_assert(mnm, (m4 & 7) == 0 && |
1992 | ++ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4))); |
1993 | + |
1994 | +- put_vr_qw(v1, result); |
1995 | +-} |
1996 | ++ int idx = 2 * (m3 - 2); |
1997 | + |
1998 | +-static void s390x_vec_fp_unary_op(IROp generalOp, IROp singleElementOp, |
1999 | +- UChar v1, UChar v2, UChar m3, UChar m4) |
2000 | +-{ |
2001 | +- IRExpr* result; |
2002 | +- if (!s390_vr_is_single_element_control_set(m4)) { |
2003 | +- result = binop(generalOp, get_bfp_rounding_mode_from_fpc(), |
2004 | +- get_vr_qw(v2)); |
2005 | ++ if (m3 == 4 || s390_vr_is_single_element_control_set(m4)) { |
2006 | ++ IRType type = s390_vr_get_ftype(m3); |
2007 | ++ put_vr(v1, type, 0, |
2008 | ++ binop(ops[idx], get_bfp_rounding_mode_from_fpc(), |
2009 | ++ get_vr(v2, type, 0))); |
2010 | + } |
2011 | + else { |
2012 | +- IRExpr* highHalf = binop(singleElementOp, |
2013 | +- get_bfp_rounding_mode_from_fpc(), |
2014 | +- get_vr(v2, Ity_F64, 0)); |
2015 | +- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf), |
2016 | +- mkU64(0ULL)); |
2017 | ++ put_vr_qw(v1, binop(ops[idx + 1], get_bfp_rounding_mode_from_fpc(), |
2018 | ++ get_vr_qw(v2))); |
2019 | + } |
2020 | + |
2021 | +- put_vr_qw(v1, result); |
2022 | ++ return mnm; |
2023 | + } |
2024 | + |
2025 | + |
2026 | +-static void |
2027 | +-s390_vector_fp_mulAddOrSub(IROp singleElementOp, |
2028 | +- UChar v1, UChar v2, UChar v3, UChar v4, |
2029 | +- UChar m5, UChar m6) |
2030 | ++static const HChar * |
2031 | ++s390_vector_fp_mulAddOrSub(UChar v1, UChar v2, UChar v3, UChar v4, |
2032 | ++ UChar m5, UChar m6, |
2033 | ++ const HChar* mnm, const IROp single_ops[], |
2034 | ++ Bool negate) |
2035 | + { |
2036 | +- Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5); |
2037 | ++ s390_insn_assert(mnm, m6 == 3 || (s390_host_has_vxe && m6 >= 2 && m6 <= 4)); |
2038 | ++ |
2039 | ++ static const IROp negate_ops[] = { Iop_NegF32, Iop_NegF64, Iop_NegF128 }; |
2040 | ++ IRType type = s390_vr_get_ftype(m6); |
2041 | ++ Bool single = s390_vr_is_single_element_control_set(m5) || m6 == 4; |
2042 | ++ UChar n_elem = single ? 1 : s390_vr_get_n_elem(m6); |
2043 | + IRTemp irrm_temp = newTemp(Ity_I32); |
2044 | + assign(irrm_temp, get_bfp_rounding_mode_from_fpc()); |
2045 | + IRExpr* irrm = mkexpr(irrm_temp); |
2046 | +- IRExpr* result; |
2047 | +- IRExpr* highHalf = qop(singleElementOp, |
2048 | +- irrm, |
2049 | +- get_vr(v2, Ity_F64, 0), |
2050 | +- get_vr(v3, Ity_F64, 0), |
2051 | +- get_vr(v4, Ity_F64, 0)); |
2052 | +- |
2053 | +- if (isSingleElementOp) { |
2054 | +- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf), |
2055 | +- mkU64(0ULL)); |
2056 | +- } else { |
2057 | +- IRExpr* lowHalf = qop(singleElementOp, |
2058 | +- irrm, |
2059 | +- get_vr(v2, Ity_F64, 1), |
2060 | +- get_vr(v3, Ity_F64, 1), |
2061 | +- get_vr(v4, Ity_F64, 1)); |
2062 | +- result = binop(Iop_64HLtoV128, unop(Iop_ReinterpF64asI64, highHalf), |
2063 | +- unop(Iop_ReinterpF64asI64, lowHalf)); |
2064 | +- } |
2065 | + |
2066 | +- put_vr_qw(v1, result); |
2067 | ++ for (UChar idx = 0; idx < n_elem; idx++) { |
2068 | ++ IRExpr* result = qop(single_ops[m6 - 2], |
2069 | ++ irrm, |
2070 | ++ get_vr(v2, type, idx), |
2071 | ++ get_vr(v3, type, idx), |
2072 | ++ get_vr(v4, type, idx)); |
2073 | ++ put_vr(v1, type, idx, negate ? unop(negate_ops[m6 - 2], result) : result); |
2074 | ++ } |
2075 | ++ return mnm; |
2076 | + } |
2077 | + |
2078 | + static const HChar * |
2079 | + s390_irgen_VFA(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5) |
2080 | + { |
2081 | +- s390_insn_assert("vfa", m4 == 3); |
2082 | +- s390x_vec_fp_binary_op(Iop_Add64Fx2, Iop_AddF64, v1, v2, v3, m4, m5); |
2083 | +- return "vfa"; |
2084 | ++ static const IROp vfa_ops[] = { |
2085 | ++ Iop_AddF32, Iop_Add32Fx4, |
2086 | ++ Iop_AddF64, Iop_Add64Fx2, |
2087 | ++ Iop_AddF128, |
2088 | ++ }; |
2089 | ++ return s390x_vec_fp_binary_op("vfa", vfa_ops, v1, v2, v3, m4, m5); |
2090 | + } |
2091 | + |
2092 | + static const HChar * |
2093 | + s390_irgen_VFS(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5) |
2094 | + { |
2095 | +- s390_insn_assert("vfs", m4 == 3); |
2096 | +- s390x_vec_fp_binary_op(Iop_Sub64Fx2, Iop_SubF64, v1, v2, v3, m4, m5); |
2097 | +- return "vfs"; |
2098 | ++ static const IROp vfs_ops[] = { |
2099 | ++ Iop_SubF32, Iop_Sub32Fx4, |
2100 | ++ Iop_SubF64, Iop_Sub64Fx2, |
2101 | ++ Iop_SubF128, |
2102 | ++ }; |
2103 | ++ return s390x_vec_fp_binary_op("vfs", vfs_ops, v1, v2, v3, m4, m5); |
2104 | + } |
2105 | + |
2106 | + static const HChar * |
2107 | + s390_irgen_VFM(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5) |
2108 | + { |
2109 | +- s390_insn_assert("vfm", m4 == 3); |
2110 | +- s390x_vec_fp_binary_op(Iop_Mul64Fx2, Iop_MulF64, v1, v2, v3, m4, m5); |
2111 | +- return "vfm"; |
2112 | ++ static const IROp vfm_ops[] = { |
2113 | ++ Iop_MulF32, Iop_Mul32Fx4, |
2114 | ++ Iop_MulF64, Iop_Mul64Fx2, |
2115 | ++ Iop_MulF128, |
2116 | ++ }; |
2117 | ++ return s390x_vec_fp_binary_op("vfm", vfm_ops, v1, v2, v3, m4, m5); |
2118 | + } |
2119 | + |
2120 | + static const HChar * |
2121 | + s390_irgen_VFD(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5) |
2122 | + { |
2123 | +- s390_insn_assert("vfd", m4 == 3); |
2124 | +- s390x_vec_fp_binary_op(Iop_Div64Fx2, Iop_DivF64, v1, v2, v3, m4, m5); |
2125 | +- return "vfd"; |
2126 | ++ static const IROp vfd_ops[] = { |
2127 | ++ Iop_DivF32, Iop_Div32Fx4, |
2128 | ++ Iop_DivF64, Iop_Div64Fx2, |
2129 | ++ Iop_DivF128, |
2130 | ++ }; |
2131 | ++ return s390x_vec_fp_binary_op("vfd", vfd_ops, v1, v2, v3, m4, m5); |
2132 | + } |
2133 | + |
2134 | + static const HChar * |
2135 | + s390_irgen_VFSQ(UChar v1, UChar v2, UChar m3, UChar m4) |
2136 | + { |
2137 | +- s390_insn_assert("vfsq", m3 == 3); |
2138 | +- s390x_vec_fp_unary_op(Iop_Sqrt64Fx2, Iop_SqrtF64, v1, v2, m3, m4); |
2139 | +- |
2140 | +- return "vfsq"; |
2141 | ++ static const IROp vfsq_ops[] = { |
2142 | ++ Iop_SqrtF32, Iop_Sqrt32Fx4, |
2143 | ++ Iop_SqrtF64, Iop_Sqrt64Fx2, |
2144 | ++ Iop_SqrtF128 |
2145 | ++ }; |
2146 | ++ return s390x_vec_fp_unary_op("vfsq", vfsq_ops, v1, v2, m3, m4); |
2147 | + } |
2148 | + |
2149 | ++static const IROp FMA_single_ops[] = { |
2150 | ++ Iop_MAddF32, Iop_MAddF64, Iop_MAddF128 |
2151 | ++}; |
2152 | ++ |
2153 | + static const HChar * |
2154 | + s390_irgen_VFMA(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6) |
2155 | + { |
2156 | +- s390_insn_assert("vfma", m6 == 3); |
2157 | +- s390_vector_fp_mulAddOrSub(Iop_MAddF64, v1, v2, v3, v4, m5, m6); |
2158 | +- return "vfma"; |
2159 | ++ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6, |
2160 | ++ "vfma", FMA_single_ops, False); |
2161 | + } |
2162 | + |
2163 | + static const HChar * |
2164 | ++s390_irgen_VFNMA(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6) |
2165 | ++{ |
2166 | ++ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6, |
2167 | ++ "vfnma", FMA_single_ops, True); |
2168 | ++} |
2169 | ++ |
2170 | ++static const IROp FMS_single_ops[] = { |
2171 | ++ Iop_MSubF32, Iop_MSubF64, Iop_MSubF128 |
2172 | ++}; |
2173 | ++ |
2174 | ++static const HChar * |
2175 | + s390_irgen_VFMS(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6) |
2176 | + { |
2177 | +- s390_insn_assert("vfms", m6 == 3); |
2178 | +- s390_vector_fp_mulAddOrSub(Iop_MSubF64, v1, v2, v3, v4, m5, m6); |
2179 | +- return "vfms"; |
2180 | ++ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6, |
2181 | ++ "vfms", FMS_single_ops, False); |
2182 | ++} |
2183 | ++ |
2184 | ++static const HChar * |
2185 | ++s390_irgen_VFNMS(UChar v1, UChar v2, UChar v3, UChar v4, UChar m5, UChar m6) |
2186 | ++{ |
2187 | ++ return s390_vector_fp_mulAddOrSub(v1, v2, v3, v4, m5, m6, |
2188 | ++ "vfnms", FMS_single_ops, True); |
2189 | + } |
2190 | + |
2191 | + static const HChar * |
2192 | + s390_irgen_WFC(UChar v1, UChar v2, UChar m3, UChar m4) |
2193 | + { |
2194 | +- s390_insn_assert("wfc", m3 == 3); |
2195 | +- s390_insn_assert("wfc", m4 == 0); |
2196 | ++ s390_insn_assert("wfc", m4 == 0 && |
2197 | ++ (m3 == 3 || (s390_host_has_vxe && m3 >= 2 && m3 <= 4))); |
2198 | ++ |
2199 | ++ static const IROp ops[] = { Iop_CmpF32, Iop_CmpF64, Iop_CmpF128 }; |
2200 | ++ IRType type = s390_vr_get_ftype(m3); |
2201 | + |
2202 | + IRTemp cc_vex = newTemp(Ity_I32); |
2203 | +- assign(cc_vex, binop(Iop_CmpF64, |
2204 | +- get_vr(v1, Ity_F64, 0), get_vr(v2, Ity_F64, 0))); |
2205 | ++ assign(cc_vex, binop(ops[m3 - 2], get_vr(v1, type, 0), get_vr(v2, type, 0))); |
2206 | + |
2207 | + IRTemp cc_s390 = newTemp(Ity_I32); |
2208 | + assign(cc_s390, convert_vex_bfpcc_to_s390(cc_vex)); |
2209 | +@@ -18692,213 +18917,253 @@ |
2210 | + } |
2211 | + |
2212 | + static const HChar * |
2213 | +-s390_irgen_VFCE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2214 | +-{ |
2215 | +- s390_insn_assert("vfce", m4 == 3); |
2216 | ++s390_irgen_VFCx(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6, |
2217 | ++ const HChar *mnem, IRCmpFResult cmp, Bool equal_ok, |
2218 | ++ IROp cmp32, IROp cmp64) |
2219 | ++{ |
2220 | ++ s390_insn_assert(mnem, (m5 & 3) == 0 && (m6 & 14) == 0 && |
2221 | ++ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4))); |
2222 | ++ |
2223 | ++ Bool single = s390_vr_is_single_element_control_set(m5) || m4 == 4; |
2224 | ++ |
2225 | ++ if (single) { |
2226 | ++ static const IROp ops[] = { Iop_CmpF32, Iop_CmpF64, Iop_CmpF128 }; |
2227 | ++ IRType type = s390_vr_get_ftype(m4); |
2228 | ++ IRTemp result = newTemp(Ity_I32); |
2229 | ++ IRTemp cond = newTemp(Ity_I1); |
2230 | + |
2231 | +- Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5); |
2232 | +- if (!s390_vr_is_cs_set(m6)) { |
2233 | +- if (!isSingleElementOp) { |
2234 | +- put_vr_qw(v1, binop(Iop_CmpEQ64Fx2, get_vr_qw(v2), get_vr_qw(v3))); |
2235 | ++ assign(result, binop(ops[m4 - 2], |
2236 | ++ get_vr(v2, type, 0), get_vr(v3, type, 0))); |
2237 | ++ if (equal_ok) { |
2238 | ++ assign(cond, |
2239 | ++ binop(Iop_Or1, |
2240 | ++ binop(Iop_CmpEQ32, mkexpr(result), mkU32(cmp)), |
2241 | ++ binop(Iop_CmpEQ32, mkexpr(result), mkU32(Ircr_EQ)))); |
2242 | + } else { |
2243 | +- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v2, Ity_F64, 0), |
2244 | +- get_vr(v3, Ity_F64, 0)); |
2245 | +- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult, |
2246 | +- mkU32(Ircr_EQ)), |
2247 | +- mkU64(0xffffffffffffffffULL), |
2248 | +- mkU64(0ULL)); |
2249 | +- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL))); |
2250 | ++ assign(cond, binop(Iop_CmpEQ32, mkexpr(result), mkU32(cmp))); |
2251 | ++ } |
2252 | ++ put_vr_qw(v1, mkite(mkexpr(cond), |
2253 | ++ IRExpr_Const(IRConst_V128(0xffff)), |
2254 | ++ IRExpr_Const(IRConst_V128(0)))); |
2255 | ++ if (s390_vr_is_cs_set(m6)) { |
2256 | ++ IRTemp cc = newTemp(Ity_I64); |
2257 | ++ assign(cc, mkite(mkexpr(cond), mkU64(0), mkU64(3))); |
2258 | ++ s390_cc_set(cc); |
2259 | + } |
2260 | + } else { |
2261 | +- IRDirty* d; |
2262 | +- IRTemp cc = newTemp(Ity_I64); |
2263 | ++ IRTemp result = newTemp(Ity_V128); |
2264 | + |
2265 | +- s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2266 | +- details.op = S390_VEC_OP_VFCE; |
2267 | +- details.v1 = v1; |
2268 | +- details.v2 = v2; |
2269 | +- details.v3 = v3; |
2270 | +- details.m4 = m4; |
2271 | +- details.m5 = m5; |
2272 | +- details.m6 = m6; |
2273 | ++ assign(result, binop(m4 == 2 ? cmp32 : cmp64, |
2274 | ++ get_vr_qw(v2), get_vr_qw(v3))); |
2275 | ++ put_vr_qw(v1, mkexpr(result)); |
2276 | ++ if (s390_vr_is_cs_set(m6)) { |
2277 | ++ IRTemp cc = newTemp(Ity_I64); |
2278 | ++ assign(cc, |
2279 | ++ mkite(binop(Iop_CmpEQ64, |
2280 | ++ binop(Iop_And64, |
2281 | ++ unop(Iop_V128to64, mkexpr(result)), |
2282 | ++ unop(Iop_V128HIto64, mkexpr(result))), |
2283 | ++ mkU64(-1ULL)), |
2284 | ++ mkU64(0), /* all comparison results are true */ |
2285 | ++ mkite(binop(Iop_CmpEQ64, |
2286 | ++ binop(Iop_Or64, |
2287 | ++ unop(Iop_V128to64, mkexpr(result)), |
2288 | ++ unop(Iop_V128HIto64, mkexpr(result))), |
2289 | ++ mkU64(0)), |
2290 | ++ mkU64(3), /* all false */ |
2291 | ++ mkU64(1)))); /* mixed true/false */ |
2292 | ++ s390_cc_set(cc); |
2293 | ++ } |
2294 | ++ } |
2295 | + |
2296 | +- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2297 | +- &s390x_dirtyhelper_vec_op, |
2298 | +- mkIRExprVec_2(IRExpr_GSPTR(), |
2299 | +- mkU64(details.serialized))); |
2300 | ++ return mnem; |
2301 | ++} |
2302 | + |
2303 | +- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128); |
2304 | +- d->nFxState = 3; |
2305 | +- vex_bzero(&d->fxState, sizeof(d->fxState)); |
2306 | +- d->fxState[0].fx = Ifx_Read; |
2307 | +- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2308 | +- d->fxState[0].size = elementSize; |
2309 | +- d->fxState[1].fx = Ifx_Read; |
2310 | +- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2311 | +- d->fxState[1].size = elementSize; |
2312 | +- d->fxState[2].fx = Ifx_Write; |
2313 | +- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2314 | +- d->fxState[2].size = sizeof(V128); |
2315 | ++static const HChar * |
2316 | ++s390_irgen_VFCE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2317 | ++{ |
2318 | ++ return s390_irgen_VFCx(v1, v2, v3, m4, m5, m6, "vfce", Ircr_EQ, |
2319 | ++ False, Iop_CmpEQ32Fx4, Iop_CmpEQ64Fx2); |
2320 | ++} |
2321 | + |
2322 | +- stmt(IRStmt_Dirty(d)); |
2323 | +- s390_cc_set(cc); |
2324 | +- } |
2325 | ++static const HChar * |
2326 | ++s390_irgen_VFCH(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2327 | ++{ |
2328 | ++ /* Swap arguments and compare "low" instead. */ |
2329 | ++ return s390_irgen_VFCx(v1, v3, v2, m4, m5, m6, "vfch", Ircr_LT, |
2330 | ++ False, Iop_CmpLT32Fx4, Iop_CmpLT64Fx2); |
2331 | ++} |
2332 | + |
2333 | +- return "vfce"; |
2334 | ++static const HChar * |
2335 | ++s390_irgen_VFCHE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2336 | ++{ |
2337 | ++ /* Swap arguments and compare "low or equal" instead. */ |
2338 | ++ return s390_irgen_VFCx(v1, v3, v2, m4, m5, m6, "vfche", Ircr_LT, |
2339 | ++ True, Iop_CmpLE32Fx4, Iop_CmpLE64Fx2); |
2340 | + } |
2341 | + |
2342 | + static const HChar * |
2343 | +-s390_irgen_VFCH(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2344 | ++s390_irgen_VFTCI(UChar v1, UChar v2, UShort i3, UChar m4, UChar m5) |
2345 | + { |
2346 | +- vassert(m4 == 3); |
2347 | ++ s390_insn_assert("vftci", |
2348 | ++ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4))); |
2349 | + |
2350 | + Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5); |
2351 | +- if (!s390_vr_is_cs_set(m6)) { |
2352 | +- if (!isSingleElementOp) { |
2353 | +- put_vr_qw(v1, binop(Iop_CmpLE64Fx2, get_vr_qw(v3), get_vr_qw(v2))); |
2354 | +- } else { |
2355 | +- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v2, Ity_F64, 0), |
2356 | +- get_vr(v3, Ity_F64, 0)); |
2357 | +- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult, |
2358 | +- mkU32(Ircr_GT)), |
2359 | +- mkU64(0xffffffffffffffffULL), |
2360 | +- mkU64(0ULL)); |
2361 | +- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL))); |
2362 | +- } |
2363 | +- } |
2364 | +- else { |
2365 | +- IRDirty* d; |
2366 | +- IRTemp cc = newTemp(Ity_I64); |
2367 | + |
2368 | +- s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2369 | +- details.op = S390_VEC_OP_VFCH; |
2370 | +- details.v1 = v1; |
2371 | +- details.v2 = v2; |
2372 | +- details.v3 = v3; |
2373 | +- details.m4 = m4; |
2374 | +- details.m5 = m5; |
2375 | +- details.m6 = m6; |
2376 | ++ IRDirty* d; |
2377 | ++ IRTemp cc = newTemp(Ity_I64); |
2378 | + |
2379 | +- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2380 | +- &s390x_dirtyhelper_vec_op, |
2381 | +- mkIRExprVec_2(IRExpr_GSPTR(), |
2382 | +- mkU64(details.serialized))); |
2383 | ++ s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2384 | ++ details.op = S390_VEC_OP_VFTCI; |
2385 | ++ details.v1 = v1; |
2386 | ++ details.v2 = v2; |
2387 | ++ details.i3 = i3; |
2388 | ++ details.m4 = m4; |
2389 | ++ details.m5 = m5; |
2390 | + |
2391 | +- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128); |
2392 | +- d->nFxState = 3; |
2393 | +- vex_bzero(&d->fxState, sizeof(d->fxState)); |
2394 | +- d->fxState[0].fx = Ifx_Read; |
2395 | +- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2396 | +- d->fxState[0].size = elementSize; |
2397 | +- d->fxState[1].fx = Ifx_Read; |
2398 | +- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2399 | +- d->fxState[1].size = elementSize; |
2400 | +- d->fxState[2].fx = Ifx_Write; |
2401 | +- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2402 | +- d->fxState[2].size = sizeof(V128); |
2403 | ++ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2404 | ++ &s390x_dirtyhelper_vec_op, |
2405 | ++ mkIRExprVec_2(IRExpr_GSPTR(), |
2406 | ++ mkU64(details.serialized))); |
2407 | + |
2408 | +- stmt(IRStmt_Dirty(d)); |
2409 | +- s390_cc_set(cc); |
2410 | +- } |
2411 | ++ const UChar elementSize = isSingleElementOp ? |
2412 | ++ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128); |
2413 | ++ d->nFxState = 2; |
2414 | ++ vex_bzero(&d->fxState, sizeof(d->fxState)); |
2415 | ++ d->fxState[0].fx = Ifx_Read; |
2416 | ++ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2417 | ++ d->fxState[0].size = elementSize; |
2418 | ++ d->fxState[1].fx = Ifx_Write; |
2419 | ++ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2420 | ++ d->fxState[1].size = sizeof(V128); |
2421 | ++ |
2422 | ++ stmt(IRStmt_Dirty(d)); |
2423 | ++ s390_cc_set(cc); |
2424 | + |
2425 | +- return "vfch"; |
2426 | ++ return "vftci"; |
2427 | + } |
2428 | + |
2429 | + static const HChar * |
2430 | +-s390_irgen_VFCHE(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2431 | ++s390_irgen_VFMIN(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2432 | + { |
2433 | +- s390_insn_assert("vfche", m4 == 3); |
2434 | ++ s390_insn_assert("vfmin", |
2435 | ++ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4))); |
2436 | + |
2437 | + Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5); |
2438 | +- if (!s390_vr_is_cs_set(m6)) { |
2439 | +- if (!isSingleElementOp) { |
2440 | +- put_vr_qw(v1, binop(Iop_CmpLT64Fx2, get_vr_qw(v3), get_vr_qw(v2))); |
2441 | +- } |
2442 | +- else { |
2443 | +- IRExpr* comparisonResult = binop(Iop_CmpF64, get_vr(v3, Ity_F64, 0), |
2444 | +- get_vr(v2, Ity_F64, 0)); |
2445 | +- IRExpr* result = mkite(binop(Iop_CmpEQ32, comparisonResult, |
2446 | +- mkU32(Ircr_LT)), |
2447 | +- mkU64(0xffffffffffffffffULL), |
2448 | +- mkU64(0ULL)); |
2449 | +- put_vr_qw(v1, binop(Iop_64HLtoV128, result, mkU64(0ULL))); |
2450 | +- } |
2451 | +- } |
2452 | +- else { |
2453 | +- IRDirty* d; |
2454 | +- IRTemp cc = newTemp(Ity_I64); |
2455 | +- |
2456 | +- s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2457 | +- details.op = S390_VEC_OP_VFCHE; |
2458 | +- details.v1 = v1; |
2459 | +- details.v2 = v2; |
2460 | +- details.v3 = v3; |
2461 | +- details.m4 = m4; |
2462 | +- details.m5 = m5; |
2463 | +- details.m6 = m6; |
2464 | ++ IRDirty* d; |
2465 | ++ IRTemp cc = newTemp(Ity_I64); |
2466 | + |
2467 | +- d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2468 | +- &s390x_dirtyhelper_vec_op, |
2469 | +- mkIRExprVec_2(IRExpr_GSPTR(), |
2470 | +- mkU64(details.serialized))); |
2471 | ++ s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2472 | ++ details.op = S390_VEC_OP_VFMIN; |
2473 | ++ details.v1 = v1; |
2474 | ++ details.v2 = v2; |
2475 | ++ details.v3 = v3; |
2476 | ++ details.m4 = m4; |
2477 | ++ details.m5 = m5; |
2478 | ++ details.m6 = m6; |
2479 | + |
2480 | +- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128); |
2481 | +- d->nFxState = 3; |
2482 | +- vex_bzero(&d->fxState, sizeof(d->fxState)); |
2483 | +- d->fxState[0].fx = Ifx_Read; |
2484 | +- d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2485 | +- d->fxState[0].size = elementSize; |
2486 | +- d->fxState[1].fx = Ifx_Read; |
2487 | +- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2488 | +- d->fxState[1].size = elementSize; |
2489 | +- d->fxState[2].fx = Ifx_Write; |
2490 | +- d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2491 | +- d->fxState[2].size = sizeof(V128); |
2492 | ++ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2493 | ++ &s390x_dirtyhelper_vec_op, |
2494 | ++ mkIRExprVec_2(IRExpr_GSPTR(), |
2495 | ++ mkU64(details.serialized))); |
2496 | + |
2497 | +- stmt(IRStmt_Dirty(d)); |
2498 | +- s390_cc_set(cc); |
2499 | +- } |
2500 | ++ const UChar elementSize = isSingleElementOp ? |
2501 | ++ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128); |
2502 | ++ d->nFxState = 3; |
2503 | ++ vex_bzero(&d->fxState, sizeof(d->fxState)); |
2504 | ++ d->fxState[0].fx = Ifx_Read; |
2505 | ++ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2506 | ++ d->fxState[0].size = elementSize; |
2507 | ++ d->fxState[1].fx = Ifx_Read; |
2508 | ++ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2509 | ++ d->fxState[1].size = elementSize; |
2510 | ++ d->fxState[2].fx = Ifx_Write; |
2511 | ++ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2512 | ++ d->fxState[2].size = sizeof(V128); |
2513 | + |
2514 | +- return "vfche"; |
2515 | ++ stmt(IRStmt_Dirty(d)); |
2516 | ++ s390_cc_set(cc); |
2517 | ++ return "vfmin"; |
2518 | + } |
2519 | + |
2520 | + static const HChar * |
2521 | +-s390_irgen_VFTCI(UChar v1, UChar v2, UShort i3, UChar m4, UChar m5) |
2522 | ++s390_irgen_VFMAX(UChar v1, UChar v2, UChar v3, UChar m4, UChar m5, UChar m6) |
2523 | + { |
2524 | +- s390_insn_assert("vftci", m4 == 3); |
2525 | ++ s390_insn_assert("vfmax", |
2526 | ++ (m4 == 3 || (s390_host_has_vxe && m4 >= 2 && m4 <= 4))); |
2527 | + |
2528 | + Bool isSingleElementOp = s390_vr_is_single_element_control_set(m5); |
2529 | +- |
2530 | + IRDirty* d; |
2531 | + IRTemp cc = newTemp(Ity_I64); |
2532 | + |
2533 | + s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2534 | +- details.op = S390_VEC_OP_VFTCI; |
2535 | ++ details.op = S390_VEC_OP_VFMAX; |
2536 | + details.v1 = v1; |
2537 | + details.v2 = v2; |
2538 | +- details.i3 = i3; |
2539 | ++ details.v3 = v3; |
2540 | + details.m4 = m4; |
2541 | + details.m5 = m5; |
2542 | ++ details.m6 = m6; |
2543 | + |
2544 | + d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2545 | + &s390x_dirtyhelper_vec_op, |
2546 | + mkIRExprVec_2(IRExpr_GSPTR(), |
2547 | + mkU64(details.serialized))); |
2548 | + |
2549 | +- const UChar elementSize = isSingleElementOp ? sizeof(ULong) : sizeof(V128); |
2550 | +- d->nFxState = 2; |
2551 | ++ const UChar elementSize = isSingleElementOp ? |
2552 | ++ sizeofIRType(s390_vr_get_ftype(m4)) : sizeof(V128); |
2553 | ++ d->nFxState = 3; |
2554 | + vex_bzero(&d->fxState, sizeof(d->fxState)); |
2555 | + d->fxState[0].fx = Ifx_Read; |
2556 | + d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2557 | + d->fxState[0].size = elementSize; |
2558 | +- d->fxState[1].fx = Ifx_Write; |
2559 | +- d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2560 | +- d->fxState[1].size = sizeof(V128); |
2561 | ++ d->fxState[1].fx = Ifx_Read; |
2562 | ++ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2563 | ++ d->fxState[1].size = elementSize; |
2564 | ++ d->fxState[2].fx = Ifx_Write; |
2565 | ++ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2566 | ++ d->fxState[2].size = sizeof(V128); |
2567 | + |
2568 | + stmt(IRStmt_Dirty(d)); |
2569 | + s390_cc_set(cc); |
2570 | ++ return "vfmax"; |
2571 | ++} |
2572 | + |
2573 | +- return "vftci"; |
2574 | ++static const HChar * |
2575 | ++s390_irgen_VBPERM(UChar v1, UChar v2, UChar v3) |
2576 | ++{ |
2577 | ++ IRDirty* d; |
2578 | ++ IRTemp cc = newTemp(Ity_I64); |
2579 | ++ |
2580 | ++ s390x_vec_op_details_t details = { .serialized = 0ULL }; |
2581 | ++ details.op = S390_VEC_OP_VBPERM; |
2582 | ++ details.v1 = v1; |
2583 | ++ details.v2 = v2; |
2584 | ++ details.v3 = v3; |
2585 | ++ details.m4 = 0; |
2586 | ++ details.m5 = 0; |
2587 | ++ details.m6 = 0; |
2588 | ++ |
2589 | ++ d = unsafeIRDirty_1_N(cc, 0, "s390x_dirtyhelper_vec_op", |
2590 | ++ &s390x_dirtyhelper_vec_op, |
2591 | ++ mkIRExprVec_2(IRExpr_GSPTR(), |
2592 | ++ mkU64(details.serialized))); |
2593 | ++ |
2594 | ++ d->nFxState = 3; |
2595 | ++ vex_bzero(&d->fxState, sizeof(d->fxState)); |
2596 | ++ d->fxState[0].fx = Ifx_Read; |
2597 | ++ d->fxState[0].offset = S390X_GUEST_OFFSET(guest_v0) + v2 * sizeof(V128); |
2598 | ++ d->fxState[0].size = sizeof(V128); |
2599 | ++ d->fxState[1].fx = Ifx_Read; |
2600 | ++ d->fxState[1].offset = S390X_GUEST_OFFSET(guest_v0) + v3 * sizeof(V128); |
2601 | ++ d->fxState[1].size = sizeof(V128); |
2602 | ++ d->fxState[2].fx = Ifx_Write; |
2603 | ++ d->fxState[2].offset = S390X_GUEST_OFFSET(guest_v0) + v1 * sizeof(V128); |
2604 | ++ d->fxState[2].size = sizeof(V128); |
2605 | ++ |
2606 | ++ stmt(IRStmt_Dirty(d)); |
2607 | ++ s390_cc_set(cc); |
2608 | ++ return "vbperm"; |
2609 | + } |
2610 | + |
2611 | + /* New insns are added here. |
2612 | +@@ -20486,11 +20751,23 @@ |
2613 | + RXY_dl2(ovl), |
2614 | + RXY_dh2(ovl)); goto ok; |
2615 | + case 0xe60000000034ULL: /* VPKZ */ goto unimplemented; |
2616 | +- case 0xe60000000035ULL: /* VLRL */ goto unimplemented; |
2617 | +- case 0xe60000000037ULL: /* VLRLR */ goto unimplemented; |
2618 | ++ case 0xe60000000035ULL: s390_format_VSI_URDV(s390_irgen_VLRL, VSI_v1(ovl), |
2619 | ++ VSI_b2(ovl), VSI_d2(ovl), |
2620 | ++ VSI_i3(ovl), |
2621 | ++ VSI_rxb(ovl)); goto ok; |
2622 | ++ case 0xe60000000037ULL: s390_format_VRS_RRDV(s390_irgen_VLRLR, VRSd_v1(ovl), |
2623 | ++ VRSd_r3(ovl), VRS_b2(ovl), |
2624 | ++ VRS_d2(ovl), |
2625 | ++ VRS_rxb(ovl)); goto ok; |
2626 | + case 0xe6000000003cULL: /* VUPKZ */ goto unimplemented; |
2627 | +- case 0xe6000000003dULL: /* VSTRL */ goto unimplemented; |
2628 | +- case 0xe6000000003fULL: /* VSTRLR */ goto unimplemented; |
2629 | ++ case 0xe6000000003dULL: s390_format_VSI_URDV(s390_irgen_VSTRL, VSI_v1(ovl), |
2630 | ++ VSI_b2(ovl), VSI_d2(ovl), |
2631 | ++ VSI_i3(ovl), |
2632 | ++ VSI_rxb(ovl)); goto ok; |
2633 | ++ case 0xe6000000003fULL: s390_format_VRS_RRDV(s390_irgen_VSTRLR, VRSd_v1(ovl), |
2634 | ++ VRSd_r3(ovl), VRS_b2(ovl), |
2635 | ++ VRS_d2(ovl), |
2636 | ++ VRS_rxb(ovl)); goto ok; |
2637 | + case 0xe60000000049ULL: /* VLIP */ goto unimplemented; |
2638 | + case 0xe60000000050ULL: /* VCVB */ goto unimplemented; |
2639 | + case 0xe60000000052ULL: /* VCVBG */ goto unimplemented; |
2640 | +@@ -20688,12 +20965,18 @@ |
2641 | + case 0xe7000000006bULL: s390_format_VRR_VVV(s390_irgen_VNO, VRR_v1(ovl), |
2642 | + VRR_v2(ovl), VRR_r3(ovl), |
2643 | + VRR_rxb(ovl)); goto ok; |
2644 | +- case 0xe7000000006cULL: /* VNX */ goto unimplemented; |
2645 | ++ case 0xe7000000006cULL: s390_format_VRR_VVV(s390_irgen_VNX, VRR_v1(ovl), |
2646 | ++ VRR_v2(ovl), VRR_r3(ovl), |
2647 | ++ VRR_rxb(ovl)); goto ok; |
2648 | + case 0xe7000000006dULL: s390_format_VRR_VVV(s390_irgen_VX, VRR_v1(ovl), |
2649 | + VRR_v2(ovl), VRR_r3(ovl), |
2650 | + VRR_rxb(ovl)); goto ok; |
2651 | +- case 0xe7000000006eULL: /* VNN */ goto unimplemented; |
2652 | +- case 0xe7000000006fULL: /* VOC */ goto unimplemented; |
2653 | ++ case 0xe7000000006eULL: s390_format_VRR_VVV(s390_irgen_VNN, VRR_v1(ovl), |
2654 | ++ VRR_v2(ovl), VRR_r3(ovl), |
2655 | ++ VRR_rxb(ovl)); goto ok; |
2656 | ++ case 0xe7000000006fULL: s390_format_VRR_VVV(s390_irgen_VOC, VRR_v1(ovl), |
2657 | ++ VRR_v2(ovl), VRR_r3(ovl), |
2658 | ++ VRR_rxb(ovl)); goto ok; |
2659 | + case 0xe70000000070ULL: s390_format_VRR_VVVM(s390_irgen_VESLV, VRR_v1(ovl), |
2660 | + VRR_v2(ovl), VRR_r3(ovl), |
2661 | + VRR_m4(ovl), VRR_rxb(ovl)); goto ok; |
2662 | +@@ -20746,7 +21029,9 @@ |
2663 | + case 0xe70000000084ULL: s390_format_VRR_VVVM(s390_irgen_VPDI, VRR_v1(ovl), |
2664 | + VRR_v2(ovl), VRR_r3(ovl), |
2665 | + VRR_m4(ovl), VRR_rxb(ovl)); goto ok; |
2666 | +- case 0xe70000000085ULL: /* VBPERM */ goto unimplemented; |
2667 | ++ case 0xe70000000085ULL: s390_format_VRR_VVV(s390_irgen_VBPERM, VRR_v1(ovl), |
2668 | ++ VRR_v2(ovl), VRR_r3(ovl), |
2669 | ++ VRR_rxb(ovl)); goto ok; |
2670 | + case 0xe7000000008aULL: s390_format_VRR_VVVVMM(s390_irgen_VSTRC, VRRd_v1(ovl), |
2671 | + VRRd_v2(ovl), VRRd_v3(ovl), |
2672 | + VRRd_v4(ovl), VRRd_m5(ovl), |
2673 | +@@ -20777,8 +21062,16 @@ |
2674 | + case 0xe70000000097ULL: s390_format_VRR_VVVMM(s390_irgen_VPKS, VRR_v1(ovl), |
2675 | + VRR_v2(ovl), VRR_r3(ovl), |
2676 | + VRR_m4(ovl), VRR_m5(ovl), VRR_rxb(ovl)); goto ok; |
2677 | +- case 0xe7000000009eULL: /* VFNMS */ goto unimplemented; |
2678 | +- case 0xe7000000009fULL: /* VFNMA */ goto unimplemented; |
2679 | ++ case 0xe7000000009eULL: s390_format_VRR_VVVVMM(s390_irgen_VFNMS, VRRe_v1(ovl), |
2680 | ++ VRRe_v2(ovl), VRRe_v3(ovl), |
2681 | ++ VRRe_v4(ovl), VRRe_m5(ovl), |
2682 | ++ VRRe_m6(ovl), |
2683 | ++ VRRe_rxb(ovl)); goto ok; |
2684 | ++ case 0xe7000000009fULL: s390_format_VRR_VVVVMM(s390_irgen_VFNMA, VRRe_v1(ovl), |
2685 | ++ VRRe_v2(ovl), VRRe_v3(ovl), |
2686 | ++ VRRe_v4(ovl), VRRe_m5(ovl), |
2687 | ++ VRRe_m6(ovl), |
2688 | ++ VRRe_rxb(ovl)); goto ok; |
2689 | + case 0xe700000000a1ULL: s390_format_VRR_VVVM(s390_irgen_VMLH, VRR_v1(ovl), |
2690 | + VRR_v2(ovl), VRR_r3(ovl), |
2691 | + VRR_m4(ovl), VRR_rxb(ovl)); goto ok; |
2692 | +@@ -20831,7 +21124,11 @@ |
2693 | + case 0xe700000000b4ULL: s390_format_VRR_VVVM(s390_irgen_VGFM, VRR_v1(ovl), |
2694 | + VRR_v2(ovl), VRR_r3(ovl), |
2695 | + VRR_m4(ovl), VRR_rxb(ovl)); goto ok; |
2696 | +- case 0xe700000000b8ULL: /* VMSL */ goto unimplemented; |
2697 | ++ case 0xe700000000b8ULL: s390_format_VRR_VVVVMM(s390_irgen_VMSL, VRRd_v1(ovl), |
2698 | ++ VRRd_v2(ovl), VRRd_v3(ovl), |
2699 | ++ VRRd_v4(ovl), VRRd_m5(ovl), |
2700 | ++ VRRd_m6(ovl), |
2701 | ++ VRRd_rxb(ovl)); goto ok; |
2702 | + case 0xe700000000b9ULL: s390_format_VRRd_VVVVM(s390_irgen_VACCC, VRRd_v1(ovl), |
2703 | + VRRd_v2(ovl), VRRd_v3(ovl), |
2704 | + VRRd_v4(ovl), VRRd_m5(ovl), |
2705 | +@@ -20868,11 +21165,11 @@ |
2706 | + VRRa_v2(ovl), VRRa_m3(ovl), |
2707 | + VRRa_m4(ovl), VRRa_m5(ovl), |
2708 | + VRRa_rxb(ovl)); goto ok; |
2709 | +- case 0xe700000000c4ULL: s390_format_VRRa_VVMMM(s390_irgen_VLDE, VRRa_v1(ovl), |
2710 | ++ case 0xe700000000c4ULL: s390_format_VRRa_VVMMM(s390_irgen_VFLL, VRRa_v1(ovl), |
2711 | + VRRa_v2(ovl), VRRa_m3(ovl), |
2712 | + VRRa_m4(ovl), VRRa_m5(ovl), |
2713 | + VRRa_rxb(ovl)); goto ok; |
2714 | +- case 0xe700000000c5ULL: s390_format_VRRa_VVMMM(s390_irgen_VLED, VRRa_v1(ovl), |
2715 | ++ case 0xe700000000c5ULL: s390_format_VRRa_VVMMM(s390_irgen_VFLR, VRRa_v1(ovl), |
2716 | + VRRa_v2(ovl), VRRa_m3(ovl), |
2717 | + VRRa_m4(ovl), VRRa_m5(ovl), |
2718 | + VRRa_rxb(ovl)); goto ok; |
2719 | +@@ -20953,8 +21250,16 @@ |
2720 | + VRRa_m3(ovl), VRRa_m4(ovl), |
2721 | + VRRa_m5(ovl), |
2722 | + VRRa_rxb(ovl)); goto ok; |
2723 | +- case 0xe700000000eeULL: /* VFMIN */ goto unimplemented; |
2724 | +- case 0xe700000000efULL: /* VFMAX */ goto unimplemented; |
2725 | ++ case 0xe700000000eeULL: s390_format_VRRa_VVVMMM(s390_irgen_VFMIN, VRRa_v1(ovl), |
2726 | ++ VRRa_v2(ovl), VRRa_v3(ovl), |
2727 | ++ VRRa_m3(ovl), VRRa_m4(ovl), |
2728 | ++ VRRa_m5(ovl), |
2729 | ++ VRRa_rxb(ovl)); goto ok; |
2730 | ++ case 0xe700000000efULL: s390_format_VRRa_VVVMMM(s390_irgen_VFMAX, VRRa_v1(ovl), |
2731 | ++ VRRa_v2(ovl), VRRa_v3(ovl), |
2732 | ++ VRRa_m3(ovl), VRRa_m4(ovl), |
2733 | ++ VRRa_m5(ovl), |
2734 | ++ VRRa_rxb(ovl)); goto ok; |
2735 | + case 0xe700000000f0ULL: s390_format_VRR_VVVM(s390_irgen_VAVGL, VRR_v1(ovl), |
2736 | + VRR_v2(ovl), VRR_r3(ovl), |
2737 | + VRR_m4(ovl), VRR_rxb(ovl)); goto ok; |
2738 | +--- a/VEX/priv/host_s390_defs.c |
2739 | ++++ b/VEX/priv/host_s390_defs.c |
2740 | +@@ -8,7 +8,7 @@ |
2741 | + This file is part of Valgrind, a dynamic binary instrumentation |
2742 | + framework. |
2743 | + |
2744 | +- Copyright IBM Corp. 2010-2017 |
2745 | ++ Copyright IBM Corp. 2010-2020 |
2746 | + Copyright (C) 2012-2017 Florian Krohm (britzel@acm.org) |
2747 | + |
2748 | + This program is free software; you can redistribute it and/or |
2749 | +@@ -684,6 +684,8 @@ |
2750 | + switch (hregClass(from)) { |
2751 | + case HRcInt64: |
2752 | + return s390_insn_move(sizeofIRType(Ity_I64), to, from); |
2753 | ++ case HRcFlt64: |
2754 | ++ return s390_insn_move(sizeofIRType(Ity_F64), to, from); |
2755 | + case HRcVec128: |
2756 | + return s390_insn_move(sizeofIRType(Ity_V128), to, from); |
2757 | + default: |
2758 | +@@ -7870,6 +7872,10 @@ |
2759 | + op = "v-vfloatabs"; |
2760 | + break; |
2761 | + |
2762 | ++ case S390_VEC_FLOAT_NABS: |
2763 | ++ op = "v-vfloatnabs"; |
2764 | ++ break; |
2765 | ++ |
2766 | + default: |
2767 | + goto fail; |
2768 | + } |
2769 | +@@ -9439,21 +9445,28 @@ |
2770 | + |
2771 | + case S390_VEC_FLOAT_NEG: { |
2772 | + vassert(insn->variant.unop.src.tag == S390_OPND_REG); |
2773 | +- vassert(insn->size == 8); |
2774 | ++ vassert(insn->size >= 4); |
2775 | + UChar v1 = hregNumber(insn->variant.unop.dst); |
2776 | + UChar v2 = hregNumber(insn->variant.unop.src.variant.reg); |
2777 | + return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 0); |
2778 | + } |
2779 | + case S390_VEC_FLOAT_ABS: { |
2780 | + vassert(insn->variant.unop.src.tag == S390_OPND_REG); |
2781 | +- vassert(insn->size == 8); |
2782 | ++ vassert(insn->size >= 4); |
2783 | + UChar v1 = hregNumber(insn->variant.unop.dst); |
2784 | + UChar v2 = hregNumber(insn->variant.unop.src.variant.reg); |
2785 | + return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 2); |
2786 | + } |
2787 | ++ case S390_VEC_FLOAT_NABS: { |
2788 | ++ vassert(insn->variant.unop.src.tag == S390_OPND_REG); |
2789 | ++ vassert(insn->size >= 4); |
2790 | ++ UChar v1 = hregNumber(insn->variant.unop.dst); |
2791 | ++ UChar v2 = hregNumber(insn->variant.unop.src.variant.reg); |
2792 | ++ return s390_emit_VFPSO(buf, v1, v2, s390_getM_from_size(insn->size), 0, 1); |
2793 | ++ } |
2794 | + case S390_VEC_FLOAT_SQRT: { |
2795 | + vassert(insn->variant.unop.src.tag == S390_OPND_REG); |
2796 | +- vassert(insn->size == 8); |
2797 | ++ vassert(insn->size >= 4); |
2798 | + UChar v1 = hregNumber(insn->variant.unop.dst); |
2799 | + UChar v2 = hregNumber(insn->variant.unop.src.variant.reg); |
2800 | + return s390_emit_VFSQ(buf, v1, v2, s390_getM_from_size(insn->size), 0); |
2801 | +--- a/VEX/priv/host_s390_defs.h |
2802 | ++++ b/VEX/priv/host_s390_defs.h |
2803 | +@@ -8,7 +8,7 @@ |
2804 | + This file is part of Valgrind, a dynamic binary instrumentation |
2805 | + framework. |
2806 | + |
2807 | +- Copyright IBM Corp. 2010-2017 |
2808 | ++ Copyright IBM Corp. 2010-2020 |
2809 | + |
2810 | + This program is free software; you can redistribute it and/or |
2811 | + modify it under the terms of the GNU General Public License as |
2812 | +@@ -205,6 +205,7 @@ |
2813 | + S390_VEC_COUNT_ONES, |
2814 | + S390_VEC_FLOAT_NEG, |
2815 | + S390_VEC_FLOAT_ABS, |
2816 | ++ S390_VEC_FLOAT_NABS, |
2817 | + S390_VEC_FLOAT_SQRT, |
2818 | + S390_UNOP_T_INVALID |
2819 | + } s390_unop_t; |
2820 | +@@ -931,6 +932,8 @@ |
2821 | + (s390_host_hwcaps & (VEX_HWCAPS_S390X_MSA5)) |
2822 | + #define s390_host_has_lsc2 \ |
2823 | + (s390_host_hwcaps & (VEX_HWCAPS_S390X_LSC2)) |
2824 | ++#define s390_host_has_vxe \ |
2825 | ++ (s390_host_hwcaps & (VEX_HWCAPS_S390X_VXE)) |
2826 | + #endif /* ndef __VEX_HOST_S390_DEFS_H */ |
2827 | + |
2828 | + /*---------------------------------------------------------------*/ |
2829 | +--- a/VEX/priv/host_s390_isel.c |
2830 | ++++ b/VEX/priv/host_s390_isel.c |
2831 | +@@ -8,7 +8,7 @@ |
2832 | + This file is part of Valgrind, a dynamic binary instrumentation |
2833 | + framework. |
2834 | + |
2835 | +- Copyright IBM Corp. 2010-2017 |
2836 | ++ Copyright IBM Corp. 2010-2020 |
2837 | + Copyright (C) 2012-2017 Florian Krohm (britzel@acm.org) |
2838 | + |
2839 | + This program is free software; you can redistribute it and/or |
2840 | +@@ -2362,9 +2362,10 @@ |
2841 | + case Iop_NegF128: |
2842 | + if (left->tag == Iex_Unop && |
2843 | + (left->Iex.Unop.op == Iop_AbsF32 || |
2844 | +- left->Iex.Unop.op == Iop_AbsF64)) |
2845 | ++ left->Iex.Unop.op == Iop_AbsF64)) { |
2846 | + bfpop = S390_BFP_NABS; |
2847 | +- else |
2848 | ++ left = left->Iex.Unop.arg; |
2849 | ++ } else |
2850 | + bfpop = S390_BFP_NEG; |
2851 | + goto float128_opnd; |
2852 | + case Iop_AbsF128: bfpop = S390_BFP_ABS; goto float128_opnd; |
2853 | +@@ -2726,9 +2727,10 @@ |
2854 | + case Iop_NegF64: |
2855 | + if (left->tag == Iex_Unop && |
2856 | + (left->Iex.Unop.op == Iop_AbsF32 || |
2857 | +- left->Iex.Unop.op == Iop_AbsF64)) |
2858 | ++ left->Iex.Unop.op == Iop_AbsF64)) { |
2859 | + bfpop = S390_BFP_NABS; |
2860 | +- else |
2861 | ++ left = left->Iex.Unop.arg; |
2862 | ++ } else |
2863 | + bfpop = S390_BFP_NEG; |
2864 | + break; |
2865 | + |
2866 | +@@ -3944,11 +3946,27 @@ |
2867 | + vec_unop = S390_VEC_COUNT_ONES; |
2868 | + goto Iop_V_wrk; |
2869 | + |
2870 | ++ case Iop_Neg32Fx4: |
2871 | ++ size = 4; |
2872 | ++ vec_unop = S390_VEC_FLOAT_NEG; |
2873 | ++ if (arg->tag == Iex_Unop && arg->Iex.Unop.op == Iop_Abs32Fx4) { |
2874 | ++ vec_unop = S390_VEC_FLOAT_NABS; |
2875 | ++ arg = arg->Iex.Unop.arg; |
2876 | ++ } |
2877 | ++ goto Iop_V_wrk; |
2878 | + case Iop_Neg64Fx2: |
2879 | + size = 8; |
2880 | + vec_unop = S390_VEC_FLOAT_NEG; |
2881 | ++ if (arg->tag == Iex_Unop && arg->Iex.Unop.op == Iop_Abs64Fx2) { |
2882 | ++ vec_unop = S390_VEC_FLOAT_NABS; |
2883 | ++ arg = arg->Iex.Unop.arg; |
2884 | ++ } |
2885 | + goto Iop_V_wrk; |
2886 | + |
2887 | ++ case Iop_Abs32Fx4: |
2888 | ++ size = 4; |
2889 | ++ vec_unop = S390_VEC_FLOAT_ABS; |
2890 | ++ goto Iop_V_wrk; |
2891 | + case Iop_Abs64Fx2: |
2892 | + size = 8; |
2893 | + vec_unop = S390_VEC_FLOAT_ABS; |
2894 | +@@ -4474,17 +4492,29 @@ |
2895 | + vec_binop = S390_VEC_ELEM_ROLL_V; |
2896 | + goto Iop_VV_wrk; |
2897 | + |
2898 | ++ case Iop_CmpEQ32Fx4: |
2899 | ++ size = 4; |
2900 | ++ vec_binop = S390_VEC_FLOAT_COMPARE_EQUAL; |
2901 | ++ goto Iop_VV_wrk; |
2902 | + case Iop_CmpEQ64Fx2: |
2903 | + size = 8; |
2904 | + vec_binop = S390_VEC_FLOAT_COMPARE_EQUAL; |
2905 | + goto Iop_VV_wrk; |
2906 | + |
2907 | ++ case Iop_CmpLE32Fx4: |
2908 | ++ size = 4; |
2909 | ++ vec_binop = S390_VEC_FLOAT_COMPARE_LESS_OR_EQUAL; |
2910 | ++ goto Iop_VV_wrk; |
2911 | + case Iop_CmpLE64Fx2: { |
2912 | + size = 8; |
2913 | + vec_binop = S390_VEC_FLOAT_COMPARE_LESS_OR_EQUAL; |
2914 | + goto Iop_VV_wrk; |
2915 | + } |
2916 | + |
2917 | ++ case Iop_CmpLT32Fx4: |
2918 | ++ size = 4; |
2919 | ++ vec_binop = S390_VEC_FLOAT_COMPARE_LESS; |
2920 | ++ goto Iop_VV_wrk; |
2921 | + case Iop_CmpLT64Fx2: { |
2922 | + size = 8; |
2923 | + vec_binop = S390_VEC_FLOAT_COMPARE_LESS; |
2924 | +@@ -4671,20 +4701,41 @@ |
2925 | + dst, reg1, reg2, reg3)); |
2926 | + return dst; |
2927 | + |
2928 | ++ case Iop_Add32Fx4: |
2929 | ++ size = 4; |
2930 | ++ vec_binop = S390_VEC_FLOAT_ADD; |
2931 | ++ goto Iop_irrm_VV_wrk; |
2932 | ++ |
2933 | + case Iop_Add64Fx2: |
2934 | + size = 8; |
2935 | + vec_binop = S390_VEC_FLOAT_ADD; |
2936 | + goto Iop_irrm_VV_wrk; |
2937 | + |
2938 | ++ case Iop_Sub32Fx4: |
2939 | ++ size = 4; |
2940 | ++ vec_binop = S390_VEC_FLOAT_SUB; |
2941 | ++ goto Iop_irrm_VV_wrk; |
2942 | ++ |
2943 | + case Iop_Sub64Fx2: |
2944 | + size = 8; |
2945 | + vec_binop = S390_VEC_FLOAT_SUB; |
2946 | + goto Iop_irrm_VV_wrk; |
2947 | + |
2948 | ++ case Iop_Mul32Fx4: |
2949 | ++ size = 4; |
2950 | ++ vec_binop = S390_VEC_FLOAT_MUL; |
2951 | ++ goto Iop_irrm_VV_wrk; |
2952 | ++ |
2953 | + case Iop_Mul64Fx2: |
2954 | + size = 8; |
2955 | + vec_binop = S390_VEC_FLOAT_MUL; |
2956 | + goto Iop_irrm_VV_wrk; |
2957 | ++ |
2958 | ++ case Iop_Div32Fx4: |
2959 | ++ size = 4; |
2960 | ++ vec_binop = S390_VEC_FLOAT_DIV; |
2961 | ++ goto Iop_irrm_VV_wrk; |
2962 | ++ |
2963 | + case Iop_Div64Fx2: |
2964 | + size = 8; |
2965 | + vec_binop = S390_VEC_FLOAT_DIV; |
2966 | +--- a/VEX/priv/main_main.c |
2967 | ++++ b/VEX/priv/main_main.c |
2968 | +@@ -1792,6 +1792,7 @@ |
2969 | + { VEX_HWCAPS_S390X_MSA5, "msa5" }, |
2970 | + { VEX_HWCAPS_S390X_MI2, "mi2" }, |
2971 | + { VEX_HWCAPS_S390X_LSC2, "lsc2" }, |
2972 | ++ { VEX_HWCAPS_S390X_LSC2, "vxe" }, |
2973 | + }; |
2974 | + /* Allocate a large enough buffer */ |
2975 | + static HChar buf[sizeof prefix + |
2976 | +--- a/VEX/pub/libvex_emnote.h |
2977 | ++++ b/VEX/pub/libvex_emnote.h |
2978 | +@@ -124,6 +124,10 @@ |
2979 | + /* ppno insn is not supported on this host */ |
2980 | + EmFail_S390X_ppno, |
2981 | + |
2982 | ++ /* insn needs vector-enhancements facility which is not available on this |
2983 | ++ host */ |
2984 | ++ EmFail_S390X_vxe, |
2985 | ++ |
2986 | + EmNote_NUMBER |
2987 | + } |
2988 | + VexEmNote; |
2989 | +--- a/VEX/pub/libvex.h |
2990 | ++++ b/VEX/pub/libvex.h |
2991 | +@@ -167,7 +167,7 @@ |
2992 | + #define VEX_HWCAPS_S390X_MSA5 (1<<19) /* message security assistance facility */ |
2993 | + #define VEX_HWCAPS_S390X_MI2 (1<<20) /* miscellaneous-instruction-extensions facility 2 */ |
2994 | + #define VEX_HWCAPS_S390X_LSC2 (1<<21) /* Conditional load/store facility2 */ |
2995 | +- |
2996 | ++#define VEX_HWCAPS_S390X_VXE (1<<22) /* Vector-enhancements facility */ |
2997 | + |
2998 | + /* Special value representing all available s390x hwcaps */ |
2999 | + #define VEX_HWCAPS_S390X_ALL (VEX_HWCAPS_S390X_LDISP | \ |
3000 | +@@ -185,7 +185,8 @@ |
3001 | + VEX_HWCAPS_S390X_VX | \ |
3002 | + VEX_HWCAPS_S390X_MSA5 | \ |
3003 | + VEX_HWCAPS_S390X_MI2 | \ |
3004 | +- VEX_HWCAPS_S390X_LSC2) |
3005 | ++ VEX_HWCAPS_S390X_LSC2 | \ |
3006 | ++ VEX_HWCAPS_S390X_VXE) |
3007 | + |
3008 | + #define VEX_HWCAPS_S390X(x) ((x) & ~VEX_S390X_MODEL_MASK) |
3009 | + #define VEX_S390X_MODEL(x) ((x) & VEX_S390X_MODEL_MASK) |
3010 | diff --git a/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch b/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch |
3011 | new file mode 100644 |
3012 | index 0000000..a62098a |
3013 | --- /dev/null |
3014 | +++ b/debian/patches/lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch |
3015 | @@ -0,0 +1,45 @@ |
3016 | +From ba73f8d2ebe4b5fe8163ee5ab806f0e50961ebdf Mon Sep 17 00:00:00 2001 |
3017 | +From: Andreas Arnez <arnez@linux.ibm.com> |
3018 | +Date: Tue, 3 Nov 2020 18:17:30 +0100 |
3019 | +Subject: [PATCH] Bug 428648 - s390x: Force 12-bit amode for vector loads in isel |
3020 | + |
3021 | +Similar to Bug 417452, where the instruction selector sometimes attempted |
3022 | +to generate vector stores with a 20-bit displacement, the same problem has |
3023 | +now been reported with vector loads. |
3024 | + |
3025 | +The problem is caused in s390_isel_vec_expr_wrk(), where the addressing |
3026 | +mode is generated with s390_isel_amode() instead of |
3027 | +s390_isel_amode_short(). This is fixed. |
3028 | + |
3029 | +Author: Andreas Arnez <arnez@linux.ibm.com> |
3030 | +Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=ba73f8d2e |
3031 | +Bug-IBM: IBM Bugzilla 163660 |
3032 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343 |
3033 | +Applied-Upstream: > v3.16.1 |
3034 | +Reviewed-by: Frank Heimes <frank.heimes@canonical.com> |
3035 | +Last-Update: 2021-02-10 |
3036 | + |
3037 | +--- |
3038 | + NEWS | 1 + |
3039 | + VEX/priv/host_s390_isel.c | 2 +- |
3040 | + 2 files changed, 3 insertions(+), 1 deletion(-) |
3041 | +--- a/NEWS |
3042 | ++++ b/NEWS |
3043 | +@@ -1,4 +1,6 @@ |
3044 | + |
3045 | ++428648 s390_emit_load_mem panics due to 20-bit offset for vector load |
3046 | ++ |
3047 | + Release 3.16.1 (22 June 2020) |
3048 | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
3049 | + |
3050 | +--- a/VEX/priv/host_s390_isel.c |
3051 | ++++ b/VEX/priv/host_s390_isel.c |
3052 | +@@ -3741,7 +3741,7 @@ |
3053 | + /* --------- LOAD --------- */ |
3054 | + case Iex_Load: { |
3055 | + HReg dst = newVRegV(env); |
3056 | +- s390_amode *am = s390_isel_amode(env, expr->Iex.Load.addr); |
3057 | ++ s390_amode *am = s390_isel_amode_short(env, expr->Iex.Load.addr); |
3058 | + |
3059 | + if (expr->Iex.Load.end != Iend_BE) |
3060 | + goto irreducible; |
3061 | diff --git a/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch b/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch |
3062 | new file mode 100644 |
3063 | index 0000000..94c81f8 |
3064 | --- /dev/null |
3065 | +++ b/debian/patches/lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch |
3066 | @@ -0,0 +1,155 @@ |
3067 | +From 5adeafad7a60b63786d9545e6980de26c17cb0a6 Mon Sep 17 00:00:00 2001 |
3068 | +From: Andreas Arnez <arnez@linux.ibm.com> |
3069 | +Date: Thu, 3 Dec 2020 18:32:45 +0100 |
3070 | +Subject: [PATCH] Bug 429864 - s390: Use Iop_CasCmp* to fix memcheck false |
3071 | + positives |
3072 | + |
3073 | +Compare-and-swap instructions can cause memcheck false positives when |
3074 | +operating on partially uninitialized data. An example is where a 1-byte |
3075 | +lock is allocated on the stack and then manipulated using CS on the |
3076 | +surrounding word. This is correct, and the uninitialized data has no |
3077 | +influence on the result, but memcheck still complains. |
3078 | + |
3079 | +This is caused by logic in the s390 backend, where the expected and actual |
3080 | +memory values are compared using Iop_Sub32. Fix this by using |
3081 | +Iop_CasCmpNE32 instead. |
3082 | + |
3083 | +Author: Andreas Arnez <arnez@linux.ibm.com> |
3084 | +Origin: backport, https://sourceware.org/git/?p=valgrind.git;a=commit;h=5adeafad7 |
3085 | +Bug-IBM: IBM Bugzilla 163660 |
3086 | +Bug-Ubuntu: https://bugs.launchpad.net/bugs/1825343 |
3087 | +Applied-Upstream: > v3.16.1 |
3088 | +Reviewed-by: Frank Heimes <frank.heimes@canonical.com> |
3089 | +Last-Update: 2021-02-10 |
3090 | + |
3091 | +--- |
3092 | + NEWS | 2 ++ |
3093 | + VEX/priv/guest_s390_toIR.c | 31 ++++++++++++++----------------- |
3094 | + 2 files changed, 16 insertions(+), 17 deletions(-) |
3095 | + |
3096 | +--- a/NEWS |
3097 | ++++ b/NEWS |
3098 | +@@ -1,5 +1,7 @@ |
3099 | + |
3100 | + 428648 s390_emit_load_mem panics due to 20-bit offset for vector load |
3101 | ++429864 s390x: C++ atomic test_and_set yields false-positive memcheck |
3102 | ++ diagnostics |
3103 | + |
3104 | + Release 3.16.1 (22 June 2020) |
3105 | + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
3106 | +--- a/VEX/priv/guest_s390_toIR.c |
3107 | ++++ b/VEX/priv/guest_s390_toIR.c |
3108 | +@@ -742,6 +742,9 @@ |
3109 | + case Ity_I8: |
3110 | + expr = unop(sign_extend ? Iop_8Sto64 : Iop_8Uto64, expr); |
3111 | + break; |
3112 | ++ case Ity_I1: |
3113 | ++ expr = unop(sign_extend ? Iop_1Sto64 : Iop_1Uto64, expr); |
3114 | ++ break; |
3115 | + default: |
3116 | + vpanic("s390_cc_widen"); |
3117 | + } |
3118 | +@@ -7417,7 +7420,7 @@ |
3119 | + |
3120 | + /* If old_mem contains the expected value, then the CAS succeeded. |
3121 | + Otherwise, it did not */ |
3122 | +- yield_if(binop(Iop_CmpNE32, mkexpr(old_mem), mkexpr(op2))); |
3123 | ++ yield_if(binop(Iop_CasCmpNE32, mkexpr(old_mem), mkexpr(op2))); |
3124 | + put_gpr_w1(r1, mkexpr(old_mem)); |
3125 | + } |
3126 | + |
3127 | +@@ -7451,7 +7454,7 @@ |
3128 | + |
3129 | + /* If old_mem contains the expected value, then the CAS succeeded. |
3130 | + Otherwise, it did not */ |
3131 | +- yield_if(binop(Iop_CmpNE64, mkexpr(old_mem), mkexpr(op2))); |
3132 | ++ yield_if(binop(Iop_CasCmpNE64, mkexpr(old_mem), mkexpr(op2))); |
3133 | + put_gpr_dw0(r1, mkexpr(old_mem)); |
3134 | + } |
3135 | + |
3136 | +@@ -7481,7 +7484,7 @@ |
3137 | + |
3138 | + /* If old_mem contains the expected value, then the CAS succeeded. |
3139 | + Otherwise, it did not */ |
3140 | +- yield_if(binop(Iop_CmpNE32, mkexpr(old_mem), mkexpr(op2))); |
3141 | ++ yield_if(binop(Iop_CasCmpNE32, mkexpr(old_mem), mkexpr(op2))); |
3142 | + put_gpr_w1(r1, mkexpr(old_mem)); |
3143 | + } |
3144 | + |
3145 | +@@ -13864,7 +13867,6 @@ |
3146 | + IRTemp op1 = newTemp(Ity_I32); |
3147 | + IRTemp old_mem = newTemp(Ity_I32); |
3148 | + IRTemp op3 = newTemp(Ity_I32); |
3149 | +- IRTemp result = newTemp(Ity_I32); |
3150 | + IRTemp nequal = newTemp(Ity_I1); |
3151 | + |
3152 | + assign(op1, get_gpr_w1(r1)); |
3153 | +@@ -13879,12 +13881,11 @@ |
3154 | + stmt(IRStmt_CAS(cas)); |
3155 | + |
3156 | + /* Set CC. Operands compared equal -> 0, else 1. */ |
3157 | +- assign(result, binop(Iop_Sub32, mkexpr(op1), mkexpr(old_mem))); |
3158 | +- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False); |
3159 | ++ assign(nequal, binop(Iop_CasCmpNE32, mkexpr(op1), mkexpr(old_mem))); |
3160 | ++ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True); |
3161 | + |
3162 | + /* If operands were equal (cc == 0) just store the old value op1 in r1. |
3163 | + Otherwise, store the old_value from memory in r1 and yield. */ |
3164 | +- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0))); |
3165 | + put_gpr_w1(r1, mkite(mkexpr(nequal), mkexpr(old_mem), mkexpr(op1))); |
3166 | + yield_if(mkexpr(nequal)); |
3167 | + } |
3168 | +@@ -13912,7 +13913,6 @@ |
3169 | + IRTemp op1 = newTemp(Ity_I64); |
3170 | + IRTemp old_mem = newTemp(Ity_I64); |
3171 | + IRTemp op3 = newTemp(Ity_I64); |
3172 | +- IRTemp result = newTemp(Ity_I64); |
3173 | + IRTemp nequal = newTemp(Ity_I1); |
3174 | + |
3175 | + assign(op1, get_gpr_dw0(r1)); |
3176 | +@@ -13927,12 +13927,11 @@ |
3177 | + stmt(IRStmt_CAS(cas)); |
3178 | + |
3179 | + /* Set CC. Operands compared equal -> 0, else 1. */ |
3180 | +- assign(result, binop(Iop_Sub64, mkexpr(op1), mkexpr(old_mem))); |
3181 | +- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False); |
3182 | ++ assign(nequal, binop(Iop_CasCmpNE64, mkexpr(op1), mkexpr(old_mem))); |
3183 | ++ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True); |
3184 | + |
3185 | + /* If operands were equal (cc == 0) just store the old value op1 in r1. |
3186 | + Otherwise, store the old_value from memory in r1 and yield. */ |
3187 | +- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0))); |
3188 | + put_gpr_dw0(r1, mkite(mkexpr(nequal), mkexpr(old_mem), mkexpr(op1))); |
3189 | + yield_if(mkexpr(nequal)); |
3190 | + |
3191 | +@@ -13950,7 +13949,6 @@ |
3192 | + IRTemp old_mem_low = newTemp(Ity_I32); |
3193 | + IRTemp op3_high = newTemp(Ity_I32); |
3194 | + IRTemp op3_low = newTemp(Ity_I32); |
3195 | +- IRTemp result = newTemp(Ity_I32); |
3196 | + IRTemp nequal = newTemp(Ity_I1); |
3197 | + |
3198 | + assign(op1_high, get_gpr_w1(r1)); |
3199 | +@@ -13967,18 +13965,17 @@ |
3200 | + stmt(IRStmt_CAS(cas)); |
3201 | + |
3202 | + /* Set CC. Operands compared equal -> 0, else 1. */ |
3203 | +- assign(result, unop(Iop_1Uto32, |
3204 | +- binop(Iop_CmpNE32, |
3205 | ++ assign(nequal, |
3206 | ++ binop(Iop_CasCmpNE32, |
3207 | + binop(Iop_Or32, |
3208 | + binop(Iop_Xor32, mkexpr(op1_high), mkexpr(old_mem_high)), |
3209 | + binop(Iop_Xor32, mkexpr(op1_low), mkexpr(old_mem_low))), |
3210 | +- mkU32(0)))); |
3211 | ++ mkU32(0))); |
3212 | + |
3213 | +- s390_cc_thunk_put1(S390_CC_OP_BITWISE, result, False); |
3214 | ++ s390_cc_thunk_put1(S390_CC_OP_BITWISE, nequal, True); |
3215 | + |
3216 | + /* If operands were equal (cc == 0) just store the old value op1 in r1. |
3217 | + Otherwise, store the old_value from memory in r1 and yield. */ |
3218 | +- assign(nequal, binop(Iop_CmpNE32, s390_call_calculate_cc(), mkU32(0))); |
3219 | + put_gpr_w1(r1, mkite(mkexpr(nequal), mkexpr(old_mem_high), mkexpr(op1_high))); |
3220 | + put_gpr_w1(r1+1, mkite(mkexpr(nequal), mkexpr(old_mem_low), mkexpr(op1_low))); |
3221 | + yield_if(mkexpr(nequal)); |
3222 | diff --git a/debian/patches/series b/debian/patches/series |
3223 | index bc89f83..36cbddd 100644 |
3224 | --- a/debian/patches/series |
3225 | +++ b/debian/patches/series |
3226 | @@ -9,3 +9,6 @@ |
3227 | 13_fix-path-to-vgdb.patch |
3228 | 14_fix-debuginfo-section-duplicates-a-section-in-the-main-ELF-file.patch |
3229 | armv7-illegal-opcode.patch |
3230 | +lp-1825343-Bug-428648-s390x-Force-12-bit-amode-for-vector-loads.patch |
3231 | +lp-1825343-Bug-429864-s390-Use-Iop_CasCmp-to-fix-memcheck-false.patch |
3232 | +lp-1825343-Bug-404076-s390x-Implement-z14-vector-instructions.patch |
Changelog:
- [✓] changelog entry correct version and targeted codename
- [✓] changelog entries correct
- [✓] update-maintainer has been run
New Delta: patches/ series
- [✓] new patches are good or match what was is merged upstream
- [✓] new patches correctly included in debian/
- [✓] new patches have correct DEP3 metadata
Build/Test:
- [✓] build is ok (the one warning is not related to the changed code)
- [✓] verified PPA package installs/uninstalls
- [✓] sanity checks test fine
Some code changes are too complex to fully sign off on, e,g. lp-1825343-Bug-404076-s390x- Implement- z14-vector- instructions. patch. There I have verified that it closely matches what went upstream and have to trust in them to be the subject matter experts.
Overall that looks like a good pre-FF Feature add pulled forward from the coming next version.
+1