Merge ~mkukri/ubuntu/+source/zlib:merge into ubuntu/+source/zlib:debian/sid
- Git
- lp:~mkukri/ubuntu/+source/zlib
- merge
- Merge into debian/sid
Status: | Merged |
---|---|
Merge reported by: | Mate Kukri |
Merged at revision: | 515581d841bd3732d669f9806966080208c840b8 |
Proposed branch: | ~mkukri/ubuntu/+source/zlib:merge |
Merge into: | ubuntu/+source/zlib:debian/sid |
Diff against target: |
6023 lines (+5732/-19) 17 files modified
debian/changelog (+246/-0) debian/control (+24/-1) debian/libx32z1-dev.dirs (+1/-0) debian/libx32z1-dev.install (+2/-0) debian/libx32z1.dirs (+1/-0) debian/libx32z1.install (+1/-0) debian/libx32z1.symbols (+3/-0) debian/patches/power/add-optimized-crc32.patch (+2539/-0) debian/patches/power/fix-clang7-builtins.patch (+62/-0) debian/patches/power/indirect-func-macros.patch (+295/-0) debian/patches/s390x/add-accel-deflate.patch (+2043/-0) debian/patches/s390x/add-vectorized-crc32.patch (+426/-0) debian/patches/series (+5/-0) debian/rules (+39/-5) debian/upstream/signing-key.asc (+30/-0) debian/watch (+2/-0) debian/zlib-core.symbols (+13/-13) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Lukas Märdian (community) | Approve | ||
Frank Heimes (community) | Approve | ||
Steve Langasek (community) | Abstain | ||
Ubuntu Sponsors | Pending | ||
git-ubuntu import | Pending | ||
Review via email: mp+456176@code.launchpad.net |
Commit message
Merge zlib with Debian unstable.
This needed some TLC:
- Split the previous diff with git ubuntu
- Replaced the POWER and s390x patches with the newest ones from IBM rebased on Debian
- Removed the superseded bugfix patches (now included in the above)
Description of the change
Mate Kukri (mkukri) wrote : | # |
> I'm off until end of year so I think you should grab a different reviewer for
> this
Understood, I saw your and Frank Heimes's name on the last changelog entries, that's what I based this on.
Do you have any names in mind who has touched this package before and might be willing to review this?
Steve Langasek (vorlon) wrote : | # |
On Thu, Nov 23, 2023 at 01:30:54PM -0000, Mate Kukri wrote:
> > I'm off until end of year so I think you should grab a different reviewer for
> > this
> Understood, I saw your and Frank Heimes's name on the last changelog
> entries, that's what I based this on.
>
> Do you have any names in mind who has touched this package before and
> might be willing to review this?
I don't think "touched this package" is a relevant criterion and you should
ask around in Foundations (or just ask ~canonical-
- b2a9df2... by Mate Kukri
-
merge-changelogs
- 87e1e2b... by Mate Kukri
-
reconstruct-
changelog
Mate Kukri (mkukri) wrote : | # |
Now based on 1:1.3.dfsg-3
- 515581d... by Mate Kukri
-
update-maintainer
Frank Heimes (fheimes) wrote : | # |
I think this looks good, and is a nice clean-up.
Since this is merged to the noble development release quite early, there should be some time to ask the IBM s390x people to give it a try (I remember that Ilya Leoshkevich <email address hidden> had some test code).
Once I see that this landed, I would like to ask Ilya (no need for you to do anything, but that allows to ensure that the changing s390x optimization patches work fine ...).
Mate Kukri (mkukri) wrote : | # |
> I think this looks good, and is a nice clean-up.
>
> Since this is merged to the noble development release quite early, there
> should be some time to ask the IBM s390x people to give it a try (I remember
> that Ilya Leoshkevich <email address hidden> had some test code).
>
> Once I see that this landed, I would like to ask Ilya (no need for you to do
> anything, but that allows to ensure that the changing s390x optimization
> patches work fine ...).
Are you also able to upload this, or should I ask someone else?
Frank Heimes (fheimes) wrote : | # |
Hi Mate,
I'm sorry, you would need a coredev for uploading, since it's a main
package - and I am only MOTU (working on coredev ;-).
IIRC schopin sponsored my zlib uploads in the past ...
Bye, Frank
Ubuntu on s390x Blog -- ubuntu-
<http://
On Mon, Nov 27, 2023 at 3:01 PM Mate Kukri <email address hidden>
wrote:
> > I think this looks good, and is a nice clean-up.
> >
> > Since this is merged to the noble development release quite early, there
> > should be some time to ask the IBM s390x people to give it a try (I
> remember
> > that Ilya Leoshkevich <email address hidden> had some test code).
> >
> > Once I see that this landed, I would like to ask Ilya (no need for you
> to do
> > anything, but that allows to ensure that the changing s390x optimization
> > patches work fine ...).
>
> Are you also able to upload this, or should I ask someone else?
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
Frank Heimes (fheimes) wrote : | # |
Btw. I haven't seen a LP bug reference in the changelog, are you doing this
merge based on a LP bug ? (what I assume), then please don't forget to
reference this LP bug in d/changelog.
On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <email address hidden>
wrote:
> You have been requested to review the proposed merge of
> ~mkukri/
>
> For more details, see:
>
> https:/
>
>
>
> --
> You are requested to review the proposed merge of
> ~mkukri/
>
Mate Kukri (mkukri) wrote : | # |
I don't think there is an LP bug for this, maybe I should have created one, but this is tracked internally on the Foundations Jira.
> Btw. I haven't seen a LP bug reference in the changelog, are you doing this
> merge based on a LP bug ? (what I assume), then please don't forget to
> reference this LP bug in d/changelog.
>
> On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <email address hidden>
> wrote:
>
> > You have been requested to review the proposed merge of
> > ~mkukri/
> >
> > For more details, see:
> >
> >
> https:/
> >
> >
> >
> > --
> > You are requested to review the proposed merge of
> > ~mkukri/
> >
Frank Heimes (fheimes) wrote : | # |
I think the Wiki page for merging recommends to do so:
https:/
"FILE A MERGE BUG"
Ubuntu on s390x Blog -- ubuntu-
<http://
On Tue, Nov 28, 2023 at 9:08 AM Mate Kukri <email address hidden>
wrote:
> I don't think there is an LP bug for this, maybe I should have created
> one, but this is tracked internally on the Foundations Jira.
>
> > Btw. I haven't seen a LP bug reference in the changelog, are you doing
> this
> > merge based on a LP bug ? (what I assume), then please don't forget to
> > reference this LP bug in d/changelog.
> >
> > On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <<email address hidden>
> >
> > wrote:
> >
> > > You have been requested to review the proposed merge of
> > > ~mkukri/
> > >
> > > For more details, see:
> > >
> > >
> >
> https:/
> > >
> > >
> > >
> > > --
> > > You are requested to review the proposed merge of
> > > ~mkukri/
> > >
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
Lukas Märdian (slyon) wrote : | # |
Thank you Mate, that's indeed a really nice cleanup!
The new patches are nicely structured and provide clean patch headers. I confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub. Besides the new patches the delta looks very similar to our previous delta, but this time as clean git-ubuntu commits. Kudos!
@Frank: you mention there might be some test code available, I wonder if we could somehow integrate that into the package? Because unfortunately there doesn't seem to be any dh_auto_test nor autopkgtest. :(
Either way, we should definitely ask IBM/Ilya to verify that the new patches work as intended.
@Mate: We should also consider upstreaming the d/watch delta to Debian, I think that could be useful and doesn't need to be part of the delta.
Test build passed in a PPA:
https:/
LGTM. Sponsoring.
Frank Heimes (fheimes) wrote : | # |
From what I remember 'iii' has just a few roughly coded C programs, that
test s390x optimizations and verify some bugs (that popped up in the past).
(Unfortunately) I assume is not in a shape to be integrated as standard
test - and is s390x specific anyway ... :-/
I more thought about using these as kind of regression testing for the
s390x specific bits and pieces.
But I'll ask - maybe there was some more work on it, that I am not aware of
...
On Tue, Nov 28, 2023 at 4:31 PM Lukas Märdian <email address hidden>
wrote:
> Review: Approve
>
> Thank you Mate, that's indeed a really nice cleanup!
>
> The new patches are nicely structured and provide clean patch headers. I
> confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub.
> Besides the new patches the delta looks very similar to our previous delta,
> but this time as clean git-ubuntu commits. Kudos!
>
> @Frank: you mention there might be some test code available, I wonder if
> we could somehow integrate that into the package? Because unfortunately
> there doesn't seem to be any dh_auto_test nor autopkgtest. :(
> Either way, we should definitely ask IBM/Ilya to verify that the new
> patches work as intended.
>
> @Mate: We should also consider upstreaming the d/watch delta to Debian, I
> think that could be useful and doesn't need to be part of the delta.
>
> Test build passed in a PPA:
>
> https:/
>
> LGTM. Sponsoring.
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
Frank Heimes (fheimes) wrote : | # |
So Ilya was pretty quick. He tested the package on a mantic environment
(which is still close to noble) and all his tests passed !
Like assumed his tests are s390x specific - so not very useful for a more
generic autopkgtest.
Anyway, glad that he could gave it a try and came back with a :thumbs up:
On Tue, Nov 28, 2023 at 5:14 PM Frank Heimes <email address hidden>
wrote:
> From what I remember 'iii' has just a few roughly coded C programs, that
> test s390x optimizations and verify some bugs (that popped up in the past).
> (Unfortunately) I assume is not in a shape to be integrated as standard
> test - and is s390x specific anyway ... :-/
>
> I more thought about using these as kind of regression testing for the
> s390x specific bits and pieces.
>
> But I'll ask - maybe there was some more work on it, that I am not aware of
> ...
>
> On Tue, Nov 28, 2023 at 4:31 PM Lukas Märdian <
> <email address hidden>>
> wrote:
>
> > Review: Approve
> >
> > Thank you Mate, that's indeed a really nice cleanup!
> >
> > The new patches are nicely structured and provide clean patch headers. I
> > confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub.
> > Besides the new patches the delta looks very similar to our previous
> delta,
> > but this time as clean git-ubuntu commits. Kudos!
> >
> > @Frank: you mention there might be some test code available, I wonder if
> > we could somehow integrate that into the package? Because unfortunately
> > there doesn't seem to be any dh_auto_test nor autopkgtest. :(
> > Either way, we should definitely ask IBM/Ilya to verify that the new
> > patches work as intended.
> >
> > @Mate: We should also consider upstreaming the d/watch delta to Debian, I
> > think that could be useful and doesn't need to be part of the delta.
> >
> > Test build passed in a PPA:
> >
> >
> https:/
> >
> > LGTM. Sponsoring.
> > --
> >
> >
> https:/
> > You are reviewing the proposed merge of ~mkukri/
> > into ubuntu/
> >
> >
>
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
Mate Kukri (mkukri) wrote : | # |
@fheimes That is good news.
If the test code is in a publishable state it might still be worth a shot integrating it as an s390x specific autopkgtest.
That and POWER crc32 is our only significant delta over Debian, so I think it would still help give more confidence to these merges.
Preview Diff
1 | diff --git a/debian/changelog b/debian/changelog |
2 | index 92d84a0..d52ce34 100644 |
3 | --- a/debian/changelog |
4 | +++ b/debian/changelog |
5 | @@ -1,3 +1,25 @@ |
6 | +zlib (1:1.3.dfsg-3ubuntu1) noble; urgency=medium |
7 | + |
8 | + * Merge with Debian unstable. Remaining changes: |
9 | + - Build x32 packages |
10 | + - Add watch file, with GPG tarball checking, and version mangling |
11 | + - d/rules: Compile with DFLTCC enabled on s390x and hardware |
12 | + compression at level 6 |
13 | + - d/zlib-core.symbols: Drop dfsg suffix from version |
14 | + * New patches rebased from iii-i/zlib/dfltcc on GitHub: |
15 | + - d/p/power/*: Add optimized crc32 for POWER8+ |
16 | + - d/p/s390x/*: Add optimized crc32 and hardware deflate |
17 | + * Patches superseded by the above: |
18 | + - d/p/410.patch: Add support for IBM Z hardware-accelerated deflate |
19 | + - d/p/478.patch: Add optimized crc32 for Power 8+ processors |
20 | + - d/p/s390x-vectorize-crc32.patch: Add s390x vectorized crc32 support |
21 | + - d/p/1390.patch: Don't update strm.adler for raw streams on s390x |
22 | + (DFLTCC), otherwise libxml2 gets broken on s390x. LP #2002511 |
23 | + - d/p/lp-2018293-fix-crash-in-deflateBound-if-called-before-deflateInt |
24 | + .patch: Avoid potential deflateBound() function crash on s390x |
25 | + |
26 | + -- Mate Kukri <mate.kukri@canonical.com> Fri, 24 Nov 2023 08:22:52 +0000 |
27 | + |
28 | zlib (1:1.3.dfsg-3) unstable; urgency=low |
29 | |
30 | * Update the version of texlive-binaries we break since they still had |
31 | @@ -34,6 +56,74 @@ zlib (1:1.2.13.dfsg-2) unstable; urgency=low |
32 | |
33 | -- Mark Brown <broonie@debian.org> Tue, 15 Aug 2023 00:28:42 +0100 |
34 | |
35 | +zlib (1:1.2.13.dfsg-1ubuntu5) mantic; urgency=medium |
36 | + |
37 | + * Add |
38 | + d/p/lp-2018293-fix-crash-in-deflateBound-if-called-before-deflateInt.patch |
39 | + to avoid potential deflateBound() function crash on s390x. |
40 | + * Clean-up and remove |
41 | + d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch since it was |
42 | + replaced by d/p/s390x-vectorize-crc32.patch with 1.2.13.dfsg-1ubuntu3 |
43 | + but was still in d/p/ (but not in d/p/series). |
44 | + |
45 | + -- Frank Heimes <frank.heimes@canonical.com> Wed, 02 Aug 2023 13:22:26 +0200 |
46 | + |
47 | +zlib (1:1.2.13.dfsg-1ubuntu4) lunar; urgency=medium |
48 | + |
49 | + * Add d/p/1390.patch to not update strm.adler for raw streams on s390x |
50 | + (DFLTCC), otherwise libxml2 gets broken on s390x. LP: #2002511 |
51 | + |
52 | + -- Frank Heimes <frank.heimes@canonical.com> Wed, 11 Jan 2023 18:02:34 +0100 |
53 | + |
54 | +zlib (1:1.2.13.dfsg-1ubuntu3) lunar; urgency=medium |
55 | + |
56 | + * Re-add vectorized crc32 support for s390x by adding |
57 | + d/p/s390x-vectorize-crc32.patch |
58 | + (crc32vx-v4: s390x: vectorize crc32). (LP: #1998470) |
59 | + This replaces the previously dropped patch: |
60 | + lp1932010-ibm-z-add-vectorized-crc32-implementation.patch |
61 | + * Remove option '--crc32-vx' for s390x in d/rules, that was previously just |
62 | + commented out, since it's no longer needed with the new s390x crc32 code. |
63 | + * Update d/p/410.patch to version 26f2c0a4e17e5558d779797d713aa37ebaeef390 |
64 | + due to unused "const char *endptr;". |
65 | + |
66 | + -- Frank Heimes <frank.heimes@canonical.com> Mon, 21 Nov 2022 20:28:58 +0100 |
67 | + |
68 | +zlib (1:1.2.13.dfsg-1ubuntu2) lunar; urgency=medium |
69 | + |
70 | + * Comment out use of --crc32-vx on s390x, since this is currently not |
71 | + implemented due to the dropped patch that needs porting. |
72 | + |
73 | + -- Steve Langasek <steve.langasek@ubuntu.com> Tue, 15 Nov 2022 17:06:45 +0000 |
74 | + |
75 | +zlib (1:1.2.13.dfsg-1ubuntu1) lunar; urgency=low |
76 | + |
77 | + * Merge from Debian unstable. Remaining changes: |
78 | + - Build x32 packages |
79 | + - debian/zlib-core.symbols: Drop dfsg suffix from version |
80 | + - Add watch file, with GPG tarball checking, and version mangling |
81 | + - Cherrypick PR#410 to enable hardware-accelerated deflate. |
82 | + - Copmile with DFLTCC enabled on s390x. |
83 | + - Enable hardware compression on s390x at level 6. |
84 | + - d/rules: use configure options for dfltcc instead of hardcoding |
85 | + the CFLAGS |
86 | + * Dropped changes, included upstream: |
87 | + - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch |
88 | + - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits |
89 | + for deflatePrime() is valid in deflate.c. |
90 | + * Pull rebased 410.patch from https://github.com/madler/zlib/pull/410. |
91 | + * Drop d/p/410-lp1961427.patch, included in the above rebase. |
92 | + * Replace 335.patch for ppc64el (P8) crc32 performance with 478.patch which |
93 | + supersedes it (https://github.com/madler/zlib/pull/478). |
94 | + * Forward-port lp1932010-ibm-z-add-vectorized-crc32-implementation.patch. |
95 | + * Dropped changes: |
96 | + - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: this |
97 | + patch depends on zlib upstream PR 335 which has been superseded by |
98 | + upstream PR 478 with significant refactoring. Drop this patch, |
99 | + pending a port from IBM. |
100 | + |
101 | + -- Steve Langasek <steve.langasek@ubuntu.com> Mon, 07 Nov 2022 15:57:28 -0800 |
102 | + |
103 | zlib (1:1.2.13.dfsg-1) unstable; urgency=low |
104 | |
105 | * New upstream release. |
106 | @@ -42,6 +132,38 @@ zlib (1:1.2.13.dfsg-1) unstable; urgency=low |
107 | |
108 | -- Mark Brown <broonie@debian.org> Sat, 05 Nov 2022 12:24:46 +0000 |
109 | |
110 | +zlib (1:1.2.11.dfsg-4.1ubuntu1) kinetic; urgency=low |
111 | + |
112 | + * Merge from Debian unstable. Remaining changes: |
113 | + - Build x32 packages |
114 | + - debian/zlib-core.symbols: Drop dfsg suffix from version |
115 | + - Add watch file, with GPG tarball checking, and version mangling |
116 | + - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: |
117 | + - Cherrypick PR#410 to enable hardware-accelerated deflate. |
118 | + - Copmile with DFLTCC enabled on s390x. |
119 | + - Improve crc32 performance on P8, proposed upstream patch. |
120 | + - Enable hardware compression on s390x at level 6. |
121 | + - Cherrypick update of s390x hw acceleration #410 pull request patch, |
122 | + which corrects inflateSyncPoint() return value to always gracefully |
123 | + fail when hw acceleration is in use. |
124 | + - d/rules: use configure options for dfltcc instead of hardcoding |
125 | + the CFLAGS |
126 | + - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch |
127 | + ported from zlib-ng #912, adding a vectorized implementation |
128 | + of CRC32 on s390x architectures based on kernel code. |
129 | + - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: adjust |
130 | + to not make a PLT call in an ifunc on s390/s390x. |
131 | + - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits |
132 | + for deflatePrime() is valid in deflate.c. |
133 | + - d/p/410-lp1961427.patch ported from zlib #410, fixing |
134 | + compressBound() with hw acceleration. |
135 | + * Dropped changes, included in Debian: |
136 | + - debian/patches/CVE-2018-25032-1.patch: fix a bug that can crash |
137 | + deflate on some input when using Z_FIXED in deflate.c, deflate.h. |
138 | + * Refresh 410.patch for upstream changes. |
139 | + |
140 | + -- Steve Langasek <steve.langasek@ubuntu.com> Thu, 18 Aug 2022 09:09:22 -0700 |
141 | + |
142 | zlib (1:1.2.11.dfsg-4.1) unstable; urgency=medium |
143 | |
144 | * Non-maintainer upload. |
145 | @@ -69,6 +191,89 @@ zlib (1:1.2.11.dfsg-3) unstable; urgency=low |
146 | |
147 | -- Mark Brown <broonie@debian.org> Fri, 18 Mar 2022 00:21:37 +0000 |
148 | |
149 | +zlib (1:1.2.11.dfsg-2ubuntu10) kinetic; urgency=medium |
150 | + |
151 | + * d/p/410-lp1961427.patch ported from zlib #410, fixing |
152 | + compressBound() with hw acceleration. LP: #1961427 |
153 | + Thanks to Ilya Leoshkevich <iii@linux.ibm.com>. |
154 | + In addition a patch is needed for bedtools. |
155 | + |
156 | + -- Frank Heimes <frank.heimes@canonical.com> Thu, 21 Jul 2022 09:30:05 +0100 |
157 | + |
158 | +zlib (1:1.2.11.dfsg-2ubuntu9) jammy; urgency=medium |
159 | + |
160 | + * SECURITY UPDATE: memory corruption when deflating |
161 | + - debian/patches/CVE-2018-25032-1.patch: fix a bug that can crash |
162 | + deflate on some input when using Z_FIXED in deflate.c, deflate.h. |
163 | + - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits |
164 | + for deflatePrime() is valid in deflate.c. |
165 | + - CVE-2018-25032 |
166 | + |
167 | + -- Marc Deslauriers <marc.deslauriers@ubuntu.com> Fri, 25 Mar 2022 08:06:31 -0400 |
168 | + |
169 | +zlib (1:1.2.11.dfsg-2ubuntu7) impish; urgency=medium |
170 | + |
171 | + [ Simon Chopin ] |
172 | + * d/rules: use configure options for dfltcc instead of hardcoding |
173 | + the CFLAGS |
174 | + * d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch |
175 | + ported from zlib-ng #912, adding a vectorized implementation |
176 | + of CRC32 on s390x architectures based on kernel code. LP: #1932010 |
177 | + |
178 | + [ Michael Hudson-Doyle ] |
179 | + * d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: adjust to |
180 | + not make a PLT call in an ifunc on s390/s390x. |
181 | + |
182 | + -- Simon Chopin <simon.chopin@canonical.com> Thu, 12 Aug 2021 15:45:49 +1200 |
183 | + |
184 | +zlib (1:1.2.11.dfsg-2ubuntu6) hirsute; urgency=medium |
185 | + |
186 | + * No-change rebuild to build with lto. |
187 | + |
188 | + -- Matthias Klose <doko@ubuntu.com> Sun, 28 Mar 2021 09:10:07 +0200 |
189 | + |
190 | +zlib (1:1.2.11.dfsg-2ubuntu5) hirsute; urgency=medium |
191 | + |
192 | + * No-change rebuild to drop the udeb package. |
193 | + |
194 | + -- Matthias Klose <doko@ubuntu.com> Mon, 22 Feb 2021 10:36:58 +0100 |
195 | + |
196 | +zlib (1:1.2.11.dfsg-2ubuntu4) groovy; urgency=medium |
197 | + |
198 | + * Cherrypick update of s390x hw acceleration #410 pull request patch, |
199 | + which corrects inflateSyncPoint() return value to always gracefully |
200 | + fail when hw acceleration is in use. This fixes rsync failure with |
201 | + zlib compression on hw accelerated s390x. LP: #1899621 |
202 | + |
203 | + -- Dimitri John Ledkov <xnox@ubuntu.com> Thu, 15 Oct 2020 11:01:38 +0100 |
204 | + |
205 | +zlib (1:1.2.11.dfsg-2ubuntu3) groovy; urgency=medium |
206 | + |
207 | + * Enable hardware compression on s390x at level 6. LP: #1884514 |
208 | + |
209 | + -- Michael Hudson-Doyle <michael.hudson@ubuntu.com> Thu, 24 Sep 2020 08:44:35 +1200 |
210 | + |
211 | +zlib (1:1.2.11.dfsg-2ubuntu2) groovy; urgency=medium |
212 | + |
213 | + * Update d/patches/410.patch to current state. LP: #1882494, #1889059, #1893170 |
214 | + |
215 | + -- Michael Hudson-Doyle <michael.hudson@ubuntu.com> Thu, 20 Aug 2020 11:52:59 +1200 |
216 | + |
217 | +zlib (1:1.2.11.dfsg-2ubuntu1) focal; urgency=medium |
218 | + |
219 | + * Merge with Debian; remaining changes: |
220 | + - Build x32 packages |
221 | + - debian/zlib-core.symbols: Drop dfsg suffix from version |
222 | + - Add watch file, with GPG tarball checking, and version mangling |
223 | + - Drop unused patches |
224 | + - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: |
225 | + (LP: #1692870) |
226 | + - Cherrypick PR#410 to enable hardware-accelerated deflate. |
227 | + - Copmile with DFLTCC enabled on s390x. LP: #1823157 |
228 | + - Improve crc32 performance on P8, proposed upstream patch. LP: #1742941. |
229 | + |
230 | + -- Matthias Klose <doko@ubuntu.com> Tue, 25 Feb 2020 16:59:52 +0100 |
231 | + |
232 | zlib (1:1.2.11.dfsg-2) unstable; urgency=low |
233 | |
234 | * Acknowledge previous NMUs (closes: #949388). |
235 | @@ -80,6 +285,21 @@ zlib (1:1.2.11.dfsg-2) unstable; urgency=low |
236 | |
237 | -- Mark Brown <broonie@debian.org> Mon, 24 Feb 2020 21:07:12 +0000 |
238 | |
239 | +zlib (1:1.2.11.dfsg-1.2ubuntu1) focal; urgency=medium |
240 | + |
241 | + * Merge with Debian; remaining changes: |
242 | + - Build x32 packages |
243 | + - debian/zlib-core.symbols: Drop dfsg suffix from version |
244 | + - Add watch file, with GPG tarball checking, and version mangling |
245 | + - Drop unused patches |
246 | + - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: |
247 | + (LP: #1692870) |
248 | + - Cherrypick PR#410 to enable hardware-accelerated deflate. |
249 | + - Copmile with DFLTCC enabled on s390x. LP: #1823157 |
250 | + * Improve crc32 performance on P8, proposed upstream patch. LP: #1742941. |
251 | + |
252 | + -- Matthias Klose <doko@ubuntu.com> Mon, 24 Feb 2020 12:57:03 +0100 |
253 | + |
254 | zlib (1:1.2.11.dfsg-1.2) unstable; urgency=medium |
255 | |
256 | * Non-maintainer upload. |
257 | @@ -97,6 +317,31 @@ zlib (1:1.2.11.dfsg-1.1) unstable; urgency=medium |
258 | |
259 | -- YunQiang Su <syq@debian.org> Tue, 28 Jan 2020 19:55:38 +0800 |
260 | |
261 | +zlib (1:1.2.11.dfsg-1ubuntu3) eoan; urgency=medium |
262 | + |
263 | + * Cherrypick PR#410 to enable hardware-accelerated deflate. |
264 | + * Copmile with DFLTCC enabled on s390x. LP: #1823157 |
265 | + |
266 | + -- Dimitri John Ledkov <xnox@ubuntu.com> Mon, 19 Aug 2019 19:51:09 +0100 |
267 | + |
268 | +zlib (1:1.2.11.dfsg-1ubuntu2) disco; urgency=medium |
269 | + |
270 | + * debian/zlib-core.symbols: fix mistake introduced in the merge |
271 | + |
272 | + -- Jeremy Bicha <jbicha@debian.org> Thu, 24 Jan 2019 12:56:53 -0500 |
273 | + |
274 | +zlib (1:1.2.11.dfsg-1ubuntu1) disco; urgency=medium |
275 | + |
276 | + * Sync with Debian. Remaining changes: |
277 | + - Build x32 packages |
278 | + - debian/zlib-core.symbols: Drop dfsg suffix from version |
279 | + - Add watch file, with GPG tarball checking, and version mangling |
280 | + - Drop unused patches |
281 | + - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: |
282 | + (LP: #1692870) |
283 | + |
284 | + -- Jeremy Bicha <jbicha@debian.org> Wed, 23 Jan 2019 17:22:17 -0500 |
285 | + |
286 | zlib (1:1.2.11.dfsg-1) unstable; urgency=low |
287 | |
288 | * New upstream release (closes: #883180). |
289 | @@ -1072,3 +1317,4 @@ zlib (1.0.4-1) unstable; urgency=low |
290 | * Moved to new source packaging format. |
291 | |
292 | -- Michael Alan Dorman <mdorman@calder.med.miami.edu> Thu, 12 Sep 1996 15:19:35 -0400 |
293 | + |
294 | diff --git a/debian/control b/debian/control |
295 | index 3b4ff22..f365460 100644 |
296 | --- a/debian/control |
297 | +++ b/debian/control |
298 | @@ -1,7 +1,8 @@ |
299 | Source: zlib |
300 | Section: libs |
301 | Priority: optional |
302 | -Maintainer: Mark Brown <broonie@debian.org> |
303 | +Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> |
304 | +XSBC-Original-Maintainer: Mark Brown <broonie@debian.org> |
305 | Standards-Version: 4.6.1 |
306 | Homepage: http://zlib.net/ |
307 | Build-Depends: debhelper (>= 13), gcc-multilib [amd64 i386 kfreebsd-amd64 mips mipsel powerpc ppc64 s390 sparc s390x mipsn32 mipsn32el mipsr6 mipsr6el mipsn32r6 mipsn32r6el mips64 mips64el mips64r6 mips64r6el x32] <!nobiarch>, dpkg-dev (>= 1.16.1), autoconf |
308 | @@ -119,6 +120,28 @@ Description: compression library - n32 - DO NOT USE EXCEPT FOR PACKAGING |
309 | not need to build packages should use multiarch to install the relevant |
310 | runtime. |
311 | |
312 | +Package: libx32z1 |
313 | +Architecture: amd64 i386 |
314 | +Depends: ${shlibs:Depends}, ${misc:Depends} |
315 | +Description: compression library - x32 runtime |
316 | + zlib is a library implementing the deflate compression method found |
317 | + in gzip and PKZIP. This package includes a n32 version of the shared |
318 | + library. |
319 | + |
320 | +Package: libx32z1-dev |
321 | +Section: libdevel |
322 | +Architecture: amd64 i386 |
323 | +Depends: libx32z1 (= ${binary:Version}), zlib1g-dev (= ${binary:Version}), libc6-dev-x32, ${misc:Depends} |
324 | +Provides: libx32z-dev |
325 | +Description: compression library - x32 - DO NOT USE EXCEPT FOR PACKAGING |
326 | + zlib is a library implementing the deflate compression method found |
327 | + in gzip and PKZIP. This package includes the development support |
328 | + files for building n32 applications. |
329 | + . |
330 | + This package should ONLY be used for building packages, users who do |
331 | + not need to build packages should use multiarch to install the relevant |
332 | + runtime. |
333 | + |
334 | Package: minizip |
335 | Section: utils |
336 | Architecture: any |
337 | diff --git a/debian/libx32z1-dev.dirs b/debian/libx32z1-dev.dirs |
338 | new file mode 100644 |
339 | index 0000000..5447591 |
340 | --- /dev/null |
341 | +++ b/debian/libx32z1-dev.dirs |
342 | @@ -0,0 +1 @@ |
343 | +usr/libx32 |
344 | diff --git a/debian/libx32z1-dev.install b/debian/libx32z1-dev.install |
345 | new file mode 100644 |
346 | index 0000000..a865054 |
347 | --- /dev/null |
348 | +++ b/debian/libx32z1-dev.install |
349 | @@ -0,0 +1,2 @@ |
350 | +usr/libx32/libz.a |
351 | +usr/libx32/libz.so |
352 | diff --git a/debian/libx32z1.dirs b/debian/libx32z1.dirs |
353 | new file mode 100644 |
354 | index 0000000..5447591 |
355 | --- /dev/null |
356 | +++ b/debian/libx32z1.dirs |
357 | @@ -0,0 +1 @@ |
358 | +usr/libx32 |
359 | diff --git a/debian/libx32z1.install b/debian/libx32z1.install |
360 | new file mode 100644 |
361 | index 0000000..3ff82f2 |
362 | --- /dev/null |
363 | +++ b/debian/libx32z1.install |
364 | @@ -0,0 +1 @@ |
365 | +usr/libx32/libz.so.* |
366 | diff --git a/debian/libx32z1.symbols b/debian/libx32z1.symbols |
367 | new file mode 100644 |
368 | index 0000000..a87cfdc |
369 | --- /dev/null |
370 | +++ b/debian/libx32z1.symbols |
371 | @@ -0,0 +1,3 @@ |
372 | +libz.so.1 libx32z1 #MINVER# |
373 | +#include "zlib-core.symbols" |
374 | +#include "zlib-64.symbols" |
375 | diff --git a/debian/patches/power/add-optimized-crc32.patch b/debian/patches/power/add-optimized-crc32.patch |
376 | new file mode 100644 |
377 | index 0000000..b057b57 |
378 | --- /dev/null |
379 | +++ b/debian/patches/power/add-optimized-crc32.patch |
380 | @@ -0,0 +1,2539 @@ |
381 | +From: Manjunath S Matti <mmatti@linux.ibm.com> |
382 | +Date: Thu, 14 Sep 2023 06:43:11 -0500 |
383 | +Subject: Add Power8+ optimized crc32 |
384 | + |
385 | +This commit adds an optimized version for the crc32 function based |
386 | +on crc32-vpmsum from https://github.com/antonblanchard/crc32-vpmsum/ |
387 | + |
388 | +This is the C implementation created by Rogerio Alves |
389 | +<rogealve@br.ibm.com> |
390 | + |
391 | +It makes use of vector instructions to speed up CRC32 algorithm. |
392 | + |
393 | +Author: Rogerio Alves <rcardoso@linux.ibm.com> |
394 | +Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> |
395 | + |
396 | +Origin: i-iii/zlib,https://github.com/iii-i/zlib/commit/6879bc81b111247939b4924b08c5993fd0482b1a |
397 | +--- |
398 | + .gitignore | 29 + |
399 | + CMakeLists.txt | 7 +- |
400 | + Makefile.in | 43 +- |
401 | + configure | 7 +- |
402 | + contrib/README.contrib | 3 +- |
403 | + contrib/power/clang_workaround.h | 82 +++ |
404 | + contrib/power/crc32_constants.h | 1206 ++++++++++++++++++++++++++++++++++++++ |
405 | + contrib/power/crc32_z_power8.c | 679 +++++++++++++++++++++ |
406 | + contrib/power/crc32_z_resolver.c | 15 + |
407 | + contrib/power/power.h | 4 + |
408 | + crc32.c | 12 + |
409 | + test/crc32_test.c | 205 +++++++ |
410 | + 12 files changed, 2278 insertions(+), 14 deletions(-) |
411 | + create mode 100644 .gitignore |
412 | + create mode 100644 contrib/power/clang_workaround.h |
413 | + create mode 100644 contrib/power/crc32_constants.h |
414 | + create mode 100644 contrib/power/crc32_z_power8.c |
415 | + create mode 100644 contrib/power/crc32_z_resolver.c |
416 | + create mode 100644 test/crc32_test.c |
417 | + |
418 | +diff --git a/.gitignore b/.gitignore |
419 | +new file mode 100644 |
420 | +index 0000000..e324531 |
421 | +--- /dev/null |
422 | ++++ b/.gitignore |
423 | +@@ -0,0 +1,29 @@ |
424 | ++*.diff |
425 | ++*.patch |
426 | ++*.orig |
427 | ++*.rej |
428 | ++ |
429 | ++*~ |
430 | ++*.a |
431 | ++*.lo |
432 | ++*.o |
433 | ++*.dylib |
434 | ++ |
435 | ++*.gcda |
436 | ++*.gcno |
437 | ++*.gcov |
438 | ++ |
439 | ++/crc32_test |
440 | ++/crc32_test64 |
441 | ++/crc32_testsh |
442 | ++/example |
443 | ++/example64 |
444 | ++/examplesh |
445 | ++/libz.so* |
446 | ++/minigzip |
447 | ++/minigzip64 |
448 | ++/minigzipsh |
449 | ++/zlib.pc |
450 | ++/configure.log |
451 | ++ |
452 | ++.DS_Store |
453 | +diff --git a/CMakeLists.txt b/CMakeLists.txt |
454 | +index 4456cd7..0464ba3 100644 |
455 | +--- a/CMakeLists.txt |
456 | ++++ b/CMakeLists.txt |
457 | +@@ -172,7 +172,8 @@ if(CMAKE_COMPILER_IS_GNUCC) |
458 | + |
459 | + if(POWER8) |
460 | + add_definitions(-DZ_POWER8) |
461 | +- set(ZLIB_POWER8 ) |
462 | ++ set(ZLIB_POWER8 |
463 | ++ contrib/power/crc32_z_power8.c) |
464 | + |
465 | + set_source_files_properties( |
466 | + ${ZLIB_POWER8} |
467 | +@@ -269,6 +270,10 @@ add_executable(example test/example.c) |
468 | + target_link_libraries(example zlib) |
469 | + add_test(example example) |
470 | + |
471 | ++add_executable(crc32_test test/crc32_test.c) |
472 | ++target_link_libraries(crc32_test zlib) |
473 | ++add_test(crc32_test crc32_test) |
474 | ++ |
475 | + add_executable(minigzip test/minigzip.c) |
476 | + target_link_libraries(minigzip zlib) |
477 | + |
478 | +diff --git a/Makefile.in b/Makefile.in |
479 | +index 34d3cd7..2dbb20a 100644 |
480 | +--- a/Makefile.in |
481 | ++++ b/Makefile.in |
482 | +@@ -71,11 +71,11 @@ PIC_OBJS = $(PIC_OBJC) $(PIC_OBJA) |
483 | + |
484 | + all: static shared |
485 | + |
486 | +-static: example$(EXE) minigzip$(EXE) |
487 | ++static: crc32_test$(EXE) example$(EXE) minigzip$(EXE) |
488 | + |
489 | +-shared: examplesh$(EXE) minigzipsh$(EXE) |
490 | ++shared: crc32_testsh$(EXE) examplesh$(EXE) minigzipsh$(EXE) |
491 | + |
492 | +-all64: example64$(EXE) minigzip64$(EXE) |
493 | ++all64: crc32_test64$(EXE) example64$(EXE) minigzip64$(EXE) |
494 | + |
495 | + check: test |
496 | + |
497 | +@@ -83,7 +83,7 @@ test: all teststatic testshared |
498 | + |
499 | + teststatic: static |
500 | + @TMPST=tmpst_$$; \ |
501 | +- if echo hello world | ${QEMU_RUN} ./minigzip | ${QEMU_RUN} ./minigzip -d && ${QEMU_RUN} ./example $$TMPST ; then \ |
502 | ++ if echo hello world | ${QEMU_RUN} ./minigzip | ${QEMU_RUN} ./minigzip -d && ${QEMU_RUN} ./example $$TMPST && ${QEMU_RUN} ./crc32_test; then \ |
503 | + echo ' *** zlib test OK ***'; \ |
504 | + else \ |
505 | + echo ' *** zlib test FAILED ***'; false; \ |
506 | +@@ -96,7 +96,7 @@ testshared: shared |
507 | + DYLD_LIBRARY_PATH=`pwd`:$(DYLD_LIBRARY_PATH) ; export DYLD_LIBRARY_PATH; \ |
508 | + SHLIB_PATH=`pwd`:$(SHLIB_PATH) ; export SHLIB_PATH; \ |
509 | + TMPSH=tmpsh_$$; \ |
510 | +- if echo hello world | ${QEMU_RUN} ./minigzipsh | ${QEMU_RUN} ./minigzipsh -d && ${QEMU_RUN} ./examplesh $$TMPSH; then \ |
511 | ++ if echo hello world | ${QEMU_RUN} ./minigzipsh | ${QEMU_RUN} ./minigzipsh -d && ${QEMU_RUN} ./examplesh $$TMPSH && ${QEMU_RUN} ./crc32_testsh; then \ |
512 | + echo ' *** zlib shared test OK ***'; \ |
513 | + else \ |
514 | + echo ' *** zlib shared test FAILED ***'; false; \ |
515 | +@@ -105,7 +105,7 @@ testshared: shared |
516 | + |
517 | + test64: all64 |
518 | + @TMP64=tmp64_$$; \ |
519 | +- if echo hello world | ${QEMU_RUN} ./minigzip64 | ${QEMU_RUN} ./minigzip64 -d && ${QEMU_RUN} ./example64 $$TMP64; then \ |
520 | ++ if echo hello world | ${QEMU_RUN} ./minigzip64 | ${QEMU_RUN} ./minigzip64 -d && ${QEMU_RUN} ./example64 $$TMP64 && ${QEMU_RUN} ./crc32_test64; then \ |
521 | + echo ' *** zlib 64-bit test OK ***'; \ |
522 | + else \ |
523 | + echo ' *** zlib 64-bit test FAILED ***'; false; \ |
524 | +@@ -139,12 +139,18 @@ match.lo: match.S |
525 | + mv _match.o match.lo |
526 | + rm -f _match.s |
527 | + |
528 | ++crc32_test.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h |
529 | ++ $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/crc32_test.c |
530 | ++ |
531 | + example.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h |
532 | + $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/example.c |
533 | + |
534 | + minigzip.o: $(SRCDIR)test/minigzip.c $(SRCDIR)zlib.h zconf.h |
535 | + $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/minigzip.c |
536 | + |
537 | ++crc32_test64.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h |
538 | ++ $(CC) $(CFLAGS) $(ZINCOUT) -D_FILE_OFFSET_BITS=64 -c -o $@ $(SRCDIR)test/crc32_test.c |
539 | ++ |
540 | + example64.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h |
541 | + $(CC) $(CFLAGS) $(ZINCOUT) -D_FILE_OFFSET_BITS=64 -c -o $@ $(SRCDIR)test/example.c |
542 | + |
543 | +@@ -158,6 +164,9 @@ adler32.o: $(SRCDIR)adler32.c |
544 | + crc32.o: $(SRCDIR)crc32.c |
545 | + $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)crc32.c |
546 | + |
547 | ++crc32_z_power8.o: $(SRCDIR)contrib/power/crc32_z_power8.c |
548 | ++ $(CC) $(CFLAGS) -mcpu=power8 $(ZINC) -c -o $@ $(SRCDIR)contrib/power/crc32_z_power8.c |
549 | ++ |
550 | + deflate.o: $(SRCDIR)deflate.c |
551 | + $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)deflate.c |
552 | + |
553 | +@@ -208,6 +217,11 @@ crc32.lo: $(SRCDIR)crc32.c |
554 | + $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/crc32.o $(SRCDIR)crc32.c |
555 | + -@mv objs/crc32.o $@ |
556 | + |
557 | ++crc32_z_power8.lo: $(SRCDIR)contrib/power/crc32_z_power8.c |
558 | ++ -@mkdir objs 2>/dev/null || test -d objs |
559 | ++ $(CC) $(SFLAGS) -mcpu=power8 $(ZINC) -DPIC -c -o objs/crc32_z_power8.o $(SRCDIR)contrib/power/crc32_z_power8.c |
560 | ++ -@mv objs/crc32_z_power8.o $@ |
561 | ++ |
562 | + deflate.lo: $(SRCDIR)deflate.c |
563 | + -@mkdir objs 2>/dev/null || test -d objs |
564 | + $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/deflate.o $(SRCDIR)deflate.c |
565 | +@@ -281,18 +295,27 @@ placebo $(SHAREDLIBV): $(PIC_OBJS) libz.a |
566 | + ln -s $@ $(SHAREDLIBM) |
567 | + -@rmdir objs |
568 | + |
569 | ++crc32_test$(EXE): crc32_test.o $(STATICLIB) |
570 | ++ $(CC) $(CFLAGS) -o $@ crc32_test.o $(TEST_LDFLAGS) |
571 | ++ |
572 | + example$(EXE): example.o $(STATICLIB) |
573 | + $(CC) $(CFLAGS) -o $@ example.o $(TEST_LDFLAGS) |
574 | + |
575 | + minigzip$(EXE): minigzip.o $(STATICLIB) |
576 | + $(CC) $(CFLAGS) -o $@ minigzip.o $(TEST_LDFLAGS) |
577 | + |
578 | ++crc32_testsh$(EXE): crc32_test.o $(SHAREDLIBV) |
579 | ++ $(CC) $(CFLAGS) -o $@ crc32_test.o -L. $(SHAREDLIBV) |
580 | ++ |
581 | + examplesh$(EXE): example.o $(SHAREDLIBV) |
582 | + $(CC) $(CFLAGS) -o $@ example.o $(LDFLAGS) -L. $(SHAREDLIBV) |
583 | + |
584 | + minigzipsh$(EXE): minigzip.o $(SHAREDLIBV) |
585 | + $(CC) $(CFLAGS) -o $@ minigzip.o $(LDFLAGS) -L. $(SHAREDLIBV) |
586 | + |
587 | ++crc32_test64$(EXE): crc32_test64.o $(STATICLIB) |
588 | ++ $(CC) $(CFLAGS) -o $@ crc32_test64.o $(TEST_LDFLAGS) |
589 | ++ |
590 | + example64$(EXE): example64.o $(STATICLIB) |
591 | + $(CC) $(CFLAGS) -o $@ example64.o $(TEST_LDFLAGS) |
592 | + |
593 | +@@ -368,8 +391,8 @@ minizip-clean: |
594 | + mostlyclean: clean |
595 | + clean: minizip-clean |
596 | + rm -f *.o *.lo *~ \ |
597 | +- example$(EXE) minigzip$(EXE) examplesh$(EXE) minigzipsh$(EXE) \ |
598 | +- example64$(EXE) minigzip64$(EXE) \ |
599 | ++ crc32_test$(EXE) example$(EXE) minigzip$(EXE) crc32_testsh$(EXE) examplesh$(EXE) minigzipsh$(EXE) \ |
600 | ++ crc32_test64$(EXE) example64$(EXE) minigzip64$(EXE) \ |
601 | + infcover \ |
602 | + libz.* foo.gz so_locations \ |
603 | + _match.s maketree contrib/infback9/*.o |
604 | +@@ -391,7 +414,7 @@ tags: |
605 | + |
606 | + adler32.o zutil.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h |
607 | + gzclose.o gzlib.o gzread.o gzwrite.o: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h |
608 | +-compress.o example.o minigzip.o uncompr.o: $(SRCDIR)zlib.h zconf.h |
609 | ++compress.o crc32_test.o example.o minigzip.o uncompr.o: $(SRCDIR)zlib.h zconf.h |
610 | + crc32.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h |
611 | + deflate.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h |
612 | + infback.o inflate.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h |
613 | +@@ -401,7 +424,7 @@ trees.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)tr |
614 | + |
615 | + adler32.lo zutil.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h |
616 | + gzclose.lo gzlib.lo gzread.lo gzwrite.lo: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h |
617 | +-compress.lo example.lo minigzip.lo uncompr.lo: $(SRCDIR)zlib.h zconf.h |
618 | ++compress.lo crc32_test.lo example.lo minigzip.lo uncompr.lo: $(SRCDIR)zlib.h zconf.h |
619 | + crc32.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h |
620 | + deflate.lo: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h |
621 | + infback.lo inflate.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h |
622 | +diff --git a/configure b/configure |
623 | +index e307a8d..b96ed4a 100755 |
624 | +--- a/configure |
625 | ++++ b/configure |
626 | +@@ -864,6 +864,9 @@ cat > $test.c <<EOF |
627 | + #ifndef _ARCH_PPC |
628 | + #error "Target is not Power" |
629 | + #endif |
630 | ++#if !(defined(__PPC64__) || defined(__powerpc64__)) |
631 | ++ #error "Target is not 64 bits" |
632 | ++#endif |
633 | + #ifndef HAVE_IFUNC |
634 | + #error "Target doesn't support ifunc" |
635 | + #endif |
636 | +@@ -877,8 +880,8 @@ if tryboth $CC -c $CFLAGS $test.c; then |
637 | + |
638 | + if tryboth $CC -c $CFLAGS -mcpu=power8 $test.c; then |
639 | + POWER8="-DZ_POWER8" |
640 | +- PIC_OBJC="${PIC_OBJC}" |
641 | +- OBJC="${OBJC}" |
642 | ++ PIC_OBJC="${PIC_OBJC} crc32_z_power8.lo" |
643 | ++ OBJC="${OBJC} crc32_z_power8.o" |
644 | + echo "Checking for -mcpu=power8 support... Yes." | tee -a configure.log |
645 | + else |
646 | + echo "Checking for -mcpu=power8 support... No." | tee -a configure.log |
647 | +diff --git a/contrib/README.contrib b/contrib/README.contrib |
648 | +index c57b520..90170df 100644 |
649 | +--- a/contrib/README.contrib |
650 | ++++ b/contrib/README.contrib |
651 | +@@ -46,7 +46,8 @@ minizip/ by Gilles Vollant <info@winimage.com> |
652 | + pascal/ by Bob Dellaca <bobdl@xtra.co.nz> et al. |
653 | + Support for Pascal |
654 | + |
655 | +-power/ by Matheus Castanho <msc@linux.ibm.com> |
656 | ++power/ by Daniel Black <daniel@linux.ibm.com> |
657 | ++ Matheus Castanho <msc@linux.ibm.com> |
658 | + and Rogerio Alves <rcardoso@linux.ibm.com> |
659 | + Optimized functions for Power processors |
660 | + |
661 | +diff --git a/contrib/power/clang_workaround.h b/contrib/power/clang_workaround.h |
662 | +new file mode 100644 |
663 | +index 0000000..b5e7dae |
664 | +--- /dev/null |
665 | ++++ b/contrib/power/clang_workaround.h |
666 | +@@ -0,0 +1,82 @@ |
667 | ++#ifndef CLANG_WORKAROUNDS_H |
668 | ++#define CLANG_WORKAROUNDS_H |
669 | ++ |
670 | ++/* |
671 | ++ * These stubs fix clang incompatibilities with GCC builtins. |
672 | ++ */ |
673 | ++ |
674 | ++#ifndef __builtin_crypto_vpmsumw |
675 | ++#define __builtin_crypto_vpmsumw __builtin_crypto_vpmsumb |
676 | ++#endif |
677 | ++#ifndef __builtin_crypto_vpmsumd |
678 | ++#define __builtin_crypto_vpmsumd __builtin_crypto_vpmsumb |
679 | ++#endif |
680 | ++ |
681 | ++static inline |
682 | ++__vector unsigned long long __attribute__((overloadable)) |
683 | ++vec_ld(int __a, const __vector unsigned long long* __b) |
684 | ++{ |
685 | ++ return (__vector unsigned long long)__builtin_altivec_lvx(__a, __b); |
686 | ++} |
687 | ++ |
688 | ++/* |
689 | ++ * GCC __builtin_pack_vector_int128 returns a vector __int128_t but Clang |
690 | ++ * does not recognize this type. On GCC this builtin is translated to a |
691 | ++ * xxpermdi instruction that only moves the registers __a, __b instead generates |
692 | ++ * a load. |
693 | ++ * |
694 | ++ * Clang has vec_xxpermdi intrinsics. It was implemented in 4.0.0. |
695 | ++ */ |
696 | ++static inline |
697 | ++__vector unsigned long long __builtin_pack_vector (unsigned long __a, |
698 | ++ unsigned long __b) |
699 | ++{ |
700 | ++ #if defined(__BIG_ENDIAN__) |
701 | ++ __vector unsigned long long __v = {__a, __b}; |
702 | ++ #else |
703 | ++ __vector unsigned long long __v = {__b, __a}; |
704 | ++ #endif |
705 | ++ return __v; |
706 | ++} |
707 | ++ |
708 | ++#ifndef vec_xxpermdi |
709 | ++ |
710 | ++static inline |
711 | ++unsigned long __builtin_unpack_vector (__vector unsigned long long __v, |
712 | ++ int __o) |
713 | ++{ |
714 | ++ return __v[__o]; |
715 | ++} |
716 | ++ |
717 | ++#if defined(__BIG_ENDIAN__) |
718 | ++#define __builtin_unpack_vector_0(a) __builtin_unpack_vector ((a), 0) |
719 | ++#define __builtin_unpack_vector_1(a) __builtin_unpack_vector ((a), 1) |
720 | ++#else |
721 | ++#define __builtin_unpack_vector_0(a) __builtin_unpack_vector ((a), 1) |
722 | ++#define __builtin_unpack_vector_1(a) __builtin_unpack_vector ((a), 0) |
723 | ++#endif |
724 | ++ |
725 | ++#else |
726 | ++ |
727 | ++static inline |
728 | ++unsigned long __builtin_unpack_vector_0 (__vector unsigned long long __v) |
729 | ++{ |
730 | ++ #if defined(__BIG_ENDIAN__) |
731 | ++ return vec_xxpermdi(__v, __v, 0x0)[1]; |
732 | ++ #else |
733 | ++ return vec_xxpermdi(__v, __v, 0x0)[0]; |
734 | ++ #endif |
735 | ++} |
736 | ++ |
737 | ++static inline |
738 | ++unsigned long __builtin_unpack_vector_1 (__vector unsigned long long __v) |
739 | ++{ |
740 | ++ #if defined(__BIG_ENDIAN__) |
741 | ++ return vec_xxpermdi(__v, __v, 0x3)[1]; |
742 | ++ #else |
743 | ++ return vec_xxpermdi(__v, __v, 0x3)[0]; |
744 | ++ #endif |
745 | ++} |
746 | ++#endif /* vec_xxpermdi */ |
747 | ++ |
748 | ++#endif |
749 | +diff --git a/contrib/power/crc32_constants.h b/contrib/power/crc32_constants.h |
750 | +new file mode 100644 |
751 | +index 0000000..3d01150 |
752 | +--- /dev/null |
753 | ++++ b/contrib/power/crc32_constants.h |
754 | +@@ -0,0 +1,1206 @@ |
755 | ++/* |
756 | ++* |
757 | ++* THIS FILE IS GENERATED WITH |
758 | ++./crc32_constants -c -r -x 0x04C11DB7 |
759 | ++ |
760 | ++* This is from https://github.com/antonblanchard/crc32-vpmsum/ |
761 | ++* DO NOT MODIFY IT MANUALLY! |
762 | ++* |
763 | ++*/ |
764 | ++ |
765 | ++#define CRC 0x4c11db7 |
766 | ++#define CRC_XOR |
767 | ++#define REFLECT |
768 | ++#define MAX_SIZE 32768 |
769 | ++ |
770 | ++#ifndef __ASSEMBLER__ |
771 | ++#ifdef CRC_TABLE |
772 | ++static const unsigned int crc_table[] = { |
773 | ++ 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, |
774 | ++ 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, |
775 | ++ 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, |
776 | ++ 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, |
777 | ++ 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, |
778 | ++ 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, |
779 | ++ 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, |
780 | ++ 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, |
781 | ++ 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, |
782 | ++ 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, |
783 | ++ 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, |
784 | ++ 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, |
785 | ++ 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, |
786 | ++ 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, |
787 | ++ 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, |
788 | ++ 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, |
789 | ++ 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, |
790 | ++ 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, |
791 | ++ 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, |
792 | ++ 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, |
793 | ++ 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, |
794 | ++ 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, |
795 | ++ 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, |
796 | ++ 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, |
797 | ++ 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, |
798 | ++ 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, |
799 | ++ 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, |
800 | ++ 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, |
801 | ++ 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, |
802 | ++ 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, |
803 | ++ 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, |
804 | ++ 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, |
805 | ++ 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, |
806 | ++ 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, |
807 | ++ 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, |
808 | ++ 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, |
809 | ++ 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, |
810 | ++ 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, |
811 | ++ 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, |
812 | ++ 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, |
813 | ++ 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, |
814 | ++ 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, |
815 | ++ 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, |
816 | ++ 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, |
817 | ++ 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, |
818 | ++ 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, |
819 | ++ 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, |
820 | ++ 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, |
821 | ++ 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, |
822 | ++ 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, |
823 | ++ 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, |
824 | ++ 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, |
825 | ++ 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, |
826 | ++ 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, |
827 | ++ 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, |
828 | ++ 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, |
829 | ++ 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, |
830 | ++ 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, |
831 | ++ 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, |
832 | ++ 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, |
833 | ++ 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, |
834 | ++ 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, |
835 | ++ 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, |
836 | ++ 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d,}; |
837 | ++ |
838 | ++#endif /* CRC_TABLE */ |
839 | ++#ifdef POWER8_INTRINSICS |
840 | ++ |
841 | ++/* Constants */ |
842 | ++ |
843 | ++/* Reduce 262144 kbits to 1024 bits */ |
844 | ++static const __vector unsigned long long vcrc_const[255] |
845 | ++ __attribute__((aligned (16))) = { |
846 | ++#ifdef __LITTLE_ENDIAN__ |
847 | ++ /* x^261120 mod p(x)` << 1, x^261184 mod p(x)` << 1 */ |
848 | ++ { 0x0000000099ea94a8, 0x00000001651797d2 }, |
849 | ++ /* x^260096 mod p(x)` << 1, x^260160 mod p(x)` << 1 */ |
850 | ++ { 0x00000000945a8420, 0x0000000021e0d56c }, |
851 | ++ /* x^259072 mod p(x)` << 1, x^259136 mod p(x)` << 1 */ |
852 | ++ { 0x0000000030762706, 0x000000000f95ecaa }, |
853 | ++ /* x^258048 mod p(x)` << 1, x^258112 mod p(x)` << 1 */ |
854 | ++ { 0x00000001a52fc582, 0x00000001ebd224ac }, |
855 | ++ /* x^257024 mod p(x)` << 1, x^257088 mod p(x)` << 1 */ |
856 | ++ { 0x00000001a4a7167a, 0x000000000ccb97ca }, |
857 | ++ /* x^256000 mod p(x)` << 1, x^256064 mod p(x)` << 1 */ |
858 | ++ { 0x000000000c18249a, 0x00000001006ec8a8 }, |
859 | ++ /* x^254976 mod p(x)` << 1, x^255040 mod p(x)` << 1 */ |
860 | ++ { 0x00000000a924ae7c, 0x000000014f58f196 }, |
861 | ++ /* x^253952 mod p(x)` << 1, x^254016 mod p(x)` << 1 */ |
862 | ++ { 0x00000001e12ccc12, 0x00000001a7192ca6 }, |
863 | ++ /* x^252928 mod p(x)` << 1, x^252992 mod p(x)` << 1 */ |
864 | ++ { 0x00000000a0b9d4ac, 0x000000019a64bab2 }, |
865 | ++ /* x^251904 mod p(x)` << 1, x^251968 mod p(x)` << 1 */ |
866 | ++ { 0x0000000095e8ddfe, 0x0000000014f4ed2e }, |
867 | ++ /* x^250880 mod p(x)` << 1, x^250944 mod p(x)` << 1 */ |
868 | ++ { 0x00000000233fddc4, 0x000000011092b6a2 }, |
869 | ++ /* x^249856 mod p(x)` << 1, x^249920 mod p(x)` << 1 */ |
870 | ++ { 0x00000001b4529b62, 0x00000000c8a1629c }, |
871 | ++ /* x^248832 mod p(x)` << 1, x^248896 mod p(x)` << 1 */ |
872 | ++ { 0x00000001a7fa0e64, 0x000000017bf32e8e }, |
873 | ++ /* x^247808 mod p(x)` << 1, x^247872 mod p(x)` << 1 */ |
874 | ++ { 0x00000001b5334592, 0x00000001f8cc6582 }, |
875 | ++ /* x^246784 mod p(x)` << 1, x^246848 mod p(x)` << 1 */ |
876 | ++ { 0x000000011f8ee1b4, 0x000000008631ddf0 }, |
877 | ++ /* x^245760 mod p(x)` << 1, x^245824 mod p(x)` << 1 */ |
878 | ++ { 0x000000006252e632, 0x000000007e5a76d0 }, |
879 | ++ /* x^244736 mod p(x)` << 1, x^244800 mod p(x)` << 1 */ |
880 | ++ { 0x00000000ab973e84, 0x000000002b09b31c }, |
881 | ++ /* x^243712 mod p(x)` << 1, x^243776 mod p(x)` << 1 */ |
882 | ++ { 0x000000007734f5ec, 0x00000001b2df1f84 }, |
883 | ++ /* x^242688 mod p(x)` << 1, x^242752 mod p(x)` << 1 */ |
884 | ++ { 0x000000007c547798, 0x00000001d6f56afc }, |
885 | ++ /* x^241664 mod p(x)` << 1, x^241728 mod p(x)` << 1 */ |
886 | ++ { 0x000000007ec40210, 0x00000001b9b5e70c }, |
887 | ++ /* x^240640 mod p(x)` << 1, x^240704 mod p(x)` << 1 */ |
888 | ++ { 0x00000001ab1695a8, 0x0000000034b626d2 }, |
889 | ++ /* x^239616 mod p(x)` << 1, x^239680 mod p(x)` << 1 */ |
890 | ++ { 0x0000000090494bba, 0x000000014c53479a }, |
891 | ++ /* x^238592 mod p(x)` << 1, x^238656 mod p(x)` << 1 */ |
892 | ++ { 0x00000001123fb816, 0x00000001a6d179a4 }, |
893 | ++ /* x^237568 mod p(x)` << 1, x^237632 mod p(x)` << 1 */ |
894 | ++ { 0x00000001e188c74c, 0x000000015abd16b4 }, |
895 | ++ /* x^236544 mod p(x)` << 1, x^236608 mod p(x)` << 1 */ |
896 | ++ { 0x00000001c2d3451c, 0x00000000018f9852 }, |
897 | ++ /* x^235520 mod p(x)` << 1, x^235584 mod p(x)` << 1 */ |
898 | ++ { 0x00000000f55cf1ca, 0x000000001fb3084a }, |
899 | ++ /* x^234496 mod p(x)` << 1, x^234560 mod p(x)` << 1 */ |
900 | ++ { 0x00000001a0531540, 0x00000000c53dfb04 }, |
901 | ++ /* x^233472 mod p(x)` << 1, x^233536 mod p(x)` << 1 */ |
902 | ++ { 0x0000000132cd7ebc, 0x00000000e10c9ad6 }, |
903 | ++ /* x^232448 mod p(x)` << 1, x^232512 mod p(x)` << 1 */ |
904 | ++ { 0x0000000073ab7f36, 0x0000000025aa994a }, |
905 | ++ /* x^231424 mod p(x)` << 1, x^231488 mod p(x)` << 1 */ |
906 | ++ { 0x0000000041aed1c2, 0x00000000fa3a74c4 }, |
907 | ++ /* x^230400 mod p(x)` << 1, x^230464 mod p(x)` << 1 */ |
908 | ++ { 0x0000000136c53800, 0x0000000033eb3f40 }, |
909 | ++ /* x^229376 mod p(x)` << 1, x^229440 mod p(x)` << 1 */ |
910 | ++ { 0x0000000126835a30, 0x000000017193f296 }, |
911 | ++ /* x^228352 mod p(x)` << 1, x^228416 mod p(x)` << 1 */ |
912 | ++ { 0x000000006241b502, 0x0000000043f6c86a }, |
913 | ++ /* x^227328 mod p(x)` << 1, x^227392 mod p(x)` << 1 */ |
914 | ++ { 0x00000000d5196ad4, 0x000000016b513ec6 }, |
915 | ++ /* x^226304 mod p(x)` << 1, x^226368 mod p(x)` << 1 */ |
916 | ++ { 0x000000009cfa769a, 0x00000000c8f25b4e }, |
917 | ++ /* x^225280 mod p(x)` << 1, x^225344 mod p(x)` << 1 */ |
918 | ++ { 0x00000000920e5df4, 0x00000001a45048ec }, |
919 | ++ /* x^224256 mod p(x)` << 1, x^224320 mod p(x)` << 1 */ |
920 | ++ { 0x0000000169dc310e, 0x000000000c441004 }, |
921 | ++ /* x^223232 mod p(x)` << 1, x^223296 mod p(x)` << 1 */ |
922 | ++ { 0x0000000009fc331c, 0x000000000e17cad6 }, |
923 | ++ /* x^222208 mod p(x)` << 1, x^222272 mod p(x)` << 1 */ |
924 | ++ { 0x000000010d94a81e, 0x00000001253ae964 }, |
925 | ++ /* x^221184 mod p(x)` << 1, x^221248 mod p(x)` << 1 */ |
926 | ++ { 0x0000000027a20ab2, 0x00000001d7c88ebc }, |
927 | ++ /* x^220160 mod p(x)` << 1, x^220224 mod p(x)` << 1 */ |
928 | ++ { 0x0000000114f87504, 0x00000001e7ca913a }, |
929 | ++ /* x^219136 mod p(x)` << 1, x^219200 mod p(x)` << 1 */ |
930 | ++ { 0x000000004b076d96, 0x0000000033ed078a }, |
931 | ++ /* x^218112 mod p(x)` << 1, x^218176 mod p(x)` << 1 */ |
932 | ++ { 0x00000000da4d1e74, 0x00000000e1839c78 }, |
933 | ++ /* x^217088 mod p(x)` << 1, x^217152 mod p(x)` << 1 */ |
934 | ++ { 0x000000001b81f672, 0x00000001322b267e }, |
935 | ++ /* x^216064 mod p(x)` << 1, x^216128 mod p(x)` << 1 */ |
936 | ++ { 0x000000009367c988, 0x00000000638231b6 }, |
937 | ++ /* x^215040 mod p(x)` << 1, x^215104 mod p(x)` << 1 */ |
938 | ++ { 0x00000001717214ca, 0x00000001ee7f16f4 }, |
939 | ++ /* x^214016 mod p(x)` << 1, x^214080 mod p(x)` << 1 */ |
940 | ++ { 0x000000009f47d820, 0x0000000117d9924a }, |
941 | ++ /* x^212992 mod p(x)` << 1, x^213056 mod p(x)` << 1 */ |
942 | ++ { 0x000000010d9a47d2, 0x00000000e1a9e0c4 }, |
943 | ++ /* x^211968 mod p(x)` << 1, x^212032 mod p(x)` << 1 */ |
944 | ++ { 0x00000000a696c58c, 0x00000001403731dc }, |
945 | ++ /* x^210944 mod p(x)` << 1, x^211008 mod p(x)` << 1 */ |
946 | ++ { 0x000000002aa28ec6, 0x00000001a5ea9682 }, |
947 | ++ /* x^209920 mod p(x)` << 1, x^209984 mod p(x)` << 1 */ |
948 | ++ { 0x00000001fe18fd9a, 0x0000000101c5c578 }, |
949 | ++ /* x^208896 mod p(x)` << 1, x^208960 mod p(x)` << 1 */ |
950 | ++ { 0x000000019d4fc1ae, 0x00000000dddf6494 }, |
951 | ++ /* x^207872 mod p(x)` << 1, x^207936 mod p(x)` << 1 */ |
952 | ++ { 0x00000001ba0e3dea, 0x00000000f1c3db28 }, |
953 | ++ /* x^206848 mod p(x)` << 1, x^206912 mod p(x)` << 1 */ |
954 | ++ { 0x0000000074b59a5e, 0x000000013112fb9c }, |
955 | ++ /* x^205824 mod p(x)` << 1, x^205888 mod p(x)` << 1 */ |
956 | ++ { 0x00000000f2b5ea98, 0x00000000b680b906 }, |
957 | ++ /* x^204800 mod p(x)` << 1, x^204864 mod p(x)` << 1 */ |
958 | ++ { 0x0000000187132676, 0x000000001a282932 }, |
959 | ++ /* x^203776 mod p(x)` << 1, x^203840 mod p(x)` << 1 */ |
960 | ++ { 0x000000010a8c6ad4, 0x0000000089406e7e }, |
961 | ++ /* x^202752 mod p(x)` << 1, x^202816 mod p(x)` << 1 */ |
962 | ++ { 0x00000001e21dfe70, 0x00000001def6be8c }, |
963 | ++ /* x^201728 mod p(x)` << 1, x^201792 mod p(x)` << 1 */ |
964 | ++ { 0x00000001da0050e4, 0x0000000075258728 }, |
965 | ++ /* x^200704 mod p(x)` << 1, x^200768 mod p(x)` << 1 */ |
966 | ++ { 0x00000000772172ae, 0x000000019536090a }, |
967 | ++ /* x^199680 mod p(x)` << 1, x^199744 mod p(x)` << 1 */ |
968 | ++ { 0x00000000e47724aa, 0x00000000f2455bfc }, |
969 | ++ /* x^198656 mod p(x)` << 1, x^198720 mod p(x)` << 1 */ |
970 | ++ { 0x000000003cd63ac4, 0x000000018c40baf4 }, |
971 | ++ /* x^197632 mod p(x)` << 1, x^197696 mod p(x)` << 1 */ |
972 | ++ { 0x00000001bf47d352, 0x000000004cd390d4 }, |
973 | ++ /* x^196608 mod p(x)` << 1, x^196672 mod p(x)` << 1 */ |
974 | ++ { 0x000000018dc1d708, 0x00000001e4ece95a }, |
975 | ++ /* x^195584 mod p(x)` << 1, x^195648 mod p(x)` << 1 */ |
976 | ++ { 0x000000002d4620a4, 0x000000001a3ee918 }, |
977 | ++ /* x^194560 mod p(x)` << 1, x^194624 mod p(x)` << 1 */ |
978 | ++ { 0x0000000058fd1740, 0x000000007c652fb8 }, |
979 | ++ /* x^193536 mod p(x)` << 1, x^193600 mod p(x)` << 1 */ |
980 | ++ { 0x00000000dadd9bfc, 0x000000011c67842c }, |
981 | ++ /* x^192512 mod p(x)` << 1, x^192576 mod p(x)` << 1 */ |
982 | ++ { 0x00000001ea2140be, 0x00000000254f759c }, |
983 | ++ /* x^191488 mod p(x)` << 1, x^191552 mod p(x)` << 1 */ |
984 | ++ { 0x000000009de128ba, 0x000000007ece94ca }, |
985 | ++ /* x^190464 mod p(x)` << 1, x^190528 mod p(x)` << 1 */ |
986 | ++ { 0x000000013ac3aa8e, 0x0000000038f258c2 }, |
987 | ++ /* x^189440 mod p(x)` << 1, x^189504 mod p(x)` << 1 */ |
988 | ++ { 0x0000000099980562, 0x00000001cdf17b00 }, |
989 | ++ /* x^188416 mod p(x)` << 1, x^188480 mod p(x)` << 1 */ |
990 | ++ { 0x00000001c1579c86, 0x000000011f882c16 }, |
991 | ++ /* x^187392 mod p(x)` << 1, x^187456 mod p(x)` << 1 */ |
992 | ++ { 0x0000000068dbbf94, 0x0000000100093fc8 }, |
993 | ++ /* x^186368 mod p(x)` << 1, x^186432 mod p(x)` << 1 */ |
994 | ++ { 0x000000004509fb04, 0x00000001cd684f16 }, |
995 | ++ /* x^185344 mod p(x)` << 1, x^185408 mod p(x)` << 1 */ |
996 | ++ { 0x00000001202f6398, 0x000000004bc6a70a }, |
997 | ++ /* x^184320 mod p(x)` << 1, x^184384 mod p(x)` << 1 */ |
998 | ++ { 0x000000013aea243e, 0x000000004fc7e8e4 }, |
999 | ++ /* x^183296 mod p(x)` << 1, x^183360 mod p(x)` << 1 */ |
1000 | ++ { 0x00000001b4052ae6, 0x0000000130103f1c }, |
1001 | ++ /* x^182272 mod p(x)` << 1, x^182336 mod p(x)` << 1 */ |
1002 | ++ { 0x00000001cd2a0ae8, 0x0000000111b0024c }, |
1003 | ++ /* x^181248 mod p(x)` << 1, x^181312 mod p(x)` << 1 */ |
1004 | ++ { 0x00000001fe4aa8b4, 0x000000010b3079da }, |
1005 | ++ /* x^180224 mod p(x)` << 1, x^180288 mod p(x)` << 1 */ |
1006 | ++ { 0x00000001d1559a42, 0x000000010192bcc2 }, |
1007 | ++ /* x^179200 mod p(x)` << 1, x^179264 mod p(x)` << 1 */ |
1008 | ++ { 0x00000001f3e05ecc, 0x0000000074838d50 }, |
1009 | ++ /* x^178176 mod p(x)` << 1, x^178240 mod p(x)` << 1 */ |
1010 | ++ { 0x0000000104ddd2cc, 0x000000001b20f520 }, |
1011 | ++ /* x^177152 mod p(x)` << 1, x^177216 mod p(x)` << 1 */ |
1012 | ++ { 0x000000015393153c, 0x0000000050c3590a }, |
1013 | ++ /* x^176128 mod p(x)` << 1, x^176192 mod p(x)` << 1 */ |
1014 | ++ { 0x0000000057e942c6, 0x00000000b41cac8e }, |
1015 | ++ /* x^175104 mod p(x)` << 1, x^175168 mod p(x)` << 1 */ |
1016 | ++ { 0x000000012c633850, 0x000000000c72cc78 }, |
1017 | ++ /* x^174080 mod p(x)` << 1, x^174144 mod p(x)` << 1 */ |
1018 | ++ { 0x00000000ebcaae4c, 0x0000000030cdb032 }, |
1019 | ++ /* x^173056 mod p(x)` << 1, x^173120 mod p(x)` << 1 */ |
1020 | ++ { 0x000000013ee532a6, 0x000000013e09fc32 }, |
1021 | ++ /* x^172032 mod p(x)` << 1, x^172096 mod p(x)` << 1 */ |
1022 | ++ { 0x00000001bf0cbc7e, 0x000000001ed624d2 }, |
1023 | ++ /* x^171008 mod p(x)` << 1, x^171072 mod p(x)` << 1 */ |
1024 | ++ { 0x00000000d50b7a5a, 0x00000000781aee1a }, |
1025 | ++ /* x^169984 mod p(x)` << 1, x^170048 mod p(x)` << 1 */ |
1026 | ++ { 0x0000000002fca6e8, 0x00000001c4d8348c }, |
1027 | ++ /* x^168960 mod p(x)` << 1, x^169024 mod p(x)` << 1 */ |
1028 | ++ { 0x000000007af40044, 0x0000000057a40336 }, |
1029 | ++ /* x^167936 mod p(x)` << 1, x^168000 mod p(x)` << 1 */ |
1030 | ++ { 0x0000000016178744, 0x0000000085544940 }, |
1031 | ++ /* x^166912 mod p(x)` << 1, x^166976 mod p(x)` << 1 */ |
1032 | ++ { 0x000000014c177458, 0x000000019cd21e80 }, |
1033 | ++ /* x^165888 mod p(x)` << 1, x^165952 mod p(x)` << 1 */ |
1034 | ++ { 0x000000011b6ddf04, 0x000000013eb95bc0 }, |
1035 | ++ /* x^164864 mod p(x)` << 1, x^164928 mod p(x)` << 1 */ |
1036 | ++ { 0x00000001f3e29ccc, 0x00000001dfc9fdfc }, |
1037 | ++ /* x^163840 mod p(x)` << 1, x^163904 mod p(x)` << 1 */ |
1038 | ++ { 0x0000000135ae7562, 0x00000000cd028bc2 }, |
1039 | ++ /* x^162816 mod p(x)` << 1, x^162880 mod p(x)` << 1 */ |
1040 | ++ { 0x0000000190ef812c, 0x0000000090db8c44 }, |
1041 | ++ /* x^161792 mod p(x)` << 1, x^161856 mod p(x)` << 1 */ |
1042 | ++ { 0x0000000067a2c786, 0x000000010010a4ce }, |
1043 | ++ /* x^160768 mod p(x)` << 1, x^160832 mod p(x)` << 1 */ |
1044 | ++ { 0x0000000048b9496c, 0x00000001c8f4c72c }, |
1045 | ++ /* x^159744 mod p(x)` << 1, x^159808 mod p(x)` << 1 */ |
1046 | ++ { 0x000000015a422de6, 0x000000001c26170c }, |
1047 | ++ /* x^158720 mod p(x)` << 1, x^158784 mod p(x)` << 1 */ |
1048 | ++ { 0x00000001ef0e3640, 0x00000000e3fccf68 }, |
1049 | ++ /* x^157696 mod p(x)` << 1, x^157760 mod p(x)` << 1 */ |
1050 | ++ { 0x00000001006d2d26, 0x00000000d513ed24 }, |
1051 | ++ /* x^156672 mod p(x)` << 1, x^156736 mod p(x)` << 1 */ |
1052 | ++ { 0x00000001170d56d6, 0x00000000141beada }, |
1053 | ++ /* x^155648 mod p(x)` << 1, x^155712 mod p(x)` << 1 */ |
1054 | ++ { 0x00000000a5fb613c, 0x000000011071aea0 }, |
1055 | ++ /* x^154624 mod p(x)` << 1, x^154688 mod p(x)` << 1 */ |
1056 | ++ { 0x0000000040bbf7fc, 0x000000012e19080a }, |
1057 | ++ /* x^153600 mod p(x)` << 1, x^153664 mod p(x)` << 1 */ |
1058 | ++ { 0x000000016ac3a5b2, 0x0000000100ecf826 }, |
1059 | ++ /* x^152576 mod p(x)` << 1, x^152640 mod p(x)` << 1 */ |
1060 | ++ { 0x00000000abf16230, 0x0000000069b09412 }, |
1061 | ++ /* x^151552 mod p(x)` << 1, x^151616 mod p(x)` << 1 */ |
1062 | ++ { 0x00000001ebe23fac, 0x0000000122297bac }, |
1063 | ++ /* x^150528 mod p(x)` << 1, x^150592 mod p(x)` << 1 */ |
1064 | ++ { 0x000000008b6a0894, 0x00000000e9e4b068 }, |
1065 | ++ /* x^149504 mod p(x)` << 1, x^149568 mod p(x)` << 1 */ |
1066 | ++ { 0x00000001288ea478, 0x000000004b38651a }, |
1067 | ++ /* x^148480 mod p(x)` << 1, x^148544 mod p(x)` << 1 */ |
1068 | ++ { 0x000000016619c442, 0x00000001468360e2 }, |
1069 | ++ /* x^147456 mod p(x)` << 1, x^147520 mod p(x)` << 1 */ |
1070 | ++ { 0x0000000086230038, 0x00000000121c2408 }, |
1071 | ++ /* x^146432 mod p(x)` << 1, x^146496 mod p(x)` << 1 */ |
1072 | ++ { 0x000000017746a756, 0x00000000da7e7d08 }, |
1073 | ++ /* x^145408 mod p(x)` << 1, x^145472 mod p(x)` << 1 */ |
1074 | ++ { 0x0000000191b8f8f8, 0x00000001058d7652 }, |
1075 | ++ /* x^144384 mod p(x)` << 1, x^144448 mod p(x)` << 1 */ |
1076 | ++ { 0x000000008e167708, 0x000000014a098a90 }, |
1077 | ++ /* x^143360 mod p(x)` << 1, x^143424 mod p(x)` << 1 */ |
1078 | ++ { 0x0000000148b22d54, 0x0000000020dbe72e }, |
1079 | ++ /* x^142336 mod p(x)` << 1, x^142400 mod p(x)` << 1 */ |
1080 | ++ { 0x0000000044ba2c3c, 0x000000011e7323e8 }, |
1081 | ++ /* x^141312 mod p(x)` << 1, x^141376 mod p(x)` << 1 */ |
1082 | ++ { 0x00000000b54d2b52, 0x00000000d5d4bf94 }, |
1083 | ++ /* x^140288 mod p(x)` << 1, x^140352 mod p(x)` << 1 */ |
1084 | ++ { 0x0000000005a4fd8a, 0x0000000199d8746c }, |
1085 | ++ /* x^139264 mod p(x)` << 1, x^139328 mod p(x)` << 1 */ |
1086 | ++ { 0x0000000139f9fc46, 0x00000000ce9ca8a0 }, |
1087 | ++ /* x^138240 mod p(x)` << 1, x^138304 mod p(x)` << 1 */ |
1088 | ++ { 0x000000015a1fa824, 0x00000000136edece }, |
1089 | ++ /* x^137216 mod p(x)` << 1, x^137280 mod p(x)` << 1 */ |
1090 | ++ { 0x000000000a61ae4c, 0x000000019b92a068 }, |
1091 | ++ /* x^136192 mod p(x)` << 1, x^136256 mod p(x)` << 1 */ |
1092 | ++ { 0x0000000145e9113e, 0x0000000071d62206 }, |
1093 | ++ /* x^135168 mod p(x)` << 1, x^135232 mod p(x)` << 1 */ |
1094 | ++ { 0x000000006a348448, 0x00000000dfc50158 }, |
1095 | ++ /* x^134144 mod p(x)` << 1, x^134208 mod p(x)` << 1 */ |
1096 | ++ { 0x000000004d80a08c, 0x00000001517626bc }, |
1097 | ++ /* x^133120 mod p(x)` << 1, x^133184 mod p(x)` << 1 */ |
1098 | ++ { 0x000000014b6837a0, 0x0000000148d1e4fa }, |
1099 | ++ /* x^132096 mod p(x)` << 1, x^132160 mod p(x)` << 1 */ |
1100 | ++ { 0x000000016896a7fc, 0x0000000094d8266e }, |
1101 | ++ /* x^131072 mod p(x)` << 1, x^131136 mod p(x)` << 1 */ |
1102 | ++ { 0x000000014f187140, 0x00000000606c5e34 }, |
1103 | ++ /* x^130048 mod p(x)` << 1, x^130112 mod p(x)` << 1 */ |
1104 | ++ { 0x000000019581b9da, 0x000000019766beaa }, |
1105 | ++ /* x^129024 mod p(x)` << 1, x^129088 mod p(x)` << 1 */ |
1106 | ++ { 0x00000001091bc984, 0x00000001d80c506c }, |
1107 | ++ /* x^128000 mod p(x)` << 1, x^128064 mod p(x)` << 1 */ |
1108 | ++ { 0x000000001067223c, 0x000000001e73837c }, |
1109 | ++ /* x^126976 mod p(x)` << 1, x^127040 mod p(x)` << 1 */ |
1110 | ++ { 0x00000001ab16ea02, 0x0000000064d587de }, |
1111 | ++ /* x^125952 mod p(x)` << 1, x^126016 mod p(x)` << 1 */ |
1112 | ++ { 0x000000013c4598a8, 0x00000000f4a507b0 }, |
1113 | ++ /* x^124928 mod p(x)` << 1, x^124992 mod p(x)` << 1 */ |
1114 | ++ { 0x00000000b3735430, 0x0000000040e342fc }, |
1115 | ++ /* x^123904 mod p(x)` << 1, x^123968 mod p(x)` << 1 */ |
1116 | ++ { 0x00000001bb3fc0c0, 0x00000001d5ad9c3a }, |
1117 | ++ /* x^122880 mod p(x)` << 1, x^122944 mod p(x)` << 1 */ |
1118 | ++ { 0x00000001570ae19c, 0x0000000094a691a4 }, |
1119 | ++ /* x^121856 mod p(x)` << 1, x^121920 mod p(x)` << 1 */ |
1120 | ++ { 0x00000001ea910712, 0x00000001271ecdfa }, |
1121 | ++ /* x^120832 mod p(x)` << 1, x^120896 mod p(x)` << 1 */ |
1122 | ++ { 0x0000000167127128, 0x000000009e54475a }, |
1123 | ++ /* x^119808 mod p(x)` << 1, x^119872 mod p(x)` << 1 */ |
1124 | ++ { 0x0000000019e790a2, 0x00000000c9c099ee }, |
1125 | ++ /* x^118784 mod p(x)` << 1, x^118848 mod p(x)` << 1 */ |
1126 | ++ { 0x000000003788f710, 0x000000009a2f736c }, |
1127 | ++ /* x^117760 mod p(x)` << 1, x^117824 mod p(x)` << 1 */ |
1128 | ++ { 0x00000001682a160e, 0x00000000bb9f4996 }, |
1129 | ++ /* x^116736 mod p(x)` << 1, x^116800 mod p(x)` << 1 */ |
1130 | ++ { 0x000000007f0ebd2e, 0x00000001db688050 }, |
1131 | ++ /* x^115712 mod p(x)` << 1, x^115776 mod p(x)` << 1 */ |
1132 | ++ { 0x000000002b032080, 0x00000000e9b10af4 }, |
1133 | ++ /* x^114688 mod p(x)` << 1, x^114752 mod p(x)` << 1 */ |
1134 | ++ { 0x00000000cfd1664a, 0x000000012d4545e4 }, |
1135 | ++ /* x^113664 mod p(x)` << 1, x^113728 mod p(x)` << 1 */ |
1136 | ++ { 0x00000000aa1181c2, 0x000000000361139c }, |
1137 | ++ /* x^112640 mod p(x)` << 1, x^112704 mod p(x)` << 1 */ |
1138 | ++ { 0x00000000ddd08002, 0x00000001a5a1a3a8 }, |
1139 | ++ /* x^111616 mod p(x)` << 1, x^111680 mod p(x)` << 1 */ |
1140 | ++ { 0x00000000e8dd0446, 0x000000006844e0b0 }, |
1141 | ++ /* x^110592 mod p(x)` << 1, x^110656 mod p(x)` << 1 */ |
1142 | ++ { 0x00000001bbd94a00, 0x00000000c3762f28 }, |
1143 | ++ /* x^109568 mod p(x)` << 1, x^109632 mod p(x)` << 1 */ |
1144 | ++ { 0x00000000ab6cd180, 0x00000001d26287a2 }, |
1145 | ++ /* x^108544 mod p(x)` << 1, x^108608 mod p(x)` << 1 */ |
1146 | ++ { 0x0000000031803ce2, 0x00000001f6f0bba8 }, |
1147 | ++ /* x^107520 mod p(x)` << 1, x^107584 mod p(x)` << 1 */ |
1148 | ++ { 0x0000000024f40b0c, 0x000000002ffabd62 }, |
1149 | ++ /* x^106496 mod p(x)` << 1, x^106560 mod p(x)` << 1 */ |
1150 | ++ { 0x00000001ba1d9834, 0x00000000fb4516b8 }, |
1151 | ++ /* x^105472 mod p(x)` << 1, x^105536 mod p(x)` << 1 */ |
1152 | ++ { 0x0000000104de61aa, 0x000000018cfa961c }, |
1153 | ++ /* x^104448 mod p(x)` << 1, x^104512 mod p(x)` << 1 */ |
1154 | ++ { 0x0000000113e40d46, 0x000000019e588d52 }, |
1155 | ++ /* x^103424 mod p(x)` << 1, x^103488 mod p(x)` << 1 */ |
1156 | ++ { 0x00000001415598a0, 0x00000001180f0bbc }, |
1157 | ++ /* x^102400 mod p(x)` << 1, x^102464 mod p(x)` << 1 */ |
1158 | ++ { 0x00000000bf6c8c90, 0x00000000e1d9177a }, |
1159 | ++ /* x^101376 mod p(x)` << 1, x^101440 mod p(x)` << 1 */ |
1160 | ++ { 0x00000001788b0504, 0x0000000105abc27c }, |
1161 | ++ /* x^100352 mod p(x)` << 1, x^100416 mod p(x)` << 1 */ |
1162 | ++ { 0x0000000038385d02, 0x00000000972e4a58 }, |
1163 | ++ /* x^99328 mod p(x)` << 1, x^99392 mod p(x)` << 1 */ |
1164 | ++ { 0x00000001b6c83844, 0x0000000183499a5e }, |
1165 | ++ /* x^98304 mod p(x)` << 1, x^98368 mod p(x)` << 1 */ |
1166 | ++ { 0x0000000051061a8a, 0x00000001c96a8cca }, |
1167 | ++ /* x^97280 mod p(x)` << 1, x^97344 mod p(x)` << 1 */ |
1168 | ++ { 0x000000017351388a, 0x00000001a1a5b60c }, |
1169 | ++ /* x^96256 mod p(x)` << 1, x^96320 mod p(x)` << 1 */ |
1170 | ++ { 0x0000000132928f92, 0x00000000e4b6ac9c }, |
1171 | ++ /* x^95232 mod p(x)` << 1, x^95296 mod p(x)` << 1 */ |
1172 | ++ { 0x00000000e6b4f48a, 0x00000001807e7f5a }, |
1173 | ++ /* x^94208 mod p(x)` << 1, x^94272 mod p(x)` << 1 */ |
1174 | ++ { 0x0000000039d15e90, 0x000000017a7e3bc8 }, |
1175 | ++ /* x^93184 mod p(x)` << 1, x^93248 mod p(x)` << 1 */ |
1176 | ++ { 0x00000000312d6074, 0x00000000d73975da }, |
1177 | ++ /* x^92160 mod p(x)` << 1, x^92224 mod p(x)` << 1 */ |
1178 | ++ { 0x000000017bbb2cc4, 0x000000017375d038 }, |
1179 | ++ /* x^91136 mod p(x)` << 1, x^91200 mod p(x)` << 1 */ |
1180 | ++ { 0x000000016ded3e18, 0x00000000193680bc }, |
1181 | ++ /* x^90112 mod p(x)` << 1, x^90176 mod p(x)` << 1 */ |
1182 | ++ { 0x00000000f1638b16, 0x00000000999b06f6 }, |
1183 | ++ /* x^89088 mod p(x)` << 1, x^89152 mod p(x)` << 1 */ |
1184 | ++ { 0x00000001d38b9ecc, 0x00000001f685d2b8 }, |
1185 | ++ /* x^88064 mod p(x)` << 1, x^88128 mod p(x)` << 1 */ |
1186 | ++ { 0x000000018b8d09dc, 0x00000001f4ecbed2 }, |
1187 | ++ /* x^87040 mod p(x)` << 1, x^87104 mod p(x)` << 1 */ |
1188 | ++ { 0x00000000e7bc27d2, 0x00000000ba16f1a0 }, |
1189 | ++ /* x^86016 mod p(x)` << 1, x^86080 mod p(x)` << 1 */ |
1190 | ++ { 0x00000000275e1e96, 0x0000000115aceac4 }, |
1191 | ++ /* x^84992 mod p(x)` << 1, x^85056 mod p(x)` << 1 */ |
1192 | ++ { 0x00000000e2e3031e, 0x00000001aeff6292 }, |
1193 | ++ /* x^83968 mod p(x)` << 1, x^84032 mod p(x)` << 1 */ |
1194 | ++ { 0x00000001041c84d8, 0x000000009640124c }, |
1195 | ++ /* x^82944 mod p(x)` << 1, x^83008 mod p(x)` << 1 */ |
1196 | ++ { 0x00000000706ce672, 0x0000000114f41f02 }, |
1197 | ++ /* x^81920 mod p(x)` << 1, x^81984 mod p(x)` << 1 */ |
1198 | ++ { 0x000000015d5070da, 0x000000009c5f3586 }, |
1199 | ++ /* x^80896 mod p(x)` << 1, x^80960 mod p(x)` << 1 */ |
1200 | ++ { 0x0000000038f9493a, 0x00000001878275fa }, |
1201 | ++ /* x^79872 mod p(x)` << 1, x^79936 mod p(x)` << 1 */ |
1202 | ++ { 0x00000000a3348a76, 0x00000000ddc42ce8 }, |
1203 | ++ /* x^78848 mod p(x)` << 1, x^78912 mod p(x)` << 1 */ |
1204 | ++ { 0x00000001ad0aab92, 0x0000000181d2c73a }, |
1205 | ++ /* x^77824 mod p(x)` << 1, x^77888 mod p(x)` << 1 */ |
1206 | ++ { 0x000000019e85f712, 0x0000000141c9320a }, |
1207 | ++ /* x^76800 mod p(x)` << 1, x^76864 mod p(x)` << 1 */ |
1208 | ++ { 0x000000005a871e76, 0x000000015235719a }, |
1209 | ++ /* x^75776 mod p(x)` << 1, x^75840 mod p(x)` << 1 */ |
1210 | ++ { 0x000000017249c662, 0x00000000be27d804 }, |
1211 | ++ /* x^74752 mod p(x)` << 1, x^74816 mod p(x)` << 1 */ |
1212 | ++ { 0x000000003a084712, 0x000000006242d45a }, |
1213 | ++ /* x^73728 mod p(x)` << 1, x^73792 mod p(x)` << 1 */ |
1214 | ++ { 0x00000000ed438478, 0x000000009a53638e }, |
1215 | ++ /* x^72704 mod p(x)` << 1, x^72768 mod p(x)` << 1 */ |
1216 | ++ { 0x00000000abac34cc, 0x00000001001ecfb6 }, |
1217 | ++ /* x^71680 mod p(x)` << 1, x^71744 mod p(x)` << 1 */ |
1218 | ++ { 0x000000005f35ef3e, 0x000000016d7c2d64 }, |
1219 | ++ /* x^70656 mod p(x)` << 1, x^70720 mod p(x)` << 1 */ |
1220 | ++ { 0x0000000047d6608c, 0x00000001d0ce46c0 }, |
1221 | ++ /* x^69632 mod p(x)` << 1, x^69696 mod p(x)` << 1 */ |
1222 | ++ { 0x000000002d01470e, 0x0000000124c907b4 }, |
1223 | ++ /* x^68608 mod p(x)` << 1, x^68672 mod p(x)` << 1 */ |
1224 | ++ { 0x0000000158bbc7b0, 0x0000000018a555ca }, |
1225 | ++ /* x^67584 mod p(x)` << 1, x^67648 mod p(x)` << 1 */ |
1226 | ++ { 0x00000000c0a23e8e, 0x000000006b0980bc }, |
1227 | ++ /* x^66560 mod p(x)` << 1, x^66624 mod p(x)` << 1 */ |
1228 | ++ { 0x00000001ebd85c88, 0x000000008bbba964 }, |
1229 | ++ /* x^65536 mod p(x)` << 1, x^65600 mod p(x)` << 1 */ |
1230 | ++ { 0x000000019ee20bb2, 0x00000001070a5a1e }, |
1231 | ++ /* x^64512 mod p(x)` << 1, x^64576 mod p(x)` << 1 */ |
1232 | ++ { 0x00000001acabf2d6, 0x000000002204322a }, |
1233 | ++ /* x^63488 mod p(x)` << 1, x^63552 mod p(x)` << 1 */ |
1234 | ++ { 0x00000001b7963d56, 0x00000000a27524d0 }, |
1235 | ++ /* x^62464 mod p(x)` << 1, x^62528 mod p(x)` << 1 */ |
1236 | ++ { 0x000000017bffa1fe, 0x0000000020b1e4ba }, |
1237 | ++ /* x^61440 mod p(x)` << 1, x^61504 mod p(x)` << 1 */ |
1238 | ++ { 0x000000001f15333e, 0x0000000032cc27fc }, |
1239 | ++ /* x^60416 mod p(x)` << 1, x^60480 mod p(x)` << 1 */ |
1240 | ++ { 0x000000018593129e, 0x0000000044dd22b8 }, |
1241 | ++ /* x^59392 mod p(x)` << 1, x^59456 mod p(x)` << 1 */ |
1242 | ++ { 0x000000019cb32602, 0x00000000dffc9e0a }, |
1243 | ++ /* x^58368 mod p(x)` << 1, x^58432 mod p(x)` << 1 */ |
1244 | ++ { 0x0000000142b05cc8, 0x00000001b7a0ed14 }, |
1245 | ++ /* x^57344 mod p(x)` << 1, x^57408 mod p(x)` << 1 */ |
1246 | ++ { 0x00000001be49e7a4, 0x00000000c7842488 }, |
1247 | ++ /* x^56320 mod p(x)` << 1, x^56384 mod p(x)` << 1 */ |
1248 | ++ { 0x0000000108f69d6c, 0x00000001c02a4fee }, |
1249 | ++ /* x^55296 mod p(x)` << 1, x^55360 mod p(x)` << 1 */ |
1250 | ++ { 0x000000006c0971f0, 0x000000003c273778 }, |
1251 | ++ /* x^54272 mod p(x)` << 1, x^54336 mod p(x)` << 1 */ |
1252 | ++ { 0x000000005b16467a, 0x00000001d63f8894 }, |
1253 | ++ /* x^53248 mod p(x)` << 1, x^53312 mod p(x)` << 1 */ |
1254 | ++ { 0x00000001551a628e, 0x000000006be557d6 }, |
1255 | ++ /* x^52224 mod p(x)` << 1, x^52288 mod p(x)` << 1 */ |
1256 | ++ { 0x000000019e42ea92, 0x000000006a7806ea }, |
1257 | ++ /* x^51200 mod p(x)` << 1, x^51264 mod p(x)` << 1 */ |
1258 | ++ { 0x000000012fa83ff2, 0x000000016155aa0c }, |
1259 | ++ /* x^50176 mod p(x)` << 1, x^50240 mod p(x)` << 1 */ |
1260 | ++ { 0x000000011ca9cde0, 0x00000000908650ac }, |
1261 | ++ /* x^49152 mod p(x)` << 1, x^49216 mod p(x)` << 1 */ |
1262 | ++ { 0x00000000c8e5cd74, 0x00000000aa5a8084 }, |
1263 | ++ /* x^48128 mod p(x)` << 1, x^48192 mod p(x)` << 1 */ |
1264 | ++ { 0x0000000096c27f0c, 0x0000000191bb500a }, |
1265 | ++ /* x^47104 mod p(x)` << 1, x^47168 mod p(x)` << 1 */ |
1266 | ++ { 0x000000002baed926, 0x0000000064e9bed0 }, |
1267 | ++ /* x^46080 mod p(x)` << 1, x^46144 mod p(x)` << 1 */ |
1268 | ++ { 0x000000017c8de8d2, 0x000000009444f302 }, |
1269 | ++ /* x^45056 mod p(x)` << 1, x^45120 mod p(x)` << 1 */ |
1270 | ++ { 0x00000000d43d6068, 0x000000019db07d3c }, |
1271 | ++ /* x^44032 mod p(x)` << 1, x^44096 mod p(x)` << 1 */ |
1272 | ++ { 0x00000000cb2c4b26, 0x00000001359e3e6e }, |
1273 | ++ /* x^43008 mod p(x)` << 1, x^43072 mod p(x)` << 1 */ |
1274 | ++ { 0x0000000145b8da26, 0x00000001e4f10dd2 }, |
1275 | ++ /* x^41984 mod p(x)` << 1, x^42048 mod p(x)` << 1 */ |
1276 | ++ { 0x000000018fff4b08, 0x0000000124f5735e }, |
1277 | ++ /* x^40960 mod p(x)` << 1, x^41024 mod p(x)` << 1 */ |
1278 | ++ { 0x0000000150b58ed0, 0x0000000124760a4c }, |
1279 | ++ /* x^39936 mod p(x)` << 1, x^40000 mod p(x)` << 1 */ |
1280 | ++ { 0x00000001549f39bc, 0x000000000f1fc186 }, |
1281 | ++ /* x^38912 mod p(x)` << 1, x^38976 mod p(x)` << 1 */ |
1282 | ++ { 0x00000000ef4d2f42, 0x00000000150e4cc4 }, |
1283 | ++ /* x^37888 mod p(x)` << 1, x^37952 mod p(x)` << 1 */ |
1284 | ++ { 0x00000001b1468572, 0x000000002a6204e8 }, |
1285 | ++ /* x^36864 mod p(x)` << 1, x^36928 mod p(x)` << 1 */ |
1286 | ++ { 0x000000013d7403b2, 0x00000000beb1d432 }, |
1287 | ++ /* x^35840 mod p(x)` << 1, x^35904 mod p(x)` << 1 */ |
1288 | ++ { 0x00000001a4681842, 0x0000000135f3f1f0 }, |
1289 | ++ /* x^34816 mod p(x)` << 1, x^34880 mod p(x)` << 1 */ |
1290 | ++ { 0x0000000167714492, 0x0000000074fe2232 }, |
1291 | ++ /* x^33792 mod p(x)` << 1, x^33856 mod p(x)` << 1 */ |
1292 | ++ { 0x00000001e599099a, 0x000000001ac6e2ba }, |
1293 | ++ /* x^32768 mod p(x)` << 1, x^32832 mod p(x)` << 1 */ |
1294 | ++ { 0x00000000fe128194, 0x0000000013fca91e }, |
1295 | ++ /* x^31744 mod p(x)` << 1, x^31808 mod p(x)` << 1 */ |
1296 | ++ { 0x0000000077e8b990, 0x0000000183f4931e }, |
1297 | ++ /* x^30720 mod p(x)` << 1, x^30784 mod p(x)` << 1 */ |
1298 | ++ { 0x00000001a267f63a, 0x00000000b6d9b4e4 }, |
1299 | ++ /* x^29696 mod p(x)` << 1, x^29760 mod p(x)` << 1 */ |
1300 | ++ { 0x00000001945c245a, 0x00000000b5188656 }, |
1301 | ++ /* x^28672 mod p(x)` << 1, x^28736 mod p(x)` << 1 */ |
1302 | ++ { 0x0000000149002e76, 0x0000000027a81a84 }, |
1303 | ++ /* x^27648 mod p(x)` << 1, x^27712 mod p(x)` << 1 */ |
1304 | ++ { 0x00000001bb8310a4, 0x0000000125699258 }, |
1305 | ++ /* x^26624 mod p(x)` << 1, x^26688 mod p(x)` << 1 */ |
1306 | ++ { 0x000000019ec60bcc, 0x00000001b23de796 }, |
1307 | ++ /* x^25600 mod p(x)` << 1, x^25664 mod p(x)` << 1 */ |
1308 | ++ { 0x000000012d8590ae, 0x00000000fe4365dc }, |
1309 | ++ /* x^24576 mod p(x)` << 1, x^24640 mod p(x)` << 1 */ |
1310 | ++ { 0x0000000065b00684, 0x00000000c68f497a }, |
1311 | ++ /* x^23552 mod p(x)` << 1, x^23616 mod p(x)` << 1 */ |
1312 | ++ { 0x000000015e5aeadc, 0x00000000fbf521ee }, |
1313 | ++ /* x^22528 mod p(x)` << 1, x^22592 mod p(x)` << 1 */ |
1314 | ++ { 0x00000000b77ff2b0, 0x000000015eac3378 }, |
1315 | ++ /* x^21504 mod p(x)` << 1, x^21568 mod p(x)` << 1 */ |
1316 | ++ { 0x0000000188da2ff6, 0x0000000134914b90 }, |
1317 | ++ /* x^20480 mod p(x)` << 1, x^20544 mod p(x)` << 1 */ |
1318 | ++ { 0x0000000063da929a, 0x0000000016335cfe }, |
1319 | ++ /* x^19456 mod p(x)` << 1, x^19520 mod p(x)` << 1 */ |
1320 | ++ { 0x00000001389caa80, 0x000000010372d10c }, |
1321 | ++ /* x^18432 mod p(x)` << 1, x^18496 mod p(x)` << 1 */ |
1322 | ++ { 0x000000013db599d2, 0x000000015097b908 }, |
1323 | ++ /* x^17408 mod p(x)` << 1, x^17472 mod p(x)` << 1 */ |
1324 | ++ { 0x0000000122505a86, 0x00000001227a7572 }, |
1325 | ++ /* x^16384 mod p(x)` << 1, x^16448 mod p(x)` << 1 */ |
1326 | ++ { 0x000000016bd72746, 0x000000009a8f75c0 }, |
1327 | ++ /* x^15360 mod p(x)` << 1, x^15424 mod p(x)` << 1 */ |
1328 | ++ { 0x00000001c3faf1d4, 0x00000000682c77a2 }, |
1329 | ++ /* x^14336 mod p(x)` << 1, x^14400 mod p(x)` << 1 */ |
1330 | ++ { 0x00000001111c826c, 0x00000000231f091c }, |
1331 | ++ /* x^13312 mod p(x)` << 1, x^13376 mod p(x)` << 1 */ |
1332 | ++ { 0x00000000153e9fb2, 0x000000007d4439f2 }, |
1333 | ++ /* x^12288 mod p(x)` << 1, x^12352 mod p(x)` << 1 */ |
1334 | ++ { 0x000000002b1f7b60, 0x000000017e221efc }, |
1335 | ++ /* x^11264 mod p(x)` << 1, x^11328 mod p(x)` << 1 */ |
1336 | ++ { 0x00000000b1dba570, 0x0000000167457c38 }, |
1337 | ++ /* x^10240 mod p(x)` << 1, x^10304 mod p(x)` << 1 */ |
1338 | ++ { 0x00000001f6397b76, 0x00000000bdf081c4 }, |
1339 | ++ /* x^9216 mod p(x)` << 1, x^9280 mod p(x)` << 1 */ |
1340 | ++ { 0x0000000156335214, 0x000000016286d6b0 }, |
1341 | ++ /* x^8192 mod p(x)` << 1, x^8256 mod p(x)` << 1 */ |
1342 | ++ { 0x00000001d70e3986, 0x00000000c84f001c }, |
1343 | ++ /* x^7168 mod p(x)` << 1, x^7232 mod p(x)` << 1 */ |
1344 | ++ { 0x000000003701a774, 0x0000000064efe7c0 }, |
1345 | ++ /* x^6144 mod p(x)` << 1, x^6208 mod p(x)` << 1 */ |
1346 | ++ { 0x00000000ac81ef72, 0x000000000ac2d904 }, |
1347 | ++ /* x^5120 mod p(x)` << 1, x^5184 mod p(x)` << 1 */ |
1348 | ++ { 0x0000000133212464, 0x00000000fd226d14 }, |
1349 | ++ /* x^4096 mod p(x)` << 1, x^4160 mod p(x)` << 1 */ |
1350 | ++ { 0x00000000e4e45610, 0x000000011cfd42e0 }, |
1351 | ++ /* x^3072 mod p(x)` << 1, x^3136 mod p(x)` << 1 */ |
1352 | ++ { 0x000000000c1bd370, 0x000000016e5a5678 }, |
1353 | ++ /* x^2048 mod p(x)` << 1, x^2112 mod p(x)` << 1 */ |
1354 | ++ { 0x00000001a7b9e7a6, 0x00000001d888fe22 }, |
1355 | ++ /* x^1024 mod p(x)` << 1, x^1088 mod p(x)` << 1 */ |
1356 | ++ { 0x000000007d657a10, 0x00000001af77fcd4 } |
1357 | ++#else /* __LITTLE_ENDIAN__ */ |
1358 | ++ /* x^261120 mod p(x)` << 1, x^261184 mod p(x)` << 1 */ |
1359 | ++ { 0x00000001651797d2, 0x0000000099ea94a8 }, |
1360 | ++ /* x^260096 mod p(x)` << 1, x^260160 mod p(x)` << 1 */ |
1361 | ++ { 0x0000000021e0d56c, 0x00000000945a8420 }, |
1362 | ++ /* x^259072 mod p(x)` << 1, x^259136 mod p(x)` << 1 */ |
1363 | ++ { 0x000000000f95ecaa, 0x0000000030762706 }, |
1364 | ++ /* x^258048 mod p(x)` << 1, x^258112 mod p(x)` << 1 */ |
1365 | ++ { 0x00000001ebd224ac, 0x00000001a52fc582 }, |
1366 | ++ /* x^257024 mod p(x)` << 1, x^257088 mod p(x)` << 1 */ |
1367 | ++ { 0x000000000ccb97ca, 0x00000001a4a7167a }, |
1368 | ++ /* x^256000 mod p(x)` << 1, x^256064 mod p(x)` << 1 */ |
1369 | ++ { 0x00000001006ec8a8, 0x000000000c18249a }, |
1370 | ++ /* x^254976 mod p(x)` << 1, x^255040 mod p(x)` << 1 */ |
1371 | ++ { 0x000000014f58f196, 0x00000000a924ae7c }, |
1372 | ++ /* x^253952 mod p(x)` << 1, x^254016 mod p(x)` << 1 */ |
1373 | ++ { 0x00000001a7192ca6, 0x00000001e12ccc12 }, |
1374 | ++ /* x^252928 mod p(x)` << 1, x^252992 mod p(x)` << 1 */ |
1375 | ++ { 0x000000019a64bab2, 0x00000000a0b9d4ac }, |
1376 | ++ /* x^251904 mod p(x)` << 1, x^251968 mod p(x)` << 1 */ |
1377 | ++ { 0x0000000014f4ed2e, 0x0000000095e8ddfe }, |
1378 | ++ /* x^250880 mod p(x)` << 1, x^250944 mod p(x)` << 1 */ |
1379 | ++ { 0x000000011092b6a2, 0x00000000233fddc4 }, |
1380 | ++ /* x^249856 mod p(x)` << 1, x^249920 mod p(x)` << 1 */ |
1381 | ++ { 0x00000000c8a1629c, 0x00000001b4529b62 }, |
1382 | ++ /* x^248832 mod p(x)` << 1, x^248896 mod p(x)` << 1 */ |
1383 | ++ { 0x000000017bf32e8e, 0x00000001a7fa0e64 }, |
1384 | ++ /* x^247808 mod p(x)` << 1, x^247872 mod p(x)` << 1 */ |
1385 | ++ { 0x00000001f8cc6582, 0x00000001b5334592 }, |
1386 | ++ /* x^246784 mod p(x)` << 1, x^246848 mod p(x)` << 1 */ |
1387 | ++ { 0x000000008631ddf0, 0x000000011f8ee1b4 }, |
1388 | ++ /* x^245760 mod p(x)` << 1, x^245824 mod p(x)` << 1 */ |
1389 | ++ { 0x000000007e5a76d0, 0x000000006252e632 }, |
1390 | ++ /* x^244736 mod p(x)` << 1, x^244800 mod p(x)` << 1 */ |
1391 | ++ { 0x000000002b09b31c, 0x00000000ab973e84 }, |
1392 | ++ /* x^243712 mod p(x)` << 1, x^243776 mod p(x)` << 1 */ |
1393 | ++ { 0x00000001b2df1f84, 0x000000007734f5ec }, |
1394 | ++ /* x^242688 mod p(x)` << 1, x^242752 mod p(x)` << 1 */ |
1395 | ++ { 0x00000001d6f56afc, 0x000000007c547798 }, |
1396 | ++ /* x^241664 mod p(x)` << 1, x^241728 mod p(x)` << 1 */ |
1397 | ++ { 0x00000001b9b5e70c, 0x000000007ec40210 }, |
1398 | ++ /* x^240640 mod p(x)` << 1, x^240704 mod p(x)` << 1 */ |
1399 | ++ { 0x0000000034b626d2, 0x00000001ab1695a8 }, |
1400 | ++ /* x^239616 mod p(x)` << 1, x^239680 mod p(x)` << 1 */ |
1401 | ++ { 0x000000014c53479a, 0x0000000090494bba }, |
1402 | ++ /* x^238592 mod p(x)` << 1, x^238656 mod p(x)` << 1 */ |
1403 | ++ { 0x00000001a6d179a4, 0x00000001123fb816 }, |
1404 | ++ /* x^237568 mod p(x)` << 1, x^237632 mod p(x)` << 1 */ |
1405 | ++ { 0x000000015abd16b4, 0x00000001e188c74c }, |
1406 | ++ /* x^236544 mod p(x)` << 1, x^236608 mod p(x)` << 1 */ |
1407 | ++ { 0x00000000018f9852, 0x00000001c2d3451c }, |
1408 | ++ /* x^235520 mod p(x)` << 1, x^235584 mod p(x)` << 1 */ |
1409 | ++ { 0x000000001fb3084a, 0x00000000f55cf1ca }, |
1410 | ++ /* x^234496 mod p(x)` << 1, x^234560 mod p(x)` << 1 */ |
1411 | ++ { 0x00000000c53dfb04, 0x00000001a0531540 }, |
1412 | ++ /* x^233472 mod p(x)` << 1, x^233536 mod p(x)` << 1 */ |
1413 | ++ { 0x00000000e10c9ad6, 0x0000000132cd7ebc }, |
1414 | ++ /* x^232448 mod p(x)` << 1, x^232512 mod p(x)` << 1 */ |
1415 | ++ { 0x0000000025aa994a, 0x0000000073ab7f36 }, |
1416 | ++ /* x^231424 mod p(x)` << 1, x^231488 mod p(x)` << 1 */ |
1417 | ++ { 0x00000000fa3a74c4, 0x0000000041aed1c2 }, |
1418 | ++ /* x^230400 mod p(x)` << 1, x^230464 mod p(x)` << 1 */ |
1419 | ++ { 0x0000000033eb3f40, 0x0000000136c53800 }, |
1420 | ++ /* x^229376 mod p(x)` << 1, x^229440 mod p(x)` << 1 */ |
1421 | ++ { 0x000000017193f296, 0x0000000126835a30 }, |
1422 | ++ /* x^228352 mod p(x)` << 1, x^228416 mod p(x)` << 1 */ |
1423 | ++ { 0x0000000043f6c86a, 0x000000006241b502 }, |
1424 | ++ /* x^227328 mod p(x)` << 1, x^227392 mod p(x)` << 1 */ |
1425 | ++ { 0x000000016b513ec6, 0x00000000d5196ad4 }, |
1426 | ++ /* x^226304 mod p(x)` << 1, x^226368 mod p(x)` << 1 */ |
1427 | ++ { 0x00000000c8f25b4e, 0x000000009cfa769a }, |
1428 | ++ /* x^225280 mod p(x)` << 1, x^225344 mod p(x)` << 1 */ |
1429 | ++ { 0x00000001a45048ec, 0x00000000920e5df4 }, |
1430 | ++ /* x^224256 mod p(x)` << 1, x^224320 mod p(x)` << 1 */ |
1431 | ++ { 0x000000000c441004, 0x0000000169dc310e }, |
1432 | ++ /* x^223232 mod p(x)` << 1, x^223296 mod p(x)` << 1 */ |
1433 | ++ { 0x000000000e17cad6, 0x0000000009fc331c }, |
1434 | ++ /* x^222208 mod p(x)` << 1, x^222272 mod p(x)` << 1 */ |
1435 | ++ { 0x00000001253ae964, 0x000000010d94a81e }, |
1436 | ++ /* x^221184 mod p(x)` << 1, x^221248 mod p(x)` << 1 */ |
1437 | ++ { 0x00000001d7c88ebc, 0x0000000027a20ab2 }, |
1438 | ++ /* x^220160 mod p(x)` << 1, x^220224 mod p(x)` << 1 */ |
1439 | ++ { 0x00000001e7ca913a, 0x0000000114f87504 }, |
1440 | ++ /* x^219136 mod p(x)` << 1, x^219200 mod p(x)` << 1 */ |
1441 | ++ { 0x0000000033ed078a, 0x000000004b076d96 }, |
1442 | ++ /* x^218112 mod p(x)` << 1, x^218176 mod p(x)` << 1 */ |
1443 | ++ { 0x00000000e1839c78, 0x00000000da4d1e74 }, |
1444 | ++ /* x^217088 mod p(x)` << 1, x^217152 mod p(x)` << 1 */ |
1445 | ++ { 0x00000001322b267e, 0x000000001b81f672 }, |
1446 | ++ /* x^216064 mod p(x)` << 1, x^216128 mod p(x)` << 1 */ |
1447 | ++ { 0x00000000638231b6, 0x000000009367c988 }, |
1448 | ++ /* x^215040 mod p(x)` << 1, x^215104 mod p(x)` << 1 */ |
1449 | ++ { 0x00000001ee7f16f4, 0x00000001717214ca }, |
1450 | ++ /* x^214016 mod p(x)` << 1, x^214080 mod p(x)` << 1 */ |
1451 | ++ { 0x0000000117d9924a, 0x000000009f47d820 }, |
1452 | ++ /* x^212992 mod p(x)` << 1, x^213056 mod p(x)` << 1 */ |
1453 | ++ { 0x00000000e1a9e0c4, 0x000000010d9a47d2 }, |
1454 | ++ /* x^211968 mod p(x)` << 1, x^212032 mod p(x)` << 1 */ |
1455 | ++ { 0x00000001403731dc, 0x00000000a696c58c }, |
1456 | ++ /* x^210944 mod p(x)` << 1, x^211008 mod p(x)` << 1 */ |
1457 | ++ { 0x00000001a5ea9682, 0x000000002aa28ec6 }, |
1458 | ++ /* x^209920 mod p(x)` << 1, x^209984 mod p(x)` << 1 */ |
1459 | ++ { 0x0000000101c5c578, 0x00000001fe18fd9a }, |
1460 | ++ /* x^208896 mod p(x)` << 1, x^208960 mod p(x)` << 1 */ |
1461 | ++ { 0x00000000dddf6494, 0x000000019d4fc1ae }, |
1462 | ++ /* x^207872 mod p(x)` << 1, x^207936 mod p(x)` << 1 */ |
1463 | ++ { 0x00000000f1c3db28, 0x00000001ba0e3dea }, |
1464 | ++ /* x^206848 mod p(x)` << 1, x^206912 mod p(x)` << 1 */ |
1465 | ++ { 0x000000013112fb9c, 0x0000000074b59a5e }, |
1466 | ++ /* x^205824 mod p(x)` << 1, x^205888 mod p(x)` << 1 */ |
1467 | ++ { 0x00000000b680b906, 0x00000000f2b5ea98 }, |
1468 | ++ /* x^204800 mod p(x)` << 1, x^204864 mod p(x)` << 1 */ |
1469 | ++ { 0x000000001a282932, 0x0000000187132676 }, |
1470 | ++ /* x^203776 mod p(x)` << 1, x^203840 mod p(x)` << 1 */ |
1471 | ++ { 0x0000000089406e7e, 0x000000010a8c6ad4 }, |
1472 | ++ /* x^202752 mod p(x)` << 1, x^202816 mod p(x)` << 1 */ |
1473 | ++ { 0x00000001def6be8c, 0x00000001e21dfe70 }, |
1474 | ++ /* x^201728 mod p(x)` << 1, x^201792 mod p(x)` << 1 */ |
1475 | ++ { 0x0000000075258728, 0x00000001da0050e4 }, |
1476 | ++ /* x^200704 mod p(x)` << 1, x^200768 mod p(x)` << 1 */ |
1477 | ++ { 0x000000019536090a, 0x00000000772172ae }, |
1478 | ++ /* x^199680 mod p(x)` << 1, x^199744 mod p(x)` << 1 */ |
1479 | ++ { 0x00000000f2455bfc, 0x00000000e47724aa }, |
1480 | ++ /* x^198656 mod p(x)` << 1, x^198720 mod p(x)` << 1 */ |
1481 | ++ { 0x000000018c40baf4, 0x000000003cd63ac4 }, |
1482 | ++ /* x^197632 mod p(x)` << 1, x^197696 mod p(x)` << 1 */ |
1483 | ++ { 0x000000004cd390d4, 0x00000001bf47d352 }, |
1484 | ++ /* x^196608 mod p(x)` << 1, x^196672 mod p(x)` << 1 */ |
1485 | ++ { 0x00000001e4ece95a, 0x000000018dc1d708 }, |
1486 | ++ /* x^195584 mod p(x)` << 1, x^195648 mod p(x)` << 1 */ |
1487 | ++ { 0x000000001a3ee918, 0x000000002d4620a4 }, |
1488 | ++ /* x^194560 mod p(x)` << 1, x^194624 mod p(x)` << 1 */ |
1489 | ++ { 0x000000007c652fb8, 0x0000000058fd1740 }, |
1490 | ++ /* x^193536 mod p(x)` << 1, x^193600 mod p(x)` << 1 */ |
1491 | ++ { 0x000000011c67842c, 0x00000000dadd9bfc }, |
1492 | ++ /* x^192512 mod p(x)` << 1, x^192576 mod p(x)` << 1 */ |
1493 | ++ { 0x00000000254f759c, 0x00000001ea2140be }, |
1494 | ++ /* x^191488 mod p(x)` << 1, x^191552 mod p(x)` << 1 */ |
1495 | ++ { 0x000000007ece94ca, 0x000000009de128ba }, |
1496 | ++ /* x^190464 mod p(x)` << 1, x^190528 mod p(x)` << 1 */ |
1497 | ++ { 0x0000000038f258c2, 0x000000013ac3aa8e }, |
1498 | ++ /* x^189440 mod p(x)` << 1, x^189504 mod p(x)` << 1 */ |
1499 | ++ { 0x00000001cdf17b00, 0x0000000099980562 }, |
1500 | ++ /* x^188416 mod p(x)` << 1, x^188480 mod p(x)` << 1 */ |
1501 | ++ { 0x000000011f882c16, 0x00000001c1579c86 }, |
1502 | ++ /* x^187392 mod p(x)` << 1, x^187456 mod p(x)` << 1 */ |
1503 | ++ { 0x0000000100093fc8, 0x0000000068dbbf94 }, |
1504 | ++ /* x^186368 mod p(x)` << 1, x^186432 mod p(x)` << 1 */ |
1505 | ++ { 0x00000001cd684f16, 0x000000004509fb04 }, |
1506 | ++ /* x^185344 mod p(x)` << 1, x^185408 mod p(x)` << 1 */ |
1507 | ++ { 0x000000004bc6a70a, 0x00000001202f6398 }, |
1508 | ++ /* x^184320 mod p(x)` << 1, x^184384 mod p(x)` << 1 */ |
1509 | ++ { 0x000000004fc7e8e4, 0x000000013aea243e }, |
1510 | ++ /* x^183296 mod p(x)` << 1, x^183360 mod p(x)` << 1 */ |
1511 | ++ { 0x0000000130103f1c, 0x00000001b4052ae6 }, |
1512 | ++ /* x^182272 mod p(x)` << 1, x^182336 mod p(x)` << 1 */ |
1513 | ++ { 0x0000000111b0024c, 0x00000001cd2a0ae8 }, |
1514 | ++ /* x^181248 mod p(x)` << 1, x^181312 mod p(x)` << 1 */ |
1515 | ++ { 0x000000010b3079da, 0x00000001fe4aa8b4 }, |
1516 | ++ /* x^180224 mod p(x)` << 1, x^180288 mod p(x)` << 1 */ |
1517 | ++ { 0x000000010192bcc2, 0x00000001d1559a42 }, |
1518 | ++ /* x^179200 mod p(x)` << 1, x^179264 mod p(x)` << 1 */ |
1519 | ++ { 0x0000000074838d50, 0x00000001f3e05ecc }, |
1520 | ++ /* x^178176 mod p(x)` << 1, x^178240 mod p(x)` << 1 */ |
1521 | ++ { 0x000000001b20f520, 0x0000000104ddd2cc }, |
1522 | ++ /* x^177152 mod p(x)` << 1, x^177216 mod p(x)` << 1 */ |
1523 | ++ { 0x0000000050c3590a, 0x000000015393153c }, |
1524 | ++ /* x^176128 mod p(x)` << 1, x^176192 mod p(x)` << 1 */ |
1525 | ++ { 0x00000000b41cac8e, 0x0000000057e942c6 }, |
1526 | ++ /* x^175104 mod p(x)` << 1, x^175168 mod p(x)` << 1 */ |
1527 | ++ { 0x000000000c72cc78, 0x000000012c633850 }, |
1528 | ++ /* x^174080 mod p(x)` << 1, x^174144 mod p(x)` << 1 */ |
1529 | ++ { 0x0000000030cdb032, 0x00000000ebcaae4c }, |
1530 | ++ /* x^173056 mod p(x)` << 1, x^173120 mod p(x)` << 1 */ |
1531 | ++ { 0x000000013e09fc32, 0x000000013ee532a6 }, |
1532 | ++ /* x^172032 mod p(x)` << 1, x^172096 mod p(x)` << 1 */ |
1533 | ++ { 0x000000001ed624d2, 0x00000001bf0cbc7e }, |
1534 | ++ /* x^171008 mod p(x)` << 1, x^171072 mod p(x)` << 1 */ |
1535 | ++ { 0x00000000781aee1a, 0x00000000d50b7a5a }, |
1536 | ++ /* x^169984 mod p(x)` << 1, x^170048 mod p(x)` << 1 */ |
1537 | ++ { 0x00000001c4d8348c, 0x0000000002fca6e8 }, |
1538 | ++ /* x^168960 mod p(x)` << 1, x^169024 mod p(x)` << 1 */ |
1539 | ++ { 0x0000000057a40336, 0x000000007af40044 }, |
1540 | ++ /* x^167936 mod p(x)` << 1, x^168000 mod p(x)` << 1 */ |
1541 | ++ { 0x0000000085544940, 0x0000000016178744 }, |
1542 | ++ /* x^166912 mod p(x)` << 1, x^166976 mod p(x)` << 1 */ |
1543 | ++ { 0x000000019cd21e80, 0x000000014c177458 }, |
1544 | ++ /* x^165888 mod p(x)` << 1, x^165952 mod p(x)` << 1 */ |
1545 | ++ { 0x000000013eb95bc0, 0x000000011b6ddf04 }, |
1546 | ++ /* x^164864 mod p(x)` << 1, x^164928 mod p(x)` << 1 */ |
1547 | ++ { 0x00000001dfc9fdfc, 0x00000001f3e29ccc }, |
1548 | ++ /* x^163840 mod p(x)` << 1, x^163904 mod p(x)` << 1 */ |
1549 | ++ { 0x00000000cd028bc2, 0x0000000135ae7562 }, |
1550 | ++ /* x^162816 mod p(x)` << 1, x^162880 mod p(x)` << 1 */ |
1551 | ++ { 0x0000000090db8c44, 0x0000000190ef812c }, |
1552 | ++ /* x^161792 mod p(x)` << 1, x^161856 mod p(x)` << 1 */ |
1553 | ++ { 0x000000010010a4ce, 0x0000000067a2c786 }, |
1554 | ++ /* x^160768 mod p(x)` << 1, x^160832 mod p(x)` << 1 */ |
1555 | ++ { 0x00000001c8f4c72c, 0x0000000048b9496c }, |
1556 | ++ /* x^159744 mod p(x)` << 1, x^159808 mod p(x)` << 1 */ |
1557 | ++ { 0x000000001c26170c, 0x000000015a422de6 }, |
1558 | ++ /* x^158720 mod p(x)` << 1, x^158784 mod p(x)` << 1 */ |
1559 | ++ { 0x00000000e3fccf68, 0x00000001ef0e3640 }, |
1560 | ++ /* x^157696 mod p(x)` << 1, x^157760 mod p(x)` << 1 */ |
1561 | ++ { 0x00000000d513ed24, 0x00000001006d2d26 }, |
1562 | ++ /* x^156672 mod p(x)` << 1, x^156736 mod p(x)` << 1 */ |
1563 | ++ { 0x00000000141beada, 0x00000001170d56d6 }, |
1564 | ++ /* x^155648 mod p(x)` << 1, x^155712 mod p(x)` << 1 */ |
1565 | ++ { 0x000000011071aea0, 0x00000000a5fb613c }, |
1566 | ++ /* x^154624 mod p(x)` << 1, x^154688 mod p(x)` << 1 */ |
1567 | ++ { 0x000000012e19080a, 0x0000000040bbf7fc }, |
1568 | ++ /* x^153600 mod p(x)` << 1, x^153664 mod p(x)` << 1 */ |
1569 | ++ { 0x0000000100ecf826, 0x000000016ac3a5b2 }, |
1570 | ++ /* x^152576 mod p(x)` << 1, x^152640 mod p(x)` << 1 */ |
1571 | ++ { 0x0000000069b09412, 0x00000000abf16230 }, |
1572 | ++ /* x^151552 mod p(x)` << 1, x^151616 mod p(x)` << 1 */ |
1573 | ++ { 0x0000000122297bac, 0x00000001ebe23fac }, |
1574 | ++ /* x^150528 mod p(x)` << 1, x^150592 mod p(x)` << 1 */ |
1575 | ++ { 0x00000000e9e4b068, 0x000000008b6a0894 }, |
1576 | ++ /* x^149504 mod p(x)` << 1, x^149568 mod p(x)` << 1 */ |
1577 | ++ { 0x000000004b38651a, 0x00000001288ea478 }, |
1578 | ++ /* x^148480 mod p(x)` << 1, x^148544 mod p(x)` << 1 */ |
1579 | ++ { 0x00000001468360e2, 0x000000016619c442 }, |
1580 | ++ /* x^147456 mod p(x)` << 1, x^147520 mod p(x)` << 1 */ |
1581 | ++ { 0x00000000121c2408, 0x0000000086230038 }, |
1582 | ++ /* x^146432 mod p(x)` << 1, x^146496 mod p(x)` << 1 */ |
1583 | ++ { 0x00000000da7e7d08, 0x000000017746a756 }, |
1584 | ++ /* x^145408 mod p(x)` << 1, x^145472 mod p(x)` << 1 */ |
1585 | ++ { 0x00000001058d7652, 0x0000000191b8f8f8 }, |
1586 | ++ /* x^144384 mod p(x)` << 1, x^144448 mod p(x)` << 1 */ |
1587 | ++ { 0x000000014a098a90, 0x000000008e167708 }, |
1588 | ++ /* x^143360 mod p(x)` << 1, x^143424 mod p(x)` << 1 */ |
1589 | ++ { 0x0000000020dbe72e, 0x0000000148b22d54 }, |
1590 | ++ /* x^142336 mod p(x)` << 1, x^142400 mod p(x)` << 1 */ |
1591 | ++ { 0x000000011e7323e8, 0x0000000044ba2c3c }, |
1592 | ++ /* x^141312 mod p(x)` << 1, x^141376 mod p(x)` << 1 */ |
1593 | ++ { 0x00000000d5d4bf94, 0x00000000b54d2b52 }, |
1594 | ++ /* x^140288 mod p(x)` << 1, x^140352 mod p(x)` << 1 */ |
1595 | ++ { 0x0000000199d8746c, 0x0000000005a4fd8a }, |
1596 | ++ /* x^139264 mod p(x)` << 1, x^139328 mod p(x)` << 1 */ |
1597 | ++ { 0x00000000ce9ca8a0, 0x0000000139f9fc46 }, |
1598 | ++ /* x^138240 mod p(x)` << 1, x^138304 mod p(x)` << 1 */ |
1599 | ++ { 0x00000000136edece, 0x000000015a1fa824 }, |
1600 | ++ /* x^137216 mod p(x)` << 1, x^137280 mod p(x)` << 1 */ |
1601 | ++ { 0x000000019b92a068, 0x000000000a61ae4c }, |
1602 | ++ /* x^136192 mod p(x)` << 1, x^136256 mod p(x)` << 1 */ |
1603 | ++ { 0x0000000071d62206, 0x0000000145e9113e }, |
1604 | ++ /* x^135168 mod p(x)` << 1, x^135232 mod p(x)` << 1 */ |
1605 | ++ { 0x00000000dfc50158, 0x000000006a348448 }, |
1606 | ++ /* x^134144 mod p(x)` << 1, x^134208 mod p(x)` << 1 */ |
1607 | ++ { 0x00000001517626bc, 0x000000004d80a08c }, |
1608 | ++ /* x^133120 mod p(x)` << 1, x^133184 mod p(x)` << 1 */ |
1609 | ++ { 0x0000000148d1e4fa, 0x000000014b6837a0 }, |
1610 | ++ /* x^132096 mod p(x)` << 1, x^132160 mod p(x)` << 1 */ |
1611 | ++ { 0x0000000094d8266e, 0x000000016896a7fc }, |
1612 | ++ /* x^131072 mod p(x)` << 1, x^131136 mod p(x)` << 1 */ |
1613 | ++ { 0x00000000606c5e34, 0x000000014f187140 }, |
1614 | ++ /* x^130048 mod p(x)` << 1, x^130112 mod p(x)` << 1 */ |
1615 | ++ { 0x000000019766beaa, 0x000000019581b9da }, |
1616 | ++ /* x^129024 mod p(x)` << 1, x^129088 mod p(x)` << 1 */ |
1617 | ++ { 0x00000001d80c506c, 0x00000001091bc984 }, |
1618 | ++ /* x^128000 mod p(x)` << 1, x^128064 mod p(x)` << 1 */ |
1619 | ++ { 0x000000001e73837c, 0x000000001067223c }, |
1620 | ++ /* x^126976 mod p(x)` << 1, x^127040 mod p(x)` << 1 */ |
1621 | ++ { 0x0000000064d587de, 0x00000001ab16ea02 }, |
1622 | ++ /* x^125952 mod p(x)` << 1, x^126016 mod p(x)` << 1 */ |
1623 | ++ { 0x00000000f4a507b0, 0x000000013c4598a8 }, |
1624 | ++ /* x^124928 mod p(x)` << 1, x^124992 mod p(x)` << 1 */ |
1625 | ++ { 0x0000000040e342fc, 0x00000000b3735430 }, |
1626 | ++ /* x^123904 mod p(x)` << 1, x^123968 mod p(x)` << 1 */ |
1627 | ++ { 0x00000001d5ad9c3a, 0x00000001bb3fc0c0 }, |
1628 | ++ /* x^122880 mod p(x)` << 1, x^122944 mod p(x)` << 1 */ |
1629 | ++ { 0x0000000094a691a4, 0x00000001570ae19c }, |
1630 | ++ /* x^121856 mod p(x)` << 1, x^121920 mod p(x)` << 1 */ |
1631 | ++ { 0x00000001271ecdfa, 0x00000001ea910712 }, |
1632 | ++ /* x^120832 mod p(x)` << 1, x^120896 mod p(x)` << 1 */ |
1633 | ++ { 0x000000009e54475a, 0x0000000167127128 }, |
1634 | ++ /* x^119808 mod p(x)` << 1, x^119872 mod p(x)` << 1 */ |
1635 | ++ { 0x00000000c9c099ee, 0x0000000019e790a2 }, |
1636 | ++ /* x^118784 mod p(x)` << 1, x^118848 mod p(x)` << 1 */ |
1637 | ++ { 0x000000009a2f736c, 0x000000003788f710 }, |
1638 | ++ /* x^117760 mod p(x)` << 1, x^117824 mod p(x)` << 1 */ |
1639 | ++ { 0x00000000bb9f4996, 0x00000001682a160e }, |
1640 | ++ /* x^116736 mod p(x)` << 1, x^116800 mod p(x)` << 1 */ |
1641 | ++ { 0x00000001db688050, 0x000000007f0ebd2e }, |
1642 | ++ /* x^115712 mod p(x)` << 1, x^115776 mod p(x)` << 1 */ |
1643 | ++ { 0x00000000e9b10af4, 0x000000002b032080 }, |
1644 | ++ /* x^114688 mod p(x)` << 1, x^114752 mod p(x)` << 1 */ |
1645 | ++ { 0x000000012d4545e4, 0x00000000cfd1664a }, |
1646 | ++ /* x^113664 mod p(x)` << 1, x^113728 mod p(x)` << 1 */ |
1647 | ++ { 0x000000000361139c, 0x00000000aa1181c2 }, |
1648 | ++ /* x^112640 mod p(x)` << 1, x^112704 mod p(x)` << 1 */ |
1649 | ++ { 0x00000001a5a1a3a8, 0x00000000ddd08002 }, |
1650 | ++ /* x^111616 mod p(x)` << 1, x^111680 mod p(x)` << 1 */ |
1651 | ++ { 0x000000006844e0b0, 0x00000000e8dd0446 }, |
1652 | ++ /* x^110592 mod p(x)` << 1, x^110656 mod p(x)` << 1 */ |
1653 | ++ { 0x00000000c3762f28, 0x00000001bbd94a00 }, |
1654 | ++ /* x^109568 mod p(x)` << 1, x^109632 mod p(x)` << 1 */ |
1655 | ++ { 0x00000001d26287a2, 0x00000000ab6cd180 }, |
1656 | ++ /* x^108544 mod p(x)` << 1, x^108608 mod p(x)` << 1 */ |
1657 | ++ { 0x00000001f6f0bba8, 0x0000000031803ce2 }, |
1658 | ++ /* x^107520 mod p(x)` << 1, x^107584 mod p(x)` << 1 */ |
1659 | ++ { 0x000000002ffabd62, 0x0000000024f40b0c }, |
1660 | ++ /* x^106496 mod p(x)` << 1, x^106560 mod p(x)` << 1 */ |
1661 | ++ { 0x00000000fb4516b8, 0x00000001ba1d9834 }, |
1662 | ++ /* x^105472 mod p(x)` << 1, x^105536 mod p(x)` << 1 */ |
1663 | ++ { 0x000000018cfa961c, 0x0000000104de61aa }, |
1664 | ++ /* x^104448 mod p(x)` << 1, x^104512 mod p(x)` << 1 */ |
1665 | ++ { 0x000000019e588d52, 0x0000000113e40d46 }, |
1666 | ++ /* x^103424 mod p(x)` << 1, x^103488 mod p(x)` << 1 */ |
1667 | ++ { 0x00000001180f0bbc, 0x00000001415598a0 }, |
1668 | ++ /* x^102400 mod p(x)` << 1, x^102464 mod p(x)` << 1 */ |
1669 | ++ { 0x00000000e1d9177a, 0x00000000bf6c8c90 }, |
1670 | ++ /* x^101376 mod p(x)` << 1, x^101440 mod p(x)` << 1 */ |
1671 | ++ { 0x0000000105abc27c, 0x00000001788b0504 }, |
1672 | ++ /* x^100352 mod p(x)` << 1, x^100416 mod p(x)` << 1 */ |
1673 | ++ { 0x00000000972e4a58, 0x0000000038385d02 }, |
1674 | ++ /* x^99328 mod p(x)` << 1, x^99392 mod p(x)` << 1 */ |
1675 | ++ { 0x0000000183499a5e, 0x00000001b6c83844 }, |
1676 | ++ /* x^98304 mod p(x)` << 1, x^98368 mod p(x)` << 1 */ |
1677 | ++ { 0x00000001c96a8cca, 0x0000000051061a8a }, |
1678 | ++ /* x^97280 mod p(x)` << 1, x^97344 mod p(x)` << 1 */ |
1679 | ++ { 0x00000001a1a5b60c, 0x000000017351388a }, |
1680 | ++ /* x^96256 mod p(x)` << 1, x^96320 mod p(x)` << 1 */ |
1681 | ++ { 0x00000000e4b6ac9c, 0x0000000132928f92 }, |
1682 | ++ /* x^95232 mod p(x)` << 1, x^95296 mod p(x)` << 1 */ |
1683 | ++ { 0x00000001807e7f5a, 0x00000000e6b4f48a }, |
1684 | ++ /* x^94208 mod p(x)` << 1, x^94272 mod p(x)` << 1 */ |
1685 | ++ { 0x000000017a7e3bc8, 0x0000000039d15e90 }, |
1686 | ++ /* x^93184 mod p(x)` << 1, x^93248 mod p(x)` << 1 */ |
1687 | ++ { 0x00000000d73975da, 0x00000000312d6074 }, |
1688 | ++ /* x^92160 mod p(x)` << 1, x^92224 mod p(x)` << 1 */ |
1689 | ++ { 0x000000017375d038, 0x000000017bbb2cc4 }, |
1690 | ++ /* x^91136 mod p(x)` << 1, x^91200 mod p(x)` << 1 */ |
1691 | ++ { 0x00000000193680bc, 0x000000016ded3e18 }, |
1692 | ++ /* x^90112 mod p(x)` << 1, x^90176 mod p(x)` << 1 */ |
1693 | ++ { 0x00000000999b06f6, 0x00000000f1638b16 }, |
1694 | ++ /* x^89088 mod p(x)` << 1, x^89152 mod p(x)` << 1 */ |
1695 | ++ { 0x00000001f685d2b8, 0x00000001d38b9ecc }, |
1696 | ++ /* x^88064 mod p(x)` << 1, x^88128 mod p(x)` << 1 */ |
1697 | ++ { 0x00000001f4ecbed2, 0x000000018b8d09dc }, |
1698 | ++ /* x^87040 mod p(x)` << 1, x^87104 mod p(x)` << 1 */ |
1699 | ++ { 0x00000000ba16f1a0, 0x00000000e7bc27d2 }, |
1700 | ++ /* x^86016 mod p(x)` << 1, x^86080 mod p(x)` << 1 */ |
1701 | ++ { 0x0000000115aceac4, 0x00000000275e1e96 }, |
1702 | ++ /* x^84992 mod p(x)` << 1, x^85056 mod p(x)` << 1 */ |
1703 | ++ { 0x00000001aeff6292, 0x00000000e2e3031e }, |
1704 | ++ /* x^83968 mod p(x)` << 1, x^84032 mod p(x)` << 1 */ |
1705 | ++ { 0x000000009640124c, 0x00000001041c84d8 }, |
1706 | ++ /* x^82944 mod p(x)` << 1, x^83008 mod p(x)` << 1 */ |
1707 | ++ { 0x0000000114f41f02, 0x00000000706ce672 }, |
1708 | ++ /* x^81920 mod p(x)` << 1, x^81984 mod p(x)` << 1 */ |
1709 | ++ { 0x000000009c5f3586, 0x000000015d5070da }, |
1710 | ++ /* x^80896 mod p(x)` << 1, x^80960 mod p(x)` << 1 */ |
1711 | ++ { 0x00000001878275fa, 0x0000000038f9493a }, |
1712 | ++ /* x^79872 mod p(x)` << 1, x^79936 mod p(x)` << 1 */ |
1713 | ++ { 0x00000000ddc42ce8, 0x00000000a3348a76 }, |
1714 | ++ /* x^78848 mod p(x)` << 1, x^78912 mod p(x)` << 1 */ |
1715 | ++ { 0x0000000181d2c73a, 0x00000001ad0aab92 }, |
1716 | ++ /* x^77824 mod p(x)` << 1, x^77888 mod p(x)` << 1 */ |
1717 | ++ { 0x0000000141c9320a, 0x000000019e85f712 }, |
1718 | ++ /* x^76800 mod p(x)` << 1, x^76864 mod p(x)` << 1 */ |
1719 | ++ { 0x000000015235719a, 0x000000005a871e76 }, |
1720 | ++ /* x^75776 mod p(x)` << 1, x^75840 mod p(x)` << 1 */ |
1721 | ++ { 0x00000000be27d804, 0x000000017249c662 }, |
1722 | ++ /* x^74752 mod p(x)` << 1, x^74816 mod p(x)` << 1 */ |
1723 | ++ { 0x000000006242d45a, 0x000000003a084712 }, |
1724 | ++ /* x^73728 mod p(x)` << 1, x^73792 mod p(x)` << 1 */ |
1725 | ++ { 0x000000009a53638e, 0x00000000ed438478 }, |
1726 | ++ /* x^72704 mod p(x)` << 1, x^72768 mod p(x)` << 1 */ |
1727 | ++ { 0x00000001001ecfb6, 0x00000000abac34cc }, |
1728 | ++ /* x^71680 mod p(x)` << 1, x^71744 mod p(x)` << 1 */ |
1729 | ++ { 0x000000016d7c2d64, 0x000000005f35ef3e }, |
1730 | ++ /* x^70656 mod p(x)` << 1, x^70720 mod p(x)` << 1 */ |
1731 | ++ { 0x00000001d0ce46c0, 0x0000000047d6608c }, |
1732 | ++ /* x^69632 mod p(x)` << 1, x^69696 mod p(x)` << 1 */ |
1733 | ++ { 0x0000000124c907b4, 0x000000002d01470e }, |
1734 | ++ /* x^68608 mod p(x)` << 1, x^68672 mod p(x)` << 1 */ |
1735 | ++ { 0x0000000018a555ca, 0x0000000158bbc7b0 }, |
1736 | ++ /* x^67584 mod p(x)` << 1, x^67648 mod p(x)` << 1 */ |
1737 | ++ { 0x000000006b0980bc, 0x00000000c0a23e8e }, |
1738 | ++ /* x^66560 mod p(x)` << 1, x^66624 mod p(x)` << 1 */ |
1739 | ++ { 0x000000008bbba964, 0x00000001ebd85c88 }, |
1740 | ++ /* x^65536 mod p(x)` << 1, x^65600 mod p(x)` << 1 */ |
1741 | ++ { 0x00000001070a5a1e, 0x000000019ee20bb2 }, |
1742 | ++ /* x^64512 mod p(x)` << 1, x^64576 mod p(x)` << 1 */ |
1743 | ++ { 0x000000002204322a, 0x00000001acabf2d6 }, |
1744 | ++ /* x^63488 mod p(x)` << 1, x^63552 mod p(x)` << 1 */ |
1745 | ++ { 0x00000000a27524d0, 0x00000001b7963d56 }, |
1746 | ++ /* x^62464 mod p(x)` << 1, x^62528 mod p(x)` << 1 */ |
1747 | ++ { 0x0000000020b1e4ba, 0x000000017bffa1fe }, |
1748 | ++ /* x^61440 mod p(x)` << 1, x^61504 mod p(x)` << 1 */ |
1749 | ++ { 0x0000000032cc27fc, 0x000000001f15333e }, |
1750 | ++ /* x^60416 mod p(x)` << 1, x^60480 mod p(x)` << 1 */ |
1751 | ++ { 0x0000000044dd22b8, 0x000000018593129e }, |
1752 | ++ /* x^59392 mod p(x)` << 1, x^59456 mod p(x)` << 1 */ |
1753 | ++ { 0x00000000dffc9e0a, 0x000000019cb32602 }, |
1754 | ++ /* x^58368 mod p(x)` << 1, x^58432 mod p(x)` << 1 */ |
1755 | ++ { 0x00000001b7a0ed14, 0x0000000142b05cc8 }, |
1756 | ++ /* x^57344 mod p(x)` << 1, x^57408 mod p(x)` << 1 */ |
1757 | ++ { 0x00000000c7842488, 0x00000001be49e7a4 }, |
1758 | ++ /* x^56320 mod p(x)` << 1, x^56384 mod p(x)` << 1 */ |
1759 | ++ { 0x00000001c02a4fee, 0x0000000108f69d6c }, |
1760 | ++ /* x^55296 mod p(x)` << 1, x^55360 mod p(x)` << 1 */ |
1761 | ++ { 0x000000003c273778, 0x000000006c0971f0 }, |
1762 | ++ /* x^54272 mod p(x)` << 1, x^54336 mod p(x)` << 1 */ |
1763 | ++ { 0x00000001d63f8894, 0x000000005b16467a }, |
1764 | ++ /* x^53248 mod p(x)` << 1, x^53312 mod p(x)` << 1 */ |
1765 | ++ { 0x000000006be557d6, 0x00000001551a628e }, |
1766 | ++ /* x^52224 mod p(x)` << 1, x^52288 mod p(x)` << 1 */ |
1767 | ++ { 0x000000006a7806ea, 0x000000019e42ea92 }, |
1768 | ++ /* x^51200 mod p(x)` << 1, x^51264 mod p(x)` << 1 */ |
1769 | ++ { 0x000000016155aa0c, 0x000000012fa83ff2 }, |
1770 | ++ /* x^50176 mod p(x)` << 1, x^50240 mod p(x)` << 1 */ |
1771 | ++ { 0x00000000908650ac, 0x000000011ca9cde0 }, |
1772 | ++ /* x^49152 mod p(x)` << 1, x^49216 mod p(x)` << 1 */ |
1773 | ++ { 0x00000000aa5a8084, 0x00000000c8e5cd74 }, |
1774 | ++ /* x^48128 mod p(x)` << 1, x^48192 mod p(x)` << 1 */ |
1775 | ++ { 0x0000000191bb500a, 0x0000000096c27f0c }, |
1776 | ++ /* x^47104 mod p(x)` << 1, x^47168 mod p(x)` << 1 */ |
1777 | ++ { 0x0000000064e9bed0, 0x000000002baed926 }, |
1778 | ++ /* x^46080 mod p(x)` << 1, x^46144 mod p(x)` << 1 */ |
1779 | ++ { 0x000000009444f302, 0x000000017c8de8d2 }, |
1780 | ++ /* x^45056 mod p(x)` << 1, x^45120 mod p(x)` << 1 */ |
1781 | ++ { 0x000000019db07d3c, 0x00000000d43d6068 }, |
1782 | ++ /* x^44032 mod p(x)` << 1, x^44096 mod p(x)` << 1 */ |
1783 | ++ { 0x00000001359e3e6e, 0x00000000cb2c4b26 }, |
1784 | ++ /* x^43008 mod p(x)` << 1, x^43072 mod p(x)` << 1 */ |
1785 | ++ { 0x00000001e4f10dd2, 0x0000000145b8da26 }, |
1786 | ++ /* x^41984 mod p(x)` << 1, x^42048 mod p(x)` << 1 */ |
1787 | ++ { 0x0000000124f5735e, 0x000000018fff4b08 }, |
1788 | ++ /* x^40960 mod p(x)` << 1, x^41024 mod p(x)` << 1 */ |
1789 | ++ { 0x0000000124760a4c, 0x0000000150b58ed0 }, |
1790 | ++ /* x^39936 mod p(x)` << 1, x^40000 mod p(x)` << 1 */ |
1791 | ++ { 0x000000000f1fc186, 0x00000001549f39bc }, |
1792 | ++ /* x^38912 mod p(x)` << 1, x^38976 mod p(x)` << 1 */ |
1793 | ++ { 0x00000000150e4cc4, 0x00000000ef4d2f42 }, |
1794 | ++ /* x^37888 mod p(x)` << 1, x^37952 mod p(x)` << 1 */ |
1795 | ++ { 0x000000002a6204e8, 0x00000001b1468572 }, |
1796 | ++ /* x^36864 mod p(x)` << 1, x^36928 mod p(x)` << 1 */ |
1797 | ++ { 0x00000000beb1d432, 0x000000013d7403b2 }, |
1798 | ++ /* x^35840 mod p(x)` << 1, x^35904 mod p(x)` << 1 */ |
1799 | ++ { 0x0000000135f3f1f0, 0x00000001a4681842 }, |
1800 | ++ /* x^34816 mod p(x)` << 1, x^34880 mod p(x)` << 1 */ |
1801 | ++ { 0x0000000074fe2232, 0x0000000167714492 }, |
1802 | ++ /* x^33792 mod p(x)` << 1, x^33856 mod p(x)` << 1 */ |
1803 | ++ { 0x000000001ac6e2ba, 0x00000001e599099a }, |
1804 | ++ /* x^32768 mod p(x)` << 1, x^32832 mod p(x)` << 1 */ |
1805 | ++ { 0x0000000013fca91e, 0x00000000fe128194 }, |
1806 | ++ /* x^31744 mod p(x)` << 1, x^31808 mod p(x)` << 1 */ |
1807 | ++ { 0x0000000183f4931e, 0x0000000077e8b990 }, |
1808 | ++ /* x^30720 mod p(x)` << 1, x^30784 mod p(x)` << 1 */ |
1809 | ++ { 0x00000000b6d9b4e4, 0x00000001a267f63a }, |
1810 | ++ /* x^29696 mod p(x)` << 1, x^29760 mod p(x)` << 1 */ |
1811 | ++ { 0x00000000b5188656, 0x00000001945c245a }, |
1812 | ++ /* x^28672 mod p(x)` << 1, x^28736 mod p(x)` << 1 */ |
1813 | ++ { 0x0000000027a81a84, 0x0000000149002e76 }, |
1814 | ++ /* x^27648 mod p(x)` << 1, x^27712 mod p(x)` << 1 */ |
1815 | ++ { 0x0000000125699258, 0x00000001bb8310a4 }, |
1816 | ++ /* x^26624 mod p(x)` << 1, x^26688 mod p(x)` << 1 */ |
1817 | ++ { 0x00000001b23de796, 0x000000019ec60bcc }, |
1818 | ++ /* x^25600 mod p(x)` << 1, x^25664 mod p(x)` << 1 */ |
1819 | ++ { 0x00000000fe4365dc, 0x000000012d8590ae }, |
1820 | ++ /* x^24576 mod p(x)` << 1, x^24640 mod p(x)` << 1 */ |
1821 | ++ { 0x00000000c68f497a, 0x0000000065b00684 }, |
1822 | ++ /* x^23552 mod p(x)` << 1, x^23616 mod p(x)` << 1 */ |
1823 | ++ { 0x00000000fbf521ee, 0x000000015e5aeadc }, |
1824 | ++ /* x^22528 mod p(x)` << 1, x^22592 mod p(x)` << 1 */ |
1825 | ++ { 0x000000015eac3378, 0x00000000b77ff2b0 }, |
1826 | ++ /* x^21504 mod p(x)` << 1, x^21568 mod p(x)` << 1 */ |
1827 | ++ { 0x0000000134914b90, 0x0000000188da2ff6 }, |
1828 | ++ /* x^20480 mod p(x)` << 1, x^20544 mod p(x)` << 1 */ |
1829 | ++ { 0x0000000016335cfe, 0x0000000063da929a }, |
1830 | ++ /* x^19456 mod p(x)` << 1, x^19520 mod p(x)` << 1 */ |
1831 | ++ { 0x000000010372d10c, 0x00000001389caa80 }, |
1832 | ++ /* x^18432 mod p(x)` << 1, x^18496 mod p(x)` << 1 */ |
1833 | ++ { 0x000000015097b908, 0x000000013db599d2 }, |
1834 | ++ /* x^17408 mod p(x)` << 1, x^17472 mod p(x)` << 1 */ |
1835 | ++ { 0x00000001227a7572, 0x0000000122505a86 }, |
1836 | ++ /* x^16384 mod p(x)` << 1, x^16448 mod p(x)` << 1 */ |
1837 | ++ { 0x000000009a8f75c0, 0x000000016bd72746 }, |
1838 | ++ /* x^15360 mod p(x)` << 1, x^15424 mod p(x)` << 1 */ |
1839 | ++ { 0x00000000682c77a2, 0x00000001c3faf1d4 }, |
1840 | ++ /* x^14336 mod p(x)` << 1, x^14400 mod p(x)` << 1 */ |
1841 | ++ { 0x00000000231f091c, 0x00000001111c826c }, |
1842 | ++ /* x^13312 mod p(x)` << 1, x^13376 mod p(x)` << 1 */ |
1843 | ++ { 0x000000007d4439f2, 0x00000000153e9fb2 }, |
1844 | ++ /* x^12288 mod p(x)` << 1, x^12352 mod p(x)` << 1 */ |
1845 | ++ { 0x000000017e221efc, 0x000000002b1f7b60 }, |
1846 | ++ /* x^11264 mod p(x)` << 1, x^11328 mod p(x)` << 1 */ |
1847 | ++ { 0x0000000167457c38, 0x00000000b1dba570 }, |
1848 | ++ /* x^10240 mod p(x)` << 1, x^10304 mod p(x)` << 1 */ |
1849 | ++ { 0x00000000bdf081c4, 0x00000001f6397b76 }, |
1850 | ++ /* x^9216 mod p(x)` << 1, x^9280 mod p(x)` << 1 */ |
1851 | ++ { 0x000000016286d6b0, 0x0000000156335214 }, |
1852 | ++ /* x^8192 mod p(x)` << 1, x^8256 mod p(x)` << 1 */ |
1853 | ++ { 0x00000000c84f001c, 0x00000001d70e3986 }, |
1854 | ++ /* x^7168 mod p(x)` << 1, x^7232 mod p(x)` << 1 */ |
1855 | ++ { 0x0000000064efe7c0, 0x000000003701a774 }, |
1856 | ++ /* x^6144 mod p(x)` << 1, x^6208 mod p(x)` << 1 */ |
1857 | ++ { 0x000000000ac2d904, 0x00000000ac81ef72 }, |
1858 | ++ /* x^5120 mod p(x)` << 1, x^5184 mod p(x)` << 1 */ |
1859 | ++ { 0x00000000fd226d14, 0x0000000133212464 }, |
1860 | ++ /* x^4096 mod p(x)` << 1, x^4160 mod p(x)` << 1 */ |
1861 | ++ { 0x000000011cfd42e0, 0x00000000e4e45610 }, |
1862 | ++ /* x^3072 mod p(x)` << 1, x^3136 mod p(x)` << 1 */ |
1863 | ++ { 0x000000016e5a5678, 0x000000000c1bd370 }, |
1864 | ++ /* x^2048 mod p(x)` << 1, x^2112 mod p(x)` << 1 */ |
1865 | ++ { 0x00000001d888fe22, 0x00000001a7b9e7a6 }, |
1866 | ++ /* x^1024 mod p(x)` << 1, x^1088 mod p(x)` << 1 */ |
1867 | ++ { 0x00000001af77fcd4, 0x000000007d657a10 } |
1868 | ++#endif /* __LITTLE_ENDIAN__ */ |
1869 | ++ }; |
1870 | ++ |
1871 | ++/* Reduce final 1024-2048 bits to 64 bits, shifting 32 bits to include the trailing 32 bits of zeros */ |
1872 | ++ |
1873 | ++static const __vector unsigned long long vcrc_short_const[16] |
1874 | ++ __attribute__((aligned (16))) = { |
1875 | ++#ifdef __LITTLE_ENDIAN__ |
1876 | ++ /* x^1952 mod p(x) , x^1984 mod p(x) , x^2016 mod p(x) , x^2048 mod p(x) */ |
1877 | ++ { 0x99168a18ec447f11, 0xed837b2613e8221e }, |
1878 | ++ /* x^1824 mod p(x) , x^1856 mod p(x) , x^1888 mod p(x) , x^1920 mod p(x) */ |
1879 | ++ { 0xe23e954e8fd2cd3c, 0xc8acdd8147b9ce5a }, |
1880 | ++ /* x^1696 mod p(x) , x^1728 mod p(x) , x^1760 mod p(x) , x^1792 mod p(x) */ |
1881 | ++ { 0x92f8befe6b1d2b53, 0xd9ad6d87d4277e25 }, |
1882 | ++ /* x^1568 mod p(x) , x^1600 mod p(x) , x^1632 mod p(x) , x^1664 mod p(x) */ |
1883 | ++ { 0xf38a3556291ea462, 0xc10ec5e033fbca3b }, |
1884 | ++ /* x^1440 mod p(x) , x^1472 mod p(x) , x^1504 mod p(x) , x^1536 mod p(x) */ |
1885 | ++ { 0x974ac56262b6ca4b, 0xc0b55b0e82e02e2f }, |
1886 | ++ /* x^1312 mod p(x) , x^1344 mod p(x) , x^1376 mod p(x) , x^1408 mod p(x) */ |
1887 | ++ { 0x855712b3784d2a56, 0x71aa1df0e172334d }, |
1888 | ++ /* x^1184 mod p(x) , x^1216 mod p(x) , x^1248 mod p(x) , x^1280 mod p(x) */ |
1889 | ++ { 0xa5abe9f80eaee722, 0xfee3053e3969324d }, |
1890 | ++ /* x^1056 mod p(x) , x^1088 mod p(x) , x^1120 mod p(x) , x^1152 mod p(x) */ |
1891 | ++ { 0x1fa0943ddb54814c, 0xf44779b93eb2bd08 }, |
1892 | ++ /* x^928 mod p(x) , x^960 mod p(x) , x^992 mod p(x) , x^1024 mod p(x) */ |
1893 | ++ { 0xa53ff440d7bbfe6a, 0xf5449b3f00cc3374 }, |
1894 | ++ /* x^800 mod p(x) , x^832 mod p(x) , x^864 mod p(x) , x^896 mod p(x) */ |
1895 | ++ { 0xebe7e3566325605c, 0x6f8346e1d777606e }, |
1896 | ++ /* x^672 mod p(x) , x^704 mod p(x) , x^736 mod p(x) , x^768 mod p(x) */ |
1897 | ++ { 0xc65a272ce5b592b8, 0xe3ab4f2ac0b95347 }, |
1898 | ++ /* x^544 mod p(x) , x^576 mod p(x) , x^608 mod p(x) , x^640 mod p(x) */ |
1899 | ++ { 0x5705a9ca4721589f, 0xaa2215ea329ecc11 }, |
1900 | ++ /* x^416 mod p(x) , x^448 mod p(x) , x^480 mod p(x) , x^512 mod p(x) */ |
1901 | ++ { 0xe3720acb88d14467, 0x1ed8f66ed95efd26 }, |
1902 | ++ /* x^288 mod p(x) , x^320 mod p(x) , x^352 mod p(x) , x^384 mod p(x) */ |
1903 | ++ { 0xba1aca0315141c31, 0x78ed02d5a700e96a }, |
1904 | ++ /* x^160 mod p(x) , x^192 mod p(x) , x^224 mod p(x) , x^256 mod p(x) */ |
1905 | ++ { 0xad2a31b3ed627dae, 0xba8ccbe832b39da3 }, |
1906 | ++ /* x^32 mod p(x) , x^64 mod p(x) , x^96 mod p(x) , x^128 mod p(x) */ |
1907 | ++ { 0x6655004fa06a2517, 0xedb88320b1e6b092 } |
1908 | ++#else /* __LITTLE_ENDIAN__ */ |
1909 | ++ /* x^1952 mod p(x) , x^1984 mod p(x) , x^2016 mod p(x) , x^2048 mod p(x) */ |
1910 | ++ { 0xed837b2613e8221e, 0x99168a18ec447f11 }, |
1911 | ++ /* x^1824 mod p(x) , x^1856 mod p(x) , x^1888 mod p(x) , x^1920 mod p(x) */ |
1912 | ++ { 0xc8acdd8147b9ce5a, 0xe23e954e8fd2cd3c }, |
1913 | ++ /* x^1696 mod p(x) , x^1728 mod p(x) , x^1760 mod p(x) , x^1792 mod p(x) */ |
1914 | ++ { 0xd9ad6d87d4277e25, 0x92f8befe6b1d2b53 }, |
1915 | ++ /* x^1568 mod p(x) , x^1600 mod p(x) , x^1632 mod p(x) , x^1664 mod p(x) */ |
1916 | ++ { 0xc10ec5e033fbca3b, 0xf38a3556291ea462 }, |
1917 | ++ /* x^1440 mod p(x) , x^1472 mod p(x) , x^1504 mod p(x) , x^1536 mod p(x) */ |
1918 | ++ { 0xc0b55b0e82e02e2f, 0x974ac56262b6ca4b }, |
1919 | ++ /* x^1312 mod p(x) , x^1344 mod p(x) , x^1376 mod p(x) , x^1408 mod p(x) */ |
1920 | ++ { 0x71aa1df0e172334d, 0x855712b3784d2a56 }, |
1921 | ++ /* x^1184 mod p(x) , x^1216 mod p(x) , x^1248 mod p(x) , x^1280 mod p(x) */ |
1922 | ++ { 0xfee3053e3969324d, 0xa5abe9f80eaee722 }, |
1923 | ++ /* x^1056 mod p(x) , x^1088 mod p(x) , x^1120 mod p(x) , x^1152 mod p(x) */ |
1924 | ++ { 0xf44779b93eb2bd08, 0x1fa0943ddb54814c }, |
1925 | ++ /* x^928 mod p(x) , x^960 mod p(x) , x^992 mod p(x) , x^1024 mod p(x) */ |
1926 | ++ { 0xf5449b3f00cc3374, 0xa53ff440d7bbfe6a }, |
1927 | ++ /* x^800 mod p(x) , x^832 mod p(x) , x^864 mod p(x) , x^896 mod p(x) */ |
1928 | ++ { 0x6f8346e1d777606e, 0xebe7e3566325605c }, |
1929 | ++ /* x^672 mod p(x) , x^704 mod p(x) , x^736 mod p(x) , x^768 mod p(x) */ |
1930 | ++ { 0xe3ab4f2ac0b95347, 0xc65a272ce5b592b8 }, |
1931 | ++ /* x^544 mod p(x) , x^576 mod p(x) , x^608 mod p(x) , x^640 mod p(x) */ |
1932 | ++ { 0xaa2215ea329ecc11, 0x5705a9ca4721589f }, |
1933 | ++ /* x^416 mod p(x) , x^448 mod p(x) , x^480 mod p(x) , x^512 mod p(x) */ |
1934 | ++ { 0x1ed8f66ed95efd26, 0xe3720acb88d14467 }, |
1935 | ++ /* x^288 mod p(x) , x^320 mod p(x) , x^352 mod p(x) , x^384 mod p(x) */ |
1936 | ++ { 0x78ed02d5a700e96a, 0xba1aca0315141c31 }, |
1937 | ++ /* x^160 mod p(x) , x^192 mod p(x) , x^224 mod p(x) , x^256 mod p(x) */ |
1938 | ++ { 0xba8ccbe832b39da3, 0xad2a31b3ed627dae }, |
1939 | ++ /* x^32 mod p(x) , x^64 mod p(x) , x^96 mod p(x) , x^128 mod p(x) */ |
1940 | ++ { 0xedb88320b1e6b092, 0x6655004fa06a2517 } |
1941 | ++#endif /* __LITTLE_ENDIAN__ */ |
1942 | ++ }; |
1943 | ++ |
1944 | ++/* Barrett constants */ |
1945 | ++/* 33 bit reflected Barrett constant m - (4^32)/n */ |
1946 | ++ |
1947 | ++static const __vector unsigned long long v_Barrett_const[2] |
1948 | ++ __attribute__((aligned (16))) = { |
1949 | ++ /* x^64 div p(x) */ |
1950 | ++#ifdef __LITTLE_ENDIAN__ |
1951 | ++ { 0x00000001f7011641, 0x0000000000000000 }, |
1952 | ++ { 0x00000001db710641, 0x0000000000000000 } |
1953 | ++#else /* __LITTLE_ENDIAN__ */ |
1954 | ++ { 0x0000000000000000, 0x00000001f7011641 }, |
1955 | ++ { 0x0000000000000000, 0x00000001db710641 } |
1956 | ++#endif /* __LITTLE_ENDIAN__ */ |
1957 | ++ }; |
1958 | ++#endif /* POWER8_INTRINSICS */ |
1959 | ++ |
1960 | ++#endif /* __ASSEMBLER__ */ |
1961 | +diff --git a/contrib/power/crc32_z_power8.c b/contrib/power/crc32_z_power8.c |
1962 | +new file mode 100644 |
1963 | +index 0000000..7858cfe |
1964 | +--- /dev/null |
1965 | ++++ b/contrib/power/crc32_z_power8.c |
1966 | +@@ -0,0 +1,679 @@ |
1967 | ++/* |
1968 | ++ * Calculate the checksum of data that is 16 byte aligned and a multiple of |
1969 | ++ * 16 bytes. |
1970 | ++ * |
1971 | ++ * The first step is to reduce it to 1024 bits. We do this in 8 parallel |
1972 | ++ * chunks in order to mask the latency of the vpmsum instructions. If we |
1973 | ++ * have more than 32 kB of data to checksum we repeat this step multiple |
1974 | ++ * times, passing in the previous 1024 bits. |
1975 | ++ * |
1976 | ++ * The next step is to reduce the 1024 bits to 64 bits. This step adds |
1977 | ++ * 32 bits of 0s to the end - this matches what a CRC does. We just |
1978 | ++ * calculate constants that land the data in this 32 bits. |
1979 | ++ * |
1980 | ++ * We then use fixed point Barrett reduction to compute a mod n over GF(2) |
1981 | ++ * for n = CRC using POWER8 instructions. We use x = 32. |
1982 | ++ * |
1983 | ++ * http://en.wikipedia.org/wiki/Barrett_reduction |
1984 | ++ * |
1985 | ++ * This code uses gcc vector builtins instead using assembly directly. |
1986 | ++ * |
1987 | ++ * Copyright (C) 2017 Rogerio Alves <rogealve@br.ibm.com>, IBM |
1988 | ++ * |
1989 | ++ * This program is free software; you can redistribute it and/or |
1990 | ++ * modify it under the terms of either: |
1991 | ++ * |
1992 | ++ * a) the GNU General Public License as published by the Free Software |
1993 | ++ * Foundation; either version 2 of the License, or (at your option) |
1994 | ++ * any later version, or |
1995 | ++ * b) the Apache License, Version 2.0 |
1996 | ++ */ |
1997 | ++ |
1998 | ++#include <altivec.h> |
1999 | ++#include "../../zutil.h" |
2000 | ++#include "power.h" |
2001 | ++ |
2002 | ++#define POWER8_INTRINSICS |
2003 | ++#define CRC_TABLE |
2004 | ++ |
2005 | ++#ifdef CRC32_CONSTANTS_HEADER |
2006 | ++#include CRC32_CONSTANTS_HEADER |
2007 | ++#else |
2008 | ++#include "crc32_constants.h" |
2009 | ++#endif |
2010 | ++ |
2011 | ++#define VMX_ALIGN 16 |
2012 | ++#define VMX_ALIGN_MASK (VMX_ALIGN-1) |
2013 | ++ |
2014 | ++#ifdef REFLECT |
2015 | ++static unsigned int crc32_align(unsigned int crc, const unsigned char *p, |
2016 | ++ unsigned long len) |
2017 | ++{ |
2018 | ++ while (len--) |
2019 | ++ crc = crc_table[(crc ^ *p++) & 0xff] ^ (crc >> 8); |
2020 | ++ return crc; |
2021 | ++} |
2022 | ++#else |
2023 | ++static unsigned int crc32_align(unsigned int crc, const unsigned char *p, |
2024 | ++ unsigned long len) |
2025 | ++{ |
2026 | ++ while (len--) |
2027 | ++ crc = crc_table[((crc >> 24) ^ *p++) & 0xff] ^ (crc << 8); |
2028 | ++ return crc; |
2029 | ++} |
2030 | ++#endif |
2031 | ++ |
2032 | ++static unsigned int __attribute__ ((aligned (32))) |
2033 | ++__crc32_vpmsum(unsigned int crc, const void* p, unsigned long len); |
2034 | ++ |
2035 | ++unsigned long ZLIB_INTERNAL _crc32_z_power8(uLong _crc, const Bytef *_p, |
2036 | ++ z_size_t _len) |
2037 | ++{ |
2038 | ++ unsigned int prealign; |
2039 | ++ unsigned int tail; |
2040 | ++ |
2041 | ++ /* Map zlib API to crc32_vpmsum API */ |
2042 | ++ unsigned int crc = (unsigned int) (0xffffffff & _crc); |
2043 | ++ const unsigned char *p = _p; |
2044 | ++ unsigned long len = (unsigned long) _len; |
2045 | ++ |
2046 | ++ if (p == (const unsigned char *) 0x0) return 0; |
2047 | ++#ifdef CRC_XOR |
2048 | ++ crc ^= 0xffffffff; |
2049 | ++#endif |
2050 | ++ |
2051 | ++ if (len < VMX_ALIGN + VMX_ALIGN_MASK) { |
2052 | ++ crc = crc32_align(crc, p, len); |
2053 | ++ goto out; |
2054 | ++ } |
2055 | ++ |
2056 | ++ if ((unsigned long)p & VMX_ALIGN_MASK) { |
2057 | ++ prealign = VMX_ALIGN - ((unsigned long)p & VMX_ALIGN_MASK); |
2058 | ++ crc = crc32_align(crc, p, prealign); |
2059 | ++ len -= prealign; |
2060 | ++ p += prealign; |
2061 | ++ } |
2062 | ++ |
2063 | ++ crc = __crc32_vpmsum(crc, p, len & ~VMX_ALIGN_MASK); |
2064 | ++ |
2065 | ++ tail = len & VMX_ALIGN_MASK; |
2066 | ++ if (tail) { |
2067 | ++ p += len & ~VMX_ALIGN_MASK; |
2068 | ++ crc = crc32_align(crc, p, tail); |
2069 | ++ } |
2070 | ++ |
2071 | ++out: |
2072 | ++#ifdef CRC_XOR |
2073 | ++ crc ^= 0xffffffff; |
2074 | ++#endif |
2075 | ++ |
2076 | ++ /* Convert to zlib API */ |
2077 | ++ return (unsigned long) crc; |
2078 | ++} |
2079 | ++ |
2080 | ++#if defined (__clang__) |
2081 | ++#include "clang_workaround.h" |
2082 | ++#else |
2083 | ++#define __builtin_pack_vector(a, b) __builtin_pack_vector_int128 ((a), (b)) |
2084 | ++#define __builtin_unpack_vector_0(a) __builtin_unpack_vector_int128 ((vector __int128_t)(a), 0) |
2085 | ++#define __builtin_unpack_vector_1(a) __builtin_unpack_vector_int128 ((vector __int128_t)(a), 1) |
2086 | ++#endif |
2087 | ++ |
2088 | ++/* When we have a load-store in a single-dispatch group and address overlap |
2089 | ++ * such that foward is not allowed (load-hit-store) the group must be flushed. |
2090 | ++ * A group ending NOP prevents the flush. |
2091 | ++ */ |
2092 | ++#define GROUP_ENDING_NOP asm("ori 2,2,0" ::: "memory") |
2093 | ++ |
2094 | ++#if defined(__BIG_ENDIAN__) && defined (REFLECT) |
2095 | ++#define BYTESWAP_DATA |
2096 | ++#elif defined(__LITTLE_ENDIAN__) && !defined(REFLECT) |
2097 | ++#define BYTESWAP_DATA |
2098 | ++#endif |
2099 | ++ |
2100 | ++#ifdef BYTESWAP_DATA |
2101 | ++#define VEC_PERM(vr, va, vb, vc) vr = vec_perm(va, vb,\ |
2102 | ++ (__vector unsigned char) vc) |
2103 | ++#if defined(__LITTLE_ENDIAN__) |
2104 | ++/* Byte reverse permute constant LE. */ |
2105 | ++static const __vector unsigned long long vperm_const |
2106 | ++ __attribute__ ((aligned(16))) = { 0x08090A0B0C0D0E0FUL, |
2107 | ++ 0x0001020304050607UL }; |
2108 | ++#else |
2109 | ++static const __vector unsigned long long vperm_const |
2110 | ++ __attribute__ ((aligned(16))) = { 0x0F0E0D0C0B0A0908UL, |
2111 | ++ 0X0706050403020100UL }; |
2112 | ++#endif |
2113 | ++#else |
2114 | ++#define VEC_PERM(vr, va, vb, vc) |
2115 | ++#endif |
2116 | ++ |
2117 | ++static unsigned int __attribute__ ((aligned (32))) |
2118 | ++__crc32_vpmsum(unsigned int crc, const void* p, unsigned long len) { |
2119 | ++ |
2120 | ++ const __vector unsigned long long vzero = {0,0}; |
2121 | ++ const __vector unsigned long long vones = {0xffffffffffffffffUL, |
2122 | ++ 0xffffffffffffffffUL}; |
2123 | ++ |
2124 | ++#ifdef REFLECT |
2125 | ++ const __vector unsigned long long vmask_32bit = |
2126 | ++ (__vector unsigned long long)vec_sld((__vector unsigned char)vzero, |
2127 | ++ (__vector unsigned char)vones, 4); |
2128 | ++#endif |
2129 | ++ |
2130 | ++ const __vector unsigned long long vmask_64bit = |
2131 | ++ (__vector unsigned long long)vec_sld((__vector unsigned char)vzero, |
2132 | ++ (__vector unsigned char)vones, 8); |
2133 | ++ |
2134 | ++ __vector unsigned long long vcrc; |
2135 | ++ |
2136 | ++ __vector unsigned long long vconst1, vconst2; |
2137 | ++ |
2138 | ++ /* vdata0-vdata7 will contain our data (p). */ |
2139 | ++ __vector unsigned long long vdata0, vdata1, vdata2, vdata3, vdata4, |
2140 | ++ vdata5, vdata6, vdata7; |
2141 | ++ |
2142 | ++ /* v0-v7 will contain our checksums */ |
2143 | ++ __vector unsigned long long v0 = {0,0}; |
2144 | ++ __vector unsigned long long v1 = {0,0}; |
2145 | ++ __vector unsigned long long v2 = {0,0}; |
2146 | ++ __vector unsigned long long v3 = {0,0}; |
2147 | ++ __vector unsigned long long v4 = {0,0}; |
2148 | ++ __vector unsigned long long v5 = {0,0}; |
2149 | ++ __vector unsigned long long v6 = {0,0}; |
2150 | ++ __vector unsigned long long v7 = {0,0}; |
2151 | ++ |
2152 | ++ |
2153 | ++ /* Vector auxiliary variables. */ |
2154 | ++ __vector unsigned long long va0, va1, va2, va3, va4, va5, va6, va7; |
2155 | ++ |
2156 | ++ unsigned int result = 0; |
2157 | ++ unsigned int offset; /* Constant table offset. */ |
2158 | ++ |
2159 | ++ unsigned long i; /* Counter. */ |
2160 | ++ unsigned long chunks; |
2161 | ++ |
2162 | ++ unsigned long block_size; |
2163 | ++ int next_block = 0; |
2164 | ++ |
2165 | ++ /* Align by 128 bits. The last 128 bit block will be processed at end. */ |
2166 | ++ unsigned long length = len & 0xFFFFFFFFFFFFFF80UL; |
2167 | ++ |
2168 | ++#ifdef REFLECT |
2169 | ++ vcrc = (__vector unsigned long long)__builtin_pack_vector(0UL, crc); |
2170 | ++#else |
2171 | ++ vcrc = (__vector unsigned long long)__builtin_pack_vector(crc, 0UL); |
2172 | ++ |
2173 | ++ /* Shift into top 32 bits */ |
2174 | ++ vcrc = (__vector unsigned long long)vec_sld((__vector unsigned char)vcrc, |
2175 | ++ (__vector unsigned char)vzero, 4); |
2176 | ++#endif |
2177 | ++ |
2178 | ++ /* Short version. */ |
2179 | ++ if (len < 256) { |
2180 | ++ /* Calculate where in the constant table we need to start. */ |
2181 | ++ offset = 256 - len; |
2182 | ++ |
2183 | ++ vconst1 = vec_ld(offset, vcrc_short_const); |
2184 | ++ vdata0 = vec_ld(0, (__vector unsigned long long*) p); |
2185 | ++ VEC_PERM(vdata0, vdata0, vconst1, vperm_const); |
2186 | ++ |
2187 | ++ /* xor initial value*/ |
2188 | ++ vdata0 = vec_xor(vdata0, vcrc); |
2189 | ++ |
2190 | ++ vdata0 = (__vector unsigned long long) __builtin_crypto_vpmsumw |
2191 | ++ ((__vector unsigned int)vdata0, (__vector unsigned int)vconst1); |
2192 | ++ v0 = vec_xor(v0, vdata0); |
2193 | ++ |
2194 | ++ for (i = 16; i < len; i += 16) { |
2195 | ++ vconst1 = vec_ld(offset + i, vcrc_short_const); |
2196 | ++ vdata0 = vec_ld(i, (__vector unsigned long long*) p); |
2197 | ++ VEC_PERM(vdata0, vdata0, vconst1, vperm_const); |
2198 | ++ vdata0 = (__vector unsigned long long) __builtin_crypto_vpmsumw |
2199 | ++ ((__vector unsigned int)vdata0, (__vector unsigned int)vconst1); |
2200 | ++ v0 = vec_xor(v0, vdata0); |
2201 | ++ } |
2202 | ++ } else { |
2203 | ++ |
2204 | ++ /* Load initial values. */ |
2205 | ++ vdata0 = vec_ld(0, (__vector unsigned long long*) p); |
2206 | ++ vdata1 = vec_ld(16, (__vector unsigned long long*) p); |
2207 | ++ |
2208 | ++ VEC_PERM(vdata0, vdata0, vdata0, vperm_const); |
2209 | ++ VEC_PERM(vdata1, vdata1, vdata1, vperm_const); |
2210 | ++ |
2211 | ++ vdata2 = vec_ld(32, (__vector unsigned long long*) p); |
2212 | ++ vdata3 = vec_ld(48, (__vector unsigned long long*) p); |
2213 | ++ |
2214 | ++ VEC_PERM(vdata2, vdata2, vdata2, vperm_const); |
2215 | ++ VEC_PERM(vdata3, vdata3, vdata3, vperm_const); |
2216 | ++ |
2217 | ++ vdata4 = vec_ld(64, (__vector unsigned long long*) p); |
2218 | ++ vdata5 = vec_ld(80, (__vector unsigned long long*) p); |
2219 | ++ |
2220 | ++ VEC_PERM(vdata4, vdata4, vdata4, vperm_const); |
2221 | ++ VEC_PERM(vdata5, vdata5, vdata5, vperm_const); |
2222 | ++ |
2223 | ++ vdata6 = vec_ld(96, (__vector unsigned long long*) p); |
2224 | ++ vdata7 = vec_ld(112, (__vector unsigned long long*) p); |
2225 | ++ |
2226 | ++ VEC_PERM(vdata6, vdata6, vdata6, vperm_const); |
2227 | ++ VEC_PERM(vdata7, vdata7, vdata7, vperm_const); |
2228 | ++ |
2229 | ++ /* xor in initial value */ |
2230 | ++ vdata0 = vec_xor(vdata0, vcrc); |
2231 | ++ |
2232 | ++ p = (char *)p + 128; |
2233 | ++ |
2234 | ++ do { |
2235 | ++ /* Checksum in blocks of MAX_SIZE. */ |
2236 | ++ block_size = length; |
2237 | ++ if (block_size > MAX_SIZE) { |
2238 | ++ block_size = MAX_SIZE; |
2239 | ++ } |
2240 | ++ |
2241 | ++ length = length - block_size; |
2242 | ++ |
2243 | ++ /* |
2244 | ++ * Work out the offset into the constants table to start at. Each |
2245 | ++ * constant is 16 bytes, and it is used against 128 bytes of input |
2246 | ++ * data - 128 / 16 = 8 |
2247 | ++ */ |
2248 | ++ offset = (MAX_SIZE/8) - (block_size/8); |
2249 | ++ /* We reduce our final 128 bytes in a separate step */ |
2250 | ++ chunks = (block_size/128)-1; |
2251 | ++ |
2252 | ++ vconst1 = vec_ld(offset, vcrc_const); |
2253 | ++ |
2254 | ++ va0 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata0, |
2255 | ++ (__vector unsigned long long)vconst1); |
2256 | ++ va1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata1, |
2257 | ++ (__vector unsigned long long)vconst1); |
2258 | ++ va2 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata2, |
2259 | ++ (__vector unsigned long long)vconst1); |
2260 | ++ va3 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata3, |
2261 | ++ (__vector unsigned long long)vconst1); |
2262 | ++ va4 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata4, |
2263 | ++ (__vector unsigned long long)vconst1); |
2264 | ++ va5 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata5, |
2265 | ++ (__vector unsigned long long)vconst1); |
2266 | ++ va6 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata6, |
2267 | ++ (__vector unsigned long long)vconst1); |
2268 | ++ va7 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata7, |
2269 | ++ (__vector unsigned long long)vconst1); |
2270 | ++ |
2271 | ++ if (chunks > 1) { |
2272 | ++ offset += 16; |
2273 | ++ vconst2 = vec_ld(offset, vcrc_const); |
2274 | ++ GROUP_ENDING_NOP; |
2275 | ++ |
2276 | ++ vdata0 = vec_ld(0, (__vector unsigned long long*) p); |
2277 | ++ VEC_PERM(vdata0, vdata0, vdata0, vperm_const); |
2278 | ++ |
2279 | ++ vdata1 = vec_ld(16, (__vector unsigned long long*) p); |
2280 | ++ VEC_PERM(vdata1, vdata1, vdata1, vperm_const); |
2281 | ++ |
2282 | ++ vdata2 = vec_ld(32, (__vector unsigned long long*) p); |
2283 | ++ VEC_PERM(vdata2, vdata2, vdata2, vperm_const); |
2284 | ++ |
2285 | ++ vdata3 = vec_ld(48, (__vector unsigned long long*) p); |
2286 | ++ VEC_PERM(vdata3, vdata3, vdata3, vperm_const); |
2287 | ++ |
2288 | ++ vdata4 = vec_ld(64, (__vector unsigned long long*) p); |
2289 | ++ VEC_PERM(vdata4, vdata4, vdata4, vperm_const); |
2290 | ++ |
2291 | ++ vdata5 = vec_ld(80, (__vector unsigned long long*) p); |
2292 | ++ VEC_PERM(vdata5, vdata5, vdata5, vperm_const); |
2293 | ++ |
2294 | ++ vdata6 = vec_ld(96, (__vector unsigned long long*) p); |
2295 | ++ VEC_PERM(vdata6, vdata6, vdata6, vperm_const); |
2296 | ++ |
2297 | ++ vdata7 = vec_ld(112, (__vector unsigned long long*) p); |
2298 | ++ VEC_PERM(vdata7, vdata7, vdata7, vperm_const); |
2299 | ++ |
2300 | ++ p = (char *)p + 128; |
2301 | ++ |
2302 | ++ /* |
2303 | ++ * main loop. We modulo schedule it such that it takes three |
2304 | ++ * iterations to complete - first iteration load, second |
2305 | ++ * iteration vpmsum, third iteration xor. |
2306 | ++ */ |
2307 | ++ for (i = 0; i < chunks-2; i++) { |
2308 | ++ vconst1 = vec_ld(offset, vcrc_const); |
2309 | ++ offset += 16; |
2310 | ++ GROUP_ENDING_NOP; |
2311 | ++ |
2312 | ++ v0 = vec_xor(v0, va0); |
2313 | ++ va0 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2314 | ++ long)vdata0, (__vector unsigned long long)vconst2); |
2315 | ++ vdata0 = vec_ld(0, (__vector unsigned long long*) p); |
2316 | ++ VEC_PERM(vdata0, vdata0, vdata0, vperm_const); |
2317 | ++ GROUP_ENDING_NOP; |
2318 | ++ |
2319 | ++ v1 = vec_xor(v1, va1); |
2320 | ++ va1 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2321 | ++ long)vdata1, (__vector unsigned long long)vconst2); |
2322 | ++ vdata1 = vec_ld(16, (__vector unsigned long long*) p); |
2323 | ++ VEC_PERM(vdata1, vdata1, vdata1, vperm_const); |
2324 | ++ GROUP_ENDING_NOP; |
2325 | ++ |
2326 | ++ v2 = vec_xor(v2, va2); |
2327 | ++ va2 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2328 | ++ long)vdata2, (__vector unsigned long long)vconst2); |
2329 | ++ vdata2 = vec_ld(32, (__vector unsigned long long*) p); |
2330 | ++ VEC_PERM(vdata2, vdata2, vdata2, vperm_const); |
2331 | ++ GROUP_ENDING_NOP; |
2332 | ++ |
2333 | ++ v3 = vec_xor(v3, va3); |
2334 | ++ va3 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2335 | ++ long)vdata3, (__vector unsigned long long)vconst2); |
2336 | ++ vdata3 = vec_ld(48, (__vector unsigned long long*) p); |
2337 | ++ VEC_PERM(vdata3, vdata3, vdata3, vperm_const); |
2338 | ++ |
2339 | ++ vconst2 = vec_ld(offset, vcrc_const); |
2340 | ++ GROUP_ENDING_NOP; |
2341 | ++ |
2342 | ++ v4 = vec_xor(v4, va4); |
2343 | ++ va4 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2344 | ++ long)vdata4, (__vector unsigned long long)vconst1); |
2345 | ++ vdata4 = vec_ld(64, (__vector unsigned long long*) p); |
2346 | ++ VEC_PERM(vdata4, vdata4, vdata4, vperm_const); |
2347 | ++ GROUP_ENDING_NOP; |
2348 | ++ |
2349 | ++ v5 = vec_xor(v5, va5); |
2350 | ++ va5 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2351 | ++ long)vdata5, (__vector unsigned long long)vconst1); |
2352 | ++ vdata5 = vec_ld(80, (__vector unsigned long long*) p); |
2353 | ++ VEC_PERM(vdata5, vdata5, vdata5, vperm_const); |
2354 | ++ GROUP_ENDING_NOP; |
2355 | ++ |
2356 | ++ v6 = vec_xor(v6, va6); |
2357 | ++ va6 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2358 | ++ long)vdata6, (__vector unsigned long long)vconst1); |
2359 | ++ vdata6 = vec_ld(96, (__vector unsigned long long*) p); |
2360 | ++ VEC_PERM(vdata6, vdata6, vdata6, vperm_const); |
2361 | ++ GROUP_ENDING_NOP; |
2362 | ++ |
2363 | ++ v7 = vec_xor(v7, va7); |
2364 | ++ va7 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2365 | ++ long)vdata7, (__vector unsigned long long)vconst1); |
2366 | ++ vdata7 = vec_ld(112, (__vector unsigned long long*) p); |
2367 | ++ VEC_PERM(vdata7, vdata7, vdata7, vperm_const); |
2368 | ++ |
2369 | ++ p = (char *)p + 128; |
2370 | ++ } |
2371 | ++ |
2372 | ++ /* First cool down*/ |
2373 | ++ vconst1 = vec_ld(offset, vcrc_const); |
2374 | ++ offset += 16; |
2375 | ++ |
2376 | ++ v0 = vec_xor(v0, va0); |
2377 | ++ va0 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2378 | ++ long)vdata0, (__vector unsigned long long)vconst1); |
2379 | ++ GROUP_ENDING_NOP; |
2380 | ++ |
2381 | ++ v1 = vec_xor(v1, va1); |
2382 | ++ va1 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2383 | ++ long)vdata1, (__vector unsigned long long)vconst1); |
2384 | ++ GROUP_ENDING_NOP; |
2385 | ++ |
2386 | ++ v2 = vec_xor(v2, va2); |
2387 | ++ va2 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2388 | ++ long)vdata2, (__vector unsigned long long)vconst1); |
2389 | ++ GROUP_ENDING_NOP; |
2390 | ++ |
2391 | ++ v3 = vec_xor(v3, va3); |
2392 | ++ va3 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2393 | ++ long)vdata3, (__vector unsigned long long)vconst1); |
2394 | ++ GROUP_ENDING_NOP; |
2395 | ++ |
2396 | ++ v4 = vec_xor(v4, va4); |
2397 | ++ va4 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2398 | ++ long)vdata4, (__vector unsigned long long)vconst1); |
2399 | ++ GROUP_ENDING_NOP; |
2400 | ++ |
2401 | ++ v5 = vec_xor(v5, va5); |
2402 | ++ va5 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2403 | ++ long)vdata5, (__vector unsigned long long)vconst1); |
2404 | ++ GROUP_ENDING_NOP; |
2405 | ++ |
2406 | ++ v6 = vec_xor(v6, va6); |
2407 | ++ va6 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2408 | ++ long)vdata6, (__vector unsigned long long)vconst1); |
2409 | ++ GROUP_ENDING_NOP; |
2410 | ++ |
2411 | ++ v7 = vec_xor(v7, va7); |
2412 | ++ va7 = __builtin_crypto_vpmsumd ((__vector unsigned long |
2413 | ++ long)vdata7, (__vector unsigned long long)vconst1); |
2414 | ++ }/* else */ |
2415 | ++ |
2416 | ++ /* Second cool down. */ |
2417 | ++ v0 = vec_xor(v0, va0); |
2418 | ++ v1 = vec_xor(v1, va1); |
2419 | ++ v2 = vec_xor(v2, va2); |
2420 | ++ v3 = vec_xor(v3, va3); |
2421 | ++ v4 = vec_xor(v4, va4); |
2422 | ++ v5 = vec_xor(v5, va5); |
2423 | ++ v6 = vec_xor(v6, va6); |
2424 | ++ v7 = vec_xor(v7, va7); |
2425 | ++ |
2426 | ++#ifdef REFLECT |
2427 | ++ /* |
2428 | ++ * vpmsumd produces a 96 bit result in the least significant bits |
2429 | ++ * of the register. Since we are bit reflected we have to shift it |
2430 | ++ * left 32 bits so it occupies the least significant bits in the |
2431 | ++ * bit reflected domain. |
2432 | ++ */ |
2433 | ++ v0 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, |
2434 | ++ (__vector unsigned char)vzero, 4); |
2435 | ++ v1 = (__vector unsigned long long)vec_sld((__vector unsigned char)v1, |
2436 | ++ (__vector unsigned char)vzero, 4); |
2437 | ++ v2 = (__vector unsigned long long)vec_sld((__vector unsigned char)v2, |
2438 | ++ (__vector unsigned char)vzero, 4); |
2439 | ++ v3 = (__vector unsigned long long)vec_sld((__vector unsigned char)v3, |
2440 | ++ (__vector unsigned char)vzero, 4); |
2441 | ++ v4 = (__vector unsigned long long)vec_sld((__vector unsigned char)v4, |
2442 | ++ (__vector unsigned char)vzero, 4); |
2443 | ++ v5 = (__vector unsigned long long)vec_sld((__vector unsigned char)v5, |
2444 | ++ (__vector unsigned char)vzero, 4); |
2445 | ++ v6 = (__vector unsigned long long)vec_sld((__vector unsigned char)v6, |
2446 | ++ (__vector unsigned char)vzero, 4); |
2447 | ++ v7 = (__vector unsigned long long)vec_sld((__vector unsigned char)v7, |
2448 | ++ (__vector unsigned char)vzero, 4); |
2449 | ++#endif |
2450 | ++ |
2451 | ++ /* xor with the last 1024 bits. */ |
2452 | ++ va0 = vec_ld(0, (__vector unsigned long long*) p); |
2453 | ++ VEC_PERM(va0, va0, va0, vperm_const); |
2454 | ++ |
2455 | ++ va1 = vec_ld(16, (__vector unsigned long long*) p); |
2456 | ++ VEC_PERM(va1, va1, va1, vperm_const); |
2457 | ++ |
2458 | ++ va2 = vec_ld(32, (__vector unsigned long long*) p); |
2459 | ++ VEC_PERM(va2, va2, va2, vperm_const); |
2460 | ++ |
2461 | ++ va3 = vec_ld(48, (__vector unsigned long long*) p); |
2462 | ++ VEC_PERM(va3, va3, va3, vperm_const); |
2463 | ++ |
2464 | ++ va4 = vec_ld(64, (__vector unsigned long long*) p); |
2465 | ++ VEC_PERM(va4, va4, va4, vperm_const); |
2466 | ++ |
2467 | ++ va5 = vec_ld(80, (__vector unsigned long long*) p); |
2468 | ++ VEC_PERM(va5, va5, va5, vperm_const); |
2469 | ++ |
2470 | ++ va6 = vec_ld(96, (__vector unsigned long long*) p); |
2471 | ++ VEC_PERM(va6, va6, va6, vperm_const); |
2472 | ++ |
2473 | ++ va7 = vec_ld(112, (__vector unsigned long long*) p); |
2474 | ++ VEC_PERM(va7, va7, va7, vperm_const); |
2475 | ++ |
2476 | ++ p = (char *)p + 128; |
2477 | ++ |
2478 | ++ vdata0 = vec_xor(v0, va0); |
2479 | ++ vdata1 = vec_xor(v1, va1); |
2480 | ++ vdata2 = vec_xor(v2, va2); |
2481 | ++ vdata3 = vec_xor(v3, va3); |
2482 | ++ vdata4 = vec_xor(v4, va4); |
2483 | ++ vdata5 = vec_xor(v5, va5); |
2484 | ++ vdata6 = vec_xor(v6, va6); |
2485 | ++ vdata7 = vec_xor(v7, va7); |
2486 | ++ |
2487 | ++ /* Check if we have more blocks to process */ |
2488 | ++ next_block = 0; |
2489 | ++ if (length != 0) { |
2490 | ++ next_block = 1; |
2491 | ++ |
2492 | ++ /* zero v0-v7 */ |
2493 | ++ v0 = vec_xor(v0, v0); |
2494 | ++ v1 = vec_xor(v1, v1); |
2495 | ++ v2 = vec_xor(v2, v2); |
2496 | ++ v3 = vec_xor(v3, v3); |
2497 | ++ v4 = vec_xor(v4, v4); |
2498 | ++ v5 = vec_xor(v5, v5); |
2499 | ++ v6 = vec_xor(v6, v6); |
2500 | ++ v7 = vec_xor(v7, v7); |
2501 | ++ } |
2502 | ++ length = length + 128; |
2503 | ++ |
2504 | ++ } while (next_block); |
2505 | ++ |
2506 | ++ /* Calculate how many bytes we have left. */ |
2507 | ++ length = (len & 127); |
2508 | ++ |
2509 | ++ /* Calculate where in (short) constant table we need to start. */ |
2510 | ++ offset = 128 - length; |
2511 | ++ |
2512 | ++ v0 = vec_ld(offset, vcrc_short_const); |
2513 | ++ v1 = vec_ld(offset + 16, vcrc_short_const); |
2514 | ++ v2 = vec_ld(offset + 32, vcrc_short_const); |
2515 | ++ v3 = vec_ld(offset + 48, vcrc_short_const); |
2516 | ++ v4 = vec_ld(offset + 64, vcrc_short_const); |
2517 | ++ v5 = vec_ld(offset + 80, vcrc_short_const); |
2518 | ++ v6 = vec_ld(offset + 96, vcrc_short_const); |
2519 | ++ v7 = vec_ld(offset + 112, vcrc_short_const); |
2520 | ++ |
2521 | ++ offset += 128; |
2522 | ++ |
2523 | ++ v0 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2524 | ++ (__vector unsigned int)vdata0,(__vector unsigned int)v0); |
2525 | ++ v1 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2526 | ++ (__vector unsigned int)vdata1,(__vector unsigned int)v1); |
2527 | ++ v2 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2528 | ++ (__vector unsigned int)vdata2,(__vector unsigned int)v2); |
2529 | ++ v3 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2530 | ++ (__vector unsigned int)vdata3,(__vector unsigned int)v3); |
2531 | ++ v4 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2532 | ++ (__vector unsigned int)vdata4,(__vector unsigned int)v4); |
2533 | ++ v5 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2534 | ++ (__vector unsigned int)vdata5,(__vector unsigned int)v5); |
2535 | ++ v6 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2536 | ++ (__vector unsigned int)vdata6,(__vector unsigned int)v6); |
2537 | ++ v7 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2538 | ++ (__vector unsigned int)vdata7,(__vector unsigned int)v7); |
2539 | ++ |
2540 | ++ /* Now reduce the tail (0-112 bytes). */ |
2541 | ++ for (i = 0; i < length; i+=16) { |
2542 | ++ vdata0 = vec_ld(i,(__vector unsigned long long*)p); |
2543 | ++ VEC_PERM(vdata0, vdata0, vdata0, vperm_const); |
2544 | ++ va0 = vec_ld(offset + i,vcrc_short_const); |
2545 | ++ va0 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( |
2546 | ++ (__vector unsigned int)vdata0,(__vector unsigned int)va0); |
2547 | ++ v0 = vec_xor(v0, va0); |
2548 | ++ } |
2549 | ++ |
2550 | ++ /* xor all parallel chunks together. */ |
2551 | ++ v0 = vec_xor(v0, v1); |
2552 | ++ v2 = vec_xor(v2, v3); |
2553 | ++ v4 = vec_xor(v4, v5); |
2554 | ++ v6 = vec_xor(v6, v7); |
2555 | ++ |
2556 | ++ v0 = vec_xor(v0, v2); |
2557 | ++ v4 = vec_xor(v4, v6); |
2558 | ++ |
2559 | ++ v0 = vec_xor(v0, v4); |
2560 | ++ } |
2561 | ++ |
2562 | ++ /* Barrett Reduction */ |
2563 | ++ vconst1 = vec_ld(0, v_Barrett_const); |
2564 | ++ vconst2 = vec_ld(16, v_Barrett_const); |
2565 | ++ |
2566 | ++ v1 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, |
2567 | ++ (__vector unsigned char)v0, 8); |
2568 | ++ v0 = vec_xor(v1,v0); |
2569 | ++ |
2570 | ++#ifdef REFLECT |
2571 | ++ /* shift left one bit */ |
2572 | ++ __vector unsigned char vsht_splat = vec_splat_u8 (1); |
2573 | ++ v0 = (__vector unsigned long long)vec_sll ((__vector unsigned char)v0, |
2574 | ++ vsht_splat); |
2575 | ++#endif |
2576 | ++ |
2577 | ++ v0 = vec_and(v0, vmask_64bit); |
2578 | ++ |
2579 | ++#ifndef REFLECT |
2580 | ++ |
2581 | ++ /* |
2582 | ++ * Now for the actual algorithm. The idea is to calculate q, |
2583 | ++ * the multiple of our polynomial that we need to subtract. By |
2584 | ++ * doing the computation 2x bits higher (ie 64 bits) and shifting the |
2585 | ++ * result back down 2x bits, we round down to the nearest multiple. |
2586 | ++ */ |
2587 | ++ |
2588 | ++ /* ma */ |
2589 | ++ v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v0, |
2590 | ++ (__vector unsigned long long)vconst1); |
2591 | ++ /* q = floor(ma/(2^64)) */ |
2592 | ++ v1 = (__vector unsigned long long)vec_sld ((__vector unsigned char)vzero, |
2593 | ++ (__vector unsigned char)v1, 8); |
2594 | ++ /* qn */ |
2595 | ++ v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, |
2596 | ++ (__vector unsigned long long)vconst2); |
2597 | ++ /* a - qn, subtraction is xor in GF(2) */ |
2598 | ++ v0 = vec_xor (v0, v1); |
2599 | ++ /* |
2600 | ++ * Get the result into r3. We need to shift it left 8 bytes: |
2601 | ++ * V0 [ 0 1 2 X ] |
2602 | ++ * V0 [ 0 X 2 3 ] |
2603 | ++ */ |
2604 | ++ result = __builtin_unpack_vector_1 (v0); |
2605 | ++#else |
2606 | ++ |
2607 | ++ /* |
2608 | ++ * The reflected version of Barrett reduction. Instead of bit |
2609 | ++ * reflecting our data (which is expensive to do), we bit reflect our |
2610 | ++ * constants and our algorithm, which means the intermediate data in |
2611 | ++ * our vector registers goes from 0-63 instead of 63-0. We can reflect |
2612 | ++ * the algorithm because we don't carry in mod 2 arithmetic. |
2613 | ++ */ |
2614 | ++ |
2615 | ++ /* bottom 32 bits of a */ |
2616 | ++ v1 = vec_and(v0, vmask_32bit); |
2617 | ++ |
2618 | ++ /* ma */ |
2619 | ++ v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, |
2620 | ++ (__vector unsigned long long)vconst1); |
2621 | ++ |
2622 | ++ /* bottom 32bits of ma */ |
2623 | ++ v1 = vec_and(v1, vmask_32bit); |
2624 | ++ /* qn */ |
2625 | ++ v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, |
2626 | ++ (__vector unsigned long long)vconst2); |
2627 | ++ /* a - qn, subtraction is xor in GF(2) */ |
2628 | ++ v0 = vec_xor (v0, v1); |
2629 | ++ |
2630 | ++ /* |
2631 | ++ * Since we are bit reflected, the result (ie the low 32 bits) is in |
2632 | ++ * the high 32 bits. We just need to shift it left 4 bytes |
2633 | ++ * V0 [ 0 1 X 3 ] |
2634 | ++ * V0 [ 0 X 2 3 ] |
2635 | ++ */ |
2636 | ++ |
2637 | ++ /* shift result into top 64 bits of */ |
2638 | ++ v0 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, |
2639 | ++ (__vector unsigned char)vzero, 4); |
2640 | ++ |
2641 | ++ result = __builtin_unpack_vector_0 (v0); |
2642 | ++#endif |
2643 | ++ |
2644 | ++ return result; |
2645 | ++} |
2646 | +diff --git a/contrib/power/crc32_z_resolver.c b/contrib/power/crc32_z_resolver.c |
2647 | +new file mode 100644 |
2648 | +index 0000000..f4e9aa4 |
2649 | +--- /dev/null |
2650 | ++++ b/contrib/power/crc32_z_resolver.c |
2651 | +@@ -0,0 +1,15 @@ |
2652 | ++/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM |
2653 | ++ * For conditions of distribution and use, see copyright notice in zlib.h |
2654 | ++ */ |
2655 | ++ |
2656 | ++#include "../gcc/zifunc.h" |
2657 | ++#include "power.h" |
2658 | ++ |
2659 | ++Z_IFUNC(crc32_z) { |
2660 | ++#ifdef Z_POWER8 |
2661 | ++ if (__builtin_cpu_supports("arch_2_07")) |
2662 | ++ return _crc32_z_power8; |
2663 | ++#endif |
2664 | ++ |
2665 | ++ return crc32_z_default; |
2666 | ++} |
2667 | +diff --git a/contrib/power/power.h b/contrib/power/power.h |
2668 | +index b42c7d6..79123aa 100644 |
2669 | +--- a/contrib/power/power.h |
2670 | ++++ b/contrib/power/power.h |
2671 | +@@ -2,3 +2,7 @@ |
2672 | + * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM |
2673 | + * For conditions of distribution and use, see copyright notice in zlib.h |
2674 | + */ |
2675 | ++ |
2676 | ++#include "../../zconf.h" |
2677 | ++ |
2678 | ++unsigned long _crc32_z_power8(unsigned long, const Bytef *, z_size_t); |
2679 | +diff --git a/crc32.c b/crc32.c |
2680 | +index 6c38f5c..5589d54 100644 |
2681 | +--- a/crc32.c |
2682 | ++++ b/crc32.c |
2683 | +@@ -691,6 +691,13 @@ local z_word_t crc_word_big(z_word_t data) { |
2684 | + #endif |
2685 | + |
2686 | + /* ========================================================================= */ |
2687 | ++#ifdef Z_POWER_OPT |
2688 | ++/* Rename function so resolver can use its symbol. The default version will be |
2689 | ++ * returned by the resolver if the host has no support for an optimized version. |
2690 | ++ */ |
2691 | ++#define crc32_z crc32_z_default |
2692 | ++#endif /* Z_POWER_OPT */ |
2693 | ++ |
2694 | + unsigned long ZEXPORT crc32_z(unsigned long crc, const unsigned char FAR *buf, |
2695 | + z_size_t len) { |
2696 | + /* Return initial CRC, if requested. */ |
2697 | +@@ -1009,6 +1016,11 @@ unsigned long ZEXPORT crc32_z(unsigned long crc, const unsigned char FAR *buf, |
2698 | + return crc ^ 0xffffffff; |
2699 | + } |
2700 | + |
2701 | ++#ifdef Z_POWER_OPT |
2702 | ++#undef crc32_z |
2703 | ++#include "contrib/power/crc32_z_resolver.c" |
2704 | ++#endif /* Z_POWER_OPT */ |
2705 | ++ |
2706 | + #endif |
2707 | + |
2708 | + /* ========================================================================= */ |
2709 | +diff --git a/test/crc32_test.c b/test/crc32_test.c |
2710 | +new file mode 100644 |
2711 | +index 0000000..3155553 |
2712 | +--- /dev/null |
2713 | ++++ b/test/crc32_test.c |
2714 | +@@ -0,0 +1,205 @@ |
2715 | ++/* crc32_tes.c -- unit test for crc32 in the zlib compression library |
2716 | ++ * Copyright (C) 1995-2006, 2010, 2011, 2016, 2019 Rogerio Alves |
2717 | ++ * For conditions of distribution and use, see copyright notice in zlib.h |
2718 | ++ */ |
2719 | ++ |
2720 | ++#include "zlib.h" |
2721 | ++#include <stdio.h> |
2722 | ++ |
2723 | ++#ifdef STDC |
2724 | ++# include <string.h> |
2725 | ++# include <stdlib.h> |
2726 | ++#endif |
2727 | ++ |
2728 | ++void test_crc32 OF((uLong crc, Byte* buf, z_size_t len, uLong chk, int line)); |
2729 | ++int main OF((void)); |
2730 | ++ |
2731 | ++typedef struct { |
2732 | ++ int line; |
2733 | ++ uLong crc; |
2734 | ++ char* buf; |
2735 | ++ int len; |
2736 | ++ uLong expect; |
2737 | ++} crc32_test; |
2738 | ++ |
2739 | ++void test_crc32(crc, buf, len, chk, line) |
2740 | ++ uLong crc; |
2741 | ++ Byte *buf; |
2742 | ++ z_size_t len; |
2743 | ++ uLong chk; |
2744 | ++ int line; |
2745 | ++{ |
2746 | ++ uLong res = crc32(crc, buf, len); |
2747 | ++ if (res != chk) { |
2748 | ++ fprintf(stderr, "FAIL [%d]: crc32 returned 0x%08X expected 0x%08X\n", |
2749 | ++ line, (unsigned int)res, (unsigned int)chk); |
2750 | ++ exit(1); |
2751 | ++ } |
2752 | ++} |
2753 | ++ |
2754 | ++static const crc32_test tests[] = { |
2755 | ++ {__LINE__, 0x0, 0x0, 0, 0x0}, |
2756 | ++ {__LINE__, 0xffffffff, 0x0, 0, 0x0}, |
2757 | ++ {__LINE__, 0x0, 0x0, 255, 0x0}, /* BZ 174799. */ |
2758 | ++ {__LINE__, 0x0, 0x0, 256, 0x0}, |
2759 | ++ {__LINE__, 0x0, 0x0, 257, 0x0}, |
2760 | ++ {__LINE__, 0x0, 0x0, 32767, 0x0}, |
2761 | ++ {__LINE__, 0x0, 0x0, 32768, 0x0}, |
2762 | ++ {__LINE__, 0x0, 0x0, 32769, 0x0}, |
2763 | ++ {__LINE__, 0x0, "", 0, 0x0}, |
2764 | ++ {__LINE__, 0xffffffff, "", 0, 0xffffffff}, |
2765 | ++ {__LINE__, 0x0, "abacus", 6, 0xc3d7115b}, |
2766 | ++ {__LINE__, 0x0, "backlog", 7, 0x269205}, |
2767 | ++ {__LINE__, 0x0, "campfire", 8, 0x22a515f8}, |
2768 | ++ {__LINE__, 0x0, "delta", 5, 0x9643fed9}, |
2769 | ++ {__LINE__, 0x0, "executable", 10, 0xd68eda01}, |
2770 | ++ {__LINE__, 0x0, "file", 4, 0x8c9f3610}, |
2771 | ++ {__LINE__, 0x0, "greatest", 8, 0xc1abd6cd}, |
2772 | ++ {__LINE__, 0x0, "hello", 5, 0x3610a686}, |
2773 | ++ {__LINE__, 0x0, "inverter", 8, 0xc9e962c9}, |
2774 | ++ {__LINE__, 0x0, "jigsaw", 6, 0xce4e3f69}, |
2775 | ++ {__LINE__, 0x0, "karate", 6, 0x890be0e2}, |
2776 | ++ {__LINE__, 0x0, "landscape", 9, 0xc4e0330b}, |
2777 | ++ {__LINE__, 0x0, "machine", 7, 0x1505df84}, |
2778 | ++ {__LINE__, 0x0, "nanometer", 9, 0xd4e19f39}, |
2779 | ++ {__LINE__, 0x0, "oblivion", 8, 0xdae9de77}, |
2780 | ++ {__LINE__, 0x0, "panama", 6, 0x66b8979c}, |
2781 | ++ {__LINE__, 0x0, "quest", 5, 0x4317f817}, |
2782 | ++ {__LINE__, 0x0, "resource", 8, 0xbc91f416}, |
2783 | ++ {__LINE__, 0x0, "secret", 6, 0x5ca2e8e5}, |
2784 | ++ {__LINE__, 0x0, "test", 4, 0xd87f7e0c}, |
2785 | ++ {__LINE__, 0x0, "ultimate", 8, 0x3fc79b0b}, |
2786 | ++ {__LINE__, 0x0, "vector", 6, 0x1b6e485b}, |
2787 | ++ {__LINE__, 0x0, "walrus", 6, 0xbe769b97}, |
2788 | ++ {__LINE__, 0x0, "xeno", 4, 0xe7a06444}, |
2789 | ++ {__LINE__, 0x0, "yelling", 7, 0xfe3944e5}, |
2790 | ++ {__LINE__, 0x0, "zlib", 4, 0x73887d3a}, |
2791 | ++ {__LINE__, 0x0, "4BJD7PocN1VqX0jXVpWB", 20, 0xd487a5a1}, |
2792 | ++ {__LINE__, 0x0, "F1rPWI7XvDs6nAIRx41l", 20, 0x61a0132e}, |
2793 | ++ {__LINE__, 0x0, "ldhKlsVkPFOveXgkGtC2", 20, 0xdf02f76}, |
2794 | ++ {__LINE__, 0x0, "5KKnGOOrs8BvJ35iKTOS", 20, 0x579b2b0a}, |
2795 | ++ {__LINE__, 0x0, "0l1tw7GOcem06Ddu7yn4", 20, 0xf7d16e2d}, |
2796 | ++ {__LINE__, 0x0, "MCr47CjPIn9R1IvE1Tm5", 20, 0x731788f5}, |
2797 | ++ {__LINE__, 0x0, "UcixbzPKTIv0SvILHVdO", 20, 0x7112bb11}, |
2798 | ++ {__LINE__, 0x0, "dGnAyAhRQDsWw0ESou24", 20, 0xf32a0dac}, |
2799 | ++ {__LINE__, 0x0, "di0nvmY9UYMYDh0r45XT", 20, 0x625437bb}, |
2800 | ++ {__LINE__, 0x0, "2XKDwHfAhFsV0RhbqtvH", 20, 0x896930f9}, |
2801 | ++ {__LINE__, 0x0, "ZhrANFIiIvRnqClIVyeD", 20, 0x8579a37}, |
2802 | ++ {__LINE__, 0x0, "v7Q9ehzioTOVeDIZioT1", 20, 0x632aa8e0}, |
2803 | ++ {__LINE__, 0x0, "Yod5hEeKcYqyhfXbhxj2", 20, 0xc829af29}, |
2804 | ++ {__LINE__, 0x0, "GehSWY2ay4uUKhehXYb0", 20, 0x1b08b7e8}, |
2805 | ++ {__LINE__, 0x0, "kwytJmq6UqpflV8Y8GoE", 20, 0x4e33b192}, |
2806 | ++ {__LINE__, 0x0, "70684206568419061514", 20, 0x59a179f0}, |
2807 | ++ {__LINE__, 0x0, "42015093765128581010", 20, 0xcd1013d7}, |
2808 | ++ {__LINE__, 0x0, "88214814356148806939", 20, 0xab927546}, |
2809 | ++ {__LINE__, 0x0, "43472694284527343838", 20, 0x11f3b20c}, |
2810 | ++ {__LINE__, 0x0, "49769333513942933689", 20, 0xd562d4ca}, |
2811 | ++ {__LINE__, 0x0, "54979784887993251199", 20, 0x233395f7}, |
2812 | ++ {__LINE__, 0x0, "58360544869206793220", 20, 0x2d167fd5}, |
2813 | ++ {__LINE__, 0x0, "27347953487840714234", 20, 0x8b5108ba}, |
2814 | ++ {__LINE__, 0x0, "07650690295365319082", 20, 0xc46b3cd8}, |
2815 | ++ {__LINE__, 0x0, "42655507906821911703", 20, 0xc10b2662}, |
2816 | ++ {__LINE__, 0x0, "29977409200786225655", 20, 0xc9a0f9d2}, |
2817 | ++ {__LINE__, 0x0, "85181542907229116674", 20, 0x9341357b}, |
2818 | ++ {__LINE__, 0x0, "87963594337989416799", 20, 0xf0424937}, |
2819 | ++ {__LINE__, 0x0, "21395988329504168551", 20, 0xd7c4c31f}, |
2820 | ++ {__LINE__, 0x0, "51991013580943379423", 20, 0xf11edcc4}, |
2821 | ++ {__LINE__, 0x0, "*]+@!);({_$;}[_},?{?;(_?,=-][@", 30, 0x40795df4}, |
2822 | ++ {__LINE__, 0x0, "_@:_).&(#.[:[{[:)$++-($_;@[)}+", 30, 0xdd61a631}, |
2823 | ++ {__LINE__, 0x0, "&[!,[$_==}+.]@!;*(+},[;:)$;)-@", 30, 0xca907a99}, |
2824 | ++ {__LINE__, 0x0, "]{.[.+?+[[=;[?}_#&;[=)__$$:+=_", 30, 0xf652deac}, |
2825 | ++ {__LINE__, 0x0, "-%.)=/[@].:.(:,()$;=%@-$?]{%+%", 30, 0xaf39a5a9}, |
2826 | ++ {__LINE__, 0x0, "+]#$(@&.=:,*];/.!]%/{:){:@(;)$", 30, 0x6bebb4cf}, |
2827 | ++ {__LINE__, 0x0, ")-._.:?[&:.=+}(*$/=!.${;(=$@!}", 30, 0x76430bac}, |
2828 | ++ {__LINE__, 0x0, ":(_*&%/[[}+,?#$&*+#[([*-/#;%(]", 30, 0x6c80c388}, |
2829 | ++ {__LINE__, 0x0, "{[#-;:$/{)(+[}#]/{&!%(@)%:@-$:", 30, 0xd54d977d}, |
2830 | ++ {__LINE__, 0x0, "_{$*,}(&,@.)):=!/%(&(,,-?$}}}!", 30, 0xe3966ad5}, |
2831 | ++ {__LINE__, 0x0, "e$98KNzqaV)Y:2X?]77].{gKRD4G5{mHZk,Z)SpU%L3FSgv!Wb8MLAFdi{+fp)c,@8m6v)yXg@]HBDFk?.4&}g5_udE*JHCiH=aL", 100, 0xe7c71db9}, |
2832 | ++ {__LINE__, 0x0, "r*Fd}ef+5RJQ;+W=4jTR9)R*p!B;]Ed7tkrLi;88U7g@3v!5pk2X6D)vt,.@N8c]@yyEcKi[vwUu@.Ppm@C6%Mv*3Nw}Y,58_aH)", 100, 0xeaa52777}, |
2833 | ++ {__LINE__, 0x0, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 100, 0xcd472048}, |
2834 | ++ {__LINE__, 0x7a30360d, "abacus", 6, 0xf8655a84}, |
2835 | ++ {__LINE__, 0x6fd767ee, "backlog", 7, 0x1ed834b1}, |
2836 | ++ {__LINE__, 0xefeb7589, "campfire", 8, 0x686cfca}, |
2837 | ++ {__LINE__, 0x61cf7e6b, "delta", 5, 0x1554e4b1}, |
2838 | ++ {__LINE__, 0xdc712e2, "executable", 10, 0x761b4254}, |
2839 | ++ {__LINE__, 0xad23c7fd, "file", 4, 0x7abdd09b}, |
2840 | ++ {__LINE__, 0x85cb2317, "greatest", 8, 0x4ba91c6b}, |
2841 | ++ {__LINE__, 0x9eed31b0, "inverter", 8, 0xd5e78ba5}, |
2842 | ++ {__LINE__, 0xb94f34ca, "jigsaw", 6, 0x23649109}, |
2843 | ++ {__LINE__, 0xab058a2, "karate", 6, 0xc5591f41}, |
2844 | ++ {__LINE__, 0x5bff2b7a, "landscape", 9, 0xf10eb644}, |
2845 | ++ {__LINE__, 0x605c9a5f, "machine", 7, 0xbaa0a636}, |
2846 | ++ {__LINE__, 0x51bdeea5, "nanometer", 9, 0x6af89afb}, |
2847 | ++ {__LINE__, 0x85c21c79, "oblivion", 8, 0xecae222b}, |
2848 | ++ {__LINE__, 0x97216f56, "panama", 6, 0x47dffac4}, |
2849 | ++ {__LINE__, 0x18444af2, "quest", 5, 0x70c2fe36}, |
2850 | ++ {__LINE__, 0xbe6ce359, "resource", 8, 0x1471d925}, |
2851 | ++ {__LINE__, 0x843071f1, "secret", 6, 0x50c9a0db}, |
2852 | ++ {__LINE__, 0xf2480c60, "ultimate", 8, 0xf973daf8}, |
2853 | ++ {__LINE__, 0x2d2feb3d, "vector", 6, 0x344ac03d}, |
2854 | ++ {__LINE__, 0x7490310a, "walrus", 6, 0x6d1408ef}, |
2855 | ++ {__LINE__, 0x97d247d4, "xeno", 4, 0xe62670b5}, |
2856 | ++ {__LINE__, 0x93cf7599, "yelling", 7, 0x1b36da38}, |
2857 | ++ {__LINE__, 0x73c84278, "zlib", 4, 0x6432d127}, |
2858 | ++ {__LINE__, 0x228a87d1, "4BJD7PocN1VqX0jXVpWB", 20, 0x997107d0}, |
2859 | ++ {__LINE__, 0xa7a048d0, "F1rPWI7XvDs6nAIRx41l", 20, 0xdc567274}, |
2860 | ++ {__LINE__, 0x1f0ded40, "ldhKlsVkPFOveXgkGtC2", 20, 0xdcc63870}, |
2861 | ++ {__LINE__, 0xa804a62f, "5KKnGOOrs8BvJ35iKTOS", 20, 0x6926cffd}, |
2862 | ++ {__LINE__, 0x508fae6a, "0l1tw7GOcem06Ddu7yn4", 20, 0xb52b38bc}, |
2863 | ++ {__LINE__, 0xe5adaf4f, "MCr47CjPIn9R1IvE1Tm5", 20, 0xf83b8178}, |
2864 | ++ {__LINE__, 0x67136a40, "UcixbzPKTIv0SvILHVdO", 20, 0xc5213070}, |
2865 | ++ {__LINE__, 0xb00c4a10, "dGnAyAhRQDsWw0ESou24", 20, 0xbc7648b0}, |
2866 | ++ {__LINE__, 0x2e0c84b5, "di0nvmY9UYMYDh0r45XT", 20, 0xd8123a72}, |
2867 | ++ {__LINE__, 0x81238d44, "2XKDwHfAhFsV0RhbqtvH", 20, 0xd5ac5620}, |
2868 | ++ {__LINE__, 0xf853aa92, "ZhrANFIiIvRnqClIVyeD", 20, 0xceae099d}, |
2869 | ++ {__LINE__, 0x5a692325, "v7Q9ehzioTOVeDIZioT1", 20, 0xb07d2b24}, |
2870 | ++ {__LINE__, 0x3275b9f, "Yod5hEeKcYqyhfXbhxj2", 20, 0x24ce91df}, |
2871 | ++ {__LINE__, 0x38371feb, "GehSWY2ay4uUKhehXYb0", 20, 0x707b3b30}, |
2872 | ++ {__LINE__, 0xafc8bf62, "kwytJmq6UqpflV8Y8GoE", 20, 0x16abc6a9}, |
2873 | ++ {__LINE__, 0x9b07db73, "70684206568419061514", 20, 0xae1fb7b7}, |
2874 | ++ {__LINE__, 0xe75b214, "42015093765128581010", 20, 0xd4eecd2d}, |
2875 | ++ {__LINE__, 0x72d0fe6f, "88214814356148806939", 20, 0x4660ec7}, |
2876 | ++ {__LINE__, 0xf857a4b1, "43472694284527343838", 20, 0xfd8afdf7}, |
2877 | ++ {__LINE__, 0x54b8e14, "49769333513942933689", 20, 0xc6d1b5f2}, |
2878 | ++ {__LINE__, 0xd6aa5616, "54979784887993251199", 20, 0x32476461}, |
2879 | ++ {__LINE__, 0x11e63098, "58360544869206793220", 20, 0xd917cf1a}, |
2880 | ++ {__LINE__, 0xbe92385, "27347953487840714234", 20, 0x4ad14a12}, |
2881 | ++ {__LINE__, 0x49511de0, "07650690295365319082", 20, 0xe37b5c6c}, |
2882 | ++ {__LINE__, 0x3db13bc1, "42655507906821911703", 20, 0x7cc497f1}, |
2883 | ++ {__LINE__, 0xbb899bea, "29977409200786225655", 20, 0x99781bb2}, |
2884 | ++ {__LINE__, 0xf6cd9436, "85181542907229116674", 20, 0x132256a1}, |
2885 | ++ {__LINE__, 0x9109e6c3, "87963594337989416799", 20, 0xbfdb2c83}, |
2886 | ++ {__LINE__, 0x75770fc, "21395988329504168551", 20, 0x8d9d1e81}, |
2887 | ++ {__LINE__, 0x69b1d19b, "51991013580943379423", 20, 0x7b6d4404}, |
2888 | ++ {__LINE__, 0xc6132975, "*]+@!);({_$;}[_},?{?;(_?,=-][@", 30, 0x8619f010}, |
2889 | ++ {__LINE__, 0xd58cb00c, "_@:_).&(#.[:[{[:)$++-($_;@[)}+", 30, 0x15746ac3}, |
2890 | ++ {__LINE__, 0xb63b8caa, "&[!,[$_==}+.]@!;*(+},[;:)$;)-@", 30, 0xaccf812f}, |
2891 | ++ {__LINE__, 0x8a45a2b8, "]{.[.+?+[[=;[?}_#&;[=)__$$:+=_", 30, 0x78af45de}, |
2892 | ++ {__LINE__, 0xcbe95b78, "-%.)=/[@].:.(:,()$;=%@-$?]{%+%", 30, 0x25b06b59}, |
2893 | ++ {__LINE__, 0x4ef8a54b, "+]#$(@&.=:,*];/.!]%/{:){:@(;)$", 30, 0x4ba0d08f}, |
2894 | ++ {__LINE__, 0x76ad267a, ")-._.:?[&:.=+}(*$/=!.${;(=$@!}", 30, 0xe26b6aac}, |
2895 | ++ {__LINE__, 0x569e613c, ":(_*&%/[[}+,?#$&*+#[([*-/#;%(]", 30, 0x7e2b0a66}, |
2896 | ++ {__LINE__, 0x36aa61da, "{[#-;:$/{)(+[}#]/{&!%(@)%:@-$:", 30, 0xb3430dc7}, |
2897 | ++ {__LINE__, 0xf67222df, "_{$*,}(&,@.)):=!/%(&(,,-?$}}}!", 30, 0x626c17a}, |
2898 | ++ {__LINE__, 0x74b34fd3, "e$98KNzqaV)Y:2X?]77].{gKRD4G5{mHZk,Z)SpU%L3FSgv!Wb8MLAFdi{+fp)c,@8m6v)yXg@]HBDFk?.4&}g5_udE*JHCiH=aL", 100, 0xccf98060}, |
2899 | ++ {__LINE__, 0x351fd770, "r*Fd}ef+5RJQ;+W=4jTR9)R*p!B;]Ed7tkrLi;88U7g@3v!5pk2X6D)vt,.@N8c]@yyEcKi[vwUu@.Ppm@C6%Mv*3Nw}Y,58_aH)", 100, 0xd8b95312}, |
2900 | ++ {__LINE__, 0xc45aef77, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 100, 0xbb1c9912}, |
2901 | ++ {__LINE__, 0xc45aef77, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" |
2902 | ++ "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" |
2903 | ++ "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" |
2904 | ++ "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" |
2905 | ++ "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" |
2906 | ++ "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 600, 0x888AFA5B} |
2907 | ++}; |
2908 | ++ |
2909 | ++static const int test_size = sizeof(tests) / sizeof(tests[0]); |
2910 | ++ |
2911 | ++int main(void) |
2912 | ++{ |
2913 | ++ int i; |
2914 | ++ for (i = 0; i < test_size; i++) { |
2915 | ++ test_crc32(tests[i].crc, (Byte*) tests[i].buf, tests[i].len, |
2916 | ++ tests[i].expect, tests[i].line); |
2917 | ++ } |
2918 | ++ return 0; |
2919 | ++} |
2920 | diff --git a/debian/patches/power/fix-clang7-builtins.patch b/debian/patches/power/fix-clang7-builtins.patch |
2921 | new file mode 100644 |
2922 | index 0000000..0ed510f |
2923 | --- /dev/null |
2924 | +++ b/debian/patches/power/fix-clang7-builtins.patch |
2925 | @@ -0,0 +1,62 @@ |
2926 | +From: Manjunath S Matti <mmatti@linux.ibm.com> |
2927 | +Date: Thu, 14 Sep 2023 06:45:31 -0500 |
2928 | +Subject: Fix clang's behavior on versions >= 7 |
2929 | + |
2930 | +Clang 7 changed the behavior of vec_xxpermdi in order to match GCC's |
2931 | +behavior. After this change, code that used to work on Clang 6 stopped |
2932 | +to work on Clang >= 7. |
2933 | + |
2934 | +Tested on Clang 6, 7, 8 and 9. |
2935 | + |
2936 | +Reference: https://bugs.llvm.org/show_bug.cgi?id=38192 |
2937 | + |
2938 | +Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> |
2939 | +Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> |
2940 | + |
2941 | +Origin: i-iii/zlib, https://github.com/iii-i/zlib/commit/8aca10a8a5ddb397854eb9a443f29658d3e3e12e |
2942 | +--- |
2943 | + contrib/power/clang_workaround.h | 15 ++++++++++----- |
2944 | + 1 file changed, 10 insertions(+), 5 deletions(-) |
2945 | + |
2946 | +diff --git a/contrib/power/clang_workaround.h b/contrib/power/clang_workaround.h |
2947 | +index b5e7dae..915f7e5 100644 |
2948 | +--- a/contrib/power/clang_workaround.h |
2949 | ++++ b/contrib/power/clang_workaround.h |
2950 | +@@ -39,7 +39,12 @@ __vector unsigned long long __builtin_pack_vector (unsigned long __a, |
2951 | + return __v; |
2952 | + } |
2953 | + |
2954 | +-#ifndef vec_xxpermdi |
2955 | ++/* |
2956 | ++ * Clang 7 changed the behavior of vec_xxpermdi in order to provide the same |
2957 | ++ * behavior of GCC. That means code adapted to Clang >= 7 does not work on |
2958 | ++ * Clang <= 6. So, fallback to __builtin_unpack_vector() on Clang <= 6. |
2959 | ++ */ |
2960 | ++#if !defined vec_xxpermdi || __clang_major__ <= 6 |
2961 | + |
2962 | + static inline |
2963 | + unsigned long __builtin_unpack_vector (__vector unsigned long long __v, |
2964 | +@@ -62,9 +67,9 @@ static inline |
2965 | + unsigned long __builtin_unpack_vector_0 (__vector unsigned long long __v) |
2966 | + { |
2967 | + #if defined(__BIG_ENDIAN__) |
2968 | +- return vec_xxpermdi(__v, __v, 0x0)[1]; |
2969 | +- #else |
2970 | + return vec_xxpermdi(__v, __v, 0x0)[0]; |
2971 | ++ #else |
2972 | ++ return vec_xxpermdi(__v, __v, 0x3)[0]; |
2973 | + #endif |
2974 | + } |
2975 | + |
2976 | +@@ -72,9 +77,9 @@ static inline |
2977 | + unsigned long __builtin_unpack_vector_1 (__vector unsigned long long __v) |
2978 | + { |
2979 | + #if defined(__BIG_ENDIAN__) |
2980 | +- return vec_xxpermdi(__v, __v, 0x3)[1]; |
2981 | +- #else |
2982 | + return vec_xxpermdi(__v, __v, 0x3)[0]; |
2983 | ++ #else |
2984 | ++ return vec_xxpermdi(__v, __v, 0x0)[0]; |
2985 | + #endif |
2986 | + } |
2987 | + #endif /* vec_xxpermdi */ |
2988 | diff --git a/debian/patches/power/indirect-func-macros.patch b/debian/patches/power/indirect-func-macros.patch |
2989 | new file mode 100644 |
2990 | index 0000000..c2976d8 |
2991 | --- /dev/null |
2992 | +++ b/debian/patches/power/indirect-func-macros.patch |
2993 | @@ -0,0 +1,295 @@ |
2994 | +From: Manjunath S Matti <mmatti@linux.ibm.com> |
2995 | +Date: Thu, 14 Sep 2023 06:15:57 -0500 |
2996 | +Subject: Preparation for Power optimizations |
2997 | + |
2998 | +Optimized functions for Power will make use of GNU indirect functions, |
2999 | +an extension to support different implementations of the same function, |
3000 | +which can be selected during runtime. This will be used to provide |
3001 | +optimized functions for different processor versions. |
3002 | + |
3003 | +Since this is a GNU extension, we placed the definition of the Z_IFUNC |
3004 | +macro under `contrib/gcc`. This can be reused by other archs as well. |
3005 | + |
3006 | +Author: Matheus Castanho <msc@linux.ibm.com> |
3007 | +Author: Rogerio Alves <rcardoso@linux.ibm.com> |
3008 | +Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> |
3009 | + |
3010 | +Origin: iii-i/zlib, https://github.com/iii-i/zlib/commit/096441298ecd1c123f1d37c2b34d6b6bb3c42e93 |
3011 | +--- |
3012 | + CMakeLists.txt | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++ |
3013 | + configure | 66 ++++++++++++++++++++++++++++++++++++++++++++++ |
3014 | + contrib/README.contrib | 8 ++++++ |
3015 | + contrib/gcc/zifunc.h | 60 ++++++++++++++++++++++++++++++++++++++++++ |
3016 | + contrib/power/power.h | 4 +++ |
3017 | + 5 files changed, 209 insertions(+) |
3018 | + create mode 100644 contrib/gcc/zifunc.h |
3019 | + create mode 100644 contrib/power/power.h |
3020 | + |
3021 | +diff --git a/CMakeLists.txt b/CMakeLists.txt |
3022 | +index 7f1b69f..4456cd7 100644 |
3023 | +--- a/CMakeLists.txt |
3024 | ++++ b/CMakeLists.txt |
3025 | +@@ -5,6 +5,8 @@ project(zlib C) |
3026 | + |
3027 | + set(VERSION "1.3") |
3028 | + |
3029 | ++option(POWER "Enable building power implementation") |
3030 | ++ |
3031 | + set(INSTALL_BIN_DIR "${CMAKE_INSTALL_PREFIX}/bin" CACHE PATH "Installation directory for executables") |
3032 | + set(INSTALL_LIB_DIR "${CMAKE_INSTALL_PREFIX}/lib" CACHE PATH "Installation directory for libraries") |
3033 | + set(INSTALL_INC_DIR "${CMAKE_INSTALL_PREFIX}/include" CACHE PATH "Installation directory for headers") |
3034 | +@@ -126,6 +128,75 @@ if(NOT MINGW) |
3035 | + ) |
3036 | + endif() |
3037 | + |
3038 | ++if(CMAKE_COMPILER_IS_GNUCC) |
3039 | ++ |
3040 | ++ # test to see if we can use a GNU indirect function to detect and load optimized code at runtime |
3041 | ++ CHECK_C_SOURCE_COMPILES(" |
3042 | ++ static int test_ifunc_native(void) |
3043 | ++ { |
3044 | ++ return 1; |
3045 | ++ } |
3046 | ++ static int (*(check_ifunc_native(void)))(void) |
3047 | ++ { |
3048 | ++ return test_ifunc_native; |
3049 | ++ } |
3050 | ++ int test_ifunc(void) __attribute__ ((ifunc (\"check_ifunc_native\"))); |
3051 | ++ int main(void) |
3052 | ++ { |
3053 | ++ return 0; |
3054 | ++ } |
3055 | ++ " HAS_C_ATTR_IFUNC) |
3056 | ++ |
3057 | ++ if(HAS_C_ATTR_IFUNC) |
3058 | ++ add_definitions(-DHAVE_IFUNC) |
3059 | ++ set(ZLIB_PRIVATE_HDRS ${ZLIB_PRIVATE_HDRS} contrib/gcc/zifunc.h) |
3060 | ++ endif() |
3061 | ++ |
3062 | ++ if(POWER) |
3063 | ++ # Test to see if we can use the optimizations for Power |
3064 | ++ CHECK_C_SOURCE_COMPILES(" |
3065 | ++ #ifndef _ARCH_PPC |
3066 | ++ #error \"Target is not Power\" |
3067 | ++ #endif |
3068 | ++ #ifndef __BUILTIN_CPU_SUPPORTS__ |
3069 | ++ #error \"Target doesn't support __builtin_cpu_supports()\" |
3070 | ++ #endif |
3071 | ++ int main() { return 0; } |
3072 | ++ " HAS_POWER_SUPPORT) |
3073 | ++ |
3074 | ++ if(HAS_POWER_SUPPORT AND HAS_C_ATTR_IFUNC) |
3075 | ++ add_definitions(-DZ_POWER_OPT) |
3076 | ++ |
3077 | ++ set(CMAKE_REQUIRED_FLAGS -mcpu=power8) |
3078 | ++ CHECK_C_SOURCE_COMPILES("int main(void){return 0;}" POWER8) |
3079 | ++ |
3080 | ++ if(POWER8) |
3081 | ++ add_definitions(-DZ_POWER8) |
3082 | ++ set(ZLIB_POWER8 ) |
3083 | ++ |
3084 | ++ set_source_files_properties( |
3085 | ++ ${ZLIB_POWER8} |
3086 | ++ PROPERTIES COMPILE_FLAGS -mcpu=power8) |
3087 | ++ endif() |
3088 | ++ |
3089 | ++ set(CMAKE_REQUIRED_FLAGS -mcpu=power9) |
3090 | ++ CHECK_C_SOURCE_COMPILES("int main(void){return 0;}" POWER9) |
3091 | ++ |
3092 | ++ if(POWER9) |
3093 | ++ add_definitions(-DZ_POWER9) |
3094 | ++ set(ZLIB_POWER9 ) |
3095 | ++ |
3096 | ++ set_source_files_properties( |
3097 | ++ ${ZLIB_POWER9} |
3098 | ++ PROPERTIES COMPILE_FLAGS -mcpu=power9) |
3099 | ++ endif() |
3100 | ++ |
3101 | ++ set(ZLIB_PRIVATE_HDRS ${ZLIB_PRIVATE_HDRS} contrib/power/power.h) |
3102 | ++ set(ZLIB_SRCS ${ZLIB_SRCS} ${ZLIB_POWER8} ${ZLIB_POWER9}) |
3103 | ++ endif() |
3104 | ++ endif() |
3105 | ++endif() |
3106 | ++ |
3107 | + # parse the full version number from zlib.h and include in ZLIB_FULL_VERSION |
3108 | + file(READ ${CMAKE_CURRENT_SOURCE_DIR}/zlib.h _zlib_h_contents) |
3109 | + string(REGEX REPLACE ".*#define[ \t]+ZLIB_VERSION[ \t]+\"([-0-9A-Za-z.]+)\".*" |
3110 | +diff --git a/configure b/configure |
3111 | +index cc867c9..e307a8d 100755 |
3112 | +--- a/configure |
3113 | ++++ b/configure |
3114 | +@@ -834,6 +834,72 @@ EOF |
3115 | + fi |
3116 | + fi |
3117 | + |
3118 | ++# test to see if we can use a gnu indirection function to detect and load optimized code at runtime |
3119 | ++echo >> configure.log |
3120 | ++cat > $test.c <<EOF |
3121 | ++static int test_ifunc_native(void) |
3122 | ++{ |
3123 | ++ return 1; |
3124 | ++} |
3125 | ++ |
3126 | ++static int (*(check_ifunc_native(void)))(void) |
3127 | ++{ |
3128 | ++ return test_ifunc_native; |
3129 | ++} |
3130 | ++ |
3131 | ++int test_ifunc(void) __attribute__ ((ifunc ("check_ifunc_native"))); |
3132 | ++EOF |
3133 | ++ |
3134 | ++if tryboth $CC -c $CFLAGS $test.c; then |
3135 | ++ SFLAGS="${SFLAGS} -DHAVE_IFUNC" |
3136 | ++ CFLAGS="${CFLAGS} -DHAVE_IFUNC" |
3137 | ++ echo "Checking for attribute(ifunc) support... Yes." | tee -a configure.log |
3138 | ++else |
3139 | ++ echo "Checking for attribute(ifunc) support... No." | tee -a configure.log |
3140 | ++fi |
3141 | ++ |
3142 | ++# Test to see if we can use the optimizations for Power |
3143 | ++echo >> configure.log |
3144 | ++cat > $test.c <<EOF |
3145 | ++#ifndef _ARCH_PPC |
3146 | ++ #error "Target is not Power" |
3147 | ++#endif |
3148 | ++#ifndef HAVE_IFUNC |
3149 | ++ #error "Target doesn't support ifunc" |
3150 | ++#endif |
3151 | ++#ifndef __BUILTIN_CPU_SUPPORTS__ |
3152 | ++ #error "Target doesn't support __builtin_cpu_supports()" |
3153 | ++#endif |
3154 | ++EOF |
3155 | ++ |
3156 | ++if tryboth $CC -c $CFLAGS $test.c; then |
3157 | ++ echo "int main(void){return 0;}" > $test.c |
3158 | ++ |
3159 | ++ if tryboth $CC -c $CFLAGS -mcpu=power8 $test.c; then |
3160 | ++ POWER8="-DZ_POWER8" |
3161 | ++ PIC_OBJC="${PIC_OBJC}" |
3162 | ++ OBJC="${OBJC}" |
3163 | ++ echo "Checking for -mcpu=power8 support... Yes." | tee -a configure.log |
3164 | ++ else |
3165 | ++ echo "Checking for -mcpu=power8 support... No." | tee -a configure.log |
3166 | ++ fi |
3167 | ++ |
3168 | ++ if tryboth $CC -c $CFLAGS -mcpu=power9 $test.c; then |
3169 | ++ POWER9="-DZ_POWER9" |
3170 | ++ PIC_OBJC="${PIC_OBJC}" |
3171 | ++ OBJC="${OBJC}" |
3172 | ++ echo "Checking for -mcpu=power9 support... Yes." | tee -a configure.log |
3173 | ++ else |
3174 | ++ echo "Checking for -mcpu=power9 support... No." | tee -a configure.log |
3175 | ++ fi |
3176 | ++ |
3177 | ++ SFLAGS="${SFLAGS} ${POWER8} ${POWER9} -DZ_POWER_OPT" |
3178 | ++ CFLAGS="${CFLAGS} ${POWER8} ${POWER9} -DZ_POWER_OPT" |
3179 | ++ echo "Checking for Power optimizations support... Yes." | tee -a configure.log |
3180 | ++else |
3181 | ++ echo "Checking for Power optimizations support... No." | tee -a configure.log |
3182 | ++fi |
3183 | ++ |
3184 | + # show the results in the log |
3185 | + echo >> configure.log |
3186 | + echo ALL = $ALL >> configure.log |
3187 | +diff --git a/contrib/README.contrib b/contrib/README.contrib |
3188 | +index 5e5f950..c57b520 100644 |
3189 | +--- a/contrib/README.contrib |
3190 | ++++ b/contrib/README.contrib |
3191 | +@@ -11,6 +11,10 @@ ada/ by Dmitriy Anisimkov <anisimkov@yahoo.com> |
3192 | + blast/ by Mark Adler <madler@alumni.caltech.edu> |
3193 | + Decompressor for output of PKWare Data Compression Library (DCL) |
3194 | + |
3195 | ++gcc/ by Matheus Castanho <msc@linux.ibm.com> |
3196 | ++ and Rogerio Alves <rcardoso@linux.ibm.com> |
3197 | ++ Optimization helpers using GCC-specific extensions |
3198 | ++ |
3199 | + delphi/ by Cosmin Truta <cosmint@cs.ubbcluj.ro> |
3200 | + Support for Delphi and C++ Builder |
3201 | + |
3202 | +@@ -42,6 +46,10 @@ minizip/ by Gilles Vollant <info@winimage.com> |
3203 | + pascal/ by Bob Dellaca <bobdl@xtra.co.nz> et al. |
3204 | + Support for Pascal |
3205 | + |
3206 | ++power/ by Matheus Castanho <msc@linux.ibm.com> |
3207 | ++ and Rogerio Alves <rcardoso@linux.ibm.com> |
3208 | ++ Optimized functions for Power processors |
3209 | ++ |
3210 | + puff/ by Mark Adler <madler@alumni.caltech.edu> |
3211 | + Small, low memory usage inflate. Also serves to provide an |
3212 | + unambiguous description of the deflate format. |
3213 | +diff --git a/contrib/gcc/zifunc.h b/contrib/gcc/zifunc.h |
3214 | +new file mode 100644 |
3215 | +index 0000000..daf4fe4 |
3216 | +--- /dev/null |
3217 | ++++ b/contrib/gcc/zifunc.h |
3218 | +@@ -0,0 +1,60 @@ |
3219 | ++/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM |
3220 | ++ * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM |
3221 | ++ * For conditions of distribution and use, see copyright notice in zlib.h |
3222 | ++ */ |
3223 | ++ |
3224 | ++#ifndef Z_IFUNC_H_ |
3225 | ++#define Z_IFUNC_H_ |
3226 | ++ |
3227 | ++/* Helpers for arch optimizations */ |
3228 | ++ |
3229 | ++#define Z_IFUNC(fname) \ |
3230 | ++ typeof(fname) fname __attribute__ ((ifunc (#fname "_resolver"))); \ |
3231 | ++ local typeof(fname) *fname##_resolver(void) |
3232 | ++/* This is a helper macro to declare a resolver for an indirect function |
3233 | ++ * (ifunc). Let's say you have function |
3234 | ++ * |
3235 | ++ * int foo (int a); |
3236 | ++ * |
3237 | ++ * for which you want to provide different implementations, for example: |
3238 | ++ * |
3239 | ++ * int foo_clever (int a) { |
3240 | ++ * ... clever things ... |
3241 | ++ * } |
3242 | ++ * |
3243 | ++ * int foo_smart (int a) { |
3244 | ++ * ... smart things ... |
3245 | ++ * } |
3246 | ++ * |
3247 | ++ * You will have to declare foo() as an indirect function and also provide a |
3248 | ++ * resolver for it, to choose between foo_clever() and foo_smart() based on |
3249 | ++ * some criteria you define (e.g. processor features). |
3250 | ++ * |
3251 | ++ * Since most likely foo() has a default implementation somewhere in zlib, you |
3252 | ++ * may have to rename it so the 'foo' symbol can be used by the ifunc without |
3253 | ++ * conflicts. |
3254 | ++ * |
3255 | ++ * #define foo foo_default |
3256 | ++ * int foo (int a) { |
3257 | ++ * ... |
3258 | ++ * } |
3259 | ++ * #undef foo |
3260 | ++ * |
3261 | ++ * Now you just have to provide a resolver function to choose which function |
3262 | ++ * should be used (decided at runtime on the first call to foo()): |
3263 | ++ * |
3264 | ++ * Z_IFUNC(foo) { |
3265 | ++ * if (... some condition ...) |
3266 | ++ * return foo_clever; |
3267 | ++ * |
3268 | ++ * if (... other condition ...) |
3269 | ++ * return foo_smart; |
3270 | ++ * |
3271 | ++ * return foo_default; |
3272 | ++ * } |
3273 | ++ * |
3274 | ++ * All calls to foo() throughout the code can remain untouched, all the magic |
3275 | ++ * will be done by the linker using the resolver function. |
3276 | ++ */ |
3277 | ++ |
3278 | ++#endif /* Z_IFUNC_H_ */ |
3279 | +diff --git a/contrib/power/power.h b/contrib/power/power.h |
3280 | +new file mode 100644 |
3281 | +index 0000000..b42c7d6 |
3282 | +--- /dev/null |
3283 | ++++ b/contrib/power/power.h |
3284 | +@@ -0,0 +1,4 @@ |
3285 | ++/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM |
3286 | ++ * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM |
3287 | ++ * For conditions of distribution and use, see copyright notice in zlib.h |
3288 | ++ */ |
3289 | diff --git a/debian/patches/s390x/add-accel-deflate.patch b/debian/patches/s390x/add-accel-deflate.patch |
3290 | new file mode 100644 |
3291 | index 0000000..1ae9be6 |
3292 | --- /dev/null |
3293 | +++ b/debian/patches/s390x/add-accel-deflate.patch |
3294 | @@ -0,0 +1,2043 @@ |
3295 | +From: Ilya Leoshkevich <iii@linux.ibm.com> |
3296 | +Date: Wed, 18 Jul 2018 13:14:07 +0200 |
3297 | +Subject: Add support for IBM Z hardware-accelerated deflate |
3298 | + |
3299 | +IBM Z mainframes starting from version z15 provide DFLTCC instruction, |
3300 | +which implements deflate algorithm in hardware with estimated |
3301 | +compression and decompression performance orders of magnitude faster |
3302 | +than the current zlib and ratio comparable with that of level 1. |
3303 | + |
3304 | +This patch adds DFLTCC support to zlib. It can be enabled using the |
3305 | +following build commands: |
3306 | + |
3307 | + $ ./configure --dfltcc |
3308 | + $ make |
3309 | + |
3310 | +When built like this, zlib would compress in hardware on level 1, and |
3311 | +in software on all other levels. Decompression will always happen in |
3312 | +hardware. In order to enable DFLTCC compression for levels 1-6 (i.e., |
3313 | +to make it used by default) one could either configure with |
3314 | +`--dfltcc-level-mask=0x7e` or `export DFLTCC_LEVEL_MASK=0x7e` at run |
3315 | +time. |
3316 | + |
3317 | +Two DFLTCC compression calls produce the same results only when they |
3318 | +both are made on machines of the same generation, and when the |
3319 | +respective buffers have the same offset relative to the start of the |
3320 | +page. Therefore care should be taken when using hardware compression |
3321 | +when reproducible results are desired. One such use case - reproducible |
3322 | +software builds - is handled explicitly: when the `SOURCE_DATE_EPOCH` |
3323 | +environment variable is set, the hardware compression is disabled. |
3324 | + |
3325 | +DFLTCC does not support every single zlib feature, in particular: |
3326 | + |
3327 | + * `inflate(Z_BLOCK)` and `inflate(Z_TREES)` |
3328 | + * `inflateMark()` |
3329 | + * `inflatePrime()` |
3330 | + * `inflateSyncPoint()` |
3331 | + |
3332 | +When used, these functions will either switch to software, or, in case |
3333 | +this is not possible, gracefully fail. |
3334 | + |
3335 | +This patch tries to add DFLTCC support in the least intrusive way. |
3336 | +All SystemZ-specific code is placed into a separate file, but |
3337 | +unfortunately there is still a noticeable amount of changes in the |
3338 | +main zlib code. Below is the summary of these changes. |
3339 | + |
3340 | +DFLTCC takes as arguments a parameter block, an input buffer, an output |
3341 | +buffer and a window. Since DFLTCC requires parameter block to be |
3342 | +doubleword-aligned, and it's reasonable to allocate it alongside |
3343 | +deflate and inflate states, The `ZALLOC_STATE()`, `ZFREE_STATE()` and |
3344 | +`ZCOPY_STATE()` macros are introduced in order to encapsulate the |
3345 | +allocation details. The same is true for window, for which |
3346 | +the `ZALLOC_WINDOW()` and `TRY_FREE_WINDOW()` macros are introduced. |
3347 | + |
3348 | +Software and hardware window formats do not match, therefore, |
3349 | +`deflateSetDictionary()`, `deflateGetDictionary()`, |
3350 | +`inflateSetDictionary()` and `inflateGetDictionary()` need special |
3351 | +handling, which is triggered using the new |
3352 | +`DEFLATE_SET_DICTIONARY_HOOK()`, `DEFLATE_GET_DICTIONARY_HOOK()`, |
3353 | +`INFLATE_SET_DICTIONARY_HOOK()` and `INFLATE_GET_DICTIONARY_HOOK()` |
3354 | +macros. |
3355 | + |
3356 | +`deflateResetKeep()` and `inflateResetKeep()` now update the DFLTCC |
3357 | +parameter block, which is allocated alongside zlib state, using |
3358 | +the new `DEFLATE_RESET_KEEP_HOOK()` and `INFLATE_RESET_KEEP_HOOK()` |
3359 | +macros. |
3360 | + |
3361 | +The new `DEFLATE_PARAMS_HOOK()` macro switches between the hardware |
3362 | +and the software deflate implementations when the `deflateParams()` |
3363 | +arguments demand this. |
3364 | + |
3365 | +The new `INFLATE_PRIME_HOOK()`, `INFLATE_MARK_HOOK()` and |
3366 | +`INFLATE_SYNC_POINT_HOOK()` macros make the respective unsupported |
3367 | +calls gracefully fail. |
3368 | + |
3369 | +The algorithm implemented in the hardware has different compression |
3370 | +ratio than the one implemented in software. In order for |
3371 | +`deflateBound()` to return the correct results for the hardware |
3372 | +implementation, the new `DEFLATE_BOUND_ADJUST_COMPLEN()` and |
3373 | +`DEFLATE_NEED_CONSERVATIVE_BOUND()` macros are introduced. |
3374 | + |
3375 | +Actual compression and decompression are handled by the new |
3376 | +`DEFLATE_HOOK()` and `INFLATE_TYPEDO_HOOK()` macros. Since inflation |
3377 | +with DFLTCC manages the window on its own, calling `updatewindow()` is |
3378 | +suppressed using the new `INFLATE_NEED_UPDATEWINDOW()` macro. |
3379 | + |
3380 | +In addition to the compression, DFLTCC computes the CRC-32 and Adler-32 |
3381 | +checksums, therefore, whenever it's used, the software checksumming is |
3382 | +suppressed using the new `DEFLATE_NEED_CHECKSUM()` and |
3383 | +`INFLATE_NEED_CHECKSUM()` macros. |
3384 | + |
3385 | +DFLTCC will refuse to write an End-of-block Symbol if there is no input |
3386 | +data, thus in some cases it is necessary to do this manually. In order |
3387 | +to achieve this, `send_bits()`, `bi_reverse()`, `bi_windup()` and |
3388 | +`flush_pending()` are promoted from `local` to `ZLIB_INTERNAL`. |
3389 | +Furthermore, since the block and the stream termination must be handled |
3390 | +in software as well, `enum block_state` is moved to `deflate.h`. |
3391 | + |
3392 | +Since the first call to `dfltcc_inflate()` already needs the window, |
3393 | +and it might be not allocated yet, `inflate_ensure_window()` is |
3394 | +factored out of `updatewindow()` and made `ZLIB_INTERNAL`. |
3395 | + |
3396 | +Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> |
3397 | +Origin: i-iii/zlib,https://github.com/iii-i/zlib/commit/481ee63d5f8fa12b5c833d32d08a3c74bc62cb20 |
3398 | +--- |
3399 | + Makefile.in | 8 + |
3400 | + compress.c | 14 +- |
3401 | + configure | 24 + |
3402 | + contrib/README.contrib | 4 + |
3403 | + contrib/s390/README.txt | 17 + |
3404 | + contrib/s390/dfltcc.c | 1004 +++++++++++++++++++++++++++++++++++++++++ |
3405 | + contrib/s390/dfltcc.h | 97 ++++ |
3406 | + contrib/s390/dfltcc_deflate.h | 53 +++ |
3407 | + deflate.c | 76 +++- |
3408 | + deflate.h | 12 + |
3409 | + gzguts.h | 4 + |
3410 | + inflate.c | 98 ++-- |
3411 | + inflate.h | 2 + |
3412 | + test/infcover.c | 3 +- |
3413 | + test/minigzip.c | 4 + |
3414 | + trees.c | 8 +- |
3415 | + zutil.h | 2 + |
3416 | + 17 files changed, 1371 insertions(+), 59 deletions(-) |
3417 | + create mode 100644 contrib/s390/README.txt |
3418 | + create mode 100644 contrib/s390/dfltcc.c |
3419 | + create mode 100644 contrib/s390/dfltcc.h |
3420 | + create mode 100644 contrib/s390/dfltcc_deflate.h |
3421 | + |
3422 | +diff --git a/Makefile.in b/Makefile.in |
3423 | +index ede4db3..1710f63 100644 |
3424 | +--- a/Makefile.in |
3425 | ++++ b/Makefile.in |
3426 | +@@ -140,6 +140,14 @@ match.lo: match.S |
3427 | + mv _match.o match.lo |
3428 | + rm -f _match.s |
3429 | + |
3430 | ++dfltcc.o: $(SRCDIR)contrib/s390/dfltcc.c $(SRCDIR)zlib.h zconf.h |
3431 | ++ $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)contrib/s390/dfltcc.c |
3432 | ++ |
3433 | ++dfltcc.lo: $(SRCDIR)contrib/s390/dfltcc.c $(SRCDIR)zlib.h zconf.h |
3434 | ++ -@mkdir objs 2>/dev/null || test -d objs |
3435 | ++ $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/dfltcc.o $(SRCDIR)contrib/s390/dfltcc.c |
3436 | ++ -@mv objs/dfltcc.o $@ |
3437 | ++ |
3438 | + crc32_test.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h |
3439 | + $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/crc32_test.c |
3440 | + |
3441 | +diff --git a/compress.c b/compress.c |
3442 | +index f43bacf..08a0660 100644 |
3443 | +--- a/compress.c |
3444 | ++++ b/compress.c |
3445 | +@@ -5,9 +5,15 @@ |
3446 | + |
3447 | + /* @(#) $Id$ */ |
3448 | + |
3449 | +-#define ZLIB_INTERNAL |
3450 | ++#include "zutil.h" |
3451 | + #include "zlib.h" |
3452 | + |
3453 | ++#ifdef DFLTCC |
3454 | ++# include "contrib/s390/dfltcc.h" |
3455 | ++#else |
3456 | ++#define DEFLATE_BOUND_COMPLEN(source_len) 0 |
3457 | ++#endif |
3458 | ++ |
3459 | + /* =========================================================================== |
3460 | + Compresses the source buffer into the destination buffer. The level |
3461 | + parameter has the same meaning as in deflateInit. sourceLen is the byte |
3462 | +@@ -70,6 +76,12 @@ int ZEXPORT compress(Bytef *dest, uLongf *destLen, const Bytef *source, |
3463 | + this function needs to be updated. |
3464 | + */ |
3465 | + uLong ZEXPORT compressBound(uLong sourceLen) { |
3466 | ++ uLong complen = DEFLATE_BOUND_COMPLEN(sourceLen); |
3467 | ++ |
3468 | ++ if (complen > 0) |
3469 | ++ /* Architecture-specific code provided an upper bound. */ |
3470 | ++ return complen + ZLIB_WRAPLEN; |
3471 | ++ |
3472 | + return sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + |
3473 | + (sourceLen >> 25) + 13; |
3474 | + } |
3475 | +diff --git a/configure b/configure |
3476 | +index 3372cbf..b99a348 100755 |
3477 | +--- a/configure |
3478 | ++++ b/configure |
3479 | +@@ -117,6 +117,7 @@ case "$1" in |
3480 | + echo ' configure [--const] [--zprefix] [--prefix=PREFIX] [--eprefix=EXPREFIX]' | tee -a configure.log |
3481 | + echo ' [--static] [--64] [--libdir=LIBDIR] [--sharedlibdir=LIBDIR]' | tee -a configure.log |
3482 | + echo ' [--includedir=INCLUDEDIR] [--archs="-arch i386 -arch x86_64"]' | tee -a configure.log |
3483 | ++ echo ' [--dfltcc] [--dfltcc-level-mask=MASK]' | tee -a configure.log |
3484 | + exit 0 ;; |
3485 | + -p*=* | --prefix=*) prefix=`echo $1 | sed 's/.*=//'`; shift ;; |
3486 | + -e*=* | --eprefix=*) exec_prefix=`echo $1 | sed 's/.*=//'`; shift ;; |
3487 | +@@ -143,6 +144,16 @@ case "$1" in |
3488 | + --sanitize) address=1; shift ;; |
3489 | + --address) address=1; shift ;; |
3490 | + --memory) memory=1; shift ;; |
3491 | ++ --dfltcc) |
3492 | ++ CFLAGS="$CFLAGS -DDFLTCC" |
3493 | ++ OBJC="$OBJC dfltcc.o" |
3494 | ++ PIC_OBJC="$PIC_OBJC dfltcc.lo" |
3495 | ++ shift |
3496 | ++ ;; |
3497 | ++ --dfltcc-level-mask=*) |
3498 | ++ CFLAGS="$CFLAGS -DDFLTCC_LEVEL_MASK=`echo $1 | sed 's/.*=//'`" |
3499 | ++ shift |
3500 | ++ ;; |
3501 | + *) |
3502 | + echo "unknown option: $1" | tee -a configure.log |
3503 | + echo "$0 --help for help" | tee -a configure.log |
3504 | +@@ -834,6 +845,19 @@ EOF |
3505 | + fi |
3506 | + fi |
3507 | + |
3508 | ++# Check whether sys/sdt.h is available |
3509 | ++cat > $test.c << EOF |
3510 | ++#include <sys/sdt.h> |
3511 | ++int main() { return 0; } |
3512 | ++EOF |
3513 | ++if try $CC -c $CFLAGS $test.c; then |
3514 | ++ echo "Checking for sys/sdt.h ... Yes." | tee -a configure.log |
3515 | ++ CFLAGS="$CFLAGS -DHAVE_SYS_SDT_H" |
3516 | ++ SFLAGS="$SFLAGS -DHAVE_SYS_SDT_H" |
3517 | ++else |
3518 | ++ echo "Checking for sys/sdt.h ... No." | tee -a configure.log |
3519 | ++fi |
3520 | ++ |
3521 | + # test to see if we can use a gnu indirection function to detect and load optimized code at runtime |
3522 | + echo >> configure.log |
3523 | + cat > $test.c <<EOF |
3524 | +diff --git a/contrib/README.contrib b/contrib/README.contrib |
3525 | +index 90170df..a36d404 100644 |
3526 | +--- a/contrib/README.contrib |
3527 | ++++ b/contrib/README.contrib |
3528 | +@@ -55,6 +55,10 @@ puff/ by Mark Adler <madler@alumni.caltech.edu> |
3529 | + Small, low memory usage inflate. Also serves to provide an |
3530 | + unambiguous description of the deflate format. |
3531 | + |
3532 | ++s390/ by Ilya Leoshkevich <iii@linux.ibm.com> |
3533 | ++ Hardware-accelerated deflate on IBM Z with DEFLATE CONVERSION CALL |
3534 | ++ instruction. |
3535 | ++ |
3536 | + testzlib/ by Gilles Vollant <info@winimage.com> |
3537 | + Example of the use of zlib |
3538 | + |
3539 | +diff --git a/contrib/s390/README.txt b/contrib/s390/README.txt |
3540 | +new file mode 100644 |
3541 | +index 0000000..48be008 |
3542 | +--- /dev/null |
3543 | ++++ b/contrib/s390/README.txt |
3544 | +@@ -0,0 +1,17 @@ |
3545 | ++IBM Z mainframes starting from version z15 provide DFLTCC instruction, |
3546 | ++which implements deflate algorithm in hardware with estimated |
3547 | ++compression and decompression performance orders of magnitude faster |
3548 | ++than the current zlib and ratio comparable with that of level 1. |
3549 | ++ |
3550 | ++This directory adds DFLTCC support. In order to enable it, the following |
3551 | ++build commands should be used: |
3552 | ++ |
3553 | ++ $ ./configure --dfltcc |
3554 | ++ $ make |
3555 | ++ |
3556 | ++When built like this, zlib would compress in hardware on level 1, and in |
3557 | ++software on all other levels. Decompression will always happen in |
3558 | ++hardware. In order to enable DFLTCC compression for levels 1-6 (i.e. to |
3559 | ++make it used by default) one could either configure with |
3560 | ++--dfltcc-level-mask=0x7e or set the environment variable |
3561 | ++DFLTCC_LEVEL_MASK to 0x7e at run time. |
3562 | +diff --git a/contrib/s390/dfltcc.c b/contrib/s390/dfltcc.c |
3563 | +new file mode 100644 |
3564 | +index 0000000..f2b222d |
3565 | +--- /dev/null |
3566 | ++++ b/contrib/s390/dfltcc.c |
3567 | +@@ -0,0 +1,1004 @@ |
3568 | ++/* dfltcc.c - SystemZ DEFLATE CONVERSION CALL support. */ |
3569 | ++ |
3570 | ++/* |
3571 | ++ Use the following commands to build zlib with DFLTCC support: |
3572 | ++ |
3573 | ++ $ ./configure --dfltcc |
3574 | ++ $ make |
3575 | ++*/ |
3576 | ++ |
3577 | ++#define _GNU_SOURCE |
3578 | ++#include <ctype.h> |
3579 | ++#include <errno.h> |
3580 | ++#include <inttypes.h> |
3581 | ++#include <stddef.h> |
3582 | ++#include <stdio.h> |
3583 | ++#include <stdint.h> |
3584 | ++#include <stdlib.h> |
3585 | ++#include "../../zutil.h" |
3586 | ++#include "../../deflate.h" |
3587 | ++#include "../../inftrees.h" |
3588 | ++#include "../../inflate.h" |
3589 | ++#include "dfltcc.h" |
3590 | ++#include "dfltcc_deflate.h" |
3591 | ++#ifdef HAVE_SYS_SDT_H |
3592 | ++#include <sys/sdt.h> |
3593 | ++#endif |
3594 | ++ |
3595 | ++/* |
3596 | ++ C wrapper for the DEFLATE CONVERSION CALL instruction. |
3597 | ++ */ |
3598 | ++typedef enum { |
3599 | ++ DFLTCC_CC_OK = 0, |
3600 | ++ DFLTCC_CC_OP1_TOO_SHORT = 1, |
3601 | ++ DFLTCC_CC_OP2_TOO_SHORT = 2, |
3602 | ++ DFLTCC_CC_OP2_CORRUPT = 2, |
3603 | ++ DFLTCC_CC_AGAIN = 3, |
3604 | ++} dfltcc_cc; |
3605 | ++ |
3606 | ++#define DFLTCC_QAF 0 |
3607 | ++#define DFLTCC_GDHT 1 |
3608 | ++#define DFLTCC_CMPR 2 |
3609 | ++#define DFLTCC_XPND 4 |
3610 | ++#define HBT_CIRCULAR (1 << 7) |
3611 | ++#define HB_BITS 15 |
3612 | ++#define HB_SIZE (1 << HB_BITS) |
3613 | ++#define DFLTCC_FACILITY 151 |
3614 | ++ |
3615 | ++local inline dfltcc_cc dfltcc(int fn, void *param, |
3616 | ++ Bytef **op1, size_t *len1, |
3617 | ++ z_const Bytef **op2, size_t *len2, |
3618 | ++ void *hist) |
3619 | ++{ |
3620 | ++ Bytef *t2 = op1 ? *op1 : NULL; |
3621 | ++ size_t t3 = len1 ? *len1 : 0; |
3622 | ++ z_const Bytef *t4 = op2 ? *op2 : NULL; |
3623 | ++ size_t t5 = len2 ? *len2 : 0; |
3624 | ++ register int r0 __asm__("r0") = fn; |
3625 | ++ register void *r1 __asm__("r1") = param; |
3626 | ++ register Bytef *r2 __asm__("r2") = t2; |
3627 | ++ register size_t r3 __asm__("r3") = t3; |
3628 | ++ register z_const Bytef *r4 __asm__("r4") = t4; |
3629 | ++ register size_t r5 __asm__("r5") = t5; |
3630 | ++ int cc; |
3631 | ++ |
3632 | ++ __asm__ volatile( |
3633 | ++#ifdef HAVE_SYS_SDT_H |
3634 | ++ STAP_PROBE_ASM(zlib, dfltcc_entry, |
3635 | ++ STAP_PROBE_ASM_TEMPLATE(5)) |
3636 | ++#endif |
3637 | ++ ".insn rrf,0xb9390000,%[r2],%[r4],%[hist],0\n" |
3638 | ++#ifdef HAVE_SYS_SDT_H |
3639 | ++ STAP_PROBE_ASM(zlib, dfltcc_exit, |
3640 | ++ STAP_PROBE_ASM_TEMPLATE(5)) |
3641 | ++#endif |
3642 | ++ "ipm %[cc]\n" |
3643 | ++ : [r2] "+r" (r2) |
3644 | ++ , [r3] "+r" (r3) |
3645 | ++ , [r4] "+r" (r4) |
3646 | ++ , [r5] "+r" (r5) |
3647 | ++ , [cc] "=r" (cc) |
3648 | ++ : [r0] "r" (r0) |
3649 | ++ , [r1] "r" (r1) |
3650 | ++ , [hist] "r" (hist) |
3651 | ++#ifdef HAVE_SYS_SDT_H |
3652 | ++ , STAP_PROBE_ASM_OPERANDS(5, r2, r3, r4, r5, hist) |
3653 | ++#endif |
3654 | ++ : "cc", "memory"); |
3655 | ++ t2 = r2; t3 = r3; t4 = r4; t5 = r5; |
3656 | ++ |
3657 | ++ if (op1) |
3658 | ++ *op1 = t2; |
3659 | ++ if (len1) |
3660 | ++ *len1 = t3; |
3661 | ++ if (op2) |
3662 | ++ *op2 = t4; |
3663 | ++ if (len2) |
3664 | ++ *len2 = t5; |
3665 | ++ return (cc >> 28) & 3; |
3666 | ++} |
3667 | ++ |
3668 | ++/* |
3669 | ++ Parameter Block for Query Available Functions. |
3670 | ++ */ |
3671 | ++#define static_assert(c, msg) \ |
3672 | ++ __attribute__((unused)) \ |
3673 | ++ static char static_assert_failed_ ## msg[c ? 1 : -1] |
3674 | ++ |
3675 | ++struct dfltcc_qaf_param { |
3676 | ++ char fns[16]; |
3677 | ++ char reserved1[8]; |
3678 | ++ char fmts[2]; |
3679 | ++ char reserved2[6]; |
3680 | ++}; |
3681 | ++ |
3682 | ++static_assert(sizeof(struct dfltcc_qaf_param) == 32, |
3683 | ++ sizeof_struct_dfltcc_qaf_param_is_32); |
3684 | ++ |
3685 | ++local inline int is_bit_set(const char *bits, int n) |
3686 | ++{ |
3687 | ++ return bits[n / 8] & (1 << (7 - (n % 8))); |
3688 | ++} |
3689 | ++ |
3690 | ++local inline void clear_bit(char *bits, int n) |
3691 | ++{ |
3692 | ++ bits[n / 8] &= ~(1 << (7 - (n % 8))); |
3693 | ++} |
3694 | ++ |
3695 | ++#define DFLTCC_FMT0 0 |
3696 | ++ |
3697 | ++/* |
3698 | ++ Parameter Block for Generate Dynamic-Huffman Table, Compress and Expand. |
3699 | ++ */ |
3700 | ++#define CVT_CRC32 0 |
3701 | ++#define CVT_ADLER32 1 |
3702 | ++#define HTT_FIXED 0 |
3703 | ++#define HTT_DYNAMIC 1 |
3704 | ++ |
3705 | ++struct dfltcc_param_v0 { |
3706 | ++ uint16_t pbvn; /* Parameter-Block-Version Number */ |
3707 | ++ uint8_t mvn; /* Model-Version Number */ |
3708 | ++ uint8_t ribm; /* Reserved for IBM use */ |
3709 | ++ unsigned reserved32 : 31; |
3710 | ++ unsigned cf : 1; /* Continuation Flag */ |
3711 | ++ uint8_t reserved64[8]; |
3712 | ++ unsigned nt : 1; /* New Task */ |
3713 | ++ unsigned reserved129 : 1; |
3714 | ++ unsigned cvt : 1; /* Check Value Type */ |
3715 | ++ unsigned reserved131 : 1; |
3716 | ++ unsigned htt : 1; /* Huffman-Table Type */ |
3717 | ++ unsigned bcf : 1; /* Block-Continuation Flag */ |
3718 | ++ unsigned bcc : 1; /* Block Closing Control */ |
3719 | ++ unsigned bhf : 1; /* Block Header Final */ |
3720 | ++ unsigned reserved136 : 1; |
3721 | ++ unsigned reserved137 : 1; |
3722 | ++ unsigned dhtgc : 1; /* DHT Generation Control */ |
3723 | ++ unsigned reserved139 : 5; |
3724 | ++ unsigned reserved144 : 5; |
3725 | ++ unsigned sbb : 3; /* Sub-Byte Boundary */ |
3726 | ++ uint8_t oesc; /* Operation-Ending-Supplemental Code */ |
3727 | ++ unsigned reserved160 : 12; |
3728 | ++ unsigned ifs : 4; /* Incomplete-Function Status */ |
3729 | ++ uint16_t ifl; /* Incomplete-Function Length */ |
3730 | ++ uint8_t reserved192[8]; |
3731 | ++ uint8_t reserved256[8]; |
3732 | ++ uint8_t reserved320[4]; |
3733 | ++ uint16_t hl; /* History Length */ |
3734 | ++ unsigned reserved368 : 1; |
3735 | ++ uint16_t ho : 15; /* History Offset */ |
3736 | ++ uint32_t cv; /* Check Value */ |
3737 | ++ unsigned eobs : 15; /* End-of-block Symbol */ |
3738 | ++ unsigned reserved431: 1; |
3739 | ++ uint8_t eobl : 4; /* End-of-block Length */ |
3740 | ++ unsigned reserved436 : 12; |
3741 | ++ unsigned reserved448 : 4; |
3742 | ++ uint16_t cdhtl : 12; /* Compressed-Dynamic-Huffman Table |
3743 | ++ Length */ |
3744 | ++ uint8_t reserved464[6]; |
3745 | ++ uint8_t cdht[288]; |
3746 | ++ uint8_t reserved[32]; |
3747 | ++ uint8_t csb[1152]; |
3748 | ++}; |
3749 | ++ |
3750 | ++static_assert(sizeof(struct dfltcc_param_v0) == 1536, |
3751 | ++ sizeof_struct_dfltcc_param_v0_is_1536); |
3752 | ++ |
3753 | ++local z_const char *oesc_msg(char *buf, int oesc) |
3754 | ++{ |
3755 | ++ if (oesc == 0x00) |
3756 | ++ return NULL; /* Successful completion */ |
3757 | ++ else { |
3758 | ++ sprintf(buf, "Operation-Ending-Supplemental Code is 0x%.2X", oesc); |
3759 | ++ return buf; |
3760 | ++ } |
3761 | ++} |
3762 | ++ |
3763 | ++/* |
3764 | ++ Extension of inflate_state and deflate_state. Must be doubleword-aligned. |
3765 | ++*/ |
3766 | ++struct dfltcc_state { |
3767 | ++ struct dfltcc_param_v0 param; /* Parameter block. */ |
3768 | ++ struct dfltcc_qaf_param af; /* Available functions. */ |
3769 | ++ uLong level_mask; /* Levels on which to use DFLTCC */ |
3770 | ++ uLong block_size; /* New block each X bytes */ |
3771 | ++ uLong block_threshold; /* New block after total_in > X */ |
3772 | ++ uLong dht_threshold; /* New block only if avail_in >= X */ |
3773 | ++ char msg[64]; /* Buffer for strm->msg */ |
3774 | ++}; |
3775 | ++ |
3776 | ++#define ALIGN_UP(p, size) \ |
3777 | ++ (__typeof__(p))(((uintptr_t)(p) + ((size) - 1)) & ~((size) - 1)) |
3778 | ++ |
3779 | ++#define GET_DFLTCC_STATE(state) ((struct dfltcc_state *)( \ |
3780 | ++ (char *)(state) + ALIGN_UP(sizeof(*state), 8))) |
3781 | ++ |
3782 | ++/* |
3783 | ++ Compress. |
3784 | ++ */ |
3785 | ++local inline int dfltcc_can_deflate_with_params(z_streamp strm, |
3786 | ++ int level, |
3787 | ++ uInt window_bits, |
3788 | ++ int strategy) |
3789 | ++{ |
3790 | ++ deflate_state *state = (deflate_state *)strm->state; |
3791 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
3792 | ++ |
3793 | ++ /* Unsupported compression settings */ |
3794 | ++ if ((dfltcc_state->level_mask & (1 << level)) == 0) |
3795 | ++ return 0; |
3796 | ++ if (window_bits != HB_BITS) |
3797 | ++ return 0; |
3798 | ++ if (strategy != Z_FIXED && strategy != Z_DEFAULT_STRATEGY) |
3799 | ++ return 0; |
3800 | ++ |
3801 | ++ /* Unsupported hardware */ |
3802 | ++ if (!is_bit_set(dfltcc_state->af.fns, DFLTCC_GDHT) || |
3803 | ++ !is_bit_set(dfltcc_state->af.fns, DFLTCC_CMPR) || |
3804 | ++ !is_bit_set(dfltcc_state->af.fmts, DFLTCC_FMT0)) |
3805 | ++ return 0; |
3806 | ++ |
3807 | ++ return 1; |
3808 | ++} |
3809 | ++ |
3810 | ++int ZLIB_INTERNAL dfltcc_can_deflate(z_streamp strm) |
3811 | ++{ |
3812 | ++ deflate_state *state = (deflate_state *)strm->state; |
3813 | ++ |
3814 | ++ return dfltcc_can_deflate_with_params(strm, |
3815 | ++ state->level, |
3816 | ++ state->w_bits, |
3817 | ++ state->strategy); |
3818 | ++} |
3819 | ++ |
3820 | ++local void dfltcc_gdht(z_streamp strm) |
3821 | ++{ |
3822 | ++ deflate_state *state = (deflate_state *)strm->state; |
3823 | ++ struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; |
3824 | ++ size_t avail_in = avail_in = strm->avail_in; |
3825 | ++ |
3826 | ++ dfltcc(DFLTCC_GDHT, |
3827 | ++ param, NULL, NULL, |
3828 | ++ &strm->next_in, &avail_in, NULL); |
3829 | ++} |
3830 | ++ |
3831 | ++local dfltcc_cc dfltcc_cmpr(z_streamp strm) |
3832 | ++{ |
3833 | ++ deflate_state *state = (deflate_state *)strm->state; |
3834 | ++ struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; |
3835 | ++ size_t avail_in = strm->avail_in; |
3836 | ++ size_t avail_out = strm->avail_out; |
3837 | ++ dfltcc_cc cc; |
3838 | ++ |
3839 | ++ cc = dfltcc(DFLTCC_CMPR | HBT_CIRCULAR, |
3840 | ++ param, &strm->next_out, &avail_out, |
3841 | ++ &strm->next_in, &avail_in, state->window); |
3842 | ++ strm->total_in += (strm->avail_in - avail_in); |
3843 | ++ strm->total_out += (strm->avail_out - avail_out); |
3844 | ++ strm->avail_in = avail_in; |
3845 | ++ strm->avail_out = avail_out; |
3846 | ++ return cc; |
3847 | ++} |
3848 | ++ |
3849 | ++local void send_eobs(z_streamp strm, |
3850 | ++ z_const struct dfltcc_param_v0 *param) |
3851 | ++{ |
3852 | ++ deflate_state *state = (deflate_state *)strm->state; |
3853 | ++ |
3854 | ++ _tr_send_bits( |
3855 | ++ state, |
3856 | ++ bi_reverse(param->eobs >> (15 - param->eobl), param->eobl), |
3857 | ++ param->eobl); |
3858 | ++ flush_pending(strm); |
3859 | ++ if (state->pending != 0) { |
3860 | ++ /* The remaining data is located in pending_out[0:pending]. If someone |
3861 | ++ * calls put_byte() - this might happen in deflate() - the byte will be |
3862 | ++ * placed into pending_buf[pending], which is incorrect. Move the |
3863 | ++ * remaining data to the beginning of pending_buf so that put_byte() is |
3864 | ++ * usable again. |
3865 | ++ */ |
3866 | ++ memmove(state->pending_buf, state->pending_out, state->pending); |
3867 | ++ state->pending_out = state->pending_buf; |
3868 | ++ } |
3869 | ++#ifdef ZLIB_DEBUG |
3870 | ++ state->compressed_len += param->eobl; |
3871 | ++#endif |
3872 | ++} |
3873 | ++ |
3874 | ++int ZLIB_INTERNAL dfltcc_deflate(z_streamp strm, int flush, |
3875 | ++ block_state *result) |
3876 | ++{ |
3877 | ++ deflate_state *state = (deflate_state *)strm->state; |
3878 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
3879 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
3880 | ++ uInt masked_avail_in; |
3881 | ++ dfltcc_cc cc; |
3882 | ++ int need_empty_block; |
3883 | ++ int soft_bcc; |
3884 | ++ int no_flush; |
3885 | ++ |
3886 | ++ if (!dfltcc_can_deflate(strm)) { |
3887 | ++ /* Clear history. */ |
3888 | ++ if (flush == Z_FULL_FLUSH) |
3889 | ++ param->hl = 0; |
3890 | ++ return 0; |
3891 | ++ } |
3892 | ++ |
3893 | ++again: |
3894 | ++ masked_avail_in = 0; |
3895 | ++ soft_bcc = 0; |
3896 | ++ no_flush = flush == Z_NO_FLUSH; |
3897 | ++ |
3898 | ++ /* No input data. Return, except when Continuation Flag is set, which means |
3899 | ++ * that DFLTCC has buffered some output in the parameter block and needs to |
3900 | ++ * be called again in order to flush it. |
3901 | ++ */ |
3902 | ++ if (strm->avail_in == 0 && !param->cf) { |
3903 | ++ /* A block is still open, and the hardware does not support closing |
3904 | ++ * blocks without adding data. Thus, close it manually. |
3905 | ++ */ |
3906 | ++ if (!no_flush && param->bcf) { |
3907 | ++ send_eobs(strm, param); |
3908 | ++ param->bcf = 0; |
3909 | ++ } |
3910 | ++ /* Let one of deflate_* functions write a trailing empty block. */ |
3911 | ++ if (flush == Z_FINISH) |
3912 | ++ return 0; |
3913 | ++ /* Clear history. */ |
3914 | ++ if (flush == Z_FULL_FLUSH) |
3915 | ++ param->hl = 0; |
3916 | ++ /* Trigger block post-processing if necessary. */ |
3917 | ++ *result = no_flush ? need_more : block_done; |
3918 | ++ return 1; |
3919 | ++ } |
3920 | ++ |
3921 | ++ /* There is an open non-BFINAL block, we are not going to close it just |
3922 | ++ * yet, we have compressed more than DFLTCC_BLOCK_SIZE bytes and we see |
3923 | ++ * more than DFLTCC_DHT_MIN_SAMPLE_SIZE bytes. Open a new block with a new |
3924 | ++ * DHT in order to adapt to a possibly changed input data distribution. |
3925 | ++ */ |
3926 | ++ if (param->bcf && no_flush && |
3927 | ++ strm->total_in > dfltcc_state->block_threshold && |
3928 | ++ strm->avail_in >= dfltcc_state->dht_threshold) { |
3929 | ++ if (param->cf) { |
3930 | ++ /* We need to flush the DFLTCC buffer before writing the |
3931 | ++ * End-of-block Symbol. Mask the input data and proceed as usual. |
3932 | ++ */ |
3933 | ++ masked_avail_in += strm->avail_in; |
3934 | ++ strm->avail_in = 0; |
3935 | ++ no_flush = 0; |
3936 | ++ } else { |
3937 | ++ /* DFLTCC buffer is empty, so we can manually write the |
3938 | ++ * End-of-block Symbol right away. |
3939 | ++ */ |
3940 | ++ send_eobs(strm, param); |
3941 | ++ param->bcf = 0; |
3942 | ++ dfltcc_state->block_threshold = |
3943 | ++ strm->total_in + dfltcc_state->block_size; |
3944 | ++ } |
3945 | ++ } |
3946 | ++ |
3947 | ++ /* No space for compressed data. If we proceed, dfltcc_cmpr() will return |
3948 | ++ * DFLTCC_CC_OP1_TOO_SHORT without buffering header bits, but we will still |
3949 | ++ * set BCF=1, which is wrong. Avoid complications and return early. |
3950 | ++ */ |
3951 | ++ if (strm->avail_out == 0) { |
3952 | ++ *result = need_more; |
3953 | ++ return 1; |
3954 | ++ } |
3955 | ++ |
3956 | ++ /* The caller gave us too much data. Pass only one block worth of |
3957 | ++ * uncompressed data to DFLTCC and mask the rest, so that on the next |
3958 | ++ * iteration we start a new block. |
3959 | ++ */ |
3960 | ++ if (no_flush && strm->avail_in > dfltcc_state->block_size) { |
3961 | ++ masked_avail_in += (strm->avail_in - dfltcc_state->block_size); |
3962 | ++ strm->avail_in = dfltcc_state->block_size; |
3963 | ++ } |
3964 | ++ |
3965 | ++ /* When we have an open non-BFINAL deflate block and caller indicates that |
3966 | ++ * the stream is ending, we need to close an open deflate block and open a |
3967 | ++ * BFINAL one. |
3968 | ++ */ |
3969 | ++ need_empty_block = flush == Z_FINISH && param->bcf && !param->bhf; |
3970 | ++ |
3971 | ++ /* Translate stream to parameter block */ |
3972 | ++ param->cvt = state->wrap == 2 ? CVT_CRC32 : CVT_ADLER32; |
3973 | ++ if (!no_flush) |
3974 | ++ /* We need to close a block. Always do this in software - when there is |
3975 | ++ * no input data, the hardware will not honor BCC. */ |
3976 | ++ soft_bcc = 1; |
3977 | ++ if (flush == Z_FINISH && !param->bcf) |
3978 | ++ /* We are about to open a BFINAL block, set Block Header Final bit |
3979 | ++ * until the stream ends. |
3980 | ++ */ |
3981 | ++ param->bhf = 1; |
3982 | ++ /* DFLTCC-CMPR will write to next_out, so make sure that buffers with |
3983 | ++ * higher precedence are empty. |
3984 | ++ */ |
3985 | ++ Assert(state->pending == 0, "There must be no pending bytes"); |
3986 | ++ Assert(state->bi_valid < 8, "There must be less than 8 pending bits"); |
3987 | ++ param->sbb = (unsigned int)state->bi_valid; |
3988 | ++ if (param->sbb > 0) |
3989 | ++ *strm->next_out = (Bytef)state->bi_buf; |
3990 | ++ /* Honor history and check value */ |
3991 | ++ param->nt = 0; |
3992 | ++ if (state->wrap == 1) |
3993 | ++ param->cv = strm->adler; |
3994 | ++ else if (state->wrap == 2) |
3995 | ++ param->cv = ZSWAP32(strm->adler); |
3996 | ++ |
3997 | ++ /* When opening a block, choose a Huffman-Table Type */ |
3998 | ++ if (!param->bcf) { |
3999 | ++ if (state->strategy == Z_FIXED || |
4000 | ++ (strm->total_in == 0 && dfltcc_state->block_threshold > 0)) |
4001 | ++ param->htt = HTT_FIXED; |
4002 | ++ else { |
4003 | ++ param->htt = HTT_DYNAMIC; |
4004 | ++ dfltcc_gdht(strm); |
4005 | ++ } |
4006 | ++ } |
4007 | ++ |
4008 | ++ /* Deflate */ |
4009 | ++ do { |
4010 | ++ cc = dfltcc_cmpr(strm); |
4011 | ++ if (strm->avail_in < 4096 && masked_avail_in > 0) |
4012 | ++ /* We are about to call DFLTCC with a small input buffer, which is |
4013 | ++ * inefficient. Since there is masked data, there will be at least |
4014 | ++ * one more DFLTCC call, so skip the current one and make the next |
4015 | ++ * one handle more data. |
4016 | ++ */ |
4017 | ++ break; |
4018 | ++ } while (cc == DFLTCC_CC_AGAIN); |
4019 | ++ |
4020 | ++ /* Translate parameter block to stream */ |
4021 | ++ strm->msg = oesc_msg(dfltcc_state->msg, param->oesc); |
4022 | ++ state->bi_valid = param->sbb; |
4023 | ++ if (state->bi_valid == 0) |
4024 | ++ state->bi_buf = 0; /* Avoid accessing next_out */ |
4025 | ++ else |
4026 | ++ state->bi_buf = *strm->next_out & ((1 << state->bi_valid) - 1); |
4027 | ++ if (state->wrap == 1) |
4028 | ++ strm->adler = param->cv; |
4029 | ++ else if (state->wrap == 2) |
4030 | ++ strm->adler = ZSWAP32(param->cv); |
4031 | ++ |
4032 | ++ /* Unmask the input data */ |
4033 | ++ strm->avail_in += masked_avail_in; |
4034 | ++ masked_avail_in = 0; |
4035 | ++ |
4036 | ++ /* If we encounter an error, it means there is a bug in DFLTCC call */ |
4037 | ++ Assert(cc != DFLTCC_CC_OP2_CORRUPT || param->oesc == 0, "BUG"); |
4038 | ++ |
4039 | ++ /* Update Block-Continuation Flag. It will be used to check whether to call |
4040 | ++ * GDHT the next time. |
4041 | ++ */ |
4042 | ++ if (cc == DFLTCC_CC_OK) { |
4043 | ++ if (soft_bcc) { |
4044 | ++ send_eobs(strm, param); |
4045 | ++ param->bcf = 0; |
4046 | ++ dfltcc_state->block_threshold = |
4047 | ++ strm->total_in + dfltcc_state->block_size; |
4048 | ++ } else |
4049 | ++ param->bcf = 1; |
4050 | ++ if (flush == Z_FINISH) { |
4051 | ++ if (need_empty_block) |
4052 | ++ /* Make the current deflate() call also close the stream */ |
4053 | ++ return 0; |
4054 | ++ else { |
4055 | ++ bi_windup(state); |
4056 | ++ *result = finish_done; |
4057 | ++ } |
4058 | ++ } else { |
4059 | ++ if (flush == Z_FULL_FLUSH) |
4060 | ++ param->hl = 0; /* Clear history */ |
4061 | ++ *result = flush == Z_NO_FLUSH ? need_more : block_done; |
4062 | ++ } |
4063 | ++ } else { |
4064 | ++ param->bcf = 1; |
4065 | ++ *result = need_more; |
4066 | ++ } |
4067 | ++ if (strm->avail_in != 0 && strm->avail_out != 0) |
4068 | ++ goto again; /* deflate() must use all input or all output */ |
4069 | ++ return 1; |
4070 | ++} |
4071 | ++ |
4072 | ++/* |
4073 | ++ Expand. |
4074 | ++ */ |
4075 | ++int ZLIB_INTERNAL dfltcc_can_inflate(z_streamp strm) |
4076 | ++{ |
4077 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4078 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4079 | ++ |
4080 | ++ /* Unsupported hardware */ |
4081 | ++ return is_bit_set(dfltcc_state->af.fns, DFLTCC_XPND) && |
4082 | ++ is_bit_set(dfltcc_state->af.fmts, DFLTCC_FMT0); |
4083 | ++} |
4084 | ++ |
4085 | ++local dfltcc_cc dfltcc_xpnd(z_streamp strm) |
4086 | ++{ |
4087 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4088 | ++ struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; |
4089 | ++ size_t avail_in = strm->avail_in; |
4090 | ++ size_t avail_out = strm->avail_out; |
4091 | ++ dfltcc_cc cc; |
4092 | ++ |
4093 | ++ cc = dfltcc(DFLTCC_XPND | HBT_CIRCULAR, |
4094 | ++ param, &strm->next_out, &avail_out, |
4095 | ++ &strm->next_in, &avail_in, state->window); |
4096 | ++ strm->avail_in = avail_in; |
4097 | ++ strm->avail_out = avail_out; |
4098 | ++ return cc; |
4099 | ++} |
4100 | ++ |
4101 | ++dfltcc_inflate_action ZLIB_INTERNAL dfltcc_inflate(z_streamp strm, int flush, |
4102 | ++ int *ret) |
4103 | ++{ |
4104 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4105 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4106 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4107 | ++ dfltcc_cc cc; |
4108 | ++ |
4109 | ++ if (flush == Z_BLOCK || flush == Z_TREES) { |
4110 | ++ /* DFLTCC does not support stopping on block boundaries */ |
4111 | ++ if (dfltcc_inflate_disable(strm)) { |
4112 | ++ *ret = Z_STREAM_ERROR; |
4113 | ++ return DFLTCC_INFLATE_BREAK; |
4114 | ++ } else |
4115 | ++ return DFLTCC_INFLATE_SOFTWARE; |
4116 | ++ } |
4117 | ++ |
4118 | ++ if (state->last) { |
4119 | ++ if (state->bits != 0) { |
4120 | ++ strm->next_in++; |
4121 | ++ strm->avail_in--; |
4122 | ++ state->bits = 0; |
4123 | ++ } |
4124 | ++ state->mode = CHECK; |
4125 | ++ return DFLTCC_INFLATE_CONTINUE; |
4126 | ++ } |
4127 | ++ |
4128 | ++ if (strm->avail_in == 0 && !param->cf) |
4129 | ++ return DFLTCC_INFLATE_BREAK; |
4130 | ++ |
4131 | ++ if (inflate_ensure_window(state)) { |
4132 | ++ state->mode = MEM; |
4133 | ++ return DFLTCC_INFLATE_CONTINUE; |
4134 | ++ } |
4135 | ++ |
4136 | ++ /* Translate stream to parameter block */ |
4137 | ++ param->cvt = ((state->wrap & 4) && state->flags) ? CVT_CRC32 : CVT_ADLER32; |
4138 | ++ param->sbb = state->bits; |
4139 | ++ if (param->hl) |
4140 | ++ param->nt = 0; /* Honor history for the first block */ |
4141 | ++ if (state->wrap & 4) |
4142 | ++ param->cv = state->flags ? ZSWAP32(state->check) : state->check; |
4143 | ++ |
4144 | ++ /* Inflate */ |
4145 | ++ do { |
4146 | ++ cc = dfltcc_xpnd(strm); |
4147 | ++ } while (cc == DFLTCC_CC_AGAIN); |
4148 | ++ |
4149 | ++ /* Translate parameter block to stream */ |
4150 | ++ strm->msg = oesc_msg(dfltcc_state->msg, param->oesc); |
4151 | ++ state->last = cc == DFLTCC_CC_OK; |
4152 | ++ state->bits = param->sbb; |
4153 | ++ if (state->wrap & 4) |
4154 | ++ strm->adler = state->check = state->flags ? |
4155 | ++ ZSWAP32(param->cv) : param->cv; |
4156 | ++ if (cc == DFLTCC_CC_OP2_CORRUPT && param->oesc != 0) { |
4157 | ++ /* Report an error if stream is corrupted */ |
4158 | ++ state->mode = BAD; |
4159 | ++ return DFLTCC_INFLATE_CONTINUE; |
4160 | ++ } |
4161 | ++ state->mode = TYPEDO; |
4162 | ++ /* Break if operands are exhausted, otherwise continue looping */ |
4163 | ++ return (cc == DFLTCC_CC_OP1_TOO_SHORT || cc == DFLTCC_CC_OP2_TOO_SHORT) ? |
4164 | ++ DFLTCC_INFLATE_BREAK : DFLTCC_INFLATE_CONTINUE; |
4165 | ++} |
4166 | ++ |
4167 | ++int ZLIB_INTERNAL dfltcc_was_inflate_used(z_streamp strm) |
4168 | ++{ |
4169 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4170 | ++ struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; |
4171 | ++ |
4172 | ++ return !param->nt; |
4173 | ++} |
4174 | ++ |
4175 | ++/* |
4176 | ++ Rotates a circular buffer. |
4177 | ++ The implementation is based on https://cplusplus.com/reference/algorithm/rotate/ |
4178 | ++ */ |
4179 | ++local void rotate(Bytef *start, Bytef *pivot, Bytef *end) |
4180 | ++{ |
4181 | ++ Bytef *p = pivot; |
4182 | ++ Bytef tmp; |
4183 | ++ |
4184 | ++ while (p != start) { |
4185 | ++ tmp = *start; |
4186 | ++ *start = *p; |
4187 | ++ *p = tmp; |
4188 | ++ |
4189 | ++ start++; |
4190 | ++ p++; |
4191 | ++ |
4192 | ++ if (p == end) |
4193 | ++ p = pivot; |
4194 | ++ else if (start == pivot) |
4195 | ++ pivot = p; |
4196 | ++ } |
4197 | ++} |
4198 | ++ |
4199 | ++#define MIN(x, y) ({ \ |
4200 | ++ typeof(x) _x = (x); \ |
4201 | ++ typeof(y) _y = (y); \ |
4202 | ++ _x < _y ? _x : _y; \ |
4203 | ++}) |
4204 | ++ |
4205 | ++#define MAX(x, y) ({ \ |
4206 | ++ typeof(x) _x = (x); \ |
4207 | ++ typeof(y) _y = (y); \ |
4208 | ++ _x > _y ? _x : _y; \ |
4209 | ++}) |
4210 | ++ |
4211 | ++int ZLIB_INTERNAL dfltcc_inflate_disable(z_streamp strm) |
4212 | ++{ |
4213 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4214 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4215 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4216 | ++ |
4217 | ++ if (!dfltcc_can_inflate(strm)) |
4218 | ++ return 0; |
4219 | ++ if (dfltcc_was_inflate_used(strm)) |
4220 | ++ /* DFLTCC has already decompressed some data. Since there is not |
4221 | ++ * enough information to resume decompression in software, the call |
4222 | ++ * must fail. |
4223 | ++ */ |
4224 | ++ return 1; |
4225 | ++ /* DFLTCC was not used yet - decompress in software */ |
4226 | ++ memset(&dfltcc_state->af, 0, sizeof(dfltcc_state->af)); |
4227 | ++ /* Convert the window from the hardware to the software format */ |
4228 | ++ rotate(state->window, state->window + param->ho, state->window + HB_SIZE); |
4229 | ++ state->whave = state->wnext = MIN(param->hl, state->wsize); |
4230 | ++ return 0; |
4231 | ++} |
4232 | ++ |
4233 | ++local int env_dfltcc_disabled; |
4234 | ++local int env_source_date_epoch; |
4235 | ++local unsigned long env_level_mask; |
4236 | ++local unsigned long env_block_size; |
4237 | ++local unsigned long env_block_threshold; |
4238 | ++local unsigned long env_dht_threshold; |
4239 | ++local unsigned long env_ribm; |
4240 | ++local uint64_t cpu_facilities[(DFLTCC_FACILITY / 64) + 1]; |
4241 | ++local struct dfltcc_qaf_param cpu_af __attribute__((aligned(8))); |
4242 | ++ |
4243 | ++local inline int is_dfltcc_enabled(void) |
4244 | ++{ |
4245 | ++ if (env_dfltcc_disabled) |
4246 | ++ /* User has explicitly disabled DFLTCC. */ |
4247 | ++ return 0; |
4248 | ++ |
4249 | ++ return is_bit_set((const char *)cpu_facilities, DFLTCC_FACILITY); |
4250 | ++} |
4251 | ++ |
4252 | ++local unsigned long xstrtoul(const char *s, unsigned long _default) |
4253 | ++{ |
4254 | ++ char *endptr; |
4255 | ++ unsigned long result; |
4256 | ++ |
4257 | ++ if (!(s && *s)) |
4258 | ++ return _default; |
4259 | ++ errno = 0; |
4260 | ++ result = strtoul(s, &endptr, 0); |
4261 | ++ return (errno || *endptr) ? _default : result; |
4262 | ++} |
4263 | ++ |
4264 | ++__attribute__((constructor)) local void init_globals(void) |
4265 | ++{ |
4266 | ++ const char *env; |
4267 | ++ register char r0 __asm__("r0"); |
4268 | ++ |
4269 | ++ env = secure_getenv("DFLTCC"); |
4270 | ++ env_dfltcc_disabled = env && !strcmp(env, "0"); |
4271 | ++ |
4272 | ++ env = secure_getenv("SOURCE_DATE_EPOCH"); |
4273 | ++ env_source_date_epoch = !!env; |
4274 | ++ |
4275 | ++#ifndef DFLTCC_LEVEL_MASK |
4276 | ++#define DFLTCC_LEVEL_MASK 0x2 |
4277 | ++#endif |
4278 | ++ env_level_mask = xstrtoul(secure_getenv("DFLTCC_LEVEL_MASK"), |
4279 | ++ DFLTCC_LEVEL_MASK); |
4280 | ++ |
4281 | ++#ifndef DFLTCC_BLOCK_SIZE |
4282 | ++#define DFLTCC_BLOCK_SIZE 1048576 |
4283 | ++#endif |
4284 | ++ env_block_size = xstrtoul(secure_getenv("DFLTCC_BLOCK_SIZE"), |
4285 | ++ DFLTCC_BLOCK_SIZE); |
4286 | ++ |
4287 | ++#ifndef DFLTCC_FIRST_FHT_BLOCK_SIZE |
4288 | ++#define DFLTCC_FIRST_FHT_BLOCK_SIZE 4096 |
4289 | ++#endif |
4290 | ++ env_block_threshold = xstrtoul(secure_getenv("DFLTCC_FIRST_FHT_BLOCK_SIZE"), |
4291 | ++ DFLTCC_FIRST_FHT_BLOCK_SIZE); |
4292 | ++ |
4293 | ++#ifndef DFLTCC_DHT_MIN_SAMPLE_SIZE |
4294 | ++#define DFLTCC_DHT_MIN_SAMPLE_SIZE 4096 |
4295 | ++#endif |
4296 | ++ env_dht_threshold = xstrtoul(secure_getenv("DFLTCC_DHT_MIN_SAMPLE_SIZE"), |
4297 | ++ DFLTCC_DHT_MIN_SAMPLE_SIZE); |
4298 | ++ |
4299 | ++#ifndef DFLTCC_RIBM |
4300 | ++#define DFLTCC_RIBM 0 |
4301 | ++#endif |
4302 | ++ env_ribm = xstrtoul(secure_getenv("DFLTCC_RIBM"), DFLTCC_RIBM); |
4303 | ++ |
4304 | ++ memset(cpu_facilities, 0, sizeof(cpu_facilities)); |
4305 | ++ r0 = sizeof(cpu_facilities) / sizeof(cpu_facilities[0]) - 1; |
4306 | ++ /* STFLE is supported since z9-109 and only in z/Architecture mode. When |
4307 | ++ * compiling with -m31, gcc defaults to ESA mode, however, since the kernel |
4308 | ++ * is 64-bit, it's always z/Architecture mode at runtime. |
4309 | ++ */ |
4310 | ++ __asm__ volatile( |
4311 | ++#ifndef __clang__ |
4312 | ++ ".machinemode push\n" |
4313 | ++ ".machinemode zarch\n" |
4314 | ++#endif |
4315 | ++ "stfle %[facilities]\n" |
4316 | ++#ifndef __clang__ |
4317 | ++ ".machinemode pop\n" |
4318 | ++#endif |
4319 | ++ : [facilities] "=Q" (cpu_facilities) |
4320 | ++ , [r0] "+r" (r0) |
4321 | ++ : |
4322 | ++ : "cc"); |
4323 | ++ |
4324 | ++ /* Initialize available functions */ |
4325 | ++ if (is_dfltcc_enabled()) |
4326 | ++ dfltcc(DFLTCC_QAF, &cpu_af, NULL, NULL, NULL, NULL, NULL); |
4327 | ++ else |
4328 | ++ memset(&cpu_af, 0, sizeof(cpu_af)); |
4329 | ++} |
4330 | ++ |
4331 | ++/* |
4332 | ++ Memory management. |
4333 | ++ |
4334 | ++ DFLTCC requires parameter blocks and window to be aligned. zlib allows |
4335 | ++ users to specify their own allocation functions, so using e.g. |
4336 | ++ `posix_memalign' is not an option. Thus, we overallocate and take the |
4337 | ++ aligned portion of the buffer. |
4338 | ++*/ |
4339 | ++void ZLIB_INTERNAL dfltcc_reset(z_streamp strm, uInt size) |
4340 | ++{ |
4341 | ++ struct dfltcc_state *dfltcc_state = |
4342 | ++ (struct dfltcc_state *)((char *)strm->state + ALIGN_UP(size, 8)); |
4343 | ++ |
4344 | ++ memcpy(&dfltcc_state->af, &cpu_af, sizeof(dfltcc_state->af)); |
4345 | ++ |
4346 | ++ if (env_source_date_epoch) |
4347 | ++ /* User needs reproducible results, but the output of DFLTCC_CMPR |
4348 | ++ * depends on buffers' page offsets. |
4349 | ++ */ |
4350 | ++ clear_bit(dfltcc_state->af.fns, DFLTCC_CMPR); |
4351 | ++ |
4352 | ++ /* Initialize parameter block */ |
4353 | ++ memset(&dfltcc_state->param, 0, sizeof(dfltcc_state->param)); |
4354 | ++ dfltcc_state->param.nt = 1; |
4355 | ++ |
4356 | ++ /* Initialize tuning parameters */ |
4357 | ++ dfltcc_state->level_mask = env_level_mask; |
4358 | ++ dfltcc_state->block_size = env_block_size; |
4359 | ++ dfltcc_state->block_threshold = env_block_threshold; |
4360 | ++ dfltcc_state->dht_threshold = env_dht_threshold; |
4361 | ++ dfltcc_state->param.ribm = env_ribm; |
4362 | ++} |
4363 | ++ |
4364 | ++voidpf ZLIB_INTERNAL dfltcc_alloc_state(z_streamp strm, uInt items, uInt size) |
4365 | ++{ |
4366 | ++ return ZALLOC(strm, |
4367 | ++ ALIGN_UP(items * size, 8) + sizeof(struct dfltcc_state), |
4368 | ++ sizeof(unsigned char)); |
4369 | ++} |
4370 | ++ |
4371 | ++void ZLIB_INTERNAL dfltcc_copy_state(voidpf dst, const voidpf src, uInt size) |
4372 | ++{ |
4373 | ++ zmemcpy(dst, src, ALIGN_UP(size, 8) + sizeof(struct dfltcc_state)); |
4374 | ++} |
4375 | ++ |
4376 | ++static const int PAGE_ALIGN = 0x1000; |
4377 | ++ |
4378 | ++voidpf ZLIB_INTERNAL dfltcc_alloc_window(z_streamp strm, uInt items, uInt size) |
4379 | ++{ |
4380 | ++ voidpf p, w; |
4381 | ++ |
4382 | ++ /* To simplify freeing, we store the pointer to the allocated buffer right |
4383 | ++ * before the window. Note that DFLTCC always uses HB_SIZE bytes. |
4384 | ++ */ |
4385 | ++ p = ZALLOC(strm, sizeof(voidpf) + MAX(items * size, HB_SIZE) + PAGE_ALIGN, |
4386 | ++ sizeof(unsigned char)); |
4387 | ++ if (p == NULL) |
4388 | ++ return NULL; |
4389 | ++ w = ALIGN_UP((char *)p + sizeof(voidpf), PAGE_ALIGN); |
4390 | ++ *(voidpf *)((char *)w - sizeof(voidpf)) = p; |
4391 | ++ return w; |
4392 | ++} |
4393 | ++ |
4394 | ++void ZLIB_INTERNAL dfltcc_copy_window(void *dest, const void *src, size_t n) |
4395 | ++{ |
4396 | ++ memcpy(dest, src, MAX(n, HB_SIZE)); |
4397 | ++} |
4398 | ++ |
4399 | ++void ZLIB_INTERNAL dfltcc_free_window(z_streamp strm, voidpf w) |
4400 | ++{ |
4401 | ++ if (w) |
4402 | ++ ZFREE(strm, *(voidpf *)((unsigned char *)w - sizeof(voidpf))); |
4403 | ++} |
4404 | ++ |
4405 | ++/* |
4406 | ++ Switching between hardware and software compression. |
4407 | ++ |
4408 | ++ DFLTCC does not support all zlib settings, e.g. generation of non-compressed |
4409 | ++ blocks or alternative window sizes. When such settings are applied on the |
4410 | ++ fly with deflateParams, we need to convert between hardware and software |
4411 | ++ window formats. |
4412 | ++*/ |
4413 | ++int ZLIB_INTERNAL dfltcc_deflate_params(z_streamp strm, int level, |
4414 | ++ int strategy, int *flush) |
4415 | ++{ |
4416 | ++ deflate_state *state = (deflate_state *)strm->state; |
4417 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4418 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4419 | ++ int could_deflate = dfltcc_can_deflate(strm); |
4420 | ++ int can_deflate = dfltcc_can_deflate_with_params(strm, |
4421 | ++ level, |
4422 | ++ state->w_bits, |
4423 | ++ strategy); |
4424 | ++ |
4425 | ++ if (can_deflate == could_deflate) |
4426 | ++ /* We continue to work in the same mode - no changes needed */ |
4427 | ++ return Z_OK; |
4428 | ++ |
4429 | ++ if (strm->total_in == 0 && param->nt == 1 && param->hl == 0) |
4430 | ++ /* DFLTCC was not used yet - no changes needed */ |
4431 | ++ return Z_OK; |
4432 | ++ |
4433 | ++ /* For now, do not convert between window formats - simply get rid of the |
4434 | ++ * old data instead. |
4435 | ++ */ |
4436 | ++ *flush = Z_FULL_FLUSH; |
4437 | ++ return Z_OK; |
4438 | ++} |
4439 | ++ |
4440 | ++int ZLIB_INTERNAL dfltcc_deflate_done(z_streamp strm, int flush) |
4441 | ++{ |
4442 | ++ deflate_state *state = (deflate_state *)strm->state; |
4443 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4444 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4445 | ++ |
4446 | ++ /* When deflate(Z_FULL_FLUSH) is called with small avail_out, it might |
4447 | ++ * close the block without resetting the compression state. Detect this |
4448 | ++ * situation and return that deflation is not done. |
4449 | ++ */ |
4450 | ++ if (flush == Z_FULL_FLUSH && strm->avail_out == 0) |
4451 | ++ return 0; |
4452 | ++ |
4453 | ++ /* Return that deflation is not done if DFLTCC is used and either it |
4454 | ++ * buffered some data (Continuation Flag is set), or has not written EOBS |
4455 | ++ * yet (Block-Continuation Flag is set). |
4456 | ++ */ |
4457 | ++ return !dfltcc_can_deflate(strm) || (!param->cf && !param->bcf); |
4458 | ++} |
4459 | ++ |
4460 | ++/* |
4461 | ++ Preloading history. |
4462 | ++*/ |
4463 | ++local void append_history(struct dfltcc_param_v0 *param, |
4464 | ++ Bytef *history, |
4465 | ++ const Bytef *buf, |
4466 | ++ uInt count) |
4467 | ++{ |
4468 | ++ size_t offset; |
4469 | ++ size_t n; |
4470 | ++ |
4471 | ++ /* Do not use more than 32K */ |
4472 | ++ if (count > HB_SIZE) { |
4473 | ++ buf += count - HB_SIZE; |
4474 | ++ count = HB_SIZE; |
4475 | ++ } |
4476 | ++ offset = (param->ho + param->hl) % HB_SIZE; |
4477 | ++ if (offset + count <= HB_SIZE) |
4478 | ++ /* Circular history buffer does not wrap - copy one chunk */ |
4479 | ++ zmemcpy(history + offset, buf, count); |
4480 | ++ else { |
4481 | ++ /* Circular history buffer wraps - copy two chunks */ |
4482 | ++ n = HB_SIZE - offset; |
4483 | ++ zmemcpy(history + offset, buf, n); |
4484 | ++ zmemcpy(history, buf + n, count - n); |
4485 | ++ } |
4486 | ++ n = param->hl + count; |
4487 | ++ if (n <= HB_SIZE) |
4488 | ++ /* All history fits into buffer - no need to discard anything */ |
4489 | ++ param->hl = n; |
4490 | ++ else { |
4491 | ++ /* History does not fit into buffer - discard extra bytes */ |
4492 | ++ param->ho = (param->ho + (n - HB_SIZE)) % HB_SIZE; |
4493 | ++ param->hl = HB_SIZE; |
4494 | ++ } |
4495 | ++} |
4496 | ++ |
4497 | ++local void get_history(struct dfltcc_param_v0 *param, |
4498 | ++ const Bytef *history, |
4499 | ++ Bytef *buf) |
4500 | ++{ |
4501 | ++ if (param->ho + param->hl <= HB_SIZE) |
4502 | ++ /* Circular history buffer does not wrap - copy one chunk */ |
4503 | ++ memcpy(buf, history + param->ho, param->hl); |
4504 | ++ else { |
4505 | ++ /* Circular history buffer wraps - copy two chunks */ |
4506 | ++ memcpy(buf, history + param->ho, HB_SIZE - param->ho); |
4507 | ++ memcpy(buf + HB_SIZE - param->ho, history, param->ho + param->hl - HB_SIZE); |
4508 | ++ } |
4509 | ++} |
4510 | ++ |
4511 | ++int ZLIB_INTERNAL dfltcc_deflate_set_dictionary(z_streamp strm, |
4512 | ++ const Bytef *dictionary, |
4513 | ++ uInt dict_length) |
4514 | ++{ |
4515 | ++ deflate_state *state = (deflate_state *)strm->state; |
4516 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4517 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4518 | ++ |
4519 | ++ append_history(param, state->window, dictionary, dict_length); |
4520 | ++ state->strstart = 1; /* Add FDICT to zlib header */ |
4521 | ++ state->block_start = state->strstart; /* Make deflate_stored happy */ |
4522 | ++ return Z_OK; |
4523 | ++} |
4524 | ++ |
4525 | ++int ZLIB_INTERNAL dfltcc_deflate_get_dictionary(z_streamp strm, |
4526 | ++ Bytef *dictionary, |
4527 | ++ uInt *dict_length) |
4528 | ++{ |
4529 | ++ deflate_state *state = (deflate_state *)strm->state; |
4530 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4531 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4532 | ++ |
4533 | ++ if (dictionary) |
4534 | ++ get_history(param, state->window, dictionary); |
4535 | ++ if (dict_length) |
4536 | ++ *dict_length = param->hl; |
4537 | ++ return Z_OK; |
4538 | ++} |
4539 | ++ |
4540 | ++int ZLIB_INTERNAL dfltcc_inflate_set_dictionary(z_streamp strm, |
4541 | ++ const Bytef *dictionary, |
4542 | ++ uInt dict_length) |
4543 | ++{ |
4544 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4545 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4546 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4547 | ++ |
4548 | ++ if (inflate_ensure_window(state)) { |
4549 | ++ state->mode = MEM; |
4550 | ++ return Z_MEM_ERROR; |
4551 | ++ } |
4552 | ++ |
4553 | ++ append_history(param, state->window, dictionary, dict_length); |
4554 | ++ state->havedict = 1; |
4555 | ++ return Z_OK; |
4556 | ++} |
4557 | ++ |
4558 | ++int ZLIB_INTERNAL dfltcc_inflate_get_dictionary(z_streamp strm, |
4559 | ++ Bytef *dictionary, |
4560 | ++ uInt *dict_length) |
4561 | ++{ |
4562 | ++ struct inflate_state *state = (struct inflate_state *)strm->state; |
4563 | ++ struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); |
4564 | ++ struct dfltcc_param_v0 *param = &dfltcc_state->param; |
4565 | ++ |
4566 | ++ if (dictionary && state->window) |
4567 | ++ get_history(param, state->window, dictionary); |
4568 | ++ if (dict_length) |
4569 | ++ *dict_length = param->hl; |
4570 | ++ return Z_OK; |
4571 | ++} |
4572 | +diff --git a/contrib/s390/dfltcc.h b/contrib/s390/dfltcc.h |
4573 | +new file mode 100644 |
4574 | +index 0000000..c8491c4 |
4575 | +--- /dev/null |
4576 | ++++ b/contrib/s390/dfltcc.h |
4577 | +@@ -0,0 +1,97 @@ |
4578 | ++#ifndef DFLTCC_H |
4579 | ++#define DFLTCC_H |
4580 | ++ |
4581 | ++#include "../../zlib.h" |
4582 | ++#include "../../zutil.h" |
4583 | ++ |
4584 | ++voidpf ZLIB_INTERNAL dfltcc_alloc_state(z_streamp strm, uInt items, uInt size); |
4585 | ++void ZLIB_INTERNAL dfltcc_copy_state(voidpf dst, const voidpf src, uInt size); |
4586 | ++void ZLIB_INTERNAL dfltcc_reset(z_streamp strm, uInt size); |
4587 | ++voidpf ZLIB_INTERNAL dfltcc_alloc_window(z_streamp strm, uInt items, |
4588 | ++ uInt size); |
4589 | ++void ZLIB_INTERNAL dfltcc_copy_window(void *dest, const void *src, size_t n); |
4590 | ++void ZLIB_INTERNAL dfltcc_free_window(z_streamp strm, voidpf w); |
4591 | ++#define DFLTCC_BLOCK_HEADER_BITS 3 |
4592 | ++#define DFLTCC_HLITS_COUNT_BITS 5 |
4593 | ++#define DFLTCC_HDISTS_COUNT_BITS 5 |
4594 | ++#define DFLTCC_HCLENS_COUNT_BITS 4 |
4595 | ++#define DFLTCC_MAX_HCLENS 19 |
4596 | ++#define DFLTCC_HCLEN_BITS 3 |
4597 | ++#define DFLTCC_MAX_HLITS 286 |
4598 | ++#define DFLTCC_MAX_HDISTS 30 |
4599 | ++#define DFLTCC_MAX_HLIT_HDIST_BITS 7 |
4600 | ++#define DFLTCC_MAX_SYMBOL_BITS 16 |
4601 | ++#define DFLTCC_MAX_EOBS_BITS 15 |
4602 | ++#define DFLTCC_MAX_PADDING_BITS 7 |
4603 | ++#define DEFLATE_BOUND_COMPLEN(source_len) \ |
4604 | ++ ((DFLTCC_BLOCK_HEADER_BITS + \ |
4605 | ++ DFLTCC_HLITS_COUNT_BITS + \ |
4606 | ++ DFLTCC_HDISTS_COUNT_BITS + \ |
4607 | ++ DFLTCC_HCLENS_COUNT_BITS + \ |
4608 | ++ DFLTCC_MAX_HCLENS * DFLTCC_HCLEN_BITS + \ |
4609 | ++ (DFLTCC_MAX_HLITS + DFLTCC_MAX_HDISTS) * DFLTCC_MAX_HLIT_HDIST_BITS + \ |
4610 | ++ (source_len) * DFLTCC_MAX_SYMBOL_BITS + \ |
4611 | ++ DFLTCC_MAX_EOBS_BITS + \ |
4612 | ++ DFLTCC_MAX_PADDING_BITS) >> 3) |
4613 | ++int ZLIB_INTERNAL dfltcc_can_inflate(z_streamp strm); |
4614 | ++typedef enum { |
4615 | ++ DFLTCC_INFLATE_CONTINUE, |
4616 | ++ DFLTCC_INFLATE_BREAK, |
4617 | ++ DFLTCC_INFLATE_SOFTWARE, |
4618 | ++} dfltcc_inflate_action; |
4619 | ++dfltcc_inflate_action ZLIB_INTERNAL dfltcc_inflate(z_streamp strm, |
4620 | ++ int flush, int *ret); |
4621 | ++int ZLIB_INTERNAL dfltcc_was_inflate_used(z_streamp strm); |
4622 | ++int ZLIB_INTERNAL dfltcc_inflate_disable(z_streamp strm); |
4623 | ++int ZLIB_INTERNAL dfltcc_inflate_set_dictionary(z_streamp strm, |
4624 | ++ const Bytef *dictionary, |
4625 | ++ uInt dict_length); |
4626 | ++int ZLIB_INTERNAL dfltcc_inflate_get_dictionary(z_streamp strm, |
4627 | ++ Bytef *dictionary, |
4628 | ++ uInt* dict_length); |
4629 | ++ |
4630 | ++#define ZALLOC_STATE dfltcc_alloc_state |
4631 | ++#define ZFREE_STATE ZFREE |
4632 | ++#define ZCOPY_STATE dfltcc_copy_state |
4633 | ++#define ZALLOC_WINDOW dfltcc_alloc_window |
4634 | ++#define ZCOPY_WINDOW dfltcc_copy_window |
4635 | ++#define ZFREE_WINDOW dfltcc_free_window |
4636 | ++#define TRY_FREE_WINDOW dfltcc_free_window |
4637 | ++#define INFLATE_RESET_KEEP_HOOK(strm) \ |
4638 | ++ dfltcc_reset((strm), sizeof(struct inflate_state)) |
4639 | ++#define INFLATE_PRIME_HOOK(strm, bits, value) \ |
4640 | ++ do { if (dfltcc_inflate_disable((strm))) return Z_STREAM_ERROR; } while (0) |
4641 | ++#define INFLATE_TYPEDO_HOOK(strm, flush) \ |
4642 | ++ if (dfltcc_can_inflate((strm))) { \ |
4643 | ++ dfltcc_inflate_action action; \ |
4644 | ++\ |
4645 | ++ RESTORE(); \ |
4646 | ++ action = dfltcc_inflate((strm), (flush), &ret); \ |
4647 | ++ LOAD(); \ |
4648 | ++ if (action == DFLTCC_INFLATE_CONTINUE) \ |
4649 | ++ break; \ |
4650 | ++ else if (action == DFLTCC_INFLATE_BREAK) \ |
4651 | ++ goto inf_leave; \ |
4652 | ++ } |
4653 | ++#define INFLATE_NEED_CHECKSUM(strm) (!dfltcc_can_inflate((strm))) |
4654 | ++#define INFLATE_NEED_UPDATEWINDOW(strm) (!dfltcc_can_inflate((strm))) |
4655 | ++#define INFLATE_MARK_HOOK(strm) \ |
4656 | ++ do { \ |
4657 | ++ if (dfltcc_was_inflate_used((strm))) return -(1L << 16); \ |
4658 | ++ } while (0) |
4659 | ++#define INFLATE_SYNC_POINT_HOOK(strm) \ |
4660 | ++ do { \ |
4661 | ++ if (dfltcc_was_inflate_used((strm))) return Z_STREAM_ERROR; \ |
4662 | ++ } while (0) |
4663 | ++#define INFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) \ |
4664 | ++ do { \ |
4665 | ++ if (dfltcc_can_inflate(strm)) \ |
4666 | ++ return dfltcc_inflate_set_dictionary(strm, dict, dict_len); \ |
4667 | ++ } while (0) |
4668 | ++#define INFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) \ |
4669 | ++ do { \ |
4670 | ++ if (dfltcc_can_inflate(strm)) \ |
4671 | ++ return dfltcc_inflate_get_dictionary(strm, dict, dict_len); \ |
4672 | ++ } while (0) |
4673 | ++ |
4674 | ++#endif |
4675 | +diff --git a/contrib/s390/dfltcc_deflate.h b/contrib/s390/dfltcc_deflate.h |
4676 | +new file mode 100644 |
4677 | +index 0000000..2699d15 |
4678 | +--- /dev/null |
4679 | ++++ b/contrib/s390/dfltcc_deflate.h |
4680 | +@@ -0,0 +1,53 @@ |
4681 | ++#ifndef DFLTCC_DEFLATE_H |
4682 | ++#define DFLTCC_DEFLATE_H |
4683 | ++ |
4684 | ++#include "dfltcc.h" |
4685 | ++ |
4686 | ++int ZLIB_INTERNAL dfltcc_can_deflate(z_streamp strm); |
4687 | ++int ZLIB_INTERNAL dfltcc_deflate(z_streamp strm, |
4688 | ++ int flush, |
4689 | ++ block_state *result); |
4690 | ++int ZLIB_INTERNAL dfltcc_deflate_params(z_streamp strm, int level, |
4691 | ++ int strategy, int *flush); |
4692 | ++int ZLIB_INTERNAL dfltcc_deflate_done(z_streamp strm, int flush); |
4693 | ++int ZLIB_INTERNAL dfltcc_deflate_set_dictionary(z_streamp strm, |
4694 | ++ const Bytef *dictionary, |
4695 | ++ uInt dict_length); |
4696 | ++int ZLIB_INTERNAL dfltcc_deflate_get_dictionary(z_streamp strm, |
4697 | ++ Bytef *dictionary, |
4698 | ++ uInt* dict_length); |
4699 | ++ |
4700 | ++#define DEFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) \ |
4701 | ++ do { \ |
4702 | ++ if (dfltcc_can_deflate((strm))) \ |
4703 | ++ return dfltcc_deflate_set_dictionary((strm), (dict), (dict_len)); \ |
4704 | ++ } while (0) |
4705 | ++#define DEFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) \ |
4706 | ++ do { \ |
4707 | ++ if (dfltcc_can_deflate((strm))) \ |
4708 | ++ return dfltcc_deflate_get_dictionary((strm), (dict), (dict_len)); \ |
4709 | ++ } while (0) |
4710 | ++#define DEFLATE_RESET_KEEP_HOOK(strm) \ |
4711 | ++ dfltcc_reset((strm), sizeof(deflate_state)) |
4712 | ++#define DEFLATE_PARAMS_HOOK(strm, level, strategy, hook_flush) \ |
4713 | ++ do { \ |
4714 | ++ int err; \ |
4715 | ++\ |
4716 | ++ err = dfltcc_deflate_params((strm), \ |
4717 | ++ (level), \ |
4718 | ++ (strategy), \ |
4719 | ++ (hook_flush)); \ |
4720 | ++ if (err == Z_STREAM_ERROR) \ |
4721 | ++ return err; \ |
4722 | ++ } while (0) |
4723 | ++#define DEFLATE_DONE dfltcc_deflate_done |
4724 | ++#define DEFLATE_BOUND_ADJUST_COMPLEN(strm, complen, source_len) \ |
4725 | ++ do { \ |
4726 | ++ if (deflateStateCheck((strm)) || dfltcc_can_deflate((strm))) \ |
4727 | ++ (complen) = DEFLATE_BOUND_COMPLEN(source_len); \ |
4728 | ++ } while (0) |
4729 | ++#define DEFLATE_NEED_CONSERVATIVE_BOUND(strm) (dfltcc_can_deflate((strm))) |
4730 | ++#define DEFLATE_HOOK dfltcc_deflate |
4731 | ++#define DEFLATE_NEED_CHECKSUM(strm) (!dfltcc_can_deflate((strm))) |
4732 | ++ |
4733 | ++#endif |
4734 | +diff --git a/deflate.c b/deflate.c |
4735 | +index bd01175..9f5bc8b 100644 |
4736 | +--- a/deflate.c |
4737 | ++++ b/deflate.c |
4738 | +@@ -60,12 +60,24 @@ const char deflate_copyright[] = |
4739 | + copyright string in the executable of your product. |
4740 | + */ |
4741 | + |
4742 | +-typedef enum { |
4743 | +- need_more, /* block not completed, need more input or more output */ |
4744 | +- block_done, /* block flush performed */ |
4745 | +- finish_started, /* finish started, need only more output at next deflate */ |
4746 | +- finish_done /* finish done, accept no more input or output */ |
4747 | +-} block_state; |
4748 | ++#ifdef DFLTCC |
4749 | ++#include "contrib/s390/dfltcc_deflate.h" |
4750 | ++#else |
4751 | ++#define ZALLOC_STATE ZALLOC |
4752 | ++#define ZFREE_STATE ZFREE |
4753 | ++#define ZCOPY_STATE zmemcpy |
4754 | ++#define ZALLOC_WINDOW ZALLOC |
4755 | ++#define TRY_FREE_WINDOW TRY_FREE |
4756 | ++#define DEFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) do {} while (0) |
4757 | ++#define DEFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) do {} while (0) |
4758 | ++#define DEFLATE_RESET_KEEP_HOOK(strm) do {} while (0) |
4759 | ++#define DEFLATE_PARAMS_HOOK(strm, level, strategy, hook_flush) do {} while (0) |
4760 | ++#define DEFLATE_DONE(strm, flush) 1 |
4761 | ++#define DEFLATE_BOUND_ADJUST_COMPLEN(strm, complen, sourceLen) do {} while (0) |
4762 | ++#define DEFLATE_NEED_CONSERVATIVE_BOUND(strm) 0 |
4763 | ++#define DEFLATE_HOOK(strm, flush, bstate) 0 |
4764 | ++#define DEFLATE_NEED_CHECKSUM(strm) 1 |
4765 | ++#endif |
4766 | + |
4767 | + typedef block_state (*compress_func)(deflate_state *s, int flush); |
4768 | + /* Compression function. Returns the block state after the call. */ |
4769 | +@@ -224,7 +236,8 @@ local unsigned read_buf(z_streamp strm, Bytef *buf, unsigned size) { |
4770 | + strm->avail_in -= len; |
4771 | + |
4772 | + zmemcpy(buf, strm->next_in, len); |
4773 | +- if (strm->state->wrap == 1) { |
4774 | ++ if (!DEFLATE_NEED_CHECKSUM(strm)) {} |
4775 | ++ else if (strm->state->wrap == 1) { |
4776 | + strm->adler = adler32(strm->adler, buf, len); |
4777 | + } |
4778 | + #ifdef GZIP |
4779 | +@@ -429,7 +442,7 @@ int ZEXPORT deflateInit2_(z_streamp strm, int level, int method, |
4780 | + return Z_STREAM_ERROR; |
4781 | + } |
4782 | + if (windowBits == 8) windowBits = 9; /* until 256-byte window bug fixed */ |
4783 | +- s = (deflate_state *) ZALLOC(strm, 1, sizeof(deflate_state)); |
4784 | ++ s = (deflate_state *) ZALLOC_STATE(strm, 1, sizeof(deflate_state)); |
4785 | + if (s == Z_NULL) return Z_MEM_ERROR; |
4786 | + strm->state = (struct internal_state FAR *)s; |
4787 | + s->strm = strm; |
4788 | +@@ -446,7 +459,7 @@ int ZEXPORT deflateInit2_(z_streamp strm, int level, int method, |
4789 | + s->hash_mask = s->hash_size - 1; |
4790 | + s->hash_shift = ((s->hash_bits + MIN_MATCH-1) / MIN_MATCH); |
4791 | + |
4792 | +- s->window = (Bytef *) ZALLOC(strm, s->w_size, 2*sizeof(Byte)); |
4793 | ++ s->window = (Bytef *) ZALLOC_WINDOW(strm, s->w_size, 2*sizeof(Byte)); |
4794 | + s->prev = (Posf *) ZALLOC(strm, s->w_size, sizeof(Pos)); |
4795 | + s->head = (Posf *) ZALLOC(strm, s->hash_size, sizeof(Pos)); |
4796 | + |
4797 | +@@ -559,6 +572,7 @@ int ZEXPORT deflateSetDictionary(z_streamp strm, const Bytef *dictionary, |
4798 | + /* when using zlib wrappers, compute Adler-32 for provided dictionary */ |
4799 | + if (wrap == 1) |
4800 | + strm->adler = adler32(strm->adler, dictionary, dictLength); |
4801 | ++ DEFLATE_SET_DICTIONARY_HOOK(strm, dictionary, dictLength); |
4802 | + s->wrap = 0; /* avoid computing Adler-32 in read_buf */ |
4803 | + |
4804 | + /* if dictionary would fill window, just replace the history */ |
4805 | +@@ -614,6 +628,7 @@ int ZEXPORT deflateGetDictionary(z_streamp strm, Bytef *dictionary, |
4806 | + |
4807 | + if (deflateStateCheck(strm)) |
4808 | + return Z_STREAM_ERROR; |
4809 | ++ DEFLATE_GET_DICTIONARY_HOOK(strm, dictionary, dictLength); |
4810 | + s = strm->state; |
4811 | + len = s->strstart + s->lookahead; |
4812 | + if (len > s->w_size) |
4813 | +@@ -658,6 +673,8 @@ int ZEXPORT deflateResetKeep(z_streamp strm) { |
4814 | + |
4815 | + _tr_init(s); |
4816 | + |
4817 | ++ DEFLATE_RESET_KEEP_HOOK(strm); |
4818 | ++ |
4819 | + return Z_OK; |
4820 | + } |
4821 | + |
4822 | +@@ -740,6 +757,7 @@ int ZEXPORT deflatePrime(z_streamp strm, int bits, int value) { |
4823 | + int ZEXPORT deflateParams(z_streamp strm, int level, int strategy) { |
4824 | + deflate_state *s; |
4825 | + compress_func func; |
4826 | ++ int hook_flush = Z_NO_FLUSH; |
4827 | + |
4828 | + if (deflateStateCheck(strm)) return Z_STREAM_ERROR; |
4829 | + s = strm->state; |
4830 | +@@ -752,15 +770,18 @@ int ZEXPORT deflateParams(z_streamp strm, int level, int strategy) { |
4831 | + if (level < 0 || level > 9 || strategy < 0 || strategy > Z_FIXED) { |
4832 | + return Z_STREAM_ERROR; |
4833 | + } |
4834 | ++ DEFLATE_PARAMS_HOOK(strm, level, strategy, &hook_flush); |
4835 | + func = configuration_table[s->level].func; |
4836 | + |
4837 | +- if ((strategy != s->strategy || func != configuration_table[level].func) && |
4838 | +- s->last_flush != -2) { |
4839 | ++ if (((strategy != s->strategy || func != configuration_table[level].func) && |
4840 | ++ s->last_flush != -2) || hook_flush != Z_NO_FLUSH) { |
4841 | + /* Flush the last buffer: */ |
4842 | +- int err = deflate(strm, Z_BLOCK); |
4843 | ++ int flush = RANK(hook_flush) > RANK(Z_BLOCK) ? hook_flush : Z_BLOCK; |
4844 | ++ int err = deflate(strm, flush); |
4845 | + if (err == Z_STREAM_ERROR) |
4846 | + return err; |
4847 | +- if (strm->avail_in || (s->strstart - s->block_start) + s->lookahead) |
4848 | ++ if (strm->avail_in || (s->strstart - s->block_start) + s->lookahead || |
4849 | ++ !DEFLATE_DONE(strm, flush)) |
4850 | + return Z_BUF_ERROR; |
4851 | + } |
4852 | + if (s->level != level) { |
4853 | +@@ -828,11 +849,13 @@ uLong ZEXPORT deflateBound(z_streamp strm, uLong sourceLen) { |
4854 | + ~13% overhead plus a small constant */ |
4855 | + fixedlen = sourceLen + (sourceLen >> 3) + (sourceLen >> 8) + |
4856 | + (sourceLen >> 9) + 4; |
4857 | ++ DEFLATE_BOUND_ADJUST_COMPLEN(strm, fixedlen, sourceLen); |
4858 | + |
4859 | + /* upper bound for stored blocks with length 127 (memLevel == 1) -- |
4860 | + ~4% overhead plus a small constant */ |
4861 | + storelen = sourceLen + (sourceLen >> 5) + (sourceLen >> 7) + |
4862 | + (sourceLen >> 11) + 7; |
4863 | ++ DEFLATE_BOUND_ADJUST_COMPLEN(strm, storelen, sourceLen); |
4864 | + |
4865 | + /* if can't get parameters, return larger bound plus a zlib wrapper */ |
4866 | + if (deflateStateCheck(strm)) |
4867 | +@@ -874,7 +897,8 @@ uLong ZEXPORT deflateBound(z_streamp strm, uLong sourceLen) { |
4868 | + } |
4869 | + |
4870 | + /* if not default parameters, return one of the conservative bounds */ |
4871 | +- if (s->w_bits != 15 || s->hash_bits != 8 + 7) |
4872 | ++ if (DEFLATE_NEED_CONSERVATIVE_BOUND(strm) || |
4873 | ++ s->w_bits != 15 || s->hash_bits != 8 + 7) |
4874 | + return (s->w_bits <= s->hash_bits && s->level ? fixedlen : storelen) + |
4875 | + wraplen; |
4876 | + |
4877 | +@@ -900,7 +924,7 @@ local void putShortMSB(deflate_state *s, uInt b) { |
4878 | + * applications may wish to modify it to avoid allocating a large |
4879 | + * strm->next_out buffer and copying into it. (See also read_buf()). |
4880 | + */ |
4881 | +-local void flush_pending(z_streamp strm) { |
4882 | ++void ZLIB_INTERNAL flush_pending(z_streamp strm) { |
4883 | + unsigned len; |
4884 | + deflate_state *s = strm->state; |
4885 | + |
4886 | +@@ -1167,7 +1191,8 @@ int ZEXPORT deflate(z_streamp strm, int flush) { |
4887 | + (flush != Z_NO_FLUSH && s->status != FINISH_STATE)) { |
4888 | + block_state bstate; |
4889 | + |
4890 | +- bstate = s->level == 0 ? deflate_stored(s, flush) : |
4891 | ++ bstate = DEFLATE_HOOK(strm, flush, &bstate) ? bstate : |
4892 | ++ s->level == 0 ? deflate_stored(s, flush) : |
4893 | + s->strategy == Z_HUFFMAN_ONLY ? deflate_huff(s, flush) : |
4894 | + s->strategy == Z_RLE ? deflate_rle(s, flush) : |
4895 | + (*(configuration_table[s->level].func))(s, flush); |
4896 | +@@ -1214,7 +1239,6 @@ int ZEXPORT deflate(z_streamp strm, int flush) { |
4897 | + } |
4898 | + |
4899 | + if (flush != Z_FINISH) return Z_OK; |
4900 | +- if (s->wrap <= 0) return Z_STREAM_END; |
4901 | + |
4902 | + /* Write the trailer */ |
4903 | + #ifdef GZIP |
4904 | +@@ -1230,7 +1254,7 @@ int ZEXPORT deflate(z_streamp strm, int flush) { |
4905 | + } |
4906 | + else |
4907 | + #endif |
4908 | +- { |
4909 | ++ if (s->wrap == 1) { |
4910 | + putShortMSB(s, (uInt)(strm->adler >> 16)); |
4911 | + putShortMSB(s, (uInt)(strm->adler & 0xffff)); |
4912 | + } |
4913 | +@@ -1239,7 +1263,11 @@ int ZEXPORT deflate(z_streamp strm, int flush) { |
4914 | + * to flush the rest. |
4915 | + */ |
4916 | + if (s->wrap > 0) s->wrap = -s->wrap; /* write the trailer only once! */ |
4917 | +- return s->pending != 0 ? Z_OK : Z_STREAM_END; |
4918 | ++ if (s->pending == 0) { |
4919 | ++ Assert(s->bi_valid == 0, "bi_buf not flushed"); |
4920 | ++ return Z_STREAM_END; |
4921 | ++ } |
4922 | ++ return Z_OK; |
4923 | + } |
4924 | + |
4925 | + /* ========================================================================= */ |
4926 | +@@ -1254,9 +1282,9 @@ int ZEXPORT deflateEnd(z_streamp strm) { |
4927 | + TRY_FREE(strm, strm->state->pending_buf); |
4928 | + TRY_FREE(strm, strm->state->head); |
4929 | + TRY_FREE(strm, strm->state->prev); |
4930 | +- TRY_FREE(strm, strm->state->window); |
4931 | ++ TRY_FREE_WINDOW(strm, strm->state->window); |
4932 | + |
4933 | +- ZFREE(strm, strm->state); |
4934 | ++ ZFREE_STATE(strm, strm->state); |
4935 | + strm->state = Z_NULL; |
4936 | + |
4937 | + return status == BUSY_STATE ? Z_DATA_ERROR : Z_OK; |
4938 | +@@ -1285,13 +1313,13 @@ int ZEXPORT deflateCopy(z_streamp dest, z_streamp source) { |
4939 | + |
4940 | + zmemcpy((voidpf)dest, (voidpf)source, sizeof(z_stream)); |
4941 | + |
4942 | +- ds = (deflate_state *) ZALLOC(dest, 1, sizeof(deflate_state)); |
4943 | ++ ds = (deflate_state *) ZALLOC_STATE(dest, 1, sizeof(deflate_state)); |
4944 | + if (ds == Z_NULL) return Z_MEM_ERROR; |
4945 | + dest->state = (struct internal_state FAR *) ds; |
4946 | +- zmemcpy((voidpf)ds, (voidpf)ss, sizeof(deflate_state)); |
4947 | ++ ZCOPY_STATE((voidpf)ds, (voidpf)ss, sizeof(deflate_state)); |
4948 | + ds->strm = dest; |
4949 | + |
4950 | +- ds->window = (Bytef *) ZALLOC(dest, ds->w_size, 2*sizeof(Byte)); |
4951 | ++ ds->window = (Bytef *) ZALLOC_WINDOW(dest, ds->w_size, 2*sizeof(Byte)); |
4952 | + ds->prev = (Posf *) ZALLOC(dest, ds->w_size, sizeof(Pos)); |
4953 | + ds->head = (Posf *) ZALLOC(dest, ds->hash_size, sizeof(Pos)); |
4954 | + ds->pending_buf = (uchf *) ZALLOC(dest, ds->lit_bufsize, 4); |
4955 | +diff --git a/deflate.h b/deflate.h |
4956 | +index 8696791..d49e698 100644 |
4957 | +--- a/deflate.h |
4958 | ++++ b/deflate.h |
4959 | +@@ -299,6 +299,7 @@ void ZLIB_INTERNAL _tr_flush_bits(deflate_state *s); |
4960 | + void ZLIB_INTERNAL _tr_align(deflate_state *s); |
4961 | + void ZLIB_INTERNAL _tr_stored_block(deflate_state *s, charf *buf, |
4962 | + ulg stored_len, int last); |
4963 | ++void ZLIB_INTERNAL _tr_send_bits(deflate_state *s, int value, int length); |
4964 | + |
4965 | + #define d_code(dist) \ |
4966 | + ((dist) < 256 ? _dist_code[dist] : _dist_code[256+((dist)>>7)]) |
4967 | +@@ -343,4 +344,15 @@ void ZLIB_INTERNAL _tr_stored_block(deflate_state *s, charf *buf, |
4968 | + flush = _tr_tally(s, distance, length) |
4969 | + #endif |
4970 | + |
4971 | ++typedef enum { |
4972 | ++ need_more, /* block not completed, need more input or more output */ |
4973 | ++ block_done, /* block flush performed */ |
4974 | ++ finish_started, /* finish started, need only more output at next deflate */ |
4975 | ++ finish_done /* finish done, accept no more input or output */ |
4976 | ++} block_state; |
4977 | ++ |
4978 | ++unsigned ZLIB_INTERNAL bi_reverse(unsigned code, int len); |
4979 | ++void ZLIB_INTERNAL bi_windup(deflate_state *s); |
4980 | ++void ZLIB_INTERNAL flush_pending(z_streamp strm); |
4981 | ++ |
4982 | + #endif /* DEFLATE_H */ |
4983 | +diff --git a/gzguts.h b/gzguts.h |
4984 | +index f937504..5adfd1d 100644 |
4985 | +--- a/gzguts.h |
4986 | ++++ b/gzguts.h |
4987 | +@@ -152,7 +152,11 @@ |
4988 | + |
4989 | + /* default i/o buffer size -- double this for output when reading (this and |
4990 | + twice this must be able to fit in an unsigned type) */ |
4991 | ++#ifdef DFLTCC |
4992 | ++#define GZBUFSIZE 131072 |
4993 | ++#else |
4994 | + #define GZBUFSIZE 8192 |
4995 | ++#endif |
4996 | + |
4997 | + /* gzip modes, also provide a little integrity check on the passed structure */ |
4998 | + #define GZ_NONE 0 |
4999 | +diff --git a/inflate.c b/inflate.c |
5000 | +index b0757a9..c0f808f 100644 |
I'm off until end of year so I think you should grab a different reviewer for this