Merge ~mkukri/ubuntu/+source/zlib:merge into ubuntu/+source/zlib:debian/sid
- Git
- lp:~mkukri/ubuntu/+source/zlib
- merge
- Merge into debian/sid
Status: | Merged |
---|---|
Merge reported by: | Mate Kukri |
Merged at revision: | 515581d841bd3732d669f9806966080208c840b8 |
Proposed branch: | ~mkukri/ubuntu/+source/zlib:merge |
Merge into: | ubuntu/+source/zlib:debian/sid |
Diff against target: |
6023 lines (+5732/-19) 17 files modified
debian/changelog (+246/-0) debian/control (+24/-1) debian/libx32z1-dev.dirs (+1/-0) debian/libx32z1-dev.install (+2/-0) debian/libx32z1.dirs (+1/-0) debian/libx32z1.install (+1/-0) debian/libx32z1.symbols (+3/-0) debian/patches/power/add-optimized-crc32.patch (+2539/-0) debian/patches/power/fix-clang7-builtins.patch (+62/-0) debian/patches/power/indirect-func-macros.patch (+295/-0) debian/patches/s390x/add-accel-deflate.patch (+2043/-0) debian/patches/s390x/add-vectorized-crc32.patch (+426/-0) debian/patches/series (+5/-0) debian/rules (+39/-5) debian/upstream/signing-key.asc (+30/-0) debian/watch (+2/-0) debian/zlib-core.symbols (+13/-13) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Lukas Märdian (community) | Approve | ||
Frank Heimes (community) | Approve | ||
Steve Langasek (community) | Abstain | ||
Ubuntu Sponsors | Pending | ||
git-ubuntu import | Pending | ||
Review via email:
|
Commit message
Merge zlib with Debian unstable.
This needed some TLC:
- Split the previous diff with git ubuntu
- Replaced the POWER and s390x patches with the newest ones from IBM rebased on Debian
- Removed the superseded bugfix patches (now included in the above)
Description of the change
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Mate Kukri (mkukri) wrote : | # |
> I'm off until end of year so I think you should grab a different reviewer for
> this
Understood, I saw your and Frank Heimes's name on the last changelog entries, that's what I based this on.
Do you have any names in mind who has touched this package before and might be willing to review this?
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Steve Langasek (vorlon) wrote : | # |
On Thu, Nov 23, 2023 at 01:30:54PM -0000, Mate Kukri wrote:
> > I'm off until end of year so I think you should grab a different reviewer for
> > this
> Understood, I saw your and Frank Heimes's name on the last changelog
> entries, that's what I based this on.
>
> Do you have any names in mind who has touched this package before and
> might be willing to review this?
I don't think "touched this package" is a relevant criterion and you should
ask around in Foundations (or just ask ~canonical-
- b2a9df2... by Mate Kukri
-
merge-changelogs
- 87e1e2b... by Mate Kukri
-
reconstruct-
changelog
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Mate Kukri (mkukri) wrote : | # |
Now based on 1:1.3.dfsg-3
- 515581d... by Mate Kukri
-
update-maintainer
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
I think this looks good, and is a nice clean-up.
Since this is merged to the noble development release quite early, there should be some time to ask the IBM s390x people to give it a try (I remember that Ilya Leoshkevich <email address hidden> had some test code).
Once I see that this landed, I would like to ask Ilya (no need for you to do anything, but that allows to ensure that the changing s390x optimization patches work fine ...).
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Mate Kukri (mkukri) wrote : | # |
> I think this looks good, and is a nice clean-up.
>
> Since this is merged to the noble development release quite early, there
> should be some time to ask the IBM s390x people to give it a try (I remember
> that Ilya Leoshkevich <email address hidden> had some test code).
>
> Once I see that this landed, I would like to ask Ilya (no need for you to do
> anything, but that allows to ensure that the changing s390x optimization
> patches work fine ...).
Are you also able to upload this, or should I ask someone else?
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
Hi Mate,
I'm sorry, you would need a coredev for uploading, since it's a main
package - and I am only MOTU (working on coredev ;-).
IIRC schopin sponsored my zlib uploads in the past ...
Bye, Frank
Ubuntu on s390x Blog -- ubuntu-
<http://
On Mon, Nov 27, 2023 at 3:01 PM Mate Kukri <email address hidden>
wrote:
> > I think this looks good, and is a nice clean-up.
> >
> > Since this is merged to the noble development release quite early, there
> > should be some time to ask the IBM s390x people to give it a try (I
> remember
> > that Ilya Leoshkevich <email address hidden> had some test code).
> >
> > Once I see that this landed, I would like to ask Ilya (no need for you
> to do
> > anything, but that allows to ensure that the changing s390x optimization
> > patches work fine ...).
>
> Are you also able to upload this, or should I ask someone else?
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
Btw. I haven't seen a LP bug reference in the changelog, are you doing this
merge based on a LP bug ? (what I assume), then please don't forget to
reference this LP bug in d/changelog.
On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <email address hidden>
wrote:
> You have been requested to review the proposed merge of
> ~mkukri/
>
> For more details, see:
>
> https:/
>
>
>
> --
> You are requested to review the proposed merge of
> ~mkukri/
>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Mate Kukri (mkukri) wrote : | # |
I don't think there is an LP bug for this, maybe I should have created one, but this is tracked internally on the Foundations Jira.
> Btw. I haven't seen a LP bug reference in the changelog, are you doing this
> merge based on a LP bug ? (what I assume), then please don't forget to
> reference this LP bug in d/changelog.
>
> On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <email address hidden>
> wrote:
>
> > You have been requested to review the proposed merge of
> > ~mkukri/
> >
> > For more details, see:
> >
> >
> https:/
> >
> >
> >
> > --
> > You are requested to review the proposed merge of
> > ~mkukri/
> >
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
I think the Wiki page for merging recommends to do so:
https:/
"FILE A MERGE BUG"
Ubuntu on s390x Blog -- ubuntu-
<http://
On Tue, Nov 28, 2023 at 9:08 AM Mate Kukri <email address hidden>
wrote:
> I don't think there is an LP bug for this, maybe I should have created
> one, but this is tracked internally on the Foundations Jira.
>
> > Btw. I haven't seen a LP bug reference in the changelog, are you doing
> this
> > merge based on a LP bug ? (what I assume), then please don't forget to
> > reference this LP bug in d/changelog.
> >
> > On Thu, Nov 23, 2023 at 2:15 PM Mate Kukri <<email address hidden>
> >
> > wrote:
> >
> > > You have been requested to review the proposed merge of
> > > ~mkukri/
> > >
> > > For more details, see:
> > >
> > >
> >
> https:/
> > >
> > >
> > >
> > > --
> > > You are requested to review the proposed merge of
> > > ~mkukri/
> > >
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Lukas Märdian (slyon) wrote : | # |
Thank you Mate, that's indeed a really nice cleanup!
The new patches are nicely structured and provide clean patch headers. I confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub. Besides the new patches the delta looks very similar to our previous delta, but this time as clean git-ubuntu commits. Kudos!
@Frank: you mention there might be some test code available, I wonder if we could somehow integrate that into the package? Because unfortunately there doesn't seem to be any dh_auto_test nor autopkgtest. :(
Either way, we should definitely ask IBM/Ilya to verify that the new patches work as intended.
@Mate: We should also consider upstreaming the d/watch delta to Debian, I think that could be useful and doesn't need to be part of the delta.
Test build passed in a PPA:
https:/
LGTM. Sponsoring.
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
From what I remember 'iii' has just a few roughly coded C programs, that
test s390x optimizations and verify some bugs (that popped up in the past).
(Unfortunately) I assume is not in a shape to be integrated as standard
test - and is s390x specific anyway ... :-/
I more thought about using these as kind of regression testing for the
s390x specific bits and pieces.
But I'll ask - maybe there was some more work on it, that I am not aware of
...
On Tue, Nov 28, 2023 at 4:31 PM Lukas Märdian <email address hidden>
wrote:
> Review: Approve
>
> Thank you Mate, that's indeed a really nice cleanup!
>
> The new patches are nicely structured and provide clean patch headers. I
> confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub.
> Besides the new patches the delta looks very similar to our previous delta,
> but this time as clean git-ubuntu commits. Kudos!
>
> @Frank: you mention there might be some test code available, I wonder if
> we could somehow integrate that into the package? Because unfortunately
> there doesn't seem to be any dh_auto_test nor autopkgtest. :(
> Either way, we should definitely ask IBM/Ilya to verify that the new
> patches work as intended.
>
> @Mate: We should also consider upstreaming the d/watch delta to Debian, I
> think that could be useful and doesn't need to be part of the delta.
>
> Test build passed in a PPA:
>
> https:/
>
> LGTM. Sponsoring.
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Frank Heimes (fheimes) wrote : | # |
So Ilya was pretty quick. He tested the package on a mantic environment
(which is still close to noble) and all his tests passed !
Like assumed his tests are s390x specific - so not very useful for a more
generic autopkgtest.
Anyway, glad that he could gave it a try and came back with a :thumbs up:
On Tue, Nov 28, 2023 at 5:14 PM Frank Heimes <email address hidden>
wrote:
> From what I remember 'iii' has just a few roughly coded C programs, that
> test s390x optimizations and verify some bugs (that popped up in the past).
> (Unfortunately) I assume is not in a shape to be integrated as standard
> test - and is s390x specific anyway ... :-/
>
> I more thought about using these as kind of regression testing for the
> s390x specific bits and pieces.
>
> But I'll ask - maybe there was some more work on it, that I am not aware of
> ...
>
> On Tue, Nov 28, 2023 at 4:31 PM Lukas Märdian <
> <email address hidden>>
> wrote:
>
> > Review: Approve
> >
> > Thank you Mate, that's indeed a really nice cleanup!
> >
> > The new patches are nicely structured and provide clean patch headers. I
> > confirmed they match the patches from Ilya (iii-i/zlib/dfltcc) on GitHub.
> > Besides the new patches the delta looks very similar to our previous
> delta,
> > but this time as clean git-ubuntu commits. Kudos!
> >
> > @Frank: you mention there might be some test code available, I wonder if
> > we could somehow integrate that into the package? Because unfortunately
> > there doesn't seem to be any dh_auto_test nor autopkgtest. :(
> > Either way, we should definitely ask IBM/Ilya to verify that the new
> > patches work as intended.
> >
> > @Mate: We should also consider upstreaming the d/watch delta to Debian, I
> > think that could be useful and doesn't need to be part of the delta.
> >
> > Test build passed in a PPA:
> >
> >
> https:/
> >
> > LGTM. Sponsoring.
> > --
> >
> >
> https:/
> > You are reviewing the proposed merge of ~mkukri/
> > into ubuntu/
> >
> >
>
> --
>
> https:/
> You are reviewing the proposed merge of ~mkukri/
> into ubuntu/
>
>
![](/+icing/build/overlay/assets/skins/sam/images/close.gif)
Mate Kukri (mkukri) wrote : | # |
@fheimes That is good news.
If the test code is in a publishable state it might still be worth a shot integrating it as an s390x specific autopkgtest.
That and POWER crc32 is our only significant delta over Debian, so I think it would still help give more confidence to these merges.
Preview Diff
1 | diff --git a/debian/changelog b/debian/changelog | |||
2 | index 92d84a0..d52ce34 100644 | |||
3 | --- a/debian/changelog | |||
4 | +++ b/debian/changelog | |||
5 | @@ -1,3 +1,25 @@ | |||
6 | 1 | zlib (1:1.3.dfsg-3ubuntu1) noble; urgency=medium | ||
7 | 2 | |||
8 | 3 | * Merge with Debian unstable. Remaining changes: | ||
9 | 4 | - Build x32 packages | ||
10 | 5 | - Add watch file, with GPG tarball checking, and version mangling | ||
11 | 6 | - d/rules: Compile with DFLTCC enabled on s390x and hardware | ||
12 | 7 | compression at level 6 | ||
13 | 8 | - d/zlib-core.symbols: Drop dfsg suffix from version | ||
14 | 9 | * New patches rebased from iii-i/zlib/dfltcc on GitHub: | ||
15 | 10 | - d/p/power/*: Add optimized crc32 for POWER8+ | ||
16 | 11 | - d/p/s390x/*: Add optimized crc32 and hardware deflate | ||
17 | 12 | * Patches superseded by the above: | ||
18 | 13 | - d/p/410.patch: Add support for IBM Z hardware-accelerated deflate | ||
19 | 14 | - d/p/478.patch: Add optimized crc32 for Power 8+ processors | ||
20 | 15 | - d/p/s390x-vectorize-crc32.patch: Add s390x vectorized crc32 support | ||
21 | 16 | - d/p/1390.patch: Don't update strm.adler for raw streams on s390x | ||
22 | 17 | (DFLTCC), otherwise libxml2 gets broken on s390x. LP #2002511 | ||
23 | 18 | - d/p/lp-2018293-fix-crash-in-deflateBound-if-called-before-deflateInt | ||
24 | 19 | .patch: Avoid potential deflateBound() function crash on s390x | ||
25 | 20 | |||
26 | 21 | -- Mate Kukri <mate.kukri@canonical.com> Fri, 24 Nov 2023 08:22:52 +0000 | ||
27 | 22 | |||
28 | 1 | zlib (1:1.3.dfsg-3) unstable; urgency=low | 23 | zlib (1:1.3.dfsg-3) unstable; urgency=low |
29 | 2 | 24 | ||
30 | 3 | * Update the version of texlive-binaries we break since they still had | 25 | * Update the version of texlive-binaries we break since they still had |
31 | @@ -34,6 +56,74 @@ zlib (1:1.2.13.dfsg-2) unstable; urgency=low | |||
32 | 34 | 56 | ||
33 | 35 | -- Mark Brown <broonie@debian.org> Tue, 15 Aug 2023 00:28:42 +0100 | 57 | -- Mark Brown <broonie@debian.org> Tue, 15 Aug 2023 00:28:42 +0100 |
34 | 36 | 58 | ||
35 | 59 | zlib (1:1.2.13.dfsg-1ubuntu5) mantic; urgency=medium | ||
36 | 60 | |||
37 | 61 | * Add | ||
38 | 62 | d/p/lp-2018293-fix-crash-in-deflateBound-if-called-before-deflateInt.patch | ||
39 | 63 | to avoid potential deflateBound() function crash on s390x. | ||
40 | 64 | * Clean-up and remove | ||
41 | 65 | d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch since it was | ||
42 | 66 | replaced by d/p/s390x-vectorize-crc32.patch with 1.2.13.dfsg-1ubuntu3 | ||
43 | 67 | but was still in d/p/ (but not in d/p/series). | ||
44 | 68 | |||
45 | 69 | -- Frank Heimes <frank.heimes@canonical.com> Wed, 02 Aug 2023 13:22:26 +0200 | ||
46 | 70 | |||
47 | 71 | zlib (1:1.2.13.dfsg-1ubuntu4) lunar; urgency=medium | ||
48 | 72 | |||
49 | 73 | * Add d/p/1390.patch to not update strm.adler for raw streams on s390x | ||
50 | 74 | (DFLTCC), otherwise libxml2 gets broken on s390x. LP: #2002511 | ||
51 | 75 | |||
52 | 76 | -- Frank Heimes <frank.heimes@canonical.com> Wed, 11 Jan 2023 18:02:34 +0100 | ||
53 | 77 | |||
54 | 78 | zlib (1:1.2.13.dfsg-1ubuntu3) lunar; urgency=medium | ||
55 | 79 | |||
56 | 80 | * Re-add vectorized crc32 support for s390x by adding | ||
57 | 81 | d/p/s390x-vectorize-crc32.patch | ||
58 | 82 | (crc32vx-v4: s390x: vectorize crc32). (LP: #1998470) | ||
59 | 83 | This replaces the previously dropped patch: | ||
60 | 84 | lp1932010-ibm-z-add-vectorized-crc32-implementation.patch | ||
61 | 85 | * Remove option '--crc32-vx' for s390x in d/rules, that was previously just | ||
62 | 86 | commented out, since it's no longer needed with the new s390x crc32 code. | ||
63 | 87 | * Update d/p/410.patch to version 26f2c0a4e17e5558d779797d713aa37ebaeef390 | ||
64 | 88 | due to unused "const char *endptr;". | ||
65 | 89 | |||
66 | 90 | -- Frank Heimes <frank.heimes@canonical.com> Mon, 21 Nov 2022 20:28:58 +0100 | ||
67 | 91 | |||
68 | 92 | zlib (1:1.2.13.dfsg-1ubuntu2) lunar; urgency=medium | ||
69 | 93 | |||
70 | 94 | * Comment out use of --crc32-vx on s390x, since this is currently not | ||
71 | 95 | implemented due to the dropped patch that needs porting. | ||
72 | 96 | |||
73 | 97 | -- Steve Langasek <steve.langasek@ubuntu.com> Tue, 15 Nov 2022 17:06:45 +0000 | ||
74 | 98 | |||
75 | 99 | zlib (1:1.2.13.dfsg-1ubuntu1) lunar; urgency=low | ||
76 | 100 | |||
77 | 101 | * Merge from Debian unstable. Remaining changes: | ||
78 | 102 | - Build x32 packages | ||
79 | 103 | - debian/zlib-core.symbols: Drop dfsg suffix from version | ||
80 | 104 | - Add watch file, with GPG tarball checking, and version mangling | ||
81 | 105 | - Cherrypick PR#410 to enable hardware-accelerated deflate. | ||
82 | 106 | - Copmile with DFLTCC enabled on s390x. | ||
83 | 107 | - Enable hardware compression on s390x at level 6. | ||
84 | 108 | - d/rules: use configure options for dfltcc instead of hardcoding | ||
85 | 109 | the CFLAGS | ||
86 | 110 | * Dropped changes, included upstream: | ||
87 | 111 | - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch | ||
88 | 112 | - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits | ||
89 | 113 | for deflatePrime() is valid in deflate.c. | ||
90 | 114 | * Pull rebased 410.patch from https://github.com/madler/zlib/pull/410. | ||
91 | 115 | * Drop d/p/410-lp1961427.patch, included in the above rebase. | ||
92 | 116 | * Replace 335.patch for ppc64el (P8) crc32 performance with 478.patch which | ||
93 | 117 | supersedes it (https://github.com/madler/zlib/pull/478). | ||
94 | 118 | * Forward-port lp1932010-ibm-z-add-vectorized-crc32-implementation.patch. | ||
95 | 119 | * Dropped changes: | ||
96 | 120 | - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: this | ||
97 | 121 | patch depends on zlib upstream PR 335 which has been superseded by | ||
98 | 122 | upstream PR 478 with significant refactoring. Drop this patch, | ||
99 | 123 | pending a port from IBM. | ||
100 | 124 | |||
101 | 125 | -- Steve Langasek <steve.langasek@ubuntu.com> Mon, 07 Nov 2022 15:57:28 -0800 | ||
102 | 126 | |||
103 | 37 | zlib (1:1.2.13.dfsg-1) unstable; urgency=low | 127 | zlib (1:1.2.13.dfsg-1) unstable; urgency=low |
104 | 38 | 128 | ||
105 | 39 | * New upstream release. | 129 | * New upstream release. |
106 | @@ -42,6 +132,38 @@ zlib (1:1.2.13.dfsg-1) unstable; urgency=low | |||
107 | 42 | 132 | ||
108 | 43 | -- Mark Brown <broonie@debian.org> Sat, 05 Nov 2022 12:24:46 +0000 | 133 | -- Mark Brown <broonie@debian.org> Sat, 05 Nov 2022 12:24:46 +0000 |
109 | 44 | 134 | ||
110 | 135 | zlib (1:1.2.11.dfsg-4.1ubuntu1) kinetic; urgency=low | ||
111 | 136 | |||
112 | 137 | * Merge from Debian unstable. Remaining changes: | ||
113 | 138 | - Build x32 packages | ||
114 | 139 | - debian/zlib-core.symbols: Drop dfsg suffix from version | ||
115 | 140 | - Add watch file, with GPG tarball checking, and version mangling | ||
116 | 141 | - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: | ||
117 | 142 | - Cherrypick PR#410 to enable hardware-accelerated deflate. | ||
118 | 143 | - Copmile with DFLTCC enabled on s390x. | ||
119 | 144 | - Improve crc32 performance on P8, proposed upstream patch. | ||
120 | 145 | - Enable hardware compression on s390x at level 6. | ||
121 | 146 | - Cherrypick update of s390x hw acceleration #410 pull request patch, | ||
122 | 147 | which corrects inflateSyncPoint() return value to always gracefully | ||
123 | 148 | fail when hw acceleration is in use. | ||
124 | 149 | - d/rules: use configure options for dfltcc instead of hardcoding | ||
125 | 150 | the CFLAGS | ||
126 | 151 | - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch | ||
127 | 152 | ported from zlib-ng #912, adding a vectorized implementation | ||
128 | 153 | of CRC32 on s390x architectures based on kernel code. | ||
129 | 154 | - d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: adjust | ||
130 | 155 | to not make a PLT call in an ifunc on s390/s390x. | ||
131 | 156 | - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits | ||
132 | 157 | for deflatePrime() is valid in deflate.c. | ||
133 | 158 | - d/p/410-lp1961427.patch ported from zlib #410, fixing | ||
134 | 159 | compressBound() with hw acceleration. | ||
135 | 160 | * Dropped changes, included in Debian: | ||
136 | 161 | - debian/patches/CVE-2018-25032-1.patch: fix a bug that can crash | ||
137 | 162 | deflate on some input when using Z_FIXED in deflate.c, deflate.h. | ||
138 | 163 | * Refresh 410.patch for upstream changes. | ||
139 | 164 | |||
140 | 165 | -- Steve Langasek <steve.langasek@ubuntu.com> Thu, 18 Aug 2022 09:09:22 -0700 | ||
141 | 166 | |||
142 | 45 | zlib (1:1.2.11.dfsg-4.1) unstable; urgency=medium | 167 | zlib (1:1.2.11.dfsg-4.1) unstable; urgency=medium |
143 | 46 | 168 | ||
144 | 47 | * Non-maintainer upload. | 169 | * Non-maintainer upload. |
145 | @@ -69,6 +191,89 @@ zlib (1:1.2.11.dfsg-3) unstable; urgency=low | |||
146 | 69 | 191 | ||
147 | 70 | -- Mark Brown <broonie@debian.org> Fri, 18 Mar 2022 00:21:37 +0000 | 192 | -- Mark Brown <broonie@debian.org> Fri, 18 Mar 2022 00:21:37 +0000 |
148 | 71 | 193 | ||
149 | 194 | zlib (1:1.2.11.dfsg-2ubuntu10) kinetic; urgency=medium | ||
150 | 195 | |||
151 | 196 | * d/p/410-lp1961427.patch ported from zlib #410, fixing | ||
152 | 197 | compressBound() with hw acceleration. LP: #1961427 | ||
153 | 198 | Thanks to Ilya Leoshkevich <iii@linux.ibm.com>. | ||
154 | 199 | In addition a patch is needed for bedtools. | ||
155 | 200 | |||
156 | 201 | -- Frank Heimes <frank.heimes@canonical.com> Thu, 21 Jul 2022 09:30:05 +0100 | ||
157 | 202 | |||
158 | 203 | zlib (1:1.2.11.dfsg-2ubuntu9) jammy; urgency=medium | ||
159 | 204 | |||
160 | 205 | * SECURITY UPDATE: memory corruption when deflating | ||
161 | 206 | - debian/patches/CVE-2018-25032-1.patch: fix a bug that can crash | ||
162 | 207 | deflate on some input when using Z_FIXED in deflate.c, deflate.h. | ||
163 | 208 | - debian/patches/CVE-2018-25032-2.patch: assure that the number of bits | ||
164 | 209 | for deflatePrime() is valid in deflate.c. | ||
165 | 210 | - CVE-2018-25032 | ||
166 | 211 | |||
167 | 212 | -- Marc Deslauriers <marc.deslauriers@ubuntu.com> Fri, 25 Mar 2022 08:06:31 -0400 | ||
168 | 213 | |||
169 | 214 | zlib (1:1.2.11.dfsg-2ubuntu7) impish; urgency=medium | ||
170 | 215 | |||
171 | 216 | [ Simon Chopin ] | ||
172 | 217 | * d/rules: use configure options for dfltcc instead of hardcoding | ||
173 | 218 | the CFLAGS | ||
174 | 219 | * d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch | ||
175 | 220 | ported from zlib-ng #912, adding a vectorized implementation | ||
176 | 221 | of CRC32 on s390x architectures based on kernel code. LP: #1932010 | ||
177 | 222 | |||
178 | 223 | [ Michael Hudson-Doyle ] | ||
179 | 224 | * d/p/lp1932010-ibm-z-add-vectorized-crc32-implementation.patch: adjust to | ||
180 | 225 | not make a PLT call in an ifunc on s390/s390x. | ||
181 | 226 | |||
182 | 227 | -- Simon Chopin <simon.chopin@canonical.com> Thu, 12 Aug 2021 15:45:49 +1200 | ||
183 | 228 | |||
184 | 229 | zlib (1:1.2.11.dfsg-2ubuntu6) hirsute; urgency=medium | ||
185 | 230 | |||
186 | 231 | * No-change rebuild to build with lto. | ||
187 | 232 | |||
188 | 233 | -- Matthias Klose <doko@ubuntu.com> Sun, 28 Mar 2021 09:10:07 +0200 | ||
189 | 234 | |||
190 | 235 | zlib (1:1.2.11.dfsg-2ubuntu5) hirsute; urgency=medium | ||
191 | 236 | |||
192 | 237 | * No-change rebuild to drop the udeb package. | ||
193 | 238 | |||
194 | 239 | -- Matthias Klose <doko@ubuntu.com> Mon, 22 Feb 2021 10:36:58 +0100 | ||
195 | 240 | |||
196 | 241 | zlib (1:1.2.11.dfsg-2ubuntu4) groovy; urgency=medium | ||
197 | 242 | |||
198 | 243 | * Cherrypick update of s390x hw acceleration #410 pull request patch, | ||
199 | 244 | which corrects inflateSyncPoint() return value to always gracefully | ||
200 | 245 | fail when hw acceleration is in use. This fixes rsync failure with | ||
201 | 246 | zlib compression on hw accelerated s390x. LP: #1899621 | ||
202 | 247 | |||
203 | 248 | -- Dimitri John Ledkov <xnox@ubuntu.com> Thu, 15 Oct 2020 11:01:38 +0100 | ||
204 | 249 | |||
205 | 250 | zlib (1:1.2.11.dfsg-2ubuntu3) groovy; urgency=medium | ||
206 | 251 | |||
207 | 252 | * Enable hardware compression on s390x at level 6. LP: #1884514 | ||
208 | 253 | |||
209 | 254 | -- Michael Hudson-Doyle <michael.hudson@ubuntu.com> Thu, 24 Sep 2020 08:44:35 +1200 | ||
210 | 255 | |||
211 | 256 | zlib (1:1.2.11.dfsg-2ubuntu2) groovy; urgency=medium | ||
212 | 257 | |||
213 | 258 | * Update d/patches/410.patch to current state. LP: #1882494, #1889059, #1893170 | ||
214 | 259 | |||
215 | 260 | -- Michael Hudson-Doyle <michael.hudson@ubuntu.com> Thu, 20 Aug 2020 11:52:59 +1200 | ||
216 | 261 | |||
217 | 262 | zlib (1:1.2.11.dfsg-2ubuntu1) focal; urgency=medium | ||
218 | 263 | |||
219 | 264 | * Merge with Debian; remaining changes: | ||
220 | 265 | - Build x32 packages | ||
221 | 266 | - debian/zlib-core.symbols: Drop dfsg suffix from version | ||
222 | 267 | - Add watch file, with GPG tarball checking, and version mangling | ||
223 | 268 | - Drop unused patches | ||
224 | 269 | - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: | ||
225 | 270 | (LP: #1692870) | ||
226 | 271 | - Cherrypick PR#410 to enable hardware-accelerated deflate. | ||
227 | 272 | - Copmile with DFLTCC enabled on s390x. LP: #1823157 | ||
228 | 273 | - Improve crc32 performance on P8, proposed upstream patch. LP: #1742941. | ||
229 | 274 | |||
230 | 275 | -- Matthias Klose <doko@ubuntu.com> Tue, 25 Feb 2020 16:59:52 +0100 | ||
231 | 276 | |||
232 | 72 | zlib (1:1.2.11.dfsg-2) unstable; urgency=low | 277 | zlib (1:1.2.11.dfsg-2) unstable; urgency=low |
233 | 73 | 278 | ||
234 | 74 | * Acknowledge previous NMUs (closes: #949388). | 279 | * Acknowledge previous NMUs (closes: #949388). |
235 | @@ -80,6 +285,21 @@ zlib (1:1.2.11.dfsg-2) unstable; urgency=low | |||
236 | 80 | 285 | ||
237 | 81 | -- Mark Brown <broonie@debian.org> Mon, 24 Feb 2020 21:07:12 +0000 | 286 | -- Mark Brown <broonie@debian.org> Mon, 24 Feb 2020 21:07:12 +0000 |
238 | 82 | 287 | ||
239 | 288 | zlib (1:1.2.11.dfsg-1.2ubuntu1) focal; urgency=medium | ||
240 | 289 | |||
241 | 290 | * Merge with Debian; remaining changes: | ||
242 | 291 | - Build x32 packages | ||
243 | 292 | - debian/zlib-core.symbols: Drop dfsg suffix from version | ||
244 | 293 | - Add watch file, with GPG tarball checking, and version mangling | ||
245 | 294 | - Drop unused patches | ||
246 | 295 | - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: | ||
247 | 296 | (LP: #1692870) | ||
248 | 297 | - Cherrypick PR#410 to enable hardware-accelerated deflate. | ||
249 | 298 | - Copmile with DFLTCC enabled on s390x. LP: #1823157 | ||
250 | 299 | * Improve crc32 performance on P8, proposed upstream patch. LP: #1742941. | ||
251 | 300 | |||
252 | 301 | -- Matthias Klose <doko@ubuntu.com> Mon, 24 Feb 2020 12:57:03 +0100 | ||
253 | 302 | |||
254 | 83 | zlib (1:1.2.11.dfsg-1.2) unstable; urgency=medium | 303 | zlib (1:1.2.11.dfsg-1.2) unstable; urgency=medium |
255 | 84 | 304 | ||
256 | 85 | * Non-maintainer upload. | 305 | * Non-maintainer upload. |
257 | @@ -97,6 +317,31 @@ zlib (1:1.2.11.dfsg-1.1) unstable; urgency=medium | |||
258 | 97 | 317 | ||
259 | 98 | -- YunQiang Su <syq@debian.org> Tue, 28 Jan 2020 19:55:38 +0800 | 318 | -- YunQiang Su <syq@debian.org> Tue, 28 Jan 2020 19:55:38 +0800 |
260 | 99 | 319 | ||
261 | 320 | zlib (1:1.2.11.dfsg-1ubuntu3) eoan; urgency=medium | ||
262 | 321 | |||
263 | 322 | * Cherrypick PR#410 to enable hardware-accelerated deflate. | ||
264 | 323 | * Copmile with DFLTCC enabled on s390x. LP: #1823157 | ||
265 | 324 | |||
266 | 325 | -- Dimitri John Ledkov <xnox@ubuntu.com> Mon, 19 Aug 2019 19:51:09 +0100 | ||
267 | 326 | |||
268 | 327 | zlib (1:1.2.11.dfsg-1ubuntu2) disco; urgency=medium | ||
269 | 328 | |||
270 | 329 | * debian/zlib-core.symbols: fix mistake introduced in the merge | ||
271 | 330 | |||
272 | 331 | -- Jeremy Bicha <jbicha@debian.org> Thu, 24 Jan 2019 12:56:53 -0500 | ||
273 | 332 | |||
274 | 333 | zlib (1:1.2.11.dfsg-1ubuntu1) disco; urgency=medium | ||
275 | 334 | |||
276 | 335 | * Sync with Debian. Remaining changes: | ||
277 | 336 | - Build x32 packages | ||
278 | 337 | - debian/zlib-core.symbols: Drop dfsg suffix from version | ||
279 | 338 | - Add watch file, with GPG tarball checking, and version mangling | ||
280 | 339 | - Drop unused patches | ||
281 | 340 | - Cherry-pick Permit-a-deflateParams-parameter-change-asap.patch: | ||
282 | 341 | (LP: #1692870) | ||
283 | 342 | |||
284 | 343 | -- Jeremy Bicha <jbicha@debian.org> Wed, 23 Jan 2019 17:22:17 -0500 | ||
285 | 344 | |||
286 | 100 | zlib (1:1.2.11.dfsg-1) unstable; urgency=low | 345 | zlib (1:1.2.11.dfsg-1) unstable; urgency=low |
287 | 101 | 346 | ||
288 | 102 | * New upstream release (closes: #883180). | 347 | * New upstream release (closes: #883180). |
289 | @@ -1072,3 +1317,4 @@ zlib (1.0.4-1) unstable; urgency=low | |||
290 | 1072 | * Moved to new source packaging format. | 1317 | * Moved to new source packaging format. |
291 | 1073 | 1318 | ||
292 | 1074 | -- Michael Alan Dorman <mdorman@calder.med.miami.edu> Thu, 12 Sep 1996 15:19:35 -0400 | 1319 | -- Michael Alan Dorman <mdorman@calder.med.miami.edu> Thu, 12 Sep 1996 15:19:35 -0400 |
293 | 1320 | |||
294 | diff --git a/debian/control b/debian/control | |||
295 | index 3b4ff22..f365460 100644 | |||
296 | --- a/debian/control | |||
297 | +++ b/debian/control | |||
298 | @@ -1,7 +1,8 @@ | |||
299 | 1 | Source: zlib | 1 | Source: zlib |
300 | 2 | Section: libs | 2 | Section: libs |
301 | 3 | Priority: optional | 3 | Priority: optional |
303 | 4 | Maintainer: Mark Brown <broonie@debian.org> | 4 | Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com> |
304 | 5 | XSBC-Original-Maintainer: Mark Brown <broonie@debian.org> | ||
305 | 5 | Standards-Version: 4.6.1 | 6 | Standards-Version: 4.6.1 |
306 | 6 | Homepage: http://zlib.net/ | 7 | Homepage: http://zlib.net/ |
307 | 7 | Build-Depends: debhelper (>= 13), gcc-multilib [amd64 i386 kfreebsd-amd64 mips mipsel powerpc ppc64 s390 sparc s390x mipsn32 mipsn32el mipsr6 mipsr6el mipsn32r6 mipsn32r6el mips64 mips64el mips64r6 mips64r6el x32] <!nobiarch>, dpkg-dev (>= 1.16.1), autoconf | 8 | Build-Depends: debhelper (>= 13), gcc-multilib [amd64 i386 kfreebsd-amd64 mips mipsel powerpc ppc64 s390 sparc s390x mipsn32 mipsn32el mipsr6 mipsr6el mipsn32r6 mipsn32r6el mips64 mips64el mips64r6 mips64r6el x32] <!nobiarch>, dpkg-dev (>= 1.16.1), autoconf |
308 | @@ -119,6 +120,28 @@ Description: compression library - n32 - DO NOT USE EXCEPT FOR PACKAGING | |||
309 | 119 | not need to build packages should use multiarch to install the relevant | 120 | not need to build packages should use multiarch to install the relevant |
310 | 120 | runtime. | 121 | runtime. |
311 | 121 | 122 | ||
312 | 123 | Package: libx32z1 | ||
313 | 124 | Architecture: amd64 i386 | ||
314 | 125 | Depends: ${shlibs:Depends}, ${misc:Depends} | ||
315 | 126 | Description: compression library - x32 runtime | ||
316 | 127 | zlib is a library implementing the deflate compression method found | ||
317 | 128 | in gzip and PKZIP. This package includes a n32 version of the shared | ||
318 | 129 | library. | ||
319 | 130 | |||
320 | 131 | Package: libx32z1-dev | ||
321 | 132 | Section: libdevel | ||
322 | 133 | Architecture: amd64 i386 | ||
323 | 134 | Depends: libx32z1 (= ${binary:Version}), zlib1g-dev (= ${binary:Version}), libc6-dev-x32, ${misc:Depends} | ||
324 | 135 | Provides: libx32z-dev | ||
325 | 136 | Description: compression library - x32 - DO NOT USE EXCEPT FOR PACKAGING | ||
326 | 137 | zlib is a library implementing the deflate compression method found | ||
327 | 138 | in gzip and PKZIP. This package includes the development support | ||
328 | 139 | files for building n32 applications. | ||
329 | 140 | . | ||
330 | 141 | This package should ONLY be used for building packages, users who do | ||
331 | 142 | not need to build packages should use multiarch to install the relevant | ||
332 | 143 | runtime. | ||
333 | 144 | |||
334 | 122 | Package: minizip | 145 | Package: minizip |
335 | 123 | Section: utils | 146 | Section: utils |
336 | 124 | Architecture: any | 147 | Architecture: any |
337 | diff --git a/debian/libx32z1-dev.dirs b/debian/libx32z1-dev.dirs | |||
338 | 125 | new file mode 100644 | 148 | new file mode 100644 |
339 | index 0000000..5447591 | |||
340 | --- /dev/null | |||
341 | +++ b/debian/libx32z1-dev.dirs | |||
342 | @@ -0,0 +1 @@ | |||
343 | 1 | usr/libx32 | ||
344 | diff --git a/debian/libx32z1-dev.install b/debian/libx32z1-dev.install | |||
345 | 0 | new file mode 100644 | 2 | new file mode 100644 |
346 | index 0000000..a865054 | |||
347 | --- /dev/null | |||
348 | +++ b/debian/libx32z1-dev.install | |||
349 | @@ -0,0 +1,2 @@ | |||
350 | 1 | usr/libx32/libz.a | ||
351 | 2 | usr/libx32/libz.so | ||
352 | diff --git a/debian/libx32z1.dirs b/debian/libx32z1.dirs | |||
353 | 0 | new file mode 100644 | 3 | new file mode 100644 |
354 | index 0000000..5447591 | |||
355 | --- /dev/null | |||
356 | +++ b/debian/libx32z1.dirs | |||
357 | @@ -0,0 +1 @@ | |||
358 | 1 | usr/libx32 | ||
359 | diff --git a/debian/libx32z1.install b/debian/libx32z1.install | |||
360 | 0 | new file mode 100644 | 2 | new file mode 100644 |
361 | index 0000000..3ff82f2 | |||
362 | --- /dev/null | |||
363 | +++ b/debian/libx32z1.install | |||
364 | @@ -0,0 +1 @@ | |||
365 | 1 | usr/libx32/libz.so.* | ||
366 | diff --git a/debian/libx32z1.symbols b/debian/libx32z1.symbols | |||
367 | 0 | new file mode 100644 | 2 | new file mode 100644 |
368 | index 0000000..a87cfdc | |||
369 | --- /dev/null | |||
370 | +++ b/debian/libx32z1.symbols | |||
371 | @@ -0,0 +1,3 @@ | |||
372 | 1 | libz.so.1 libx32z1 #MINVER# | ||
373 | 2 | #include "zlib-core.symbols" | ||
374 | 3 | #include "zlib-64.symbols" | ||
375 | diff --git a/debian/patches/power/add-optimized-crc32.patch b/debian/patches/power/add-optimized-crc32.patch | |||
376 | 0 | new file mode 100644 | 4 | new file mode 100644 |
377 | index 0000000..b057b57 | |||
378 | --- /dev/null | |||
379 | +++ b/debian/patches/power/add-optimized-crc32.patch | |||
380 | @@ -0,0 +1,2539 @@ | |||
381 | 1 | From: Manjunath S Matti <mmatti@linux.ibm.com> | ||
382 | 2 | Date: Thu, 14 Sep 2023 06:43:11 -0500 | ||
383 | 3 | Subject: Add Power8+ optimized crc32 | ||
384 | 4 | |||
385 | 5 | This commit adds an optimized version for the crc32 function based | ||
386 | 6 | on crc32-vpmsum from https://github.com/antonblanchard/crc32-vpmsum/ | ||
387 | 7 | |||
388 | 8 | This is the C implementation created by Rogerio Alves | ||
389 | 9 | <rogealve@br.ibm.com> | ||
390 | 10 | |||
391 | 11 | It makes use of vector instructions to speed up CRC32 algorithm. | ||
392 | 12 | |||
393 | 13 | Author: Rogerio Alves <rcardoso@linux.ibm.com> | ||
394 | 14 | Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> | ||
395 | 15 | |||
396 | 16 | Origin: i-iii/zlib,https://github.com/iii-i/zlib/commit/6879bc81b111247939b4924b08c5993fd0482b1a | ||
397 | 17 | --- | ||
398 | 18 | .gitignore | 29 + | ||
399 | 19 | CMakeLists.txt | 7 +- | ||
400 | 20 | Makefile.in | 43 +- | ||
401 | 21 | configure | 7 +- | ||
402 | 22 | contrib/README.contrib | 3 +- | ||
403 | 23 | contrib/power/clang_workaround.h | 82 +++ | ||
404 | 24 | contrib/power/crc32_constants.h | 1206 ++++++++++++++++++++++++++++++++++++++ | ||
405 | 25 | contrib/power/crc32_z_power8.c | 679 +++++++++++++++++++++ | ||
406 | 26 | contrib/power/crc32_z_resolver.c | 15 + | ||
407 | 27 | contrib/power/power.h | 4 + | ||
408 | 28 | crc32.c | 12 + | ||
409 | 29 | test/crc32_test.c | 205 +++++++ | ||
410 | 30 | 12 files changed, 2278 insertions(+), 14 deletions(-) | ||
411 | 31 | create mode 100644 .gitignore | ||
412 | 32 | create mode 100644 contrib/power/clang_workaround.h | ||
413 | 33 | create mode 100644 contrib/power/crc32_constants.h | ||
414 | 34 | create mode 100644 contrib/power/crc32_z_power8.c | ||
415 | 35 | create mode 100644 contrib/power/crc32_z_resolver.c | ||
416 | 36 | create mode 100644 test/crc32_test.c | ||
417 | 37 | |||
418 | 38 | diff --git a/.gitignore b/.gitignore | ||
419 | 39 | new file mode 100644 | ||
420 | 40 | index 0000000..e324531 | ||
421 | 41 | --- /dev/null | ||
422 | 42 | +++ b/.gitignore | ||
423 | 43 | @@ -0,0 +1,29 @@ | ||
424 | 44 | +*.diff | ||
425 | 45 | +*.patch | ||
426 | 46 | +*.orig | ||
427 | 47 | +*.rej | ||
428 | 48 | + | ||
429 | 49 | +*~ | ||
430 | 50 | +*.a | ||
431 | 51 | +*.lo | ||
432 | 52 | +*.o | ||
433 | 53 | +*.dylib | ||
434 | 54 | + | ||
435 | 55 | +*.gcda | ||
436 | 56 | +*.gcno | ||
437 | 57 | +*.gcov | ||
438 | 58 | + | ||
439 | 59 | +/crc32_test | ||
440 | 60 | +/crc32_test64 | ||
441 | 61 | +/crc32_testsh | ||
442 | 62 | +/example | ||
443 | 63 | +/example64 | ||
444 | 64 | +/examplesh | ||
445 | 65 | +/libz.so* | ||
446 | 66 | +/minigzip | ||
447 | 67 | +/minigzip64 | ||
448 | 68 | +/minigzipsh | ||
449 | 69 | +/zlib.pc | ||
450 | 70 | +/configure.log | ||
451 | 71 | + | ||
452 | 72 | +.DS_Store | ||
453 | 73 | diff --git a/CMakeLists.txt b/CMakeLists.txt | ||
454 | 74 | index 4456cd7..0464ba3 100644 | ||
455 | 75 | --- a/CMakeLists.txt | ||
456 | 76 | +++ b/CMakeLists.txt | ||
457 | 77 | @@ -172,7 +172,8 @@ if(CMAKE_COMPILER_IS_GNUCC) | ||
458 | 78 | |||
459 | 79 | if(POWER8) | ||
460 | 80 | add_definitions(-DZ_POWER8) | ||
461 | 81 | - set(ZLIB_POWER8 ) | ||
462 | 82 | + set(ZLIB_POWER8 | ||
463 | 83 | + contrib/power/crc32_z_power8.c) | ||
464 | 84 | |||
465 | 85 | set_source_files_properties( | ||
466 | 86 | ${ZLIB_POWER8} | ||
467 | 87 | @@ -269,6 +270,10 @@ add_executable(example test/example.c) | ||
468 | 88 | target_link_libraries(example zlib) | ||
469 | 89 | add_test(example example) | ||
470 | 90 | |||
471 | 91 | +add_executable(crc32_test test/crc32_test.c) | ||
472 | 92 | +target_link_libraries(crc32_test zlib) | ||
473 | 93 | +add_test(crc32_test crc32_test) | ||
474 | 94 | + | ||
475 | 95 | add_executable(minigzip test/minigzip.c) | ||
476 | 96 | target_link_libraries(minigzip zlib) | ||
477 | 97 | |||
478 | 98 | diff --git a/Makefile.in b/Makefile.in | ||
479 | 99 | index 34d3cd7..2dbb20a 100644 | ||
480 | 100 | --- a/Makefile.in | ||
481 | 101 | +++ b/Makefile.in | ||
482 | 102 | @@ -71,11 +71,11 @@ PIC_OBJS = $(PIC_OBJC) $(PIC_OBJA) | ||
483 | 103 | |||
484 | 104 | all: static shared | ||
485 | 105 | |||
486 | 106 | -static: example$(EXE) minigzip$(EXE) | ||
487 | 107 | +static: crc32_test$(EXE) example$(EXE) minigzip$(EXE) | ||
488 | 108 | |||
489 | 109 | -shared: examplesh$(EXE) minigzipsh$(EXE) | ||
490 | 110 | +shared: crc32_testsh$(EXE) examplesh$(EXE) minigzipsh$(EXE) | ||
491 | 111 | |||
492 | 112 | -all64: example64$(EXE) minigzip64$(EXE) | ||
493 | 113 | +all64: crc32_test64$(EXE) example64$(EXE) minigzip64$(EXE) | ||
494 | 114 | |||
495 | 115 | check: test | ||
496 | 116 | |||
497 | 117 | @@ -83,7 +83,7 @@ test: all teststatic testshared | ||
498 | 118 | |||
499 | 119 | teststatic: static | ||
500 | 120 | @TMPST=tmpst_$$; \ | ||
501 | 121 | - if echo hello world | ${QEMU_RUN} ./minigzip | ${QEMU_RUN} ./minigzip -d && ${QEMU_RUN} ./example $$TMPST ; then \ | ||
502 | 122 | + if echo hello world | ${QEMU_RUN} ./minigzip | ${QEMU_RUN} ./minigzip -d && ${QEMU_RUN} ./example $$TMPST && ${QEMU_RUN} ./crc32_test; then \ | ||
503 | 123 | echo ' *** zlib test OK ***'; \ | ||
504 | 124 | else \ | ||
505 | 125 | echo ' *** zlib test FAILED ***'; false; \ | ||
506 | 126 | @@ -96,7 +96,7 @@ testshared: shared | ||
507 | 127 | DYLD_LIBRARY_PATH=`pwd`:$(DYLD_LIBRARY_PATH) ; export DYLD_LIBRARY_PATH; \ | ||
508 | 128 | SHLIB_PATH=`pwd`:$(SHLIB_PATH) ; export SHLIB_PATH; \ | ||
509 | 129 | TMPSH=tmpsh_$$; \ | ||
510 | 130 | - if echo hello world | ${QEMU_RUN} ./minigzipsh | ${QEMU_RUN} ./minigzipsh -d && ${QEMU_RUN} ./examplesh $$TMPSH; then \ | ||
511 | 131 | + if echo hello world | ${QEMU_RUN} ./minigzipsh | ${QEMU_RUN} ./minigzipsh -d && ${QEMU_RUN} ./examplesh $$TMPSH && ${QEMU_RUN} ./crc32_testsh; then \ | ||
512 | 132 | echo ' *** zlib shared test OK ***'; \ | ||
513 | 133 | else \ | ||
514 | 134 | echo ' *** zlib shared test FAILED ***'; false; \ | ||
515 | 135 | @@ -105,7 +105,7 @@ testshared: shared | ||
516 | 136 | |||
517 | 137 | test64: all64 | ||
518 | 138 | @TMP64=tmp64_$$; \ | ||
519 | 139 | - if echo hello world | ${QEMU_RUN} ./minigzip64 | ${QEMU_RUN} ./minigzip64 -d && ${QEMU_RUN} ./example64 $$TMP64; then \ | ||
520 | 140 | + if echo hello world | ${QEMU_RUN} ./minigzip64 | ${QEMU_RUN} ./minigzip64 -d && ${QEMU_RUN} ./example64 $$TMP64 && ${QEMU_RUN} ./crc32_test64; then \ | ||
521 | 141 | echo ' *** zlib 64-bit test OK ***'; \ | ||
522 | 142 | else \ | ||
523 | 143 | echo ' *** zlib 64-bit test FAILED ***'; false; \ | ||
524 | 144 | @@ -139,12 +139,18 @@ match.lo: match.S | ||
525 | 145 | mv _match.o match.lo | ||
526 | 146 | rm -f _match.s | ||
527 | 147 | |||
528 | 148 | +crc32_test.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h | ||
529 | 149 | + $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/crc32_test.c | ||
530 | 150 | + | ||
531 | 151 | example.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h | ||
532 | 152 | $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/example.c | ||
533 | 153 | |||
534 | 154 | minigzip.o: $(SRCDIR)test/minigzip.c $(SRCDIR)zlib.h zconf.h | ||
535 | 155 | $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/minigzip.c | ||
536 | 156 | |||
537 | 157 | +crc32_test64.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h | ||
538 | 158 | + $(CC) $(CFLAGS) $(ZINCOUT) -D_FILE_OFFSET_BITS=64 -c -o $@ $(SRCDIR)test/crc32_test.c | ||
539 | 159 | + | ||
540 | 160 | example64.o: $(SRCDIR)test/example.c $(SRCDIR)zlib.h zconf.h | ||
541 | 161 | $(CC) $(CFLAGS) $(ZINCOUT) -D_FILE_OFFSET_BITS=64 -c -o $@ $(SRCDIR)test/example.c | ||
542 | 162 | |||
543 | 163 | @@ -158,6 +164,9 @@ adler32.o: $(SRCDIR)adler32.c | ||
544 | 164 | crc32.o: $(SRCDIR)crc32.c | ||
545 | 165 | $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)crc32.c | ||
546 | 166 | |||
547 | 167 | +crc32_z_power8.o: $(SRCDIR)contrib/power/crc32_z_power8.c | ||
548 | 168 | + $(CC) $(CFLAGS) -mcpu=power8 $(ZINC) -c -o $@ $(SRCDIR)contrib/power/crc32_z_power8.c | ||
549 | 169 | + | ||
550 | 170 | deflate.o: $(SRCDIR)deflate.c | ||
551 | 171 | $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)deflate.c | ||
552 | 172 | |||
553 | 173 | @@ -208,6 +217,11 @@ crc32.lo: $(SRCDIR)crc32.c | ||
554 | 174 | $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/crc32.o $(SRCDIR)crc32.c | ||
555 | 175 | -@mv objs/crc32.o $@ | ||
556 | 176 | |||
557 | 177 | +crc32_z_power8.lo: $(SRCDIR)contrib/power/crc32_z_power8.c | ||
558 | 178 | + -@mkdir objs 2>/dev/null || test -d objs | ||
559 | 179 | + $(CC) $(SFLAGS) -mcpu=power8 $(ZINC) -DPIC -c -o objs/crc32_z_power8.o $(SRCDIR)contrib/power/crc32_z_power8.c | ||
560 | 180 | + -@mv objs/crc32_z_power8.o $@ | ||
561 | 181 | + | ||
562 | 182 | deflate.lo: $(SRCDIR)deflate.c | ||
563 | 183 | -@mkdir objs 2>/dev/null || test -d objs | ||
564 | 184 | $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/deflate.o $(SRCDIR)deflate.c | ||
565 | 185 | @@ -281,18 +295,27 @@ placebo $(SHAREDLIBV): $(PIC_OBJS) libz.a | ||
566 | 186 | ln -s $@ $(SHAREDLIBM) | ||
567 | 187 | -@rmdir objs | ||
568 | 188 | |||
569 | 189 | +crc32_test$(EXE): crc32_test.o $(STATICLIB) | ||
570 | 190 | + $(CC) $(CFLAGS) -o $@ crc32_test.o $(TEST_LDFLAGS) | ||
571 | 191 | + | ||
572 | 192 | example$(EXE): example.o $(STATICLIB) | ||
573 | 193 | $(CC) $(CFLAGS) -o $@ example.o $(TEST_LDFLAGS) | ||
574 | 194 | |||
575 | 195 | minigzip$(EXE): minigzip.o $(STATICLIB) | ||
576 | 196 | $(CC) $(CFLAGS) -o $@ minigzip.o $(TEST_LDFLAGS) | ||
577 | 197 | |||
578 | 198 | +crc32_testsh$(EXE): crc32_test.o $(SHAREDLIBV) | ||
579 | 199 | + $(CC) $(CFLAGS) -o $@ crc32_test.o -L. $(SHAREDLIBV) | ||
580 | 200 | + | ||
581 | 201 | examplesh$(EXE): example.o $(SHAREDLIBV) | ||
582 | 202 | $(CC) $(CFLAGS) -o $@ example.o $(LDFLAGS) -L. $(SHAREDLIBV) | ||
583 | 203 | |||
584 | 204 | minigzipsh$(EXE): minigzip.o $(SHAREDLIBV) | ||
585 | 205 | $(CC) $(CFLAGS) -o $@ minigzip.o $(LDFLAGS) -L. $(SHAREDLIBV) | ||
586 | 206 | |||
587 | 207 | +crc32_test64$(EXE): crc32_test64.o $(STATICLIB) | ||
588 | 208 | + $(CC) $(CFLAGS) -o $@ crc32_test64.o $(TEST_LDFLAGS) | ||
589 | 209 | + | ||
590 | 210 | example64$(EXE): example64.o $(STATICLIB) | ||
591 | 211 | $(CC) $(CFLAGS) -o $@ example64.o $(TEST_LDFLAGS) | ||
592 | 212 | |||
593 | 213 | @@ -368,8 +391,8 @@ minizip-clean: | ||
594 | 214 | mostlyclean: clean | ||
595 | 215 | clean: minizip-clean | ||
596 | 216 | rm -f *.o *.lo *~ \ | ||
597 | 217 | - example$(EXE) minigzip$(EXE) examplesh$(EXE) minigzipsh$(EXE) \ | ||
598 | 218 | - example64$(EXE) minigzip64$(EXE) \ | ||
599 | 219 | + crc32_test$(EXE) example$(EXE) minigzip$(EXE) crc32_testsh$(EXE) examplesh$(EXE) minigzipsh$(EXE) \ | ||
600 | 220 | + crc32_test64$(EXE) example64$(EXE) minigzip64$(EXE) \ | ||
601 | 221 | infcover \ | ||
602 | 222 | libz.* foo.gz so_locations \ | ||
603 | 223 | _match.s maketree contrib/infback9/*.o | ||
604 | 224 | @@ -391,7 +414,7 @@ tags: | ||
605 | 225 | |||
606 | 226 | adler32.o zutil.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h | ||
607 | 227 | gzclose.o gzlib.o gzread.o gzwrite.o: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h | ||
608 | 228 | -compress.o example.o minigzip.o uncompr.o: $(SRCDIR)zlib.h zconf.h | ||
609 | 229 | +compress.o crc32_test.o example.o minigzip.o uncompr.o: $(SRCDIR)zlib.h zconf.h | ||
610 | 230 | crc32.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h | ||
611 | 231 | deflate.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h | ||
612 | 232 | infback.o inflate.o: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h | ||
613 | 233 | @@ -401,7 +424,7 @@ trees.o: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)tr | ||
614 | 234 | |||
615 | 235 | adler32.lo zutil.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h | ||
616 | 236 | gzclose.lo gzlib.lo gzread.lo gzwrite.lo: $(SRCDIR)zlib.h zconf.h $(SRCDIR)gzguts.h | ||
617 | 237 | -compress.lo example.lo minigzip.lo uncompr.lo: $(SRCDIR)zlib.h zconf.h | ||
618 | 238 | +compress.lo crc32_test.lo example.lo minigzip.lo uncompr.lo: $(SRCDIR)zlib.h zconf.h | ||
619 | 239 | crc32.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)crc32.h | ||
620 | 240 | deflate.lo: $(SRCDIR)deflate.h $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h | ||
621 | 241 | infback.lo inflate.lo: $(SRCDIR)zutil.h $(SRCDIR)zlib.h zconf.h $(SRCDIR)inftrees.h $(SRCDIR)inflate.h $(SRCDIR)inffast.h $(SRCDIR)inffixed.h | ||
622 | 242 | diff --git a/configure b/configure | ||
623 | 243 | index e307a8d..b96ed4a 100755 | ||
624 | 244 | --- a/configure | ||
625 | 245 | +++ b/configure | ||
626 | 246 | @@ -864,6 +864,9 @@ cat > $test.c <<EOF | ||
627 | 247 | #ifndef _ARCH_PPC | ||
628 | 248 | #error "Target is not Power" | ||
629 | 249 | #endif | ||
630 | 250 | +#if !(defined(__PPC64__) || defined(__powerpc64__)) | ||
631 | 251 | + #error "Target is not 64 bits" | ||
632 | 252 | +#endif | ||
633 | 253 | #ifndef HAVE_IFUNC | ||
634 | 254 | #error "Target doesn't support ifunc" | ||
635 | 255 | #endif | ||
636 | 256 | @@ -877,8 +880,8 @@ if tryboth $CC -c $CFLAGS $test.c; then | ||
637 | 257 | |||
638 | 258 | if tryboth $CC -c $CFLAGS -mcpu=power8 $test.c; then | ||
639 | 259 | POWER8="-DZ_POWER8" | ||
640 | 260 | - PIC_OBJC="${PIC_OBJC}" | ||
641 | 261 | - OBJC="${OBJC}" | ||
642 | 262 | + PIC_OBJC="${PIC_OBJC} crc32_z_power8.lo" | ||
643 | 263 | + OBJC="${OBJC} crc32_z_power8.o" | ||
644 | 264 | echo "Checking for -mcpu=power8 support... Yes." | tee -a configure.log | ||
645 | 265 | else | ||
646 | 266 | echo "Checking for -mcpu=power8 support... No." | tee -a configure.log | ||
647 | 267 | diff --git a/contrib/README.contrib b/contrib/README.contrib | ||
648 | 268 | index c57b520..90170df 100644 | ||
649 | 269 | --- a/contrib/README.contrib | ||
650 | 270 | +++ b/contrib/README.contrib | ||
651 | 271 | @@ -46,7 +46,8 @@ minizip/ by Gilles Vollant <info@winimage.com> | ||
652 | 272 | pascal/ by Bob Dellaca <bobdl@xtra.co.nz> et al. | ||
653 | 273 | Support for Pascal | ||
654 | 274 | |||
655 | 275 | -power/ by Matheus Castanho <msc@linux.ibm.com> | ||
656 | 276 | +power/ by Daniel Black <daniel@linux.ibm.com> | ||
657 | 277 | + Matheus Castanho <msc@linux.ibm.com> | ||
658 | 278 | and Rogerio Alves <rcardoso@linux.ibm.com> | ||
659 | 279 | Optimized functions for Power processors | ||
660 | 280 | |||
661 | 281 | diff --git a/contrib/power/clang_workaround.h b/contrib/power/clang_workaround.h | ||
662 | 282 | new file mode 100644 | ||
663 | 283 | index 0000000..b5e7dae | ||
664 | 284 | --- /dev/null | ||
665 | 285 | +++ b/contrib/power/clang_workaround.h | ||
666 | 286 | @@ -0,0 +1,82 @@ | ||
667 | 287 | +#ifndef CLANG_WORKAROUNDS_H | ||
668 | 288 | +#define CLANG_WORKAROUNDS_H | ||
669 | 289 | + | ||
670 | 290 | +/* | ||
671 | 291 | + * These stubs fix clang incompatibilities with GCC builtins. | ||
672 | 292 | + */ | ||
673 | 293 | + | ||
674 | 294 | +#ifndef __builtin_crypto_vpmsumw | ||
675 | 295 | +#define __builtin_crypto_vpmsumw __builtin_crypto_vpmsumb | ||
676 | 296 | +#endif | ||
677 | 297 | +#ifndef __builtin_crypto_vpmsumd | ||
678 | 298 | +#define __builtin_crypto_vpmsumd __builtin_crypto_vpmsumb | ||
679 | 299 | +#endif | ||
680 | 300 | + | ||
681 | 301 | +static inline | ||
682 | 302 | +__vector unsigned long long __attribute__((overloadable)) | ||
683 | 303 | +vec_ld(int __a, const __vector unsigned long long* __b) | ||
684 | 304 | +{ | ||
685 | 305 | + return (__vector unsigned long long)__builtin_altivec_lvx(__a, __b); | ||
686 | 306 | +} | ||
687 | 307 | + | ||
688 | 308 | +/* | ||
689 | 309 | + * GCC __builtin_pack_vector_int128 returns a vector __int128_t but Clang | ||
690 | 310 | + * does not recognize this type. On GCC this builtin is translated to a | ||
691 | 311 | + * xxpermdi instruction that only moves the registers __a, __b instead generates | ||
692 | 312 | + * a load. | ||
693 | 313 | + * | ||
694 | 314 | + * Clang has vec_xxpermdi intrinsics. It was implemented in 4.0.0. | ||
695 | 315 | + */ | ||
696 | 316 | +static inline | ||
697 | 317 | +__vector unsigned long long __builtin_pack_vector (unsigned long __a, | ||
698 | 318 | + unsigned long __b) | ||
699 | 319 | +{ | ||
700 | 320 | + #if defined(__BIG_ENDIAN__) | ||
701 | 321 | + __vector unsigned long long __v = {__a, __b}; | ||
702 | 322 | + #else | ||
703 | 323 | + __vector unsigned long long __v = {__b, __a}; | ||
704 | 324 | + #endif | ||
705 | 325 | + return __v; | ||
706 | 326 | +} | ||
707 | 327 | + | ||
708 | 328 | +#ifndef vec_xxpermdi | ||
709 | 329 | + | ||
710 | 330 | +static inline | ||
711 | 331 | +unsigned long __builtin_unpack_vector (__vector unsigned long long __v, | ||
712 | 332 | + int __o) | ||
713 | 333 | +{ | ||
714 | 334 | + return __v[__o]; | ||
715 | 335 | +} | ||
716 | 336 | + | ||
717 | 337 | +#if defined(__BIG_ENDIAN__) | ||
718 | 338 | +#define __builtin_unpack_vector_0(a) __builtin_unpack_vector ((a), 0) | ||
719 | 339 | +#define __builtin_unpack_vector_1(a) __builtin_unpack_vector ((a), 1) | ||
720 | 340 | +#else | ||
721 | 341 | +#define __builtin_unpack_vector_0(a) __builtin_unpack_vector ((a), 1) | ||
722 | 342 | +#define __builtin_unpack_vector_1(a) __builtin_unpack_vector ((a), 0) | ||
723 | 343 | +#endif | ||
724 | 344 | + | ||
725 | 345 | +#else | ||
726 | 346 | + | ||
727 | 347 | +static inline | ||
728 | 348 | +unsigned long __builtin_unpack_vector_0 (__vector unsigned long long __v) | ||
729 | 349 | +{ | ||
730 | 350 | + #if defined(__BIG_ENDIAN__) | ||
731 | 351 | + return vec_xxpermdi(__v, __v, 0x0)[1]; | ||
732 | 352 | + #else | ||
733 | 353 | + return vec_xxpermdi(__v, __v, 0x0)[0]; | ||
734 | 354 | + #endif | ||
735 | 355 | +} | ||
736 | 356 | + | ||
737 | 357 | +static inline | ||
738 | 358 | +unsigned long __builtin_unpack_vector_1 (__vector unsigned long long __v) | ||
739 | 359 | +{ | ||
740 | 360 | + #if defined(__BIG_ENDIAN__) | ||
741 | 361 | + return vec_xxpermdi(__v, __v, 0x3)[1]; | ||
742 | 362 | + #else | ||
743 | 363 | + return vec_xxpermdi(__v, __v, 0x3)[0]; | ||
744 | 364 | + #endif | ||
745 | 365 | +} | ||
746 | 366 | +#endif /* vec_xxpermdi */ | ||
747 | 367 | + | ||
748 | 368 | +#endif | ||
749 | 369 | diff --git a/contrib/power/crc32_constants.h b/contrib/power/crc32_constants.h | ||
750 | 370 | new file mode 100644 | ||
751 | 371 | index 0000000..3d01150 | ||
752 | 372 | --- /dev/null | ||
753 | 373 | +++ b/contrib/power/crc32_constants.h | ||
754 | 374 | @@ -0,0 +1,1206 @@ | ||
755 | 375 | +/* | ||
756 | 376 | +* | ||
757 | 377 | +* THIS FILE IS GENERATED WITH | ||
758 | 378 | +./crc32_constants -c -r -x 0x04C11DB7 | ||
759 | 379 | + | ||
760 | 380 | +* This is from https://github.com/antonblanchard/crc32-vpmsum/ | ||
761 | 381 | +* DO NOT MODIFY IT MANUALLY! | ||
762 | 382 | +* | ||
763 | 383 | +*/ | ||
764 | 384 | + | ||
765 | 385 | +#define CRC 0x4c11db7 | ||
766 | 386 | +#define CRC_XOR | ||
767 | 387 | +#define REFLECT | ||
768 | 388 | +#define MAX_SIZE 32768 | ||
769 | 389 | + | ||
770 | 390 | +#ifndef __ASSEMBLER__ | ||
771 | 391 | +#ifdef CRC_TABLE | ||
772 | 392 | +static const unsigned int crc_table[] = { | ||
773 | 393 | + 0x00000000, 0x77073096, 0xee0e612c, 0x990951ba, | ||
774 | 394 | + 0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, | ||
775 | 395 | + 0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, | ||
776 | 396 | + 0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, | ||
777 | 397 | + 0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, | ||
778 | 398 | + 0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, | ||
779 | 399 | + 0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, | ||
780 | 400 | + 0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, | ||
781 | 401 | + 0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, | ||
782 | 402 | + 0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, | ||
783 | 403 | + 0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, | ||
784 | 404 | + 0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, | ||
785 | 405 | + 0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, | ||
786 | 406 | + 0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, | ||
787 | 407 | + 0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, | ||
788 | 408 | + 0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, | ||
789 | 409 | + 0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, | ||
790 | 410 | + 0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, | ||
791 | 411 | + 0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, | ||
792 | 412 | + 0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, | ||
793 | 413 | + 0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, | ||
794 | 414 | + 0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, | ||
795 | 415 | + 0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, | ||
796 | 416 | + 0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, | ||
797 | 417 | + 0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, | ||
798 | 418 | + 0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, | ||
799 | 419 | + 0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, | ||
800 | 420 | + 0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, | ||
801 | 421 | + 0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, | ||
802 | 422 | + 0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, | ||
803 | 423 | + 0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, | ||
804 | 424 | + 0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, | ||
805 | 425 | + 0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, | ||
806 | 426 | + 0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, | ||
807 | 427 | + 0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, | ||
808 | 428 | + 0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, | ||
809 | 429 | + 0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, | ||
810 | 430 | + 0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, | ||
811 | 431 | + 0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, | ||
812 | 432 | + 0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, | ||
813 | 433 | + 0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, | ||
814 | 434 | + 0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, | ||
815 | 435 | + 0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, | ||
816 | 436 | + 0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, | ||
817 | 437 | + 0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, | ||
818 | 438 | + 0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, | ||
819 | 439 | + 0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, | ||
820 | 440 | + 0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, | ||
821 | 441 | + 0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, | ||
822 | 442 | + 0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, | ||
823 | 443 | + 0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, | ||
824 | 444 | + 0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, | ||
825 | 445 | + 0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, | ||
826 | 446 | + 0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, | ||
827 | 447 | + 0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, | ||
828 | 448 | + 0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, | ||
829 | 449 | + 0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, | ||
830 | 450 | + 0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, | ||
831 | 451 | + 0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, | ||
832 | 452 | + 0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9, | ||
833 | 453 | + 0xbdbdf21c, 0xcabac28a, 0x53b39330, 0x24b4a3a6, | ||
834 | 454 | + 0xbad03605, 0xcdd70693, 0x54de5729, 0x23d967bf, | ||
835 | 455 | + 0xb3667a2e, 0xc4614ab8, 0x5d681b02, 0x2a6f2b94, | ||
836 | 456 | + 0xb40bbe37, 0xc30c8ea1, 0x5a05df1b, 0x2d02ef8d,}; | ||
837 | 457 | + | ||
838 | 458 | +#endif /* CRC_TABLE */ | ||
839 | 459 | +#ifdef POWER8_INTRINSICS | ||
840 | 460 | + | ||
841 | 461 | +/* Constants */ | ||
842 | 462 | + | ||
843 | 463 | +/* Reduce 262144 kbits to 1024 bits */ | ||
844 | 464 | +static const __vector unsigned long long vcrc_const[255] | ||
845 | 465 | + __attribute__((aligned (16))) = { | ||
846 | 466 | +#ifdef __LITTLE_ENDIAN__ | ||
847 | 467 | + /* x^261120 mod p(x)` << 1, x^261184 mod p(x)` << 1 */ | ||
848 | 468 | + { 0x0000000099ea94a8, 0x00000001651797d2 }, | ||
849 | 469 | + /* x^260096 mod p(x)` << 1, x^260160 mod p(x)` << 1 */ | ||
850 | 470 | + { 0x00000000945a8420, 0x0000000021e0d56c }, | ||
851 | 471 | + /* x^259072 mod p(x)` << 1, x^259136 mod p(x)` << 1 */ | ||
852 | 472 | + { 0x0000000030762706, 0x000000000f95ecaa }, | ||
853 | 473 | + /* x^258048 mod p(x)` << 1, x^258112 mod p(x)` << 1 */ | ||
854 | 474 | + { 0x00000001a52fc582, 0x00000001ebd224ac }, | ||
855 | 475 | + /* x^257024 mod p(x)` << 1, x^257088 mod p(x)` << 1 */ | ||
856 | 476 | + { 0x00000001a4a7167a, 0x000000000ccb97ca }, | ||
857 | 477 | + /* x^256000 mod p(x)` << 1, x^256064 mod p(x)` << 1 */ | ||
858 | 478 | + { 0x000000000c18249a, 0x00000001006ec8a8 }, | ||
859 | 479 | + /* x^254976 mod p(x)` << 1, x^255040 mod p(x)` << 1 */ | ||
860 | 480 | + { 0x00000000a924ae7c, 0x000000014f58f196 }, | ||
861 | 481 | + /* x^253952 mod p(x)` << 1, x^254016 mod p(x)` << 1 */ | ||
862 | 482 | + { 0x00000001e12ccc12, 0x00000001a7192ca6 }, | ||
863 | 483 | + /* x^252928 mod p(x)` << 1, x^252992 mod p(x)` << 1 */ | ||
864 | 484 | + { 0x00000000a0b9d4ac, 0x000000019a64bab2 }, | ||
865 | 485 | + /* x^251904 mod p(x)` << 1, x^251968 mod p(x)` << 1 */ | ||
866 | 486 | + { 0x0000000095e8ddfe, 0x0000000014f4ed2e }, | ||
867 | 487 | + /* x^250880 mod p(x)` << 1, x^250944 mod p(x)` << 1 */ | ||
868 | 488 | + { 0x00000000233fddc4, 0x000000011092b6a2 }, | ||
869 | 489 | + /* x^249856 mod p(x)` << 1, x^249920 mod p(x)` << 1 */ | ||
870 | 490 | + { 0x00000001b4529b62, 0x00000000c8a1629c }, | ||
871 | 491 | + /* x^248832 mod p(x)` << 1, x^248896 mod p(x)` << 1 */ | ||
872 | 492 | + { 0x00000001a7fa0e64, 0x000000017bf32e8e }, | ||
873 | 493 | + /* x^247808 mod p(x)` << 1, x^247872 mod p(x)` << 1 */ | ||
874 | 494 | + { 0x00000001b5334592, 0x00000001f8cc6582 }, | ||
875 | 495 | + /* x^246784 mod p(x)` << 1, x^246848 mod p(x)` << 1 */ | ||
876 | 496 | + { 0x000000011f8ee1b4, 0x000000008631ddf0 }, | ||
877 | 497 | + /* x^245760 mod p(x)` << 1, x^245824 mod p(x)` << 1 */ | ||
878 | 498 | + { 0x000000006252e632, 0x000000007e5a76d0 }, | ||
879 | 499 | + /* x^244736 mod p(x)` << 1, x^244800 mod p(x)` << 1 */ | ||
880 | 500 | + { 0x00000000ab973e84, 0x000000002b09b31c }, | ||
881 | 501 | + /* x^243712 mod p(x)` << 1, x^243776 mod p(x)` << 1 */ | ||
882 | 502 | + { 0x000000007734f5ec, 0x00000001b2df1f84 }, | ||
883 | 503 | + /* x^242688 mod p(x)` << 1, x^242752 mod p(x)` << 1 */ | ||
884 | 504 | + { 0x000000007c547798, 0x00000001d6f56afc }, | ||
885 | 505 | + /* x^241664 mod p(x)` << 1, x^241728 mod p(x)` << 1 */ | ||
886 | 506 | + { 0x000000007ec40210, 0x00000001b9b5e70c }, | ||
887 | 507 | + /* x^240640 mod p(x)` << 1, x^240704 mod p(x)` << 1 */ | ||
888 | 508 | + { 0x00000001ab1695a8, 0x0000000034b626d2 }, | ||
889 | 509 | + /* x^239616 mod p(x)` << 1, x^239680 mod p(x)` << 1 */ | ||
890 | 510 | + { 0x0000000090494bba, 0x000000014c53479a }, | ||
891 | 511 | + /* x^238592 mod p(x)` << 1, x^238656 mod p(x)` << 1 */ | ||
892 | 512 | + { 0x00000001123fb816, 0x00000001a6d179a4 }, | ||
893 | 513 | + /* x^237568 mod p(x)` << 1, x^237632 mod p(x)` << 1 */ | ||
894 | 514 | + { 0x00000001e188c74c, 0x000000015abd16b4 }, | ||
895 | 515 | + /* x^236544 mod p(x)` << 1, x^236608 mod p(x)` << 1 */ | ||
896 | 516 | + { 0x00000001c2d3451c, 0x00000000018f9852 }, | ||
897 | 517 | + /* x^235520 mod p(x)` << 1, x^235584 mod p(x)` << 1 */ | ||
898 | 518 | + { 0x00000000f55cf1ca, 0x000000001fb3084a }, | ||
899 | 519 | + /* x^234496 mod p(x)` << 1, x^234560 mod p(x)` << 1 */ | ||
900 | 520 | + { 0x00000001a0531540, 0x00000000c53dfb04 }, | ||
901 | 521 | + /* x^233472 mod p(x)` << 1, x^233536 mod p(x)` << 1 */ | ||
902 | 522 | + { 0x0000000132cd7ebc, 0x00000000e10c9ad6 }, | ||
903 | 523 | + /* x^232448 mod p(x)` << 1, x^232512 mod p(x)` << 1 */ | ||
904 | 524 | + { 0x0000000073ab7f36, 0x0000000025aa994a }, | ||
905 | 525 | + /* x^231424 mod p(x)` << 1, x^231488 mod p(x)` << 1 */ | ||
906 | 526 | + { 0x0000000041aed1c2, 0x00000000fa3a74c4 }, | ||
907 | 527 | + /* x^230400 mod p(x)` << 1, x^230464 mod p(x)` << 1 */ | ||
908 | 528 | + { 0x0000000136c53800, 0x0000000033eb3f40 }, | ||
909 | 529 | + /* x^229376 mod p(x)` << 1, x^229440 mod p(x)` << 1 */ | ||
910 | 530 | + { 0x0000000126835a30, 0x000000017193f296 }, | ||
911 | 531 | + /* x^228352 mod p(x)` << 1, x^228416 mod p(x)` << 1 */ | ||
912 | 532 | + { 0x000000006241b502, 0x0000000043f6c86a }, | ||
913 | 533 | + /* x^227328 mod p(x)` << 1, x^227392 mod p(x)` << 1 */ | ||
914 | 534 | + { 0x00000000d5196ad4, 0x000000016b513ec6 }, | ||
915 | 535 | + /* x^226304 mod p(x)` << 1, x^226368 mod p(x)` << 1 */ | ||
916 | 536 | + { 0x000000009cfa769a, 0x00000000c8f25b4e }, | ||
917 | 537 | + /* x^225280 mod p(x)` << 1, x^225344 mod p(x)` << 1 */ | ||
918 | 538 | + { 0x00000000920e5df4, 0x00000001a45048ec }, | ||
919 | 539 | + /* x^224256 mod p(x)` << 1, x^224320 mod p(x)` << 1 */ | ||
920 | 540 | + { 0x0000000169dc310e, 0x000000000c441004 }, | ||
921 | 541 | + /* x^223232 mod p(x)` << 1, x^223296 mod p(x)` << 1 */ | ||
922 | 542 | + { 0x0000000009fc331c, 0x000000000e17cad6 }, | ||
923 | 543 | + /* x^222208 mod p(x)` << 1, x^222272 mod p(x)` << 1 */ | ||
924 | 544 | + { 0x000000010d94a81e, 0x00000001253ae964 }, | ||
925 | 545 | + /* x^221184 mod p(x)` << 1, x^221248 mod p(x)` << 1 */ | ||
926 | 546 | + { 0x0000000027a20ab2, 0x00000001d7c88ebc }, | ||
927 | 547 | + /* x^220160 mod p(x)` << 1, x^220224 mod p(x)` << 1 */ | ||
928 | 548 | + { 0x0000000114f87504, 0x00000001e7ca913a }, | ||
929 | 549 | + /* x^219136 mod p(x)` << 1, x^219200 mod p(x)` << 1 */ | ||
930 | 550 | + { 0x000000004b076d96, 0x0000000033ed078a }, | ||
931 | 551 | + /* x^218112 mod p(x)` << 1, x^218176 mod p(x)` << 1 */ | ||
932 | 552 | + { 0x00000000da4d1e74, 0x00000000e1839c78 }, | ||
933 | 553 | + /* x^217088 mod p(x)` << 1, x^217152 mod p(x)` << 1 */ | ||
934 | 554 | + { 0x000000001b81f672, 0x00000001322b267e }, | ||
935 | 555 | + /* x^216064 mod p(x)` << 1, x^216128 mod p(x)` << 1 */ | ||
936 | 556 | + { 0x000000009367c988, 0x00000000638231b6 }, | ||
937 | 557 | + /* x^215040 mod p(x)` << 1, x^215104 mod p(x)` << 1 */ | ||
938 | 558 | + { 0x00000001717214ca, 0x00000001ee7f16f4 }, | ||
939 | 559 | + /* x^214016 mod p(x)` << 1, x^214080 mod p(x)` << 1 */ | ||
940 | 560 | + { 0x000000009f47d820, 0x0000000117d9924a }, | ||
941 | 561 | + /* x^212992 mod p(x)` << 1, x^213056 mod p(x)` << 1 */ | ||
942 | 562 | + { 0x000000010d9a47d2, 0x00000000e1a9e0c4 }, | ||
943 | 563 | + /* x^211968 mod p(x)` << 1, x^212032 mod p(x)` << 1 */ | ||
944 | 564 | + { 0x00000000a696c58c, 0x00000001403731dc }, | ||
945 | 565 | + /* x^210944 mod p(x)` << 1, x^211008 mod p(x)` << 1 */ | ||
946 | 566 | + { 0x000000002aa28ec6, 0x00000001a5ea9682 }, | ||
947 | 567 | + /* x^209920 mod p(x)` << 1, x^209984 mod p(x)` << 1 */ | ||
948 | 568 | + { 0x00000001fe18fd9a, 0x0000000101c5c578 }, | ||
949 | 569 | + /* x^208896 mod p(x)` << 1, x^208960 mod p(x)` << 1 */ | ||
950 | 570 | + { 0x000000019d4fc1ae, 0x00000000dddf6494 }, | ||
951 | 571 | + /* x^207872 mod p(x)` << 1, x^207936 mod p(x)` << 1 */ | ||
952 | 572 | + { 0x00000001ba0e3dea, 0x00000000f1c3db28 }, | ||
953 | 573 | + /* x^206848 mod p(x)` << 1, x^206912 mod p(x)` << 1 */ | ||
954 | 574 | + { 0x0000000074b59a5e, 0x000000013112fb9c }, | ||
955 | 575 | + /* x^205824 mod p(x)` << 1, x^205888 mod p(x)` << 1 */ | ||
956 | 576 | + { 0x00000000f2b5ea98, 0x00000000b680b906 }, | ||
957 | 577 | + /* x^204800 mod p(x)` << 1, x^204864 mod p(x)` << 1 */ | ||
958 | 578 | + { 0x0000000187132676, 0x000000001a282932 }, | ||
959 | 579 | + /* x^203776 mod p(x)` << 1, x^203840 mod p(x)` << 1 */ | ||
960 | 580 | + { 0x000000010a8c6ad4, 0x0000000089406e7e }, | ||
961 | 581 | + /* x^202752 mod p(x)` << 1, x^202816 mod p(x)` << 1 */ | ||
962 | 582 | + { 0x00000001e21dfe70, 0x00000001def6be8c }, | ||
963 | 583 | + /* x^201728 mod p(x)` << 1, x^201792 mod p(x)` << 1 */ | ||
964 | 584 | + { 0x00000001da0050e4, 0x0000000075258728 }, | ||
965 | 585 | + /* x^200704 mod p(x)` << 1, x^200768 mod p(x)` << 1 */ | ||
966 | 586 | + { 0x00000000772172ae, 0x000000019536090a }, | ||
967 | 587 | + /* x^199680 mod p(x)` << 1, x^199744 mod p(x)` << 1 */ | ||
968 | 588 | + { 0x00000000e47724aa, 0x00000000f2455bfc }, | ||
969 | 589 | + /* x^198656 mod p(x)` << 1, x^198720 mod p(x)` << 1 */ | ||
970 | 590 | + { 0x000000003cd63ac4, 0x000000018c40baf4 }, | ||
971 | 591 | + /* x^197632 mod p(x)` << 1, x^197696 mod p(x)` << 1 */ | ||
972 | 592 | + { 0x00000001bf47d352, 0x000000004cd390d4 }, | ||
973 | 593 | + /* x^196608 mod p(x)` << 1, x^196672 mod p(x)` << 1 */ | ||
974 | 594 | + { 0x000000018dc1d708, 0x00000001e4ece95a }, | ||
975 | 595 | + /* x^195584 mod p(x)` << 1, x^195648 mod p(x)` << 1 */ | ||
976 | 596 | + { 0x000000002d4620a4, 0x000000001a3ee918 }, | ||
977 | 597 | + /* x^194560 mod p(x)` << 1, x^194624 mod p(x)` << 1 */ | ||
978 | 598 | + { 0x0000000058fd1740, 0x000000007c652fb8 }, | ||
979 | 599 | + /* x^193536 mod p(x)` << 1, x^193600 mod p(x)` << 1 */ | ||
980 | 600 | + { 0x00000000dadd9bfc, 0x000000011c67842c }, | ||
981 | 601 | + /* x^192512 mod p(x)` << 1, x^192576 mod p(x)` << 1 */ | ||
982 | 602 | + { 0x00000001ea2140be, 0x00000000254f759c }, | ||
983 | 603 | + /* x^191488 mod p(x)` << 1, x^191552 mod p(x)` << 1 */ | ||
984 | 604 | + { 0x000000009de128ba, 0x000000007ece94ca }, | ||
985 | 605 | + /* x^190464 mod p(x)` << 1, x^190528 mod p(x)` << 1 */ | ||
986 | 606 | + { 0x000000013ac3aa8e, 0x0000000038f258c2 }, | ||
987 | 607 | + /* x^189440 mod p(x)` << 1, x^189504 mod p(x)` << 1 */ | ||
988 | 608 | + { 0x0000000099980562, 0x00000001cdf17b00 }, | ||
989 | 609 | + /* x^188416 mod p(x)` << 1, x^188480 mod p(x)` << 1 */ | ||
990 | 610 | + { 0x00000001c1579c86, 0x000000011f882c16 }, | ||
991 | 611 | + /* x^187392 mod p(x)` << 1, x^187456 mod p(x)` << 1 */ | ||
992 | 612 | + { 0x0000000068dbbf94, 0x0000000100093fc8 }, | ||
993 | 613 | + /* x^186368 mod p(x)` << 1, x^186432 mod p(x)` << 1 */ | ||
994 | 614 | + { 0x000000004509fb04, 0x00000001cd684f16 }, | ||
995 | 615 | + /* x^185344 mod p(x)` << 1, x^185408 mod p(x)` << 1 */ | ||
996 | 616 | + { 0x00000001202f6398, 0x000000004bc6a70a }, | ||
997 | 617 | + /* x^184320 mod p(x)` << 1, x^184384 mod p(x)` << 1 */ | ||
998 | 618 | + { 0x000000013aea243e, 0x000000004fc7e8e4 }, | ||
999 | 619 | + /* x^183296 mod p(x)` << 1, x^183360 mod p(x)` << 1 */ | ||
1000 | 620 | + { 0x00000001b4052ae6, 0x0000000130103f1c }, | ||
1001 | 621 | + /* x^182272 mod p(x)` << 1, x^182336 mod p(x)` << 1 */ | ||
1002 | 622 | + { 0x00000001cd2a0ae8, 0x0000000111b0024c }, | ||
1003 | 623 | + /* x^181248 mod p(x)` << 1, x^181312 mod p(x)` << 1 */ | ||
1004 | 624 | + { 0x00000001fe4aa8b4, 0x000000010b3079da }, | ||
1005 | 625 | + /* x^180224 mod p(x)` << 1, x^180288 mod p(x)` << 1 */ | ||
1006 | 626 | + { 0x00000001d1559a42, 0x000000010192bcc2 }, | ||
1007 | 627 | + /* x^179200 mod p(x)` << 1, x^179264 mod p(x)` << 1 */ | ||
1008 | 628 | + { 0x00000001f3e05ecc, 0x0000000074838d50 }, | ||
1009 | 629 | + /* x^178176 mod p(x)` << 1, x^178240 mod p(x)` << 1 */ | ||
1010 | 630 | + { 0x0000000104ddd2cc, 0x000000001b20f520 }, | ||
1011 | 631 | + /* x^177152 mod p(x)` << 1, x^177216 mod p(x)` << 1 */ | ||
1012 | 632 | + { 0x000000015393153c, 0x0000000050c3590a }, | ||
1013 | 633 | + /* x^176128 mod p(x)` << 1, x^176192 mod p(x)` << 1 */ | ||
1014 | 634 | + { 0x0000000057e942c6, 0x00000000b41cac8e }, | ||
1015 | 635 | + /* x^175104 mod p(x)` << 1, x^175168 mod p(x)` << 1 */ | ||
1016 | 636 | + { 0x000000012c633850, 0x000000000c72cc78 }, | ||
1017 | 637 | + /* x^174080 mod p(x)` << 1, x^174144 mod p(x)` << 1 */ | ||
1018 | 638 | + { 0x00000000ebcaae4c, 0x0000000030cdb032 }, | ||
1019 | 639 | + /* x^173056 mod p(x)` << 1, x^173120 mod p(x)` << 1 */ | ||
1020 | 640 | + { 0x000000013ee532a6, 0x000000013e09fc32 }, | ||
1021 | 641 | + /* x^172032 mod p(x)` << 1, x^172096 mod p(x)` << 1 */ | ||
1022 | 642 | + { 0x00000001bf0cbc7e, 0x000000001ed624d2 }, | ||
1023 | 643 | + /* x^171008 mod p(x)` << 1, x^171072 mod p(x)` << 1 */ | ||
1024 | 644 | + { 0x00000000d50b7a5a, 0x00000000781aee1a }, | ||
1025 | 645 | + /* x^169984 mod p(x)` << 1, x^170048 mod p(x)` << 1 */ | ||
1026 | 646 | + { 0x0000000002fca6e8, 0x00000001c4d8348c }, | ||
1027 | 647 | + /* x^168960 mod p(x)` << 1, x^169024 mod p(x)` << 1 */ | ||
1028 | 648 | + { 0x000000007af40044, 0x0000000057a40336 }, | ||
1029 | 649 | + /* x^167936 mod p(x)` << 1, x^168000 mod p(x)` << 1 */ | ||
1030 | 650 | + { 0x0000000016178744, 0x0000000085544940 }, | ||
1031 | 651 | + /* x^166912 mod p(x)` << 1, x^166976 mod p(x)` << 1 */ | ||
1032 | 652 | + { 0x000000014c177458, 0x000000019cd21e80 }, | ||
1033 | 653 | + /* x^165888 mod p(x)` << 1, x^165952 mod p(x)` << 1 */ | ||
1034 | 654 | + { 0x000000011b6ddf04, 0x000000013eb95bc0 }, | ||
1035 | 655 | + /* x^164864 mod p(x)` << 1, x^164928 mod p(x)` << 1 */ | ||
1036 | 656 | + { 0x00000001f3e29ccc, 0x00000001dfc9fdfc }, | ||
1037 | 657 | + /* x^163840 mod p(x)` << 1, x^163904 mod p(x)` << 1 */ | ||
1038 | 658 | + { 0x0000000135ae7562, 0x00000000cd028bc2 }, | ||
1039 | 659 | + /* x^162816 mod p(x)` << 1, x^162880 mod p(x)` << 1 */ | ||
1040 | 660 | + { 0x0000000190ef812c, 0x0000000090db8c44 }, | ||
1041 | 661 | + /* x^161792 mod p(x)` << 1, x^161856 mod p(x)` << 1 */ | ||
1042 | 662 | + { 0x0000000067a2c786, 0x000000010010a4ce }, | ||
1043 | 663 | + /* x^160768 mod p(x)` << 1, x^160832 mod p(x)` << 1 */ | ||
1044 | 664 | + { 0x0000000048b9496c, 0x00000001c8f4c72c }, | ||
1045 | 665 | + /* x^159744 mod p(x)` << 1, x^159808 mod p(x)` << 1 */ | ||
1046 | 666 | + { 0x000000015a422de6, 0x000000001c26170c }, | ||
1047 | 667 | + /* x^158720 mod p(x)` << 1, x^158784 mod p(x)` << 1 */ | ||
1048 | 668 | + { 0x00000001ef0e3640, 0x00000000e3fccf68 }, | ||
1049 | 669 | + /* x^157696 mod p(x)` << 1, x^157760 mod p(x)` << 1 */ | ||
1050 | 670 | + { 0x00000001006d2d26, 0x00000000d513ed24 }, | ||
1051 | 671 | + /* x^156672 mod p(x)` << 1, x^156736 mod p(x)` << 1 */ | ||
1052 | 672 | + { 0x00000001170d56d6, 0x00000000141beada }, | ||
1053 | 673 | + /* x^155648 mod p(x)` << 1, x^155712 mod p(x)` << 1 */ | ||
1054 | 674 | + { 0x00000000a5fb613c, 0x000000011071aea0 }, | ||
1055 | 675 | + /* x^154624 mod p(x)` << 1, x^154688 mod p(x)` << 1 */ | ||
1056 | 676 | + { 0x0000000040bbf7fc, 0x000000012e19080a }, | ||
1057 | 677 | + /* x^153600 mod p(x)` << 1, x^153664 mod p(x)` << 1 */ | ||
1058 | 678 | + { 0x000000016ac3a5b2, 0x0000000100ecf826 }, | ||
1059 | 679 | + /* x^152576 mod p(x)` << 1, x^152640 mod p(x)` << 1 */ | ||
1060 | 680 | + { 0x00000000abf16230, 0x0000000069b09412 }, | ||
1061 | 681 | + /* x^151552 mod p(x)` << 1, x^151616 mod p(x)` << 1 */ | ||
1062 | 682 | + { 0x00000001ebe23fac, 0x0000000122297bac }, | ||
1063 | 683 | + /* x^150528 mod p(x)` << 1, x^150592 mod p(x)` << 1 */ | ||
1064 | 684 | + { 0x000000008b6a0894, 0x00000000e9e4b068 }, | ||
1065 | 685 | + /* x^149504 mod p(x)` << 1, x^149568 mod p(x)` << 1 */ | ||
1066 | 686 | + { 0x00000001288ea478, 0x000000004b38651a }, | ||
1067 | 687 | + /* x^148480 mod p(x)` << 1, x^148544 mod p(x)` << 1 */ | ||
1068 | 688 | + { 0x000000016619c442, 0x00000001468360e2 }, | ||
1069 | 689 | + /* x^147456 mod p(x)` << 1, x^147520 mod p(x)` << 1 */ | ||
1070 | 690 | + { 0x0000000086230038, 0x00000000121c2408 }, | ||
1071 | 691 | + /* x^146432 mod p(x)` << 1, x^146496 mod p(x)` << 1 */ | ||
1072 | 692 | + { 0x000000017746a756, 0x00000000da7e7d08 }, | ||
1073 | 693 | + /* x^145408 mod p(x)` << 1, x^145472 mod p(x)` << 1 */ | ||
1074 | 694 | + { 0x0000000191b8f8f8, 0x00000001058d7652 }, | ||
1075 | 695 | + /* x^144384 mod p(x)` << 1, x^144448 mod p(x)` << 1 */ | ||
1076 | 696 | + { 0x000000008e167708, 0x000000014a098a90 }, | ||
1077 | 697 | + /* x^143360 mod p(x)` << 1, x^143424 mod p(x)` << 1 */ | ||
1078 | 698 | + { 0x0000000148b22d54, 0x0000000020dbe72e }, | ||
1079 | 699 | + /* x^142336 mod p(x)` << 1, x^142400 mod p(x)` << 1 */ | ||
1080 | 700 | + { 0x0000000044ba2c3c, 0x000000011e7323e8 }, | ||
1081 | 701 | + /* x^141312 mod p(x)` << 1, x^141376 mod p(x)` << 1 */ | ||
1082 | 702 | + { 0x00000000b54d2b52, 0x00000000d5d4bf94 }, | ||
1083 | 703 | + /* x^140288 mod p(x)` << 1, x^140352 mod p(x)` << 1 */ | ||
1084 | 704 | + { 0x0000000005a4fd8a, 0x0000000199d8746c }, | ||
1085 | 705 | + /* x^139264 mod p(x)` << 1, x^139328 mod p(x)` << 1 */ | ||
1086 | 706 | + { 0x0000000139f9fc46, 0x00000000ce9ca8a0 }, | ||
1087 | 707 | + /* x^138240 mod p(x)` << 1, x^138304 mod p(x)` << 1 */ | ||
1088 | 708 | + { 0x000000015a1fa824, 0x00000000136edece }, | ||
1089 | 709 | + /* x^137216 mod p(x)` << 1, x^137280 mod p(x)` << 1 */ | ||
1090 | 710 | + { 0x000000000a61ae4c, 0x000000019b92a068 }, | ||
1091 | 711 | + /* x^136192 mod p(x)` << 1, x^136256 mod p(x)` << 1 */ | ||
1092 | 712 | + { 0x0000000145e9113e, 0x0000000071d62206 }, | ||
1093 | 713 | + /* x^135168 mod p(x)` << 1, x^135232 mod p(x)` << 1 */ | ||
1094 | 714 | + { 0x000000006a348448, 0x00000000dfc50158 }, | ||
1095 | 715 | + /* x^134144 mod p(x)` << 1, x^134208 mod p(x)` << 1 */ | ||
1096 | 716 | + { 0x000000004d80a08c, 0x00000001517626bc }, | ||
1097 | 717 | + /* x^133120 mod p(x)` << 1, x^133184 mod p(x)` << 1 */ | ||
1098 | 718 | + { 0x000000014b6837a0, 0x0000000148d1e4fa }, | ||
1099 | 719 | + /* x^132096 mod p(x)` << 1, x^132160 mod p(x)` << 1 */ | ||
1100 | 720 | + { 0x000000016896a7fc, 0x0000000094d8266e }, | ||
1101 | 721 | + /* x^131072 mod p(x)` << 1, x^131136 mod p(x)` << 1 */ | ||
1102 | 722 | + { 0x000000014f187140, 0x00000000606c5e34 }, | ||
1103 | 723 | + /* x^130048 mod p(x)` << 1, x^130112 mod p(x)` << 1 */ | ||
1104 | 724 | + { 0x000000019581b9da, 0x000000019766beaa }, | ||
1105 | 725 | + /* x^129024 mod p(x)` << 1, x^129088 mod p(x)` << 1 */ | ||
1106 | 726 | + { 0x00000001091bc984, 0x00000001d80c506c }, | ||
1107 | 727 | + /* x^128000 mod p(x)` << 1, x^128064 mod p(x)` << 1 */ | ||
1108 | 728 | + { 0x000000001067223c, 0x000000001e73837c }, | ||
1109 | 729 | + /* x^126976 mod p(x)` << 1, x^127040 mod p(x)` << 1 */ | ||
1110 | 730 | + { 0x00000001ab16ea02, 0x0000000064d587de }, | ||
1111 | 731 | + /* x^125952 mod p(x)` << 1, x^126016 mod p(x)` << 1 */ | ||
1112 | 732 | + { 0x000000013c4598a8, 0x00000000f4a507b0 }, | ||
1113 | 733 | + /* x^124928 mod p(x)` << 1, x^124992 mod p(x)` << 1 */ | ||
1114 | 734 | + { 0x00000000b3735430, 0x0000000040e342fc }, | ||
1115 | 735 | + /* x^123904 mod p(x)` << 1, x^123968 mod p(x)` << 1 */ | ||
1116 | 736 | + { 0x00000001bb3fc0c0, 0x00000001d5ad9c3a }, | ||
1117 | 737 | + /* x^122880 mod p(x)` << 1, x^122944 mod p(x)` << 1 */ | ||
1118 | 738 | + { 0x00000001570ae19c, 0x0000000094a691a4 }, | ||
1119 | 739 | + /* x^121856 mod p(x)` << 1, x^121920 mod p(x)` << 1 */ | ||
1120 | 740 | + { 0x00000001ea910712, 0x00000001271ecdfa }, | ||
1121 | 741 | + /* x^120832 mod p(x)` << 1, x^120896 mod p(x)` << 1 */ | ||
1122 | 742 | + { 0x0000000167127128, 0x000000009e54475a }, | ||
1123 | 743 | + /* x^119808 mod p(x)` << 1, x^119872 mod p(x)` << 1 */ | ||
1124 | 744 | + { 0x0000000019e790a2, 0x00000000c9c099ee }, | ||
1125 | 745 | + /* x^118784 mod p(x)` << 1, x^118848 mod p(x)` << 1 */ | ||
1126 | 746 | + { 0x000000003788f710, 0x000000009a2f736c }, | ||
1127 | 747 | + /* x^117760 mod p(x)` << 1, x^117824 mod p(x)` << 1 */ | ||
1128 | 748 | + { 0x00000001682a160e, 0x00000000bb9f4996 }, | ||
1129 | 749 | + /* x^116736 mod p(x)` << 1, x^116800 mod p(x)` << 1 */ | ||
1130 | 750 | + { 0x000000007f0ebd2e, 0x00000001db688050 }, | ||
1131 | 751 | + /* x^115712 mod p(x)` << 1, x^115776 mod p(x)` << 1 */ | ||
1132 | 752 | + { 0x000000002b032080, 0x00000000e9b10af4 }, | ||
1133 | 753 | + /* x^114688 mod p(x)` << 1, x^114752 mod p(x)` << 1 */ | ||
1134 | 754 | + { 0x00000000cfd1664a, 0x000000012d4545e4 }, | ||
1135 | 755 | + /* x^113664 mod p(x)` << 1, x^113728 mod p(x)` << 1 */ | ||
1136 | 756 | + { 0x00000000aa1181c2, 0x000000000361139c }, | ||
1137 | 757 | + /* x^112640 mod p(x)` << 1, x^112704 mod p(x)` << 1 */ | ||
1138 | 758 | + { 0x00000000ddd08002, 0x00000001a5a1a3a8 }, | ||
1139 | 759 | + /* x^111616 mod p(x)` << 1, x^111680 mod p(x)` << 1 */ | ||
1140 | 760 | + { 0x00000000e8dd0446, 0x000000006844e0b0 }, | ||
1141 | 761 | + /* x^110592 mod p(x)` << 1, x^110656 mod p(x)` << 1 */ | ||
1142 | 762 | + { 0x00000001bbd94a00, 0x00000000c3762f28 }, | ||
1143 | 763 | + /* x^109568 mod p(x)` << 1, x^109632 mod p(x)` << 1 */ | ||
1144 | 764 | + { 0x00000000ab6cd180, 0x00000001d26287a2 }, | ||
1145 | 765 | + /* x^108544 mod p(x)` << 1, x^108608 mod p(x)` << 1 */ | ||
1146 | 766 | + { 0x0000000031803ce2, 0x00000001f6f0bba8 }, | ||
1147 | 767 | + /* x^107520 mod p(x)` << 1, x^107584 mod p(x)` << 1 */ | ||
1148 | 768 | + { 0x0000000024f40b0c, 0x000000002ffabd62 }, | ||
1149 | 769 | + /* x^106496 mod p(x)` << 1, x^106560 mod p(x)` << 1 */ | ||
1150 | 770 | + { 0x00000001ba1d9834, 0x00000000fb4516b8 }, | ||
1151 | 771 | + /* x^105472 mod p(x)` << 1, x^105536 mod p(x)` << 1 */ | ||
1152 | 772 | + { 0x0000000104de61aa, 0x000000018cfa961c }, | ||
1153 | 773 | + /* x^104448 mod p(x)` << 1, x^104512 mod p(x)` << 1 */ | ||
1154 | 774 | + { 0x0000000113e40d46, 0x000000019e588d52 }, | ||
1155 | 775 | + /* x^103424 mod p(x)` << 1, x^103488 mod p(x)` << 1 */ | ||
1156 | 776 | + { 0x00000001415598a0, 0x00000001180f0bbc }, | ||
1157 | 777 | + /* x^102400 mod p(x)` << 1, x^102464 mod p(x)` << 1 */ | ||
1158 | 778 | + { 0x00000000bf6c8c90, 0x00000000e1d9177a }, | ||
1159 | 779 | + /* x^101376 mod p(x)` << 1, x^101440 mod p(x)` << 1 */ | ||
1160 | 780 | + { 0x00000001788b0504, 0x0000000105abc27c }, | ||
1161 | 781 | + /* x^100352 mod p(x)` << 1, x^100416 mod p(x)` << 1 */ | ||
1162 | 782 | + { 0x0000000038385d02, 0x00000000972e4a58 }, | ||
1163 | 783 | + /* x^99328 mod p(x)` << 1, x^99392 mod p(x)` << 1 */ | ||
1164 | 784 | + { 0x00000001b6c83844, 0x0000000183499a5e }, | ||
1165 | 785 | + /* x^98304 mod p(x)` << 1, x^98368 mod p(x)` << 1 */ | ||
1166 | 786 | + { 0x0000000051061a8a, 0x00000001c96a8cca }, | ||
1167 | 787 | + /* x^97280 mod p(x)` << 1, x^97344 mod p(x)` << 1 */ | ||
1168 | 788 | + { 0x000000017351388a, 0x00000001a1a5b60c }, | ||
1169 | 789 | + /* x^96256 mod p(x)` << 1, x^96320 mod p(x)` << 1 */ | ||
1170 | 790 | + { 0x0000000132928f92, 0x00000000e4b6ac9c }, | ||
1171 | 791 | + /* x^95232 mod p(x)` << 1, x^95296 mod p(x)` << 1 */ | ||
1172 | 792 | + { 0x00000000e6b4f48a, 0x00000001807e7f5a }, | ||
1173 | 793 | + /* x^94208 mod p(x)` << 1, x^94272 mod p(x)` << 1 */ | ||
1174 | 794 | + { 0x0000000039d15e90, 0x000000017a7e3bc8 }, | ||
1175 | 795 | + /* x^93184 mod p(x)` << 1, x^93248 mod p(x)` << 1 */ | ||
1176 | 796 | + { 0x00000000312d6074, 0x00000000d73975da }, | ||
1177 | 797 | + /* x^92160 mod p(x)` << 1, x^92224 mod p(x)` << 1 */ | ||
1178 | 798 | + { 0x000000017bbb2cc4, 0x000000017375d038 }, | ||
1179 | 799 | + /* x^91136 mod p(x)` << 1, x^91200 mod p(x)` << 1 */ | ||
1180 | 800 | + { 0x000000016ded3e18, 0x00000000193680bc }, | ||
1181 | 801 | + /* x^90112 mod p(x)` << 1, x^90176 mod p(x)` << 1 */ | ||
1182 | 802 | + { 0x00000000f1638b16, 0x00000000999b06f6 }, | ||
1183 | 803 | + /* x^89088 mod p(x)` << 1, x^89152 mod p(x)` << 1 */ | ||
1184 | 804 | + { 0x00000001d38b9ecc, 0x00000001f685d2b8 }, | ||
1185 | 805 | + /* x^88064 mod p(x)` << 1, x^88128 mod p(x)` << 1 */ | ||
1186 | 806 | + { 0x000000018b8d09dc, 0x00000001f4ecbed2 }, | ||
1187 | 807 | + /* x^87040 mod p(x)` << 1, x^87104 mod p(x)` << 1 */ | ||
1188 | 808 | + { 0x00000000e7bc27d2, 0x00000000ba16f1a0 }, | ||
1189 | 809 | + /* x^86016 mod p(x)` << 1, x^86080 mod p(x)` << 1 */ | ||
1190 | 810 | + { 0x00000000275e1e96, 0x0000000115aceac4 }, | ||
1191 | 811 | + /* x^84992 mod p(x)` << 1, x^85056 mod p(x)` << 1 */ | ||
1192 | 812 | + { 0x00000000e2e3031e, 0x00000001aeff6292 }, | ||
1193 | 813 | + /* x^83968 mod p(x)` << 1, x^84032 mod p(x)` << 1 */ | ||
1194 | 814 | + { 0x00000001041c84d8, 0x000000009640124c }, | ||
1195 | 815 | + /* x^82944 mod p(x)` << 1, x^83008 mod p(x)` << 1 */ | ||
1196 | 816 | + { 0x00000000706ce672, 0x0000000114f41f02 }, | ||
1197 | 817 | + /* x^81920 mod p(x)` << 1, x^81984 mod p(x)` << 1 */ | ||
1198 | 818 | + { 0x000000015d5070da, 0x000000009c5f3586 }, | ||
1199 | 819 | + /* x^80896 mod p(x)` << 1, x^80960 mod p(x)` << 1 */ | ||
1200 | 820 | + { 0x0000000038f9493a, 0x00000001878275fa }, | ||
1201 | 821 | + /* x^79872 mod p(x)` << 1, x^79936 mod p(x)` << 1 */ | ||
1202 | 822 | + { 0x00000000a3348a76, 0x00000000ddc42ce8 }, | ||
1203 | 823 | + /* x^78848 mod p(x)` << 1, x^78912 mod p(x)` << 1 */ | ||
1204 | 824 | + { 0x00000001ad0aab92, 0x0000000181d2c73a }, | ||
1205 | 825 | + /* x^77824 mod p(x)` << 1, x^77888 mod p(x)` << 1 */ | ||
1206 | 826 | + { 0x000000019e85f712, 0x0000000141c9320a }, | ||
1207 | 827 | + /* x^76800 mod p(x)` << 1, x^76864 mod p(x)` << 1 */ | ||
1208 | 828 | + { 0x000000005a871e76, 0x000000015235719a }, | ||
1209 | 829 | + /* x^75776 mod p(x)` << 1, x^75840 mod p(x)` << 1 */ | ||
1210 | 830 | + { 0x000000017249c662, 0x00000000be27d804 }, | ||
1211 | 831 | + /* x^74752 mod p(x)` << 1, x^74816 mod p(x)` << 1 */ | ||
1212 | 832 | + { 0x000000003a084712, 0x000000006242d45a }, | ||
1213 | 833 | + /* x^73728 mod p(x)` << 1, x^73792 mod p(x)` << 1 */ | ||
1214 | 834 | + { 0x00000000ed438478, 0x000000009a53638e }, | ||
1215 | 835 | + /* x^72704 mod p(x)` << 1, x^72768 mod p(x)` << 1 */ | ||
1216 | 836 | + { 0x00000000abac34cc, 0x00000001001ecfb6 }, | ||
1217 | 837 | + /* x^71680 mod p(x)` << 1, x^71744 mod p(x)` << 1 */ | ||
1218 | 838 | + { 0x000000005f35ef3e, 0x000000016d7c2d64 }, | ||
1219 | 839 | + /* x^70656 mod p(x)` << 1, x^70720 mod p(x)` << 1 */ | ||
1220 | 840 | + { 0x0000000047d6608c, 0x00000001d0ce46c0 }, | ||
1221 | 841 | + /* x^69632 mod p(x)` << 1, x^69696 mod p(x)` << 1 */ | ||
1222 | 842 | + { 0x000000002d01470e, 0x0000000124c907b4 }, | ||
1223 | 843 | + /* x^68608 mod p(x)` << 1, x^68672 mod p(x)` << 1 */ | ||
1224 | 844 | + { 0x0000000158bbc7b0, 0x0000000018a555ca }, | ||
1225 | 845 | + /* x^67584 mod p(x)` << 1, x^67648 mod p(x)` << 1 */ | ||
1226 | 846 | + { 0x00000000c0a23e8e, 0x000000006b0980bc }, | ||
1227 | 847 | + /* x^66560 mod p(x)` << 1, x^66624 mod p(x)` << 1 */ | ||
1228 | 848 | + { 0x00000001ebd85c88, 0x000000008bbba964 }, | ||
1229 | 849 | + /* x^65536 mod p(x)` << 1, x^65600 mod p(x)` << 1 */ | ||
1230 | 850 | + { 0x000000019ee20bb2, 0x00000001070a5a1e }, | ||
1231 | 851 | + /* x^64512 mod p(x)` << 1, x^64576 mod p(x)` << 1 */ | ||
1232 | 852 | + { 0x00000001acabf2d6, 0x000000002204322a }, | ||
1233 | 853 | + /* x^63488 mod p(x)` << 1, x^63552 mod p(x)` << 1 */ | ||
1234 | 854 | + { 0x00000001b7963d56, 0x00000000a27524d0 }, | ||
1235 | 855 | + /* x^62464 mod p(x)` << 1, x^62528 mod p(x)` << 1 */ | ||
1236 | 856 | + { 0x000000017bffa1fe, 0x0000000020b1e4ba }, | ||
1237 | 857 | + /* x^61440 mod p(x)` << 1, x^61504 mod p(x)` << 1 */ | ||
1238 | 858 | + { 0x000000001f15333e, 0x0000000032cc27fc }, | ||
1239 | 859 | + /* x^60416 mod p(x)` << 1, x^60480 mod p(x)` << 1 */ | ||
1240 | 860 | + { 0x000000018593129e, 0x0000000044dd22b8 }, | ||
1241 | 861 | + /* x^59392 mod p(x)` << 1, x^59456 mod p(x)` << 1 */ | ||
1242 | 862 | + { 0x000000019cb32602, 0x00000000dffc9e0a }, | ||
1243 | 863 | + /* x^58368 mod p(x)` << 1, x^58432 mod p(x)` << 1 */ | ||
1244 | 864 | + { 0x0000000142b05cc8, 0x00000001b7a0ed14 }, | ||
1245 | 865 | + /* x^57344 mod p(x)` << 1, x^57408 mod p(x)` << 1 */ | ||
1246 | 866 | + { 0x00000001be49e7a4, 0x00000000c7842488 }, | ||
1247 | 867 | + /* x^56320 mod p(x)` << 1, x^56384 mod p(x)` << 1 */ | ||
1248 | 868 | + { 0x0000000108f69d6c, 0x00000001c02a4fee }, | ||
1249 | 869 | + /* x^55296 mod p(x)` << 1, x^55360 mod p(x)` << 1 */ | ||
1250 | 870 | + { 0x000000006c0971f0, 0x000000003c273778 }, | ||
1251 | 871 | + /* x^54272 mod p(x)` << 1, x^54336 mod p(x)` << 1 */ | ||
1252 | 872 | + { 0x000000005b16467a, 0x00000001d63f8894 }, | ||
1253 | 873 | + /* x^53248 mod p(x)` << 1, x^53312 mod p(x)` << 1 */ | ||
1254 | 874 | + { 0x00000001551a628e, 0x000000006be557d6 }, | ||
1255 | 875 | + /* x^52224 mod p(x)` << 1, x^52288 mod p(x)` << 1 */ | ||
1256 | 876 | + { 0x000000019e42ea92, 0x000000006a7806ea }, | ||
1257 | 877 | + /* x^51200 mod p(x)` << 1, x^51264 mod p(x)` << 1 */ | ||
1258 | 878 | + { 0x000000012fa83ff2, 0x000000016155aa0c }, | ||
1259 | 879 | + /* x^50176 mod p(x)` << 1, x^50240 mod p(x)` << 1 */ | ||
1260 | 880 | + { 0x000000011ca9cde0, 0x00000000908650ac }, | ||
1261 | 881 | + /* x^49152 mod p(x)` << 1, x^49216 mod p(x)` << 1 */ | ||
1262 | 882 | + { 0x00000000c8e5cd74, 0x00000000aa5a8084 }, | ||
1263 | 883 | + /* x^48128 mod p(x)` << 1, x^48192 mod p(x)` << 1 */ | ||
1264 | 884 | + { 0x0000000096c27f0c, 0x0000000191bb500a }, | ||
1265 | 885 | + /* x^47104 mod p(x)` << 1, x^47168 mod p(x)` << 1 */ | ||
1266 | 886 | + { 0x000000002baed926, 0x0000000064e9bed0 }, | ||
1267 | 887 | + /* x^46080 mod p(x)` << 1, x^46144 mod p(x)` << 1 */ | ||
1268 | 888 | + { 0x000000017c8de8d2, 0x000000009444f302 }, | ||
1269 | 889 | + /* x^45056 mod p(x)` << 1, x^45120 mod p(x)` << 1 */ | ||
1270 | 890 | + { 0x00000000d43d6068, 0x000000019db07d3c }, | ||
1271 | 891 | + /* x^44032 mod p(x)` << 1, x^44096 mod p(x)` << 1 */ | ||
1272 | 892 | + { 0x00000000cb2c4b26, 0x00000001359e3e6e }, | ||
1273 | 893 | + /* x^43008 mod p(x)` << 1, x^43072 mod p(x)` << 1 */ | ||
1274 | 894 | + { 0x0000000145b8da26, 0x00000001e4f10dd2 }, | ||
1275 | 895 | + /* x^41984 mod p(x)` << 1, x^42048 mod p(x)` << 1 */ | ||
1276 | 896 | + { 0x000000018fff4b08, 0x0000000124f5735e }, | ||
1277 | 897 | + /* x^40960 mod p(x)` << 1, x^41024 mod p(x)` << 1 */ | ||
1278 | 898 | + { 0x0000000150b58ed0, 0x0000000124760a4c }, | ||
1279 | 899 | + /* x^39936 mod p(x)` << 1, x^40000 mod p(x)` << 1 */ | ||
1280 | 900 | + { 0x00000001549f39bc, 0x000000000f1fc186 }, | ||
1281 | 901 | + /* x^38912 mod p(x)` << 1, x^38976 mod p(x)` << 1 */ | ||
1282 | 902 | + { 0x00000000ef4d2f42, 0x00000000150e4cc4 }, | ||
1283 | 903 | + /* x^37888 mod p(x)` << 1, x^37952 mod p(x)` << 1 */ | ||
1284 | 904 | + { 0x00000001b1468572, 0x000000002a6204e8 }, | ||
1285 | 905 | + /* x^36864 mod p(x)` << 1, x^36928 mod p(x)` << 1 */ | ||
1286 | 906 | + { 0x000000013d7403b2, 0x00000000beb1d432 }, | ||
1287 | 907 | + /* x^35840 mod p(x)` << 1, x^35904 mod p(x)` << 1 */ | ||
1288 | 908 | + { 0x00000001a4681842, 0x0000000135f3f1f0 }, | ||
1289 | 909 | + /* x^34816 mod p(x)` << 1, x^34880 mod p(x)` << 1 */ | ||
1290 | 910 | + { 0x0000000167714492, 0x0000000074fe2232 }, | ||
1291 | 911 | + /* x^33792 mod p(x)` << 1, x^33856 mod p(x)` << 1 */ | ||
1292 | 912 | + { 0x00000001e599099a, 0x000000001ac6e2ba }, | ||
1293 | 913 | + /* x^32768 mod p(x)` << 1, x^32832 mod p(x)` << 1 */ | ||
1294 | 914 | + { 0x00000000fe128194, 0x0000000013fca91e }, | ||
1295 | 915 | + /* x^31744 mod p(x)` << 1, x^31808 mod p(x)` << 1 */ | ||
1296 | 916 | + { 0x0000000077e8b990, 0x0000000183f4931e }, | ||
1297 | 917 | + /* x^30720 mod p(x)` << 1, x^30784 mod p(x)` << 1 */ | ||
1298 | 918 | + { 0x00000001a267f63a, 0x00000000b6d9b4e4 }, | ||
1299 | 919 | + /* x^29696 mod p(x)` << 1, x^29760 mod p(x)` << 1 */ | ||
1300 | 920 | + { 0x00000001945c245a, 0x00000000b5188656 }, | ||
1301 | 921 | + /* x^28672 mod p(x)` << 1, x^28736 mod p(x)` << 1 */ | ||
1302 | 922 | + { 0x0000000149002e76, 0x0000000027a81a84 }, | ||
1303 | 923 | + /* x^27648 mod p(x)` << 1, x^27712 mod p(x)` << 1 */ | ||
1304 | 924 | + { 0x00000001bb8310a4, 0x0000000125699258 }, | ||
1305 | 925 | + /* x^26624 mod p(x)` << 1, x^26688 mod p(x)` << 1 */ | ||
1306 | 926 | + { 0x000000019ec60bcc, 0x00000001b23de796 }, | ||
1307 | 927 | + /* x^25600 mod p(x)` << 1, x^25664 mod p(x)` << 1 */ | ||
1308 | 928 | + { 0x000000012d8590ae, 0x00000000fe4365dc }, | ||
1309 | 929 | + /* x^24576 mod p(x)` << 1, x^24640 mod p(x)` << 1 */ | ||
1310 | 930 | + { 0x0000000065b00684, 0x00000000c68f497a }, | ||
1311 | 931 | + /* x^23552 mod p(x)` << 1, x^23616 mod p(x)` << 1 */ | ||
1312 | 932 | + { 0x000000015e5aeadc, 0x00000000fbf521ee }, | ||
1313 | 933 | + /* x^22528 mod p(x)` << 1, x^22592 mod p(x)` << 1 */ | ||
1314 | 934 | + { 0x00000000b77ff2b0, 0x000000015eac3378 }, | ||
1315 | 935 | + /* x^21504 mod p(x)` << 1, x^21568 mod p(x)` << 1 */ | ||
1316 | 936 | + { 0x0000000188da2ff6, 0x0000000134914b90 }, | ||
1317 | 937 | + /* x^20480 mod p(x)` << 1, x^20544 mod p(x)` << 1 */ | ||
1318 | 938 | + { 0x0000000063da929a, 0x0000000016335cfe }, | ||
1319 | 939 | + /* x^19456 mod p(x)` << 1, x^19520 mod p(x)` << 1 */ | ||
1320 | 940 | + { 0x00000001389caa80, 0x000000010372d10c }, | ||
1321 | 941 | + /* x^18432 mod p(x)` << 1, x^18496 mod p(x)` << 1 */ | ||
1322 | 942 | + { 0x000000013db599d2, 0x000000015097b908 }, | ||
1323 | 943 | + /* x^17408 mod p(x)` << 1, x^17472 mod p(x)` << 1 */ | ||
1324 | 944 | + { 0x0000000122505a86, 0x00000001227a7572 }, | ||
1325 | 945 | + /* x^16384 mod p(x)` << 1, x^16448 mod p(x)` << 1 */ | ||
1326 | 946 | + { 0x000000016bd72746, 0x000000009a8f75c0 }, | ||
1327 | 947 | + /* x^15360 mod p(x)` << 1, x^15424 mod p(x)` << 1 */ | ||
1328 | 948 | + { 0x00000001c3faf1d4, 0x00000000682c77a2 }, | ||
1329 | 949 | + /* x^14336 mod p(x)` << 1, x^14400 mod p(x)` << 1 */ | ||
1330 | 950 | + { 0x00000001111c826c, 0x00000000231f091c }, | ||
1331 | 951 | + /* x^13312 mod p(x)` << 1, x^13376 mod p(x)` << 1 */ | ||
1332 | 952 | + { 0x00000000153e9fb2, 0x000000007d4439f2 }, | ||
1333 | 953 | + /* x^12288 mod p(x)` << 1, x^12352 mod p(x)` << 1 */ | ||
1334 | 954 | + { 0x000000002b1f7b60, 0x000000017e221efc }, | ||
1335 | 955 | + /* x^11264 mod p(x)` << 1, x^11328 mod p(x)` << 1 */ | ||
1336 | 956 | + { 0x00000000b1dba570, 0x0000000167457c38 }, | ||
1337 | 957 | + /* x^10240 mod p(x)` << 1, x^10304 mod p(x)` << 1 */ | ||
1338 | 958 | + { 0x00000001f6397b76, 0x00000000bdf081c4 }, | ||
1339 | 959 | + /* x^9216 mod p(x)` << 1, x^9280 mod p(x)` << 1 */ | ||
1340 | 960 | + { 0x0000000156335214, 0x000000016286d6b0 }, | ||
1341 | 961 | + /* x^8192 mod p(x)` << 1, x^8256 mod p(x)` << 1 */ | ||
1342 | 962 | + { 0x00000001d70e3986, 0x00000000c84f001c }, | ||
1343 | 963 | + /* x^7168 mod p(x)` << 1, x^7232 mod p(x)` << 1 */ | ||
1344 | 964 | + { 0x000000003701a774, 0x0000000064efe7c0 }, | ||
1345 | 965 | + /* x^6144 mod p(x)` << 1, x^6208 mod p(x)` << 1 */ | ||
1346 | 966 | + { 0x00000000ac81ef72, 0x000000000ac2d904 }, | ||
1347 | 967 | + /* x^5120 mod p(x)` << 1, x^5184 mod p(x)` << 1 */ | ||
1348 | 968 | + { 0x0000000133212464, 0x00000000fd226d14 }, | ||
1349 | 969 | + /* x^4096 mod p(x)` << 1, x^4160 mod p(x)` << 1 */ | ||
1350 | 970 | + { 0x00000000e4e45610, 0x000000011cfd42e0 }, | ||
1351 | 971 | + /* x^3072 mod p(x)` << 1, x^3136 mod p(x)` << 1 */ | ||
1352 | 972 | + { 0x000000000c1bd370, 0x000000016e5a5678 }, | ||
1353 | 973 | + /* x^2048 mod p(x)` << 1, x^2112 mod p(x)` << 1 */ | ||
1354 | 974 | + { 0x00000001a7b9e7a6, 0x00000001d888fe22 }, | ||
1355 | 975 | + /* x^1024 mod p(x)` << 1, x^1088 mod p(x)` << 1 */ | ||
1356 | 976 | + { 0x000000007d657a10, 0x00000001af77fcd4 } | ||
1357 | 977 | +#else /* __LITTLE_ENDIAN__ */ | ||
1358 | 978 | + /* x^261120 mod p(x)` << 1, x^261184 mod p(x)` << 1 */ | ||
1359 | 979 | + { 0x00000001651797d2, 0x0000000099ea94a8 }, | ||
1360 | 980 | + /* x^260096 mod p(x)` << 1, x^260160 mod p(x)` << 1 */ | ||
1361 | 981 | + { 0x0000000021e0d56c, 0x00000000945a8420 }, | ||
1362 | 982 | + /* x^259072 mod p(x)` << 1, x^259136 mod p(x)` << 1 */ | ||
1363 | 983 | + { 0x000000000f95ecaa, 0x0000000030762706 }, | ||
1364 | 984 | + /* x^258048 mod p(x)` << 1, x^258112 mod p(x)` << 1 */ | ||
1365 | 985 | + { 0x00000001ebd224ac, 0x00000001a52fc582 }, | ||
1366 | 986 | + /* x^257024 mod p(x)` << 1, x^257088 mod p(x)` << 1 */ | ||
1367 | 987 | + { 0x000000000ccb97ca, 0x00000001a4a7167a }, | ||
1368 | 988 | + /* x^256000 mod p(x)` << 1, x^256064 mod p(x)` << 1 */ | ||
1369 | 989 | + { 0x00000001006ec8a8, 0x000000000c18249a }, | ||
1370 | 990 | + /* x^254976 mod p(x)` << 1, x^255040 mod p(x)` << 1 */ | ||
1371 | 991 | + { 0x000000014f58f196, 0x00000000a924ae7c }, | ||
1372 | 992 | + /* x^253952 mod p(x)` << 1, x^254016 mod p(x)` << 1 */ | ||
1373 | 993 | + { 0x00000001a7192ca6, 0x00000001e12ccc12 }, | ||
1374 | 994 | + /* x^252928 mod p(x)` << 1, x^252992 mod p(x)` << 1 */ | ||
1375 | 995 | + { 0x000000019a64bab2, 0x00000000a0b9d4ac }, | ||
1376 | 996 | + /* x^251904 mod p(x)` << 1, x^251968 mod p(x)` << 1 */ | ||
1377 | 997 | + { 0x0000000014f4ed2e, 0x0000000095e8ddfe }, | ||
1378 | 998 | + /* x^250880 mod p(x)` << 1, x^250944 mod p(x)` << 1 */ | ||
1379 | 999 | + { 0x000000011092b6a2, 0x00000000233fddc4 }, | ||
1380 | 1000 | + /* x^249856 mod p(x)` << 1, x^249920 mod p(x)` << 1 */ | ||
1381 | 1001 | + { 0x00000000c8a1629c, 0x00000001b4529b62 }, | ||
1382 | 1002 | + /* x^248832 mod p(x)` << 1, x^248896 mod p(x)` << 1 */ | ||
1383 | 1003 | + { 0x000000017bf32e8e, 0x00000001a7fa0e64 }, | ||
1384 | 1004 | + /* x^247808 mod p(x)` << 1, x^247872 mod p(x)` << 1 */ | ||
1385 | 1005 | + { 0x00000001f8cc6582, 0x00000001b5334592 }, | ||
1386 | 1006 | + /* x^246784 mod p(x)` << 1, x^246848 mod p(x)` << 1 */ | ||
1387 | 1007 | + { 0x000000008631ddf0, 0x000000011f8ee1b4 }, | ||
1388 | 1008 | + /* x^245760 mod p(x)` << 1, x^245824 mod p(x)` << 1 */ | ||
1389 | 1009 | + { 0x000000007e5a76d0, 0x000000006252e632 }, | ||
1390 | 1010 | + /* x^244736 mod p(x)` << 1, x^244800 mod p(x)` << 1 */ | ||
1391 | 1011 | + { 0x000000002b09b31c, 0x00000000ab973e84 }, | ||
1392 | 1012 | + /* x^243712 mod p(x)` << 1, x^243776 mod p(x)` << 1 */ | ||
1393 | 1013 | + { 0x00000001b2df1f84, 0x000000007734f5ec }, | ||
1394 | 1014 | + /* x^242688 mod p(x)` << 1, x^242752 mod p(x)` << 1 */ | ||
1395 | 1015 | + { 0x00000001d6f56afc, 0x000000007c547798 }, | ||
1396 | 1016 | + /* x^241664 mod p(x)` << 1, x^241728 mod p(x)` << 1 */ | ||
1397 | 1017 | + { 0x00000001b9b5e70c, 0x000000007ec40210 }, | ||
1398 | 1018 | + /* x^240640 mod p(x)` << 1, x^240704 mod p(x)` << 1 */ | ||
1399 | 1019 | + { 0x0000000034b626d2, 0x00000001ab1695a8 }, | ||
1400 | 1020 | + /* x^239616 mod p(x)` << 1, x^239680 mod p(x)` << 1 */ | ||
1401 | 1021 | + { 0x000000014c53479a, 0x0000000090494bba }, | ||
1402 | 1022 | + /* x^238592 mod p(x)` << 1, x^238656 mod p(x)` << 1 */ | ||
1403 | 1023 | + { 0x00000001a6d179a4, 0x00000001123fb816 }, | ||
1404 | 1024 | + /* x^237568 mod p(x)` << 1, x^237632 mod p(x)` << 1 */ | ||
1405 | 1025 | + { 0x000000015abd16b4, 0x00000001e188c74c }, | ||
1406 | 1026 | + /* x^236544 mod p(x)` << 1, x^236608 mod p(x)` << 1 */ | ||
1407 | 1027 | + { 0x00000000018f9852, 0x00000001c2d3451c }, | ||
1408 | 1028 | + /* x^235520 mod p(x)` << 1, x^235584 mod p(x)` << 1 */ | ||
1409 | 1029 | + { 0x000000001fb3084a, 0x00000000f55cf1ca }, | ||
1410 | 1030 | + /* x^234496 mod p(x)` << 1, x^234560 mod p(x)` << 1 */ | ||
1411 | 1031 | + { 0x00000000c53dfb04, 0x00000001a0531540 }, | ||
1412 | 1032 | + /* x^233472 mod p(x)` << 1, x^233536 mod p(x)` << 1 */ | ||
1413 | 1033 | + { 0x00000000e10c9ad6, 0x0000000132cd7ebc }, | ||
1414 | 1034 | + /* x^232448 mod p(x)` << 1, x^232512 mod p(x)` << 1 */ | ||
1415 | 1035 | + { 0x0000000025aa994a, 0x0000000073ab7f36 }, | ||
1416 | 1036 | + /* x^231424 mod p(x)` << 1, x^231488 mod p(x)` << 1 */ | ||
1417 | 1037 | + { 0x00000000fa3a74c4, 0x0000000041aed1c2 }, | ||
1418 | 1038 | + /* x^230400 mod p(x)` << 1, x^230464 mod p(x)` << 1 */ | ||
1419 | 1039 | + { 0x0000000033eb3f40, 0x0000000136c53800 }, | ||
1420 | 1040 | + /* x^229376 mod p(x)` << 1, x^229440 mod p(x)` << 1 */ | ||
1421 | 1041 | + { 0x000000017193f296, 0x0000000126835a30 }, | ||
1422 | 1042 | + /* x^228352 mod p(x)` << 1, x^228416 mod p(x)` << 1 */ | ||
1423 | 1043 | + { 0x0000000043f6c86a, 0x000000006241b502 }, | ||
1424 | 1044 | + /* x^227328 mod p(x)` << 1, x^227392 mod p(x)` << 1 */ | ||
1425 | 1045 | + { 0x000000016b513ec6, 0x00000000d5196ad4 }, | ||
1426 | 1046 | + /* x^226304 mod p(x)` << 1, x^226368 mod p(x)` << 1 */ | ||
1427 | 1047 | + { 0x00000000c8f25b4e, 0x000000009cfa769a }, | ||
1428 | 1048 | + /* x^225280 mod p(x)` << 1, x^225344 mod p(x)` << 1 */ | ||
1429 | 1049 | + { 0x00000001a45048ec, 0x00000000920e5df4 }, | ||
1430 | 1050 | + /* x^224256 mod p(x)` << 1, x^224320 mod p(x)` << 1 */ | ||
1431 | 1051 | + { 0x000000000c441004, 0x0000000169dc310e }, | ||
1432 | 1052 | + /* x^223232 mod p(x)` << 1, x^223296 mod p(x)` << 1 */ | ||
1433 | 1053 | + { 0x000000000e17cad6, 0x0000000009fc331c }, | ||
1434 | 1054 | + /* x^222208 mod p(x)` << 1, x^222272 mod p(x)` << 1 */ | ||
1435 | 1055 | + { 0x00000001253ae964, 0x000000010d94a81e }, | ||
1436 | 1056 | + /* x^221184 mod p(x)` << 1, x^221248 mod p(x)` << 1 */ | ||
1437 | 1057 | + { 0x00000001d7c88ebc, 0x0000000027a20ab2 }, | ||
1438 | 1058 | + /* x^220160 mod p(x)` << 1, x^220224 mod p(x)` << 1 */ | ||
1439 | 1059 | + { 0x00000001e7ca913a, 0x0000000114f87504 }, | ||
1440 | 1060 | + /* x^219136 mod p(x)` << 1, x^219200 mod p(x)` << 1 */ | ||
1441 | 1061 | + { 0x0000000033ed078a, 0x000000004b076d96 }, | ||
1442 | 1062 | + /* x^218112 mod p(x)` << 1, x^218176 mod p(x)` << 1 */ | ||
1443 | 1063 | + { 0x00000000e1839c78, 0x00000000da4d1e74 }, | ||
1444 | 1064 | + /* x^217088 mod p(x)` << 1, x^217152 mod p(x)` << 1 */ | ||
1445 | 1065 | + { 0x00000001322b267e, 0x000000001b81f672 }, | ||
1446 | 1066 | + /* x^216064 mod p(x)` << 1, x^216128 mod p(x)` << 1 */ | ||
1447 | 1067 | + { 0x00000000638231b6, 0x000000009367c988 }, | ||
1448 | 1068 | + /* x^215040 mod p(x)` << 1, x^215104 mod p(x)` << 1 */ | ||
1449 | 1069 | + { 0x00000001ee7f16f4, 0x00000001717214ca }, | ||
1450 | 1070 | + /* x^214016 mod p(x)` << 1, x^214080 mod p(x)` << 1 */ | ||
1451 | 1071 | + { 0x0000000117d9924a, 0x000000009f47d820 }, | ||
1452 | 1072 | + /* x^212992 mod p(x)` << 1, x^213056 mod p(x)` << 1 */ | ||
1453 | 1073 | + { 0x00000000e1a9e0c4, 0x000000010d9a47d2 }, | ||
1454 | 1074 | + /* x^211968 mod p(x)` << 1, x^212032 mod p(x)` << 1 */ | ||
1455 | 1075 | + { 0x00000001403731dc, 0x00000000a696c58c }, | ||
1456 | 1076 | + /* x^210944 mod p(x)` << 1, x^211008 mod p(x)` << 1 */ | ||
1457 | 1077 | + { 0x00000001a5ea9682, 0x000000002aa28ec6 }, | ||
1458 | 1078 | + /* x^209920 mod p(x)` << 1, x^209984 mod p(x)` << 1 */ | ||
1459 | 1079 | + { 0x0000000101c5c578, 0x00000001fe18fd9a }, | ||
1460 | 1080 | + /* x^208896 mod p(x)` << 1, x^208960 mod p(x)` << 1 */ | ||
1461 | 1081 | + { 0x00000000dddf6494, 0x000000019d4fc1ae }, | ||
1462 | 1082 | + /* x^207872 mod p(x)` << 1, x^207936 mod p(x)` << 1 */ | ||
1463 | 1083 | + { 0x00000000f1c3db28, 0x00000001ba0e3dea }, | ||
1464 | 1084 | + /* x^206848 mod p(x)` << 1, x^206912 mod p(x)` << 1 */ | ||
1465 | 1085 | + { 0x000000013112fb9c, 0x0000000074b59a5e }, | ||
1466 | 1086 | + /* x^205824 mod p(x)` << 1, x^205888 mod p(x)` << 1 */ | ||
1467 | 1087 | + { 0x00000000b680b906, 0x00000000f2b5ea98 }, | ||
1468 | 1088 | + /* x^204800 mod p(x)` << 1, x^204864 mod p(x)` << 1 */ | ||
1469 | 1089 | + { 0x000000001a282932, 0x0000000187132676 }, | ||
1470 | 1090 | + /* x^203776 mod p(x)` << 1, x^203840 mod p(x)` << 1 */ | ||
1471 | 1091 | + { 0x0000000089406e7e, 0x000000010a8c6ad4 }, | ||
1472 | 1092 | + /* x^202752 mod p(x)` << 1, x^202816 mod p(x)` << 1 */ | ||
1473 | 1093 | + { 0x00000001def6be8c, 0x00000001e21dfe70 }, | ||
1474 | 1094 | + /* x^201728 mod p(x)` << 1, x^201792 mod p(x)` << 1 */ | ||
1475 | 1095 | + { 0x0000000075258728, 0x00000001da0050e4 }, | ||
1476 | 1096 | + /* x^200704 mod p(x)` << 1, x^200768 mod p(x)` << 1 */ | ||
1477 | 1097 | + { 0x000000019536090a, 0x00000000772172ae }, | ||
1478 | 1098 | + /* x^199680 mod p(x)` << 1, x^199744 mod p(x)` << 1 */ | ||
1479 | 1099 | + { 0x00000000f2455bfc, 0x00000000e47724aa }, | ||
1480 | 1100 | + /* x^198656 mod p(x)` << 1, x^198720 mod p(x)` << 1 */ | ||
1481 | 1101 | + { 0x000000018c40baf4, 0x000000003cd63ac4 }, | ||
1482 | 1102 | + /* x^197632 mod p(x)` << 1, x^197696 mod p(x)` << 1 */ | ||
1483 | 1103 | + { 0x000000004cd390d4, 0x00000001bf47d352 }, | ||
1484 | 1104 | + /* x^196608 mod p(x)` << 1, x^196672 mod p(x)` << 1 */ | ||
1485 | 1105 | + { 0x00000001e4ece95a, 0x000000018dc1d708 }, | ||
1486 | 1106 | + /* x^195584 mod p(x)` << 1, x^195648 mod p(x)` << 1 */ | ||
1487 | 1107 | + { 0x000000001a3ee918, 0x000000002d4620a4 }, | ||
1488 | 1108 | + /* x^194560 mod p(x)` << 1, x^194624 mod p(x)` << 1 */ | ||
1489 | 1109 | + { 0x000000007c652fb8, 0x0000000058fd1740 }, | ||
1490 | 1110 | + /* x^193536 mod p(x)` << 1, x^193600 mod p(x)` << 1 */ | ||
1491 | 1111 | + { 0x000000011c67842c, 0x00000000dadd9bfc }, | ||
1492 | 1112 | + /* x^192512 mod p(x)` << 1, x^192576 mod p(x)` << 1 */ | ||
1493 | 1113 | + { 0x00000000254f759c, 0x00000001ea2140be }, | ||
1494 | 1114 | + /* x^191488 mod p(x)` << 1, x^191552 mod p(x)` << 1 */ | ||
1495 | 1115 | + { 0x000000007ece94ca, 0x000000009de128ba }, | ||
1496 | 1116 | + /* x^190464 mod p(x)` << 1, x^190528 mod p(x)` << 1 */ | ||
1497 | 1117 | + { 0x0000000038f258c2, 0x000000013ac3aa8e }, | ||
1498 | 1118 | + /* x^189440 mod p(x)` << 1, x^189504 mod p(x)` << 1 */ | ||
1499 | 1119 | + { 0x00000001cdf17b00, 0x0000000099980562 }, | ||
1500 | 1120 | + /* x^188416 mod p(x)` << 1, x^188480 mod p(x)` << 1 */ | ||
1501 | 1121 | + { 0x000000011f882c16, 0x00000001c1579c86 }, | ||
1502 | 1122 | + /* x^187392 mod p(x)` << 1, x^187456 mod p(x)` << 1 */ | ||
1503 | 1123 | + { 0x0000000100093fc8, 0x0000000068dbbf94 }, | ||
1504 | 1124 | + /* x^186368 mod p(x)` << 1, x^186432 mod p(x)` << 1 */ | ||
1505 | 1125 | + { 0x00000001cd684f16, 0x000000004509fb04 }, | ||
1506 | 1126 | + /* x^185344 mod p(x)` << 1, x^185408 mod p(x)` << 1 */ | ||
1507 | 1127 | + { 0x000000004bc6a70a, 0x00000001202f6398 }, | ||
1508 | 1128 | + /* x^184320 mod p(x)` << 1, x^184384 mod p(x)` << 1 */ | ||
1509 | 1129 | + { 0x000000004fc7e8e4, 0x000000013aea243e }, | ||
1510 | 1130 | + /* x^183296 mod p(x)` << 1, x^183360 mod p(x)` << 1 */ | ||
1511 | 1131 | + { 0x0000000130103f1c, 0x00000001b4052ae6 }, | ||
1512 | 1132 | + /* x^182272 mod p(x)` << 1, x^182336 mod p(x)` << 1 */ | ||
1513 | 1133 | + { 0x0000000111b0024c, 0x00000001cd2a0ae8 }, | ||
1514 | 1134 | + /* x^181248 mod p(x)` << 1, x^181312 mod p(x)` << 1 */ | ||
1515 | 1135 | + { 0x000000010b3079da, 0x00000001fe4aa8b4 }, | ||
1516 | 1136 | + /* x^180224 mod p(x)` << 1, x^180288 mod p(x)` << 1 */ | ||
1517 | 1137 | + { 0x000000010192bcc2, 0x00000001d1559a42 }, | ||
1518 | 1138 | + /* x^179200 mod p(x)` << 1, x^179264 mod p(x)` << 1 */ | ||
1519 | 1139 | + { 0x0000000074838d50, 0x00000001f3e05ecc }, | ||
1520 | 1140 | + /* x^178176 mod p(x)` << 1, x^178240 mod p(x)` << 1 */ | ||
1521 | 1141 | + { 0x000000001b20f520, 0x0000000104ddd2cc }, | ||
1522 | 1142 | + /* x^177152 mod p(x)` << 1, x^177216 mod p(x)` << 1 */ | ||
1523 | 1143 | + { 0x0000000050c3590a, 0x000000015393153c }, | ||
1524 | 1144 | + /* x^176128 mod p(x)` << 1, x^176192 mod p(x)` << 1 */ | ||
1525 | 1145 | + { 0x00000000b41cac8e, 0x0000000057e942c6 }, | ||
1526 | 1146 | + /* x^175104 mod p(x)` << 1, x^175168 mod p(x)` << 1 */ | ||
1527 | 1147 | + { 0x000000000c72cc78, 0x000000012c633850 }, | ||
1528 | 1148 | + /* x^174080 mod p(x)` << 1, x^174144 mod p(x)` << 1 */ | ||
1529 | 1149 | + { 0x0000000030cdb032, 0x00000000ebcaae4c }, | ||
1530 | 1150 | + /* x^173056 mod p(x)` << 1, x^173120 mod p(x)` << 1 */ | ||
1531 | 1151 | + { 0x000000013e09fc32, 0x000000013ee532a6 }, | ||
1532 | 1152 | + /* x^172032 mod p(x)` << 1, x^172096 mod p(x)` << 1 */ | ||
1533 | 1153 | + { 0x000000001ed624d2, 0x00000001bf0cbc7e }, | ||
1534 | 1154 | + /* x^171008 mod p(x)` << 1, x^171072 mod p(x)` << 1 */ | ||
1535 | 1155 | + { 0x00000000781aee1a, 0x00000000d50b7a5a }, | ||
1536 | 1156 | + /* x^169984 mod p(x)` << 1, x^170048 mod p(x)` << 1 */ | ||
1537 | 1157 | + { 0x00000001c4d8348c, 0x0000000002fca6e8 }, | ||
1538 | 1158 | + /* x^168960 mod p(x)` << 1, x^169024 mod p(x)` << 1 */ | ||
1539 | 1159 | + { 0x0000000057a40336, 0x000000007af40044 }, | ||
1540 | 1160 | + /* x^167936 mod p(x)` << 1, x^168000 mod p(x)` << 1 */ | ||
1541 | 1161 | + { 0x0000000085544940, 0x0000000016178744 }, | ||
1542 | 1162 | + /* x^166912 mod p(x)` << 1, x^166976 mod p(x)` << 1 */ | ||
1543 | 1163 | + { 0x000000019cd21e80, 0x000000014c177458 }, | ||
1544 | 1164 | + /* x^165888 mod p(x)` << 1, x^165952 mod p(x)` << 1 */ | ||
1545 | 1165 | + { 0x000000013eb95bc0, 0x000000011b6ddf04 }, | ||
1546 | 1166 | + /* x^164864 mod p(x)` << 1, x^164928 mod p(x)` << 1 */ | ||
1547 | 1167 | + { 0x00000001dfc9fdfc, 0x00000001f3e29ccc }, | ||
1548 | 1168 | + /* x^163840 mod p(x)` << 1, x^163904 mod p(x)` << 1 */ | ||
1549 | 1169 | + { 0x00000000cd028bc2, 0x0000000135ae7562 }, | ||
1550 | 1170 | + /* x^162816 mod p(x)` << 1, x^162880 mod p(x)` << 1 */ | ||
1551 | 1171 | + { 0x0000000090db8c44, 0x0000000190ef812c }, | ||
1552 | 1172 | + /* x^161792 mod p(x)` << 1, x^161856 mod p(x)` << 1 */ | ||
1553 | 1173 | + { 0x000000010010a4ce, 0x0000000067a2c786 }, | ||
1554 | 1174 | + /* x^160768 mod p(x)` << 1, x^160832 mod p(x)` << 1 */ | ||
1555 | 1175 | + { 0x00000001c8f4c72c, 0x0000000048b9496c }, | ||
1556 | 1176 | + /* x^159744 mod p(x)` << 1, x^159808 mod p(x)` << 1 */ | ||
1557 | 1177 | + { 0x000000001c26170c, 0x000000015a422de6 }, | ||
1558 | 1178 | + /* x^158720 mod p(x)` << 1, x^158784 mod p(x)` << 1 */ | ||
1559 | 1179 | + { 0x00000000e3fccf68, 0x00000001ef0e3640 }, | ||
1560 | 1180 | + /* x^157696 mod p(x)` << 1, x^157760 mod p(x)` << 1 */ | ||
1561 | 1181 | + { 0x00000000d513ed24, 0x00000001006d2d26 }, | ||
1562 | 1182 | + /* x^156672 mod p(x)` << 1, x^156736 mod p(x)` << 1 */ | ||
1563 | 1183 | + { 0x00000000141beada, 0x00000001170d56d6 }, | ||
1564 | 1184 | + /* x^155648 mod p(x)` << 1, x^155712 mod p(x)` << 1 */ | ||
1565 | 1185 | + { 0x000000011071aea0, 0x00000000a5fb613c }, | ||
1566 | 1186 | + /* x^154624 mod p(x)` << 1, x^154688 mod p(x)` << 1 */ | ||
1567 | 1187 | + { 0x000000012e19080a, 0x0000000040bbf7fc }, | ||
1568 | 1188 | + /* x^153600 mod p(x)` << 1, x^153664 mod p(x)` << 1 */ | ||
1569 | 1189 | + { 0x0000000100ecf826, 0x000000016ac3a5b2 }, | ||
1570 | 1190 | + /* x^152576 mod p(x)` << 1, x^152640 mod p(x)` << 1 */ | ||
1571 | 1191 | + { 0x0000000069b09412, 0x00000000abf16230 }, | ||
1572 | 1192 | + /* x^151552 mod p(x)` << 1, x^151616 mod p(x)` << 1 */ | ||
1573 | 1193 | + { 0x0000000122297bac, 0x00000001ebe23fac }, | ||
1574 | 1194 | + /* x^150528 mod p(x)` << 1, x^150592 mod p(x)` << 1 */ | ||
1575 | 1195 | + { 0x00000000e9e4b068, 0x000000008b6a0894 }, | ||
1576 | 1196 | + /* x^149504 mod p(x)` << 1, x^149568 mod p(x)` << 1 */ | ||
1577 | 1197 | + { 0x000000004b38651a, 0x00000001288ea478 }, | ||
1578 | 1198 | + /* x^148480 mod p(x)` << 1, x^148544 mod p(x)` << 1 */ | ||
1579 | 1199 | + { 0x00000001468360e2, 0x000000016619c442 }, | ||
1580 | 1200 | + /* x^147456 mod p(x)` << 1, x^147520 mod p(x)` << 1 */ | ||
1581 | 1201 | + { 0x00000000121c2408, 0x0000000086230038 }, | ||
1582 | 1202 | + /* x^146432 mod p(x)` << 1, x^146496 mod p(x)` << 1 */ | ||
1583 | 1203 | + { 0x00000000da7e7d08, 0x000000017746a756 }, | ||
1584 | 1204 | + /* x^145408 mod p(x)` << 1, x^145472 mod p(x)` << 1 */ | ||
1585 | 1205 | + { 0x00000001058d7652, 0x0000000191b8f8f8 }, | ||
1586 | 1206 | + /* x^144384 mod p(x)` << 1, x^144448 mod p(x)` << 1 */ | ||
1587 | 1207 | + { 0x000000014a098a90, 0x000000008e167708 }, | ||
1588 | 1208 | + /* x^143360 mod p(x)` << 1, x^143424 mod p(x)` << 1 */ | ||
1589 | 1209 | + { 0x0000000020dbe72e, 0x0000000148b22d54 }, | ||
1590 | 1210 | + /* x^142336 mod p(x)` << 1, x^142400 mod p(x)` << 1 */ | ||
1591 | 1211 | + { 0x000000011e7323e8, 0x0000000044ba2c3c }, | ||
1592 | 1212 | + /* x^141312 mod p(x)` << 1, x^141376 mod p(x)` << 1 */ | ||
1593 | 1213 | + { 0x00000000d5d4bf94, 0x00000000b54d2b52 }, | ||
1594 | 1214 | + /* x^140288 mod p(x)` << 1, x^140352 mod p(x)` << 1 */ | ||
1595 | 1215 | + { 0x0000000199d8746c, 0x0000000005a4fd8a }, | ||
1596 | 1216 | + /* x^139264 mod p(x)` << 1, x^139328 mod p(x)` << 1 */ | ||
1597 | 1217 | + { 0x00000000ce9ca8a0, 0x0000000139f9fc46 }, | ||
1598 | 1218 | + /* x^138240 mod p(x)` << 1, x^138304 mod p(x)` << 1 */ | ||
1599 | 1219 | + { 0x00000000136edece, 0x000000015a1fa824 }, | ||
1600 | 1220 | + /* x^137216 mod p(x)` << 1, x^137280 mod p(x)` << 1 */ | ||
1601 | 1221 | + { 0x000000019b92a068, 0x000000000a61ae4c }, | ||
1602 | 1222 | + /* x^136192 mod p(x)` << 1, x^136256 mod p(x)` << 1 */ | ||
1603 | 1223 | + { 0x0000000071d62206, 0x0000000145e9113e }, | ||
1604 | 1224 | + /* x^135168 mod p(x)` << 1, x^135232 mod p(x)` << 1 */ | ||
1605 | 1225 | + { 0x00000000dfc50158, 0x000000006a348448 }, | ||
1606 | 1226 | + /* x^134144 mod p(x)` << 1, x^134208 mod p(x)` << 1 */ | ||
1607 | 1227 | + { 0x00000001517626bc, 0x000000004d80a08c }, | ||
1608 | 1228 | + /* x^133120 mod p(x)` << 1, x^133184 mod p(x)` << 1 */ | ||
1609 | 1229 | + { 0x0000000148d1e4fa, 0x000000014b6837a0 }, | ||
1610 | 1230 | + /* x^132096 mod p(x)` << 1, x^132160 mod p(x)` << 1 */ | ||
1611 | 1231 | + { 0x0000000094d8266e, 0x000000016896a7fc }, | ||
1612 | 1232 | + /* x^131072 mod p(x)` << 1, x^131136 mod p(x)` << 1 */ | ||
1613 | 1233 | + { 0x00000000606c5e34, 0x000000014f187140 }, | ||
1614 | 1234 | + /* x^130048 mod p(x)` << 1, x^130112 mod p(x)` << 1 */ | ||
1615 | 1235 | + { 0x000000019766beaa, 0x000000019581b9da }, | ||
1616 | 1236 | + /* x^129024 mod p(x)` << 1, x^129088 mod p(x)` << 1 */ | ||
1617 | 1237 | + { 0x00000001d80c506c, 0x00000001091bc984 }, | ||
1618 | 1238 | + /* x^128000 mod p(x)` << 1, x^128064 mod p(x)` << 1 */ | ||
1619 | 1239 | + { 0x000000001e73837c, 0x000000001067223c }, | ||
1620 | 1240 | + /* x^126976 mod p(x)` << 1, x^127040 mod p(x)` << 1 */ | ||
1621 | 1241 | + { 0x0000000064d587de, 0x00000001ab16ea02 }, | ||
1622 | 1242 | + /* x^125952 mod p(x)` << 1, x^126016 mod p(x)` << 1 */ | ||
1623 | 1243 | + { 0x00000000f4a507b0, 0x000000013c4598a8 }, | ||
1624 | 1244 | + /* x^124928 mod p(x)` << 1, x^124992 mod p(x)` << 1 */ | ||
1625 | 1245 | + { 0x0000000040e342fc, 0x00000000b3735430 }, | ||
1626 | 1246 | + /* x^123904 mod p(x)` << 1, x^123968 mod p(x)` << 1 */ | ||
1627 | 1247 | + { 0x00000001d5ad9c3a, 0x00000001bb3fc0c0 }, | ||
1628 | 1248 | + /* x^122880 mod p(x)` << 1, x^122944 mod p(x)` << 1 */ | ||
1629 | 1249 | + { 0x0000000094a691a4, 0x00000001570ae19c }, | ||
1630 | 1250 | + /* x^121856 mod p(x)` << 1, x^121920 mod p(x)` << 1 */ | ||
1631 | 1251 | + { 0x00000001271ecdfa, 0x00000001ea910712 }, | ||
1632 | 1252 | + /* x^120832 mod p(x)` << 1, x^120896 mod p(x)` << 1 */ | ||
1633 | 1253 | + { 0x000000009e54475a, 0x0000000167127128 }, | ||
1634 | 1254 | + /* x^119808 mod p(x)` << 1, x^119872 mod p(x)` << 1 */ | ||
1635 | 1255 | + { 0x00000000c9c099ee, 0x0000000019e790a2 }, | ||
1636 | 1256 | + /* x^118784 mod p(x)` << 1, x^118848 mod p(x)` << 1 */ | ||
1637 | 1257 | + { 0x000000009a2f736c, 0x000000003788f710 }, | ||
1638 | 1258 | + /* x^117760 mod p(x)` << 1, x^117824 mod p(x)` << 1 */ | ||
1639 | 1259 | + { 0x00000000bb9f4996, 0x00000001682a160e }, | ||
1640 | 1260 | + /* x^116736 mod p(x)` << 1, x^116800 mod p(x)` << 1 */ | ||
1641 | 1261 | + { 0x00000001db688050, 0x000000007f0ebd2e }, | ||
1642 | 1262 | + /* x^115712 mod p(x)` << 1, x^115776 mod p(x)` << 1 */ | ||
1643 | 1263 | + { 0x00000000e9b10af4, 0x000000002b032080 }, | ||
1644 | 1264 | + /* x^114688 mod p(x)` << 1, x^114752 mod p(x)` << 1 */ | ||
1645 | 1265 | + { 0x000000012d4545e4, 0x00000000cfd1664a }, | ||
1646 | 1266 | + /* x^113664 mod p(x)` << 1, x^113728 mod p(x)` << 1 */ | ||
1647 | 1267 | + { 0x000000000361139c, 0x00000000aa1181c2 }, | ||
1648 | 1268 | + /* x^112640 mod p(x)` << 1, x^112704 mod p(x)` << 1 */ | ||
1649 | 1269 | + { 0x00000001a5a1a3a8, 0x00000000ddd08002 }, | ||
1650 | 1270 | + /* x^111616 mod p(x)` << 1, x^111680 mod p(x)` << 1 */ | ||
1651 | 1271 | + { 0x000000006844e0b0, 0x00000000e8dd0446 }, | ||
1652 | 1272 | + /* x^110592 mod p(x)` << 1, x^110656 mod p(x)` << 1 */ | ||
1653 | 1273 | + { 0x00000000c3762f28, 0x00000001bbd94a00 }, | ||
1654 | 1274 | + /* x^109568 mod p(x)` << 1, x^109632 mod p(x)` << 1 */ | ||
1655 | 1275 | + { 0x00000001d26287a2, 0x00000000ab6cd180 }, | ||
1656 | 1276 | + /* x^108544 mod p(x)` << 1, x^108608 mod p(x)` << 1 */ | ||
1657 | 1277 | + { 0x00000001f6f0bba8, 0x0000000031803ce2 }, | ||
1658 | 1278 | + /* x^107520 mod p(x)` << 1, x^107584 mod p(x)` << 1 */ | ||
1659 | 1279 | + { 0x000000002ffabd62, 0x0000000024f40b0c }, | ||
1660 | 1280 | + /* x^106496 mod p(x)` << 1, x^106560 mod p(x)` << 1 */ | ||
1661 | 1281 | + { 0x00000000fb4516b8, 0x00000001ba1d9834 }, | ||
1662 | 1282 | + /* x^105472 mod p(x)` << 1, x^105536 mod p(x)` << 1 */ | ||
1663 | 1283 | + { 0x000000018cfa961c, 0x0000000104de61aa }, | ||
1664 | 1284 | + /* x^104448 mod p(x)` << 1, x^104512 mod p(x)` << 1 */ | ||
1665 | 1285 | + { 0x000000019e588d52, 0x0000000113e40d46 }, | ||
1666 | 1286 | + /* x^103424 mod p(x)` << 1, x^103488 mod p(x)` << 1 */ | ||
1667 | 1287 | + { 0x00000001180f0bbc, 0x00000001415598a0 }, | ||
1668 | 1288 | + /* x^102400 mod p(x)` << 1, x^102464 mod p(x)` << 1 */ | ||
1669 | 1289 | + { 0x00000000e1d9177a, 0x00000000bf6c8c90 }, | ||
1670 | 1290 | + /* x^101376 mod p(x)` << 1, x^101440 mod p(x)` << 1 */ | ||
1671 | 1291 | + { 0x0000000105abc27c, 0x00000001788b0504 }, | ||
1672 | 1292 | + /* x^100352 mod p(x)` << 1, x^100416 mod p(x)` << 1 */ | ||
1673 | 1293 | + { 0x00000000972e4a58, 0x0000000038385d02 }, | ||
1674 | 1294 | + /* x^99328 mod p(x)` << 1, x^99392 mod p(x)` << 1 */ | ||
1675 | 1295 | + { 0x0000000183499a5e, 0x00000001b6c83844 }, | ||
1676 | 1296 | + /* x^98304 mod p(x)` << 1, x^98368 mod p(x)` << 1 */ | ||
1677 | 1297 | + { 0x00000001c96a8cca, 0x0000000051061a8a }, | ||
1678 | 1298 | + /* x^97280 mod p(x)` << 1, x^97344 mod p(x)` << 1 */ | ||
1679 | 1299 | + { 0x00000001a1a5b60c, 0x000000017351388a }, | ||
1680 | 1300 | + /* x^96256 mod p(x)` << 1, x^96320 mod p(x)` << 1 */ | ||
1681 | 1301 | + { 0x00000000e4b6ac9c, 0x0000000132928f92 }, | ||
1682 | 1302 | + /* x^95232 mod p(x)` << 1, x^95296 mod p(x)` << 1 */ | ||
1683 | 1303 | + { 0x00000001807e7f5a, 0x00000000e6b4f48a }, | ||
1684 | 1304 | + /* x^94208 mod p(x)` << 1, x^94272 mod p(x)` << 1 */ | ||
1685 | 1305 | + { 0x000000017a7e3bc8, 0x0000000039d15e90 }, | ||
1686 | 1306 | + /* x^93184 mod p(x)` << 1, x^93248 mod p(x)` << 1 */ | ||
1687 | 1307 | + { 0x00000000d73975da, 0x00000000312d6074 }, | ||
1688 | 1308 | + /* x^92160 mod p(x)` << 1, x^92224 mod p(x)` << 1 */ | ||
1689 | 1309 | + { 0x000000017375d038, 0x000000017bbb2cc4 }, | ||
1690 | 1310 | + /* x^91136 mod p(x)` << 1, x^91200 mod p(x)` << 1 */ | ||
1691 | 1311 | + { 0x00000000193680bc, 0x000000016ded3e18 }, | ||
1692 | 1312 | + /* x^90112 mod p(x)` << 1, x^90176 mod p(x)` << 1 */ | ||
1693 | 1313 | + { 0x00000000999b06f6, 0x00000000f1638b16 }, | ||
1694 | 1314 | + /* x^89088 mod p(x)` << 1, x^89152 mod p(x)` << 1 */ | ||
1695 | 1315 | + { 0x00000001f685d2b8, 0x00000001d38b9ecc }, | ||
1696 | 1316 | + /* x^88064 mod p(x)` << 1, x^88128 mod p(x)` << 1 */ | ||
1697 | 1317 | + { 0x00000001f4ecbed2, 0x000000018b8d09dc }, | ||
1698 | 1318 | + /* x^87040 mod p(x)` << 1, x^87104 mod p(x)` << 1 */ | ||
1699 | 1319 | + { 0x00000000ba16f1a0, 0x00000000e7bc27d2 }, | ||
1700 | 1320 | + /* x^86016 mod p(x)` << 1, x^86080 mod p(x)` << 1 */ | ||
1701 | 1321 | + { 0x0000000115aceac4, 0x00000000275e1e96 }, | ||
1702 | 1322 | + /* x^84992 mod p(x)` << 1, x^85056 mod p(x)` << 1 */ | ||
1703 | 1323 | + { 0x00000001aeff6292, 0x00000000e2e3031e }, | ||
1704 | 1324 | + /* x^83968 mod p(x)` << 1, x^84032 mod p(x)` << 1 */ | ||
1705 | 1325 | + { 0x000000009640124c, 0x00000001041c84d8 }, | ||
1706 | 1326 | + /* x^82944 mod p(x)` << 1, x^83008 mod p(x)` << 1 */ | ||
1707 | 1327 | + { 0x0000000114f41f02, 0x00000000706ce672 }, | ||
1708 | 1328 | + /* x^81920 mod p(x)` << 1, x^81984 mod p(x)` << 1 */ | ||
1709 | 1329 | + { 0x000000009c5f3586, 0x000000015d5070da }, | ||
1710 | 1330 | + /* x^80896 mod p(x)` << 1, x^80960 mod p(x)` << 1 */ | ||
1711 | 1331 | + { 0x00000001878275fa, 0x0000000038f9493a }, | ||
1712 | 1332 | + /* x^79872 mod p(x)` << 1, x^79936 mod p(x)` << 1 */ | ||
1713 | 1333 | + { 0x00000000ddc42ce8, 0x00000000a3348a76 }, | ||
1714 | 1334 | + /* x^78848 mod p(x)` << 1, x^78912 mod p(x)` << 1 */ | ||
1715 | 1335 | + { 0x0000000181d2c73a, 0x00000001ad0aab92 }, | ||
1716 | 1336 | + /* x^77824 mod p(x)` << 1, x^77888 mod p(x)` << 1 */ | ||
1717 | 1337 | + { 0x0000000141c9320a, 0x000000019e85f712 }, | ||
1718 | 1338 | + /* x^76800 mod p(x)` << 1, x^76864 mod p(x)` << 1 */ | ||
1719 | 1339 | + { 0x000000015235719a, 0x000000005a871e76 }, | ||
1720 | 1340 | + /* x^75776 mod p(x)` << 1, x^75840 mod p(x)` << 1 */ | ||
1721 | 1341 | + { 0x00000000be27d804, 0x000000017249c662 }, | ||
1722 | 1342 | + /* x^74752 mod p(x)` << 1, x^74816 mod p(x)` << 1 */ | ||
1723 | 1343 | + { 0x000000006242d45a, 0x000000003a084712 }, | ||
1724 | 1344 | + /* x^73728 mod p(x)` << 1, x^73792 mod p(x)` << 1 */ | ||
1725 | 1345 | + { 0x000000009a53638e, 0x00000000ed438478 }, | ||
1726 | 1346 | + /* x^72704 mod p(x)` << 1, x^72768 mod p(x)` << 1 */ | ||
1727 | 1347 | + { 0x00000001001ecfb6, 0x00000000abac34cc }, | ||
1728 | 1348 | + /* x^71680 mod p(x)` << 1, x^71744 mod p(x)` << 1 */ | ||
1729 | 1349 | + { 0x000000016d7c2d64, 0x000000005f35ef3e }, | ||
1730 | 1350 | + /* x^70656 mod p(x)` << 1, x^70720 mod p(x)` << 1 */ | ||
1731 | 1351 | + { 0x00000001d0ce46c0, 0x0000000047d6608c }, | ||
1732 | 1352 | + /* x^69632 mod p(x)` << 1, x^69696 mod p(x)` << 1 */ | ||
1733 | 1353 | + { 0x0000000124c907b4, 0x000000002d01470e }, | ||
1734 | 1354 | + /* x^68608 mod p(x)` << 1, x^68672 mod p(x)` << 1 */ | ||
1735 | 1355 | + { 0x0000000018a555ca, 0x0000000158bbc7b0 }, | ||
1736 | 1356 | + /* x^67584 mod p(x)` << 1, x^67648 mod p(x)` << 1 */ | ||
1737 | 1357 | + { 0x000000006b0980bc, 0x00000000c0a23e8e }, | ||
1738 | 1358 | + /* x^66560 mod p(x)` << 1, x^66624 mod p(x)` << 1 */ | ||
1739 | 1359 | + { 0x000000008bbba964, 0x00000001ebd85c88 }, | ||
1740 | 1360 | + /* x^65536 mod p(x)` << 1, x^65600 mod p(x)` << 1 */ | ||
1741 | 1361 | + { 0x00000001070a5a1e, 0x000000019ee20bb2 }, | ||
1742 | 1362 | + /* x^64512 mod p(x)` << 1, x^64576 mod p(x)` << 1 */ | ||
1743 | 1363 | + { 0x000000002204322a, 0x00000001acabf2d6 }, | ||
1744 | 1364 | + /* x^63488 mod p(x)` << 1, x^63552 mod p(x)` << 1 */ | ||
1745 | 1365 | + { 0x00000000a27524d0, 0x00000001b7963d56 }, | ||
1746 | 1366 | + /* x^62464 mod p(x)` << 1, x^62528 mod p(x)` << 1 */ | ||
1747 | 1367 | + { 0x0000000020b1e4ba, 0x000000017bffa1fe }, | ||
1748 | 1368 | + /* x^61440 mod p(x)` << 1, x^61504 mod p(x)` << 1 */ | ||
1749 | 1369 | + { 0x0000000032cc27fc, 0x000000001f15333e }, | ||
1750 | 1370 | + /* x^60416 mod p(x)` << 1, x^60480 mod p(x)` << 1 */ | ||
1751 | 1371 | + { 0x0000000044dd22b8, 0x000000018593129e }, | ||
1752 | 1372 | + /* x^59392 mod p(x)` << 1, x^59456 mod p(x)` << 1 */ | ||
1753 | 1373 | + { 0x00000000dffc9e0a, 0x000000019cb32602 }, | ||
1754 | 1374 | + /* x^58368 mod p(x)` << 1, x^58432 mod p(x)` << 1 */ | ||
1755 | 1375 | + { 0x00000001b7a0ed14, 0x0000000142b05cc8 }, | ||
1756 | 1376 | + /* x^57344 mod p(x)` << 1, x^57408 mod p(x)` << 1 */ | ||
1757 | 1377 | + { 0x00000000c7842488, 0x00000001be49e7a4 }, | ||
1758 | 1378 | + /* x^56320 mod p(x)` << 1, x^56384 mod p(x)` << 1 */ | ||
1759 | 1379 | + { 0x00000001c02a4fee, 0x0000000108f69d6c }, | ||
1760 | 1380 | + /* x^55296 mod p(x)` << 1, x^55360 mod p(x)` << 1 */ | ||
1761 | 1381 | + { 0x000000003c273778, 0x000000006c0971f0 }, | ||
1762 | 1382 | + /* x^54272 mod p(x)` << 1, x^54336 mod p(x)` << 1 */ | ||
1763 | 1383 | + { 0x00000001d63f8894, 0x000000005b16467a }, | ||
1764 | 1384 | + /* x^53248 mod p(x)` << 1, x^53312 mod p(x)` << 1 */ | ||
1765 | 1385 | + { 0x000000006be557d6, 0x00000001551a628e }, | ||
1766 | 1386 | + /* x^52224 mod p(x)` << 1, x^52288 mod p(x)` << 1 */ | ||
1767 | 1387 | + { 0x000000006a7806ea, 0x000000019e42ea92 }, | ||
1768 | 1388 | + /* x^51200 mod p(x)` << 1, x^51264 mod p(x)` << 1 */ | ||
1769 | 1389 | + { 0x000000016155aa0c, 0x000000012fa83ff2 }, | ||
1770 | 1390 | + /* x^50176 mod p(x)` << 1, x^50240 mod p(x)` << 1 */ | ||
1771 | 1391 | + { 0x00000000908650ac, 0x000000011ca9cde0 }, | ||
1772 | 1392 | + /* x^49152 mod p(x)` << 1, x^49216 mod p(x)` << 1 */ | ||
1773 | 1393 | + { 0x00000000aa5a8084, 0x00000000c8e5cd74 }, | ||
1774 | 1394 | + /* x^48128 mod p(x)` << 1, x^48192 mod p(x)` << 1 */ | ||
1775 | 1395 | + { 0x0000000191bb500a, 0x0000000096c27f0c }, | ||
1776 | 1396 | + /* x^47104 mod p(x)` << 1, x^47168 mod p(x)` << 1 */ | ||
1777 | 1397 | + { 0x0000000064e9bed0, 0x000000002baed926 }, | ||
1778 | 1398 | + /* x^46080 mod p(x)` << 1, x^46144 mod p(x)` << 1 */ | ||
1779 | 1399 | + { 0x000000009444f302, 0x000000017c8de8d2 }, | ||
1780 | 1400 | + /* x^45056 mod p(x)` << 1, x^45120 mod p(x)` << 1 */ | ||
1781 | 1401 | + { 0x000000019db07d3c, 0x00000000d43d6068 }, | ||
1782 | 1402 | + /* x^44032 mod p(x)` << 1, x^44096 mod p(x)` << 1 */ | ||
1783 | 1403 | + { 0x00000001359e3e6e, 0x00000000cb2c4b26 }, | ||
1784 | 1404 | + /* x^43008 mod p(x)` << 1, x^43072 mod p(x)` << 1 */ | ||
1785 | 1405 | + { 0x00000001e4f10dd2, 0x0000000145b8da26 }, | ||
1786 | 1406 | + /* x^41984 mod p(x)` << 1, x^42048 mod p(x)` << 1 */ | ||
1787 | 1407 | + { 0x0000000124f5735e, 0x000000018fff4b08 }, | ||
1788 | 1408 | + /* x^40960 mod p(x)` << 1, x^41024 mod p(x)` << 1 */ | ||
1789 | 1409 | + { 0x0000000124760a4c, 0x0000000150b58ed0 }, | ||
1790 | 1410 | + /* x^39936 mod p(x)` << 1, x^40000 mod p(x)` << 1 */ | ||
1791 | 1411 | + { 0x000000000f1fc186, 0x00000001549f39bc }, | ||
1792 | 1412 | + /* x^38912 mod p(x)` << 1, x^38976 mod p(x)` << 1 */ | ||
1793 | 1413 | + { 0x00000000150e4cc4, 0x00000000ef4d2f42 }, | ||
1794 | 1414 | + /* x^37888 mod p(x)` << 1, x^37952 mod p(x)` << 1 */ | ||
1795 | 1415 | + { 0x000000002a6204e8, 0x00000001b1468572 }, | ||
1796 | 1416 | + /* x^36864 mod p(x)` << 1, x^36928 mod p(x)` << 1 */ | ||
1797 | 1417 | + { 0x00000000beb1d432, 0x000000013d7403b2 }, | ||
1798 | 1418 | + /* x^35840 mod p(x)` << 1, x^35904 mod p(x)` << 1 */ | ||
1799 | 1419 | + { 0x0000000135f3f1f0, 0x00000001a4681842 }, | ||
1800 | 1420 | + /* x^34816 mod p(x)` << 1, x^34880 mod p(x)` << 1 */ | ||
1801 | 1421 | + { 0x0000000074fe2232, 0x0000000167714492 }, | ||
1802 | 1422 | + /* x^33792 mod p(x)` << 1, x^33856 mod p(x)` << 1 */ | ||
1803 | 1423 | + { 0x000000001ac6e2ba, 0x00000001e599099a }, | ||
1804 | 1424 | + /* x^32768 mod p(x)` << 1, x^32832 mod p(x)` << 1 */ | ||
1805 | 1425 | + { 0x0000000013fca91e, 0x00000000fe128194 }, | ||
1806 | 1426 | + /* x^31744 mod p(x)` << 1, x^31808 mod p(x)` << 1 */ | ||
1807 | 1427 | + { 0x0000000183f4931e, 0x0000000077e8b990 }, | ||
1808 | 1428 | + /* x^30720 mod p(x)` << 1, x^30784 mod p(x)` << 1 */ | ||
1809 | 1429 | + { 0x00000000b6d9b4e4, 0x00000001a267f63a }, | ||
1810 | 1430 | + /* x^29696 mod p(x)` << 1, x^29760 mod p(x)` << 1 */ | ||
1811 | 1431 | + { 0x00000000b5188656, 0x00000001945c245a }, | ||
1812 | 1432 | + /* x^28672 mod p(x)` << 1, x^28736 mod p(x)` << 1 */ | ||
1813 | 1433 | + { 0x0000000027a81a84, 0x0000000149002e76 }, | ||
1814 | 1434 | + /* x^27648 mod p(x)` << 1, x^27712 mod p(x)` << 1 */ | ||
1815 | 1435 | + { 0x0000000125699258, 0x00000001bb8310a4 }, | ||
1816 | 1436 | + /* x^26624 mod p(x)` << 1, x^26688 mod p(x)` << 1 */ | ||
1817 | 1437 | + { 0x00000001b23de796, 0x000000019ec60bcc }, | ||
1818 | 1438 | + /* x^25600 mod p(x)` << 1, x^25664 mod p(x)` << 1 */ | ||
1819 | 1439 | + { 0x00000000fe4365dc, 0x000000012d8590ae }, | ||
1820 | 1440 | + /* x^24576 mod p(x)` << 1, x^24640 mod p(x)` << 1 */ | ||
1821 | 1441 | + { 0x00000000c68f497a, 0x0000000065b00684 }, | ||
1822 | 1442 | + /* x^23552 mod p(x)` << 1, x^23616 mod p(x)` << 1 */ | ||
1823 | 1443 | + { 0x00000000fbf521ee, 0x000000015e5aeadc }, | ||
1824 | 1444 | + /* x^22528 mod p(x)` << 1, x^22592 mod p(x)` << 1 */ | ||
1825 | 1445 | + { 0x000000015eac3378, 0x00000000b77ff2b0 }, | ||
1826 | 1446 | + /* x^21504 mod p(x)` << 1, x^21568 mod p(x)` << 1 */ | ||
1827 | 1447 | + { 0x0000000134914b90, 0x0000000188da2ff6 }, | ||
1828 | 1448 | + /* x^20480 mod p(x)` << 1, x^20544 mod p(x)` << 1 */ | ||
1829 | 1449 | + { 0x0000000016335cfe, 0x0000000063da929a }, | ||
1830 | 1450 | + /* x^19456 mod p(x)` << 1, x^19520 mod p(x)` << 1 */ | ||
1831 | 1451 | + { 0x000000010372d10c, 0x00000001389caa80 }, | ||
1832 | 1452 | + /* x^18432 mod p(x)` << 1, x^18496 mod p(x)` << 1 */ | ||
1833 | 1453 | + { 0x000000015097b908, 0x000000013db599d2 }, | ||
1834 | 1454 | + /* x^17408 mod p(x)` << 1, x^17472 mod p(x)` << 1 */ | ||
1835 | 1455 | + { 0x00000001227a7572, 0x0000000122505a86 }, | ||
1836 | 1456 | + /* x^16384 mod p(x)` << 1, x^16448 mod p(x)` << 1 */ | ||
1837 | 1457 | + { 0x000000009a8f75c0, 0x000000016bd72746 }, | ||
1838 | 1458 | + /* x^15360 mod p(x)` << 1, x^15424 mod p(x)` << 1 */ | ||
1839 | 1459 | + { 0x00000000682c77a2, 0x00000001c3faf1d4 }, | ||
1840 | 1460 | + /* x^14336 mod p(x)` << 1, x^14400 mod p(x)` << 1 */ | ||
1841 | 1461 | + { 0x00000000231f091c, 0x00000001111c826c }, | ||
1842 | 1462 | + /* x^13312 mod p(x)` << 1, x^13376 mod p(x)` << 1 */ | ||
1843 | 1463 | + { 0x000000007d4439f2, 0x00000000153e9fb2 }, | ||
1844 | 1464 | + /* x^12288 mod p(x)` << 1, x^12352 mod p(x)` << 1 */ | ||
1845 | 1465 | + { 0x000000017e221efc, 0x000000002b1f7b60 }, | ||
1846 | 1466 | + /* x^11264 mod p(x)` << 1, x^11328 mod p(x)` << 1 */ | ||
1847 | 1467 | + { 0x0000000167457c38, 0x00000000b1dba570 }, | ||
1848 | 1468 | + /* x^10240 mod p(x)` << 1, x^10304 mod p(x)` << 1 */ | ||
1849 | 1469 | + { 0x00000000bdf081c4, 0x00000001f6397b76 }, | ||
1850 | 1470 | + /* x^9216 mod p(x)` << 1, x^9280 mod p(x)` << 1 */ | ||
1851 | 1471 | + { 0x000000016286d6b0, 0x0000000156335214 }, | ||
1852 | 1472 | + /* x^8192 mod p(x)` << 1, x^8256 mod p(x)` << 1 */ | ||
1853 | 1473 | + { 0x00000000c84f001c, 0x00000001d70e3986 }, | ||
1854 | 1474 | + /* x^7168 mod p(x)` << 1, x^7232 mod p(x)` << 1 */ | ||
1855 | 1475 | + { 0x0000000064efe7c0, 0x000000003701a774 }, | ||
1856 | 1476 | + /* x^6144 mod p(x)` << 1, x^6208 mod p(x)` << 1 */ | ||
1857 | 1477 | + { 0x000000000ac2d904, 0x00000000ac81ef72 }, | ||
1858 | 1478 | + /* x^5120 mod p(x)` << 1, x^5184 mod p(x)` << 1 */ | ||
1859 | 1479 | + { 0x00000000fd226d14, 0x0000000133212464 }, | ||
1860 | 1480 | + /* x^4096 mod p(x)` << 1, x^4160 mod p(x)` << 1 */ | ||
1861 | 1481 | + { 0x000000011cfd42e0, 0x00000000e4e45610 }, | ||
1862 | 1482 | + /* x^3072 mod p(x)` << 1, x^3136 mod p(x)` << 1 */ | ||
1863 | 1483 | + { 0x000000016e5a5678, 0x000000000c1bd370 }, | ||
1864 | 1484 | + /* x^2048 mod p(x)` << 1, x^2112 mod p(x)` << 1 */ | ||
1865 | 1485 | + { 0x00000001d888fe22, 0x00000001a7b9e7a6 }, | ||
1866 | 1486 | + /* x^1024 mod p(x)` << 1, x^1088 mod p(x)` << 1 */ | ||
1867 | 1487 | + { 0x00000001af77fcd4, 0x000000007d657a10 } | ||
1868 | 1488 | +#endif /* __LITTLE_ENDIAN__ */ | ||
1869 | 1489 | + }; | ||
1870 | 1490 | + | ||
1871 | 1491 | +/* Reduce final 1024-2048 bits to 64 bits, shifting 32 bits to include the trailing 32 bits of zeros */ | ||
1872 | 1492 | + | ||
1873 | 1493 | +static const __vector unsigned long long vcrc_short_const[16] | ||
1874 | 1494 | + __attribute__((aligned (16))) = { | ||
1875 | 1495 | +#ifdef __LITTLE_ENDIAN__ | ||
1876 | 1496 | + /* x^1952 mod p(x) , x^1984 mod p(x) , x^2016 mod p(x) , x^2048 mod p(x) */ | ||
1877 | 1497 | + { 0x99168a18ec447f11, 0xed837b2613e8221e }, | ||
1878 | 1498 | + /* x^1824 mod p(x) , x^1856 mod p(x) , x^1888 mod p(x) , x^1920 mod p(x) */ | ||
1879 | 1499 | + { 0xe23e954e8fd2cd3c, 0xc8acdd8147b9ce5a }, | ||
1880 | 1500 | + /* x^1696 mod p(x) , x^1728 mod p(x) , x^1760 mod p(x) , x^1792 mod p(x) */ | ||
1881 | 1501 | + { 0x92f8befe6b1d2b53, 0xd9ad6d87d4277e25 }, | ||
1882 | 1502 | + /* x^1568 mod p(x) , x^1600 mod p(x) , x^1632 mod p(x) , x^1664 mod p(x) */ | ||
1883 | 1503 | + { 0xf38a3556291ea462, 0xc10ec5e033fbca3b }, | ||
1884 | 1504 | + /* x^1440 mod p(x) , x^1472 mod p(x) , x^1504 mod p(x) , x^1536 mod p(x) */ | ||
1885 | 1505 | + { 0x974ac56262b6ca4b, 0xc0b55b0e82e02e2f }, | ||
1886 | 1506 | + /* x^1312 mod p(x) , x^1344 mod p(x) , x^1376 mod p(x) , x^1408 mod p(x) */ | ||
1887 | 1507 | + { 0x855712b3784d2a56, 0x71aa1df0e172334d }, | ||
1888 | 1508 | + /* x^1184 mod p(x) , x^1216 mod p(x) , x^1248 mod p(x) , x^1280 mod p(x) */ | ||
1889 | 1509 | + { 0xa5abe9f80eaee722, 0xfee3053e3969324d }, | ||
1890 | 1510 | + /* x^1056 mod p(x) , x^1088 mod p(x) , x^1120 mod p(x) , x^1152 mod p(x) */ | ||
1891 | 1511 | + { 0x1fa0943ddb54814c, 0xf44779b93eb2bd08 }, | ||
1892 | 1512 | + /* x^928 mod p(x) , x^960 mod p(x) , x^992 mod p(x) , x^1024 mod p(x) */ | ||
1893 | 1513 | + { 0xa53ff440d7bbfe6a, 0xf5449b3f00cc3374 }, | ||
1894 | 1514 | + /* x^800 mod p(x) , x^832 mod p(x) , x^864 mod p(x) , x^896 mod p(x) */ | ||
1895 | 1515 | + { 0xebe7e3566325605c, 0x6f8346e1d777606e }, | ||
1896 | 1516 | + /* x^672 mod p(x) , x^704 mod p(x) , x^736 mod p(x) , x^768 mod p(x) */ | ||
1897 | 1517 | + { 0xc65a272ce5b592b8, 0xe3ab4f2ac0b95347 }, | ||
1898 | 1518 | + /* x^544 mod p(x) , x^576 mod p(x) , x^608 mod p(x) , x^640 mod p(x) */ | ||
1899 | 1519 | + { 0x5705a9ca4721589f, 0xaa2215ea329ecc11 }, | ||
1900 | 1520 | + /* x^416 mod p(x) , x^448 mod p(x) , x^480 mod p(x) , x^512 mod p(x) */ | ||
1901 | 1521 | + { 0xe3720acb88d14467, 0x1ed8f66ed95efd26 }, | ||
1902 | 1522 | + /* x^288 mod p(x) , x^320 mod p(x) , x^352 mod p(x) , x^384 mod p(x) */ | ||
1903 | 1523 | + { 0xba1aca0315141c31, 0x78ed02d5a700e96a }, | ||
1904 | 1524 | + /* x^160 mod p(x) , x^192 mod p(x) , x^224 mod p(x) , x^256 mod p(x) */ | ||
1905 | 1525 | + { 0xad2a31b3ed627dae, 0xba8ccbe832b39da3 }, | ||
1906 | 1526 | + /* x^32 mod p(x) , x^64 mod p(x) , x^96 mod p(x) , x^128 mod p(x) */ | ||
1907 | 1527 | + { 0x6655004fa06a2517, 0xedb88320b1e6b092 } | ||
1908 | 1528 | +#else /* __LITTLE_ENDIAN__ */ | ||
1909 | 1529 | + /* x^1952 mod p(x) , x^1984 mod p(x) , x^2016 mod p(x) , x^2048 mod p(x) */ | ||
1910 | 1530 | + { 0xed837b2613e8221e, 0x99168a18ec447f11 }, | ||
1911 | 1531 | + /* x^1824 mod p(x) , x^1856 mod p(x) , x^1888 mod p(x) , x^1920 mod p(x) */ | ||
1912 | 1532 | + { 0xc8acdd8147b9ce5a, 0xe23e954e8fd2cd3c }, | ||
1913 | 1533 | + /* x^1696 mod p(x) , x^1728 mod p(x) , x^1760 mod p(x) , x^1792 mod p(x) */ | ||
1914 | 1534 | + { 0xd9ad6d87d4277e25, 0x92f8befe6b1d2b53 }, | ||
1915 | 1535 | + /* x^1568 mod p(x) , x^1600 mod p(x) , x^1632 mod p(x) , x^1664 mod p(x) */ | ||
1916 | 1536 | + { 0xc10ec5e033fbca3b, 0xf38a3556291ea462 }, | ||
1917 | 1537 | + /* x^1440 mod p(x) , x^1472 mod p(x) , x^1504 mod p(x) , x^1536 mod p(x) */ | ||
1918 | 1538 | + { 0xc0b55b0e82e02e2f, 0x974ac56262b6ca4b }, | ||
1919 | 1539 | + /* x^1312 mod p(x) , x^1344 mod p(x) , x^1376 mod p(x) , x^1408 mod p(x) */ | ||
1920 | 1540 | + { 0x71aa1df0e172334d, 0x855712b3784d2a56 }, | ||
1921 | 1541 | + /* x^1184 mod p(x) , x^1216 mod p(x) , x^1248 mod p(x) , x^1280 mod p(x) */ | ||
1922 | 1542 | + { 0xfee3053e3969324d, 0xa5abe9f80eaee722 }, | ||
1923 | 1543 | + /* x^1056 mod p(x) , x^1088 mod p(x) , x^1120 mod p(x) , x^1152 mod p(x) */ | ||
1924 | 1544 | + { 0xf44779b93eb2bd08, 0x1fa0943ddb54814c }, | ||
1925 | 1545 | + /* x^928 mod p(x) , x^960 mod p(x) , x^992 mod p(x) , x^1024 mod p(x) */ | ||
1926 | 1546 | + { 0xf5449b3f00cc3374, 0xa53ff440d7bbfe6a }, | ||
1927 | 1547 | + /* x^800 mod p(x) , x^832 mod p(x) , x^864 mod p(x) , x^896 mod p(x) */ | ||
1928 | 1548 | + { 0x6f8346e1d777606e, 0xebe7e3566325605c }, | ||
1929 | 1549 | + /* x^672 mod p(x) , x^704 mod p(x) , x^736 mod p(x) , x^768 mod p(x) */ | ||
1930 | 1550 | + { 0xe3ab4f2ac0b95347, 0xc65a272ce5b592b8 }, | ||
1931 | 1551 | + /* x^544 mod p(x) , x^576 mod p(x) , x^608 mod p(x) , x^640 mod p(x) */ | ||
1932 | 1552 | + { 0xaa2215ea329ecc11, 0x5705a9ca4721589f }, | ||
1933 | 1553 | + /* x^416 mod p(x) , x^448 mod p(x) , x^480 mod p(x) , x^512 mod p(x) */ | ||
1934 | 1554 | + { 0x1ed8f66ed95efd26, 0xe3720acb88d14467 }, | ||
1935 | 1555 | + /* x^288 mod p(x) , x^320 mod p(x) , x^352 mod p(x) , x^384 mod p(x) */ | ||
1936 | 1556 | + { 0x78ed02d5a700e96a, 0xba1aca0315141c31 }, | ||
1937 | 1557 | + /* x^160 mod p(x) , x^192 mod p(x) , x^224 mod p(x) , x^256 mod p(x) */ | ||
1938 | 1558 | + { 0xba8ccbe832b39da3, 0xad2a31b3ed627dae }, | ||
1939 | 1559 | + /* x^32 mod p(x) , x^64 mod p(x) , x^96 mod p(x) , x^128 mod p(x) */ | ||
1940 | 1560 | + { 0xedb88320b1e6b092, 0x6655004fa06a2517 } | ||
1941 | 1561 | +#endif /* __LITTLE_ENDIAN__ */ | ||
1942 | 1562 | + }; | ||
1943 | 1563 | + | ||
1944 | 1564 | +/* Barrett constants */ | ||
1945 | 1565 | +/* 33 bit reflected Barrett constant m - (4^32)/n */ | ||
1946 | 1566 | + | ||
1947 | 1567 | +static const __vector unsigned long long v_Barrett_const[2] | ||
1948 | 1568 | + __attribute__((aligned (16))) = { | ||
1949 | 1569 | + /* x^64 div p(x) */ | ||
1950 | 1570 | +#ifdef __LITTLE_ENDIAN__ | ||
1951 | 1571 | + { 0x00000001f7011641, 0x0000000000000000 }, | ||
1952 | 1572 | + { 0x00000001db710641, 0x0000000000000000 } | ||
1953 | 1573 | +#else /* __LITTLE_ENDIAN__ */ | ||
1954 | 1574 | + { 0x0000000000000000, 0x00000001f7011641 }, | ||
1955 | 1575 | + { 0x0000000000000000, 0x00000001db710641 } | ||
1956 | 1576 | +#endif /* __LITTLE_ENDIAN__ */ | ||
1957 | 1577 | + }; | ||
1958 | 1578 | +#endif /* POWER8_INTRINSICS */ | ||
1959 | 1579 | + | ||
1960 | 1580 | +#endif /* __ASSEMBLER__ */ | ||
1961 | 1581 | diff --git a/contrib/power/crc32_z_power8.c b/contrib/power/crc32_z_power8.c | ||
1962 | 1582 | new file mode 100644 | ||
1963 | 1583 | index 0000000..7858cfe | ||
1964 | 1584 | --- /dev/null | ||
1965 | 1585 | +++ b/contrib/power/crc32_z_power8.c | ||
1966 | 1586 | @@ -0,0 +1,679 @@ | ||
1967 | 1587 | +/* | ||
1968 | 1588 | + * Calculate the checksum of data that is 16 byte aligned and a multiple of | ||
1969 | 1589 | + * 16 bytes. | ||
1970 | 1590 | + * | ||
1971 | 1591 | + * The first step is to reduce it to 1024 bits. We do this in 8 parallel | ||
1972 | 1592 | + * chunks in order to mask the latency of the vpmsum instructions. If we | ||
1973 | 1593 | + * have more than 32 kB of data to checksum we repeat this step multiple | ||
1974 | 1594 | + * times, passing in the previous 1024 bits. | ||
1975 | 1595 | + * | ||
1976 | 1596 | + * The next step is to reduce the 1024 bits to 64 bits. This step adds | ||
1977 | 1597 | + * 32 bits of 0s to the end - this matches what a CRC does. We just | ||
1978 | 1598 | + * calculate constants that land the data in this 32 bits. | ||
1979 | 1599 | + * | ||
1980 | 1600 | + * We then use fixed point Barrett reduction to compute a mod n over GF(2) | ||
1981 | 1601 | + * for n = CRC using POWER8 instructions. We use x = 32. | ||
1982 | 1602 | + * | ||
1983 | 1603 | + * http://en.wikipedia.org/wiki/Barrett_reduction | ||
1984 | 1604 | + * | ||
1985 | 1605 | + * This code uses gcc vector builtins instead using assembly directly. | ||
1986 | 1606 | + * | ||
1987 | 1607 | + * Copyright (C) 2017 Rogerio Alves <rogealve@br.ibm.com>, IBM | ||
1988 | 1608 | + * | ||
1989 | 1609 | + * This program is free software; you can redistribute it and/or | ||
1990 | 1610 | + * modify it under the terms of either: | ||
1991 | 1611 | + * | ||
1992 | 1612 | + * a) the GNU General Public License as published by the Free Software | ||
1993 | 1613 | + * Foundation; either version 2 of the License, or (at your option) | ||
1994 | 1614 | + * any later version, or | ||
1995 | 1615 | + * b) the Apache License, Version 2.0 | ||
1996 | 1616 | + */ | ||
1997 | 1617 | + | ||
1998 | 1618 | +#include <altivec.h> | ||
1999 | 1619 | +#include "../../zutil.h" | ||
2000 | 1620 | +#include "power.h" | ||
2001 | 1621 | + | ||
2002 | 1622 | +#define POWER8_INTRINSICS | ||
2003 | 1623 | +#define CRC_TABLE | ||
2004 | 1624 | + | ||
2005 | 1625 | +#ifdef CRC32_CONSTANTS_HEADER | ||
2006 | 1626 | +#include CRC32_CONSTANTS_HEADER | ||
2007 | 1627 | +#else | ||
2008 | 1628 | +#include "crc32_constants.h" | ||
2009 | 1629 | +#endif | ||
2010 | 1630 | + | ||
2011 | 1631 | +#define VMX_ALIGN 16 | ||
2012 | 1632 | +#define VMX_ALIGN_MASK (VMX_ALIGN-1) | ||
2013 | 1633 | + | ||
2014 | 1634 | +#ifdef REFLECT | ||
2015 | 1635 | +static unsigned int crc32_align(unsigned int crc, const unsigned char *p, | ||
2016 | 1636 | + unsigned long len) | ||
2017 | 1637 | +{ | ||
2018 | 1638 | + while (len--) | ||
2019 | 1639 | + crc = crc_table[(crc ^ *p++) & 0xff] ^ (crc >> 8); | ||
2020 | 1640 | + return crc; | ||
2021 | 1641 | +} | ||
2022 | 1642 | +#else | ||
2023 | 1643 | +static unsigned int crc32_align(unsigned int crc, const unsigned char *p, | ||
2024 | 1644 | + unsigned long len) | ||
2025 | 1645 | +{ | ||
2026 | 1646 | + while (len--) | ||
2027 | 1647 | + crc = crc_table[((crc >> 24) ^ *p++) & 0xff] ^ (crc << 8); | ||
2028 | 1648 | + return crc; | ||
2029 | 1649 | +} | ||
2030 | 1650 | +#endif | ||
2031 | 1651 | + | ||
2032 | 1652 | +static unsigned int __attribute__ ((aligned (32))) | ||
2033 | 1653 | +__crc32_vpmsum(unsigned int crc, const void* p, unsigned long len); | ||
2034 | 1654 | + | ||
2035 | 1655 | +unsigned long ZLIB_INTERNAL _crc32_z_power8(uLong _crc, const Bytef *_p, | ||
2036 | 1656 | + z_size_t _len) | ||
2037 | 1657 | +{ | ||
2038 | 1658 | + unsigned int prealign; | ||
2039 | 1659 | + unsigned int tail; | ||
2040 | 1660 | + | ||
2041 | 1661 | + /* Map zlib API to crc32_vpmsum API */ | ||
2042 | 1662 | + unsigned int crc = (unsigned int) (0xffffffff & _crc); | ||
2043 | 1663 | + const unsigned char *p = _p; | ||
2044 | 1664 | + unsigned long len = (unsigned long) _len; | ||
2045 | 1665 | + | ||
2046 | 1666 | + if (p == (const unsigned char *) 0x0) return 0; | ||
2047 | 1667 | +#ifdef CRC_XOR | ||
2048 | 1668 | + crc ^= 0xffffffff; | ||
2049 | 1669 | +#endif | ||
2050 | 1670 | + | ||
2051 | 1671 | + if (len < VMX_ALIGN + VMX_ALIGN_MASK) { | ||
2052 | 1672 | + crc = crc32_align(crc, p, len); | ||
2053 | 1673 | + goto out; | ||
2054 | 1674 | + } | ||
2055 | 1675 | + | ||
2056 | 1676 | + if ((unsigned long)p & VMX_ALIGN_MASK) { | ||
2057 | 1677 | + prealign = VMX_ALIGN - ((unsigned long)p & VMX_ALIGN_MASK); | ||
2058 | 1678 | + crc = crc32_align(crc, p, prealign); | ||
2059 | 1679 | + len -= prealign; | ||
2060 | 1680 | + p += prealign; | ||
2061 | 1681 | + } | ||
2062 | 1682 | + | ||
2063 | 1683 | + crc = __crc32_vpmsum(crc, p, len & ~VMX_ALIGN_MASK); | ||
2064 | 1684 | + | ||
2065 | 1685 | + tail = len & VMX_ALIGN_MASK; | ||
2066 | 1686 | + if (tail) { | ||
2067 | 1687 | + p += len & ~VMX_ALIGN_MASK; | ||
2068 | 1688 | + crc = crc32_align(crc, p, tail); | ||
2069 | 1689 | + } | ||
2070 | 1690 | + | ||
2071 | 1691 | +out: | ||
2072 | 1692 | +#ifdef CRC_XOR | ||
2073 | 1693 | + crc ^= 0xffffffff; | ||
2074 | 1694 | +#endif | ||
2075 | 1695 | + | ||
2076 | 1696 | + /* Convert to zlib API */ | ||
2077 | 1697 | + return (unsigned long) crc; | ||
2078 | 1698 | +} | ||
2079 | 1699 | + | ||
2080 | 1700 | +#if defined (__clang__) | ||
2081 | 1701 | +#include "clang_workaround.h" | ||
2082 | 1702 | +#else | ||
2083 | 1703 | +#define __builtin_pack_vector(a, b) __builtin_pack_vector_int128 ((a), (b)) | ||
2084 | 1704 | +#define __builtin_unpack_vector_0(a) __builtin_unpack_vector_int128 ((vector __int128_t)(a), 0) | ||
2085 | 1705 | +#define __builtin_unpack_vector_1(a) __builtin_unpack_vector_int128 ((vector __int128_t)(a), 1) | ||
2086 | 1706 | +#endif | ||
2087 | 1707 | + | ||
2088 | 1708 | +/* When we have a load-store in a single-dispatch group and address overlap | ||
2089 | 1709 | + * such that foward is not allowed (load-hit-store) the group must be flushed. | ||
2090 | 1710 | + * A group ending NOP prevents the flush. | ||
2091 | 1711 | + */ | ||
2092 | 1712 | +#define GROUP_ENDING_NOP asm("ori 2,2,0" ::: "memory") | ||
2093 | 1713 | + | ||
2094 | 1714 | +#if defined(__BIG_ENDIAN__) && defined (REFLECT) | ||
2095 | 1715 | +#define BYTESWAP_DATA | ||
2096 | 1716 | +#elif defined(__LITTLE_ENDIAN__) && !defined(REFLECT) | ||
2097 | 1717 | +#define BYTESWAP_DATA | ||
2098 | 1718 | +#endif | ||
2099 | 1719 | + | ||
2100 | 1720 | +#ifdef BYTESWAP_DATA | ||
2101 | 1721 | +#define VEC_PERM(vr, va, vb, vc) vr = vec_perm(va, vb,\ | ||
2102 | 1722 | + (__vector unsigned char) vc) | ||
2103 | 1723 | +#if defined(__LITTLE_ENDIAN__) | ||
2104 | 1724 | +/* Byte reverse permute constant LE. */ | ||
2105 | 1725 | +static const __vector unsigned long long vperm_const | ||
2106 | 1726 | + __attribute__ ((aligned(16))) = { 0x08090A0B0C0D0E0FUL, | ||
2107 | 1727 | + 0x0001020304050607UL }; | ||
2108 | 1728 | +#else | ||
2109 | 1729 | +static const __vector unsigned long long vperm_const | ||
2110 | 1730 | + __attribute__ ((aligned(16))) = { 0x0F0E0D0C0B0A0908UL, | ||
2111 | 1731 | + 0X0706050403020100UL }; | ||
2112 | 1732 | +#endif | ||
2113 | 1733 | +#else | ||
2114 | 1734 | +#define VEC_PERM(vr, va, vb, vc) | ||
2115 | 1735 | +#endif | ||
2116 | 1736 | + | ||
2117 | 1737 | +static unsigned int __attribute__ ((aligned (32))) | ||
2118 | 1738 | +__crc32_vpmsum(unsigned int crc, const void* p, unsigned long len) { | ||
2119 | 1739 | + | ||
2120 | 1740 | + const __vector unsigned long long vzero = {0,0}; | ||
2121 | 1741 | + const __vector unsigned long long vones = {0xffffffffffffffffUL, | ||
2122 | 1742 | + 0xffffffffffffffffUL}; | ||
2123 | 1743 | + | ||
2124 | 1744 | +#ifdef REFLECT | ||
2125 | 1745 | + const __vector unsigned long long vmask_32bit = | ||
2126 | 1746 | + (__vector unsigned long long)vec_sld((__vector unsigned char)vzero, | ||
2127 | 1747 | + (__vector unsigned char)vones, 4); | ||
2128 | 1748 | +#endif | ||
2129 | 1749 | + | ||
2130 | 1750 | + const __vector unsigned long long vmask_64bit = | ||
2131 | 1751 | + (__vector unsigned long long)vec_sld((__vector unsigned char)vzero, | ||
2132 | 1752 | + (__vector unsigned char)vones, 8); | ||
2133 | 1753 | + | ||
2134 | 1754 | + __vector unsigned long long vcrc; | ||
2135 | 1755 | + | ||
2136 | 1756 | + __vector unsigned long long vconst1, vconst2; | ||
2137 | 1757 | + | ||
2138 | 1758 | + /* vdata0-vdata7 will contain our data (p). */ | ||
2139 | 1759 | + __vector unsigned long long vdata0, vdata1, vdata2, vdata3, vdata4, | ||
2140 | 1760 | + vdata5, vdata6, vdata7; | ||
2141 | 1761 | + | ||
2142 | 1762 | + /* v0-v7 will contain our checksums */ | ||
2143 | 1763 | + __vector unsigned long long v0 = {0,0}; | ||
2144 | 1764 | + __vector unsigned long long v1 = {0,0}; | ||
2145 | 1765 | + __vector unsigned long long v2 = {0,0}; | ||
2146 | 1766 | + __vector unsigned long long v3 = {0,0}; | ||
2147 | 1767 | + __vector unsigned long long v4 = {0,0}; | ||
2148 | 1768 | + __vector unsigned long long v5 = {0,0}; | ||
2149 | 1769 | + __vector unsigned long long v6 = {0,0}; | ||
2150 | 1770 | + __vector unsigned long long v7 = {0,0}; | ||
2151 | 1771 | + | ||
2152 | 1772 | + | ||
2153 | 1773 | + /* Vector auxiliary variables. */ | ||
2154 | 1774 | + __vector unsigned long long va0, va1, va2, va3, va4, va5, va6, va7; | ||
2155 | 1775 | + | ||
2156 | 1776 | + unsigned int result = 0; | ||
2157 | 1777 | + unsigned int offset; /* Constant table offset. */ | ||
2158 | 1778 | + | ||
2159 | 1779 | + unsigned long i; /* Counter. */ | ||
2160 | 1780 | + unsigned long chunks; | ||
2161 | 1781 | + | ||
2162 | 1782 | + unsigned long block_size; | ||
2163 | 1783 | + int next_block = 0; | ||
2164 | 1784 | + | ||
2165 | 1785 | + /* Align by 128 bits. The last 128 bit block will be processed at end. */ | ||
2166 | 1786 | + unsigned long length = len & 0xFFFFFFFFFFFFFF80UL; | ||
2167 | 1787 | + | ||
2168 | 1788 | +#ifdef REFLECT | ||
2169 | 1789 | + vcrc = (__vector unsigned long long)__builtin_pack_vector(0UL, crc); | ||
2170 | 1790 | +#else | ||
2171 | 1791 | + vcrc = (__vector unsigned long long)__builtin_pack_vector(crc, 0UL); | ||
2172 | 1792 | + | ||
2173 | 1793 | + /* Shift into top 32 bits */ | ||
2174 | 1794 | + vcrc = (__vector unsigned long long)vec_sld((__vector unsigned char)vcrc, | ||
2175 | 1795 | + (__vector unsigned char)vzero, 4); | ||
2176 | 1796 | +#endif | ||
2177 | 1797 | + | ||
2178 | 1798 | + /* Short version. */ | ||
2179 | 1799 | + if (len < 256) { | ||
2180 | 1800 | + /* Calculate where in the constant table we need to start. */ | ||
2181 | 1801 | + offset = 256 - len; | ||
2182 | 1802 | + | ||
2183 | 1803 | + vconst1 = vec_ld(offset, vcrc_short_const); | ||
2184 | 1804 | + vdata0 = vec_ld(0, (__vector unsigned long long*) p); | ||
2185 | 1805 | + VEC_PERM(vdata0, vdata0, vconst1, vperm_const); | ||
2186 | 1806 | + | ||
2187 | 1807 | + /* xor initial value*/ | ||
2188 | 1808 | + vdata0 = vec_xor(vdata0, vcrc); | ||
2189 | 1809 | + | ||
2190 | 1810 | + vdata0 = (__vector unsigned long long) __builtin_crypto_vpmsumw | ||
2191 | 1811 | + ((__vector unsigned int)vdata0, (__vector unsigned int)vconst1); | ||
2192 | 1812 | + v0 = vec_xor(v0, vdata0); | ||
2193 | 1813 | + | ||
2194 | 1814 | + for (i = 16; i < len; i += 16) { | ||
2195 | 1815 | + vconst1 = vec_ld(offset + i, vcrc_short_const); | ||
2196 | 1816 | + vdata0 = vec_ld(i, (__vector unsigned long long*) p); | ||
2197 | 1817 | + VEC_PERM(vdata0, vdata0, vconst1, vperm_const); | ||
2198 | 1818 | + vdata0 = (__vector unsigned long long) __builtin_crypto_vpmsumw | ||
2199 | 1819 | + ((__vector unsigned int)vdata0, (__vector unsigned int)vconst1); | ||
2200 | 1820 | + v0 = vec_xor(v0, vdata0); | ||
2201 | 1821 | + } | ||
2202 | 1822 | + } else { | ||
2203 | 1823 | + | ||
2204 | 1824 | + /* Load initial values. */ | ||
2205 | 1825 | + vdata0 = vec_ld(0, (__vector unsigned long long*) p); | ||
2206 | 1826 | + vdata1 = vec_ld(16, (__vector unsigned long long*) p); | ||
2207 | 1827 | + | ||
2208 | 1828 | + VEC_PERM(vdata0, vdata0, vdata0, vperm_const); | ||
2209 | 1829 | + VEC_PERM(vdata1, vdata1, vdata1, vperm_const); | ||
2210 | 1830 | + | ||
2211 | 1831 | + vdata2 = vec_ld(32, (__vector unsigned long long*) p); | ||
2212 | 1832 | + vdata3 = vec_ld(48, (__vector unsigned long long*) p); | ||
2213 | 1833 | + | ||
2214 | 1834 | + VEC_PERM(vdata2, vdata2, vdata2, vperm_const); | ||
2215 | 1835 | + VEC_PERM(vdata3, vdata3, vdata3, vperm_const); | ||
2216 | 1836 | + | ||
2217 | 1837 | + vdata4 = vec_ld(64, (__vector unsigned long long*) p); | ||
2218 | 1838 | + vdata5 = vec_ld(80, (__vector unsigned long long*) p); | ||
2219 | 1839 | + | ||
2220 | 1840 | + VEC_PERM(vdata4, vdata4, vdata4, vperm_const); | ||
2221 | 1841 | + VEC_PERM(vdata5, vdata5, vdata5, vperm_const); | ||
2222 | 1842 | + | ||
2223 | 1843 | + vdata6 = vec_ld(96, (__vector unsigned long long*) p); | ||
2224 | 1844 | + vdata7 = vec_ld(112, (__vector unsigned long long*) p); | ||
2225 | 1845 | + | ||
2226 | 1846 | + VEC_PERM(vdata6, vdata6, vdata6, vperm_const); | ||
2227 | 1847 | + VEC_PERM(vdata7, vdata7, vdata7, vperm_const); | ||
2228 | 1848 | + | ||
2229 | 1849 | + /* xor in initial value */ | ||
2230 | 1850 | + vdata0 = vec_xor(vdata0, vcrc); | ||
2231 | 1851 | + | ||
2232 | 1852 | + p = (char *)p + 128; | ||
2233 | 1853 | + | ||
2234 | 1854 | + do { | ||
2235 | 1855 | + /* Checksum in blocks of MAX_SIZE. */ | ||
2236 | 1856 | + block_size = length; | ||
2237 | 1857 | + if (block_size > MAX_SIZE) { | ||
2238 | 1858 | + block_size = MAX_SIZE; | ||
2239 | 1859 | + } | ||
2240 | 1860 | + | ||
2241 | 1861 | + length = length - block_size; | ||
2242 | 1862 | + | ||
2243 | 1863 | + /* | ||
2244 | 1864 | + * Work out the offset into the constants table to start at. Each | ||
2245 | 1865 | + * constant is 16 bytes, and it is used against 128 bytes of input | ||
2246 | 1866 | + * data - 128 / 16 = 8 | ||
2247 | 1867 | + */ | ||
2248 | 1868 | + offset = (MAX_SIZE/8) - (block_size/8); | ||
2249 | 1869 | + /* We reduce our final 128 bytes in a separate step */ | ||
2250 | 1870 | + chunks = (block_size/128)-1; | ||
2251 | 1871 | + | ||
2252 | 1872 | + vconst1 = vec_ld(offset, vcrc_const); | ||
2253 | 1873 | + | ||
2254 | 1874 | + va0 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata0, | ||
2255 | 1875 | + (__vector unsigned long long)vconst1); | ||
2256 | 1876 | + va1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata1, | ||
2257 | 1877 | + (__vector unsigned long long)vconst1); | ||
2258 | 1878 | + va2 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata2, | ||
2259 | 1879 | + (__vector unsigned long long)vconst1); | ||
2260 | 1880 | + va3 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata3, | ||
2261 | 1881 | + (__vector unsigned long long)vconst1); | ||
2262 | 1882 | + va4 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata4, | ||
2263 | 1883 | + (__vector unsigned long long)vconst1); | ||
2264 | 1884 | + va5 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata5, | ||
2265 | 1885 | + (__vector unsigned long long)vconst1); | ||
2266 | 1886 | + va6 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata6, | ||
2267 | 1887 | + (__vector unsigned long long)vconst1); | ||
2268 | 1888 | + va7 = __builtin_crypto_vpmsumd ((__vector unsigned long long)vdata7, | ||
2269 | 1889 | + (__vector unsigned long long)vconst1); | ||
2270 | 1890 | + | ||
2271 | 1891 | + if (chunks > 1) { | ||
2272 | 1892 | + offset += 16; | ||
2273 | 1893 | + vconst2 = vec_ld(offset, vcrc_const); | ||
2274 | 1894 | + GROUP_ENDING_NOP; | ||
2275 | 1895 | + | ||
2276 | 1896 | + vdata0 = vec_ld(0, (__vector unsigned long long*) p); | ||
2277 | 1897 | + VEC_PERM(vdata0, vdata0, vdata0, vperm_const); | ||
2278 | 1898 | + | ||
2279 | 1899 | + vdata1 = vec_ld(16, (__vector unsigned long long*) p); | ||
2280 | 1900 | + VEC_PERM(vdata1, vdata1, vdata1, vperm_const); | ||
2281 | 1901 | + | ||
2282 | 1902 | + vdata2 = vec_ld(32, (__vector unsigned long long*) p); | ||
2283 | 1903 | + VEC_PERM(vdata2, vdata2, vdata2, vperm_const); | ||
2284 | 1904 | + | ||
2285 | 1905 | + vdata3 = vec_ld(48, (__vector unsigned long long*) p); | ||
2286 | 1906 | + VEC_PERM(vdata3, vdata3, vdata3, vperm_const); | ||
2287 | 1907 | + | ||
2288 | 1908 | + vdata4 = vec_ld(64, (__vector unsigned long long*) p); | ||
2289 | 1909 | + VEC_PERM(vdata4, vdata4, vdata4, vperm_const); | ||
2290 | 1910 | + | ||
2291 | 1911 | + vdata5 = vec_ld(80, (__vector unsigned long long*) p); | ||
2292 | 1912 | + VEC_PERM(vdata5, vdata5, vdata5, vperm_const); | ||
2293 | 1913 | + | ||
2294 | 1914 | + vdata6 = vec_ld(96, (__vector unsigned long long*) p); | ||
2295 | 1915 | + VEC_PERM(vdata6, vdata6, vdata6, vperm_const); | ||
2296 | 1916 | + | ||
2297 | 1917 | + vdata7 = vec_ld(112, (__vector unsigned long long*) p); | ||
2298 | 1918 | + VEC_PERM(vdata7, vdata7, vdata7, vperm_const); | ||
2299 | 1919 | + | ||
2300 | 1920 | + p = (char *)p + 128; | ||
2301 | 1921 | + | ||
2302 | 1922 | + /* | ||
2303 | 1923 | + * main loop. We modulo schedule it such that it takes three | ||
2304 | 1924 | + * iterations to complete - first iteration load, second | ||
2305 | 1925 | + * iteration vpmsum, third iteration xor. | ||
2306 | 1926 | + */ | ||
2307 | 1927 | + for (i = 0; i < chunks-2; i++) { | ||
2308 | 1928 | + vconst1 = vec_ld(offset, vcrc_const); | ||
2309 | 1929 | + offset += 16; | ||
2310 | 1930 | + GROUP_ENDING_NOP; | ||
2311 | 1931 | + | ||
2312 | 1932 | + v0 = vec_xor(v0, va0); | ||
2313 | 1933 | + va0 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2314 | 1934 | + long)vdata0, (__vector unsigned long long)vconst2); | ||
2315 | 1935 | + vdata0 = vec_ld(0, (__vector unsigned long long*) p); | ||
2316 | 1936 | + VEC_PERM(vdata0, vdata0, vdata0, vperm_const); | ||
2317 | 1937 | + GROUP_ENDING_NOP; | ||
2318 | 1938 | + | ||
2319 | 1939 | + v1 = vec_xor(v1, va1); | ||
2320 | 1940 | + va1 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2321 | 1941 | + long)vdata1, (__vector unsigned long long)vconst2); | ||
2322 | 1942 | + vdata1 = vec_ld(16, (__vector unsigned long long*) p); | ||
2323 | 1943 | + VEC_PERM(vdata1, vdata1, vdata1, vperm_const); | ||
2324 | 1944 | + GROUP_ENDING_NOP; | ||
2325 | 1945 | + | ||
2326 | 1946 | + v2 = vec_xor(v2, va2); | ||
2327 | 1947 | + va2 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2328 | 1948 | + long)vdata2, (__vector unsigned long long)vconst2); | ||
2329 | 1949 | + vdata2 = vec_ld(32, (__vector unsigned long long*) p); | ||
2330 | 1950 | + VEC_PERM(vdata2, vdata2, vdata2, vperm_const); | ||
2331 | 1951 | + GROUP_ENDING_NOP; | ||
2332 | 1952 | + | ||
2333 | 1953 | + v3 = vec_xor(v3, va3); | ||
2334 | 1954 | + va3 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2335 | 1955 | + long)vdata3, (__vector unsigned long long)vconst2); | ||
2336 | 1956 | + vdata3 = vec_ld(48, (__vector unsigned long long*) p); | ||
2337 | 1957 | + VEC_PERM(vdata3, vdata3, vdata3, vperm_const); | ||
2338 | 1958 | + | ||
2339 | 1959 | + vconst2 = vec_ld(offset, vcrc_const); | ||
2340 | 1960 | + GROUP_ENDING_NOP; | ||
2341 | 1961 | + | ||
2342 | 1962 | + v4 = vec_xor(v4, va4); | ||
2343 | 1963 | + va4 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2344 | 1964 | + long)vdata4, (__vector unsigned long long)vconst1); | ||
2345 | 1965 | + vdata4 = vec_ld(64, (__vector unsigned long long*) p); | ||
2346 | 1966 | + VEC_PERM(vdata4, vdata4, vdata4, vperm_const); | ||
2347 | 1967 | + GROUP_ENDING_NOP; | ||
2348 | 1968 | + | ||
2349 | 1969 | + v5 = vec_xor(v5, va5); | ||
2350 | 1970 | + va5 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2351 | 1971 | + long)vdata5, (__vector unsigned long long)vconst1); | ||
2352 | 1972 | + vdata5 = vec_ld(80, (__vector unsigned long long*) p); | ||
2353 | 1973 | + VEC_PERM(vdata5, vdata5, vdata5, vperm_const); | ||
2354 | 1974 | + GROUP_ENDING_NOP; | ||
2355 | 1975 | + | ||
2356 | 1976 | + v6 = vec_xor(v6, va6); | ||
2357 | 1977 | + va6 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2358 | 1978 | + long)vdata6, (__vector unsigned long long)vconst1); | ||
2359 | 1979 | + vdata6 = vec_ld(96, (__vector unsigned long long*) p); | ||
2360 | 1980 | + VEC_PERM(vdata6, vdata6, vdata6, vperm_const); | ||
2361 | 1981 | + GROUP_ENDING_NOP; | ||
2362 | 1982 | + | ||
2363 | 1983 | + v7 = vec_xor(v7, va7); | ||
2364 | 1984 | + va7 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2365 | 1985 | + long)vdata7, (__vector unsigned long long)vconst1); | ||
2366 | 1986 | + vdata7 = vec_ld(112, (__vector unsigned long long*) p); | ||
2367 | 1987 | + VEC_PERM(vdata7, vdata7, vdata7, vperm_const); | ||
2368 | 1988 | + | ||
2369 | 1989 | + p = (char *)p + 128; | ||
2370 | 1990 | + } | ||
2371 | 1991 | + | ||
2372 | 1992 | + /* First cool down*/ | ||
2373 | 1993 | + vconst1 = vec_ld(offset, vcrc_const); | ||
2374 | 1994 | + offset += 16; | ||
2375 | 1995 | + | ||
2376 | 1996 | + v0 = vec_xor(v0, va0); | ||
2377 | 1997 | + va0 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2378 | 1998 | + long)vdata0, (__vector unsigned long long)vconst1); | ||
2379 | 1999 | + GROUP_ENDING_NOP; | ||
2380 | 2000 | + | ||
2381 | 2001 | + v1 = vec_xor(v1, va1); | ||
2382 | 2002 | + va1 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2383 | 2003 | + long)vdata1, (__vector unsigned long long)vconst1); | ||
2384 | 2004 | + GROUP_ENDING_NOP; | ||
2385 | 2005 | + | ||
2386 | 2006 | + v2 = vec_xor(v2, va2); | ||
2387 | 2007 | + va2 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2388 | 2008 | + long)vdata2, (__vector unsigned long long)vconst1); | ||
2389 | 2009 | + GROUP_ENDING_NOP; | ||
2390 | 2010 | + | ||
2391 | 2011 | + v3 = vec_xor(v3, va3); | ||
2392 | 2012 | + va3 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2393 | 2013 | + long)vdata3, (__vector unsigned long long)vconst1); | ||
2394 | 2014 | + GROUP_ENDING_NOP; | ||
2395 | 2015 | + | ||
2396 | 2016 | + v4 = vec_xor(v4, va4); | ||
2397 | 2017 | + va4 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2398 | 2018 | + long)vdata4, (__vector unsigned long long)vconst1); | ||
2399 | 2019 | + GROUP_ENDING_NOP; | ||
2400 | 2020 | + | ||
2401 | 2021 | + v5 = vec_xor(v5, va5); | ||
2402 | 2022 | + va5 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2403 | 2023 | + long)vdata5, (__vector unsigned long long)vconst1); | ||
2404 | 2024 | + GROUP_ENDING_NOP; | ||
2405 | 2025 | + | ||
2406 | 2026 | + v6 = vec_xor(v6, va6); | ||
2407 | 2027 | + va6 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2408 | 2028 | + long)vdata6, (__vector unsigned long long)vconst1); | ||
2409 | 2029 | + GROUP_ENDING_NOP; | ||
2410 | 2030 | + | ||
2411 | 2031 | + v7 = vec_xor(v7, va7); | ||
2412 | 2032 | + va7 = __builtin_crypto_vpmsumd ((__vector unsigned long | ||
2413 | 2033 | + long)vdata7, (__vector unsigned long long)vconst1); | ||
2414 | 2034 | + }/* else */ | ||
2415 | 2035 | + | ||
2416 | 2036 | + /* Second cool down. */ | ||
2417 | 2037 | + v0 = vec_xor(v0, va0); | ||
2418 | 2038 | + v1 = vec_xor(v1, va1); | ||
2419 | 2039 | + v2 = vec_xor(v2, va2); | ||
2420 | 2040 | + v3 = vec_xor(v3, va3); | ||
2421 | 2041 | + v4 = vec_xor(v4, va4); | ||
2422 | 2042 | + v5 = vec_xor(v5, va5); | ||
2423 | 2043 | + v6 = vec_xor(v6, va6); | ||
2424 | 2044 | + v7 = vec_xor(v7, va7); | ||
2425 | 2045 | + | ||
2426 | 2046 | +#ifdef REFLECT | ||
2427 | 2047 | + /* | ||
2428 | 2048 | + * vpmsumd produces a 96 bit result in the least significant bits | ||
2429 | 2049 | + * of the register. Since we are bit reflected we have to shift it | ||
2430 | 2050 | + * left 32 bits so it occupies the least significant bits in the | ||
2431 | 2051 | + * bit reflected domain. | ||
2432 | 2052 | + */ | ||
2433 | 2053 | + v0 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, | ||
2434 | 2054 | + (__vector unsigned char)vzero, 4); | ||
2435 | 2055 | + v1 = (__vector unsigned long long)vec_sld((__vector unsigned char)v1, | ||
2436 | 2056 | + (__vector unsigned char)vzero, 4); | ||
2437 | 2057 | + v2 = (__vector unsigned long long)vec_sld((__vector unsigned char)v2, | ||
2438 | 2058 | + (__vector unsigned char)vzero, 4); | ||
2439 | 2059 | + v3 = (__vector unsigned long long)vec_sld((__vector unsigned char)v3, | ||
2440 | 2060 | + (__vector unsigned char)vzero, 4); | ||
2441 | 2061 | + v4 = (__vector unsigned long long)vec_sld((__vector unsigned char)v4, | ||
2442 | 2062 | + (__vector unsigned char)vzero, 4); | ||
2443 | 2063 | + v5 = (__vector unsigned long long)vec_sld((__vector unsigned char)v5, | ||
2444 | 2064 | + (__vector unsigned char)vzero, 4); | ||
2445 | 2065 | + v6 = (__vector unsigned long long)vec_sld((__vector unsigned char)v6, | ||
2446 | 2066 | + (__vector unsigned char)vzero, 4); | ||
2447 | 2067 | + v7 = (__vector unsigned long long)vec_sld((__vector unsigned char)v7, | ||
2448 | 2068 | + (__vector unsigned char)vzero, 4); | ||
2449 | 2069 | +#endif | ||
2450 | 2070 | + | ||
2451 | 2071 | + /* xor with the last 1024 bits. */ | ||
2452 | 2072 | + va0 = vec_ld(0, (__vector unsigned long long*) p); | ||
2453 | 2073 | + VEC_PERM(va0, va0, va0, vperm_const); | ||
2454 | 2074 | + | ||
2455 | 2075 | + va1 = vec_ld(16, (__vector unsigned long long*) p); | ||
2456 | 2076 | + VEC_PERM(va1, va1, va1, vperm_const); | ||
2457 | 2077 | + | ||
2458 | 2078 | + va2 = vec_ld(32, (__vector unsigned long long*) p); | ||
2459 | 2079 | + VEC_PERM(va2, va2, va2, vperm_const); | ||
2460 | 2080 | + | ||
2461 | 2081 | + va3 = vec_ld(48, (__vector unsigned long long*) p); | ||
2462 | 2082 | + VEC_PERM(va3, va3, va3, vperm_const); | ||
2463 | 2083 | + | ||
2464 | 2084 | + va4 = vec_ld(64, (__vector unsigned long long*) p); | ||
2465 | 2085 | + VEC_PERM(va4, va4, va4, vperm_const); | ||
2466 | 2086 | + | ||
2467 | 2087 | + va5 = vec_ld(80, (__vector unsigned long long*) p); | ||
2468 | 2088 | + VEC_PERM(va5, va5, va5, vperm_const); | ||
2469 | 2089 | + | ||
2470 | 2090 | + va6 = vec_ld(96, (__vector unsigned long long*) p); | ||
2471 | 2091 | + VEC_PERM(va6, va6, va6, vperm_const); | ||
2472 | 2092 | + | ||
2473 | 2093 | + va7 = vec_ld(112, (__vector unsigned long long*) p); | ||
2474 | 2094 | + VEC_PERM(va7, va7, va7, vperm_const); | ||
2475 | 2095 | + | ||
2476 | 2096 | + p = (char *)p + 128; | ||
2477 | 2097 | + | ||
2478 | 2098 | + vdata0 = vec_xor(v0, va0); | ||
2479 | 2099 | + vdata1 = vec_xor(v1, va1); | ||
2480 | 2100 | + vdata2 = vec_xor(v2, va2); | ||
2481 | 2101 | + vdata3 = vec_xor(v3, va3); | ||
2482 | 2102 | + vdata4 = vec_xor(v4, va4); | ||
2483 | 2103 | + vdata5 = vec_xor(v5, va5); | ||
2484 | 2104 | + vdata6 = vec_xor(v6, va6); | ||
2485 | 2105 | + vdata7 = vec_xor(v7, va7); | ||
2486 | 2106 | + | ||
2487 | 2107 | + /* Check if we have more blocks to process */ | ||
2488 | 2108 | + next_block = 0; | ||
2489 | 2109 | + if (length != 0) { | ||
2490 | 2110 | + next_block = 1; | ||
2491 | 2111 | + | ||
2492 | 2112 | + /* zero v0-v7 */ | ||
2493 | 2113 | + v0 = vec_xor(v0, v0); | ||
2494 | 2114 | + v1 = vec_xor(v1, v1); | ||
2495 | 2115 | + v2 = vec_xor(v2, v2); | ||
2496 | 2116 | + v3 = vec_xor(v3, v3); | ||
2497 | 2117 | + v4 = vec_xor(v4, v4); | ||
2498 | 2118 | + v5 = vec_xor(v5, v5); | ||
2499 | 2119 | + v6 = vec_xor(v6, v6); | ||
2500 | 2120 | + v7 = vec_xor(v7, v7); | ||
2501 | 2121 | + } | ||
2502 | 2122 | + length = length + 128; | ||
2503 | 2123 | + | ||
2504 | 2124 | + } while (next_block); | ||
2505 | 2125 | + | ||
2506 | 2126 | + /* Calculate how many bytes we have left. */ | ||
2507 | 2127 | + length = (len & 127); | ||
2508 | 2128 | + | ||
2509 | 2129 | + /* Calculate where in (short) constant table we need to start. */ | ||
2510 | 2130 | + offset = 128 - length; | ||
2511 | 2131 | + | ||
2512 | 2132 | + v0 = vec_ld(offset, vcrc_short_const); | ||
2513 | 2133 | + v1 = vec_ld(offset + 16, vcrc_short_const); | ||
2514 | 2134 | + v2 = vec_ld(offset + 32, vcrc_short_const); | ||
2515 | 2135 | + v3 = vec_ld(offset + 48, vcrc_short_const); | ||
2516 | 2136 | + v4 = vec_ld(offset + 64, vcrc_short_const); | ||
2517 | 2137 | + v5 = vec_ld(offset + 80, vcrc_short_const); | ||
2518 | 2138 | + v6 = vec_ld(offset + 96, vcrc_short_const); | ||
2519 | 2139 | + v7 = vec_ld(offset + 112, vcrc_short_const); | ||
2520 | 2140 | + | ||
2521 | 2141 | + offset += 128; | ||
2522 | 2142 | + | ||
2523 | 2143 | + v0 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2524 | 2144 | + (__vector unsigned int)vdata0,(__vector unsigned int)v0); | ||
2525 | 2145 | + v1 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2526 | 2146 | + (__vector unsigned int)vdata1,(__vector unsigned int)v1); | ||
2527 | 2147 | + v2 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2528 | 2148 | + (__vector unsigned int)vdata2,(__vector unsigned int)v2); | ||
2529 | 2149 | + v3 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2530 | 2150 | + (__vector unsigned int)vdata3,(__vector unsigned int)v3); | ||
2531 | 2151 | + v4 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2532 | 2152 | + (__vector unsigned int)vdata4,(__vector unsigned int)v4); | ||
2533 | 2153 | + v5 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2534 | 2154 | + (__vector unsigned int)vdata5,(__vector unsigned int)v5); | ||
2535 | 2155 | + v6 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2536 | 2156 | + (__vector unsigned int)vdata6,(__vector unsigned int)v6); | ||
2537 | 2157 | + v7 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2538 | 2158 | + (__vector unsigned int)vdata7,(__vector unsigned int)v7); | ||
2539 | 2159 | + | ||
2540 | 2160 | + /* Now reduce the tail (0-112 bytes). */ | ||
2541 | 2161 | + for (i = 0; i < length; i+=16) { | ||
2542 | 2162 | + vdata0 = vec_ld(i,(__vector unsigned long long*)p); | ||
2543 | 2163 | + VEC_PERM(vdata0, vdata0, vdata0, vperm_const); | ||
2544 | 2164 | + va0 = vec_ld(offset + i,vcrc_short_const); | ||
2545 | 2165 | + va0 = (__vector unsigned long long)__builtin_crypto_vpmsumw ( | ||
2546 | 2166 | + (__vector unsigned int)vdata0,(__vector unsigned int)va0); | ||
2547 | 2167 | + v0 = vec_xor(v0, va0); | ||
2548 | 2168 | + } | ||
2549 | 2169 | + | ||
2550 | 2170 | + /* xor all parallel chunks together. */ | ||
2551 | 2171 | + v0 = vec_xor(v0, v1); | ||
2552 | 2172 | + v2 = vec_xor(v2, v3); | ||
2553 | 2173 | + v4 = vec_xor(v4, v5); | ||
2554 | 2174 | + v6 = vec_xor(v6, v7); | ||
2555 | 2175 | + | ||
2556 | 2176 | + v0 = vec_xor(v0, v2); | ||
2557 | 2177 | + v4 = vec_xor(v4, v6); | ||
2558 | 2178 | + | ||
2559 | 2179 | + v0 = vec_xor(v0, v4); | ||
2560 | 2180 | + } | ||
2561 | 2181 | + | ||
2562 | 2182 | + /* Barrett Reduction */ | ||
2563 | 2183 | + vconst1 = vec_ld(0, v_Barrett_const); | ||
2564 | 2184 | + vconst2 = vec_ld(16, v_Barrett_const); | ||
2565 | 2185 | + | ||
2566 | 2186 | + v1 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, | ||
2567 | 2187 | + (__vector unsigned char)v0, 8); | ||
2568 | 2188 | + v0 = vec_xor(v1,v0); | ||
2569 | 2189 | + | ||
2570 | 2190 | +#ifdef REFLECT | ||
2571 | 2191 | + /* shift left one bit */ | ||
2572 | 2192 | + __vector unsigned char vsht_splat = vec_splat_u8 (1); | ||
2573 | 2193 | + v0 = (__vector unsigned long long)vec_sll ((__vector unsigned char)v0, | ||
2574 | 2194 | + vsht_splat); | ||
2575 | 2195 | +#endif | ||
2576 | 2196 | + | ||
2577 | 2197 | + v0 = vec_and(v0, vmask_64bit); | ||
2578 | 2198 | + | ||
2579 | 2199 | +#ifndef REFLECT | ||
2580 | 2200 | + | ||
2581 | 2201 | + /* | ||
2582 | 2202 | + * Now for the actual algorithm. The idea is to calculate q, | ||
2583 | 2203 | + * the multiple of our polynomial that we need to subtract. By | ||
2584 | 2204 | + * doing the computation 2x bits higher (ie 64 bits) and shifting the | ||
2585 | 2205 | + * result back down 2x bits, we round down to the nearest multiple. | ||
2586 | 2206 | + */ | ||
2587 | 2207 | + | ||
2588 | 2208 | + /* ma */ | ||
2589 | 2209 | + v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v0, | ||
2590 | 2210 | + (__vector unsigned long long)vconst1); | ||
2591 | 2211 | + /* q = floor(ma/(2^64)) */ | ||
2592 | 2212 | + v1 = (__vector unsigned long long)vec_sld ((__vector unsigned char)vzero, | ||
2593 | 2213 | + (__vector unsigned char)v1, 8); | ||
2594 | 2214 | + /* qn */ | ||
2595 | 2215 | + v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, | ||
2596 | 2216 | + (__vector unsigned long long)vconst2); | ||
2597 | 2217 | + /* a - qn, subtraction is xor in GF(2) */ | ||
2598 | 2218 | + v0 = vec_xor (v0, v1); | ||
2599 | 2219 | + /* | ||
2600 | 2220 | + * Get the result into r3. We need to shift it left 8 bytes: | ||
2601 | 2221 | + * V0 [ 0 1 2 X ] | ||
2602 | 2222 | + * V0 [ 0 X 2 3 ] | ||
2603 | 2223 | + */ | ||
2604 | 2224 | + result = __builtin_unpack_vector_1 (v0); | ||
2605 | 2225 | +#else | ||
2606 | 2226 | + | ||
2607 | 2227 | + /* | ||
2608 | 2228 | + * The reflected version of Barrett reduction. Instead of bit | ||
2609 | 2229 | + * reflecting our data (which is expensive to do), we bit reflect our | ||
2610 | 2230 | + * constants and our algorithm, which means the intermediate data in | ||
2611 | 2231 | + * our vector registers goes from 0-63 instead of 63-0. We can reflect | ||
2612 | 2232 | + * the algorithm because we don't carry in mod 2 arithmetic. | ||
2613 | 2233 | + */ | ||
2614 | 2234 | + | ||
2615 | 2235 | + /* bottom 32 bits of a */ | ||
2616 | 2236 | + v1 = vec_and(v0, vmask_32bit); | ||
2617 | 2237 | + | ||
2618 | 2238 | + /* ma */ | ||
2619 | 2239 | + v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, | ||
2620 | 2240 | + (__vector unsigned long long)vconst1); | ||
2621 | 2241 | + | ||
2622 | 2242 | + /* bottom 32bits of ma */ | ||
2623 | 2243 | + v1 = vec_and(v1, vmask_32bit); | ||
2624 | 2244 | + /* qn */ | ||
2625 | 2245 | + v1 = __builtin_crypto_vpmsumd ((__vector unsigned long long)v1, | ||
2626 | 2246 | + (__vector unsigned long long)vconst2); | ||
2627 | 2247 | + /* a - qn, subtraction is xor in GF(2) */ | ||
2628 | 2248 | + v0 = vec_xor (v0, v1); | ||
2629 | 2249 | + | ||
2630 | 2250 | + /* | ||
2631 | 2251 | + * Since we are bit reflected, the result (ie the low 32 bits) is in | ||
2632 | 2252 | + * the high 32 bits. We just need to shift it left 4 bytes | ||
2633 | 2253 | + * V0 [ 0 1 X 3 ] | ||
2634 | 2254 | + * V0 [ 0 X 2 3 ] | ||
2635 | 2255 | + */ | ||
2636 | 2256 | + | ||
2637 | 2257 | + /* shift result into top 64 bits of */ | ||
2638 | 2258 | + v0 = (__vector unsigned long long)vec_sld((__vector unsigned char)v0, | ||
2639 | 2259 | + (__vector unsigned char)vzero, 4); | ||
2640 | 2260 | + | ||
2641 | 2261 | + result = __builtin_unpack_vector_0 (v0); | ||
2642 | 2262 | +#endif | ||
2643 | 2263 | + | ||
2644 | 2264 | + return result; | ||
2645 | 2265 | +} | ||
2646 | 2266 | diff --git a/contrib/power/crc32_z_resolver.c b/contrib/power/crc32_z_resolver.c | ||
2647 | 2267 | new file mode 100644 | ||
2648 | 2268 | index 0000000..f4e9aa4 | ||
2649 | 2269 | --- /dev/null | ||
2650 | 2270 | +++ b/contrib/power/crc32_z_resolver.c | ||
2651 | 2271 | @@ -0,0 +1,15 @@ | ||
2652 | 2272 | +/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM | ||
2653 | 2273 | + * For conditions of distribution and use, see copyright notice in zlib.h | ||
2654 | 2274 | + */ | ||
2655 | 2275 | + | ||
2656 | 2276 | +#include "../gcc/zifunc.h" | ||
2657 | 2277 | +#include "power.h" | ||
2658 | 2278 | + | ||
2659 | 2279 | +Z_IFUNC(crc32_z) { | ||
2660 | 2280 | +#ifdef Z_POWER8 | ||
2661 | 2281 | + if (__builtin_cpu_supports("arch_2_07")) | ||
2662 | 2282 | + return _crc32_z_power8; | ||
2663 | 2283 | +#endif | ||
2664 | 2284 | + | ||
2665 | 2285 | + return crc32_z_default; | ||
2666 | 2286 | +} | ||
2667 | 2287 | diff --git a/contrib/power/power.h b/contrib/power/power.h | ||
2668 | 2288 | index b42c7d6..79123aa 100644 | ||
2669 | 2289 | --- a/contrib/power/power.h | ||
2670 | 2290 | +++ b/contrib/power/power.h | ||
2671 | 2291 | @@ -2,3 +2,7 @@ | ||
2672 | 2292 | * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM | ||
2673 | 2293 | * For conditions of distribution and use, see copyright notice in zlib.h | ||
2674 | 2294 | */ | ||
2675 | 2295 | + | ||
2676 | 2296 | +#include "../../zconf.h" | ||
2677 | 2297 | + | ||
2678 | 2298 | +unsigned long _crc32_z_power8(unsigned long, const Bytef *, z_size_t); | ||
2679 | 2299 | diff --git a/crc32.c b/crc32.c | ||
2680 | 2300 | index 6c38f5c..5589d54 100644 | ||
2681 | 2301 | --- a/crc32.c | ||
2682 | 2302 | +++ b/crc32.c | ||
2683 | 2303 | @@ -691,6 +691,13 @@ local z_word_t crc_word_big(z_word_t data) { | ||
2684 | 2304 | #endif | ||
2685 | 2305 | |||
2686 | 2306 | /* ========================================================================= */ | ||
2687 | 2307 | +#ifdef Z_POWER_OPT | ||
2688 | 2308 | +/* Rename function so resolver can use its symbol. The default version will be | ||
2689 | 2309 | + * returned by the resolver if the host has no support for an optimized version. | ||
2690 | 2310 | + */ | ||
2691 | 2311 | +#define crc32_z crc32_z_default | ||
2692 | 2312 | +#endif /* Z_POWER_OPT */ | ||
2693 | 2313 | + | ||
2694 | 2314 | unsigned long ZEXPORT crc32_z(unsigned long crc, const unsigned char FAR *buf, | ||
2695 | 2315 | z_size_t len) { | ||
2696 | 2316 | /* Return initial CRC, if requested. */ | ||
2697 | 2317 | @@ -1009,6 +1016,11 @@ unsigned long ZEXPORT crc32_z(unsigned long crc, const unsigned char FAR *buf, | ||
2698 | 2318 | return crc ^ 0xffffffff; | ||
2699 | 2319 | } | ||
2700 | 2320 | |||
2701 | 2321 | +#ifdef Z_POWER_OPT | ||
2702 | 2322 | +#undef crc32_z | ||
2703 | 2323 | +#include "contrib/power/crc32_z_resolver.c" | ||
2704 | 2324 | +#endif /* Z_POWER_OPT */ | ||
2705 | 2325 | + | ||
2706 | 2326 | #endif | ||
2707 | 2327 | |||
2708 | 2328 | /* ========================================================================= */ | ||
2709 | 2329 | diff --git a/test/crc32_test.c b/test/crc32_test.c | ||
2710 | 2330 | new file mode 100644 | ||
2711 | 2331 | index 0000000..3155553 | ||
2712 | 2332 | --- /dev/null | ||
2713 | 2333 | +++ b/test/crc32_test.c | ||
2714 | 2334 | @@ -0,0 +1,205 @@ | ||
2715 | 2335 | +/* crc32_tes.c -- unit test for crc32 in the zlib compression library | ||
2716 | 2336 | + * Copyright (C) 1995-2006, 2010, 2011, 2016, 2019 Rogerio Alves | ||
2717 | 2337 | + * For conditions of distribution and use, see copyright notice in zlib.h | ||
2718 | 2338 | + */ | ||
2719 | 2339 | + | ||
2720 | 2340 | +#include "zlib.h" | ||
2721 | 2341 | +#include <stdio.h> | ||
2722 | 2342 | + | ||
2723 | 2343 | +#ifdef STDC | ||
2724 | 2344 | +# include <string.h> | ||
2725 | 2345 | +# include <stdlib.h> | ||
2726 | 2346 | +#endif | ||
2727 | 2347 | + | ||
2728 | 2348 | +void test_crc32 OF((uLong crc, Byte* buf, z_size_t len, uLong chk, int line)); | ||
2729 | 2349 | +int main OF((void)); | ||
2730 | 2350 | + | ||
2731 | 2351 | +typedef struct { | ||
2732 | 2352 | + int line; | ||
2733 | 2353 | + uLong crc; | ||
2734 | 2354 | + char* buf; | ||
2735 | 2355 | + int len; | ||
2736 | 2356 | + uLong expect; | ||
2737 | 2357 | +} crc32_test; | ||
2738 | 2358 | + | ||
2739 | 2359 | +void test_crc32(crc, buf, len, chk, line) | ||
2740 | 2360 | + uLong crc; | ||
2741 | 2361 | + Byte *buf; | ||
2742 | 2362 | + z_size_t len; | ||
2743 | 2363 | + uLong chk; | ||
2744 | 2364 | + int line; | ||
2745 | 2365 | +{ | ||
2746 | 2366 | + uLong res = crc32(crc, buf, len); | ||
2747 | 2367 | + if (res != chk) { | ||
2748 | 2368 | + fprintf(stderr, "FAIL [%d]: crc32 returned 0x%08X expected 0x%08X\n", | ||
2749 | 2369 | + line, (unsigned int)res, (unsigned int)chk); | ||
2750 | 2370 | + exit(1); | ||
2751 | 2371 | + } | ||
2752 | 2372 | +} | ||
2753 | 2373 | + | ||
2754 | 2374 | +static const crc32_test tests[] = { | ||
2755 | 2375 | + {__LINE__, 0x0, 0x0, 0, 0x0}, | ||
2756 | 2376 | + {__LINE__, 0xffffffff, 0x0, 0, 0x0}, | ||
2757 | 2377 | + {__LINE__, 0x0, 0x0, 255, 0x0}, /* BZ 174799. */ | ||
2758 | 2378 | + {__LINE__, 0x0, 0x0, 256, 0x0}, | ||
2759 | 2379 | + {__LINE__, 0x0, 0x0, 257, 0x0}, | ||
2760 | 2380 | + {__LINE__, 0x0, 0x0, 32767, 0x0}, | ||
2761 | 2381 | + {__LINE__, 0x0, 0x0, 32768, 0x0}, | ||
2762 | 2382 | + {__LINE__, 0x0, 0x0, 32769, 0x0}, | ||
2763 | 2383 | + {__LINE__, 0x0, "", 0, 0x0}, | ||
2764 | 2384 | + {__LINE__, 0xffffffff, "", 0, 0xffffffff}, | ||
2765 | 2385 | + {__LINE__, 0x0, "abacus", 6, 0xc3d7115b}, | ||
2766 | 2386 | + {__LINE__, 0x0, "backlog", 7, 0x269205}, | ||
2767 | 2387 | + {__LINE__, 0x0, "campfire", 8, 0x22a515f8}, | ||
2768 | 2388 | + {__LINE__, 0x0, "delta", 5, 0x9643fed9}, | ||
2769 | 2389 | + {__LINE__, 0x0, "executable", 10, 0xd68eda01}, | ||
2770 | 2390 | + {__LINE__, 0x0, "file", 4, 0x8c9f3610}, | ||
2771 | 2391 | + {__LINE__, 0x0, "greatest", 8, 0xc1abd6cd}, | ||
2772 | 2392 | + {__LINE__, 0x0, "hello", 5, 0x3610a686}, | ||
2773 | 2393 | + {__LINE__, 0x0, "inverter", 8, 0xc9e962c9}, | ||
2774 | 2394 | + {__LINE__, 0x0, "jigsaw", 6, 0xce4e3f69}, | ||
2775 | 2395 | + {__LINE__, 0x0, "karate", 6, 0x890be0e2}, | ||
2776 | 2396 | + {__LINE__, 0x0, "landscape", 9, 0xc4e0330b}, | ||
2777 | 2397 | + {__LINE__, 0x0, "machine", 7, 0x1505df84}, | ||
2778 | 2398 | + {__LINE__, 0x0, "nanometer", 9, 0xd4e19f39}, | ||
2779 | 2399 | + {__LINE__, 0x0, "oblivion", 8, 0xdae9de77}, | ||
2780 | 2400 | + {__LINE__, 0x0, "panama", 6, 0x66b8979c}, | ||
2781 | 2401 | + {__LINE__, 0x0, "quest", 5, 0x4317f817}, | ||
2782 | 2402 | + {__LINE__, 0x0, "resource", 8, 0xbc91f416}, | ||
2783 | 2403 | + {__LINE__, 0x0, "secret", 6, 0x5ca2e8e5}, | ||
2784 | 2404 | + {__LINE__, 0x0, "test", 4, 0xd87f7e0c}, | ||
2785 | 2405 | + {__LINE__, 0x0, "ultimate", 8, 0x3fc79b0b}, | ||
2786 | 2406 | + {__LINE__, 0x0, "vector", 6, 0x1b6e485b}, | ||
2787 | 2407 | + {__LINE__, 0x0, "walrus", 6, 0xbe769b97}, | ||
2788 | 2408 | + {__LINE__, 0x0, "xeno", 4, 0xe7a06444}, | ||
2789 | 2409 | + {__LINE__, 0x0, "yelling", 7, 0xfe3944e5}, | ||
2790 | 2410 | + {__LINE__, 0x0, "zlib", 4, 0x73887d3a}, | ||
2791 | 2411 | + {__LINE__, 0x0, "4BJD7PocN1VqX0jXVpWB", 20, 0xd487a5a1}, | ||
2792 | 2412 | + {__LINE__, 0x0, "F1rPWI7XvDs6nAIRx41l", 20, 0x61a0132e}, | ||
2793 | 2413 | + {__LINE__, 0x0, "ldhKlsVkPFOveXgkGtC2", 20, 0xdf02f76}, | ||
2794 | 2414 | + {__LINE__, 0x0, "5KKnGOOrs8BvJ35iKTOS", 20, 0x579b2b0a}, | ||
2795 | 2415 | + {__LINE__, 0x0, "0l1tw7GOcem06Ddu7yn4", 20, 0xf7d16e2d}, | ||
2796 | 2416 | + {__LINE__, 0x0, "MCr47CjPIn9R1IvE1Tm5", 20, 0x731788f5}, | ||
2797 | 2417 | + {__LINE__, 0x0, "UcixbzPKTIv0SvILHVdO", 20, 0x7112bb11}, | ||
2798 | 2418 | + {__LINE__, 0x0, "dGnAyAhRQDsWw0ESou24", 20, 0xf32a0dac}, | ||
2799 | 2419 | + {__LINE__, 0x0, "di0nvmY9UYMYDh0r45XT", 20, 0x625437bb}, | ||
2800 | 2420 | + {__LINE__, 0x0, "2XKDwHfAhFsV0RhbqtvH", 20, 0x896930f9}, | ||
2801 | 2421 | + {__LINE__, 0x0, "ZhrANFIiIvRnqClIVyeD", 20, 0x8579a37}, | ||
2802 | 2422 | + {__LINE__, 0x0, "v7Q9ehzioTOVeDIZioT1", 20, 0x632aa8e0}, | ||
2803 | 2423 | + {__LINE__, 0x0, "Yod5hEeKcYqyhfXbhxj2", 20, 0xc829af29}, | ||
2804 | 2424 | + {__LINE__, 0x0, "GehSWY2ay4uUKhehXYb0", 20, 0x1b08b7e8}, | ||
2805 | 2425 | + {__LINE__, 0x0, "kwytJmq6UqpflV8Y8GoE", 20, 0x4e33b192}, | ||
2806 | 2426 | + {__LINE__, 0x0, "70684206568419061514", 20, 0x59a179f0}, | ||
2807 | 2427 | + {__LINE__, 0x0, "42015093765128581010", 20, 0xcd1013d7}, | ||
2808 | 2428 | + {__LINE__, 0x0, "88214814356148806939", 20, 0xab927546}, | ||
2809 | 2429 | + {__LINE__, 0x0, "43472694284527343838", 20, 0x11f3b20c}, | ||
2810 | 2430 | + {__LINE__, 0x0, "49769333513942933689", 20, 0xd562d4ca}, | ||
2811 | 2431 | + {__LINE__, 0x0, "54979784887993251199", 20, 0x233395f7}, | ||
2812 | 2432 | + {__LINE__, 0x0, "58360544869206793220", 20, 0x2d167fd5}, | ||
2813 | 2433 | + {__LINE__, 0x0, "27347953487840714234", 20, 0x8b5108ba}, | ||
2814 | 2434 | + {__LINE__, 0x0, "07650690295365319082", 20, 0xc46b3cd8}, | ||
2815 | 2435 | + {__LINE__, 0x0, "42655507906821911703", 20, 0xc10b2662}, | ||
2816 | 2436 | + {__LINE__, 0x0, "29977409200786225655", 20, 0xc9a0f9d2}, | ||
2817 | 2437 | + {__LINE__, 0x0, "85181542907229116674", 20, 0x9341357b}, | ||
2818 | 2438 | + {__LINE__, 0x0, "87963594337989416799", 20, 0xf0424937}, | ||
2819 | 2439 | + {__LINE__, 0x0, "21395988329504168551", 20, 0xd7c4c31f}, | ||
2820 | 2440 | + {__LINE__, 0x0, "51991013580943379423", 20, 0xf11edcc4}, | ||
2821 | 2441 | + {__LINE__, 0x0, "*]+@!);({_$;}[_},?{?;(_?,=-][@", 30, 0x40795df4}, | ||
2822 | 2442 | + {__LINE__, 0x0, "_@:_).&(#.[:[{[:)$++-($_;@[)}+", 30, 0xdd61a631}, | ||
2823 | 2443 | + {__LINE__, 0x0, "&[!,[$_==}+.]@!;*(+},[;:)$;)-@", 30, 0xca907a99}, | ||
2824 | 2444 | + {__LINE__, 0x0, "]{.[.+?+[[=;[?}_#&;[=)__$$:+=_", 30, 0xf652deac}, | ||
2825 | 2445 | + {__LINE__, 0x0, "-%.)=/[@].:.(:,()$;=%@-$?]{%+%", 30, 0xaf39a5a9}, | ||
2826 | 2446 | + {__LINE__, 0x0, "+]#$(@&.=:,*];/.!]%/{:){:@(;)$", 30, 0x6bebb4cf}, | ||
2827 | 2447 | + {__LINE__, 0x0, ")-._.:?[&:.=+}(*$/=!.${;(=$@!}", 30, 0x76430bac}, | ||
2828 | 2448 | + {__LINE__, 0x0, ":(_*&%/[[}+,?#$&*+#[([*-/#;%(]", 30, 0x6c80c388}, | ||
2829 | 2449 | + {__LINE__, 0x0, "{[#-;:$/{)(+[}#]/{&!%(@)%:@-$:", 30, 0xd54d977d}, | ||
2830 | 2450 | + {__LINE__, 0x0, "_{$*,}(&,@.)):=!/%(&(,,-?$}}}!", 30, 0xe3966ad5}, | ||
2831 | 2451 | + {__LINE__, 0x0, "e$98KNzqaV)Y:2X?]77].{gKRD4G5{mHZk,Z)SpU%L3FSgv!Wb8MLAFdi{+fp)c,@8m6v)yXg@]HBDFk?.4&}g5_udE*JHCiH=aL", 100, 0xe7c71db9}, | ||
2832 | 2452 | + {__LINE__, 0x0, "r*Fd}ef+5RJQ;+W=4jTR9)R*p!B;]Ed7tkrLi;88U7g@3v!5pk2X6D)vt,.@N8c]@yyEcKi[vwUu@.Ppm@C6%Mv*3Nw}Y,58_aH)", 100, 0xeaa52777}, | ||
2833 | 2453 | + {__LINE__, 0x0, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 100, 0xcd472048}, | ||
2834 | 2454 | + {__LINE__, 0x7a30360d, "abacus", 6, 0xf8655a84}, | ||
2835 | 2455 | + {__LINE__, 0x6fd767ee, "backlog", 7, 0x1ed834b1}, | ||
2836 | 2456 | + {__LINE__, 0xefeb7589, "campfire", 8, 0x686cfca}, | ||
2837 | 2457 | + {__LINE__, 0x61cf7e6b, "delta", 5, 0x1554e4b1}, | ||
2838 | 2458 | + {__LINE__, 0xdc712e2, "executable", 10, 0x761b4254}, | ||
2839 | 2459 | + {__LINE__, 0xad23c7fd, "file", 4, 0x7abdd09b}, | ||
2840 | 2460 | + {__LINE__, 0x85cb2317, "greatest", 8, 0x4ba91c6b}, | ||
2841 | 2461 | + {__LINE__, 0x9eed31b0, "inverter", 8, 0xd5e78ba5}, | ||
2842 | 2462 | + {__LINE__, 0xb94f34ca, "jigsaw", 6, 0x23649109}, | ||
2843 | 2463 | + {__LINE__, 0xab058a2, "karate", 6, 0xc5591f41}, | ||
2844 | 2464 | + {__LINE__, 0x5bff2b7a, "landscape", 9, 0xf10eb644}, | ||
2845 | 2465 | + {__LINE__, 0x605c9a5f, "machine", 7, 0xbaa0a636}, | ||
2846 | 2466 | + {__LINE__, 0x51bdeea5, "nanometer", 9, 0x6af89afb}, | ||
2847 | 2467 | + {__LINE__, 0x85c21c79, "oblivion", 8, 0xecae222b}, | ||
2848 | 2468 | + {__LINE__, 0x97216f56, "panama", 6, 0x47dffac4}, | ||
2849 | 2469 | + {__LINE__, 0x18444af2, "quest", 5, 0x70c2fe36}, | ||
2850 | 2470 | + {__LINE__, 0xbe6ce359, "resource", 8, 0x1471d925}, | ||
2851 | 2471 | + {__LINE__, 0x843071f1, "secret", 6, 0x50c9a0db}, | ||
2852 | 2472 | + {__LINE__, 0xf2480c60, "ultimate", 8, 0xf973daf8}, | ||
2853 | 2473 | + {__LINE__, 0x2d2feb3d, "vector", 6, 0x344ac03d}, | ||
2854 | 2474 | + {__LINE__, 0x7490310a, "walrus", 6, 0x6d1408ef}, | ||
2855 | 2475 | + {__LINE__, 0x97d247d4, "xeno", 4, 0xe62670b5}, | ||
2856 | 2476 | + {__LINE__, 0x93cf7599, "yelling", 7, 0x1b36da38}, | ||
2857 | 2477 | + {__LINE__, 0x73c84278, "zlib", 4, 0x6432d127}, | ||
2858 | 2478 | + {__LINE__, 0x228a87d1, "4BJD7PocN1VqX0jXVpWB", 20, 0x997107d0}, | ||
2859 | 2479 | + {__LINE__, 0xa7a048d0, "F1rPWI7XvDs6nAIRx41l", 20, 0xdc567274}, | ||
2860 | 2480 | + {__LINE__, 0x1f0ded40, "ldhKlsVkPFOveXgkGtC2", 20, 0xdcc63870}, | ||
2861 | 2481 | + {__LINE__, 0xa804a62f, "5KKnGOOrs8BvJ35iKTOS", 20, 0x6926cffd}, | ||
2862 | 2482 | + {__LINE__, 0x508fae6a, "0l1tw7GOcem06Ddu7yn4", 20, 0xb52b38bc}, | ||
2863 | 2483 | + {__LINE__, 0xe5adaf4f, "MCr47CjPIn9R1IvE1Tm5", 20, 0xf83b8178}, | ||
2864 | 2484 | + {__LINE__, 0x67136a40, "UcixbzPKTIv0SvILHVdO", 20, 0xc5213070}, | ||
2865 | 2485 | + {__LINE__, 0xb00c4a10, "dGnAyAhRQDsWw0ESou24", 20, 0xbc7648b0}, | ||
2866 | 2486 | + {__LINE__, 0x2e0c84b5, "di0nvmY9UYMYDh0r45XT", 20, 0xd8123a72}, | ||
2867 | 2487 | + {__LINE__, 0x81238d44, "2XKDwHfAhFsV0RhbqtvH", 20, 0xd5ac5620}, | ||
2868 | 2488 | + {__LINE__, 0xf853aa92, "ZhrANFIiIvRnqClIVyeD", 20, 0xceae099d}, | ||
2869 | 2489 | + {__LINE__, 0x5a692325, "v7Q9ehzioTOVeDIZioT1", 20, 0xb07d2b24}, | ||
2870 | 2490 | + {__LINE__, 0x3275b9f, "Yod5hEeKcYqyhfXbhxj2", 20, 0x24ce91df}, | ||
2871 | 2491 | + {__LINE__, 0x38371feb, "GehSWY2ay4uUKhehXYb0", 20, 0x707b3b30}, | ||
2872 | 2492 | + {__LINE__, 0xafc8bf62, "kwytJmq6UqpflV8Y8GoE", 20, 0x16abc6a9}, | ||
2873 | 2493 | + {__LINE__, 0x9b07db73, "70684206568419061514", 20, 0xae1fb7b7}, | ||
2874 | 2494 | + {__LINE__, 0xe75b214, "42015093765128581010", 20, 0xd4eecd2d}, | ||
2875 | 2495 | + {__LINE__, 0x72d0fe6f, "88214814356148806939", 20, 0x4660ec7}, | ||
2876 | 2496 | + {__LINE__, 0xf857a4b1, "43472694284527343838", 20, 0xfd8afdf7}, | ||
2877 | 2497 | + {__LINE__, 0x54b8e14, "49769333513942933689", 20, 0xc6d1b5f2}, | ||
2878 | 2498 | + {__LINE__, 0xd6aa5616, "54979784887993251199", 20, 0x32476461}, | ||
2879 | 2499 | + {__LINE__, 0x11e63098, "58360544869206793220", 20, 0xd917cf1a}, | ||
2880 | 2500 | + {__LINE__, 0xbe92385, "27347953487840714234", 20, 0x4ad14a12}, | ||
2881 | 2501 | + {__LINE__, 0x49511de0, "07650690295365319082", 20, 0xe37b5c6c}, | ||
2882 | 2502 | + {__LINE__, 0x3db13bc1, "42655507906821911703", 20, 0x7cc497f1}, | ||
2883 | 2503 | + {__LINE__, 0xbb899bea, "29977409200786225655", 20, 0x99781bb2}, | ||
2884 | 2504 | + {__LINE__, 0xf6cd9436, "85181542907229116674", 20, 0x132256a1}, | ||
2885 | 2505 | + {__LINE__, 0x9109e6c3, "87963594337989416799", 20, 0xbfdb2c83}, | ||
2886 | 2506 | + {__LINE__, 0x75770fc, "21395988329504168551", 20, 0x8d9d1e81}, | ||
2887 | 2507 | + {__LINE__, 0x69b1d19b, "51991013580943379423", 20, 0x7b6d4404}, | ||
2888 | 2508 | + {__LINE__, 0xc6132975, "*]+@!);({_$;}[_},?{?;(_?,=-][@", 30, 0x8619f010}, | ||
2889 | 2509 | + {__LINE__, 0xd58cb00c, "_@:_).&(#.[:[{[:)$++-($_;@[)}+", 30, 0x15746ac3}, | ||
2890 | 2510 | + {__LINE__, 0xb63b8caa, "&[!,[$_==}+.]@!;*(+},[;:)$;)-@", 30, 0xaccf812f}, | ||
2891 | 2511 | + {__LINE__, 0x8a45a2b8, "]{.[.+?+[[=;[?}_#&;[=)__$$:+=_", 30, 0x78af45de}, | ||
2892 | 2512 | + {__LINE__, 0xcbe95b78, "-%.)=/[@].:.(:,()$;=%@-$?]{%+%", 30, 0x25b06b59}, | ||
2893 | 2513 | + {__LINE__, 0x4ef8a54b, "+]#$(@&.=:,*];/.!]%/{:){:@(;)$", 30, 0x4ba0d08f}, | ||
2894 | 2514 | + {__LINE__, 0x76ad267a, ")-._.:?[&:.=+}(*$/=!.${;(=$@!}", 30, 0xe26b6aac}, | ||
2895 | 2515 | + {__LINE__, 0x569e613c, ":(_*&%/[[}+,?#$&*+#[([*-/#;%(]", 30, 0x7e2b0a66}, | ||
2896 | 2516 | + {__LINE__, 0x36aa61da, "{[#-;:$/{)(+[}#]/{&!%(@)%:@-$:", 30, 0xb3430dc7}, | ||
2897 | 2517 | + {__LINE__, 0xf67222df, "_{$*,}(&,@.)):=!/%(&(,,-?$}}}!", 30, 0x626c17a}, | ||
2898 | 2518 | + {__LINE__, 0x74b34fd3, "e$98KNzqaV)Y:2X?]77].{gKRD4G5{mHZk,Z)SpU%L3FSgv!Wb8MLAFdi{+fp)c,@8m6v)yXg@]HBDFk?.4&}g5_udE*JHCiH=aL", 100, 0xccf98060}, | ||
2899 | 2519 | + {__LINE__, 0x351fd770, "r*Fd}ef+5RJQ;+W=4jTR9)R*p!B;]Ed7tkrLi;88U7g@3v!5pk2X6D)vt,.@N8c]@yyEcKi[vwUu@.Ppm@C6%Mv*3Nw}Y,58_aH)", 100, 0xd8b95312}, | ||
2900 | 2520 | + {__LINE__, 0xc45aef77, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 100, 0xbb1c9912}, | ||
2901 | 2521 | + {__LINE__, 0xc45aef77, "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" | ||
2902 | 2522 | + "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" | ||
2903 | 2523 | + "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" | ||
2904 | 2524 | + "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" | ||
2905 | 2525 | + "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&" | ||
2906 | 2526 | + "h{bcmdC+a;t+Cf{6Y_dFq-{X4Yu&7uNfVDh?q&_u.UWJU],-GiH7ADzb7-V.Q%4=+v!$L9W+T=bP]$_:]Vyg}A.ygD.r;h-D]m%&", 600, 0x888AFA5B} | ||
2907 | 2527 | +}; | ||
2908 | 2528 | + | ||
2909 | 2529 | +static const int test_size = sizeof(tests) / sizeof(tests[0]); | ||
2910 | 2530 | + | ||
2911 | 2531 | +int main(void) | ||
2912 | 2532 | +{ | ||
2913 | 2533 | + int i; | ||
2914 | 2534 | + for (i = 0; i < test_size; i++) { | ||
2915 | 2535 | + test_crc32(tests[i].crc, (Byte*) tests[i].buf, tests[i].len, | ||
2916 | 2536 | + tests[i].expect, tests[i].line); | ||
2917 | 2537 | + } | ||
2918 | 2538 | + return 0; | ||
2919 | 2539 | +} | ||
2920 | diff --git a/debian/patches/power/fix-clang7-builtins.patch b/debian/patches/power/fix-clang7-builtins.patch | |||
2921 | 0 | new file mode 100644 | 2540 | new file mode 100644 |
2922 | index 0000000..0ed510f | |||
2923 | --- /dev/null | |||
2924 | +++ b/debian/patches/power/fix-clang7-builtins.patch | |||
2925 | @@ -0,0 +1,62 @@ | |||
2926 | 1 | From: Manjunath S Matti <mmatti@linux.ibm.com> | ||
2927 | 2 | Date: Thu, 14 Sep 2023 06:45:31 -0500 | ||
2928 | 3 | Subject: Fix clang's behavior on versions >= 7 | ||
2929 | 4 | |||
2930 | 5 | Clang 7 changed the behavior of vec_xxpermdi in order to match GCC's | ||
2931 | 6 | behavior. After this change, code that used to work on Clang 6 stopped | ||
2932 | 7 | to work on Clang >= 7. | ||
2933 | 8 | |||
2934 | 9 | Tested on Clang 6, 7, 8 and 9. | ||
2935 | 10 | |||
2936 | 11 | Reference: https://bugs.llvm.org/show_bug.cgi?id=38192 | ||
2937 | 12 | |||
2938 | 13 | Signed-off-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> | ||
2939 | 14 | Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> | ||
2940 | 15 | |||
2941 | 16 | Origin: i-iii/zlib, https://github.com/iii-i/zlib/commit/8aca10a8a5ddb397854eb9a443f29658d3e3e12e | ||
2942 | 17 | --- | ||
2943 | 18 | contrib/power/clang_workaround.h | 15 ++++++++++----- | ||
2944 | 19 | 1 file changed, 10 insertions(+), 5 deletions(-) | ||
2945 | 20 | |||
2946 | 21 | diff --git a/contrib/power/clang_workaround.h b/contrib/power/clang_workaround.h | ||
2947 | 22 | index b5e7dae..915f7e5 100644 | ||
2948 | 23 | --- a/contrib/power/clang_workaround.h | ||
2949 | 24 | +++ b/contrib/power/clang_workaround.h | ||
2950 | 25 | @@ -39,7 +39,12 @@ __vector unsigned long long __builtin_pack_vector (unsigned long __a, | ||
2951 | 26 | return __v; | ||
2952 | 27 | } | ||
2953 | 28 | |||
2954 | 29 | -#ifndef vec_xxpermdi | ||
2955 | 30 | +/* | ||
2956 | 31 | + * Clang 7 changed the behavior of vec_xxpermdi in order to provide the same | ||
2957 | 32 | + * behavior of GCC. That means code adapted to Clang >= 7 does not work on | ||
2958 | 33 | + * Clang <= 6. So, fallback to __builtin_unpack_vector() on Clang <= 6. | ||
2959 | 34 | + */ | ||
2960 | 35 | +#if !defined vec_xxpermdi || __clang_major__ <= 6 | ||
2961 | 36 | |||
2962 | 37 | static inline | ||
2963 | 38 | unsigned long __builtin_unpack_vector (__vector unsigned long long __v, | ||
2964 | 39 | @@ -62,9 +67,9 @@ static inline | ||
2965 | 40 | unsigned long __builtin_unpack_vector_0 (__vector unsigned long long __v) | ||
2966 | 41 | { | ||
2967 | 42 | #if defined(__BIG_ENDIAN__) | ||
2968 | 43 | - return vec_xxpermdi(__v, __v, 0x0)[1]; | ||
2969 | 44 | - #else | ||
2970 | 45 | return vec_xxpermdi(__v, __v, 0x0)[0]; | ||
2971 | 46 | + #else | ||
2972 | 47 | + return vec_xxpermdi(__v, __v, 0x3)[0]; | ||
2973 | 48 | #endif | ||
2974 | 49 | } | ||
2975 | 50 | |||
2976 | 51 | @@ -72,9 +77,9 @@ static inline | ||
2977 | 52 | unsigned long __builtin_unpack_vector_1 (__vector unsigned long long __v) | ||
2978 | 53 | { | ||
2979 | 54 | #if defined(__BIG_ENDIAN__) | ||
2980 | 55 | - return vec_xxpermdi(__v, __v, 0x3)[1]; | ||
2981 | 56 | - #else | ||
2982 | 57 | return vec_xxpermdi(__v, __v, 0x3)[0]; | ||
2983 | 58 | + #else | ||
2984 | 59 | + return vec_xxpermdi(__v, __v, 0x0)[0]; | ||
2985 | 60 | #endif | ||
2986 | 61 | } | ||
2987 | 62 | #endif /* vec_xxpermdi */ | ||
2988 | diff --git a/debian/patches/power/indirect-func-macros.patch b/debian/patches/power/indirect-func-macros.patch | |||
2989 | 0 | new file mode 100644 | 63 | new file mode 100644 |
2990 | index 0000000..c2976d8 | |||
2991 | --- /dev/null | |||
2992 | +++ b/debian/patches/power/indirect-func-macros.patch | |||
2993 | @@ -0,0 +1,295 @@ | |||
2994 | 1 | From: Manjunath S Matti <mmatti@linux.ibm.com> | ||
2995 | 2 | Date: Thu, 14 Sep 2023 06:15:57 -0500 | ||
2996 | 3 | Subject: Preparation for Power optimizations | ||
2997 | 4 | |||
2998 | 5 | Optimized functions for Power will make use of GNU indirect functions, | ||
2999 | 6 | an extension to support different implementations of the same function, | ||
3000 | 7 | which can be selected during runtime. This will be used to provide | ||
3001 | 8 | optimized functions for different processor versions. | ||
3002 | 9 | |||
3003 | 10 | Since this is a GNU extension, we placed the definition of the Z_IFUNC | ||
3004 | 11 | macro under `contrib/gcc`. This can be reused by other archs as well. | ||
3005 | 12 | |||
3006 | 13 | Author: Matheus Castanho <msc@linux.ibm.com> | ||
3007 | 14 | Author: Rogerio Alves <rcardoso@linux.ibm.com> | ||
3008 | 15 | Signed-off-by: Manjunath Matti <mmatti@linux.ibm.com> | ||
3009 | 16 | |||
3010 | 17 | Origin: iii-i/zlib, https://github.com/iii-i/zlib/commit/096441298ecd1c123f1d37c2b34d6b6bb3c42e93 | ||
3011 | 18 | --- | ||
3012 | 19 | CMakeLists.txt | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++ | ||
3013 | 20 | configure | 66 ++++++++++++++++++++++++++++++++++++++++++++++ | ||
3014 | 21 | contrib/README.contrib | 8 ++++++ | ||
3015 | 22 | contrib/gcc/zifunc.h | 60 ++++++++++++++++++++++++++++++++++++++++++ | ||
3016 | 23 | contrib/power/power.h | 4 +++ | ||
3017 | 24 | 5 files changed, 209 insertions(+) | ||
3018 | 25 | create mode 100644 contrib/gcc/zifunc.h | ||
3019 | 26 | create mode 100644 contrib/power/power.h | ||
3020 | 27 | |||
3021 | 28 | diff --git a/CMakeLists.txt b/CMakeLists.txt | ||
3022 | 29 | index 7f1b69f..4456cd7 100644 | ||
3023 | 30 | --- a/CMakeLists.txt | ||
3024 | 31 | +++ b/CMakeLists.txt | ||
3025 | 32 | @@ -5,6 +5,8 @@ project(zlib C) | ||
3026 | 33 | |||
3027 | 34 | set(VERSION "1.3") | ||
3028 | 35 | |||
3029 | 36 | +option(POWER "Enable building power implementation") | ||
3030 | 37 | + | ||
3031 | 38 | set(INSTALL_BIN_DIR "${CMAKE_INSTALL_PREFIX}/bin" CACHE PATH "Installation directory for executables") | ||
3032 | 39 | set(INSTALL_LIB_DIR "${CMAKE_INSTALL_PREFIX}/lib" CACHE PATH "Installation directory for libraries") | ||
3033 | 40 | set(INSTALL_INC_DIR "${CMAKE_INSTALL_PREFIX}/include" CACHE PATH "Installation directory for headers") | ||
3034 | 41 | @@ -126,6 +128,75 @@ if(NOT MINGW) | ||
3035 | 42 | ) | ||
3036 | 43 | endif() | ||
3037 | 44 | |||
3038 | 45 | +if(CMAKE_COMPILER_IS_GNUCC) | ||
3039 | 46 | + | ||
3040 | 47 | + # test to see if we can use a GNU indirect function to detect and load optimized code at runtime | ||
3041 | 48 | + CHECK_C_SOURCE_COMPILES(" | ||
3042 | 49 | + static int test_ifunc_native(void) | ||
3043 | 50 | + { | ||
3044 | 51 | + return 1; | ||
3045 | 52 | + } | ||
3046 | 53 | + static int (*(check_ifunc_native(void)))(void) | ||
3047 | 54 | + { | ||
3048 | 55 | + return test_ifunc_native; | ||
3049 | 56 | + } | ||
3050 | 57 | + int test_ifunc(void) __attribute__ ((ifunc (\"check_ifunc_native\"))); | ||
3051 | 58 | + int main(void) | ||
3052 | 59 | + { | ||
3053 | 60 | + return 0; | ||
3054 | 61 | + } | ||
3055 | 62 | + " HAS_C_ATTR_IFUNC) | ||
3056 | 63 | + | ||
3057 | 64 | + if(HAS_C_ATTR_IFUNC) | ||
3058 | 65 | + add_definitions(-DHAVE_IFUNC) | ||
3059 | 66 | + set(ZLIB_PRIVATE_HDRS ${ZLIB_PRIVATE_HDRS} contrib/gcc/zifunc.h) | ||
3060 | 67 | + endif() | ||
3061 | 68 | + | ||
3062 | 69 | + if(POWER) | ||
3063 | 70 | + # Test to see if we can use the optimizations for Power | ||
3064 | 71 | + CHECK_C_SOURCE_COMPILES(" | ||
3065 | 72 | + #ifndef _ARCH_PPC | ||
3066 | 73 | + #error \"Target is not Power\" | ||
3067 | 74 | + #endif | ||
3068 | 75 | + #ifndef __BUILTIN_CPU_SUPPORTS__ | ||
3069 | 76 | + #error \"Target doesn't support __builtin_cpu_supports()\" | ||
3070 | 77 | + #endif | ||
3071 | 78 | + int main() { return 0; } | ||
3072 | 79 | + " HAS_POWER_SUPPORT) | ||
3073 | 80 | + | ||
3074 | 81 | + if(HAS_POWER_SUPPORT AND HAS_C_ATTR_IFUNC) | ||
3075 | 82 | + add_definitions(-DZ_POWER_OPT) | ||
3076 | 83 | + | ||
3077 | 84 | + set(CMAKE_REQUIRED_FLAGS -mcpu=power8) | ||
3078 | 85 | + CHECK_C_SOURCE_COMPILES("int main(void){return 0;}" POWER8) | ||
3079 | 86 | + | ||
3080 | 87 | + if(POWER8) | ||
3081 | 88 | + add_definitions(-DZ_POWER8) | ||
3082 | 89 | + set(ZLIB_POWER8 ) | ||
3083 | 90 | + | ||
3084 | 91 | + set_source_files_properties( | ||
3085 | 92 | + ${ZLIB_POWER8} | ||
3086 | 93 | + PROPERTIES COMPILE_FLAGS -mcpu=power8) | ||
3087 | 94 | + endif() | ||
3088 | 95 | + | ||
3089 | 96 | + set(CMAKE_REQUIRED_FLAGS -mcpu=power9) | ||
3090 | 97 | + CHECK_C_SOURCE_COMPILES("int main(void){return 0;}" POWER9) | ||
3091 | 98 | + | ||
3092 | 99 | + if(POWER9) | ||
3093 | 100 | + add_definitions(-DZ_POWER9) | ||
3094 | 101 | + set(ZLIB_POWER9 ) | ||
3095 | 102 | + | ||
3096 | 103 | + set_source_files_properties( | ||
3097 | 104 | + ${ZLIB_POWER9} | ||
3098 | 105 | + PROPERTIES COMPILE_FLAGS -mcpu=power9) | ||
3099 | 106 | + endif() | ||
3100 | 107 | + | ||
3101 | 108 | + set(ZLIB_PRIVATE_HDRS ${ZLIB_PRIVATE_HDRS} contrib/power/power.h) | ||
3102 | 109 | + set(ZLIB_SRCS ${ZLIB_SRCS} ${ZLIB_POWER8} ${ZLIB_POWER9}) | ||
3103 | 110 | + endif() | ||
3104 | 111 | + endif() | ||
3105 | 112 | +endif() | ||
3106 | 113 | + | ||
3107 | 114 | # parse the full version number from zlib.h and include in ZLIB_FULL_VERSION | ||
3108 | 115 | file(READ ${CMAKE_CURRENT_SOURCE_DIR}/zlib.h _zlib_h_contents) | ||
3109 | 116 | string(REGEX REPLACE ".*#define[ \t]+ZLIB_VERSION[ \t]+\"([-0-9A-Za-z.]+)\".*" | ||
3110 | 117 | diff --git a/configure b/configure | ||
3111 | 118 | index cc867c9..e307a8d 100755 | ||
3112 | 119 | --- a/configure | ||
3113 | 120 | +++ b/configure | ||
3114 | 121 | @@ -834,6 +834,72 @@ EOF | ||
3115 | 122 | fi | ||
3116 | 123 | fi | ||
3117 | 124 | |||
3118 | 125 | +# test to see if we can use a gnu indirection function to detect and load optimized code at runtime | ||
3119 | 126 | +echo >> configure.log | ||
3120 | 127 | +cat > $test.c <<EOF | ||
3121 | 128 | +static int test_ifunc_native(void) | ||
3122 | 129 | +{ | ||
3123 | 130 | + return 1; | ||
3124 | 131 | +} | ||
3125 | 132 | + | ||
3126 | 133 | +static int (*(check_ifunc_native(void)))(void) | ||
3127 | 134 | +{ | ||
3128 | 135 | + return test_ifunc_native; | ||
3129 | 136 | +} | ||
3130 | 137 | + | ||
3131 | 138 | +int test_ifunc(void) __attribute__ ((ifunc ("check_ifunc_native"))); | ||
3132 | 139 | +EOF | ||
3133 | 140 | + | ||
3134 | 141 | +if tryboth $CC -c $CFLAGS $test.c; then | ||
3135 | 142 | + SFLAGS="${SFLAGS} -DHAVE_IFUNC" | ||
3136 | 143 | + CFLAGS="${CFLAGS} -DHAVE_IFUNC" | ||
3137 | 144 | + echo "Checking for attribute(ifunc) support... Yes." | tee -a configure.log | ||
3138 | 145 | +else | ||
3139 | 146 | + echo "Checking for attribute(ifunc) support... No." | tee -a configure.log | ||
3140 | 147 | +fi | ||
3141 | 148 | + | ||
3142 | 149 | +# Test to see if we can use the optimizations for Power | ||
3143 | 150 | +echo >> configure.log | ||
3144 | 151 | +cat > $test.c <<EOF | ||
3145 | 152 | +#ifndef _ARCH_PPC | ||
3146 | 153 | + #error "Target is not Power" | ||
3147 | 154 | +#endif | ||
3148 | 155 | +#ifndef HAVE_IFUNC | ||
3149 | 156 | + #error "Target doesn't support ifunc" | ||
3150 | 157 | +#endif | ||
3151 | 158 | +#ifndef __BUILTIN_CPU_SUPPORTS__ | ||
3152 | 159 | + #error "Target doesn't support __builtin_cpu_supports()" | ||
3153 | 160 | +#endif | ||
3154 | 161 | +EOF | ||
3155 | 162 | + | ||
3156 | 163 | +if tryboth $CC -c $CFLAGS $test.c; then | ||
3157 | 164 | + echo "int main(void){return 0;}" > $test.c | ||
3158 | 165 | + | ||
3159 | 166 | + if tryboth $CC -c $CFLAGS -mcpu=power8 $test.c; then | ||
3160 | 167 | + POWER8="-DZ_POWER8" | ||
3161 | 168 | + PIC_OBJC="${PIC_OBJC}" | ||
3162 | 169 | + OBJC="${OBJC}" | ||
3163 | 170 | + echo "Checking for -mcpu=power8 support... Yes." | tee -a configure.log | ||
3164 | 171 | + else | ||
3165 | 172 | + echo "Checking for -mcpu=power8 support... No." | tee -a configure.log | ||
3166 | 173 | + fi | ||
3167 | 174 | + | ||
3168 | 175 | + if tryboth $CC -c $CFLAGS -mcpu=power9 $test.c; then | ||
3169 | 176 | + POWER9="-DZ_POWER9" | ||
3170 | 177 | + PIC_OBJC="${PIC_OBJC}" | ||
3171 | 178 | + OBJC="${OBJC}" | ||
3172 | 179 | + echo "Checking for -mcpu=power9 support... Yes." | tee -a configure.log | ||
3173 | 180 | + else | ||
3174 | 181 | + echo "Checking for -mcpu=power9 support... No." | tee -a configure.log | ||
3175 | 182 | + fi | ||
3176 | 183 | + | ||
3177 | 184 | + SFLAGS="${SFLAGS} ${POWER8} ${POWER9} -DZ_POWER_OPT" | ||
3178 | 185 | + CFLAGS="${CFLAGS} ${POWER8} ${POWER9} -DZ_POWER_OPT" | ||
3179 | 186 | + echo "Checking for Power optimizations support... Yes." | tee -a configure.log | ||
3180 | 187 | +else | ||
3181 | 188 | + echo "Checking for Power optimizations support... No." | tee -a configure.log | ||
3182 | 189 | +fi | ||
3183 | 190 | + | ||
3184 | 191 | # show the results in the log | ||
3185 | 192 | echo >> configure.log | ||
3186 | 193 | echo ALL = $ALL >> configure.log | ||
3187 | 194 | diff --git a/contrib/README.contrib b/contrib/README.contrib | ||
3188 | 195 | index 5e5f950..c57b520 100644 | ||
3189 | 196 | --- a/contrib/README.contrib | ||
3190 | 197 | +++ b/contrib/README.contrib | ||
3191 | 198 | @@ -11,6 +11,10 @@ ada/ by Dmitriy Anisimkov <anisimkov@yahoo.com> | ||
3192 | 199 | blast/ by Mark Adler <madler@alumni.caltech.edu> | ||
3193 | 200 | Decompressor for output of PKWare Data Compression Library (DCL) | ||
3194 | 201 | |||
3195 | 202 | +gcc/ by Matheus Castanho <msc@linux.ibm.com> | ||
3196 | 203 | + and Rogerio Alves <rcardoso@linux.ibm.com> | ||
3197 | 204 | + Optimization helpers using GCC-specific extensions | ||
3198 | 205 | + | ||
3199 | 206 | delphi/ by Cosmin Truta <cosmint@cs.ubbcluj.ro> | ||
3200 | 207 | Support for Delphi and C++ Builder | ||
3201 | 208 | |||
3202 | 209 | @@ -42,6 +46,10 @@ minizip/ by Gilles Vollant <info@winimage.com> | ||
3203 | 210 | pascal/ by Bob Dellaca <bobdl@xtra.co.nz> et al. | ||
3204 | 211 | Support for Pascal | ||
3205 | 212 | |||
3206 | 213 | +power/ by Matheus Castanho <msc@linux.ibm.com> | ||
3207 | 214 | + and Rogerio Alves <rcardoso@linux.ibm.com> | ||
3208 | 215 | + Optimized functions for Power processors | ||
3209 | 216 | + | ||
3210 | 217 | puff/ by Mark Adler <madler@alumni.caltech.edu> | ||
3211 | 218 | Small, low memory usage inflate. Also serves to provide an | ||
3212 | 219 | unambiguous description of the deflate format. | ||
3213 | 220 | diff --git a/contrib/gcc/zifunc.h b/contrib/gcc/zifunc.h | ||
3214 | 221 | new file mode 100644 | ||
3215 | 222 | index 0000000..daf4fe4 | ||
3216 | 223 | --- /dev/null | ||
3217 | 224 | +++ b/contrib/gcc/zifunc.h | ||
3218 | 225 | @@ -0,0 +1,60 @@ | ||
3219 | 226 | +/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM | ||
3220 | 227 | + * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM | ||
3221 | 228 | + * For conditions of distribution and use, see copyright notice in zlib.h | ||
3222 | 229 | + */ | ||
3223 | 230 | + | ||
3224 | 231 | +#ifndef Z_IFUNC_H_ | ||
3225 | 232 | +#define Z_IFUNC_H_ | ||
3226 | 233 | + | ||
3227 | 234 | +/* Helpers for arch optimizations */ | ||
3228 | 235 | + | ||
3229 | 236 | +#define Z_IFUNC(fname) \ | ||
3230 | 237 | + typeof(fname) fname __attribute__ ((ifunc (#fname "_resolver"))); \ | ||
3231 | 238 | + local typeof(fname) *fname##_resolver(void) | ||
3232 | 239 | +/* This is a helper macro to declare a resolver for an indirect function | ||
3233 | 240 | + * (ifunc). Let's say you have function | ||
3234 | 241 | + * | ||
3235 | 242 | + * int foo (int a); | ||
3236 | 243 | + * | ||
3237 | 244 | + * for which you want to provide different implementations, for example: | ||
3238 | 245 | + * | ||
3239 | 246 | + * int foo_clever (int a) { | ||
3240 | 247 | + * ... clever things ... | ||
3241 | 248 | + * } | ||
3242 | 249 | + * | ||
3243 | 250 | + * int foo_smart (int a) { | ||
3244 | 251 | + * ... smart things ... | ||
3245 | 252 | + * } | ||
3246 | 253 | + * | ||
3247 | 254 | + * You will have to declare foo() as an indirect function and also provide a | ||
3248 | 255 | + * resolver for it, to choose between foo_clever() and foo_smart() based on | ||
3249 | 256 | + * some criteria you define (e.g. processor features). | ||
3250 | 257 | + * | ||
3251 | 258 | + * Since most likely foo() has a default implementation somewhere in zlib, you | ||
3252 | 259 | + * may have to rename it so the 'foo' symbol can be used by the ifunc without | ||
3253 | 260 | + * conflicts. | ||
3254 | 261 | + * | ||
3255 | 262 | + * #define foo foo_default | ||
3256 | 263 | + * int foo (int a) { | ||
3257 | 264 | + * ... | ||
3258 | 265 | + * } | ||
3259 | 266 | + * #undef foo | ||
3260 | 267 | + * | ||
3261 | 268 | + * Now you just have to provide a resolver function to choose which function | ||
3262 | 269 | + * should be used (decided at runtime on the first call to foo()): | ||
3263 | 270 | + * | ||
3264 | 271 | + * Z_IFUNC(foo) { | ||
3265 | 272 | + * if (... some condition ...) | ||
3266 | 273 | + * return foo_clever; | ||
3267 | 274 | + * | ||
3268 | 275 | + * if (... other condition ...) | ||
3269 | 276 | + * return foo_smart; | ||
3270 | 277 | + * | ||
3271 | 278 | + * return foo_default; | ||
3272 | 279 | + * } | ||
3273 | 280 | + * | ||
3274 | 281 | + * All calls to foo() throughout the code can remain untouched, all the magic | ||
3275 | 282 | + * will be done by the linker using the resolver function. | ||
3276 | 283 | + */ | ||
3277 | 284 | + | ||
3278 | 285 | +#endif /* Z_IFUNC_H_ */ | ||
3279 | 286 | diff --git a/contrib/power/power.h b/contrib/power/power.h | ||
3280 | 287 | new file mode 100644 | ||
3281 | 288 | index 0000000..b42c7d6 | ||
3282 | 289 | --- /dev/null | ||
3283 | 290 | +++ b/contrib/power/power.h | ||
3284 | 291 | @@ -0,0 +1,4 @@ | ||
3285 | 292 | +/* Copyright (C) 2019 Matheus Castanho <msc@linux.ibm.com>, IBM | ||
3286 | 293 | + * 2019 Rogerio Alves <rogerio.alves@ibm.com>, IBM | ||
3287 | 294 | + * For conditions of distribution and use, see copyright notice in zlib.h | ||
3288 | 295 | + */ | ||
3289 | diff --git a/debian/patches/s390x/add-accel-deflate.patch b/debian/patches/s390x/add-accel-deflate.patch | |||
3290 | 0 | new file mode 100644 | 296 | new file mode 100644 |
3291 | index 0000000..1ae9be6 | |||
3292 | --- /dev/null | |||
3293 | +++ b/debian/patches/s390x/add-accel-deflate.patch | |||
3294 | @@ -0,0 +1,2043 @@ | |||
3295 | 1 | From: Ilya Leoshkevich <iii@linux.ibm.com> | ||
3296 | 2 | Date: Wed, 18 Jul 2018 13:14:07 +0200 | ||
3297 | 3 | Subject: Add support for IBM Z hardware-accelerated deflate | ||
3298 | 4 | |||
3299 | 5 | IBM Z mainframes starting from version z15 provide DFLTCC instruction, | ||
3300 | 6 | which implements deflate algorithm in hardware with estimated | ||
3301 | 7 | compression and decompression performance orders of magnitude faster | ||
3302 | 8 | than the current zlib and ratio comparable with that of level 1. | ||
3303 | 9 | |||
3304 | 10 | This patch adds DFLTCC support to zlib. It can be enabled using the | ||
3305 | 11 | following build commands: | ||
3306 | 12 | |||
3307 | 13 | $ ./configure --dfltcc | ||
3308 | 14 | $ make | ||
3309 | 15 | |||
3310 | 16 | When built like this, zlib would compress in hardware on level 1, and | ||
3311 | 17 | in software on all other levels. Decompression will always happen in | ||
3312 | 18 | hardware. In order to enable DFLTCC compression for levels 1-6 (i.e., | ||
3313 | 19 | to make it used by default) one could either configure with | ||
3314 | 20 | `--dfltcc-level-mask=0x7e` or `export DFLTCC_LEVEL_MASK=0x7e` at run | ||
3315 | 21 | time. | ||
3316 | 22 | |||
3317 | 23 | Two DFLTCC compression calls produce the same results only when they | ||
3318 | 24 | both are made on machines of the same generation, and when the | ||
3319 | 25 | respective buffers have the same offset relative to the start of the | ||
3320 | 26 | page. Therefore care should be taken when using hardware compression | ||
3321 | 27 | when reproducible results are desired. One such use case - reproducible | ||
3322 | 28 | software builds - is handled explicitly: when the `SOURCE_DATE_EPOCH` | ||
3323 | 29 | environment variable is set, the hardware compression is disabled. | ||
3324 | 30 | |||
3325 | 31 | DFLTCC does not support every single zlib feature, in particular: | ||
3326 | 32 | |||
3327 | 33 | * `inflate(Z_BLOCK)` and `inflate(Z_TREES)` | ||
3328 | 34 | * `inflateMark()` | ||
3329 | 35 | * `inflatePrime()` | ||
3330 | 36 | * `inflateSyncPoint()` | ||
3331 | 37 | |||
3332 | 38 | When used, these functions will either switch to software, or, in case | ||
3333 | 39 | this is not possible, gracefully fail. | ||
3334 | 40 | |||
3335 | 41 | This patch tries to add DFLTCC support in the least intrusive way. | ||
3336 | 42 | All SystemZ-specific code is placed into a separate file, but | ||
3337 | 43 | unfortunately there is still a noticeable amount of changes in the | ||
3338 | 44 | main zlib code. Below is the summary of these changes. | ||
3339 | 45 | |||
3340 | 46 | DFLTCC takes as arguments a parameter block, an input buffer, an output | ||
3341 | 47 | buffer and a window. Since DFLTCC requires parameter block to be | ||
3342 | 48 | doubleword-aligned, and it's reasonable to allocate it alongside | ||
3343 | 49 | deflate and inflate states, The `ZALLOC_STATE()`, `ZFREE_STATE()` and | ||
3344 | 50 | `ZCOPY_STATE()` macros are introduced in order to encapsulate the | ||
3345 | 51 | allocation details. The same is true for window, for which | ||
3346 | 52 | the `ZALLOC_WINDOW()` and `TRY_FREE_WINDOW()` macros are introduced. | ||
3347 | 53 | |||
3348 | 54 | Software and hardware window formats do not match, therefore, | ||
3349 | 55 | `deflateSetDictionary()`, `deflateGetDictionary()`, | ||
3350 | 56 | `inflateSetDictionary()` and `inflateGetDictionary()` need special | ||
3351 | 57 | handling, which is triggered using the new | ||
3352 | 58 | `DEFLATE_SET_DICTIONARY_HOOK()`, `DEFLATE_GET_DICTIONARY_HOOK()`, | ||
3353 | 59 | `INFLATE_SET_DICTIONARY_HOOK()` and `INFLATE_GET_DICTIONARY_HOOK()` | ||
3354 | 60 | macros. | ||
3355 | 61 | |||
3356 | 62 | `deflateResetKeep()` and `inflateResetKeep()` now update the DFLTCC | ||
3357 | 63 | parameter block, which is allocated alongside zlib state, using | ||
3358 | 64 | the new `DEFLATE_RESET_KEEP_HOOK()` and `INFLATE_RESET_KEEP_HOOK()` | ||
3359 | 65 | macros. | ||
3360 | 66 | |||
3361 | 67 | The new `DEFLATE_PARAMS_HOOK()` macro switches between the hardware | ||
3362 | 68 | and the software deflate implementations when the `deflateParams()` | ||
3363 | 69 | arguments demand this. | ||
3364 | 70 | |||
3365 | 71 | The new `INFLATE_PRIME_HOOK()`, `INFLATE_MARK_HOOK()` and | ||
3366 | 72 | `INFLATE_SYNC_POINT_HOOK()` macros make the respective unsupported | ||
3367 | 73 | calls gracefully fail. | ||
3368 | 74 | |||
3369 | 75 | The algorithm implemented in the hardware has different compression | ||
3370 | 76 | ratio than the one implemented in software. In order for | ||
3371 | 77 | `deflateBound()` to return the correct results for the hardware | ||
3372 | 78 | implementation, the new `DEFLATE_BOUND_ADJUST_COMPLEN()` and | ||
3373 | 79 | `DEFLATE_NEED_CONSERVATIVE_BOUND()` macros are introduced. | ||
3374 | 80 | |||
3375 | 81 | Actual compression and decompression are handled by the new | ||
3376 | 82 | `DEFLATE_HOOK()` and `INFLATE_TYPEDO_HOOK()` macros. Since inflation | ||
3377 | 83 | with DFLTCC manages the window on its own, calling `updatewindow()` is | ||
3378 | 84 | suppressed using the new `INFLATE_NEED_UPDATEWINDOW()` macro. | ||
3379 | 85 | |||
3380 | 86 | In addition to the compression, DFLTCC computes the CRC-32 and Adler-32 | ||
3381 | 87 | checksums, therefore, whenever it's used, the software checksumming is | ||
3382 | 88 | suppressed using the new `DEFLATE_NEED_CHECKSUM()` and | ||
3383 | 89 | `INFLATE_NEED_CHECKSUM()` macros. | ||
3384 | 90 | |||
3385 | 91 | DFLTCC will refuse to write an End-of-block Symbol if there is no input | ||
3386 | 92 | data, thus in some cases it is necessary to do this manually. In order | ||
3387 | 93 | to achieve this, `send_bits()`, `bi_reverse()`, `bi_windup()` and | ||
3388 | 94 | `flush_pending()` are promoted from `local` to `ZLIB_INTERNAL`. | ||
3389 | 95 | Furthermore, since the block and the stream termination must be handled | ||
3390 | 96 | in software as well, `enum block_state` is moved to `deflate.h`. | ||
3391 | 97 | |||
3392 | 98 | Since the first call to `dfltcc_inflate()` already needs the window, | ||
3393 | 99 | and it might be not allocated yet, `inflate_ensure_window()` is | ||
3394 | 100 | factored out of `updatewindow()` and made `ZLIB_INTERNAL`. | ||
3395 | 101 | |||
3396 | 102 | Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> | ||
3397 | 103 | Origin: i-iii/zlib,https://github.com/iii-i/zlib/commit/481ee63d5f8fa12b5c833d32d08a3c74bc62cb20 | ||
3398 | 104 | --- | ||
3399 | 105 | Makefile.in | 8 + | ||
3400 | 106 | compress.c | 14 +- | ||
3401 | 107 | configure | 24 + | ||
3402 | 108 | contrib/README.contrib | 4 + | ||
3403 | 109 | contrib/s390/README.txt | 17 + | ||
3404 | 110 | contrib/s390/dfltcc.c | 1004 +++++++++++++++++++++++++++++++++++++++++ | ||
3405 | 111 | contrib/s390/dfltcc.h | 97 ++++ | ||
3406 | 112 | contrib/s390/dfltcc_deflate.h | 53 +++ | ||
3407 | 113 | deflate.c | 76 +++- | ||
3408 | 114 | deflate.h | 12 + | ||
3409 | 115 | gzguts.h | 4 + | ||
3410 | 116 | inflate.c | 98 ++-- | ||
3411 | 117 | inflate.h | 2 + | ||
3412 | 118 | test/infcover.c | 3 +- | ||
3413 | 119 | test/minigzip.c | 4 + | ||
3414 | 120 | trees.c | 8 +- | ||
3415 | 121 | zutil.h | 2 + | ||
3416 | 122 | 17 files changed, 1371 insertions(+), 59 deletions(-) | ||
3417 | 123 | create mode 100644 contrib/s390/README.txt | ||
3418 | 124 | create mode 100644 contrib/s390/dfltcc.c | ||
3419 | 125 | create mode 100644 contrib/s390/dfltcc.h | ||
3420 | 126 | create mode 100644 contrib/s390/dfltcc_deflate.h | ||
3421 | 127 | |||
3422 | 128 | diff --git a/Makefile.in b/Makefile.in | ||
3423 | 129 | index ede4db3..1710f63 100644 | ||
3424 | 130 | --- a/Makefile.in | ||
3425 | 131 | +++ b/Makefile.in | ||
3426 | 132 | @@ -140,6 +140,14 @@ match.lo: match.S | ||
3427 | 133 | mv _match.o match.lo | ||
3428 | 134 | rm -f _match.s | ||
3429 | 135 | |||
3430 | 136 | +dfltcc.o: $(SRCDIR)contrib/s390/dfltcc.c $(SRCDIR)zlib.h zconf.h | ||
3431 | 137 | + $(CC) $(CFLAGS) $(ZINC) -c -o $@ $(SRCDIR)contrib/s390/dfltcc.c | ||
3432 | 138 | + | ||
3433 | 139 | +dfltcc.lo: $(SRCDIR)contrib/s390/dfltcc.c $(SRCDIR)zlib.h zconf.h | ||
3434 | 140 | + -@mkdir objs 2>/dev/null || test -d objs | ||
3435 | 141 | + $(CC) $(SFLAGS) $(ZINC) -DPIC -c -o objs/dfltcc.o $(SRCDIR)contrib/s390/dfltcc.c | ||
3436 | 142 | + -@mv objs/dfltcc.o $@ | ||
3437 | 143 | + | ||
3438 | 144 | crc32_test.o: $(SRCDIR)test/crc32_test.c $(SRCDIR)zlib.h zconf.h | ||
3439 | 145 | $(CC) $(CFLAGS) $(ZINCOUT) -c -o $@ $(SRCDIR)test/crc32_test.c | ||
3440 | 146 | |||
3441 | 147 | diff --git a/compress.c b/compress.c | ||
3442 | 148 | index f43bacf..08a0660 100644 | ||
3443 | 149 | --- a/compress.c | ||
3444 | 150 | +++ b/compress.c | ||
3445 | 151 | @@ -5,9 +5,15 @@ | ||
3446 | 152 | |||
3447 | 153 | /* @(#) $Id$ */ | ||
3448 | 154 | |||
3449 | 155 | -#define ZLIB_INTERNAL | ||
3450 | 156 | +#include "zutil.h" | ||
3451 | 157 | #include "zlib.h" | ||
3452 | 158 | |||
3453 | 159 | +#ifdef DFLTCC | ||
3454 | 160 | +# include "contrib/s390/dfltcc.h" | ||
3455 | 161 | +#else | ||
3456 | 162 | +#define DEFLATE_BOUND_COMPLEN(source_len) 0 | ||
3457 | 163 | +#endif | ||
3458 | 164 | + | ||
3459 | 165 | /* =========================================================================== | ||
3460 | 166 | Compresses the source buffer into the destination buffer. The level | ||
3461 | 167 | parameter has the same meaning as in deflateInit. sourceLen is the byte | ||
3462 | 168 | @@ -70,6 +76,12 @@ int ZEXPORT compress(Bytef *dest, uLongf *destLen, const Bytef *source, | ||
3463 | 169 | this function needs to be updated. | ||
3464 | 170 | */ | ||
3465 | 171 | uLong ZEXPORT compressBound(uLong sourceLen) { | ||
3466 | 172 | + uLong complen = DEFLATE_BOUND_COMPLEN(sourceLen); | ||
3467 | 173 | + | ||
3468 | 174 | + if (complen > 0) | ||
3469 | 175 | + /* Architecture-specific code provided an upper bound. */ | ||
3470 | 176 | + return complen + ZLIB_WRAPLEN; | ||
3471 | 177 | + | ||
3472 | 178 | return sourceLen + (sourceLen >> 12) + (sourceLen >> 14) + | ||
3473 | 179 | (sourceLen >> 25) + 13; | ||
3474 | 180 | } | ||
3475 | 181 | diff --git a/configure b/configure | ||
3476 | 182 | index 3372cbf..b99a348 100755 | ||
3477 | 183 | --- a/configure | ||
3478 | 184 | +++ b/configure | ||
3479 | 185 | @@ -117,6 +117,7 @@ case "$1" in | ||
3480 | 186 | echo ' configure [--const] [--zprefix] [--prefix=PREFIX] [--eprefix=EXPREFIX]' | tee -a configure.log | ||
3481 | 187 | echo ' [--static] [--64] [--libdir=LIBDIR] [--sharedlibdir=LIBDIR]' | tee -a configure.log | ||
3482 | 188 | echo ' [--includedir=INCLUDEDIR] [--archs="-arch i386 -arch x86_64"]' | tee -a configure.log | ||
3483 | 189 | + echo ' [--dfltcc] [--dfltcc-level-mask=MASK]' | tee -a configure.log | ||
3484 | 190 | exit 0 ;; | ||
3485 | 191 | -p*=* | --prefix=*) prefix=`echo $1 | sed 's/.*=//'`; shift ;; | ||
3486 | 192 | -e*=* | --eprefix=*) exec_prefix=`echo $1 | sed 's/.*=//'`; shift ;; | ||
3487 | 193 | @@ -143,6 +144,16 @@ case "$1" in | ||
3488 | 194 | --sanitize) address=1; shift ;; | ||
3489 | 195 | --address) address=1; shift ;; | ||
3490 | 196 | --memory) memory=1; shift ;; | ||
3491 | 197 | + --dfltcc) | ||
3492 | 198 | + CFLAGS="$CFLAGS -DDFLTCC" | ||
3493 | 199 | + OBJC="$OBJC dfltcc.o" | ||
3494 | 200 | + PIC_OBJC="$PIC_OBJC dfltcc.lo" | ||
3495 | 201 | + shift | ||
3496 | 202 | + ;; | ||
3497 | 203 | + --dfltcc-level-mask=*) | ||
3498 | 204 | + CFLAGS="$CFLAGS -DDFLTCC_LEVEL_MASK=`echo $1 | sed 's/.*=//'`" | ||
3499 | 205 | + shift | ||
3500 | 206 | + ;; | ||
3501 | 207 | *) | ||
3502 | 208 | echo "unknown option: $1" | tee -a configure.log | ||
3503 | 209 | echo "$0 --help for help" | tee -a configure.log | ||
3504 | 210 | @@ -834,6 +845,19 @@ EOF | ||
3505 | 211 | fi | ||
3506 | 212 | fi | ||
3507 | 213 | |||
3508 | 214 | +# Check whether sys/sdt.h is available | ||
3509 | 215 | +cat > $test.c << EOF | ||
3510 | 216 | +#include <sys/sdt.h> | ||
3511 | 217 | +int main() { return 0; } | ||
3512 | 218 | +EOF | ||
3513 | 219 | +if try $CC -c $CFLAGS $test.c; then | ||
3514 | 220 | + echo "Checking for sys/sdt.h ... Yes." | tee -a configure.log | ||
3515 | 221 | + CFLAGS="$CFLAGS -DHAVE_SYS_SDT_H" | ||
3516 | 222 | + SFLAGS="$SFLAGS -DHAVE_SYS_SDT_H" | ||
3517 | 223 | +else | ||
3518 | 224 | + echo "Checking for sys/sdt.h ... No." | tee -a configure.log | ||
3519 | 225 | +fi | ||
3520 | 226 | + | ||
3521 | 227 | # test to see if we can use a gnu indirection function to detect and load optimized code at runtime | ||
3522 | 228 | echo >> configure.log | ||
3523 | 229 | cat > $test.c <<EOF | ||
3524 | 230 | diff --git a/contrib/README.contrib b/contrib/README.contrib | ||
3525 | 231 | index 90170df..a36d404 100644 | ||
3526 | 232 | --- a/contrib/README.contrib | ||
3527 | 233 | +++ b/contrib/README.contrib | ||
3528 | 234 | @@ -55,6 +55,10 @@ puff/ by Mark Adler <madler@alumni.caltech.edu> | ||
3529 | 235 | Small, low memory usage inflate. Also serves to provide an | ||
3530 | 236 | unambiguous description of the deflate format. | ||
3531 | 237 | |||
3532 | 238 | +s390/ by Ilya Leoshkevich <iii@linux.ibm.com> | ||
3533 | 239 | + Hardware-accelerated deflate on IBM Z with DEFLATE CONVERSION CALL | ||
3534 | 240 | + instruction. | ||
3535 | 241 | + | ||
3536 | 242 | testzlib/ by Gilles Vollant <info@winimage.com> | ||
3537 | 243 | Example of the use of zlib | ||
3538 | 244 | |||
3539 | 245 | diff --git a/contrib/s390/README.txt b/contrib/s390/README.txt | ||
3540 | 246 | new file mode 100644 | ||
3541 | 247 | index 0000000..48be008 | ||
3542 | 248 | --- /dev/null | ||
3543 | 249 | +++ b/contrib/s390/README.txt | ||
3544 | 250 | @@ -0,0 +1,17 @@ | ||
3545 | 251 | +IBM Z mainframes starting from version z15 provide DFLTCC instruction, | ||
3546 | 252 | +which implements deflate algorithm in hardware with estimated | ||
3547 | 253 | +compression and decompression performance orders of magnitude faster | ||
3548 | 254 | +than the current zlib and ratio comparable with that of level 1. | ||
3549 | 255 | + | ||
3550 | 256 | +This directory adds DFLTCC support. In order to enable it, the following | ||
3551 | 257 | +build commands should be used: | ||
3552 | 258 | + | ||
3553 | 259 | + $ ./configure --dfltcc | ||
3554 | 260 | + $ make | ||
3555 | 261 | + | ||
3556 | 262 | +When built like this, zlib would compress in hardware on level 1, and in | ||
3557 | 263 | +software on all other levels. Decompression will always happen in | ||
3558 | 264 | +hardware. In order to enable DFLTCC compression for levels 1-6 (i.e. to | ||
3559 | 265 | +make it used by default) one could either configure with | ||
3560 | 266 | +--dfltcc-level-mask=0x7e or set the environment variable | ||
3561 | 267 | +DFLTCC_LEVEL_MASK to 0x7e at run time. | ||
3562 | 268 | diff --git a/contrib/s390/dfltcc.c b/contrib/s390/dfltcc.c | ||
3563 | 269 | new file mode 100644 | ||
3564 | 270 | index 0000000..f2b222d | ||
3565 | 271 | --- /dev/null | ||
3566 | 272 | +++ b/contrib/s390/dfltcc.c | ||
3567 | 273 | @@ -0,0 +1,1004 @@ | ||
3568 | 274 | +/* dfltcc.c - SystemZ DEFLATE CONVERSION CALL support. */ | ||
3569 | 275 | + | ||
3570 | 276 | +/* | ||
3571 | 277 | + Use the following commands to build zlib with DFLTCC support: | ||
3572 | 278 | + | ||
3573 | 279 | + $ ./configure --dfltcc | ||
3574 | 280 | + $ make | ||
3575 | 281 | +*/ | ||
3576 | 282 | + | ||
3577 | 283 | +#define _GNU_SOURCE | ||
3578 | 284 | +#include <ctype.h> | ||
3579 | 285 | +#include <errno.h> | ||
3580 | 286 | +#include <inttypes.h> | ||
3581 | 287 | +#include <stddef.h> | ||
3582 | 288 | +#include <stdio.h> | ||
3583 | 289 | +#include <stdint.h> | ||
3584 | 290 | +#include <stdlib.h> | ||
3585 | 291 | +#include "../../zutil.h" | ||
3586 | 292 | +#include "../../deflate.h" | ||
3587 | 293 | +#include "../../inftrees.h" | ||
3588 | 294 | +#include "../../inflate.h" | ||
3589 | 295 | +#include "dfltcc.h" | ||
3590 | 296 | +#include "dfltcc_deflate.h" | ||
3591 | 297 | +#ifdef HAVE_SYS_SDT_H | ||
3592 | 298 | +#include <sys/sdt.h> | ||
3593 | 299 | +#endif | ||
3594 | 300 | + | ||
3595 | 301 | +/* | ||
3596 | 302 | + C wrapper for the DEFLATE CONVERSION CALL instruction. | ||
3597 | 303 | + */ | ||
3598 | 304 | +typedef enum { | ||
3599 | 305 | + DFLTCC_CC_OK = 0, | ||
3600 | 306 | + DFLTCC_CC_OP1_TOO_SHORT = 1, | ||
3601 | 307 | + DFLTCC_CC_OP2_TOO_SHORT = 2, | ||
3602 | 308 | + DFLTCC_CC_OP2_CORRUPT = 2, | ||
3603 | 309 | + DFLTCC_CC_AGAIN = 3, | ||
3604 | 310 | +} dfltcc_cc; | ||
3605 | 311 | + | ||
3606 | 312 | +#define DFLTCC_QAF 0 | ||
3607 | 313 | +#define DFLTCC_GDHT 1 | ||
3608 | 314 | +#define DFLTCC_CMPR 2 | ||
3609 | 315 | +#define DFLTCC_XPND 4 | ||
3610 | 316 | +#define HBT_CIRCULAR (1 << 7) | ||
3611 | 317 | +#define HB_BITS 15 | ||
3612 | 318 | +#define HB_SIZE (1 << HB_BITS) | ||
3613 | 319 | +#define DFLTCC_FACILITY 151 | ||
3614 | 320 | + | ||
3615 | 321 | +local inline dfltcc_cc dfltcc(int fn, void *param, | ||
3616 | 322 | + Bytef **op1, size_t *len1, | ||
3617 | 323 | + z_const Bytef **op2, size_t *len2, | ||
3618 | 324 | + void *hist) | ||
3619 | 325 | +{ | ||
3620 | 326 | + Bytef *t2 = op1 ? *op1 : NULL; | ||
3621 | 327 | + size_t t3 = len1 ? *len1 : 0; | ||
3622 | 328 | + z_const Bytef *t4 = op2 ? *op2 : NULL; | ||
3623 | 329 | + size_t t5 = len2 ? *len2 : 0; | ||
3624 | 330 | + register int r0 __asm__("r0") = fn; | ||
3625 | 331 | + register void *r1 __asm__("r1") = param; | ||
3626 | 332 | + register Bytef *r2 __asm__("r2") = t2; | ||
3627 | 333 | + register size_t r3 __asm__("r3") = t3; | ||
3628 | 334 | + register z_const Bytef *r4 __asm__("r4") = t4; | ||
3629 | 335 | + register size_t r5 __asm__("r5") = t5; | ||
3630 | 336 | + int cc; | ||
3631 | 337 | + | ||
3632 | 338 | + __asm__ volatile( | ||
3633 | 339 | +#ifdef HAVE_SYS_SDT_H | ||
3634 | 340 | + STAP_PROBE_ASM(zlib, dfltcc_entry, | ||
3635 | 341 | + STAP_PROBE_ASM_TEMPLATE(5)) | ||
3636 | 342 | +#endif | ||
3637 | 343 | + ".insn rrf,0xb9390000,%[r2],%[r4],%[hist],0\n" | ||
3638 | 344 | +#ifdef HAVE_SYS_SDT_H | ||
3639 | 345 | + STAP_PROBE_ASM(zlib, dfltcc_exit, | ||
3640 | 346 | + STAP_PROBE_ASM_TEMPLATE(5)) | ||
3641 | 347 | +#endif | ||
3642 | 348 | + "ipm %[cc]\n" | ||
3643 | 349 | + : [r2] "+r" (r2) | ||
3644 | 350 | + , [r3] "+r" (r3) | ||
3645 | 351 | + , [r4] "+r" (r4) | ||
3646 | 352 | + , [r5] "+r" (r5) | ||
3647 | 353 | + , [cc] "=r" (cc) | ||
3648 | 354 | + : [r0] "r" (r0) | ||
3649 | 355 | + , [r1] "r" (r1) | ||
3650 | 356 | + , [hist] "r" (hist) | ||
3651 | 357 | +#ifdef HAVE_SYS_SDT_H | ||
3652 | 358 | + , STAP_PROBE_ASM_OPERANDS(5, r2, r3, r4, r5, hist) | ||
3653 | 359 | +#endif | ||
3654 | 360 | + : "cc", "memory"); | ||
3655 | 361 | + t2 = r2; t3 = r3; t4 = r4; t5 = r5; | ||
3656 | 362 | + | ||
3657 | 363 | + if (op1) | ||
3658 | 364 | + *op1 = t2; | ||
3659 | 365 | + if (len1) | ||
3660 | 366 | + *len1 = t3; | ||
3661 | 367 | + if (op2) | ||
3662 | 368 | + *op2 = t4; | ||
3663 | 369 | + if (len2) | ||
3664 | 370 | + *len2 = t5; | ||
3665 | 371 | + return (cc >> 28) & 3; | ||
3666 | 372 | +} | ||
3667 | 373 | + | ||
3668 | 374 | +/* | ||
3669 | 375 | + Parameter Block for Query Available Functions. | ||
3670 | 376 | + */ | ||
3671 | 377 | +#define static_assert(c, msg) \ | ||
3672 | 378 | + __attribute__((unused)) \ | ||
3673 | 379 | + static char static_assert_failed_ ## msg[c ? 1 : -1] | ||
3674 | 380 | + | ||
3675 | 381 | +struct dfltcc_qaf_param { | ||
3676 | 382 | + char fns[16]; | ||
3677 | 383 | + char reserved1[8]; | ||
3678 | 384 | + char fmts[2]; | ||
3679 | 385 | + char reserved2[6]; | ||
3680 | 386 | +}; | ||
3681 | 387 | + | ||
3682 | 388 | +static_assert(sizeof(struct dfltcc_qaf_param) == 32, | ||
3683 | 389 | + sizeof_struct_dfltcc_qaf_param_is_32); | ||
3684 | 390 | + | ||
3685 | 391 | +local inline int is_bit_set(const char *bits, int n) | ||
3686 | 392 | +{ | ||
3687 | 393 | + return bits[n / 8] & (1 << (7 - (n % 8))); | ||
3688 | 394 | +} | ||
3689 | 395 | + | ||
3690 | 396 | +local inline void clear_bit(char *bits, int n) | ||
3691 | 397 | +{ | ||
3692 | 398 | + bits[n / 8] &= ~(1 << (7 - (n % 8))); | ||
3693 | 399 | +} | ||
3694 | 400 | + | ||
3695 | 401 | +#define DFLTCC_FMT0 0 | ||
3696 | 402 | + | ||
3697 | 403 | +/* | ||
3698 | 404 | + Parameter Block for Generate Dynamic-Huffman Table, Compress and Expand. | ||
3699 | 405 | + */ | ||
3700 | 406 | +#define CVT_CRC32 0 | ||
3701 | 407 | +#define CVT_ADLER32 1 | ||
3702 | 408 | +#define HTT_FIXED 0 | ||
3703 | 409 | +#define HTT_DYNAMIC 1 | ||
3704 | 410 | + | ||
3705 | 411 | +struct dfltcc_param_v0 { | ||
3706 | 412 | + uint16_t pbvn; /* Parameter-Block-Version Number */ | ||
3707 | 413 | + uint8_t mvn; /* Model-Version Number */ | ||
3708 | 414 | + uint8_t ribm; /* Reserved for IBM use */ | ||
3709 | 415 | + unsigned reserved32 : 31; | ||
3710 | 416 | + unsigned cf : 1; /* Continuation Flag */ | ||
3711 | 417 | + uint8_t reserved64[8]; | ||
3712 | 418 | + unsigned nt : 1; /* New Task */ | ||
3713 | 419 | + unsigned reserved129 : 1; | ||
3714 | 420 | + unsigned cvt : 1; /* Check Value Type */ | ||
3715 | 421 | + unsigned reserved131 : 1; | ||
3716 | 422 | + unsigned htt : 1; /* Huffman-Table Type */ | ||
3717 | 423 | + unsigned bcf : 1; /* Block-Continuation Flag */ | ||
3718 | 424 | + unsigned bcc : 1; /* Block Closing Control */ | ||
3719 | 425 | + unsigned bhf : 1; /* Block Header Final */ | ||
3720 | 426 | + unsigned reserved136 : 1; | ||
3721 | 427 | + unsigned reserved137 : 1; | ||
3722 | 428 | + unsigned dhtgc : 1; /* DHT Generation Control */ | ||
3723 | 429 | + unsigned reserved139 : 5; | ||
3724 | 430 | + unsigned reserved144 : 5; | ||
3725 | 431 | + unsigned sbb : 3; /* Sub-Byte Boundary */ | ||
3726 | 432 | + uint8_t oesc; /* Operation-Ending-Supplemental Code */ | ||
3727 | 433 | + unsigned reserved160 : 12; | ||
3728 | 434 | + unsigned ifs : 4; /* Incomplete-Function Status */ | ||
3729 | 435 | + uint16_t ifl; /* Incomplete-Function Length */ | ||
3730 | 436 | + uint8_t reserved192[8]; | ||
3731 | 437 | + uint8_t reserved256[8]; | ||
3732 | 438 | + uint8_t reserved320[4]; | ||
3733 | 439 | + uint16_t hl; /* History Length */ | ||
3734 | 440 | + unsigned reserved368 : 1; | ||
3735 | 441 | + uint16_t ho : 15; /* History Offset */ | ||
3736 | 442 | + uint32_t cv; /* Check Value */ | ||
3737 | 443 | + unsigned eobs : 15; /* End-of-block Symbol */ | ||
3738 | 444 | + unsigned reserved431: 1; | ||
3739 | 445 | + uint8_t eobl : 4; /* End-of-block Length */ | ||
3740 | 446 | + unsigned reserved436 : 12; | ||
3741 | 447 | + unsigned reserved448 : 4; | ||
3742 | 448 | + uint16_t cdhtl : 12; /* Compressed-Dynamic-Huffman Table | ||
3743 | 449 | + Length */ | ||
3744 | 450 | + uint8_t reserved464[6]; | ||
3745 | 451 | + uint8_t cdht[288]; | ||
3746 | 452 | + uint8_t reserved[32]; | ||
3747 | 453 | + uint8_t csb[1152]; | ||
3748 | 454 | +}; | ||
3749 | 455 | + | ||
3750 | 456 | +static_assert(sizeof(struct dfltcc_param_v0) == 1536, | ||
3751 | 457 | + sizeof_struct_dfltcc_param_v0_is_1536); | ||
3752 | 458 | + | ||
3753 | 459 | +local z_const char *oesc_msg(char *buf, int oesc) | ||
3754 | 460 | +{ | ||
3755 | 461 | + if (oesc == 0x00) | ||
3756 | 462 | + return NULL; /* Successful completion */ | ||
3757 | 463 | + else { | ||
3758 | 464 | + sprintf(buf, "Operation-Ending-Supplemental Code is 0x%.2X", oesc); | ||
3759 | 465 | + return buf; | ||
3760 | 466 | + } | ||
3761 | 467 | +} | ||
3762 | 468 | + | ||
3763 | 469 | +/* | ||
3764 | 470 | + Extension of inflate_state and deflate_state. Must be doubleword-aligned. | ||
3765 | 471 | +*/ | ||
3766 | 472 | +struct dfltcc_state { | ||
3767 | 473 | + struct dfltcc_param_v0 param; /* Parameter block. */ | ||
3768 | 474 | + struct dfltcc_qaf_param af; /* Available functions. */ | ||
3769 | 475 | + uLong level_mask; /* Levels on which to use DFLTCC */ | ||
3770 | 476 | + uLong block_size; /* New block each X bytes */ | ||
3771 | 477 | + uLong block_threshold; /* New block after total_in > X */ | ||
3772 | 478 | + uLong dht_threshold; /* New block only if avail_in >= X */ | ||
3773 | 479 | + char msg[64]; /* Buffer for strm->msg */ | ||
3774 | 480 | +}; | ||
3775 | 481 | + | ||
3776 | 482 | +#define ALIGN_UP(p, size) \ | ||
3777 | 483 | + (__typeof__(p))(((uintptr_t)(p) + ((size) - 1)) & ~((size) - 1)) | ||
3778 | 484 | + | ||
3779 | 485 | +#define GET_DFLTCC_STATE(state) ((struct dfltcc_state *)( \ | ||
3780 | 486 | + (char *)(state) + ALIGN_UP(sizeof(*state), 8))) | ||
3781 | 487 | + | ||
3782 | 488 | +/* | ||
3783 | 489 | + Compress. | ||
3784 | 490 | + */ | ||
3785 | 491 | +local inline int dfltcc_can_deflate_with_params(z_streamp strm, | ||
3786 | 492 | + int level, | ||
3787 | 493 | + uInt window_bits, | ||
3788 | 494 | + int strategy) | ||
3789 | 495 | +{ | ||
3790 | 496 | + deflate_state *state = (deflate_state *)strm->state; | ||
3791 | 497 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
3792 | 498 | + | ||
3793 | 499 | + /* Unsupported compression settings */ | ||
3794 | 500 | + if ((dfltcc_state->level_mask & (1 << level)) == 0) | ||
3795 | 501 | + return 0; | ||
3796 | 502 | + if (window_bits != HB_BITS) | ||
3797 | 503 | + return 0; | ||
3798 | 504 | + if (strategy != Z_FIXED && strategy != Z_DEFAULT_STRATEGY) | ||
3799 | 505 | + return 0; | ||
3800 | 506 | + | ||
3801 | 507 | + /* Unsupported hardware */ | ||
3802 | 508 | + if (!is_bit_set(dfltcc_state->af.fns, DFLTCC_GDHT) || | ||
3803 | 509 | + !is_bit_set(dfltcc_state->af.fns, DFLTCC_CMPR) || | ||
3804 | 510 | + !is_bit_set(dfltcc_state->af.fmts, DFLTCC_FMT0)) | ||
3805 | 511 | + return 0; | ||
3806 | 512 | + | ||
3807 | 513 | + return 1; | ||
3808 | 514 | +} | ||
3809 | 515 | + | ||
3810 | 516 | +int ZLIB_INTERNAL dfltcc_can_deflate(z_streamp strm) | ||
3811 | 517 | +{ | ||
3812 | 518 | + deflate_state *state = (deflate_state *)strm->state; | ||
3813 | 519 | + | ||
3814 | 520 | + return dfltcc_can_deflate_with_params(strm, | ||
3815 | 521 | + state->level, | ||
3816 | 522 | + state->w_bits, | ||
3817 | 523 | + state->strategy); | ||
3818 | 524 | +} | ||
3819 | 525 | + | ||
3820 | 526 | +local void dfltcc_gdht(z_streamp strm) | ||
3821 | 527 | +{ | ||
3822 | 528 | + deflate_state *state = (deflate_state *)strm->state; | ||
3823 | 529 | + struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; | ||
3824 | 530 | + size_t avail_in = avail_in = strm->avail_in; | ||
3825 | 531 | + | ||
3826 | 532 | + dfltcc(DFLTCC_GDHT, | ||
3827 | 533 | + param, NULL, NULL, | ||
3828 | 534 | + &strm->next_in, &avail_in, NULL); | ||
3829 | 535 | +} | ||
3830 | 536 | + | ||
3831 | 537 | +local dfltcc_cc dfltcc_cmpr(z_streamp strm) | ||
3832 | 538 | +{ | ||
3833 | 539 | + deflate_state *state = (deflate_state *)strm->state; | ||
3834 | 540 | + struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; | ||
3835 | 541 | + size_t avail_in = strm->avail_in; | ||
3836 | 542 | + size_t avail_out = strm->avail_out; | ||
3837 | 543 | + dfltcc_cc cc; | ||
3838 | 544 | + | ||
3839 | 545 | + cc = dfltcc(DFLTCC_CMPR | HBT_CIRCULAR, | ||
3840 | 546 | + param, &strm->next_out, &avail_out, | ||
3841 | 547 | + &strm->next_in, &avail_in, state->window); | ||
3842 | 548 | + strm->total_in += (strm->avail_in - avail_in); | ||
3843 | 549 | + strm->total_out += (strm->avail_out - avail_out); | ||
3844 | 550 | + strm->avail_in = avail_in; | ||
3845 | 551 | + strm->avail_out = avail_out; | ||
3846 | 552 | + return cc; | ||
3847 | 553 | +} | ||
3848 | 554 | + | ||
3849 | 555 | +local void send_eobs(z_streamp strm, | ||
3850 | 556 | + z_const struct dfltcc_param_v0 *param) | ||
3851 | 557 | +{ | ||
3852 | 558 | + deflate_state *state = (deflate_state *)strm->state; | ||
3853 | 559 | + | ||
3854 | 560 | + _tr_send_bits( | ||
3855 | 561 | + state, | ||
3856 | 562 | + bi_reverse(param->eobs >> (15 - param->eobl), param->eobl), | ||
3857 | 563 | + param->eobl); | ||
3858 | 564 | + flush_pending(strm); | ||
3859 | 565 | + if (state->pending != 0) { | ||
3860 | 566 | + /* The remaining data is located in pending_out[0:pending]. If someone | ||
3861 | 567 | + * calls put_byte() - this might happen in deflate() - the byte will be | ||
3862 | 568 | + * placed into pending_buf[pending], which is incorrect. Move the | ||
3863 | 569 | + * remaining data to the beginning of pending_buf so that put_byte() is | ||
3864 | 570 | + * usable again. | ||
3865 | 571 | + */ | ||
3866 | 572 | + memmove(state->pending_buf, state->pending_out, state->pending); | ||
3867 | 573 | + state->pending_out = state->pending_buf; | ||
3868 | 574 | + } | ||
3869 | 575 | +#ifdef ZLIB_DEBUG | ||
3870 | 576 | + state->compressed_len += param->eobl; | ||
3871 | 577 | +#endif | ||
3872 | 578 | +} | ||
3873 | 579 | + | ||
3874 | 580 | +int ZLIB_INTERNAL dfltcc_deflate(z_streamp strm, int flush, | ||
3875 | 581 | + block_state *result) | ||
3876 | 582 | +{ | ||
3877 | 583 | + deflate_state *state = (deflate_state *)strm->state; | ||
3878 | 584 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
3879 | 585 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
3880 | 586 | + uInt masked_avail_in; | ||
3881 | 587 | + dfltcc_cc cc; | ||
3882 | 588 | + int need_empty_block; | ||
3883 | 589 | + int soft_bcc; | ||
3884 | 590 | + int no_flush; | ||
3885 | 591 | + | ||
3886 | 592 | + if (!dfltcc_can_deflate(strm)) { | ||
3887 | 593 | + /* Clear history. */ | ||
3888 | 594 | + if (flush == Z_FULL_FLUSH) | ||
3889 | 595 | + param->hl = 0; | ||
3890 | 596 | + return 0; | ||
3891 | 597 | + } | ||
3892 | 598 | + | ||
3893 | 599 | +again: | ||
3894 | 600 | + masked_avail_in = 0; | ||
3895 | 601 | + soft_bcc = 0; | ||
3896 | 602 | + no_flush = flush == Z_NO_FLUSH; | ||
3897 | 603 | + | ||
3898 | 604 | + /* No input data. Return, except when Continuation Flag is set, which means | ||
3899 | 605 | + * that DFLTCC has buffered some output in the parameter block and needs to | ||
3900 | 606 | + * be called again in order to flush it. | ||
3901 | 607 | + */ | ||
3902 | 608 | + if (strm->avail_in == 0 && !param->cf) { | ||
3903 | 609 | + /* A block is still open, and the hardware does not support closing | ||
3904 | 610 | + * blocks without adding data. Thus, close it manually. | ||
3905 | 611 | + */ | ||
3906 | 612 | + if (!no_flush && param->bcf) { | ||
3907 | 613 | + send_eobs(strm, param); | ||
3908 | 614 | + param->bcf = 0; | ||
3909 | 615 | + } | ||
3910 | 616 | + /* Let one of deflate_* functions write a trailing empty block. */ | ||
3911 | 617 | + if (flush == Z_FINISH) | ||
3912 | 618 | + return 0; | ||
3913 | 619 | + /* Clear history. */ | ||
3914 | 620 | + if (flush == Z_FULL_FLUSH) | ||
3915 | 621 | + param->hl = 0; | ||
3916 | 622 | + /* Trigger block post-processing if necessary. */ | ||
3917 | 623 | + *result = no_flush ? need_more : block_done; | ||
3918 | 624 | + return 1; | ||
3919 | 625 | + } | ||
3920 | 626 | + | ||
3921 | 627 | + /* There is an open non-BFINAL block, we are not going to close it just | ||
3922 | 628 | + * yet, we have compressed more than DFLTCC_BLOCK_SIZE bytes and we see | ||
3923 | 629 | + * more than DFLTCC_DHT_MIN_SAMPLE_SIZE bytes. Open a new block with a new | ||
3924 | 630 | + * DHT in order to adapt to a possibly changed input data distribution. | ||
3925 | 631 | + */ | ||
3926 | 632 | + if (param->bcf && no_flush && | ||
3927 | 633 | + strm->total_in > dfltcc_state->block_threshold && | ||
3928 | 634 | + strm->avail_in >= dfltcc_state->dht_threshold) { | ||
3929 | 635 | + if (param->cf) { | ||
3930 | 636 | + /* We need to flush the DFLTCC buffer before writing the | ||
3931 | 637 | + * End-of-block Symbol. Mask the input data and proceed as usual. | ||
3932 | 638 | + */ | ||
3933 | 639 | + masked_avail_in += strm->avail_in; | ||
3934 | 640 | + strm->avail_in = 0; | ||
3935 | 641 | + no_flush = 0; | ||
3936 | 642 | + } else { | ||
3937 | 643 | + /* DFLTCC buffer is empty, so we can manually write the | ||
3938 | 644 | + * End-of-block Symbol right away. | ||
3939 | 645 | + */ | ||
3940 | 646 | + send_eobs(strm, param); | ||
3941 | 647 | + param->bcf = 0; | ||
3942 | 648 | + dfltcc_state->block_threshold = | ||
3943 | 649 | + strm->total_in + dfltcc_state->block_size; | ||
3944 | 650 | + } | ||
3945 | 651 | + } | ||
3946 | 652 | + | ||
3947 | 653 | + /* No space for compressed data. If we proceed, dfltcc_cmpr() will return | ||
3948 | 654 | + * DFLTCC_CC_OP1_TOO_SHORT without buffering header bits, but we will still | ||
3949 | 655 | + * set BCF=1, which is wrong. Avoid complications and return early. | ||
3950 | 656 | + */ | ||
3951 | 657 | + if (strm->avail_out == 0) { | ||
3952 | 658 | + *result = need_more; | ||
3953 | 659 | + return 1; | ||
3954 | 660 | + } | ||
3955 | 661 | + | ||
3956 | 662 | + /* The caller gave us too much data. Pass only one block worth of | ||
3957 | 663 | + * uncompressed data to DFLTCC and mask the rest, so that on the next | ||
3958 | 664 | + * iteration we start a new block. | ||
3959 | 665 | + */ | ||
3960 | 666 | + if (no_flush && strm->avail_in > dfltcc_state->block_size) { | ||
3961 | 667 | + masked_avail_in += (strm->avail_in - dfltcc_state->block_size); | ||
3962 | 668 | + strm->avail_in = dfltcc_state->block_size; | ||
3963 | 669 | + } | ||
3964 | 670 | + | ||
3965 | 671 | + /* When we have an open non-BFINAL deflate block and caller indicates that | ||
3966 | 672 | + * the stream is ending, we need to close an open deflate block and open a | ||
3967 | 673 | + * BFINAL one. | ||
3968 | 674 | + */ | ||
3969 | 675 | + need_empty_block = flush == Z_FINISH && param->bcf && !param->bhf; | ||
3970 | 676 | + | ||
3971 | 677 | + /* Translate stream to parameter block */ | ||
3972 | 678 | + param->cvt = state->wrap == 2 ? CVT_CRC32 : CVT_ADLER32; | ||
3973 | 679 | + if (!no_flush) | ||
3974 | 680 | + /* We need to close a block. Always do this in software - when there is | ||
3975 | 681 | + * no input data, the hardware will not honor BCC. */ | ||
3976 | 682 | + soft_bcc = 1; | ||
3977 | 683 | + if (flush == Z_FINISH && !param->bcf) | ||
3978 | 684 | + /* We are about to open a BFINAL block, set Block Header Final bit | ||
3979 | 685 | + * until the stream ends. | ||
3980 | 686 | + */ | ||
3981 | 687 | + param->bhf = 1; | ||
3982 | 688 | + /* DFLTCC-CMPR will write to next_out, so make sure that buffers with | ||
3983 | 689 | + * higher precedence are empty. | ||
3984 | 690 | + */ | ||
3985 | 691 | + Assert(state->pending == 0, "There must be no pending bytes"); | ||
3986 | 692 | + Assert(state->bi_valid < 8, "There must be less than 8 pending bits"); | ||
3987 | 693 | + param->sbb = (unsigned int)state->bi_valid; | ||
3988 | 694 | + if (param->sbb > 0) | ||
3989 | 695 | + *strm->next_out = (Bytef)state->bi_buf; | ||
3990 | 696 | + /* Honor history and check value */ | ||
3991 | 697 | + param->nt = 0; | ||
3992 | 698 | + if (state->wrap == 1) | ||
3993 | 699 | + param->cv = strm->adler; | ||
3994 | 700 | + else if (state->wrap == 2) | ||
3995 | 701 | + param->cv = ZSWAP32(strm->adler); | ||
3996 | 702 | + | ||
3997 | 703 | + /* When opening a block, choose a Huffman-Table Type */ | ||
3998 | 704 | + if (!param->bcf) { | ||
3999 | 705 | + if (state->strategy == Z_FIXED || | ||
4000 | 706 | + (strm->total_in == 0 && dfltcc_state->block_threshold > 0)) | ||
4001 | 707 | + param->htt = HTT_FIXED; | ||
4002 | 708 | + else { | ||
4003 | 709 | + param->htt = HTT_DYNAMIC; | ||
4004 | 710 | + dfltcc_gdht(strm); | ||
4005 | 711 | + } | ||
4006 | 712 | + } | ||
4007 | 713 | + | ||
4008 | 714 | + /* Deflate */ | ||
4009 | 715 | + do { | ||
4010 | 716 | + cc = dfltcc_cmpr(strm); | ||
4011 | 717 | + if (strm->avail_in < 4096 && masked_avail_in > 0) | ||
4012 | 718 | + /* We are about to call DFLTCC with a small input buffer, which is | ||
4013 | 719 | + * inefficient. Since there is masked data, there will be at least | ||
4014 | 720 | + * one more DFLTCC call, so skip the current one and make the next | ||
4015 | 721 | + * one handle more data. | ||
4016 | 722 | + */ | ||
4017 | 723 | + break; | ||
4018 | 724 | + } while (cc == DFLTCC_CC_AGAIN); | ||
4019 | 725 | + | ||
4020 | 726 | + /* Translate parameter block to stream */ | ||
4021 | 727 | + strm->msg = oesc_msg(dfltcc_state->msg, param->oesc); | ||
4022 | 728 | + state->bi_valid = param->sbb; | ||
4023 | 729 | + if (state->bi_valid == 0) | ||
4024 | 730 | + state->bi_buf = 0; /* Avoid accessing next_out */ | ||
4025 | 731 | + else | ||
4026 | 732 | + state->bi_buf = *strm->next_out & ((1 << state->bi_valid) - 1); | ||
4027 | 733 | + if (state->wrap == 1) | ||
4028 | 734 | + strm->adler = param->cv; | ||
4029 | 735 | + else if (state->wrap == 2) | ||
4030 | 736 | + strm->adler = ZSWAP32(param->cv); | ||
4031 | 737 | + | ||
4032 | 738 | + /* Unmask the input data */ | ||
4033 | 739 | + strm->avail_in += masked_avail_in; | ||
4034 | 740 | + masked_avail_in = 0; | ||
4035 | 741 | + | ||
4036 | 742 | + /* If we encounter an error, it means there is a bug in DFLTCC call */ | ||
4037 | 743 | + Assert(cc != DFLTCC_CC_OP2_CORRUPT || param->oesc == 0, "BUG"); | ||
4038 | 744 | + | ||
4039 | 745 | + /* Update Block-Continuation Flag. It will be used to check whether to call | ||
4040 | 746 | + * GDHT the next time. | ||
4041 | 747 | + */ | ||
4042 | 748 | + if (cc == DFLTCC_CC_OK) { | ||
4043 | 749 | + if (soft_bcc) { | ||
4044 | 750 | + send_eobs(strm, param); | ||
4045 | 751 | + param->bcf = 0; | ||
4046 | 752 | + dfltcc_state->block_threshold = | ||
4047 | 753 | + strm->total_in + dfltcc_state->block_size; | ||
4048 | 754 | + } else | ||
4049 | 755 | + param->bcf = 1; | ||
4050 | 756 | + if (flush == Z_FINISH) { | ||
4051 | 757 | + if (need_empty_block) | ||
4052 | 758 | + /* Make the current deflate() call also close the stream */ | ||
4053 | 759 | + return 0; | ||
4054 | 760 | + else { | ||
4055 | 761 | + bi_windup(state); | ||
4056 | 762 | + *result = finish_done; | ||
4057 | 763 | + } | ||
4058 | 764 | + } else { | ||
4059 | 765 | + if (flush == Z_FULL_FLUSH) | ||
4060 | 766 | + param->hl = 0; /* Clear history */ | ||
4061 | 767 | + *result = flush == Z_NO_FLUSH ? need_more : block_done; | ||
4062 | 768 | + } | ||
4063 | 769 | + } else { | ||
4064 | 770 | + param->bcf = 1; | ||
4065 | 771 | + *result = need_more; | ||
4066 | 772 | + } | ||
4067 | 773 | + if (strm->avail_in != 0 && strm->avail_out != 0) | ||
4068 | 774 | + goto again; /* deflate() must use all input or all output */ | ||
4069 | 775 | + return 1; | ||
4070 | 776 | +} | ||
4071 | 777 | + | ||
4072 | 778 | +/* | ||
4073 | 779 | + Expand. | ||
4074 | 780 | + */ | ||
4075 | 781 | +int ZLIB_INTERNAL dfltcc_can_inflate(z_streamp strm) | ||
4076 | 782 | +{ | ||
4077 | 783 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4078 | 784 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4079 | 785 | + | ||
4080 | 786 | + /* Unsupported hardware */ | ||
4081 | 787 | + return is_bit_set(dfltcc_state->af.fns, DFLTCC_XPND) && | ||
4082 | 788 | + is_bit_set(dfltcc_state->af.fmts, DFLTCC_FMT0); | ||
4083 | 789 | +} | ||
4084 | 790 | + | ||
4085 | 791 | +local dfltcc_cc dfltcc_xpnd(z_streamp strm) | ||
4086 | 792 | +{ | ||
4087 | 793 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4088 | 794 | + struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; | ||
4089 | 795 | + size_t avail_in = strm->avail_in; | ||
4090 | 796 | + size_t avail_out = strm->avail_out; | ||
4091 | 797 | + dfltcc_cc cc; | ||
4092 | 798 | + | ||
4093 | 799 | + cc = dfltcc(DFLTCC_XPND | HBT_CIRCULAR, | ||
4094 | 800 | + param, &strm->next_out, &avail_out, | ||
4095 | 801 | + &strm->next_in, &avail_in, state->window); | ||
4096 | 802 | + strm->avail_in = avail_in; | ||
4097 | 803 | + strm->avail_out = avail_out; | ||
4098 | 804 | + return cc; | ||
4099 | 805 | +} | ||
4100 | 806 | + | ||
4101 | 807 | +dfltcc_inflate_action ZLIB_INTERNAL dfltcc_inflate(z_streamp strm, int flush, | ||
4102 | 808 | + int *ret) | ||
4103 | 809 | +{ | ||
4104 | 810 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4105 | 811 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4106 | 812 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4107 | 813 | + dfltcc_cc cc; | ||
4108 | 814 | + | ||
4109 | 815 | + if (flush == Z_BLOCK || flush == Z_TREES) { | ||
4110 | 816 | + /* DFLTCC does not support stopping on block boundaries */ | ||
4111 | 817 | + if (dfltcc_inflate_disable(strm)) { | ||
4112 | 818 | + *ret = Z_STREAM_ERROR; | ||
4113 | 819 | + return DFLTCC_INFLATE_BREAK; | ||
4114 | 820 | + } else | ||
4115 | 821 | + return DFLTCC_INFLATE_SOFTWARE; | ||
4116 | 822 | + } | ||
4117 | 823 | + | ||
4118 | 824 | + if (state->last) { | ||
4119 | 825 | + if (state->bits != 0) { | ||
4120 | 826 | + strm->next_in++; | ||
4121 | 827 | + strm->avail_in--; | ||
4122 | 828 | + state->bits = 0; | ||
4123 | 829 | + } | ||
4124 | 830 | + state->mode = CHECK; | ||
4125 | 831 | + return DFLTCC_INFLATE_CONTINUE; | ||
4126 | 832 | + } | ||
4127 | 833 | + | ||
4128 | 834 | + if (strm->avail_in == 0 && !param->cf) | ||
4129 | 835 | + return DFLTCC_INFLATE_BREAK; | ||
4130 | 836 | + | ||
4131 | 837 | + if (inflate_ensure_window(state)) { | ||
4132 | 838 | + state->mode = MEM; | ||
4133 | 839 | + return DFLTCC_INFLATE_CONTINUE; | ||
4134 | 840 | + } | ||
4135 | 841 | + | ||
4136 | 842 | + /* Translate stream to parameter block */ | ||
4137 | 843 | + param->cvt = ((state->wrap & 4) && state->flags) ? CVT_CRC32 : CVT_ADLER32; | ||
4138 | 844 | + param->sbb = state->bits; | ||
4139 | 845 | + if (param->hl) | ||
4140 | 846 | + param->nt = 0; /* Honor history for the first block */ | ||
4141 | 847 | + if (state->wrap & 4) | ||
4142 | 848 | + param->cv = state->flags ? ZSWAP32(state->check) : state->check; | ||
4143 | 849 | + | ||
4144 | 850 | + /* Inflate */ | ||
4145 | 851 | + do { | ||
4146 | 852 | + cc = dfltcc_xpnd(strm); | ||
4147 | 853 | + } while (cc == DFLTCC_CC_AGAIN); | ||
4148 | 854 | + | ||
4149 | 855 | + /* Translate parameter block to stream */ | ||
4150 | 856 | + strm->msg = oesc_msg(dfltcc_state->msg, param->oesc); | ||
4151 | 857 | + state->last = cc == DFLTCC_CC_OK; | ||
4152 | 858 | + state->bits = param->sbb; | ||
4153 | 859 | + if (state->wrap & 4) | ||
4154 | 860 | + strm->adler = state->check = state->flags ? | ||
4155 | 861 | + ZSWAP32(param->cv) : param->cv; | ||
4156 | 862 | + if (cc == DFLTCC_CC_OP2_CORRUPT && param->oesc != 0) { | ||
4157 | 863 | + /* Report an error if stream is corrupted */ | ||
4158 | 864 | + state->mode = BAD; | ||
4159 | 865 | + return DFLTCC_INFLATE_CONTINUE; | ||
4160 | 866 | + } | ||
4161 | 867 | + state->mode = TYPEDO; | ||
4162 | 868 | + /* Break if operands are exhausted, otherwise continue looping */ | ||
4163 | 869 | + return (cc == DFLTCC_CC_OP1_TOO_SHORT || cc == DFLTCC_CC_OP2_TOO_SHORT) ? | ||
4164 | 870 | + DFLTCC_INFLATE_BREAK : DFLTCC_INFLATE_CONTINUE; | ||
4165 | 871 | +} | ||
4166 | 872 | + | ||
4167 | 873 | +int ZLIB_INTERNAL dfltcc_was_inflate_used(z_streamp strm) | ||
4168 | 874 | +{ | ||
4169 | 875 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4170 | 876 | + struct dfltcc_param_v0 *param = &GET_DFLTCC_STATE(state)->param; | ||
4171 | 877 | + | ||
4172 | 878 | + return !param->nt; | ||
4173 | 879 | +} | ||
4174 | 880 | + | ||
4175 | 881 | +/* | ||
4176 | 882 | + Rotates a circular buffer. | ||
4177 | 883 | + The implementation is based on https://cplusplus.com/reference/algorithm/rotate/ | ||
4178 | 884 | + */ | ||
4179 | 885 | +local void rotate(Bytef *start, Bytef *pivot, Bytef *end) | ||
4180 | 886 | +{ | ||
4181 | 887 | + Bytef *p = pivot; | ||
4182 | 888 | + Bytef tmp; | ||
4183 | 889 | + | ||
4184 | 890 | + while (p != start) { | ||
4185 | 891 | + tmp = *start; | ||
4186 | 892 | + *start = *p; | ||
4187 | 893 | + *p = tmp; | ||
4188 | 894 | + | ||
4189 | 895 | + start++; | ||
4190 | 896 | + p++; | ||
4191 | 897 | + | ||
4192 | 898 | + if (p == end) | ||
4193 | 899 | + p = pivot; | ||
4194 | 900 | + else if (start == pivot) | ||
4195 | 901 | + pivot = p; | ||
4196 | 902 | + } | ||
4197 | 903 | +} | ||
4198 | 904 | + | ||
4199 | 905 | +#define MIN(x, y) ({ \ | ||
4200 | 906 | + typeof(x) _x = (x); \ | ||
4201 | 907 | + typeof(y) _y = (y); \ | ||
4202 | 908 | + _x < _y ? _x : _y; \ | ||
4203 | 909 | +}) | ||
4204 | 910 | + | ||
4205 | 911 | +#define MAX(x, y) ({ \ | ||
4206 | 912 | + typeof(x) _x = (x); \ | ||
4207 | 913 | + typeof(y) _y = (y); \ | ||
4208 | 914 | + _x > _y ? _x : _y; \ | ||
4209 | 915 | +}) | ||
4210 | 916 | + | ||
4211 | 917 | +int ZLIB_INTERNAL dfltcc_inflate_disable(z_streamp strm) | ||
4212 | 918 | +{ | ||
4213 | 919 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4214 | 920 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4215 | 921 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4216 | 922 | + | ||
4217 | 923 | + if (!dfltcc_can_inflate(strm)) | ||
4218 | 924 | + return 0; | ||
4219 | 925 | + if (dfltcc_was_inflate_used(strm)) | ||
4220 | 926 | + /* DFLTCC has already decompressed some data. Since there is not | ||
4221 | 927 | + * enough information to resume decompression in software, the call | ||
4222 | 928 | + * must fail. | ||
4223 | 929 | + */ | ||
4224 | 930 | + return 1; | ||
4225 | 931 | + /* DFLTCC was not used yet - decompress in software */ | ||
4226 | 932 | + memset(&dfltcc_state->af, 0, sizeof(dfltcc_state->af)); | ||
4227 | 933 | + /* Convert the window from the hardware to the software format */ | ||
4228 | 934 | + rotate(state->window, state->window + param->ho, state->window + HB_SIZE); | ||
4229 | 935 | + state->whave = state->wnext = MIN(param->hl, state->wsize); | ||
4230 | 936 | + return 0; | ||
4231 | 937 | +} | ||
4232 | 938 | + | ||
4233 | 939 | +local int env_dfltcc_disabled; | ||
4234 | 940 | +local int env_source_date_epoch; | ||
4235 | 941 | +local unsigned long env_level_mask; | ||
4236 | 942 | +local unsigned long env_block_size; | ||
4237 | 943 | +local unsigned long env_block_threshold; | ||
4238 | 944 | +local unsigned long env_dht_threshold; | ||
4239 | 945 | +local unsigned long env_ribm; | ||
4240 | 946 | +local uint64_t cpu_facilities[(DFLTCC_FACILITY / 64) + 1]; | ||
4241 | 947 | +local struct dfltcc_qaf_param cpu_af __attribute__((aligned(8))); | ||
4242 | 948 | + | ||
4243 | 949 | +local inline int is_dfltcc_enabled(void) | ||
4244 | 950 | +{ | ||
4245 | 951 | + if (env_dfltcc_disabled) | ||
4246 | 952 | + /* User has explicitly disabled DFLTCC. */ | ||
4247 | 953 | + return 0; | ||
4248 | 954 | + | ||
4249 | 955 | + return is_bit_set((const char *)cpu_facilities, DFLTCC_FACILITY); | ||
4250 | 956 | +} | ||
4251 | 957 | + | ||
4252 | 958 | +local unsigned long xstrtoul(const char *s, unsigned long _default) | ||
4253 | 959 | +{ | ||
4254 | 960 | + char *endptr; | ||
4255 | 961 | + unsigned long result; | ||
4256 | 962 | + | ||
4257 | 963 | + if (!(s && *s)) | ||
4258 | 964 | + return _default; | ||
4259 | 965 | + errno = 0; | ||
4260 | 966 | + result = strtoul(s, &endptr, 0); | ||
4261 | 967 | + return (errno || *endptr) ? _default : result; | ||
4262 | 968 | +} | ||
4263 | 969 | + | ||
4264 | 970 | +__attribute__((constructor)) local void init_globals(void) | ||
4265 | 971 | +{ | ||
4266 | 972 | + const char *env; | ||
4267 | 973 | + register char r0 __asm__("r0"); | ||
4268 | 974 | + | ||
4269 | 975 | + env = secure_getenv("DFLTCC"); | ||
4270 | 976 | + env_dfltcc_disabled = env && !strcmp(env, "0"); | ||
4271 | 977 | + | ||
4272 | 978 | + env = secure_getenv("SOURCE_DATE_EPOCH"); | ||
4273 | 979 | + env_source_date_epoch = !!env; | ||
4274 | 980 | + | ||
4275 | 981 | +#ifndef DFLTCC_LEVEL_MASK | ||
4276 | 982 | +#define DFLTCC_LEVEL_MASK 0x2 | ||
4277 | 983 | +#endif | ||
4278 | 984 | + env_level_mask = xstrtoul(secure_getenv("DFLTCC_LEVEL_MASK"), | ||
4279 | 985 | + DFLTCC_LEVEL_MASK); | ||
4280 | 986 | + | ||
4281 | 987 | +#ifndef DFLTCC_BLOCK_SIZE | ||
4282 | 988 | +#define DFLTCC_BLOCK_SIZE 1048576 | ||
4283 | 989 | +#endif | ||
4284 | 990 | + env_block_size = xstrtoul(secure_getenv("DFLTCC_BLOCK_SIZE"), | ||
4285 | 991 | + DFLTCC_BLOCK_SIZE); | ||
4286 | 992 | + | ||
4287 | 993 | +#ifndef DFLTCC_FIRST_FHT_BLOCK_SIZE | ||
4288 | 994 | +#define DFLTCC_FIRST_FHT_BLOCK_SIZE 4096 | ||
4289 | 995 | +#endif | ||
4290 | 996 | + env_block_threshold = xstrtoul(secure_getenv("DFLTCC_FIRST_FHT_BLOCK_SIZE"), | ||
4291 | 997 | + DFLTCC_FIRST_FHT_BLOCK_SIZE); | ||
4292 | 998 | + | ||
4293 | 999 | +#ifndef DFLTCC_DHT_MIN_SAMPLE_SIZE | ||
4294 | 1000 | +#define DFLTCC_DHT_MIN_SAMPLE_SIZE 4096 | ||
4295 | 1001 | +#endif | ||
4296 | 1002 | + env_dht_threshold = xstrtoul(secure_getenv("DFLTCC_DHT_MIN_SAMPLE_SIZE"), | ||
4297 | 1003 | + DFLTCC_DHT_MIN_SAMPLE_SIZE); | ||
4298 | 1004 | + | ||
4299 | 1005 | +#ifndef DFLTCC_RIBM | ||
4300 | 1006 | +#define DFLTCC_RIBM 0 | ||
4301 | 1007 | +#endif | ||
4302 | 1008 | + env_ribm = xstrtoul(secure_getenv("DFLTCC_RIBM"), DFLTCC_RIBM); | ||
4303 | 1009 | + | ||
4304 | 1010 | + memset(cpu_facilities, 0, sizeof(cpu_facilities)); | ||
4305 | 1011 | + r0 = sizeof(cpu_facilities) / sizeof(cpu_facilities[0]) - 1; | ||
4306 | 1012 | + /* STFLE is supported since z9-109 and only in z/Architecture mode. When | ||
4307 | 1013 | + * compiling with -m31, gcc defaults to ESA mode, however, since the kernel | ||
4308 | 1014 | + * is 64-bit, it's always z/Architecture mode at runtime. | ||
4309 | 1015 | + */ | ||
4310 | 1016 | + __asm__ volatile( | ||
4311 | 1017 | +#ifndef __clang__ | ||
4312 | 1018 | + ".machinemode push\n" | ||
4313 | 1019 | + ".machinemode zarch\n" | ||
4314 | 1020 | +#endif | ||
4315 | 1021 | + "stfle %[facilities]\n" | ||
4316 | 1022 | +#ifndef __clang__ | ||
4317 | 1023 | + ".machinemode pop\n" | ||
4318 | 1024 | +#endif | ||
4319 | 1025 | + : [facilities] "=Q" (cpu_facilities) | ||
4320 | 1026 | + , [r0] "+r" (r0) | ||
4321 | 1027 | + : | ||
4322 | 1028 | + : "cc"); | ||
4323 | 1029 | + | ||
4324 | 1030 | + /* Initialize available functions */ | ||
4325 | 1031 | + if (is_dfltcc_enabled()) | ||
4326 | 1032 | + dfltcc(DFLTCC_QAF, &cpu_af, NULL, NULL, NULL, NULL, NULL); | ||
4327 | 1033 | + else | ||
4328 | 1034 | + memset(&cpu_af, 0, sizeof(cpu_af)); | ||
4329 | 1035 | +} | ||
4330 | 1036 | + | ||
4331 | 1037 | +/* | ||
4332 | 1038 | + Memory management. | ||
4333 | 1039 | + | ||
4334 | 1040 | + DFLTCC requires parameter blocks and window to be aligned. zlib allows | ||
4335 | 1041 | + users to specify their own allocation functions, so using e.g. | ||
4336 | 1042 | + `posix_memalign' is not an option. Thus, we overallocate and take the | ||
4337 | 1043 | + aligned portion of the buffer. | ||
4338 | 1044 | +*/ | ||
4339 | 1045 | +void ZLIB_INTERNAL dfltcc_reset(z_streamp strm, uInt size) | ||
4340 | 1046 | +{ | ||
4341 | 1047 | + struct dfltcc_state *dfltcc_state = | ||
4342 | 1048 | + (struct dfltcc_state *)((char *)strm->state + ALIGN_UP(size, 8)); | ||
4343 | 1049 | + | ||
4344 | 1050 | + memcpy(&dfltcc_state->af, &cpu_af, sizeof(dfltcc_state->af)); | ||
4345 | 1051 | + | ||
4346 | 1052 | + if (env_source_date_epoch) | ||
4347 | 1053 | + /* User needs reproducible results, but the output of DFLTCC_CMPR | ||
4348 | 1054 | + * depends on buffers' page offsets. | ||
4349 | 1055 | + */ | ||
4350 | 1056 | + clear_bit(dfltcc_state->af.fns, DFLTCC_CMPR); | ||
4351 | 1057 | + | ||
4352 | 1058 | + /* Initialize parameter block */ | ||
4353 | 1059 | + memset(&dfltcc_state->param, 0, sizeof(dfltcc_state->param)); | ||
4354 | 1060 | + dfltcc_state->param.nt = 1; | ||
4355 | 1061 | + | ||
4356 | 1062 | + /* Initialize tuning parameters */ | ||
4357 | 1063 | + dfltcc_state->level_mask = env_level_mask; | ||
4358 | 1064 | + dfltcc_state->block_size = env_block_size; | ||
4359 | 1065 | + dfltcc_state->block_threshold = env_block_threshold; | ||
4360 | 1066 | + dfltcc_state->dht_threshold = env_dht_threshold; | ||
4361 | 1067 | + dfltcc_state->param.ribm = env_ribm; | ||
4362 | 1068 | +} | ||
4363 | 1069 | + | ||
4364 | 1070 | +voidpf ZLIB_INTERNAL dfltcc_alloc_state(z_streamp strm, uInt items, uInt size) | ||
4365 | 1071 | +{ | ||
4366 | 1072 | + return ZALLOC(strm, | ||
4367 | 1073 | + ALIGN_UP(items * size, 8) + sizeof(struct dfltcc_state), | ||
4368 | 1074 | + sizeof(unsigned char)); | ||
4369 | 1075 | +} | ||
4370 | 1076 | + | ||
4371 | 1077 | +void ZLIB_INTERNAL dfltcc_copy_state(voidpf dst, const voidpf src, uInt size) | ||
4372 | 1078 | +{ | ||
4373 | 1079 | + zmemcpy(dst, src, ALIGN_UP(size, 8) + sizeof(struct dfltcc_state)); | ||
4374 | 1080 | +} | ||
4375 | 1081 | + | ||
4376 | 1082 | +static const int PAGE_ALIGN = 0x1000; | ||
4377 | 1083 | + | ||
4378 | 1084 | +voidpf ZLIB_INTERNAL dfltcc_alloc_window(z_streamp strm, uInt items, uInt size) | ||
4379 | 1085 | +{ | ||
4380 | 1086 | + voidpf p, w; | ||
4381 | 1087 | + | ||
4382 | 1088 | + /* To simplify freeing, we store the pointer to the allocated buffer right | ||
4383 | 1089 | + * before the window. Note that DFLTCC always uses HB_SIZE bytes. | ||
4384 | 1090 | + */ | ||
4385 | 1091 | + p = ZALLOC(strm, sizeof(voidpf) + MAX(items * size, HB_SIZE) + PAGE_ALIGN, | ||
4386 | 1092 | + sizeof(unsigned char)); | ||
4387 | 1093 | + if (p == NULL) | ||
4388 | 1094 | + return NULL; | ||
4389 | 1095 | + w = ALIGN_UP((char *)p + sizeof(voidpf), PAGE_ALIGN); | ||
4390 | 1096 | + *(voidpf *)((char *)w - sizeof(voidpf)) = p; | ||
4391 | 1097 | + return w; | ||
4392 | 1098 | +} | ||
4393 | 1099 | + | ||
4394 | 1100 | +void ZLIB_INTERNAL dfltcc_copy_window(void *dest, const void *src, size_t n) | ||
4395 | 1101 | +{ | ||
4396 | 1102 | + memcpy(dest, src, MAX(n, HB_SIZE)); | ||
4397 | 1103 | +} | ||
4398 | 1104 | + | ||
4399 | 1105 | +void ZLIB_INTERNAL dfltcc_free_window(z_streamp strm, voidpf w) | ||
4400 | 1106 | +{ | ||
4401 | 1107 | + if (w) | ||
4402 | 1108 | + ZFREE(strm, *(voidpf *)((unsigned char *)w - sizeof(voidpf))); | ||
4403 | 1109 | +} | ||
4404 | 1110 | + | ||
4405 | 1111 | +/* | ||
4406 | 1112 | + Switching between hardware and software compression. | ||
4407 | 1113 | + | ||
4408 | 1114 | + DFLTCC does not support all zlib settings, e.g. generation of non-compressed | ||
4409 | 1115 | + blocks or alternative window sizes. When such settings are applied on the | ||
4410 | 1116 | + fly with deflateParams, we need to convert between hardware and software | ||
4411 | 1117 | + window formats. | ||
4412 | 1118 | +*/ | ||
4413 | 1119 | +int ZLIB_INTERNAL dfltcc_deflate_params(z_streamp strm, int level, | ||
4414 | 1120 | + int strategy, int *flush) | ||
4415 | 1121 | +{ | ||
4416 | 1122 | + deflate_state *state = (deflate_state *)strm->state; | ||
4417 | 1123 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4418 | 1124 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4419 | 1125 | + int could_deflate = dfltcc_can_deflate(strm); | ||
4420 | 1126 | + int can_deflate = dfltcc_can_deflate_with_params(strm, | ||
4421 | 1127 | + level, | ||
4422 | 1128 | + state->w_bits, | ||
4423 | 1129 | + strategy); | ||
4424 | 1130 | + | ||
4425 | 1131 | + if (can_deflate == could_deflate) | ||
4426 | 1132 | + /* We continue to work in the same mode - no changes needed */ | ||
4427 | 1133 | + return Z_OK; | ||
4428 | 1134 | + | ||
4429 | 1135 | + if (strm->total_in == 0 && param->nt == 1 && param->hl == 0) | ||
4430 | 1136 | + /* DFLTCC was not used yet - no changes needed */ | ||
4431 | 1137 | + return Z_OK; | ||
4432 | 1138 | + | ||
4433 | 1139 | + /* For now, do not convert between window formats - simply get rid of the | ||
4434 | 1140 | + * old data instead. | ||
4435 | 1141 | + */ | ||
4436 | 1142 | + *flush = Z_FULL_FLUSH; | ||
4437 | 1143 | + return Z_OK; | ||
4438 | 1144 | +} | ||
4439 | 1145 | + | ||
4440 | 1146 | +int ZLIB_INTERNAL dfltcc_deflate_done(z_streamp strm, int flush) | ||
4441 | 1147 | +{ | ||
4442 | 1148 | + deflate_state *state = (deflate_state *)strm->state; | ||
4443 | 1149 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4444 | 1150 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4445 | 1151 | + | ||
4446 | 1152 | + /* When deflate(Z_FULL_FLUSH) is called with small avail_out, it might | ||
4447 | 1153 | + * close the block without resetting the compression state. Detect this | ||
4448 | 1154 | + * situation and return that deflation is not done. | ||
4449 | 1155 | + */ | ||
4450 | 1156 | + if (flush == Z_FULL_FLUSH && strm->avail_out == 0) | ||
4451 | 1157 | + return 0; | ||
4452 | 1158 | + | ||
4453 | 1159 | + /* Return that deflation is not done if DFLTCC is used and either it | ||
4454 | 1160 | + * buffered some data (Continuation Flag is set), or has not written EOBS | ||
4455 | 1161 | + * yet (Block-Continuation Flag is set). | ||
4456 | 1162 | + */ | ||
4457 | 1163 | + return !dfltcc_can_deflate(strm) || (!param->cf && !param->bcf); | ||
4458 | 1164 | +} | ||
4459 | 1165 | + | ||
4460 | 1166 | +/* | ||
4461 | 1167 | + Preloading history. | ||
4462 | 1168 | +*/ | ||
4463 | 1169 | +local void append_history(struct dfltcc_param_v0 *param, | ||
4464 | 1170 | + Bytef *history, | ||
4465 | 1171 | + const Bytef *buf, | ||
4466 | 1172 | + uInt count) | ||
4467 | 1173 | +{ | ||
4468 | 1174 | + size_t offset; | ||
4469 | 1175 | + size_t n; | ||
4470 | 1176 | + | ||
4471 | 1177 | + /* Do not use more than 32K */ | ||
4472 | 1178 | + if (count > HB_SIZE) { | ||
4473 | 1179 | + buf += count - HB_SIZE; | ||
4474 | 1180 | + count = HB_SIZE; | ||
4475 | 1181 | + } | ||
4476 | 1182 | + offset = (param->ho + param->hl) % HB_SIZE; | ||
4477 | 1183 | + if (offset + count <= HB_SIZE) | ||
4478 | 1184 | + /* Circular history buffer does not wrap - copy one chunk */ | ||
4479 | 1185 | + zmemcpy(history + offset, buf, count); | ||
4480 | 1186 | + else { | ||
4481 | 1187 | + /* Circular history buffer wraps - copy two chunks */ | ||
4482 | 1188 | + n = HB_SIZE - offset; | ||
4483 | 1189 | + zmemcpy(history + offset, buf, n); | ||
4484 | 1190 | + zmemcpy(history, buf + n, count - n); | ||
4485 | 1191 | + } | ||
4486 | 1192 | + n = param->hl + count; | ||
4487 | 1193 | + if (n <= HB_SIZE) | ||
4488 | 1194 | + /* All history fits into buffer - no need to discard anything */ | ||
4489 | 1195 | + param->hl = n; | ||
4490 | 1196 | + else { | ||
4491 | 1197 | + /* History does not fit into buffer - discard extra bytes */ | ||
4492 | 1198 | + param->ho = (param->ho + (n - HB_SIZE)) % HB_SIZE; | ||
4493 | 1199 | + param->hl = HB_SIZE; | ||
4494 | 1200 | + } | ||
4495 | 1201 | +} | ||
4496 | 1202 | + | ||
4497 | 1203 | +local void get_history(struct dfltcc_param_v0 *param, | ||
4498 | 1204 | + const Bytef *history, | ||
4499 | 1205 | + Bytef *buf) | ||
4500 | 1206 | +{ | ||
4501 | 1207 | + if (param->ho + param->hl <= HB_SIZE) | ||
4502 | 1208 | + /* Circular history buffer does not wrap - copy one chunk */ | ||
4503 | 1209 | + memcpy(buf, history + param->ho, param->hl); | ||
4504 | 1210 | + else { | ||
4505 | 1211 | + /* Circular history buffer wraps - copy two chunks */ | ||
4506 | 1212 | + memcpy(buf, history + param->ho, HB_SIZE - param->ho); | ||
4507 | 1213 | + memcpy(buf + HB_SIZE - param->ho, history, param->ho + param->hl - HB_SIZE); | ||
4508 | 1214 | + } | ||
4509 | 1215 | +} | ||
4510 | 1216 | + | ||
4511 | 1217 | +int ZLIB_INTERNAL dfltcc_deflate_set_dictionary(z_streamp strm, | ||
4512 | 1218 | + const Bytef *dictionary, | ||
4513 | 1219 | + uInt dict_length) | ||
4514 | 1220 | +{ | ||
4515 | 1221 | + deflate_state *state = (deflate_state *)strm->state; | ||
4516 | 1222 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4517 | 1223 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4518 | 1224 | + | ||
4519 | 1225 | + append_history(param, state->window, dictionary, dict_length); | ||
4520 | 1226 | + state->strstart = 1; /* Add FDICT to zlib header */ | ||
4521 | 1227 | + state->block_start = state->strstart; /* Make deflate_stored happy */ | ||
4522 | 1228 | + return Z_OK; | ||
4523 | 1229 | +} | ||
4524 | 1230 | + | ||
4525 | 1231 | +int ZLIB_INTERNAL dfltcc_deflate_get_dictionary(z_streamp strm, | ||
4526 | 1232 | + Bytef *dictionary, | ||
4527 | 1233 | + uInt *dict_length) | ||
4528 | 1234 | +{ | ||
4529 | 1235 | + deflate_state *state = (deflate_state *)strm->state; | ||
4530 | 1236 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4531 | 1237 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4532 | 1238 | + | ||
4533 | 1239 | + if (dictionary) | ||
4534 | 1240 | + get_history(param, state->window, dictionary); | ||
4535 | 1241 | + if (dict_length) | ||
4536 | 1242 | + *dict_length = param->hl; | ||
4537 | 1243 | + return Z_OK; | ||
4538 | 1244 | +} | ||
4539 | 1245 | + | ||
4540 | 1246 | +int ZLIB_INTERNAL dfltcc_inflate_set_dictionary(z_streamp strm, | ||
4541 | 1247 | + const Bytef *dictionary, | ||
4542 | 1248 | + uInt dict_length) | ||
4543 | 1249 | +{ | ||
4544 | 1250 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4545 | 1251 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4546 | 1252 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4547 | 1253 | + | ||
4548 | 1254 | + if (inflate_ensure_window(state)) { | ||
4549 | 1255 | + state->mode = MEM; | ||
4550 | 1256 | + return Z_MEM_ERROR; | ||
4551 | 1257 | + } | ||
4552 | 1258 | + | ||
4553 | 1259 | + append_history(param, state->window, dictionary, dict_length); | ||
4554 | 1260 | + state->havedict = 1; | ||
4555 | 1261 | + return Z_OK; | ||
4556 | 1262 | +} | ||
4557 | 1263 | + | ||
4558 | 1264 | +int ZLIB_INTERNAL dfltcc_inflate_get_dictionary(z_streamp strm, | ||
4559 | 1265 | + Bytef *dictionary, | ||
4560 | 1266 | + uInt *dict_length) | ||
4561 | 1267 | +{ | ||
4562 | 1268 | + struct inflate_state *state = (struct inflate_state *)strm->state; | ||
4563 | 1269 | + struct dfltcc_state *dfltcc_state = GET_DFLTCC_STATE(state); | ||
4564 | 1270 | + struct dfltcc_param_v0 *param = &dfltcc_state->param; | ||
4565 | 1271 | + | ||
4566 | 1272 | + if (dictionary && state->window) | ||
4567 | 1273 | + get_history(param, state->window, dictionary); | ||
4568 | 1274 | + if (dict_length) | ||
4569 | 1275 | + *dict_length = param->hl; | ||
4570 | 1276 | + return Z_OK; | ||
4571 | 1277 | +} | ||
4572 | 1278 | diff --git a/contrib/s390/dfltcc.h b/contrib/s390/dfltcc.h | ||
4573 | 1279 | new file mode 100644 | ||
4574 | 1280 | index 0000000..c8491c4 | ||
4575 | 1281 | --- /dev/null | ||
4576 | 1282 | +++ b/contrib/s390/dfltcc.h | ||
4577 | 1283 | @@ -0,0 +1,97 @@ | ||
4578 | 1284 | +#ifndef DFLTCC_H | ||
4579 | 1285 | +#define DFLTCC_H | ||
4580 | 1286 | + | ||
4581 | 1287 | +#include "../../zlib.h" | ||
4582 | 1288 | +#include "../../zutil.h" | ||
4583 | 1289 | + | ||
4584 | 1290 | +voidpf ZLIB_INTERNAL dfltcc_alloc_state(z_streamp strm, uInt items, uInt size); | ||
4585 | 1291 | +void ZLIB_INTERNAL dfltcc_copy_state(voidpf dst, const voidpf src, uInt size); | ||
4586 | 1292 | +void ZLIB_INTERNAL dfltcc_reset(z_streamp strm, uInt size); | ||
4587 | 1293 | +voidpf ZLIB_INTERNAL dfltcc_alloc_window(z_streamp strm, uInt items, | ||
4588 | 1294 | + uInt size); | ||
4589 | 1295 | +void ZLIB_INTERNAL dfltcc_copy_window(void *dest, const void *src, size_t n); | ||
4590 | 1296 | +void ZLIB_INTERNAL dfltcc_free_window(z_streamp strm, voidpf w); | ||
4591 | 1297 | +#define DFLTCC_BLOCK_HEADER_BITS 3 | ||
4592 | 1298 | +#define DFLTCC_HLITS_COUNT_BITS 5 | ||
4593 | 1299 | +#define DFLTCC_HDISTS_COUNT_BITS 5 | ||
4594 | 1300 | +#define DFLTCC_HCLENS_COUNT_BITS 4 | ||
4595 | 1301 | +#define DFLTCC_MAX_HCLENS 19 | ||
4596 | 1302 | +#define DFLTCC_HCLEN_BITS 3 | ||
4597 | 1303 | +#define DFLTCC_MAX_HLITS 286 | ||
4598 | 1304 | +#define DFLTCC_MAX_HDISTS 30 | ||
4599 | 1305 | +#define DFLTCC_MAX_HLIT_HDIST_BITS 7 | ||
4600 | 1306 | +#define DFLTCC_MAX_SYMBOL_BITS 16 | ||
4601 | 1307 | +#define DFLTCC_MAX_EOBS_BITS 15 | ||
4602 | 1308 | +#define DFLTCC_MAX_PADDING_BITS 7 | ||
4603 | 1309 | +#define DEFLATE_BOUND_COMPLEN(source_len) \ | ||
4604 | 1310 | + ((DFLTCC_BLOCK_HEADER_BITS + \ | ||
4605 | 1311 | + DFLTCC_HLITS_COUNT_BITS + \ | ||
4606 | 1312 | + DFLTCC_HDISTS_COUNT_BITS + \ | ||
4607 | 1313 | + DFLTCC_HCLENS_COUNT_BITS + \ | ||
4608 | 1314 | + DFLTCC_MAX_HCLENS * DFLTCC_HCLEN_BITS + \ | ||
4609 | 1315 | + (DFLTCC_MAX_HLITS + DFLTCC_MAX_HDISTS) * DFLTCC_MAX_HLIT_HDIST_BITS + \ | ||
4610 | 1316 | + (source_len) * DFLTCC_MAX_SYMBOL_BITS + \ | ||
4611 | 1317 | + DFLTCC_MAX_EOBS_BITS + \ | ||
4612 | 1318 | + DFLTCC_MAX_PADDING_BITS) >> 3) | ||
4613 | 1319 | +int ZLIB_INTERNAL dfltcc_can_inflate(z_streamp strm); | ||
4614 | 1320 | +typedef enum { | ||
4615 | 1321 | + DFLTCC_INFLATE_CONTINUE, | ||
4616 | 1322 | + DFLTCC_INFLATE_BREAK, | ||
4617 | 1323 | + DFLTCC_INFLATE_SOFTWARE, | ||
4618 | 1324 | +} dfltcc_inflate_action; | ||
4619 | 1325 | +dfltcc_inflate_action ZLIB_INTERNAL dfltcc_inflate(z_streamp strm, | ||
4620 | 1326 | + int flush, int *ret); | ||
4621 | 1327 | +int ZLIB_INTERNAL dfltcc_was_inflate_used(z_streamp strm); | ||
4622 | 1328 | +int ZLIB_INTERNAL dfltcc_inflate_disable(z_streamp strm); | ||
4623 | 1329 | +int ZLIB_INTERNAL dfltcc_inflate_set_dictionary(z_streamp strm, | ||
4624 | 1330 | + const Bytef *dictionary, | ||
4625 | 1331 | + uInt dict_length); | ||
4626 | 1332 | +int ZLIB_INTERNAL dfltcc_inflate_get_dictionary(z_streamp strm, | ||
4627 | 1333 | + Bytef *dictionary, | ||
4628 | 1334 | + uInt* dict_length); | ||
4629 | 1335 | + | ||
4630 | 1336 | +#define ZALLOC_STATE dfltcc_alloc_state | ||
4631 | 1337 | +#define ZFREE_STATE ZFREE | ||
4632 | 1338 | +#define ZCOPY_STATE dfltcc_copy_state | ||
4633 | 1339 | +#define ZALLOC_WINDOW dfltcc_alloc_window | ||
4634 | 1340 | +#define ZCOPY_WINDOW dfltcc_copy_window | ||
4635 | 1341 | +#define ZFREE_WINDOW dfltcc_free_window | ||
4636 | 1342 | +#define TRY_FREE_WINDOW dfltcc_free_window | ||
4637 | 1343 | +#define INFLATE_RESET_KEEP_HOOK(strm) \ | ||
4638 | 1344 | + dfltcc_reset((strm), sizeof(struct inflate_state)) | ||
4639 | 1345 | +#define INFLATE_PRIME_HOOK(strm, bits, value) \ | ||
4640 | 1346 | + do { if (dfltcc_inflate_disable((strm))) return Z_STREAM_ERROR; } while (0) | ||
4641 | 1347 | +#define INFLATE_TYPEDO_HOOK(strm, flush) \ | ||
4642 | 1348 | + if (dfltcc_can_inflate((strm))) { \ | ||
4643 | 1349 | + dfltcc_inflate_action action; \ | ||
4644 | 1350 | +\ | ||
4645 | 1351 | + RESTORE(); \ | ||
4646 | 1352 | + action = dfltcc_inflate((strm), (flush), &ret); \ | ||
4647 | 1353 | + LOAD(); \ | ||
4648 | 1354 | + if (action == DFLTCC_INFLATE_CONTINUE) \ | ||
4649 | 1355 | + break; \ | ||
4650 | 1356 | + else if (action == DFLTCC_INFLATE_BREAK) \ | ||
4651 | 1357 | + goto inf_leave; \ | ||
4652 | 1358 | + } | ||
4653 | 1359 | +#define INFLATE_NEED_CHECKSUM(strm) (!dfltcc_can_inflate((strm))) | ||
4654 | 1360 | +#define INFLATE_NEED_UPDATEWINDOW(strm) (!dfltcc_can_inflate((strm))) | ||
4655 | 1361 | +#define INFLATE_MARK_HOOK(strm) \ | ||
4656 | 1362 | + do { \ | ||
4657 | 1363 | + if (dfltcc_was_inflate_used((strm))) return -(1L << 16); \ | ||
4658 | 1364 | + } while (0) | ||
4659 | 1365 | +#define INFLATE_SYNC_POINT_HOOK(strm) \ | ||
4660 | 1366 | + do { \ | ||
4661 | 1367 | + if (dfltcc_was_inflate_used((strm))) return Z_STREAM_ERROR; \ | ||
4662 | 1368 | + } while (0) | ||
4663 | 1369 | +#define INFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) \ | ||
4664 | 1370 | + do { \ | ||
4665 | 1371 | + if (dfltcc_can_inflate(strm)) \ | ||
4666 | 1372 | + return dfltcc_inflate_set_dictionary(strm, dict, dict_len); \ | ||
4667 | 1373 | + } while (0) | ||
4668 | 1374 | +#define INFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) \ | ||
4669 | 1375 | + do { \ | ||
4670 | 1376 | + if (dfltcc_can_inflate(strm)) \ | ||
4671 | 1377 | + return dfltcc_inflate_get_dictionary(strm, dict, dict_len); \ | ||
4672 | 1378 | + } while (0) | ||
4673 | 1379 | + | ||
4674 | 1380 | +#endif | ||
4675 | 1381 | diff --git a/contrib/s390/dfltcc_deflate.h b/contrib/s390/dfltcc_deflate.h | ||
4676 | 1382 | new file mode 100644 | ||
4677 | 1383 | index 0000000..2699d15 | ||
4678 | 1384 | --- /dev/null | ||
4679 | 1385 | +++ b/contrib/s390/dfltcc_deflate.h | ||
4680 | 1386 | @@ -0,0 +1,53 @@ | ||
4681 | 1387 | +#ifndef DFLTCC_DEFLATE_H | ||
4682 | 1388 | +#define DFLTCC_DEFLATE_H | ||
4683 | 1389 | + | ||
4684 | 1390 | +#include "dfltcc.h" | ||
4685 | 1391 | + | ||
4686 | 1392 | +int ZLIB_INTERNAL dfltcc_can_deflate(z_streamp strm); | ||
4687 | 1393 | +int ZLIB_INTERNAL dfltcc_deflate(z_streamp strm, | ||
4688 | 1394 | + int flush, | ||
4689 | 1395 | + block_state *result); | ||
4690 | 1396 | +int ZLIB_INTERNAL dfltcc_deflate_params(z_streamp strm, int level, | ||
4691 | 1397 | + int strategy, int *flush); | ||
4692 | 1398 | +int ZLIB_INTERNAL dfltcc_deflate_done(z_streamp strm, int flush); | ||
4693 | 1399 | +int ZLIB_INTERNAL dfltcc_deflate_set_dictionary(z_streamp strm, | ||
4694 | 1400 | + const Bytef *dictionary, | ||
4695 | 1401 | + uInt dict_length); | ||
4696 | 1402 | +int ZLIB_INTERNAL dfltcc_deflate_get_dictionary(z_streamp strm, | ||
4697 | 1403 | + Bytef *dictionary, | ||
4698 | 1404 | + uInt* dict_length); | ||
4699 | 1405 | + | ||
4700 | 1406 | +#define DEFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) \ | ||
4701 | 1407 | + do { \ | ||
4702 | 1408 | + if (dfltcc_can_deflate((strm))) \ | ||
4703 | 1409 | + return dfltcc_deflate_set_dictionary((strm), (dict), (dict_len)); \ | ||
4704 | 1410 | + } while (0) | ||
4705 | 1411 | +#define DEFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) \ | ||
4706 | 1412 | + do { \ | ||
4707 | 1413 | + if (dfltcc_can_deflate((strm))) \ | ||
4708 | 1414 | + return dfltcc_deflate_get_dictionary((strm), (dict), (dict_len)); \ | ||
4709 | 1415 | + } while (0) | ||
4710 | 1416 | +#define DEFLATE_RESET_KEEP_HOOK(strm) \ | ||
4711 | 1417 | + dfltcc_reset((strm), sizeof(deflate_state)) | ||
4712 | 1418 | +#define DEFLATE_PARAMS_HOOK(strm, level, strategy, hook_flush) \ | ||
4713 | 1419 | + do { \ | ||
4714 | 1420 | + int err; \ | ||
4715 | 1421 | +\ | ||
4716 | 1422 | + err = dfltcc_deflate_params((strm), \ | ||
4717 | 1423 | + (level), \ | ||
4718 | 1424 | + (strategy), \ | ||
4719 | 1425 | + (hook_flush)); \ | ||
4720 | 1426 | + if (err == Z_STREAM_ERROR) \ | ||
4721 | 1427 | + return err; \ | ||
4722 | 1428 | + } while (0) | ||
4723 | 1429 | +#define DEFLATE_DONE dfltcc_deflate_done | ||
4724 | 1430 | +#define DEFLATE_BOUND_ADJUST_COMPLEN(strm, complen, source_len) \ | ||
4725 | 1431 | + do { \ | ||
4726 | 1432 | + if (deflateStateCheck((strm)) || dfltcc_can_deflate((strm))) \ | ||
4727 | 1433 | + (complen) = DEFLATE_BOUND_COMPLEN(source_len); \ | ||
4728 | 1434 | + } while (0) | ||
4729 | 1435 | +#define DEFLATE_NEED_CONSERVATIVE_BOUND(strm) (dfltcc_can_deflate((strm))) | ||
4730 | 1436 | +#define DEFLATE_HOOK dfltcc_deflate | ||
4731 | 1437 | +#define DEFLATE_NEED_CHECKSUM(strm) (!dfltcc_can_deflate((strm))) | ||
4732 | 1438 | + | ||
4733 | 1439 | +#endif | ||
4734 | 1440 | diff --git a/deflate.c b/deflate.c | ||
4735 | 1441 | index bd01175..9f5bc8b 100644 | ||
4736 | 1442 | --- a/deflate.c | ||
4737 | 1443 | +++ b/deflate.c | ||
4738 | 1444 | @@ -60,12 +60,24 @@ const char deflate_copyright[] = | ||
4739 | 1445 | copyright string in the executable of your product. | ||
4740 | 1446 | */ | ||
4741 | 1447 | |||
4742 | 1448 | -typedef enum { | ||
4743 | 1449 | - need_more, /* block not completed, need more input or more output */ | ||
4744 | 1450 | - block_done, /* block flush performed */ | ||
4745 | 1451 | - finish_started, /* finish started, need only more output at next deflate */ | ||
4746 | 1452 | - finish_done /* finish done, accept no more input or output */ | ||
4747 | 1453 | -} block_state; | ||
4748 | 1454 | +#ifdef DFLTCC | ||
4749 | 1455 | +#include "contrib/s390/dfltcc_deflate.h" | ||
4750 | 1456 | +#else | ||
4751 | 1457 | +#define ZALLOC_STATE ZALLOC | ||
4752 | 1458 | +#define ZFREE_STATE ZFREE | ||
4753 | 1459 | +#define ZCOPY_STATE zmemcpy | ||
4754 | 1460 | +#define ZALLOC_WINDOW ZALLOC | ||
4755 | 1461 | +#define TRY_FREE_WINDOW TRY_FREE | ||
4756 | 1462 | +#define DEFLATE_SET_DICTIONARY_HOOK(strm, dict, dict_len) do {} while (0) | ||
4757 | 1463 | +#define DEFLATE_GET_DICTIONARY_HOOK(strm, dict, dict_len) do {} while (0) | ||
4758 | 1464 | +#define DEFLATE_RESET_KEEP_HOOK(strm) do {} while (0) | ||
4759 | 1465 | +#define DEFLATE_PARAMS_HOOK(strm, level, strategy, hook_flush) do {} while (0) | ||
4760 | 1466 | +#define DEFLATE_DONE(strm, flush) 1 | ||
4761 | 1467 | +#define DEFLATE_BOUND_ADJUST_COMPLEN(strm, complen, sourceLen) do {} while (0) | ||
4762 | 1468 | +#define DEFLATE_NEED_CONSERVATIVE_BOUND(strm) 0 | ||
4763 | 1469 | +#define DEFLATE_HOOK(strm, flush, bstate) 0 | ||
4764 | 1470 | +#define DEFLATE_NEED_CHECKSUM(strm) 1 | ||
4765 | 1471 | +#endif | ||
4766 | 1472 | |||
4767 | 1473 | typedef block_state (*compress_func)(deflate_state *s, int flush); | ||
4768 | 1474 | /* Compression function. Returns the block state after the call. */ | ||
4769 | 1475 | @@ -224,7 +236,8 @@ local unsigned read_buf(z_streamp strm, Bytef *buf, unsigned size) { | ||
4770 | 1476 | strm->avail_in -= len; | ||
4771 | 1477 | |||
4772 | 1478 | zmemcpy(buf, strm->next_in, len); | ||
4773 | 1479 | - if (strm->state->wrap == 1) { | ||
4774 | 1480 | + if (!DEFLATE_NEED_CHECKSUM(strm)) {} | ||
4775 | 1481 | + else if (strm->state->wrap == 1) { | ||
4776 | 1482 | strm->adler = adler32(strm->adler, buf, len); | ||
4777 | 1483 | } | ||
4778 | 1484 | #ifdef GZIP | ||
4779 | 1485 | @@ -429,7 +442,7 @@ int ZEXPORT deflateInit2_(z_streamp strm, int level, int method, | ||
4780 | 1486 | return Z_STREAM_ERROR; | ||
4781 | 1487 | } | ||
4782 | 1488 | if (windowBits == 8) windowBits = 9; /* until 256-byte window bug fixed */ | ||
4783 | 1489 | - s = (deflate_state *) ZALLOC(strm, 1, sizeof(deflate_state)); | ||
4784 | 1490 | + s = (deflate_state *) ZALLOC_STATE(strm, 1, sizeof(deflate_state)); | ||
4785 | 1491 | if (s == Z_NULL) return Z_MEM_ERROR; | ||
4786 | 1492 | strm->state = (struct internal_state FAR *)s; | ||
4787 | 1493 | s->strm = strm; | ||
4788 | 1494 | @@ -446,7 +459,7 @@ int ZEXPORT deflateInit2_(z_streamp strm, int level, int method, | ||
4789 | 1495 | s->hash_mask = s->hash_size - 1; | ||
4790 | 1496 | s->hash_shift = ((s->hash_bits + MIN_MATCH-1) / MIN_MATCH); | ||
4791 | 1497 | |||
4792 | 1498 | - s->window = (Bytef *) ZALLOC(strm, s->w_size, 2*sizeof(Byte)); | ||
4793 | 1499 | + s->window = (Bytef *) ZALLOC_WINDOW(strm, s->w_size, 2*sizeof(Byte)); | ||
4794 | 1500 | s->prev = (Posf *) ZALLOC(strm, s->w_size, sizeof(Pos)); | ||
4795 | 1501 | s->head = (Posf *) ZALLOC(strm, s->hash_size, sizeof(Pos)); | ||
4796 | 1502 | |||
4797 | 1503 | @@ -559,6 +572,7 @@ int ZEXPORT deflateSetDictionary(z_streamp strm, const Bytef *dictionary, | ||
4798 | 1504 | /* when using zlib wrappers, compute Adler-32 for provided dictionary */ | ||
4799 | 1505 | if (wrap == 1) | ||
4800 | 1506 | strm->adler = adler32(strm->adler, dictionary, dictLength); | ||
4801 | 1507 | + DEFLATE_SET_DICTIONARY_HOOK(strm, dictionary, dictLength); | ||
4802 | 1508 | s->wrap = 0; /* avoid computing Adler-32 in read_buf */ | ||
4803 | 1509 | |||
4804 | 1510 | /* if dictionary would fill window, just replace the history */ | ||
4805 | 1511 | @@ -614,6 +628,7 @@ int ZEXPORT deflateGetDictionary(z_streamp strm, Bytef *dictionary, | ||
4806 | 1512 | |||
4807 | 1513 | if (deflateStateCheck(strm)) | ||
4808 | 1514 | return Z_STREAM_ERROR; | ||
4809 | 1515 | + DEFLATE_GET_DICTIONARY_HOOK(strm, dictionary, dictLength); | ||
4810 | 1516 | s = strm->state; | ||
4811 | 1517 | len = s->strstart + s->lookahead; | ||
4812 | 1518 | if (len > s->w_size) | ||
4813 | 1519 | @@ -658,6 +673,8 @@ int ZEXPORT deflateResetKeep(z_streamp strm) { | ||
4814 | 1520 | |||
4815 | 1521 | _tr_init(s); | ||
4816 | 1522 | |||
4817 | 1523 | + DEFLATE_RESET_KEEP_HOOK(strm); | ||
4818 | 1524 | + | ||
4819 | 1525 | return Z_OK; | ||
4820 | 1526 | } | ||
4821 | 1527 | |||
4822 | 1528 | @@ -740,6 +757,7 @@ int ZEXPORT deflatePrime(z_streamp strm, int bits, int value) { | ||
4823 | 1529 | int ZEXPORT deflateParams(z_streamp strm, int level, int strategy) { | ||
4824 | 1530 | deflate_state *s; | ||
4825 | 1531 | compress_func func; | ||
4826 | 1532 | + int hook_flush = Z_NO_FLUSH; | ||
4827 | 1533 | |||
4828 | 1534 | if (deflateStateCheck(strm)) return Z_STREAM_ERROR; | ||
4829 | 1535 | s = strm->state; | ||
4830 | 1536 | @@ -752,15 +770,18 @@ int ZEXPORT deflateParams(z_streamp strm, int level, int strategy) { | ||
4831 | 1537 | if (level < 0 || level > 9 || strategy < 0 || strategy > Z_FIXED) { | ||
4832 | 1538 | return Z_STREAM_ERROR; | ||
4833 | 1539 | } | ||
4834 | 1540 | + DEFLATE_PARAMS_HOOK(strm, level, strategy, &hook_flush); | ||
4835 | 1541 | func = configuration_table[s->level].func; | ||
4836 | 1542 | |||
4837 | 1543 | - if ((strategy != s->strategy || func != configuration_table[level].func) && | ||
4838 | 1544 | - s->last_flush != -2) { | ||
4839 | 1545 | + if (((strategy != s->strategy || func != configuration_table[level].func) && | ||
4840 | 1546 | + s->last_flush != -2) || hook_flush != Z_NO_FLUSH) { | ||
4841 | 1547 | /* Flush the last buffer: */ | ||
4842 | 1548 | - int err = deflate(strm, Z_BLOCK); | ||
4843 | 1549 | + int flush = RANK(hook_flush) > RANK(Z_BLOCK) ? hook_flush : Z_BLOCK; | ||
4844 | 1550 | + int err = deflate(strm, flush); | ||
4845 | 1551 | if (err == Z_STREAM_ERROR) | ||
4846 | 1552 | return err; | ||
4847 | 1553 | - if (strm->avail_in || (s->strstart - s->block_start) + s->lookahead) | ||
4848 | 1554 | + if (strm->avail_in || (s->strstart - s->block_start) + s->lookahead || | ||
4849 | 1555 | + !DEFLATE_DONE(strm, flush)) | ||
4850 | 1556 | return Z_BUF_ERROR; | ||
4851 | 1557 | } | ||
4852 | 1558 | if (s->level != level) { | ||
4853 | 1559 | @@ -828,11 +849,13 @@ uLong ZEXPORT deflateBound(z_streamp strm, uLong sourceLen) { | ||
4854 | 1560 | ~13% overhead plus a small constant */ | ||
4855 | 1561 | fixedlen = sourceLen + (sourceLen >> 3) + (sourceLen >> 8) + | ||
4856 | 1562 | (sourceLen >> 9) + 4; | ||
4857 | 1563 | + DEFLATE_BOUND_ADJUST_COMPLEN(strm, fixedlen, sourceLen); | ||
4858 | 1564 | |||
4859 | 1565 | /* upper bound for stored blocks with length 127 (memLevel == 1) -- | ||
4860 | 1566 | ~4% overhead plus a small constant */ | ||
4861 | 1567 | storelen = sourceLen + (sourceLen >> 5) + (sourceLen >> 7) + | ||
4862 | 1568 | (sourceLen >> 11) + 7; | ||
4863 | 1569 | + DEFLATE_BOUND_ADJUST_COMPLEN(strm, storelen, sourceLen); | ||
4864 | 1570 | |||
4865 | 1571 | /* if can't get parameters, return larger bound plus a zlib wrapper */ | ||
4866 | 1572 | if (deflateStateCheck(strm)) | ||
4867 | 1573 | @@ -874,7 +897,8 @@ uLong ZEXPORT deflateBound(z_streamp strm, uLong sourceLen) { | ||
4868 | 1574 | } | ||
4869 | 1575 | |||
4870 | 1576 | /* if not default parameters, return one of the conservative bounds */ | ||
4871 | 1577 | - if (s->w_bits != 15 || s->hash_bits != 8 + 7) | ||
4872 | 1578 | + if (DEFLATE_NEED_CONSERVATIVE_BOUND(strm) || | ||
4873 | 1579 | + s->w_bits != 15 || s->hash_bits != 8 + 7) | ||
4874 | 1580 | return (s->w_bits <= s->hash_bits && s->level ? fixedlen : storelen) + | ||
4875 | 1581 | wraplen; | ||
4876 | 1582 | |||
4877 | 1583 | @@ -900,7 +924,7 @@ local void putShortMSB(deflate_state *s, uInt b) { | ||
4878 | 1584 | * applications may wish to modify it to avoid allocating a large | ||
4879 | 1585 | * strm->next_out buffer and copying into it. (See also read_buf()). | ||
4880 | 1586 | */ | ||
4881 | 1587 | -local void flush_pending(z_streamp strm) { | ||
4882 | 1588 | +void ZLIB_INTERNAL flush_pending(z_streamp strm) { | ||
4883 | 1589 | unsigned len; | ||
4884 | 1590 | deflate_state *s = strm->state; | ||
4885 | 1591 | |||
4886 | 1592 | @@ -1167,7 +1191,8 @@ int ZEXPORT deflate(z_streamp strm, int flush) { | ||
4887 | 1593 | (flush != Z_NO_FLUSH && s->status != FINISH_STATE)) { | ||
4888 | 1594 | block_state bstate; | ||
4889 | 1595 | |||
4890 | 1596 | - bstate = s->level == 0 ? deflate_stored(s, flush) : | ||
4891 | 1597 | + bstate = DEFLATE_HOOK(strm, flush, &bstate) ? bstate : | ||
4892 | 1598 | + s->level == 0 ? deflate_stored(s, flush) : | ||
4893 | 1599 | s->strategy == Z_HUFFMAN_ONLY ? deflate_huff(s, flush) : | ||
4894 | 1600 | s->strategy == Z_RLE ? deflate_rle(s, flush) : | ||
4895 | 1601 | (*(configuration_table[s->level].func))(s, flush); | ||
4896 | 1602 | @@ -1214,7 +1239,6 @@ int ZEXPORT deflate(z_streamp strm, int flush) { | ||
4897 | 1603 | } | ||
4898 | 1604 | |||
4899 | 1605 | if (flush != Z_FINISH) return Z_OK; | ||
4900 | 1606 | - if (s->wrap <= 0) return Z_STREAM_END; | ||
4901 | 1607 | |||
4902 | 1608 | /* Write the trailer */ | ||
4903 | 1609 | #ifdef GZIP | ||
4904 | 1610 | @@ -1230,7 +1254,7 @@ int ZEXPORT deflate(z_streamp strm, int flush) { | ||
4905 | 1611 | } | ||
4906 | 1612 | else | ||
4907 | 1613 | #endif | ||
4908 | 1614 | - { | ||
4909 | 1615 | + if (s->wrap == 1) { | ||
4910 | 1616 | putShortMSB(s, (uInt)(strm->adler >> 16)); | ||
4911 | 1617 | putShortMSB(s, (uInt)(strm->adler & 0xffff)); | ||
4912 | 1618 | } | ||
4913 | 1619 | @@ -1239,7 +1263,11 @@ int ZEXPORT deflate(z_streamp strm, int flush) { | ||
4914 | 1620 | * to flush the rest. | ||
4915 | 1621 | */ | ||
4916 | 1622 | if (s->wrap > 0) s->wrap = -s->wrap; /* write the trailer only once! */ | ||
4917 | 1623 | - return s->pending != 0 ? Z_OK : Z_STREAM_END; | ||
4918 | 1624 | + if (s->pending == 0) { | ||
4919 | 1625 | + Assert(s->bi_valid == 0, "bi_buf not flushed"); | ||
4920 | 1626 | + return Z_STREAM_END; | ||
4921 | 1627 | + } | ||
4922 | 1628 | + return Z_OK; | ||
4923 | 1629 | } | ||
4924 | 1630 | |||
4925 | 1631 | /* ========================================================================= */ | ||
4926 | 1632 | @@ -1254,9 +1282,9 @@ int ZEXPORT deflateEnd(z_streamp strm) { | ||
4927 | 1633 | TRY_FREE(strm, strm->state->pending_buf); | ||
4928 | 1634 | TRY_FREE(strm, strm->state->head); | ||
4929 | 1635 | TRY_FREE(strm, strm->state->prev); | ||
4930 | 1636 | - TRY_FREE(strm, strm->state->window); | ||
4931 | 1637 | + TRY_FREE_WINDOW(strm, strm->state->window); | ||
4932 | 1638 | |||
4933 | 1639 | - ZFREE(strm, strm->state); | ||
4934 | 1640 | + ZFREE_STATE(strm, strm->state); | ||
4935 | 1641 | strm->state = Z_NULL; | ||
4936 | 1642 | |||
4937 | 1643 | return status == BUSY_STATE ? Z_DATA_ERROR : Z_OK; | ||
4938 | 1644 | @@ -1285,13 +1313,13 @@ int ZEXPORT deflateCopy(z_streamp dest, z_streamp source) { | ||
4939 | 1645 | |||
4940 | 1646 | zmemcpy((voidpf)dest, (voidpf)source, sizeof(z_stream)); | ||
4941 | 1647 | |||
4942 | 1648 | - ds = (deflate_state *) ZALLOC(dest, 1, sizeof(deflate_state)); | ||
4943 | 1649 | + ds = (deflate_state *) ZALLOC_STATE(dest, 1, sizeof(deflate_state)); | ||
4944 | 1650 | if (ds == Z_NULL) return Z_MEM_ERROR; | ||
4945 | 1651 | dest->state = (struct internal_state FAR *) ds; | ||
4946 | 1652 | - zmemcpy((voidpf)ds, (voidpf)ss, sizeof(deflate_state)); | ||
4947 | 1653 | + ZCOPY_STATE((voidpf)ds, (voidpf)ss, sizeof(deflate_state)); | ||
4948 | 1654 | ds->strm = dest; | ||
4949 | 1655 | |||
4950 | 1656 | - ds->window = (Bytef *) ZALLOC(dest, ds->w_size, 2*sizeof(Byte)); | ||
4951 | 1657 | + ds->window = (Bytef *) ZALLOC_WINDOW(dest, ds->w_size, 2*sizeof(Byte)); | ||
4952 | 1658 | ds->prev = (Posf *) ZALLOC(dest, ds->w_size, sizeof(Pos)); | ||
4953 | 1659 | ds->head = (Posf *) ZALLOC(dest, ds->hash_size, sizeof(Pos)); | ||
4954 | 1660 | ds->pending_buf = (uchf *) ZALLOC(dest, ds->lit_bufsize, 4); | ||
4955 | 1661 | diff --git a/deflate.h b/deflate.h | ||
4956 | 1662 | index 8696791..d49e698 100644 | ||
4957 | 1663 | --- a/deflate.h | ||
4958 | 1664 | +++ b/deflate.h | ||
4959 | 1665 | @@ -299,6 +299,7 @@ void ZLIB_INTERNAL _tr_flush_bits(deflate_state *s); | ||
4960 | 1666 | void ZLIB_INTERNAL _tr_align(deflate_state *s); | ||
4961 | 1667 | void ZLIB_INTERNAL _tr_stored_block(deflate_state *s, charf *buf, | ||
4962 | 1668 | ulg stored_len, int last); | ||
4963 | 1669 | +void ZLIB_INTERNAL _tr_send_bits(deflate_state *s, int value, int length); | ||
4964 | 1670 | |||
4965 | 1671 | #define d_code(dist) \ | ||
4966 | 1672 | ((dist) < 256 ? _dist_code[dist] : _dist_code[256+((dist)>>7)]) | ||
4967 | 1673 | @@ -343,4 +344,15 @@ void ZLIB_INTERNAL _tr_stored_block(deflate_state *s, charf *buf, | ||
4968 | 1674 | flush = _tr_tally(s, distance, length) | ||
4969 | 1675 | #endif | ||
4970 | 1676 | |||
4971 | 1677 | +typedef enum { | ||
4972 | 1678 | + need_more, /* block not completed, need more input or more output */ | ||
4973 | 1679 | + block_done, /* block flush performed */ | ||
4974 | 1680 | + finish_started, /* finish started, need only more output at next deflate */ | ||
4975 | 1681 | + finish_done /* finish done, accept no more input or output */ | ||
4976 | 1682 | +} block_state; | ||
4977 | 1683 | + | ||
4978 | 1684 | +unsigned ZLIB_INTERNAL bi_reverse(unsigned code, int len); | ||
4979 | 1685 | +void ZLIB_INTERNAL bi_windup(deflate_state *s); | ||
4980 | 1686 | +void ZLIB_INTERNAL flush_pending(z_streamp strm); | ||
4981 | 1687 | + | ||
4982 | 1688 | #endif /* DEFLATE_H */ | ||
4983 | 1689 | diff --git a/gzguts.h b/gzguts.h | ||
4984 | 1690 | index f937504..5adfd1d 100644 | ||
4985 | 1691 | --- a/gzguts.h | ||
4986 | 1692 | +++ b/gzguts.h | ||
4987 | 1693 | @@ -152,7 +152,11 @@ | ||
4988 | 1694 | |||
4989 | 1695 | /* default i/o buffer size -- double this for output when reading (this and | ||
4990 | 1696 | twice this must be able to fit in an unsigned type) */ | ||
4991 | 1697 | +#ifdef DFLTCC | ||
4992 | 1698 | +#define GZBUFSIZE 131072 | ||
4993 | 1699 | +#else | ||
4994 | 1700 | #define GZBUFSIZE 8192 | ||
4995 | 1701 | +#endif | ||
4996 | 1702 | |||
4997 | 1703 | /* gzip modes, also provide a little integrity check on the passed structure */ | ||
4998 | 1704 | #define GZ_NONE 0 | ||
4999 | 1705 | diff --git a/inflate.c b/inflate.c | ||
5000 | 1706 | index b0757a9..c0f808f 100644 |
I'm off until end of year so I think you should grab a different reviewer for this