Merge lp:~drizzle-pbxt/drizzle/drizzle-pbxt-2 into lp:~drizzle-trunk/drizzle/development

Proposed by Paul McCullagh
Status: Work in progress
Proposed branch: lp:~drizzle-pbxt/drizzle/drizzle-pbxt-2
Merge into: lp:~drizzle-trunk/drizzle/development
Diff against target: 144096 lines (has conflicts)
Text conflict in drizzled/sql_string.h
Text conflict in drizzled/table_share.h
Text conflict in plugin/innobase/lock/lock0lock.c
Contents conflict in tests/r/information_schema.result
Text conflict in tests/t/create_not_windows.test
Text conflict in tests/t/information_schema.test
Text conflict in tests/t/subselect.test
To merge this branch: bzr merge lp:~drizzle-pbxt/drizzle/drizzle-pbxt-2
Reviewer Review Type Date Requested Status
Brian Aker Needs Resubmitting
Review via email: mp+6822@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Paul McCullagh (paul-mccullagh) wrote :

This is the current merge of the PBXT storage engine into Drizzle (revision #1039). I re-merged (using to Stewart's method) after the move of the storage engines to the plugin directory, to ensure the entire PBXT change history is included.

So bazaar merging PBXT works on this tree. i.e. in the root directory:
bzr merge lp:pbxt
correctly merges the PBXT trunk into the plugin/pbxt directory.

I have also checked the following:

- Compiles and runs on Linux, Solaris and Mac OS.
- All tests run though using: ./dtr --engine=pbxt
- Compiles and runs without atomic ops if not supported.
- Performance (lp:~drizzle-developers/sysbench/trunk) also looks OK.

Let me know if anything is missing :)

Revision history for this message
Stewart Smith (stewart) wrote :

I got this for insert_update test (./dtr --engine=pbxt):

--- /home/stewart/drizzle/drizzle-pbxt-2/tests/r/pbxt/insert_update.result 2009-05-28 15:55:48.726957481 +1000
+++ /home/stewart/drizzle/drizzle-pbxt-2/tests/r/pbxt/insert_update.reject 2009-05-29 14:01:43.915086845 +1000
@@ -58,12 +58,12 @@
 8 9 60 NULL
 explain extended SELECT *, VALUES(a) FROM t1;
 id select_type table type possible_keys key key_len ref rows filtered Extra
-1 SIMPLE t1 ALL NULL NULL NULL NULL 5 100.00
+1 SIMPLE t1 ALL NULL NULL NULL NULL 7 100.00
 Warnings:
 Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c`,values(`test`.`t1`.`a`) AS `VALUES(a)` from `test`.`t1`
 explain extended select * from t1 where values(a);
 id select_type table type possible_keys key key_len ref rows filtered Extra
-1 SIMPLE t1 ALL NULL NULL NULL NULL 5 100.00 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 7 100.00 Using where
 Warnings:
 Note 1003 select `test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t1`.`c` AS `c` from `test`.`t1` where values(`test`.`t1`.`a`)
 DROP TABLE t1;

and

--- /home/stewart/drizzle/drizzle-pbxt-2/tests/r/pbxt/type_enum.result 2009-05-20 21:41:41.506953655 +1000
+++ /home/stewart/drizzle/drizzle-pbxt-2/tests/r/pbxt/type_enum.reject 2009-05-29 14:09:57.670984523 +1000
@@ -1776,7 +1776,7 @@
 c
 EXPLAIN SELECT a FROM t1 WHERE a=0;
 id select_type table type possible_keys key key_len ref rows Extra
-1 SIMPLE t1 ALL NULL NULL NULL NULL 4 Using where
+1 SIMPLE t1 ALL NULL NULL NULL NULL 6 Using where
 SELECT a FROM t1 WHERE a=0;
 a
 ALTER TABLE t1 ADD PRIMARY KEY (a);

apart from that, I vote for merging:
- doesn't touch core (apart from us having to reslove the sql_string
c_ptr() thing)
- test fixes are all okay.
- code is quite likely to be actively maintained

--
Stewart Smith

Revision history for this message
Monty Taylor (mordred) wrote :

Stewart Smith wrote:

>
> apart from that, I vote for merging:
> - doesn't touch core (apart from us having to reslove the sql_string
> c_ptr() thing)
> - test fixes are all okay.
> - code is quite likely to be actively maintained

I second this vote. I'm working on the c_ptr() thing anyway.

Monty

Revision history for this message
Brian Aker (brianaker) wrote :

Out of date (I did ask Paul to remerge)

review: Needs Resubmitting
Revision history for this message
Paul McCullagh (paul-mccullagh) wrote :

Hi Brian,

This work is currently in progress. I think we should have it done by the end of next week.

Revision history for this message
Brian Aker (brianaker) wrote :

Awesome!

Just tell me when... we should end up with spare time on our build
servers. Tell me when you need it pushed to them and we can see how
well it passes all of the build environments.

Cheers,
 -Brian

On Oct 27, 2009, at 4:12 PM, Paul McCullagh wrote:

> Hi Brian,
>
> This work is currently in progress. I think we should have it done
> by the end of next week.
>
> --
> https://code.launchpad.net/~drizzle-pbxt/drizzle/drizzle-pbxt-2/+merge/6822
> You are reviewing the proposed merge of lp:~drizzle-pbxt/drizzle/
> drizzle-pbxt-2 into lp:drizzle.

1039. By Paul McCullagh

Merged Drizzle trunk and PBXT 1.0.09

1040. By Paul McCullagh

Changes required to compile PBXT for MySQL again

1041. By Padraig O'Sullivan

Merge trunk.

1042. By Padraig O'Sullivan

Changes required to compile PBXT with drizzle.

1043. By Padraig O'Sullivan

Updated the information_schema result file as it has changed due to the
extra I_S table that PBXT introduces.

1044. By Padraig O'Sullivan

Decided to update the information_schema test case to filter out PBXT
information schema tables in the queries. This way, we won't have to worry
about the output being different if the PBXT plugin has not been loaded.

1045. By Padraig O'Sullivan

Whoops, forgot to change renameTableImplementation to doRenameTable.

1046. By Paul McCullagh

Prototype corrections

1047. By Vladimir Kolesnikov

merge from work breanch

1048. By Vladimir Kolesnikov

fixed test-cases according to changes in drizzle

1049. By Vladimir Kolesnikov

a diff not related to PBXT

1050. By Vladimir Kolesnikov

added workaround for a test of DELETE IGNORE which is not currently supported by PBXT

1051. By Vladimir Kolesnikov

fixed resultset order

1052. By Vladimir Kolesnikov

merged Paul's fix from lp:pbxt rev.730

1053. By Vladimir Kolesnikov

more simple fixes

1054. By Vladimir Kolesnikov

added pbxt-specific action

1055. By Vladimir Kolesnikov

the original query plan can be forced by inserting more rows, but it doesn't test a storage engine's code

1056. By Vladimir Kolesnikov

explain differences mostly because of non-clustered indexes in pbxt vs innodb, better row count estimates

1057. By Vladimir Kolesnikov

changes similar to subselect.result

1058. By Vladimir Kolesnikov

added resultset ordering

1059. By Vladimir Kolesnikov

fixes similar to subselect.result

1060. By Vladimir Kolesnikov

fixes similar to subselect.result

1061. By Vladimir Kolesnikov

fixes similar to subselect.result

1062. By Vladimir Kolesnikov

added restulset ordering

1063. By Vladimir Kolesnikov

fixed a problem with key size for enums on drizzle side

Unmerged revisions

1063. By Vladimir Kolesnikov

fixed a problem with key size for enums on drizzle side

1062. By Vladimir Kolesnikov

added restulset ordering

1061. By Vladimir Kolesnikov

fixes similar to subselect.result

1060. By Vladimir Kolesnikov

fixes similar to subselect.result

1059. By Vladimir Kolesnikov

fixes similar to subselect.result

1058. By Vladimir Kolesnikov

added resultset ordering

1057. By Vladimir Kolesnikov

changes similar to subselect.result

1056. By Vladimir Kolesnikov

explain differences mostly because of non-clustered indexes in pbxt vs innodb, better row count estimates

1055. By Vladimir Kolesnikov

the original query plan can be forced by inserting more rows, but it doesn't test a storage engine's code

1054. By Vladimir Kolesnikov

added pbxt-specific action

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'drizzled/field.h'
--- drizzled/field.h 2010-03-27 10:10:49 +0000
+++ drizzled/field.h 2010-04-01 14:19:35 +0000
@@ -65,7 +65,7 @@
6565
66inline uint32_t get_enum_pack_length(int elements)66inline uint32_t get_enum_pack_length(int elements)
67{67{
68 return elements < 256 ? 1 : 2;68 return elements < 256 ? 1 : 4;
69}69}
7070
71/**71/**
7272
=== modified file 'drizzled/sql_bitmap.h'
--- drizzled/sql_bitmap.h 2010-02-04 08:14:46 +0000
+++ drizzled/sql_bitmap.h 2010-04-01 14:19:35 +0000
@@ -297,6 +297,14 @@
297 return last_word_ptr;297 return last_word_ptr;
298 }298 }
299299
300 /**
301 * * @return the last word mask for this bitmap
302 * */
303 my_bitmap_map getLastWordMask() const
304 {
305 return last_word_mask;
306 }
307
300 void addMaskToLastWord() const308 void addMaskToLastWord() const
301 {309 {
302 *last_word_ptr|= last_word_mask;310 *last_word_ptr|= last_word_mask;
303311
=== modified file 'drizzled/sql_string.h'
--- drizzled/sql_string.h 2010-02-05 08:11:15 +0000
+++ drizzled/sql_string.h 2010-04-01 14:19:35 +0000
@@ -93,11 +93,21 @@
93 inline const char *ptr() const { return Ptr; }93 inline const char *ptr() const { return Ptr; }
94 inline char *c_ptr()94 inline char *c_ptr()
95 {95 {
96 if (!Ptr || Ptr[str_length]) /* Should be safe */
97 (void) realloc(str_length);
98/* This code crashes or overwrites the buffer if
99 * str_length > Alloced_length,
100 * which can happen if the buffer is not allocated at
101 * all (Alloced_length == 0)!
96 if (str_length == Alloced_length)102 if (str_length == Alloced_length)
97 (void) realloc(str_length);103 (void) realloc(str_length);
98 else104 else
99 Ptr[str_length]= 0;105 Ptr[str_length]= 0;
106<<<<<<< TREE
100107
108=======
109*/
110>>>>>>> MERGE-SOURCE
101 return Ptr;111 return Ptr;
102 }112 }
103 inline char *c_ptr_quick()113 inline char *c_ptr_quick()
104114
=== modified file 'drizzled/table_share.h'
--- drizzled/table_share.h 2010-03-26 19:56:34 +0000
+++ drizzled/table_share.h 2010-04-01 14:19:35 +0000
@@ -327,6 +327,7 @@
327 max_rows= arg;327 max_rows= arg;
328 }328 }
329329
330<<<<<<< TREE
330 /**331 /**
331 * Returns true if the supplied Field object332 * Returns true if the supplied Field object
332 * is part of the table's primary key.333 * is part of the table's primary key.
@@ -344,6 +345,19 @@
344 }345 }
345346
346 TableIdentifier::Type tmp_table;347 TableIdentifier::Type tmp_table;
348=======
349 inline uint32_t getAvgRowLength()
350 {
351 return (table_proto) ? (table_proto->options().has_avg_row_length() ? table_proto->options().avg_row_length() : 0) : 0;
352 }
353
354 drizzled::plugin::StorageEngine *storage_engine; /* storage engine plugin */
355 inline drizzled::plugin::StorageEngine *db_type() const /* table_type for handler */
356 {
357 return storage_engine;
358 }
359 enum tmp_table_type tmp_table;
360>>>>>>> MERGE-SOURCE
347361
348 uint32_t ref_count; /* How many Table objects uses this */362 uint32_t ref_count; /* How many Table objects uses this */
349 uint32_t getTableCount()363 uint32_t getTableCount()
350364
=== modified file 'plugin/innobase/lock/lock0lock.c'
--- plugin/innobase/lock/lock0lock.c 2009-11-09 06:31:17 +0000
+++ plugin/innobase/lock/lock0lock.c 2010-04-01 14:19:35 +0000
@@ -4878,6 +4878,7 @@
4878 LOCK_GAP type locks from the successor4878 LOCK_GAP type locks from the successor
4879 record */4879 record */
4880{4880{
4881 rec_t* nc_rec;
4881 const rec_t* next_rec;4882 const rec_t* next_rec;
4882 trx_t* trx;4883 trx_t* trx;
4883 lock_t* lock;4884 lock_t* lock;
@@ -4892,7 +4893,12 @@
4892 }4893 }
48934894
4894 trx = thr_get_trx(thr);4895 trx = thr_get_trx(thr);
4896<<<<<<< TREE
4895 next_rec = page_rec_get_next_const(rec);4897 next_rec = page_rec_get_next_const(rec);
4898=======
4899 nc_rec = (rec_t *)rec;
4900 next_rec = page_rec_get_next(nc_rec);
4901>>>>>>> MERGE-SOURCE
4896 next_rec_heap_no = page_rec_get_heap_no(next_rec);4902 next_rec_heap_no = page_rec_get_heap_no(next_rec);
48974903
4898 lock_mutex_enter_kernel();4904 lock_mutex_enter_kernel();
48994905
=== added directory 'plugin/pbxt'
=== added file 'plugin/pbxt/AUTHORS'
--- plugin/pbxt/AUTHORS 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/AUTHORS 2010-04-01 14:19:35 +0000
@@ -0,0 +1,4 @@
1Paul McCullagh
2paul.mccullagh@primebase.org
3http://www.primebase.org
4http://pbxt.blogspot.com
05
=== added file 'plugin/pbxt/COPYING'
--- plugin/pbxt/COPYING 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/COPYING 2010-04-01 14:19:35 +0000
@@ -0,0 +1,340 @@
1 GNU GENERAL PUBLIC LICENSE
2 Version 2, June 1991
3
4 Copyright (C) 1989, 1991 Free Software Foundation, Inc.
5 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
6 Everyone is permitted to copy and distribute verbatim copies
7 of this license document, but changing it is not allowed.
8
9 Preamble
10
11 The licenses for most software are designed to take away your
12freedom to share and change it. By contrast, the GNU General Public
13License is intended to guarantee your freedom to share and change free
14software--to make sure the software is free for all its users. This
15General Public License applies to most of the Free Software
16Foundation's software and to any other program whose authors commit to
17using it. (Some other Free Software Foundation software is covered by
18the GNU Library General Public License instead.) You can apply it to
19your programs, too.
20
21 When we speak of free software, we are referring to freedom, not
22price. Our General Public Licenses are designed to make sure that you
23have the freedom to distribute copies of free software (and charge for
24this service if you wish), that you receive source code or can get it
25if you want it, that you can change the software or use pieces of it
26in new free programs; and that you know you can do these things.
27
28 To protect your rights, we need to make restrictions that forbid
29anyone to deny you these rights or to ask you to surrender the rights.
30These restrictions translate to certain responsibilities for you if you
31distribute copies of the software, or if you modify it.
32
33 For example, if you distribute copies of such a program, whether
34gratis or for a fee, you must give the recipients all the rights that
35you have. You must make sure that they, too, receive or can get the
36source code. And you must show them these terms so they know their
37rights.
38
39 We protect your rights with two steps: (1) copyright the software, and
40(2) offer you this license which gives you legal permission to copy,
41distribute and/or modify the software.
42
43 Also, for each author's protection and ours, we want to make certain
44that everyone understands that there is no warranty for this free
45software. If the software is modified by someone else and passed on, we
46want its recipients to know that what they have is not the original, so
47that any problems introduced by others will not reflect on the original
48authors' reputations.
49
50 Finally, any free program is threatened constantly by software
51patents. We wish to avoid the danger that redistributors of a free
52program will individually obtain patent licenses, in effect making the
53program proprietary. To prevent this, we have made it clear that any
54patent must be licensed for everyone's free use or not licensed at all.
55
56 The precise terms and conditions for copying, distribution and
57modification follow.
58
059
60 GNU GENERAL PUBLIC LICENSE
61 TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
62
63 0. This License applies to any program or other work which contains
64a notice placed by the copyright holder saying it may be distributed
65under the terms of this General Public License. The "Program", below,
66refers to any such program or work, and a "work based on the Program"
67means either the Program or any derivative work under copyright law:
68that is to say, a work containing the Program or a portion of it,
69either verbatim or with modifications and/or translated into another
70language. (Hereinafter, translation is included without limitation in
71the term "modification".) Each licensee is addressed as "you".
72
73Activities other than copying, distribution and modification are not
74covered by this License; they are outside its scope. The act of
75running the Program is not restricted, and the output from the Program
76is covered only if its contents constitute a work based on the
77Program (independent of having been made by running the Program).
78Whether that is true depends on what the Program does.
79
80 1. You may copy and distribute verbatim copies of the Program's
81source code as you receive it, in any medium, provided that you
82conspicuously and appropriately publish on each copy an appropriate
83copyright notice and disclaimer of warranty; keep intact all the
84notices that refer to this License and to the absence of any warranty;
85and give any other recipients of the Program a copy of this License
86along with the Program.
87
88You may charge a fee for the physical act of transferring a copy, and
89you may at your option offer warranty protection in exchange for a fee.
90
91 2. You may modify your copy or copies of the Program or any portion
92of it, thus forming a work based on the Program, and copy and
93distribute such modifications or work under the terms of Section 1
94above, provided that you also meet all of these conditions:
95
96 a) You must cause the modified files to carry prominent notices
97 stating that you changed the files and the date of any change.
98
99 b) You must cause any work that you distribute or publish, that in
100 whole or in part contains or is derived from the Program or any
101 part thereof, to be licensed as a whole at no charge to all third
102 parties under the terms of this License.
103
104 c) If the modified program normally reads commands interactively
105 when run, you must cause it, when started running for such
106 interactive use in the most ordinary way, to print or display an
107 announcement including an appropriate copyright notice and a
108 notice that there is no warranty (or else, saying that you provide
109 a warranty) and that users may redistribute the program under
110 these conditions, and telling the user how to view a copy of this
111 License. (Exception: if the Program itself is interactive but
112 does not normally print such an announcement, your work based on
113 the Program is not required to print an announcement.)
114
1115
116These requirements apply to the modified work as a whole. If
117identifiable sections of that work are not derived from the Program,
118and can be reasonably considered independent and separate works in
119themselves, then this License, and its terms, do not apply to those
120sections when you distribute them as separate works. But when you
121distribute the same sections as part of a whole which is a work based
122on the Program, the distribution of the whole must be on the terms of
123this License, whose permissions for other licensees extend to the
124entire whole, and thus to each and every part regardless of who wrote it.
125
126Thus, it is not the intent of this section to claim rights or contest
127your rights to work written entirely by you; rather, the intent is to
128exercise the right to control the distribution of derivative or
129collective works based on the Program.
130
131In addition, mere aggregation of another work not based on the Program
132with the Program (or with a work based on the Program) on a volume of
133a storage or distribution medium does not bring the other work under
134the scope of this License.
135
136 3. You may copy and distribute the Program (or a work based on it,
137under Section 2) in object code or executable form under the terms of
138Sections 1 and 2 above provided that you also do one of the following:
139
140 a) Accompany it with the complete corresponding machine-readable
141 source code, which must be distributed under the terms of Sections
142 1 and 2 above on a medium customarily used for software interchange; or,
143
144 b) Accompany it with a written offer, valid for at least three
145 years, to give any third party, for a charge no more than your
146 cost of physically performing source distribution, a complete
147 machine-readable copy of the corresponding source code, to be
148 distributed under the terms of Sections 1 and 2 above on a medium
149 customarily used for software interchange; or,
150
151 c) Accompany it with the information you received as to the offer
152 to distribute corresponding source code. (This alternative is
153 allowed only for noncommercial distribution and only if you
154 received the program in object code or executable form with such
155 an offer, in accord with Subsection b above.)
156
157The source code for a work means the preferred form of the work for
158making modifications to it. For an executable work, complete source
159code means all the source code for all modules it contains, plus any
160associated interface definition files, plus the scripts used to
161control compilation and installation of the executable. However, as a
162special exception, the source code distributed need not include
163anything that is normally distributed (in either source or binary
164form) with the major components (compiler, kernel, and so on) of the
165operating system on which the executable runs, unless that component
166itself accompanies the executable.
167
168If distribution of executable or object code is made by offering
169access to copy from a designated place, then offering equivalent
170access to copy the source code from the same place counts as
171distribution of the source code, even though third parties are not
172compelled to copy the source along with the object code.
173
2174
175 4. You may not copy, modify, sublicense, or distribute the Program
176except as expressly provided under this License. Any attempt
177otherwise to copy, modify, sublicense or distribute the Program is
178void, and will automatically terminate your rights under this License.
179However, parties who have received copies, or rights, from you under
180this License will not have their licenses terminated so long as such
181parties remain in full compliance.
182
183 5. You are not required to accept this License, since you have not
184signed it. However, nothing else grants you permission to modify or
185distribute the Program or its derivative works. These actions are
186prohibited by law if you do not accept this License. Therefore, by
187modifying or distributing the Program (or any work based on the
188Program), you indicate your acceptance of this License to do so, and
189all its terms and conditions for copying, distributing or modifying
190the Program or works based on it.
191
192 6. Each time you redistribute the Program (or any work based on the
193Program), the recipient automatically receives a license from the
194original licensor to copy, distribute or modify the Program subject to
195these terms and conditions. You may not impose any further
196restrictions on the recipients' exercise of the rights granted herein.
197You are not responsible for enforcing compliance by third parties to
198this License.
199
200 7. If, as a consequence of a court judgment or allegation of patent
201infringement or for any other reason (not limited to patent issues),
202conditions are imposed on you (whether by court order, agreement or
203otherwise) that contradict the conditions of this License, they do not
204excuse you from the conditions of this License. If you cannot
205distribute so as to satisfy simultaneously your obligations under this
206License and any other pertinent obligations, then as a consequence you
207may not distribute the Program at all. For example, if a patent
208license would not permit royalty-free redistribution of the Program by
209all those who receive copies directly or indirectly through you, then
210the only way you could satisfy both it and this License would be to
211refrain entirely from distribution of the Program.
212
213If any portion of this section is held invalid or unenforceable under
214any particular circumstance, the balance of the section is intended to
215apply and the section as a whole is intended to apply in other
216circumstances.
217
218It is not the purpose of this section to induce you to infringe any
219patents or other property right claims or to contest validity of any
220such claims; this section has the sole purpose of protecting the
221integrity of the free software distribution system, which is
222implemented by public license practices. Many people have made
223generous contributions to the wide range of software distributed
224through that system in reliance on consistent application of that
225system; it is up to the author/donor to decide if he or she is willing
226to distribute software through any other system and a licensee cannot
227impose that choice.
228
229This section is intended to make thoroughly clear what is believed to
230be a consequence of the rest of this License.
231
3232
233 8. If the distribution and/or use of the Program is restricted in
234certain countries either by patents or by copyrighted interfaces, the
235original copyright holder who places the Program under this License
236may add an explicit geographical distribution limitation excluding
237those countries, so that distribution is permitted only in or among
238countries not thus excluded. In such case, this License incorporates
239the limitation as if written in the body of this License.
240
241 9. The Free Software Foundation may publish revised and/or new versions
242of the General Public License from time to time. Such new versions will
243be similar in spirit to the present version, but may differ in detail to
244address new problems or concerns.
245
246Each version is given a distinguishing version number. If the Program
247specifies a version number of this License which applies to it and "any
248later version", you have the option of following the terms and conditions
249either of that version or of any later version published by the Free
250Software Foundation. If the Program does not specify a version number of
251this License, you may choose any version ever published by the Free Software
252Foundation.
253
254 10. If you wish to incorporate parts of the Program into other free
255programs whose distribution conditions are different, write to the author
256to ask for permission. For software which is copyrighted by the Free
257Software Foundation, write to the Free Software Foundation; we sometimes
258make exceptions for this. Our decision will be guided by the two goals
259of preserving the free status of all derivatives of our free software and
260of promoting the sharing and reuse of software generally.
261
262 NO WARRANTY
263
264 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
265FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
266OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
267PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
268OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
269MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
270TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
271PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
272REPAIR OR CORRECTION.
273
274 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
275WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
276REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
277INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
278OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
279TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
280YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
281PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
282POSSIBILITY OF SUCH DAMAGES.
283
284 END OF TERMS AND CONDITIONS
285
4286
287 How to Apply These Terms to Your New Programs
288
289 If you develop a new program, and you want it to be of the greatest
290possible use to the public, the best way to achieve this is to make it
291free software which everyone can redistribute and change under these terms.
292
293 To do so, attach the following notices to the program. It is safest
294to attach them to the start of each source file to most effectively
295convey the exclusion of warranty; and each file should have at least
296the "copyright" line and a pointer to where the full notice is found.
297
298 <one line to give the program's name and a brief idea of what it does.>
299 Copyright (C) <year> <name of author>
300
301 This program is free software; you can redistribute it and/or modify
302 it under the terms of the GNU General Public License as published by
303 the Free Software Foundation; either version 2 of the License, or
304 (at your option) any later version.
305
306 This program is distributed in the hope that it will be useful,
307 but WITHOUT ANY WARRANTY; without even the implied warranty of
308 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
309 GNU General Public License for more details.
310
311 You should have received a copy of the GNU General Public License
312 along with this program; if not, write to the Free Software
313 Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
314
315
316Also add information on how to contact you by electronic and paper mail.
317
318If the program is interactive, make it output a short notice like this
319when it starts in an interactive mode:
320
321 Gnomovision version 69, Copyright (C) year name of author
322 Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
323 This is free software, and you are welcome to redistribute it
324 under certain conditions; type `show c' for details.
325
326The hypothetical commands `show w' and `show c' should show the appropriate
327parts of the General Public License. Of course, the commands you use may
328be called something other than `show w' and `show c'; they could even be
329mouse-clicks or menu items--whatever suits your program.
330
331You should also get your employer (if you work as a programmer) or your
332school, if any, to sign a "copyright disclaimer" for the program, if
333necessary. Here is a sample; alter the names:
334
335 Yoyodyne, Inc., hereby disclaims all copyright interest in the program
336 `Gnomovision' (which makes passes at compilers) written by James Hacker.
337
338 <signature of Ty Coon>, 1 April 1989
339 Ty Coon, President of Vice
340
341This General Public License does not permit incorporating your program into
342proprietary programs. If your program is a subroutine library, you may
343consider it more useful to permit linking proprietary applications with the
344library. If this is what you want to do, use the GNU Library General
345Public License instead of this License.
5346
=== added file 'plugin/pbxt/ChangeLog'
--- plugin/pbxt/ChangeLog 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/ChangeLog 2010-04-01 14:19:35 +0000
@@ -0,0 +1,801 @@
1PBXT Release Notes
2==================
3
4------- 1.0.09e RC3 - Not released yet
5
6RN283: Fixed bug that cause the error "[ERROR] Invalid (old?) table or database name 'mysqld.1'", when running temp_table.test under MariaDB (thanks to Monty for his initial bug fix).
7
8RN282: Added win_inttypes.h to the distribution. This file is only required for the Windows build.
9
10RN281: Fixed bug #451101: jump or move depends on uninitialised value in myxt_get_key_length
11
12RN280: Fixed bug #451080: Uninitialised memory write in XTDatabaseLog::xlog_append
13
14RN279: Fixed bug #451085: jump or move depends on uninitialised value in my_type_to_string
15
16RN278: Fixed bug #441000: xtstat crashes with segmentation fault on startup if max_pbxt_threads exceeded.
17
18------- 1.0.09d RC3 - 2009-09-30
19
20RN277: Added r/o flag to pbxt_max_threads server variable (this fix is related to bug #430637)
21
22RN276: Added test case for replication on tables w/o PKs (see bug #430716)
23
24RN275: Fixed bug #430600: 'Failed to read auto-increment value from storage engine' error.
25
26RN274: Fixed bug #431240: This report is public edit xtstat fails if no PBXT table has been created. xtstat now accepts --database=information_schema or --database=pbxt. Depending on this setting PBXT will either use the information_schema.pbxt_statistics or the pbxt.statistics table. If information_schema is used, then the statistics are displayed even when no PBXT table exists. Recovery activity is also displayed, unless pbxt_support_xa=1, in which case MySQL will wait for PBXT recovery to complete before allowing connections.
27
28RN273: Fixed bug #430633: XA_RBDEADLOCK is not returned on XA END after the transacting ended with a deadlock.
29
30RN272: Fixed bug #430596: Backup/restore does not work well even on a basic PBXT table with auto-increment.
31
32------- 1.0.09c RC3 - 2009-09-16
33
34RN271: Windows build update: now you can simply put the pbxt directory under <mysql-root>/storage and build the PBXT engine as a part of the source tree. The engine will be linked statically. Be sure to specify the WITH_PBXT_STORAGE_ENGINE option when running win\configure.js
35
36RN270: Correctly disabled PBMS so that this version now compiles under Windows. If PBMS_ENABLED is defined, PBXT will not compile under Windows becaause of a getpid() call in pbms.h.
37
38------- 1.0.09 RC3 - 2009-09-09
39
40RN269: Implemented online backup. A native online backup driver now performs BACKUP and RESTORE DATABASE operations for PBXT. NOTE: This feature is only supported by MySQL 6.0.9 or later.
41
42RN268: Implemented XA support. PBXT now supports all XA related MySQL statements. The variable pbxt_support_xa determines if XA support is enabled. Note: due to MySQL bug #47134, enabling XA support could lead to a crash.
43
44------- 1.0.08d RC2 - 2009-09-02
45
46RN267: Fixed a bug that caused MySQL to crash on shutdown, after an incorrect command line parameter was given. The crash occurred because the background recovery task was not cleaned up before the PBXT engine was de-initialized.
47
48------- 1.0.08c RC2 - 2009-08-18
49
50RN266: Updated BLOB streaming glue, used with the PBMS engine. The glue code is now identical to the version of "1.0.08-rc-pbms" version of PBXT available from http://blobstreaming.org/download.
51
52RN265: Changes the sequential reading of data log files to skip gaps, instead of returning EOF. This ensures that extended data records are preserved even when something goes wrong with the way the file is written.
53
54RN264: Fixed a bug that cased an "Data log not found" error after an out of disk space error on a log file. This bug is similar to RN262 in that it allows "gaps" to appear in the data logs.
55
56RN263: Updated xtstat to compile on Windows/MS Visual C++.
57
58RN262: Merged changes for PBMS version 0.5.09.
59
60RN261: Concerning bug #377788: Cannot find index for FK. Fixed buffer overflow which occurred when the error was reported.
61
62RN260: Fixed bug #377788: Cannot find index for FK. PBXT now correctly uses prefix of an index to support FK references (e.g. if key = (c1, c2) then an index on (c1, c2, c3) will work). Also fixed buffer overflow, which occurred when reporting the error.
63
64RN259: Fixed bug #309424: xtstat doesn't use my.cnf. You can now add an [xtstat] section to my.cnf, for use with xtstat.
65
66RN258: updated xt_p_join implementation for Windows to check if a thread has already exited or has not yet started
67
68RN257: Removed false assertion that could fail during restore if a transaction log page was zero-filled
69
70RN256: Update datalog eof pointer only if write opearions were sucessful
71
72RN255: Added re-allocation of of filemap if allocating the of the new map failed. This often happens if there's not enough space on disk.
73
74RN254: When a table with a corrupted index is detected, PBXT creates a file called 'repair-pending' in the pbxt directory, with the name of the table in it. Each table in the file is listed on a line by itself (the last line has no trailing \n). When the table is repaired (using the REPAIR TABLE command), this entry is removed from the file.
75
76RN253: Use fcntl(F_FULLFSYNC) instead of fsync on platforms that support it. Improper fsync operation was presumably the reason of index corruption on Mac OS X.
77
78RN252: Fixed bug #368692: PBXT not reporting data size correctly in information_schema.
79
80------- 1.0.08 RC2 - 2009-06-30
81
82RN251: A Windows-specific test update, also removed false assertion that failed on Windows.
83
84RN250: Fixed a bug that caused recovery to fail when the transaction log ID exceeded 255. The problem was a checksum failed in the log record.
85
86RN249: Fixed bug #313176: Test case timeout. This happened because record cache pages where not properly freed and as soon as cache filled up the performacne degraded.
87
88RN248: PBXT now compiles and runs with MySQL 5.1.35. All tests pass.
89
90RN247: Fixed bug #369086: Incosistent/Incorrect Truncate behavior
91
92RN246: Fixed bug #378222: Drop sakila causes error: Cannot delete or update a parent row: a foreign key constraint fails
93
94RN245: Fixed bug #379315: Inconsistent behavior of DELETE IGNORE and FK constraint.
95
96RN244: Fixed a recovery problem: during the recovery of "record modified" action the table was updated before the old index entries were removed; then the xres_remove_index_entries was supplied the new record which lead to incorrect index update.
97
98RN243: Fixed a bug that caused a recovery failure if partitioned pbxt tables where present. This happended because the recovery used a MySQL function to open tables and the PBXT handler was not yet registered
99
100RN242: Fixed a bug that caused a deadlock if pbxt initialization failed. This happened because pbxt ceanup was done from pbxt_init() with PLUGIN_lock being held by MySQL which lead to a deadlock in the freeer thread
101
102RN241: Fixed a heap corruption bug (writing to a freed memory location). It happened only when memory mapped files were used leading to heap inconsistency and program crash or termination by heap checker. Likely to happen right after or during DROP TABLE but possible in other cases too.
103
104RN240: Load the record cache on read when no using memory mapped files.
105
106RN239: Added PBXT variable pbxt_max_threads. This is the maximum number of threads that can be created PBXT. By default this value is set to 0 which means the number of threads is derived from the MySQL variable max_connections. The value used is max_connections+7. Under Drizzle the default value is 500.
107
108RN238: Added an option to wait for the sweeper to clean up old transactions on a particular connection. This prevents the sweeper from getting too far behind.
109
110RN237: Added an option to lazy delete fixed length index entries. This means the index entries are just marked for deletion, instead of removing the items from the index page. This has the advantage that an exclusive lock is not always required for deletion.
111
112RN236: Fixed bug #349177: a bug in configure.in script.
113
114RN235: Fixed bug 349176: a compiler warning.
115
116RN234: Completed Drizzle integration. All Drizzle tests now run with PBXT.
117
118RN233: Fixed bugs which occur when PBXT is used together with PBMS (BLOB Streaming engine).
119
120RN232: Merged Drizzle-specific changes into the main tree.
121
122RN231: Fixed a bug that caused bad performance as the number of threads increased. This occurred when the number of open table handles exceeded 'table_open_cache', and MySQL started closing open table handlers. PBXT was flushing a table when all table handlers were closed. PBXT will now only do this when the FLUSH TABLES statement is used.
123
124RN230: Improved efficiency of conflict resolution: Implemented a queue for threads waiting for a lock. Threads no longer poll to take a lock. If a temp lock is granted because of an update, then the thread granted the temp lock will also wait for the transaction that did the update to quit.
125
126RN229: Fixed bug #313391: LOAD DATA ... REPLACE broken.
127
128RN228: Fixed bug #341115: 'Out of memory' error (a bug in key comparison algorithm).
129
130RN227: Changed conflict handling to use spin locks and improve efficiency.
131
132RN226: Fixed bug #340316: Issue with bigint unsigned auto-increment field.
133
134RN225: Fixed bug #308557: UPDATE fails to match all rows in a transactional scenario.
135
136RN224: Fixed a deadlock which could occur during table scans.
137
138RN223: Index scans now use handles to cache buffers instead of making a copy of the index page. The handles are "copy-on-write".
139
140RN222: Fixed a bug that caused the server to hang on startup if PBXT ran out of record cache while waiting for the sweeper to complete.
141
142RN221: Fixed an index recovery bug. This occurred if the server crashed after operating in low index cache sitations.
143
144RN220: Improved index selectivity estimation: added scanning from the end of index backwards.
145
146RN219: Fixed a problem: during intersected range scan not all fields were returned by engine to MySQL.
147
148RN218: Changed the way row locking (used by SELECT FOR UPDATE) works. Previously we locked a group of rows at once (although there were many groups). However, this caused conflicts even when the same rows were not locked. We now locks individual rows.
149
150RN217: Fixed bug #315564: Rollbacked inserts remain permanently in table.
151
152RN216: Added lock tracing. In DEBUG mode, each thread has a list of locks (semaphores, mutexes, r/w locks that it holds).
153
154RN215: Fixed a bug that caused a crash during restart if an index file was flushed during recovery.
155
156RN214: Fixed bug #310184: Deadlock when trying to wake up transactions
157
158RN213: Fixed an index corruption bug on SPARC Solaris. Note this error will occur on any machine that does not use the x86 (little endian) byte order.
159
160------- 1.0.07 RC - 2008-12-15
161
162RN212: Fixed build problems on NetBSD.
163
164RN211: Fixed build problems on FreeBSD.
165
166RN210: Fixed build problems on OpenSolaris.
167
168RN209: Added handling of the foreign_key_checks system flag.
169
170RN208: xtstat will now automatically reconnect if the connection to server is lost.
171
172RN207: Foreign key references are now checked on CREATE TABLE.
173
174RN206: Fixed a crash if inserting into a table that has an FK that references a column that has no index on it.
175
176RN205: Added processing of foreign key action SET DEFAULT.
177
178RN204: Fixed an index recovery problem: unswept index entries were not recovered correctly
179
180RN203: Fixed foreign key bug: REPLACE fails with 'on delete cascade'
181
182RN202: Fixes and updates to tests, now all tests pass on windows and linux.
183
184RN201: Fixed ref-counting for mmapped files.
185
186RN200: Fixed an index recovery problem: unswept index entries were not recovered correctly .
187
188RN199: Recovery now takes place on plug-in startup. Previously recovery occurred when the first PBXT table was accessed.
189
190RN198: Fixed a recovery bug that caused index entries to get out of sync with the data file.
191
192RN197: Improved the efficiency of group commit.
193
194RN196: Changed checkpointing so that it now works during idle time. Every record, row or index file fllush now also contributes to the checkpoint (fuzzy checkpointing). Checkpointing is forced to complete after about 50% of the checkpoint threshold in order to ensure the correct maximum for log reading on recovery.
195
196RN195: Fixed scheduling bug that caused sweeper to get behind with the cleanup, which caused performance problems in high conflict situations. Foreground threads will now wait if the sweeper gets too far behind.
197
198RN194: Created the xtstat program which monitors the internal performance of PBXT. Run xtstat --help for more details information of the output.
199
200RN193: Implemented the pbxt.statistics virtual table. The statistics table returns information about the internal activity of the engine. This includes I/O byte counts, cache hit counts and usage, commit count, etc.
201
202RN192: Due to timing issues in the engine API it could happen that the client received an OK for a committed transaction before the transaction was actually committed. This problem has been fixed.
203
204RN191: Fixed a bug that caused a hang when conflicts occured while reading a covering index.
205
206RN190: Previously the sweeper delayed deletion of transaction structures until all transactions that were running during sweeping have quit. This is now handled by the same code that fixed the bug in RN189.
207
208RN189: Fixed a bug that could cause a row to go missing due to a visibility issue.
209
210RN188: Fixed a bug which ocurred when using CREATE TABLE ... AVG_ROW_LENGTH=x, and the table contained BLOBs. In this case, alter table corrupted the table.
211
212RN187: Windows now stores paths in the location file in UNIX format by converting all '\' characters to '/'. Note that the location file is only cross-platform if the paths are relative (which is the default).
213
214RN186: Set version number to 1.0.07.
215
216------- 1.0.06 Beta 2 - 2008-11-06
217
218RN185: Disabled support for INSERT DELAYED because of MySQL bug #40505
219
220RN184: Implemented info(flag == HA_STATUS_AUTO) engine API call. This call returns the next value that will be assigned as auto-increment value on the table.
221
222RN183: Turned off streaming on Windows (see XT_STREAMING macro in sources)
223
224RN182: Switch code base to the latest version of BLOB streaming engine (PBMS): www.blobstreaming.org.
225
226RN181: Updated pbxt-test-run default parameters (--force is on, --default-storage-engine is pbxt, --base-dir is set according to config)
227
228RN180: PBXT can now cope with a missing .xti file (the file that contains the table indexes). This file can be regenerated using REPAIR TABLE.
229
230RN179: On recovery PBXT now creates a filed called 'recovery-progress' in the pbxt database. The recovery percentage complete is written to this file as recovery progresses. Note that this file will not be created if no recoery is necessary or if PBXT estimates that it will read less then 10MB to do recovery.
231
232RN178: Fixed a problem in CHECK TABLE that caused memory corruption for fixed-size records
233
234RN177: Added "crash debugging". When enabled, crash debugging does the following:
235 - Create a core dump on Windows if the server crashes.
236 - Make a backup copy of the datadir directory before recovery if the server crashes.
237 - Keep at least 5 of the previous transaction logs.
238Currently crash debugging is disabled by default. To disable, create a file called 'no-debug' in the pbxt database folder, and restart the server. When crash debugging is disabled by default, it can be enabled by creating a file called 'crash-debug; in the pbxt database folder.
239
240RN176: Fixed a bug: a lock was not released appropriately
241
242RN175: Fixed some debug assertions
243
244RN174: Fixed some of test/mysql-test tests
245
246RN173: Fixed a RENAME TABLE bug, that prevented index files from being properly recreated
247
248RN172: Added the file ./pbxt/lock-pid. This file is locked while the server is running, and contains the process of the server. PBXT will return an error on startup if the file is locked or the process is still running in order to prevent a second server from being started.
249
250RN171: Implemented the AVG_ROW_LENGTH table attribute. When set, this value determines the size of the fixed length data component of a record. Normally this size is estimated depending on the column definitions. The command CHECK TABLE dumps the current average row length to the log. This can be used to find a suitable value for AVG_ROW_LENGTH.
251
252RN170: Changed configure so that debug/optimize flags set for building the engine override the flags set for MySQL. If --with-debug is not specified, then the engine will use the flags set when building MySQL. If MySQL was built with --with-debug=full, the DEBUG will be defined for the engine. When building the engine, the following flags can be set:
253 yes - Debug symbols enabled, no optimization, DEBUG not defined.
254 full - Debug symbols enabled, no optimization, DEBUG defined.
255 only - Debug symbols enabled, MySQL flags used, DEBUG not defined.
256 prof - Profile code enabled, optimization on, DEBUG not defined.
257 no - No debug symbols, optimization on, DEBUG not defined.
258
259RN169: Used MySQL root Makefile instead of config.status in order to extract settings (such as CFLAGS and CXXFLAGS) for the PBXT build.
260
261RN168: Fixed Windows build after merging changes for Drizzle.
262
263RN167: Fixed "This table requires primary key" error in sql-bench.
264
265RN166: Fixed threading problems that caused crashes in sql-bench.
266
267RN165: Added sql-bench to pbxt source tree.
268
269RN164: Ported PBXT to Drizzle. To compile for Drizzle DRIZZLED must be defined on the command line. The -drz.am and -drz.in files are must be used when PBXT is embedded in Drizzle.
270
271RN163: Added "make test" build step. Running "make test" from the root of pbxt source tree will launch test/mysql-test/pbxt-test-run.pl with appropriate options to execute the pbxt functional test suite. On Windows where
272pbxt is statically linked into mysql server binary pbxt testing works by going to test/mysql-test directory and running ./pbxt-test-run.pl with --base-dir argument pointing to a mysql source tree (mysql binaries are taken
273from there) and passing the rest of usual arguments (--force --mysqld=--default-storage-engine=pbxt)
274
275RN162: The 'pbxt' database must now be dropped explicitly. It is automatically created when the first PBXT table is created. After that, the pbxt database can be dropped once all PBXT tables have been dropped. Dropping the pbxt database will also cause all transaction (pbxt/system directory) and data logs (pbxt/data directory) to also be deleted.
276
277RN161: Added pbxt.location system table. This table can only be dropped when all PBXT tables have been deleted. Dropping the system table will cause all transaction (pbxt/system directory) and data logs (pbxt/data directory) to also be deleted.
278
279RN160: Made changes to run with MySQL 6.0.6.
280
281RN159: Changes to configure: added --with-plugindir=<path>, which should be used to specify the plugin directory. This means that --libdir should no longer be used. For backwards compatibility configure will still recognize this options if the path ends with 'plugin'.
282
283Also updated --help, to include all options, and better desciptions of the options.
284
285The configure options are now as follows:
286
287--with-mysql=<path> - (Required) It specifies the path to the MySQL source tree. The source should already be built. All other options will be taken from the MySQL build by default.
288--with-debug=yes/no - (Optional) Specify if then engine should be built with different debug options to the MySQL source tree.
289--with-plugindur=<path> - (Optional) Specify an alternative installation directory for the plugin. By default it will be installed in the plugin directory of the MySQL installation.
290
291
292RN158: Added support for core dumps on Windows. This can be enabled by defining XT_COREDUMP. On by default at the moment. If the server crashes a file called PBXTCore00000001.dmp will be created in the data directory. This file can be openned using MS VS.
293
294RN157: Fixed a compile problem with tv_nsec which is not supported on all platforms.
295
296RN156: Updated tests to run with MySQL 5.1.28.
297
298RN155: Errors during cascade update of VARCHAR values with trailing spaces
299
300RN154: Fixed a bug: impossible to create a foreign key that referenced an ENUM or SET column
301
302RN153: Fixed a bug that caused the following problems: #1. Foreign keys: crash if update cascade and autocommit=0 #2. Foreign keys: crash if update cascade and multi-level recursion
303
304RN152: Fixed missing information about foreign keys in I_S.table_constraints and I_S.referential_constraints
305
306------- 1.0.05 Beta - 2008-08-30
307
308RN151: "Quick config": It is now possible to configure the engine by just specifying the mysql source code tree (the --with-mysql option). The --libdir and --with-debug setting will be deduced automatically.
309
310RN150: Added system variable pbxt_sweeper_priority, 0 = low (default), 1 = normal (same as user threads), 2 = high. The sweeper cleans up deleted records (deleted records also result from an update). If allowed to accumulate, these records can slow searches. Higher priority for the sweeper is recommended on systems with 4 or more cores.
311
312RN149: Record cleanup is now initiated if a deleted record is found, and the transaction that deleted the record has ended. Since waking up the sweeper is an expensive operation, normally the sweeper will run every 1/10th of a second.
313
314RN148: Fixed a bug which caused transaction starvation (one transaction was constantly locked out) during high conflict updates. This lead to cleanup of records not being done, which lead to a general slow down.
315
316RN147: Fixed a problem with TRUNCATE TABLE: a failed TRUNCATE TABLE could put the engine into an invalid state that later caused a crash
317
318RN146: Fixed a bug that caused the error: "-49: Record format unknown, either corrupted or upgrade required".
319
320RN145: Added pbxt_db_offline_log_function system variable, 0 = recycle logs (default), 1 = delete logs (default on Mac OS X), 2 = keep logs.
321
322------- 1.0.04 Alpha - 2008-08-02
323
324RN144: Completed port and testing of Windows version.
325
326RN143: Fixed a bug which caused the free-er thread to hang. This was a result of an invalid operation ID, which was the result of the checkpointer flushing the table at the same time as a foreground thread.
327
328RN142: The fast RW/mutex lock can now handle nested calls. This is possible during a sequential scan.
329
330RN141: The normal behavior in MySQL is that an auto-increment values will be re-issued if you delete the row containing the current maximum auto-increment value and then restart the server. To prevent this you can use ALTER TABLE my_table AUTO_INCREMENT = <current-max-auto-increment> + 1, before deleting the current maximum auto-increment value.
331
332A new system variable, pbxt_auto_increment_mode, has been added so that this work around is not necessary. When set to 0 (the default), auto-increment works as described above. When set to 1, the AUTO_INCREMENT value of the table is automatically to prevent previously issued auto-increment values being returned.
333
334However, if the server crashes, a gap of up to 100 unique values can result, because the table AUTO_INCREMENT value is incremented in steps of 100.
335
336RN140: Index statistics are now automatically recalculated when the table row count exceeds 200.
337
338RN139: Fixed a bug that caused index corruption, error: "int idx_push(index_xt.cc:172) -2: Core B-tree too deep".
339
340RN138: Handle startup and recovery when an index is corrupted.
341
342RN137: Fixed a bug in the zero wait R/W lock that caused the lock to fail (the state is extremely volatile, and must be written to memory after increment).
343
344RN136: Fixed a bug that cause the error "int xt_pwrite_file(filesys_xt.cc:789) errno (14): Bad address".
345
346RN135: Fixed TRUNCATE TABLE that did not work correctly when the table contained BLOBs stored in the BLOB streaming engine (www.blobstreaming.org).
347
348RN134: Fixed a bug that caused duplicate rows to be returned from an index scan (using a SELECT FOR UPDATE) if a concurrent update was done.
349
350RN133: Optimised PBXT for multi-processor scale-up. This mostly involved using different types of locks instead of the standard pthread mutex and reader/writer locks [TODO: 0038].
351
352------- 1.0.03 Alpha - 2008-05-30
353
354RN132: Fixed bug when using PBXT in conjunction with the BLOB streaming engine (www.blobstreaming.org). Uploaded BLOBs could not be inserted into a table.
355
356RN131: Fixed wait for background processes on shutdown. Shutdown will wait a maximum of 16 seconds for each process.
357
358RN130: Fixed calculation of bytes to be read for recovery.
359
360RN129: Fixed bug in cleanup of unterminated transactions.
361
362RN128: The writer will now start working when one of the following is true:
363- it is time for a checkpoint,
364- the log cache is almost full,
365- the free'er is waiting for the writer,
366- there is no other activity.
367
368RN127: Fixed checkpoint frequency. Checkpointing is now done correctly after 'pbxt_checkpoint_frequency' bytes.
369
370RN126: Implemented index consistent write [TODO: 0050].
371
372RN125: Implemented memory mapping for row pointer (.xtr) and handle data files (.xtd).
373
374RN124: Index files now use direct I/O.
375
376------- 1.0.02 Alpha - 2008-04-25
377
378RN123: Fixed compile errors with MySQL 5.1.24.
379
380------- 1.0.01 Alpha - 2008-03-28
381
382RN122: ++++ NOTE: This version is not compatible with older versions of PBXT ++++.
383
384RN121: Transaction logs are now global so that multi-database statements are now possible. This makes it also possible to work PBXT temporary tables.
385
386RN120: Transaction logs pre-allocated and recycled.
387
388RN119: Transaction log writes on 512 byte boundaries only.
389
390------- 1.0.00 Alpha - 2008-03-10
391
392This version has alpha status because of the large number of changes done for full durability.
393
394RN118: ++++ NOTE: This version is incompatible to older versions of PBXT ++++.
395
396RN117: Documentation now avaliable at http://www.primebase.org/documentation.
397
398RN116: Corrected the plug.in file so that PBXT compiles when dropped into the storage directory in the MySQL source tree.
399
400RN115: Compiled and tested with MySQL 5.1.23.
401
402RN114: Increased index block size. Minimum is now 4K. Default is 16K.
403
404RN113: Calculate index selectivity to return a more accurate value from records_in_range(). NOTE: FLUSH TABLESl will update the index statistics, after data has been inserted or updated.
405
406RN112: Optimized table storage, saving 8 bytes per row.
407
408RN111: Optimized search on keys containing 2 or 3 not null integer values.
409
410RN110: Optimization: store the row ID in the index so that an index entry can be verified as current without loading the record. This is necessary to optimize an access with index coverage.
411
412RN109: Optimization: only load the record extended data if required.
413
414RN108: Implemented SHOW ENGINE PBXT STATUS;
415
416RN107: Added the following system variables:
417
418pbxt_index_cache_size - The amount of memory allocated to the index cache, used only to cache index data
419pbxt_record_cache_size - The amount of memory allocated to the record cache used to cache table data
420pbxt_log_cache_size - The amount of memory allocated to the transaction log cache used to cache on transaction log data
421pbxt_log_file_threshold - The size of a transaction log before rollover, and a new log is created
422pbxt_transaction_buffer_size - The size of the global transaction log buffer (the engine allocates 2 buffers of this size)
423pbxt_log_buffer_size - The size of the buffer used to cache data from transaction and data logs during sequential scans, or when writing a data log
424pbxt_checkpoint_frequency - The amount of data written to the transaction log before a checkpoint is performed
425pbxt_data_log_threshold - The maximum size of a data log file
426pbxt_garbage_threshold - The percentage of garbage in a data log file before it is compacted
427
428RN106: PBXT now compiles for MySQL 6.0.3.
429
430RN104: Updates now locks a record temporarily. This prevents most "record changed" errors, however, it makes UPDATE statements a type of "committed read". This means that you may update a different value to that which you selected in repeatable read mode. To avoid this, use SELECT FOR UPDATE if you plan to UPDATE records after reading.
431
432RN103: Implemented SELECT FOR UPDATE. This is implemented by turning SELECT FOR UPDATE into a type of "committed read". This means that, if you do a SELECT followed by a SELECT FOR UPDATE you can get different results, even in repeatable read mode.
433
434RN102: Implemented recovery of index entries. Note: indexes are not yet fully consistent. This means that index can become currupted due to a crash. Data, however, cannot be lost. The indices can be rebuild using REPAIR TABLE.
435
436RN101: Writing and flushing of a single transaction write-ahead log.
437
438RN100: Automatic rollover of transaction logs as they become full.
439
440RN99: Implementation of the transaction log cache.
441
442RN98: Group commit.
443
444RN97: Implementation of the writer thread that applies changes in the transaction log to the database.
445
446RN96: Implementation of the checkpointer thread that periodically flushes the database and writes a checkpoint which determines the recovery start point.
447
448RN95: Implementation of the free'er thread that is responsible for keeping the record cache at a preset level.
449
450RN94: Modifications to the record cache so that rows are stored in pages, in order to speed up sequence access.
451
452RN93: Implemented the recovery process which applies changes written to the log that are not in the database, on startup.
453
454RN92: Modification of the sweeper thread which cleans up rolled-back transactions and deleted data, to use the new transaction log format.
455
456RN91: Modifications to the data logs so that they use the same record structure as the transaction logs.
457
458RN90: The data logs are now managed "per database" in order to minimize the work done to flush and commit a transaction.
459
460RN89: Implementation of a file handle pool for the data logs.
461
462------- 0.9.91 Beta - 2007-10-30
463
464RN88: The format of the URL genearated by MyBS has been changed. The format of the BLOB URLs is now as follows:
465
466'~*' <db-name> '/' <type-char> <table-id> '-' <blob-id> '-' <access-code> '-' <server-id>
467
468Where <type-char> is '_' or '~'.
469
470Examples: ~*test/_11-128-fbd590b-0, ~*test/~1-524-3dc45b09-0
471
472In other words, the characters '>' has been replace by '*', '^' has been replace by '_' and ':' has been replace by '~'. The reason for this is that the characters '>' and '^' are not allowed in URLs, and must be URL-encoded. The character ':' is reserved, but allowed.
473
474NOTE: This change makes this version incompatible with previous versions of MyBS. If you have a table with BLOB URLs, you can upgrade the URLs as follows:
475
476UPDATE blob_table SET blob_col = REPLACE(REPLACE(blob_col, '~>', '~*'), '/:', '/~');
477
478Replacing '^' is not necessary because BLOB URLs with '^' should not appear in tables.
479
480------- 0.9.90 Beta - 2007-10-17
481
482RN87: Corrected stack trace of errors passed through the BLOB streaming API.
483
484RN86: Added new engine API accessor functions that appeared in 5.1.21 (thanks Stewart).
485
486RN85: Added plug.in file. PBXT now compiles when dropped into the storage directory of the MySQL build tree. However, you have rebuild configure. For example:
487
488rm -rf autom4te.cache/
489aclocal
490autoconf
491autoheader
492automake -a
493./configure --help
494./configure --with-plugins=max --without-innodb --prefix=/usr/local/mysql --with-debug=full
495
496NOTE: ./configure --help should show that the PBXT has been included.
497
498RN84: Fixed several problems with shutdown of PBXT in combiniation with MyBS.
499
500------- 0.9.89 Beta - 2007-08-17
501
502RN83 (2007-08-21): Fixed a crash due to a compile bug that does not like the contruct *((xtWordPS *) &(v)) = (xtWordPS) (x) (macro allocr_() and alloczr_()).
503
504RN82: It is now possible to insert non-URL values into a LONGBLOB field, in the previous version the generated an "Invalid URL" error. Such values can be retrieved as a stream using a field reference.
505
506RN81: Fixed a bug that caused PBXT to crash during certina operations when MyBS was not installed.
507
508RN80: Set engine as capable of row-level replication, but not as statement replication. Statement replication does not work because MVCC is not serializable.
509
510------- 0.9.88 Beta - 2007-07-25
511
512RN79: Made some corrections in order to compile with MySQL 5.1.20.
513
514RN78: Support for the features of the MyBS BLOB Streaming engine, version 0.5 Alpha.
515
516RN77: Bugfix: The server crashes during BLOB data handling. The reason is the table field structure is shared, and may not be changed.
517
518------- 0.9.87 Beta - 2007-06-19
519
520RN76: The major feature of this release is support for the BLOB Streaming Engine. The current version enables the download of specific BLOB columns via the Streaming Engine. For example:
521
522use test;
523CREATE TABLE notes_tab (
524 n_id INTEGER PRIMARY KEY,
525 n_text BLOB
526) ENGINE=pbxt;
527INSERT notes_tab VALUES (1, "This is a BLOB streaming test!");
528
529The URL:
530
531http://localhost:8080/test/notes_tab/n_text/n_id=1
532
533will return the value "This is a BLOB streaming test!"
534
535RN75: Bugfix: MySQL prints error: "Plugin 'PBXT' will be forced to shutdown". This error was caused by the plug-in having a reference to itself.
536
537RN74: Added system variable pbxt_index_cache_size and pbxt_record_cache_size. These variable can now be set on the mysqld command line (for example: --pbxt_record_cache_size=50MB). The values are also displayed by SHOW VARIABLES.
538
539------- 0.9.86 Beta - 2007-04-07
540
541RN74: ++++ NOTE: This version is incompatible to older versions of PBXT ++++.
542
543In order to upgrade, install the older version of PBXT. Convert all tables to MyISAM using ALTER TABLE t1 ENGINE=MyISAM. Then install the new version of PBXT and convert back using ALTER TABLE t1 ENGINE=PBXT.
544
545RN73: Each table will now use a maximum of 4 data log files. This means a maximum of 7 files per table. The minimum is 3 for tables that do not have a variable field that exceeds about 40 bytes in size. This means that under Linux PBXT requires a maximum of 7 file handles per table used. Windows lock of pread/pwrite (atomic seek and read/write) functions means it requires a file handler per file per open table handler. [TODO: 0044]
546
547RN72: All threads now write to the same data log file. Recovery and compaction take this fact into account. Each thread still writes its own transaction log.
548
549RN71: Removed all directory scans when creating and dropping table. Increased the table limit to 10000.
550
551RN70: Changed locking to avoid a deadlock when TRUNCATE TABLE is used together with other DML.
552
553RN69: procedures and functions are now considered atomic, and execute in a single transaction.
554
555RN68: Bug fixed: all files are now correctly flushed before commit.
556
557------- 0.9.85 Beta - 2007-03-15
558
559RN67: Changed the implementation of the pushsr_ and allocr_ macros because "*((void **) &(v) = " caused a crash due to a compiler error on some platforms (thanks Luciano for your help on this one and RN66).
560
561RN66: Fixed a bug that caused PBXT to corrupt the index file when the size exceeded 4GB. [TODO: 0031]
562
563RN65: PBXT now runs under Windows. This source tree must be placed in the MySQL source storage directory in order to compile. Further details of how to build are in the windows-readme.txt file. [TODO: 0027]
564
565RN64: Improved speed of table lookup by ID after a table has been deleted. The sweeper needs to ignore these records. Scanning the directory each time was too slow.
566
567RN63: Added checking for repeat update of a record in a statement.
568
569RN62: Committed read no longer blocks due to a change made by another transaction (the XT_REPEATABLE_READ_BLOCKS define, turns blocking on).
570
571RN61: Avoid checking for duplicates if an index is not modified by an update.
572
573RN60: Records updated repeatedly by a transaction are now updated in place. [TODO: 0040]
574
575------- 0.9.8 Beta - 2007-01-30
576
577RN59: Reduced the number of file handles used to a maximum of one per file. This assumes that pread() and pwrite() allows multiple threads to use the same file handle (according to my tests, this is the case).
578
579RN58: Added the configure flag --with-debug=only which compiles a version of the plug-in with debug symbols that will link to an non-debug MySQL server.
580
581RN57: Changed error number returned on lock from 1205 (lock timeout) to 1020 (optimistic lock failure).
582
583RN56: Added UNIX environment variable for PBXT system parameters. These must be set before starting mysqld, for example:
584
585setenv pbxt_index_cache_size 400MB
586setenv pbxt_record_cache_size "1 GB"
587
588Values are in bytes unless one of the following units is specified: GB, MB, Kb
589
590RN55: Fixed a bug which prevented VARCHAR values from being compressed correctly when stored in variable length rows.
591
592RN54: Fixed a bug which caused a crash when PBXT was used with MySQL 5.1.14. This bug also caused data to be corrupted on insert.
593
594RN53: Set query caching mode to transactional. [TODO: 0027]
595
596RN52: Added conditions so that the engine compiles with MySQL 5.1.14 and 5.1.13.
597
598------- 0.9.74 Beta - 2006-12-14
599
600RN51: DELETE FROM <table>; is no longer implemented by re-creating the table. This statement now works by deleting all rows. TRUNCATE is implemented as before, by re-creating the table.
601
602RN50: The test scripts innodb.test and innodb-mysql.test have been modified to run with PBXT.
603
604RN49: [TODO: 0020] Implemented foreign keys. Functionality is identical to InnoDB with 2 exceptions:
605
606* Data types of referenced columns must be an exact match (e.g. you cannot mix VARCHAR and CHAR values).
607* Currently an exact matching index is required on referenced columns (i.e. the index may not have more columns that the columns used in the foreign key definition).
608
609Also note the following:
610
611* It is possible to create foreign keys that reference non-existent tables or columns. An error will occur when updating a table with an incorrect foreign key declaration.
612* If you alter the data-type of a column referenced by a foreign key set you need to set foreign_key_checks=0; or an error will occur.
613
614RN48: Fixed a bug in the implementation of indexes on ENUM and SET types.
615
616RN47: Fixed a bug that caused a crash when an index was place on a BLOB column, and data was retrieved from the index directly.
617
618------- 0.9.73 Beta - 2006-10-31
619
620RN46: Updated test scripts to run with MySQL 5.1.13.
621
622------- 0.9.72 Beta - 2006-10-19
623
624RN45: Corrected compilation errors that occurred due to a change to struct st_mysql_plugin.
625
626------- 0.9.71 Beta - 2006-10-04
627
628RN44: Corrected compilation errors that occurred due to changes in the storage engine API.
629
630------- 0.9.7 Beta - 2006-09-20
631
632RN43: This is the first Beta release of PrimeBase XT. It has been integrated into MySQL 4.1.21 and is available as a plug-in for MySQL 5.1.12, or later. This version has been extensively tested using mysql-test-run, on various Linux and Mac OS X platforms.
633
634RN42: ++++ NOTE: This version is incompatible to older versions of PBXT ++++. Files created by older versions cannot be opened by version 0.9.7.
635
636RN41: Renaming or deleting a table while using a name with different case to the original created name did not work.
637
638RN40: Fixed a bug when grouping and searching on indexed columns that contain a null.
639
640RN39: Fixed bugs related to trailing spaces on VARCHAR values. Values that only vary by the number of trailing spaces (for example "aa" and "aa "), are now correctly handled as identical.
641
642RN38: The default AUTO_INCREMENT value was not correctly preserved during ALTER TABLE.
643
644RN37: Created a MySQL 5.1 Plugin version of PBXT. [TODO: 0017]
645
646RN36: Fixed a race condition in the row cache which had the affect that inserted rows dissappeared after cleanup because the cache was out of date. I was only able to reproduce this error on multi-processor machines.
647
648------- 0.9.6 - 2006-08-05
649
650RN35: ++++ NOTE: This version is incompatible to older versions of PBXT ++++.
651
652The disk format of tables and log files has changed slightly in this version. As a result, files created by older versions cannot be opened by version 0.9.6. An error will be generated. If you have data wish to preserve, first start the older version of XT and convert all tables to MyISAM. The stop the server and removed all transaction log file (files of the form xtlog-*.xt). Then start the new version and convert tables back to XT.
653
654RN34: Implemented READ COMMITTED transaction mode. XT now supports READ COMMITTED and SERIALIZABLE transaction modes. NOTE: if the mode is set to REPEATABLE READ, SERIALIZABLE is used. If the mode is set to READ UNCOMMITTED READ COMMITTED is used.
655
656RN33: The implementation of AUTO_INCREMENT on a paritial index is non-standard. A unique value is generated without regard to the value of the index prefix. For example, assume we have the following table: CREATE TABLE t1 (c1 CHAR(10) not null, c2 INT not null AUTO_INCREMENT, PRIMARY KEY(c1, c2));
657
658With the following contents: c1 c2
659 A 8
660 B 1
661
662After executing the following statement: insert into t1 (c1) values ('B');
663
664This is the result using PBXT: c1 c2
665 A 8
666 B 1
667 B 9
668
669The standard result would be: c1 c2
670 A 8
671 B 1
672 B 2
673
674RN32: PBXT does not permit access to multiple databases within a single transaction. For example:
675
676begin;
677update database_1.t1 set a=10;
678update database_2.t2 set d=10;
679commit;
680
681In this case the following error is returned: 1015: Can't lock file (errno: -1)
682
683RN31: The implementation of COUNT(*) has changed. For effectiency, rows are not counted. The information is taken from the header of the record (.xtr) files. This information is only 100% accurate after transaction cleanup has completed. Which basically means, only when PBXT is idle. ANALYZE TABLE waits for all background activity to stop, so the statement may be executed before a COUNT(*) to ensure an accurate result. NOTE: Other then waiting for background processes, ANALYSE TABLE is not implemented.
684
685RN30: Two concurrency bugs have been fixed: a shared lock was used instead of an exclusive lock when deleting from a transaction list, the transaction segment semaphore was not initialized. XT now runs correctly in a multi-processor environment. The test used was sysbench on a dual-process, dual-core, AMD 64-bit machine running SUSE Linux 10.0.
686
687RN29: PBXT compiles and runs on under 64-bit Lunix. [TODO: 0009]
688
689RN28: ./mysql-test-run --force --mysqld=--default-storage-engine=pbxt will now execute most tests successfully. Changes to the tests and the result have been documented in http://www.primebase.com/xt/download/pbxt-test-run-changes.txt. [TODO: 0004, 0019]
690
691RN27: Fixed a bug that caused the server to crash if when using tables locks and transactions. For example: LOCK TABLES, BEGIN, COMMIT, SELECT. This sequence now returns an error. The correct sequence is:
692
693LOCK TABLES, BEGIN, COMMIT, UNLOCK TABLES, SELECT
694or
695LOCK TABLES, BEGIN, COMMIT, BEGIN, SELECT COMMIT, UNLOCK TABLES
696
697RN26: Fixed a concurrency problem which caused a number of threads to hang during the sysbench test - see RN30 above (bug reported by Vadim).
698
699RN25: Fixed a bug that caused the server to hang when ha_pbxt::create() and ha_pbxt::ha_open() where given different, but equivalent paths for a particular table.
700
701RN24: Fixed bug in the indexing of blob columns, for example: create table t1(name_id int, name blob, INDEX name_idx (name(5)));
702
703RN23: When a duplicate key error occurs in auto-commit mode, the transaction is now rolled back.
704
705RN22: Fixed incorrect duplicate key error. In the case of a unique key which allows NULLs, duplicates are allowed if the inserted key contains a NULL. For example:
706
707create table t1 (id int not null, str char(10), unique(str));
708insert into t1 values (1, null),(2, null),(3, "foo"),(4, "bar");
709
710RN21: PBXT now returns the correct error code on duplicate key: 1062 instead of 1022.
711
712RN19: Implemented AUTO_INCREMENT on partial keys. However, the XT implementation is non-standard. Increment of partial index works, but the ID generated is incremented like a non-partial index. For example:
713
714create table t1 (c1 char(10) not null, c2 int not null auto_increment, primary key(c1, c2));
715select * from t1;
716c1 c2
717A 8
718B 1
719
720insert into t1 (c1) values ('B');
721select * from t1;
722c1 c2
723A 8
724B 1
725B 9
726
727The standard result would be:
728c1 c2
729A 8
730B 1
731B 2
732
733RN18: Implemented TRUNCATE TABLE and DELETE FROM <table>; (i.e. a DELETE without WHERE clause). Previously DELETE FROM <table>; did not cause an error, but no rows where deleted (TRUNCATE TABLE returned an error). [TODO: 0012, 0022]
734
735RN17: Implemented CREATE TABLE (...) auto_increment=<value>;
736
737------- 0.9.51 - 2006-07-06
738
739RN16: Fixed crash which could occur when creating the first table in a database (bug reported by Hakan).
740
741------- 0.9.5 - 2006-07-03
742
743RN15: This version concludes the re-structuring of the PBXT implementation. I have made a number of major changes, including:
744
745- All files except the transaction logs are now associated with a particular table. All table related files begin with the name of the table. The extension indicates the function.
746
747- I have merged the handle and the fixed length row data for performance reasons.
748
749- Only the variable size component of a row is stored in the data log files. As a result the data logs can now be considered as a type of "overflow" area.
750
751- Memory mapped files are no longer used because it is not possible to flush changes to the disk.
752
753RN14: File names have the following forms:
754
755[table-name]-[table-id].xtr - These files contains the table row pointers. Each row pointer occupies 8 bytes and refers to a list of records. The file name also contains the table ID. This is a unique number which is used internally by XT to identify the table.
756
757[table-name].xtd - This file contains the fixed length data of a table. Each data item includes a handle and a record. The handle references a record in the data log file if the table contains variable length records.
758
759[table-name].xti - This file contains the index data of the table.
760
761[table-name]-[log-id].xtl - This is a data log file. It contains the variable length data of the table. A table may have any number of data log files, each with a unique ID.
762
763xtlog-[log-id].xt - These files are the transaction logs. Log entries that specify updates reference a data file record. Each active thread has its own transaction log in order to avoid contension.
764
765RN13: Fixed the bug "Hang on DROP DATABASE". [TODO: 0016]
766
767RN12: PBXT currently only supports the "Serializable" transaction isolation level. This is the highest isolation level possible and includes the "repeatable-read" functionality [TODO: 0015]. This is implemented by giving every transaction a snapshot of the database at the point when the transaction is started.
768
769If the transaction tries to update a record that was updated by some other transaction after the snapshot was taken, a locked error is returned. A deadlock can occur if 2 transactions update the same record in a different order. PBXT can detect all deadlocks.
770
771RN11: I have implemented write buffering on the table data files. [TODO: 0013]
772
773RN10: The unique constraint (UNIQUE INDEX/PRIMARY KEY) is now checked correctly. [TODO: 0008]
774
775RN9: I have implemented a conventional B-tree algorithm for the indices (instead of the Lehman and Yoa B*-link tree). Although this reduces concurrency it improves the performance of queries significantly because of the simplicity of the algorithm. Deletion is also implemented in a very simple manner. [TODO: 0007]
776
777RN8: PBXT now has only 2 caches [TODO: 0006]:
778
779The Index Cache (pbxt_index_cache_size): This is the amount of memory the PBXT storage engine uses to cache index data and row pointers. This is all the data in the files with the extensions '.xti' and '.xtr'. This cache is managed in blocks of 2K.
780
781The Record Cache (pbxt_record_cache_size): This is the amount of memory the PBXT storage engine uses to cache table row data (handles and records). This is all the data in the files with the extension '.xtd'.
782
783The size of the caches are determined by the values of the system variables pbxt_index_cache_size and pbxt_row_cache_size. By default these values are set to 32MB.
784
785RN7: Auto-increment is now implemented in memory. This is done by doing a MAX() select when a table is first opened to get the high value. After that, then high value is incremented in memory on INSERT. On UPDATE (or INSERT) the value in memory is adjusted if necessary. This method also makes it possible for rows to be inserted simultaneously on the same table. [TODO: 0005, 0014]
786
787RN6: ./run-all-tests --create-options=TYPE=PBXT succeeds. [TODO: 0004]
788
789RN5: Using sql-bench and my own Java based test I have confirmed that PBXT behaves correctly during multi-threaded access. [PARTIALY TODO: 0002]
790
791RN4: Load/Stability test. Using sql-bench I have tested PBXT under load over a long period of time. [PARTIALY TODO: 0001]
792
793------- 0.9.2 - 2006-04-01
794
795RN3: Fixed a bug that cause the error "-6: Handle is out of range: [0:0]".
796
797RN2: Implemented SET, ENUM and YEAR data types.
798
799RN1: Fixed a bug in the error reporting when a table is created with a datatype that is not supported. [TODO: 0011]
800
801
0802
=== added file 'plugin/pbxt/Makefile.am'
--- plugin/pbxt/Makefile.am 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/Makefile.am 2010-04-01 14:19:35 +0000
@@ -0,0 +1,3 @@
1SUBDIRS = src
2
3EXTRA_DIST = plug.ini
04
=== added file 'plugin/pbxt/NEWS'
=== added file 'plugin/pbxt/README'
--- plugin/pbxt/README 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/README 2010-04-01 14:19:35 +0000
@@ -0,0 +1,19 @@
1PrimeBase XT for MySQL 5.1
2==========================
3
4This is the PrimeBase XT (PBXT) transactional storage engine for MySQL. PBXT is "pluggable", which means that it can be loaded dynamically by MySQL at runtime. It uses a unique "write-once" update strategy and MVCC (multi-version concurrency control) to provide optimal performance over a wide range of tasks.
5
6This package includes the complete source code for the engine. Although this is a standalone project it must be built against a compiled version of the MySQL 5.1 source tree, because it references headers files used internally by the server.
7
8Details about how to build PBXT both under UNIX or Windows, as a standalone plug-in, or as part of the MySQL source code, is distribed in the documentation which is avaliable online at:
9
10http://www.primebase.org/documentation
11
12Bug reports, questions and comments can be sent directly to me.
13
14Thanks for your support!
15
16Paul McCullagh
17SNAP Innovation GmbH
18paul.mccullagh@primebase.org
19
020
=== added file 'plugin/pbxt/TODO'
--- plugin/pbxt/TODO 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/TODO 2010-04-01 14:19:35 +0000
@@ -0,0 +1,195 @@
1PBXT To-Do List
2===============
3
4My thanks to all who have downloaded and tested PBXT. If an issue you reported before the date below is not on this list, please e-mail me again.
5
6------- 2008-12-09
7
80063: The option for not using memory mapped files must be fixed.
9
100062: Dynamic option for using memory mapping on a table (Dimitri).
11
12------- 2008-09-12
13
140061: Add records per key result to ha_pbxt:info() call (Mark).
15
16------- 2008-08-31
17
180060: Add table option to determine if a table should be memory mapped or not (also requested by Dimitri).
19
200059: Add table options:
21 AVG_ROW_LENGTH [=] value
22 DATA DIRECTORY [=] 'absolute path to directory'
23 INDEX DIRECTORY [=] 'absolute path to directory'
24 MAX_ROWS [=] value
25
26------- 2008-03-28
27
280058: Consolidate writes when changes in the log are applied to the database.
29
30------- 2008-03-07
31
320057: Cluster updates onto a single page.
33
340056: Add checksum to index and data pages.
35
360055: When no index cache is available, the complete index must be flushed (not just single pages).
37
380054: Optimize indexes by not creating indexes that are a complete sub-set of some other index. In this case we must be able to identify part of an index as unique. For example: primary key (a, b), index (a, b, c). Here we would just create index (a, b, c), and specify that the part (a, b) must be unique. Operations on (a, b) will be directed to index (a, b, c).
39
400053: Check and test lock tables.
41
420052: Cache data log data in the handle data cache. Must be purged when a handle data record is written.
43
440051: Write data log data alternatively to the transaction log. The compactor must then compact transaction logs.
45
460050: [RESOLVED: RN126] Implement consistent write for indexes.
47
480049: [RESOLVED: RN114] Set the index block size to 4K, or 16K as used by InnoDB.
49
500048: [RESOLVED: RN110] Add row ID to indexes. This should only be set once the row is cleaned by the sweeper. Then the row ID can be used to make a quite check if the row is the most recent version.
51
52------- 2007-06-19
53
540047: Test build with ./configure --with-innodb under Linux (Vadim).
55
560046: [RESOLVED: RN85] Add plug.in file to enable drop in compile under Linux.
57
580045: Provide libstdc++.so.6 binaries (Vadim).
59
600044: [RESOLVED: RN73] Limit number of file handles used per table (Brian).
61
620043: XA (two-phase commit) support (Peter).
63
64------- 2007-03-13
65
660042: [RESOLVED: RN108] Implemement STATUS commands.
67
680041: Implement index prefix compression.
69
70------- 2007-03-07
71
720040: [RESOLVED: RN60] Update in-place when a transaction updates the same record more than once.
73
740039: Set the number and size of the segments dynamically according to the amount of memory in the cache (and the number of CPUs?) (as discussed with: Peter & Vadim).
75
760038: [RESOLVED: RN133] Improve the efficiency of the locks by using atomic compare and swap (Peter & Vadim).
77
780037: [RESOLVED: RN133] Instead of a global LRU list, use a LRU list for segment of the cache (Peter & Vadim). [ Note: a global list using a TAS lock and change time (so that LRU is not always updated) is most efficient].
79
800036: Add support for deferred foreign key checking (requested by: Mark).
81
820035: [RESOLVED: RN71] Remove the 2000 table limit (reported by: Hakan).
83
84------- 2007-02-28
85
860035: [RESOLVED: RN74, RN107] Build in the PBXT system parameters (currently they must be set using environment variables.
87
880034: [RESOLVED: RN117] Initial documentation (yes, it must be done!)
89
900033: Make the error code returned on lock error configurable.
91
920032: [RESOLVED: RN65] Create a source code pluggable version for Windows.
93
940031: [RESOLVED: RN66] PBXT corrupts the index file when the size exceeds 4 GB (reported by: Luciano)
95
960030: [RESOLVED: RN102] Implement pbxt_index_flush_delay. Postpones index writing in order to speed up imports. [Resolution uses that fact hat index entries that are missing are added during recovery. As a result, index flushing can be delayed.]
97
980029: [RESOLVED: RN103] Implement SELECT ... FOR UPDATE (recommended by: Robin).
99
100------- 2007-02-14
101
1020028: Implement CREATE TABLE ... DATA/INDEX DIRECTORY (suggested by: Robin).
103
104------- 2006-12-06
105
1060027: [RESOLVED: RN53] Bug in pbxt with query caching (reported by: Giuseppe) caused violation of transaction isolation.
107
108------- 2006-08-05
109
1100026: Implement BACKUP and RESTORE table (planned for the first post release version).
111
1120025: Implement DISABLE/ENABLE KEYS. Works for FOREIGN KEYs, currently no plans to implement for disabling indexes.
113
1140024: Implement ANALYZE TABLE (planned for the first post release version).
115
1160023: Implement CHECK TABLE (planned for the first release candidate).
117
1180022: [RESOLVED: RN18] Implement TRUNCATE TABLE and DELETE FROM <table>; (i.e. a DELETE without WHERE clause). Currently this function does not cause an error, but no rows are deleted.
119
120------- 2006-07-06
121
1220021: [RESOLVED: RN28] .../mysql-test/mysql-test-run --force --mysqld=--default-storage-engine=pbxt produces a number of errors (reported by: Hakan): As far as I can tell some failures are unnessary but others are bugs. All need to be checked.
123
124------- 2006-07-03
125
1260020: [RESOLVED: RN49] Implement referential integrity (planned for the first release candidate).
127
128------- 2006-04-01
129
1300019: [RESOLVED: RN28] mysql-test-run hangs on alter table (reported by: Hakan): Running a test like ./mysql-test-run.pl --mysqld=--default-storage-engine=pbxt, hangs on ALTER TABLE.
131
1320018: Implement GEOMETRY date type. Note: There are currently no plans to implement this feature.
133
134------- 2006-03-31
135
1360017: [RESOLVED: RN37] MySQL 5.x Version (reported by: Ronald, Giuseppe).
137
1380016: [RESOLVED: RN13] Hang on "DROP DATABASE" (reported by: Giuseppe). Load the world database (http://downloads.mysql.com/docs/world.sql) and convert all tables into PBXT. Then, the drop database command hangs.
139
1400015: [RESOLVED: RN12] Implement isolation level "repeatable read" (reported by: Giuseppe). Current PBXT only supports isolation level "committed read". This means committed data can be seen no matter when it was committed. Use SELECT ... FOR UPDATE to guarantee repeatable read, on data already read.
141
1420014: [RESOLVED: RN7] Two transactions cannot insert simaltaneously if they use auto_increment (reported by: Giuseppe). See also 0005.
143
1440013: [RESOLVED: RN11] Implement buffered write (reported by: Giuseppe): Lack of buffered write leads to bad performance in operations such as ALTER TABLE ENGINE = PBXT and INSERT ... SELECT.
145
1460012: [RESOLVED: RN18] TRUNCATE does not work (reported by: Giuseppe)
147
1480011: [RESOLVED: RN2] Load Sakila Sample Database (reported by: Ronald): ALTER TABLE film ENGINE=PBXT; fails
149
1500010: [RESOLVED: RN6] sql-bench (reported by: Dmitry): ./run-all-tests --create-options=TYPE=PBXT fails.
151
1520009: [RESOLVED: RN29] 64-bit Linux (reported by: Hakan): PBXT current does not compile under 64-bit Linux.
153
154------- 2006-03-16
155
1560008: [RESOLVED: RN10] Enforcing the unique index constraint:
157
158An index declared as "unique" must return a "duplicate unique key" error when inserting a duplicate value. The difficulty part of implementing this in PBXT is that we may encounter a duplicate value that has not yet been committed. The index reading thread must then wait for the transaction to commit or abort.
159
1600007: [RESOLVED: RN9] Cleaning up empty index nodes:
161
162The Lehman and Yoa algorithm used for indexing does not describe a way of cleaning up empty index nodes on-the-fly. A search of the relevant literature for an algorithm also turns up empty handed (periodic "reorg" is mostly suggested). I have subsequently devised an algorithm that will do the job. This needs to be implemented.
163
1640006: [RESOLVED: RN8] Cache Balancing:
165
166PBXT uses a number of small caches in order to improve concurrency (rather than one large cache). A process is required to manage the amount of cache memory used as a whole. The process must distribute the overall amount of memory available for caching over the small caches, according to demand.
167
1680005: [RESOLVED: RN7] Implement a faster auto-increment method
169
170Currently the auto-increment is handled by the default method used in MySQL. This is done by performing a "fetch-last" on the index for each insert to find the highest key value. This works well unless there are large number empty index nodes due to the problem described in (2) above.
171
172PBXT Testing To-Do List
173
174This is my first take on what still must be tested. My thanks to Ronald Bradford who is working on a generic testing framework that can be used to test PBXT.
175
1760004: [RESOLVED: RN6, RN28] MySQL Tests:
177
178Several tests (for mysql-test-run) written for other engines can be adapted and used to test PBXT.
179
1800003: [RESOLVED: RN30] Multi-processor Test:
181
182There is a difference between preemptive multitasking and true multitasking, which you have on a multi-processor (or dual core) machine. I don't expect any fundamental problems here, but it must be tested.
183
1840002: [RESOLVED: RN5, RN30, RN43] Multi-user/locking Test:
185
186How does the engine perform with a number of concurrent users running various transactions on a number of different tables?
187This is a difficult test to write because it need to simulate a production situation. To test at least 2 or 3 machines is required. The idea is not to use too much data so that a lot of conflicts may occur.
188
1890001: [RESOLVED: RN4, RN43] Load/Stability Test:
190
191How does the engine perform under heavy load over a long period of time? How stable is the engine on power outage, etc?
192
193The test could use a variation of the test program written for test (3) above. At least 3 test machines would be required. The test must be modified to cause as much activity as possible. The test should monitor the performance under load.
194
195
0196
=== added file 'plugin/pbxt/plugin.am'
--- plugin/pbxt/plugin.am 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/plugin.am 2010-04-01 14:19:35 +0000
@@ -0,0 +1,76 @@
1# Used to build Makefile.in
2
3noinst_LTLIBRARIES+= plugin/pbxt/libpbxt.la
4
5noinst_HEADERS+= \
6 plugin/pbxt/src/bsearch_xt.h
7 plugin/pbxt/src/cache_xt.h \
8 plugin/pbxt/src/ccutils_xt.h \
9 plugin/pbxt/src/database_xt.h \
10 plugin/pbxt/src/datadic_xt.h \
11 plugin/pbxt/src/datalog_xt.h \
12 plugin/pbxt/src/filesys_xt.h \
13 plugin/pbxt/src/hashtab_xt.h \
14 plugin/pbxt/src/ha_pbxt.h \
15 plugin/pbxt/src/heap_xt.h \
16 plugin/pbxt/src/index_xt.h \
17 plugin/pbxt/src/linklist_xt.h \
18 plugin/pbxt/src/memory_xt.h \
19 plugin/pbxt/src/myxt_xt.h \
20 plugin/pbxt/src/pthread_xt.h \
21 plugin/pbxt/src/restart_xt.h \
22 plugin/pbxt/src/sortedlist_xt.h \
23 plugin/pbxt/src/strutil_xt.h \
24 plugin/pbxt/src/tabcache_xt.h \
25 plugin/pbxt/src/table_xt.h \
26 plugin/pbxt/src/trace_xt.h \
27 plugin/pbxt/src/thread_xt.h \
28 plugin/pbxt/src/util_xt.h \
29 plugin/pbxt/src/xaction_xt.h \
30 plugin/pbxt/src/xactlog_xt.h \
31 plugin/pbxt/src/lock_xt.h \
32 plugin/pbxt/src/systab_xt.h \
33 plugin/pbxt/src/ha_xtsys.h \
34 plugin/pbxt/src/discover_xt.h \
35 plugin/pbxt/src/pbms.h \
36 plugin/pbxt/src/xt_config.h \
37 plugin/pbxt/src/xt_defs.h \
38 plugin/pbxt/src/xt_errno.h
39
40
41plugin_pbxt_libpbxt_la_CXXFLAGS= ${AM_CXXFLAGS} -DDRIZZLED -Wno-long-long -Wno-overloaded-virtual -Wno-sign-compare -Wno-unused-function
42plugin_pbxt_libpbxt_la_CFLAGS= ${AM_CFLAGS} -DDRIZZLED -std=c99
43
44plugin_pbxt_libpbxt_la_SOURCES= \
45 plugin/pbxt/src/bsearch_xt.cc \
46 plugin/pbxt/src/cache_xt.cc \
47 plugin/pbxt/src/ccutils_xt.cc \
48 plugin/pbxt/src/database_xt.cc \
49 plugin/pbxt/src/datadic_xt.cc \
50 plugin/pbxt/src/datalog_xt.cc \
51 plugin/pbxt/src/filesys_xt.cc \
52 plugin/pbxt/src/hashtab_xt.cc \
53 plugin/pbxt/src/ha_pbxt.cc \
54 plugin/pbxt/src/heap_xt.cc \
55 plugin/pbxt/src/index_xt.cc \
56 plugin/pbxt/src/linklist_xt.cc \
57 plugin/pbxt/src/memory_xt.cc \
58 plugin/pbxt/src/myxt_xt.cc \
59 plugin/pbxt/src/pthread_xt.cc \
60 plugin/pbxt/src/restart_xt.cc \
61 plugin/pbxt/src/sortedlist_xt.cc \
62 plugin/pbxt/src/strutil_xt.cc \
63 plugin/pbxt/src/tabcache_xt.cc \
64 plugin/pbxt/src/table_xt.cc \
65 plugin/pbxt/src/trace_xt.cc \
66 plugin/pbxt/src/thread_xt.cc \
67 plugin/pbxt/src/systab_xt.cc \
68 plugin/pbxt/src/ha_xtsys.cc \
69 plugin/pbxt/src/discover_xt.cc \
70 plugin/pbxt/src/util_xt.cc \
71 plugin/pbxt/src/xaction_xt.cc \
72 plugin/pbxt/src/xactlog_xt.cc \
73 plugin/pbxt/src/lock_xt.cc
74
75
76EXTRA_DIST+= CMakeLists.txt
077
=== added file 'plugin/pbxt/plugin.ini'
--- plugin/pbxt/plugin.ini 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/plugin.ini 2010-04-01 14:19:35 +0000
@@ -0,0 +1,25 @@
1#
2# Copyright (c) 2006, 2009, Innobase Oy. All Rights Reserved.
3#
4# This program is free software; you can redistribute it and/or modify it under
5# the terms of the GNU General Public License as published by the Free Software
6# Foundation; version 2 of the License.
7#
8# This program is distributed in the hope that it will be useful, but WITHOUT
9# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
10# FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
11#
12# You should have received a copy of the GNU General Public License along with
13# this program; if not, write to the Free Software Foundation, Inc., 59 Temple
14# Place, Suite 330, Boston, MA 02111-1307 USA
15#
16
17[plugin]
18name=pbxt
19title=PBXT Storage Engine
20description=MVCC-based transactional engine
21sources=src/ha_pbxt.cc
22load_by_default=yes
23libs=plugin/pbxt/libpbxt.la
24cflags=-DDRIZZLED -std=c99
25cxxflags=-DDRIZZLED -Wno-long-long -Wno-overloaded-virtual
026
=== added directory 'plugin/pbxt/src'
=== added file 'plugin/pbxt/src/Makefile.am'
--- plugin/pbxt/src/Makefile.am 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/Makefile.am 2010-04-01 14:19:35 +0000
@@ -0,0 +1,50 @@
1# Used to build Makefile.in
2
3MYSQLDATAdir = $(localstatedir)
4MYSQLSHAREdir = $(pkgdatadir)
5MYSQLBASEdir= $(prefix)
6MYSQLLIBdir= $(pkglibdir)
7pkgplugindir = $(pkglibdir)/plugin
8
9AM_CPPFLAGS = -I$(top_srcdir)
10
11LIBS =
12
13LDADD =
14
15noinst_HEADERS = bsearch_xt.h cache_xt.h ccutils_xt.h database_xt.h \
16 datadic_xt.h datalog_xt.h filesys_xt.h hashtab_xt.h \
17 ha_pbxt.h heap_xt.h index_xt.h linklist_xt.h \
18 memory_xt.h myxt_xt.h pthread_xt.h restart_xt.h \
19 sortedlist_xt.h strutil_xt.h \
20 tabcache_xt.h table_xt.h trace_xt.h thread_xt.h \
21 util_xt.h xaction_xt.h xactlog_xt.h lock_xt.h \
22 systab_xt.h ha_xtsys.h discover_xt.h backup_xt.h \
23 pbms.h pbms_enabled.h xt_config.h xt_defs.h xt_errno.h locklist_xt.h
24
25plugin_LTLIBRARIES = libpbxt.la
26
27libpbxt_la_SOURCES = bsearch_xt.cc cache_xt.cc ccutils_xt.cc database_xt.cc \
28 datadic_xt.cc datalog_xt.cc filesys_xt.cc hashtab_xt.cc \
29 ha_pbxt.cc heap_xt.cc index_xt.cc linklist_xt.cc \
30 memory_xt.cc myxt_xt.cc pthread_xt.cc restart_xt.cc \
31 pbms_enabled.cc sortedlist_xt.cc strutil_xt.cc \
32 tabcache_xt.cc table_xt.cc trace_xt.cc thread_xt.cc \
33 systab_xt.cc ha_xtsys.cc discover_xt.cc backup_xt.cc \
34 util_xt.cc xaction_xt.cc xactlog_xt.cc lock_xt.cc locklist_xt.cc
35
36libpbxt_la_LDFLAGS = -module
37
38# These are the warning Drizzle uses:
39# DRIZZLE_WARNINGS = -W -Wall -Wextra -pedantic -Wundef -Wredundant-decls -Wno-strict-aliasing -Wno-long-long -Wno-unused-parameter
40
41libpbxt_la_CXXFLAGS = $(AM_CXXFLAGS) -DMYSQL_DYNAMIC_PLUGIN -Wno-overloaded-virtual
42libpbxt_la_CFLAGS = $(AM_CFLAGS) -DMYSQL_DYNAMIC_PLUGIN -std=c99
43
44EXTRA_LIBRARIES = libpbxt.a
45noinst_LIBRARIES = libpbxt.a
46libpbxt_a_SOURCES = $(libpbxt_la_SOURCES)
47libpbxt_a_CXXFLAGS = $(AM_CXXFLAGS) -DDRIZZLED -Wno-long-long -Wno-overloaded-virtual
48libpbxt_a_CFLAGS = $(AM_CFLAGS) -DDRIZZLED -std=c99
49
50EXTRA_DIST = CMakeLists.txt
051
=== added file 'plugin/pbxt/src/backup_xt.cc'
--- plugin/pbxt/src/backup_xt.cc 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/backup_xt.cc 2010-04-01 14:19:35 +0000
@@ -0,0 +1,802 @@
1/* Copyright (c) 2009 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2009-09-07 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23
24#include "xt_config.h"
25
26#ifdef MYSQL_SUPPORTS_BACKUP
27
28#include <string.h>
29#include <stdio.h>
30#include <stdlib.h>
31#include <time.h>
32#include <ctype.h>
33
34#include "mysql_priv.h"
35#include <backup/api_types.h>
36#include <backup/backup_engine.h>
37#include <backup/backup_aux.h> // for build_table_list()
38#include <hash.h>
39
40#include "ha_pbxt.h"
41
42#include "backup_xt.h"
43#include "pthread_xt.h"
44#include "filesys_xt.h"
45#include "database_xt.h"
46#include "strutil_xt.h"
47#include "memory_xt.h"
48#include "trace_xt.h"
49#include "myxt_xt.h"
50
51#ifdef OK
52#undef OK
53#endif
54
55#ifdef byte
56#undef byte
57#endif
58
59#ifdef DEBUG
60//#define TRACE_BACKUP_CALLS
61//#define TEST_SMALL_BLOCK 100000
62#endif
63
64using backup::byte;
65using backup::result_t;
66using backup::version_t;
67using backup::Table_list;
68using backup::Table_ref;
69using backup::Buffer;
70
71#ifdef TRACE_BACKUP_CALLS
72#define XT_TRACE_CALL() ha_trace_function(__FUNC__, NULL)
73#else
74#define XT_TRACE_CALL()
75#endif
76
77#define XT_RESTORE_BATCH_SIZE 10000
78
79#define BUP_STATE_BEFORE_LOCK 0
80#define BUP_STATE_AFTER_LOCK 1
81
82#define BUP_STANDARD_VAR_RECORD 1
83#define BUP_RECORD_BLOCK_4_START 2 // Part of a record, with a 4 byte total length, and 4 byte data length
84#define BUP_RECORD_BLOCK_4 3 // Part of a record, with a 4 byte length
85#define BUP_RECORD_BLOCK_4_END 4 // Last part of a record with a 4 byte length
86
87/*
88 * -----------------------------------------------------------------------
89 * UTILITIES
90 */
91
92#ifdef TRACE_BACKUP_CALLS
93static void ha_trace_function(const char *function, char *table)
94{
95 char func_buf[50], *ptr;
96 XTThreadPtr thread = xt_get_self();
97
98 if ((ptr = strchr(function, '('))) {
99 ptr--;
100 while (ptr > function) {
101 if (!(isalnum(*ptr) || *ptr == '_'))
102 break;
103 ptr--;
104 }
105 ptr++;
106 xt_strcpy(50, func_buf, ptr);
107 if ((ptr = strchr(func_buf, '(')))
108 *ptr = 0;
109 }
110 else
111 xt_strcpy(50, func_buf, function);
112 if (table)
113 printf("%s %s (%s)\n", thread ? thread->t_name : "-unknown-", func_buf, table);
114 else
115 printf("%s %s\n", thread ? thread->t_name : "-unknown-", func_buf);
116}
117#endif
118
119/*
120 * -----------------------------------------------------------------------
121 * BACKUP DRIVER
122 */
123
124class PBXTBackupDriver: public Backup_driver
125{
126 public:
127 PBXTBackupDriver(const Table_list &);
128 virtual ~PBXTBackupDriver();
129
130 virtual size_t size();
131 virtual size_t init_size();
132 virtual result_t begin(const size_t);
133 virtual result_t end();
134 virtual result_t get_data(Buffer &);
135 virtual result_t prelock();
136 virtual result_t lock();
137 virtual result_t unlock();
138 virtual result_t cancel();
139 virtual void free();
140 void lock_tables_TL_READ_NO_INSERT();
141
142 private:
143 XTThreadPtr bd_thread;
144 int bd_state;
145 u_int bd_table_no;
146 XTOpenTablePtr bd_ot;
147 xtWord1 *bd_row_buf;
148
149 /* Non-zero if we last returned only part of
150 * a row.
151 */
152 xtWord1 *db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *size, xtWord4 row_len);
153 xtWord1 *db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *size, xtWord4 total_len, xtWord4 row_len);
154
155 xtWord4 bd_row_offset;
156 xtWord4 bd_row_size;
157};
158
159
160PBXTBackupDriver::PBXTBackupDriver(const Table_list &tables):
161Backup_driver(tables),
162bd_state(BUP_STATE_BEFORE_LOCK),
163bd_table_no(0),
164bd_ot(NULL),
165bd_row_buf(NULL),
166bd_row_offset(0),
167bd_row_size(0)
168{
169}
170
171PBXTBackupDriver::~PBXTBackupDriver()
172{
173}
174
175/** Estimates total size of backup. @todo improve it */
176size_t PBXTBackupDriver::size()
177{
178 XT_TRACE_CALL();
179 return UNKNOWN_SIZE;
180}
181
182/** Estimates size of backup before lock. @todo improve it */
183size_t PBXTBackupDriver::init_size()
184{
185 XT_TRACE_CALL();
186 return 0;
187}
188
189result_t PBXTBackupDriver::begin(const size_t)
190{
191 THD *thd = current_thd;
192 XTExceptionRec e;
193
194 XT_TRACE_CALL();
195
196 if (!(bd_thread = xt_ha_set_current_thread(thd, &e))) {
197 xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
198 return backup::ERROR;
199 }
200
201 return backup::OK;
202}
203
204result_t PBXTBackupDriver::end()
205{
206 XT_TRACE_CALL();
207 if (bd_ot) {
208 xt_tab_seq_exit(bd_ot);
209 xt_db_return_table_to_pool_ns(bd_ot);
210 bd_ot = NULL;
211 }
212 if (bd_thread->st_xact_data) {
213 if (!xt_xn_commit(bd_thread))
214 return backup::ERROR;
215 }
216 return backup::OK;
217}
218
219xtWord1 *PBXTBackupDriver::db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *ret_size, xtWord4 row_len)
220{
221 register size_t size = *ret_size;
222
223 *buffer = bup_type; // Record type identifier.
224 buffer++;
225 size--;
226 memcpy(buffer, bd_ot->ot_row_wbuffer, row_len);
227 buffer += row_len;
228 size -= row_len;
229 *ret_size = size;
230 return buffer;
231}
232
233xtWord1 *PBXTBackupDriver::db_write_block(xtWord1 *buffer, xtWord1 bup_type, size_t *ret_size, xtWord4 total_len, xtWord4 row_len)
234{
235 register size_t size = *ret_size;
236
237 *buffer = bup_type; // Record type identifier.
238 buffer++;
239 size--;
240 if (bup_type == BUP_RECORD_BLOCK_4_START) {
241 XT_SET_DISK_4(buffer, total_len);
242 buffer += 4;
243 size -= 4;
244 }
245 XT_SET_DISK_4(buffer, row_len);
246 buffer += 4;
247 size -= 4;
248 memcpy(buffer, bd_ot->ot_row_wbuffer+bd_row_offset, row_len);
249 buffer += row_len;
250 size -= row_len;
251 bd_row_size -= row_len;
252 bd_row_offset += row_len;
253 *ret_size = size;
254 return buffer;
255}
256
257result_t PBXTBackupDriver::get_data(Buffer &buf)
258{
259 xtBool eof = FALSE;
260 size_t size;
261 xtWord4 row_len;
262 xtWord1 *buffer;
263
264 XT_TRACE_CALL();
265
266 if (bd_state == BUP_STATE_BEFORE_LOCK) {
267 buf.table_num = 0;
268 buf.size = 0;
269 buf.last = FALSE;
270 return backup::READY;
271 }
272
273 /* Open the backup table: */
274 if (!bd_ot) {
275 XTThreadPtr self = bd_thread;
276 XTTableHPtr tab;
277 char path[PATH_MAX];
278
279 if (bd_table_no == m_tables.count()) {
280 buf.size = 0;
281 buf.table_num = 0;
282 buf.last = TRUE;
283 return backup::DONE;
284 }
285
286 m_tables[bd_table_no].internal_name(path, sizeof(path));
287 bd_table_no++;
288 try_(a) {
289 xt_ha_open_database_of_table(self, (XTPathStrPtr) path);
290 tab = xt_use_table(self, (XTPathStrPtr) path, FALSE, FALSE, NULL);
291 pushr_(xt_heap_release, tab);
292 if (!(bd_ot = xt_db_open_table_using_tab(tab, bd_thread)))
293 xt_throw(self);
294 freer_(); // xt_heap_release(tab)
295
296 /* Prepare the seqential scan: */
297 xt_tab_seq_exit(bd_ot);
298 if (!xt_tab_seq_init(bd_ot))
299 xt_throw(self);
300
301 if (bd_row_buf) {
302 xt_free(self, bd_row_buf);
303 bd_row_buf = NULL;
304 }
305 bd_row_buf = (xtWord1 *) xt_malloc(self, bd_ot->ot_table->tab_dic.dic_mysql_buf_size);
306 bd_ot->ot_cols_req = bd_ot->ot_table->tab_dic.dic_no_of_cols;
307 }
308 catch_(a) {
309 ;
310 }
311 cont_(a);
312
313 if (!bd_ot)
314 goto failed;
315 }
316
317 buf.table_num = bd_table_no;
318#ifdef TEST_SMALL_BLOCK
319 buf.size = TEST_SMALL_BLOCK;
320#endif
321 size = buf.size;
322 buffer = (xtWord1 *) buf.data;
323 ASSERT_NS(size > 9);
324
325 /* First check of a record was partically written
326 * last time.
327 */
328 write_row:
329 if (bd_row_size > 0) {
330 row_len = bd_row_size;
331 if (bd_row_offset == 0) {
332 if (row_len+1 > size) {
333 ASSERT_NS(size > 9);
334 row_len = size - 9;
335 buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4_START, &size, bd_row_size, row_len);
336 goto done;
337 }
338 buffer = db_write_block(buffer, BUP_STANDARD_VAR_RECORD, &size, row_len);
339 bd_row_size = 0;
340 }
341 else {
342 if (row_len+5 > size) {
343 row_len = size - 5;
344 buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4, &size, 0, row_len);
345 goto done;
346 }
347 buffer = db_write_block(buffer, BUP_RECORD_BLOCK_4_END, &size, 0, row_len);
348 }
349 }
350
351 /* Now continue with the sequential scan. */
352 while (size > 1) {
353 if (!xt_tab_seq_next(bd_ot, bd_row_buf, &eof))
354 goto failed;
355 if (eof) {
356 /* We will go the next table, on the next call. */
357 xt_tab_seq_exit(bd_ot);
358 xt_db_return_table_to_pool_ns(bd_ot);
359 bd_ot = NULL;
360 break;
361 }
362 if (!(row_len = myxt_store_row_data(bd_ot, 0, (char *) bd_row_buf)))
363 goto failed;
364 if (row_len+1 > size) {
365 /* Does not fit: */
366 bd_row_offset = 0;
367 bd_row_size = row_len;
368 /* Only add part of the row, if there is still
369 * quite a bit of space left:
370 */
371 if (size >= (32 * 1024))
372 goto write_row;
373 break;
374 }
375 buffer = db_write_block(buffer, BUP_STANDARD_VAR_RECORD, &size, row_len);
376 }
377
378 done:
379 buf.size = buf.size - size;
380 /* This indicates wnd of data for a table! */
381 buf.last = eof;
382
383 return backup::OK;
384
385 failed:
386 xt_log_and_clear_exception(bd_thread);
387 return backup::ERROR;
388}
389
390result_t PBXTBackupDriver::prelock()
391{
392 XT_TRACE_CALL();
393 return backup::READY;
394}
395
396result_t PBXTBackupDriver::lock()
397{
398 XT_TRACE_CALL();
399 bd_thread->st_xact_mode = XT_XACT_COMMITTED_READ;
400 bd_thread->st_ignore_fkeys = FALSE;
401 bd_thread->st_auto_commit = FALSE;
402 bd_thread->st_table_trans = FALSE;
403 bd_thread->st_abort_trans = FALSE;
404 bd_thread->st_stat_ended = FALSE;
405 bd_thread->st_stat_trans = FALSE;
406 bd_thread->st_is_update = FALSE;
407 if (!xt_xn_begin(bd_thread))
408 return backup::ERROR;
409 bd_state = BUP_STATE_AFTER_LOCK;
410 return backup::OK;
411}
412
413result_t PBXTBackupDriver::unlock()
414{
415 XT_TRACE_CALL();
416 return backup::OK;
417}
418
419result_t PBXTBackupDriver::cancel()
420{
421 XT_TRACE_CALL();
422 return backup::OK; // free() will be called and suffice
423}
424
425void PBXTBackupDriver::free()
426{
427 XT_TRACE_CALL();
428 if (bd_ot) {
429 xt_tab_seq_exit(bd_ot);
430 xt_db_return_table_to_pool_ns(bd_ot);
431 bd_ot = NULL;
432 }
433 if (bd_row_buf) {
434 xt_free_ns(bd_row_buf);
435 bd_row_buf = NULL;
436 }
437 if (bd_thread->st_xact_data)
438 xt_xn_rollback(bd_thread);
439 delete this;
440}
441
442void PBXTBackupDriver::lock_tables_TL_READ_NO_INSERT()
443{
444 XT_TRACE_CALL();
445}
446
447/*
448 * -----------------------------------------------------------------------
449 * BACKUP DRIVER
450 */
451
452class PBXTRestoreDriver: public Restore_driver
453{
454 public:
455 PBXTRestoreDriver(const Table_list &tables);
456 virtual ~PBXTRestoreDriver();
457
458 virtual result_t begin(const size_t);
459 virtual result_t end();
460 virtual result_t send_data(Buffer &buf);
461 virtual result_t cancel();
462 virtual void free();
463
464 private:
465 XTThreadPtr rd_thread;
466 u_int rd_table_no;
467 XTOpenTablePtr rd_ot;
468 STRUCT_TABLE *rd_my_table;
469 xtWord1 *rb_row_buf;
470 u_int rb_col_cnt;
471 u_int rb_insert_count;
472
473 /* Long rows are accumulated here: */
474 xtWord4 rb_row_len;
475 xtWord4 rb_data_size;
476 xtWord1 *rb_row_data;
477};
478
479PBXTRestoreDriver::PBXTRestoreDriver(const Table_list &tables):
480Restore_driver(tables),
481rd_thread(NULL),
482rd_table_no(0),
483rd_ot(NULL),
484rb_row_buf(NULL),
485rb_row_len(0),
486rb_data_size(0),
487rb_row_data(NULL)
488{
489}
490
491PBXTRestoreDriver::~PBXTRestoreDriver()
492{
493}
494
495result_t PBXTRestoreDriver::begin(const size_t)
496{
497 THD *thd = current_thd;
498 XTExceptionRec e;
499
500 XT_TRACE_CALL();
501
502 if (!(rd_thread = xt_ha_set_current_thread(thd, &e))) {
503 xt_log_exception(NULL, &e, XT_LOG_DEFAULT);
504 return backup::ERROR;
505 }
506
507 return backup::OK;
508}
509
510result_t PBXTRestoreDriver::end()
511{
512 XT_TRACE_CALL();
513 if (rd_ot) {
514 xt_db_return_table_to_pool_ns(rd_ot);
515 rd_ot = NULL;
516 }
517 //if (rb_row_buf) {
518 // xt_free_ns(rb_row_buf);
519 // rb_row_buf = NULL;
520 //}
521 if (rb_row_data) {
522 xt_free_ns(rb_row_data);
523 rb_row_data = NULL;
524 }
525 if (rd_thread->st_xact_data) {
526 if (!xt_xn_commit(rd_thread))
527 return backup::ERROR;
528 }
529 return backup::OK;
530}
531
532
533result_t PBXTRestoreDriver::send_data(Buffer &buf)
534{
535 size_t size;
536 xtWord1 type;
537 xtWord1 *buffer;
538 xtWord4 row_len;
539 xtWord1 *rec_data;
540
541 XT_TRACE_CALL();
542
543 if (buf.table_num != rd_table_no) {
544 XTThreadPtr self = rd_thread;
545 XTTableHPtr tab;
546 char path[PATH_MAX];
547
548 if (rd_ot) {
549 xt_db_return_table_to_pool_ns(rd_ot);
550 rd_ot = NULL;
551 }
552
553 if (rd_thread->st_xact_data) {
554 if (!xt_xn_commit(rd_thread))
555 goto failed;
556 }
557 if (!xt_xn_begin(rd_thread))
558 goto failed;
559 rb_insert_count = 0;
560
561 rd_table_no = buf.table_num;
562 m_tables[rd_table_no-1].internal_name(path, sizeof(path));
563 try_(a) {
564 xt_ha_open_database_of_table(self, (XTPathStrPtr) path);
565 tab = xt_use_table(self, (XTPathStrPtr) path, FALSE, FALSE, NULL);
566 pushr_(xt_heap_release, tab);
567 if (!(rd_ot = xt_db_open_table_using_tab(tab, rd_thread)))
568 xt_throw(self);
569 freer_(); // xt_heap_release(tab)
570
571 rd_my_table = rd_ot->ot_table->tab_dic.dic_my_table;
572 if (rd_my_table->found_next_number_field) {
573 rd_my_table->in_use = current_thd;
574 rd_my_table->next_number_field = rd_my_table->found_next_number_field;
575 rd_my_table->mark_columns_used_by_index_no_reset(rd_my_table->s->next_number_index, rd_my_table->read_set);
576 }
577
578 /* This is safe because only one thread can restore a table at
579 * a time!
580 */
581 rb_row_buf = (xtWord1 *) rd_my_table->record[0];
582 //if (rb_row_buf) {
583 // xt_free(self, rb_row_buf);
584 // rb_row_buf = NULL;
585 //}
586 //rb_row_buf = (xtWord1 *) xt_malloc(self, rd_ot->ot_table->tab_dic.dic_mysql_buf_size);
587
588 rb_col_cnt = rd_ot->ot_table->tab_dic.dic_no_of_cols;
589
590 }
591 catch_(a) {
592 ;
593 }
594 cont_(a);
595
596 if (!rd_ot)
597 goto failed;
598 }
599
600 buffer = (xtWord1 *) buf.data;
601 size = buf.size;
602
603 while (size > 0) {
604 type = *buffer;
605 switch (type) {
606 case BUP_STANDARD_VAR_RECORD:
607 rec_data = buffer + 1;
608 break;
609 case BUP_RECORD_BLOCK_4_START:
610 buffer++;
611 row_len = XT_GET_DISK_4(buffer);
612 buffer += 4;
613 if (rb_data_size < row_len) {
614 if (!xt_realloc_ns((void **) &rb_row_data, row_len))
615 goto failed;
616 rb_data_size = row_len;
617 }
618 row_len = XT_GET_DISK_4(buffer);
619 buffer += 4;
620 ASSERT_NS(row_len <= rb_data_size);
621 if (row_len > rb_data_size) {
622 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
623 goto failed;
624 }
625 memcpy(rb_row_data, buffer, row_len);
626 rb_row_len = row_len;
627 buffer += row_len;
628 if (row_len + 9 > size) {
629 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
630 goto failed;
631 }
632 size -= row_len + 9;
633 continue;
634 case BUP_RECORD_BLOCK_4:
635 buffer++;
636 row_len = XT_GET_DISK_4(buffer);
637 buffer += 4;
638 ASSERT_NS(rb_row_len + row_len <= rb_data_size);
639 if (rb_row_len + row_len > rb_data_size) {
640 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
641 goto failed;
642 }
643 memcpy(rb_row_data + rb_row_len, buffer, row_len);
644 rb_row_len += row_len;
645 buffer += row_len;
646 if (row_len + 5 > size) {
647 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
648 goto failed;
649 }
650 size -= row_len + 5;
651 continue;
652 case BUP_RECORD_BLOCK_4_END:
653 buffer++;
654 row_len = XT_GET_DISK_4(buffer);
655 buffer += 4;
656 ASSERT_NS(rb_row_len + row_len <= rb_data_size);
657 if (rb_row_len + row_len > rb_data_size) {
658 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
659 goto failed;
660 }
661 memcpy(rb_row_data + rb_row_len, buffer, row_len);
662 buffer += row_len;
663 if (row_len + 5 > size) {
664 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
665 goto failed;
666 }
667 size -= row_len + 5;
668 rec_data = rb_row_data;
669 break;
670 default:
671 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
672 goto failed;
673 }
674
675 if (!(row_len = myxt_load_row_data(rd_ot, rec_data, rb_row_buf, rb_col_cnt)))
676 goto failed;
677
678 if (rd_ot->ot_table->tab_dic.dic_my_table->found_next_number_field)
679 ha_set_auto_increment(rd_ot, rd_ot->ot_table->tab_dic.dic_my_table->found_next_number_field);
680
681 if (!xt_tab_new_record(rd_ot, rb_row_buf))
682 goto failed;
683
684 if (type == BUP_STANDARD_VAR_RECORD) {
685 buffer += row_len+1;
686 if (row_len + 1 > size) {
687 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_BAD_BACKUP_FORMAT);
688 goto failed;
689 }
690 size -= row_len + 1;
691 }
692
693 rb_insert_count++;
694 if (rb_insert_count == XT_RESTORE_BATCH_SIZE) {
695 if (!xt_xn_commit(rd_thread))
696 goto failed;
697 if (!xt_xn_begin(rd_thread))
698 goto failed;
699 rb_insert_count = 0;
700 }
701 }
702
703 return backup::OK;
704
705 failed:
706 xt_log_and_clear_exception(rd_thread);
707 return backup::ERROR;
708}
709
710
711result_t PBXTRestoreDriver::cancel()
712{
713 XT_TRACE_CALL();
714 /* Nothing to do in cancel(); free() will suffice */
715 return backup::OK;
716}
717
718void PBXTRestoreDriver::free()
719{
720 XT_TRACE_CALL();
721 if (rd_ot) {
722 xt_db_return_table_to_pool_ns(rd_ot);
723 rd_ot = NULL;
724 }
725 //if (rb_row_buf) {
726 // xt_free_ns(rb_row_buf);
727 // rb_row_buf = NULL;
728 //}
729 if (rb_row_data) {
730 xt_free_ns(rb_row_data);
731 rb_row_data = NULL;
732 }
733 if (rd_thread->st_xact_data)
734 xt_xn_rollback(rd_thread);
735 delete this;
736}
737
738/*
739 * -----------------------------------------------------------------------
740 * BACKUP ENGINE FACTORY
741 */
742
743#define PBXT_BACKUP_VERSION 1
744
745
746class PBXTBackupEngine: public Backup_engine
747{
748 public:
749 PBXTBackupEngine() { };
750
751 virtual version_t version() const {
752 return PBXT_BACKUP_VERSION;
753 };
754
755 virtual result_t get_backup(const uint32, const Table_list &, Backup_driver* &);
756
757 virtual result_t get_restore(const version_t, const uint32, const Table_list &,Restore_driver* &);
758
759 virtual void free()
760 {
761 delete this;
762 }
763};
764
765result_t PBXTBackupEngine::get_backup(const u_int count, const Table_list &tables, Backup_driver* &drv)
766{
767 PBXTBackupDriver *ptr = new PBXTBackupDriver(tables);
768
769 if (!ptr)
770 return backup::ERROR;
771 drv = ptr;
772 return backup::OK;
773}
774
775result_t PBXTBackupEngine::get_restore(const version_t ver, const uint32,
776 const Table_list &tables, Restore_driver* &drv)
777{
778 if (ver > PBXT_BACKUP_VERSION)
779 {
780 return backup::ERROR;
781 }
782
783 PBXTRestoreDriver *ptr = new PBXTRestoreDriver(tables);
784
785 if (!ptr)
786 return backup::ERROR;
787 drv = (Restore_driver *) ptr;
788 return backup::OK;
789}
790
791
792Backup_result_t pbxt_backup_engine(handlerton *self, Backup_engine* &be)
793{
794 be = new PBXTBackupEngine();
795
796 if (!be)
797 return backup::ERROR;
798
799 return backup::OK;
800}
801
802#endif
0803
=== added file 'plugin/pbxt/src/backup_xt.h'
--- plugin/pbxt/src/backup_xt.h 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/backup_xt.h 2010-04-01 14:19:35 +0000
@@ -0,0 +1,34 @@
1/* Copyright (c) 2009 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2009-09-07 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23
24#ifndef __backup_xt_h__
25#define __backup_xt_h__
26
27#include "xt_defs.h"
28
29#ifdef MYSQL_SUPPORTS_BACKUP
30
31Backup_result_t pbxt_backup_engine(handlerton *self, Backup_engine* &be);
32
33#endif
34#endif
035
=== added file 'plugin/pbxt/src/bsearch_xt.cc'
--- plugin/pbxt/src/bsearch_xt.cc 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/bsearch_xt.cc 2010-04-01 14:19:35 +0000
@@ -0,0 +1,66 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2004-01-03 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23
24#include "xt_config.h"
25
26#include <stdio.h>
27
28#include "bsearch_xt.h"
29#include "pthread_xt.h"
30#include "thread_xt.h"
31
32/**
33 * Binary search a array of 'count' items, with byte size 'size'. This
34 * function returns a pointer to the element and the 'index'
35 * of the element if found.
36 *
37 * If not found the index of the insert point of the item
38 * is returned (0 <= index <= count).
39 *
40 * The comparison routine 'compar' may throw an exception.
41 * In this case the error details will be stored in 'thread'.
42 */
43void *xt_bsearch(XTThreadPtr thread, const void *key, register const void *base, size_t count, size_t size, size_t *idx, const void *thunk, XTCompareFunc compar)
44{
45 register size_t i;
46 register size_t guess;
47 register int r;
48
49 i = 0;
50 while (i < count) {
51 guess = (i + count - 1) >> 1;
52 r = (compar)(thread, thunk, key, ((char *) base) + guess * size);
53 if (r == 0) {
54 *idx = guess;
55 return ((char *) base) + guess * size;
56 }
57 if (r < 0)
58 count = guess;
59 else
60 i = guess + 1;
61 }
62
63 *idx = i;
64 return NULL;
65}
66
067
=== added file 'plugin/pbxt/src/bsearch_xt.h'
--- plugin/pbxt/src/bsearch_xt.h 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/bsearch_xt.h 2010-04-01 14:19:35 +0000
@@ -0,0 +1,32 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2004-01-03 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23#ifndef __xt_bsearch_h__
24#define __xt_bsearch_h__
25
26#include "xt_defs.h"
27
28struct XTThread;
29
30void *xt_bsearch(struct XTThread *self, const void *key, register const void *base, size_t count, size_t size, size_t *idx, const void *thunk, XTCompareFunc compar);
31
32#endif
033
=== added file 'plugin/pbxt/src/cache_xt.cc'
--- plugin/pbxt/src/cache_xt.cc 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/cache_xt.cc 2010-04-01 14:19:35 +0000
@@ -0,0 +1,1577 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH, Germany
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2005-05-24 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23
24#include "xt_config.h"
25
26#ifdef DRIZZLED
27#include <bitset>
28#endif
29
30#ifndef XT_WIN
31#include <unistd.h>
32#endif
33
34#include <stdio.h>
35#include <time.h>
36
37#include "pthread_xt.h"
38#include "thread_xt.h"
39#include "filesys_xt.h"
40#include "cache_xt.h"
41#include "table_xt.h"
42#include "trace_xt.h"
43#include "util_xt.h"
44
45#define XT_TIME_DIFF(start, now) (\
46 ((xtWord4) (now) < (xtWord4) (start)) ? \
47 ((xtWord4) 0XFFFFFFFF - ((xtWord4) (start) - (xtWord4) (now))) : \
48 ((xtWord4) (now) - (xtWord4) (start)))
49
50/*
51 * -----------------------------------------------------------------------
52 * D I S K C A C H E
53 */
54
55#define IDX_CAC_SEGMENT_COUNT ((off_t) 1 << XT_INDEX_CACHE_SEGMENT_SHIFTS)
56#define IDX_CAC_SEGMENT_MASK (IDX_CAC_SEGMENT_COUNT - 1)
57
58#ifdef XT_NO_ATOMICS
59#define IDX_CAC_USE_PTHREAD_RW
60#else
61//#define IDX_CAC_USE_RWMUTEX
62//#define IDX_CAC_USE_PTHREAD_RW
63//#define IDX_USE_SPINXSLOCK
64#define IDX_CAC_USE_XSMUTEX
65#endif
66
67#ifdef IDX_CAC_USE_XSMUTEX
68#define IDX_CAC_LOCK_TYPE XTXSMutexRec
69#define IDX_CAC_INIT_LOCK(s, i) xt_xsmutex_init_with_autoname(s, &(i)->cs_lock)
70#define IDX_CAC_FREE_LOCK(s, i) xt_xsmutex_free(s, &(i)->cs_lock)
71#define IDX_CAC_READ_LOCK(i, o) xt_xsmutex_slock(&(i)->cs_lock, (o)->t_id)
72#define IDX_CAC_WRITE_LOCK(i, o) xt_xsmutex_xlock(&(i)->cs_lock, (o)->t_id)
73#define IDX_CAC_UNLOCK(i, o) xt_xsmutex_unlock(&(i)->cs_lock, (o)->t_id)
74#elif defined(IDX_CAC_USE_PTHREAD_RW)
75#define IDX_CAC_LOCK_TYPE xt_rwlock_type
76#define IDX_CAC_INIT_LOCK(s, i) xt_init_rwlock(s, &(i)->cs_lock)
77#define IDX_CAC_FREE_LOCK(s, i) xt_free_rwlock(&(i)->cs_lock)
78#define IDX_CAC_READ_LOCK(i, o) xt_slock_rwlock_ns(&(i)->cs_lock)
79#define IDX_CAC_WRITE_LOCK(i, o) xt_xlock_rwlock_ns(&(i)->cs_lock)
80#define IDX_CAC_UNLOCK(i, o) xt_unlock_rwlock_ns(&(i)->cs_lock)
81#elif defined(IDX_CAC_USE_RWMUTEX)
82#define IDX_CAC_LOCK_TYPE XTRWMutexRec
83#define IDX_CAC_INIT_LOCK(s, i) xt_rwmutex_init_with_autoname(s, &(i)->cs_lock)
84#define IDX_CAC_FREE_LOCK(s, i) xt_rwmutex_free(s, &(i)->cs_lock)
85#define IDX_CAC_READ_LOCK(i, o) xt_rwmutex_slock(&(i)->cs_lock, (o)->t_id)
86#define IDX_CAC_WRITE_LOCK(i, o) xt_rwmutex_xlock(&(i)->cs_lock, (o)->t_id)
87#define IDX_CAC_UNLOCK(i, o) xt_rwmutex_unlock(&(i)->cs_lock, (o)->t_id)
88#elif defined(IDX_CAC_USE_SPINXSLOCK)
89#define IDX_CAC_LOCK_TYPE XTSpinXSLockRec
90#define IDX_CAC_INIT_LOCK(s, i) xt_spinxslock_init_with_autoname(s, &(i)->cs_lock)
91#define IDX_CAC_FREE_LOCK(s, i) xt_spinxslock_free(s, &(i)->cs_lock)
92#define IDX_CAC_READ_LOCK(i, s) xt_spinxslock_slock(&(i)->cs_lock, (s)->t_id)
93#define IDX_CAC_WRITE_LOCK(i, s) xt_spinxslock_xlock(&(i)->cs_lock, (s)->t_id)
94#define IDX_CAC_UNLOCK(i, s) xt_spinxslock_unlock(&(i)->cs_lock, (s)->t_id)
95#endif
96
97#define ID_HANDLE_USE_SPINLOCK
98//#define ID_HANDLE_USE_PTHREAD_RW
99
100#if defined(ID_HANDLE_USE_PTHREAD_RW)
101#define ID_HANDLE_LOCK_TYPE xt_mutex_type
102#define ID_HANDLE_INIT_LOCK(s, i) xt_init_mutex_with_autoname(s, i)
103#define ID_HANDLE_FREE_LOCK(s, i) xt_free_mutex(i)
104#define ID_HANDLE_LOCK(i) xt_lock_mutex_ns(i)
105#define ID_HANDLE_UNLOCK(i) xt_unlock_mutex_ns(i)
106#elif defined(ID_HANDLE_USE_SPINLOCK)
107#define ID_HANDLE_LOCK_TYPE XTSpinLockRec
108#define ID_HANDLE_INIT_LOCK(s, i) xt_spinlock_init_with_autoname(s, i)
109#define ID_HANDLE_FREE_LOCK(s, i) xt_spinlock_free(s, i)
110#define ID_HANDLE_LOCK(i) xt_spinlock_lock(i)
111#define ID_HANDLE_UNLOCK(i) xt_spinlock_unlock(i)
112#endif
113
114#define XT_HANDLE_SLOTS 37
115
116/*
117#ifdef DEBUG
118#define XT_INIT_HANDLE_COUNT 0
119#define XT_INIT_HANDLE_BLOCKS 0
120#else
121#define XT_INIT_HANDLE_COUNT 40
122#define XT_INIT_HANDLE_BLOCKS 10
123#endif
124*/
125
126/* A disk cache segment. The cache is divided into a number of segments
127 * to improve concurrency.
128 */
129typedef struct DcSegment {
130 IDX_CAC_LOCK_TYPE cs_lock; /* The cache segment lock. */
131 XTIndBlockPtr *cs_hash_table;
132} DcSegmentRec, *DcSegmentPtr;
133
134typedef struct DcHandleSlot {
135 ID_HANDLE_LOCK_TYPE hs_handles_lock;
136 XTIndHandleBlockPtr hs_free_blocks;
137 XTIndHandlePtr hs_free_handles;
138 XTIndHandlePtr hs_used_handles;
139} DcHandleSlotRec, *DcHandleSlotPtr;
140
141typedef struct DcGlobals {
142 xt_mutex_type cg_lock; /* The public cache lock. */
143 DcSegmentRec cg_segment[IDX_CAC_SEGMENT_COUNT];
144 XTIndBlockPtr cg_blocks;
145#ifdef XT_USE_DIRECT_IO_ON_INDEX
146 xtWord1 *cg_buffer;
147#endif
148 XTIndBlockPtr cg_free_list;
149 xtWord4 cg_free_count;
150 xtWord4 cg_ru_now; /* A counter as described by Jim Starkey (my thanks) */
151 XTIndBlockPtr cg_lru_block;
152 XTIndBlockPtr cg_mru_block;
153 xtWord4 cg_hash_size;
154 xtWord4 cg_block_count;
155 xtWord4 cg_max_free;
156#ifdef DEBUG_CHECK_IND_CACHE
157 u_int cg_reserved_by_ots; /* Number of blocks reserved by open tables. */
158 u_int cg_read_count; /* Number of blocks being read. */
159#endif
160
161 /* Index cache handles: */
162 DcHandleSlotRec cg_handle_slot[XT_HANDLE_SLOTS];
163} DcGlobalsRec;
164
165static DcGlobalsRec ind_cac_globals;
166
167#ifdef XT_USE_MYSYS
168#ifdef xtPublic
169#undef xtPublic
170#endif
171#include "my_global.h"
172#include "my_sys.h"
173#include "keycache.h"
174KEY_CACHE my_cache;
175#undef pthread_rwlock_rdlock
176#undef pthread_rwlock_wrlock
177#undef pthread_rwlock_unlock
178#undef pthread_mutex_lock
179#undef pthread_mutex_unlock
180#undef pthread_cond_wait
181#undef pthread_cond_broadcast
182#undef xt_mutex_type
183#define xtPublic
184#endif
185
186/*
187 * -----------------------------------------------------------------------
188 * INDEX CACHE HANDLES
189 */
190
191static XTIndHandlePtr ind_alloc_handle()
192{
193 XTIndHandlePtr handle;
194
195 if (!(handle = (XTIndHandlePtr) xt_calloc_ns(sizeof(XTIndHandleRec))))
196 return NULL;
197 xt_spinlock_init_with_autoname(NULL, &handle->ih_lock);
198 return handle;
199}
200
201static void ind_free_handle(XTIndHandlePtr handle)
202{
203 xt_spinlock_free(NULL, &handle->ih_lock);
204 xt_free_ns(handle);
205}
206
207static void ind_handle_exit(XTThreadPtr self)
208{
209 DcHandleSlotPtr hs;
210 XTIndHandlePtr handle;
211 XTIndHandleBlockPtr hptr;
212
213 for (int i=0; i<XT_HANDLE_SLOTS; i++) {
214 hs = &ind_cac_globals.cg_handle_slot[i];
215
216 while (hs->hs_used_handles) {
217 handle = hs->hs_used_handles;
218 xt_ind_release_handle(handle, FALSE, self);
219 }
220
221 while (hs->hs_free_blocks) {
222 hptr = hs->hs_free_blocks;
223 hs->hs_free_blocks = hptr->hb_next;
224 xt_free(self, hptr);
225 }
226
227 while (hs->hs_free_handles) {
228 handle = hs->hs_free_handles;
229 hs->hs_free_handles = handle->ih_next;
230 ind_free_handle(handle);
231 }
232
233 ID_HANDLE_FREE_LOCK(self, &hs->hs_handles_lock);
234 }
235}
236
237static void ind_handle_init(XTThreadPtr self)
238{
239 DcHandleSlotPtr hs;
240
241 for (int i=0; i<XT_HANDLE_SLOTS; i++) {
242 hs = &ind_cac_globals.cg_handle_slot[i];
243 memset(hs, 0, sizeof(DcHandleSlotRec));
244 ID_HANDLE_INIT_LOCK(self, &hs->hs_handles_lock);
245 }
246}
247
248//#define CHECK_HANDLE_STRUCTS
249
250#ifdef CHECK_HANDLE_STRUCTS
251static int gdummy = 0;
252
253static void ic_stop_here()
254{
255 gdummy = gdummy + 1;
256 printf("Nooo %d!\n", gdummy);
257}
258
259static void ic_check_handle_structs()
260{
261 XTIndHandlePtr handle, phandle;
262 XTIndHandleBlockPtr hptr, phptr;
263 int count = 0;
264 int ctest;
265
266 phandle = NULL;
267 handle = ind_cac_globals.cg_used_handles;
268 while (handle) {
269 if (handle == phandle)
270 ic_stop_here();
271 if (handle->ih_prev != phandle)
272 ic_stop_here();
273 if (handle->ih_cache_reference) {
274 ctest = handle->x.ih_cache_block->cb_handle_count;
275 if (ctest == 0 || ctest > 100)
276 ic_stop_here();
277 }
278 else {
279 ctest = handle->x.ih_handle_block->hb_ref_count;
280 if (ctest == 0 || ctest > 100)
281 ic_stop_here();
282 }
283 phandle = handle;
284 handle = handle->ih_next;
285 count++;
286 if (count > 1000)
287 ic_stop_here();
288 }
289
290 count = 0;
291 hptr = ind_cac_globals.cg_free_blocks;
292 while (hptr) {
293 if (hptr == phptr)
294 ic_stop_here();
295 phptr = hptr;
296 hptr = hptr->hb_next;
297 count++;
298 if (count > 1000)
299 ic_stop_here();
300 }
301
302 count = 0;
303 handle = ind_cac_globals.cg_free_handles;
304 while (handle) {
305 if (handle == phandle)
306 ic_stop_here();
307 phandle = handle;
308 handle = handle->ih_next;
309 count++;
310 if (count > 1000)
311 ic_stop_here();
312 }
313}
314#endif
315
316/*
317 * Get a handle to the index block.
318 * This function is called by index scanners (readers).
319 */
320xtPublic XTIndHandlePtr xt_ind_get_handle(XTOpenTablePtr ot, XTIndexPtr ind, XTIndReferencePtr iref)
321{
322 DcHandleSlotPtr hs;
323 XTIndHandlePtr handle;
324
325 hs = &ind_cac_globals.cg_handle_slot[iref->ir_block->cb_address % XT_HANDLE_SLOTS];
326
327 ASSERT_NS(iref->ir_xlock == FALSE);
328 ASSERT_NS(iref->ir_updated == FALSE);
329 ID_HANDLE_LOCK(&hs->hs_handles_lock);
330#ifdef CHECK_HANDLE_STRUCTS
331 ic_check_handle_structs();
332#endif
333 if ((handle = hs->hs_free_handles))
334 hs->hs_free_handles = handle->ih_next;
335 else {
336 if (!(handle = ind_alloc_handle())) {
337 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
338 xt_ind_release(ot, ind, XT_UNLOCK_READ, iref);
339 return NULL;
340 }
341 }
342 if (hs->hs_used_handles)
343 hs->hs_used_handles->ih_prev = handle;
344 handle->ih_next = hs->hs_used_handles;
345 handle->ih_prev = NULL;
346 handle->ih_address = iref->ir_block->cb_address;
347 handle->ih_cache_reference = TRUE;
348 handle->x.ih_cache_block = iref->ir_block;
349 handle->ih_branch = iref->ir_branch;
350 /* {HANDLE-COUNT-USAGE}
351 * This is safe because:
352 *
353 * I have an Slock on the cache block, and I have
354 * at least an Slock on the index.
355 * So this excludes anyone who is reading
356 * cb_handle_count in the index.
357 * (all cache block writers, and the freeer).
358 *
359 * The increment is safe because I have the list
360 * lock (hs_handles_lock), which is required by anyone else
361 * who increments or decrements this value.
362 */
363 iref->ir_block->cb_handle_count++;
364 hs->hs_used_handles = handle;
365#ifdef CHECK_HANDLE_STRUCTS
366 ic_check_handle_structs();
367#endif
368 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
369 xt_ind_release(ot, ind, XT_UNLOCK_READ, iref);
370 return handle;
371}
372
373xtPublic void xt_ind_release_handle(XTIndHandlePtr handle, xtBool have_lock, XTThreadPtr thread)
374{
375 DcHandleSlotPtr hs;
376 XTIndBlockPtr block = NULL;
377 u_int hash_idx = 0;
378 DcSegmentPtr seg = NULL;
379 XTIndBlockPtr xblock;
380
381 /* The lock order is:
382 * 1. Cache segment (cs_lock) - This is only by ind_free_block()!
383 * 1. S/Slock cache block (cb_lock)
384 * 2. List lock (cg_handles_lock).
385 * 3. Handle lock (ih_lock)
386 */
387 if (!have_lock)
388 xt_spinlock_lock(&handle->ih_lock);
389
390 /* Get the lock on the cache page if required: */
391 if (handle->ih_cache_reference) {
392 u_int file_id;
393 xtIndexNodeID address;
394
395 block = handle->x.ih_cache_block;
396
397 file_id = block->cb_file_id;
398 address = block->cb_address;
399 hash_idx = XT_NODE_ID(address) + (file_id * 223);
400 seg = &ind_cac_globals.cg_segment[hash_idx & IDX_CAC_SEGMENT_MASK];
401 hash_idx = (hash_idx >> XT_INDEX_CACHE_SEGMENT_SHIFTS) % ind_cac_globals.cg_hash_size;
402 }
403
404 xt_spinlock_unlock(&handle->ih_lock);
405
406 /* Because of the lock order, I have to release the
407 * handle before I get a lock on the cache block.
408 *
409 * But, by doing this, thie cache block may be gone!
410 */
411 if (block) {
412 IDX_CAC_READ_LOCK(seg, thread);
413 xblock = seg->cs_hash_table[hash_idx];
414 while (xblock) {
415 if (block == xblock) {
416 /* Found the block...
417 * {HANDLE-COUNT-SLOCK}
418 * 04.05.2009, changed to slock.
419 */
420 XT_IPAGE_READ_LOCK(&block->cb_lock);
421 goto block_found;
422 }
423 xblock = xblock->cb_next;
424 }
425 block = NULL;
426 block_found:
427 IDX_CAC_UNLOCK(seg, thread);
428 }
429
430 hs = &ind_cac_globals.cg_handle_slot[handle->ih_address % XT_HANDLE_SLOTS];
431
432 ID_HANDLE_LOCK(&hs->hs_handles_lock);
433#ifdef CHECK_HANDLE_STRUCTS
434 ic_check_handle_structs();
435#endif
436
437 /* I don't need to lock the handle because I have locked
438 * the list, and no other thread can change the
439 * handle without first getting a lock on the list.
440 *
441 * In addition, the caller is the only owner of the
442 * handle, and the only thread with an independent
443 * reference to the handle.
444 * All other access occur over the list.
445 */
446
447 /* Remove the reference to the cache or a handle block: */
448 if (handle->ih_cache_reference) {
449 ASSERT_NS(block == handle->x.ih_cache_block);
450 ASSERT_NS(block && block->cb_handle_count > 0);
451 /* {HANDLE-COUNT-USAGE}
452 * This is safe here because I have excluded
453 * all readers by taking an Xlock on the
454 * cache block (CHANGED - see below).
455 *
456 * {HANDLE-COUNT-SLOCK}
457 * 04.05.2009, changed to slock.
458 * Should be OK, because:
459 * A have a lock on the list lock (hs_handles_lock),
460 * which prevents concurrent updates to cb_handle_count.
461 *
462 * I have also have a read lock on the cache block
463 * but not a lock on the index. As a result, we cannot
464 * excluded all index writers (and readers of
465 * cb_handle_count.
466 */
467 block->cb_handle_count--;
468 }
469 else {
470 XTIndHandleBlockPtr hptr = handle->x.ih_handle_block;
471
472 ASSERT_NS(!handle->ih_cache_reference);
473 ASSERT_NS(hptr->hb_ref_count > 0);
474 hptr->hb_ref_count--;
475 if (!hptr->hb_ref_count) {
476 /* Put it back on the free list: */
477 hptr->hb_next = hs->hs_free_blocks;
478 hs->hs_free_blocks = hptr;
479 }
480 }
481
482 /* Unlink the handle: */
483 if (handle->ih_next)
484 handle->ih_next->ih_prev = handle->ih_prev;
485 if (handle->ih_prev)
486 handle->ih_prev->ih_next = handle->ih_next;
487 if (hs->hs_used_handles == handle)
488 hs->hs_used_handles = handle->ih_next;
489
490 /* Put it on the free list: */
491 handle->ih_next = hs->hs_free_handles;
492 hs->hs_free_handles = handle;
493
494#ifdef CHECK_HANDLE_STRUCTS
495 ic_check_handle_structs();
496#endif
497 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
498
499 if (block)
500 XT_IPAGE_UNLOCK(&block->cb_lock, FALSE);
501}
502
503/* Call this function before a referenced cache block is modified!
504 * This function is called by index updaters.
505 */
506xtPublic xtBool xt_ind_copy_on_write(XTIndReferencePtr iref)
507{
508 DcHandleSlotPtr hs;
509 XTIndHandleBlockPtr hptr;
510 u_int branch_size;
511 XTIndHandlePtr handle;
512 u_int i = 0;
513
514 hs = &ind_cac_globals.cg_handle_slot[iref->ir_block->cb_address % XT_HANDLE_SLOTS];
515
516 ID_HANDLE_LOCK(&hs->hs_handles_lock);
517
518 /* {HANDLE-COUNT-USAGE}
519 * This is only called by updaters of this index block, or
520 * the free which holds an Xlock on the index block.
521 * These are all mutually exclusive for the index block.
522 *
523 * {HANDLE-COUNT-SLOCK}
524 * Do this check again, after we have the list lock (hs_handles_lock).
525 * There is a small chance that the count has changed, since we last
526 * checked because xt_ind_release_handle() only holds
527 * an slock on the index page.
528 *
529 * An updater can sometimes have a XLOCK on the index and an slock
530 * on the cache block. In this case xt_ind_release_handle()
531 * could have run through.
532 */
533 if (!iref->ir_block->cb_handle_count) {
534 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
535 return OK;
536 }
537
538#ifdef CHECK_HANDLE_STRUCTS
539 ic_check_handle_structs();
540#endif
541 if ((hptr = hs->hs_free_blocks))
542 hs->hs_free_blocks = hptr->hb_next;
543 else {
544 if (!(hptr = (XTIndHandleBlockPtr) xt_malloc_ns(sizeof(XTIndHandleBlockRec)))) {
545 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
546 return FAILED;
547 }
548 }
549
550 branch_size = XT_GET_INDEX_BLOCK_LEN(XT_GET_DISK_2(iref->ir_branch->tb_size_2));
551 memcpy(&hptr->hb_branch, iref->ir_branch, branch_size);
552 hptr->hb_ref_count = iref->ir_block->cb_handle_count;
553
554 handle = hs->hs_used_handles;
555 while (handle) {
556 if (handle->ih_branch == iref->ir_branch) {
557 i++;
558 xt_spinlock_lock(&handle->ih_lock);
559 ASSERT_NS(handle->ih_cache_reference);
560 handle->ih_cache_reference = FALSE;
561 handle->x.ih_handle_block = hptr;
562 handle->ih_branch = &hptr->hb_branch;
563 xt_spinlock_unlock(&handle->ih_lock);
564#ifndef DEBUG
565 if (i == hptr->hb_ref_count)
566 break;
567#endif
568 }
569 handle = handle->ih_next;
570 }
571#ifdef DEBUG
572 ASSERT_NS(hptr->hb_ref_count == i);
573#endif
574 /* {HANDLE-COUNT-USAGE}
575 * It is safe to modify cb_handle_count when I have the
576 * list lock, and I have excluded all readers!
577 */
578 iref->ir_block->cb_handle_count = 0;
579#ifdef CHECK_HANDLE_STRUCTS
580 ic_check_handle_structs();
581#endif
582 ID_HANDLE_UNLOCK(&hs->hs_handles_lock);
583
584 return OK;
585}
586
587xtPublic void xt_ind_lock_handle(XTIndHandlePtr handle)
588{
589 xt_spinlock_lock(&handle->ih_lock);
590}
591
592xtPublic void xt_ind_unlock_handle(XTIndHandlePtr handle)
593{
594 xt_spinlock_unlock(&handle->ih_lock);
595}
596
597/*
598 * -----------------------------------------------------------------------
599 * INIT/EXIT
600 */
601
602/*
603 * Initialize the disk cache.
604 */
605xtPublic void xt_ind_init(XTThreadPtr self, size_t cache_size)
606{
607 XTIndBlockPtr block;
608
609#ifdef XT_USE_MYSYS
610 init_key_cache(&my_cache, 1024, cache_size, 100, 300);
611#endif
612 /* Memory is devoted to the page data alone, I no longer count the size of the directory,
613 * or the page overhead: */
614 ind_cac_globals.cg_block_count = cache_size / XT_INDEX_PAGE_SIZE;
615 ind_cac_globals.cg_hash_size = ind_cac_globals.cg_block_count / (IDX_CAC_SEGMENT_COUNT >> 1);
616 ind_cac_globals.cg_max_free = ind_cac_globals.cg_block_count / 10;
617 if (ind_cac_globals.cg_max_free < 8)
618 ind_cac_globals.cg_max_free = 8;
619 if (ind_cac_globals.cg_max_free > 128)
620 ind_cac_globals.cg_max_free = 128;
621
622 try_(a) {
623 for (u_int i=0; i<IDX_CAC_SEGMENT_COUNT; i++) {
624 ind_cac_globals.cg_segment[i].cs_hash_table = (XTIndBlockPtr *) xt_calloc(self, ind_cac_globals.cg_hash_size * sizeof(XTIndBlockPtr));
625 IDX_CAC_INIT_LOCK(self, &ind_cac_globals.cg_segment[i]);
626 }
627
628 block = (XTIndBlockPtr) xt_malloc(self, ind_cac_globals.cg_block_count * sizeof(XTIndBlockRec));
629 ind_cac_globals.cg_blocks = block;
630 xt_init_mutex_with_autoname(self, &ind_cac_globals.cg_lock);
631#ifdef XT_USE_DIRECT_IO_ON_INDEX
632 xtWord1 *buffer;
633#ifdef XT_WIN
634 size_t psize = 512;
635#else
636 size_t psize = getpagesize();
637#endif
638 size_t diff;
639
640 buffer = (xtWord1 *) xt_malloc(self, (ind_cac_globals.cg_block_count * XT_INDEX_PAGE_SIZE));
641 diff = (size_t) buffer % psize;
642 if (diff != 0) {
643 xt_free(self, buffer);
644 buffer = (xtWord1 *) xt_malloc(self, (ind_cac_globals.cg_block_count * XT_INDEX_PAGE_SIZE) + psize);
645 diff = (size_t) buffer % psize;
646 if (diff != 0)
647 diff = psize - diff;
648 }
649 ind_cac_globals.cg_buffer = buffer;
650 buffer += diff;
651#endif
652
653 for (u_int i=0; i<ind_cac_globals.cg_block_count; i++) {
654 XT_IPAGE_INIT_LOCK(self, &block->cb_lock);
655 block->cb_state = IDX_CAC_BLOCK_FREE;
656 block->cb_next = ind_cac_globals.cg_free_list;
657#ifdef XT_USE_DIRECT_IO_ON_INDEX
658 block->cb_data = buffer;
659 buffer += XT_INDEX_PAGE_SIZE;
660#endif
661 ind_cac_globals.cg_free_list = block;
662 block++;
663 }
664 ind_cac_globals.cg_free_count = ind_cac_globals.cg_block_count;
665#ifdef DEBUG_CHECK_IND_CACHE
666 ind_cac_globals.cg_reserved_by_ots = 0;
667#endif
668 ind_handle_init(self);
669 }
670 catch_(a) {
671 xt_ind_exit(self);
672 throw_();
673 }
674 cont_(a);
675}
676
677xtPublic void xt_ind_exit(XTThreadPtr self)
678{
679#ifdef XT_USE_MYSYS
680 end_key_cache(&my_cache, 1);
681#endif
682 for (u_int i=0; i<IDX_CAC_SEGMENT_COUNT; i++) {
683 if (ind_cac_globals.cg_segment[i].cs_hash_table) {
684 xt_free(self, ind_cac_globals.cg_segment[i].cs_hash_table);
685 ind_cac_globals.cg_segment[i].cs_hash_table = NULL;
686 IDX_CAC_FREE_LOCK(self, &ind_cac_globals.cg_segment[i]);
687 }
688 }
689
690 if (ind_cac_globals.cg_blocks) {
691 xt_free(self, ind_cac_globals.cg_blocks);
692 ind_cac_globals.cg_blocks = NULL;
693 xt_free_mutex(&ind_cac_globals.cg_lock);
694 }
695#ifdef XT_USE_DIRECT_IO_ON_INDEX
696 if (ind_cac_globals.cg_buffer) {
697 xt_free(self, ind_cac_globals.cg_buffer);
698 ind_cac_globals.cg_buffer = NULL;
699 }
700#endif
701 ind_handle_exit(self);
702
703 memset(&ind_cac_globals, 0, sizeof(ind_cac_globals));
704}
705
706xtPublic xtInt8 xt_ind_get_usage()
707{
708 xtInt8 size = 0;
709
710 size = (xtInt8) (ind_cac_globals.cg_block_count - ind_cac_globals.cg_free_count) * (xtInt8) XT_INDEX_PAGE_SIZE;
711 return size;
712}
713
714xtPublic xtInt8 xt_ind_get_size()
715{
716 xtInt8 size = 0;
717
718 size = (xtInt8) ind_cac_globals.cg_block_count * (xtInt8) XT_INDEX_PAGE_SIZE;
719 return size;
720}
721
722/*
723 * -----------------------------------------------------------------------
724 * INDEX CHECKING
725 */
726
727xtPublic void xt_ind_check_cache(XTIndexPtr ind)
728{
729 XTIndBlockPtr block;
730 u_int free_count, inuse_count, clean_count;
731 xtBool check_count = FALSE;
732
733 if (ind == (XTIndex *) 1) {
734 ind = NULL;
735 check_count = TRUE;
736 }
737
738 // Check the dirty list:
739 if (ind) {
740 u_int cnt = 0;
741
742 block = ind->mi_dirty_list;
743 while (block) {
744 cnt++;
745 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_DIRTY);
746 block = block->cb_dirty_next;
747 }
748 ASSERT_NS(ind->mi_dirty_blocks == cnt);
749 }
750
751 xt_lock_mutex_ns(&ind_cac_globals.cg_lock);
752
753 // Check the free list:
754 free_count = 0;
755 block = ind_cac_globals.cg_free_list;
756 while (block) {
757 free_count++;
758 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_FREE);
759 block = block->cb_next;
760 }
761 ASSERT_NS(ind_cac_globals.cg_free_count == free_count);
762
763 /* Check the LRU list: */
764 XTIndBlockPtr list_block, plist_block;
765
766 plist_block = NULL;
767 list_block = ind_cac_globals.cg_lru_block;
768 if (list_block) {
769 ASSERT_NS(ind_cac_globals.cg_mru_block != NULL);
770 ASSERT_NS(ind_cac_globals.cg_mru_block->cb_mr_used == NULL);
771 ASSERT_NS(list_block->cb_lr_used == NULL);
772 inuse_count = 0;
773 clean_count = 0;
774 while (list_block) {
775 inuse_count++;
776 ASSERT_NS(list_block->cb_state == IDX_CAC_BLOCK_DIRTY || list_block->cb_state == IDX_CAC_BLOCK_CLEAN);
777 if (list_block->cb_state == IDX_CAC_BLOCK_CLEAN)
778 clean_count++;
779 ASSERT_NS(block != list_block);
780 ASSERT_NS(list_block->cb_lr_used == plist_block);
781 plist_block = list_block;
782 list_block = list_block->cb_mr_used;
783 }
784 ASSERT_NS(ind_cac_globals.cg_mru_block == plist_block);
785 }
786 else {
787 inuse_count = 0;
788 clean_count = 0;
789 ASSERT_NS(ind_cac_globals.cg_mru_block == NULL);
790 }
791
792#ifdef DEBUG_CHECK_IND_CACHE
793 ASSERT_NS(free_count + inuse_count + ind_cac_globals.cg_reserved_by_ots + ind_cac_globals.cg_read_count == ind_cac_globals.cg_block_count);
794#endif
795 xt_unlock_mutex_ns(&ind_cac_globals.cg_lock);
796 if (check_count) {
797 /* We have just flushed, check how much is now free/clean. */
798 if (free_count + clean_count < 10) {
799 /* This could be a problem: */
800 printf("Cache very low!\n");
801 }
802 }
803}
804
805#ifdef XXXXDEBUG
806static void ind_cac_check_on_dirty_list(DcSegmentPtr seg, XTIndBlockPtr block)
807{
808 XTIndBlockPtr list_block, plist_block;
809 xtBool found = FALSE;
810
811 plist_block = NULL;
812 list_block = seg->cs_dirty_list[block->cb_file_id % XT_INDEX_CACHE_FILE_SLOTS];
813 while (list_block) {
814 ASSERT_NS(list_block->cb_state == IDX_CAC_BLOCK_DIRTY);
815 ASSERT_NS(list_block->cb_dirty_prev == plist_block);
816 if (list_block == block)
817 found = TRUE;
818 plist_block = list_block;
819 list_block = list_block->cb_dirty_next;
820 }
821 ASSERT_NS(found);
822}
823
824static void ind_cac_check_dirty_list(DcSegmentPtr seg, XTIndBlockPtr block)
825{
826 XTIndBlockPtr list_block, plist_block;
827
828 for (u_int j=0; j<XT_INDEX_CACHE_FILE_SLOTS; j++) {
829 plist_block = NULL;
830 list_block = seg->cs_dirty_list[j];
831 while (list_block) {
832 ASSERT_NS(list_block->cb_state == IDX_CAC_BLOCK_DIRTY);
833 ASSERT_NS(block != list_block);
834 ASSERT_NS(list_block->cb_dirty_prev == plist_block);
835 plist_block = list_block;
836 list_block = list_block->cb_dirty_next;
837 }
838 }
839}
840
841#endif
842
843/*
844 * -----------------------------------------------------------------------
845 * FREEING INDEX CACHE
846 */
847
848/*
849 * This function return TRUE if the block is freed.
850 * This function returns FALSE if the block cannot be found, or the
851 * block is not clean.
852 *
853 * We also return FALSE if we cannot copy the block to the handle
854 * (if this is required). This will be due to out-of-memory!
855 */
856static xtBool ind_free_block(XTOpenTablePtr ot, XTIndBlockPtr block)
857{
858 XTIndBlockPtr xblock, pxblock;
859 u_int hash_idx;
860 u_int file_id;
861 xtIndexNodeID address;
862 DcSegmentPtr seg;
863
864#ifdef DEBUG_CHECK_IND_CACHE
865 xt_ind_check_cache(NULL);
866#endif
867 file_id = block->cb_file_id;
868 address = block->cb_address;
869
870 hash_idx = XT_NODE_ID(address) + (file_id * 223);
871 seg = &ind_cac_globals.cg_segment[hash_idx & IDX_CAC_SEGMENT_MASK];
872 hash_idx = (hash_idx >> XT_INDEX_CACHE_SEGMENT_SHIFTS) % ind_cac_globals.cg_hash_size;
873
874 IDX_CAC_WRITE_LOCK(seg, ot->ot_thread);
875
876 pxblock = NULL;
877 xblock = seg->cs_hash_table[hash_idx];
878 while (xblock) {
879 if (block == xblock) {
880 /* Found the block... */
881 XT_IPAGE_WRITE_LOCK(&block->cb_lock, ot->ot_thread->t_id);
882 if (block->cb_state != IDX_CAC_BLOCK_CLEAN) {
883 /* This block cannot be freeed: */
884 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
885 IDX_CAC_UNLOCK(seg, ot->ot_thread);
886#ifdef DEBUG_CHECK_IND_CACHE
887 xt_ind_check_cache(NULL);
888#endif
889 return FALSE;
890 }
891
892 goto free_the_block;
893 }
894 pxblock = xblock;
895 xblock = xblock->cb_next;
896 }
897
898 IDX_CAC_UNLOCK(seg, ot->ot_thread);
899
900 /* Not found (this can happen, if block was freed by another thread) */
901#ifdef DEBUG_CHECK_IND_CACHE
902 xt_ind_check_cache(NULL);
903#endif
904 return FALSE;
905
906 free_the_block:
907
908 /* If the block is reference by a handle, then we
909 * have to copy the data to the handle before we
910 * free the page:
911 */
912 /* {HANDLE-COUNT-USAGE}
913 * This access is safe because:
914 *
915 * We have an Xlock on the cache block, which excludes
916 * all other writers that want to change the cache block
917 * and also all readers of the cache block, because
918 * they all have at least an Slock on the cache block.
919 */
920 if (block->cb_handle_count) {
921 XTIndReferenceRec iref;
922
923 iref.ir_xlock = TRUE;
924 iref.ir_updated = FALSE;
925 iref.ir_block = block;
926 iref.ir_branch = (XTIdxBranchDPtr) block->cb_data;
927 if (!xt_ind_copy_on_write(&iref)) {
928 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
929 return FALSE;
930 }
931 }
932
933 /* Block is clean, remove from the hash table: */
934 if (pxblock)
935 pxblock->cb_next = block->cb_next;
936 else
937 seg->cs_hash_table[hash_idx] = block->cb_next;
938
939 xt_lock_mutex_ns(&ind_cac_globals.cg_lock);
940
941 /* Remove from the MRU list: */
942 if (ind_cac_globals.cg_lru_block == block)
943 ind_cac_globals.cg_lru_block = block->cb_mr_used;
944 if (ind_cac_globals.cg_mru_block == block)
945 ind_cac_globals.cg_mru_block = block->cb_lr_used;
946
947 /* Note, I am updating blocks for which I have no lock
948 * here. But I think this is OK because I have a lock
949 * for the MRU list.
950 */
951 if (block->cb_lr_used)
952 block->cb_lr_used->cb_mr_used = block->cb_mr_used;
953 if (block->cb_mr_used)
954 block->cb_mr_used->cb_lr_used = block->cb_lr_used;
955
956 /* The block is now free: */
957 block->cb_next = ind_cac_globals.cg_free_list;
958 ind_cac_globals.cg_free_list = block;
959 ind_cac_globals.cg_free_count++;
960 block->cb_state = IDX_CAC_BLOCK_FREE;
961 IDX_TRACE("%d- f%x\n", (int) XT_NODE_ID(address), (int) XT_GET_DISK_2(block->cb_data));
962
963 /* Unlock BEFORE the block is reused! */
964 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
965
966 xt_unlock_mutex_ns(&ind_cac_globals.cg_lock);
967
968 IDX_CAC_UNLOCK(seg, ot->ot_thread);
969
970#ifdef DEBUG_CHECK_IND_CACHE
971 xt_ind_check_cache(NULL);
972#endif
973 return TRUE;
974}
975
976#define IND_CACHE_MAX_BLOCKS_TO_FREE 100
977
978/*
979 * Return the number of blocks freed.
980 *
981 * The idea is to grab a list of blocks to free.
982 * The list consists of the LRU blocks that are
983 * clean.
984 *
985 * Free as many as possible (up to max of blocks_required)
986 * from the list, even if LRU position has changed
987 * (or we have a race if there are too few blocks).
988 * However, if the block cannot be found, or is dirty
989 * we must skip it.
990 *
991 * Repeat until we find no blocks for the list, or
992 * we have freed 'blocks_required'.
993 *
994 * 'not_this' is a block that must not be freed because
995 * it is locked by the calling thread!
996 */
997static u_int ind_cac_free_lru_blocks(XTOpenTablePtr ot, u_int blocks_required, XTIdxBranchDPtr not_this)
998{
999 register DcGlobalsRec *dcg = &ind_cac_globals;
1000 XTIndBlockPtr to_free[IND_CACHE_MAX_BLOCKS_TO_FREE];
1001 int count;
1002 XTIndBlockPtr block;
1003 u_int blocks_freed = 0;
1004 XTIndBlockPtr locked_block;
1005
1006#ifdef XT_USE_DIRECT_IO_ON_INDEX
1007#error This will not work!
1008#endif
1009 locked_block = (XTIndBlockPtr) ((xtWord1 *) not_this - offsetof(XTIndBlockRec, cb_data));
1010
1011 retry:
1012 xt_lock_mutex_ns(&ind_cac_globals.cg_lock);
1013 block = dcg->cg_lru_block;
1014 count = 0;
1015 while (block && count < IND_CACHE_MAX_BLOCKS_TO_FREE) {
1016 if (block != locked_block && block->cb_state == IDX_CAC_BLOCK_CLEAN) {
1017 to_free[count] = block;
1018 count++;
1019 }
1020 block = block->cb_mr_used;
1021 }
1022 xt_unlock_mutex_ns(&ind_cac_globals.cg_lock);
1023
1024 if (!count)
1025 return blocks_freed;
1026
1027 for (int i=0; i<count; i++) {
1028 if (ind_free_block(ot, to_free[i]))
1029 blocks_freed++;
1030 if (blocks_freed >= blocks_required &&
1031 ind_cac_globals.cg_free_count >= ind_cac_globals.cg_max_free + blocks_required)
1032 return blocks_freed;
1033 }
1034
1035 goto retry;
1036}
1037
1038/*
1039 * -----------------------------------------------------------------------
1040 * MAIN CACHE FUNCTIONS
1041 */
1042
1043/*
1044 * Fetch the block. Note, if we are about to write the block
1045 * then there is no need to read it from disk!
1046 */
1047static XTIndBlockPtr ind_cac_fetch(XTOpenTablePtr ot, XTIndexPtr ind, xtIndexNodeID address, DcSegmentPtr *ret_seg, xtBool read_data)
1048{
1049 register XTOpenFilePtr file = ot->ot_ind_file;
1050 register XTIndBlockPtr block, new_block;
1051 register DcSegmentPtr seg;
1052 register u_int hash_idx;
1053 register DcGlobalsRec *dcg = &ind_cac_globals;
1054 size_t red_size;
1055
1056#ifdef DEBUG_CHECK_IND_CACHE
1057 xt_ind_check_cache(NULL);
1058#endif
1059 /* Address, plus file ID multiplied by my favorite prime number! */
1060 hash_idx = XT_NODE_ID(address) + (file->fr_id * 223);
1061 seg = &dcg->cg_segment[hash_idx & IDX_CAC_SEGMENT_MASK];
1062 hash_idx = (hash_idx >> XT_INDEX_CACHE_SEGMENT_SHIFTS) % dcg->cg_hash_size;
1063
1064 IDX_CAC_READ_LOCK(seg, ot->ot_thread);
1065 block = seg->cs_hash_table[hash_idx];
1066 while (block) {
1067 if (XT_NODE_ID(block->cb_address) == XT_NODE_ID(address) && block->cb_file_id == file->fr_id) {
1068 ASSERT_NS(block->cb_state != IDX_CAC_BLOCK_FREE);
1069
1070 /* Check how recently this page has been used: */
1071 if (XT_TIME_DIFF(block->cb_ru_time, dcg->cg_ru_now) > (dcg->cg_block_count >> 1)) {
1072 xt_lock_mutex_ns(&dcg->cg_lock);
1073
1074 /* Move to the front of the MRU list: */
1075 block->cb_ru_time = ++dcg->cg_ru_now;
1076 if (dcg->cg_mru_block != block) {
1077 /* Remove from the MRU list: */
1078 if (dcg->cg_lru_block == block)
1079 dcg->cg_lru_block = block->cb_mr_used;
1080 if (block->cb_lr_used)
1081 block->cb_lr_used->cb_mr_used = block->cb_mr_used;
1082 if (block->cb_mr_used)
1083 block->cb_mr_used->cb_lr_used = block->cb_lr_used;
1084
1085 /* Make the block the most recently used: */
1086 if ((block->cb_lr_used = dcg->cg_mru_block))
1087 dcg->cg_mru_block->cb_mr_used = block;
1088 block->cb_mr_used = NULL;
1089 dcg->cg_mru_block = block;
1090 if (!dcg->cg_lru_block)
1091 dcg->cg_lru_block = block;
1092 }
1093
1094 xt_unlock_mutex_ns(&dcg->cg_lock);
1095 }
1096
1097 *ret_seg = seg;
1098#ifdef DEBUG_CHECK_IND_CACHE
1099 xt_ind_check_cache(NULL);
1100#endif
1101 ot->ot_thread->st_statistics.st_ind_cache_hit++;
1102 return block;
1103 }
1104 block = block->cb_next;
1105 }
1106
1107 /* Block not found... */
1108 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1109
1110 /* Check the open table reserve list first: */
1111 if ((new_block = ot->ot_ind_res_bufs)) {
1112 ot->ot_ind_res_bufs = new_block->cb_next;
1113 ot->ot_ind_res_count--;
1114#ifdef DEBUG_CHECK_IND_CACHE
1115 xt_lock_mutex_ns(&dcg->cg_lock);
1116 dcg->cg_reserved_by_ots--;
1117 dcg->cg_read_count++;
1118 xt_unlock_mutex_ns(&dcg->cg_lock);
1119#endif
1120 goto use_free_block;
1121 }
1122
1123 free_some_blocks:
1124 if (!dcg->cg_free_list) {
1125 if (!ind_cac_free_lru_blocks(ot, 1, NULL)) {
1126 if (!dcg->cg_free_list) {
1127 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_NO_INDEX_CACHE);
1128#ifdef DEBUG_CHECK_IND_CACHE
1129 xt_ind_check_cache(NULL);
1130#endif
1131 return NULL;
1132 }
1133 }
1134 }
1135
1136 /* Get a free block: */
1137 xt_lock_mutex_ns(&dcg->cg_lock);
1138 if (!(new_block = dcg->cg_free_list)) {
1139 xt_unlock_mutex_ns(&dcg->cg_lock);
1140 goto free_some_blocks;
1141 }
1142 ASSERT_NS(new_block->cb_state == IDX_CAC_BLOCK_FREE);
1143 dcg->cg_free_list = new_block->cb_next;
1144 dcg->cg_free_count--;
1145#ifdef DEBUG_CHECK_IND_CACHE
1146 dcg->cg_read_count++;
1147#endif
1148 xt_unlock_mutex_ns(&dcg->cg_lock);
1149
1150 use_free_block:
1151 new_block->cb_address = address;
1152 new_block->cb_file_id = file->fr_id;
1153 new_block->cb_state = IDX_CAC_BLOCK_CLEAN;
1154 new_block->cb_handle_count = 0;
1155 new_block->cp_flush_seq = 0;
1156 new_block->cp_del_count = 0;
1157 new_block->cb_dirty_next = NULL;
1158 new_block->cb_dirty_prev = NULL;
1159
1160 if (read_data) {
1161 if (!xt_pread_file(file, xt_ind_node_to_offset(ot->ot_table, address), XT_INDEX_PAGE_SIZE, 0, new_block->cb_data, &red_size, &ot->ot_thread->st_statistics.st_ind, ot->ot_thread)) {
1162 xt_lock_mutex_ns(&dcg->cg_lock);
1163 new_block->cb_next = dcg->cg_free_list;
1164 dcg->cg_free_list = new_block;
1165 dcg->cg_free_count++;
1166#ifdef DEBUG_CHECK_IND_CACHE
1167 dcg->cg_read_count--;
1168#endif
1169 new_block->cb_state = IDX_CAC_BLOCK_FREE;
1170 IDX_TRACE("%d- F%x\n", (int) XT_NODE_ID(address), (int) XT_GET_DISK_2(new_block->cb_data));
1171 xt_unlock_mutex_ns(&dcg->cg_lock);
1172#ifdef DEBUG_CHECK_IND_CACHE
1173 xt_ind_check_cache(NULL);
1174#endif
1175 return NULL;
1176 }
1177 IDX_TRACE("%d- R%x\n", (int) XT_NODE_ID(address), (int) XT_GET_DISK_2(new_block->cb_data));
1178 ot->ot_thread->st_statistics.st_ind_cache_miss++;
1179 }
1180 else
1181 red_size = 0;
1182 // PMC - I don't think this is required! memset(new_block->cb_data + red_size, 0, XT_INDEX_PAGE_SIZE - red_size);
1183
1184 IDX_CAC_WRITE_LOCK(seg, ot->ot_thread);
1185 block = seg->cs_hash_table[hash_idx];
1186 while (block) {
1187 if (XT_NODE_ID(block->cb_address) == XT_NODE_ID(address) && block->cb_file_id == file->fr_id) {
1188 /* Oops, someone else was faster! */
1189 xt_lock_mutex_ns(&dcg->cg_lock);
1190 new_block->cb_next = dcg->cg_free_list;
1191 dcg->cg_free_list = new_block;
1192 dcg->cg_free_count++;
1193#ifdef DEBUG_CHECK_IND_CACHE
1194 dcg->cg_read_count--;
1195#endif
1196 new_block->cb_state = IDX_CAC_BLOCK_FREE;
1197 IDX_TRACE("%d- F%x\n", (int) XT_NODE_ID(address), (int) XT_GET_DISK_2(new_block->cb_data));
1198 xt_unlock_mutex_ns(&dcg->cg_lock);
1199 goto done_ok;
1200 }
1201 block = block->cb_next;
1202 }
1203 block = new_block;
1204
1205 /* Make the block the most recently used: */
1206 xt_lock_mutex_ns(&dcg->cg_lock);
1207 block->cb_ru_time = ++dcg->cg_ru_now;
1208 if ((block->cb_lr_used = dcg->cg_mru_block))
1209 dcg->cg_mru_block->cb_mr_used = block;
1210 block->cb_mr_used = NULL;
1211 dcg->cg_mru_block = block;
1212 if (!dcg->cg_lru_block)
1213 dcg->cg_lru_block = block;
1214#ifdef DEBUG_CHECK_IND_CACHE
1215 dcg->cg_read_count--;
1216#endif
1217 xt_unlock_mutex_ns(&dcg->cg_lock);
1218
1219 /* {LAZY-DEL-INDEX-ITEMS}
1220 * Conditionally count the number of deleted entries in the index:
1221 * We do this before other threads can read the block.
1222 */
1223 if (ind->mi_lazy_delete && read_data)
1224 xt_ind_count_deleted_items(ot->ot_table, ind, block);
1225
1226 /* Add to the hash table: */
1227 block->cb_next = seg->cs_hash_table[hash_idx];
1228 seg->cs_hash_table[hash_idx] = block;
1229
1230 done_ok:
1231 *ret_seg = seg;
1232#ifdef DEBUG_CHECK_IND_CACHE
1233 xt_ind_check_cache(NULL);
1234#endif
1235 return block;
1236}
1237
1238static xtBool ind_cac_get(XTOpenTablePtr ot, xtIndexNodeID address, DcSegmentPtr *ret_seg, XTIndBlockPtr *ret_block)
1239{
1240 register XTOpenFilePtr file = ot->ot_ind_file;
1241 register XTIndBlockPtr block;
1242 register DcSegmentPtr seg;
1243 register u_int hash_idx;
1244 register DcGlobalsRec *dcg = &ind_cac_globals;
1245
1246 hash_idx = XT_NODE_ID(address) + (file->fr_id * 223);
1247 seg = &dcg->cg_segment[hash_idx & IDX_CAC_SEGMENT_MASK];
1248 hash_idx = (hash_idx >> XT_INDEX_CACHE_SEGMENT_SHIFTS) % dcg->cg_hash_size;
1249
1250 IDX_CAC_READ_LOCK(seg, ot->ot_thread);
1251 block = seg->cs_hash_table[hash_idx];
1252 while (block) {
1253 if (XT_NODE_ID(block->cb_address) == XT_NODE_ID(address) && block->cb_file_id == file->fr_id) {
1254 ASSERT_NS(block->cb_state != IDX_CAC_BLOCK_FREE);
1255
1256 *ret_seg = seg;
1257 *ret_block = block;
1258 return OK;
1259 }
1260 block = block->cb_next;
1261 }
1262 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1263
1264 /* Block not found: */
1265 *ret_seg = NULL;
1266 *ret_block = NULL;
1267 return OK;
1268}
1269
1270xtPublic xtBool xt_ind_write(XTOpenTablePtr ot, XTIndexPtr ind, xtIndexNodeID address, size_t size, xtWord1 *data)
1271{
1272 XTIndBlockPtr block;
1273 DcSegmentPtr seg;
1274
1275 if (!(block = ind_cac_fetch(ot, ind, address, &seg, FALSE)))
1276 return FAILED;
1277
1278 XT_IPAGE_WRITE_LOCK(&block->cb_lock, ot->ot_thread->t_id);
1279 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_CLEAN || block->cb_state == IDX_CAC_BLOCK_DIRTY);
1280 memcpy(block->cb_data, data, size);
1281 block->cp_flush_seq = ot->ot_table->tab_ind_flush_seq;
1282 if (block->cb_state != IDX_CAC_BLOCK_DIRTY) {
1283 TRACK_BLOCK_WRITE(offset);
1284 xt_spinlock_lock(&ind->mi_dirty_lock);
1285 if ((block->cb_dirty_next = ind->mi_dirty_list))
1286 ind->mi_dirty_list->cb_dirty_prev = block;
1287 block->cb_dirty_prev = NULL;
1288 ind->mi_dirty_list = block;
1289 ind->mi_dirty_blocks++;
1290 xt_spinlock_unlock(&ind->mi_dirty_lock);
1291 block->cb_state = IDX_CAC_BLOCK_DIRTY;
1292 }
1293 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
1294 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1295#ifdef XT_TRACK_INDEX_UPDATES
1296 ot->ot_ind_changed++;
1297#endif
1298 return OK;
1299}
1300
1301/*
1302 * Update the cache, if in RAM.
1303 */
1304xtPublic xtBool xt_ind_write_cache(XTOpenTablePtr ot, xtIndexNodeID address, size_t size, xtWord1 *data)
1305{
1306 XTIndBlockPtr block;
1307 DcSegmentPtr seg;
1308
1309 if (!ind_cac_get(ot, address, &seg, &block))
1310 return FAILED;
1311
1312 if (block) {
1313 XT_IPAGE_WRITE_LOCK(&block->cb_lock, ot->ot_thread->t_id);
1314 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_CLEAN || block->cb_state == IDX_CAC_BLOCK_DIRTY);
1315 memcpy(block->cb_data, data, size);
1316 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
1317 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1318 }
1319
1320 return OK;
1321}
1322
1323xtPublic xtBool xt_ind_clean(XTOpenTablePtr ot, XTIndexPtr ind, xtIndexNodeID address)
1324{
1325 XTIndBlockPtr block;
1326 DcSegmentPtr seg;
1327
1328 if (!ind_cac_get(ot, address, &seg, &block))
1329 return FAILED;
1330 if (block) {
1331 XT_IPAGE_WRITE_LOCK(&block->cb_lock, ot->ot_thread->t_id);
1332 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_CLEAN || block->cb_state == IDX_CAC_BLOCK_DIRTY);
1333
1334 if (block->cb_state == IDX_CAC_BLOCK_DIRTY) {
1335 /* Take the block off the dirty list: */
1336 xt_spinlock_lock(&ind->mi_dirty_lock);
1337 if (block->cb_dirty_next)
1338 block->cb_dirty_next->cb_dirty_prev = block->cb_dirty_prev;
1339 if (block->cb_dirty_prev)
1340 block->cb_dirty_prev->cb_dirty_next = block->cb_dirty_next;
1341 if (ind->mi_dirty_list == block)
1342 ind->mi_dirty_list = block->cb_dirty_next;
1343 ind->mi_dirty_blocks--;
1344 xt_spinlock_unlock(&ind->mi_dirty_lock);
1345 block->cb_state = IDX_CAC_BLOCK_CLEAN;
1346 }
1347 XT_IPAGE_UNLOCK(&block->cb_lock, TRUE);
1348
1349 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1350 }
1351
1352 return OK;
1353}
1354
1355xtPublic xtBool xt_ind_read_bytes(XTOpenTablePtr ot, XTIndexPtr ind, xtIndexNodeID address, size_t size, xtWord1 *data)
1356{
1357 XTIndBlockPtr block;
1358 DcSegmentPtr seg;
1359
1360 if (!(block = ind_cac_fetch(ot, ind, address, &seg, TRUE)))
1361 return FAILED;
1362
1363 XT_IPAGE_READ_LOCK(&block->cb_lock);
1364 memcpy(data, block->cb_data, size);
1365 XT_IPAGE_UNLOCK(&block->cb_lock, FALSE);
1366 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1367 return OK;
1368}
1369
1370xtPublic xtBool xt_ind_fetch(XTOpenTablePtr ot, XTIndexPtr ind, xtIndexNodeID address, XTPageLockType ltype, XTIndReferencePtr iref)
1371{
1372 register XTIndBlockPtr block;
1373 DcSegmentPtr seg;
1374 xtWord2 branch_size;
1375 xtBool xlock = FALSE;
1376
1377#ifdef DEBUG
1378 ASSERT_NS(iref->ir_xlock == 2);
1379 ASSERT_NS(iref->ir_xlock == 2);
1380#endif
1381 if (!(block = ind_cac_fetch(ot, ind, address, &seg, TRUE)))
1382 return FAILED;
1383
1384 branch_size = XT_GET_DISK_2(((XTIdxBranchDPtr) block->cb_data)->tb_size_2);
1385 if (XT_GET_INDEX_BLOCK_LEN(branch_size) < 2 || XT_GET_INDEX_BLOCK_LEN(branch_size) > XT_INDEX_PAGE_SIZE) {
1386 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1387 xt_register_taberr(XT_REG_CONTEXT, XT_ERR_INDEX_CORRUPTED, ot->ot_table->tab_name);
1388 return FAILED;
1389 }
1390
1391 switch (ltype) {
1392 case XT_LOCK_READ:
1393 break;
1394 case XT_LOCK_WRITE:
1395 xlock = TRUE;
1396 break;
1397 case XT_XLOCK_LEAF:
1398 if (!XT_IS_NODE(branch_size))
1399 xlock = TRUE;
1400 break;
1401 case XT_XLOCK_DEL_LEAF:
1402 if (!XT_IS_NODE(branch_size)) {
1403 if (ot->ot_table->tab_dic.dic_no_lazy_delete)
1404 xlock = TRUE;
1405 else {
1406 /*
1407 * {LAZY-DEL-INDEX-ITEMS}
1408 *
1409 * We are fetch a page for delete purpose.
1410 * we decide here if we plan to do a lazy delete,
1411 * Or if we plan to compact the node.
1412 *
1413 * A lazy delete just requires a shared lock.
1414 *
1415 */
1416 if (ind->mi_lazy_delete) {
1417 /* If the number of deleted items is greater than
1418 * half of the number of times that can fit in the
1419 * page, then we will compact the node.
1420 */
1421 if (!xt_idx_lazy_delete_on_leaf(ind, block, XT_GET_INDEX_BLOCK_LEN(branch_size)))
1422 xlock = TRUE;
1423 }
1424 else
1425 xlock = TRUE;
1426 }
1427 }
1428 break;
1429 }
1430
1431 if ((iref->ir_xlock = xlock))
1432 XT_IPAGE_WRITE_LOCK(&block->cb_lock, ot->ot_thread->t_id);
1433 else
1434 XT_IPAGE_READ_LOCK(&block->cb_lock);
1435
1436 IDX_CAC_UNLOCK(seg, ot->ot_thread);
1437
1438 /* {DIRECT-IO}
1439 * Direct I/O requires that the buffer is 512 byte aligned.
1440 * To do this, cb_data is turned into a pointer, instead
1441 * of an array.
1442 * As a result, we need to pass a pointer to both the
1443 * cache block and the cache block data:
1444 */
1445 iref->ir_updated = FALSE;
1446 iref->ir_block = block;
1447 iref->ir_branch = (XTIdxBranchDPtr) block->cb_data;
1448 return OK;
1449}
1450
1451xtPublic xtBool xt_ind_release(XTOpenTablePtr ot, XTIndexPtr ind, XTPageUnlockType XT_NDEBUG_UNUSED(utype), XTIndReferencePtr iref)
1452{
1453 register XTIndBlockPtr block;
1454
1455 block = iref->ir_block;
1456
1457#ifdef DEBUG
1458 ASSERT_NS(iref->ir_xlock != 2);
1459 ASSERT_NS(iref->ir_updated != 2);
1460 if (iref->ir_updated)
1461 ASSERT_NS(utype == XT_UNLOCK_R_UPDATE || utype == XT_UNLOCK_W_UPDATE);
1462 else
1463 ASSERT_NS(utype == XT_UNLOCK_READ || utype == XT_UNLOCK_WRITE);
1464 if (iref->ir_xlock)
1465 ASSERT_NS(utype == XT_UNLOCK_WRITE || utype == XT_UNLOCK_W_UPDATE);
1466 else
1467 ASSERT_NS(utype == XT_UNLOCK_READ || utype == XT_UNLOCK_R_UPDATE);
1468#endif
1469 if (iref->ir_updated) {
1470 /* The page was update: */
1471 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_CLEAN || block->cb_state == IDX_CAC_BLOCK_DIRTY);
1472 block->cp_flush_seq = ot->ot_table->tab_ind_flush_seq;
1473 if (block->cb_state != IDX_CAC_BLOCK_DIRTY) {
1474 TRACK_BLOCK_WRITE(offset);
1475 xt_spinlock_lock(&ind->mi_dirty_lock);
1476 if ((block->cb_dirty_next = ind->mi_dirty_list))
1477 ind->mi_dirty_list->cb_dirty_prev = block;
1478 block->cb_dirty_prev = NULL;
1479 ind->mi_dirty_list = block;
1480 ind->mi_dirty_blocks++;
1481 xt_spinlock_unlock(&ind->mi_dirty_lock);
1482 block->cb_state = IDX_CAC_BLOCK_DIRTY;
1483 }
1484 }
1485
1486 XT_IPAGE_UNLOCK(&block->cb_lock, iref->ir_xlock);
1487#ifdef DEBUG
1488 iref->ir_xlock = 2;
1489 iref->ir_updated = 2;
1490#endif
1491 return OK;
1492}
1493
1494xtPublic xtBool xt_ind_reserve(XTOpenTablePtr ot, u_int count, XTIdxBranchDPtr not_this)
1495{
1496 register XTIndBlockPtr block;
1497 register DcGlobalsRec *dcg = &ind_cac_globals;
1498
1499#ifdef XT_TRACK_INDEX_UPDATES
1500 ot->ot_ind_reserved = count;
1501 ot->ot_ind_reads = 0;
1502#endif
1503#ifdef DEBUG_CHECK_IND_CACHE
1504 xt_ind_check_cache(NULL);
1505#endif
1506 while (ot->ot_ind_res_count < count) {
1507 if (!dcg->cg_free_list) {
1508 if (!ind_cac_free_lru_blocks(ot, count - ot->ot_ind_res_count, not_this)) {
1509 if (!dcg->cg_free_list) {
1510 xt_ind_free_reserved(ot);
1511 xt_register_xterr(XT_REG_CONTEXT, XT_ERR_NO_INDEX_CACHE);
1512#ifdef DEBUG_CHECK_IND_CACHE
1513 xt_ind_check_cache(NULL);
1514#endif
1515 return FAILED;
1516 }
1517 }
1518 }
1519
1520 /* Get a free block: */
1521 xt_lock_mutex_ns(&dcg->cg_lock);
1522 while (ot->ot_ind_res_count < count && (block = dcg->cg_free_list)) {
1523 ASSERT_NS(block->cb_state == IDX_CAC_BLOCK_FREE);
1524 dcg->cg_free_list = block->cb_next;
1525 dcg->cg_free_count--;
1526 block->cb_next = ot->ot_ind_res_bufs;
1527 ot->ot_ind_res_bufs = block;
1528 ot->ot_ind_res_count++;
1529#ifdef DEBUG_CHECK_IND_CACHE
1530 dcg->cg_reserved_by_ots++;
1531#endif
1532 }
1533 xt_unlock_mutex_ns(&dcg->cg_lock);
1534 }
1535#ifdef DEBUG_CHECK_IND_CACHE
1536 xt_ind_check_cache(NULL);
1537#endif
1538 return OK;
1539}
1540
1541xtPublic void xt_ind_free_reserved(XTOpenTablePtr ot)
1542{
1543#ifdef DEBUG_CHECK_IND_CACHE
1544 xt_ind_check_cache(NULL);
1545#endif
1546 if (ot->ot_ind_res_bufs) {
1547 register XTIndBlockPtr block, fblock;
1548 register DcGlobalsRec *dcg = &ind_cac_globals;
1549
1550 xt_lock_mutex_ns(&dcg->cg_lock);
1551 block = ot->ot_ind_res_bufs;
1552 while (block) {
1553 fblock = block;
1554 block = block->cb_next;
1555
1556 fblock->cb_next = dcg->cg_free_list;
1557 dcg->cg_free_list = fblock;
1558#ifdef DEBUG_CHECK_IND_CACHE
1559 dcg->cg_reserved_by_ots--;
1560#endif
1561 dcg->cg_free_count++;
1562 }
1563 xt_unlock_mutex_ns(&dcg->cg_lock);
1564 ot->ot_ind_res_bufs = NULL;
1565 ot->ot_ind_res_count = 0;
1566 }
1567#ifdef DEBUG_CHECK_IND_CACHE
1568 xt_ind_check_cache(NULL);
1569#endif
1570}
1571
1572xtPublic void xt_ind_unreserve(XTOpenTablePtr ot)
1573{
1574 if (!ind_cac_globals.cg_free_list)
1575 xt_ind_free_reserved(ot);
1576}
1577
01578
=== added file 'plugin/pbxt/src/cache_xt.h'
--- plugin/pbxt/src/cache_xt.h 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/cache_xt.h 2010-04-01 14:19:35 +0000
@@ -0,0 +1,188 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2005-05-24 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23#ifndef __xt_cache_h__
24#define __xt_cache_h__
25
26//#define XT_USE_MYSYS
27
28#include "filesys_xt.h"
29#include "index_xt.h"
30
31struct XTOpenTable;
32struct XTIdxReadBuffer;
33
34#ifdef DEBUG
35//#define XT_USE_CACHE_DEBUG_SIZES
36#endif
37
38#ifdef XT_USE_CACHE_DEBUG_SIZES
39#define XT_INDEX_CACHE_SEGMENT_SHIFTS 1
40#else
41#define XT_INDEX_CACHE_SEGMENT_SHIFTS 3
42#endif
43
44#define IDX_CAC_BLOCK_FREE 0
45#define IDX_CAC_BLOCK_CLEAN 1
46#define IDX_CAC_BLOCK_DIRTY 2
47
48#ifdef XT_NO_ATOMICS
49#define XT_IPAGE_USE_PTHREAD_RW
50#else
51//#define XT_IPAGE_USE_ATOMIC_RW
52#define XT_IPAGE_USE_SPINXSLOCK
53//#define XT_IPAGE_USE_SKEW_RW
54#endif
55
56#ifdef XT_IPAGE_USE_ATOMIC_RW
57#define XT_IPAGE_LOCK_TYPE XTAtomicRWLockRec
58#define XT_IPAGE_INIT_LOCK(s, i) xt_atomicrwlock_init_with_autoname(s, i)
59#define XT_IPAGE_FREE_LOCK(s, i) xt_atomicrwlock_free(s, i)
60#define XT_IPAGE_READ_LOCK(i) xt_atomicrwlock_slock(i)
61#define XT_IPAGE_WRITE_LOCK(i, o) xt_atomicrwlock_xlock(i, o)
62#define XT_IPAGE_UNLOCK(i, x) xt_atomicrwlock_unlock(i, x)
63#elif defined(XT_IPAGE_USE_PTHREAD_RW)
64#define XT_IPAGE_LOCK_TYPE xt_rwlock_type
65#define XT_IPAGE_INIT_LOCK(s, i) xt_init_rwlock(s, i)
66#define XT_IPAGE_FREE_LOCK(s, i) xt_free_rwlock(i)
67#define XT_IPAGE_READ_LOCK(i) xt_slock_rwlock_ns(i)
68#define XT_IPAGE_WRITE_LOCK(i, s) xt_xlock_rwlock_ns(i)
69#define XT_IPAGE_UNLOCK(i, x) xt_unlock_rwlock_ns(i)
70#elif defined(XT_IPAGE_USE_SPINXSLOCK)
71#define XT_IPAGE_LOCK_TYPE XTSpinXSLockRec
72#define XT_IPAGE_INIT_LOCK(s, i) xt_spinxslock_init_with_autoname(s, i)
73#define XT_IPAGE_FREE_LOCK(s, i) xt_spinxslock_free(s, i)
74#define XT_IPAGE_READ_LOCK(i) xt_spinxslock_slock(i)
75#define XT_IPAGE_WRITE_LOCK(i, o) xt_spinxslock_xlock(i, o)
76#define XT_IPAGE_UNLOCK(i, x) xt_spinxslock_unlock(i, x)
77#else // XT_IPAGE_USE_SKEW_RW
78#define XT_IPAGE_LOCK_TYPE XTSkewRWLockRec
79#define XT_IPAGE_INIT_LOCK(s, i) xt_skewrwlock_init_with_autoname(s, i)
80#define XT_IPAGE_FREE_LOCK(s, i) xt_skewrwlock_free(s, i)
81#define XT_IPAGE_READ_LOCK(i) xt_skewrwlock_slock(i)
82#define XT_IPAGE_WRITE_LOCK(i, o) xt_skewrwlock_xlock(i, o)
83#define XT_IPAGE_UNLOCK(i, x) xt_skewrwlock_unlock(i, x)
84#endif
85
86enum XTPageLockType { XT_LOCK_READ, XT_LOCK_WRITE, XT_XLOCK_LEAF, XT_XLOCK_DEL_LEAF };
87enum XTPageUnlockType { XT_UNLOCK_NONE, XT_UNLOCK_READ, XT_UNLOCK_WRITE, XT_UNLOCK_R_UPDATE, XT_UNLOCK_W_UPDATE };
88
89/* A block is X locked if it is being changed or freed.
90 * A block is S locked if it is being read.
91 */
92typedef struct XTIndBlock {
93 xtIndexNodeID cb_address; /* The block address. */
94 u_int cb_file_id; /* The file id of the block. */
95 /* This is protected by cs_lock */
96 struct XTIndBlock *cb_next; /* Pointer to next block on hash list, or next free block on free list. */
97 /* This is protected by mi_dirty_lock */
98 struct XTIndBlock *cb_dirty_next; /* Double link for dirty blocks, next pointer. */
99 struct XTIndBlock *cb_dirty_prev; /* Double link for dirty blocks, previous pointer. */
100 /* This is protected by cg_lock */
101 xtWord4 cb_ru_time; /* If this is in the top 1/4 don't change position in MRU list. */
102 struct XTIndBlock *cb_mr_used; /* More recently used blocks. */
103 struct XTIndBlock *cb_lr_used; /* Less recently used blocks. */
104 /* Protected by cb_lock: */
105 XT_IPAGE_LOCK_TYPE cb_lock;
106 xtWord1 cb_state; /* Block status. */
107 xtWord2 cb_handle_count; /* TRUE if this page is referenced by a handle. */
108 xtWord2 cp_flush_seq;
109 xtWord2 cp_del_count; /* Number of deleted entries. */
110#ifdef XT_USE_DIRECT_IO_ON_INDEX
111 xtWord1 *cb_data;
112#else
113 xtWord1 cb_data[XT_INDEX_PAGE_SIZE];
114#endif
115} XTIndBlockRec, *XTIndBlockPtr;
116
117typedef struct XTIndReference {
118 xtBool ir_xlock; /* Set to TRUE if the cache block is X locked. */
119 xtBool ir_updated; /* Set to TRUE if the cache block is updated. */
120 XTIndBlockPtr ir_block;
121 XTIdxBranchDPtr ir_branch;
122} XTIndReferenceRec, *XTIndReferencePtr;
123
124typedef struct XTIndFreeBlock {
125 XTDiskValue1 if_zero1_1; /* Must be set to zero. */
126 XTDiskValue1 if_zero2_1; /* Must be set to zero. */
127 XTDiskValue1 if_status_1;
128 XTDiskValue1 if_unused1_1;
129 XTDiskValue4 if_unused2_4;
130 XTDiskValue8 if_next_block_8;
131} XTIndFreeBlockRec, *XTIndFreeBlockPtr;
132
133typedef struct XTIndHandleBlock {
134 xtWord4 hb_ref_count;
135 struct XTIndHandleBlock *hb_next;
136 XTIdxBranchDRec hb_branch;
137} XTIndHandleBlockRec, *XTIndHandleBlockPtr;
138
139typedef struct XTIndHandle {
140 struct XTIndHandle *ih_next;
141 struct XTIndHandle *ih_prev;
142 XTSpinLockRec ih_lock;
143 xtIndexNodeID ih_address;
144 xtBool ih_cache_reference; /* True if this handle references the cache. */
145 union {
146 XTIndBlockPtr ih_cache_block;
147 XTIndHandleBlockPtr ih_handle_block;
148 } x;
149 XTIdxBranchDPtr ih_branch;
150} XTIndHandleRec, *XTIndHandlePtr;
151
152void xt_ind_init(XTThreadPtr self, size_t cache_size);
153void xt_ind_exit(XTThreadPtr self);
154
155xtInt8 xt_ind_get_usage();
156xtInt8 xt_ind_get_size();
157xtBool xt_ind_write(struct XTOpenTable *ot, XTIndexPtr ind, xtIndexNodeID offset, size_t size, xtWord1 *data);
158xtBool xt_ind_write_cache(struct XTOpenTable *ot, xtIndexNodeID offset, size_t size, xtWord1 *data);
159xtBool xt_ind_clean(struct XTOpenTable *ot, XTIndexPtr ind, xtIndexNodeID offset);
160xtBool xt_ind_read_bytes(struct XTOpenTable *ot, XTIndexPtr ind, xtIndexNodeID offset, size_t size, xtWord1 *data);
161void xt_ind_check_cache(XTIndexPtr ind);
162xtBool xt_ind_reserve(struct XTOpenTable *ot, u_int count, XTIdxBranchDPtr not_this);
163void xt_ind_free_reserved(struct XTOpenTable *ot);
164void xt_ind_unreserve(struct XTOpenTable *ot);
165
166xtBool xt_ind_fetch(struct XTOpenTable *ot, XTIndexPtr ind, xtIndexNodeID node, XTPageLockType ltype, XTIndReferencePtr iref);
167xtBool xt_ind_release(struct XTOpenTable *ot, XTIndexPtr ind, XTPageUnlockType utype, XTIndReferencePtr iref);
168
169void xt_ind_lock_handle(XTIndHandlePtr handle);
170void xt_ind_unlock_handle(XTIndHandlePtr handle);
171xtBool xt_ind_copy_on_write(XTIndReferencePtr iref);
172
173XTIndHandlePtr xt_ind_get_handle(struct XTOpenTable *ot, XTIndexPtr ind, XTIndReferencePtr iref);
174void xt_ind_release_handle(XTIndHandlePtr handle, xtBool have_lock, XTThreadPtr thread);
175
176#ifdef DEBUG
177//#define DEBUG_CHECK_IND_CACHE
178#endif
179
180//#define XT_TRACE_INDEX
181
182#ifdef XT_TRACE_INDEX
183#define IDX_TRACE(x, y, z) xt_trace(x, y, z)
184#else
185#define IDX_TRACE(x, y, z)
186#endif
187
188#endif
0189
=== added file 'plugin/pbxt/src/ccutils_xt.cc'
--- plugin/pbxt/src/ccutils_xt.cc 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/ccutils_xt.cc 2010-04-01 14:19:35 +0000
@@ -0,0 +1,69 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2006-05-16 Paul McCullagh
20 *
21 * H&G2JCtL
22 *
23 * C++ Utilities
24 */
25
26#include "xt_config.h"
27
28#include "pthread_xt.h"
29#include "ccutils_xt.h"
30#include "bsearch_xt.h"
31
32static int ccu_compare_object(XTThreadPtr XT_UNUSED(self), register const void *XT_UNUSED(thunk), register const void *a, register const void *b)
33{
34 XTObject *obj_ptr = (XTObject *) b;
35
36 return obj_ptr->compare(a);
37}
38
39void XTListImp::append(XTThreadPtr self, XTObject *info, void *key) {
40 size_t idx;
41
42 if (li_item_count == 0)
43 idx = 0;
44 else if (li_item_count == 1) {
45 int r;
46
47 if ((r = li_items[0]->compare(key)) == 0)
48 idx = 0;
49 else if (r < 0)
50 idx = 0;
51 else
52 idx = 1;
53 }
54 else {
55 xt_bsearch(self, key, li_items, li_item_count, sizeof(void *), &idx, NULL, ccu_compare_object);
56 }
57
58 if (!xt_realloc(NULL, (void **) &li_items, (li_item_count + 1) * sizeof(void *))) {
59 if (li_referenced)
60 info->release(self);
61 xt_throw_errno(XT_CONTEXT, XT_ENOMEM);
62 return;
63 }
64 memmove(&li_items[idx+1], &li_items[idx], (li_item_count-idx) * sizeof(void *));
65 li_items[idx] = info;
66 li_item_count++;
67}
68
69
070
=== added file 'plugin/pbxt/src/ccutils_xt.h'
--- plugin/pbxt/src/ccutils_xt.h 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/ccutils_xt.h 2010-04-01 14:19:35 +0000
@@ -0,0 +1,220 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2006-05-16 Paul McCullagh
20 *
21 * H&G2JCtL
22 *
23 * C++ Utilities
24 */
25
26#ifndef __ccutils_xt_h__
27#define __ccutils_xt_h__
28
29#include <errno.h>
30
31#include "xt_defs.h"
32#include "thread_xt.h"
33
34class XTObject
35{
36 private:
37 u_int o_refcnt;
38
39 public:
40 inline XTObject() { o_refcnt = 1; }
41
42 virtual ~XTObject() { }
43
44 inline void reference() {
45 o_refcnt++;
46 }
47
48 inline void release(XTThreadPtr self) {
49 ASSERT(o_refcnt > 0);
50 o_refcnt--;
51 if (o_refcnt == 0) {
52 finalize(self);
53 delete this;
54 }
55 }
56
57 virtual XTObject *factory(XTThreadPtr self) {
58 XTObject *new_obj;
59
60 if (!(new_obj = new XTObject()))
61 xt_throw_errno(XT_CONTEXT, XT_ENOMEM);
62 return new_obj;
63 }
64
65 virtual XTObject *clone(XTThreadPtr self) {
66 XTObject *new_obj;
67
68 new_obj = factory(self);
69 new_obj->init(self, this);
70 return new_obj;
71 }
72
73 virtual void init(XTThreadPtr self) { (void) self; }
74 virtual void init(XTThreadPtr self, XTObject *obj) { (void) obj; init(self); }
75 virtual void finalize(XTThreadPtr self) { (void) self; }
76 virtual int compare(const void *key) { (void) key; return -1; }
77};
78
79class XTListImp
80{
81 protected:
82 bool li_referenced;
83 u_int li_item_count;
84 XTObject **li_items;
85
86 public:
87 inline XTListImp() : li_referenced(true), li_item_count(0), li_items(NULL) { }
88
89 inline void setNonReferenced() { li_referenced = false; }
90
91 void append(XTThreadPtr self, XTObject *info) {
92 if (!xt_realloc(NULL, (void **) &li_items, (li_item_count + 1) * sizeof(void *))) {
93 if (li_referenced)
94 info->release(self);
95 xt_throw_errno(XT_CONTEXT, XT_ENOMEM);
96 return;
97 }
98 li_items[li_item_count] = info;
99 li_item_count++;
100 }
101
102 void insert(XTThreadPtr self, XTObject *info, u_int i) {
103 if (!xt_realloc(NULL, (void **) &li_items, (li_item_count + 1) * sizeof(void *))) {
104 if (li_referenced)
105 info->release(self);
106 xt_throw_errno(XT_CONTEXT, XT_ENOMEM);
107 return;
108 }
109 memmove(&li_items[i+1], &li_items[i], (li_item_count-i) * sizeof(XTObject *));
110 li_items[i] = info;
111 li_item_count++;
112 }
113
114 void addToFront(XTThreadPtr self, XTObject *info) {
115 insert(self, info, 0);
116 }
117
118 /* Will sort! */
119 void append(XTThreadPtr self, XTObject *info, void *key);
120
121 inline bool remove(XTObject *info) {
122 for (u_int i=0; i<li_item_count; i++) {
123 if (li_items[i] == info) {
124 li_item_count--;
125 memmove(&li_items[i], &li_items[i+1], (li_item_count - i) * sizeof(XTObject *));
126 return true;
127 }
128 }
129 return false;
130 }
131
132 inline bool remove(XTThreadPtr self, u_int i) {
133 XTObject *item;
134
135 if (i >= li_item_count)
136 return false;
137 item = li_items[i];
138 li_item_count--;
139 memmove(&li_items[i], &li_items[i+1], (li_item_count - i) * sizeof(void *));
140 if (li_referenced)
141 item->release(self);
142 return true;
143 }
144
145 inline XTObject *take(u_int i) {
146 XTObject *item;
147
148 if (i >= li_item_count)
149 return NULL;
150 item = li_items[i];
151 li_item_count--;
152 memmove(&li_items[i], &li_items[i+1], (li_item_count - i) * sizeof(void *));
153 return item;
154 }
155
156 inline u_int size() const { return li_item_count; }
157
158 inline void setEmpty(XTThreadPtr self) {
159 if (li_items)
160 xt_free(self, li_items);
161 li_item_count = 0;
162 li_items = NULL;
163 }
164
165 inline bool isEmpty() { return li_item_count == 0; }
166
167 inline XTObject *itemAt(u_int i) const {
168 if (i >= li_item_count)
169 return NULL;
170 return li_items[i];
171 }
172};
173
174
175template <class T> class XTList : public XTListImp
176{
177 public:
178 inline XTList() : XTListImp() { }
179
180 inline void append(XTThreadPtr self, T *a) { XTListImp::append(self, a); }
181 inline void insert(XTThreadPtr self, T *a, u_int i) { XTListImp::insert(self, a, i); }
182 inline void addToFront(XTThreadPtr self, T *a) { XTListImp::addToFront(self, a); }
183
184 inline bool remove(T *a) { return XTListImp::remove(a); }
185
186 inline bool remove(XTThreadPtr self, u_int i) { return XTListImp::remove(self, i); }
187
188 inline T *take(u_int i) { return (T *) XTListImp::take(i); }
189
190 inline T *itemAt(u_int i) const { return (T *) XTListImp::itemAt(i); }
191
192 inline u_int indexOf(T *a) {
193 u_int i;
194
195 for (i=0; i<size(); i++) {
196 if (itemAt(i) == a)
197 break;
198 }
199 return i;
200 }
201
202 void deleteAll(XTThreadPtr self)
203 {
204 for (u_int i=0; i<size(); i++) {
205 if (li_referenced)
206 itemAt(i)->release(self);
207 }
208 setEmpty(self);
209 }
210
211 void clone(XTThreadPtr self, XTListImp *list)
212 {
213 deleteAll(self);
214 for (u_int i=0; i<list->size(); i++) {
215 XTListImp::append(self, list->itemAt(i)->clone(self));
216 }
217 }
218};
219
220#endif
0221
=== added file 'plugin/pbxt/src/database_xt.cc'
--- plugin/pbxt/src/database_xt.cc 1970-01-01 00:00:00 +0000
+++ plugin/pbxt/src/database_xt.cc 2010-04-01 14:19:35 +0000
@@ -0,0 +1,1314 @@
1/* Copyright (c) 2005 PrimeBase Technologies GmbH
2 *
3 * PrimeBase XT
4 *
5 * This program is free software; you can redistribute it and/or modify
6 * it under the terms of the GNU General Public License as published by
7 * the Free Software Foundation; either version 2 of the License, or
8 * (at your option) any later version.
9 *
10 * This program is distributed in the hope that it will be useful,
11 * but WITHOUT ANY WARRANTY; without even the implied warranty of
12 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
13 * GNU General Public License for more details.
14 *
15 * You should have received a copy of the GNU General Public License
16 * along with this program; if not, write to the Free Software
17 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18 *
19 * 2005-01-15 Paul McCullagh
20 *
21 * H&G2JCtL
22 */
23
24#include "xt_config.h"
25
26#ifdef DRIZZLED
27#include <bitset>
28#endif
29
30#include <string.h>
31#include <stdio.h>
32
33#include "pthread_xt.h"
34#include "hashtab_xt.h"
35#include "filesys_xt.h"
36#include "database_xt.h"
37#include "memory_xt.h"
38#include "heap_xt.h"
39#include "datalog_xt.h"
40#include "strutil_xt.h"
41#include "util_xt.h"
42#include "trace_xt.h"
43
44#ifdef DEBUG
45//#define XT_TEST_XACT_OVERFLOW
46#endif
47
48#ifndef NAME_MAX
49#define NAME_MAX 128
50#endif
51
52/*
53 * -----------------------------------------------------------------------
54 * GLOBALS
55 */
56
57xtPublic XTDatabaseHPtr pbxt_database = NULL; // The global open database
58
59xtPublic xtLogOffset xt_db_log_file_threshold;
60xtPublic size_t xt_db_log_buffer_size;
61xtPublic size_t xt_db_transaction_buffer_size;
62xtPublic size_t xt_db_checkpoint_frequency;
63xtPublic off_t xt_db_data_log_threshold;
64xtPublic size_t xt_db_data_file_grow_size;
65xtPublic size_t xt_db_row_file_grow_size;
66xtPublic int xt_db_garbage_threshold;
67xtPublic int xt_db_log_file_count;
68xtPublic int xt_db_auto_increment_mode; /* 0 = MySQL compatible, 1 = PrimeBase Compatible. */
69xtPublic int xt_db_offline_log_function; /* 0 = recycle logs, 1 = delete logs, 2 = keep logs */
70xtPublic int xt_db_sweeper_priority; /* 0 = low (default), 1 = normal, 2 = high */
71
72xtPublic XTSortedListPtr xt_db_open_db_by_id = NULL;
73xtPublic XTHashTabPtr xt_db_open_databases = NULL;
74xtPublic time_t xt_db_approximate_time = 0; /* A "fast" alternative timer (not too accurate). */
75
76static xtDatabaseID db_next_id = 1;
77static volatile XTOpenFilePtr db_lock_file = NULL;
78
79/*
80 * -----------------------------------------------------------------------
81 * LOCK/UNLOCK INSTALLATION
82 */
83
84xtPublic void xt_lock_installation(XTThreadPtr self, char *installation_path)
85{
86 char file_path[PATH_MAX];
87 char buffer[101];
88 size_t red_size;
89 llong pid;
90 xtBool cd = pbxt_crash_debug;
91
92 xt_strcpy(PATH_MAX, file_path, installation_path);
93 xt_add_pbxt_file(PATH_MAX, file_path, "no-debug");
94 if (xt_fs_exists(file_path))
95 pbxt_crash_debug = FALSE;
96 xt_strcpy(PATH_MAX, file_path, installation_path);
97 xt_add_pbxt_file(PATH_MAX, file_path, "crash-debug");
98 if (xt_fs_exists(file_path))
99 pbxt_crash_debug = TRUE;
100
101 if (pbxt_crash_debug != cd) {
102 if (pbxt_crash_debug)
103 xt_logf(XT_NT_WARNING, "Crash debugging has been turned on ('crash-debug' file exists)\n");
104 else
105 xt_logf(XT_NT_WARNING, "Crash debugging has been turned off ('no-debug' file exists)\n");
106 }
107 else if (pbxt_crash_debug)
108 xt_logf(XT_NT_WARNING, "Crash debugging is enabled\n");
109
110 /* Moved the lock file out of the pbxt directory so that
111 * it is possible to drop the pbxt database!
112 */
113 xt_strcpy(PATH_MAX, file_path, installation_path);
114 xt_add_dir_char(PATH_MAX, file_path);
115 xt_strcat(PATH_MAX, file_path, "pbxt-lock");
116 db_lock_file = xt_open_file(self, file_path, XT_FS_CREATE | XT_FS_MAKE_PATH);
117
118 try_(a) {
119 if (!xt_lock_file(self, db_lock_file)) {
120 xt_logf(XT_NT_ERROR, "A server appears to already be running\n");
121 xt_logf(XT_NT_ERROR, "The file: %s, is locked\n", file_path);
122 xt_throw_xterr(XT_CONTEXT, XT_ERR_SERVER_RUNNING);
123 }
124 if (!xt_pread_file(db_lock_file, 0, 100, 0, buffer, &red_size, &self->st_statistics.st_rec, self))
125 xt_throw(self);
126 if (red_size > 0) {
127 buffer[red_size] = 0;
128#ifdef XT_WIN
129 pid = (llong) _atoi64(buffer);
130#else
131 pid = atoll(buffer);
132#endif
133 /* Problem with this code is, after a restart
134 * the process ID's are reused.
135 * If some system process grabs the proc id that
136 * the server had on the last run, then
137 * the database will not start.
138 if (xt_process_exists((xtProcID) pid)) {
139 xt_logf(XT_NT_ERROR, "A server appears to already be running, process ID: %lld\n", pid);
140 xt_logf(XT_NT_ERROR, "Remove the file: %s, if this is not the case\n", file_path);
141 xt_throw_xterr(XT_CONTEXT, XT_ERR_SERVER_RUNNING);
142 }
143 */
144 xt_logf(XT_NT_INFO, "The server was not shutdown correctly, recovery required\n");
145#ifdef XT_BACKUP_BEFORE_RECOVERY
146 if (pbxt_crash_debug) {
147 /* The server was not shut down correctly. Make a backup before
148 * we start recovery.
149 */
150 char extension[100];
151
152 for (int i=1;;i++) {
153 xt_strcpy(PATH_MAX, file_path, installation_path);
154 xt_remove_dir_char(file_path);
155 sprintf(extension, "-recovery-%d", i);
156 xt_strcat(PATH_MAX, file_path, extension);
157 if (!xt_fs_exists(file_path))
158 break;
159 }
160 xt_logf(XT_NT_INFO, "In order to reproduce recovery errors a backup of the installation\n");
161 xt_logf(XT_NT_INFO, "will be made to:\n");
162 xt_logf(XT_NT_INFO, "%s\n", file_path);
163 xt_logf(XT_NT_INFO, "Copy in progress...\n");
164 xt_fs_copy_dir(self, installation_path, file_path);
165 xt_logf(XT_NT_INFO, "Copy OK\n");
166 }
167#endif
168 }
169
170 sprintf(buffer, "%lld", (llong) xt_getpid());
171 xt_set_eof_file(self, db_lock_file, 0);
172 if (!xt_pwrite_file(db_lock_file, 0, strlen(buffer), buffer, &self->st_statistics.st_rec, self))
173 xt_throw(self);
174 }
175 catch_(a) {
176 xt_close_file(self, db_lock_file);
177 db_lock_file = NULL;
178 xt_throw(self);
179 }
180 cont_(a);
181}
182
183xtPublic void xt_unlock_installation(XTThreadPtr self, char *installation_path)
184{
185 if (db_lock_file) {
186 char lock_file[PATH_MAX];
187
188 xt_unlock_file(NULL, db_lock_file);
189 xt_close_file_ns(db_lock_file);
190 db_lock_file = NULL;
191
192 xt_strcpy(PATH_MAX, lock_file, installation_path);
193 xt_add_dir_char(PATH_MAX, lock_file);
194 xt_strcat(PATH_MAX, lock_file, "pbxt-lock");
195 xt_fs_delete(self, lock_file);
196 }
197}
198
199int *xt_bad_pointer = 0;
200
201void xt_crash_me(void)
202{
203 if (pbxt_crash_debug)
204 *xt_bad_pointer = 123;
205}
206
207/*
208 * -----------------------------------------------------------------------
209 * INIT/EXIT DATABASE
210 */
211
212static xtBool db_hash_comp(void *key, void *data)
213{
214 XTDatabaseHPtr db = (XTDatabaseHPtr) data;
215
216 return strcmp((char *) key, db->db_name) == 0;
217}
218
219static xtHashValue db_hash(xtBool is_key, void *key_data)
220{
221 XTDatabaseHPtr db = (XTDatabaseHPtr) key_data;
222
223 if (is_key)
224 return xt_ht_hash((char *) key_data);
225 return xt_ht_hash(db->db_name);
226}
227
228static xtBool db_hash_comp_ci(void *key, void *data)
229{
230 XTDatabaseHPtr db = (XTDatabaseHPtr) data;
231
232 return strcasecmp((char *) key, db->db_name) == 0;
233}
234
235static xtHashValue db_hash_ci(xtBool is_key, void *key_data)
236{
237 XTDatabaseHPtr db = (XTDatabaseHPtr) key_data;
238
239 if (is_key)
240 return xt_ht_casehash((char *) key_data);
241 return xt_ht_casehash(db->db_name);
242}
243
244static void db_hash_free(XTThreadPtr self, void *data)
245{
246 xt_heap_release(self, (XTDatabaseHPtr) data);
247}
248
249static int db_cmp_db_id(struct XTThread *XT_UNUSED(self), register const void *XT_UNUSED(thunk), register const void *a, register const void *b)
250{
251 xtDatabaseID db_id = *((xtDatabaseID *) a);
252 XTDatabaseHPtr *db_ptr = (XTDatabaseHPtr *) b;
253
254 if (db_id == (*db_ptr)->db_id)
255 return 0;
256 if (db_id < (*db_ptr)->db_id)
257 return -1;
258 return 1;
259}
260
261xtPublic void xt_init_databases(XTThreadPtr self)
262{
263 if (pbxt_ignore_case)
264 xt_db_open_databases = xt_new_hashtable(self, db_hash_comp_ci, db_hash_ci, db_hash_free, TRUE, TRUE);
265 else
266 xt_db_open_databases = xt_new_hashtable(self, db_hash_comp, db_hash, db_hash_free, TRUE, TRUE);
267 xt_db_open_db_by_id = xt_new_sortedlist(self, sizeof(XTDatabaseHPtr), 20, 10, db_cmp_db_id, NULL, NULL, FALSE, FALSE);
268}
269
270xtPublic void xt_stop_database_threads(XTThreadPtr self, xtBool sync)
271{
272 u_int len = 0;
273 XTDatabaseHPtr *dbptr;
274 XTDatabaseHPtr db = NULL;
275
276 if (xt_db_open_db_by_id)
277 len = xt_sl_get_size(xt_db_open_db_by_id);
278 for (u_int i=0; i<len; i++) {
279 if ((dbptr = (XTDatabaseHPtr *) xt_sl_item_at(xt_db_open_db_by_id, i))) {
280 db = *dbptr;
281 if (sync) {
282 /* Wait for the sweeper: */
283 xt_wait_for_sweeper(self, db, 16);
284
285 /* Wait for the writer: */
286 xt_wait_for_writer(self, db);
287
The diff has been truncated for viewing.