1
=== added file 'doc/BasicAPI.rst'
2
--- doc/BasicAPI.rst	1970-01-01 00:00:00 +0000
3
+++ doc/BasicAPI.rst	2012-05-30 18:23:19 +0000
4
@@ -0,0 +1,332 @@
5
1
.. _Basic-API:
6
2
7
3
Basic API
8
4
=========
9
5
10
6
Akiban Persistit stores data as key-value pairs in highly optimized B-Tree. (Actually, implements `B-Link Tree <http://www.cs.cornell.edu/courses/cs4411/2009sp/blink.pdf>`_ trees for greater concurrency). Like a Java Map implementation, Persistit associates at most one value with each unique instance of a Key value.
11
7
12
8
Persistit provides classes and methods to access and modify keys and their associated values. Application code calls Persistit API methods to store, fetch, traverse and remove keys and records to and from the database.
13
9
14
10
Persistit permits efficient multi-threaded concurrent access to database volumes. It is designed to minimize contention for critical resources and to maximize throughput on multi-processor machines. Concurrent ACID transactions are supported with multi-value concurrency control (MVCC).
15
11
16
12
The Persistit Instance
17
13
----------------------
18
14
19
15
To access Persistit, the application first constructs an instance of the ``com.persistit.Persistit`` class and initializes it. This instance is the keystone of all subsequent operations.  It holds references to the buffers, maps, transaction contexts and other structures needed to access B-trees. The life cycle of a Persistit instance should be managed as follows:
20
16
21
17
.. code-block:: java
22
18
23
19
  final Persistit db = new Persistit();
24
20
  //
25
21
  // register any Coder and Renderer instances before initialization
26
22
  //
27
23
  db.initialize(configProperties);
28
24
  try {
29
25
      // do application work
30
26
  } finally {
31
27
      db.close();
32
28
  }
33
29
34
30
The ``configProperties`` describe the memory allocation, initial set of volumes, the journal, and other settings needed to get Persistit started. See :ref:`Configuration` for details.
35
31
36
32
The ``com.persistit.Persistit#close`` method gracefully flushes all modified data to disk, stops background threads and unregisters JMX MBeans. 
37
33
38
34
.. note:: 
39
35
40
36
  The Persistit background threads are not daemon threads, so if your application returns 
41
37
  from its static main method without calling ``close``, the JVM will not automatically exit.
42
38
43
39
Although normal shutdown should always invoke ``close``, Persistit is designed to recover a consistent database state in the event of an abrupt shutdown or crash. See :ref:`Recovery`.
44
40
45
41
.. _Key:
46
42
47
43
Key
48
44
---
49
45
50
46
The content of a ``com.persistit.Key`` is the unique identifier for a key/value pair within a tree. Internally a ``Key`` contains an array of bytes that constitute the physical identity of the key/value pair within a tree. Logically, a key consists of a sequence of zero or more Java values, each of which is called a *key segment*. 
51
47
52
48
The following value types are implicitly supported in keys::
53
49
54
50
  null
55
51
  boolean (and Boolean)
56
52
  byte (and Byte)
57
53
  short (and Short)
58
54
  char (and Character)
59
55
  int (and Integer)
60
56
  long (and Long)
61
57
  float (and Float)
62
58
  double (and Double)
63
59
  java.lang.String
64
60
  java.math.BigInteger
65
61
  java.math.BigDecimal
66
62
  java.util.Date
67
63
  byte[]
68
64
69
65
In addition, you may register custom implementations of the ``com.persistit.encoding.KeyCoder`` interface to support encoding of other object classes. By default, String values are encoded in UTF-8 format.
70
66
71
67
Appending and Decoding Key Segments
72
68
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
73
69
74
70
The ``Key`` class provides methods to encode and decode each of these types to and from a key segment. For each type listed above, there is an ``append`` method, a ``to`` method and a ``decode`` method. For example, for the long type, there are methods
75
71
76
72
.. code-block:: java
77
73
78
74
  public void append(long v)
79
75
  public void to(long v)
80
76
  public long decodeLong()
81
77
82
78
The ``to`` methods replaces the final key segment with a different value (unless the key is empty, in which case it works the same as ``append``).
83
79
84
80
For example:
85
81
86
82
.. code-block:: java
87
83
88
84
  key.clear();         	// clear any previous key segments
89
85
  key.append("Atlantic");  // append segment "Atlantic"
90
86
  key.to("Pacific");   	// replace "Atlantic" with "Pacific"
91
87
  key.reset();         	// reset index to beginning
92
88
  String s = key.decode(); // s contains "Pacific"
93
89
94
90
The Key class also provides methods to encode and decode Object values to and from a key. Strings, Dates, objects of the corresponding wrapper classes for the primitive types listed above, and objects supported by registered instances of ``com.persistit.encoding.KeyCoder`` are permitted. Primitive values are automatically boxed and unboxed as needed. The following code fragment demonstrates key manipulation with automatic conversion of primitive types and their wrappers.
95
91
96
92
.. code-block:: java
97
93
98
94
  key.clear();              	// clear any previous key segments
99
95
  key.append(new Integer(1234));
100
96
  key.append("Atlantic");
101
97
  key.append(1.23d);
102
98
  key.reset();              	// reset index to beginning for decoding
103
99
  int v = key.decodeInt();  	// v will be 1234
104
100
  String s = (String)key.decode(); // s will be "Atlantic"
105
101
  Double d = (Double)decode();    // d will be 1.23d as a Double
106
102
107
103
In this code segment, an object of type Integer is appended to the key’s value sequence, and then the same value is later decoded as a primitive int value. A String is appended and then decoded into a String. Finally, a primitive double value is appended and then decoded as an object of class Double.
108
104
109
105
The maximum size of a serialized ``Key`` is 2,047 bytes.
110
106
111
107
For further information, see ``com.persistit.Key``.
112
108
113
109
114
110
.. _Value:
115
111
116
112
Value
117
113
-----
118
114
119
115
A ``com.persistit.Value`` object holds a value. Unlike keys, Value objects have no restriction on the types of data they can represent, and they can hold much larger objects. In particular, a Value may contain null, any of the primitive types, or an object of any class.
120
116
121
117
The backing store of a ``Value`` is a byte array that is written to a B-Tree data page, or in the case of a long record, multiple pages. The ``com.persistit.Value#put`` method variants encode (serialize) a Java primitive or Object value into the backing store, and the ``com.persistit.Value#get`` method variants decode (deserialize) the value.
122
118
123
119
For example, in ``HelloWorld.java``, the line
124
120
125
121
.. code-block:: java
126
122
127
123
  dbex.getValue().put("World");
128
124
129
125
serializes the String “World”, and the expression
130
126
131
127
.. code-block:: java
132
128
133
129
  dbex.getValue().get()
134
130
135
131
decodes it. Persistit does not intrinsically cache decoded object values, nor does it track an object's state changes.  Each call to the ``get()`` method returns a new instance of the object. However, you can use a ``com.persistit.encoding.ObjectCache`` to cache object values. ``ObjectCache`` is designed specifically to cache objects fetched from Persistit.
136
132
137
133
Value Types
138
134
^^^^^^^^^^^
139
135
140
136
``Value`` provides optimized predefined representations for the following types::
141
137
142
138
  null
143
139
  all primitive types
144
140
  all arrays
145
141
  java.math.BigInteger
146
142
  java.math.BigDecimal
147
143
  java.lang.String
148
144
  java.util.Date
149
145
150
146
In general, Persistit uses one of four mechanisms to encode a Java value into a Value object:
151
147
152
148
- If the value is one of the predefined types listed above, Persistit uses its own internal serialization logic.
153
149
- If there is a registered ``com.persistit.encoding.ValueCoder`` for the object's class, Persistit delegates to it.
154
150
- If enabled, Persistit uses an accelerated serialization/deserialization mechanism to encode and decode objects.
155
151
- Otherwise, for classes that implement java.io.Serializable, Persistit attempts to perform default Java serialization and deserialization.
156
152
157
153
A Value may also be in the undefined state, which results from performing a fetch operation on a key for which no value is present in the database. The undefined state is distinct from the value ``null`` and can be tested with the ``isDefined()`` method.
158
154
159
155
See :ref:`Serialization` for additional information.
160
156
161
157
Large Values
162
158
^^^^^^^^^^^^
163
159
164
160
Persistit stores large values, in the current version up to 64MB in size. For example, it is possible to store an image’s backing bytes as a single value in the database. The size of the value to be stored is constrained by available heap memory; the entire value must be able to be serialized into an in-memory byte array in order for Persistit to store or retrieve it. Use ``com.persistit.Value#setMaximumSize`` to specify a the size constraint. Large values are broken up across multiple data pages and are not necessarily stored in contiguous file areas.
165
161
166
162
The definition of “large” depends on the configuration properties. for example, for a volume with a page size of 16K bytes the threshold occurs at 6,108 bytes. A value having a serialized size smaller than this is stored in a single data page while a larger value is broken up and stored in multiple pages. For a smaller pages size the threshold is lower.
167
163
168
164
On occasion it may be desirable to fetch only part of a large value. For example, it may be useful to extract summary information from the beginning of a the backing byte array for an Image. Variants versions of the ``fetch`` and ``traverse`` accept a minimum byte count parameter. When these methods are used only the specified minimum number bytes of the backing store are retrieved from the database. This technique can prevent Persistit from reading large numbers of pages from the disk in order to examine only a small portion of the record.
169
165
170
166
.. _Exchange:
171
167
172
168
Exchange
173
169
--------
174
170
175
171
The primary low-level interface for interacting with Persistit is ``com.persistit.Exchange``. The Exchange class provides all methods for storing, deleting, fetching and traversing key/value pairs. These methods are summarized here and described in detail in the Javadoc API documentation.
176
172
177
173
An Exchange instance contains references to a ``Key`` and a ``Value``. The methods ``com.persistit.Exchange.getKey()`` and ``com.persistit.Exchange.getValue()`` access these instances.
178
174
179
175
To construct an Exchange you specify a Volume (or alias) and a tree name in its constructor. The constructor will optionally create a new tree in that Volume if a tree having the specified name has not already been created. An application may construct an arbitrary number of Exchange objects. Creating a new Exchange has no effect on the database if the specified tree already exists. Tree creation is thread-safe: multiple threads concurrently constructing Exchanges using the same Tree name will safely result in the creation of only one new tree.
180
176
181
177
An Exchange is a moderately complex object that can consume tens of kilobytes to megabytes (depending on the sizes of the Key and Value) of heap space. Memory-constrained applications should construct Exchanges in moderatation.
182
178
183
179
Persistit offers Exchange pooling to avoid rapidly creating and destroying Exchange objects in multi-threaded applications.  An application may use the ``com.persistit.Persistit#getExchange`` and ``com.persistit.Persistit#releaseExchange`` methods to take and return an Exchange from and to a thread-local pool.
184
180
185
181
An Exchange internally maintains some optimization information such that references to nearby Keys within a tree are accelerated. Performance may benefit from using a different Exchange for each area of the Tree being accessed.
186
182
187
183
Concurrent Operations on Exchanges
188
184
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
189
185
190
186
Although the underlying Persistit database is designed for highly concurrent multi-threaded operation, the ``Exchange`` class and its associated ``Key`` and ``Value`` instances are *not* thread-safe. Each thread should acquire and use its own Exchange objects when accessing the database. Nonetheless, multiple threads can execute database operations on overlapping data concurrently using their thread-private ``Exchange`` instances.
191
187
192
188
Because Persistit permits concurrent operations by multiple threads, there is no guarantee that the underlying database will remain unchanged after an Exchange fetches or modifies its data. However, each operation on an Exchange is atomic, meaning that the inputs and outputs of each method are consistent with some valid state of the underlying Persistit backing store at some instant in time. The Exchange’s Value and Key objects represent that consistent state even if another thread subsequently modifies the database. Transactions, described below, allow multiple database operations to be performed atomically and consistently.
193
189
194
190
Exchange API
195
191
^^^^^^^^^^^^
196
192
197
193
An Exchange has permanent references to a ``com.persistit.Key`` and a ``com.persistit.Value``. Typically you work with an Exchange in one of the following patterns:
198
194
199
195
- Modify the Key, perform a ``fetch`` operation, and extract the Value.
200
196
- Modify the Key, modify the Value, and then perform a ``store`` operation.
201
197
- Modify the Key, and then perform a ``remove`` operation.
202
198
- Optionally modify the Key, perform a ``traverse`` operation, then read the resulting Key and/or Value.
203
199
204
200
These four methods, plus a few other methods listed here, are the primary low-level interface to the database. Semantics are as follows:
205
201
206
202
``fetch``
207
203
    Reads the stored value associated with this Exchange's Key and modifies the Exchange’s Value to reflect that value.
208
204
``store``
209
205
    Inserts or replaces the key/value pair for the specified key in the Tree either by replacing the former value, if there was one, or inserting a new value.
210
206
``fetchAndStore``
211
207
    Reads and then replaces the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic.
212
208
``remove``, ``removeAll``, ``removeKeyRange``
213
209
    Removes key/value pairs from the Tree. Versions of this method specify either a single key or a range of keys to be removed.
214
210
``fetchAndRemove``
215
211
    Fetches and then removes the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic.
216
212
``traverse``, ``next``, ``previous``
217
213
    Modifies the Exchange’s Key and Value to reflect a successor or predecessor key within the tree. See ``com.persistit.Key`` for detailed information on the order of traversal.
218
214
``hasNext``, ``hasPrevious``
219
215
    Indicates, without modifying the Exchange’s Value or Key objects, whether there is a successor or predecessor key in the Tree.
220
216
``hasChildren``
221
217
    Indicates whether there are records having keys that are logical children. A *logical child* of some key *P* is any key that can be constructed by appending one or more key segments to *P*.
222
218
223
219
For convenience, Exchange delegates ``append`` and ``to`` methods to ``com.persistit.Key``. For example, Exchange provides the following methods that delegate to the identically named methods of Key :
224
220
225
221
.. code-block:: java
226
222
227
223
  public Exchange append(long v)
228
224
  public Exchange append(String v)
229
225
  ...
230
226
231
227
To allow code call-chaining these methods of Exchange return the same Exchange. For example, it is valid to write code such as
232
228
233
229
.. code-block:: java
234
230
235
231
  exchange.clear().append(" Pacific").append("Ocean").append(123).fetch();
236
232
237
233
This example fetches the value associated with the concatenated key
238
234
``{“Pacific”, ”Ocean”, 123}``.
239
235
240
236
Exchange also delegates other key manipulation methods. (See ``com.persistit.Exchange`` for detailed API documentation.)
241
237
242
238
Traversing and Querying Collections of Data
243
239
-------------------------------------------
244
240
245
241
An Exchange provides a number of methods for traversing a collection of records in the Persistit database. These include variations of the ``com.persistit.Exchange#traverse``, ``com.persistit.Exchange#next`` and ``com.persistit.Exchange#previous``. For all of these methods, Persistit does two things: it modifies the Exchange's ``Key`` to reflect a new key that is before or after the current key, and it modifies the ``Value`` associated with the Exchange to reflect the database value associated with that key.
246
242
247
243
For example, this code from ``HelloWorld.java`` prints out the key and value of each record in a tree:
248
244
249
245
.. code-block:: java
250
246
251
247
       	dbex.getKey().to(Key.BEFORE);
252
248
       	while (dbex.next())
253
249
       	{
254
250
           	System.out.println(
255
251
               	dbex.getKey().indexTo(0).decode() + " " +
256
252
               	dbex.getValue().get());
257
253
       	}
258
254
259
255
In general, the traversal methods let you find a key in a tree related to the key you supply. In Persistit programs you frequently prime a key value by appending either ``com.persistit.Key#BEFORE`` or ``com.persistit.Key#AFTER``. A key containing either of these special values can never be stored in a tree; these are reserved to represent positions in key traversal order before the first valid key and after the last valid key, respectively. You then invoke next or previous, or any of the other traverse family variants, to enumerate keys within the tree.
260
256
261
257
You can specify whether traversal is *deep* or *shallow*.  Deep traversal traverses the logical children (see com.persistit.Key) of a key. Shallow traversal traverses only the logical siblings.
262
258
263
259
.. _KeyFilter:
264
260
265
261
Selecting key values with a KeyFilter
266
262
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
267
263
268
264
A ``com.persistit.KeyFilter`` defines a subset of all possible key values. For example, a KeyFilter can select keys with certain fixed segment values, sets of values or ranges of values.  Calling ``traverse``, ``next`` or ``previous`` with a KeyFilter efficiently traverses the subset of all keys in a Tree that match the filter.
269
265
270
266
You construct a KeyFilter either by adding selection terms to it, or by calling the ``com.persistit.KeyParser#parseKeyFilter`` method of the ``com.persistit.KeyParser`` class to construct one from a string representation.
271
267
272
268
Use of a KeyFilter is illustrated by the following code fragment:
273
269
274
270
.. code-block:: java
275
271
276
272
  Exchange ex = new Exchange("myVolume", "myTree", true);
277
273
  KeyFilter kf = new KeyFilter("{\"Bellini\":\"Britten\"}");
278
274
  ex.append(Key.BEFORE);
279
275
  while (ex.next(kf)){
280
276
      System.out.println(ex.getKey().reset().decodeString());
281
277
  }
282
278
283
279
This simple example emits the string-valued keys within Tree “myTree” whose values fall alphabetically between “Bellini” and “Britten”, inclusive.
284
280
285
281
286
282
You will find an example with a KeyFilter in the examples/FindFileDemo directory.
287
283
288
284
.. _PersistitMap:
289
285
290
286
PersistitMap
291
287
------------
292
288
293
289
In addition to low-level access methods on keys and values, Persistit provides ``com.persistit.PersistitMap``, which implements the ``java.util.SortedMap`` interface. PersistitMap uses the Persistit database as a backing store so that key/value pairs are persistent, potentially shared with all threads, and limited in number only by disk storage.
294
290
295
291
Keys and Values for PersistitMap must conform to the constraints described above under :ref:`Key` and :ref:`Value`.
296
292
297
293
The constructor for PersistitMap takes an Exchange as its sole parameter. All key/value pairs of the Map are stored within the tree identified by this Exchange. The Key supplied by the Exchange becomes the root of a logical tree. For example:
298
294
299
295
.. code-block:: java
300
296
301
297
  Exchange ex = new Exchange("myVolume", "myTree", true);
302
298
  ex.append("USA").append("MA");
303
299
  PersistitMap<String, String> map = new PersistitMap<String, String>(ex);
304
300
  map.put("Boston", "Hub");
305
301
306
302
places a key/value pair into Tree “myTree” with the concatenated key ``{"USA ","MA","Boston"}`` and a value ``"Hub"``.
307
303
308
304
Generally the expected behavior for an Iterator on a Map collection view is to throw a ``ConcurrentModificationException`` if the underlying collection changes. This is known as “fail-fast” behavior. PersistitMap implements this behavior by throwing a ``ConcurrentModificationException`` in the event the Tree containing the map changes after the Iterator is constructed.
309
305
310
306
However, sometimes it may be desirable to use PersistitMap and its collections view interfaces to iterate across changing data, especially for large databases. PersistitMap provides the method ``com.persistit.PersistitMap#setAllowConcurrentModification`` to control whether changes made by other threads are permitted. By default, concurrent modifications are not allowed.
311
307
312
308
.. note:: When ``PersistitMap`` is used within a transaction updates generated by other concurrent transactions are not visible and   
313
309
   therefore cannot cause a ConcurrentModificationException.  However, to avoid unpredictable results an Iterator created within the scope 
314
310
   of a transaction must be used only within that transaction.
315
311
316
312
317
313
Exceptions in PersistitMap
318
314
^^^^^^^^^^^^^^^^^^^^^^^^^^
319
315
320
316
Persistit operations throw a variety of exceptions that are subclasses of ``com.persistit.exception.PersistitException``. However, the methods of the SortedMap interface do not permit arbitrary checked exceptions to be thrown. Therefore, PersistitMap wraps any PersistitException generated by the underlying database methods within a ``com.persistit.PersistitMap.PersistitMapException``. This exception is unchecked and can therefore be thrown by methods of the Map interface. Applications using PersistitMap should catch and handle PersistitMap.PersistitMapException.
321
317
322
318
Applying a KeyFilter to a PersistitMap Iterator
323
319
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
324
320
325
321
You can specify a ``com.persistit.KeyFilter`` for the Iterator returned by the ``keySet()``, ``entrySet()`` and ``values()`` methods of ``com.persistit.PersistitMap``.  The KeyFilter restricts the range of keys traversed by the Iterator. To set the KeyFilter, you must cast the Iterator to the inner class PersistitMap.ExchangeIterator, as shown here:
326
322
327
323
.. code-block:: java
328
324
329
325
	PersistitMap map = new PersistitMap(exchange);
330
326
	PersistitMap.ExchangeIterator iterator =
331
327
   	(PersistitMap.ExchangeIterator)map.entrySet().iterator();
332
328
	iterator.setFilterTerm(KeyFilter.rangeTerm("A", "M"));
333
329
334
330
In this example, the iterator will only access String-valued keys between “A” and “M”.
335
331
336
332
337
0
333
338
=== removed file 'doc/BasicAPI.txt'
339
--- doc/BasicAPI.txt	2012-04-30 22:09:31 +0000
340
+++ doc/BasicAPI.txt	1970-01-01 00:00:00 +0000
341
@@ -1,320 +0,0 @@
342
1
[[BasicAPI]]
343
2
= Basic API
344
3
345
4
Akiban Persistit stores data as key-value pairs in highly optimized B-Tree footnote:[Technically, Persistit implements http://www.cs.cornell.edu/courses/cs4411/2009sp/blink.pdf[B-Link] trees for greater concurrency]. Like a Java Map implementation, Persistit associates at most one value with each unique instance of a Key value.
346
5
347
6
Persistit provides classes and methods to access and modify keys and their associated values. Application code calls Persistit API methods to store, fetch, traverse and remove keys and records to and from the database.
348
7
349
8
Persistit permits efficient multi-threaded concurrent access to database volumes. It is designed to minimize contention for critical resources and to maximize throughput on multi-processor machines. Concurrent ACID transactions are supported with multi-value concurrency control (MVCC).
350
9
351
10
== The Persistit Instance
352
11
353
12
To access Persistit, the application first constructs an instance of the +com.persistit.Persistit+ class and initializes it. This instance is the keystone of all subsequent operations.  It holds references to the buffers, maps, transaction contexts and other structures needed to access B-trees. The life cycle of a Persistit instance should be managed as follows:
354
13
355
14
[source,java]
356
15
----
357
16
final Persistit db = new Persistit();
358
17
//
359
18
// register any Coder and Renderer instances before initialization
360
19
//
361
20
db.initialize(configProperties);
362
21
try {
363
22
    // do application work
364
23
} finally {
365
24
    db.close();
366
25
}
367
26
----
368
27
369
28
The +configProperties+ describe the memory allocation, initial set of volumes, the journal, and other settings needed to get Persistit started. See <<Configuration>> for details.
370
29
371
30
The +com.persistit.Persistit#close+ method gracefully flushes all modified data to disk, stops background threads and unregisters JMX MBeans. 
372
31
****
373
32
The Persistit background threads are not daemon threads, so if your application returns from its static main method without calling +close+, the JVM will not automically exit.
374
33
****
375
34
376
35
Although normal shutdown should always invoke +close+, Persistit is designed to recover a consistent database state in the event of an abrupt shutdown or crash. See <<Recovery>>.
377
36
378
37
[[Key]]
379
38
== Key
380
39
381
40
The content of a +com.persistit.Key+ is the unique identifier for a key/value pair within a tree. Internally a +Key+ contains an array of bytes that constitute the physical identity of the key/value pair within a tree. Logically, a key consists of a sequence of zero or more Java values, each of which is called a _key segment_. 
382
41
383
42
The following value types are implicitly supported in keys:
384
43
385
44
.Types Supported in +com.persistit.Key+
386
45
----
387
46
null
388
47
boolean (and Boolean)
389
48
byte (and Byte)
390
49
short (and Short)
391
50
char (and Character)
392
51
int (and Integer)
393
52
long (and Long)
394
53
float (and Float)
395
54
double (and Double)
396
55
java.lang.String
397
56
java.math.BigInteger
398
57
java.math.BigDecimal
399
58
java.util.Date
400
59
byte[]
401
60
----
402
61
403
62
In addition, you may register custom implementations of the +com.persistit.encoding.KeyCoder+ interface to support encoding of other object classes. By default, String values are encoded in UTF-8 format.
404
63
****
405
64
TODO: Collation
406
65
****
407
66
408
67
=== Appending and Decoding Key Segments
409
68
410
69
The +Key+ class provides methods to encode and decode each of these types to and from a key segment. For each type listed above, there is an +append+ method, a +to+ method and a +decode+ method. For example, for the long type, there are methods
411
70
412
71
[source,java]
413
72
----
414
73
public void append(long v)
415
74
public void to(long v)
416
75
public long decodeLong()
417
76
----
418
77
419
78
The +to+ methods replaces the final key segment with a different value (unless the key is empty, in which case it works the same as +append+).
420
79
421
80
For example:
422
81
423
82
[source,java]
424
83
----
425
84
key.clear();         	// clear any previous key segments
426
85
key.append("Atlantic");  // append segment "Atlantic"
427
86
key.to("Pacific");   	// replace "Atlantic" with "Pacific"
428
87
key.reset();         	// reset index to beginning
429
88
String s = key.decode(); // s contains "Pacific"
430
89
----
431
90
432
91
The Key class also provides methods to encode and decode Object values to and from a key. Strings, Dates, objects of the corresponding wrapper classes for the primitive types listed above, and objects supported by registered instances of +com.persistit.encoding.KeyCoder+ are permitted. Primitive values are automatically boxed and unboxed as needed. The following code fragment demonstrates key manipulation with automatic conversion of primitive types and their wrappers.
433
92
434
93
[source,java]
435
94
----
436
95
key.clear();              	// clear any previous key segments
437
96
key.append(new Integer(1234));
438
97
key.append("Atlantic");
439
98
key.append(1.23d);
440
99
key.reset();              	// reset index to beginning for decoding
441
100
int v = key.decodeInt();  	// v will be 1234
442
101
String s = (String)key.decode(); // s will be "Atlantic"
443
102
Double d = (Double)decode();    // d will be 1.23d as a Double
444
103
----
445
104
446
105
In this code segment, an object of type Integer is appended to the key’s value sequence, and then the same value is later decoded as a primitive int value. A String is appended and then decoded into a String. Finally, a primitive double value is appended and then decoded as an object of class Double.
447
106
448
107
The maximum size of a serialized +Key+ is 2,047 bytes.
449
108
450
109
For further information, see +com.persistit.Key+.
451
110
452
111
453
112
[[Value]]
454
113
== Value
455
114
456
115
A +com.persistit.Value+ object holds a value. Unlike keys, Value objects have no restriction on the types of data they can represent, and they can hold much larger objects. In particular, a Value may contain null, any of the primitive types, or an object of any class.
457
116
458
117
The backing store of a +Value+ is a byte array that is written to a B-Tree data page, or in the case of a long record, multiple pages. The +com.persistit.Value#put+ method variants encode (serialize) a Java primitive or Object value into the backing store, and the +com.persistit.Value#get+ method variants decode (deserialize) the value.
459
118
460
119
For example, in +HelloWorld.java+, the line
461
120
462
121
[source,java]
463
122
----
464
123
dbex.getValue().put("World");
465
124
----
466
125
467
126
serializes the String “World”, and the expression
468
127
469
128
[source,java]
470
129
----
471
130
dbex.getValue().get()
472
131
----
473
132
474
133
decodes it. Persistit does not intrinsically cache decoded object values, nor does it track an object's state changes.  Each call to the +get()+ method returns a new instance of the object. However, you can use a +com.persistit.encoding.ObjectCache+ to cache object values. +ObjectCache+ is designed specifically to cache objects fetched from Persistit.
475
134
476
135
=== Value Types
477
136
478
137
+Value+ provides optimized predefined representations for the following types:
479
138
480
139
.Types Implicitly Supported by +com.persistit.Value+
481
140
----
482
141
null
483
142
all primitive types
484
143
all arrays
485
144
java.math.BigInteger
486
145
java.math.BigDecimal
487
146
java.lang.String
488
147
java.util.Date
489
148
----
490
149
In general, Persistit uses one of four mechanisms to encode a Java value into a Value object:
491
150
492
151
- If the value is one of the predefined types listed above, Persistit uses its own internal serialization logic.
493
152
- If there is a registered +com.persistit.encoding.ValueCoder+ for the object's class, Persistit delegates to it.
494
153
- If enabled, Persistit uses an accelerated serialization/deserialization mechanism to encode and decode objects.
495
154
- Otherwise, for classes that implement java.io.Serializable, Persistit attempts to perform default Java serialization and deserialization.
496
155
497
156
A Value may also be in the undefined state, which results from performing a fetch operation on a key for which no value is present in the database. The undefined state is distinct from the value +null+ and can be tested with the +isDefined()+ method.
498
157
499
158
See <<Serialization>> for additional information.
500
159
501
160
=== Large Values
502
161
503
162
Persistit stores large values, in the current version up to 64MB in size. For example, it is possible to store an image’s backing bytes as a single value in the database. The size of the value to be stored is constrained by available heap memory; the entire value must be able to be serialized into an in-memory byte array in order for Persistit to store or retrieve it. Use +com.persistit.Value#setMaximumSize+ to specify a the size constraint. Large values are broken up across multiple data pages and are not necessarily stored in contiguous file areas.
504
163
505
164
The definition of “large” depends on the configuration properties. for example, for a volume with a page size of 16K bytes the threshold occurs at 6,108 bytes. A value having a serialized size smaller than this is stored in a single data page while a larger value is broken up and stored in multiple pages. For a smaller pages size the threshold is lower.
506
165
507
166
On occasion it may be desirable to fetch only part of a large value. For example, it may be useful to extract summary information from the beginning of a the backing byte array for an Image. Variants versions of the +fetch+ and +traverse+ accept a minimum byte count parameter. When these methods are used only the specified minimum number bytes of the backing store are retrieved from the database. This technique can prevent Persistit from reading large numbers of pages from the disk in order to examine only a small portion of the record.
508
167
509
168
[[Exchange]]
510
169
== Exchange
511
170
512
171
The primary low-level interface for interacting with Persistit is +com.persistit.Exchange+. The Exchange class provides all methods for storing, deleting, fetching and traversing key/value pairs. These methods are summarized here and described in detail in the Javadoc API documentation.
513
172
514
173
An Exchange instance contains references to a +Key+ and a +Value+. The methods +com.persistit.Exchange.getKey()+ and +com.persistit.Exchange.getValue()+ access these instances.
515
174
516
175
To construct an Exchange you specify a Volume (or alias) and a tree name in its constructor. The constructor will optionally create a new tree in that Volume if a tree having the specified name has not already been created. An application may construct an arbitrary number of Exchange objects. Creating a new Exchange has no effect on the database if the specified tree already exists. Tree creation is thread-safe: multiple threads concurrently constructing Exchanges using the same Tree name will safely result in the creation of only one new tree.
517
176
518
177
An Exchange is a moderately complex object that can consume tens of kilobytes to megabytes (depending on the sizes of the Key and Value) of heap space. Memory-constrained applications should construct Exchanges in moderatation.
519
178
520
179
Persistit offers Exchange pooling to avoid rapidly creating and destroying Exchange objects in multi-threaded applications.  An application may use the +com.persistit.Persistit#getExchange+ and +com.persistit.Persistit#releaseExchange+ methods to take and return an Exchange from and to a thread-local pool.
521
180
522
181
An Exchange internally maintains some optimization information such that references to nearby Keys within a tree are accelerated. Performance may benefit from using a different Exchange for each area of the Tree being accessed.
523
182
524
183
=== Concurrent Operations on Exchanges
525
184
526
185
Although the underlying Persistit database is designed for highly concurrent multi-threaded operation, the +Exchange+ class and its associated +Key+ and +Value+ instances are _not_ thread-safe. Each thread should acquire and use its own Exchange objects when accessing the database. Nonetheless, multiple threads can execute database operations on overlapping data concurrently using their thread-private +Exchange+ instances.
527
186
528
187
Because Persistit permits concurrent operations by multiple threads, there is no guarantee that the underlying database will remain unchanged after an Exchange fetches or modifies its data. However, each operation on an Exchange is atomic, meaning that the inputs and outputs of each method are consistent with some valid state of the underlying Persistit backing store at some instant in time. The Exchange’s Value and Key objects represent that consistent state even if another thread subsequently modifies the database. Transactions, described below, allow multiple database operations to be performed atomically and consistently.
529
188
530
189
=== Exchange API
531
190
532
191
An Exchange has permanent references to a +com.persistit.Key+ and a +com.persistit.Value+. Typically you work with an Exchange in one of the following patterns:
533
192
534
193
- Modify the Key, perform a +fetch+ operation, and extract the Value.
535
194
- Modify the Key, modify the Value, and then perform a +store+ operation.
536
195
- Modify the Key, and then perform a +remove+ operation.
537
196
- Optionally modify the Key, perform a +traverse+ operation, then read the resulting Key and/or Value.
538
197
539
198
These four methods, plus a few other methods listed here, are the primary low-level interface to the database. Semantics are as follows:
540
199
541
200
[horizontal]
542
201
+fetch+:: Reads the stored value associated with this Exchange's Key and modifies the Exchange’s Value to reflect that value.
543
202
+store+:: Inserts or replaces the key/value pair for the specified key in the Tree either by replacing the former value, if there was one, or inserting a new value.
544
203
+fetchAndStore+:: Reads and then replaces the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic.
545
204
+remove+, +removeAll+, +removeKeyRange+:: Removes key/value pairs from the Tree. Versions of this method specify either a single key or a range of keys to be removed.
546
205
+fetchAndRemove+:: Fetches and then removes the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic.
547
206
+traverse+, +next+, +previous+:: Modifies the Exchange’s Key and Value to reflect a successor or predecessor key within the tree. See +com.persistit.Key+ for detailed information on the order of traversal.
548
207
+hasNext+, +hasPrevious+:: Indicates, without modifying the Exchange’s Value or Key objects, whether there is a successor or predecessor key in the Tree.
549
208
+hasChildren+:: Indicates whether there are records having keys that are logical children. A _logical child_of some key _P_ is any key that can be constructed by appending one or more key segments to _P_.
550
209
551
210
552
211
For convenience, Exchange delegates +append+ and +to+ methods to +com.persistit.Key+. For example, Exchange provides the following methods that delegate to the identically named methods of Key :
553
212
554
213
[source,java]
555
214
----
556
215
public Exchange append(long v)
557
216
public Exchange append(String v)
558
217
...
559
218
----
560
219
To allow code call-chaining these methods of Exchange return the same Exchange. For example, it is valid to write code such as
561
220
562
221
[source,java]
563
222
----
564
223
exchange.clear().append(" Pacific").append("Ocean").append(123).fetch();
565
224
----
566
225
567
226
This example fetches the value associated with the concatenated key
568
227
+{“Pacific”, ”Ocean”, 123}+.
569
228
570
229
Exchange also delegates other key manipulation methods. (See +com.persistit.Exchange+ for detailed API documentation.)
571
230
572
231
=== Traversing and Querying Collections of Data
573
232
574
233
An Exchange provides a number of methods for traversing a collection of records in the Persistit database. These include variations of the +com.persistit.Exchange#traverse+, +com.persistit.Exchange#next+ and +com.persistit.Exchange#previous+. For all of these methods, Persistit does two things: it modifies the Exchange's +Key+ to reflect a new key that is before or after the current key, and it modifies the +Value+ associated with the Exchange to reflect the database value associated with that key.
575
234
576
235
For example, this code from +HelloWorld.java+ prints out the key and value of each record in a tree:
577
236
578
237
[source,java]
579
238
----
580
239
       	dbex.getKey().to(Key.BEFORE);
581
240
       	while (dbex.next())
582
241
       	{
583
242
           	System.out.println(
584
243
               	dbex.getKey().indexTo(0).decode() + " " +
585
244
               	dbex.getValue().get());
586
245
       	}
587
246
----
588
247
589
248
In general, the traversal methods let you find a key in a tree related to the key you supply. In Persistit programs you frequently prime a key value by appending either +com.persistit.Key#BEFORE+ or +com.persistit.Key#AFTER+. A key containing either of these special values can never be stored in a tree; these are reserved to represent positions in key traversal order before the first valid key and after the last valid key, respectively. You then invoke next or previous, or any of the other traverse family variants, to enumerate keys within the tree.
590
249
591
250
You can specify whether traversal is _deep_ or _shallow_.  Deep traversal traverses the logical children (see com.persistit.Key) of a key. Shallow traversal traverses only the logical siblings.
592
251
593
252
[[KeyFilter]]
594
253
=== Selecting key values with a KeyFilter
595
254
596
255
A +com.persistit.KeyFilter+ defines a subset of all possible key values. For example, a KeyFilter can select keys with certain fixed segment values, sets of values or ranges of values.  Calling +traverse+, +next+ or +previous+ with a KeyFilter efficiently traverses the subset of all keys in a Tree that match the filter.
597
256
598
257
You construct a KeyFilter either by adding selection terms to it, or by calling the +com.persistit.KeyParser#parseKeyFilter+ method of the +com.persistit.KeyParser+ class to construct one from a string representation.
599
258
600
259
Use of a KeyFilter is illustrated by the following code fragment:
601
260
602
261
[source,java]
603
262
----
604
263
Exchange ex = new Exchange("myVolume", "myTree", true);
605
264
KeyFilter kf = new KeyFilter("{\"Bellini\":\"Britten\"}");
606
265
ex.append(Key.BEFORE);
607
266
while (ex.next(kf)){
608
267
    System.out.println(ex.getKey().reset().decodeString());
609
268
}
610
269
----
611
270
612
271
This simple example emits the string-valued keys within Tree “myTree” whose values fall alphabetically between “Bellini” and “Britten”, inclusive.
613
272
614
273
615
274
You will find an example with a KeyFilter in the examples/FindFileDemo directory.
616
275
617
276
[[PersistitMap]]
618
277
=== PersistitMap
619
278
620
279
In addition to low-level access methods on keys and values, Persistit provides +com.persistit.PersistitMap+, which implements the +java.util.SortedMap+ interface. PersistitMap uses the Persistit database as a backing store so that key/value pairs are persistent, potentially shared with all threads, and limited in number only by disk storage.
621
280
622
281
Keys and Values for PersistitMap must conform to the constraints described above under <<Key>> and <<Value>>.
623
282
624
283
The constructor for PersistitMap takes an Exchange as its sole parameter. All key/value pairs of the Map are stored within the tree identified by this Exchange. The Key supplied by the Exchange becomes the root of a logical tree. For example:
625
284
626
285
[source,java]
627
286
----
628
287
Exchange ex = new Exchange("myVolume", "myTree", true);
629
288
ex.append("USA").append("MA");
630
289
PersistitMap<String, String> map = new PersistitMap<String, String>(ex);
631
290
map.put("Boston", "Hub");
632
291
----
633
292
634
293
places a key/value pair into Tree “myTree” with the concatenated key +{"USA ","MA","Boston"}+ and a value +"Hub"+.
635
294
636
295
Generally the expected behavior for an Iterator on a Map collection view is to throw a +ConcurrentModificationException+ if the underlying collection changes. This is known as “fail-fast” behavior. PersistitMap implements this behavior by throwing a +ConcurrentModificationException+ in the event the Tree containing the map changes after the Iterator is constructed.
637
296
638
297
However, sometimes it may be desirable to use PersistitMap and its collections view interfaces to iterate across changing data, especially for large databases. PersistitMap provides the method +com.persistit.PersistitMap#setAllowConcurrentModification+ to control whether changes made by other threads are permitted. By default, concurrent modifications are not allowed.
639
298
640
299
NOTE: when +PersistitMap+ is used within a transaction updates generated by other concurrent transactions are not visible and therefore cannot cause a ConcurrentModificationException.  However, to avoid unpredictable results an Iterator created within the scope of a transaction must be used only within that transaction.
641
300
642
301
643
302
=== Exceptions in PersistitMap
644
303
645
304
Persistit operations throw a variety of exceptions that are subclasses of +com.persistit.exception.PersistitException+. However, the methods of the SortedMap interface do not permit arbitrary checked exceptions to be thrown. Therefore, PersistitMap wraps any PersistitException generated by the underlying database methods within a +com.persistit.PersistitMap.PersistitMapException+. This exception is unchecked and can therefore be thrown by methods of the Map interface. Applications using PersistitMap should catch and handle PersistitMap.PersistitMapException.
646
305
647
306
=== Applying a KeyFilter to a PersistitMap Iterator
648
307
649
308
You can specify a +com.persistit.KeyFilter+ for the Iterator returned by the +keySet()+, +entrySet()+ and +values()+ methods of +com.persistit.PersistitMap+.  The KeyFilter restricts the range of keys traversed by the Iterator. To set the KeyFilter, you must cast the Iterator to the inner class PersistitMap.ExchangeIterator, as shown here:
650
309
651
310
[source,java]
652
311
----
653
312
	PersistitMap map = new PersistitMap(exchange);
654
313
	PersistitMap.ExchangeIterator iterator =
655
314
   	(PersistitMap.ExchangeIterator)map.entrySet().iterator();
656
315
	iterator.setFilterTerm(KeyFilter.rangeTerm("A", "M"));
657
316
----
658
317
659
318
In this example, the iterator will only access String-valued keys between “A” and “M”.
660
319
661
320
662
321
0
663
=== added file 'doc/Configuration.rst'
664
--- doc/Configuration.rst	1970-01-01 00:00:00 +0000
665
+++ doc/Configuration.rst	2012-05-30 18:23:19 +0000
666
@@ -0,0 +1,273 @@
667
1
.. _Configuration:
668
2
669
3
Configuration
670
4
=============
671
5
672
6
To initialize Akiban Persistit the embedding application defines a configuration and then invokes one of the  ``com.persistit.Persistit#initialize`` methods. The configuration defines parameters used to determine locations of files, sizes of buffer pool and journal files, policies and other elements required when Persistit starts up. These parameters are managed by the ``com.persistit.Configuration`` class.
673
7
674
8
An application can define the configuration in one of two equivalent ways:
675
9
676
10
- Create a ``com.persistit.Configuration`` instance and set its properties through methods such as ``com.persistit.Configuration#setJournalPath``.
677
11
- Specify properties by name in a ``java.util.Properties`` instance and then pass the ``Properties`` to a ``Configuration`` constructor.
678
12
679
13
The following code samples show different ways of using the ``com.persistit.Persistit#initialize`` method to configure and start Persistit:
680
14
681
15
.. code-block:: java
682
16
683
17
  final Persistit db = new Persistit();
684
18
  final Properties p = new Properties();
685
19
  p.setProperty("buffer.count.16384", "1000");
686
20
  p.setProperty("journalpath", "/home/akiban/data/journal");
687
21
  ...
688
22
  db.initialize(p);
689
23
690
24
.. code-block:: java
691
25
692
26
  final Persistit db = new Persistit();
693
27
  db.initialize("/home/akiban/my_config.properties");
694
28
695
29
.. code-block:: java
696
30
697
31
  final Persistit db = new Persistit();
698
32
  final Configuration c = new Configuration();
699
33
  c.getBufferPoolMap().get(16384).setCount(1000);
700
34
  c.setJournalPath("/home/akiban/data/journal");
701
35
  ...
702
36
  db.initialize(c);
703
37
704
38
There are three essential elements in a Persistit configuration:
705
39
706
40
- Memory for the buffer pool(s)
707
41
- Specifications for ``com.persistit.Volume`` instances
708
42
- Journal file path
709
43
710
44
Configuring the Buffer Pool
711
45
---------------------------
712
46
713
47
During initialization Persistit allocates a fixed amount of heap memory for use as buffers. Depending on the application, it is usually desirable to allocate most of a server’s physical memory to the JVM heap (using the ``-Xmx`` and ``-Xms`` JVM options) and then to allocate a large fraction of the heap to Persistit buffers. The buffer pool allocation is determined during initialization and remains constant until the embedding application calls ``com.persistit.Persistit#close`` to release all resources.
714
48
715
49
The number of buffers, and therefore the heap memory consumed, is determined during initialization by the available heap memory and configuration parameters. The heap memory size is obtained from the platform MemoryMXBean which in turn supplies the value given by the ``-Xmx`` JVM property. Configuration parameters are specified by a collection of five ``com.persistit.Configuration.BufferPoolConfiguration`` objects, one for each of the possible buffer sizes 1,024, 2,048, 4,096, 8,192 and 16,384 bytes. The method ``com.persistit.Configuration#getBufferPoolMap(int)`` gets the ``BufferPoolConfiguration`` for a specified buffer size.
716
50
717
51
A ``BufferPoolConfiguration`` contains the following attributes:
718
52
719
53
  ``minimumCount``
720
54
      lower bound on the number of buffers. (Default is zero.)
721
55
  ``maximumCount``
722
56
      upper bound on the number of buffers. (Default is zero.)
723
57
  ``minimumMemory``
724
58
      lower bound on memory to allocate. (Default is zero bytes.)
725
59
  ``maximumMemory``
726
60
      upper bound on memory to allocate. (Default is Long.MAX_VALUE bytes.)
727
61
  ``reservedMemory``
728
62
      minimum number of bytes to reserve for use other than buffers.  (Default is zero.)
729
63
  ``fraction``
730
64
      floating point value between 0.0f and 1.0f indicating how much of available memory too allocate.  (Default is 1.0f.)
731
65
732
66
Persistit uses the following algorithm to determine the number of buffers to allocate for each buffer size:
733
67
734
68
.. code-block:: java
735
69
736
70
  memoryToUse = fraction * (maxHeap - reservedMemory)
737
71
  memoryToUse = min(memoryToUse, maximumMemory)
738
72
  bufferCount = memoryToUse / bufferSizeWithOverhead
739
73
  bufferCount = max(minimumCount, min(maximumCount, count))
740
74
  if (bufferCount * bufferSize > maximumMemory) then FAIL
741
75
  if ((bufferCount + 1) * bufferSize < minimumMemory) then FAIL
742
76
  allocate bufferCount buffers
743
77
744
78
In other words, Persistit computes a buffer count based on the memory parameters, bounds it by ``minimumCount`` and ``maximumCount`` and then checks whether the resulting allocation fits within the memory constraints. Note that ``bufferSizeWithOverhead`` is about 14% larger than the buffer size; the additional memory is reserved for indexing data and other overhead associated with the buffer.
745
79
746
80
Typically an application uses a single buffer size, specifying either an absolute count or memory-based constraints for that size. This can be done by setting the attributes of the appropriate ``BufferPoolConfiguration`` object directly, or using Property values.
747
81
748
82
The property named ``buffer.count.SSSS`` where ``SSSS`` is “1024”, “2048”, “4096”, “8192” or “16384” specifies an absolute count.  For example,
749
83
750
84
.. code-block:: java
751
85
752
86
  buffer.count.8192 = 10000
753
87
754
88
causes Persistit to allocate 10,000 buffers of size 8192.
755
89
756
90
The property ``buffer.memory.SSSS`` specifies memory constraints as shown in this example
757
91
758
92
.. code-block:: java
759
93
760
94
  buffer.memory.8192 = 512K,20M,4M,0.6
761
95
762
96
where 512K, 20M, 4M and 0.6 are the ``minimumMemory``, ``maximumMemory``, ``reservedMemory`` and ``fraction``, respectively.
763
97
764
98
The MemoryMXBean supplies as its maximum heap size value the size given by the ``-Xmx`` JVM parameter.
765
99
766
100
Heap Tuning
767
101
-----------
768
102
769
103
This section pertains to the Oracle HotSpot(tm) Java virtual machine.
770
104
771
105
.. note:: 
772
106
773
107
   Buffer instances are long-lived objects. To avoid severe garbage collector overhead it is important for all of them    
774
108
   to fit in the heap’s tenured generation. This issue becomes especially significant with multi-gigabyte heaps.
775
109
776
110
By default the HotSpot server JVM allocates 1/3 of the heap to the new generation and 2/3 to the tenured generation, meaning that allocating more than 2/3 of the heap to buffers will result in bad performance.
777
111
778
112
You can increase the fraction by specifying ``-XX:NewRatio=N`` where ``N`` indicates the ratio of tenured generation space to new generation space, or by using the ``-Xmn`` parameter to specify an absolute amount of memory for the new generation.  Also, setting ``-Xms`` equal to ``-Xmx`` will avoid numerous garbage collection cycles during the start-up process.
779
113
780
114
See [http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html] for further information on tuning the heap and garbage collector for the HotSpot JVM.
781
115
782
116
Multiple Buffer Pools
783
117
---------------------
784
118
785
119
In some cases it may be desirable to allocate two or more buffer pools having buffers of different sizes. For example, it may be beneficial to use a large number of small buffers to hold secondary index pages.
786
120
787
121
When specifying multiple memory constraints for multiple buffer pools, the ``fraction`` property applies to the available memory before any buffers are allocated. So, for example,
788
122
789
123
.. code-block:: java
790
124
791
125
  buffer.memory.2048=64M,512G,2G,.2
792
126
  buffer.memory.16384=64M,512G,2G,.5
793
127
794
128
results in two buffer pools having buffers of size 2,048 bytes and 16,384 bytes, respectively. Assuming that the ``-Xmx`` value is 12G, then 2,048 byte buffers will be allocated to fill 20% of 10GByte, 16,384 byte buffers will be allocated to fill 50% of 10GByte, and approximately 5GByte (30% of 10GByte plus 2GByte reserved) will be available to application code.
795
129
796
130
Configuring Volumes
797
131
-------------------
798
132
799
133
Persistit creates and/or opens a set of database volume files during start-up. An application can create, open and close additional volumes, but it is often convenient for volumes to be defined in the confiuration, outside of application code.
800
134
801
135
The ``com.persistit.Configuration#getVolumeList`` method returns a List of ``com.persistit.VolumeSpecification`` objects. An application can construct and add new ``VolumeSpecification`` instances to this list before calling ``com.persistit.Persistit#initialize(Configuration)``.  Alternatively, the application can define volume specifications as property values using the syntax:
802
136
803
137
``volume.N = path[,attrname[:attrvalue]]...``
804
138
805
139
where ``N`` is an arbitrary integer, ``path`` is the path specification of the volume file, and ``attrnames`` include:
806
140
807
141
- ``pageSize``: Fixed length unit representing one page. Value must be one of 1024, 2048, 4096, 8192 or 16384. To open and use the Volume, the buffer pool must have available buffers of the same size.
808
142
809
143
- ``create``: Persistit attempts to open an existing volume file with the specified *path*, or create a new one if the file does not exist.
810
144
811
145
- ``createOnly``: Persistit throw a VolumeAlreadyExistsException if the file specified by path already exists. Otherwise it creates a new file with the specified path.
812
146
813
147
- ``readOnly``: Opens a volume in read-only mode. An attempt to modify the volume results in a ReadOnlyVolumeException.
814
148
815
149
- ``initialPages`` or ``initialSize``: Specifies the initial size of the newly created volume file, either as the count of pages or as the size in bytes.
816
150
817
151
- ``extensionPages`` or ``extensionSize``: Specifies the extension size of the newly created volume, either as the count of pages or as the size in bytes. This is the size by which the volume file will expand when the volume needs to be enlarged.
818
152
819
153
- ``maximumPages`` or ``maximumSize``: An upper limit on the number of pages this Volume may hold, either as the count of pages or as the size in bytes. An attempt to further enlarge the Volume will generate a VolumeFullException.
820
154
821
155
- ``alias``: The name of this Volume used in constructing ``Exchange`` instances.  If unspecified, the name is the simple file name given in the *path*, not including its dotted suffix.
822
156
823
157
For example::
824
158
825
159
  volume.1=/home/akiban/ffdemo,create,pageSize:16K,\
826
160
      initialSize:10M,extensionSize:10M,maximumSize:1G
827
161
828
162
specifies a volume having the name “ffdemo” in the /home/akiban directory. A new volume will be created if there is no existing volume file, and when created it will have the initial, extension and maximum sizes of 10MByte, 10MByte and 1GByte, respectively. Its page size will be 16KByte, meaning that the configuration must also have a buffer pool of 16KByte buffers.
829
163
830
164
System Volume
831
165
-------------
832
166
833
167
One volume in a Persistit configuration must be designated as the system volume. It contains class meta data for objects stored serialized in Persistit Values. When a configuration specifies only one volume, that volume implicitly becomes the system volume by default. However, when a configuration specifies multiple volumes, you must indicate which volume will serve as the system volume. There are two ways to do this. By default, Persistit looks for a unique volume named “_system”. You can simply create a volume whose file name is “_system”.
834
168
835
169
Alternatively, you can specify a system volume name explicitly with the ``sysvolume`` property (or ``com.persistit.Configuration#setSysVolume``). The value is the name or alias of the selected volume.
836
170
837
171
Configuring the Journal Path
838
172
----------------------------
839
173
840
174
The :ref:`Journal` consists of a series of sequentially numbered files located in directory specified by the configuration parameter ``journalpath``. The application can set this property by calling ``com.persistit.Configuration#setJournalPath`` prior to initializing Persistit or through the property::
841
175
842
176
  journalpath=/ssd/data/my_app_journal
843
177
844
178
The value specified can be either a 
845
179
846
180
- directory, in which case files named ``persistit_journal.NNNNNNNNNNNN`` will be created, 
847
181
- or a file name, in which case journal files will be created by appending the suffix ``.NNNNNNNNNNNN``.
848
182
849
183
Recommendations for Physical Media
850
184
----------------------------------
851
185
852
186
The journal is written by appending records to the end of the highest-numbered file. Read operations occur while copying page images from the journal to their home volume files. While copying, Persistit attempts to perform large sequential read operations from the journal. Read operations also occur at random when Persistit needs to reload the image of a previously evicted page.
853
187
854
188
Because of these characteristics a modern SSD (solid disk drive) is ideally suited for maintaining the journal. If no SSD is available in the server, placing the journal on a different physical disk drive than the volume file(s) can significantly improve performance.
855
189
856
190
Other Configuration Parameters
857
191
------------------------------
858
192
859
193
The following additional properties are defined for Persistit. Other properties may also reside in the Properties object or its backing file; Persistit simply ignores any property not listed here.
860
194
861
195
  ``journalsize``: (``com.persistit.Configuration#setJournalSize``) 
862
196
      Journal file block size. Default is 1,000,000,000 bytes. A new Persistit rolls over to a new journal file when this 
863
197
      size is reached. Generally there is no reason to adjust this setting.
864
198
865
199
  ``appendonly``: (``com.persistit.Configuration#setAppendOnly``), True or false (default).  
866
200
      When true, Persistit’s journal starts up in *append-only* mode in which modified pages are only written to the 
867
201
      journal and not copied to their home volumes. As a consequence, all existing journal files are preserved, and new 
868
202
      modifications are written only to newly created journal files. The append-only flag can also be enabled or disabled 
869
203
      by application code and through the JMX and RMI interfaces.
870
204
871
205
  ``rmiport``: (``com.persistit.Configuration#setRmiPort``) 
872
206
      Specifies a port number on which Persistit will create a temporary Remote Method Invocation registry.  If this 
873
207
      property is specified, Persistit creates a registry and registers a ``com.persistit.Management`` server on it. This 
874
208
      allows remote access to management facilities within Persistit and permits the Swing-based administrative utility to 
875
209
      attach to and manage a Persistit instance running on a headless server. The ``rmihost`` and ``rmiport`` properties 
876
210
      are mutually exclusive.
877
211
878
212
  ``rmihost``: (``com.persistit.Configuration#setRmiHost``) 
879
213
      Specifies the URL of an Remote Method Invocation registry.  If present, Persistit registers its a server for its 
880
214
      ``com.persistit.Management`` interface at the specified external registry. The ``rmihost`` and ``rmiport`` 
881
215
      properties are mutually exclusive.
882
216
883
217
  ``jmx``: (``com.persistit.Configuration#setJmxEnabled``), True or false (default). 
884
218
      Specifies whether Persistit registers MXBeans with the platform MBean server. Set this value to ``true`` to enable 
885
219
      access from ``jconsole`` and other management tools.
886
220
887
221
  ``serialOverride``, ``constructorOverride``: (``com.persistit.Configuration#setSerialOverride`` ``com.persistit.Configuration#setConstructorOverride``) 
888
222
      Control aspects of object serialization. See :ref:`Serialization`.
889
223
890
224
  ``showgui``: (``com.persistit.Configuration#setShowGUI``), True of False.  
891
225
      If true, Persistit attempts to create and display an instance of the AdminUI utility panel within the current JVM. 
892
226
      Alternatively, AdminUI uses RMI and can be launched and run remotely if ``rmiport`` or ``rmihost`` has been 
893
227
      specified.
894
228
895
229
  ``logfile``: (``com.persistit.Configuration#setLogFile``) 
896
230
      Name of a log file to which Persistit’s default logger will write diagnostic log entries. Applications generally 
897
231
      install a logging adapter to reroute messages through Log4J, SLF4J or other logger. The ``logfile`` property is used 
898
232
      only when no adapter has been installed.
899
233
900
234
For all integer-valued properties, the suffix “K” may be used to represent kilo, “M” for mega, “G” for giga and “T” for tera. For example, “2M” represents the value 2,097,152.
901
235
902
236
A Configuration Example
903
237
-----------------------
904
238
905
239
Following is an example of a Persistit configuration properties file::
906
240
907
241
  datapath = /var/opt/persistit/data
908
242
  logpath = /var/log/persistit
909
243
  logfile = ${logpath}/${timestamp}.log
910
244
911
245
  buffer.count.16384 = 5000
912
246
913
247
  volume.1 = ${datapath}/demo_data, create, pageSize:16K, \
914
248
  	  initialSize:1M, extensionSize:1M, maximumSize:10G, alias:data
915
249
916
250
  volume.2 = ${datapath}/demo_system, create, pageSize:16K, \
917
251
	  initialSize:100K, extensionSize:100K, maximumSize:1G
918
252
919
253
  sysvolume = demo_system
920
254
921
255
  journalpath = /ssd/persistit_journal
922
256
923
257
With this configuration there will be 5,000 16K buffers in the buffer pool consuming heap space of approximately 93MB including overhead. Persistit will open or create volume files named ``/var/opt/persistit/data/demo_data`` and ``/var/opt/persistit/data/demo_system`` and a journal file named ``/ssd/persistit_journal.0000000000000000``. Persistit will write diagnostic logging output to a file such as ``/var/log/persistit/20110523172213.log``.
924
258
925
259
The ``demo_data`` volume has the alias ``data``. Application code uses the name "data" to refer to it. The ``sysvolume`` property specifies that the ``demo_system`` volume is designated to hold class meta data for serialized objects.
926
260
927
261
Property Value Substitution
928
262
---------------------------
929
263
930
264
This example also illustrates how property value substitution can be used within a Persistit configuration.  The value of the ``datapath`` replaces ``$\{datapath\}`` in the volume specification. The property name ``datapath`` is arbitrary; you may use any valid property name as a substitution variable. Similarly, the value of ``logpath`` replaces ``$\{logpath\}`` and the pseudo-property ``$\{timestamp\}`` expands to a timestamp in the form ``*yyyyMMddHHmm*`` to provides a unique time-based log file name.
931
265
932
266
Incorporating Java System Properties
933
267
------------------------------------
934
268
935
269
You may also specify any configuration property as a Java system property with the prefix ``com.persisit.`` System properties override values specified as properties. For example, you can override the value of ``buffer.count.8192`` specifying::
936
270
937
271
  java -Dcom.persistit.buffer.count.8192=10K -jar MyJar
938
272
939
273
This is also true for substitution property values. For example, ``-Dcom.persistit.logpath=/tmp/`` will place the log files in the ``/tmp`` directory rather than ``/var/log/persistit`` as specified by the configuration file.
940
0
274
941
=== removed file 'doc/Configuration.txt'
942
--- doc/Configuration.txt	2012-04-30 22:09:31 +0000
943
+++ doc/Configuration.txt	1970-01-01 00:00:00 +0000
944
@@ -1,246 +0,0 @@
945
1
[[Configuration]]
946
2
= Configuration
947
3
948
4
To initialize Akiban Persistit the embedding application defines a configuration and then invokes one of the  +com.persistit.Persistit#initialize+ methods. The configuration defines parameters used to determine locations of files, sizes of buffer pool and journal files, policies and other elements required when Persistit starts up. These parameters are managed by the +com.persistit.Configuration+ class.
949
5
950
6
An application can define the configuration in one of two equivalent ways:
951
7
952
8
- Create a +com.persistit.Configuration+ instance and set its properties through methods such as +com.persistit.Configuration#setJournalPath+.
953
9
- Specify properties by name in a +java.util.Properties+ instance and then pass the +Properties+ to a +Configuration+ constructor.
954
10
955
11
The following code samples show different ways of using the +com.persistit.Persistit#initialize+ method to configure and start Persistit:
956
12
957
13
[source,java]
958
14
----
959
15
final Persistit db = new Persistit();
960
16
final Properties p = new Properties();
961
17
p.setProperty("buffer.count.16384", "1000");
962
18
p.setProperty("journalpath", "/home/akiban/data/journal");
963
19
...
964
20
db.initialize(p);
965
21
----
966
22
967
23
[source,java]
968
24
----
969
25
final Persistit db = new Persistit();
970
26
db.initialize("/home/akiban/my_config.properties");
971
27
----
972
28
973
29
[source,java]
974
30
----
975
31
final Persistit db = new Persistit();
976
32
final Configuration c = new Configuration();
977
33
c.getBufferPoolMap().get(16384).setCount(1000);
978
34
c.setJournalPath("/home/akiban/data/journal");
979
35
...
980
36
db.initialize(c);
981
37
----
982
38
983
39
There are three essential elements in a Persistit configuration:
984
40
985
41
- Memory for the buffer pool(s)
986
42
- Specifications for +com.persistit.Volume+ instances
987
43
- Journal file path
988
44
989
45
== Configuring the Buffer Pool
990
46
991
47
During initialization Persistit allocates a fixed amount of heap memory for use as buffers. Depending on the application, it is usually desirable to allocate most of a server’s physical memory to the JVM heap (using the +-Xmx+ and +-Xms+ JVM options) and then to allocate a large fraction of the heap to Persistit buffers. The buffer pool allocation is determined during initialization and remains constant until the embedding application calls +com.persistit.Persistit#close+ to release all resources.
992
48
993
49
The number of buffers, and therefore the heap memory consumed, is determined during initialization by the available heap memory and configuration parameters. The heap memory size is obtained from the platform MemoryMXBean which in turn supplies the value given by the +-Xmx+ JVM property. Configuration parameters are specified by a collection of five +com.persistit.Configuration.BufferPoolConfiguration+ objects, one for each of the possible buffer sizes 1,024, 2,048, 4,096, 8,192 and 16,384 bytes. The method +com.persistit.Configuration#getBufferPoolMap(int)+ gets the +BufferPoolConfiguration+ for a specified buffer size.
994
50
995
51
A +BufferPoolConfiguration+ contains the following attributes:
996
52
997
53
[horizontal]
998
54
minimumCount:: lower bound on the number of buffers. (Default is zero.)
999
55
maximumCount:: upper bound on the number of buffers. (Default is zero.)
1000
56
minimumMemory:: lower bound on memory to allocate. (Default is zero bytes.)
1001
57
maximumMemory:: upper bound on memory to allocate. (Default is Long.MAX_VALUE bytes.)
1002
58
reservedMemory:: minimum number of bytes to reserve for use other than buffers.  (Default is zero.)
1003
59
fraction:: floating point value between 0.0f and 1.0f indicating how much of available memory too allocate.  (Default is 1.0f.)
1004
60
1005
61
Persistit uses the following algorithm to determine the number of buffers to allocate for each buffer size:
1006
62
1007
63
[source,java]
1008
64
----
1009
65
memoryToUse = fraction * (maxHeap - reservedMemory)
1010
66
memoryToUse = min(memoryToUse, maximumMemory)
1011
67
bufferCount = memoryToUse / bufferSizeWithOverhead
1012
68
bufferCount = max(minimumCount, min(maximumCount, count))
1013
69
if (bufferCount * bufferSize > maximumMemory) then FAIL
1014
70
if ((bufferCount + 1) * bufferSize < minimumMemory) then FAIL
1015
71
allocate bufferCount buffers
1016
72
----
1017
73
1018
74
In other words, Persistit computes a buffer count based on the memory parameters, bounds it by +minimumCount+ and +maximumCount+ and then checks whether the resulting allocation fits within the memory constraints. Note that +bufferSizeWithOverhead+ is about 14% larger than the buffer size; the additional memory is reserved for indexing data and other overhead associated with the buffer.
1019
75
1020
76
Typically an application uses a single buffer size, specifying either an absolute count or memory-based constraints for that size. This can be done by setting the attributes of the appropriate +BufferPoolConfiguration+ object directly, or using Property values.
1021
77
1022
78
The property named +buffer.count._SSSS_+ where _SSSS_ is “1024”, “2048”, “4096”, “8192” or “16384” specifies an absolute count.  For example,
1023
79
1024
80
----
1025
81
buffer.count.8192 = 10000
1026
82
----
1027
83
1028
84
causes Persistit to allocate 10,000 buffers of size 8192.
1029
85
1030
86
The property +buffer.memory._SSSS_+ specifies memory constraints as shown in this example
1031
87
1032
88
----
1033
89
buffer.memory.8192 = 512K,20M,4M,0.6
1034
90
----
1035
91
1036
92
where 512K, 20M, 4M and 0.6 are the +minimumMemory+, +maximumMemory+, +reservedMemory+ and +fraction+, respectively.
1037
93
1038
94
The MemoryMXBean supplies as its maximum heap size value the size given by the +-Xmx+ JVM parameter.
1039
95
1040
96
=== Heap Tuning
1041
97
1042
98
This section pertains to the Oracle HotSpot(tm) Java virtual machine.
1043
99
1044
100
****
1045
101
Buffer instances are long-lived objects. To avoid severe garbage collector overhead it is important for all of them to fit in the heap’s tenured generation. This issue becomes especially significant with multi-gigabyte heaps.
1046
102
****
1047
103
1048
104
By default the HotSpot server JVM allocates 1/3 of the heap to the new generation and 2/3 to the tenured generation, meaning that allocating more than 2/3 of the heap to buffers will result in bad performance.
1049
105
1050
106
You can increase the fraction by specifying +-XX:NewRatio=_N_+ where _N_ indicates the ratio of tenured generation space to new generation space, or by using the +-Xmn+ parameter to specify an absolute amount of memory for the new generation.  Also, setting +-Xms+ equal to +-Xmx+ will avoid numerous garbage collection cycles during the start-up process.
1051
107
1052
108
See [http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html] for further information on tuning the heap and garbage collector for the HotSpot JVM.
1053
109
1054
110
=== Multiple Buffer Pools
1055
111
1056
112
In some cases it may be desirable to allocate two or more buffer pools having buffers of different sizes. For example, it may be beneficial to use a large number of small buffers to hold secondary index pages.
1057
113
1058
114
When specifying multiple memory constraints for multiple buffer pools, the +fraction+ property applies to the available memory before any buffers are allocated. So, for example,
1059
115
1060
116
1061
117
----
1062
118
buffer.memory.2048=64M,512G,2G,.2
1063
119
buffer.memory.16384=64M,512G,2G,.5
1064
120
----
1065
121
1066
122
results in two buffer pools having buffers of size 2,048 bytes and 16,384 bytes, respectively. Assuming that the +-Xmx+ value is 12G, then 2,048 byte buffers will be allocated to fill 20% of 10GByte, 16,384 byte buffers will be allocated to fill 50% of 10GByte, and approximately 5GByte (30% of 10GByte plus 2GByte reserved) will be available to application code.
1067
123
1068
124
== Configuring Volumes
1069
125
1070
126
Persistit creates and/or opens a set of database volume files during start-up. An application can create, open and close additional volumes, but it is often convenient for volumes to be defined in the confiuration, outside of application code.
1071
127
1072
128
The +com.persistit.Configuration#getVolumeList+ method returns a List of +com.persistit.VolumeSpecification+ objects. An application can construct and add new +VolumeSpecification+ instances to this list before calling +com.persistit.Persistit#initialize(Configuration)+.  Alternatively, the application can define volume specifications as property values using the syntax:
1073
129
1074
130
+volume._N_ = _path_[,_attrname_[:_attrvalue_]]...+
1075
131
1076
132
where _N_ is an arbitrary integer, _path_ is the path specification of the volume file, and _attrnames_ include:
1077
133
1078
134
- +pageSize+: Fixed length unit representing one page. Value must be one of 1024, 2048, 4096, 8192 or 16384. To open and use the Volume, the buffer pool must have available buffers of the same size.
1079
135
1080
136
- +create+: Persistit attempts to open an existing volume file with the specified _path_, or create a new one if the file does not exist.
1081
137
1082
138
- +createOnly+: Persistit throw a VolumeAlreadyExistsException if the file specified by path already exists. Otherwise it creates a new file with the specified path.
1083
139
1084
140
- +readOnly+: Opens a volume in read-only mode. An attempt to modify the volume results in a ReadOnlyVolumeException.
1085
141
1086
142
- +initialPages+ or +initialSize+: Specifies the initial size of the newly created volume file, either as the count of pages or as the size in bytes.
1087
143
1088
144
- +extensionPages+ or +extensionSize+: Specifies the extension size of the newly created volume, either as the count of pages or as the size in bytes. This is the size by which the volume file will expand when the volume needs to be enlarged.
1089
145
1090
146
- +maximumPages+ or +maximumSize+: An upper limit on the number of pages this Volume may hold, either as the count of pages or as the size in bytes. An attempt to further enlarge the Volume will generate a VolumeFullException.
1091
147
1092
148
- +alias+: The name of this Volume used in constructing +Exchange+ instances.  If unspecified, the name is the simple file name given in the _path_, not including its dotted suffix.
1093
149
1094
150
For example:
1095
151
1096
152
----
1097
153
volume.1=/home/akiban/ffdemo,create,pageSize:16K,\
1098
154
    initialSize:10M,extensionSize:10M,maximumSize:1G
1099
155
----
1100
156
1101
157
specifies a volume having the name “ffdemo” in the /home/akiban directory. A new volume will be created if there is no existing volume file, and when created it will have the initial, extension and maximum sizes of 10MByte, 10MByte and 1GByte, respectively. Its page size will be 16KByte, meaning that the configuration must also have a buffer pool of 16KByte buffers.
1102
158
1103
159
=== System Volume
1104
160
1105
161
One volume in a Persistit configuration must be designated as the system volume. It contains class meta data for objects stored serialized in Persistit Values. When a configuration specifies only one volume, that volume implicitly becomes the system volume by default. However, when a configuration specifies multiple volumes, you must indicate which volume will serve as the system volume. There are two ways to do this. By default, Persistit looks for a unique volume named “_system”. You can simply create a volume whose file name is “_system”.
1106
162
1107
163
Alternatively, you can specify a system volume name explicitly with the +sysvolume+ property (or +com.persistit.Configuration#setSysVolume+). The value is the name or alias of the selected volume.
1108
164
1109
165
== Configuring the Journal Path
1110
166
1111
167
The <<Journal>> consists of a series of sequentially numbered files located in directory specified by the configuration parameter +journalpath+. The application can set this property by calling +com.persistit.Configuration#setJournalPath+ prior to initializing Persistit or through the property
1112
168
1113
169
----
1114
170
journalpath=/ssd/data/my_app_journal
1115
171
---- 
1116
172
1117
173
The value specified can be either a 
1118
174
1119
175
- directory, in which case files named +persistit_journal._NNNNNNNNNNNN_+ will be created, 
1120
176
1121
177
- or a file name, in which case journal files will be created by appending the suffix +._NNNNNNNNNNNN_+.
1122
178
1123
179
=== Recommendations for Physical Media
1124
180
1125
181
The journal is written by appending records to the end of the highest-numbered file. Read operations occur while copying page images from the journal to their home volume files. While copying, Persistit attempts to perform large sequential read operations from the journal. Read operations also occur at random when Persistit needs to reload the image of a previously evicted page.
1126
182
1127
183
Because of these characteristics a modern SSD (solid disk drive) is ideally suited for maintaining the journal. If no SSD is available in the server, placing the journal on a different physical disk drive than the volume file(s) can significantly improve performance.
1128
184
1129
185
== Other Configuration Parameters
1130
186
1131
187
The following additional properties are defined for Persistit. Other properties may also reside in the Properties object or its backing file; Persistit simply ignores any property not listed here.
1132
188
1133
189
[horizontal]
1134
190
+journalsize+:: (+com.persistit.Configuration#setJournalSize+) Journal file block size. Default is 1,000,000,000 bytes. A new Persistit rolls over to a new journal file when this size is reached. Generally there is no reason to adjust this setting.
1135
191
1136
192
+appendonly+:: (+com.persistit.Configuration#setAppendOnly+) True or false (default).  When true, Persistit’s journal starts up in _append-only_ mode in which modified pages are only written to the journal and not copied to their home volumes. As a consequence, all existing journal files are preserved, and new modifications are written only to newly created journal files. The append-only flag can also be enabled or disabled by application code and through the JMX and RMI interfaces.
1137
193
1138
194
+rmiport+:: (+com.persistit.Configuration#setRmiPort+) Specifies a port number on which Persistit will create a temporary Remote Method Invocation registry.  If this property is specified, Persistit creates a registry and registers a +com.persistit.Management+ server on it. This allows remote access to management facilities within Persistit and permits the Swing-based administrative utility to attach to and manage a Persistit instance running on a headless server. The +rmihost+ and +rmiport+ properties are mutually exclusive.
1139
195
1140
196
+rmihost+:: (+com.persistit.Configuration#setRmiHost+) Specifies the URL of an Remote Method Invocation registry.  If present, Persistit registers its a server for its +com.persistit.Management+ interface at the specified external registry. The +rmihost+ and +rmiport+ properties are mutually exclusive.
1141
197
1142
198
+jmx+:: (+com.persistit.Configuration#setJmxEnabled+) True or false (default). Specifies whether Persistit registers MXBeans with the platform MBean server. Set this value to +true+ to enable access from +jconsole+ and other management tools.
1143
199
1144
200
+serialOverride+, +constructorOverride+:: (+com.persistit.Configuration#setSerialOverride+ +com.persistit.Configuration#setConstructorOverride+) Control aspects of object serialization. See <<Serialization>>.
1145
201
1146
202
+showgui+:: ((+com.persistit.Configuration#setShowGUI+) True of False.  If true, Persistit attempts to create and display an instance of the AdminUI utility panel within the current JVM. (Alternatively, AdminUI uses RMI and can be launched and run remotely if +rmiport+ or +rmihost+ has been specified.
1147
203
1148
204
+logfile+:: (+com.persistit.Configuration#setLogFile+) Name of a log file to which Persistit’s default logger will write diagnostic log entries. Applications generally install a logging adapter to reroute messages through Log4J, SLF4J or other logger. The +logfile+ property is used only when no adapter has been installed.
1149
205
1150
206
For all integer-valued properties, the suffix “K” may be used to represent kilo, “M” for mega, “G” for giga and “T” for tera. For example, “2M” represents the value 2,097,152.
1151
207
1152
208
== A Configuration Example
1153
209
1154
210
Following is an example of a Persistit configuration properties file:
1155
211
1156
212
----
1157
213
datapath = /var/opt/persistit/data
1158
214
logpath = /var/log/persistit
1159
215
logfile = ${logpath}/${timestamp}.log
1160
216
1161
217
buffer.count.16384 = 5000
1162
218
1163
219
volume.1 = ${datapath}/demo_data, create, pageSize:16K, \
1164
220
	initialSize:1M, extensionSize:1M, maximumSize:10G, alias:data
1165
221
1166
222
volume.2 = ${datapath}/demo_system, create, pageSize:16K, \
1167
223
	initialSize:100K, extensionSize:100K, maximumSize:1G
1168
224
1169
225
sysvolume = demo_system
1170
226
1171
227
journalpath = /ssd/persistit_journal
1172
228
----
1173
229
1174
230
With this configuration there will be 5,000 16K buffers in the buffer pool consuming heap space of approximately 93MB including overhead. Persistit will open or create volume files named +/var/opt/persistit/data/demo_data+ and +/var/opt/persistit/data/demo_system+ and a journal file named +/ssd/persistit_journal.0000000000000000+. Persistit will write diagnostic logging output to a file such as +/var/log/persistit/20110523172213.log+.
1175
231
1176
232
The +demo_data+ volume has the alias +data+. Application code uses the name "data" to refer to it. The +sysvolume+ property specifies that the +demo_system+ volume is designated to hold class meta data for serialized objects.
1177
233
1178
234
=== Property Value Substitution
1179
235
1180
236
This example also illustrates how property value substitution can be used within a Persistit configuration.  The value of the +datapath+ replaces +$\{datapath\}+ in the volume specification. The property name +datapath+ is arbitrary; you may use any valid property name as a substitution variable. Similarly, the value of +logpath+ replaces +$\{logpath\}+ and the pseudo-property +$\{timestamp\}+ expands to a timestamp in the form +_yyyyMMddHHmm_+ to provides a unique time-based log file name.
1181
237
1182
238
=== Incorporating Java System Properties
1183
239
1184
240
You may also specify any configuration property as a Java system property with the prefix +com.persisit.+ System properties override values specified as properties. For example, you can override the value of +buffer.count.8192+ specifying
1185
241
1186
242
----
1187
243
java -Dcom.persistit.buffer.count.8192=10K -jar MyJar
1188
244
----
1189
245
1190
246
This is also true for substitution property values. For example, +-Dcom.persistit.logpath=/tmp/+ will place the log files in the +/tmp+ directory rather than +/var/log/persistit+ as specified by the configuration file.
1191
247
0
1192
=== added file 'doc/GettingStarted.rst'
1193
--- doc/GettingStarted.rst	1970-01-01 00:00:00 +0000
1194
+++ doc/GettingStarted.rst	2012-05-30 18:23:19 +0000
1195
@@ -0,0 +1,270 @@
1196
1
1197
2
Getting Started with Akiban Persistit
1198
3
=====================================
1199
4
1200
5
Welcome!
1201
6
1202
7
We have worked hard to make Akiban Persistit(TM) exceptionally fast, reliable, simple and lightweight. We hope you will enjoy learning more about it and using it.
1203
8
1204
9
Akiban Persistit is a key/value data storage library written in Java(TM). Key features include:
1205
10
1206
11
- support for highly concurrent transaction processing with multi-version concurrency control
1207
12
- optimized serialization and deserialization mechanism for Java primitives and objects
1208
13
- multi-segment (compound) keys to enable a natural logical key hierarchy
1209
14
- support for long records (megabytes)
1210
15
- implementation of a persistent SortedMap
1211
16
- extensive management capability including command-line and GUI tools
1212
17
1213
18
This chapter briefly and informally introduces and demonstrates various Persistit features through examples. Subsequent chapters and the Javadoc API documentation provides a detailed reference guide to the product.
1214
19
1215
20
Download and Install
1216
21
--------------------
1217
22
1218
23
Download ``akiban-persistit-3.xx.yy.zip`` from http://www.akiban.com/persistit/download.html.
1219
24
1220
25
Unpack the distribution kit into a convenient directory using any unzip utility. For example, use ``jar`` to unpack the distribution kit to the current working directory as follows:
1221
26
.. code-block::
1222
27
1223
28
  jar xvzf akiban-persistit-core-3.xx.yy.zip
1224
29
1225
30
Review the ``LICENSE.html`` and ``README.html`` files located in the root of the installation directory. Persistit is licensed under the GNU Affero General Public License. By installing, copying or otherwise using the Software contained in the distribution kit, you agree to be bound by the terms of the license agreement. If you do not agree to these terms, remove and destroy all copies of the software in your possession immediately.
1226
31
1227
32
Working with Persistit
1228
33
----------------------
1229
34
1230
35
Add the ``akiban-persistit-core-3.xx.yy.jar`` from the ``lib`` directory of the distribution kit to your project's classpath. For example, copy it to ``jre/lib/ext`` in your Java Runtime Environment, or add it to your classpath environment variable. 
1231
36
1232
37
That's it. You are ready to work with Persistit.
1233
38
1234
39
Examples
1235
40
^^^^^^^^
1236
41
1237
42
Review the ``examples`` directory. Here you will find functional examples of varying complexity.
1238
43
1239
44
  ``examples/HelloWorld``
1240
45
      source code for the example illustrating this chapter
1241
46
  ``examples/SimpleDemo``
1242
47
      short, simple program similar to HelloWorld
1243
48
  ``examples/SimpleBench``
1244
49
      a small micro-benchmark measuring the speed of insert, traversal, random updates, etc.
1245
50
  ``examples/SimpleTransaction``
1246
51
      example demonstrating use of Persisit’s multi-version currency control (MVCC) transactions
1247
52
  ``examples/FindFile``
1248
53
      a somewhat larger example that uses Persistit as the backing store for file finder utility
1249
54
  ``example/PersistitMapDemo``
1250
55
      demonstration of the PersistitMap interface
1251
56
1252
57
HelloWorld
1253
58
----------
1254
59
1255
60
Before going further let's honor tradition with a small program that stores, fetches and displays the phrase “Hello World.” In this program we will create a record with the key “Hello” and the value “World”.
1256
61
1257
62
.. HelloWorld.java
1258
63
1259
64
.. code-block:: java
1260
65
1261
66
  import com.persistit.Exchange;
1262
67
  import com.persistit.Key;
1263
68
  import com.persistit.Persistit;
1264
69
1265
70
  public class HelloWorld {
1266
71
      public static void main(String[] args) throws Exception {
1267
72
          Persistit db = new Persistit();
1268
73
          try {
1269
74
              // Read configuration from persistit.properties, allocate
1270
75
              // buffers, open Volume, and perform recovery processing
1271
76
              // if necessary.
1272
77
              //
1273
78
              db.initialize();
1274
79
              //
1275
80
              // Create an Exchange, which is a thread-private facade for
1276
81
              // accessing data in a Persistit Tree. This Exchange will
1277
82
              // access a Tree called "greetings" in a Volume called
1278
83
              // "hwdemo". It will create a new Tree by that name
1279
84
              // if one does not already exist.
1280
85
              //
1281
86
              Exchange dbex = db.getExchange("hwdemo", "greetings", true);
1282
87
              //
1283
88
              // Set up the Value field of the Exchange.
1284
89
              //
1285
90
              dbex.getValue().put("World");
1286
91
              //
1287
92
              // Set up the Key field of the Exchange.
1288
93
              //
1289
94
              dbex.getKey().append("Hello");
1290
95
              //
1291
96
              // Ask Persistit to put this key/value pair into the Tree.
1292
97
              // Until this point, the changes to the Exchange are local
1293
98
              // to this thread.
1294
99
              //
1295
100
              dbex.store();
1296
101
              //
1297
102
              // Prepare to traverse all the keys in the Tree (of which there
1298
103
              // is currently only one!) and for each key display its value.
1299
104
              //
1300
105
              dbex.getKey().to(Key.BEFORE);
1301
106
              while (dbex.next()) {
1302
107
                  System.out.println(dbex.getKey().indexTo(0).decode() + " "
1303
108
                          + dbex.getValue().get());
1304
109
              }
1305
110
              db.releaseExchange(dbex);
1306
111
          } finally {
1307
112
              // Always close Persistit. If the application does not do
1308
113
              // this, Persistit's background threads will keep the JVM from
1309
114
              // terminating.
1310
115
              //
1311
116
              db.close();
1312
117
          }
1313
118
      }
1314
119
  }
1315
120
1316
121
Concepts
1317
122
--------
1318
123
1319
124
Although ``HelloWorld.java`` is not very useful, it demonstrates several of the basic building blocks of the Persistit API.
1320
125
1321
126
Initialization and Configuration
1322
127
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1323
128
1324
129
Before accessing any data, ``HelloWorld.java`` calls one of the ``com.persistit.Persistit#initialize`` methods of ``com.persistit.Persistit``. This sets up the memory configuration for buffers and the path names of Persistit volume and journal files. Alternative versions of the initialize method accept configuration information from a ``java.util.Properties`` object, from a specified properties file, or by default from the file named ``persistit.properties``.
1325
130
1326
131
In this example, ``persistit.properties`` looks like this:: 
1327
132
1328
133
  datapath=.
1329
134
  buffer.count.8192=32
1330
135
  volume.1=${datapath}/hwdemo,create,pageSize:8192,\
1331
136
	  initialPages:5,extensionPages:5,maximumPages:100000
1332
137
  journalpath=${datapath}/hwdemo_journal
1333
138
1334
139
See :ref:`Configuration` for additional information about Persistit configuration properties.
1335
140
1336
141
Volumes and Trees
1337
142
^^^^^^^^^^^^^^^^^
1338
143
1339
144
A configuration defines one or more volume files that will contain stored Persistit data. Usually you will specify the ``create`` flag, which allows Persistit to create a new volume if the file does not already exist. Creating a new file also establishes the initial size and growth parameters for that volume.
1340
145
1341
146
Each volume may contain an unlimited number of named trees. Each tree within a volume embodies a logically distinct B-Tree index structure. Think of a tree as simply a named key space within a volume.
1342
147
1343
148
``HelloWorld.java`` stores its key/value pair in a tree called “greetings” in a volume named “hwdemo”. This is specified by constructing an Exchange.
1344
149
1345
150
Exchanges
1346
151
---------
1347
152
1348
153
The ``com.persistit.Exchange`` class is the primary facade for interacting with Persistit data. It is so-named because it allows an application to exchange information with the database. An Exchange provides methods for storing, deleting, fetching and traversing key/value pairs.
1349
154
1350
155
The method
1351
156
1352
157
.. code-block:: java
1353
158
1354
159
  Exchange dbex = db.getExchange("hwdemo", "greetings", true);
1355
160
1356
161
in ``HelloWorld.java`` finds a volume named "hwdemo" and attempts to find a tree in it named "greetings". If there is no such tree, ``getExchange`` creates it.
1357
162
1358
163
Methods ``com.persistit.Persistit#getExchange`` and ``com.persistit.Persistit#releaseExchange`` maintain a pool of reusable Exchange objects designed for use by multi-threaded applications such as web applications. If a suitable exchange already exists, ``getExchange`` returns it; otherwise it constructs a new one.
1359
164
1360
165
The Exchange looks up the volume name “hwdemo” by matching it against the volumes specified in the configuration. The match is based on the simple file name of the volume after removing its final dotted suffix.  For example, the volume name “hwdemo” matches the volume specification ``${datapath}/hwdemo.v00``.
1361
166
1362
167
Each Exchange is implicitly associated with a ``com.persistit.Key`` and a ``com.persistit.Value``. Typically you work with an Exchange in one of the following patterns:
1363
168
1364
169
- Modify the Key, modify the Value and then perform a ``com.persistit.Exchange#store`` operation.
1365
170
- Modify the Key, perform a ``com.persistit.Exchange#fetch`` operation and then read the Value.
1366
171
- Modify the Key and then perform a ``com.persistit.Exchange#remove`` operation.
1367
172
- Optionally modify the Key, perform a ``com.persistit.Exchange#next``, ``com.persistit.Exchange#previous`` or ``com.persistit.Exchange#traverse`` operation, then read the resulting Key and/or Value.
1368
173
1369
174
These methods and their variants provide the foundation for using Persistit.
1370
175
1371
176
Records
1372
177
^^^^^^^
1373
178
1374
179
In Persistit, a database record consists of a Key and a Value. The terms “record” and “key/value pair” are used interchangeably.
1375
180
1376
181
When you store a record, Persistit searches for a previously stored record having the same key. If there is such a record, Persistit replaces its value.  If there is no such record, Persistit inserts a new one.  Like a Java Map, Persistit stores at most one value per key, and every record in a Tree has a unique key value.
1377
182
1378
183
Keys
1379
184
^^^^
1380
185
1381
186
A Key contains a unique identifier for key/value pair - or record - in a tree. The identifier consists of a sequence of one or more Java values encoded into an array of bytes stored in the volume file.
1382
187
1383
188
Key instances are mutable. Your application typically changes an Exchange's Key in preparation for fetching or retrieving data. In particular, you can append, remove or replace one or more values in a Key. Each value you append is called a *key segment*. You append multiple key segments to implement concatenated keys. See ``com.persistit.Key`` for additional information on constructing keys and the ordering of key traversal within a tree.
1384
189
1385
190
The ``HelloWorld.java`` example appends “Hello” to the Exchange’s Key object in this line:
1386
191
1387
192
.. code-block:: java
1388
193
1389
194
            dbex.getKey().append("Hello");
1390
195
1391
196
The result is a key with a single key segment.
1392
197
1393
198
Values
1394
199
^^^^^^
1395
200
1396
201
A Value object represents the serialized state of a Java object or a primitive value. It is a staging area for data being transferred from or to the database by ``fetch``, ``traverse`` and ``store`` operations.
1397
202
1398
203
Value instances are mutable. The ``fetch`` and ``traverse`` operations modify the state of an Exchange's Value instance to represent the value associated with some Key. Your application executes methods to modify the state of the Value instance in preparation for storing new data values into the database.
1399
204
1400
205
Numerous methods allow you to serialize and deserialize primitives values and objects into and from a Value object. For example, in ``HelloWorld.java``, the statement
1401
206
1402
207
.. code-block:: java
1403
208
1404
209
            dbex.getValue().put("World");
1405
210
1406
211
serializes the string “World” into the backing byte array of the Exchange’s Value object and
1407
212
1408
213
.. code-block:: java
1409
214
1410
215
            	System.out.println(
1411
216
                	dbex.getKey().indexTo(0).decode() + " " +
1412
217
                	dbex.getValue().get());
1413
218
1414
219
deserializes and prints an object value from the Key and another object value from the Value. Value also has methods such as ``getInt``, ``getLong``, ``getByteArray`` to decode primitive and array values directly.
1415
220
1416
221
Storing and Fetching Data
1417
222
^^^^^^^^^^^^^^^^^^^^^^^^^
1418
223
1419
224
Finally, it is these two methods in ``HelloWorld.java`` that cause the Exchange object to share data with the B-Tree, making it persistent and potentially available to other threads:
1420
225
1421
226
.. code-block:: java
1422
227
1423
228
            dbex.store();
1424
229
            ...
1425
230
            while (dbex.next()) { ... }
1426
231
1427
232
Closing Persistit
1428
233
^^^^^^^^^^^^^^^^^
1429
234
1430
235
Persistit creates one or more background threads that lazily write data to the Volume files and perform other maintenance activities. Be sure to invoke the ``com.persistit.Persistit#close`` method to allow these threads to finish their work and exit properly. The pattern illustrated in ``HelloWorld.java``, using a *try/finally* block to invoke ``close``, is strongly recommended.
1431
236
1432
237
The ``com.persistit.Persistit#close(boolean)`` method optionally flushes all data to disk from the buffer pool before shutting down. Specifying the ``false`` option will close Persistit more quickly will lose recent updates if they were not performed inside of transactions, or will potentially require a longer recovery process during the next startup to reapply committed transactions.
1433
238
Additional Topics
1434
239
-----------------
1435
240
1436
241
PersistitMap
1437
242
^^^^^^^^^^^^
1438
243
A particularly easy way to get started with Persistit is to use its built-in ``com.persistit.PersistitMap`` implementation. PersistitMap implements the ``java.util.SortedMap`` interface, so it can directly replace ``java.util.TreeMap`` or other kinds of Map in existing Java code.
1439
244
1440
245
See :ref:`PersistitMap`.
1441
246
1442
247
KeyFilters
1443
248
^^^^^^^^^^
1444
249
1445
250
A ``com.persistit.KeyFilter`` can be supplied to restrict the results traversal operation in a convenient and  
1446
251
1447
252
Transactions
1448
253
^^^^^^^^^^^^
1449
254
1450
255
Persistit provides ACID Transaction support with multi-version concurrency control (MCC) and adjustable durability policy.
1451
256
1452
257
See :ref:`Transactions`.
1453
258
1454
259
Managing Persistit
1455
260
^^^^^^^^^^^^^^^^^^
1456
261
1457
262
Persistit provides several mechanisms for managing Persistit operation within an application. These include
1458
263
1459
264
- JMX MXBeans
1460
265
- The ``com.persistit.Management`` object which provides programmatic access to many management operations
1461
266
- The ``com.persistit.CLI`` object which provides a command-line interface for various management operations
1462
267
- The AdminUI tool which provides a graphical client interface for examining records and other resources
1463
268
- Logging interface design for easy embedding in host applications
1464
269
1465
270
See :ref:`Management`.
1466
0
271
1467
=== removed file 'doc/GettingStarted.txt'
1468
--- doc/GettingStarted.txt	2012-04-24 02:37:12 +0000
1469
+++ doc/GettingStarted.txt	1970-01-01 00:00:00 +0000
1470
@@ -1,255 +0,0 @@
1471
1
1472
2
= Getting Started with Akiban Persistit
1473
3
1474
4
1475
5
Welcome!
1476
6
1477
7
We have worked hard to make Akiban Persistit(TM) exceptionally fast, reliable, simple and lightweight. We hope you will enjoy learning more about it and using it.
1478
8
1479
9
Akiban Persistit is a key/value data storage library written in Java(TM). Key features include:
1480
10
1481
11
- support for highly concurrent transaction processing with multi-version concurrency control
1482
12
- optimized serialization and deserialization mechanism for Java primitives and objects
1483
13
- multi-segment (compound) keys to enable a natural logical key hierarchy
1484
14
- support for long records (megabytes)
1485
15
- implementation of a persistent SortedMap
1486
16
- extensive management capability including command-line and GUI tools
1487
17
1488
18
This chapter briefly and informally introduces and demonstrates various Persistit features through examples. Subsequent chapters and the Javadoc API documentation provides a detailed reference guide to the product.
1489
19
1490
20
== Download and Install
1491
21
1492
22
Download +akiban-persistit-3.xx.yy.zip+ from http://www.akiban.com/persistit/download.html.
1493
23
1494
24
Unpack the distribution kit into a convenient directory using any unzip utility. For example, use +jar+ to unpack the distribution kit to the current working directory as follows:
1495
25
----
1496
26
jar xvzf akiban-persistit-core-3.xx.yy.zip
1497
27
----
1498
28
1499
29
Review the +LICENSE.html+ and +README.html+ files located in the root of the installation directory. Persistit is licensed under the XXXXXX XXXXXXX License. By installing, copying or otherwise using the Software contained in the distribution kit, you agree to be bound by the terms of the license agreement. If you do not agree to these terms, remove and destroy all copies of the software in your possession immediately.
1500
30
1501
31
== Working with Persistit
1502
32
1503
33
Add the +akiban-persistit-core-3.xx.yy.jar+ from the +lib+ directory of the distribution kit to your project's classpath. For example, copy it to +jre/lib/ext+ in your Java Runtime Environment, or add it to your classpath environment variable. 
1504
34
1505
35
That's it. You are ready to work with Persistit.
1506
36
1507
37
=== Examples
1508
38
1509
39
Review the +examples+ directory. Here you will find functional examples of varying complexity.
1510
40
1511
41
[horizontal]
1512
42
+examples/HelloWorld+:: source code for the example illustrating this chapter
1513
43
+examples/SimpleDemo+:: short, simple program similar to HelloWorld
1514
44
+examples/SimpleBench+:: a small micro-benchmark measuring the speed of insert, traversal, random updates, etc.
1515
45
+examples/SimpleTransaction+:: example demonstrating use of Persisit’s multi-version currency control (MVCC) transactions
1516
46
+examples/FindFile+:: a somewhat larger example that uses Persistit as the backing store for file finder utility
1517
47
+example/PersistitMapDemo+:: demonstration of the PersistitMap interface
1518
48
1519
49
== HelloWorld
1520
50
1521
51
Before going further let's honor tradition with a small program that stores, fetches and displays the phrase “Hello World.” In this program we will create a record with the key “Hello” and the value “World”.
1522
52
1523
53
.HelloWorld.java
1524
54
[source,java]
1525
55
----
1526
56
import com.persistit.Exchange;
1527
57
import com.persistit.Key;
1528
58
import com.persistit.Persistit;
1529
59
1530
60
public class HelloWorld {
1531
61
    public static void main(String[] args) throws Exception {
1532
62
        Persistit db = new Persistit();
1533
63
        try {
1534
64
            // Read configuration from persistit.properties, allocate
1535
65
            // buffers, open Volume, and perform recovery processing
1536
66
            // if necessary.
1537
67
            //
1538
68
            db.initialize();
1539
69
            //
1540
70
            // Create an Exchange, which is a thread-private facade for
1541
71
            // accessing data in a Persistit Tree. This Exchange will
1542
72
            // access a Tree called "greetings" in a Volume called
1543
73
            // "hwdemo". It will create a new Tree by that name
1544
74
            // if one does not already exist.
1545
75
            //
1546
76
            Exchange dbex = db.getExchange("hwdemo", "greetings", true);
1547
77
            //
1548
78
            // Set up the Value field of the Exchange.
1549
79
            //
1550
80
            dbex.getValue().put("World");
1551
81
            //
1552
82
            // Set up the Key field of the Exchange.
1553
83
            //
1554
84
            dbex.getKey().append("Hello");
1555
85
            //
1556
86
            // Ask Persistit to put this key/value pair into the Tree.
1557
87
            // Until this point, the changes to the Exchange are local
1558
88
            // to this thread.
1559
89
            //
1560
90
            dbex.store();
1561
91
            //
1562
92
            // Prepare to traverse all the keys in the Tree (of which there
1563
93
            // is currently only one!) and for each key display its value.
1564
94
            //
1565
95
            dbex.getKey().to(Key.BEFORE);
1566
96
            while (dbex.next()) {
1567
97
                System.out.println(dbex.getKey().indexTo(0).decode() + " "
1568
98
                        + dbex.getValue().get());
1569
99
            }
1570
100
            db.releaseExchange(dbex);
1571
101
        } finally {
1572
102
            // Always close Persistit. If the application does not do
1573
103
            // this, Persistit's background threads will keep the JVM from
1574
104
            // terminating.
1575
105
            //
1576
106
            db.close();
1577
107
        }
1578
108
    }
1579
109
}
1580
110
----
1581
111
1582
112
== Concepts
1583
113
1584
114
Although +HelloWorld.java+ is not very useful, it demonstrates several of the basic building blocks of the Persistit API.
1585
115
1586
116
=== Initialization and Configuration
1587
117
1588
118
Before accessing any data, +HelloWorld.java+ calls one of the +com.persistit.Persistit#initialize+ methods of +com.persistit.Persistit+. This sets up the memory configuration for buffers and the path names of Persistit volume and journal files. Alternative versions of the initialize method accept configuration information from a +java.util.Properties+ object, from a specified properties file, or by default from the file named +persistit.properties+.
1589
119
1590
120
In this example, +persistit.properties+ looks like this:
1591
121
1592
122
----
1593
123
datapath=.
1594
124
buffer.count.8192=32
1595
125
volume.1=${datapath}/hwdemo,create,pageSize:8192,\
1596
126
	initialPages:5,extensionPages:5,maximumPages:100000
1597
127
journalpath=${datapath}/hwdemo_journal
1598
128
----
1599
129
1600
130
See <<Configuration>> for additional information about Persistit configuration properties.
1601
131
1602
132
=== Volumes and Trees
1603
133
1604
134
A configuration defines one or more volume files that will contain stored Persistit data. Usually you will specify the +create+ flag, which allows Persistit to create a new volume if the file does not already exist. Creating a new file also establishes the initial size and growth parameters for that volume.
1605
135
1606
136
Each volume may contain an unlimited number of named trees. Each tree within a volume embodies a logically distinct B-Tree index structure. Think of a tree as simply a named key space within a volume.
1607
137
1608
138
+HelloWorld.java+ stores its key/value pair in a tree called “greetings” in a volume named “hwdemo”. This is specified by constructing an Exchange.
1609
139
1610
140
=== Exchanges
1611
141
1612
142
The +com.persistit.Exchange+ class is the primary facade for interacting with Persistit data. It is so-named because it allows an application to exchange information with the database. An Exchange provides methods for storing, deleting, fetching and traversing key/value pairs.
1613
143
1614
144
The method
1615
145
1616
146
[source,java]
1617
147
----
1618
148
Exchange dbex = db.getExchange("hwdemo", "greetings", true);
1619
149
----
1620
150
1621
151
in +HelloWorld.java+ finds a volume named "hwdemo" and attempts to find a tree in it named "greetings". If there is no such tree, +getExchange+ creates it.
1622
152
1623
153
Methods +com.persistit.Persistit#getExchange+ and +com.persistit.Persistit#releaseExchange+ maintain a pool of reusable Exchange objects designed for use by multi-threaded applications such as web applications. If a suitable exchange already exists, +getExchange+ returns it; otherwise it constructs a new one.
1624
154
1625
155
The Exchange looks up the volume name “hwdemo” by matching it against the volumes specified in the configuration. The match is based on the simple file name of the volume after removing its final dotted suffix.  For example, the volume name “hwdemo” matches the volume specification +${datapath}/hwdemo.v00+.
1626
156
1627
157
Each Exchange is implicitly associated with a +com.persistit.Key+ and a +com.persistit.Value+. Typically you work with an Exchange in one of the following patterns:
1628
158
1629
159
- Modify the Key, modify the Value and then perform a +com.persistit.Exchange#store+ operation.
1630
160
- Modify the Key, perform a +com.persistit.Exchange#fetch+ operation and then read the Value.
1631
161
- Modify the Key and then perform a +com.persistit.Exchange#remove+ operation.
1632
162
- Optionally modify the Key, perform a +com.persistit.Exchange#next+, +com.persistit.Exchange#previous+ or +com.persistit.Exchange#traverse+ operation, then read the resulting Key and/or Value.
1633
163
1634
164
These methods and their variants provide the foundation for using Persistit.
1635
165
1636
166
=== Records
1637
167
1638
168
In Persistit, a database record consists of a Key and a Value. The terms “record” and “key/value pair” are used interchangeably.
1639
169
1640
170
When you store a record, Persistit searches for a previously stored record having the same key. If there is such a record, Persistit replaces its value.  If there is no such record, Persistit inserts a new one.  Like a Java Map, Persistit stores at most one value per key, and every record in a Tree has a unique key value.
1641
171
1642
172
=== Keys
1643
173
1644
174
A Key contains a unique identifier for key/value pair - or record - in a tree. The identifier consists of a sequence of one or more Java values encoded into an array of bytes stored in the volume file.
1645
175
1646
176
Key instances are mutable. Your application typically changes an Exchange's Key in preparation for fetching or retrieving data. In particular, you can append, remove or replace one or more values in a Key. Each value you append is called a _key segment_. You append multiple key segments to implement concatenated keys. See +com.persistit.Key+ for additional information on constructing keys and the ordering of key traversal within a tree.
1647
177
1648
178
The +HelloWorld.java+ example appends “Hello” to the Exchange’s Key object in this line:
1649
179
1650
180
[source,java]
1651
181
----
1652
182
            dbex.getKey().append("Hello");
1653
183
----
1654
184
1655
185
The result is a key with a single key segment.
1656
186
1657
187
=== Values
1658
188
1659
189
A Value object represents the serialized state of a Java object or a primitive value. It is a staging area for data being transferred from or to the database by +fetch+, +traverse+ and +store+ operations.
1660
190
1661
191
Value instances are mutable. The +fetch+ and +traverse+ operations modify the state of an Exchange's Value instance to represent the value associated with some Key. Your application executes methods to modify the state of the Value instance in preparation for storing new data values into the database.
1662
192
1663
193
Numerous methods allow you to serialize and deserialize primitives values and objects into and from a Value object. For example, in +HelloWorld.java+, the statement
1664
194
1665
195
[source,java]
1666
196
----
1667
197
            dbex.getValue().put("World");
1668
198
----
1669
199
serializes the string “World” into the backing byte array of the Exchange’s Value object and
1670
200
1671
201
[source,java]
1672
202
----
1673
203
            	System.out.println(
1674
204
                	dbex.getKey().indexTo(0).decode() + " " +
1675
205
                	dbex.getValue().get());
1676
206
----
1677
207
deserializes and prints an object value from the Key and another object value from the Value. Value also has methods such as +getInt+, +getLong+, +getByteArray+ to decode primitive and array values directly.
1678
208
1679
209
=== Storing and Fetching Data
1680
210
1681
211
Finally, it is these two methods in +HelloWorld.java+ that cause the Exchange object to share data with the B-Tree, making it persistent and potentially available to other threads:
1682
212
1683
213
[source,java]
1684
214
----
1685
215
            dbex.store();
1686
216
            ...
1687
217
            while (dbex.next()) { ... }
1688
218
----
1689
219
1690
220
=== Closing Persistit
1691
221
1692
222
Persistit creates one or more background threads that lazily write data to the Volume files and perform other maintenance activities. Be sure to invoke the +com.persistit.Persistit.close+ method to allow these threads to finish their work and exit properly. The pattern illustrated in +HelloWorld.java+, using a _try/finally_ block to invoke +close+, is strongly recommended.
1693
223
1694
224
The +close+ flushes all data from 
1695
225
1696
226
1697
227
== Additional Topics
1698
228
1699
229
=== PersistitMap
1700
230
1701
231
A particularly easy way to get started with Persistit is to use its built-in +com.persistit.PersistitMap+ implementation. PersistitMap implements the +java.util.SortedMap+ interface, so it can directly replace +java.util.TreeMap+ or other kinds of Map in existing Java code.
1702
232
1703
233
See <<PersistitMap>>.
1704
234
1705
235
=== KeyFilters
1706
236
1707
237
A +com.persistit.KeyFilter+ can be supplied to restrict the results traversal operation in a convenient and  
1708
238
1709
239
=== Transactions
1710
240
1711
241
Persistit provides ACID Transaction support with multi-version concurrency control (MCC) and adjustable durability policy.
1712
242
1713
243
See <<Transactions>>.
1714
244
1715
245
=== Managing Persistit
1716
246
1717
247
Persistit provides several mechanisms for managing Persistit operation within an application. These include
1718
248
1719
249
- JMX MXBeans
1720
250
- The +com.persistit.Management+ object which provides programmatic access to many management operations
1721
251
- The +com.persistit.CLI+ object which provides a command-line interface for various management operations
1722
252
- The AdminUI tool which provides a graphical client interface for examining records and other resources
1723
253
- Logging interface design for easy embedding in host applications
1724
254
1725
255
See <<Management>>.
1726
256
0
1727
=== added file 'doc/Management.rst'
1728
--- doc/Management.rst	1970-01-01 00:00:00 +0000
1729
+++ doc/Management.rst	2012-05-30 18:23:19 +0000
1730
@@ -0,0 +1,446 @@
1731
1
.. _Management:
1732
2
1733
3
Management
1734
4
==========
1735
5
1736
6
Akiban Persistit provides three main avenues for measuring and managing its internal resources: an RMI interface, a JMX interface and a command-line interface capable of launching various utility tasks. 
1737
7
1738
8
The RMI interface is primarily intended for the com.persistit.ui.AdminUI utility. AdminUI is a JFC/Swing program that runs on a device with graphical UI capabilities.  For example, in Linux and Unix it requires an XServer. Since production servers are usually headless it is often necessary to run AdminUI remotely, via its RMI interface. To do this, the Persistit configuration must specify either the ``rmiport`` or ``rmihost`` property so that it can start an RMI server.
1739
9
1740
10
Suppose a Persistit-based application is running on a host named “somehost” and has specified the configuration property ``rmiport=1099`` in its configuration.  Then the AdminUI can be launched as follows to connect with it:
1741
11
1742
12
.. code-block:: java
1743
13
1744
14
  java -cp classpath  com.persistit.ui.AdminUI somehost:1099
1745
15
1746
16
where classpath includes the Persistit ``com.persistit.ui`` package. 
1747
17
1748
18
The JMX interface can be used by third-party management utilities, from applications such as ``jconsole`` and ``visualvm``, and from command-line JMX clients such as ``jmxterm``. To enable JMX access, the configuration must specify the property ``jmx=true``.  This causes Persistit to register several MBeans with the platform MBean server during initialization.
1749
19
1750
20
MXBeans
1751
21
-------
1752
22
The following JMX MXBeans are available:
1753
23
1754
24
  ``com.persistit:type=Persistit``
1755
25
      See ``com.persistit.mxbeans.ManagementMXBean``
1756
26
  ``com.persistit:type=Persistit,class=AlertMonitorMXBean``
1757
27
      Accumulates, logs and emits notifications about abnormal events such as IOExceptions and measurements outside of 
1758
28
      expected thresholds.
1759
29
  ``com.persistit:type=Persistit,class=CleanupManagerMXBean``
1760
30
      View current state of the Cleanup Manager. The Cleanup Manager performs background pruning and tree maintenance 
1761
31
      activities.
1762
32
  ``com.persistit:type=Persistit,class=IOMeter``
1763
33
      Maintains statistics on file system I/O operations.
1764
34
  ``com.persistit.type=Persistit,class=JournalManager``
1765
35
      Views current journal status.
1766
36
  ``com.persistit.type=Persistit,class=RecoveryManager``
1767
37
      Views current status of the recovery process. Attributes of this MXBean change only during the recovery process.
1768
38
  ``com.persistit:type=Persistit,class=TransactionIndexMXBean``
1769
39
      View internal state of transaction index queues and tables.
1770
40
  ``com.persistit.type=Persistit,class=BufferPool.*SSSS*``
1771
41
      where *SSSS* is a buffer size (512, 1024, 2048, 4096 or 16394). View utilization statistics for buffers of the 
1772
42
      selected size.
1773
43
1774
44
1775
45
For details see the JavaDoc API documentation for each MXBean interface.
1776
46
1777
47
Management Tasks
1778
48
----------------
1779
49
1780
50
Persistit provides several ways to launch and administer ``com.persistit.Task`` instances.  A ``Task`` is a management operation that may take a significant amount of time and usually runs in a separate thread. For example, ``com.persistit.IntegrityCheck`` is a ``Task`` that verifies the internal structural integrity of one or more trees and can run for minutes to hours, depending on the size of the database.  The :ref:`AdminUI` tool, ``com.persistit.mxbeans.ManagementMXBean`` and the command-line interface (:ref:`CLI`) provide mechanisms to launch, suspend or stop a task, and to monitor a task’s progress.
1781
51
1782
52
Currently the following built-in Tasks are available:
1783
53
1784
54
  ``icheck``
1785
55
      Check the integrity of one or more trees or volumes.
1786
56
  ``save``
1787
57
      Save selected key-value pairs from one or more trees to a flat file.
1788
58
  ``load``
1789
59
      Load selected key-value pairs from a flat file written by ``save``.
1790
60
  ``backup``
1791
61
      Control and/or perform a concurrent backup of one more more volumes.
1792
62
  ``stat``
1793
63
      Aggregate various performance statistics and either return them immediately, or write them periodically to a file.
1794
64
  ``task``
1795
65
      Check the status of an existing task.  This task can also suspend, resume or stop an existing task. This task, which 
1796
66
      immediately returns status information, can be used by external tools to poll the status of other tasks.
1797
67
  ``cliserver``
1798
68
      Start a simple command-line server on a specified port.  This enables a client program to execute commands sending 
1799
69
      them directly to that port.
1800
70
  *other tasks*
1801
71
      Various commands allow you to select and view pages and journal records.
1802
72
1803
73
1804
74
Executing a Task from an Application
1805
75
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1806
76
1807
77
The ``com.persistit.mxbeans.ManagementMXBean#execute`` and ``com.persistit.mxbeans.ManagementMXBean#launch`` methods both take a single String-valued argument, parse it to set up a ``Task`` and return a String-valued result. For example:
1808
78
1809
79
.. code-block:: java
1810
80
1811
81
  String taskId = db.getManagement().launch(“backup -z file=/tmp/mybackup.zip”);
1812
82
  String status = db.getManagement().execute(“task -v -m -c taskId=” + taskId);
1813
83
1814
84
launches the backup task and then queries its status.
1815
85
1816
86
Executing a Task from a JMX Client
1817
87
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1818
88
1819
89
The ``com.persistit.mxbeans.ManagementMXBean#execute`` and ``com.persistit.mxbeans.ManagementMXBean#launch`` methods are exposed as operations on the ``com.persistit.mxbeans.ManagementMXBean``.  You can invoke tasks
1820
90
1821
91
- via ``jconsole`` by typing the desired command line as the argument of the ``execute`` operation.
1822
92
- via a third-party JMX client such as ``jmxterm``.
1823
93
- via the ``cliserver`` feature
1824
94
1825
95
Executing a Task Using a Third-Party JMX client
1826
96
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1827
97
1828
98
You can use the ``jmxterm`` program, for example, (see [http://www.cyclopsgroup.org/projects/jmxterm]) to execute commands with the following shell script::
1829
99
1830
100
  #!/bin/sh
1831
101
  java -jar jmxterm-1.0-alpha-4-uber.jar --verbose silent --noninteract --url $1 <<EOF
1832
102
  run -d com.persistit -b com.persistit:type=Persistit execute $2
1833
103
  EOF
1834
104
1835
105
To use this script, specify either the JMX URL or the process ID as the first command argument, and the command line as the second argument.  Example::
1836
106
1837
107
  peter:~/workspace/sandbox$ jmxterm-execute 1234 ‘stat\ -a’
1838
108
  hit=3942334 miss=14 new=7364 evict=0 jwrite=81810 jread=2 jcopy=63848 tcommit=0 troll=0 CC=0 RV=12 RJ=2 WJ=81810 EV=0 FJ=529 IOkbytes=1134487 TOTAL
1839
109
1840
110
This command invokes the ``stat`` task with the flag ``-a`` on a JVM running with process id 1234.  Note that with jxmterm white-space must be quoted by backslash (‘\’) even though the argument list is also enclosed in single-quotes.  The backslash marshals the space character through ``jmxterm``’s parser. Commas and other delimiters also need to be quoted.
1841
111
1842
112
.. _cliserver:
1843
113
1844
114
Executing a Task Using the Built-In ``cliserver``
1845
115
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1846
116
1847
117
``cliserver`` is a simple text-based server that receives a command line as a text string and emits the generated output as its response. To start it, enter the command::
1848
118
1849
119
  cliserver port=9999
1850
120
1851
121
programmatically or through JMX. (You may specify any valid, available port.) Then use a command-line client to send command lines to that port and display their results. Persistit includes a primitive command-line client within the ``com.persistit.CLI`` class itself.  Create a script to invoke it as follows::
1852
122
1853
123
  #!/bin/sh
1854
124
  java -cp classpath com.persistit.CLI localhost:9999 $*
1855
125
1856
126
Where ``classpath`` includes the Persistit library. Assuming the name of the script is ``pcli`` you can then invoke commands from a shell as shown in this example::
1857
127
1858
128
  /home/akiban:~$ pcli icheck -v -c "trees=*:Acc*"
1859
129
  Volume,Tree,Faults,IndexPages,IndexBytes,DataPages,DataBytes,LongRecordPages,LongRecordBytes,MvvPages,MvvRecords,MvvOverhead,MvvAntiValues,IndexHoles,PrunedPages
1860
130
  "persistit","AccumulatorRecoveryTest",0,3,24296,1519,15560788,0,0,1506,52192,721521,2397,0,0
1861
131
  "*","*",0,3,24296,1519,15560788,0,0,1506,52192,721521,2397,0,0
1862
132
  /home/akiban:~$
1863
133
1864
134
Alternatively, you can use ``curl`` as follows::
1865
135
1866
136
  #!/bin/sh
1867
137
  echo "$*" | curl --silent --show-error telnet://localhost:9999
1868
138
1869
139
to issue commands.
1870
140
1871
141
.. caution::
1872
142
   
1873
143
   ``cliserver`` has no access control and sends potentially sensitive data in cleartext form. Therefore it should be used with care and only in a secure 
1874
144
   network environment. Its primary mission is to allow easy inspection of internal data structures within Persistit.
1875
145
1876
146
.. _CLI:
1877
147
1878
148
The Command-Line Interface
1879
149
--------------------------
1880
150
1881
151
The String value passed to the ``execute`` and ``launch`` operations specifies the name of a task and its arguments. The general form is::
1882
152
1883
153
  commandname -flag -flag argname=value argname=value
1884
154
1885
155
where the order of arguments and flags is not significant.
1886
156
1887
157
1888
158
Command: ``icheck``
1889
159
^^^^^^^^^^^^^^^^^^^
1890
160
1891
161
Performs a com.persistit.IntegrityCheck task. Arguments:
1892
162
1893
163
  ``trees``
1894
164
      Specifies volumes and/or trees to check. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
1895
165
  ``-r``
1896
166
      Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.
1897
167
  ``-u``
1898
168
      Don't freeze updates (Default is to freeze updates)
1899
169
  ``-h``
1900
170
      Fix index holes. An *index hole* is an anomaly that occurs rarely in normal operation such that a page does not have an index entry in the index page level 
1901
171
      immediately above it
1902
172
  ``-p``
1903
173
      Prune obsolete MVV (multi-version value) instances while checking.
1904
174
  ``-P``
1905
175
      Prune obsolete MVV instances, and clear any remaining aborted TransactionStatus instances.  Use with care.
1906
176
  ``-v``
1907
177
      Emit verbose output. For example, emit statistics for each tree.
1908
178
  ``-c``
1909
179
      Display tree statistics in comma-separated-variable format suitable for import into a spreadsheet program.
1910
180
1911
181
Example::
1912
182
1913
183
  icheck trees=vehicles/* -h
1914
184
1915
185
Checks all trees in the ``vehicles`` volume and repairs index holes.
1916
186
1917
187
Command: ``save``
1918
188
^^^^^^^^^^^^^^^^^
1919
189
1920
190
Starts a com.persistit.StreamSaver task. Arguments:
1921
191
1922
192
  ``file``
1923
193
      Name of file to save records to (required)
1924
194
  ``trees``
1925
195
      Specifies volumes and/or trees to save. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
1926
196
  ``-r``
1927
197
      Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
1928
198
  ``-v``
1929
199
      emit verbose output
1930
200
  
1931
201
...‘*’ and ‘?’ are standard wildcards.
1932
202
1933
203
Example::
1934
204
1935
205
  save -v file=/home/akiban/save.dat trees=vehicles/*{[“Edsel”:”Yugo”]}
1936
206
1937
207
Saves the records for “Edsel” through “Yugo”, inclusive, from any tree in the volume named ``vehicles``. See com.persistit.TreeSelector for selection syntax details.
1938
208
1939
209
Command: ``load``
1940
210
^^^^^^^^^^^^^^^^^
1941
211
1942
212
Starts a com.persistit.StreamLoader task. Arguments:
1943
213
1944
214
  ``file``
1945
215
      Name of file to load records from
1946
216
  ``trees``
1947
217
      Specifies volumes and/or trees to load. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
1948
218
  ``-r``
1949
219
      Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
1950
220
  ``-n``
1951
221
      Don't create missing Volumes (Default is to create them)
1952
222
  ``-t``
1953
223
      Don't create missing Trees (Default is to create them)
1954
224
  ``-v``
1955
225
      Emit verbose output
1956
226
1957
227
...‘*’ and ‘?’ are standard wildcards.
1958
228
1959
229
Example::
1960
230
1961
231
  load file=/home/akiban/save.dat trees=*/*{[“Falcon”:”Firebird”]}
1962
232
1963
233
For any tree in any volume, this command loads all records having keys between “Falcon” and “Firebird”, inclusive.
1964
234
1965
235
Command: ``backup``
1966
236
^^^^^^^^^^^^^^^^^^^
1967
237
1968
238
Starts a ``com.persistit.BackupTask`` task to perform concurrent (hot) backup. Arguments:
1969
239
1970
240
  ``file``
1971
241
      Archive file path. If this argument is specified, BackupTask will back up the database in .zip format to the specified file.  This is intended only for small 
1972
242
      databases. It is expected that ``backup`` will be used in conjunction with high-speed third-party data copying utilities for production use. The ``-a`` and       
1973
243
  ``-e`` 
1974
244
      flags are incompatible with operation when the ``file`` argument is specified and are ignored.
1975
245
  ``-a``
1976
246
      Start appendOnly mode - for use with third-party backup tools.  ``backup -a`` should be invoked before data copying begins.
1977
247
  ``-e``
1978
248
      End appendOnly mode - for use with third-party backup tools.  ``backup -e`` should be invoked after data copying ends.
1979
249
  ``-c``
1980
250
      Request checkpoint before backup.
1981
251
  ``-z``
1982
252
      Compress output to ZIP format - meaningful only in conjunction with the ``file`` argument.
1983
253
  ``-f``
1984
254
      Emit a list of files that need to be copied. In this form the task immediately returns with a list of files currently comprising the Persistit database,  
1985
255
      including Volume and journal files.
1986
256
  ``-y``
1987
257
      Copy pages from journal to Volumes before starting backup.  This reduces the number of journal files in the backup set.
1988
258
1989
259
Example::
1990
260
1991
261
    backup -y -a -c -y -f
1992
262
    … invoke third-party backup tool to copy the database files
1993
263
    backup -e
1994
264
1995
265
Uses the ``backup`` task twice, once to set *append-only* mode, checkpoint the journal and perform a full copy-back cycle (a process that attempts to shorten the journal), and then write out a list of files that need to be copied. The second call to ``backup`` restores normal operation.  Between these two calls a third party backup tool is used to copy the data.
1996
266
1997
267
Example::
1998
268
1999
269
    backup -z file=/tmp/my_backup.zip
2000
270
2001
271
Uses the built-in file copy feature with ZIP compression.
2002
272
2003
273
Command: ``task``
2004
274
^^^^^^^^^^^^^^^^^
2005
275
2006
276
Queries, stops, suspends or resumes a background task.  Arguments:
2007
277
2008
278
  ``taskId``
2009
279
      Task ID to to check, or -1 for all
2010
280
  ``-v``
2011
281
      Verbose - returns detailed status messages from the selected task(s)
2012
282
  ``-m``
2013
283
      Keep previously delivered messages. Default is to remove messages once reported.
2014
284
  ``-k``
2015
285
      Keep the selected task or tasks even if completed.  Default is to remove tasks once reported.
2016
286
  ``-x``
2017
287
      Stop the selected task or tasks
2018
288
  ``-u``
2019
289
      Suspend the selected task or tasks
2020
290
  ``-r``
2021
291
      Resume the selected task or tasks
2022
292
2023
293
Unlike other commands, the ``task`` command always runs immediately even if invoked through the ``launch`` method. 
2024
294
2025
295
You can use the ``task`` command to poll and display progress of long-running tasks. Invoke::
2026
296
2027
297
  task  -v -m -c taskId=nnn
2028
298
2029
299
until the result is empty.
2030
300
2031
301
Command: ``cliserver``
2032
302
^^^^^^^^^^^^^^^^^^^^^^
2033
303
2034
304
Starts a simple text-based server that receives a command line as a text string and emits the generated output as its response. Argument:
2035
305
2036
306
  ``port``
2037
307
      Port number on which to listen for commands.
2038
308
2039
309
Command: ``exit``
2040
310
^^^^^^^^^^^^^^^^^
2041
311
2042
312
Ends a running ``cliserver`` instance.
2043
313
2044
314
Commands for Viewing Data
2045
315
^^^^^^^^^^^^^^^^^^^^^^^^^
2046
316
2047
317
The following commands execute immediately, even if invoked through the ``launch`` method.  They provide a mechanism to examine individual database pages or journal records.
2048
318
2049
319
Command: ``select``
2050
320
^^^^^^^^^^^^^^^^^^^
2051
321
2052
322
Selects a volume and optionally a tree for subsequent operations such as ``view``. Arguments:
2053
323
2054
324
  ``tree``
2055
325
      Specifies volume and/or tree to select as context for subsequent operations. See com.persistit.TreeSelector for details syntax.
2056
326
  ``-r``
2057
327
      Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
2058
328
2059
329
Command: ``list``
2060
330
^^^^^^^^^^^^^^^^^
2061
331
2062
332
Lists volumes and trees.  Arguments:
2063
333
2064
334
  ``trees``
2065
335
      Specifies volumes and/or trees to list. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
2066
336
  ``-r``
2067
337
      Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.
2068
338
2069
339
All volumes, and all trees within those volumes, that match the ``trees`` specification are listed. By default, this command lists all trees in all volumes.
2070
340
2071
341
Command: ``pview``
2072
342
^^^^^^^^^^^^^^^^^^
2073
343
2074
344
Displays contents of a database page. Arguments:
2075
345
2076
346
  ``page``
2077
347
      page address
2078
348
  ``jaddr``
2079
349
      journal address - displays a page version stored at the specified journal address
2080
350
  ``key``
2081
351
      a key specified as a String defined in the com.persistit.Key class
2082
352
  ``level``
2083
353
      tree level of the desired page
2084
354
  ``find``
2085
355
      selected records in an index page surrounding a key that points to the specified page address
2086
356
  ``-a``
2087
357
      all records. If specified, all records in the page will be displayed.  Otherwise the output is abbreviated to no more than 20 lines.
2088
358
  ``-s``
2089
359
      summary - only header information in the page is displayed
2090
360
2091
361
The ``pview`` command identifies a page in one of three distinct ways: by page address, by journal address, or by key.  Only one of the three parameters ``page``, ``jaddr`` or ``key`` (with ``level``) may be used.
2092
362
2093
363
``page`` specifies the current version of a page having the specified address.  If there is a copy of the page in the buffer pool, that copy is displayed even if it contains updates that are not yet written to disk.
2094
364
2095
365
``jaddr`` specifies an address in the journal. Typical use is to invoke the ``jview`` command to view a list of journal records, and then to see a detailed view of one page record in the journal, invoke the ``pview`` command with its journal address.
2096
366
2097
367
``key`` specifies a key. By default the data page associated with that key will be displayed.  The data page is defined as level 0. The ``level`` parameter allows pages at various index levels to be viewed; for example ``level=1`` refers to the index page that points to the data page containing the specified key.
2098
368
2099
369
When examining an index page with potentially hundreds of records it is sometimes convenient to find the record that points to a particular child page, and also the records immediately before and after. Specifying the ``find`` parameter when viewing an index page abbreviates the displayed records to include just the first and last records in the page, plus a small range of records surrounding the one that points to the specified page. This mechanism provides a convenient way to find sibling pages.
2100
370
2101
371
2102
372
Command: ``path``
2103
373
^^^^^^^^^^^^^^^^^
2104
374
2105
375
For a specified key displays the sequence of pages from root of the tree to the data page containing they key. Argument:
2106
376
2107
377
  ``key``
2108
378
      a key specified as a String defined in the com.persistit.Key class
2109
379
2110
380
2111
381
Command: ``jview``
2112
382
^^^^^^^^^^^^^^^^^^
2113
383
2114
384
Displays journal records.  Arguments:
2115
385
2116
386
  ``start``
2117
387
      starting journal address (default = 0)
2118
388
  ``end``
2119
389
      end journal address (address = infinite)
2120
390
  ``timestamps``
2121
391
      range selection of timestamp values, e.g., “132466-132499” for records having timestamps between these two numbers, inclusive. Default is all timestamps.
2122
392
  ``types``
2123
393
      comma-delimted list of two-character record types, e.g., “JH,IV,IT,CP” to select only Journal Header, Identify Volume, Identify Tree and Check Point records 
2124
394
      (see ``com.persistit.JournalRecord`` for definitions of all types.) Default value is all types.
2125
395
  ``pages``
2126
396
      range selection of page address for PA (Page) records, e.g., “1,2,13-16” to include pages, 1, 2, 13, 14, 15 or 16.
2127
397
  ``maxkey``
2128
398
      maximum display length of key values in the output. Default value is 42.
2129
399
  ``maxvalue``
2130
400
      maximum display length of values in the output. Default value is 42.
2131
401
  ``path``
2132
402
      journal file path. Default is the journal file path of the currently instantiated Persistit instance.
2133
403
  ``-v``
2134
404
      verbose format. If specified, causes PM (Page Map) and TM (TransactionMap) records to be be display all map elements.
2135
405
2136
406
2137
407
Note that the journal on a busy system contains a large number of records, so entering the ``journal`` command without constraining the address range or record types may result in extremely lengthy output.
2138
408
2139
409
Command: ``open``
2140
410
^^^^^^^^^^^^^^^^^
2141
411
2142
412
Opens a Persistit database for analysis. This task can only be used to examine a copy of a Persistit database that is not currently in use by an application. It works by attempting to open the volume and journal files using a synthesized configuration. It finds a collection of journal files and volume files specified by the ``datapath``, ``journalpath`` and ``volumepath`` arguments; from these it derives a set of properties that will allow it to examine those journals and volumes. By default all volumes are opened in read-only mode and cannot be changed by operations executed from the command-line interface.
2143
413
2144
414
If there already is an open Persistit instance, this command detaches it. For example, if you start ``cliserver`` from a live Persistit instance and then issue the ``open`` command, the live instance will continue to operate but ``cliserver`` will no longer be attached to it.
2145
415
2146
416
Note that you cannot ``open`` volumes that are already open in a running Persistit instance due to their file locks. However, you can copy open volumes and journal files to another location and ``open`` the copy. This is the primary use case for the ``open`` command: to analyze a copy of a database (for example a copy recovered from backup) without having to a launch the application software that embeds Persistit.
2147
417
2148
418
Arguments:
2149
419
2150
420
  ``datapath``
2151
421
      a directory path for volume and journal files to be analyzed
2152
422
  ``volumepath``
2153
423
      overrides ``datapath`` to specify an alternative location for volume files.
2154
424
  ``journalpath``
2155
425
      overrides ``datapath`` to specify an alternative location for journal files.
2156
426
  ``rmiport``
2157
427
      specifies an RMI port to which an instance of the AdminUI can attach.
2158
428
  ``-g``
2159
429
      launch a local copy of AdminUI
2160
430
  ``-y``
2161
431
      attempt to recover committed transactions .
2162
432
2163
433
Note that even if you specify ``-y`` to recover transactions, the volume files will not be modified. But the ``open`` command will add a new journal file containing modifications caused by the recovery process. You can simply delete that file when done.
2164
434
2165
435
Command: ``close``
2166
436
^^^^^^^^^^^^^^^^^^
2167
437
2168
438
Detach and close the current Persistit instance. If the CLI was started with a live Persistit instance then this command merely detaches from it; if the instance was created with the ``open`` command then ``close`` closes it and releases all related file locks, buffers, etc.
2169
439
2170
440
Command: ``source``
2171
441
^^^^^^^^^^^^^^^^^^^
2172
442
2173
443
Execute command lines from a specified text file. Argument:
2174
444
2175
445
  ``file``
2176
446
      file name of command input file
2177
0
447
2178
=== removed file 'doc/Management.txt'
2179
--- doc/Management.txt	2012-04-30 22:09:31 +0000
2180
+++ doc/Management.txt	1970-01-01 00:00:00 +0000
2181
@@ -1,360 +0,0 @@
2182
1
[[Management]]
2183
2
= Management
2184
3
2185
4
Akiban Persistit provides three main avenues for measuring and managing its internal resources: an RMI interface, a JMX interface and a command-line interface capable of launching various utility tasks. 
2186
5
2187
6
The RMI interface is primarily intended for the com.persistit.ui.AdminUI utility. AdminUI is a JFC/Swing program that runs on a device with graphical UI capabilities.  For example, in Linux and Unix it requires an XServer. Since production servers are usually headless it is often necessary to run AdminUI remotely, via its RMI interface. To do this, the Persistit configuration must specify either the +rmiport+ or +rmihost+ property so that it can start an RMI server.
2188
7
2189
8
Suppose a Persistit-based application is running on a host named “somehost” and has specified the configuration property +rmiport=1099+ in its configuration.  Then the AdminUI can be launched as follows to connect with it:
2190
9
2191
10
----
2192
11
java -cp <classpath>  com.persistit.ui.AdminUI somehost:1099
2193
12
----
2194
13
2195
14
where <classpath> includes the Persistit com.persistit.ui package. 
2196
15
2197
16
The JMX interface can be used by third-party management utilities, from applications such as +jconsole+ and +visualvm+, and from command-line JMX clients such as +jmxterm+. To enable JMX access, the configuration must specify the property +jmx=true+.  This causes Persistit to register several MBeans with the platform MBean server during initialization.
2198
17
2199
18
== MXBeans
2200
19
2201
20
The following JMX MXBeans are available:
2202
21
2203
22
[horizontal]
2204
23
+com.persistit:type=Persistit+:: See +com.persistit.mxbeans.ManagementMXBean+
2205
24
+com.persistit:type=Persistit,class=AlertMonitorMXBean+:: Accumulates, logs and emits notifications about abnormal events such as IOExceptions and measurements outside of expected thresholds.
2206
25
+com.persistit:type=Persistit,class=CleanupManagerMXBean+:: View current state of the Cleanup Manager. The Cleanup Manager performs background pruning and tree maintenance activities.
2207
26
+com.persistit:type=Persistit,class=IOMeter+:: Maintains statistics on file system I/O operations.
2208
27
+com.persistit.type=Persistit,class=JournalManager+:: Views current journal status.
2209
28
+com.persistit.type=Persistit,class=RecoveryManager+:: Views current status of the recovery process. Attributes of this MXBean change only during the recovery process.
2210
29
+com.persistit:type=Persistit,class=TransactionIndexMXBean+:: View internal state of transaction index queues and tables.
2211
30
+com.persistit.type=Persistit,class=BufferPool._SSSS_+:: where _SSSS_ is a buffer size (512, 1024, 2048, 4096 or 16394). View utilization statistics for buffers of the selected size.
2212
31
2213
32
2214
33
For details see the JavaDoc API documentation for each MXBean interface.
2215
34
2216
35
== Management Tasks
2217
36
2218
37
Persistit provides several ways to launch and administer +com.persistit.Task+ instances.  A +Task+ is a management operation that may take a significant amount of time and usually runs in a separate thread. For example, +com.persistit.IntegrityCheck+ is a +Task+ that verifies the internal structural integrity of one or more trees and can run for minutes to hours, depending on the size of the database.  The <<AdminUI>> tool, +com.persistit.ManagementMXBean+ and the command-line interface (<<CLI>>) provide mechanisms to launch, suspend or stop a task, and to monitor a task’s progress.
2219
38
2220
39
Currently the following built-in Tasks are available:
2221
40
2222
41
[horizontal]
2223
42
+icheck+:: Check the integrity of one or more trees or volumes.
2224
43
+save+:: Save selected key-value pairs from one or more trees to a flat file.
2225
44
+load+:: Load selected key-value pairs from a flat file written by +save+.
2226
45
+backup+:: Control and/or perform a concurrent backup of one more more volumes.
2227
46
+stat+:: Aggregate various performance statistics and either return them immediately, or write them periodically to a file.
2228
47
+task+:: Check the status of an existing task.  This task can also suspend, resume or stop an existing task. This task, which immediately returns status information, can be used by external tools to poll the status of other tasks.
2229
48
+cliserver+:: Start a simple command-line server on a specified port.  This enables a client program to execute commands sending them directly to that port.
2230
49
+_other tasks_+:: Various commands allow you to select and view pages and journal records.
2231
50
2232
51
2233
52
=== Executing a Task from an Application
2234
53
2235
54
The +com.persistit.mxbeans.ManagementMXBean#execute+ and +com.persistit.mxbeans.ManagementMXBean#launch+ methods both take a single String-valued argument, parse it to set up a +Task+ and return a String-valued result. For example:
2236
55
2237
56
[source,java]
2238
57
----
2239
58
String taskId = db.getManagement().launch(“backup -z file=/tmp/mybackup.zip”);
2240
59
String status = db.getManagement().execute(“task -v -m -c taskId=” + taskId);
2241
60
----
2242
61
2243
62
launches the backup task and then queries its status.
2244
63
2245
64
=== Executing a Task from a JMX Client
2246
65
2247
66
The +com.persistit.mxbeans.ManagementMXBean#execute+ and +com.persistit.mxbeans.ManagementMXBean#launch+ methods are exposed as operations on the +com.persistit.mxbeans.ManagementMXBean+.  You can invoke tasks
2248
67
2249
68
- via +jconsole+ by typing the desired command line as the argument of the +execute+ operation.
2250
69
- via a third-party JMX client such as +jmxterm+.
2251
70
- via the +cliserver+ feature
2252
71
2253
72
==== Executing a Task Using a Third-Party JMX client.
2254
73
2255
74
You can use the +jmxterm+ program, for example, (see [http://www.cyclopsgroup.org/projects/jmxterm]) to execute commands with the following shell script:
2256
75
2257
76
[source,bash]
2258
77
----
2259
78
#!/bin/sh
2260
79
java -jar jmxterm-1.0-alpha-4-uber.jar --verbose silent --noninteract --url $1 <<EOF
2261
80
run -d com.persistit -b com.persistit:type=Persistit execute $2
2262
81
EOF
2263
82
----
2264
83
2265
84
To use this script, specify either the JMX URL or the process ID as the first command argument, and the command line as the second argument.  Example
2266
85
2267
86
----
2268
87
peter:~/workspace/sandbox$ jmxterm-execute 1234 ‘stat\ -a’
2269
88
hit=3942334 miss=14 new=7364 evict=0 jwrite=81810 jread=2 jcopy=63848 tcommit=0 troll=0 CC=0 RV=12 RJ=2 WJ=81810 EV=0 FJ=529 IOkbytes=1134487 TOTAL
2270
89
----
2271
90
2272
91
This command invokes the +stat+ task with the flag +-a+ on a JVM running with process id 1234.  Note that with jxmterm white-space must be quoted by backslash (‘\’) even though the argument list is also enclosed in single-quotes.  The backslash marshals the space character through +jmxterm+’s parser. Commas and other delimiters also need to be quoted.
2273
92
2274
93
[[cliserver]]
2275
94
=== Executing a Task Using the Built-In +cliserver+
2276
95
2277
96
+cliserver+ is a simple text-based server that receives a command line as a text string and emits the generated output as its response. To start it, enter the command
2278
97
----
2279
98
cliserver port=9999
2280
99
----
2281
100
programmatically or through JMX. (You may specify any valid, available port.) Then use a command-line client to send command lines to that port and display their results. Persistit includes a primitive command-line client within the +com.persistit.CLI+ class itself.  Create a script to invoke it as follows:
2282
101
2283
102
[source,bash]
2284
103
----
2285
104
#!/bin/sh
2286
105
java -cp <classpath> com.persistit.CLI localhost:9999 $*
2287
106
----
2288
107
2289
108
Where +<classpath>+ includes the Persistit library. Assuming the name of the script is +pcli+ you can then invoke commands from a shell as shown in this example:
2290
109
2291
110
----
2292
111
/home/akiban:~$ pcli icheck -v -c "trees=*:Acc*"
2293
112
Volume,Tree,Faults,IndexPages,IndexBytes,DataPages,DataBytes,LongRecordPages,LongRecordBytes,MvvPages,MvvRecords,MvvOverhead,MvvAntiValues,IndexHoles,PrunedPages
2294
113
"persistit","AccumulatorRecoveryTest",0,3,24296,1519,15560788,0,0,1506,52192,721521,2397,0,0
2295
114
"*","*",0,3,24296,1519,15560788,0,0,1506,52192,721521,2397,0,0
2296
115
/home/akiban:~$
2297
116
----
2298
117
2299
118
Alternatively, you can use +curl+ as follows:
2300
119
2301
120
[source,bash]
2302
121
----
2303
122
#!/bin/sh
2304
123
echo "$*" | curl --silent --show-error telnet://localhost:9999
2305
124
----
2306
125
to issue commands.
2307
126
2308
127
CAUTION: Warning: +cliserver+ has no access control and sends potentially sensitive data in cleartext form. Therefore it should be used with care and only in a secure network environment. Its primary mission is to allow easy inspection of internal data structures within Persistit.
2309
128
2310
129
[[CLI]]
2311
130
== The Command-Line Interface
2312
131
2313
132
The String value passed to the +execute+ and +launch+ operations specifies the name of a task and its arguments. The general form is
2314
133
2315
134
----
2316
135
commandname -flag -flag argname=value argname=value
2317
136
----
2318
137
2319
138
where the order of arguments and flags is not significant.
2320
139
2321
140
2322
141
=== Command: +icheck+
2323
142
2324
143
Performs a com.persistit.IntegrityCheck task. Arguments:
2325
144
2326
145
[horizontal]
2327
146
+trees+:: Specifies volumes and/or trees to check. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
2328
147
+-r+:: Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.
2329
148
+-u+:: Don't freeze updates (Default is to freeze updates)
2330
149
+-h+:: Fix index holes. An _index hole_ is an anomaly that occurs rarely in normal operation such that a page does not have an index entry in the index page level immediately above it
2331
150
+-p+:: Prune obsolete MVV (multi-version value) instances while checking.
2332
151
+-P+:: Prune obsolete MVV instances, and clear any remaining aborted TransactionStatus instances.  Use with care.
2333
152
+-v+:: Emit verbose output. For example, emit statistics for each tree.
2334
153
+-c+:: Display tree statistics in comma-separated-variable format suitable for import into a spreadsheet program.
2335
154
2336
155
Example:
2337
156
----
2338
157
icheck trees=vehicles/* -h
2339
158
----
2340
159
Checks all trees in the +vehicles+ volume and repairs index holes.
2341
160
2342
161
=== Command: +save+
2343
162
2344
163
Starts a com.persistit.StreamSaver task. Arguments:
2345
164
2346
165
[horizontal]
2347
166
+file+:: Name of file to save records to (required)
2348
167
+trees+:: Specifies volumes and/or trees to save. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
2349
168
+-r+:: Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
2350
169
+-v+:: emit verbose output
2351
170
‘*’ and ‘?’ as standard wildcards.)
2352
171
2353
172
Example:
2354
173
----
2355
174
save -v file=/home/akiban/save.dat trees=vehicles/*{[“Edsel”:”Yugo”]}
2356
175
----
2357
176
2358
177
Saves the records for “Edsel” through “Yugo”, inclusive, from any tree in the volume named +vehicles+. See com.persistit.TreeSelector for selection syntax details.
2359
178
2360
179
=== Command: +load+
2361
180
2362
181
Starts a com.persistit.StreamLoader task. Arguments:
2363
182
2364
183
[horizontal]
2365
184
+file+:: Name of file to load records from
2366
185
+trees+:: Specifies volumes and/or trees to load. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
2367
186
+-r+:: Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
2368
187
+-n+:: Don't create missing Volumes (Default is to create them)
2369
188
+-t+:: Don't create missing Trees (Default is to create them)
2370
189
‘*’ and ‘?’ as standard wildcards.)
2371
190
+-v+:: emit verbose output
2372
191
2373
192
Example:
2374
193
----
2375
194
load file=/home/akiban/save.dat trees=*/*{[“Falcon”:”Firebird”]}
2376
195
----
2377
196
2378
197
For any tree in any volume, this command loads all records having keys between “Falcon” and “Firebird”, inclusive.
2379
198
2380
199
=== Command: +backup+
2381
200
2382
201
Starts a +com.persistit.BackupTask+ task to perform concurrent (hot) backup. Arguments:
2383
202
2384
203
[horizontal]
2385
204
+file+:: Archive file path. If this argument is specified, BackupTask will back up the database in .zip format to the specified file.  This is intended only for small databases. It is expected that +backup+ will be used in conjunction with high-speed third-party data copying utilities for production use. The +-a+ and +-e+ flags are incompatible with operation when the +file+ argument is specified and are ignored.
2386
205
+-a+:: Start appendOnly mode - for use with third-party backup tools.  +backup -a+ should be invoked before data copying begins.
2387
206
+-e+:: End appendOnly mode - for use with third-party backup tools.  +backup -e+ should be invoked after data copying ends.
2388
207
+-c+:: Request checkpoint before backup.
2389
208
+-z+:: Compress output to ZIP format - meaningful only in conjunction with the +file+ argument.
2390
209
+-f+:: Emit a list of files that need to be copied. In this form the task immediately returns with a list of files currently comprising the Persistit database, including Volume and journal files.
2391
210
+-y+:: Copy pages from journal to Volumes before starting backup.  This reduces the number of journal files in the backup set.
2392
211
2393
212
Examples:
2394
213
----
2395
214
backup -y -a -c -y -f
2396
215
… invoke third-party backup tool to copy the database files
2397
216
backup -e
2398
217
----
2399
218
Uses the +backup+ task twice, once to set _append-only_ mode, checkpoint the journal and perform a full copy-back cycle (a process that attempts to shorten the journal), and then write out a list of files that need to be copied. The second call to +backup+ restores normal operation.  Between these two calls a third party backup tool is used to copy the data.
2400
219
2401
220
----
2402
221
backup -z file=/tmp/my_backup.zip
2403
222
----
2404
223
Uses the built-in file copy feature with ZIP compression.
2405
224
2406
225
=== Command: +task+
2407
226
2408
227
Queries, stops, suspends or resumes a background task.  Arguments:
2409
228
2410
229
[horizontal]
2411
230
+taskId+:: Task ID to to check, or -1 for all
2412
231
+-v+:: Verbose - returns detailed status messages from the selected task(s)
2413
232
+-m+:: Keep previously delivered messages. Default is to remove messages once reported.
2414
233
+-k+:: Keep the selected task or tasks even if completed.  Default is to remove tasks once reported.
2415
234
+-x+:: Stop the selected task or tasks
2416
235
+-u+:: Suspend the selected task or tasks
2417
236
+-r+:: Resume the selected task or tasks
2418
237
2419
238
Unlike other commands, the +task+ command always runs immediately even if invoked through the +launch+ method. 
2420
239
2421
240
You can use the +task+ command to poll and display progress of long-running tasks. Invoke
2422
241
2423
242
----
2424
243
task  -v -m -c taskId=nnn
2425
244
----
2426
245
2427
246
until the result is empty.
2428
247
2429
248
=== Command: +cliserver+
2430
249
2431
250
Starts a simple text-based server that receives a command line as a text string and emits the generated output as its response. Argument:
2432
251
2433
252
[horizontal]
2434
253
+port+:: Port number on which to listen for commands.
2435
254
2436
255
=== Command: +exit+
2437
256
2438
257
Ends a running +cliserver+ instance.
2439
258
2440
259
== Commands for Viewing Data
2441
260
2442
261
The following commands execute immediately, even if invoked through the +launch+ method.  They provide a mechanism to examine individual database pages or journal records.
2443
262
2444
263
=== Command: +select+
2445
264
2446
265
Selects a volume and optionally a tree for subsequent operations such as +view+. Arguments:
2447
266
2448
267
[horizontal]
2449
268
+tree+:: Specifies volume and/or tree to select as context for subsequent operations. See com.persistit.TreeSelector for details syntax.
2450
269
+-r+:: Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.)
2451
270
2452
271
=== Command: +list+
2453
272
2454
273
Lists volumes and trees.  Arguments:
2455
274
2456
275
[horizontal]
2457
276
+trees+:: Specifies volumes and/or trees to list. See com.persistit.TreeSelector for details syntax. Default is all trees in all volumes.
2458
277
+-r+:: Tree specification uses Java RegEx syntax (Default is to treat ‘*’ and ‘?’ as standard single-character and multi-character wildcards.
2459
278
2460
279
All volumes, and all trees within those volumes, that match the +trees+ specification are listed. By default, this command lists all trees in all volumes.
2461
280
2462
281
=== Command: +pview+
2463
282
2464
283
Displays contents of a database page. Arguments:
2465
284
2466
285
[horizontal]
2467
286
+page+:: page address
2468
287
+jaddr+:: journal address - displays a page version stored at the specified journal address
2469
288
+key+:: a key specified as a String defined in the com.persistit.Key class
2470
289
+level+:: tree level of the desired page
2471
290
+find+:: selected records in an index page surrounding a key that points to the specified page address
2472
291
+-a+:: all records. If specified, all records in the page will be displayed.  Otherwise the output is abbreviated to no more than 20 lines.
2473
292
+-s+:: summary - only header information in the page is displayed
2474
293
2475
294
The +pview+ command identifies a page in one of three distinct ways: by page address, by journal address, or by key.  Only one of the three parameters +page+, +jaddr+ or +key+ (with +level+) may be used.
2476
295
2477
296
+page+ specifies the current version of a page having the specified address.  If there is a copy of the page in the buffer pool, that copy is displayed even if it contains updates that are not yet written to disk.
2478
297
2479
298
+jaddr+ specifies an address in the journal. Typical use is to invoke the +jview+ command to view a list of journal records, and then to see a detailed view of one page record in the journal, invoke the +pview+ command with its journal address.
2480
299
2481
300
+key+ specifies a key. By default the data page associated with that key will be displayed.  The data page is defined as level 0. The +level+ parameter allows pages at various index levels to be viewed; for example +level=1+ refers to the index page that points to the data page containing the specified key.
2482
301
2483
302
When examining an index page with potentially hundreds of records it is sometimes convenient to find the record that points to a particular child page, and also the records immediately before and after. Specifying the +find+ parameter when viewing an index page abbreviates the displayed records to include just the first and last records in the page, plus a small range of records surrounding the one that points to the specified page. This mechanism provides a convenient way to find sibling pages.
2484
303
2485
304
2486
305
=== Command: +path+
2487
306
2488
307
For a specified key displays the sequence of pages from root of the tree to the data page containing they key. Argument:
2489
308
2490
309
[horizontal]
2491
310
+key+:: a key specified as a String defined in the com.persistit.Key class
2492
311
2493
312
2494
313
=== Command: +jview+
2495
314
2496
315
Displays journal records.  Arguments:
2497
316
2498
317
[horizontal]
2499
318
+start+:: starting journal address (default = 0)
2500
319
+end+:: end journal address (address = infinite)
2501
320
+timestamps+:: range selection of timestamp values, e.g., “132466-132499” for records having timestamps between these two numbers, inclusive. Default is all timestamps.
2502
321
+types+:: comma-delimted list of two-character record types, e.g., “JH,IV,IT,CP” to select only Journal Header, Identify Volume, Identify Tree and Check Point records (see com.persistit.JournalRecord for definitions of all types.) Default value is all types.
2503
322
+pages+:: range selection of page address for PA (Page) records, e.g., “1,2,13-16” to include pages, 1, 2, 13, 14, 15 or 16.
2504
323
+maxkey+:: maximum display length of key values in the output. Default value is 42.
2505
324
+maxvalue+:: maximum display length of values in the output. Default value is 42.
2506
325
+path+:: journal file path. Default is the journal file path of the currently instantiated Persistit instance.
2507
326
+-v+:: verbose format. If specified, causes PM (Page Map) and TM (TransactionMap) records to be be display all map elements.
2508
327
2509
328
2510
329
Note that the journal on a busy system contains a large number of records, so entering the +journal+ command without constraining the address range or record types may result in extremely lengthy output.
2511
330
2512
331
=== Command: +open+
2513
332
2514
333
Opens a Persistit database for analysis. This task can only be used to examine a copy of a Persistit database that is not currently in use by an application. It works by attempting to open the volume and journal files using a synthesized configuration. It finds a collection of journal files and volume files specified by the +datapath+, +journalpath+ and +volumepath+ arguments; from these it derives a set of properties that will allow it to examine those journals and volumes. By default all volumes are opened in read-only mode and cannot be changed by operations executed from the command-line interface.
2515
334
2516
335
If there already is an open Persistit instance, this command detaches it. For example, if you start +cliserver+ from a live Persistit instance and then issue the +open+ command, the live instance will continue to operate but +cliserver+ will no longer be attached to it.
2517
336
2518
337
Note that you cannot +open+ volumes that are already open in a running Persistit instance due to their file locks. However, you can copy open volumes and journal files to another location and +open+ the copy. This is the primary use case for the +open+ command: to analyze a copy of a database (for example a copy recovered from backup) without having to a launch the application software that embeds Persistit.
2519
338
2520
339
Arguments:
2521
340
2522
341
[horizontal]
2523
342
+datapath+:: a directory path for volume and journal files to be analyzed
2524
343
+volumepath+:: overrides +datapath+ to specify an alternative location for volume files.
2525
344
+journalpath+:: overrides +datapath+ to specify an alternative location for journal files.
2526
345
+rmiport+:: specifies an RMI port to which an instance of the AdminUI can attach.
2527
346
+-g+:: launch a local copy of AdminUI
2528
347
+-y+:: attempt to recover committed transactions .
2529
348
2530
349
Note that even if you specify +-y+ to recover transactions, the volume files will not be modified. But the +open+ command will add a new journal file containing modifications caused by the recovery process. You can simply delete that file when done.
2531
350
2532
351
=== Command: +close+
2533
352
2534
353
Detach and close the current Persistit instance. If the CLI was started with a live Persistit instance then this command merely detaches from it; if the instance was created with the +open+ command then +close+ closes it and releases all related file locks, buffers, etc.
2535
354
2536
355
=== Command: +source+
2537
356
2538
357
Execute command lines from a specified text file. Argument:
2539
358
2540
359
[horizontal]
2541
360
+file+:: file name of command input file
2542
361
0
2543
=== added file 'doc/Miscellaneous.rst'
2544
--- doc/Miscellaneous.rst	1970-01-01 00:00:00 +0000
2545
+++ doc/Miscellaneous.rst	2012-05-30 18:23:19 +0000
2546
@@ -0,0 +1,36 @@
2547
1
.. _Miscellaneous:
2548
2
2549
3
Miscellaneous Topics
2550
4
====================
2551
5
2552
6
Following are some short items you may find useful as you explore Akiban Persistit. Follow links to the API documentation for more details.
2553
7
2554
8
Histograms
2555
9
----------
2556
10
2557
11
The method ``com.persistit.Exchange#computeHistogram`` class provides a way to sample and summarize a set of keys in a ``Tree``.  It works by traversing keys in index pages near the root of the tree.  Because only a small fraction of all the keys in the tree are represented in the index, this can result in relatively small sample set of keys relatively quickly. The result can be used to estimate the actual number of keys.
2558
12
2559
13
Temporary Volumes
2560
14
-----------------
2561
15
2562
16
A Persistit temporary volume is a special kind of Volume that is deleted when Persistit is closed. The update mechanism for temporary volumes avoids writing to disk whenever possible, and its contents are not recoverable Persistit shuts down. Therefore in some cases database operations on temporary volumes are faster.
2563
17
2564
18
The primary use case for a temporary volume is an application that needs the unlimited size, but not the persistence of normal Persistit volumes.
2565
19
2566
20
See the ``com.persistit.Persistit#createTemporaryVolume`` method for additional details.
2567
21
2568
22
Logging
2569
23
-------
2570
24
2571
25
By default Persistit emits log messages to a file called persistit.log  and also writes high level log messages to System.out.  You can change this behavior by plugging in a different logging implementation. In particular, Persistit provides pluggable adapters for various other logging implementations, including Log4J, SLF4J, and the Java logging API introduced in JDK 1.4. For details see the API documentation for com.persistit.logging.AbstractPersistitLogger.
2572
26
2573
27
Using one of these logging frameworks is simple.  For example, the following code connects Persistit to an application-supplied SLF4J logger:
2574
28
2575
29
.. code-block:: java
2576
30
2577
31
  db.setPersistitLogger(new Slf4jAdapter(LOG))
2578
32
2579
33
where ``db`` is the Persistit instance and ``LOG`` is a Logger supplied by SLF4J. This method should be called before the ``initialize`` method.
2580
34
2581
35
2582
36
2583
0
37
2584
=== removed file 'doc/Miscellaneous.txt'
2585
--- doc/Miscellaneous.txt	2012-04-30 22:09:31 +0000
2586
+++ doc/Miscellaneous.txt	1970-01-01 00:00:00 +0000
2587
@@ -1,32 +0,0 @@
2588
1
[[Miscellaneous]]
2589
2
= Miscellaneous Topics
2590
3
2591
4
Following are some short items you may find useful as you explore Akiban Persistit. Follow links to the API documentation for more details.
2592
5
2593
6
== Histograms
2594
7
2595
8
The method +com.persistit.Exchange#computeHistogram+ class provides a way to sample and summarize a set of keys in a +Tree+.  It works by traversing keys in index pages near the root of the tree.  Because only a small fraction of all the keys in the tree are represented in the index, this can result in relatively small sample set of keys relatively quickly. The result can be used to estimate the actual number of keys.
2596
9
2597
10
== Temporary Volumes
2598
11
2599
12
A Persistit temporary volume is a special kind of Volume that is deleted when Persistit is closed. The update mechanism for temporary volumes avoids writing to disk whenever possible, and its contents are not recoverable Persistit shuts down. Therefore in some cases database operations on temporary volumes are faster.
2600
13
2601
14
The primary use case for a temporary volume is an application that needs the unlimited size, but not the persistence of normal Persistit volumes.
2602
15
2603
16
See the +com.persistit.Persistit#createTemporaryVolume+ method for additional details.
2604
17
2605
18
== Logging
2606
19
2607
20
By default Persistit emits log messages to a file called persistit.log  and also writes high level log messages to System.out.  You can change this behavior by plugging in a different logging implementation. In particular, Persistit provides pluggable adapters for various other logging implementations, including Log4J, SLF4J, and the Java logging API introduced in JDK 1.4. For details see the API documentation for com.persistit.logging.AbstractPersistitLogger.
2608
21
2609
22
Using one of these logging frameworks is simple.  For example, the following code connects Persistit to an application-supplied SLF4J logger:
2610
23
2611
24
[source,java]
2612
25
----
2613
26
db.setPersistitLogger(new Slf4jAdapter(LOG))
2614
27
----
2615
28
2616
29
where +db+ is the Persistit instance and +LOG+ is a Logger supplied by SLF4J. This method should be called before the +initialize+ method.
2617
30
2618
31
2619
32
2620
33
0
2621
=== added file 'doc/PhysicalStorage.rst'
2622
--- doc/PhysicalStorage.rst	1970-01-01 00:00:00 +0000
2623
+++ doc/PhysicalStorage.rst	2012-05-30 18:23:19 +0000
2624
@@ -0,0 +1,142 @@
2625
1
.. _PhysicalStorage:
2626
2
2627
3
Physical B-Tree Representation
2628
4
==============================
2629
5
2630
6
This chapter describes the physical structures used to represent Akiban Persistit records on disk and in memory.
2631
7
2632
8
Files
2633
9
-----
2634
10
2635
11
Following is a directory listing illustrating a working Persistit database::
2636
12
2637
13
  -rw-r--r--. 1 demo demo  24G Feb  8 13:18 akiban_data
2638
14
  -rw-r--r--. 1 demo demo  48K Feb  8 13:19 akiban_system
2639
15
  -rw-r--r--. 1 demo demo 954M Feb  8 13:18 akiban_journal.000000000225
2640
16
  -rw-r--r--. 1 demo demo 954M Feb  8 13:19 akiban_journal.000000000226
2641
17
  -rw-r--r--. 1 demo demo 954M Feb  8 13:19 akiban_journal.000000000227
2642
18
  -rw-r--r--. 1 demo demo 662M Feb  8 13:19 akiban_journal.000000000228
2643
19
2644
20
This database contains two *volume* files, ``akiban_data`` and ``akiban_system`` and four files that constitute part of the *journal*. As explained below, Persistit records are usually stored in a combination of volume and journal files.
2645
21
2646
22
.. _Journal:
2647
23
2648
24
The Journal
2649
25
-----------
2650
26
2651
27
The *journal* is a set of files containing variable length records. The journal is append-only. New records are written only at the end; existing records are never overwritten. The journal consists of a numbered series of files having a configurable maximum size. When a journal file becomes full Persistit closes it and begins a new file with the next counter value. The maximum size of a journal file is determined by a configuration property called its block size.  The default block size value is 1,000,000,000 bytes which works well with today’s standard server hardware.
2652
28
2653
29
Every record in the journal has a 64-bit integer *journal address*. The journal address denotes which file contains the record and the record’s offset within that file. Journal addresses start at zero in a new database instance and grow perpetually. footnote:[Even on a system executing 1 million transactions per second the address space is large enough to last for hundreds of years.]
2654
30
2655
31
Persistit writes two major types of records to journal files.
2656
32
2657
33
- For each committed update transaction, Persistit writes a record containing sufficient information to replay the transaction during recovery. For example, when Persistit stores a key/value pair during a transaction, it writes a record to the journal containing the key and value.
2658
34
- Persistit also writes all updated page images to the journal. Some of these are eventually copied to volume files, as described below. This write/copy mechanism is critical to Persistit’s crash-recovery mechanism (see :ref:`Recovery`).
2659
35
2660
36
As updates are applied, Persistit constantly appends new information- both transaction records and modified page images - to the end of the highest-numbered file. To prevent the aggregation of a large number of journal files Persistit also works to copy or remove information from older journal files so that they can be deleted. The background thread responsible for this activity is called the ``JOURNAL_COPIER`` thread. The JOURNAL_COPIER copies pages from the journal back into their home volume files, allowing old files to be deleted. Normally a Persistit system at rest gradually copies all update page images and perform checkpoints so that only one small journal file remains. Applications can accelerate that process by calling the ``com.persistit.Persistit#copyBackPages`` method.
2661
37
2662
38
The journal is critical to ensuring Persistit can recover structurally intact B-Trees and apply all committed transactions after a system failure. For this reason, unless the JOURNAL_COPIER is entirely caught up, any attempt to save the state of a Persistit database must include both the volume and journal files.
2663
39
2664
40
The journal also plays a critical role during concurrent backup. To back up a running Persistit database, the ``com.persistit.BackupTask`` does the following:
2665
41
2666
42
- Enables ``appendOnly`` mode to suspend the copying of updated page images.
2667
43
- Copies the appropriate volume and journal files
2668
44
- Disables ``appendOnly`` mode to allow JOURNAL_COPIER to continue.
2669
45
2670
46
For more details on the journal, checkpoints and transactions, see :ref:`Recovery`. For more information on concurrent backup and other management tasks, see :ref:`Management`.
2671
47
2672
48
Pages and Volumes
2673
49
-----------------
2674
50
2675
51
Persistit ultimately stores its data in one or more Volume files. Persistit manages volume files internally in sections called pages. Every page within one volume has the same size. The page size is configurable and may be 1,024, 2,048, 4,096, 8,192, or 16,384 (recommended) bytes long. Once the page size for a volume has been established, it cannot be changed. See :ref:`Configuration` for details of how to assign the page size for a new volume.
2676
52
2677
53
Directory Tree
2678
54
^^^^^^^^^^^^^^
2679
55
2680
56
Within a volume there can be an unlimited number of B-Trees. (B-Trees are also called simply “trees” in this document.) A tree consists of a set of pages including a *root page*, *index pages* and *data pages*. The root page can be data page if the tree is trivial and contains only small number of records. Usually the root page is an index page which contains references to other index pages which in turn may refer to data pages.
2681
57
2682
58
Persistit manages a potentially large number of trees by maintaining a tree of trees called ``_directory``.  The ``_directory`` tree contains the name, root page address, ``com.persistit.Accumulator`` data and ``com.persistit.TreeStatistics`` data for all the other trees in the volume. The tree name ``_directory`` is reserved and may not be used when creating an Exchange.
2683
59
2684
60
Data Pages
2685
61
^^^^^^^^^^
2686
62
2687
63
A data page contains a representation of one or more variable-length key/value pairs. The number of key/value pairs depends on the page size, and the sizes of the serialized keys and values. The first key in each data page is stored in its entirety, while subsequent keys are stored with *prefix compression* to reduce storage footprint and accelerate searches. Therefore the storage size of the second and subsequent keys in a data page depend on how many of the leading bytes of its serialized form match its predecessor. (See :ref:`Key` and :ref:`Value` for information on how Persistit encodes logical Java values into the byte arrays stored in a data page.)
2688
64
2689
65
Index Pages
2690
66
^^^^^^^^^^^
2691
67
2692
68
An index page has a structure similar to a data page except that instead of holding serialized value data, it instead contains page addresses of subordinate pages within the tree.
2693
69
2694
70
.. TODO - diagram of B-Tree, page layouts, etc
2695
71
2696
72
.. _Recovery:
2697
73
2698
74
Recovery
2699
75
========
2700
76
2701
77
Akiban Persistit is designed, implemented and tested to ensure that whether the application shuts down gracefully or crashes without cleanly closing the database, the database remains structurally intact and internally consistent after restart.
2702
78
2703
79
To do this, Persistit performs a process called *recovery* every time it starts up.  The recovery process is generally very fast after a normal shutdown. However, it can take a considerable amount of time after a crash because many committed transactions may need to be executed.
2704
80
2705
81
Recovery performs two major activities:
2706
82
2707
83
- Restores all B-Trees to an internally consistent state with a known timestamp.
2708
84
- Replays all transaction that committed after that timestamp.
2709
85
- Prunes multi-version values belonging to certain aborted transactions (see :ref:`Pruning`).
2710
86
2711
87
To accomplish this, Persistit writes all updates first to the :ref:`Journal`. Persistit also periodically writes *checkpoint* records to the journal. During recovery, Persistit finds the last valid checkpoint written before shutdown or crash, restores B-Trees to state consistent with that checkpoint, and then replays transactions that committed after the checkpoint.
2712
88
2713
89
Recovery depends on the availability of the volume and journal files as they existed prior to abrupt termination. If these are modified or destroyed outside of Persistit, successful recovery is unlikely.
2714
90
2715
91
Timestamps and Checkpoints
2716
92
--------------------------
2717
93
2718
94
Persistit maintains a universal counter called the *timestamp* counter. Every update operation assigns a new, larger timestamp, and every record in the journal includes the timestamp assigned to the operation writing the record. The timestamp counter is unrelated to clock time.  It is merely a counter.
2719
95
2720
96
A *checkpoint* is simply a timestamp for which a valid recovery is possible. Periodically Persistit chooses a timestamp to be a new checkpoint. Over time it then ensures that all pages updated before the checkpoint have been written to the journal, and then writes a checkpoint marker. By default checkpoints occur once every two minutes. Normal shutdown through ``com.persistit.Persistit#close`` writes a final checkpoint to the journal regardless of when the last checkpoint cycle occurred. That final checkpoint is what allows recovery after a normal shutdown to be very fast.
2721
97
2722
98
Upon start-up Persistit starts by finding the last valid checkpoint timestamp, and then recovers only those page images from the journal that were written prior to it. The result is that all B-Trees are internally consistent and contain all the updates that were issued and committed to disk before the checkpoint timestamp and none the occurred after the checkpoint timestamp.
2723
99
2724
100
Then Persistit locates and reapplies all transaction records in the journal for transactions that committed after the last valid checkpoint timestamp. These transactions are reapplied to the database, with the result that:
2725
101
2726
102
- The B-Tree index and data structures are intact. All store, fetch, remove and traverse operations will complete successfully. footnote:[Persistit provides the utility class com.persistit.IntegrityCheck to verify the integrity of a Volume.]
2727
103
- All committed transactions are present in the recovered database.  (See :ref:`Transactions` for durability determined by ``CommitPolicy``.)
2728
104
2729
105
For updates occurring outside of a transaction the resulting state is identical to some consistent, reasonably recent state prior to the termination. (“Reasonably recent” depends on the checkpoint interval, which by default is set to two minutes.)
2730
106
2731
107
Flush/Force/Checkpoint
2732
108
^^^^^^^^^^^^^^^^^^^^^^
2733
109
2734
110
An application may require certainty at various points that all pending updates have been fully written to disk. The ``com.persistit.Persistit`` class provides three methods to ensure that updates have been written:
2735
111
2736
112
  ``com.persistit.Persistit#flush``
2737
113
      causes Persistit to write all pending updates to the journal. Upon successful completion of flush any pages that needed writing prior to the call to flush are 
2738
114
      guaranteed to have been written to their respective volume files.
2739
115
  ``com.persistit.Persistit#force``
2740
116
      forces the underlying operating system to write pending updates from the operating system’s write-behind cache to the actual disk. (This operation relies on 
2741
117
      the underlying ``java.io.Filechannel#force(boolean)`` method.)
2742
118
  ``com.persistit.Persistit#checkpoint``
2743
119
      causes Persistit to allocate a new checkpoint timestamp and then wait for all updates that happened before that timestamp to be committed to disk.
2744
120
2745
121
However, typical applications, especially those using :ref:`Transactions`, do not need to invoke these methods. Once a Transaction is durable, so are all other transactions that occurred at timestamps earlier than the transaction’s commit timestamp and no other method calls are required.
2746
122
2747
123
2748
124
The Buffer Pool
2749
125
---------------
2750
126
2751
127
Persistit maintains a cache of page copies in memory called the *buffer pool*. The buffer pool is a critical resource in reducing disk I/O and providing good run-time performance. After performing a relatively expensive disk operation to read a copy of a page into the buffer pool, Persistit retains that copy to allow potentially many fetch and update operations to be performed against keys and values stored in that page.
2752
128
2753
129
Persistit optimizes update operations by writing updated database pages lazily, generally a few seconds to minutes after the update has been performed on the in-memory copy of the page cached in the buffer pool. By writing lazily, Persistit allows many update operations to be completed on each page before incurring a relatively expensive disk I/O operation to write the updated version of the page to the Volume.
2754
130
2755
131
In Persistit the buffer pool is a collection of buffers allocated from the heap for the duration of Persistit’s operation. The buffers are allocated by the ``com.persistit.Persistit#initialize`` method and are released when the application invokes close. Because buffers are allocated for the life of the Persistit instance, they impose no garbage collection overhead. (However, especially when using large buffer pool allocation in a JVM with a large heap, there are some special memory configuration issues to consider.  See :ref:`Configuration` for details.)
2756
132
2757
133
Persistit allocates buffers from the buffer pool in approximately  least-recently-used (LRU) order. Most applications exhibit behavior in which data, having been accessed once, is read or updated several more times before the application moves to a different area of the database (locality of reference). LRU is an allocation strategy the yields reasonably good overall throughput by maintaining pages that are likely to be used again in the buffer pool in preference to pages that have not been used for a relatively long time.
2758
134
2759
135
Generally, allocating more buffers in the buffer pool increases the likelihood that a page will be found in the pool rather than having to be reloaded from disk. Since disk I/O is relatively expensive, this means that enlarging the buffer pool is a good strategy for reducing disk I/O and thereby increasing throughput. Persistit is designed to manage extremely large buffer pools very efficiently, so if memory is available, it is generally a good strategy to maximum buffer pool size.
2760
136
2761
137
Tools
2762
138
-----
2763
139
2764
140
The command-line interface (see :ref:`CLI`) includes tools you can use to examine pages in volumes and records in the journal. Two of these include the ``jview`` and ``pview`` tasks. The ``jview`` command displays journal records selected within an address range, by type, by page address, and using other selection criteria in a readable form.  The ``pview`` command displays the contents of pages selected by page address or key from a volume, or by journal address from the journal.
2765
141
2766
142
2767
0
143
2768
=== removed file 'doc/PhysicalStorage.txt'
2769
--- doc/PhysicalStorage.txt	2012-05-04 13:33:49 +0000
2770
+++ doc/PhysicalStorage.txt	1970-01-01 00:00:00 +0000
2771
@@ -1,126 +0,0 @@
2772
1
[[PhysicalStorage]]
2773
2
= Physical B-Tree Representation
2774
3
2775
4
This chapter describes the physical structures used to represent Akiban Persistit records on disk and in memory.
2776
5
2777
6
== Files
2778
7
2779
8
Following is a directory listing illustrating a working Persistit database:
2780
9
2781
10
----
2782
11
-rw-r--r--. 1 demo demo  24G Feb  8 13:18 akiban_data
2783
12
-rw-r--r--. 1 demo demo  48K Feb  8 13:19 akiban_system
2784
13
-rw-r--r--. 1 demo demo 954M Feb  8 13:18 akiban_journal.000000000225
2785
14
-rw-r--r--. 1 demo demo 954M Feb  8 13:19 akiban_journal.000000000226
2786
15
-rw-r--r--. 1 demo demo 954M Feb  8 13:19 akiban_journal.000000000227
2787
16
-rw-r--r--. 1 demo demo 662M Feb  8 13:19 akiban_journal.000000000228
2788
17
----
2789
18
2790
19
This database contains two _volume_ files, +akiban_data+ and +akiban_system+ and four files that constitute part of the _journal_. As explained below, Persistit records are usually stored in a combination of volume and journal files.
2791
20
2792
21
[[Journal]]
2793
22
== The Journal
2794
23
2795
24
The _journal_ is a set of files containing variable length records. The journal is append-only. New records are written only at the end; existing records are never overwritten. The journal consists of a numbered series of files having a configurable maximum size. When a journal file becomes full Persistit closes it and begins a new file with the next counter value. The maximum size of a journal file is determined by a configuration property called its block size.  The default block size value is 1,000,000,000 bytes which works well with today’s standard server hardware.
2796
25
2797
26
Every record in the journal has a 64-bit integer _journal address_. The journal address denotes which file contains the record and the record’s offset within that file. Journal addresses start at zero in a new database instance and grow perpetually. footnote:[Even on a system executing 1 million transactions per second the address space is large enough to last for hundreds of years.]
2798
27
2799
28
Persistit writes two major types of records to journal files.
2800
29
2801
30
- For each committed update transaction, Persistit writes a record containing sufficient information to replay the transaction during recovery. For example, when Persistit stores a key/value pair during a transaction, it writes a record to the journal containing the key and value.
2802
31
- Persistit also writes all updated page images to the journal. Some of these are eventually copied to volume files, as described below. This write/copy mechanism is critical to Persistit’s crash-recovery mechanism (see <<Recovery>>).
2803
32
2804
33
As updates are applied, Persistit constantly appends new information- both transaction records and modified page images - to the end of the highest-numbered file. To prevent the aggregation of a large number of journal files Persistit also works to copy or remove information from older journal files so that they can be deleted. The background thread responsible for this activity is called the +JOURNAL_COPIER+ thread. The JOURNAL_COPIER copies pages from the journal back into their home volume files, allowing old files to be deleted. Normally a Persistit system at rest gradually copies all update page images and perform checkpoints so that only one small journal file remains. Applications can accelerate that process by calling the +com.persistit.Persistit#copyBackPages+ method.
2805
34
2806
35
The journal is critical to ensuring Persistit can recover structurally intact B-Trees and apply all committed transactions after a system failure. For this reason, unless the JOURNAL_COPIER is entirely caught up, any attempt to save the state of a Persistit database must include both the volume and journal files.
2807
36
2808
37
The journal also plays a critical role during concurrent backup. To back up a running Persistit database, the +com.persistit.BackupTask+ does the following:
2809
38
2810
39
- Enables +appendOnly+ mode to suspend the copying of updated page images.
2811
40
- Copies the appropriate volume and journal files
2812
41
- Disables +appendOnly+ mode to allow JOURNAL_COPIER to continue.
2813
42
2814
43
For more details on the journal, checkpoints and transactions, see <<Recovery>>. For more information on concurrent backup and other management tasks, see <<Management>>.
2815
44
2816
45
== Pages and Volumes
2817
46
2818
47
Persistit ultimately stores its data in one or more Volume files. Persistit manages volume files internally in sections called pages. Every page within one volume has the same size. The page size is configurable and may be 1,024, 2,048, 4,096, 8,192, or 16,384 (recommended) bytes long. Once the page size for a volume has been established, it cannot be changed. See <<Configuration>> for details of how to assign the page size for a new volume.
2819
48
2820
49
=== Directory Tree
2821
50
2822
51
Within a volume there can be an unlimited number of B-Trees. (B-Trees are also called simply “trees” in this document.) A tree consists of a set of pages including a _root page_, _index pages_ and _data pages_. The root page can be data page if the tree is trivial and contains only small number of records. Usually the root page is an index page which contains references to other index pages which in turn may refer to data pages.
2823
52
2824
53
Persistit manages a potentially large number of trees by maintaining a tree of trees called +_directory+.  The +_directory+ tree contains the name, root page address, +com.persistit.Accumulator+ data and +com.persistit.TreeStatistics+ data for all the other trees in the volume. The tree name +_directory+ is reserved and may not be used when creating an Exchange.
2825
54
2826
55
=== Data Pages
2827
56
2828
57
A data page contains a representation of one or more variable-length key/value pairs. The number of key/value pairs depends on the page size, and the sizes of the serialized keys and values. The first key in each data page is stored in its entirety, while subsequent keys are stored with _prefix compression_ to reduce storage footprint and accelerate searches. Therefore the storage size of the second and subsequent keys in a data page depend on how many of the leading bytes of its serialized form match its predecessor. (See <<Key>> and <<Value>> for information on how Persistit encodes logical Java values into the byte arrays stored in a data page.)
2829
58
2830
59
=== Index Pages
2831
60
2832
61
An index page has a structure similar to a data page except that instead of holding serialized value data, it instead contains page addresses of subordinate pages within the tree.
2833
62
2834
63
****
2835
64
TODO - diagram of B-Tree, page layouts, etc
2836
65
****
2837
66
[[Recovery]]
2838
67
== Recovery
2839
68
2840
69
Akiban Persistit is designed, implemented and tested to ensure that whether the application shuts down gracefully or crashes without cleanly closing the database, the database remains structurally intact and internally consistent after restart.
2841
70
2842
71
To do this, Persistit performs a process called _recovery_ every time it starts up.  The recovery process is generally very fast after a normal shutdown. However, it can take a considerable amount of time after a crash because many committed transactions may need to be executed.
2843
72
2844
73
Recovery performs two major activities:
2845
74
2846
75
- Restores all B-Trees to an internally consistent state with a known timestamp.
2847
76
- Replays all transaction that committed after that timestamp.
2848
77
- Prunes multi-version values belonging to certain aborted transactions (see <<Pruning>>).
2849
78
2850
79
To accomplish this, Persistit writes all updates first to the <<Journal>>. Persistit also periodically writes _checkpoint_ records to the journal. During recovery, Persistit finds the last valid checkpoint written before shutdown or crash, restores B-Trees to state consistent with that checkpoint, and then replays transactions that committed after the checkpoint.
2851
80
2852
81
Recovery depends on the availability of the volume and journal files as they existed prior to abrupt termination. If these are modified or destroyed outside of Persistit, successful recovery is unlikely.
2853
82
2854
83
=== Timestamps and Checkpoints
2855
84
2856
85
Persistit maintains a universal counter called the _timestamp_ counter. Every update operation assigns a new, larger timestamp, and every record in the journal includes the timestamp assigned to the operation writing the record. The timestamp counter is unrelated to clock time.  It is merely a counter.
2857
86
2858
87
A _checkpoint_ is simply a timestamp for which a valid recovery is possible. Periodically Persistit chooses a timestamp to be a new checkpoint. Over time it then ensures that all pages updated before the checkpoint have been written to the journal, and then writes a checkpoint marker. By default checkpoints occur once every two minutes. Normal shutdown through +com.persistit.Persistit#close+ writes a final checkpoint to the journal regardless of when the last checkpoint cycle occurred. That final checkpoint is what allows recovery after a normal shutdown to be very fast.
2859
88
2860
89
Upon start-up Persistit starts by finding the last valid checkpoint timestamp, and then recovers only those page images from the journal that were written prior to it. The result is that all B-Trees are internally consistent and contain all the updates that were issued and committed to disk before the checkpoint timestamp and none the occurred after the checkpoint timestamp.
2861
90
2862
91
Then Persistit locates and reapplies all transaction records in the journal for transactions that committed after the last valid checkpoint timestamp. These transactions are reapplied to the database, with the result that:
2863
92
2864
93
- The B-Tree index and data structures are intact. All store, fetch, remove and traverse operations will complete successfully. footnote:[Persistit provides the utility class com.persistit.IntegrityCheck to verify the integrity of a Volume.]
2865
94
- All committed transactions are present in the recovered database.  (See <<Transactions>> for durability determined by +CommitPolicy+.)
2866
95
2867
96
For updates occurring outside of a transaction the resulting state is identical to some consistent, reasonably recent state prior to the termination. (“Reasonably recent” depends on the checkpoint interval, which by default is set to two minutes.)
2868
97
2869
98
=== Flush/Force/Checkpoint
2870
99
2871
100
An application may require certainty at various points that all pending updates have been fully written to disk. The +com.persistit.Persistit+ class provides three methods to ensure that updates have been written:
2872
101
2873
102
[horizontal]
2874
103
+com.persistit.Persistit#flush+:: causes Persistit to write all pending updates to the journal. Upon successful completion of flush any pages that needed writing prior to the call to flush are guaranteed to have been written to their respective volume files.
2875
104
+com.persistit.Persistit#force+:: forces the underlying operating system to write pending updates from the operating system’s write-behind cache to the actual disk. (This operation relies on the underlying +java.io.Filechannel#force(boolean)+ method.)
2876
105
+com.persistit.Persistit#checkpoint+:: causes Persistit to allocate a new checkpoint timestamp and then wait for all updates that happened before that timestamp to be committed to disk.
2877
106
2878
107
However, typical applications, especially those using <<Transactions>>, do not need to invoke these methods. Once a Transaction is durable, so are all other transactions that occurred at timestamps earlier than the transaction’s commit timestamp and no other method calls are required.
2879
108
2880
109
2881
110
== The Buffer Pool
2882
111
2883
112
Persistit maintains a cache of page copies in memory called the _buffer pool_. The buffer pool is a critical resource in reducing disk I/O and providing good run-time performance. After performing a relatively expensive disk operation to read a copy of a page into the buffer pool, Persistit retains that copy to allow potentially many fetch and update operations to be performed against keys and values stored in that page.
2884
113
2885
114
Persistit optimizes update operations by writing updated database pages lazily, generally a few seconds to minutes after the update has been performed on the in-memory copy of the page cached in the buffer pool. By writing lazily, Persistit allows many update operations to be completed on each page before incurring a relatively expensive disk I/O operation to write the updated version of the page to the Volume.
2886
115
2887
116
In Persistit the buffer pool is a collection of buffers allocated from the heap for the duration of Persistit’s operation. The buffers are allocated by the +com.persistit.Persistit#initialize+ method and are released when the application invokes close. Because buffers are allocated for the life of the Persistit instance, they impose no garbage collection overhead. (However, especially when using large buffer pool allocation in a JVM with a large heap, there are some special memory configuration issues to consider.  See <<Configuration>> for details.)
2888
117
2889
118
Persistit allocates buffers from the buffer pool in approximately  least-recently-used (LRU) order. Most applications exhibit behavior in which data, having been accessed once, is read or updated several more times before the application moves to a different area of the database (locality of reference). LRU is an allocation strategy the yields reasonably good overall throughput by maintaining pages that are likely to be used again in the buffer pool in preference to pages that have not been used for a relatively long time.
2890
119
2891
120
Generally, allocating more buffers in the buffer pool increases the likelihood that a page will be found in the pool rather than having to be reloaded from disk. Since disk I/O is relatively expensive, this means that enlarging the buffer pool is a good strategy for reducing disk I/O and thereby increasing throughput. Persistit is designed to manage extremely large buffer pools very efficiently, so if memory is available, it is generally a good strategy to maximum buffer pool size.
2892
121
2893
122
== Tools
2894
123
2895
124
The command-line interface (see <<CLI>>) includes tools you can use to examine pages in volumes and records in the journal. Two of these include the +jview+ and +pview+ tasks. The +jview+ command displays journal records selected within an address range, by type, by page address, and using other selection criteria in a readable form.  The +pview+ command displays the contents of pages selected by page address or key from a volume, or by journal address from the journal.
2896
125
2897
126
2898
127
0
2899
=== added file 'doc/ReleaseNotes.rst'
2900
--- doc/ReleaseNotes.rst	1970-01-01 00:00:00 +0000
2901
+++ doc/ReleaseNotes.rst	2012-05-30 18:23:19 +0000
2902
@@ -0,0 +1,84 @@
2903
1
************************************
2904
2
Akiban-Persistit 3.1.1 Release Notes
2905
3
************************************
2906
4
2907
5
Release Date
2908
6
============
2909
7
May 25, 2012
2910
8
2911
9
Overview
2912
10
========
2913
11
This is the first open source release of the Persistit project (https://launchpad.net/akiban-persistit).  
2914
12
2915
13
See http://www.akiban.com/akiban-persistit for a summary of features and benefits, licensing information and how to get support.
2916
14
2917
15
Documentation
2918
16
=============
2919
17
Users Guide (http://www.akiban.com/ak-docs/admin/persistit)
2920
18
JavaDoc  (http://www.akiban.com/ak-docs/admin/persistit/apidocs)
2921
19
2922
20
Building Akiban-Persistit
2923
21
=========================
2924
22
Use Maven (maven.apache.org) to build Persistit.
2925
23
2926
24
To build::
2927
25
2928
26
  mvn install
2929
27
2930
28
The resulting jar files are in the ``target`` directory. To build the Javadoc::
2931
29
2932
30
  mvn javadoc:javadoc
2933
31
2934
32
The resulting Javadoc HTML files are in ``target/site/apidocs``.
2935
33
2936
34
Building and Running the Examples
2937
35
---------------------------------
2938
36
2939
37
Small examples are located in the ``examples`` directory. Each has a short README file describing the example, and an Ant build script (http://ant.apache.org). After building the main akiban-persisit jar file using Maven, you may run::
2940
38
2941
39
  ant run
2942
40
2943
41
in each of the examples subdirectories to build and run the examples.
2944
42
2945
43
Known Issues
2946
44
============
2947
45
2948
46
Transactional Tree Management
2949
47
-----------------------------
2950
48
2951
49
All operations within Trees such as store, fetch, remove and traverse are correctly supported within transactions. However, the operations to create and delete Tree instances currently do not respect transaction boundaries. For example, if a transaction creates a new Tree, it is immediately visible within other Transactions and will continue to exist even if the original transaction aborts.  (However, records inserted or modified by the original transaction will not be visible until the transaction commits.) Prior to creating/removing trees, transaction processing should be quiesced and allowed to complete.
2952
50
2953
51
Problems with Disk Full - Bug 916071
2954
52
------------------------------------
2955
53
2956
54
There are rare cases where Persistit will generate exceptions other than java.io.IOException: No space left on device when a disk volume containing the journal or volume file fills up. The database will be intact upon recovery, but the application may receive unexpected exceptions.
2957
55
2958
56
Out of Memory Error, Direct Memory Buffer - Bug 985117
2959
57
------------------------------------------------------
2960
58
2961
59
Out of Memory Error, Direct Memory Buffer.  Can cause failed transactions under extreme load conditions as a result of threads getting backed up writing to the journal file. However, this error is transient and recoverable by by retrying the failed transaction.
2962
60
2963
61
* Workaround: Ensure your application has the ability to retry failed transactions
2964
62
2965
63
2966
64
Tree#getChangeCount may return inaccurate result - Bug 986465
2967
65
-------------------------------------------------------------
2968
66
2969
67
The getChangeCount method may return inaccurate results as its not currently transactional.  The primary consumer is the PersistitMap. As a result of this bug Persistit may not generate java.util.ConcurrentModiciationException when it is supposed to.
2970
68
2971
69
Multi-Version-Values sometimes not fully pruned - Bug 1000331
2972
70
-------------------------------------------------------------
2973
71
2974
72
Multi-version values are not always pruned properly causing volume growth.  The number of MVV records and their overhead size can be obtaining by running the IntegrityCheck task. 
2975
73
2976
74
* Workaround 1: Run the IntegrityCheck task (CLI command icheck) with the -P option which will prune the MVVs. This will remove obsolete MVV instances and in many cases free up pages in which new data can be stored.  However, it will not reduce the actual size of the volume file.
2977
75
2978
76
* Workaround 2: To reduce the size of the volume you can use the CLI commands save  and load to offload and then reload the data into a newly created volume file. See http://www.akiban.com/ak-docs/admin/persistit/Management.html#management for more information about these operations.
2979
77
2980
78
2981
79
Buffer Pool Configuration
2982
80
=========================
2983
81
For optimal performance, proper configuration of the Persistit buffer pool is required.  See section "Configuring the Buffer Pool" in the configuration document http://www.akiban.com/ak-docs/admin/persistit/Configuration.html
2984
82
2985
83
.. note:: Especially when used with multi-gigabyte heaps, the default Hotspot JVM server heuristics are be suboptimal for Persistit applications. Persistit is usually configured to allocate a large fraction of the heap to Buffer instances that are allocated at startup and held for the duration of the Persistit instance. For efficient operation, all of the Buffer instances must fit in the tenured (old) generation of the heap to avoid very significant garbage collector overhead.  Use either -XX:NewSize or -Xmn to adjust the relative sizes of the new and old generations.
2986
84
2987
0
85
2988
=== added file 'doc/Security.rst'
2989
--- doc/Security.rst	1970-01-01 00:00:00 +0000
2990
+++ doc/Security.rst	2012-05-30 18:23:19 +0000
2991
@@ -0,0 +1,93 @@
2992
1
.. _Security:
2993
2
2994
3
Security Notes
2995
4
==============
2996
5
2997
6
Akiban Persistit provides no built-in data access control because it is intended for embedded use in applications that provide their own logical access control. Security-conscious applications must also prevent unauthorized access to Persistit's physical files and to the Persistit API. The following resources must be protected from unauthorized access:
2998
7
2999
8
- Persistit volume files
3000
9
- configuration properties file, if one is used
3001
10
- access by unauthorized code to the API exposed by the Persistit library
3002
11
- the port opened and exposed by Persistit when either the ``com.persistit.rmiport`` or ``com.persistit.rmihost`` property is set. If you are using Persistit's remote administration feature, be sure to block unauthorized access to the RMI port.
3003
12
- the :ref:`cliserver` if instantiated
3004
13
3005
14
In addition to these general deployment considerations, Persistit requires certain permissions in an environment controlled by a security manager.
3006
15
3007
16
Java programs run from the command line typically do not install a security manager, and therefore implicitly grant all permissions. However, when Persistit is used within an Applet, or within any framework that installs a security manager, it is important to understand what permissions are required.
3008
17
3009
18
Security Domains
3010
19
----------------
3011
20
3012
21
This section assumes a basic understanding of the Java security model. See http://download.oracle.com/javase/1.5.0/docs/guide/security/spec/security-spec.doc2.html[New Protection Mechanisms - Overview of Basic Concepts] for further information.
3013
22
3014
23
Note that when Java is started from the command line, as is often the case in server applications, all security privileges are granted by default. The information in this section is intended for cases where security privileges need to be controlled.
3015
24
3016
25
Akiban Persistit performs various security-sensitive operations: it reads and writes files, it reads system properties, it optionally opens a TCP/IP socket, and it performs various security-sensitive operations related to reflection and serialization. These operations are divided into two categories: those required only by the Persistit library itself, and those required by the application that is using Persistit. For example, the application code must have permission to read and write files, but it does not require permission to access private fields through reflection; only the Persistit library itself needs this permission. The latter operation is called a “privileged” operation because Persistit invokes the access controller's doPrivileged method to establish its permission to perform the privileged operation.
3017
26
3018
27
At a practical level, this means you can create two separate security domains for applications embedding Persistit.  One domain is specific to the Persistit library itself, and grants all the permissions required by Persistit, including the privileged permissions. The other domain includes the application, and grants only the non-privileged permissions.
3019
28
3020
29
Using the default java.lang.SecurityManager implementation, you define domains and the permissions available to them in a policy file stored in the user's home directory.  For those already familiar with the policy file format, here is a security policy file that illustrates these concepts:
3021
30
3022
31
.. code-block:: java
3023
32
3024
33
  grant codeBase "file:/appdir/myapplication.jar" {
3025
34
    permission java.io.FilePermission "e:\\data\\*", "read, write, delete";
3026
35
    permission java.io.FilePermission "e:\\logs\\*", "read, write, delete";
3027
36
    permission java.net.SocketPermission "localhost",
3028
37
           	  "accept, connect, listen, resolve";
3029
38
  };
3030
39
3031
40
  grant codeBase "file:/lib/akiban-persistit.jar" {
3032
41
    permission java.io.FilePermission "<<ALL FILES>>", "read, write, delete";
3033
42
    permission java.net.SocketPermission "*:1099-",
3034
43
         	  "accept, connect, listen, resolve";
3035
44
    permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
3036
45
    permission java.io.SerializablePermission "enableSubclassImplementation";
3037
46
    permission java.util.PropertyPermission "com.persistit.*", "read";
3038
47
    permission java.lang.RuntimePermission "accessDeclaredMembers";
3039
48
  };
3040
49
3041
50
This policy file sets up two security domains. One covers the application code in ``myapplication.jar`` and grants just the restricted set of permissions needed by the application, while the other grants additional permissions to the Persistit library.
3042
51
3043
52
Although the file and socket permissions granted by the privileged domain are less restrictive than those granted to ``myapplication.jar``, the actual permission granted to running code is the intersection of these two and is therefore restricted to just the set of files and sockets granted to myapplication.jar.
3044
53
3045
54
Permissions Required for DefaultValueCoder
3046
55
------------------------------------------
3047
56
3048
57
DefaultValueCoder performs three security-sensitive operations:
3049
58
3050
59
- It enumerates the declared fields of the class being serialized,
3051
60
- It reads and writes data from and to those fields using reflection even if they are private, and
3052
61
- It overrides the default implementations of java.io.ObjectInputStream and java.io.ObjectOutputStream.
3053
62
3054
63
If a SecurityManager is installed then three permissions must be granted to enable this new mechanism::
3055
64
3056
65
  java.lang.RuntimePermission "accessDeclaredMembers";
3057
66
  java.lang.reflect.ReflectPermission("suppressAccessChecks")
3058
67
  java.io.SerializablePermission("enableSubclassImplementation")
3059
68
3060
69
Persistit acquires these permissions through privileged operations, meaning that only the Persistit library domain needs to have them – they do not need to be and should not be granted to the application domain.
3061
70
3062
71
Permission Required for Reading System Properties
3063
72
-------------------------------------------------
3064
73
3065
74
Persistit attempts to read system properties whose names begin with “com.persistit.” Specific permission to do this is granted by the line::
3066
75
3067
76
   permission java.util.PropertyPermission "com.persistit.*", "read";
3068
77
3069
78
Again, only the Persistit library domain needs to have this permission. If this permission is not granted, Persistit ignores all system properties.
3070
79
3071
80
Permissions Required for File and Socket I/O
3072
81
--------------------------------------------
3073
82
3074
83
Persistit needs permission to read and write its volume and journal files, to read a configuration properties file and (optionally) write to a log file.  File I/O permissions also apply to the source and destination files specified for Import and Export tasks available within the AdminUI utility. In addition, if you specify either the rmihost or rmiport property to enable remote administration, Persistit needs permission to create RMI connections.
3075
84
3076
85
These are not privileged operations, meaning that if the policy establishes separate domains for the application and the Persistit library, both domains must grant permission for all I/O operations. (If they were privileged operations, an untrusted application code could use the Persistit library as a proxy to perform malicious file I/O.) As is defined by the Java security mechanism, when Persistit attempts to open a file, permissions of both the application domain and the Persistit library domains are checked; if the operation is denied by either domain then the attempt fails with a java.security.AccessControlException.
3077
86
3078
87
As specified in the sample policy file above, the Persistit library domain has been granted permissions on `<<ALL FILES>>`. This means that the application domain controls what subset of the file system is accessible.  In the example, files may only be read and written to the e:\data and e:\logs directories on a Windows box.  (See the Java documentation on Permissions for details on how to construct File and Socket permissions.)
3079
88
3080
89
Deploying Persistit as an Installed Optional Package
3081
90
----------------------------------------------------
3082
91
3083
92
A convenient way to grant Persistit the permissions required to perform its privileged operations is to install it as an optional package. The Sun Java Runtime Environment treats JAR files located in the <jre-home>/lib/ext directory as optional Java extension classes, and by default grants them the same privileges as Java system classes. If the Persistit library is loaded from this directory then only the application File and Socket privileges need to be granted explicitly through a security policy. To deploy Persistit in this manner simply copy the Persistit library jar file to the appropriate ``*jre-home*/lib/ext`` directory.
3084
93
3085
0
94
3086
=== removed file 'doc/Security.txt'
3087
--- doc/Security.txt	2012-04-30 22:09:31 +0000
3088
+++ doc/Security.txt	1970-01-01 00:00:00 +0000
3089
@@ -1,91 +0,0 @@
3090
1
[[Security]]
3091
2
= Security Notes
3092
3
3093
4
Akiban Persistit provides no built-in data access control because it is intended for embedded use in applications that provide their own logical access control. Security-conscious applications must also prevent unauthorized access to Persistit's physical files and to the Persistit API. The following resources must be protected from unauthorized access:
3094
5
3095
6
- Persistit volume files
3096
7
- configuration properties file, if one is used
3097
8
- access by unauthorized code to the API exposed by the Persistit library
3098
9
- the port opened and exposed by Persistit when either the +com.persistit.rmiport+ or +com.persistit.rmihost+ property is set. If you are using Persistit's remote administration feature, be sure to block unauthorized access to the RMI port.
3099
10
- the <<cliserver>> if instantiated
3100
11
3101
12
In addition to these general deployment considerations, Persistit requires certain permissions in an environment controlled by a security manager.
3102
13
3103
14
Java programs run from the command line typically do not install a security manager, and therefore implicitly grant all permissions. However, when Persistit is used within an Applet, or within any framework that installs a security manager, it is important to understand what permissions are required.
3104
15
3105
16
== Security Domains
3106
17
3107
18
This section assumes a basic understanding of the Java security model. See http://download.oracle.com/javase/1.5.0/docs/guide/security/spec/security-spec.doc2.html[New Protection Mechanisms - Overview of Basic Concepts] for further information.
3108
19
3109
20
Note that when Java is started from the command line, as is often the case in server applications, all security privileges are granted by default. The information in this section is intended for cases where security privileges need to be controlled.
3110
21
3111
22
Akiban Persistit performs various security-sensitive operations: it reads and writes files, it reads system properties, it optionally opens a TCP/IP socket, and it performs various security-sensitive operations related to reflection and serialization. These operations are divided into two categories: those required only by the Persistit library itself, and those required by the application that is using Persistit. For example, the application code must have permission to read and write files, but it does not require permission to access private fields through reflection; only the Persistit library itself needs this permission. The latter operation is called a “privileged” operation because Persistit invokes the access controller's doPrivileged method to establish its permission to perform the privileged operation.
3112
23
3113
24
At a practical level, this means you can create two separate security domains for applications embedding Persistit.  One domain is specific to the Persistit library itself, and grants all the permissions required by Persistit, including the privileged permissions. The other domain includes the application, and grants only the non-privileged permissions.
3114
25
3115
26
Using the default java.lang.SecurityManager implementation, you define domains and the permissions available to them in a policy file stored in the user's home directory.  For those already familiar with the policy file format, here is a security policy file that illustrates these concepts:
3116
27
3117
28
.Example .java.policy File
3118
29
----
3119
30
grant codeBase "file:/appdir/myapplication.jar" {
3120
31
  permission java.io.FilePermission "e:\\data\\*", "read, write, delete";
3121
32
  permission java.io.FilePermission "e:\\logs\\*", "read, write, delete";
3122
33
  permission java.net.SocketPermission "localhost",
3123
34
         	"accept, connect, listen, resolve";
3124
35
};
3125
36
3126
37
grant codeBase "file:/lib/akiban-persistit.jar" {
3127
38
  permission java.io.FilePermission "<<ALL FILES>>", "read, write, delete";
3128
39
  permission java.net.SocketPermission "*:1099-",
3129
40
         	"accept, connect, listen, resolve";
3130
41
  permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
3131
42
  permission java.io.SerializablePermission "enableSubclassImplementation";
3132
43
  permission java.util.PropertyPermission "com.persistit.*", "read";
3133
44
  permission java.lang.RuntimePermission "accessDeclaredMembers";
3134
45
};
3135
46
----
3136
47
3137
48
This policy file sets up two security domains. One covers the application code in +myapplication.jar+ and grants just the restricted set of permissions needed by the application, while the other grants additional permissions to the Persistit library.
3138
49
3139
50
Although the file and socket permissions granted by the privileged domain are less restrictive than those granted to +myapplication.jar+, the actual permission granted to running code is the intersection of these two and is therefore restricted to just the set of files and sockets granted to myapplication.jar.
3140
51
3141
52
== Permissions Required for DefaultValueCoder
3142
53
3143
54
DefaultValueCoder performs three security-sensitive operations:
3144
55
3145
56
- It enumerates the declared fields of the class being serialized,
3146
57
- It reads and writes data from and to those fields using reflection even if they are private, and
3147
58
- It overrides the default implementations of java.io.ObjectInputStream and java.io.ObjectOutputStream.
3148
59
3149
60
If a SecurityManager is installed then three permissions must be granted to enable this new mechanism:
3150
61
3151
62
----
3152
63
java.lang.RuntimePermission "accessDeclaredMembers";
3153
64
java.lang.reflect.ReflectPermission("suppressAccessChecks")
3154
65
java.io.SerializablePermission("enableSubclassImplementation")
3155
66
----
3156
67
3157
68
Persistit acquires these permissions through privileged operations, meaning that only the Persistit library domain needs to have them – they do not need to be and should not be granted to the application domain.
3158
69
3159
70
== Permission Required for Reading System Properties
3160
71
3161
72
Persistit attempts to read system properties whose names begin with “com.persistit.” Specific permission to do this is granted by the line
3162
73
3163
74
----
3164
75
   permission java.util.PropertyPermission "com.persistit.*", "read";
3165
76
----
3166
77
3167
78
Again, only the Persistit library domain needs to have this permission. If this permission is not granted, Persistit ignores all system properties.
3168
79
3169
80
== Permissions Required for File and Socket I/O
3170
81
3171
82
Persistit needs permission to read and write its volume and journal files, to read a configuration properties file and (optionally) write to a log file.  File I/O permissions also apply to the source and destination files specified for Import and Export tasks available within the AdminUI utility. In addition, if you specify either the rmihost or rmiport property to enable remote administration, Persistit needs permission to create RMI connections.
3172
83
3173
84
These are not privileged operations, meaning that if the policy establishes separate domains for the application and the Persistit library, both domains must grant permission for all I/O operations. (If they were privileged operations, an untrusted application code could use the Persistit library as a proxy to perform malicious file I/O.) As is defined by the Java security mechanism, when Persistit attempts to open a file, permissions of both the application domain and the Persistit library domains are checked; if the operation is denied by either domain then the attempt fails with a java.security.AccessControlException.
3174
85
3175
86
As specified in the sample policy file above, the Persistit library domain has been granted permissions on `<<ALL FILES>>`. This means that the application domain controls what subset of the file system is accessible.  In the example, files may only be read and written to the e:\data and e:\logs directories on a Windows box.  (See the Java documentation on Permissions for details on how to construct File and Socket permissions.)
3176
87
3177
88
== Deploying Persistit as an Installed Optional Package
3178
89
3179
90
A convenient way to grant Persistit the permissions required to perform its privileged operations is to install it as an optional package. The Sun Java Runtime Environment treats JAR files located in the <jre-home>/lib/ext directory as optional Java extension classes, and by default grants them the same privileges as Java system classes. If the Persistit library is loaded from this directory then only the application File and Socket privileges need to be granted explicitly through a security policy. To deploy Persistit in this manner simply copy the Persistit library jar file to the appropriate +_jre-home_/lib/ext+ directory.
3180
91
3181
92
0
3182
=== added file 'doc/Serialization.rst'
3183
--- doc/Serialization.rst	1970-01-01 00:00:00 +0000
3184
+++ doc/Serialization.rst	2012-05-30 18:23:19 +0000
3185
@@ -0,0 +1,250 @@
3186
1
.. _Serialization:
3187
2
3188
3
Serializing Object Values
3189
4
=========================
3190
5
3191
6
Akiban Persistit uses one of several mechanisms to serialize a Java Object into a ``com.persistit.Value``.
3192
7
3193
8
* For the following classes, Persistit provides built-in optimized serialization logic that cannot be overridden:
3194
9
3195
10
  * ``java.lang.String``
3196
11
  * ``java.util.Date``
3197
12
  * ``java.math.BigInteger``
3198
13
  * ``java.math.BigDecimal``
3199
14
  * Wrapper classes for primitive values (``Boolean``, ``Byte``, ``Short``, etc.)
3200
15
  * All arrays (however, the mechanisms described here apply to array elements).
3201
16
3202
17
* An application can register a custom ``com.persistit.encoding.ValueCoder`` to handle serialization of a particular class
3203
18
* Default serialization using Persistit's built-in serialization mechanism described below, or
3204
19
* Standard Java serialization as described in http://download.oracle.com/javase/1.5.0/docs/guide/serialization/spec/serial-arch.html[Java Object Serialization Specification].
3205
20
3206
21
Persistit's default serialization method serializes objects into approximately 33% fewer bytes, and depending on the structure of objects being serialized, is about 40% faster than Java serialization.
3207
22
3208
23
Storing Objects in Persistit
3209
24
----------------------------
3210
25
3211
26
To store an object value into a Persistit database, you put the object into the Value field of an Exchange, and then invoke the Exchange's store method as shown in this code fragment:
3212
27
3213
28
.. code-block:: java
3214
29
3215
30
    exchange.getValue().put(myObject);
3216
31
    exchange.store();
3217
32
3218
33
Of course, Persistit cannot actually store a live object on disk.  Instead it creates and stores a byte array containing state information about the object. Subsequently you fetch an object from Persistit as follows:
3219
34
3220
35
.. code-block:: java
3221
36
3222
37
    exchange.fetch();
3223
38
    MyClass myObject = (MyClass)exchange.getValue().get();
3224
39
3225
40
The resulting MyClass instance is a newly constructed object instance that is equivalent - subject to the accuracy of the serialization code - to the original object. This process is equivalent to the serialization and deserialization capabilities provided by java.io.ObjectOutputStream and java.io.ObjectInputStream.
3226
41
3227
42
Persistit makes use of helper classes called “coders” to marshal data between live objects and their stored byte-array representations. Value coders, which implement ``com.persistit.encoding.ValueCoder``, marshal data to and from Value objects; ``com.persistit.encoding.KeyCoder`` implementations do the same for ``com.persistit.Key``s. A value coder provides capability somewhat like the custom serialization logic implemented through ``readObject``, ``writeObject``, ``readExternal`` and ``writeExternal``. However, a value coder can provide this logic for any class without modifying the class itself, which may be important if the class is part of a closed library.
3228
43
3229
44
You may create and register a value coder for almost any class, including classes that are not marked Serializable. The exceptions are those listed which have built-in, non-overridable serialization logic.
3230
45
3231
46
DefaultValueCoder and SerialValueCoder
3232
47
--------------------------------------
3233
48
3234
49
When required to serialize or deserialize class with no explicitly defined ``ValueCoder``, Persistit automatically creates and registers one of the following two default ``ValueCoder`` implementations:
3235
50
3236
51
``com.persistit.DefaultValueCoder``::  uses introspection to determine which fields to serialize, and reflection to access and update the fields
3237
52
``com.persistit.encoding.SerialValueCoder``:: creates instances of ObjectInputStream and ObjectOutputStream to serialize and deserialize the object.
3238
53
3239
54
DefaultValueCoder uses a more compact storage format and is significantly faster than standard Java serialization; however, it imposes certain limitations and trade-offs described below. By default, Persistit will use a DefaultValueCoder. However, you can identify classes that should instead be serialized and deserialized by ``SerialValueCoder`` by specifying the ``serialOverride`` configuration property, which is described below.
3240
55
3241
56
DefaultValueCoder
3242
57
-----------------
3243
58
3244
59
A DefaultValueCoder uses Java reflection to access and update the fields of an arbitrary object. The set of fields is defined by the Java Object Serialization Specification. By default, these include all non-static, non-transient fields of the current class and its Serializable superclasses.  A class may override this default set by specifying an array of ``java.io.ObjectStreamField`` objects in a private final static field named ``serialPersistentFields``, as described in the specification.
3245
60
3246
61
``DefaultValueCoder`` invokes the special methods ``readResolve``, ``writeReplace``, ``readObject`` and ``writeObject``,  (or for Externalizable classes,  ``writeExternal`` and ``readExternal``) to provide the compatible custom serialization support. To support the ``readObject``/``readExternal`` and ``writeObject``/``writeExternal`` methods, Persistit creates extended implementations of ``java.io.ObjectOutputStream`` and ``java.io.ObjectInputStream``. These use a custom serialization format optimized for writing to a Value's backing byte array. For example, they do not organize data into 1,024-byte blocks, and they factor meta data about classes into a separate class information database so that this information is not repeated in multiple records containing instances of the same class.
3247
62
3248
63
Currently, ``DefaultValueCoder`` does not support the following elements of the serialization API:
3249
64
3250
65
- the ``readObjectNoData`` custom serialization method
3251
66
- the ``PutFields``/``GetFields`` API of ``ObjectOutputStream`` and ``ObjectInputStream``
3252
67
- the ``readLine`` method of ``ObjectInputStream``.
3253
68
3254
69
Constructing Objects upon Deserialization
3255
70
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3256
71
3257
72
When deserializing a value, ``DefaultValueCoder`` combines information about the original object's class and the stored field data to reconstruct an object equivalent to the original. To do so it must first construct a new instance of class and then decode and set its serialized fields.
3258
73
3259
74
For compatibility with standard Java serialization, ``DefaultValueCoder`` constructs new object instances of Serializable classes using the same logic as ``ObjectInputStream``, namely:
3260
75
3261
76
If the class is Externalizable, ``DefaultValueCoder`` invokes its public no-argument constructor. (The specification for Externalizable requires the class to have such a constructor.)
3262
77
3263
78
Otherwise, if the class is Serializable, ``DefaultValueCoder`` invokes the no-argument constructor of its nearest non-serializable superclass.
3264
79
3265
80
``DefaultValueCoder`` must use platform-specific logic when constructing instances of Serializable classes: specifically, it invokes the same internal, non-public method as ``ObjectInputStream``. We have verified correct behavior on a wide range of Java runtime environments, but because the implementation uses private methods within various JRE versions, it is possible (though unlikely) that a future JRE will not provide a comparable capability.
3266
81
3267
82
To avoid using platform-specific API calls, you can specify the configuration property::
3268
83
3269
84
    constructorOverride=true
3270
85
3271
86
When this property is ``true``, ``DefaultValueCoder`` requires each object being serialized or deserialized to have a no-argument constructor through which instances will be constructed during deserialization. Unless the class implements Externalizable, that constructor may be private, package-private, protected or public.
3272
87
3273
88
Extending DefaultValueCoder
3274
89
^^^^^^^^^^^^^^^^^^^^^^^^^^^
3275
90
3276
91
You can register an extended ``DefaultValueCoder`` to provide custom behavior, including custom logic for constructing instances of a class, as shown here:
3277
92
3278
93
.. code-block:: java
3279
94
3280
95
	Persistit.getInstance().getCoderManager().registerValueCoder(MyClass.class, new DefaultValueCoder(MyClass.class) {
3281
96
  public Object get(Value value, Class clazz, CoderContext context) throws ConversionException {
3282
97
3283
98
    // Construct the object being deserialized.
3284
99
    Object instance = new MyClass(...custom arguments...);
3285
100
3286
101
    // See "registering objects while deserializing" below
3287
102
    value.registerEncodedObject(instance);
3288
103
    
3289
104
    // Load the non-transient, non-static fields
3290
105
    render(value, instance, clazz, context);
3291
106
    
3292
107
    return instance;
3293
108
        	}
3294
109
 });
3295
110
3296
111
3297
112
3298
113
Security Policy Requirements for DefaultValueCoder
3299
114
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3300
115
3301
116
DefaultValueCoder performs security-sensitive operations: (a) it reads and writes data from and to private fields using reflection, and (b) it overrides the default implementations of java.io.ObjectInputStream and java.io.ObjectOutputStream. If a SecurityManager is installed then three permissions must be granted to enable the new mechanism::
3302
117
3303
118
  java.lang.RuntimePermission "accessDeclaredMembers";
3304
119
  java.lang.reflect.ReflectPermission("suppressAccessChecks")
3305
120
  java.io.SerializablePermission("enableSubclassImplementation")
3306
121
3307
122
See :ref:`Security` for an extended discussion on security policy issues for Persistit.
3308
123
3309
124
SerialValueCoder
3310
125
----------------
3311
126
3312
127
``SerialValueCoder`` uses standard Java serialization to store and retrieve object values. Typically this results in slower performance and a more verbose storage format than ``DefaultValueCoder``, but there are a number of reasons why a particular application might require standard Java serialization, including:
3313
128
3314
129
- the security context into which the application will be deployed does not grant the permissions noted above that are required for ``DefaultValueCoder``,
3315
130
- to avoid Persistit's use of private API calls to construct object instances during deserialization,
3316
131
- a preference for the use of a standard format defined within the Java platform rather than Persistit's custom format,
3317
132
- limitations documented above on the API elements available during custom deserialization within DefaultValueCoder, for example non-support of GetField and PutField.
3318
133
3319
134
Your application can specify ``SerialValueCoders`` for specific classes either by explicitly creating and registering them, or by naming them in the com.persistit.serialOverride property.
3320
135
3321
136
To explicitly register a ``SerialValueCoder`` for the class ``MyClass``, do this:
3322
137
3323
138
.. code-block:: java
3324
139
3325
140
	...
3326
141
	Persistit.getInstance().getCoderManager().registerValueCoder(
3327
142
    	MyClass.class,
3328
143
    	new SerialValueCoder(MyClass.class));
3329
144
	...
3330
145
3331
146
3332
147
The ``com.persistit.serialOverride`` Configuration Property
3333
148
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3334
149
3335
150
The ``serialOverride`` property specifies classes that are to be serialized by ``SerialValueCoder`` rather than ``DefaultValueCoder``. This property affects how Persistit assigns a value coder when none has previously been registered. It does not override or affect explicitly registered coders.
3336
151
3337
152
Names are separated by commas and may contain wild cards.
3338
153
3339
154
The following are valid patterns:
3340
155
3341
156
  ``java.io.File``
3342
157
      Just the File class.
3343
158
  ``java.io.*``
3344
159
      All classes in the java.io package.
3345
160
  ``java.awt.**``
3346
161
      All classes in the java.awt package and its sub-packages
3347
162
  ``java.util.*Map``
3348
163
      All of the Map classes in the java.util.
3349
164
  ``**``
3350
165
      All classes in all packages
3351
166
3352
167
More precisely, ``serialOverride`` specifies a comma-delimited list of zero or more patterns, each of which is either a fully-qualified class name or pattern that has within it exactly one wild card. The wild card “\*” replaces any sequence of characters other than a period (“.”), while “\*\*” replaces any sequence of characters including periods.  For example::
3353
168
3354
169
  ``serialOverride=org.apache.**,com.mypkg.serialstuff.*,com.mypkg.MyClass``
3355
170
3356
171
Like all configuration properties, you may specify this in the persistit.properties file or as a system property through a Java command-line argument in the form::
3357
172
3358
173
  ``-Dcom.persistit.serialOverride=...``
3359
174
3360
175
Registering Objects in a Custom ``ValueCoder``
3361
176
----------------------------------------------
3362
177
3363
178
In a custom ``ValueCoder`` implementation, the ``get`` method is responsible for constructing and populating an instance of an object. The following pattern should be used when implementing the get method:
3364
179
3365
180
.. code-block:: java
3366
181
3367
182
  public void get(Value value, Class clazz, CoderContext context) throws ConversionException {
3368
183
    	// Construct the object being deserialized.
3369
184
    	//
3370
185
    	Object instance = ...constructor for the object...
3371
186
3372
187
    	// Associate a handle with the newly
3373
188
    	// created instance.
3374
189
    	//
3375
190
    	value.registerEncodedObject(instance);
3376
191
3377
192
    	// Populate the object's internal state
3378
193
    	//
3379
194
    	... load the fields – for example, by calling render...
3380
195
3381
196
    	return instance;
3382
197
  }
3383
198
3384
199
The purpose of the ``registerEncodedObject`` method is to record the association between the newly created object and an internal integer-valued handle that may be used subsequently in the serialization stream to refer to that object. This mechanism supports objects that may have fields that refer either indirectly or indirectly back to the same object – i.e., that participate in a cyclical reference graph.
3385
200
3386
201
As a concrete example, consider a Person class with a spouse field such that for married couple p and q,  p.spouse is q and q.spouse is p. When Persistit serializes p it also serializes q, but when it serializes q's spouse field, it records a reference handle associated with the already-serialized instance of p rather than writing a new copy of p in the serialization stream. Upon deserializing q, Persistit looks up the object for the recorded handle to correctly associate the already-deserialized p instance with q.
3387
202
3388
203
Whenever you implement a custom ``get()`` method in any ``ValueCoder``, you must notify the underlying Value object about the newly created object by calling registerEncodedObject before deserializing its fields so that any back-references made within serialized fields of that object can find the object correctly.
3389
204
3390
205
``Value.toString()`` and ``decodeDisplayable``
3391
206
----------------------------------------------
3392
207
3393
208
In many cases it is not very useful simply to display the result of evaluating ``toString()`` on an object. The default toString method inherited from Object conveys just a class name and a memory handle. In addition, for remote operations of AdminUI, it may not even be feasible to construct a deserialized object for each record. Therefore, ``com.persistit.Value`` provides a specialized ``toString()`` method to render an arbitrary object value into a legible string. The AdminUI utility uses this facility to summarize the data contained in a Tree.
3394
209
3395
210
Persistit creates a String value loading the object's class, using the following algorithm:
3396
211
3397
212
- If the state represented by this Value is undefined, then return "undefined".
3398
213
- If the state is null or a boolean, return "null" "false", or "true".
3399
214
- If the value represents a primitive type, return the string representation of the value, prefixed by "(byte)", "(short)", "(char)", "(long)", or "(float)" for the corresponding types. Values of type int and double are presented without prefix to reduce clutter.
3400
215
- If the value represents a String, return a modified form of the string enclosed in double quotes. For each character of the string, if it is a double quote replace it by "\"", otherwise if it is outside of the printable ASCII character set replace the character in the modified string by "\b", "\t", "\n", "\r" or "\uNNNN" such that the modified string would be a valid Java string constant.
3401
216
- If the value represents a wrapper for a primitive value (i.e., a java.lang.Boolean, java.lang.Byte, etc.) return the string representation of the value prefixed by "(Boolean)", "(Byte)", "(Short)", "(Character)", "(Integer)", "(Long)", "(Float)" or "(Double)".  The package name java.lang is removed to reduce clutter.
3402
217
- If the value represents a java.util.Date, return a formatted representation of the date using the format specified by Key.SDF. This is a readable format that displays the date with full precision, including milliseconds.
3403
218
- If the value represents an array, return a list of comma-separated element values surrounded by square brackets.
3404
219
- If the value represents one of the standard Collection implementations in the java.util package, then return a comma-separated list of values surrounded by square brackets.
3405
220
- If the value represents one of the standard Map implementations in the java.util package, then return a comma-separated list of key/value pairs surrounded by square brackets. Each key/value pair is represented by a string in the form key->value.
3406
221
- If the value represents an object of a class for which there is a registered com.persistit.encoding.ValueDisplayer, invoke the displayer's display method to format a displayable representation of the object.
3407
222
- If the value represents an object that has been stored using the version default serialization mechanism described above, return the class name of the object followed by a comma-separated tuple, enclosed within curly brace characters, representing the value of each field of the object.
3408
223
- If the value represents an object encoded through standard Java serialization, return the string "(Serialized-object)" followed by a sequence of hex digits representing the serialized bytes. Note that this process does not attempt to deserialize the object.
3409
224
- If the value represents an object that has already been represented within the formatted result - for example, if a Collection contains two references to the same object - then instead of creating an additional string representing the second or subsequent instance, emit a back reference pointer in the form @NNN where NNN is the character offset within the displayable string where the first instance was found. (This does not apply to strings and the primitive wrapper classes.)
3410
225
3411
226
For example, consider a Person having for date of birth, first name, last name, salary and friends, an array of other Person objects. The result returned by toString() on a Value representing Mary Smith who has a friend John Smith, might appear as follows::
3412
227
3413
228
 (Person){(Date)19490826000000.000-0400,"Mary","Jones",(long)75000,[
3414
229
	(Person){(Date)19550522000000.000-0400,"John","Smith",(long)68000,[@0]}]}
3415
230
3416
231
In this example, John Smith's friends array contains a back reference to Mary Jones in the form "@0" because Mary's displayable reference starts at the beginning of the string.
3417
232
3418
233
3419
234
PersistitReference
3420
235
------------------
3421
236
3422
237
In general, serializing an object that contains references to other objects requires all the referenced objects also to be serialized. For an object connected to a large reference graph, it may be impractical or even semantically incorrect to serialize the entire graph.
3423
238
3424
239
One way to control the serialization graph for such an object is to write a custom ValueCoder; the custom ValueCoder can store key values for looking up the referenced object, rather than the object itself.  The ValueCoderDemo.java program demonstrates how this can be done.
3425
240
3426
241
The ``com.persistit.ref.PersistitReference`` interface, and its abstract subclasses, provide an alternative mechanism for breaking up an object reference graph.  It requires no custom ValueCoder, but does impact the design of application classes.  In addition, you will need to write a concrete implementation of either com.persistit.ref.AbstractReference or com.persistit.ref.AbstractWeakReference based on the actual storage structure of your object graph.
3427
242
3428
243
ObjectCache
3429
244
-----------
3430
245
3431
246
A ``com.persistit.Value`` object holds the serialized, encoded state of a primitive value of an object.  Each time you invoke the get method on a Value, Persistit generates a new copy of the object deserialized from this Value.  Persistit does not implicitly cache deserialized objects. However, the ``com.persistit.encoding.ObjectCache`` class provides a simple mechanism for applications that need to maintain an in-memory cache of of objects from Persistit. ``ObjectCache`` works somewhat like a specialized version of java.util.WeakHashMap.
3432
247
3433
248
``ObjectCache`` has ``put``, ``get`` and ``remove`` methods much like a normal Map implementation.  However, when storing an object value with the supplied ``com.persistit.Key``, ``ObjectCache`` constructs a new, immutable ``com.persistit.KeyState`` object to hold as an internal key. This is necessary because ``Key`` objects change value as they are used.
3434
249
3435
250
Each ``ObjectCache`` entry holds its object value as a ``SoftReference``, making it available for garbage collection when space is needed.
3436
0
251
3437
=== removed file 'doc/Serialization.txt'
3438
--- doc/Serialization.txt	2012-04-30 22:09:31 +0000
3439
+++ doc/Serialization.txt	1970-01-01 00:00:00 +0000
3440
@@ -1,246 +0,0 @@
3441
1
[[Serialization]]
3442
2
= Serializing Object Values
3443
3
3444
4
Akiban Persistit uses one of several mechanisms to serialize a Java Object into a +com.persistit.Value+.
3445
5
3446
6
- For the following classes, Persistit provides built-in optimized serialization logic that cannot be overridden:
3447
7
3448
8
* +java.lang.String+
3449
9
* +java.util.Date+
3450
10
* +java.math.BigInteger+
3451
11
* +java.math.BigDecimal+
3452
12
* Wrapper classes for primitive values (+Boolean+, +Byte+, +Short+, etc.)
3453
13
* All arrays (however, the mechanisms described here apply to array elements).
3454
14
3455
15
- An application can register a custom +com.persistit.encoding.ValueCoder+ to handle serialization of a particular class
3456
16
- Default serialization using Persistit's built-in serialization mechanism described below, or
3457
17
- Standard Java serialization as described in http://download.oracle.com/javase/1.5.0/docs/guide/serialization/spec/serial-arch.html[Java Object Serialization Specification].
3458
18
3459
19
Persistit's default serialization method serializes objects into approximately 33% fewer bytes, and depending on the structure of objects being serialized, is about 40% faster than Java serialization.
3460
20
3461
21
== Storing Objects in Persistit
3462
22
3463
23
To store an object value into a Persistit database, you put the object into the Value field of an Exchange, and then invoke the Exchange's store method as shown in this code fragment:
3464
24
3465
25
[source,java]
3466
26
----
3467
27
    exchange.getValue().put(myObject);
3468
28
    exchange.store();
3469
29
----
3470
30
3471
31
Of course, Persistit cannot actually store a live object on disk.  Instead it creates and stores a byte array containing state information about the object. Subsequently you fetch an object from Persistit as follows:
3472
32
3473
33
[source,java]
3474
34
----
3475
35
    exchange.fetch();
3476
36
    MyClass myObject = (MyClass)exchange.getValue().get();
3477
37
----
3478
38
3479
39
The resulting MyClass instance is a newly constructed object instance that is equivalent - subject to the accuracy of the serialization code - to the original object. This process is equivalent to the serialization and deserialization capabilities provided by java.io.ObjectOutputStream and java.io.ObjectInputStream.
3480
40
3481
41
Persistit makes use of helper classes called “coders” to marshal data between live objects and their stored byte-array representations. Value coders, which implement +com.persistit.encoding.ValueCoder+, marshal data to and from Value objects; +com.persistit.encoding.KeyCoder+ implementations do the same for +com.persistit.Key+ s. A value coder provides capability somewhat like the custom serialization logic implemented through +readObject+, +writeObject+, +readExternal+ and +writeExternal+. However, a value coder can provide this logic for any class without modifying the class itself, which may be important if the class is part of a closed library.
3482
42
3483
43
You may create and register a value coder for almost any class, including classes that are not marked Serializable. The exceptions are those listed which have built-in, non-overridable serialization logic.
3484
44
3485
45
=== DefaultValueCoder and SerialValueCoder
3486
46
3487
47
When required to serialize or deserialize class with no explicitly defined +ValueCoder+, Persistit automatically creates and registers one of the following two default +ValueCoder+ implementations:
3488
48
3489
49
+com.persistit.DefaultValueCoder+::  uses introspection to determine which fields to serialize, and reflection to access and update the fields
3490
50
+com.persistit.encoding.SerialValueCoder+:: creates instances of ObjectInputStream and ObjectOutputStream to serialize and deserialize the object.
3491
51
3492
52
DefaultValueCoder uses a more compact storage format and is significantly faster than standard Java serialization; however, it imposes certain limitations and trade-offs described below. By default, Persistit will use a DefaultValueCoder. However, you can identify classes that should instead be serialized and deserialized by +SerialValueCoder+ by specifying the +serialOverride+ configuration property, which is described below.
3493
53
3494
54
== DefaultValueCoder
3495
55
3496
56
A DefaultValueCoder uses Java reflection to access and update the fields of an arbitrary object. The set of fields is defined by the Java Object Serialization Specification. By default, these include all non-static, non-transient fields of the current class and its Serializable superclasses.  A class may override this default set by specifying an array of +java.io.ObjectStreamField+ objects in a private final static field named +serialPersistentFields+, as described in the specification.
3497
57
3498
58
+DefaultValueCoder+ invokes the special methods +readResolve+, +writeReplace+, +readObject+ and +writeObject+,  (or for Externalizable classes,  +writeExternal+ and +readExternal+) to provide the compatible custom serialization support. To support the +readObject+/+readExternal+ and +writeObject+/+writeExternal+ methods, Persistit creates extended implementations of +java.io.ObjectOutputStream+ and +java.io.ObjectInputStream+. These use a custom serialization format optimized for writing to a Value's backing byte array. For example, they do not organize data into 1,024-byte blocks, and they factor meta data about classes into a separate class information database so that this information is not repeated in multiple records containing instances of the same class.
3499
59
3500
60
Currently, +DefaultValueCoder+ does not support the following elements of the serialization API:
3501
61
3502
62
- the +readObjectNoData+ custom serialization method
3503
63
- the +PutFields+/+GetFields+ API of +ObjectOutputStream+ and +ObjectInputStream+
3504
64
- the +readLine+ method of +ObjectInputStream+.
3505
65
3506
66
=== Constructing Objects upon Deserialization
3507
67
3508
68
When deserializing a value, +DefaultValueCoder+ combines information about the original object's class and the stored field data to reconstruct an object equivalent to the original. To do so it must first construct a new instance of class and then decode and set its serialized fields.
3509
69
3510
70
For compatibility with standard Java serialization, +DefaultValueCoder+ constructs new object instances of Serializable classes using the same logic as +ObjectInputStream+, namely:
3511
71
3512
72
If the class is Externalizable, +DefaultValueCoder+ invokes its public no-argument constructor. (The specification for Externalizable requires the class to have such a constructor.)
3513
73
3514
74
Otherwise, if the class is Serializable, +DefaultValueCoder+ invokes the no-argument constructor of its nearest non-serializable superclass.
3515
75
3516
76
+DefaultValueCoder+ must use platform-specific logic when constructing instances of Serializable classes: specifically, it invokes the same internal, non-public method as +ObjectInputStream+. We have verified correct behavior on a wide range of Java runtime environments, but because the implementation uses private methods within various JRE versions, it is possible (though unlikely) that a future JRE will not provide a comparable capability.
3517
77
3518
78
To avoid using platform-specific API calls, you can specify the configuration property
3519
79
3520
80
----
3521
81
    constructorOverride=true
3522
82
----
3523
83
3524
84
When this property is +true+, +DefaultValueCoder+ requires each object being serialized or deserialized to have a no-argument constructor through which instances will be constructed during deserialization. Unless the class implements Externalizable, that constructor may be private, package-private, protected or public.
3525
85
3526
86
=== Extending DefaultValueCoder
3527
87
3528
88
You can register an extended +DefaultValueCoder+ to provide custom behavior, including custom logic for constructing instances of a class, as shown here:
3529
89
3530
90
[source,java]
3531
91
-----
3532
92
	Persistit.getInstance().getCoderManager().registerValueCoder(MyClass.class, new DefaultValueCoder(MyClass.class) {
3533
93
  public Object get(Value value, Class clazz, CoderContext context) throws ConversionException {
3534
94
3535
95
    // Construct the object being deserialized.
3536
96
    Object instance = new MyClass(...custom arguments...);
3537
97
3538
98
    // See "registering objects while deserializing" below
3539
99
    value.registerEncodedObject(instance);
3540
100
    
3541
101
    // Load the non-transient, non-static fields
3542
102
    render(value, instance, clazz, context);
3543
103
    
3544
104
    return instance;
3545
105
        	}
3546
106
 });
3547
107
----
3548
108
3549
109
3550
110
3551
111
=== Security Policy Requirements for DefaultValueCoder
3552
112
3553
113
DefaultValueCoder performs security-sensitive operations: (a) it reads and writes data from and to private fields using reflection, and (b) it overrides the default implementations of java.io.ObjectInputStream and java.io.ObjectOutputStream. If a SecurityManager is installed then three permissions must be granted to enable the new mechanism:
3554
114
3555
115
----
3556
116
java.lang.RuntimePermission "accessDeclaredMembers";
3557
117
java.lang.reflect.ReflectPermission("suppressAccessChecks")
3558
118
java.io.SerializablePermission("enableSubclassImplementation")
3559
119
----
3560
120
3561
121
See <<Security>> for an extended discussion on security policy issues for Persistit.
3562
122
3563
123
== SerialValueCoder
3564
124
3565
125
+SerialValueCoder+ uses standard Java serialization to store and retrieve object values. Typically this results in slower performance and a more verbose storage format than +DefaultValueCoder+, but there are a number of reasons why a particular application might require standard Java serialization, including:
3566
126
3567
127
- the security context into which the application will be deployed does not grant the permissions noted above that are required for +DefaultValueCoder+,
3568
128
- to avoid Persistit's use of private API calls to construct object instances during deserialization,
3569
129
- a preference for the use of a standard format defined within the Java platform rather than Persistit's custom format,
3570
130
- limitations documented above on the API elements available during custom deserialization within DefaultValueCoder, for example non-support of GetField and PutField.
3571
131
3572
132
Your application can specify +SerialValueCoders+ for specific classes either by explicitly creating and registering them, or by naming them in the com.persistit.serialOverride property.
3573
133
3574
134
To explicitly register a +SerialValueCoder+ for the class +MyClass+, do this:
3575
135
3576
136
[source,java]
3577
137
----
3578
138
	...
3579
139
	Persistit.getInstance().getCoderManager().registerValueCoder(
3580
140
    	MyClass.class,
3581
141
    	new SerialValueCoder(MyClass.class));
3582
142
	...
3583
143
----
3584
144
3585
145
3586
146
=== The +com.persistit.serialOverride+ Configuration Property
3587
147
3588
148
The +serialOverride+ property specifies classes that are to be serialized by +SerialValueCoder+ rather than +DefaultValueCoder+. This property affects how Persistit assigns a value coder when none has previously been registered. It does not override or affect explicitly registered coders.
3589
149
3590
150
Names are separated by commas and may contain wild cards.
3591
151
3592
152
The following are valid patterns:
3593
153
3594
154
+java.io.File+:: Just the File class.
3595
155
+java.io.*+:: All classes in the java.io package.
3596
156
+java.awt.**+:: All classes in the java.awt package and its sub-packages
3597
157
+java.util.*Map+:: All of the Map classes in the java.util.
3598
158
+**+:: All classes in all packages
3599
159
3600
160
More precisely, +serialOverride+ specifies a comma-delimited list of zero or more patterns, each of which is either a fully-qualified class name or pattern that has within it exactly one wild card. The wild card “\*” replaces any sequence of characters other than a period (“.”), while “\*\*” replaces any sequence of characters including periods.  For example:
3601
161
3602
162
----
3603
163
+serialOverride=org.apache.**,com.mypkg.serialstuff.*,com.mypkg.MyClass+
3604
164
----
3605
165
3606
166
Like all configuration properties, you may specify this in the persistit.properties file or as a system property through a Java command-line argument in the form:
3607
167
3608
168
----
3609
169
+-Dcom.persistit.serialOverride=...+
3610
170
----
3611
171
3612
172
== Registering Objects in a Custom +ValueCoder+
3613
173
3614
174
In a custom +ValueCoder+ implementation, the +get+ method is responsible for constructing and populating an instance of an object. The following pattern should be used when implementing the get method:
3615
175
3616
176
[source,java]
3617
177
----
3618
178
public void get(Value value, Class clazz, CoderContext context) throws ConversionException {
3619
179
    	// Construct the object being deserialized.
3620
180
    	//
3621
181
    	Object instance = ...constructor for the object...
3622
182
3623
183
    	// Associate a handle with the newly
3624
184
    	// created instance.
3625
185
    	//
3626
186
    	value.registerEncodedObject(instance);
3627
187
3628
188
    	// Populate the object's internal state
3629
189
    	//
3630
190
    	... load the fields – for example, by calling render...
3631
191
3632
192
    	return instance;
3633
193
}
3634
194
----
3635
195
3636
196
The purpose of the +registerEncodedObject+ method is to record the association between the newly created object and an internal integer-valued handle that may be used subsequently in the serialization stream to refer to that object. This mechanism supports objects that may have fields that refer either indirectly or indirectly back to the same object – i.e., that participate in a cyclical reference graph.
3637
197
3638
198
As a concrete example, consider a Person class with a spouse field such that for married couple p and q,  p.spouse is q and q.spouse is p. When Persistit serializes p it also serializes q, but when it serializes q's spouse field, it records a reference handle associated with the already-serialized instance of p rather than writing a new copy of p in the serialization stream. Upon deserializing q, Persistit looks up the object for the recorded handle to correctly associate the already-deserialized p instance with q.
3639
199
3640
200
Whenever you implement a custom +get()+ method in any +ValueCoder+, you must notify the underlying Value object about the newly created object by calling registerEncodedObject before deserializing its fields so that any back-references made within serialized fields of that object can find the object correctly.
3641
201
3642
202
== +Value.toString()+ and +decodeDisplayable+
3643
203
3644
204
In many cases it is not very useful simply to display the result of evaluating +toString()+ on an object. The default toString method inherited from Object conveys just a class name and a memory handle. In addition, for remote operations of AdminUI, it may not even be feasible to construct a deserialized object for each record. Therefore, +com.persistit.Value+ provides a specialized +toString()+ method to render an arbitrary object value into a legible string. The AdminUI utility uses this facility to summarize the data contained in a Tree.
3645
205
3646
206
Persistit creates a String value loading the object's class, using the following algorithm:
3647
207
3648
208
- If the state represented by this Value is undefined, then return "undefined".
3649
209
- If the state is null or a boolean, return "null" "false", or "true".
3650
210
- If the value represents a primitive type, return the string representation of the value, prefixed by "(byte)", "(short)", "(char)", "(long)", or "(float)" for the corresponding types. Values of type int and double are presented without prefix to reduce clutter.
3651
211
- If the value represents a String, return a modified form of the string enclosed in double quotes. For each character of the string, if it is a double quote replace it by "\"", otherwise if it is outside of the printable ASCII character set replace the character in the modified string by "\b", "\t", "\n", "\r" or "\uNNNN" such that the modified string would be a valid Java string constant.
3652
212
- If the value represents a wrapper for a primitive value (i.e., a java.lang.Boolean, java.lang.Byte, etc.) return the string representation of the value prefixed by "(Boolean)", "(Byte)", "(Short)", "(Character)", "(Integer)", "(Long)", "(Float)" or "(Double)".  The package name java.lang is removed to reduce clutter.
3653
213
- If the value represents a java.util.Date, return a formatted representation of the date using the format specified by Key.SDF. This is a readable format that displays the date with full precision, including milliseconds.
3654
214
- If the value represents an array, return a list of comma-separated element values surrounded by square brackets.
3655
215
- If the value represents one of the standard Collection implementations in the java.util package, then return a comma-separated list of values surrounded by square brackets.
3656
216
- If the value represents one of the standard Map implementations in the java.util package, then return a comma-separated list of key/value pairs surrounded by square brackets. Each key/value pair is represented by a string in the form key->value.
3657
217
- If the value represents an object of a class for which there is a registered com.persistit.encoding.ValueDisplayer, invoke the displayer's display method to format a displayable representation of the object.
3658
218
- If the value represents an object that has been stored using the version default serialization mechanism described above, return the class name of the object followed by a comma-separated tuple, enclosed within curly brace characters, representing the value of each field of the object.
3659
219
- If the value represents an object encoded through standard Java serialization, return the string "(Serialized-object)" followed by a sequence of hex digits representing the serialized bytes. Note that this process does not attempt to deserialize the object.
3660
220
- If the value represents an object that has already been represented within the formatted result - for example, if a Collection contains two references to the same object - then instead of creating an additional string representing the second or subsequent instance, emit a back reference pointer in the form @NNN where NNN is the character offset within the displayable string where the first instance was found. (This does not apply to strings and the primitive wrapper classes.)
3661
221
3662
222
For example, consider a Person having for date of birth, first name, last name, salary and friends, an array of other Person objects. The result returned by toString() on a Value representing Mary Smith who has a friend John Smith, might appear as follows:
3663
223
3664
224
----
3665
225
 (Person){(Date)19490826000000.000-0400,"Mary","Jones",(long)75000,[
3666
226
	(Person){(Date)19550522000000.000-0400,"John","Smith",(long)68000,[@0]}]}
3667
227
----
3668
228
3669
229
In this example, John Smith's friends array contains a back reference to Mary Jones in the form "@0" because Mary's displayable reference starts at the beginning of the string.
3670
230
3671
231
3672
232
== PersistitReference
3673
233
3674
234
In general, serializing an object that contains references to other objects requires all the referenced objects also to be serialized. For an object connected to a large reference graph, it may be impractical or even semantically incorrect to serialize the entire graph.
3675
235
3676
236
One way to control the serialization graph for such an object is to write a custom ValueCoder; the custom ValueCoder can store key values for looking up the referenced object, rather than the object itself.  The ValueCoderDemo.java program demonstrates how this can be done.
3677
237
3678
238
The +com.persistit.ref.PersistitReference+ interface, and its abstract subclasses, provide an alternative mechanism for breaking up an object reference graph.  It requires no custom ValueCoder, but does impact the design of application classes.  In addition, you will need to write a concrete implementation of either com.persistit.ref.AbstractReference or com.persistit.ref.AbstractWeakReference based on the actual storage structure of your object graph.
3679
239
3680
240
== ObjectCache
3681
241
3682
242
A +com.persistit.Value+ object holds the serialized, encoded state of a primitive value of an object.  Each time you invoke the get method on a Value, Persistit generates a new copy of the object deserialized from this Value.  Persistit does not implicitly cache deserialized objects. However, the +com.persistit.encoding.ObjectCache+ class provides a simple mechanism for applications that need to maintain an in-memory cache of of objects from Persistit. +ObjectCache+ works somewhat like a specialized version of java.util.WeakHashMap.
3683
243
3684
244
+ObjectCache+ has +put+, +get+ and +remove+ methods much like a normal Map implementation.  However, when storing an object value with the supplied +com.persistit.Key+, +ObjectCache+ constructs a new, immutable +com.persistit.KeyState+ object to hold as an internal key. This is necessary because +Key+ objects change value as they are used.
3685
245
3686
246
Each +ObjectCache+ entry holds its object value as a +SoftReference+, making it available for garbage collection when space is needed.
3687
247
0
3688
=== removed file 'doc/TOC.txt'
3689
--- doc/TOC.txt	2012-04-25 18:13:07 +0000
3690
+++ doc/TOC.txt	1970-01-01 00:00:00 +0000
3691
@@ -1,19 +0,0 @@
3692
1
Akiban Persistit User Guide
3693
2
===========================
3694
3
Peter Beaman <pbeaman@akiban.com>
3695
4
:toc:
3696
5
:doctype: book
3697
6
:icons:
3698
7
:numbered:
3699
8
:website: http://www.akiban.com/persistit/
3700
9
3701
10
@GettingStarted.txt
3702
11
@BasicAPI.txt
3703
12
@Transactions.txt
3704
13
@PhysicalStorage.txt
3705
14
@Configuration.txt
3706
15
@Management.txt
3707
16
@Security.txt
3708
17
@Serialization.txt
3709
18
@Miscellaneous.txt
3710
19
3711
20
0
3712
=== added file 'doc/Transactions.rst'
3713
--- doc/Transactions.rst	1970-01-01 00:00:00 +0000
3714
+++ doc/Transactions.rst	2012-05-30 18:23:19 +0000
3715
@@ -0,0 +1,200 @@
3716
1
.. _Transactions:
3717
2
3718
3
Transactions
3719
4
============
3720
5
3721
6
Akiban Persistit supports transactions with multi-version concurrency control (MVCC) using a protocol called Snapshot Isolation (SI). An application calls ``com.persistit.Transaction#begin``, ``com.persistit.Transaction#commit``, ``com.persistit.Transaction#rollback# and ``com.persistit.Transaction#end`` methods to control the current transaction scope explicitly.  A Transaction allows an application to execute multiple database operations in an atomic, consistent, isolated and durable (ACID) manner.
3722
7
3723
8
Applications manage transactions through an instance of a ``com.persistit.Transaction`` object. ``Transaction`` does not represent a single transaction, but is instead a context in which a thread may perform many sequential transactions. The general pattern is that the application gets the current thread’s ``Transaction`` instance, calls its ``begin`` method, performs work, calls ``commit`` and finally ``end``.  The thread uses the same ``Transaction`` instance repeatedly. Generally each thread has one ``Transaction`` that lasts for the entire life of the thread (but see com.persistit.Transaction#_threadManagement for a mechanism that allows a transaction to be serviced by multiple threads). 
3724
9
3725
10
Using Transactions
3726
11
------------------
3727
12
3728
13
The following code fragment performs two store operations within the scope of a transaction:
3729
14
3730
15
.. code-block:: java
3731
16
3732
17
  //
3733
18
  // Get the transaction context for the current thread.
3734
19
  //
3735
20
  Transaction txn = myExchange.getTransaction();
3736
21
  int remainingRetries = RETRY_COUNT;
3737
22
  for (;;) {
3738
23
      txn.begin();
3739
24
      try {
3740
25
          myExchange.getValue().put("First value");
3741
26
          myExchange.clear().append(1).store();
3742
27
          myExchange.getValue().put("Second value");
3743
28
          myExchange.clear().append(2).store();
3744
29
          // Required to commit the transaction
3745
30
          txn.commit();
3746
31
          break;
3747
32
      } catch (RollbackException re) {
3748
33
          // perform any special rollback handling
3749
34
          // allow loop to repeat until commit succeeds or retries
3750
35
          // too many times.
3751
36
          if (--remainingRetries < 0) {
3752
37
            throw new TransactionFailedException();
3753
38
          }
3754
39
      } catch (PersistitException pe) {
3755
40
          // handle other Persistit exception
3756
41
      } finally {
3757
42
          // Required to end the scope of a transaction.
3758
43
          txn.end();
3759
44
      }
3760
45
  }
3761
46
3762
47
This example catches ``com.persistit.exception.RollbackException`` which can be thrown by any Persistit operation within the scope of a transaction, including ``commit``. Any code explicitly running within the scope of a transaction should be designed to handle rollbacks.
3763
48
3764
49
This example also uses a *try/finally* block to ensure every call to ``begin`` has a matching call to ``end``. This code pattern is mandatory: it is critical to correct transaction nesting behavior.
3765
50
3766
51
One convenient way to do this is to encapsulate the logic of a transaction in an implementation of ``com.persisitit.TransactionRunnable`` interface. The ``com.persistit.Transaction#run`` method automatically provides logic to begin the transaction, execute the TransactionRunnable and commit the transaction, repeating the process until no rollback is thrown or a maximum retry count is reached. For example, the code fragment shown above can be rewritten as:
3767
52
3768
53
.. code-block:: java
3769
54
3770
55
  //
3771
56
  // Get the transaction context for the current thread.
3772
57
  //
3773
58
  Transaction txn = myExchange.getTransaction();
3774
59
  //
3775
60
  // Perform the transaction with the following parameters:
3776
61
  // - try to commit it up to 10 times
3777
62
  // - delay 2 milliseconds before each retry
3778
63
  // - use the group commit durability policy
3779
64
  //
3780
65
  txn.run(new TransactionRunnable() {
3781
66
      public void run() throws PersistitException {
3782
67
          myExchange.getValue().put("First value");
3783
68
          myExchange.clear().append(1).store();
3784
69
          myExchange.getValue().put("Second value");
3785
70
          myExchange.clear().append(2).store();
3786
71
      }
3787
72
  }, 10, 2, CommitPolicy.GROUP);
3788
73
3789
74
Mixing Transactional and Non-Transactional Operations
3790
75
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3791
76
3792
77
Database operations running outside the scope of an explicitly defined transaction are never subject to rollback and therefore do not require retry logic. However, such operations are also not guaranteed to be durable in the event of a system crash. Further, such operations are not isolated. Read operations performed outside of a transaction can read uncommitted updates, and updates performed outside of a transaction are visible within transactions. In other words, non-transactional reads and writes may break both the durability and isolation of concurrently executing transactions.  Therefore it is strongly recommended that in an application that relies on transactions, all interactions with the database should use transactions. 
3793
78
3794
79
Optimistic Transaction Scheduling
3795
80
---------------------------------
3796
81
3797
82
To achieve high performance and scalability, Persistit supports an optimistic transaction scheduling protocol called MVCC with http://wikipedia.org/wiki/Snapshot_isolation[Snapshot Isolation]. Under this protocol multiple threads are permitted to execute transactions at full speed without blocking until a potentially inconsistent state is recognized. At that point a transaction suspected of causing the inconsistent state is automatically forced to roll back.
3798
83
3799
84
Optimistic scheduling works because transactions usually do not collide, especially when individual database operations are fast, and so in practice transactions are seldom rolled back. But because any transaction may be rolled back at any point, applications must be designed carefully to avoid unintended side-effects. For example, a transaction should never perform non-repeatable or externally visible operations such as file or network I/O within its scope.
3800
85
3801
86
Snapshot Isolation
3802
87
^^^^^^^^^^^^^^^^^^
3803
88
3804
89
Persistit schedules concurrently executing transactions optimistically, without locking any database records. Instead, Persistit uses the well-known Snapshot Isolation protocol to achieve atomicity and isolation. While transactions are modifying data, Persistit maintains multiple versions of values being modified. Each version is labeled with the commit timestamp of the transaction that modified it. Whenever a transaction reads a value that has been modified by other transactions, it gets the latest version that was committed before its own start timestamp. In other words, all read operations are performed as if from a "snapshot" of the state of the database made at the transaction's start timestamp - hence the name "Snapshot Isolation."
3805
90
3806
91
.. _Pruning:
3807
92
3808
93
Pruning 
3809
94
^^^^^^^
3810
95
3811
96
Given that all updates written through transactions are created as versions within the MVCC scheme, a large number of versions can accumulate over time. Persistit reduces this proliferation through an activity called "pruning." Pruning resolves the final state of each version by removing any versions created by aborted transactions and removing obsolete versions no longer needed by other transactions. If a value contains only one version and the commit timestamp of the transaction that created it is before the start of any currently running transaction, that value is called *primordial*. The goal of pruning is to reduce most or all values in the database to their primordial states because updating and reading primordial values is more efficient than than managing multiple version values. Pruning happens automatically and is generally not visible to the application.
3812
97
3813
98
Rollbacks
3814
99
^^^^^^^^^
3815
100
3816
101
Usually Snapshot Isolation allows concurrent transactions to commit without interference but this is not always the case. Two concurrent transactions that attempt to modify the same Persistit key/value pair before they commit are said to have a "write-write dependency". To avoid anomalous results one of them must abort, rolling back any other updates it may also have created, and retry. Persistit implements a "first updater wins" policy in which if two transactions attempt to update the same record, the first transaction "wins" by being allowed to continue, while the second transaction "loses" and is required to abort.
3817
102
3818
103
Once a transaction has aborted, any subsequent database operation it attempts throws a ``RollbackException``. Application code should catch and handle this Exception. Usually the correct and desired behavior is simply to retry the transaction as shown in the code samples above.
3819
104
3820
105
A transaction can also voluntarily roll back. For example, transaction logic could detect an error condition that it chooses to handle by throwing an exception back to the application. In this case the transaction should invoke the ``rollback`` method to explicitly declare its intent to abort the transaction.
3821
106
3822
107
Read-Only Transactions
3823
108
^^^^^^^^^^^^^^^^^^^^^^
3824
109
 
3825
110
Under Snapshot Isolation, transactions that read but do not modify data cannot generate any write-write dependencies and are therefore not subject to  being rolled back because of the actions of other transactions. However, even though it modifies no data, a long-running read-only transaction can force Persistit to retain old value versions from other transactions for its duration in order to provide a snapshot view. This behavior can cause congestion and performance degradation by preventing very old values from being pruned. The degree to which this is a problem depends on the volume of update transactions being processed and the duration of long-running transactions.
3826
111
3827
112
Snapshot Isolation is not Serializable
3828
113
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3829
114
3830
115
It is well-known that transactions executing under SI are not necessarily serializable. Under SI, so-called *write-skew* anomalies can happen with transactions that have certain kinds of interactions.  Write-skew can be avoided by (a) explicit application-level locking or (b) structuring transactions to add write-write dependencies where write-skew otherwise could occur.
3831
116
3832
117
Note that many common transaction patterns, including those defined by the TPC-C benchmark, do not experience write-skew and therefore *are* serializable under SI.
3833
118
3834
119
Durability Options: ``CommitPolicy``
3835
120
------------------------------------
3836
121
3837
122
Persistit provides three policies that determine the durability of a transaction after it has executed the ``com.persistit.Transaction#commit`` method. These are:
3838
123
3839
124
  ``HARD``
3840
125
      The ``commit`` method does not return until all updates created by the transaction have been written to non-volatile storage (e.g., disk storage).
3841
126
  ``GROUP``
3842
127
      The ``commit`` method does not return until all updates created by the transaction have been written to non-volatile storage. In addition, the committing 
3843
128
      transaction waits briefly in an attempt to recruit other transactions running in other threads to write their updates with the same physical I/O operation.
3844
129
  ``SOFT``
3845
130
      The ``commit`` method returns *before* the updates have been recorded on non-volatile storage. Persistit attempts to write them within 100 milliseconds, but 
3846
131
      this interval is not guaranteed.
3847
132
3848
133
You can specify a default policy in the Persistit initialization properties using the ``txnpolicy`` property or under program control using ``com.persistit.Persistit#setDefaultTransactionCommitPolicy``. The default policy applies whenever the application calls the ``commit()`` method. You can override the default policy using ``commit(CommitPolicy)``.
3849
134
3850
135
HARD and GROUP ensure each transaction is written durably to non-volatile storage before the ``commit`` method returns. The difference is that GROUP can improve throughput in multi-threaded applications because the average number of I/O operations needed to commit *N* transactions can be smaller than *N*. However, for one or a small number of concurrent threads, GROUP reduces throughput because it works by introducing a delay to allow other concurrent transactions to commit within a single I/O operation.
3851
136
3852
137
SOFT commits are generally much faster than HARD or GROUP commits, especially for single-threaded applications, because the results of numerous transactions committed from a single thread can be aggregated and written to disk in a single I/O operation. However, transactions written with the SOFT commit policy are not immediately durable and it is possible that the recovered state of a database will be missing transactions that reported they were committed shortly before a crash.
3853
138
3854
139
For SOFT commits, the state of the database after restart is such that for any committed transaction T, either all or none of its modifications will be present in the recovered database. Further, if a transaction T2 reads or updates data that was written by any other transaction T1, and if T2 is present in the recovered database, then so is T1. Any transaction that was in progress, but had not been committed at the time of the failure, is guaranteed not to be present in the recovered database. SOFT commits are designed to be durable within 100 milliseconds after ``commit`` returns. However, this interval is determined by computing the average duration of recent I/O operations to predict the completion time of the I/O that will write the transaction to disk, and therefore the interval cannot be guaranteed.
3855
140
3856
141
Nested Transactions
3857
142
-------------------
3858
143
3859
144
A nested transaction occurs when code that is already executing within the scope of a transaction executes the ``begin`` method to start a new transaction. This might happen, for example, if an application’s transaction logic calls a method that also uses transactions. In this case, the commit processing of the inner transaction scope is deferred until the outermost transaction commits. At that point, all the updates performed within the inner and outer transaction scopes are committed to the database. Similarly, a rollback initiated by the inner transaction causes both it and the outermost transaction to roll back.
3860
145
3861
146
Accumulators
3862
147
------------
3863
148
3864
149
Consider an application in which concurrently running transactions share a counter. For example, suppose each transaction is responsible for allocating a unique integer as a primary key for a database record. One way to do this would be to store the counter in a Persistit key/value pair, reading the value at the start of each transaction and committing an update at the end.
3865
150
3866
151
The problem with this approach is that under SI, concurrent transactions running in a multi-threaded application would experience very frequent write-write dependencies on the counter value; in fact, the only way to complete any transactions would be serially, one at a time.
3867
152
3868
153
Persistit provides the ``com.persistit.Accumulator`` class to avoid this problem.  An accumulator is designed to manage contributions from multiple concurrent transactions without causing write-write dependencies. Accumulators are durable in the sense that each transaction’s contribution is made durable with the transaction itself, and Persistit automatically recovers a correct state for each Accumulator in the event of a system crash.
3869
154
3870
155
There are four types of accumulator in Persistit. Each a concrete subclass of the abstract ``com.persistit.Accumulator`` class:
3871
156
3872
157
  ``SUM``
3873
158
      Tallies a count or sum of contributions by each transaction
3874
159
  ``MIN``
3875
160
      Finds the minimum value contributed by all transactions
3876
161
  ``MAX``
3877
162
      Finds the maximum value contributed by all transactions
3878
163
  ``SEQ``
3879
164
      Special case of the SUM accumulator used to generate sequence numbers
3880
165
3881
166
Accumulator instances are associated with a ``com.persistit.Tree``.  Each ``Tree`` may have up to 64 accumulators. The following code fragment creates and/or acquires a ``SumAccumulator``, reads its snapshot value and then adds one to it:
3882
167
3883
168
.. code-block:: java
3884
169
3885
170
  final Exchange ex = _persistit.getExchange(volume, treeName, true);
3886
171
  final Transaction txn = ex.getTransaction();
3887
172
  txn.begin();
3888
173
  try {
3889
174
      final Accumulator acc =
3890
175
          ex.getTree().getAccumulator(Accumulator.Type.SUM, 17);
3891
176
      long snap = acc.getSnapshotValue(txn);
3892
177
      acc.update(1, txn);
3893
178
      txn.commit();
3894
179
  } finally {
3895
180
      txn.end();
3896
181
  }
3897
182
3898
183
The value 17 is simply an arbitrary index number between 0 and 63, inclusive. The application is responsible for allocating and managing accumulator indexes.
3899
184
3900
185
The snapshot value of an accumulator obtained through ``com.persistit.Accumulator#getSnapshotValue()`` is the value computed from all updates contributed by transactions that had committed at the time the current transaction started, plus the transaction’s own as-yet uncommitted updates. In other words, the snapshot value of the accumulator is consistent with the snapshot view of all other data visible within the transaction.
3901
186
3902
187
An accumulator has two ways of accessing its accumulated value:
3903
188
3904
189
  ``getSnapshotValue()``
3905
190
      Is a value computed from updates that were committed at the start of the current transaction. This method may be called only within the scope of a 
3906
191
      Transaction.
3907
192
  ``getLiveValue()``
3908
193
      Is an ephemeral value reflecting all updates performed by all transactions, including concurrent and aborted transactions.
3909
194
3910
195
The snapshot value is a precise, consistent tally, while the live value is approximate. For a ``SumAccumulator``, ``MaxAccumulator`` or ``SeqAccumulator``, if all updates are have non-negative arguments, then the live value is always greater than or equal to the snapshot value.
3911
196
3912
197
SeqAccumulator
3913
198
^^^^^^^^^^^^^^
3914
199
3915
200
The ``SeqAccumulator`` class has a special role in allocating unique identifier numbers, e.g., synthetic primary keys.  The goal of the ``SeqAccumulator`` is to ensure that every committed transaction has received a unique value integer in all circumstances, including after recovery from a crash. See ``com.persistit.Accumulator`` for details.
3916
0
201
3917
=== removed file 'doc/Transactions.txt'
3918
--- doc/Transactions.txt	2012-04-30 22:09:31 +0000
3919
+++ doc/Transactions.txt	1970-01-01 00:00:00 +0000
3920
@@ -1,179 +0,0 @@
3921
1
[[Transactions]]
3922
2
= Transactions
3923
3
3924
4
Akiban Persistit supports transactions with multi-version concurrency control (MVCC) using a protocol called Snapshot Isolation (SI). An application calls +com.persistit.Transaction#begin+, +com.persistit.Transaction#commit+, +com.persistit.Transaction#rollback# and +com.persistit.Transaction#end+ methods to control the current transaction scope explicitly.  A Transaction allows an application to execute multiple database operations in an atomic, consistent, isolated and durable (ACID) manner.
3925
5
3926
6
Applications manage transactions through an instance of a +com.persistit.Transaction+ object. +Transaction+ does not represent a single transaction, but is instead a context in which a thread may perform many sequential transactions. The general pattern is that the application gets the current thread’s +Transaction+ instance, calls its +begin+ method, performs work, calls +commit+ and finally +end+.  The thread uses the same +Transaction+ instance repeatedly. Generally each thread has one +Transaction+ that lasts for the entire life of the thread (but see com.persistit.Transaction#_threadManagement[Thread Management] for a mechanism that allows a transaction to be serviced by multiple threads). 
3927
7
3928
8
== Using Transactions
3929
9
3930
10
The following code fragment performs two store operations within the scope of a transaction:
3931
11
3932
12
[source,java]
3933
13
----
3934
14
//
3935
15
// Get the transaction context for the current thread.
3936
16
//
3937
17
Transaction txn = myExchange.getTransaction();
3938
18
int remainingRetries = RETRY_COUNT;
3939
19
for (;;) {
3940
20
    txn.begin();
3941
21
    try {
3942
22
        myExchange.getValue().put("First value");
3943
23
        myExchange.clear().append(1).store();
3944
24
        myExchange.getValue().put("Second value");
3945
25
        myExchange.clear().append(2).store();
3946
26
        // Required to commit the transaction
3947
27
        txn.commit();
3948
28
        break;
3949
29
    } catch (RollbackException re) {
3950
30
        // perform any special rollback handling
3951
31
        // allow loop to repeat until commit succeeds or retries
3952
32
        // too many times.
3953
33
        if (--remainingRetries < 0) {
3954
34
            throw new TransactionFailedException();
3955
35
        }
3956
36
    } catch (PersistitException pe) {
3957
37
        // handle other Persistit exception
3958
38
    } finally {
3959
39
        // Required to end the scope of a transaction.
3960
40
        txn.end();
3961
41
    }
3962
42
}
3963
43
----
3964
44
3965
45
This example catches +com.persistit.exception.RollbackException+ which can be thrown by any Persistit operation within the scope of a transaction, including +commit+. Any code explicitly running within the scope of a transaction should be designed to handle rollbacks.
3966
46
3967
47
This example also uses a _try/finally_ block to ensure every call to +begin+ has a matching call to +end+. This code pattern is mandatory: it is critical to correct transaction nesting behavior.
3968
48
3969
49
One convenient way to do this is to encapsulate the logic of a transaction in an implementation of +com.persisitit.TransactionRunnable+ interface. The +com.persistit.Transaction#run+ method automatically provides logic to begin the transaction, execute the TransactionRunnable and commit the transaction, repeating the process until no rollback is thrown or a maximum retry count is reached. For example, the code fragment shown above can be rewritten as:
3970
50
3971
51
[source,java]
3972
52
----
3973
53
//
3974
54
// Get the transaction context for the current thread.
3975
55
//
3976
56
Transaction txn = myExchange.getTransaction();
3977
57
//
3978
58
// Perform the transaction with the following parameters:
3979
59
// - try to commit it up to 10 times
3980
60
// - delay 2 milliseconds before each retry
3981
61
// - use the group commit durability policy
3982
62
//
3983
63
txn.run(new TransactionRunnable() {
3984
64
    public void run() throws PersistitException {
3985
65
        myExchange.getValue().put("First value");
3986
66
        myExchange.clear().append(1).store();
3987
67
        myExchange.getValue().put("Second value");
3988
68
        myExchange.clear().append(2).store();
3989
69
    }
3990
70
}, 10, 2, CommitPolicy.GROUP);
3991
71
----
3992
72
3993
73
=== Mixing Transactional and Non-Transactional Operations
3994
74
3995
75
Database operations running outside the scope of an explicitly defined transaction are never subject to rollback and therefore do not require retry logic. However, such operations are also not guaranteed to be durable in the event of a system crash. Further, such operations are not isolated. Read operations performed outside of a transaction can read uncommitted updates, and updates performed outside of a transaction are visible within transactions. In other words, non-transactional reads and writes may break both the durability and isolation of concurrently executing transactions.  Therefore it is strongly recommended that in an application that relies on transactions, all interactions with the database should use transactions. 
3996
76
3997
77
== Optimistic Transaction Scheduling
3998
78
3999
79
To achieve high performance and scalability, Persistit supports an optimistic transaction scheduling protocol called MVCC with http://wikipedia.org/wiki/Snapshot_isolation[Snapshot Isolation]. Under this protocol multiple threads are permitted to execute transactions at full speed without blocking until a potentially inconsistent state is recognized. At that point a transaction suspected of causing the inconsistent state is automatically forced to roll back.
4000
80
4001
81
Optimistic scheduling works because transactions usually do not collide, especially when individual database operations are fast, and so in practice transactions are seldom rolled back. But because any transaction may be rolled back at any point, applications must be designed carefully to avoid unintended side-effects. For example, a transaction should never perform non-repeatable or externally visible operations such as file or network I/O within its scope.
4002
82
4003
83
=== Snapshot Isolation
4004
84
4005
85
Persistit schedules concurrently executing transactions optimistically, without locking any database records. Instead, Persistit uses the well-known Snapshot Isolation protocol to achieve atomicity and isolation. While transactions are modifying data, Persistit maintains multiple versions of values being modified. Each version is labeled with the commit timestamp of the transaction that modified it. Whenever a transaction reads a value that has been modified by other transactions, it gets the latest version that was committed before its own start timestamp. In other words, all read operations are performed as if from a "snapshot" of the state of the database made at the transaction's start timestamp - hence the name "Snapshot Isolation."
4006
86
4007
87
[[Pruning]]
4008
88
=== Pruning 
4009
89
4010
90
Given that all updates written through transactions are created as versions within the MVCC scheme, a large number of versions can accumulate over time. Persistit reduces this proliferation through an activity called "pruning." Pruning resolves the final state of each version by removing any versions created by aborted transactions and removing obsolete versions no longer needed by other transactions. If a value contains only one version and the commit timestamp of the transaction that created it is before the start of any currently running transaction, that value is called _primordial_. The goal of pruning is to reduce most or all values in the database to their primordial states because updating and reading primordial values is more efficient than than managing multiple version values. Pruning happens automatically and is generally not visible to the application.
4011
91
4012
92
=== Rollbacks
4013
93
4014
94
Usually Snapshot Isolation allows concurrent transactions to commit without interference but this is not always the case. Two concurrent transactions that attempt to modify the same Persistit key/value pair before they commit are said to have a "write-write dependency". To avoid anomalous results one of them must abort, rolling back any other updates it may also have created, and retry. Persistit implements a "first updater wins" policy in which if two transactions attempt to update the same record, the first transaction "wins" by being allowed to continue, while the second transaction "loses" and is required to abort.
4015
95
4016
96
Once a transaction has aborted, any subsequent database operation it attempts throws a +RollbackException+. Application code should catch and handle this Exception. Usually the correct and desired behavior is simply to retry the transaction as shown in the code samples above.
4017
97
4018
98
A transaction can also voluntarily roll back. For example, transaction logic could detect an error condition that it chooses to handle by throwing an exception back to the application. In this case the transaction should invoke the +rollback+ method to explicitly declare its intent to abort the transaction.
4019
99
4020
100
=== Read-Only Transactions
4021
101
 
4022
102
Under Snapshot Isolation, transactions that read but do not modify data cannot generate any write-write dependencies and are therefore not subject to  being rolled back because of the actions of other transactions. However, even though it modifies no data, a long-running read-only transaction can force Persistit to retain old value versions from other transactions for its duration in order to provide a snapshot view. This behavior can cause congestion and performance degradation by preventing very old values from being pruned. The degree to which this is a problem depends on the volume of update transactions being processed and the duration of long-running transactions.
4023
103
4024
104
=== Snapshot Isolation is not Serializable
4025
105
4026
106
It is well-known that transactions executing under SI are not necessarily serializable. Under SI, so-called _write-skew_ anomalies can happen with transactions that have certain kinds of interactions.  Write-skew can be avoided by (a) explicit application-level locking or (b) structuring transactions to add write-write dependencies where write-skew otherwise could occur.
4027
107
4028
108
Note that many common transaction patterns, including those defined by the TPC-C benchmark, do not experience write-skew and therefore _are_ serializable under SI.
4029
109
4030
110
== Durability Options: +CommitPolicy+
4031
111
4032
112
Persistit provides three policies that determine the durability of a transaction after it has executed the +com.persistit.Transaction#commit+ method. These are:
4033
113
4034
114
[horizontal]
4035
115
+HARD+:: The +commit+ method does not return until all updates created by the transaction have been written to non-volatile storage (e.g., disk storage).
4036
116
+GROUP+:: The +commit+ method does not return until all updates created by the transaction have been written to non-volatile storage. In addition, the committing transaction waits briefly in an attempt to recruit other transactions running in other threads to write their updates with the same physical I/O operation.
4037
117
+SOFT+:: The +commit+ method returns _before_ the updates have been recorded on non-volatile storage. Persistit attempts to write them within 100 milliseconds, but this interval is not guaranteed.
4038
118
4039
119
You can specify a default policy in the Persistit initialization properties using the +txnpolicy+ property or under program control using +com.persistit.Persistit#setDefaultTransactionCommitPolicy+. The default policy applies whenever the application calls the +commit()+ method. You can override the default policy using +commit(CommitPolicy)+.
4040
120
4041
121
HARD and GROUP ensure each transaction is written durably to non-volatile storage before the +commit+ method returns. The difference is that GROUP can improve throughput in multi-threaded applications because the average number of I/O operations needed to commit _N_ transactions can be smaller than _N_. However, for one or a small number of concurrent threads, GROUP reduces throughput because it works by introducing a delay to allow other concurrent transactions to commit within a single I/O operation.
4042
122
4043
123
SOFT commits are generally much faster than HARD or GROUP commits, especially for single-threaded applications, because the results of numerous transactions committed from a single thread can be aggregated and written to disk in a single I/O operation. However, transactions written with the SOFT commit policy are not immediately durable and it is possible that the recovered state of a database will be missing transactions that reported they were committed shortly before a crash.
4044
124
4045
125
For SOFT commits, the state of the database after restart is such that for any committed transaction T, either all or none of its modifications will be present in the recovered database. Further, if a transaction T2 reads or updates data that was written by any other transaction T1, and if T2 is present in the recovered database, then so is T1. Any transaction that was in progress, but had not been committed at the time of the failure, is guaranteed not to be present in the recovered database. SOFT commits are designed to be durable within 100 milliseconds after +commit+ returns. However, this interval is determined by computing the average duration of recent I/O operations to predict the completion time of the I/O that will write the transaction to disk, and therefore the interval cannot be guaranteed.
4046
126
4047
127
== Nested Transactions
4048
128
4049
129
A nested transaction occurs when code that is already executing within the scope of a transaction executes the +begin+ method to start a new transaction. This might happen, for example, if an application’s transaction logic calls a method that also uses transactions. In this case, the commit processing of the inner transaction scope is deferred until the outermost transaction commits. At that point, all the updates performed within the inner and outer transaction scopes are committed to the database. Similarly, a rollback initiated by the inner transaction causes both it and the outermost transaction to roll back.
4050
130
4051
131
== Accumulators
4052
132
4053
133
Consider an application in which concurrently running transactions share a counter. For example, suppose each transaction is responsible for allocating a unique integer as a primary key for a database record. One way to do this would be to store the counter in a Persistit key/value pair, reading the value at the start of each transaction and committing an update at the end.
4054
134
4055
135
The problem with this approach is that under SI, concurrent transactions running in a multi-threaded application would experience very frequent write-write dependencies on the counter value; in fact, the only way to complete any transactions would be serially, one at a time.
4056
136
4057
137
Persistit provides the +com.persistit.Accumulator+ class to avoid this problem.  An accumulator is designed to manage contributions from multiple concurrent transactions without causing write-write dependencies. Accumulators are durable in the sense that each transaction’s contribution is made durable with the transaction itself, and Persistit automatically recovers a correct state for each Accumulator in the event of a system crash.
4058
138
4059
139
There are four types of accumulator in Persistit. Each a concrete subclass of the abstract +com.persistit.Accumulator+ class:
4060
140
4061
141
[horizontal]
4062
142
+SUM+:: Tallies a count or sum of contributions by each transaction
4063
143
+MIN+:: Finds the minimum value contributed by all transactions
4064
144
+MAX+:: Finds the maximum value contributed by all transactions
4065
145
+SEQ+:: Special case of the SUM accumulator used to generate sequence numbers
4066
146
4067
147
Accumulator instances are associated with a +com.persistit.Tree+.  Each +Tree+ may have up to 64 accumulators. The following code fragment creates and/or acquires a +SumAccumulator+, reads its snapshot value and then adds one to it:
4068
148
4069
149
[source,java]
4070
150
----
4071
151
final Exchange ex = _persistit.getExchange(volume, treeName, true);
4072
152
final Transaction txn = ex.getTransaction();
4073
153
txn.begin();
4074
154
try {
4075
155
    final Accumulator acc =
4076
156
        ex.getTree().getAccumulator(Accumulator.Type.SUM, 17);
4077
157
    long snap = acc.getSnapshotValue(txn);
4078
158
    acc.update(1, txn);
4079
159
    txn.commit();
4080
160
} finally {
4081
161
    txn.end();
4082
162
}
4083
163
----
4084
164
4085
165
The value 17 is simply an arbitrary index number between 0 and 63, inclusive. The application is responsible for allocating and managing accumulator indexes.
4086
166
4087
167
The snapshot value of an accumulator obtained through +com.persistit.Accumulator#getSnapshotValue()+ is the value computed from all updates contributed by transactions that had committed at the time the current transaction started, plus the transaction’s own as-yet uncommitted updates. In other words, the snapshot value of the accumulator is consistent with the snapshot view of all other data visible within the transaction.
4088
168
4089
169
An accumulator has two ways of accessing its accumulated value:
4090
170
4091
171
[horizontal]
4092
172
+getSnapshotValue()+:: Is a value computed from updates that were committed at the start of the current transaction. This method may be called only within the scope of a Transaction.
4093
173
+getLiveValue()+:: Is an ephemeral value reflecting all updates performed by all transactions, including concurrent and aborted transactions.
4094
174
4095
175
The snapshot value is a precise, consistent tally, while the live value is approximate. For a +SumAccumulator+, +MaxAccumulator+ or +SeqAccumulator+, if all updates are have non-negative arguments, then the live value is always greater than or equal to the snapshot value.
4096
176
4097
177
=== SeqAccumulator
4098
178
4099
179
The +SeqAccumulator+ class has a special role in allocating unique identifier numbers, e.g., synthetic primary keys.  The goal of the +SeqAccumulator+ is to ensure that every committed transaction has received a unique value integer in all circumstances, including after recovery from a crash. See +com.persistit.Accumulator+ for details.
4100
180
0
4101
=== modified file 'doc/build/build-doc.sh'
4102
--- doc/build/build-doc.sh	2012-05-25 18:50:59 +0000
4103
+++ doc/build/build-doc.sh	2012-05-30 18:23:19 +0000
4104
@@ -37,12 +37,29 @@
4105
37
# The end-product files, user_guide.html and user_guide.xml are written
37
# The end-product files, user_guide.html and user_guide.xml are written
4106
38
# there.
38
# there.
4107
39
#
39
#
4116
40
rm -rf /tmp/akiban-persistit-doc
40
rm -rf ../../target/sphinx/source
4117
41
mkdir /tmp/akiban-persistit-doc
41
mkdir -p ../../target/sphinx/source
4118
42
javac -d /tmp/akiban-persistit-doc -cp ../../core/target/classes/ src/*.java
42
mkdir -p ../../target/sphinx/classes
4119
43
java -cp /tmp/akiban-persistit-doc:../../core/target/classes AsciiDocPrep in=../TOC.txt out=/tmp/akiban-persistit-doc/doc.txt base=apidocs index=../../core/target/site/apidocs/index-all.html
43
mkdir -p ../../target/sphinx/html
4120
44
asciidoc -a toc -n -d book -b xhtml11 -o /tmp/akiban-persistit-doc/doc.html /tmp/akiban-persistit-doc/doc.txt
44
mkdir -p ../../target/sphinx/text
4121
45
asciidoc -a toc -n -d book -b docbook -o /tmp/akiban-persistit-doc/doc.xml /tmp/akiban-persistit-doc/doc.txt
45
4122
46
sed s/\`/\ / /tmp/akiban-persistit-doc/doc.html > /tmp/akiban-persistit-doc/user_guide.html
46
cp ../index.rst ../../target/sphinx/source
4123
47
sed s/\`/\ / /tmp/akiban-persistit-doc/doc.xml > /tmp/akiban-persistit-doc/user_guide.xml
47
cp ../conf.py ../../target/sphinx/source
4124
48
4125
49
javac -d ../../target/sphinx/classes -cp ../../target/classes/ src/*.java
4126
50
4127
51
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../ReleaseNotes.rst out=../../target/sphinx/source/ReleaseNotes.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4128
52
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../BasicAPI.rst out=../../target/sphinx/source/BasicAPI.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4129
53
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Configuration.rst out=../../target/sphinx/source/Configuration.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4130
54
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../GettingStarted.rst out=../../target/sphinx/source/GettingStarted.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4131
55
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Management.rst out=../../target/sphinx/source/Management.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4132
56
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Miscellaneous.rst out=../../target/sphinx/source/Miscellaneous.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4133
57
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../PhysicalStorage.rst out=../../target/sphinx/source/PhysicalStorage.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4134
58
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Security.rst out=../../target/sphinx/source/Security.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4135
59
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Serialization.rst out=../../target/sphinx/source/Serialization.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4136
60
java -cp ../../target/sphinx/classes:../../target/classes SphinxDocPrep in=../Transactions.rst out=../../target/sphinx/source/Transactions.rst base=http://www.akiban.com/ak-docs/admin/persistit/apidocs index=../../target/site/apidocs/index-all.html
4137
61
4138
62
sphinx-build -a  ../../target/sphinx/source ../../target/sphinx/html
4139
63
4140
64
fold -s ../../target/sphinx/source/ReleaseNotes.rst | sed 's/``//g' | sed 's/\.\. note:/NOTE/' | sed 's/::/:/' > ../../target/sphinx/text/ReleaseNotes
4141
48
65
4142
49
66
4143
=== added file 'doc/build/build-doc.sh.orig'
4144
--- doc/build/build-doc.sh.orig	1970-01-01 00:00:00 +0000
4145
+++ doc/build/build-doc.sh.orig	2012-05-30 18:23:19 +0000
4146
@@ -0,0 +1,48 @@
4147
1
#/bin/sh
4148
2
#
4149
3
# Copyright © 2011-2012 Akiban Technologies, Inc.  All rights reserved.
4150
4
# 
4151
5
# This program is free software: you can redistribute it and/or modify
4152
6
# it under the terms of the GNU Affero General Public License as
4153
7
# published by the Free Software Foundation, version 3 (only) of the
4154
8
# License.
4155
9
# 
4156
10
# This program is distributed in the hope that it will be useful,
4157
11
# but WITHOUT ANY WARRANTY; without even the implied warranty of
4158
12
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
4159
13
# GNU Affero General Public License for more details.
4160
14
# 
4161
15
# You should have received a copy of the GNU Affero General Public License
4162
16
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
4163
17
# 
4164
18
# This program may also be available under different license terms. For more
4165
19
# information, see www.akiban.com or contact licensing@akiban.com.
4166
20
# 
4167
21
4168
22
# ---------------------
4169
23
#
4170
24
# Builds the Akiban Persistit doc set.  Currently this process is based on
4171
25
# the asciidoc tool (http://www.methods.co.nz/asciidoc/).
4172
26
#
4173
27
# Here are the steps:
4174
28
# 1. Run a Java program AsciiDocPrep to prepare a text asciidoc file.
4175
29
#    Among other things, AsciiDocPrep fills in JavaDoc hyperlinks.
4176
30
# 2. Run asciidoc to generate an html file.
4177
31
# 3. Use sed to replace some characters.  Turns out asciidoc doesn't like
4178
32
#    to link to URLs having spaces, so AsciDocPrep replaces those spaces
4179
33
#    with the "`" character.  This step converts those back to spaces.
4180
34
#
4181
35
# Run this script from the root of the persistit source directory. This
4182
36
# script writes changes only into a directory /tmp/akiban-persistit-doc.
4183
37
# The end-product files, user_guide.html and user_guide.xml are written
4184
38
# there.
4185
39
#
4186
40
rm -rf /tmp/akiban-persistit-doc
4187
41
mkdir /tmp/akiban-persistit-doc
4188
42
javac -d /tmp/akiban-persistit-doc -cp ../../core/target/classes/ src/*.java
4189
43
java -cp /tmp/akiban-persistit-doc:../../core/target/classes AsciiDocPrep in=../TOC.txt out=/tmp/akiban-persistit-doc/doc.txt base=apidocs index=../../core/target/site/apidocs/index-all.html
4190
44
asciidoc -a toc -n -d book -b xhtml11 -o /tmp/akiban-persistit-doc/doc.html /tmp/akiban-persistit-doc/doc.txt
4191
45
asciidoc -a toc -n -d book -b docbook -o /tmp/akiban-persistit-doc/doc.xml /tmp/akiban-persistit-doc/doc.txt
4192
46
sed s/\`/\ / /tmp/akiban-persistit-doc/doc.html > /tmp/akiban-persistit-doc/user_guide.html
4193
47
sed s/\`/\ / /tmp/akiban-persistit-doc/doc.xml > /tmp/akiban-persistit-doc/user_guide.xml
4194
48
4195
0
49
4196
=== added file 'doc/build/src/SphinxDocPrep.java'
4197
--- doc/build/src/SphinxDocPrep.java	1970-01-01 00:00:00 +0000
4198
+++ doc/build/src/SphinxDocPrep.java	2012-05-30 18:23:19 +0000
4199
@@ -0,0 +1,147 @@
4200
1
import java.io.BufferedReader;
4201
2
import java.io.File;
4202
3
import java.io.FileReader;
4203
4
import java.io.FileWriter;
4204
5
import java.io.PrintWriter;
4205
6
import java.util.SortedMap;
4206
7
import java.util.regex.Matcher;
4207
8
import java.util.regex.Pattern;
4208
9
4209
10
import com.persistit.util.ArgParser;
4210
11
4211
12
public class SphinxDocPrep {
4212
13
4213
14
    private final static Pattern PERSISTIT_PATTERN = Pattern
4214
15
            .compile("(``)?(com\\.persistit(?:\\.[a-z]\\w*)*(?:\\.[A-Z]\\w*)+)(?:#(\\w+(?:[\\(\\)\\,a-zA-Z]*)))?(``)?");
4215
16
4216
17
    private final static String[] ARG_TEMPLATE = { "in|string:|Input file", "out|string:|Output file",
4217
18
            "index|string:|Pathname of index-all.html file", "base|string:|Base of generated URLs", };
4218
19
4219
20
    enum BlockState {
4220
21
        OUT, WAIT_FIRST_BLANK_LINE, WAIT_SECOND_BLANK_LINE
4221
22
    }
4222
23
4223
24
    private AsciiDocIndex index;
4224
25
    private BlockState block = BlockState.OUT;
4225
26
    private PrintWriter writer;
4226
27
    private String base;
4227
28
    private String indexPath;
4228
29
4229
30
    private void prepare(final String[] args) throws Exception {
4230
31
        ArgParser ap = new ArgParser("SphinxDocPrep", args, ARG_TEMPLATE);
4231
32
        final String inPath = ap.getStringValue("in");
4232
33
        final String outPath = ap.getStringValue("out");
4233
34
4234
35
        writer = outPath.isEmpty() ? new PrintWriter(System.out) : new PrintWriter(new FileWriter(outPath));
4235
36
4236
37
        base = ap.getStringValue("base");
4237
38
        if (base.isEmpty()) {
4238
39
            base = "http://akiban.com/persistit/doc/apidocs";
4239
40
        }
4240
41
        indexPath = ap.getStringValue("index");
4241
42
        if (indexPath.isEmpty()) {
4242
43
            indexPath = "/home/peter/website/apidocs/index-all.html";
4243
44
        }
4244
45
4245
46
        index = new AsciiDocIndex();
4246
47
        System.out.print("Building JavaDoc index..");
4247
48
        index.buildIndex(indexPath, base);
4248
49
        System.out.println("done");
4249
50
4250
51
        processFile(new File(inPath), 0);
4251
52
        writer.close();
4252
53
    }
4253
54
4254
55
    public void processFile(final File file, final int level) throws Exception {
4255
56
        BufferedReader reader = new BufferedReader(new FileReader(file));
4256
57
        System.out.print("Processing file " + file);
4257
58
        String line;
4258
59
        while ((line = reader.readLine()) != null) {
4259
60
            if (line.startsWith("@")) {
4260
61
                processFile(new File(file.getParentFile(), line.substring(1)), level + 1);
4261
62
            } else {
4262
63
                processLine(line);
4263
64
            }
4264
65
        }
4265
66
        writer.println();
4266
67
        System.out.println(" - done");
4267
68
    }
4268
69
4269
70
    private void processLine(final String line) throws Exception {
4270
71
        if (line.contains(".. code-block:")) {
4271
72
            block = BlockState.WAIT_FIRST_BLANK_LINE;
4272
73
        } else if (line.isEmpty()) {
4273
74
            switch (block) {
4274
75
            case WAIT_FIRST_BLANK_LINE:
4275
76
                block = BlockState.WAIT_SECOND_BLANK_LINE;
4276
77
                break;
4277
78
4278
79
            case WAIT_SECOND_BLANK_LINE:
4279
80
                block = BlockState.OUT;
4280
81
                break;
4281
82
4282
83
            default:
4283
84
                // no change
4284
85
            }
4285
86
        }
4286
87
4287
88
        if (block == BlockState.OUT) {
4288
89
            final StringBuffer sb = new StringBuffer();
4289
90
            if (line.startsWith("=")) {
4290
91
                sb.append("=");
4291
92
            }
4292
93
            final Matcher matcher = PERSISTIT_PATTERN.matcher(line);
4293
94
            while (matcher.find()) {
4294
95
                processMatch(matcher, sb);
4295
96
            }
4296
97
            matcher.appendTail(sb);
4297
98
            writer.println(sb.toString());
4298
99
        } else {
4299
100
            writer.println(line);
4300
101
            writer.flush();
4301
102
        }
4302
103
    }
4303
104
4304
105
    private void processMatch(final Matcher matcher, final StringBuffer sb) {
4305
106
        String className = matcher.group(2);
4306
107
        String methodName = matcher.group(3);
4307
108
4308
109
        String replacement;
4309
110
        if (methodName == null) {
4310
111
            String url = index.getClassMap().get(className);
4311
112
            if (url == null || url.isEmpty()) {
4312
113
                replacement = "<<<Missing class: " + className + ">>>";
4313
114
            } else {
4314
115
                replacement = "`" + className + " <" + url + ">`_";
4315
116
            }
4316
117
        } else {
4317
118
            String from = className + "#" + methodName.split("\\(")[0];
4318
119
            final SortedMap<String, String> map = index.getMethodMap().tailMap(from);
4319
120
            String url;
4320
121
            if (map.isEmpty()) {
4321
122
                replacement = "<<<Missing method: " + methodName + ">>>";
4322
123
            } else {
4323
124
                final String first = map.firstKey();
4324
125
                url = map.get(first);
4325
126
                url = url.replace(" ", "%20");
4326
127
                String text = first.split("#")[1];
4327
128
                text = text.replace("com.persistit.encoding.", "");
4328
129
                text = text.replace("com.persistit.exception.", "");
4329
130
                text = text.replace("com.persistit.logging.", "");
4330
131
                text = text.replace("com.persistit.mxbeans.", "");
4331
132
                text = text.replace("com.persistit.ref.", "");
4332
133
                text = text.replace("com.persistit.ui.", "");
4333
134
                text = text.replace("com.persistit.", "");
4334
135
                text = text.replace("java.lang.", "");
4335
136
                text = text.replace("java.util.", "");
4336
137
                replacement= "`" + text + " <" + url + ">`_";
4337
138
            }
4338
139
        }
4339
140
4340
141
        matcher.appendReplacement(sb, Matcher.quoteReplacement(replacement));
4341
142
    }
4342
143
4343
144
    public static void main(final String[] args) throws Exception {
4344
145
        new SphinxDocPrep().prepare(args);
4345
146
    }
4346
147
}
4347
0
148
4348
=== added file 'doc/conf.py'
4349
--- doc/conf.py	1970-01-01 00:00:00 +0000
4350
+++ doc/conf.py	2012-05-30 18:23:19 +0000
4351
@@ -0,0 +1,285 @@
4352
1
# -*- coding: utf-8 -*-
4353
2
#
4354
3
# PersistitDoc documentation build configuration file, created by
4355
4
# sphinx-quickstart on Fri May 18 15:19:04 2012.
4356
5
#
4357
6
# This file is execfile()d with the current directory set to its containing dir.
4358
7
#
4359
8
# Note that not all possible configuration values are present in this
4360
9
# autogenerated file.
4361
10
#
4362
11
# All configuration values have a default; values that are commented out
4363
12
# serve to show the default.
4364
13
4365
14
import sys, os
4366
15
4367
16
# If extensions (or modules to document with autodoc) are in another directory,
4368
17
# add these directories to sys.path here. If the directory is relative to the
4369
18
# documentation root, use os.path.abspath to make it absolute, like shown here.
4370
19
#sys.path.insert(0, os.path.abspath('.'))
4371
20
4372
21
# -- General configuration -----------------------------------------------------
4373
22
4374
23
# If your documentation needs a minimal Sphinx version, state it here.
4375
24
#needs_sphinx = '1.0'
4376
25
4377
26
# Add any Sphinx extension module names here, as strings. They can be extensions
4378
27
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
4379
28
extensions = ['sphinx.ext.todo']
4380
29
4381
30
# Add any paths that contain templates here, relative to this directory.
4382
31
templates_path = ['_templates']
4383
32
4384
33
# The suffix of source filenames.
4385
34
source_suffix = '.rst'
4386
35
4387
36
# The encoding of source files.
4388
37
#source_encoding = 'utf-8-sig'
4389
38
4390
39
# The master toctree document.
4391
40
master_doc = 'index'
4392
41
4393
42
# General information about the project.
4394
43
project = u'PersistitDoc'
4395
44
copyright = u'2012, Akiban Technologies'
4396
45
4397
46
# The version info for the project you're documenting, acts as replacement for
4398
47
# |version| and |release|, also used in various other places throughout the
4399
48
# built documents.
4400
49
#
4401
50
# The short X.Y version.
4402
51
version = '1'
4403
52
# The full version, including alpha/beta/rc tags.
4404
53
release = '1'
4405
54
4406
55
# The language for content autogenerated by Sphinx. Refer to documentation
4407
56
# for a list of supported languages.
4408
57
#language = None
4409
58
4410
59
# There are two options for replacing |today|: either, you set today to some
4411
60
# non-false value, then it is used:
4412
61
#today = ''
4413
62
# Else, today_fmt is used as the format for a strftime call.
4414
63
#today_fmt = '%B %d, %Y'
4415
64
4416
65
# List of patterns, relative to source directory, that match files and
4417
66
# directories to ignore when looking for source files.
4418
67
exclude_patterns = []
4419
68
4420
69
# The reST default role (used for this markup: `text`) to use for all documents.
4421
70
#default_role = None
4422
71
4423
72
# If true, '()' will be appended to :func: etc. cross-reference text.
4424
73
#add_function_parentheses = True
4425
74
4426
75
# If true, the current module name will be prepended to all description
4427
76
# unit titles (such as .. function::).
4428
77
#add_module_names = True
4429
78
4430
79
# If true, sectionauthor and moduleauthor directives will be shown in the
4431
80
# output. They are ignored by default.
4432
81
#show_authors = False
4433
82
4434
83
# The name of the Pygments (syntax highlighting) style to use.
4435
84
pygments_style = 'sphinx'
4436
85
4437
86
# A list of ignored prefixes for module index sorting.
4438
87
#modindex_common_prefix = []
4439
88
4440
89
4441
90
# -- Options for HTML output ---------------------------------------------------
4442
91
4443
92
# The theme to use for HTML and HTML Help pages.  See the documentation for
4444
93
# a list of builtin themes.
4445
94
html_theme = 'default'
4446
95
4447
96
# Theme options are theme-specific and customize the look and feel of a theme
4448
97
# further.  For a list of options available for each theme, see the
4449
98
# documentation.
4450
99
#html_theme_options = {}
4451
100
4452
101
# Add any paths that contain custom themes here, relative to this directory.
4453
102
#html_theme_path = []
4454
103
4455
104
# The name for this set of Sphinx documents.  If None, it defaults to
4456
105
# "<project> v<release> documentation".
4457
106
#html_title = None
4458
107
4459
108
# A shorter title for the navigation bar.  Default is the same as html_title.
4460
109
#html_short_title = None
4461
110
4462
111
# The name of an image file (relative to this directory) to place at the top
4463
112
# of the sidebar.
4464
113
#html_logo = None
4465
114
4466
115
# The name of an image file (within the static path) to use as favicon of the
4467
116
# docs.  This file should be a Windows icon file (.ico) being 16x16 or 32x32
4468
117
# pixels large.
4469
118
#html_favicon = None
4470
119
4471
120
# Add any paths that contain custom static files (such as style sheets) here,
4472
121
# relative to this directory. They are copied after the builtin static files,
4473
122
# so a file named "default.css" will overwrite the builtin "default.css".
4474
123
html_static_path = ['_static']
4475
124
4476
125
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
4477
126
# using the given strftime format.
4478
127
#html_last_updated_fmt = '%b %d, %Y'
4479
128
4480
129
# If true, SmartyPants will be used to convert quotes and dashes to
4481
130
# typographically correct entities.
4482
131
#html_use_smartypants = True
4483
132
4484
133
# Custom sidebar templates, maps document names to template names.
4485
134
#html_sidebars = {}
4486
135
4487
136
# Additional templates that should be rendered to pages, maps page names to
4488
137
# template names.
4489
138
#html_additional_pages = {}
4490
139
4491
140
# If false, no module index is generated.
4492
141
#html_domain_indices = True
4493
142
4494
143
# If false, no index is generated.
4495
144
#html_use_index = True
4496
145
4497
146
# If true, the index is split into individual pages for each letter.
4498
147
#html_split_index = False
4499
148
4500
149
# If true, links to the reST sources are added to the pages.
4501
150
#html_show_sourcelink = True
4502
151
4503
152
# If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
4504
153
#html_show_sphinx = True
4505
154
4506
155
# If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
4507
156
#html_show_copyright = True
4508
157
4509
158
# If true, an OpenSearch description file will be output, and all pages will
4510
159
# contain a <link> tag referring to it.  The value of this option must be the
4511
160
# base URL from which the finished HTML is served.
4512
161
#html_use_opensearch = ''
4513
162
4514
163
# This is the file name suffix for HTML files (e.g. ".xhtml").
4515
164
#html_file_suffix = None
4516
165
4517
166
# Output file base name for HTML help builder.
4518
167
htmlhelp_basename = 'PersistitDocdoc'
4519
168
4520
169
4521
170
# -- Options for LaTeX output --------------------------------------------------
4522
171
4523
172
latex_elements = {
4524
173
# The paper size ('letterpaper' or 'a4paper').
4525
174
#'papersize': 'letterpaper',
4526
175
4527
176
# The font size ('10pt', '11pt' or '12pt').
4528
177
#'pointsize': '10pt',
4529
178
4530
179
# Additional stuff for the LaTeX preamble.
4531
180
#'preamble': '',
4532
181
}
4533
182
4534
183
# Grouping the document tree into LaTeX files. List of tuples
4535
184
# (source start file, target name, title, author, documentclass [howto/manual]).
4536
185
latex_documents = [
4537
186
  ('index', 'PersistitDoc.tex', u'PersistitDoc Documentation',
4538
187
   u'Akiban Technologies', 'manual'),
4539
188
]
4540
189
4541
190
# The name of an image file (relative to this directory) to place at the top of
4542
191
# the title page.
4543
192
#latex_logo = None
4544
193
4545
194
# For "manual" documents, if this is true, then toplevel headings are parts,
4546
195
# not chapters.
4547
196
#latex_use_parts = False
4548
197
4549
198
# If true, show page references after internal links.
4550
199
#latex_show_pagerefs = False
4551
200
4552
201
# If true, show URL addresses after external links.
4553
202
#latex_show_urls = False
4554
203
4555
204
# Documents to append as an appendix to all manuals.
4556
205
#latex_appendices = []
4557
206
4558
207
# If false, no module index is generated.
4559
208
#latex_domain_indices = True
4560
209
4561
210
4562
211
# -- Options for manual page output --------------------------------------------
4563
212
4564
213
# One entry per manual page. List of tuples
4565
214
# (source start file, name, description, authors, manual section).
4566
215
man_pages = [
4567
216
    ('index', 'persistitdoc', u'PersistitDoc Documentation',
4568
217
     [u'Akiban Technologies'], 1)
4569
218
]
4570
219
4571
220
# If true, show URL addresses after external links.
4572
221
#man_show_urls = False
4573
222
4574
223
4575
224
# -- Options for Texinfo output ------------------------------------------------
4576
225
4577
226
# Grouping the document tree into Texinfo files. List of tuples
4578
227
# (source start file, target name, title, author,
4579
228
#  dir menu entry, description, category)
4580
229
texinfo_documents = [
4581
230
  ('index', 'PersistitDoc', u'PersistitDoc Documentation',
4582
231
   u'Akiban Technologies', 'PersistitDoc', 'One line description of project.',
4583
232
   'Miscellaneous'),
4584
233
]
4585
234
4586
235
# Documents to append as an appendix to all manuals.
4587
236
#texinfo_appendices = []
4588
237
4589
238
# If false, no module index is generated.
4590
239
#texinfo_domain_indices = True
4591
240
4592
241
# How to display URL addresses: 'footnote', 'no', or 'inline'.
4593
242
#texinfo_show_urls = 'footnote'
4594
243
4595
244
4596
245
# -- Options for Epub output ---------------------------------------------------
4597
246
4598
247
# Bibliographic Dublin Core info.
4599
248
epub_title = u'PersistitDoc'
4600
249
epub_author = u'Akiban Technologies'
4601
250
epub_publisher = u'Akiban Technologies'
4602
251
epub_copyright = u'2012, Akiban Technologies'
4603
252
4604
253
# The language of the text. It defaults to the language option
4605
254
# or en if the language is not set.
4606
255
#epub_language = ''
4607
256
4608
257
# The scheme of the identifier. Typical schemes are ISBN or URL.
4609
258
#epub_scheme = ''
4610
259
4611
260
# The unique identifier of the text. This can be a ISBN number
4612
261
# or the project homepage.
4613
262
#epub_identifier = ''
4614
263
4615
264
# A unique identification for the text.
4616
265
#epub_uid = ''
4617
266
4618
267
# A tuple containing the cover image and cover page html template filenames.
4619
268
#epub_cover = ()
4620
269
4621
270
# HTML files that should be inserted before the pages created by sphinx.
4622
271
# The format is a list of tuples containing the path and title.
4623
272
#epub_pre_files = []
4624
273
4625
274
# HTML files shat should be inserted after the pages created by sphinx.
4626
275
# The format is a list of tuples containing the path and title.
4627
276
#epub_post_files = []
4628
277
4629
278
# A list of files that should not be packed into the epub file.
4630
279
#epub_exclude_files = []
4631
280
4632
281
# The depth of the table of contents in toc.ncx.
4633
282
#epub_tocdepth = 3
4634
283
4635
284
# Allow duplicate toc entries.
4636
285
#epub_tocdup = True
4637
0
286
4638
=== added file 'doc/index.rst'
4639
--- doc/index.rst	1970-01-01 00:00:00 +0000
4640
+++ doc/index.rst	2012-05-30 18:23:19 +0000
4641
@@ -0,0 +1,15 @@
4642
1
Akiban Persistit User Guide
4643
2
===========================
4644
3
.. toctree::
4645
4
   :maxdepth: 1
4646
5
4647
6
   GettingStarted
4648
7
   BasicAPI
4649
8
   Transactions
4650
9
   PhysicalStorage
4651
10
   Configuration
4652
11
   Management
4653
12
   Security
4654
13
   Serialization
4655
14
   Miscellaneous
4656
15
4657
0
16
4658
=== removed file 'doc/overview-summary.txt'
4659
--- doc/overview-summary.txt	2011-04-26 02:10:48 +0000
4660
+++ doc/overview-summary.txt	1970-01-01 00:00:00 +0000
4661
@@ -1,107 +0,0 @@
4662
1
4663
2
Persistit 2.2
4664
3
=============
4665
4
4666
5
Persistit(TM) is a small, lightweight Java(TM) library that provides simple, fast and reliable data persistence for Java applications. It is designed to be embedded in Java application programs and to operate free of administration by the end-user.
4667
6
4668
7
This section provides a brief overview.  See http://com.akiban.com/persistit/documentation/ for complete documentation.
4669
8
4670
9
== API Overview
4671
10
4672
11
Persistit stores data as key-value pairs in highly optimized B-Tree structures. Much like a Java Map implementation, Persistit associates at most one value with each unique instance of a key.
4673
12
4674
13
Persistit provides interfaces to access and modify keys and their associated values. The developer writes code to construct key and value instances and to store, fetch, traverse and remove keys and records to and from the database. Persistit permits efficient multi-threaded concurrent access to database volumes. It is designed to minimize contention for critical resources and to maximize throughput on multi-processor machines.
4675
14
4676
15
In addition to low-level access methods on keys and values, Persistit provides com.persistit.PersistitMap, which implements the java.util.SortedMap interface. PersistitMap uses the Persistit database as a backing store so that key/value pairs are persistent, potentially shared with all threads, and limited in number only by disk storage. (See PersistitMap.)
4677
16
4678
17
Within Persistit, key values are _segmented_ and _ordered_. Segmented means that you can append multiple primitive values or Strings to construct a concatenated key. Ordered means that the methods that enumerate key values within a Persistit database do so in a specified natural order. (See Keys).
4679
18
4680
19
A Persistit value may be any primitive value, any Serializable Java object, or an object of any class supported by a custom serialization helper class. When stored in the B-Tree, keys and values are represented by sequences of bytes. The byte sequence that represents a value may be of arbitrary length, bounded only by available heap memory. (See Values.)
4681
20
4682
21
=== Access Methods
4683
22
4684
23
The primary low-level interface for interacting with Persistit is +com.persistit.Exchange+. The Exchange class provides all methods for storing, deleting, fetching and traversing key/value pairs. These methods are summarized here and described in detail in the API documentation.
4685
24
4686
25
Although the underlying Persistit database is designed for highly concurrent multi-threaded operation, the Exchange object itself is not thread-safe. Each thread should create and use its own Exchange object(s) when accessing the database.
4687
26
4688
27
To create an Exchange you provide a Volume name (or alias) and a tree name in its constructor. The constructor will optionally create a new tree in that Volume if a tree having the specified name is not found. An application may construct an arbitrary number of Exchange objects. Creating a new Exchange has no effect on the database if the specified tree already exists. Tree creation is thread-safe: multiple threads concurrently constructing Exchanges using the same Tree name will safely result in the creation of only one new tree.
4689
28
4690
29
An Exchange is a moderately complex object that requires several thousand bytes of heap space. Memory-constrained applications should construct Exchanges in moderate numbers. An Exchange internally maintains some optimization information such that references to nearby Keys within a tree are accelerated. Performance may benefit from using a different Exchange for each area of the Tree being accessed.
4691
30
4692
31
Persistit offers Exchange pooling to avoid rapidly creating and destroying Exchange objects in multi-threaded applications.  In particular, web applications may benefit from using the Exchange pool.
4693
32
4694
33
An Exchange is always associated with a com.persistit.Key and a com.persistit.Value. Typically you work with an Exchange in one of the following patterns:
4695
34
. Modify the Key, perform a +fetch+ operation, and extract the Value.
4696
35
. Modify the Key, modify the Value, and then perform a +store+ operation.
4697
36
. Modify the Key, and then perform a +remove+ operation.
4698
37
. Optionally modify the Key, perform a +traverse+ operation, then read the resulting Key and/or Value.
4699
38
4700
39
These four methods, plus a few other methods listed here, are the primary low-level interface to the database. Semantics are as follows:
4701
40
4702
41
[horizontal]
4703
42
+fetch+:: Reads the stored value associated with this Exchange's Key and modifies the Exchange’s Value to reflect that value.
4704
43
+store+:: Inserts or replaces the key/value pair for the specified key in the Tree either by replacing the former value, if there was one, or inserting a new value.
4705
44
+fetchAndStore+:: Fetches and then replaces the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic, as opposed to sequential calls to fetch and store.
4706
45
+remove+, +removeAll+, +removeKeyRange+:: Removes key/value pairs from the Tree. Versions of this method specify either a single key or a range of keys to be removed.
4707
46
+fetchAndRemove+:: Fetches and then removes the stored value. Upon completion, Value reflects the formerly stored value for the current Key. This operation is atomic, as opposed to sequential calls to fetch and remove.
4708
47
+traverse+, +next+, +previous+:: Modifies the Exchange’s Key and Value to reflect a successor or predecessor key within the tree. (See API documentation for com.persistit.Key for information on the order of traversal.)
4709
48
+incrementValue+:: Atomically increments or decrements a long (64-bit integer) value associated with the current Key, and returns the modified value. If there is currently no value associated with the key then incrementValue creates one and assigns an initial value to it. This operation provides a convenient way for concurrent threads to safely allocate unique long integers without an explicit transaction scope.
4710
49
+hasNext+, +hasPrevious+:: Indicates, without modifying the Exchange’s Value or Key objects, whether there is a successor or predecessor key in the Tree.
4711
50
+getChangeCount+:: Number of times the Tree for this Exchange has changed. This count may be used as a reliable indicator of whether the Tree has changed since some earlier instant in time. For example, it is used to detect concurrent modifications by PersistitMap.
4712
51
4713
52
Because Persistit permits concurrent operations by multiple threads, there is no guarantee that the underlying database will remain unchanged after any of these operations is completed. However, each of these methods operates atomically. That is, the inputs and outputs of each method are consistent with some valid state of the underlying Persistit backing store at some instant in time. The Value and Key objects for the Exchange represent that consistent state even if some other thread subsequently modifies the underlying database.
4714
53
4715
54
=== PersistitMap
4716
55
4717
56
Persistit provides an implementation of the java.util.SortedMap interface called com.persistit.PersistitMap. PersistitMap uses Persistit as its backing store, permitting large maps to be stored efficiently on disk using constant heap memory space.
4718
57
4719
58
Keys for PersistitMap must conform to the constraints described above under Keys. Values must conform to the constraints described for Values.
4720
59
4721
60
The constructor for PersistitMap takes an Exchange as its sole parameter. All key/value pairs of the Map are stored within the tree identified by this Exchange. The Key supplied by the Exchange becomes the root of a logical tree. For example:
4722
61
4723
62
[source,java]
4724
63
----
4725
64
Exchange ex = new Exchange("myVolume", "myTree", true);
4726
65
ex.append("USA").append("MA");
4727
66
PersistitMap<String, String> map = new PersistitMap<String, String>(ex);
4728
67
map.put("Boston", "Hub");
4729
68
----
4730
69
4731
70
places a key/value pair into the myTree with the concatenated key +{"USA ","MA","Boston"}+ and a value of +"Hub"+.
4732
71
4733
72
Because Persistit is designed for concurrent operation it is possible (and often intended) for the backing store of PersistitMap to be changed by other threads while a java.util.Iterator is in use. Generally the expected behavior for an Iterator on a Map collection view is to throw a ConcurrentModificationException if the underlying collection changes. This is known as fail-fast behavior. PersistitMap implements this behavior by throwing a ConcurrentModificationException in the event the Tree containing the map changes. An application can detect that the map may have changed due to a programming error in case the design contract calls for it to remain unchanged by catching this exception.
4734
73
4735
74
However, sometimes it may be desirable to use PersistitMap and its collections view interfaces to iterate across changing data. Internally, Persistit uses the traverse method to retrieve the next highest key in the key sort order in order to implement the Iterator’s hasNext and next methods. The result will depend on the content of the database at the instant these operations are performed. PersistitMap provides the method setAllowConcurrentModification to enable this behavior. By default, concurrent modifications are not allowed.
4736
75
=== KeyFilter
4737
76
4738
77
A +com.persistit.KeyFilter+ defines a subset of all possible key values. You can supply a KeyFilter to the traverse methods of an Exchange.  You can also specify a KeyFilter for any Iterator returned by the collection views of a PersistitMap.  In either case, the key/value pairs covered by traversing the database or iterating over the collection view are restricted to those selected by the KeyFilter.
4739
78
4740
79
Use of a KeyFilter is illustrated by the following code fragment:
4741
80
4742
81
[source,java]
4743
82
----
4744
83
Exchange ex = new Exchange("myVolume", "myTree", true);
4745
84
KeyFilter kf = new KeyFilter("{\"Bellini\":\"Busoni\"}");
4746
85
ex.append(Key.BEFORE);
4747
86
while (ex.next(kf))
4748
87
{
4749
88
  System.out.println(ex.getKey().reset().decodeString());
4750
89
}
4751
90
----
4752
91
4753
92
This simple example emits the string-valued keys within myTree whose values fall alphabetically between “Bellini” and “Busoni”, inclusive.
4754
93
4755
94
4756
95
== Transactions
4757
96
4758
97
Persistit supports transactions with full isolation and optimistic concurrency control. An application may begin, commit or roll back the current transaction scope explicitly, executing multiple database operations in an atomic, consistent, isolated and (optionally) durable (ACID) manner. If the application does not explicitly define the scope of a transaction, each database operation implicitly runs within the scope of a separate transaction. Each Persistit transaction may optionally be committed to either memory or disk. Transactions committed to memory are much faster, but are not immediately durable. (See Transactions.)
4759
98
4760
99
== Configuration
4761
100
4762
101
To initialize Persistit the embedding application invokes one of the initialize methods of +com.persistit.Persistit+, passing either a +java.util.Properties+ object or the name of a properties file from which the Properties object derives its content. The following properties are defined for Persistit. Other properties may also reside in the Properties object or its backing file; Persistit simply ignores any property not listed here.
4763
102
4764
103
== Logging API
4765
104
4766
105
Persistit is writes various diagnostic and informational messages to a log. By default, the log is written as text to the file +persistit.log+ in the current working directory. However, a container application will usually have a logging architecture already in place, and Persistit provides a simple way to redirect its log output to the container application’s log. Adapters for Log4J and the Java Logging API are included; other logging systems are easy to adapt.
4767
106
4768
107
Status:	Merged
Approved by:	Nathan Williams on 2012-05-30
Approved revision:	314
Merged at revision:	312
Proposed branch:	lp:~pbeaman/akiban-persistit/sphinxdoc-release-notes
Merge into:	lp:akiban-persistit
Diff against target:	4768 lines (+2646/-1989) 26 files modified doc/BasicAPI.rst (+332/-0) doc/BasicAPI.txt (+0/-320) doc/Configuration.rst (+273/-0) doc/Configuration.txt (+0/-246) doc/GettingStarted.rst (+270/-0) doc/GettingStarted.txt (+0/-255) doc/Management.rst (+446/-0) doc/Management.txt (+0/-360) doc/Miscellaneous.rst (+36/-0) doc/Miscellaneous.txt (+0/-32) doc/PhysicalStorage.rst (+142/-0) doc/PhysicalStorage.txt (+0/-126) doc/ReleaseNotes.rst (+84/-0) doc/Security.rst (+93/-0) doc/Security.txt (+0/-91) doc/Serialization.rst (+250/-0) doc/Serialization.txt (+0/-246) doc/TOC.txt (+0/-19) doc/Transactions.rst (+200/-0) doc/Transactions.txt (+0/-179) doc/build/build-doc.sh (+25/-8) doc/build/build-doc.sh.orig (+48/-0) doc/build/src/SphinxDocPrep.java (+147/-0) doc/conf.py (+285/-0) doc/index.rst (+15/-0) doc/overview-summary.txt (+0/-107)
To merge this branch:	bzr merge lp:~pbeaman/akiban-persistit/sphinxdoc-release-notes
Related bugs:	Link a bug report
Reviewer	Review Type	Date Requested	Status
Nathan Williams		2012-05-30	Approve on 2012-05-30
Review via email: mp+108031@code.launchpad.net