Database Management for Smalltalk

Archive for August, 2008

Wed 27th Aug 2008   04:08 PM
posted by John Clapperton

When an object is added to a DictionarySet (or VirtualDictionarySet), it is added to each of the DictionarySet’s component AutoDictionaries, each of which sends its defined unary message selector to the object to obtain the key at which it is to be inserted.

If the object returns a VOKeySet or VOKeyCollection (which should also normally be a set of unique keys) then the object is added into that AutoDictionary at each of those keys; this may be useful, for example, if the DictionarySet elements are published books or papers, each of which may have been written by several contributing authors. The consequence of this, however, is that the DictionarySet would no longer be a set, as a book having three authors will be present three times, once at each author key, and this could cause unexpected behaviour in DictionarySet>>do: and other enumeration methods. Read the rest of this entry »


Join the forum discussion on this post
Thu 21st Aug 2008   03:08 PM
posted by John Clapperton

Queries in a VOSS odbms are written in Smalltalk and may therefore be of arbitrary complexity, addressing an arbitrary semantic network of persistent objects. However, to simplify the most common kinds of queries, it is recommended that VirtualDictionarySet be used as the general-purpose Collection for the major aggregations of application entities in the database.

A DictionarySet may index its elements on any number of single-valued and/or multi-valued unary key selector messages, which may be added or removed at any time by ordinary (though potentially large) transactions, and DictionarySet uses these to provide efficient query-building methods which return subsets of its contents, allowing more complex queries to be built by union and intersection of these sets. 

A previous article noted the optimisation possible by using #for:equalsNoCopy: which returns the actual virtual set within the VirtualDictionarySet, instead of #for:equals: which returns a copy of it, when there is no intention to add or remove elements to or from the returned answer set. This article concerns optimisation of queries using #for:between:and:. Read the rest of this entry »


Join the forum discussion on this post
Tue 19th Aug 2008   04:08 PM
posted by John Clapperton

The VOSS collection, DictionarySet, and therefore also its subclass VirtualDictionarySet, follows the normal Smalltalk practice of returning a new collection when asked for a subset of its elements, just as when using Collection>>select:.

However, when sending a DictionarySet one of the most commonly used query messages, #for:equals: (as in Journeys for: #month equals: ‘August’) the required elements will already be in an explicit virtual set in one of its component VirtualMultiValuedAutoDictionaries. For consistency, however, those elements are all added into a constructed answer set, and this takes time. For this reason, an optimisation is provided as the alternative method DictionarySet>>for:equalsNoCopy: which returns the actual virtual set from the database.
  
It’s safe to do this only if the application does not change that returned set, as that would be a subversive change to just that one of the DictionarSet’s component AutoDictionaries, which the DictionarySet keeps synchronised when adding or removing elements; however that can be avoided with care, and anyway cannot happen in a read-only transaction.

From a concurrency point of view, it’s also as well to remember that this returned virtual set will be locked each time a message is sent to it in a read-only transaction having #isolationDegree=2 (short read-locks), whereas when using #for:equals: the returned set is not virtual, it’s a real set of virtual objects, and only its elements will be locked when sent messages, which may make some difference to concurrency. If the transaction’s #isolationDegree=3 all locks are held until the transaction rollsback or commits, so it wouldn’t make any difference anyway.

jc


Join the forum discussion on this post
Thu 14th Aug 2008   03:08 PM
posted by John Clapperton

A user has contributed the following optimisation, which will be included in the next release.

Add the following method:

VORefPrivate>>isVOKeyCollection
“Private”

 ^false

The addition of this method improves concurrency by eliminating unnecessary object locking of virtual objects which are being used as keys in a VirtualDictionary or VirtualDictionarySet.

When virtual objects are used as keys they behave as identity keys, as only their global object IDs are compared, and both components of this (the virtual object ID prefixed by the ID of the virtual space in which it exists) are present in every VORef proxy for that object, hence the key comparison message is not forwarded to the object itself and so the object is not locked into the transaction. Read the rest of this entry »


Join the forum discussion on this post
Thu 7th Aug 2008   12:08 PM
posted by John Clapperton

Questions have been asked about what each transaction is doing in the 40tps benchmark mentioned in the previous post.

The test ‘application’ is a loop which each time around commits a transaction which creates two new virtual objects, each of which is an Array of two elements, the first a Float, the second a String of 65 characters, which is added into a VirtualDictionary using the integer part of the Float as the key. The application updates a scrolling window on each commit. Garbage collection is switched off, MVRC/MVCC Versioning is switched off. The log archive delay interval is 1000ms. The test is run for 30 seconds and the dictionary size is about 20000 elements.

For comparison, if the loop is set to add four such objects in each transaction, the commit rate falls to 30tps. Or if just one object is created and added to the VirtualDictionary the commit rate increases to 50tps (with the log archive delay interval reduced to 750ms, so as not to trigger re-initialization of the log archive process, which by default happens when there are 100 transactions in the log buffer file).

John

 

 


Join the forum discussion on this post