Old version still run through view mapper

Sep 15, 2015 at 1:43 PM
It appears that all "versions" of a given object are run though the view mapper on rebuild, such that if you saved an object 3 times it'll be mapped 3 times. Using DeleteBeforeInsert prevents duplicates, but it seems odd that this would occur. Is this behavior intended? If so I'd like a way to prevent mapping the "old" versions on rebuild since this creates long delays for view rebuild on databases that are both large and active.

Thanks
Sep 15, 2015 at 2:58 PM
Hello Dr!

Yes this is by design, the only way to ensure data integrity on rebuild is to go through appended doc storage on-by-one and redo the mapping operation for a view, hence exactly like you said if DeleteBeforeInsert=false then you will see duplicates for the same doc guid .

"long delays" will typically be seconds which is the price paid for data integrity (try doing the same on RDBMS :) ).
Sep 15, 2015 at 3:04 PM
hmm, is there anything in the storage file that indicates that an object is no longer the "live" version? If so, how would not mapping that object impact data integrity?

Thanks
Sep 15, 2015 at 3:14 PM
Also, when a database is large and active the rebuild can take far longer than a few seconds. Here's an example of our Part database rebuilding in 20 min.

2015-09-01 10:31:23 |DEBUG|5|RaptorDB.Views.ViewHandler|| View = DBWPartView
2015-09-01 10:51:52 |DEBUG|5|RaptorDB.Views.ViewHandler|| rebuild view 'DBWPartView' done (s) = 1228.8267275
Sep 15, 2015 at 3:31 PM
Since the doc storage is "immutable" implementing a "live" feature will be very difficult (I haven't thought about it in fact so can't say really).

The problem is since RaptorDB does not know what the application as opposed to the developer then it can't make judgment calls on if to use the "last" document with out loosing data integrity ( omitting a doc will not effect the overall outcome in a series of docs) .
Sep 15, 2015 at 3:32 PM
Wow!

How many items have you got in your DB that it takes so long?

What is your hardware?
Sep 15, 2015 at 3:50 PM
Edited Sep 15, 2015 at 3:56 PM
rap.DocumentCount() reports 457998 documents

The server is a VM on a much larger system, it's been allocated 8 Xeon 2670 cores and 8 GB of RAM
Sep 15, 2015 at 3:56 PM
That is strange!

It should be a couple of minutes for such a system

How long does the test 100,000 invoice sample take?

How big are your docs and number of columns in your view?
Sep 15, 2015 at 4:16 PM
I'll run some tests with the sample project.

The docs are large, and contain a list of themselves, though I've limited this list to 2 levels deep for performance reasons. If I had it to do over I'd opt for a referential object list instead of a list of the actual object. Live and learn.

We only use 5 columns (plus docid) in the views. At one time we had 24 columns in this view but removed all that were not being used to try to reduce rebuild time.

Does DocumentCount include previous object versions?
Sep 15, 2015 at 4:33 PM
2015-09-15 10:22:32|DEBUG|12|RaptorDB.Views.ViewHandler|| rebuild view 'SalesInvoice' done (s) = 37.052409

Looks like 37 seconds to rebuild 100,000 documents, however if I modify the data in each invoice and save it back, the view rebuild (predictably) takes about twice as long:

2015-09-15 10:30:31|DEBUG|6|RaptorDB.Views.ViewHandler|| rebuild view 'SalesInvoice' done (s) = 71.6611654