lucene based semantic-repository 0.5.2: major performance improvement now 24,000 items imported in 19 minutes

when we started using semantic repository, we had only one lucene index to make our content search able,
later we came up with another integration with one php based service aawaj.

on aawaj service they had  more than 150,000 items to index. we tried with our current release 0.5.1 to index all contents but we ended with extremely performance outage. later we released another version 0.5.2, where we added queued request handling and threw index optimization over an restful service uri – /rest/service/optimize/

here is the simple benchmark report –

version – 0.5.1 – first 100 items  ended in – 13.611 seconds.
version – 0.5.2 – first 100 items  ended in – 5.6152 seconds.

the change is really different and significant, later today we had anoter import on our repository, interestingly it took 1 hour to index 150,000 items. which was bit surprising since we were unable to do it with 0.5.1

actually we added single thread executor which keeps everything in queue and execute one by one. so we could remove  synchronized method.

here is an example code –

private final Executor mIndexTaskExecutor =
public void addDocument(final Document pDocument) {
new Runnable() {
public void run() {
semantic repository service is intended for indexing content from different sources and maintain multi indexes for different types of content and perform different types of search. yet another solr type indexing service on top of lucene but it will gradually support content versioning and more semantic search result.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

my tweets

June 2008
« May   Sep »

Flickr Photos

RSS my shared links on google reader

  • An error has occurred; the feed is probably down. Try again later.
%d bloggers like this: