11.06.2009

SnTT: Performance considerations for Instr

>>Author:  Bernd Hort
>>Ort:     Hamburg
        
URL: http://www.assono.de/blog/d6plinks/PerformanceInstr

Category: Show-n-Tell Thursday, LotusScript


Show-n-Tell Thursday
I had to learn the hard way that are big differences in using the LotusScript function Instr case-sensitiv or case-insensitiv. With big differences I mean something like 900 times slower using the option "case-insensitive, pitch-insensitive".

If the text to search is small you won't hardly recognize any difference at all. But if the text is getting bigger you might feel the difference. For the example where I measured a 900 time slower execution time I used a quite long "Lorem ipsum" text (5539 chars) which I concatenated ten times. For the measurement I used a profiling routine from Thomas Bahn.

For i = 1 To 10
       text = text & CRLF & text
Next

For i = 1 To 1000
       Call StartProfiling("InstrTest", "case-sensitiv")
       pos = Instr(text, "Defacto")
       Call StopProfiling("InstrTest", "case-sensitiv")                
Next

For i = 1 To 1000
       Call StartProfiling("InstrTest", "case-insensitiv")
       pos = Instr(1, text, "Defacto",5)
       Call StopProfiling("InstrTest", "case-insensitiv")                        
Next

Call LogProfiles()


The results are quite astonishing.
Option Calls Execution time
total
Execution time
average
case-insensitiv 1000 459,640 0,45964
case-sensitiv 1000 0,063 0,00006



In my application I needed the case-insensitiv method to ensure that I found all entries. It is a quite common approach to gain performance by using more memory. I used a second variable in which the text was stored converted to lower case. To find the position I just used the second variable.

For i = 1 To 1000
       Call StartProfiling("InstrTest", "case-sensitiv mit Lcase")
       textSmallLetters = Lcase(text)
       pos = Instr(textSmallLetters, Lcase("Defacto"))
       Call StopProfiling("InstrTest", "case-sensitiv mit Lcase")                        
Next


The performance gain with this approach was quite good. It is not as fast as using the case-sensitiv option. But for my application it was good enough.

Comments

#1 if you're searching multiple items in the same text, do the Lcase outside of the loop.

And you didn't post the numbers on the last loop.
Gravatar Image
#2 You are right. The Lcase is only for testing purposes inside the loop. Much of the performance gain comes from doing the conversion only once. In the production code I'm using it that way. Emoticon

The numbers of the last loop are in total 240,453 and the average was 0,24045. To be honest the numbers are not quite so impressive.
But as you pointed out the effect is bigger if you do the conversion only once and search for multiple items.
Gravatar Image

Post A Comment

Comments

:-D:-o:-p:-x:-(:-):-\:angry::cool::cry::emb::grin::huh::laugh::lips::rolleyes:;-)

Tags

Deutsche RSS-Feeds (German)

Custom Button Custom Button

English RSS feeds

Custom Button Custom Button