The performance impact of case-insensitive regular expressions

Bit of a quirky title this one… A set of scripts I have been coding rely heavily on regular expressions to do pattern matching and substitution, and I have noticed as of late that the performance has been suffering somewhat. I decided to do a bit of investigation to see what is going on.

When I originally wrote the code I used case-insensitive matching to ensure that the results wouldn’t be thrown off by another developer making a subtle change later on. It turns out that the performance impact of doing this type of checking is huge!

To put things into perspective, my code was originally taking 20 seconds to read through a logfile (only about 5MB) and parse it for certain patterns. Now that the case-insensitive matching has been disabled, it takes less than 1!.

Just for fun I rewrote the code to try lowercasing the input string first and then trying a normal pattern match (as I could ensure the text was always lower-case) and that also turned out to be faster (though not as fast as no transformation obviously).

Lesson of the day: Only use case-insensitive matching where you have to, as it will SERIOUSLY impact performance when working with large data sets!

Posted in Computing Tagged , , , ,

Relationships and differences in perspectives…

Well, I can’t say that one lasted too long… There’s nothing really nice about flying back from a week away to find things are over, but sometimes I guess it is for the best. After a long time driving back home it gave me some time to think about our differences and how some people really do see things differently.

I guess we all have to be prepared for the unexpected, and sometimes that people really aren’t who they seem Smilie: :-(

Posted in Personal

KVM VirtIO Network Bandwidth + Tuning

After my recent move back to KVM I decided to take a look at the network performance issues I was facing… I have a few apps running which require a fair amount of bandwidth (>200MB/sec continuously) and so I needed to do some tweaking to get things performing nicely…

Firstly, virt_net… This module makes KVM networking perform MUCH better, by doing multiple things, including removing the need for packet checksumming for internal traffic. As a result, network throughput goes up by a order of magnitude! Enabling it is pretty simple (ensure the module is loaded at boot, and configure libvirt to use it). This alone essentially solved my network problem, however in the interest of breaking things, I decided to play some more…

Next came jumbo frames… While my home network can’t support them, internal VM traffic should be able to without problems. In order to do this, I created a new bridge on the hypervisor for internal-only traffic and added a tap interface (ubuntu seems to need this to bring the bridge up properly). Then by changing the mtu on the tap interface, the bridge automatically adapts. Note: A bridge will always use the lowest mtu of any nic that is assigned to it, so they all need to match. The newer version of qemu checks what the mtu of a bridge is when assigning a vnic to it, and sets the mtu of the vnic to match automatically. Finally, with a bit of tweaking of the VM, I was using a larger MTU. I tested with 9K, but found that 64K actually gave me the best results.

Lastly was some TCP tweaking with the sysctl command… By changing the buffer sizes, the maximum tcp window size and a few other options I was able to squeeze the most out of it. After some testing, I have ended up with the results below:

root@plex:~# iperf -c 10.0.67.2
————————————————————
Client connecting to 10.0.67.2, TCP port 5001
TCP window size: 1.23 MByte (default)
————————————————————
[ 3] local 10.0.67.204 port 41170 connected with 10.0.67.2 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-10.0 sec 33.7 GBytes 28.9 Gbits/sec

Let’s be honest… That isn’t exactly a small figure! It’s way more than I need, but what I have found is that the CPU usage is considerably lower on my VM’s than it was previously (and after vhost_net was used). Overall, a win for KVM Smilie: :-)

Posted in Computing Tagged , , , , , , ,

Switching from KVM to VMWS, and back again…

So in what I can only consider to be a foolish (i.e. drunken) idea, I decided to move from using KVM to host a number of VM’s on my server to VMware Workstation… In my head, this was so that I could get better performance when running CPU intensive tasks, and hopefully improve the network performance as well.

While the move was relatively painless (using qemu-img and vmware-vdiskmanager), what followed after was not the case… While my VM’s that do very little (the mail server for example) were running fine, those that required a high level of throughput and performance were not doing well at all. My network performance had dropped to 1/3rd of the level when using KVM. To make matters worse, when using ffmpeg to transcode video, its performance was also lagging behind. I went through checking the usual settings (that VT/EPT was enabled) and all looked good.

To make matters worse (yes, it did actually get worse!), whenever a number of VM’s were very active, I was unable to run commands on the actual physical system! Despite changing the scheduler and the priority of the running VM’s, it was still freezing up the SSH sessions and the physical console Smilie: :-(

So now, I have migrated back to using KVM again. A simple qemu-img command converted the images back to raw format, and now with everything running again the performance is back to a decent level. I must admit I didn’t expect it to be this around, however it looks as if KVM is actually doing a really good job these days. I’m actually looking forward to the upcoming changes that are being worked on (moving the I/O controllers etc into the kernel itself, rather than userspace) as they should further increase performance and also make things a lot more stable.

Lesson learned: Using VMWS to run your production VM’s is a very bad idea, and a slow one at that!

Posted in Computing Tagged , , , , , ,

And then I was in a relationship :-)

Well, its been a good few weeks, made even better by the fact I’m now in a relationship with a very lovely lady Smilie: :-) Fingers crossed it works out Smilie: :D

Us

Posted in Personal

Pas de la Casa

Well, having a week away was actually great, even if I did sprain my ankle and end up not able to snowboard for half of the holiday… Good days, good nights, and bad hangovers Smilie: :D

Me Snowboarding

Posted in Personal Tagged , , ,

Hotel minibar prices

So as I travel a lot, I tend to have something out of a minibar more often than not…. While I understand paying extra for the privilege (if you can call it that) of having a number of food/drink items in your room, I find it beyond belief the price some hotels are charging…

So for example, the hotel I am currently in charges €1.75 for a chocolate bar, €3.75 for a small bottle of water, and €5 for a small bottle of red bull. There is nothing special about any of the items in the minibar, they are just small versions of the stuff you would pick up in a newsagents, at a rather inflated price!

I wonder if hotel chains have actually considered how much more business they would actually do if the prices were lower… The trade-off in business when it comes to profit is price of item against volume of sales. Simply put, selling items at a lower price requires you to sell more to get to the same level of profit as selling fewer expensive items.. The problem is, it comes down to peoples mindsets, especially in a hotel. While some people like myself will occasionally swallow our pride and take something out of the minibar, the majority of people would be put off by the price (not exactly value for money!).

I would love to see a report from a hotel that shows having the minibar items priced sky high actually earns them more money, in comparison to a having it priced sensibly and advertising the fact. From my perspective, if the minibar was half the cost that it currently is, I would have had more than double the amount of items out of it. I’m quite fond of a number of items in there, but I wouldn’t pay that much for it!

My little rant about how some businesses just don’t understand peoples buying behavior….

Posted in Travel

The impact of week/montly pay and budgeting for bills

I’ve been thinking about this a lot recently as I know many people who have trouble paying their monthly bills, but not because they don’t earn enough or spend every penny on drinking etc, but because of the schedule of their bills. After having a direct-debit bounce last month (the only DD I have which goes out two days before payday) due to some overspending at Christmas, it got me thinking about how the system could be a lot better..

Myself, I get paid monthly and try to schedule each bill to go out either on payday or a couple of days after (not always possible as I don’t get paid at the start of the month). I take this approach because it means that a few days after payday, whatever money I have left in my account is what I have for the month. This makes it so much easier to budget and work out what you can do without spending more than you have.

I also remember when I use to get paid weekly (and how much of a pain that was!). While getting paid weekly seems nice in one regard, with a billing cycle of 1 month, it makes things a lot more difficult to actually manage your finances and ensure you have enough to cover everything you owe etc.

To me it seems strange that everyone is missing a trick when it comes to direct debits and scheduling them on a specific date… Having your bills go out when you get paid automatically in my honest opinion is brilliant. I don’t have to ring up any companies to do manual payments, I don’t have to start working out how much money I will have each week in the month, I simply know what I have left to last and I can schedule appropriately.

I think its a shame that more people don’t take this approach, and that some companies refuse to move the date of a direct debit to make it more convenient for the end customer. I honestly believe that if more people took this approach and if all companies allowed you to change the date a payment went out, people would be in a lot less financial trouble and would have less stress due to not worrying about bills etc. It would also give people better visibility of how much they earn and how much they spend.

Posted in Personal Tagged , , , ,

A new year…

Well, 2011 is no more, and in comes 2012… Hopefully this year goes better than last, and as always I have set myself some targets for the coming 12 months…

1. Lose weight!!
2. Be more pro-active with my open-source projects
3. Learn another language (at least to speak it)

I have no idea if I will actually manage of the above, but one can hope Smilie: ;)

Posted in Personal

RHEL6.2 – Beware of the RAM!

One of the things which has caught me out recently is the new release of RHEL (6.2). As a lot of my test systems are virtual machines I try to keep the CPU/RAM settings as low as possible without sacrificing performance etc. It seems that the RAM requirements of 6.2 have increased somewhat, even from an installation perspective….

I found that the install of a 6.2 system (via the GUI) was failing on a system with 1GB RAM due to it running out of RAM when building the initramfs. Even more annoying is the fact this error isn’t reported, and the installation proceeds as if nothing has gone wrong. When you reboot you simply end up with a broken system Smilie: :-(

Even using the console installation still requires more RAM than usual! I’m not sure what RedHat have added into the 6.2 release but something sure seems hungry…

Lesson of the day.. When using RHEL6.2, double your RAM during installation!

Posted in Computing Tagged , , , ,