What is Heuristic Scanning?

heuristic-scanning-online-backupA typical virus scanner has at least two operation modes, first is what we usually term “signature based scanning”, which scans for known viruses by comparing the contents of the file being scanned to a unique signature pattern that corresponds to a particular virus (Its a bit more complex than this, I’m keeping it simple). This represents most of the detection work done by a modern AV scanner. That is, most of the viruses received by a computer are already known.

Heuristic scanning is slightly different. It examines the code within a file for virus like behaviour, using a variety of different techniques and traps to understand what a snippet of code is attempting to do. If that bit of code does too many virus like things then it is flagged as potentially being a virus.

This is difficult to do on the desktop because it can be time consuming, and is a very complex job to perform adequately, and users are very sensitive to delays in their system. If you were making a desktop virus scanner would you write a heuristic system that tested every possible condition with each file, but slowed the system down so much doing it that users just turned off the heuristic scanning? Or would you just do a few of the common tests that you can perform quickly, and call it good enough?

It is much easier to do heuristic scanning at the boundary, e.g. on an internet or email gateway scanner, where you can typically delay a file and take as long as you need to scan it. People might notice a second of delay per file on their desktop but you can keep an emailed file back for testing for a good 10 or 15 minutes, which is near enough to eternity with the performance of modern computers.

So does this mean my virus scanner works or not?

It works as well as it ever did. If you’ve purchased a decent product then the signature based scanner will do a great job of detecting known viruses. The heuristic scanner may not hold up its end as well on a desktop machine, but any effort here is better than none at all.

So is heuristic scanning a waste of time?

It’s less effective on the desktop certainly, compared to how a good server based system can work. it’s funny to see someone attacking heuristics in general at a Messagelabs seminar when they make a very big deal of their Skeptic Heuristic Engine being part of their email product, which makes me think that the report is a little distorted.

It’s important to look at why Ingram believes heuristics are ineffective, and to consider what this really means.

“I am not suggesting that there is a difference in the quality of the antivirus products themselves. What is happening is that the bad guys, the criminals, are testing their malicious code against the antivirus products to make sure they are undetectable. This is not a representation of the software,” said Ingram.
Ah. So the products aren’t ineffective because they don’t work. They’re ineffective because virus writers have access to them and are worried enough about them to take the time to work around them. That’s very different.

If virus writers are having to test their viruses and refactor them prior to release then the AV apps are making the cost of developing new malware more expensive. Maybe not as good as stopping all new viruses dead but a fairly realistic goal that is still worth getting out of bed for.

This isn’t an entirely new thing either. It’s been known for a while that spammers are among some of the most prolific users of spamassassin and the like. Same thing here, should we be sad that spammers can get around the current spam filters, or happy that they have to spend the extra dev time/costs doing so?