As we reported last quarter, the way malware is being distributed is undergoing a fundamental shift, with more attackers focusing on "drive-by downloads" from legitimate sites that have been compromised, or from sites designed specifically for malicious purposes. In nearly all the variations on this kind of attack, no user action is required for the infection to occur, beyond loading the site in a browser -- and there are very few signs that malicious code has been downloaded.
There is perhaps no better illustration of this shift than the way malware was employed in the recent attack on Google and several other companies. One of the components of the attack involved spear-phishing one or more Google employees, with an aim of driving them to a site that then exploited a zero-day vulnerability in Internet Explorer 6 to download malware to those employees' computers. In previous years, such an attack would have solely made use of a malicious email attachment to compromise the target's computer; now, attackers are clearly opting to employ multiple vectors, including web-based methods.
Looking at the data for Q4'09
Based on the telemetry data we've gathered from the web, we estimate that more than 560,000 sites and approximately 5.5 million pages were infected in Q4'09, compared with more than 640,000 sites and 5.8 million pages in Q3'09. By the end of the year, we had identified more than 100,000 web-based malware infections.
Also in Q4'09, the infections on newly compromised sites of 10 pages or more spread to an average of 24% of those sites' pages, up from 19 percent the previous quarter. This increase helps account for the smaller drop in the number of infected pages for the quarter, relative to the drop in infected sites. In other words, we saw a more significant drop in the number of infected sites than we did in the number of infected pages because each infection tended to spread to a larger number of pages on each site.
Finally, we saw a reinfection rate of 42 percent for the quarter (compared with 39 percent in Q3'09), meaning that more than four of every 10 sites infected in the quarter were reinfected within a space of three months. And, of course, with each infection the site is likely to suffer a loss of traffic, a decline in revenue, and damage to brand equity.
While we clearly saw a slight dip in some of the key metrics in the quarter (and, more specifically, as we neared the end of the year), the macro trend still points to a steady and significant increase in this kind of activity. To cite just one indicator: The number of infected pages for the quarter, 5.5 million, is a substantial increase from data published by Microsoft in April 2009, which pegged the number of infected pages per quarter at a little more than 3 million.
Attackers getting smarter
Like most other kinds of attackers, purveyors of web-based malware have long since adopted basic social engineering techniques to maximize their chances of remaining undetected as they infect endpoints. We saw plenty of evidence confirming that trend in Q4'09: For example, the most common domains being sourced in the download of a malicious file included innocuous-looking names like "google-query.com," "netlinkenterprises.com," and "starktourism.com." Similarly, the file names most often used in drive-by downloads included things like "setup.exe," "update.exe" (which was used in the Google attack), and "install_flash_player.exe."
But we also saw some evidence that attackers are responding directly to industry efforts to curb the spread of web-based malware. One interesting example can be found in the average number of extra processes started when a drive-by download is initiated. In previous years, a drive-by download would often initiate 10 or more extra processes, ostensibly in an attempt to maximize the return from each infected endpoint. In response, the search providers and anti-virus vendors who scan the web for infected sites began using the number of extra processes initiated as a signal that the webpage might be malicious. But in Q4'09, the average number of extra processes initiated was just 2.8 -- enough for a downloader and perhaps one or two pieces of malware. Clearly, attackers are getting smarter about the way they structure their attacks, opting for a smaller fingerprint on an infected machine in exchange for a greater likelihood of evading detection.
Structural vulnerabilities still being exploited
It stands to reason that the increasing complexity in and interoperability between websites and web applications has played a significant role in the rise of web-based malware. After all, the more dynamic and sophisticated your pages or applications are, the more vulnerabilities there will be for attackers to exploit. The data for Q4'09 certainly bears that out: .php, .asp, and .aspx (all file types associated with dynamic web content) accounted for 55 percent of all compromised URLs in the quarter.

Of course, a closer look at the data reveals that file types associated with static pages, such as .html, .htm, and .shtml, accounted for 39.6 percent of the compromised URLs for the quarter. This suggests that attackers are still focused in no small part on exploiting structural vulnerabilities in the web to compromise legitimate sites -- vulnerabilities like sourced-in third-party content or applications; user-added content like comments, links, photos, and other files; and syndicated ad networks, among other things. There are no simple solutions for closing these kinds of vulnerabilities, something that all site owners who want to avoid being infected -- and potentially infecting their users and being blacklisted -- should bear in mind when considering the protections they employ.
Keeping your site safe
If you're a business owner and you'd like to learn more about how Dasient WAM can help protect your websites, head here. If you're a web hosting provider and you'd like to learn about partnership opportunities with Dasient, check out this page. And no matter who you are, please be sure to check out our Twitter feed at http://twitter.com/dasient for all the latest in web-based malware and general security news.
