How Does Malware Get Named?

Posted on by

I imagine there are some of you out there who wonder how companies come up with malware names. It can often be confusing, with different companies calling the same thing by completely different names. This guide will tell you, briefly, how we decide on names you’re most likely to encounter on this blog. (If you would like a much more in-depth look, check out CARO’s site.)

Let’s take a look at the malware we call OSX/Imuler.D. If you were to look at the detection names across the industry, here’s some other names you would see for this file from various vendors:

  • Trojan-Dropper:OSX/Revir.C
  • Backdoor:MacOS_X/Imuler.C
  • OSX/Imuler.D
  • OSX/Imuler-D
  • OSX.Revir
  • Trojan.Muxler.6

Kind of confusing, no doubt. So, let’s break this down a bit.

Prefixes – What Does This Thing Do?

You’ll see a fair number of the detection names start with a word like “Trojan,” “Backdoor” or “Dropper.” These vendors start their naming convention with a description of the activity of the file, but they all have different focuses on what’s the most important descriptor. By choosing “Trojan” or “Trojan Dropper,” it’s as if they’re saying “this threat does not spread by itself – it’s sent by a malicious person.” If they choose “Backdoor,” that is to say the ultimate goal of the Trojan is to create a backdoor on your machine that will let a bad actor take control of it and spy on your actions.

If the name starts with “OSX,” this is a way of stating what operating system the malware affects. If the malware targets multiple operating systems, you may see one component named “W32/NastyBizness” and another called “OSX/NastyBizness.” “W32” lets you know which component affects Windows systems.

Family Name – The Meat and Potatoes

The next part of the name, usually after a delimiter like a slash or a dot, is the family name. This is what the press usually uses, stripped of the prefix info. If a researcher is looking at something that’s brand new malware bearing little resemblance to other malware that’s come before, they get to choose a new family name.

Knowing whether this is similar to existing malware can be more than a little tricky, especially on Windows where there are many, many millions of malware. This is the first place where things can get a little cloudy. The first researcher to see a new malware may not be familiar with previous variants of a family, and they may choose a new malware name. The next researcher to see it may be familiar with the existing family name and will choose to use that instead.

There are certain conventions that pertain to choosing a new malware family name:

  • The use of proper nouns is strongly discouraged, as this could offend the person/country/company/etc. of the thing the malware is named after. Nobody wants bad things named after them! (Except perhaps the malware’s author, and we really don’t want to be encouraging them by putting their name in the press.)
  • We try not to use obscene or offensive names. This can be tricky because the malware may come from a culture or language that the researcher is unfamiliar with.
  • Numeric names are a bad idea, as historically certain types of viruses included a number as a suffix that denoted how many bytes long the virus code was.
  • We do not use the malware author’s suggested malware name, for the same reason we don’t use the malware author’s name or handle. We don’t want to motivate them with recognition. Sometimes vendors will choose to scramble or reverse an author’s suggested name.
  • It’s best to avoid naming the malware after the filename the malware comes in, such as an email attachment. The next variant in the family may come with a different filename, and we don’t want to train people to only look for certain problematic files – any unexpected attachments should be treated as suspect.
  • For the same reason, we avoid date-based names (such as “Friday_13th”), especially if those dates are related to payload triggers.
  • It’s a good idea to name the malware based on something distinctive within the code or behavior of the threat. This way it will be easier for other researchers to identify the threat and possibly its future variants.

Not all companies agree to these naming rules, which is part of why you will see differing names between vendors. Other times, multiple researchers will discover a threat at roughly the same time, which is another case where you might get multiple different names. In the case of our example above, you can see there are three main family names that are used by the various vendors: Imuler, Revir and Muxler.

As part of the research process, most researchers will first scan the file with other anti-malware products to see if it is already detected. This is fairly common as generic and behavioral detections become more powerful. When this happens, it’s considered good form to use the family name already chosen by the other vendor, unless that name falls afoul of one of the conventions above. Sometimes the name is deemed unacceptable for some other reason (like a limit to the length of the detection name), which is up to the researcher and the vendor. If multiple acceptable names exist, it’s best to choose the one used by the majority of vendors.

Suffixes – How Many of These Things Are There?

Suffixes are separated by another delimiter, usually a dot or a dash. They’re meant to tell you which variant of a family this is. For most vendors, suffixes start with A for the first variant, then it goes up to Z, then AA to ZZ, and so on. In Windows-world, it’s very common to see family names with three-letter suffixes, as there are hundreds of variants in those families. Letters are usually used rather than numbers, as we used to use numbers to denote the length of viruses.

This is another place where you can see things get a little problematic, as we have suffixes of .B, .C, .D and .6 – Say what?? Some vendors buck tradition and name suffixes by number, rather than by letter, for starters. For those who stick with the alphabetic suffixes, there are a couple common reasons for variant letters to vary. Multiple variants may be discovered at once or within a short span, and Vendor X may get (and name) them in different order than Vendor Y. Or Vendor X may have generic detection that picks up multiple variants with one signature. In this case, they may choose to name the next variant .B rather than .C or .D. Or they may be aware that their detection catches several variants, and they’ll only have detection for NastyBizness.A and .D, because they didn’t need to amend their detection for .B and .C.

Ouch, My Head Hurts!

All this information may not make the situation much easier for you, since there are so many variables that go into choosing malware names. But hopefully it will help you understand why the names are the way they are and what they mean, even when they’re confusing. Some anti-virus vendors will put “aliases” in descriptions or blog posts about threats when they’re aware of other vendors using different names. This can certainly cut down on the inevitable calls to tech support and the research department about whether Vendor X detects what Vendor Y is talking in the press about. We try to keep things as simple as possible, but unlike in ye olden days, things move too quickly for us to periodically get together with other researchers throughout the industry to sanitize and consolidate everyone’s names.

Further Reading:

photo credit: mrmayo (1) and (2), jazzijava, and 91s_girl via photopin cc