Polymorphic code: Difference between revisions

Content deleted Content added
mNo edit summary
unreliable source
 
(153 intermediate revisions by more than 100 users not shown)
Line 1:
{{Short description|Self-modifying program code designed to defeat anti-virus programs or reverse engineering}}
In computer terminology, '''polymorphic code''' is code that mutates while keeping the original [[algorithm]] intact. This technique is sometimes used by [[computer virus]]es, [[shellcode]]s and [[computer worm]]s to hide their presence.
{{distinguish|Polymorphism (computer science)}}
{{refimprove|date=November 2010}}
In computing, '''polymorphic code''' is code that uses a [[polymorphic engine]] to mutate while keeping the original [[algorithm]] intact - that is, the ''code'' changes itself every time it runs, but the ''function'' of the code (its [[semantics]]) stays the same. For example, the simple math expressions 3+1 and 6-2 both achieve the same result, yet run with different [[machine code]] in a [[Central processing unit|CPU]]. This technique is sometimes used by [[computer virus]]es, [[shellcode]]s and [[computer worm]]s to hide their presence.<ref name="rugha">{{cite thesis |last=Raghunathan |first=Srinivasan |date=2007 |title=Protecting anti-virus software under viral attacks |type=M.Sc. |publisher=Arizona State University |citeseerx=10.1.1.93.796}}</ref>
 
[[Encryption]] is the most common method to hide code. With encryption, the main body of the code (also called its [[Payload (computing)|payload]]) is encrypted and will appear meaningless. For the code to function as before, a decryption function is added to the code. When the code is ''executed'', this function reads the payload and decrypts it before executing it in turn.
Most [[anti-virus software]] and [[intrusion detection system]]s attempt to locate malicious code by searching through computer files and data packets sent over a [[computer network]]. If the security software finds patterns that correspond to known computer viruses or worms, it takes appropriate steps to neutralize the threat. [[Polymorphic]] algorithms make it difficult for such software to locate the offending code as it constantly mutates.
 
Encryption alone is not polymorphism. To gain polymorphic behavior, the encryptor/decryptor pair is mutated with each copy of the code. This allows different versions of some code which all function the same.<ref name="wongstamp">{{cite journal |last1=Wong |first1=Wing |last2=Stamp |first2=M. |title=Hunting for Metamorphic Engines |journal=Journal in Computer Virology |volume=2 |issue= 3|pages=211–229 |date=2006 |doi=10.1007/s11416-006-0028-7 |citeseerx=10.1.1.108.3878|s2cid=8116065 }}</ref>
[[Encryption]] is the most commonly used method of achieving polymorphism in code. However, not all of the code can be encrypted as it would be completely unusable. A small portion of it is left unencrypted and is used to jumpstart the encrypted software. Anti-virus software targets this small unencrypted portion of code.
 
== Malicious code ==
Malicious [[programmer]]s have sought to protect their polymorphic code from this virus-scanning strategy by rewriting the unencrypted decryption engine each time the virus or worm is propagated. Anti-virus software uses sophisticated pattern analysis to find underlying patterns within the different mutations of the decryption engine, in hopes of reliably detecting such [[malware]].
 
Most [[anti-virus software]] and [[intrusion detection system]]s (IDS) attempt to locate malicious code by searching through computer files and data packets sent over a [[computer network]]. If the security software finds patterns that correspond to known computer viruses or worms, it takes appropriate steps to neutralize the threat. [[Polymorphic]] algorithms make it difficult for such software to locaterecognize the offending code asbecause it constantly mutates.
The first known polymorphic virus was written by Mark Washburn. The virus, called [[1260 (computer virus)|1260]], was written in 1990. A more well-known polymorphic virus was invented in 1992 by the [[Bulgarians|Bulgarian]] [[security cracking|cracker]], [[Dark Avenger]] (a [[pseudonym]]) as a means of avoiding pattern recognition from antivirus-software.
 
Malicious [[programmer]]s have sought to protect their polymorphicencrypted code from this virus-scanning strategy by rewriting the unencrypted decryption engine (and the resulting encrypted payload) each time the virus or worm is propagated. Anti-virus software uses sophisticated pattern analysis to find underlying patterns within the different mutations of the decryption engine, in hopes of reliably detecting such [[malware]].
== Example ==
 
Emulation may be used to defeat polymorphic obfuscation by letting the malware demangle itself in a virtual environment before utilizing other methods, such as traditional signature scanning. Such a virtual environment is sometimes called a [[Sandbox (computer security)|sandbox]]. Polymorphism does not protect the virus against such emulation if the decrypted payload remains the same regardless of variation in the decryption algorithm. [[Metamorphic code]] techniques may be used to complicate detection further, as the virus may execute without ever having identifiable code blocks in memory that remains constant from infection to infection.
An algorithm that uses, for example, the variables A and B but not the variable C could stay intact even if you added lots of code that changed the contents of the variable C.
 
The first known polymorphic virus was written by Mark Washburn. The virus, called [[1260 (computer virus)|1260]], was written in 1990.<ref>{{Cite Aweb more|title=An Example Decryptor of 1260 |url=https://userpages.umbc.edu/~dgorin1/432/example_decryptor.htm |access-date=2025-03-21 |website=userpages.umbc.edu}}</ref> A wellbetter-known polymorphic virus was inventedcreated in 1992 by the [[Bulgarians|Bulgarian]] [[security cracking|cracker]],hacker [[Dark Avenger]] (a [[pseudonym]]) as a means of avoiding pattern recognition from antivirus- software. A common and very virulent polymorphic virus is the file infecter [[Virut]].
The original algorithm:
 
Start:
GOTO Decryption_Code
Encrypted:
...
lots of encrypted code
...
Decryption_Code:
*A = Encrypted
Loop:
B = *A
B = B XOR CryptoKey
*A = B
A = A + 1
GOTO Loop IF NOT A = Decryption_Code
GOTO Encrypted
CryptoKey:
some_random_number
 
The same algorithm, but with lots of unnecessary C-altering code:
 
Start:
GOTO Decryption_Code
Encrypted:
...
lots of encrypted code
...
Decryption_Code:
C = C + 1
*A = Encrypted
Loop:
B = *A
C = 3214 * A
B = B XOR CryptoKey
*A = B
C = 1
C = A + B
A = A + 1
GOTO Loop IF NOT A = Decryption_Code
C = C^2
GOTO Encrypted
CryptoKey:
some_random_number
 
The code inside "Encrypted" ("lots of encrypted code") could then search the code between Decryption_Code and CryptoKey and remove all the code that alters the variable C. Before the next time the encryption engine is used, it could input new unnecessary codes that alters C, or even exchange the code in the algorithm for new code that does the same thing. Usually the coder uses a zero key for the first generation of the virus, making it easier for him because with this key the code is not encrypted. He then implements an incremental key algorithm or a random one.
 
Another polymorphism technique is to autoinject NOP (No Operation) or other opcodes that don't alter the algorithm.
 
== See also ==
* [[Timeline of notable computer viruses and worms]]
* [[Metamorphic code]]
* [[Self-modifying code]]
* [[alphanumericAlphanumeric codeshellcode]]
* [[shellcodeShellcode]]
* [[softwareObfuscated crackingcode]]
* [[securityOligomorphic crackingcode]]
 
== References ==
<references/>
* Diomidis Spinellis. [http://www.spinellis.gr/pubs/jrnl/2002-ieeetit-npvirus/html/npvirus.html Reliable identification of bounded-length viruses is NP-complete]. ''IEEE Transactions on Information Theory'', 49(1):280–284, January 2003. [http://dx.doi.org/10.1109/TIT.2002.806137 doi:10.1109/TIT.2002.806137]
{{refbegin}}
 
*{{cite Diomidisjournal |author-link= |last=Spinellis. [|first=Diomidis |url=http://www.spinellis.gr/pubs/jrnl/2002-ieeetit-npvirus/html/npvirus.html |title=Reliable identification of bounded-length viruses is NP-complete]. ''|journal=IEEE Transactions on Information Theory'', |volume=49( |issue=1):280–284, |pages=280–4 |date=January 2003. [http://dx.|doi.org/10.1109/TIT.2002.806137 doi:=10.1109/TIT.2002.806137]}}
[[Category:Malware]]
{{refend}}
 
[[Category:MalwareTypes of malware]]
[[pl:Kod polimorficzny]]
[[ru:Полиморфизм компьютерных вирусов]]