Malware original slides provided by Prof. Vern Paxson University of California, Berkeley

advertisement
Malware
original slides provided by
Prof. Vern Paxson
University of California, Berkeley
Host-Based Intrusion Detection Systems (HIDS)
(also known as anti-virus software)
• URL/Web access blocking:
– Prevent users from going to known bad locations
• Protocol scanning of network traffic
– Detect & block known attacks (e.g., port scans)
– Detect & block known malware communication (e.g.,
contacting a botnet)
• Payload scanning
– Detect & block known malware
• (Auto-update of signatures for these)
• Cloud queries regarding reputation
– Who else has run this executable and with what results?
– What’s known about the remote host / domain / URL?
Inside a Modern HIDS, con’t
• Sandbox execution
– Run selected executables in constrained/monitored
environment
– Analyze:
• System calls
• Changes to files / registry
• Self-modifying code (polymorphism/metamorphism)
• File scanning
– Look for known malware that installs itself on disk
• Memory scanning
– Look for known malware that never appears on disk
• Runtime analysis
– Apply heuristics/signatures to execution behavior
Network Intrusion Detection Systems (NIDS)
• Deployment inside network as well as at border
– Greater visibility, including tracking of user identity
• Full protocol analysis
– Including extraction of complex embedded objects
– In some systems, 100s of known protocols
• Signature analysis (also behavioral)
– Known attacks, malware communication, blacklisted
hosts/domains
– Known malicious payloads
– Sequences/patterns of activity
• Shadow execution (e.g., Flash, PDF programs, in sandbox)
• Extensive logging (in support of forensics)
• Auto-update of signatures, blacklists
NIDS vs. HIDS
• NIDS benefits:
– Can cover a lot of systems with single deployment
• Much simpler management
– Easy to “bolt on” / no need to touch end systems
– Doesn’t consume production resources on end systems
– Harder for an attacker to subvert / less to trust
• HIDS benefits:
– Can have direct access to semantics of activity
• Better positioned to block (prevent) attacks
• Harder to evade
– Can protect against non-network threats
– Visibility into encrypted activity
– Performance scales much more readily (no chokepoint)
• No issues with “dropped” packets
The Problem of Malware
• Malware = malicious code that runs on a victim’s
system
• How does it manage to run?
– Attacks a network-accessible vulnerable service
– Vulnerable client connects to remote system that sends
over an attack (a driveby)
– Social engineering: trick user into running/installing
– “Autorun” functionality (esp. from plugging in USB device)
– Slipped into a system component (at manufacture;
compromise of software provider; substituted via MITM)
– Attacker with local access downloads/runs it directly
• Might include using a “local root” exploit for privileged access
What Can Malware Do?
• Pretty much anything
– Payload generally decoupled from how manages to run
– Only subject to permissions under which it runs
• Examples:
–
–
–
–
–
–
Brag or exhort or extort (pop up a message/display)
Trash files (just to be nasty)
Damage hardware (Stuxnet?)
Launch external activity (spam, click fraud, DoS)
Steal information (exfiltrate)
Keylogging; screen / audio / camera capture
• Robbins v. Lower Merion School District
– Encrypt files (ransomware)
• Possibly delayed until condition occurs
– “time bomb” / “logic bomb”
Malware That Automatically Propagates
• Virus = code that propagates (replicates) across
systems by arranging to have itself eventually
executed, creating a new additional instance
– Generally infects by altering stored code
– Typically with the help of a user
• Worm = code that self-propagates/replicates
across systems by arranging to have itself
immediately executed (creating new addl. instance)
– Generally infects by altering running code
– No user intervention required
• (Note: line between these isn’t always so crisp;
plus some malware incorporates both styles)
The Problem of Viruses
• Opportunistic = code will eventually execute
– Generally due to user action
• Running an app, booting their system, opening an attachment
• Separate notions: how it propagates vs. what else
it does when executed (payload)
• General infection strategy:
find some code lying around,
alter it to include the virus
• Have been around for decades …
– … resulting arms race has heavily
influenced evolution of modern malware
http://vxheaven.org/lib/vml01.html
Propagation
• When virus runs, it looks for an opportunity to
infect additional systems
• One approach: look for USB-attached thumb drive,
alter any executables it holds to include the virus
– Strategy: when drive later attached to another system &
altered executable runs, it locates and infects
autorun is
executables on new system’s hard drive
handy here!
• Or: when user sends email w/ attachment, virus
alters attachment to add a copy of itself
– Works for attachment types that include programmability
– E.g., Word documents (macros), PDFs (Javascript)
– Virus can also send out such email proactively, using
user’s address book + enticing subject (“ILOVEYOU”)
LOVE-LETTER-FOR-YOU.txt.vbs
Original program
instructions can be:
Entry point
Original Program Instructions
Virus
Entry point
• Application the
user runs
• Run-time library /
routines resident
in memory
Original Program Instructions
• Disk blocks used
to boot OS
• Autorun file on
USB device
3. JMP
Original Program Instructions
2. JMP
Virus
1. Entry point
•…
Other variants are
possible; whatever
manages to get the
virus code executed
Detecting Viruses
• Signature-based detection
– Look for bytes corresponding to injected virus code
– High utility due to replicating nature
• If you capture a virus V on one system, by its nature the virus
will be trying to infect many other systems
• Can protect those other systems by installing recognizer for V
• Drove development of multi-billion $$ AV industry
(AV = “antivirus”)
– So many endemic viruses that detecting well-known
ones becomes a “checklist item” for security audits
• Using signature-based detection also has de facto
utility for (glib) marketing
– Companies compete on number of signatures …
• … rather than their quality (harder for customer to assess)
(will check your file
with 40+ antivirus
programs)
Virus Writer / AV Arms Race
• If you are a virus writer and your beautiful
new creations don’t get very far because
each time you write one, the AV companies
quickly push out a signature for it ….
– …. What are you going to do?
• Need to keep changing your viruses …
– … or at least changing their appearance!
• How can you mechanize the creation of new
instances of your viruses …
– … so that whenever your virus propagates, what
it injects as a copy of itself looks different?
Polymorphic Code
• We’ve already seen technology for creating a
representation of data apparently completely
unrelated to the original: encryption!
• Idea: every time your virus propagates, it inserts a
newly encrypted copy of itself
– Clearly, encryption needs to vary
• Either by using a different key each time
• Or by including some random initial padding (like an IV)
– Note: weak (but simple/fast) crypto algorithm works fine
• No need for truly strong encryption, just obfuscation
• When injected code runs, it decrypts itself to obtain
the original functionality
Virus
Original Program Instructions
Instead of this …
Original Program Instructions
Virus has this
initial structure
}
Key
Decryptor
Encrypted Glob of Bits
When executed,
decryptor applies key
to decrypt the glob …

Key
Decryptor
Main Virus Code
Jmp
… and jumps to the
decrypted code once
stored in memory
Polymorphic Propagation
Key
Decryptor
Encrypted Glob of Bits


}
Jmp
Encryptor
Key
Decryptor
Main Virus Code
Key2
Decryptor
Different Encrypted Glob of Bits
Once running, virus
uses an encryptor with
a new key to
propagate
New virus instance
bears little resemblance
to original
Arms Race: Polymorphic Code
• Given polymorphism, how might we then detect
viruses?
• Idea #1: use narrow sig. that targets decryptor
– Issues?
• Less code to match against  more false positives
• Virus writer spreads decryptor across existing code
• Idea #2: execute (or statically analyze) suspect
code to see if it decrypts!
– Issues?
• Legitimate “packers” perform similar operations (decompression)
• How long do you let the new code execute?
– If decryptor only acts after lengthy legit execution, difficult to spot
• Virus-writer countermeasures?
Metamorphic Code
• Idea: every time the virus propagates, generate
semantically different version of it!
– Different semantics only at immediate level of execution;
higher-level semantics remain same
• How could you do this?
• Include with the virus a code rewriter:
– Inspects its own code, generates random variant, e.g.:
•
•
•
•
•
Renumber registers
Change order of conditional code
Reorder operations not dependent on one another
Replace one low-level algorithm with another
Remove some do-nothing padding and replace with different donothing padding (“chaff”)
– Can be very complex, legit code … if it’s never called!
Polymorphic Code In Action
At some
point in
time can
take a
snapshot of
completely
decrypted
virus.
Hunting for Metamorphic, Szor & Ferrie, Symantec Corp., Virus Bulletin Conference, 2001
https://www.symantec.com/avcenter/reference/hunting.for.metamorphic.pdf
Metamorphic Code In Action
Zmist virus hides itself inside original program,
just as T-1000 can hide in the floor.
Hunting for Metamorphic, Szor & Ferrie, Symantec Corp., Virus Bulletin Conference, 2001
https://www.symantec.com/avcenter/reference/hunting.for.metamorphic.pdf
Detecting Metamorphic Viruses?
• Need to analyze execution behavior
– Shift from syntax (appearance of instructions) to
semantics (effect of instructions)
• Two stages: (1) AV company analyzes new virus to find
behavioral signature; (2) AV software on end systems
analyze suspect code to test for match to signature
• What countermeasures will the virus writer take?
– Delay analysis by taking a long time to manifest behavior
• Long time = await particular condition, or even simply clock time
– Detect that execution occurs in an analyzed environment and if so
behave differently (e.g., VW emissions controls activation)
• E.g., test whether running inside a debugger, or in a Virtual Machine
• Counter-countermeasure?
– AV analysis looks for these tactics and skips over them
• Note: attacker has edge as AV products supply an oracle
How Much Malware Is Out There?
• A final consideration re polymorphism and
metamorphism:
– Presence can lead to mis-counting a single virus
outbreak as instead reflecting 1,000s of seemingly
different viruses
• Thus take care in interpreting vendor statistics on
malcode varieties
– (Also note: public perception that many varieties exist is
in the vendors’ own interest)
https://www.av-test.org/en/statistics/malware/
Infection Cleanup
• Once malware detected on a system, how do we get
rid of it?
• May require restoring/repairing many files
– This is part of what AV companies sell: per-specimen
disinfection procedures
• What about if malware executed with adminstrator
privileges?
“nuke the entire site from orbit. It's the only way to be sure”
https://www.youtube.com/watch?v=aCbfMkh940Q
- Aliens
– i.e., rebuild system from original media + data backups
• Malware may include a rootkit: kernel patches to
hide its presence (its existence on disk, processes)
Infection Cleanup, con’t
• If we have complete source code for system,
we could rebuild from that instead, couldn’t we?
• No!
• Suppose forensic analysis shows that virus
introduced a backdoor in /bin/login
executable
– (Note: this threat isn’t specific to viruses; applies
to any malware)
• Cleanup procedure: rebuild /bin/login from
source …
Compiler
Regular compilation
process of building login
binary from source code
/bin/login
executable
Compiler
/bin/login
executable
Compiler infected
(perhaps a long time
ago) recognizes when
it’s compiling /bin/login
source and inserts extra
back door when seen
X
No problem: first step,
rebuild the compiler so
it’s uninfected
Infected Compiler
Correct compiler
executable
Infected Compiler
Oops - infected compiler
recognizes when it’s
compiling its own source
and inserts the infection!
Infected Compiler
No amount of careful source-code
scrutiny can prevent this problem.
Reflections on Trusting Trust
Turing-Award Lecture, Ken Thompson, 1983
Download