481 project contract

advertisement

Intelligent Malware Detection

Group Members:

Alex Finkelstein, Kevin Hao, Josh Suess, Mike Hite, Dom Amos

Goals:

When we set out last semester to create a more advanced, intelligent malware detection system we set goals aggressively and set out to accomplish a great amount. Our progress last semester laid the foundation for what we will accomplish this semester, but unfortunately our previous goals may not be reachable.

Our final deliverable will be capable of extracting and storing PE file signatures from a file based on its

API calls. It will allow a user to upload their own file and determine if the file is malicious or benign based on signature comparison algorithms. Additionally, our system will be able to scan all of the files in a directory and make similar determinations about them.

Our system will interface with the user, the software, and a database behind the scenes. The user will interact with a window which will present options to upload a file or select a directory to scan. After they have chosen an action the software will interface with the user file(s) and extract the PE file signature(s). The signatures will then be sent to the database where they will be compared with existing signatures. If the file already exists in the database then the determination of the file will already exist and be output to the user. If the file does not exist the classification algorithm will be run and a classification made. This classification is then output to the user and also updated in the database.

We had wanted to implement an active version of our software, capable of running in the background and alerting the user of potential threats when they occur. Considering time constraints and our newness to the subject matter we are not confident that we will be able to accomplish this in a satisfactory way. A background running system would have to be efficient so as not to hog system resources. It would also require us to create a separate interface with the software. We are new to databases as well as the structure of Windows PE files and these aspects have already presented challenges to us. For these reasons we are dropping this capability from our final deliverable and will instead use our time to focus on other aspects of the project, primarily the database and extraction software.

Infrastructure:

The essential infrastructure necessary to demonstrate our project is the database. Without the database our project cannot be demonstrated because this is where all of the malicious and benign file signatures will be. Our system will interface with the database and use its contents in the classification algorithm.

Without the database there is nothing to compare a file to and no classification can be made.

Additionally, we will require a machine capable of connecting to our database. This could be a laptop or desktop computer as long as we are able to make a connection.

Functions to be demonstrated:

During our presentation we will show our user interface and demonstrate its functionality. First we will explain how the API extraction is done behind the scenes and how a PE file signature is created. From

here we will explain our classification algorithm and demonstrate three scans. The first will be of a file that already exists in the database. This scan will show and we will explain the significance of how our system first compares the signature with known signatures before running the algorithm. The second scan will be of a file that does not exist in the database. This scan will demonstrate the classification algorithm in action and additionally how the database is updated after a scan of an unknown file is completed. The final scan will be a directory scan and go through multiple files. The files will be a mix of benign and malicious files as well as files which do and do not already exist in the database. This scan will showcase our system’s accuracy in its ability to correctly classify multiple types of files. It will also show how we elegantly present the results to the user.

User interface:

The user interface will be presented in one window. This window will give them the ability to upload a file or select a directory to scan. It will be sleek, polished, and easy to use. The interface is fairly simple because there are only two main functionalities of our system. The bulk of the work will be spent on the database, signature extraction, and classification algorithm. The importance of the interface will not be ignored as it is how the user interacts with the system.

Acceptance tests:

Acceptance testing will be done the entirety of the time we work on the project to assure we are in fact meeting the goals we have set. They will be demonstrated during our presentation through the database and scans. To elaborate, our database is updated after each scan so we will be able to see that the database is correctly updating. Additionally, we will use sample files that are known to be benign and others that are malicious for testing to assure our system is making the same classification. We will also implement tests with malicious files that are known to avoid detection from some of the leading detection systems to verify that our system is able to classify them. We know that no detection system is perfect, but we are striving for an extremely low percentage of false-positive and false-negative classifications and calculations will be done to demonstrate the accuracy of our system.

Conclusion:

While we do have to make some alterations to the scope of our project, we are confident that with the new goals in place we will be able to meet our expectations and deliver a product capable of quickly and accurately classifying files. Working with databases and PE files is new to all members of the group and will prove to be a challenge, but we are ready and will strive to deliver the best malware detection system possible.

Download