IMAGE PROCESSING AND FORENSIC VERIFICATION OF FAKE VIDEOS/IMAGES Thesis/ Dissertation submitted in the partial fulfilment of the requirements for the award of the degree of BACHELOR OF TECHNOLOGY in COMPUTER SCIENCE & ENGINEERING By T. Srikanksha 17K91A05M3 MD Nisha 18K95A0525 V. Kalyan 17K91A05N2 T. Karthik 17K91A05M2 Under the Guidance of G. Anantha Laxmi Assistant Professor DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING TKR COLLEGE OF ENGINEERING & TECHNOLOGY (AUTONOMOUS) (Accredited by NBA and NAAC with ‘A’ Grade) Medbowli, Meerpet, Saroornagar, Hyderabad-500097 DECLARATION BY THE CANDIDATE We, Ms.T. Srikanksha , bearing Roll No: 17K91A05M3, Ms.MD Nisha , bearing Roll No: 18K95A0525, Mr.V.Kalyan, bearing Roll No: 17K91A05N2, Mr.T. Karthik, bearing Roll No:17K91A05M2, hereby declare that the project report entitled “IMAGE PROCESSING AND FORENSIC VERIFICATION OF FAKE VIDEOS/IMAGES” under the guidance of Mr.G.Anantha Laxmi, Professor in Department of Computer Science & Engineering submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of Technology in Computer Science & Engineering. By T. Srikanksha (17K91A05M3) MD Nisha (18K95A0525) V. Kalyan (17K91A05N2) T. Karthik(17K91A05M2) CERTIFICATE This is to certify that the project report entitled “IMAGE PROCESSING AND FORENSIC VERIFICATION OF FAKE VIDEOS/IMAGES” being submitted by Ms.T. Srikanksha(17K91A05M3),Ms.MD Nisha(18K95A0525), Mr.V.Kalyan(17K91A05N2), Mr.T. Karthik(17K91A05M2) in partial fulfilment of requirements for the award of degree of Bachelor of Technology in Computer Science & Engineering, to the Jawaharlal Nehru Technological University is a record of bonafide work carried out by them under my guidance and supervision. Signature of the Guide Signature of the HOD G. Anantha Laxmi Assistant Professor Dr.A. Suresh Rao Professor Signature of the External Examiner ACKNOWLEDGEMENT The satisfaction and euphoria that accompanies the successful completion of any task would be incomplete without the mention of the people who made it possible and whose encouragement and guidance have crowned my efforts with success. We are indebted to the Internal Guide, Mrs. G. Anantha Laxmi Professor, Department of Computer Science & Engineering, TKR College of Engineering and Technology, for his support and guidance throughout our major project. We are also indebted to the Head of the Department, Dr.A.Suresh Rao, Professor, Computer Science & Engineering, TKR college of Engineering & Technology, for his support and guidance throughout our major project. We extend my deep sense of gratitude to the Principal, Dr. D. V. Ravi Shankar TKR college of engineering & technology, for permitting us to undertake this major project. Finally, we express our thanks to one and all that have helped us in successfully completing this Major Project. Furthermore, we would like to thank our families and friends for their moral support and encouragement. By, T.Srikanksha(17K91A05M3) MD Nisha (18K95A0525) V. Kalyan(17K91A05N2) T.Karthik(17K91A05M2) CONTENTS Abstract i List of Figures List of Screens Symbols and Abbreviations ii iii iv S.No TOPIC NAME PAGE.NO 1. Introduction 1.1 Motivation 1.2 Problem Statement 1.3 Limitations of Problem Statement 1.4 Proposed System 1-2 1 1 1 2 2. How Deep Fake Works 2.1 Deepfake Introduction 2.2 Deepfake Creation 2.3 Overview 3-7 3-4 4-6 7 3. Literature Survey 8-9 4. Requirement Analysis 4.1 Functional Requirements 4.2 Non-Functional Requirements 4.3 Software Requirements 4.4 Hardware Requirements 10-12 10 11 11 12 5. Design 5.1 System Architecture 5.2 Meso4 13-15 13-14 14-15 6. Coding 6.1 Datasets Download 6.2 Coding-Mesonet Neural Network 16-20 16 16-20 7. Implementation 7.1 Implementation 7.2 Results and Output Screens 21-27 21-24 25-27 8. Testing 8.1 Software Validation 8.2 Software Verification 8.3 Target of the Test are 8.4 Black-Box Testing 8.5 White-Box Testing 8.6 Testing Values 28-30 28 28 28-29 29 29 29-30 9. Advantages, Disadvantages, Applications 9.1 Advantages 9.2 Disadvantages 9.3 Applications 31-32 31 31 32 10. Conclusion and Future Enhancement 10.1 Conclusion 10.2 Future Enhancement References 33-34 33 34 35-36 ABSTRACT Deep learning has been successfully applied to resolve various complex problems ranging from big data analytics to computer vision and human-level control. Advances in deep learning, however, have also been used to create software that may cause threats to privacy, democracy, and national security. One such deep learning application is the “deep fake”. Deep fake algorithms can create fake images and videos that humans cannot distinguish from authentic ones. It is therefore essential to propose technologies for automatically detecting and assessing the integrity of digital visual media. This project deals with the methods to detect deep fakes in the literature to date. We present in-depth discussions on the challenges, research trends, and orientations related to deep fake technologies. Deep fake detection methods were proposed as soon as this threat was introduced. The early attempts were based on handcrafted features obtained from artifacts and inconsistencies in the fake video synthesis process. In this project, we will apply deep learning techniques to automatically pull out salient and discriminating characteristics to detect deep fakes. Detection of deep fake is normally considered a binary classification problem where classifiers are used to classify between genuine and forged videos. This type of method requires a large database of real and fake videos to form classification models. The number of fake videos is increasingly available, but it remains limited in terms of reference to validate the various detection methods. By reviewing the knowledge of deep fakes and state-of-the-art deep fake detection methods, this study provides a complete overview of deep fake techniques and facilitates a new and more robust method to deal with the increasingly challenging deep fakes. i LIST OF FIGURES Fig No. Title Page No. 2.1 Deep fake principle 5 2.2 Example of Deep fake image 6 5.1 System Architecture 13 5.2 The Network Architecture of Meso4 15 7.1 Convolutional and Hidden Layers of Meso4 Network Model 21 ii LIST OF SCREENS Screen No. Title Page No. 7.1 Correct_real Images 25 7.2 Misclassified_real Images 26 7.3 Correct_deepfake Images 27 7.4 Misclassified_deepfake Images 27 iii SYMBOLS & ABBREVIATIONS Acronym CV CNN DARPA Expansion Computer Vision Convolutional Neural Network Defense Advanced Research Projects Agency AI Artificial Intelligence DF Deep Fakes IDE Integrated Development Environment RAM Random Access Memory iv CHAPTER 1 INTRODUCTION 1.1 MOTIVATION Digital video is commonly used by many organizations as evidence for crimes. Many surveillance systems record the information using cameras. These sorts of footage are made fake by some criminals. Deepfake videos are also created to spread fake news around the world which leads to political, economical loss. To supply a satisfactory solution for this problem fake video detection concept is introduced. In this concept, we use a set of techniques to detect the images/videos are deep fake /real. 1.2 PROBLEM STATEMENT Cybercriminals are using Image processing tools and techniques for producing a variety of crimes, including Image Modification, Fabrication using Cheap & Deepfake Videos/Images. The solution should focus on helping the Image/Video verifier/examiner find out and differentiate a fabricated Image/Video from an original one. 1.3 LIMITATIONS OF EXISTING PROBLEM SYSTEM In the existing system a group of statistical tools for detecting traces of digital tampering in the absence of any digital watermark or signature. The nature of statistical correlations that result from specific forms of digital tampering, and have device detection schemes to reveal these correlations. The tools that, in the same spirit as those presented here reveal statistical correlations that result from a variety of different manipulations that are typically necessary to create a digital forgery. Analyzing/check the sensitivity and robustness to counter-attack of each of the schemes outlined/profile. While digital forensic techniques are designed to identify digital forgeries or fake even when the forgery is perceptually undetectable by humans. Dept of CSE 1 TKRCET 1.4 PROPOSED SYSTEM Our project aims at detecting the realistic human synthesized videos popularly known as deep fakes. This project deals with the methods to detect deep fakes in the literature on the date. In this project, we will apply deep learning techniques to automatically extract salient and discriminative features to detect deep fakes. Deepfake detection is generally deemed a binary classification problem where classifiers are wont to classify between authentic videos and tampered ones. This kind of method requires a large database of real and fake images /videos to train classification models. Traditional image forensics methods can be classified according to the image features that they target, such as local noise estimation, pattern analysis, illumination modeling, and feature classification. However, with the deep learning breakthrough, the computer vision (CV) community has radically steered towards neural network techniques. For example, the recent works are based on Convolutional Neural Networks (CNN). These CNN-based approaches also aim to capture the aforementioned image features, but in an explicit way. We proposed a technique that uses Meso4 a Convolutional Neural Network. Mesonet is a Convolutional Neural Network exactly designed to detect Deepfakes. We use Mesonet to do predictions on image data. It classifies our image data as real/deep fake. Dept of CSE 2 TKRCET CHAPTER 2 HOW DEEP FAKE WORKS 2.1 DEEP FAKE INTRODUCTION Deep fake is a technique that can superimpose faces, images of a target person to a video of a source person to create a video of the person doing or saying things the source person does. Deep learning models such as autoencoders and generative adversarial networks are applied widely in the computer vision domain to solve various problems. These models are also used in deep fake algorithms to look into facial expressions and movements of a person and combined facial images of another person making related expressions and movements. Deep fake algorithms are normally essential for a large amount of image and video data to train models to create photo-realistic images and videos. In the public sector such as celebrities and politicians, they have a large number of videos and images available online, they are initial targets of deep fakes. Deep fakes are used to swap faces of celebrities or politicians to bodies in porn images and videos. The first deep fake video came out in 2017 where the face of a celebrity was swapped to that of a porn actor. It is scary to world security when deep fake methods can be employed to create videos of world leaders with fake speeches for falsification purposes. Deep fakes are causing abused political or religious tensions between countries, by creating a piece of fake news to fool the public and affect results in election campaigns or create chaos in financial markets. Deepfake technique is also used to generate fake satellite images of the Earth to hold objects that do not exist to confuse military analysts, e.g., creating a fake bridge over a river although there is no such a bridge in reality. It is a piece of fake news that has been guided to cross the bridge in a battle. There are also possibilities of deep fakes such as creating voices or images of those who have lost theirs or updating episodes of movies without reshooting them. However, the number of malicious uses of deep fakes mostly dominates that of the positive ones. The development of advanced deep networks and the availability of a large amount of data may help to make the forged images and videos almost indistinguishable to humans Dept of CSE 3 TKRCET and even to elaborate computer algorithms. The process of creating those manipulated images and videos is very simple today as it needs as little as an identity photo or a short video of a target person. People may have less and less effort is required to produce stunningly convincing tempered footage. This technique can even create a deep fake with just a still image. Deepfakes can be a threat it is not only affecting public figures, but it also affects ordinary people. For example, a voice deep fake was used to scam or blackmail a CEO out of $243,000. A recent release of the software called Deep Nude shows more distributing threats can be transforming a person into non-consensual porn. Likewise, the Chinese app Zao has gone viral lately as less-skilled people can swap their faces onto bodies of movie stars and insert themselves into popular movies and TV clips, and short videos. These forms of falsification create a huge threat to violation of privacy and identity and affect many aspects of human lives. The critical part is to find the truth in the digital sector. Finding is even more challenging when dealing with deep fakes as they are majorly used to serve harmful purposes and almost anyone can create deep fakes these days using existing deep fake tools. Thus far, there have been a lot of methods proposed to detect deep fakes. Most of the methods are based on deep learning, and thus a clash between malicious and positive uses of deep learning methods has been arising. To address the threat of face-swapping technology or deep fakes, the United States Defense Advanced Research Projects Agency (DARPA) begins a research scheme in media forensics (named Media Forensics or MediFor) to stimulate the development of fake digital visual media detection methods. Recently, Facebook Inc. grouping up with Microsoft Corp and the Partnership on AI coalition has launched the Deepfake Detection Challenge to create more research and development in detecting and stopping deep fakes from being used to cheat viewers. This paper presents a survey of methods or types for creating and as well as detecting deep fakes. In Section 2, we are presenting the principles of deep fake algorithms and how deep learning has been used to create or enable such disruptive technologies. 2.2 DEEPFAKE CREATION Deepfake is a technique that aims to exchange the face of a targeted person with the face of someone else in a video. It first appeared in autumn 2017 as a script used to generate Dept of CSE 4 TKRCET face-swapped adult content. Afterward, this technique was improved by a little community to notably create a user-friendly application called FakeApp. The core idea lies in the parallel training of two autoencoders. Their architecture can vary consistent with the output size, the specified training time, the expected quality and the available resources. Traditionally an auto-encoder designates the chaining of an encoder network and a decoder network. The encoder aims to perform a dimension reduction by encoding the info from the input layer into a reduced number of variables. The goal of the decoder is then to use those variables to output and approximation of the first input. The optimization face is done by comparing the input and its generated approximation and panelizing the difference between the two, typically using an L2 distance. In case of the Deep fake technique, the original auto-encoder is fed with images of resolutions 64x64x3 =12,288 variables, encodes those images on 1024 variables then generates images with an equivalent size as the input. The process to get Deep fake images is to collect aligned faces of two different people A and B, then to train an auto-encoder EA to reconstruct the faces of A from the data of facial images of A, and an auto-encoder EB to reconstruct the faces of B from the dataset of facial images of B. The trick consists of sharing the weights of the encoding a part of the two auto-encoders EA and EB, but keeping their respective decoders separated. Once the optimization is done, any image containing a face of A is often encoded through this shared encoder but decoded with a decoder of EB. This principle is illustrated in Figure 2.1 & 2.2 Fig 2.1 Deepfake principle. Top: the training parts with the shared encoder in yellow. Bottom: the usage part where images of A are decoded with the decoder of B Dept of CSE 5 TKRCET The intuition behind this approach is to possess an encoder that privileges to encode general information of illumination, position and expression of the face and a dedicated decoder for every person to reconstitute constant characteristic shapes and details of the person face. This might thus separate the contextual information on one side and therefore the morphological information on the opposite. In practice, the results are impressive, which explains the popularity of the technique. The last step is to require the target video, extract and align the target face from each frame, use the modified auto-encoder to get another face with the same illumination and expression, then merge it back in the video. Fig 2.2 Example of deepfake image. Original(left) and Deepfake(right) Fortunately, this system is way from flawless. Basically, the extraction of faces and their reintegration can fail, especially within the case of face occlusions: some frames can end up with no facial reenactment or with an outsized blurred area or a doubled facial contour. However, those technical errors can easily be avoided with more advanced networks. More deeply, and this is often true for other applications, autoencoders tend to poorly reconstruct fine details due to the compression of the input file on a limited encoding space, the result thus often appears a touch blurry. A larger encoding space doesn’t work properly since while the fine details are certainly better approximated, on the opposite hand, the resulting face loses realism because it tends to resemble the input face, i.e., morphological data are passed to the decoder, which may be undesired effect. Dept of CSE 6 TKRCET 2.3 OVERVIEW Deep Fake is a type of artificial intelligence used to create a convincing image, audio, and video hoaxes. The term, which represents both the technology and the resulting fake content, is a portmanteau of deep learning and fake. There are positive uses for deep fake technology like making digital voices for people who lost theirs or updating movie videos instead of reshooting them if actors trip over their lines. There has been tremendous progress in the quality of deep fakes since only two or three years ago when the first products of the technology spread. Since that time, many of the scariest examples of artificial intelligence (AI)- enabled deep fakes have technology leaders, governments, and media talking about it could create for communities. Dept of CSE 7 TKRCET CHAPTER 3 LITERATURE SURVEY The explosive growth in deep fake video and its illegal use is a major threat to public trust, justice, and democracy. Due to this there is increase in demand for fake video analysis, detection and intervention. Reference 1: Title: MesoNet: a Compact Facial Video Forgery Detection Network Author Names: Darius Afchar, Vincent Nozick, Junichi Yamagishi, Isao Echizen Description: This paper presents a method to automatically and efficiently setect face tampering in videos, particularly focuses on two recent techniques used to generate hyperrealistic forged videos: Deepfake and Face2Face. Traditionally image forensics techniques are usually not well suited to videos to the compression that strongly degrades the data. Thus, this paper follows a deep learning approach and presents two networks Meso4 and MesoInception4, both with a low number of layers to focus on the mesoscopic properties of images. Reference 2: Title: Exposing DeepFake Videos By Detecting Face Warping Artifacts Author Names: Yuezun Li, Siwei Lyu Description: In this work, we describe a new deep learning based method that can effectively distinguish AI generated fake videos (DeepFake) from real videos. It uses an approach to detects artifacts by comparing the generated face areas and their surrounding regions with a dedicated Convolutional Neural Network model. In this work there were two-fold of Face Artifacts. Their method is predicted on the observations that current DF algorithm can only generate images of limited resolutions, which are then needed to be further transformed to match the faces to be replaced in the source video. Dept of CSE 8 TKRCET Reference 3: Title: Deepfake Video Detection using Neural Networks Author Names: Abhijit Jadhav, Abhishek Patange, Jay Patel, Hitendra Patil, Manjushri Mahajan Description: This paper discuss the, free deep learning based software tools has facilitated the creation of incredible face exchanges in videos that leave few traces of manipulation, called “DeepFake”(DF). Recent advances in deep learning have led to a drastic increase within the realism of fake content and the accessibility in which it can be created. Creating deepfake is easy but, when it involves detection of this DF, it’s major challenge. We have taken a breakthrough in detecting the DF using Convolutional Neural Network and Recurrent neural Network. System uses convolutional neural networks to extract features at the frame level. These features are used to train a recurrent neural networks which learns to classify whether the video is fake /real. Reference 4: Title: Capsule-Forensics: Using Capsule Networks to Detect Forged Images and Videos Author Names: Huy H. Nguyen, Junichi Yamagishi, Isao Echizen Description: This paper discuss a way where it uses a capsule network to detect various kinds of spoofs, from replay attacks using printed images or recorded videos to computer generated videos using deep convolutional neural networks(CNN). Advancing techniques in deepfake creation is creating a huge problem where people are in dilemma to identify the real/fake news. A deep learning technique is used. We discuss an approach to detect the data is real/fake. Here we took a large database of Deepfake and Real images. We do the predictions on the image data. We use Meso4 model. Meso4 - a convolutional neural network with 4 Convolutional blocks followed by one full connected hidden layer. Dept of CSE 9 TKRCET CHAPTER 4 REQUIREMENT ANALYSIS In this phase, requirements are gathered and analyzed. This phase is the main focus of the users and registered accounts. Meetings with the users and registered people determine the requirements like Who is going to use the system? How will they use the system? What data should be input into the system? What data should be output by the system? These are the general questions that get answered during a requirement gathering phase. This specifies the requirements that our project should achieve. After requirement gathering, these requirements are analyzed for their validity and the possibility of incorporating the requirements in the system to be developed is also studied. As a basis, an article on all the different requirements for software development was taken into account during the process. 4.1 FUNCTIONAL REQUIREMENTS These are the requirements that the end-user specifically requests as basic facilities that the system should provide. All these functions must be included in the system as part of the contract. These are represented or indicated in the form of an input to the system, the operation carried out and the expected output. These are the requirements stated by the user which one can see directly in the final product, unlike the non-functional requirements. ● Dataset Collection ● Training dataset ● Testing dataset ● User video upload ● Pre-processing ● Data Loader ● RestNextCNN for Feature Generation ● LSTM for Sequence Processing ● Prediction Dept of CSE 10 TKRCET 4.2 NON-FUNCTIONAL REQUIREMENTS These are basically the quality constraints that the system must satisfy according to the project contract. The priority or extent to which these factors are implemented varies from one project to another. They are also called non-behavioral requirements They basically deal with issues like: ● Portability ● Security ● Maintainability ● Reliability ● Scalability ● Performance ● Reusability ● Flexibility 4.3 SOFTWARE REQUIREMENTS Software requirements should include both a definition and a specification of requirements. The software requirements are providing a basis for creating the software specification. Software requirements are useful in estimating cost, planning team activities, performing functions and tracing the teams and tracing the team’s progress throughout the development activity. OS : Windows/Linux/Mac Programming Language/Platform : Python IDE : Google Colab (Web IDE) Python Libraries : OpenCV, Matplotlib, Tensorflow Dept of CSE 11 TKRCET 4.4 HARDWARE REQUIREMENTS The hardware requirements may work on the basis for a contract for the implementation of the system and should therefore be a complete and consistent or compatible specification of the whole system. These are used by software engineers as the starting point for the system design. Processor : Intel i3 and above RAM : 8GB and Higher Hard Disk : 500GB Minimum Dept of CSE 12 TKRCET CHAPTER 5 DESIGN 5.1 SYSTEM ARCHITECTURE Fig 5.1 System Architecture 5.1.1 Data Sets We have collected a database named deepfake_detection which consists of deepfake, real images folders. The database is huge. It consists of 7104 images. In which major part of the data is used for training the model and minor part of the dataset is used for testing the model. 5.1.2 Pre-processing Dataset pre-processing includes splitting of the video into frames(images). Followed by the face detection and cropping frame with the detected face. To keep the number of Dept of CSE 13 TKRCET images uniform, the average of the video dataset is calculated and the new processed reframed dataset is created containing the images. The frames that don’t have faces in them are ignored during pre-processing. Due to the unavailability of the required GPU, we proceeded with our project using image datasets. We collected a huge dataset of deepfake and real images. 5.1.3 Model The Data Loader loads the pre-processed face cropped videos/images and splits the videos/images into a train and test set. Further, the images of the processed videos are passed to the model for training and mini-batch testing. 5.1.4 Prediction Our system takes a batch of images as input, and does predictions on the image data. It predicts images are deepfake/real. 5.2 MESO4 This network begins with a sequence of four layers of successive convolutions and pooling, and is followed by a dense network with one hidden layer. To improve generalization, the convolutional layers use ReLU activation functions that introduce non-linearities and Batch Normalization to regularize their output and prevent the vanishing gradient effect, and the fully-connected layers use Dropout to regularize and improve their robustness. Sigmoid function: It is a logistic function, a non-linear activation function. The main reason we use sigmoid function is because it’s value exists between 0 to 1. Dept of CSE 14 TKRCET Fig 5.1 The network architecture of Meso-4. Layers and parameters are displayed in the boxes, output sizes next to the arrows. Dept of CSE 15 TKRCET CHAPTER 6 CODING 6.1 DATASETS DOWNLOAD To begin with, let’s import our datasets. #download the dataset deepfake_database.zip #!gdown https://e.pcloud.link/publink/show?code=XZnsxkZkEAgI1Og QIJHLnNl9ErhV4vpHuV0 In our project we are using the datasets of folder Validations. Validations folder consist of deepfakes and real folders. Download all the images in the two folders. Upload these folders into google drive. 6.2 CODING – MESONET NEURAL NETWORK Mount the google drive in which dataset is present. Import all the required libraries Image dimensions are height, width, channels (red, green, blue). Dept of CSE 16 TKRCET Creating a classifier class Creating Mesonet network which consists of 4 layers of successive Convolutional Neural layers and pooling and is followed by a dense network with one hidden layer. Dept of CSE 17 TKRCET Instantiating Mesonet model with pretrained weights Rescaling pixel values (between 1 and 255) to a range between 0 and 1. Instantiating generator to feed images through the network. Generating class indices. Found two classes Deepfake as 0 index value and Real as 1 index value. Doing predictions on image data. Dept of CSE 18 TKRCET Creating separate lists for correctly classified and misclassified images. The lists are correct_real, misclassified_real, correct_deepfake, misclassified_deepfake. Generating predictions on validation set, storing in separate lists. Our database consists of 7104 images. Predictions are done on all the images. Dept of CSE 19 TKRCET Uses plotter function, which takes a batch of images as input and does predictions on images. Dept of CSE 20 TKRCET CHAPTER 7 IMPLEMENTATION AND RESULTS 7.1 IMPLEMENTATION 7.1.1 Convolution in Convolutional Neural Networks: Below are the layers of Meso4 model. Meso4 model has 4 convolutional layers and a hidden dense layer. Fig 7.1 Convolutional and Hidden layers of Meso4 Network Model. The convolutional neural network, or CNN for short, is a specialized type of neural network model designed for working with two-dimensional image data, although they can be used with one-dimensional and three-dimensional data. Central to the convolutional neural network is the convolutional layer that gives the network its name. This layer operates called a “convolution”. In the context of a convolutional neural network, convolution is a linear operation that involves the multiplication of a set of weights with the input, much like a traditional neural network. Given that the technique was designed for two-dimensional input, the multiplication is performed between an array of input data and a two-dimensional array of weights, called a filter or a kernel. The filter is smaller than the input data and the type of multiplication applied between a filter-sized patch of the input and the filter is a dot product. A dot product is an element-wise multiplication between the filter-sized patch of the input and filter, which is then summed, always resulting in a single value. Because it results in a single value, the Dept of CSE 21 TKRCET operation is often referred to as the “scalar product”. Using a filter smaller than the input is intentional as it allows the same filter (set of weights) to be multiplied by the input array multiple times at different points on the input. Specifically, the filter is applied systematically to each overlapping part or filter-sized patch of the input data, left to right, top to bottom. This systematic application of the same filter across an image is a powerful idea. If the filter is designed to detect a specific type of feature in the input, then the application of that filter systematically across the entire input image allows the filter an opportunity to discover that feature anywhere in the image. This capability is commonly referred to as translation invariance, e.g., the general interest in whether the feature is present rather than where it was present. x1 = Conv2D(8, (3, 3), padding='same', activation = 'relu')(x) x1 = BatchNormalization()(x1) x1 = MaxPooling2D(pool_size=(2, 2), padding='same')(x1) x2 = Conv2D(8, (5, 5), padding='same', activation = 'relu')(x1) x2 = BatchNormalization()(x2) x2 = MaxPooling2D(pool_size=(2, 2), padding='same')(x2) x3 = Conv2D(16, (5, 5), padding='same', activation = 'relu')(x2) x3 = BatchNormalization()(x3) x3 = MaxPooling2D(pool_size=(2, 2), padding='same')(x3) x4 = Conv2D(16, (5, 5), padding='same', activation = 'relu')(x3) x4 = BatchNormalization()(x4) x4 = MaxPooling2D(pool_size=(4, 4), padding='same')(x4) y y y y y y = = = = = = Flatten()(x4) Dropout(0.5)(y) Dense(16)(y) LeakyReLU(alpha=0.1)(y) Dropout(0.5)(y) Dense(1, activation = 'sigmoid')(y) Conv2D: Conv2D is a 2D Convolutional layer, which creates a convolution kernel that is wind with layers input which helps produce a tensor of outputs. Mandatory Conv2D parameter is the number of filters that convolutional layers will learn. It is an integer value and also determines the number of output filters in the convolution. In the first Conv2D layer we used 8 filter. Kernal: In image processing, kernel is a convolution matrix which can be used for blurring, sharpening, embossing, edge detection, and more by doing a convolution between a kernel and an image. The 3x3 determines the dimensions of the kernel. It is a tuple of 2 integers specifying the height and width of the 2D convolution window. Dept of CSE 22 TKRCET Padding: Padding parameter of Keras Conv2D class can take one of the two values: ‘valid’ or ‘same’. We can preserve spatial dimensions of the volume such that the output volume size matches the input volume size, by setting the value to the ‘same’. Activation: The activation parameter to the Conv2D class is simply a convenience parameter which allows you to supply a string, which specifies the name of the activation function you want to apply after performing the convolution. Batch Normalization: It allows every layer of the network to do learning more independently. It is used to normalize the output of the previous layers. Using batch normalization learning becomes efficient also it can be used as regularization to avoid overfitting of the model. Maxpooling: Maxpooling is to reduce the spatial dimensions of the output volume. In the pooling layer we significantly reduce the dimensionality of our data which greatly speeds up the computation. Maxpooling- which means we reduce a region of pixel values to that region’s max value. Dropout: To prevent overfitting we use dropout layer. It dropouts some neurons and then neurons will become inactive & by doing backpropagation it will check with only active features. Sigmoid: It is a logistic function, a non-linear activation function. The main reason we use sigmoid function is because it exists between 0 to 1. Let’s train our mesoscopic models. We’ll start with the Meso4 model. The architecture is incredibly simple, involving just four sets of convolutions. Note that as this model hasn’t been pre-trained, it’s reasonable that we train it over more iterations to compensate — this was achieved by training at a learning rate of 0.002 for 30 epochs, followed by another 20 epochs at a lower learning rate of 2E-4. This difference helps to balance convergence with overall training time. Dept of CSE 23 TKRCET With a validation accuracy approaching 70%, a good starting point. The difference in performance can be attributed to a multitude of factors, but the latter network’s use of so-called “Inception” blocks stands out. Essentially, these self-contained blocks contain convolutional layers with small sized filters in parallel, with the results being pooled and concatenated at the end of the block. Dept of CSE 24 TKRCET 7.2 RESULTS AND OUTPUT SCREENS We created four lists, correct_real misclassified_real correct_deepfake misclassified_deepfake. Plotter function takes a batch of images as input and does predictions. In the first output it selects a batch of images from real dataset and does predictions. It gives out the images whose prediction value is 0.6000 to 1. Output Screen1: Correct_real images Dept of CSE 25 TKRCET It selects a batch of images from real dataset and does predictions. It gives the misclassified_real images whose prediction value is 0.0001 to 0.5999. Output Screen 2: Misclassified_real images It selects a batch of images from fake dataset and does predictions. It gives the fake images whose prediction value is 0.0001 to 0.5999 Dept of CSE 26 TKRCET Output Screen 3: Correct_fake images It selects a batch of images from fake dataset and does predictions. It gives the misclassified_fake images whose prediction value is 0.5000 to 0.5999 Output Screen 4: Misclassified_fake images Our model gives 70% of accuracy. So, some images maybe misclassified. Dept of CSE 27 TKRCET CHAPTER-8 TESTING Software Testing is the evaluation of the software, from the requirements gathered from users and system specifications. Testing is conducted based on the software development life cycle. Software is validated and verified in the process of testing. 8.1 SOFTWARE VALIDATION Software validation is the process of examining whether the software satisfies the user requirements or not. It is carried out at the end of the software development life cycle. Validation ensures the product under development is as per the user requirements. Validation answers the question – "Are we developing the product which attempts all that user needs from this software?" Validation emphasizes user requirements. 8.2 SOFTWARE VERIFICATION It would imply verifying if the specifications are met by running the software but this is not possible (e. g., how can anyone know if the architecture/design/etc. are correctly implemented by running the software?). Only by reviewing its associated artifacts, someone can conclude if the specifications are met.Verification is the process of confirming if the software is meeting the business requirements and is developed adhering to the proper specifications and methodologies Verification ensures the product being developed is according to design specifications. Verifications concentrate on the design and system specifications. 8.3 TARGET OF THE TEST ARE The target of testing is to determine the errors, faults and failures in the software Errors -These are actual coding mistakes made by developers. In addition, there is a difference in output of software and desired output, considered as an error. Fault - When error exists fault occurs. A fault, also known as a bug, is a result of an error which can cause the system to fail. Dept of CSE 28 TKRCET Failure - failure is said to be the inability of the system to perform the desired task. Failure mostly occurs due to the fault in systems. Software testing is further classified into different types such as unit testing, integration testing, system testing and regression testing. 8.4 BLACK-BOX TESTING Black Box Testing is a software testing method in which the internal structure/ design/ implementation of the item being tested is not known to the tester.It is carried out to test functionality of the program. It is also called ‘Behavioural’ testing. The tester in this case, has a set of input values and respective desired results. On providing input,if the output matches with the desired results, the program is tested ‘ok’, and problematic otherwise. In this testing method, the design and structure of the code are not known to the tester, and testing engineers and end users conduct this test on the software. 8.5 WHITE-BOX TESTING White Box Testing is a software testing method in which the internal structure/ design/ implementation of the item being tested is known to the tester. It is conducted to test programs and its implementation, in order to improve code efficiency or structure. It is also known as ‘Structural’ testing. In this testing method, the design and structure of the code are known to the tester. Programmers of the code conduct this test on the code. Some of the white box testing techniques are: Control-flow testing - The purpose of the control-flow testing to set up a test case which covers all statements and branch conditions. The branch conditions are tested for both being true and false, so that all statements can be covered. Data-flow testing - This testing technique emphasis to cover all the data variables included in the program. It tests where the variables were declared and defined and where they were used or change. Dept of CSE 29 TKRCET 8.6 TESTING LEVELS Tests are grouped together based on where they are added in SDLC or by the level of detailing they contain. In general, there are four levels of testing: unit testing, integration testing, system testing, and acceptance testing. The purpose of Levels of testing is to make software testing systematic and easily identify all possible test cases at a particular level. There are many different testing levels which help to check behavior and performance for software testing. These testing levels are designed to recognize missing areas and reconciliation between the development lifecycle states. In SDLC models there are characterized phases such as requirement gathering, analysis, design, coding or execution, testing, and deployment. All these phases go through the process of software testing levels. Unit Testing Functional Testing Functional testing is centered on the following items: Valid Input: identified classes of valid input must be accepted. Invalid Input: identified classes of invalid input must be rejected. Functions: identified functions must be exercised. Output: identified classes of website outputs must be exercised. Performance Testing Integration Testing System Testing Dept of CSE 30 TKRCET CHAPTER -9 ADVANTAGES, DISADVANTAGES, AND APPLICATIONS 9.1 ADVANTAGES Though it is harmful to society, it also has few advantages like It is used to create the voices of died people. Creating amazing attention among the online peoples making the web pages graphical representations popular on the search engine or google, as most of the peoples are start searching on such erotic or randy topics Another factual advantage of deep fake is, it makes us aware of such fake things and we shouldn't believe in everything that we are seeing around us. One of the main advantages is that this technology is mainly used in the film industry. 9.2 DISADVANTAGES Rather than benefiting anyone, this artificial intelligence technology has disadvantages that affect different segments of our society. Apart from creating fake news and propaganda, the deep fake is mostly used for revenge porn to smear notable celebrities. Until and unless an official statement of the targeted personality not comes, many peoples are started believing, making their life difficult, especially when they are criticized/attack by their fans via social media platforms like Facebook, Twitter or Instagram, etc. Voice or Image Manipulation in the process of Authentication. Forgery of evidence in criminal proceedings (handling the initial situation, activities...). Body Movements can be manipulated, Identity stealing will occur. Dept of CSE 31 TKRCET 9.3 APPLICATIONS Pornography: Deep fake is used for revenge porn to defame notable Celebrities. Until and unless an official statement of the targeted personality does not come, many people start believing. Morphing: ⮚ Computer-generated special effects for sound and image, video recordings. ⮚ Calculations of intermediate changes between single images or sounds. The complex process consists of 1. Warping 2.Tweening (intermediate pictures) 3. Cross-dissolving Dept of CSE 32 TKRCET CHAPTER 10 CONCLUSION AND FUTURE ENHANCEMENT 10.1 CONCLUSION Although there are few techniques for detecting deep fakes, we cannot completely rely on these methods forever. With advancement in creating new deep fakes day by day it will be much harder to detect the fakes. Even with blockchain technologies in the near future, block chains are vulnerable to sophisticated cyberattacks that can compromise their integrity and reliability. Deep fakes have begun to erode trust of people in media contents as seeing them is no longer commensurate with believing in them. They could cause distress and negative effects to those targeted, increase misinformation and hate speech, and even spark political tension, inflame the public, violence or war. Utility of Deep Learning Techniques such as CNNs for the detection of deepfake videos is more in use. The proposed solution but has higher efficiency over narrow areas of applications such as in a small business enterprise or just for an individual’s usage and concentrates on working with lesser time having access to limited resources (such as not having internet access or RAM being 8GB or lesser). But the proposed solution can only help with the existing 7104 images (extracted from 175 videos) if new deep fakes are created this solution might not give us the best possible outcome Dept of CSE 33 TKRCET 10.2 FUTURE SCOPE FOR ENHANCEMENT In the future, the advanced technological concept of block chain is used for better deep fake detection. The potential role of blockchain in detecting deep fake images or videos can be cryptographically signed by multiple parties at the source of origin. To any video at the time of recording, Cryptographic hash can be assigned to it. By using blockchain’s immutability feature, the hash data once entered cannot be modified. For every instance of video uploading, editing and downloading a smart contract can be written after validation by the original parties. This ensures integrity of the video and improves traceability. The hash data can be compared to the source at every stage of video. If there is any mismatch between the two datasets, we can know that the video is altered. Let us consider an example where the police officers and investigators use video cameras to record crime scene details. The video is assigned with unique hash data (which in this case is fingerprints) of every person present. This data is written to the block chain as a smart contract with validation from each member. Also, each download, upload and share instance, will be checked against the original data to verify its authenticity. Thus, video manipulation cases can be significantly minimized using blockchain technology. Dept of CSE 34 TKRCET REFERENCES 1. Darius Afchar, Vincent Nozick, Junichi Yamagishi, Isao Echizen, MesoNet: a Compact Facial Video Forgery Detection Network 2. Yuezun Li, Siwei Lyu, Exposing DeepFake Videos By Detecting Face Warping Artifacts 3. Abhijit Jadhav, Abhishek Patange, Jay Patel, Hitendra Patil, Manjushri Mahajan Deepfake Video Detection using Neural Networks 4. Huy H. Nguyen, Junichi Yamagishi, Isao Echizen, Using Capsule Networks to Detect Forged Images and Videos 5. J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Nießner. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2387– 2395, 2016. 6. Yuezun Li, Ming-Ching Chang and Siwei Lyu “Exposing AI Created Fake Videos by Detecting Eye Blinking” 7. An Overview of ResNet and its Variants: https://towardsdatascience.com/anoverview-of-resnetand-its-variants-5281e2f56035 8. Datasetshttps://e.pcloud.link/publink/show?code=XZnsxkZkEAgI1OgQIJHLnNl9 ErhV4vpHuV0 9. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 10. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014. Dept of CSE 35 TKRCET 11. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014. 12. A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, ¨ and M. Nießner. Faceforensics: A large-scale video dataset for forgery detection in human faces. arXiv preprint arXiv:1803.09179, 2018. 13. B. Bayar and M. C. Stamm. A deep learning approach to universal image manipulation detection using a new convolutional layer. 14. D. E. King. Dlib-ml: A machine learning toolkit. Journal of Machine Learning Research, 10:1755–1758, 2009. Dept of CSE 36 TKRCET