Uploaded by Anurag Sikarwar

Online Exam Portal Thesis Report

A Thesis/Project/Dissertation Report
Submitted in partial fulfillment of the
requirement for the award of the degree of
Bachelor of Engineering in
Computer Science Engineering
Under The Supervision of
Name of Supervisor : C Ramesh Kumar
Designation : Professor
Submitted By
Zeeshan nafees(18021011725).
Anurag Singh (18021011477).
I/We hereby certify that the work which is being presented in the thesis/project/dissertation,
entitled “CAPS….” in partial fulfillment of the requirements for the award of the Bachelor of
technology in computer science Engineering submitted in the School of Computing Science and
Engineering of Galgotias University, Greater Noida, is an original work carried out during the
period of month, Year to Month and Year, under the supervision of Name… Designation,
Department of Computer Science and Engineering/Computer Application and Information and
Science, of School of Computing Science and Engineering , Galgotias University, Greater Noida
The matter presented in the thesis/project/dissertation has not been submitted by me/us for the
award of any other degree of this or any other places.
Zeeshan Nafees,18SCSE1010497
Anurag Singh,18SCSE1010238
This is to certify that the above statement made by the candidates is correct to the best of my
Supervisor Name:-C. Ramesh Kumar
The Final Thesis/Project/
examination of
18SCSE1010497 , Anurag Singh : 18SCSE1010238 has been held on _________________ and
his/her work is recommended for the award of Bachelor of Technology(Computer Science
Signature of Examiner(s)
Signature of Project Coordinator
Signature of Supervisor(s)
Signature of Dean
Place: Greater Noida
First, we would like to express my sincere gratitude to my thesis advisor Mr. C Ramesh Kumar
for his constant support throughout this research project. This thesis wouldn’t be so steady without
his valuable feedback and support. We would also like to thank project managers V. Arul sir for
giving me this opportunity in college project and also to the experts who participated for the
validation of this research, without their participation this research could not be completed
successfully. We would also like to acknowledge as the second reader of this thesis, and We are
very gratefully indebted to his valuable comments on this thesis. Finally, we would like to express
our profound gratitude to our parents for giving work to me for unfailing backing me throughout
my years of my study and through the process of researching and writing this thesis. This
accomplishment would not have been possible without them.
Deep privacy technology turned into a hot field of exploration over the most recent couple of years.
Analysts examine complex Generative Adversarial Networks (GAN), autoencoders, and other
ways to deal with lay out exact and hearty calculations for face trading. Accomplished outcomes
show that the deep fake technology unsupervised has issues as far as the visual nature of created
information from the data. These issues ordinarily lead to high deep fake precision when a
specialist breaks down them. The primary issue is that existing picture to-picture approaches don't
think about video space explicitness and edge by-outline handling prompts face jittering also other
plainly noticeable distortions. Another issue is the created information goal, which is low for some
current strategies because of high computational intricacy. The third issue shows up at the point
when the source face has bigger extents (like greater cheeks),
also after substitution it becomes noticeable on the face line. Our primary objective was to foster
such a methodology that could address these issues and beat existing arrangements on a number
of hint measurements. We present another face trade pipeline that is in light of Face Shifter
engineering and fixes the issues expressed above. With another eye misfortune work, superresolution block, and Gaussian-based facial covering age prompts upgrades in quality which is
affirmed during assessment.
Keywords:- Generative Adversarial Networks (GAN),Face shifter, eye loss, super resolution.
Candidates Declaration
List of Table
List of Figures
Chapter 1
Chapter 2
1.1 Introduction
1.2 Formulation of Problem
1.2.1 Tool and Technology Used
Literature Survey/Project Design
Chapter 3
Functionality/Working of Project
Chapter 4
Results and Discussion
Chapter 5
Conclusion and Future Scope
5.1 Conclusion
5.2 Future Scope
List of Table
Database design of Online Examination Portal
Page No.
List of Figures
System design overview of Online Examination Portal
System implementation of Online Exam Portal
Sequence diagram of Online Exam Portal
Data Flow Diagram
Context Diagram
Page No.
CHAPTER-1 Introduction
1.1 Introduction
These days a ton of visual substance is utilized over the web for various purposes. An enormous
number of graphical instruments to work with visual information like pictures and recordings is
accessible for nothing. This prompts expressing new errands for individuals who process the
information: quality improvement, pressure, reclamation and shading old photographs, and so on
We can see that the impact of man-made brainpower (AI) progress additionally doesn't lay
uniquely in the logical exploration field, however leaves this extension and makes it accessible for
everybody to apply condition of the craftsmanship (SOTA) AI methods for everybody over simple
to-utilize applications and informal organizations. As of late we moved forward from a standard
rundown of picture handling undertakings to such applications as foundation substitution [5] (e.g.,
in video meetings, photograph altering devices, and so forth), cosmetics style move [2], facial
credits rectification [10], hairdo transfer [11], face/head trade (Snapchat) [14] and others. The last
application happened to a major interest for scientists for various purposes particularly for film/cut
making and amusement needs. Face trade overall is a methodology of taking two visual
information sources (source and target) and supplanting the objective face with the source face.
By visual information sources, we mean pictures also recordings, so various information blends
are utilized. The most regularly utilized blends are the source and target pictures furthermore
source pictures and target video. One of the notable instances of profound phony innovation
showed up in 2019 when a viral scene from Home Alone with Macaulay Culkin's face traded with
Silvester Stallone was conveyed in general friendly organizations and this began the ascent of
profound phony fame. In 2021 Deep Tom Cruise Tik Tok account pulled in the consideration of
the crowd with excellent profound phony recordings. Despite the fact that these recordings were
impeccably combined [15] not just by utilizing SOTA generative profound learning face trade
calculations, a few postprocessing steps, and utilizing an entertainer with a comparative face,
the outcome looks astonishing. In this report, we might want to notice our new face trade approach
that can be utilized for picture to-picture and picture to video substitution assignments. Our
objective was to make a general pipeline for the two information mixes and make the excellent
end-product. As a gauge approach, we utilized the Face Shifter [6] model, which we refreshed
with another misfortune work furthermore minor design improvements. Besides, extra post-
handling steps were created and installed in the pipeline to come by result pictures or recordings
with high goal.
1.2.1 Tool and Technology Used
A. System Requirement
Pytorch >= 1.7.0 (Some checkpoints requires a bug fix from pytorch 1.7 - the current
master branch)
Torchvision >= 0.6.0
NVIDIA Apex (If you want to train any models - nolt needed for inference)
Python >= 3.6
B. Hardware Requirements:
Processor: CORE i3 &above.
RAM : 16 GB &Above
Hard Disc : 2.99GB
Monitor : Color Processor Speed: 3.2GHz
Chapter 2 -Literature Survey/Project Design
One of the main face trade approaches depends on a maybe basic thought utilizing
autoencoders [4]. To apply autoencoders to confront trade the creators train two autoencoders:
for the principal individual we need to move (source), and for the subsequent one, the actual
objective. Both autoencoders have a normal encoder, however various decoders. This model
is prepared in a basic manner, except for interesting situations when we add a few bends to
the info pictures to forestall the model from overfitting.
Deep Face Lab [8] is a notable SOTA arrangement in light of autoencoders execution.
In the paper the creators create the thought referenced above, and furthermore add numerous
different elements for better exchange: extra preparation misfortune capacities, somewhat
changed design, extra increases, and numerous different upgrades. The fundamental issue for
this approach is that the model should be prepared on source and target face pictures each time
one needs to do confront trade. So this methodology can be fairly applied for some constant
undertakings when we do not have huge datasets for preparing.
Another methodology that can cycle a picture without retraining is the First Order Motion
Model(FOMM) [12], as well as its turn of events - Motion Co Segmentation [13]. This strategy
doesn't concern producing another picture without any preparation, however one might say to
interpret one edge into another utilizing teachable division maps and relative changes. A
dataset of an enormous number of recordings was utilized as preparing information, and the
preparation interaction was fabricated on outlines from one video, where the creators attempted
to decipher one casing into another. Anyway the thought looks encouraging the visual nature
of orchestrated outcomes is a long way from great, since the model adapts extremely poor to
confront revolutions, so face acknowledgment after move is badly designed.
The accompanying model [17] depends on the utilization of face key points for age. Besides
this approach is the first in our review that depends on Generative Adversarial Networks
(GAN). The principle thought of such organizations is that we utilize two networks - a
generator and a discriminator. The generator figures out how to produce reasonable pictures,
and the discriminator attempts to recognize the produced pictures from the genuine ones.
The preparation cycle comprises of two situations: right off the bat the generator attempts to
beat the discriminator, and also the discriminator attempts to recognize integrated pictures.
In the following article the writers [6] fostered a model that shows probably the best outcomes
on face trade assessment measurements. Two models are utilized in the proposed approach.
The first model, the Adaptive Embedding Integration Network (AEINet), is utilized to play out
the face move itself, and the second, the Heuristic Error Acknowledging Network (HEARNet), is used to work on the nature of the subsequent exchange. We will depict AEI-Net in a
smidgen more subtleties further on the grounds that this was our pattern.
In 2021 Hifi Face model [16] was proposed to produce an excellent face trade technique. This
model can save the face state of the source face and create photorealistic results. The creators
utilize 3D shape-mindful character to control the face shape rather than key point-based and
feature based strategies for face regions. The strategies show great outcomes what's more
protect face personality with top caliber.
The to wrap things up model that we might want to notice is the SimSwap model [3].
Philosophically, the model is very like Face Shifter [6], and the thing that matters is in utilizing
a normal model engineering rather than two unique models.
In spite of the fact that we can see totally different methodologies used to lay out ideal visual
nature of created pictures, each technique has its own upsides and downsides. In our
examination we attempted to expand the nature of the created pictures and at the same
opportunity to beat a few issues we viewed as in later articles and models. Further we give a
specialized report of our answer and assessment results to think about the proposed
model with SOTA designs.
Chapter 3- Functionality/Working of Project
Loss Function
Picking the right misfortune for the model is a fundamental stage, since it tells us precisely
what we need to accomplish. All together to beat the gauge approach we worked on the
misfortune work that was utilized with some of extra highlights. This update gave our model
better presentation in terms of value. The rundown of benchmark misfortune work parts is as
• Top addresses character misfortune. We expect that Identity
Encoder yields Yˆs,t and Xs values were close.
• Ladv addresses the GAN misfortune in view of discriminator
values (antagonistic misfortune).
• Lrec addresses recreation misfortune. We use Xs = Xt as
model information arbitrarily and expect that the result esteem
be Yˆs,t = Xt.
• Latt addresses characteristic misfortune. We require that
z1att, z2att, ..., znatt values for Yˆs,t and Xt were close.
Lets continue to our misfortune alterations. To start with, we altered the reproduction
misfortune utilizing the thought from the SimSwap [3] engineering. In the first, this misfortune
was that in the event that we give the model two indistinguishable pictures of an individual,
we didn't need the model to accomplish something with the picture. Notwithstanding, we went
further here and didn't need Xs = Xt, it was enough that Xt and Xs have a place with a similar
individual. In this case we required Xt not to be changed at all subsequently of the exchange.
Since we utilized datasets, where every individual was given a few casings, it became
conceivable to carry out such a change of the misfortune.
Another significant change depended on instinct that eyes had all the earmarks of being a vital
part in the visual view of the face trade yield, particularly when we use picture to-video move.
For this situation each and every edge ought to address a similar view for sensible insight.
Subsequently, we chose to add a unique eye misfortune work which was gotten during tests. It
depends on L2 correlation of eyes regions highlights among Xt and Yˆs,t, assessed utilizing
face key points recognition model.
Image-to-Video Improvement
Whenever we perform face trade from picture to video we save the change grid for extricated
faces on the edges.This data assists us with embedding the all around changed face into its
unique put on the casing. In any case, assuming we embed the entirety picture acquired by our
model, visual curios generally show up on the edge of the embedded region on the first edge
and are obviously apparent. This impact happens both due to the fragmented correspondence
of the splendor of the source picture and target outline, and because of the conceivable
obscuring of the picture combined by our model. Consequently, it is important to guarantee
a smooth progress from the source picture to the subsequent outline. Consequently we use
division covers.
A facial covering is only a paired picture that figures out which pixels have a place with the
face and which don't. Along these lines, we can decide the specific area of the face and do
exact form crop. To reduce the impact of precise face region move we add Gaussian obscuring
at the edges. The consequence of such adjustment is introduced in Fig. 2. It very well may be
additionally noticed that the obscuring added to the cover as well as the veil region
has changed. This is one more alteration we carried out to resolve the exchange issue for faces
with particular extents. We clarify it further.
Face mask blurring effect in terms of face swap result
At the exploratory stage we experienced the accompanying issue - in some cases Yˆs,t and Xt
have particular face extents, as the model attempts to keep the state of the source face
Xs. If the blended face Yˆs,t is essentially more extensive than the target one Xt, then, at that
point, the exchange will be just fractional, and we won't keep the state of the source face Xs.
To manage this issue we chose to follow the keypoints for the created face and the objective
face on the video. If there should be an occurrence of the huge distinction in the directions
of the keypoints we alter the twofold cover (look at the center and base column covers in Fig
2). On the off chance that the face got by the model totally covers the face in the video, we
increment the veil, accordingly making the impact of moving the face, yet in addition the head's
shape. In any case we lessen the cover and increment the obscuring degree to move just the
focal piece of the face.
Chapter 4-Results and Discussion
To prepare and approve our model we chose two normal datasets VGGFace2 [1] and CelebAHQ [7]. We utilized these datasets for preparing and further examination of our model with
SOTA structures. VGGFace2 dataset meet our prerequisites because of nationality, orientation,
point of view and lightning conditions fluctuation. We prepared our model for 12 ages with 19
bunch size. Preparing tests were conveyed out on the Tesla V100 32 GB GPU. A few casings
were chosen from the approval set to notice the nature of the proposed approach (Fig. 3).
Generally speaking there were portrayed 25 face trade results fluctuating in face extents, skin
tone, hair, and so forth You can perceive how our model fits the source face to the target
The proposed face swap model results
Visual quality appraisal isn't the best way to assess our face trade model outcomes. We
determined a few assessment measurements to perform examination with SOTA models. The
measurements list was assembled from FaceShifter, SimSwap and HifiFace articles. Without
delving into numerical subtleties here we present a rundown of measurements utilized for
• ID recovery and shape ring net - answerable for safeguarding personality (head shape, and so
• Exp ring net - answerable for look and saving
• Eye ldmk - for keeping up with the bearing of view.
Every one of the examinations we directed permitted us to finish up that the proposed model
is thoroughly prepared and can be utilized in various cases like picture to-picture and picture
to-video move with top caliber. In the accompanying Table I we can notice an examination of
our model with different methodologies in terms of character and traits encoders (prior to
mixing). Here we utilize our customary model with a U-Net encoder furthermore two AAD
We additionally determined the assessment measurements freely for every strategy subsequent
to mixing. To analyze all the models in a reliable way we involved the gave recordings in the
Face Forensics++ dataset [9]. The outcomes are given in Table II.
Chapter 5- Conclusion
To make end we should specify that our model beats numerous SOTA designs as far as a few
notable measurements. Simultaneously the visual nature of the produced outcomes likewise
demonstrates that reality. A few new elements
made the proposed pipeline appropriate for picture to-picture face trade as well as picture tovideo: general design base on AEI-Net, new eye deficit, super goal and face mas tuning in light
of source/target faces region extent examination. Running against the norm, numerous SOTA
designs are assessed distinctly in picture area and are not reasonable for recordings processing.
We likewise shared the prepared model openly on GitHub and Google Collab
Reference:[1] Qiong Cao et al. “VGGFace2: A dataset for recognising faces across pose and age”. In:
CoRR abs/1710.08092 (2017). arXiv: 1710.08092. URL: http://arxiv.org/abs/ 1710.08092.
[2] Huiwen Chang et al. “PairedCycleGAN: Asymmetric Style Transfer for Applying and
Removing Makeup”. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern
Recognition (2018), pp. 40–48.
[3] Renwang Chen et al. “SimSwap: An Efficient Framework For High Fidelity Face
Swapping”. In: CoRR abs/2106.06340 (2021). arXiv: 2106.06340. URL: https:
[5] Zhanghan Ke et al. “MODNet: Real-Time TrimapFree Portrait Matting via Objective
Decomposition”. In: AAAI. 2022.
[6] Lingzhi Li et al. “FaceShifter: Towards High Fidelity And Occlusion Aware Face
Swapping”. In: (2020). arXiv: 1912.13457 [cs.CV].
[7] Ziwei Liu et al. “Deep Learning Face Attributes in the Wild”. In: Proceedings of
International Conference on Computer Vision (ICCV). Dec. 2015.
[8] Ivan Perov et al. “DeepFaceLab: Integrated, flexible and extensible face-swapping
framework”. In: (2021). arXiv: 2005.05535 [cs.CV].
[9] Andreas Rossler et al. “FaceForensics++: Learning ¨ to Detect Manipulated Facial Images”.
In: CoRR abs/1901.08971 (2019). arXiv: 1901.08971. URL: http: //arxiv.org/abs/1901.08971.
[10] Andras Rozsa et al. “Facial attributes: Accuracy and adversarial robustness”. In: Pattern
Recognition Letters 124 (June 2019), pp. 100–108. ISSN: 0167-8655. DOI:
10.1016/j.patrec.2017.10.024. URL: http://dx.doi.org/ 10.1016/j.patrec.2017.10.024.
[11] Yujun Shen et al. “Interpreting the Latent Space of GANs for Semantic Face Editing”. In:
CVPR. 2020.
[12] Aliaksandr Siarohin et al. “First Order Motion Model for Image Animation”. In:
Conference on Neural Information Processing Systems (NeurIPS). Dec. 2019.
[13] Aliaksandr Siarohin et al. “Motion Supervised co-part Segmentation”. In: arXiv preprint
[14] Ruben Tolosana et al. “DeepFakes and Beyond: A ´ Survey of Face Manipulation and
Fake Detection”. In: CoRR abs/2001.00179 (2020). arXiv: 2001.00179. URL:
[15] James Vincent. Tom Cruise Deepfake Creator says public shouldn’t be worried
about ’one-click fakes’. Mar. 2021. URL: https://www.theverge.com/2021/3/ 5/ 22314980/tom
- cruise - deepfake - tiktok - videos - ai - impersonator-chris-ume-miles-fisher.
[16] Yuhan Wang et al. “HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face
Swapping”. In: Proceedings of the Thirtieth International Joint Conference on Artificial
Intelligence, IJCAI-21. Ed. by Zhi-Hua Zhou. Main Track. International Joint Conferences on
10.24963/ijcai.2021/157. URL: https://doi.org/10.24963/ijcai.2021/157.
[17] Egor Zakharov et al. “Few-Shot Adversarial Learning of Realistic Neural Talking Head
Models”. In: (2019). arXiv: 1905.08233 [cs.CV].