Uploaded by tijis81560

OCS

advertisement
Mini Project Report
on
Online Conversation Scanner
Submitted to
Ajay Kumar Garg Engineering College,
Ghaziabad
B.Tech. Information Technology
Sem 5TH, 2023-24
CODE: KCS-554
Project Mini Project or Internship Assessment Report
Submitted To:
Submitted By:
Mrs. Shikha Aggarwal
Sunakshi Singh (2100270110083)
Uzair Ali (2100270110092)
Udit Singh (2100270110089)
Dr. A.P.J. Abdul Kalam Technical University,
Uttar Pradesh, Lucknow
1
CERTIFICATE
This is to certify that the Seminar entitled “ONLINE CONVERSATION SCANNER” has
been submitted by SUNAKSHI SINGH, UZAIR ALI AND UDIT SINGH under my guidance
in partial fulfillment of the degree of Bachelor of Engineering in COMPUTER SCIENCE in
semester 5TH of AJAY KUMAR GARG COLLEGE OF ENGINEERING, GHAZIABAD,
UTTAR PRADESH during the year 2023-2024.
Date: 11-12-23
Place: Ghaziabad
Guide Head
HOD
Mrs. SHIKHA AGGRAWAL
DR. RAHUL SHARMA
2
ACKNOWLEDGEMENT
I would like to express my deepest thanks to Mrs. Shikha Aggrawal, a mini project advisor, for
her cooperative attitude and consistent guidance, due to which I was able to complete my
project successfully.
I would like to express my sincere gratitude to Dr. Anupama Sharma(H.O.D. CSIT
Department), Ajay Kumar Garg Engineering College, Ghaziabad and Dr R. K. Aggarwal
(Director General), Ajay Kumar Garg Engineering College for allowing me to pursue my
choice of project. They gave me valuable guidance and support.
I wish to thank various people in my college, Ajay Kumar Garg Engineering College, for their
valuable guidance. I received practical as well as theoretical knowledge and experience in this
project. Finally, last but by no means least, a paper is not enough for me to express the support
and guidance I received from them.
Sunakshi Singh (2100270110083)
Uzair Ali (2100270110092)
Udit Singh (2100270110089)
3rd year CSIT
3
TABLE OF CONTENTS
S.NO
CONTENTS
PAGE NO.
Abstract
5-7
Objectives
8-9
3
Introduction
10
4
Review of literature
11 - 15
5
Design and implementation
16 - 19
6
Code and output
20
7
Limitations
21
8
Conclusion
22
9
Future Scope
23
10
References
24
1
2
4
INTRODUCTION
The Online Conversation Scanner is a powerful tool for uncovering valuable insights from
conversational data. By analyzing chat logs, it provides deep understanding of behaviour and
preferences. This presentation will explore the key features and benefits of using the Online
Conversation Scanner for conversation analysis. It is a software tool that allows users to analyse
conversations in a variety of ways. Users can view chat frequency, message types, word count,
and more. It's a great way for individuals and businesses to gain insights into their messaging
patterns and identify areas for improvement in communication. The tool is easy to use and
provides valuable data that can be used to make informed decisions about messaging strategies.
Online conversation has become an indispensable part of our daily communication, with
billions of users exchanging messages, photos, videos, and voice notes every day. However,
with the increasing use of WhatsApp for personal and professional communication, there has
been a growing need for tools that can analyze and interpret the data generated on this platform.
This is where WhatsApp chat analyzers come in - they help users to understand their
communication patterns, identify trends, and gain insights into their relationships with others.
An online conversation scanner is a sophisticated tool designed to monitor and analyze digital
dialogues, texts, or discussions across various online platforms. Utilizing advanced algorithms,
it scans text-based interactions, identifying keywords, sentiments, and context to extract
meaningful insights.
It is a software tool that uses natural language processing (NLP) and machine learning
algorithms to analyze the text-based content of WhatsApp chats. It can identify keywords,
phrases, and sentiment (positive, negative, or neutral) in the messages exchanged between
users. The tool can also categorize the messages based on topics or themes discussed in the
chat.
The benefits of using a online conversation scanner are numerous. Firstly, it helps users to
understand their communication style and identify areas for improvement by analyzing the
frequency and tone of messages exchanged, users can learn to communicate more effectively
and efficiently. For instance, if the tool identifies that a user tends to use negative language
frequently, it may suggest ways to improve communication skills and promote a more positive
approach.
5
Secondly, it can help users to identify trends in their communication patterns. By analyzing the
topics discussed in chats over a period of time, users can gain insights into their interests,
preferences, and priorities. This information can be useful for personal growth and
development as well as for making informed decisions about career choices or educational
opportunities.
Thirdly, it can help users to manage their relationships better. By identifying patterns of
communication between individuals or groups, users can understand the dynamics of their
relationships and take proactive steps to improve them. For instance, if the tool identifies that
a user tends to communicate less frequently with certain individuals or groups, it may suggest
ways to reconnect and strengthen those relationships.
Fourthly, it can help users to monitor their online reputation. By analyzing the messages
exchanged between users in a group or community, the tool can identify potential issues or
conflicts that may affect the user's reputation. This information can be useful for taking
proactive steps to address these issues before they escalate into bigger problems.
By leveraging natural language processing (NLP) and machine learning techniques, these
scanners can detect patterns, trends, or potential risks within conversations. They serve diverse
purposes, from social media sentiment analysis to cybersecurity threat detection and content
moderation.
With the ability to sift through vast amounts of data swiftly, an online conversation scanner
aids in understanding user behavior, sentiment trends, and emerging issues. It can be pivotal in
enhancing customer experiences, identifying potential threats, ensuring compliance, and
providing actionable insights for businesses, organizations, or platforms. The scanner's
adaptability and ability to evolve alongside language nuances make it a valuable asset in
navigating the dynamic landscape of online communication.
6
In conclusion, it is a powerful tool that can provide users with valuable insights into their
communication patterns, trends, relationships, and online reputation. By using this tool
regularly, users can learn to communicate more effectively and efficiently, manage their
relationships better, and monitor their online reputation proactively. As WhatsApp continues
to be an integral part of our daily communication, it is essential to use tools like these to make
the most out of this platform's potential benefits.
7
REVIEW OF LITERATURE
1.Literature review on Chat Analysis: A survey analysis on the usage and impact of Messages
has been conducted and various studies and analysis have been found. In the survey it was
found that in the southern part of India, ages 18 to 23 spend around 8 hours using online apps
and sometimes be online almost 12-16 hours a day. Most of them agreed to be using whatsapp
or on any other site. They exchange images, audios and videos. This survey also proved that
the whatsapp has been the most widely used app on the smart phones than any other app. This
survey was conducted to know the positive and negative impacts of using whatsapp. As we can
know that from this survey, whatsapp is most used app by the youth and other generations so,
our project can give them the insights of their chats and provide them unknown facts.
2.Literature review on Web Design: Internet Users are reaching millions and can be expected
to increase more over the years. The websites are the crucial media of information,
transmission, dissemination.[5] Current paper purposes to review previous studies that have
been done in the field of web development. As the result, literatures either proposed set of
guidelines or assistive technologies particularly web interfaces, adaptive systems. The purpose
of this paper is to analyse and know the users' perceptions and behaviors, in order to achieve a
successful e-commerce website. According to a survey (Lee & Kozar, 2012) there is currently
no consensus on how to properly operationalize and assess website usability. Right now we do
not have any guidelines that individuals can follow when designing websites to increase users
engagement. • "Hypertext" are the links to connect web pages to one another, either within a
single website or between websites. Links are a fundamental aspect of the Web, by uploading
content to the Internet and linking it to pages. • HTML uses "markup" to annotate text, images,
and other content for display in a Web browser to describe the presentation of a document
written in HTML or XML. • CSS is the core languages of the open web, standardized across
Web browsers according to W3C specifications. CSS describes how elements should be
rendered on screen, on paper, in speech, or on other media means like the styling part of the
webpage.
The literature surrounding online conversation scanners presents a multifaceted view of their
technological capabilities and societal implications. Research in this domain focuses on various
aspects, including the underlying algorithms, natural language processing techniques, and
machine learning models employed to extract insights from digital
8
conversations. Studies delve into sentiment analysis, topic modeling, and entity recognition as
core functionalities, highlighting their applications in diverse fields such as social media
monitoring, cybersecurity, and content moderation. Researchers often explore the ethical
considerations surrounding these tools, emphasizing the balance between privacy concerns and
the need for effective monitoring in ensuring online safety. Moreover, the evolution of online
conversation scanners in response to the ever-changing landscape of online communication
and the challenges posed by multilingualism, slang, and contextual understanding remains a
central theme. Overall, the literature underscores the significance of these scanners in
deciphering the complexities of digital discourse while addressing the ethical, technical, and
societal implications of their implementation.
9
ABSTRACT
Text conversation has been the most used mode of communication and has been an efficient
one too. It consists of many conversations in groups and individuals. So, there might be some
hidden facts in them. This project takes those chats and provide a deep analysis of that data.
Being any topic, the chats are it provide the analysis in an efficient and accurate way. The main
advantage of this project is that it has been built using libraries like pandas, matplotlib, emoji
etc. They are used to create data frames and plot graphs in an efficient way.The proposed chat
analyser is a machine learning-based tool designed to extract insights and patterns from chats.
The system utilizes natural language processing techniques to understand the context and
sentiment of messages, identify key topics and entities, and track user behavior over time. The
analyser can also detect potential issues such as cyberbullying, misinformation, and privacy
breaches, making it a valuable resource for parents, educators, and organizations. The system's
accuracy and efficiency are enhanced through the use of deep learning algorithms, allowing for
real-time analysis of large volumes of data. Overall, this tool has the potential to revolutionize
the way we interact with social media platforms by providing a more comprehensive
understanding of our digital communication habits.
10
DESIGN & IMPLEMENTATIONS
TECHNOLOGY USED
1. Streamlit: Streamlit is a free and open-source python framework. [2] We can quickly develop
web apps for Machine Learning and Data Science by using Streamlit. Streamlit can easily
integrates with other popular python packages such as NumPy, Pandas, Matplotlib, Seaborn.
Streamlit provides fastest way to develop and deploy web apps.
2. Matplotlib: Matplotlib is a popular Python packages used for data visualization. It is a crossplatform library for making plots from data in arrays. It helps in creating static, animated and
interactive visualizations in python.
3. Word cloud: Word Cloud is a data visualization library used for representing most frequently
used words within a given text. Most frequent and important words are represented in bigger
and bolder size.
4. Pandas: Pandas is an open-source python library. Pandas used to convert string data into
Data frame. Data frame is the representation of data into 2-dimensional table of rows and
columns. We can work with large data sets using Panda library. Panda library has many builtin functions for data analysis, data cleaning, data exploration and data manipulation.
DESIGN
SOFTWARE REQUIREMENT SPECIFICATION
Software requirement specification (SRS) is a technical specification of requirements for the
software product.SRS represents an overview of products, features and summaries the
processing environments for developmentoperation and maintenance of the product
Requirement Specification –
Conceptually every SRS should have the components:
●Functionality
●Performance
●Design constraints imposed on
11
●Implementation External interfaces
USE CASE MODEL
●In the use case Diagram the actor is User.
●Users can make use of chat upload use cases to give input to the system.
●Select time format use case describes that user can input the time format of the file in the
system.
●Select user use case is to select whose analysis result is desired.
●Users can make use of Show analysis use cases to see the result of the entire analyis.
Figure 1.1
ACTIVITY DIAGRAM
●In the activity diagram as the initial activity starts user will upload the file as input which
isaction and in the next action time format will be selected.
●The decision box check chat format represents the validity of the time format of the file.
●If the time format is correct then analysis will be done and process will end.
●If the time format is wrong user will have to again check for the correct format
12
Figure 1.2
Implementation
Python is a high-level, general-purpose and a very popular programming language.
Python programminglanguage (latest Python 3) is being used in web development, Machine
Learning applications, along with allcutting-edge technology in the Software Industry.
Python Programming Language is very well suited for Beginners.
1. Python is currently the most widely used multi-purpose, high-level programming language.
2.Python allows programming in Object-Oriented and Procedural paradigms. 3.Python
programs generally are smaller than other programming languages like Java. Programmers
have totype relatively less and the indentation requirement of the language makes them
readable all the time.
4.Python language is being used by almost all tech-giant companies like –
Google, Amazon FacebookInstagram, Dropbox, Uber… etc.
5.The biggest strength of Python is huge collection of standard libraries which can be used for
the following
Software requirement for developing application
•
Jupyter notebook
13
•
VS code
Technologies
•
Python and its libraries (streamlit)
•
ML algorithm
•
NLTK
14
HARDWARE REQUIREMENTS
1. Processor (CPU) with 2 gigahertz (GHz) frequency or above
2. STORAGE - 1GB
3. DISPLAY-ANY DEVICE.
4. Monitor Resolution 1024 X 768 or higher
5. Internet Connection Broadband (high-speed) Internet connection with a speed of 4 Mbps
or higher.
Operating System:
• Windows 7, Windows 8 or Windows 10
• Mac OSX 10.8, 10.9, 10.10 or 10.11
15
CODE
Figure 1.3
Figure 1.4
16
Figure 1.5
Figure 1.6
17
OUTPUT
Figure 1.6
Figure 1.7
18
Figure 1.8
Figure 1.9
Figure 2.0
19
OBJECTIVE
The primary objective of an online conversation scanner is to systematically analyze and
interpret digital dialogues, texts, or discussions occurring across various online platforms. Its
core goals include:
1. Insight Extraction: To extract valuable information, sentiments, trends, and patterns from
conversations.
2. Risk Detection: To identify potential risks, threats, or problematic content, such as cyber
threats, hate speech, or inappropriate behavior.
3. Monitoring and Surveillance: To track and monitor discussions for various purposes,
including brand reputation management, customer sentiment analysis, and public opinion
monitoring.
4. Enhanced Decision-Making: To provide actionable insights for businesses, organizations,
or platforms to make informed decisions regarding customer engagement, content
moderation, or security measures.
5. Adaptability and Evolution: To continuously evolve and adapt to changing linguistic
nuances, slang, and emerging online communication trends, ensuring relevance and
effectiveness in understanding digital conversations.
We can say that the capabilities of the chat application and the power of the python
programming language in implementing our data analysis intended, cannot be
overemphasized. The system was done with python, and the python libraries that were
implemented includes, Streamlit, Emoji, NumPy, Pandas and Matplotlib . Finally results that
we intended were obtained. The future of our project is it is mainly useful for organisers.
Then will get to know who is more and least active in the group. Depending on that they can
take decisions.
20
LIMITATIONS
•
It has some limitations, including the inability to analyze photos and other media, as it
only focuses on text.
•
Additionally, it cannot determine sarcasm, humor, or other subtleties in the
conversation, which may affect the accuracy of the analysis.
•
The tool also lacks a sentiment analysis feature, which could provide insights into the
emotions expressed in the chat.
•
Lastly, it may not be suitable for analyzing large group conversations as it may be
challenging to identify specific individuals' sentiments and opinions within the chat.
•
Certainly, here are some limitations associated with online conversation scanners:
•
Difficulty in accurately grasping nuanced context, sarcasm, or cultural references
within conversations, leading to potential misinterpretations.
•
Struggles with multilingual conversations or dialects, impacting the accuracy of
analysis and sentiment identification, especially in languages with varying nuances.
•
Challenges in detecting coded or encrypted language, especially in cases involving
sophisticated methods of communication that aim to bypass scanning algorithms.
•
Ethical considerations arise regarding the monitoring of private conversations, raising
concerns about user privacy and data protection regulations.
•
Potential biases within the algorithms used, leading to skewed results or
misidentification of sentiments, particularly in sensitive topics or cultural contexts.
•
Processing delays might occur when dealing with vast amounts of data in real-time,
impacting the scanner's effectiveness in swiftly identifying and responding to
emerging issues.
•
Inability to access or scan private or encrypted conversations, limiting the
comprehensiveness of the analysis and potentially missing crucial information.
•
Addressing these limitations often requires continual refinement, advancements in
machine learning, and an ongoing effort to balance accuracy with privacy and ethical
considerations.
•
Another limitation is the privacy concerns that arise from monitoring customer
conversations, as some individuals may not want their personal information or
opinions shared with third parties.
21
CONCLUSION
The advent of an online conversation scanner marks a pivotal milestone in digital
communication. By employing advanced algorithms and linguistic analysis, this innovative
tool sifts through textual exchanges, identifying nuanced sentiments, intentions, and even
potential risks within conversations. Its multifaceted functionality extends beyond mere
content comprehension, delving into context, tone, and underlying emotions. Consequently,
the scanner serves as a guardian of online spaces, flagging instances of harassment,
cyberbullying, or suspicious behavior, fostering a safer digital environment.
Its applications span diverse arenas, from social media platforms and forums to customer
service interactions and educational settings, augmenting moderation efforts and proactive
intervention strategies. However, while empowering in its capabilities, ethical considerations
remain crucial. Balancing privacy concerns with the imperative to ensure safety and wellbeing is imperative. As this technology evolves, collaborative efforts between developers,
users, and policymakers become imperative to cultivate a digital sphere where expression
thrives harmoniously within a framework of safety and respect.
22
FUTURE SCOPE
The future scope of Online conversation scanner is vast as it has the potential to provide
valuable insights into group conversations, brand awareness, customer behavior, and
sentiment analysis. The tool's features can be expanded to include real-time analysis,
integration with other popular messaging platforms, and automated report generation. It can
also be used by businesses and organizations to improve their communication strategies and
enhance customer experience by understanding their needs and requirements. The data
derived from the tool can help in making informed decisions and improving overall
efficiency. The future of online conversation scanners appears promising, with several
avenues for advancement and evolution. Enhancements in natural language processing,
coupled with machine learning techniques, are expected to refine these scanners, enabling a
more nuanced understanding of contextual cues, colloquialisms, and cultural references
within conversations. The integration of advanced AI models will likely improve sentiment
analysis accuracy, allowing for better identification and handling of nuanced emotions.
Moreover, the application of these scanners could expand into new domains, such as mental
health monitoring through analyzing online discourse patterns for early detection of potential
issues. With the rising concerns surrounding misinformation and fake news, online
conversation scanners are poised to play a pivotal role in combating disinformation by swiftly
identifying and flagging misleading content. Additionally, the fusion of these scanners with
emerging technologies like blockchain might address privacy concerns by offering secure and
transparent ways to analyze conversations without compromising user data. As these tools
continue to evolve, their potential applications span across industries, from personalized
customer experiences to proactive cybersecurity measures, showcasing a dynamic and
impactful future scope for online conversation scanners.
23
REFERENCES
[1] Ravishankara K, Dhanush, Vaisakh, Srajan I S, “International Journal of Engineering
Research & Technology (IJERT)”, ISSN: 2278-0181, Vol. 9 Issue 05, May-2020
[2] https://www.analyticsvidhya.com/blog/2021/06/build-web-app-instantly-for-machinelearningusing-streamlit/
[3] Meng Cai, “PubMed Central”, PMCID: PMC7944036, PMID: 33732917
[4] Dr. D. Lakshminarayanan, S. Prabhakaran, “Dogo Rangsang Research Journal”, UGC
Care Group I Journal, Vol-10 Issue-07 No. 12 July 2020
[5] https://www.i
24
Download