Uploaded by hindospam

Graduation Project Analysis Report: Bookepida - AI Chatting Bot

advertisement
Faculty of Engineering, Mansoura University
Department of Computer Engineering and Control Systems
ChatPDF: Software Project Analysis Report
1. Nada Hossam El-Deen Ali Ismaeel Mousa
2. Mostafa Ibrahim Abd El-Hameed Abd El-Hameed Ibrahim
3. Ahmed Mohamed El-Hindawi Ibrahim
4. Yousif Mohamed Rashad Abd El-Razek
5. Nayera Ashraf Hamdi Mohamed Mohamed Yousif
Supervisors
Dr. Sarah Ayyad
Eng. Fatma Gamal
May 12, 2024
Task 1: System Specifications
► Problem statement of the system
The existing process of extracting relevant information from PDF and DOCX files is often
cumbersome and time-consuming. Users struggle to quickly find specific content within
lengthy documents, hindering productivity and research efficiency. Additionally, manual
extraction lacks context awareness and personalized interactions.
► System definition
ChatPDF is an AI-driven service that quickly and accurately summarizes lengthy PDF
documents, ranging from complex legal manuals to historical textbooks. Beyond providing
summaries, it generates relevant questions, helping you explore your data more thoroughly.
The unique part of ChatPDF is its conversational approach. With ChatPDF, you can
upload a PDF and start asking questions. It’s like ChatGPT, but for research papers, as it
revolutionizes the way we interact with PDF files, making reading and understanding more
efficient and insightful
► List of system services
1. Automatically Index Document Content:
○ Extract and semantically index the key information within uploaded PDF
and DOCX files.
○ Create a searchable database of document content for efficient retrieval.
2. Enable Natural Language Interactions:
○ Allow users to query document content using conversational language.
○ Provide relevant answers and summaries based on user inquiries.
3. Enhance User Experience and Productivity:
○ Improve the speed and accuracy of information extraction.
○ Facilitate seamless interactions between users and their documents.
► SW and HW needed
SW System Components:
● SW
1- AI models:
●
gpt-3.5-turbo-0125
● mixedbread-ai/mxbai-embed-large-v1
● mixedbread-ai/mxbai-rerank-large-v1
● Piper TTS
● db_mobilenet_v3_large
● crnn_mobilenet_v3_large
● Weaviate Vector DB
2- Backend:
● Server => Node js
● Framework => Express js
● Database => MongoDB
● Object Data Modeling Package => Mongoose
● Authentication => Bearer Token
● Gridfs => For storing and retrieving large files
● Security => bycrypt package and xss package
● Cloudinary => Blob Media
● Brevo => Email service
● Containerization => Docker
3 - Flutter
● Flutter and Dart SDK.
● dio: ^5.4.3+1 for handling RESTful APIs.
● syncfusion_flutter_pdfviewer: ^25.2.3 for displaying pdf files.
● flutter_bloc: ^8.1.5 and bloc: ^8.1.4 for state management.
● file_picker: ^8.0.3 for picking files from the user’s local directories.
● HW
1 - Backend
● onrender.com web service for deploying backend API
● Codescalers VPS (virtual private server) to deploy the production version
2 - AI
● Server with good GPU
3- Flutter
● Server for hosting the web Application.
Task 2: System Planning
► Choose your software processing model
Incremental development (Agile)
► Plan your system including a timeline for each process
Task 3:
ChatPdf Timeline
► Write the requirements document of your project that includes :
► Functional (User & System) requirements
Task 4:
Complete the requirements document of your project that must include: 2.
Non-functional ( product, organizational, external) requirements
Functional Requirements:
User requirements:
1. The user can upload various document formats
2. Simple way for User Authentication and login
3. Users need a simple User Interface (intuitive and attractive interface for both mobile
and web. Prioritize readability, ease of navigation, and aesthetics.)
4. Users can highlight specific text within documents and Upon highlighting, the
system provides instant insights into documents and External References.
5. Users can engage in conversation with an AI chatbot to provide additional context,
offer deeper insights, and ask more about the book context to help them understand
it more easily.
System requirements:
1. Document Upload and Storage:
○ The system shall accept files in formats such as:
■ Microsoft Word (.docx)
■ PDF (.pdf)
■ Plain text (.txt)
■ OpenDocument Text (.odt)
○ Uploaded documents shall be securely stored and associated with the user’s
account.
○ The system shall validate uploaded files to ensure they adhere to the specified
formats.
○ In case of invalid or unsupported formats, the system shall provide
appropriate error messages to users.
2. Text Highlighting and Insights:
Text Highlighting:
○ Users shall be able to select and highlight specific portions of text within
documents.
○ The system shall visually indicate the highlighted text (e.g., by changing its
background color).
Instant Insights:
○ When users highlight text, the system shall analyze the content and provide
relevant insights. These insights may include:
■ Definitions of key terms.
■ Contextual explanations.
■ Related concepts.
■ Statistical data.
■ Sentiment analysis (if applicable).
■ Suggestions for further reading.
External References:
○ Alongside instant insights, the system shall offer external references or links
to authoritative sources. These references may include:
■ Hyperlinks to relevant articles, research papers, or websites.
■ Citations from reputable sources.
■ Cross-references to related documents within the system.
User Interaction:
○ Users shall trigger the insights and external references by actively highlighting
text.
○ The system shall display insights in a non-intrusive manner (e.g., tooltips,
sidebars, or pop-ups).
Security and Privacy:
○ The system shall handle user-highlighted content securely and not store
sensitive information.
○ External references shall be vetted to ensure reliability and accuracy.
3. Interactive Chat with AI:
○ Chatbot Interaction:
■ Users shall be able to initiate conversations with the AI chatbot.
■ The chatbot shall respond promptly and engage in a natural dialogue.
○ Contextual Input:
■ Users can provide context related to the book they are reading. This
context may include:
1. Book title.
2. Author.
3. Genre.
4. Specific chapters or sections.
5. Personal interpretations or reflections.
○ Deeper Insights:
■ When users share context, the chatbot shall offer deeper insights into
the book. These insights may include:
1. Literary analysis.
2. Historical context.
3. Thematic exploration.
4. Character motivations.
5. Symbolism.
6. Critical perspectives.
○ Clarifications:
■ Users can ask the chatbot specific questions about the book. The
chatbot shall provide clear and concise answers to enhance
comprehension.
○ User-Friendly Experience:
■ The chatbot’s responses shall be user-friendly, avoiding jargon or
overly technical language.
○ Integration with Book Content:
■ The chatbot shall reference specific parts of the book when providing
insights or clarifications.
■ If available, the chatbot may link to relevant pages or chapters within
the book.
○ Privacy and Security:
■ User interactions with the chatbot shall be confidential.
■ The chatbot shall not store or share personal information.
4. Responsive User Interface:
Minimalistic Design:
○ The UI shall follow a clean and minimalistic design approach.
○ Extraneous elements, clutter, and unnecessary features shall be avoided.
Clear Navigation:
○ Users shall be able to find their way around the system without confusion.
○ Navigation menus, buttons, and links shall be logically organized and labeled.
Consistent Layout:
○ UI components (buttons, forms, icons, etc.) shall maintain consistent
placement and appearance.
○ Consistency across different screens or modules shall be maintained.
Intuitive Controls:
○ Buttons, checkboxes, input fields, and other controls shall behave as expected.
○ Users shall easily understand how to interact with each control.
Responsive Design:
○ The UI shall adapt gracefully to different screen sizes (e.g., desktop, tablet,
mobile).
○ Responsiveness shall ensure usability across various devices.
Error Handling:
○ Clear error messages shall be displayed when users encounter issues (e.g.,
invalid input, network errors).
○ Error messages shall guide users toward corrective actions.
Accessibility:
○ The UI shall meet accessibility standards to accommodate users with
disabilities.
○ Keyboard navigation, alt text for images, and proper color contrast shall be
implemented.
User Feedback:
○ The UI shall provide visual feedback (e.g., loading spinners, success
indicators) during interactions.
○ Users shall receive confirmation messages for completed actions.
Help and Documentation:
○ the UI shall offer contextual help or tooltips.
○ A concise user guide or documentation shall be available for reference.
User Preferences:
○ Users may customize certain UI elements (e.g., theme, font size) based on
personal preferences.
5. User Authentication and Profiles:
User Authentication:
○ Users shall be able to create accounts and log in securely.
○ Authentication methods may include:
■ Email and password.
■ Social login (e.g., Google, Facebook).
○ Failed login attempts shall be limited to prevent brute-force attacks.
○ Passwords shall be hashed and stored securely.
User Profiles:
○ Each user shall have a profile associated with their account.
○ Profile information may include:
■ Full name.
■ Email address.
■ Profile picture.
○ Users shall be able to update their profiles.
Access Control:
○ The system shall enforce role-based access control (RBAC).
○ Roles (e.g., admin, user) shall determine access to specific features or
resources.
○ Admins may have additional privileges (e.g., user management).
Session Management:
○ User sessions shall be maintained securely.
○ Sessions shall expire after a specified period of inactivity.
○ Users shall be automatically logged out upon session expiration.
Forgot Password Flow:
○ Users who forget their passwords shall be able to reset them via email
○ The system shall verify the user’s identity before allowing password reset.
Profile Privacy Settings:
○ Users shall control the visibility of their profile information (public, private,
friends-only).
○ Privacy settings shall apply to profile pictures, bio, and other details.
Profile Customization:
○ Users may customize their profiles by adding additional information (e.g.,
interests, location).
○ Customization options shall be user-friendly.
Profile Picture Upload:
○ Users shall be able to upload profile pictures.
○ The system shall validate image formats and size.
User Deactivation:
○ Admins shall have the ability to deactivate or suspend user accounts.
○ Deactivated users shall not be able to log in.
Audit Trail:
○ The system shall maintain an audit trail of user authentication and
profile-related actions (e.g., login, and profile updates).
6. Document Retrieval and Search:
Document Search:
○ Users shall be able to search for documents based on keywords, titles, or other
relevant criteria.
○ The search feature shall provide relevant results quickly and accurately.
○ Search results shall display document names, metadata, and a brief preview.
Advanced Search Options:
○ Users may refine their search using filters (e.g., date range, file type, author).
○ Advanced search options shall be user-friendly and intuitive.
Document Association:
○ Users shall associate documents with one or more categories.
○ A document may belong to multiple categories simultaneously.
User-Friendly Interface:
○ The UI for search and category management shall be intuitive.
○ Clear instructions or tooltips shall guide users through the process.
Search Performance:
○ The system shall optimize search queries for efficiency.
○ Indexing and caching mechanisms may be employed to enhance search speed.
Security and Privacy:
○ User-specific search results and categories shall be private.
○ Documents shall be accessible only to authorized users.
Cross-Platform Support:
○ The search and category features shall work seamlessly across different devices
(e.g., desktop, mobile, web).
Non-Functional Requirements:
1. Performance and Responsiveness:
○ Fast response times for document uploads, highlighting, and chat
interactions.
○ Optimize for low latency to enhance user experience.
2. Security and Privacy:
○ Implement robust data encryption during transmission and storage.
○ Safeguard user data and prevent unauthorized access.
3. Scalability and Load Handling:
○ Design the system to handle increasing user traffic.
○ Consider load balancing and auto-scaling mechanisms.
4. Compatibility and Accessibility:
○ Ensure compatibility across different devices and browsers.
○ Prioritize accessibility for users with disabilities.
5. Reliability and Availability:
○ Minimize system downtime.
○ Implement backup and recovery mechanisms.
6. Usability and User Experience:
○ Conduct usability testing to ensure an intuitive and smooth user journey.
○ Provide clear instructions and tooltips.
7. Maintainability and Extensibility:
○ Develop clean, well-documented code.
○ Plan for future enhancements and feature additions.
8. Integration with AI Models:
○ Seamlessly integrate context-aware question answering (LLM) and referenced
retrieval.
○ Ensure efficient communication between the reading platform and AI
components.
Signup
include
Verify Password &
username
DataBase
login
external
include
Display Error
check correct
formate
upload document
external
user
Display Error
chat bot
Ask more
resources
Ai model
Summarize portion
of document
save chat
include
End chat
Interface(Chat)
Backend Server
Ai Model
Create New Chat
verify user
Authenticated User
Alternative
if user
Ask to upload a document
auth
Unauthenticated User
else
Ask for signup or signin
Upload Document
Check Document Format
alt
if
correct
Correct Format
Chat starting
Send Document
formate
Ask to upload document in pdf ,docx,txt or odt
else
Ask question
Display the answer
Send Question to Ai
Send the Answer to the user
Receive question
Answer the Question
Ask for external resources
Send Request to provide more resources
Receive to provide more
resources
Provide Resources
Send the resources to the user
Display the resources
Ask for Summary portion of Document
Send Summary Request
Receive summary request
Provide the Summary
Send the Summary to the user
Dispaly the Summary
End the chat
Request to save the chat
DataBase
Save the chat content
Chat ended
Saved the chat
Saved the chat
Sprint
Name
Team
Status
Start Date
End Date
Duration (days)
1 Research & Design
All
100%
2023-10-02
2023-10-06
5
1 Authentication
100%
2023-10-07
2023-10-13
7
1 pdf files service with
Backend
crud operations
100%
2023-10-07
2023-10-13
7
1 OCR Model (Searching&Integration)
AI
100%
2023-10-07
2023-10-11
5
1 Vector Database for
AI RAG Pipeline
100%
2023-10-10
2023-10-13
4
1 Testing
All
100%
2023-10-14
2023-10-15
2
2 Research & Design
All
100%
2023-10-16
2023-10-20
5
2 handling user documents
Flutter
100%
2023-10-21
2023-10-27
7
2 book library storage
Backend
100%
2023-10-21
2023-10-27
7
2 Implementing RAG
AIPipeline
100%
2023-10-21
2023-10-26
6
2 WebRetrieval for RAG
AI Pipeline
100%
2023-10-24
2023-10-27
4
2 Testing
All
100%
2023-10-28
2023-10-29
2
3 Research & Design
All
100%
2023-10-30
2023-11-03
5
3 library and user books
Flutter
70%
2023-11-04
2023-11-10
7
3 design and add admin
Backend
functionality
100%
2023-11-04
2023-11-10
7
3 Text to Speech Model
AI (Searching&Integration)100%
2023-11-04
2023-11-10
7
3 Testing
Flutter
All
100%
2023-11-11
2023-11-12
2
4 Research & Design
All
100%
2023-11-13
2023-11-17
5
4 chat and pdfviewer
Flutter
50%
2023-11-18
2023-11-24
7
4 background tasksBackend
for security
80%
2023-11-18
2023-11-24
7
4 API for AI server AI
90%
2023-11-18
2023-11-24
7
4 Testing
0%
2023-11-25
2023-11-26
2
All
user
document
+ username:string
+ email:string
+ password:string
1
own
+ login()
+ signUp()
+ name:string
+ id:string
+ userId:string
0..*
+ displayDocument()
+ deleteDocument()
+ createDocument()
+ summarizeDocument()
1
has
1
chat
question
+ userId:string
+ documentId:string
+ chatHistory:List<question>
+ question:string
+ answer:string
+ chatId:string
1
+ startChat()
+ updateChatHistory()
+ saveChat()
contains
0..*
+ sendQuestion()
+ provideExternalSources()
User Interface
Document
Storage
System Database
Uploaded
Successfully?
Upload
Document
Raise An
Upload Error
AI API
Notify AI with
New File
Yes
Create An
Interface
No
Return File
Metadata
Chat Question
Add Resources
to Request
Display Answer
to User
Save Response and
Update Associated
Resources
Log In (Email,
Password)
Check Account
Validity
Return JWT
Token
Valid
Raise Validation
Error
Invalid
LLM Answers
Question
Download