Uploaded by Tanveer Ahmed

RAG Project Code Explanation

advertisement
RAG PROJECT CODE EXPLANATION
Contents
Step 1 : Initialize and declare necessary variables. .................................................................... 1
Step 2: Create instances of the Ollama model and Ollama Embedding model, then invoke the
model for processing. ................................................................................................................. 2
Step 3 : Parse the output from the model (such as a language model or retriever) into a string
format, performing any necessary cleaning and formatting. ....................................................... 2
Step 4 : Create a template using `PromptTemplate`, which allows you to define placeholders
that can later be filled with dynamic values. ............................................................................... 2
Step 5 : Use the `chain = prompt | model | parser` syntax to chain together multiple components
in LangChain using the pipe (`|`) operator. ................................................................................. 3
Step 6 : Use `PyPDFLoader` to load and split the document into manageable pieces. .............. 3
Step 7 : Print out the contents of each page for inspection. ........................................................ 4
Step 8: Store, manage, and search vector embeddings directly in memory using the DocArray
library. ........................................................................................................................................ 4
Step 9 : Perform a search to find documents that are semantically similar to the provided query.
.................................................................................................................................................. 4
Step 10 : Convert the vector store into a retriever for efficient document retrieval. ................... 5
Step 11 : Send the relevant documents to the LLM (Language Model) for processing and
response generation. ................................................................................................................. 6
Step 1 : Initialize and declare necessary variables.
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
MODEL = "phi"
Public ‫عام‬
Step 2: Create instances of the Ollama model and
Ollama Embedding model, then invoke the model for
processing.
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings
model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)
model.invoke("Tell me a joke")
Step 3 : Parse the output from the model (such as a
language model or retriever) into a string format,
performing any necessary cleaning and formatting.
from langchain_core.output_parsers import StrOutputParser
parser = StrOutputParser()
chain = model | parser
chain.invoke("Tell me a joke")
Step 4 : Create a template using `PromptTemplate`,
which allows you to define placeholders that can
later be filled with dynamic values.
from langchain.prompts import PromptTemplate
template = """
Answer the question based on the context below. If you can't
Public ‫عام‬
answer the question, reply " say i don't know".
context: {context}
question : {question}
"""
prompt = PromptTemplate.from_template(template)
#This line creates a PromptTemplate object using the template string.
print(prompt.format(context="here is the context", question="Here is a question"))
Step 5 : Use the `chain = prompt | model | parser`
syntax to chain together multiple components in
LangChain using the pipe (`|`) operator.
chain = prompt | model | parser
chain.invoke({"context": "My parents named me Raj", "question": "What is my age '?"})
Step 6 : Use `PyPDFLoader` to load and split the
document into manageable pieces.
from langchain_community.document_loaders import PyPDFLoader
#loader = PyPDFLoader(r"C:\OLLAMA\SDM.pdf")
loader = PyPDFLoader(r"C:\OLLAMA\ReleaseNotes.pdf")
pages = loader.load_and_split()
pages
Public ‫عام‬
Step 7 : Print out the contents of each page for
inspection.
for page in pages:
print(page.page_content) # Prints the content of each page
print("------")
Step 8: Store, manage, and search vector
embeddings directly in memory using the DocArray
library.
from langchain_community.vectorstores import DocArrayInMemorySearch
#imports the DocArrayInMemorySearch class from the langchain_community.vectorstores
module.
# This class allows you to store, manage, and search vector embeddings directly in memory
using the DocArray library.
vectorstore = DocArrayInMemorySearch.from_documents(
pages,
embedding=embeddings
)
Step 9 : Perform a search to find documents that
are semantically similar to the provided query.
query = "What is the role of the SAP Service Delivery Manager?"
result = vectorstore.similarity_search(query)
print(result)
Public ‫عام‬
Step 10 : Convert the vector store into a retriever
for efficient document retrieval.
retriever = vectorstore.as_retriever()
#Converts the vectorstore into a retriever.
#A retriever is a wrapper around the vector store that makes it easy to retrieve relevant
documents.
#It allows you to perform vector similarity search in a way that integrates well with LangChain
components.
retriever.invoke("project manager")
# Example usage
retriever = vectorstore.as_retriever()
query = "project manager"
results = retriever.get_relevant_documents(query)
for result in results:
print(result)
query = "What are the skills of the person?"
search_results = vectorstore.similarity_search(query)
print(search_results)
query = "oracle"
search_results = vectorstore.similarity_search(query)
print(search_results)
Public ‫عام‬
Step 11 : Send the relevant documents to the LLM
(Language Model) for processing and response
generation.
#This code creates a processing chain that takes a question, retrieves relevant documents,
formats them into a prompt, sends them to an LLM, and parses the response.
from operator import itemgetter
#Imports itemgetter, which is used to extract specific values from a dictionary.
#It helps map input data to different parts of the chain.
chain = (
{
"context": itemgetter("question") | retriever,
"question": itemgetter("question"),
}
| prompt
| model
| parser
)
chain.invoke({"question": "What is the fist company name mention in work experience ?"})
# Calls the entire chain with a question.
Public ‫عام‬
Download