RAG PROJECT CODE EXPLANATION Contents Step 1 : Initialize and declare necessary variables. .................................................................... 1 Step 2: Create instances of the Ollama model and Ollama Embedding model, then invoke the model for processing. ................................................................................................................. 2 Step 3 : Parse the output from the model (such as a language model or retriever) into a string format, performing any necessary cleaning and formatting. ....................................................... 2 Step 4 : Create a template using `PromptTemplate`, which allows you to define placeholders that can later be filled with dynamic values. ............................................................................... 2 Step 5 : Use the `chain = prompt | model | parser` syntax to chain together multiple components in LangChain using the pipe (`|`) operator. ................................................................................. 3 Step 6 : Use `PyPDFLoader` to load and split the document into manageable pieces. .............. 3 Step 7 : Print out the contents of each page for inspection. ........................................................ 4 Step 8: Store, manage, and search vector embeddings directly in memory using the DocArray library. ........................................................................................................................................ 4 Step 9 : Perform a search to find documents that are semantically similar to the provided query. .................................................................................................................................................. 4 Step 10 : Convert the vector store into a retriever for efficient document retrieval. ................... 5 Step 11 : Send the relevant documents to the LLM (Language Model) for processing and response generation. ................................................................................................................. 6 Step 1 : Initialize and declare necessary variables. OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") MODEL = "phi" Public عام Step 2: Create instances of the Ollama model and Ollama Embedding model, then invoke the model for processing. from langchain_community.llms import Ollama from langchain_community.embeddings import OllamaEmbeddings model = Ollama(model=MODEL) embeddings = OllamaEmbeddings(model=MODEL) model.invoke("Tell me a joke") Step 3 : Parse the output from the model (such as a language model or retriever) into a string format, performing any necessary cleaning and formatting. from langchain_core.output_parsers import StrOutputParser parser = StrOutputParser() chain = model | parser chain.invoke("Tell me a joke") Step 4 : Create a template using `PromptTemplate`, which allows you to define placeholders that can later be filled with dynamic values. from langchain.prompts import PromptTemplate template = """ Answer the question based on the context below. If you can't Public عام answer the question, reply " say i don't know". context: {context} question : {question} """ prompt = PromptTemplate.from_template(template) #This line creates a PromptTemplate object using the template string. print(prompt.format(context="here is the context", question="Here is a question")) Step 5 : Use the `chain = prompt | model | parser` syntax to chain together multiple components in LangChain using the pipe (`|`) operator. chain = prompt | model | parser chain.invoke({"context": "My parents named me Raj", "question": "What is my age '?"}) Step 6 : Use `PyPDFLoader` to load and split the document into manageable pieces. from langchain_community.document_loaders import PyPDFLoader #loader = PyPDFLoader(r"C:\OLLAMA\SDM.pdf") loader = PyPDFLoader(r"C:\OLLAMA\ReleaseNotes.pdf") pages = loader.load_and_split() pages Public عام Step 7 : Print out the contents of each page for inspection. for page in pages: print(page.page_content) # Prints the content of each page print("------") Step 8: Store, manage, and search vector embeddings directly in memory using the DocArray library. from langchain_community.vectorstores import DocArrayInMemorySearch #imports the DocArrayInMemorySearch class from the langchain_community.vectorstores module. # This class allows you to store, manage, and search vector embeddings directly in memory using the DocArray library. vectorstore = DocArrayInMemorySearch.from_documents( pages, embedding=embeddings ) Step 9 : Perform a search to find documents that are semantically similar to the provided query. query = "What is the role of the SAP Service Delivery Manager?" result = vectorstore.similarity_search(query) print(result) Public عام Step 10 : Convert the vector store into a retriever for efficient document retrieval. retriever = vectorstore.as_retriever() #Converts the vectorstore into a retriever. #A retriever is a wrapper around the vector store that makes it easy to retrieve relevant documents. #It allows you to perform vector similarity search in a way that integrates well with LangChain components. retriever.invoke("project manager") # Example usage retriever = vectorstore.as_retriever() query = "project manager" results = retriever.get_relevant_documents(query) for result in results: print(result) query = "What are the skills of the person?" search_results = vectorstore.similarity_search(query) print(search_results) query = "oracle" search_results = vectorstore.similarity_search(query) print(search_results) Public عام Step 11 : Send the relevant documents to the LLM (Language Model) for processing and response generation. #This code creates a processing chain that takes a question, retrieves relevant documents, formats them into a prompt, sends them to an LLM, and parses the response. from operator import itemgetter #Imports itemgetter, which is used to extract specific values from a dictionary. #It helps map input data to different parts of the chain. chain = ( { "context": itemgetter("question") | retriever, "question": itemgetter("question"), } | prompt | model | parser ) chain.invoke({"question": "What is the fist company name mention in work experience ?"}) # Calls the entire chain with a question. Public عام