Department of Data Science Natural Language Processing Dr. M.Z. Jhandir The Islamia University of Bahawalpur Faculty of Computing Department of Data Science Class: BSDS 6th Semester Course: DASC-2105-Natural Language Processing Course Instructor: Dr Muhammad Zeeshan Jhandir Assignment 1: Regular Expressions Objective: The objective of this assignment is to familiarize you with regular expressions and to enhance your skills in pattern matching and text processing. PART I Problem 1: Write a regular expression to match all the words that start with a capital letter in a given string. Problem 2: Write a regular expression to match all the words that end with "ing" in a given string. Problem 3: Write a regular expression to match all the hexadecimal numbers in a given string. Problem 4: Write a regular expression to match all the IP addresses in a given string. Problem 5: Write a regular expression to match all the dates in a given string. Problem 6: Write a regular expression to match all the email addresses in a given string. Problem 7: Write a regular expression to match all the URLs in a given string. Problem 8: Write a regular expression to match all the phone numbers in a given string. Problem 9: Write a regular expression to match all the HTML tags in a given string. Problem 10: Write a regular expression to match all the words that contain exactly three vowels in a given string. Problem 11: Write a regular expression to match all the words that contain exactly three of the same letters in a given string. Problem 12: Write a regular expression to match all the words that contain a sequence of two or more vowels in a given string. Problem 13: Write a regular expression to match all the words that contain both a digit and a letter in a given string. Problem 14: Write a regular expression to match all the words that contain a repeated letter in a given string. Page 1 of 3 Department of Data Science Natural Language Processing Dr. M.Z. Jhandir Problem 15: Write a regular expression to match all the palindromic words in a given string. PART II Write the algorithm for each problem given above, and explain the working of every step of your algorithm in your own words. Also draw flow chart of each problem and give pictorial diagram of what is going on. Algorithm: 1. Module regular expression 2. Define a regular expression pattern that matches words starting with a capital letter, using the ^ symbol to match the start of the string and the \b word boundary symbol to match a word. 3. Use the re.findall() function to find all the matches of the pattern in the given string. 4. Return the list of matched words. Explanation: Here you’d explain each line of the above Flowchart: by NLP Page 2 of 3 Department of Data Science Natural Language Processing Dr. M.Z. Jhandir Pictorial Presentation: PART III Convert all problems into python code and your code must fulfill the asked problem’s solution Submission: Part I: Please submit your regular expressions as a single file named NLPassignment 1 REGEX – YOUR NAME – YOUR ROLLnumber .txt. The file should include comments explaining how each task is implemented. Part II: Submit your solution as a single file named NLPassignment 1 REGEX – YOUR NAME – YOUR ROLLnumber .docx The file should include all the requirements explained in PART II. The font used in the file for normal text must be Times New Roman size 11 Justify aligned, For Heading Use same font with size range from 14 to 18. Bold but not underlined. Part III: Submit your programming solution as a single file named NLP assignment 1 REGEX – YOUR NAME – YOUR ROLLnumber .py. The code must properly be commented and must bear you details at the top of the file in comments, such as your name, Roll number, class, section. Grading: (0.5% in your overall course grading) The assignment will be graded based on the correct implementation of the tasks and the readability of the regular expressions. DO REMEMBER THE MARKS WILL BE INCLUDED IN FINAL RESULTS OF THE COURSE GRADES Page 3 of 3