AI Data Augmentation Techniques

AI Assistant Data Augmentation: • Automated Augmentation: ▪ Using (nlpaug.augmenter.word) to augment some words in the given questions in data to generate new questions and use them in training data.  Nlpaug: is a Python library for data augmentation in natural language processing (NLP). It provides various techniques for augmenting text data to improve the performance of NLP models.  The augmentation source is wordnet.  WordNet: is a lexical database of the English language that relates words to one another in terms of synonyms, hypernyms, hyponyms.  Synonyms: are words or phrases that have similar or identical meanings.  Hypernym: A hypernym is a word that represents a category or a general concept. It is more abstract and encompasses a range of specific, for example: Animal.  Hyponym: A hyponym is a word that falls within a more general category represented by a hypernym. It is a more specific term that describes a particular instance or subtype of the broader concept, example: Bird.  Used cosine similarity to compare the augmented questions to the original question and filter them based on the average score of all the augmented questions. ▪ • Modifying the way of asking in the questions.  Example: “what are the types of….” Is Modified to “Could you tell me the types of…. Manual Augmentation: ▪ Use ChatGPT to generate multiple examples of the questions given in the data. Data Structure: • Instead of manipulating the JSON file directly by writing tags, patterns, and responses, we have made it easier by allowing the user to put questions and 1 answers in a text file then we transform it to a JSON file to train the model. • Each list of questions and answers must be separated by a space. Test Case of augmented data: 2

AI Data Augmentation Techniques

Related documents

Products

Support

AI Data Augmentation Techniques

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib