Alchemist Hour – Entity Extraction Question & Answer: 1. How many different entity types does AlchemyAPI support? AlchemyAPI supports 42 primary types, along with hundreds of subtypes. Click here for a list. 2. Is there a limit to the number of entities the API can identify in one call? No, as long as the size of the text falls within the 50kb limit. 3. Can you raise the size restriction for the text or HTML? No. With entities, however, you can break up the text into multiple pieces if the file is too large. This will not impact the number of entities extracted. 4. How quickly does extraction occur? The entity extraction call takes less than one second to return a response with entities from the text. 5. When looking at the types of entities supported, it does not appear that dates are included. Is there any reason for that? To determine the publication date of a piece of text, use the Pub Date API, as opposed to the Entity Extraction API. 6. So, your entity extraction uses machine learning to find entities? Yes. We use a hybrid neural network and statistical approach to train our systems both in an unsupervised and supervised manner. 7. Is the knowledge graph a dbpedia graph, or some other customized graph? The knowledge graph is our own custom graph and was built by our engineers. 8. Will AlchemyAPI lemmatize or stem non-capitalized words? If so, will this impact the entities extracted? Capitalization will be taken into account when extracting entities, but it is not the sole factor in determining whether an entity is extracted. 9. Are entities based on ontology of general things or can they be trusted when extracting entities from websites containing scientific data? Our Entity Extraction API is trained on relatively general data. However, we have trained it on a large sum of data, and therefore it performs well on industry-specific entities. In the near future, we hope to allow people to customize based on their own documents. Try our demo to see how well the Entity Extraction API works for your use case. 10. When given an input, how does the API identify and extract names? The system is trained to recognize well-known names. That being said, it can also recognize generic names based on the context of the sentence. 11. How does pricing work for sentiment analysis in entity extraction? Entity extraction classifies as one transaction. When you add sentiment analysis, it will add an additional transaction, equaling two total transactions. With AlchemyAPI’s free plan, you receive 1,000 transactions per day. To upgrade from the free plan, our sales team can get you up and running quickly. Visit our sales page to contact them. 12. What are some good examples of use cases for entity extraction? Entity extraction is often used in combination with our other services. For example, it is often used in conjunction with our Sentiment Analysis API and News API to pull out relevant semantic information. 13. Can you provide an example of the entity extraction using the “Crime” entity? If you input the sentence, “He was accused of murder”, the Entity Extraction API will extract “murder” as a crime-type entity. Try it for yourself in the demo. 14. Questions surrounding personalization We received several questions about various types of personalization, including: Can we feed the AlchemyAPI parser with our own classes? Can we provide/add a third-party datasource for entity disambiguation? Can I define a list of custom entities? Can I build a customized knowledge graph, which includes entities specific to my company’s corpus? We would like to address all of these questions simultaneously. We are looking to accommodate customization requests, but do not currently have any solutions at this time. 15. How is the sentiment score of an entity calculated? Our sentiment score represents how confident the API is about the sentiment type of the associated term. Score values close to zero represent low confidence, while values close to -1 indicate that we have high confidence the sentiment is negative and values close to 1 indicate that we have a high confidence that the sentiment is positive. Exactly how this number is calculated is proprietary, but it involves using both supervised and unsupervised learning to train neural nets. 16. How does sentiment analysis in entity extraction differ from sentiment analysis in the News API? Using the News API, developers can pull entity-level sentiment analysis information, determined on a word-by-word basis. Document-level sentiment analysis is a bit different, as it looks at the entire text to determine if it is positive or negative. 17. Is the knowledge graph the same as the taxonomy feature? No, but they are similar. Taxonomy looks at the entire text and tries to fit it within a hierarchy. The knowledge graph works behind the scenes to find a hierarchy for a single keyword or entity. 18. Does AlchemyAPI provide any functionality to compare entities in two separate documents and match text, based on similar, if not the same, entities? This is not currently an available function. However, this is fairly easy to do on the application side once you have the results from the two separate calls. 19. What is the difference between keywords, entities, and concepts? Entities are specific nouns and noun phrases (i.e. specific companies/persons, such as “Douglas Adams”) Keywords are general nouns and noun phrases (i.e. nonspecific, such as “Author”) Concepts are identified as the sum of the text, instead of extracting individual pieces, even it the topic is not explicitly stated. For example, the phrase, “The CEO of Space X and Tesla Motors.” Keywords: “CEO” Entities: “Space X”, “Tesla Motors”, “CEO” Concepts: “Elon Musk” 20. Is it possible to exclude extraction of certain entity types? Using the structuredEntities parameter, you can eliminate certain types of entities, including Twitter handles and hashtags. It is also easy to filter entities after your data is returned, based on type. Visit the docs for more information. 21. Does AlchemyAPI identify entailment? For HTML, for instance, JavaScript that makes elements visible. Our system tries to identify and analyze the main text in a document. 22. Is all data extracted only in JSON format, or does AlchemyAPI graph visualizations to communicate the data? AlchemyAPI provides structured text output, such as JSON or XML. To obtain visual representations of the data, you will need to construct those visualizations from the output that is returned. Take a look at this recipe that showcases building visualizations using R. 23. Is there much in academic literature about AlchemyAPI and its features? Do you have scientists and researchers who might publish? AlchemyAPI has not published any papers surrounding the techniques used for our services; that information is kept proprietary. 24. When the API references entities related to pronouns, how is the distance of that reference determined? Is it immediately adjacent sentences only? The way in which our systems operate is kept for proprietary use only. Please see the answer to question 22. 25. We noticed that some location entities have geonames ids, and others do not. Will you be adding geonames ids to more geo entities in the future? We do not have any current plans to update our geo entities feature. Keep checking in for any updates that may become available.