Document AI Invoice Key-Value Extraction Labeling Instruction Overview This feature provides the user with the ability to identify the value of pre-defined keys (or “fields”) on the invoice. Beyond OCR, this identifies the relationships between block text and a pre-defined key. Key value-pairs for invoices can come in a few different outputs: One listed key to one listed value: One key to one value block (key: “name” value: “ana”) One listed key to many listed values: One key to two value blocks (key: “name” value: “ana” “smith”) One unlisted key to a listed value: these are common fields in invoices not discretely labelled as such on the image itself, like the “Merchant_Address” field on an invoice which usually only lists the actual address (“1245 highway 1”) and not a label with text “address:” Labeling Instruction Figure 1 shows two invoice image examples. The key-value annotation is based on the meanings and layouts of the text in an invoice image. Table 1 illustrates the 47 keys (field names) defined by the Oracle team which needs to be labelled. These 47 keys are: Merchant_Name, Merchant_Name_Logo, Merchant_Phone, Merchant_Address, Date, Time, Total, Subtotal, Tax, Tip, Currency, Customer_Name, Customer_ID, PO_Number, Invoice_Number, Date_Due, Merchant_Tax_ID, Merchant_Recipient, Customer_Address, Customer_Tax_ID, Customer_Recipient, Billing_Address, Billing_Recipient, Shipping_Address, Shipping_Recipient, Pay_Terms, Value_Added_Tax, Amount_Due, Service_Address, Service_Recipient, Remittance_Address, Remittance_Recipient, Date_Start, Date_End, Unpaid_Amount, Shipping_Cost, Item_Description, Item_Code, Item_Unit, Item_Date, Item_Tax, Item_VAT, Item_Name, Item_Price, Item_Quantity, Item_TotalPrice, and Other. (a) Example 1. (b) Example 2. Figure 1. Two invoice image examples. Table 1. General definitions for field names in invoices. Example 2 is a supplement to Example 1 as some keys are not listed in Example 1. Field name Details Example 1 Example 2 Merchant_Name name of the merchant issuing the invoice (usually at the top) Merchant_Name but with artistic font East Repair Inc. Merchant_Address address of the merchant (usually near the top) Merchant_Phone phone number of the merchant (usually near the top) Invoice issue date Invoice issue time total of the invoice, after all charges and taxes have been applied (usually at the bottom) number listed as subtotal, usually before taxes (usually above total amount) sometimes listed as sales tax (usually after subtotal and above total) amount of tip given by buyer (usually after subtotal and above total) Currency used in transaction. Only annotate when there is a currency symbol (such as $) explicitly displayed. Currently only need to label five currency signs: (₹: Indian, ¥: Chinese/Japanese, $: US, €: European, £: British) 1912 Harvest Lane New York, NY 12210 Not listed Merchant_Name_Log o Date Time Total Subtotal Tax Tip Currency Customer_Name Customer_ID PO_Number Invoice_Number Date_Due Merchant_Tax_ID Merchant_Recipient Invoiced customer Customer reference ID Purchase order reference number ID for this specific invoice (often "Invoice Number") Date payment for this invoice is due The taxpayer number associated with the vendor Name associated with the Merchant_Address Not listed 9897776666 11/02/2019 Not listed 154.06 145.00 9.06 Not listed $ Not listed Not listed 2312/2019 Oracle US-001 26/02/2019 Not listed Not listed City Source Customer_Address Customer_Tax_ID Customer_Recipient Billing_Address Billing_Recipient Shipping_Address Shipping_Recipient Pay_Terms Value_Added_Tax Amount_Due Service_Address Service_Recipient Remittance_Address Remittance_Recipient Date_Start Date_End Unpaid_Amount Shipping_Cost Item information (fields detailed below) Mailing address for the Customer The taxpayer number associated with the customer Name associated with the Customer_Address Explicit billing address for the customer Name associated with the Billing_Address Explicit shipping address for the customer Name associated with the Shipping_Address The terms of payment for the invoice Total VAT field identified on this invoice Total Amount Due to the vendor Explicit service address or property address for the customer Name associated with the Service_Address Explicit remittance or payment address for the customer Name associated with the RemittanceAddress First date for the service period (for example, a utility bill service period) End date for the service period (for example, a utility bill service period) Explicit previously unpaid balance Total cost of shipping a set of goods line items including item name, quantity, unit price, and total price (usually listed in the middle of the invoice). item total on the Not listed Not listed Not listed 2 Court Square New York, NY 12210 John Smith 100 Oracle Ave Seattle, WA 98101 3787 Pineview Drive Cambridge, MA 12210 John Smith 789 Allen St Seattle, WA 98101 Not listed NET 30 Not listed $100.00 Not listed Not listed $1,623.86 789 Allen St Seattle, WA 98101 Not listed Oracle Cloud Service 747 Buford Rd. Suite 210 Cleveland, OH 44115 City Source Not listed Not listed Not listed Not listed Not listed Not listed Item name: front and rear back cables, item quantity: 1, unit Oracle Oracle Cloud Item_Description Item_Name Item_Price Item_Quantity Item_TotalPrice Item_Code Item_Unit Item_Date right hand most side, with quantity and unit price in between on a given row. Quantity is usually an integer while unit price may include decimal places. The text description for the invoice line item (Only label it when listed under "Description" or "Memo" header) The name listed for a product or service (Usually listed under "Item Name" "Item" or "Product Name") Not always listed. If listed, it’s a number of equal or lesser value than ItemTotalPrice (“Item_Price _1” represents it is belonging to item 1. If belonging to item n, the field name becomes Item_Price _n) Usually a whole number, often on left hand most side (“Item_Quantity _1” represents it is belonging to item 1. If belonging to item n, the field name becomes Item_Quantity _n) Usually on the right hand most side of line item information (“Item_TotalPrice _1” represents it is belonging to item 1. If belonging to item n, the field name becomes Item_TotalPrice _n) price: 100.00, item total price: 100.00 Product code, product number, or SKU associated with the specific line item (listed under "SKU" "Product Code" "Item Code" or "Item SKU", and usually an alphanumeric value that has no semantic meaning ) The unit of the line item, e.g, kg, lb etc. Date corresponding to each line item. Often it's a date the line item was shipped Not listed front and rear back cables 100.00 1 100.00 5918008 Not listed Not listed 11/22/17 Item_Tax Item_VAT Other Tax associated with each line item. Possible values include tax amount, tax %, and tax Y/N Stands for Value added tax. This is a flat tax levied on an item. Common in european countries Any information that doesn’t belong to the above 44 fields should be labelled as “Other” Not listed Not listed $2.00 Figure 2 shows the annotation of words in an invoice image for their positional (2-Dimensional coordinates), text, and key information. Each word has a bounding box to represent its position in an invoice as shown in Figure 2(a). The four coordinates of the bounding box (x0, y0, x1, y1, x2, y2, x3, y3) should be labelled in clockwise order starting from the top left corner as illustrated in Figure 2(b). The annotation should be recorded in a .json file for one receipt. In Figure 2(b), it shows each word should be labelled with correct “text”, “key”, and “bounding_polygon”. “text” is the ground truth text of the word, “key” is the suitable key that describes the word, and “bounding_polygon” are the four corner points of the word’s bbox. KEY will always be labelled as “KEY_n” (exception: Other) to reflect the group-level labelling, where “_n” represents the nth group for the KEY. For example, the three words representing the Merchant Name in Figure 2(a) “East Repair Inc.” are the 1st group of “Merchant_Name”. Therefore the KEY labelling for “East”, “Repair”, and “Inc.” will all be “Merchant_Name_1”, as shown in Figure 2(b) .json file. Later on, if there exists another group of Merchant Name words, then those words should be labelled as “Merchant_Name_2”, and so on. In .json file, all the words should be in strict left-to-right, top-to-bottom human reading sequence from the invoice image. (a) An invoice image example. The bounding boxes are visualizations of each word’s position on the image. (b) Annotation .json file format. “x0, y0, x1, y1, x2, y2, x3, y3” represent the four corners’ position in an image. “text” is the ground truth text for a word. “key” is the field name suitable for the word. Figure 2. Invoice image words annotation (.json file) for word’s text, key, and bounding_polygon. Quantitative Evaluation Protocol The quality criteria labeling accuracy meets 95% as defined below: ((# correctly identified entity-level fields) / (# identifiable entity-level fields present in document) >= 0.95 Entity refers to the logical grouping of mapping individual tokens/words to associated key/entity defined by Oracle. This will be the smallest unit for K/V annotation quality evaluation. For example, “Costco Wholescale” is annotated as two words and both refer to same entity (MerchantName_1). If any annotation within an entity is incorrect, this entity is deemed to be incorrect.