& Address Cleansing Integration of REST web services Chris Twirbutt & Forrest Horner 2010 November 1 Melissadata REST & SAP Integration • • • • Background of Sacramento County Address cleansing choices Architecture to facilitate REST web services Example SAP screens 2 Background Sacramento County CA • Sacramento County, California – About 1.3 million citizens – Multiple jurisdictions, Cities, Districts, etc • SAP IS-U/CCS Utility Billing System (aka: “FOCUS”) – – – – Implemented in July 1999 (on v1.1b)…now vECC6 Bill for ~300K customers, ~$300m revenue Waste Water, Solid Waste, Drainage, Potable Water …have a SAP tech support staff of: ~8 3 Address Cleansing Choices #1 – Load Postal CD’s to SAP Regional Structure …but alas, it doesn’t fix “bad” addresses…only flags address is not good. #2 – Periodic Batch Cleansing Is re-active management…and bad addresses percolate through downstream data systems. #3 – Web Service Address Cleansing Is pro-active,but must put in a technical architecture. 4 Postal CD Data – Does Not “Fix” Postal CD comes from the Post Master The USPS web site http://www.usps.com/ncsc/addressinfo/addressinfomenu.htm The web site has the postal AIS technical guide in PDF format and order forms for ordering the postal CD. Note: RSADRLSM02 is the program that loads the data Connection Object Note: SAP function module ADDR_CHECK is what SAP uses for verifying if an BP or CO address is valid against the postal structures Business Partner 5 Periodic Batch Cleanse (Monthly) ADRC table File dump Data Extract SAP Production Cleansing Service Address Corrections File Data Upload and update This fixes after the fact…after bad mail has already gone out…and after other downstream data systems have already consumed the bad address 6 Melissadata – Web Service Has many different address cleansing products Meldata provides back a ‘corrected’ address. SAP Production / SYSID:“PPW” Web Service FOCUS RFC Web Hit Counter Dev / Test Systems Address Cache An “architecture” had to be built in SAP. The SAP production system acts as a “service buss”. 7 Overview of MelData Web Services Mellisadata (Henceforth referred to as “Meldata”) offers numerous services and Suites of Services. Sac County procured the “Data Quality Web Services Suite” , which is a “Multiplatform toolkit to verify, correct and standardize address, phone, email and names at point-of-entry.” This Suite consists of 7 services, 5 of which we have enabled ( And two of which we are currently using ( ): • Address Check • RBDI (Residential/Business Delivery Indicator) • GeoCode • IP Location • Email • Name Check • Phone Check ) 8 Usage allowance We subscribed to 600K hits per year for each enabled service. The following is a screen-print from MelData’s side which tracks our usage: 9 Technical Implementation SOAP vs. REST Overall, more companies such as Google and Amazon are using REST-based Web Services. REST stands for “Representational State Transfer”. We won’t go into fine detail of SOAP vs. REST but rather, why we chose REST. A nice robust link on the technical differences between SOAP and REST: http://www.taranfx.com/rest-vs-soap-using-http-choosing-the-right-webservice-protocol We utilize the REST-based Web Service Interface because: • SOAP required more JAVA skills than were available in-house, for example: • SOAP had many data-type binding issues which had to be worked out by contacting MelData tech support and having them make changes to the WSDL, etc. • REST was 10 times easier to code for in ABAP as it didn’t have to deal with Proxy generation (data binding issues), and other JAVA-related issues. • Developers who simply understand HTTP and XML can start building Web services right away, without needing any toolkits beyond what they normally use for Internet application development. 10 BASIS/Config. Requirement (for SAP to communicate with the Web Service) -Using SSL (https) to access external web site recommended. - Public web servers have SSL certificates signed by known public Certificate Authority - Web browsers come with built-in store of known public CA’s to verify certificate server is presenting. -The SAP Web Application Server (Web AS) ships with very few CAs, so it won’t know the CA the remote site had its SSL cert signed by. For example, Melissadata uses “GoDaddy”. -Navigate to the secure URL using a web browser, not SAP initially. -While there, click the “lock” icon and download the security certificates in base 64 format. This incl. both the root and class 2 certificates. -There may be multiple cert’s, called a chain. Download cert for each part of chain. -Load the security certificate into SAP (STRUST) -Import certificates into SAP anonymous &client standard PSE’s (Personal Security Environment) -Still In STRUST, export the certificates (to load them into the db). -Restart ICM. -Set up a “hosts file” entry, if needed (i.e. DNS is turned off, which it is in PPW) to Melissadata so the SAP server “knows” who Melissadata is. 11 Architecture Overview Custom(‘Z’) Naming Convention Begin all Web-Service-related Z-table names with “ZWS_” 12 Architecture Overview (cont’d) • All requests for address scrubs are done thru PPW (Production), regardless of source system. This is done by RFC and ensures that all hits are tracked in one place (Production). • PPW architected with several control & tracking tables: • A master table(ZWS_MELDATA) which stores/caches all detailed request history • A table to configure how many hits allowed per day (ZWS_HITS_MAX) • A table to store actual hits per day counts (ZWS_HITS_DAILY) 13 Architecture (Cont’d) PPW contains (cont’d): • table (ZWS_MELDATACODES) stores all possible return codes (and descriptions) Note on Return Codes: • 1st Char: •Address Scrub return Codes begin with “A” •and GeoCoding Return Codes begin with “G”. • 2nd Char: •“S” (Success – Corrected address returned) •or “E” (Error – Not able to return a fully corrected address) 14 Architecture (Cont’d) • Communication to PPW from all systems: -An RFC interface was added (By BASIS) to all the systems that "points to" PPW. This interface is named "PPW_SCRUB". -A user and role was created in PPW to perform the scrub function. The user is "adrscrub", and the role is "z_address_scrub". • Communication to Web Service from SAP/ABAP: A user-friendly wrapper/RFC/FM was developed called: ZMELISSADATA_ADDRESS_CLEANSE This FM/RFC currently performs both Address Check and Geocoding. This FM uses SAP’s OO methods contained in Class/Interface “IF_HTTP_CLIENT”, For example: CALL METHOD http_client=>create_by_url (sets up the URL to be called) CALL METHOD http_client->send (makes a call to the URL) CALL METHOD http_client->receive (receives data back from URL) The result comes back in XML format and is parsed by the FM (could have been done using XLST Transformation, but in our case using custom code to parse the XML result into Z-Table fields (this was prior to our knowledge of XLST Transformations). 15 Example of how to call the RFC IN OUT Note: Input has incorrect Street name (“AVE” vs. “RD”) and incorrect zip (95826 Instead of 95827) Output has corrected Street Name And Zip Code (+4). “Address Key” required for GeoCoding (see next slide) 16 Example of Geo-Coding IN OUT 17 Table Entry Created (in ZWS_MELDATA) MANDT COUNTER ALL TIME 020 9910 SOURCE SYSTEM PPW SOURCE USER TWIRBUTTC CALLING PROGRAM SAPLSEUJ CALLING TXN SE37 CALLED ON DATE 06/14/2010 CALLED ON TIME 14:06:45 CALLED BY ADRSCRUB URL CALLED https://addresscheck.melissadata.net/v2/REST/Service.svc/doAddressCheck?id=1234567890&a1=9700%20GOETHE%20AVE&ctry=US&opt=true&a2=&ste=C&city=SA CRAMENTO&state=CA&zip=95826 RETURN CODE AS01,AS12 BACKGROUND FLAG IN HOUSE NUM 9700 IN STREET GOETHE AVE IN SUITE C IN CITY SACRAMENTO IN STATE CA IN ZIP 95826 OUT PARSED HOUSE 9700 OUT PARSED STR Goethe OUT PARSED STSUF Rd OUT PARSED SUIT1 Ste OUT PARSED SUIT2 C OUT ADDRESS1 9700 Goethe Rd OUT ADDRESS2 OUT SUITE Ste C OUT CITY Sacramento OUT STATE CA OUT ZIP 95827 OUT PLUS4 3558 OUT RESULT CODES AS01,AS12 HOUSESTREETMATCH CITY MATCH X STATE MATCH X ZIP MATCH ADRC NUM OUT ADDRESSKEY 95827355875 GEO RESULT CODES GS01 GEO LAT 38.554697 GEO LONG -121.335972 18 ZWS_MELDATA: Field Definitions MANDT COUNTER_ALL_TIME 020 9910 # of hits made to the Web Service (all-time) SOURCE_SYSTEM PPW Calling System SOURCE_USER TWIRBUTTC Self-explanatory CALLING_PROGRAM SAPLSEUJ “ CALLING_TXN SE37 “ CALLED_ON_DATE 06/14/2010 “ CALLED_ON_TIME 14:06:45 “ CALLED_BY ADRSCRUB RFC User id (set up by BASIS) URL_CALLED https://addresscheck.melissadata.net/v2/REST/Service.svc/doAddressCheck?id=1234567890&a1=0970%20GOETHE%20AVE&ctry =US&opt=true&a2=&ste=C&city=SACRAMENTO&state=CA&zip=95826 RETURN_CODE AS01,AS12 MelData Return Codes BACKGROUND_FLAG Background/Foreground indicator IN_HOUSE_NUM 9700 Input House # IN_STREET GOETHE AVE Input Street IN_SUITE C Input Suite (opt’l) IN_CITY SACRAMENTO Input City IN_STATE CA Input State IN_ZIP 95826 Input Zip Note: Key fields are MANDT & COUNTER_ALL_TIME Index exists on URL_CALLED since we CACHE based on this one field (each unique URL represents a unique AddressCheck). Continued on next Slide… 19 Field Definitions (cont’d) OUT PARSED HOUSE OUT PARSED STR OUT PARSED STSUF OUT PARSED SUIT1 OUT PARSED SUIT2 OUT ADDRESS1 OUT ADDRESS2 OUT SUITE OUT CITY OUT STATE OUT ZIP OUT PLUS4 OUT RESULT CODES HOUSESTREETMATCH CITY MATCH STATE MATCH ZIP MATCH ADRC NUM OUT ADDRESSKEY GEO RESULT CODES GEO LAT GEO LONG 9700 Goethe Rd Ste C 9700 Goethe Rd Ste C Sacramento CA 95827 3558 AS01,AS12 X X 95827355875 GS01 38.554697 -121.335972 Parsed Address fields returned from MelData “ “ “ “ “ “ “ “ “ “ “ MelData Return Code(s) Did INPUT House # and Street match the OUTPUT House # and Street ? Did INPUT City match the OUTPUT City ? Did INPUT State match the OUTPUT State? Did INPUT Zip match the OUTPUT Zip ? ADRC Record # scrubbed (if applicable) Meldata address key returned (needed to GeoCode) GeoCode Return Code(s) GeoCode Lat Result GeoCode Long Result 20 Find Billing Account by Address lookup Via GeoCoding Step 1 – Meldata address check -- Cache result in table ZWS_MELDATA Step 2- Update sap address if necessary and possible (if address scrub was successful) Step 3 – Using MelData Address key (from Successful scrub only), get lat long Step 4 – Using haversine formula, get nearest parcel (ZPSD_APN) and VKONT (ZGIS_APN_DATA) (Using FM “Z_RFC_LAT_LONG_IN_PARCEL_OUT”) IN OUT Since Geo-Coding is not always down to the rooftop level, enter As much info of the Address as possible (in this case House # and 1st letter of Street) 21 Secret Sauce: HAVERSINE Formula http://en.wikipedia.org/wiki/Haversine_formula SAP ABAP program code: Assume that U = North Pole, and V and W are our pair of X,Y coordinates (note the variable names above do not match that of the program code in this presentation) a,b, and c are the “Great Circle”* distances between U,V and W. * A great circle of a sphere is a circle that runs along the surface of that sphere so as to cut it into two equal halves. The great circle therefore has both the same circumference and the same center as the sphere. It is the largest circle that can be drawn on a given sphere. Great circles serve as the analogue of "straight lines" in spherical geometry. 22 Other FOCUS Applications: This stand alone screen demonstrates the architecture. Link to Youtube demo of this screen: http://www.youtube.com/watc h?v=7txYpWmKWCg Link to SAP Developer Network (SDN) SAP/Meldata WiKi: http://wiki.sdn.sap.com/wiki/di splay/stage/Address%20Cleansi ng%20via%20REST%20Web%20 Service%20Consumption 23 Other FOCUS Applications: E.g. Lifeline Entry (for Non-FOCUS Accts.) New Button New Sub-screen For scrubbed address Results. 24 Future Integration • Managed update, with clerical review of suggested Address fixes. •Fully Automated update of our Address from Mel-Data. • Business Partner mailing address update • Physical location addresses • To be continued…… A myriad of other opportunities will present themselves: Finding duplicate customers, locations. Finding missing locations: address exists, but we aren’t billing. Geocoding & Work Order proximity based routing. •Regional MADD (Master Address Database) Effort by Cities, and public safety, police, fire, etc to standardize addresses. Lessons Learned: •Must have an architecture to facilitate & manage the web service “hits” •Must involve your BASIS team to “open up” the interface •Mass cleansing (thousands at a time) needs “404” error handling •Melissadata tech team were great to collaborate with!...an essential aspect. 25