DATA CLEANSING WITH SSIS BRIAN KNIGHT PRAGMATIC WORKS BKNIGHT@PRAGMATICWORKS.COM ABOUT BRIAN • SQL Server MVP • Founder of Pragmatic Works • Author of 13 books • Blogs at BIDN.com Twitter: @BrianKnight Wasn’t very good with girls Even Kermit the Frog founded a company All 13 still awaiting a publisher. BI Savvy or BI Curious? TAKEAWAYS Profiling the data Cleansing and validating with scripts Fuzzy techniques DATA PROFILING Retrieves metadata about your data Help identify data issues Uses an SSIS Data Profiling Task Data Profiling Task SCRIPTING IN SSIS Tasks Data Flow • Source • Transform • Destination CLEANSING USES IN SCRIPTING Advanced data cleansing Complex data validation Unusual data sources Script Transform Demo FUZZY TECHNIQUES De-duplicating data Fuzzy catches misspelt words or variants Fuzzy Grouping and Fuzzy Lookup FUZZY GROUPING DEMO THE END ALREADY? Questions http://www.bidn.com/people/brianknight @BrianKnight bknight@pragmaticworks.com http://www.youtube.com/pragmaticworks