The Return of The DB2 Top Ten Lists Craig S. Mullins craig.mullins@softwareonz.com craig@CraigSMullins.com ©2010 SoftwareOnZ ©2010 SoftwareOnZ The Top Ten Lists And now, from the home office in Sugar Land, Texas… a series of DB2 Top Ten lists about various topics ranging across the following subjects: • Performance • Coding • Design • Administration • Management • Features • Tools ©2010 SoftwareOnZ ©2010 SoftwareOnZ The Ten Database Management Commandments 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Thou shalt always design databases from a logical data model. Thou shalt always document thy database design. (There are many important aspects of database administration, but of them all,) data integrity shalt always remain the most important. Thou shalt encrypt personal and sensitive information, both at rest and in transit. Thou shalt implement appropriate security within thy database and between thy DBMS and thy operating systems. Thou shalt always maintain the recoverability of thy databases with sufficient backups to meet the availability requirements of the business supported by the data. Thou shalt implement a consistent and documented workflow process for implementing database change that assures data integrity while minimizing downtime. Thou shalt always consider transaction performance and data availability in thy database design. Thou shalt also work with application developers to ensure efficient code is written to access thy databases. Thou shalt not download thy database to thy laptop. ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 Database Design Guidelines 1. 2. 3. Always start with a logical data model Follow relational database design rules - normalization Create and follow consistent naming standards for all DB2 objects (including columns) 4. Favor declarative referential integrity (RI) 5. Don’t forget the free space 6. Always fully document the database design including all implementation decisions 7. Plan for data purging and archiving 8. Design for sharing data… instead of copying it all over the place! 9. Plan for performance (or availability) at design time 10. Plan for backup and recovery before implementation ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Benefits of Using DB2 CDC for DW/ETL 1. Why move all of your data every night when just a fraction of the data actually changed? 2. Near-real-time data delivery vs. daily extracts - downstream users have access to most current information 3. Significant reduction in CPU cycles by eliminating costly ETL processes 4. Identify certain business events as soon as they occur (i.e. claim submitted over $1M) 5. High availability with active/active replication between data centers 6. Optimize existing ETL processes by supplying only the data that has changed 7. Real-time integration with newer applications (i.e. send a text message if account balance goes negative) 8. No need to modify existing applications to track data changes or publish specific business events 9. Extend the life of your legacy DB2 applications by integrating data changes with newer applications 10. Allows for auditing by tracking all data changes to important tables ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Development Best Practices 1. 2. 3. 4. 5. 6. 7. 8. 9. Minimize network calls Minimize passes through the data Put the work into the SQL, not the program Unlearn the “flat file” mentality Be sure data type and length match in predicates Know your Stage 1, Stage 2, and Indexable predicates Document your code Always check the SQLCODE or SQLSTATE Analyze your access paths (and tune your SQL in test) 10. Avoid Bachelor Programming Syndrome ©2010 SoftwareOnZ ©2010 SoftwareOnZ * SELECT C.color FROM Colors AS C WHERE C.color NOT IN (SELECT P.color FROM Products AS P); Top Ten Common SQL Mistakes 1. 2. 3. Assuming an ORDER BY is not necessary for ordered results Forgetting the NULL indicator Incorrect expectations when using the NOT IN predicate with NULLs* 4. Coding predicates appropriately in Outer Joins 5. Not coding a cursor for a multi-row result 6. Recompiling but not rebinding 7. Forgetting to use single quotes around strings (instead of double quotes) 8. Trying to modify a Primary Key column 9. Forcing dynamic SQL into static SQL (sometimes hundreds of static SQL statements) 10. Asking for more data than you need (columns and/or rows) Sometimes (erroneously) referred to as the SELECT * problem ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Compliance Concerns 1. Data Quality • 2. 3. Data Retention Database Security • • • • 4. 5. 6. Authentication (Who is it?) Authorization (Who can do it?) Encryption (Who can see it?) Audit (Who did it?) Data Masking and Obfuscation Database and Data Access Auditing DBA Procedures (e.g. change management) • 7. 8. 9. 10. “Poor data quality costs the typical company at least ten percent (10%) of revenue; twenty percent (20%) is probably a better estimate.” “Unauthorized change is one of the best (and worst) ways to get your auditor’s attention.” Data Movement Tracking Master Data Management Data Definition and Categorization Metadata Management ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Biggest Data Breaches http://www.privacyrights.org/ar/ChronDataBreaches.htm 1. 2. 3. 4. 5. 6. 7. 8. 9. Jan. 20, 2009 Heartland Payment Systems ~130 million Oct. 2, 2009 U.S. Military Veterans 76 million Jan. 17, 2007 TJ stores (TJX) 45.7 million June 16, 2005 CardSystems over 40 million Dec. 15, 2009 RockYou 32 million (SQL injection) May 22, 2006 U.S. Dept. of Veteran's Affairs 28.6 million Mar. 8, 2006 iBill 17,781,462 Mar. 26, 2008 Bank of New York Mellon 12.5 million July 3, 2007 Fidelity National Information Services (Certegy Check Services Inc.) 8.5 million 10. Sept. 14, 2007 TD Ameritrade Holding Corp. 6.3 million As of Feb 17, 2010, total records breached: 345,724,373 (since January 2005) ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Non-Technical Security Steps 1. 2. 3. 4. 5. 6. 7. 8. Buy and use a good shredder Don’t wear your ID badge outside of the office Be vigilant when using your laptop in public Buy and use a screen “shade” Don’t put identification on your laptop (e.g. ID tags) Company should invest in an anonymous PO box Be careful about company logo clothing Don’t leave unencrypted disks/USB sticks/etc. laying around 9. Invest in a laptop lock and use it whenever possible 10. Never put any disk you do not know into your computer (salaries) ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Under-Utilized Features 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Table Expressions CASE statements REOPT (other than NONE) Triggers Auto expand buffer pools Real Time Stats User-Defined Functions DISTINCT Types LOBs Date/Time Arithmetic XML? ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Specialty Processor Workloads IFL: Linux “stuff” zIIP: distributed SQL requests zIIP: parallel SQL requests zIIP: data warehousing/star schema zIIP: native SQL stored procedures run via DDF (V9) zIIP: index maintenance during LOAD, REORG, and REBUILD 7. zAAP: Java 8. zAAP: XML 9. (zAAP on) zIIP: Java and XML 10. zIIP: non-DB2 stuff including: 1. 2. 3. 4. 5. 6. • z/OS Communcation Server encryption, z/OS XML System Services, and System Data Mover processing associated with zGM/XRC (z/OS Global Mirror) ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten A-Ha! Moments 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Changing the SQL terminator for triggers Truly understanding locking When you can predict access paths by looking at SQL (and statistics) Being a human compiler (or JCL checker) When your first reaction is to look it up in the manual instead of asking the DBA When you take responsibility for your professional development When you start thinking about the business problems your programs and database solve before thinking about a “neat” technical issue When you stop thinking of fetching from a cursor like reading from a file When you stop blaming DB2 before your own code When you look at this list and say “that all makes sense” ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 Myths 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Use Views to Insulate Programs from Change Locking Problems Indicate a Database Problem Primary Key is Usually a Good Choice for Clustering Just Using the Defaults Should Work Out Well Programmers Don’t Need to Know How to Tune SQL Black Boxes Work Well for Performance Using NULLs Can Save Space RUNSTATS Aren’t That Important DB2 is a Hog It Depends! ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Outdated Standards Limiting indexes per table to 3… or 5… or… Requring base table views Forbidding dynamic SQL Limiting number of tables per join (typically for online txns) 5. Avoiding NULLs 6. Arcane table naming standards (e.g. TXR0031) 7. Just about any buffer pool standard (e.g. BP0 only) 8. Almost any standard using the words “always” or “never” 9. GRANT…WITH GRANT OPTION 10. Putting standards in a binder instead of online 1. 2. 3. 4. ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Things to Do Before You Visit the DBA 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Try to Figure It Out Yourself, Please RTFM Figure out what you are going to say to him/her Be sure it is the truth! Don’t assume you (or your code) are innocent Have a drink (coffee?) Bring the DBA a drink (your choice!) Never say “But IBM said it should work this way.” Never say “But it worked that way yesterday.” Always say “Thank you.” ©2010 SoftwareOnZ ©2010 SoftwareOnZ Ten Eleven Rules of the Road for DBAs 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Write Down Everything Keep Everything Automate Share Your Knowledge Focus Your Efforts Don’t Panic! Measure Twice, Cut Once Understand the Business Don’t Be a Hermit Use All of the Resources at Your Disposal Keep Up-to-Date From zJournal article available at: http://www.craigsmullins.com/zjdp_042.htm ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Weasel-Speak Interpretations (Part 1) 1. We will look into it (We will forget all about it the moment you leave) 2. It is in process (The bureaucracy involved has rendered it hopeless) 3. I didn’t get your e-mail (I was too busy updating my Facebook status to read your e-mail) 4. The entire project is being abandoned [or reorganized] (The only guy who understood it just left or retired) 5. A number of different approaches are being tried (We are all just guessing at this point) ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Weasel-Speak Interpretations (Part 2) 6. Preliminary tests were inconclusive (We can’t get the dang thing to run) 7. Test results are very promising (Amazing, it actually works) 8. What do you think about this… (I only want your opinion so you can share the blame later) 9. Robust (We want you to buy this thing but have no Earthly idea how to convince you, so we call it robust)* 10. Low maintenance (Almost impossible to fix) * See also, “Best of Breed” ©2010 SoftwareOnZ ©2010 SoftwareOnZ http://www.redbooks.ibm.com/ Top Ten, err, Twelve DB2-Related • • SG24-7330 DB2 9 for z/OS Technical Overview SG24-6763 The Business Value of DB2 • • SG24-6300 DB2 for z/OS Application Programming Topics SG24-6289 DB2 9 for z/OS: Using the Utilities Suite • • • SG24-7322 DB2 for z/OS: Data Sharing in a Nutshell SG24-7720 Securing and Auditing Data on DB2 for z/OS SG24-7688 DB2 9 for z/OS: Packages Revisited • • SG24-7473 DB2 9 for z/OS Performance Topics SG24-7134 DB2 UDB for z/OS: Design Guidelines for High Performance and Availability • • • SG24-7604 DB2 9 for z/OS Stored Procedures: Through the CALL and Beyond SG24-7663 DB2 9 for z/OS: Deploying SOA Solutions SG24-6319 DB2 for z/OS and WebSphere: The Perfect Couple ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 Twitter-ers ©2010 SoftwareOnZ ©2010 SoftwareOnZ z/OS LUW Top Ten LUW vs. z/OS Differences 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Memory: EDM Pool, RID Pool, Sort Pool, Buffer Pools versus Catalog Cache, Package Cache, Sort Heap, Buffer Pool, Locklist Table Spaces: Simple (obsolete), Segmented, Partitioned, Universal (V9), LOB versus Regular, Temporary, Large, Automatic (V9.5) Optimizer: 7 levels of optimization in LUW (0,1,2,3,5,7,8) Monitoring: Traces and Instrumentation Facility versus Event Monitor, Snapshot Monitor XML: XPath versus XPath, XQuery Index Compression: z/OS only Oracle syntax suport: LUW only Multi-row INSERT, FETCH & multi-row cursor UPDATE: z/OS only SET CURRENT ISOLATION: LUW only Bottom Line: amazingly similar but different enough to make transitioning from one to the other non-trivial ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 9 for z/OS Application Developer Features 1. 2. 3. 4. 5. 6. Plan Management: Package Stability MERGE SELECT FROM UPDATE, DELETE, & MERGE Index on Expressions Native SQL Procedure Language FETCH FIRST and ORDER BY in subselect and fullselect 7. INTERSECT and EXCEPT 8. INSTEAD OF TRIGGER 9. TRUNCATE 10. LOB Improvements ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 9 for z/OS DBA Features 1. Universal Table Spaces - PBG and range-partitioned 2. 3. 4. 5. 6. Reordered Row Format Index Compression CLONE Tables BUILD2 Phase Eliminated in Online REORG APPEND YES – ignoring clustering during INSERT and LOAD 7. 8. 9. 10. The promise of REOPT(AUTO) IMPLICITLY HIDDEN columns New Data Types: BIGINT, DECFLOAT, BINARY/VARBINARY Database Definition on Demand - renaming a table’s column; renaming an index ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 10 Features 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. “…best reduction in CPU for transactions, queries, and batch since V2.1… most customers to reduce CPU times between 5% and 10% as soon as DB2 10 is out of the box.” Improved Security (e.g. data masking, smaller granularity for admin privileges) Index Include Columns Temporal (Versioned) Data Hashing Buffer Pool Improvements (e.g.use of the System z10 1 megabyte, page size, buffer pools in memory) 80% to 90% of Virtual Storage Moved “Above the Bar” No More Links in the DB2 Catalog Efficient Caching of Dynamic SQL Statements That Use Literals LOB Improvements (e.g. inline LOBS, improved large object streaming) ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Trends 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Industry Consolidation Cloud Computing SaaS Virtualization Server Consolidation Open Source Social Media Complexity Autonomic, Self-Managing Databases Commoditization ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Best Practices for DB2 Professional Development 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Join your local DB2 user group (and participate!) Join IDUG and attend the annual conference in your geography Lobby for relevant annual training Purchase and read DB2 and database books Subscribe to relevant magazines and read them regularly (e.g. IBM Database Magazine, Database Trends & Applications, zJournal) Subscribe to RSS feeds for DB2-related blogs (mine [DB2 Portal and Data Technology Today], Willie Favero, Troy Coleman, Dave Beulke, Robert Catterall, etc.) Subscribe to the DB2 mailing list (daily DIGEST perhaps) Visit IBM DeveloperWorks frequently (http://www.ibm.com/developerworks/data/) Download the DB2 manuals in PDF form Be business savvy, not just tech savvy! Learn about your business. ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten DB2 Related Blogs 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Start with: Planet DB2 (DB2 blog aggregator) – http://www.planetdb2.com DB2 Portal Blog - http://www.db2portal.com/blog.html Getting the Most Out of DB2 for z/OS (Willie Favero) http://it.toolbox.com/blogs/db2zos DB2 News & Tips - http://db2news.blogspot.com/ DB2 USA - http://db2usa.blogspot.com/ Dave Beulke’s Blog - http://davebeulke.com/ DB2utor (Troy Coleman) http://ibmsystemsmag.blogs.com/db2utor/ Thoughts on DB2 (Triton) - http://www.triton.co.uk/blog/ SAP on DB2 for z/OS (Omer Brandeis) http://it.toolbox.com/blogs/sap-on-db2 Robert Catterrall - http://catterallconsulting.blogspot.com/ ©2010 SoftwareOnZ ©2010 SoftwareOnZ Top Ten Books for DB2 Professionals 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide DB2 Developer’s Guide ©2010 SoftwareOnZ ©2010 SoftwareOnZ The Return of the DB2 Top Ten Lists Craig S. Mullins craig@craigsmullins.com http://www.CraigSMullins.com craig.mullins@softwareonz.com http://www.softwareonz.com ©2010 SoftwareOnZ ©2010 SoftwareOnZ