Writing Readable, Maintainable Code Steven Feuerstein PL/SQL Evangelist www.quest.com steven.feuerstein@quest.com Copyright © 2006 Quest Software How to Benefit Most From This Session • Watch, listen, ask questions. • Download the training materials and supporting scripts: – http://oracleplsqlprogramming.com/resources.html – "Demo zip": all the scripts I run in my class available at http://oracleplsqlprogramming.com/downloads/demo.zip filename_from_demo_zip.sql • Use these materials as an accelerator as you venture into new territory and need to apply new techniques. • Play games! Keep your brain fresh and active by mixing hard work with challenging games. – I strongly recommend Set (www.setgame.com) 1 Write readable, maintainable code • PL/SQL allows you to write very readable, selfdocumenting and easily-maintained code. – This should be a primary objective for any program. • Here’s what we’ll cover.... – – – – – – – – – Fundamental readability features of PL/SQL Documentation guidelines Techniques for avoiding hard-coding in your programs Modular construction with packages and reusable components Key tip: keep executable sections tiny with local modules How to decide where to put the code Instrumentation of your code Build and use regression tests The value of deterministic logic 2 Readability features you should use • END labels end_labels.sql – For program units, loops, nested blocks • SUBTYPEs – Create application-specific datatypes! • Named notation plsql_limits.pks fullname.* namednot.sql – Sometimes the extra typing is worth it! 3 Coding Conventions • It doesn’t really matter (to me, anyway!) what standards you decide to follow. • The important thing is to be consistent. • Everyone’s code as much as possible should look and read the same. – Don’t both defining formatting standards. – Use “pretty printers” that come with any PL/SQL IDE worth using. • Here is a fairly (and free) comprehensive set of coding conventions for you to use: http://examples.oreilly.com/orbestprac/ 4 Documentation guidelines • We all hate writing comments, but there are some key places in your code where comments are very important. – Program header with modification history – Explanation of complex algorithms – Identifying workarounds and the fixes for them. • Once again, standardize and automate as much as possible. /* – Code templates WORKAROUND START Bug description: Bug reference: Workaround description: Post-bug fix implementation: fileio92.sql WORKAROUND END */ 5 Avoid hard-coding in your programs! • Here are examples of hard-coding in PL/SQL: – – – – – – – – – – – Literal values IF num_requests > 100 THEN Date format in a call to TO_DATE 'MM/DD/YY' Language specificities '09-SEP-2001' Constrained declarations NUMBER(10,4) Variables declared using base datatypes, such as my_name VARCHAR2(20) Every SQL statement you write COMMIT and ROLLBACK statements Comments that "explain" code Fetching into a list of individual variables Embedded (un-encapsulated) business rules Reusing the same name for different purposes 6 Some tips to help you avoid hard-coding • Create constants and variables to hide literal values and formulas. – Either local or defined in packages • Use SUBTYPEs to avoid repetitive hard-coding declarations. – Best defined in package specifications • Encapsulate SQL behind an API. – The Quest CodeGen Utility • Hide COMMIT and ROLLBACK behind an API. – PL/Vision’s PLVcmt commit package offers one example. • Always fetch into a record based on the cursor %ROWTYPE. – Even if the query in the cursor changes, the record structure will adapt automatically to the new fields. • Declare new variables, exceptions, etc. for each new purpose. recycle_names.sql 7 Modular construction in PL/SQL • Packages: some quick reminders... – – – – Key building block for applications Overloading Package-level data and caching Initialization section • Local or nested modules – Avoid spaghetti code! – Keep your executable sections small/tiny. 8 Packages: key PL/SQL building block • Employ object-oriented design principles – Build at higher levels of abstraction – Enforce information hiding - control what people see and do – Call packaged code from object types and triggers • Encourages top-down design and bottom-up construction – TD: Design the interfaces required by the different components of your application without addressing implementation details – BU: Existing packages contain building blocks for new code • Organize your stored code more effectively • Implements session-persistent data 9 Overloading in packages: key usability technique • Overloading (static polymorphism): two or more programs with the same name, but different signature. – You can overload in the declaration section of any PL/SQL block, including the package body (most common). myproc • Overloading is a critical feature when building comprehensive programmatic interfaces (APIs) or components using packages. myfunc myproc – If you want others to use your code, you need to make that code as smart and as easy to use as possible. – Overloading transfers the "need to know" from the user to the overloaded program. Compare: DBMS_OUTPUT and p 10 How Overloading Works • For two or more modules to be overloaded, the compiler must be able to distinguish between the two calls at compile-time. – Another name for overloading is "static polymorphism." • There are two different "compile times": – 1. When you compile the package or block containing the overloaded code. – 2. When you compile programs that use the overloaded code. 11 How Overloading Works, continued • Distinguishing characteristics: – The formal parameters of overloaded modules must differ in number, order or datatype family (CHAR vs. VARCHAR2 is not different enough). – The programs are of different types: procedure and function. • Undistinguishing characteristics: – Functions differ only in their RETURN datatype. – Arguments differ only in their mode (IN, OUT, IN OUT). – Their formal parameters differ only in datatype and the datatypes are in the same family. 12 Quiz! Nuances of Overloading • Is this a valid overloading? Will it compile? How can I use it? CREATE OR REPLACE PACKAGE sales IS PROCEDURE calc_total (zone_in IN VARCHAR2); PROCEDURE calc_total (reg_in IN VARCHAR2); END sales; BEGIN sales.calc_total ('NORTHWEST'); ? sales.calc_total ('ZONE2'); END; ambig_overloading.sql 13 Package Data: Useful and Sticky • The scope of a package is your session, and any data defined at the "package level" also has session scope. – If defined in the package specification, any program can directly read/write the data. – Ideal for program-specific caching. • General best practice: hide your package data in the body so that you can control access to it. • Use the SERIALLY_REUSABLE pragma to move data to SGA and have memory released after each usage. thisuser.pkg thisuser.tst serial.sql 14 Package Initialization Structure • The initialization section: – Is defined after and outside of any programs in the package. – Is not required. – Can have its own exception handling section. • Useful for: – Perform complex setting of default or initial values. – Set up package data which does not change for the duration of a session. – Confirm that package is properly instantiated. PACKAGE BODY pkg IS PROCEDURE proc IS BEGIN END; FUNCTION func RETURN BEGIN END; BEGIN ...initialize... END pkg; BEGIN after/outside of any program defined in the pkg. init.pkg init.tst datemgr.pkg 15 Packages vs. Stand-alone programs • General recommendation: use packages instead of stand-alone programs. – Better way to organize code. – Can hide implementation and reduce need to recompile programs using the package. • Other considerations.... – Entire package loaded when any single program is called. – Central packages can become a "bottleneck" when changes are needed. recompile.sql 16 Top Tip: Write tiny chunks of code! Your executable section should have no more than fifty lines of code in it. ?!?! • It is virtually impossible to understand and therefore debug or maintain code that has long, meandering executable sections. • How do you follow this guideline? – Don't skimp on the packages. – Top-down design / step-wise refinement – Use lots of local or nested modules. 17 Let’s write some code! • My team is building a call support application. Customers call with problems, and we put their call in a queue if it cannot be handled immediately. – I must now write a program that distributes unhandled calls out to members of the call support team. • The basic algorithm from the doc summary is: – While there are still unhandled calls in the queue, assign them to employees who are under-utilized (have fewer calls assigned to them then the average for their department). • Fifty pages of doc, complicated program! 18 First: Translate the summary into code. PROCEDURE distribute_calls ( department_id_in IN departments.department_id%TYPE) IS BEGIN WHILE ( calls_are_unhandled ( ) ) LOOP FOR emp_rec IN emps_in_dept_cur (department_id_in) LOOP IF current_caseload (emp_rec.employee_id) < avg_caseload_for_dept (department_id_in) THEN assign_next_open_call (emp_rec.employee_id); END IF; END LOOP; END LOOP; END distribute_calls; • A more or less direct translation – and no comments! 19 Explanation of Subprograms • Function calls_are_unhandled: takes no arguments, returns TRUE if there is still at least one unhandled call, FALSE otherwise. • Function current_caseload: returns the number of calls (case load) assigned to that employee. • Function avg_caseload_for_dept: returns the average number of calls assigned to employees in that department. • Procedure assign_next_open_call: assigns the employee to the call, making it handled, as opposed to unhandled. 20 Next: Implement stubs for subprograms PROCEDURE call_manager.distribute_calls ( department_id_in IN departments.department_id%TYPE) IS FUNCTION calls_are_handled RETURN BOOLEAN IS BEGIN ... END calls_are_handled; FUNCTION current_caseload ( employee_id_in IN employees.employee_id%TYPE) RETURN PLS_INTEGER IS BEGIN ... END current_caseload; FUNCTION avg_caseload_for_dept ( employee_id_in IN employees.employee_id%TYPE) RETURN PLS_INTEGER IS BEGIN ... END current_caseload; PROCEDURE assign_next_open_call ( employee_id_in IN employees.employee_id%TYPE) IS BEGIN ... END assign_next_open_call; BEGIN • These are all defined locally in the procedure. 21 Next: Think about implementation of just this level. • Think about what the programs need to do. • Think about if you or someone has already done it. Don’t reinvent the wheel! Hey! Just last week I wrote another function that is very similar to current_caseload. It is now "buried" inside a procedure named show_caseload. I can’t call it from distribute_calls, though. It is local, private, hidden. Should I cut and paste? No! I should extract the program and expand its scope. 22 Next: Isolate and refactor common code. CREATE OR REPLACE PACKAGE BODY call_manager IS FUNCTION current_caseload ( employee_id_in IN employees.employee_id%TYPE , use_in_show_in IN BOOLEAN DEFAULT TRUE) RETURN PLS_INTEGER IS BEGIN ... END current_caseload; PROCEDURE show_caseload ( department_id_in IN departments.department_id%TYPE) IS BEGIN ... END show_caseload; PROCEDURE distribute_calls ( department_id_in IN departments.department_id%TYPE ) IS BEGIN ... END distribute_calls; END; Increased complexity, backward compatibility. distribute _calls show_ caseload current_ caseload • Now current_caseload is at the package level and can be called by any program in the package. 23 Next: Reuse existing code whenever possible. • Just last week, Sally emailed all of us with news of her call_util package. – Returns average workload of employee and much more. – Just what I need! I don’t have to build it myself, just call it. BEGIN WHILE ( calls_are_unhandled ( ) ) LOOP FOR emp_rec IN emps_in_dept_cur (department_id_in) LOOP IF current_caseload (emp_rec. employee_id) < call_util.dept_avg_caseload (department_id_in) THEN assign_next_open_call (emp_rec.employee_id); END IF; END LOOP; END LOOP; This program has the widest scope possible: it can be END distribute_calls; executed by any schema with execute authority on the call_util package, and by any program within the owning 24 schema. Next: Implement what’s left. • Now I am left only with program-specific, local subprograms. • So I move down to the next level of detail and apply the same process. – Write the “executive summary” first. – Keep the executable section small. – Use local modules to hide the details. • Eventually, you get down to the “real code” and write relatively trivial logic. 25 Challenges of local modules • Requires discipline: always be on the lookout for opportunities to refactor. • Need to read from the bottom, up. – Takes some getting used to. • Sometimes can feel like a "wild goose chase". – Where is the darned thing actually implemented? – Your IDE should help you understand the internal structure of the program. • How do you decide when a module should be local or defined at a “higher” level? 26 To sum up the rule: Define subprograms close to usage. • When should the program be local? Private to the package? Publicly accessible? • The best rule to follow is: Define your subprograms as close as possible to their usage(s). • The shorter the distance from usage to definition, the easier it is to find, understand and maintain that code. 27 Instrument your code. • We should all assume that we are going to need visibility into the execution of our backend code. – Debuggers are helpful once you’ve found a bug. – Automatic profiling by Oracle shows code coverage and performance. – But what about application-specific issues? • Tracing is different from debugging. – You should build application/context tracing, that can be enabled without changing the code base itself. – Package-based globals offer a simple way to do this. – Quest CodeGen Utility offers a free tracing mechanism (the qd_runtime package) 28 Build and use regression tests. • If we don’t build solid regression tests, we cannot possibly maintain that code without... – Lots of fear, lots of bugs, intensive allocation of resources. • The only way to do this from a practical standpoint is to automate the process. – Quest Code Tester for Oracle – utPLSQL – PL/Unit 29 About deterministic programs CREATE OR REPLACE FUNCTION betwnstr ( string_in IN VARCHAR2 , start_in IN INTEGER , end_in IN INTEGER , inclusive_in IN BOOLEAN DEFAULT TRUE ) RETURN VARCHAR2 DETERMINISTIC IS BEGIN RETURN ( SUBSTR ( string_in , start_in , end_in - start_in + 1 )); END betwnstr; / • Definition: the same IN argument values always result in the same output/outcomes. – That is, the results of the program are completely determined by the program header inputs to the program. – There are not “side effects.” • Most PL/SQL programs have side-effects. – Every SQL statement is a sideeffect, since it is not part of the parameter list. 30 Working with deterministic programs • Most deterministic programs are small and relatively simple. – Formulas, rules, etc. • Some clear benefits to deterministic code: – They are easier to test, since they have no side-effects. – Oracle will optimize/avoid execution - sometimes! • My suggestions: – Segregate as much of your code into deterministic and nondeterministic program units. – Carefully declare all deterministic programs. show_deterministc.sql deterministc.sql 31 Writing readable, maintainable code • PL/SQL makes it easy to write really nice, comfortable code. – Use everything the language has to offer. • Be disciplined! – You can transform your code quality relatively simply and easily by following the “Write tiny chunks of code!” • Take your time! – Hurrying to build code is a sure-fire way to dig a deep hole really quickly. – Standardize development, define your tests, instrument your code. 32 Copyright © 2006 Quest Software