By Lloyd Albin 12/3/2013 Functions The Basics - Languages Languages Pre-Installed Languages: Installable Languages that come with Postgres: SQL C internal plpgsql plperl plperlu pltcl plpython Other Downloadable Languages: pljava plphp plpy plr plruby plscheme plsh Finding Languages that can be Installed SELECT name, comment FROM pg_available_extensions WHERE comment LIKE '%language%'; Name Comment plperl PL/Perl procedural language plperlu PL/PerlU untrusted procedural language plpgsql PL/pgSQL procedural language Finding Downloadable Languages http://www.postgresql.org/docs/9.3/static/external-pl.html http://www.postgresql.org/docs/9.2/static/external-pl.html http://www.postgresql.org/docs/9.1/static/external-pl.html http://www.postgresql.org/docs/9.0/static/external-pl.html http://www.postgresql.org/docs/8.4/static/external-pl.html Name Language Website (Postgres 9.2 & 9.3) PL/Java Java http://pljava.projects.postgresql.org/ PL/PHP PHP http://www.commandprompt.com/community/plphp/ PL/Py Python http://python.projects.postgresql.org/backend/ PL/R R http://www.joeconway.com/plr/ PL/Ruby Ruby http://raa.ruby-lang.org/project/pl-ruby/ PL/Scheme Scheme http://plscheme.projects.postgresql.org/ PL/sh Unix shell http://plsh.projects.postgresql.org/ How to Install a Language CREATE EXTENSION plperl; CREATE EXTENSION plperlu; CREATE EXTENSION plpgsql; Uninstalling a Language DROP EXTENSION plperl; DROP EXTENSION plperlu; DROP EXTENSION plpgsql; Functions The Basics – Function Behavior IMMUTABLE IMMUTABLE indicates that the function cannot modify the database and always returns the same result when given the same argument values; that is, it does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value. What this means is these types of functions can be used as a type converter or indexer. STABLE STABLE indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. It is inappropriate for AFTER triggers that wish to query rows modified by the current command. Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction. VOLATILE VOLATILE indicates that the function value can change even within a single table scan, so no optimizations can be made. Relatively few database functions are volatile in this sense; some examples are random(), currval(), timeofday(). But note that any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away; some example are random(), currval(), timeofday() Functions The Basics – NULL INPUT What to do with NULL? CALLED ON NULL INPUT – This is the default if you don’t say anything. You will need to handle the NULL’s within your code. RETURNS NULL ON NULL INPUT or STRICT – This will cause the function to return NULL when any of the input values are NULL. The body of the function is never executed. Functions The Basics - SECURITY SECURITY INVOKER – This is the default. The function is to be executed with the privileges of the user that calls it. SECURITY DEFINER – This specifies that the function is to be executed with the privileges of the user that created it. This could be used to have a function update a table that the calling user does not have permissions to access, etc. Functions The Basics - Syntax CREATE FUNCTION CREATE OR REPLACE FUNCTION ( argname text ) RETURNS numeric AS $body$ …. $body$ LANGUAGE 'sql' IMMUTABLE | STABLE | VOLATILE RETURNS NULL ON NULL INPUT SECURITY DEFINER ; Additional Return Types void – This is for when the trigger should not return anything. It is just doing some backgroud process for you. trigger – This must be set for all trigger functions. boolean, text, etc – This is for a single values being passed back. SET OF schema.table – This is for returning multiple rows of data. This can either point to an existing table or a composite type to get the table/field layout. Simple Functions Basic SQL functions – Converting a field type Converting a tables field CREATE TABLE tools.lloyd_test ( mynumber VARCHAR ); ALTER TABLE tools.lloyd_test ALTER COLUMN mynumber TYPE INTEGER COLLATE pg_catalog."default"; ERROR: collations are not supported by type integer Using a function to do the conversion ALTER TABLE tools.lloyd_test ALTER COLUMN mynumber TYPE INTEGER USING tools.chartoint(mynumber); Convert varchar to int CREATE OR REPLACE FUNCTION tools.chartoint ( chartoconvert varchar ) RETURNS integer AS $body$ SELECT CASE WHEN trim(chartoconvert) SIMILAR TO '[0-9,]+' THEN CAST(trim(REPLACE(chartoconvert,',','')) AS integer) ELSE NULL END; $body$ LANGUAGE 'sql' IMMUTABLE RETURNS NULL ON NULL INPUT SECURITY INVOKER; Convert varchar to int CREATE OR REPLACE FUNCTION tools.chartonumeric ( chartoconvert varchar ) RETURNS numeric AS $body$ SELECT CASE WHEN trim(chartoconvert) SIMILAR TO '[0-9,.-]+' THEN CAST(trim(REPLACE(chartoconvert,',','')) AS integer) ELSE NULL END; $body$ LANGUAGE 'sql' IMMUTABLE RETURNS NULL ON NULL INPUT SECURITY INVOKER; Simple Functions Basic PL/pgSQL functions – Using as an index The Problem We receive faxes that are multi-page TIFF’s. The TIFF file name are called the raster id. We have data where we have the full path of the file name, the raster id and the raster id with page number. Examples: 0000/000000 0000/0000001111 /studydata/studyname/0000/000000 Finding the Raster ID The first thing to do, is to be able to find the Raster ID, no matter which format is supplied. CREATE FUNCTION find_raster (raster varchar) RETURNS VARCHAR(11) AS $$ BEGIN CASE length(raster) WHEN 11 THEN -- Format: 1234/567890 -- Returns: 1234/567890 RETURN raster; WHEN 15 THEN -- Format: 1234/5678901234 -- Returns: 1234/567890 RETURN substr(raster, 1, 11); ELSE -- Format: /study_data/study_name/1234/567890 -- Returns: 1234/567890 RETURN substr(raster, length(raster)-10, 11); END CASE; END; $$ LANGUAGE plpgsql IMMUTABLE RETURNS NULL ON NULL INPUT; Examples of the find_raster Function -- Test returning of Raster ID when Submitting Raster ID SELECT find_raster('1234/567890'); -- Returns: 1234/567890 -- Test returning of Raster ID when Submitting Raster ID with 4 Digit Page Number SELECT find_raster('1234/5678901234'); -- Returns: 1234/567890 -- Test returning of Raster ID when Submitting Filename that includes Raster ID SELECT find_raster('/study_data/study_name/1234/567890'); -- Returns: 1234/567890 Adding the Index CREATE INDEX [ name ] ON table ( expression ) expression An expression based on one or more columns of the table. The expression usually must be written with surrounding parentheses, as shown in the syntax. However, the parentheses can be omitted if the expression has the form of a function call. CREATE INDEX raster_raster_idx ON raster find_raster(raster); CREATE INDEX raster_file_raster_idx ON raster_file find_raster(raster); CREATE INDEX raster_page_raster_idx ON raster_page find_raster(raster); The benefits of the Index Without Index With Index SELECT raster.*, raster_page.* FROM (SELECT * FROM raster OFFSET 50000 LIMIT 100) raster LEFT JOIN raster_page ON raster.raster = substr(raster_page.raster, 1, 11); SELECT raster.*, raster_page.* FROM (SELECT * FROM raster OFFSET 50000 LIMIT 100) raster LEFT JOIN raster_page ON raster.raster = find_raster(raster_page.raster); Total runtime: 4982.527 ms Total runtime: 4.982527 s Total runtime: 141.809 ms Total runtime: 0.141809 s Simple Functions Basic PL/pgSQL functions – Triggers Shadow Tables Sometimes we want to have a copy of a table and know when everything happened to the original table, insert, update, delete, and truncate. This is possible to have happen automatically with a trigger function. Creating the base tables Table 1 CREATE TABLE public.table1 ( key SERIAL, value INTEGER, value_type VARCHAR, PRIMARY KEY(key) ); Table 2 CREATE TABLE public.table2 ( key INTEGER, value INTEGER, value_type VARCHAR, user_name NAME, action VARCHAR, action_time TIMESTAMP ); The Shadow Function CREATE FUNCTION public.shadow_table1 ( ) RETURNS trigger AS $body$ BEGIN IF TG_OP = 'INSERT' THEN INSERT INTO public.table2 VALUES(NEW.key, NEW.value, NEW.value_type, current_user, TG_OP, now()); RETURN NEW; END IF; IF TG_OP = 'UPDATE' THEN INSERT INTO public.table2 VALUES(NEW.key, NEW.value, NEW.value_type, current_user, TG_OP, now()); RETURN NEW; END IF; IF TG_OP = 'DELETE' THEN INSERT INTO public.table2 VALUES(OLD.key, OLD.value, OLD.value_type, current_user, TG_OP, now()); RETURN OLD; END IF; END; $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER; Adding the trigger to the table CREATE TRIGGER table1_tr BEFORE INSERT OR UPDATE OR DELETE ON public.table1 FOR EACH ROW EXECUTE PROCEDURE public.shadow_table1(); Working with Table1 INSERT INTO public.table1 (value, value_type) VALUES ('30', 'meters'); INSERT INTO public.table1 (value, value_type) VALUES ('10', 'inches'); UPDATE public.table1 SET value = '20' WHERE value_type = 'inches'; DELETE FROM public.table1 WHERE value_type = 'inches'; INSERT INTO public.table1 (value, value_type) VALUES ('50', 'inches'); What they look like Table1 Table2 key value value_type key value value_type user_name action action_time 1 30 meters 1 30 meters postgres INSERT 12/3/2013 4:58:04 PM 3 50 inches 2 10 inches postgres INSERT 12/3/2013 4:58:04 PM 2 20 inches postgres UPDATE 12/3/2013 4:58:04 PM 2 20 inches postgres DELETE 12/3/2013 4:58:04 PM 3 50 inches postgres INSERT 12/3/2013 4:58:04 PM Simple Functions Basic PL/pgSQL functions – Write to a hidden table Hidden Tables Sometimes you may want a normal user to be able to write a table, but that user also not be able to view/select any contents from the table aka INSERT only. This does not play with some front end applications. Some people will write functions where you pass in the variables, but that is not always possible depending on the front end that is being written. Creating the base tables Table 1 SET ROLE a; CREATE TABLE public.table1 ( key SERIAL, value INTEGER, value_type VARCHAR, PRIMARY KEY(key) ); Table 2 SET ROLE b; CREATE TABLE public.table2 ( key INTEGER, value INTEGER, value_type VARCHAR, PRIMARY KEY(key) ); The Hidden Function SET ROLE b; CREATE FUNCTION public.hidden_table1 ( ) RETURNS trigger AS $body$ BEGIN INSERT INTO public.table2 VALUES(NEW.key, NEW.value, NEW.value_type); RETURN NULL; END; $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER; Adding the trigger to the table CREATE TRIGGER table1_tr BEFORE INSERT ON public.table1 FOR EACH ROW EXECUTE PROCEDURE public.hidden_table1(); Working with Table1 INSERT INTO public.table1 (value, value_type) VALUES ('30', 'meters'); INSERT INTO public.table1 (value, value_type) VALUES ('10', 'inches'); INSERT INTO public.table1 (value, value_type) VALUES ('50', 'inches'); What they look like Table1 key value value_type Table2 key value value_type 1 30 meters 2 10 inches 3 50 inches Simple Functions Basic PL/pgSQL functions – Returning a Table Counting the tables CREATE OR REPLACE FUNCTION tools.count_schema_rows (search_schema_name name) RETURNS SETOF tools.schema_row_count AS $body$ DECLARE schema_results RECORD; table_results RECORD; BEGIN FOR schema_results IN SELECT table_schema, table_name FROM information_schema.tables WHERE table_schema = search_schema_name AND table_type = 'BASE TABLE' ORDER BY table_name LOOP -- looping through tables in schema here EXECUTE 'SELECT ' || quote_literal(schema_results.table_schema) || '::NAME AS schema_name, ' || quote_literal(schema_results.table_name) || '::NAME AS table_name, count(*), pg_total_relation_size(''' || schema_results.table_schema || '.' || schema_results.table_name || ''') AS total_size, pg_size_pretty(pg_total_relation_size(''' || schema_results.table_schema || '.' || schema_results.table_name || ''')) AS total_size_pretty FROM ' || quote_ident(schema_results.table_schema) || '.' || quote_ident(schema_results.table_name) INTO table_results; RETURN NEXT table_results; END LOOP; END; $body$ LANGUAGE 'plpgsql' VOLATILE CALLED ON NULL INPUT SECURITY DEFINER; Counting the tables SELECT * FROM tools.count_schema_rows ('dffax'); schema_name table_name row_count total_size total_size_pretty dffax dfx_time1_p052 925323 566116352 540 MB dffax dfx_time1_h021 1007840 622043136 593 MB dffax dfx_time1_m003 1241989 767385600 732 MB By Lloyd Albin 1/7/2014 Sample of Functions Materialized Views Single Table Shadow functions. Lookup User Dependencies Update sequences for an entire schema