Writing Basic Postgres Functions

advertisement
By Lloyd Albin
12/3/2013
Functions

The Basics - Languages
Languages


Pre-Installed Languages:




Installable Languages that come with Postgres:






SQL
C
internal
plpgsql
plperl
plperlu
pltcl
plpython
Other Downloadable Languages:







pljava
plphp
plpy
plr
plruby
plscheme
plsh
Finding Languages that can
be Installed

 SELECT name, comment
FROM pg_available_extensions
WHERE comment LIKE '%language%';
Name
Comment
plperl
PL/Perl procedural language
plperlu
PL/PerlU untrusted procedural language
plpgsql
PL/pgSQL procedural language
Finding Downloadable
Languages






http://www.postgresql.org/docs/9.3/static/external-pl.html
http://www.postgresql.org/docs/9.2/static/external-pl.html
http://www.postgresql.org/docs/9.1/static/external-pl.html
http://www.postgresql.org/docs/9.0/static/external-pl.html
http://www.postgresql.org/docs/8.4/static/external-pl.html
Name
Language
Website (Postgres 9.2 & 9.3)
PL/Java
Java
http://pljava.projects.postgresql.org/
PL/PHP
PHP
http://www.commandprompt.com/community/plphp/
PL/Py
Python
http://python.projects.postgresql.org/backend/
PL/R
R
http://www.joeconway.com/plr/
PL/Ruby
Ruby
http://raa.ruby-lang.org/project/pl-ruby/
PL/Scheme
Scheme
http://plscheme.projects.postgresql.org/
PL/sh
Unix shell
http://plsh.projects.postgresql.org/
How to Install a Language

 CREATE EXTENSION plperl;
 CREATE EXTENSION plperlu;
 CREATE EXTENSION plpgsql;
Uninstalling a Language

 DROP EXTENSION plperl;
 DROP EXTENSION plperlu;
 DROP EXTENSION plpgsql;
Functions

The Basics – Function Behavior
IMMUTABLE

 IMMUTABLE indicates that the function cannot
modify the database and always returns the same
result when given the same argument values; that is,
it does not do database lookups or otherwise use
information not directly present in its argument list.
 If this option is given, any call of the function with
all-constant arguments can be immediately replaced
with the function value.
 What this means is these types of functions can be
used as a type converter or indexer.
STABLE

 STABLE indicates that the function cannot modify the
database, and that within a single table scan it will
consistently return the same result for the same argument
values, but that its result could change across SQL
statements.
 This is the appropriate selection for functions whose
results depend on database lookups, parameter variables
(such as the current time zone), etc.
 It is inappropriate for AFTER triggers that wish to query
rows modified by the current command.
 Also note that the current_timestamp family of functions
qualify as stable, since their values do not change within a
transaction.
VOLATILE

 VOLATILE indicates that the function value can
change even within a single table scan, so no
optimizations can be made.
 Relatively few database functions are volatile in this
sense; some examples are random(), currval(),
timeofday().
 But note that any function that has side-effects must
be classified volatile, even if its result is quite
predictable, to prevent calls from being optimized
away; some example are random(), currval(),
timeofday()
Functions

The Basics – NULL INPUT
What to do with NULL?

 CALLED ON NULL INPUT – This is the default if
you don’t say anything. You will need to handle the
NULL’s within your code.
 RETURNS NULL ON NULL INPUT or STRICT –
This will cause the function to return NULL when
any of the input values are NULL. The body of the
function is never executed.
Functions

The Basics - SECURITY

 SECURITY INVOKER – This is the default. The
function is to be executed with the privileges of the
user that calls it.
 SECURITY DEFINER – This specifies that the
function is to be executed with the privileges of the
user that created it. This could be used to have a
function update a table that the calling user does not
have permissions to access, etc.
Functions

The Basics - Syntax
CREATE FUNCTION

CREATE OR REPLACE FUNCTION (
argname text
)
RETURNS numeric AS
$body$
….
$body$
LANGUAGE 'sql'
IMMUTABLE | STABLE | VOLATILE
RETURNS NULL ON NULL INPUT
SECURITY DEFINER
;
Additional Return Types

 void – This is for when the trigger should not return
anything. It is just doing some backgroud process for
you.
 trigger – This must be set for all trigger functions.
 boolean, text, etc – This is for a single values being
passed back.
 SET OF schema.table – This is for returning multiple
rows of data. This can either point to an existing
table or a composite type to get the table/field
layout.
Simple Functions

Basic SQL functions – Converting a field type
Converting a tables field

CREATE TABLE tools.lloyd_test (
mynumber VARCHAR
);
ALTER TABLE tools.lloyd_test
ALTER COLUMN mynumber TYPE INTEGER
COLLATE pg_catalog."default";
ERROR: collations are not supported by type integer
Using a function to do the
conversion

ALTER TABLE tools.lloyd_test
ALTER COLUMN mynumber TYPE INTEGER
USING tools.chartoint(mynumber);
Convert varchar to int

CREATE OR REPLACE FUNCTION tools.chartoint (
chartoconvert varchar
)
RETURNS integer AS
$body$
SELECT
CASE WHEN trim(chartoconvert) SIMILAR TO '[0-9,]+'
THEN CAST(trim(REPLACE(chartoconvert,',','')) AS integer)
ELSE NULL END;
$body$
LANGUAGE 'sql'
IMMUTABLE
RETURNS NULL ON NULL INPUT
SECURITY INVOKER;
Convert varchar to int

CREATE OR REPLACE FUNCTION tools.chartonumeric (
chartoconvert varchar
)
RETURNS numeric AS
$body$
SELECT
CASE WHEN trim(chartoconvert) SIMILAR TO '[0-9,.-]+'
THEN CAST(trim(REPLACE(chartoconvert,',','')) AS integer)
ELSE NULL END;
$body$
LANGUAGE 'sql'
IMMUTABLE
RETURNS NULL ON NULL INPUT
SECURITY INVOKER;
Simple Functions

Basic PL/pgSQL functions – Using as an index
The Problem

 We receive faxes that are multi-page TIFF’s. The TIFF
file name are called the raster id. We have data
where we have the full path of the file name, the
raster id and the raster id with page number.
 Examples:
 0000/000000
 0000/0000001111
 /studydata/studyname/0000/000000
Finding the Raster ID

The first thing to do, is to be able to find the Raster ID, no matter which format is supplied.
CREATE FUNCTION find_raster (raster varchar)
RETURNS VARCHAR(11) AS
$$
BEGIN
CASE length(raster)
WHEN 11 THEN
-- Format: 1234/567890 -- Returns: 1234/567890
RETURN raster;
WHEN 15 THEN
-- Format: 1234/5678901234 -- Returns: 1234/567890
RETURN substr(raster, 1, 11);
ELSE
-- Format: /study_data/study_name/1234/567890 -- Returns: 1234/567890
RETURN substr(raster, length(raster)-10, 11);
END CASE;
END;
$$
LANGUAGE plpgsql
IMMUTABLE
RETURNS NULL ON NULL INPUT;
Examples of the find_raster
Function

-- Test returning of Raster ID when Submitting Raster ID
SELECT find_raster('1234/567890');
-- Returns: 1234/567890
-- Test returning of Raster ID when Submitting Raster ID with 4
Digit Page Number
SELECT find_raster('1234/5678901234');
-- Returns: 1234/567890
-- Test returning of Raster ID when Submitting Filename that
includes Raster ID
SELECT find_raster('/study_data/study_name/1234/567890');
-- Returns: 1234/567890
Adding the Index

CREATE INDEX [ name ] ON table ( expression )
expression
An expression based on one or more columns of the table. The
expression usually must be written with surrounding parentheses,
as shown in the syntax. However, the parentheses can be omitted if
the expression has the form of a function call.
CREATE INDEX raster_raster_idx ON raster find_raster(raster);
CREATE INDEX raster_file_raster_idx ON raster_file find_raster(raster);
CREATE INDEX raster_page_raster_idx ON raster_page find_raster(raster);
The benefits of the Index

Without Index
With Index
SELECT raster.*, raster_page.* FROM (SELECT *
FROM raster OFFSET 50000 LIMIT 100) raster LEFT
JOIN raster_page
ON raster.raster = substr(raster_page.raster, 1, 11);
SELECT raster.*, raster_page.* FROM (SELECT *
FROM raster OFFSET 50000 LIMIT 100) raster LEFT
JOIN raster_page
ON raster.raster = find_raster(raster_page.raster);
 Total runtime: 4982.527 ms
 Total runtime: 4.982527 s
 Total runtime: 141.809 ms
 Total runtime: 0.141809 s
Simple Functions

Basic PL/pgSQL functions – Triggers
Shadow Tables

 Sometimes we want to have a copy of a table and
know when everything happened to the original
table, insert, update, delete, and truncate. This is
possible to have happen automatically with a trigger
function.
Creating the base tables

 Table 1
CREATE TABLE public.table1 (
key SERIAL,
value INTEGER,
value_type VARCHAR,
PRIMARY KEY(key)
);
 Table 2
CREATE TABLE public.table2 (
key INTEGER,
value INTEGER,
value_type VARCHAR,
user_name NAME,
action VARCHAR,
action_time TIMESTAMP
);
The Shadow Function

CREATE FUNCTION public.shadow_table1 (
)
RETURNS trigger AS
$body$
BEGIN
IF TG_OP = 'INSERT' THEN
INSERT INTO public.table2
VALUES(NEW.key, NEW.value, NEW.value_type, current_user, TG_OP, now());
RETURN NEW;
END IF;
IF TG_OP = 'UPDATE' THEN
INSERT INTO public.table2
VALUES(NEW.key, NEW.value, NEW.value_type, current_user, TG_OP, now());
RETURN NEW;
END IF;
IF TG_OP = 'DELETE' THEN
INSERT INTO public.table2
VALUES(OLD.key, OLD.value, OLD.value_type, current_user, TG_OP, now());
RETURN OLD;
END IF;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY DEFINER;
Adding the trigger to the
table

CREATE TRIGGER table1_tr
BEFORE INSERT OR UPDATE OR DELETE
ON public.table1 FOR EACH ROW
EXECUTE PROCEDURE public.shadow_table1();
Working with Table1

INSERT INTO public.table1 (value, value_type) VALUES ('30', 'meters');
INSERT INTO public.table1 (value, value_type) VALUES ('10', 'inches');
UPDATE public.table1 SET value = '20' WHERE value_type = 'inches';
DELETE FROM public.table1 WHERE value_type = 'inches';
INSERT INTO public.table1 (value, value_type) VALUES ('50', 'inches');
What they look like

Table1
Table2
key
value
value_type
key
value
value_type
user_name
action
action_time
1
30
meters
1
30
meters
postgres
INSERT
12/3/2013 4:58:04 PM
3
50
inches
2
10
inches
postgres
INSERT
12/3/2013 4:58:04 PM
2
20
inches
postgres
UPDATE
12/3/2013 4:58:04 PM
2
20
inches
postgres
DELETE
12/3/2013 4:58:04 PM
3
50
inches
postgres
INSERT
12/3/2013 4:58:04 PM
Simple Functions

Basic PL/pgSQL functions – Write to a hidden table
Hidden Tables

 Sometimes you may want a normal user to be able to
write a table, but that user also not be able to
view/select any contents from the table aka INSERT
only.
 This does not play with some front end applications.
 Some people will write functions where you pass in
the variables, but that is not always possible
depending on the front end that is being written.
Creating the base tables

 Table 1
SET ROLE a;
CREATE TABLE public.table1 (
key SERIAL,
value INTEGER,
value_type VARCHAR,
PRIMARY KEY(key)
);
 Table 2
SET ROLE b;
CREATE TABLE public.table2 (
key INTEGER,
value INTEGER,
value_type VARCHAR,
PRIMARY KEY(key)
);
The Hidden Function

SET ROLE b;
CREATE FUNCTION public.hidden_table1 (
)
RETURNS trigger AS
$body$
BEGIN
INSERT INTO public.table2 VALUES(NEW.key, NEW.value, NEW.value_type);
RETURN NULL;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY DEFINER;
Adding the trigger to the
table

CREATE TRIGGER table1_tr
BEFORE INSERT
ON public.table1 FOR EACH ROW
EXECUTE PROCEDURE public.hidden_table1();
Working with Table1

INSERT INTO public.table1 (value, value_type) VALUES ('30', 'meters');
INSERT INTO public.table1 (value, value_type) VALUES ('10', 'inches');
INSERT INTO public.table1 (value, value_type) VALUES ('50', 'inches');
What they look like

Table1
key
value
value_type
Table2
key
value
value_type
1
30
meters
2
10
inches
3
50
inches
Simple Functions

Basic PL/pgSQL functions – Returning a Table
Counting the tables

CREATE OR REPLACE FUNCTION tools.count_schema_rows (search_schema_name name)
RETURNS SETOF tools.schema_row_count AS
$body$
DECLARE
schema_results RECORD;
table_results RECORD;
BEGIN
FOR schema_results IN SELECT table_schema, table_name FROM information_schema.tables WHERE table_schema =
search_schema_name AND table_type = 'BASE TABLE' ORDER BY table_name LOOP
-- looping through tables in schema here
EXECUTE 'SELECT ' || quote_literal(schema_results.table_schema) || '::NAME AS schema_name, ' ||
quote_literal(schema_results.table_name) || '::NAME AS table_name, count(*), pg_total_relation_size(''' ||
schema_results.table_schema || '.' || schema_results.table_name || ''') AS total_size, pg_size_pretty(pg_total_relation_size('''
|| schema_results.table_schema || '.' || schema_results.table_name || ''')) AS total_size_pretty FROM ' ||
quote_ident(schema_results.table_schema) || '.' || quote_ident(schema_results.table_name) INTO table_results;
RETURN NEXT table_results;
END LOOP;
END;
$body$
LANGUAGE 'plpgsql'
VOLATILE
CALLED ON NULL INPUT
SECURITY DEFINER;
Counting the tables

 SELECT * FROM tools.count_schema_rows ('dffax');
schema_name
table_name
row_count
total_size
total_size_pretty
dffax
dfx_time1_p052
925323
566116352
540 MB
dffax
dfx_time1_h021
1007840
622043136
593 MB
dffax
dfx_time1_m003
1241989
767385600
732 MB
By Lloyd Albin
1/7/2014
Sample of Functions

 Materialized Views
 Single Table Shadow functions.
 Lookup User Dependencies
 Update sequences for an entire schema
Download