Purchase  Copyright © 2002 Paul Sheer. Click here for copying permissions.  Home 

next up previous contents
Next: 39. smbd Up: rute Previous: 37. crond and atd   Contents

Subsections

38. postgres SQL Server

This chapter will show you how to set up an SQL server for free.

38.1 Structured Query Language

Structured Query Language (SQL) is a programming language developed specifically to access data arranged in tables of rows and columns--as in a database--as well as do searching, sorting and cross-referencing of that data.

Typically, the database tables will sit in files managed by an SQL server daemon process. The SQL server will listen on a TCP socket for incoming requests from client machines and will service those requests.

SQL has become a de facto industry standard. However, the protocols (over TCP/IP) by which those SQL requests are sent are different from implementation to implementation.

SQL requests can usually be typed in manually from a command-line interface. This is difficult for most users, so a GUI interface will usually hide this process from the user.

SQL servers and SQL support software is major institution. Management of database tables is actually a complicated affair. A good SQL server will properly streamline multiple simultaneous requests that may access and modify rows in the same table. Doing this efficiently, along with the many types of complex searches and cross-referencing, while also ensuring data integrity, is a complex task.

38.2 postgres

postgres (PostGreSQL) is a free SQL server written under the BSD license. postgres supports an extended subset of SQL92. [The definitive SQL standard.] It does a lot of very nifty things that no other database can (it seems). About the only commercial equivalent worth buying over postgres is a certain very expensive industry leader. postgres runs on every flavor of UNIX and also on Windows NT.

The postgres documentation proudly states:

The Object-Relational Database Management System now known as PostgreSQL (and briefly called Postgres95) is derived from the Postgres package written at Berkeley. With over a decade of development behind it, PostgreSQL is the most advanced open-source database available anywhere, offering multi-version concurrency control, supporting almost all SQL constructs (including subselects, transactions, and user-defined types and functions), and having a wide range of language bindings available (including C, C++, Java, Perl, Tcl, and Python).

postgres is also fairly dry. Most people ask why it doesn't have a graphical frontend. Considering that it runs on so many different platforms, it makes sense for it to be purely a back-end engine. A graphical interface is a different kind of software project that would probably support more than one type of database server at the back and possibly run under only one kind of graphical interface.

The postgres package consists of the files described in the next two sections:

38.3 postgres Package Content

The postgres packages consists of the user programs

 
 
 
createdb      dropdb      pg_dump      psql
createlang    droplang    pg_dumpall   vacuumdb
createuser    dropuser    pg_id       

and the server programs

 
 
 
initdb         pg_ctl        pg_upgrade   postgresql-dump
initlocation   pg_encoding   pg_version   postmaster
ipcclean       pg_passwd     postgres

Each of these programs has a man page which you should get an inkling of.

Further man pages provide references to actual SQL commands. Try man l select (explained further on):

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
SELECT(l)                                               SELECT(l)
 
NAME
       SELECT - Retrieve rows from a table or view.
 
SYNOPSIS
       SELECT [ ALL | DISTINCT [ ON ( expression [, ...] ) ] ]
           expression [ AS name ] [, ...]
           [ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table ]
           [ FROM table [ alias ] [, ...] ]
           [ WHERE condition ]
           [ GROUP BY column [, ...] ]
           [ HAVING condition [, ...] ]
           [ { UNION [ ALL ] | INTERSECT | EXCEPT } select ]
           [ ORDER BY column [ ASC | DESC | USING operator ] [, ...] ]
           [ FOR UPDATE [ OF class_name [, ...] ] ]
           LIMIT { count | ALL } [ { OFFSET | , } start ]

Most important is the enormous amount of HTML documentation that comes with postgres. Point your web browser to /usr/doc/postgresql-? .? .? (or /usr/share/doc/...), then dive into the admin, user, programmer, tutorial, and postgres directories.

Finally, there are the start and stop scripts in /etc/rc.d/init.d/ (or /etc/init.d/) and the directory in which the database tables themselves are stored: /var/lib/pgsql/.

38.4 Installing and Initializing postgres

postgres can be gotten prepackaged for your favorite distribution. Simply install the package using rpm or dpkg and then follow the instructions given below.

Stop the postgres server if it is running; the init.d script may be called postgres or postgresql (Debian commands in parentheses):

 
 
/etc/rc.d/init.d/postgres stop
( /etc/init.d/postgresql stop )

Edit the init.d script to support TCP requests. There will be a line like the following to which you can add the -i option. Mine looks like:

 
 
su -l postgres -c "/usr/bin/pg_ctl -D $PGDATA \
       -p /usr/bin/postmaster -o '-i -o -e' start >/dev/null 2>&1"

which also (with the -o -e option) forces European date formats (28/4/1984 instead of 4/28/1984). Note that hosts will not be able to connect unless you edit your /var/lib/pgsql/data/pg_hba.conf ( /etc/postgresql/pg_hba.conf on Debian) file, and add lines like

 
host    mydatabase   192.168.4.7   255.255.255.255   trust

In either case, you should check this file to ensure that only trusted hosts can connect to your database, or remove the -i option altogether if you are only connecting from the local machine. To a limited extent, you can also limit what users can connect within this file.

It would be nice if the UNIX domain socket that postgres listens on (i.e., /tmp/.s.PGSQL.5432) had permissions 0770 instead of 0777. That way, you could limit connections to only those users belonging to the postgres group. You can add this feature by searching for the C chmod command within src/backend/libpq/pqcomm.c inside the postgres-7.0 sources. Later versions may have added a feature to set the permissions on this socket.

To run postgres, you need a user of that name. If you do not already have one then enter

 
/usr/sbin/useradd postgres

and restart the server with

 
/etc/rc.d/init.d/postgresql restart

The postgres init.d script initializes a template database on first run, so you may have to start it twice.

Now you can create your own database. The following example creates a database finance as well as a postgres user finance. It does these creations while being user postgres (this is what the -U option is for). You should run these commands as user root or as user postgres without the -U postgres.

 
 
 
 
5 
 
[root@cericon]# /usr/sbin/useradd finance
[root@cericon]# createuser -U postgres --adduser --createdb finance
CREATE USER
[root@cericon]# createdb -U finance finance
CREATE DATABASE
[root@cericon]# 

38.5 Database Queries with psql

Now that the database exists, you can begin running SQL queries.

 
 
 
 
5 
 
 
 
 
10 
 
 
 
 
15 
 
 
 
 
20 
 
 
 
 
25 
 
 
 
 
30 
 
 
 
 
35 
 
 
 
 
40 
 
[root@cericon]# psql -U finance
Welcome to psql, the PostgreSQL interactive terminal.
 
Type:  \copyright for distribution terms
       \h for help with SQL commands
       \? for help on internal slash commands
       \g or terminate with semicolon to execute query
       \q to quit
 
finance=# select * from pg_tables;
   tablename    | tableowner | hasindexes | hasrules | hastriggers 
----------------+------------+------------+----------+-------------
 pg_type        | postgres   | t          | f        | f
 pg_attribute   | postgres   | t          | f        | f
 pg_proc        | postgres   | t          | f        | f
 pg_class       | postgres   | t          | f        | f
 pg_group       | postgres   | t          | f        | f
 pg_database    | postgres   | f          | f        | f
 pg_variable    | postgres   | f          | f        | f
 pg_log         | postgres   | f          | f        | f
 pg_xactlock    | postgres   | f          | f        | f
 pg_attrdef     | postgres   | t          | f        | f
 pg_relcheck    | postgres   | t          | f        | f
 pg_trigger     | postgres   | t          | f        | f
 pg_inherits    | postgres   | t          | f        | f
 pg_index       | postgres   | t          | f        | f
 pg_statistic   | postgres   | t          | f        | f
 pg_operator    | postgres   | t          | f        | f
 pg_opclass     | postgres   | t          | f        | f
 pg_am          | postgres   | t          | f        | f
 pg_amop        | postgres   | t          | f        | f
 pg_amproc      | postgres   | f          | f        | f
 pg_language    | postgres   | t          | f        | f
 pg_aggregate   | postgres   | t          | f        | f
 pg_ipl         | postgres   | f          | f        | f
 pg_inheritproc | postgres   | f          | f        | f
 pg_rewrite     | postgres   | t          | f        | f
 pg_listener    | postgres   | t          | f        | f
 pg_description | postgres   | t          | f        | f
 pg_shadow      | postgres   | f          | f        | t
(28 rows)

The preceeding rows are postgres's internal tables. Some are actual tables, and some are views of tables. [A selective representation of an actual table.]

To get a list of databases, try:

 
 
 
 
5 
 
finance=# select * from pg_database;
  datname  | datdba | encoding |  datpath  
-----------+--------+----------+-----------
 template1 |     24 |        0 | template1
 finance   |     26 |        0 | finance
(2 rows)

38.6 Introduction to SQL

The following are 99% of the commands you are ever going to use. (Note that all SQL commands require a semicolon at the end--you won't be the first person to ask why nothing happens when you press Enter without the semicolon.)

38.6.1 Creating tables

To create a table called people, with three columns:

 
CREATE TABLE people ( name text, gender bool, address text );

The created table will title the columns, name, gender, and address. Columns are typed. This means that only the kind of data that was specified at the time of creation can go in that column. In the case of gender, it can only be true or false for the boolean type, which we will associate to the male and female genders. There is probably no reason to use the boolean value here: using an integer or text field can often be far more descriptive and flexible. In the case of name and address, these can hold anything, since they are of the text type, which is the most encompassing type of all.

Note that in the postgres documentation, a ``column'' is called an ``attribute'' for historical reasons.

You should try to choose types according to the kind of searches you are going to do and not according to the data it holds. Table 38.1 lists the most of the useful types as well as their SQL92 equivalents. The types in bold are to be used in preference to other similar types for greater range or precision:

Table: Common postgres types
Postgres Type SQL92 or SQL3 Type Description
bool boolean logical boolean (true/false)
box   rectangular box in 2D plane
char(n) character(n) fixed-length character string
cidr   IP version 4 network or host address
circle   circle in 2D plane
date date calendar date without time of day
decimal decimal(p,s) exact numeric for p <= 9, s = 0
float4 float(p), p < 7 floating-point number with precision p
float8 float(p), 7 <= p < 16 floating-point number with precision p
inet   IP version 4 network or host address
int2 smallint signed 2-byte integer
int4 int, integer signed 4-byte integer
int8   signed 8-byte integer
interval interval general-use time span
line   infinite line in 2D plane
lseg   line segment in 2D plane
money decimal(9,2) U.S.-style currency
numeric numeric(p,s) exact numeric for p == 9, s = 0
path   open and closed geometric path in 2D plane
point   geometric point in 2D plane
polygon   closed geometric path in 2D plane
serial   unique ID for indexing and cross-reference
time time time of day
text   arbitrary length text (up to 8k for postgres 7)
timetz time with time zone time of day, including time zone
timestamp timestamp with time zone accurate high range, high precision date/time with zone
varchar(n) character varying(n) variable-length character string

38.6.2 Listing a table

The SELECT statement is the most widely used statement in SQL. It returns data from tables and can do searches:

 
 
 
 
finance=# SELECT * FROM PEOPLE;
 name | gender | address 
------+--------+---------
(0 rows)

38.6.3 Adding a column

The ALTER statement changes something:

 
 
 
 
5 
 
finance=# ALTER TABLE people ADD COLUMN phone text;
ALTER
finance=# SELECT * FROM people;
 name | gender | address | phone 
------+--------+---------+-------
(0 rows)

38.6.4 Deleting (dropping) a column

You cannot drop columns in postgres; you must create a new table from the old table without the column. How to do this will become obvious further on.

38.6.5 Deleting (dropping) a table

Use the DROP command to delete most things:

 
DROP TABLE people;

38.6.6 Inserting rows, ``object relational''

Insert a row with (you can continue typing over multiple lines):

 
 
 
finance=# INSERT INTO people (name, gender, address, phone)
finance-# VALUES ('Paul Sheer', true, 'Earth', '7617224');
INSERT 20280 1

The return value is the oid (Object ID) of the row. postgres is an Object Relational database. This term gets thrown around a lot, but it really means that every table has a hidden column called the oid column that stores a unique identity number for each row. The identity number is unique across the entire database. Because it uniquely identifies rows across all tables, you could call the rows ``objects.'' The oid feature is most useful to programmers.

38.6.7 Locating rows

The oid of the above row is 20280. To find it:

 
 
 
 
5 
finance=# SELECT * FROM people WHERE oid = 20280;
    name    | gender | address |  phone  
------------+--------+---------+---------
 Paul Sheer | true   | Earth   | 7617224
(1 row)

38.6.8 Listing selected columns, and the oid column

To list selected columns, try:

 
 
 
SELECT name, address FROM people;
SELECT oid, name FROM people;
SELECT oid, * FROM people;

It should be obvious what these do.

38.6.9 Creating tables from other tables

Here we create a new table and fill two of its columns from columns in our original table:

 
 
 
 
finance=# CREATE TABLE sitings (person text, place text, siting text);
CREATE
finance=# INSERT INTO sitings (person, place) SELECT name, address FROM people;
INSERT 20324 1

38.6.10 Deleting rows

Delete selected rows, like

 
 
finance=# DELETE FROM people WHERE name = 'Paul Sheer';
DELETE 1

38.6.11 Searches

About the simplest search you can do with postgres is

 
SELECT * FROM people WHERE name LIKE '%Paul%';

Or alternatively, case insensitively and across the address field:

 
SELECT * FROM people WHERE lower(name) LIKE '%paul%' OR lower(address) LIKE '%paul%';

The first % is a wildcard that matches any length of text before the Paul, and the final % matches any text after. It is the usual way of searching with a field, instead of trying to get an exact match.

The possibilities are endless:

 
SELECT * FROM people WHERE gender = true AND phone = '8765432';

38.6.12 Migrating from another database; dumping and restoring tables as plain text

Migrating from another database;
dumping and restoring tables as plain text

The command

 
COPY people TO '/tmp/people.txt';

dumps the people table to /tmp/people.txt, as tab delimeter, newline terminated rows.

The command,

 
COPY people WITH OIDS TO '/tmp/people.txt' DELIMITERS ',' WITH NULL AS '(null)'; 

dumps the people table to /tmp/people.txt, as comma-delimited, newline-terminated rows, with (null) whereever there is supposed to be a zero byte.

Similarly, the command

 
COPY people FROM '/tmp/people.txt';

inserts into the table people the rows from /tmp/people.txt. It assumes one line per row and the tab character between each cell.

Note that unprintable characters are escaped with a backslash  \  in both output and the interpretation of input.

Hence, it is simple to get data from another database. You just have to work out how to dump it as text.

38.6.13 Dumping an entire database

The command pg_dump <database-name> dumps your entire database as plain text. If you try this on your database, you will notice that the output contains straightforward SQL commands. Your database can be reconstructed from scratch by piping this output through stdin of the psql command. In other words, pg_dump merely produces the exact sequence of SQL commands necessary to reproduce your database.

Sometimes a new version of postgres will switch to a database file format that is incompatible with your previous files. In this case it is prudent to do a pg_dumpall (and carefully save the output) before upgrading. The output of pg_dumpall can once again be fed through stdin of the psql command and contains all the commands necessary to reconstruct all your databases as well as all the data they contain.

38.6.14 More advanced searches

When you have some very complicated set of tables in front of you, you are likely to want to merge, select, search, and cross-reference them in innumerable ways to get the information you want out of them.

Being able to efficiently query the database in this way is the true power of SQL, but this is about as far as I am going to go here. The postgres documentation cited above contains details on everything you can do.

38.7 Real Database Projects

University Computer Science majors learn about subjects like Entity Modeling, Relational Algebra, and Database Normalization. These are formal academic methods according to which good databases are designed. You should not venture into constructing any complex database without these methods.

Most university book shops will have academic books that teach formal database theory.


next up previous contents
Next: 39. smbd Up: rute Previous: 37. crond and atd   Contents