PostgreSQL 7.2 upgrade and crypto
I upgraded to PostgreSQL 7.2 earlier this week. I had a few problems, but they were really annoying. The lesson is: observe, think, conclude.
Your first port of call should be the HISTORY file which comes with the PostgreSQL source.
The next step should be a backup of your data:
pg_dump dbname > dbname.pgdump
Then, and only then should you even consider installing 7.2.
Configuration
Much of this information is contained in my original PostgreSQL article so I will only briefly describe what I did.
As root, I issued the following command to create a new PostgreSQL database cluster:
# su -l pgsql /usr/local/bin/initdb /usr/local/pgsql/data
Then I started the database daemon:
# /usr/local/etc/rc.d/010.pgsql.sh start
NOTE: in recent versions of PostgreSQL (e.g. 8.2), the command is:
# /usr/local/etc/rc.d/postgresql start
Then I created myself a new PostgreSQL user account:
# su pgsql
$ createuser dan
As dan, I issued the following command:
$ createdb dbname
Then I loaded the database:
psql dbname < dbname.pgdump
That’s when the problems started.
Timestamps
The first problem occurred during the loading of the data back into a new database:
ERROR: Bad timestamp external representation 'current_timestamp'
The solution is to edit my data and remove the quotes from
current_timestamp
. I did
this using my favorite editor, joe.
This error occurred next:
ERROR: copy: line 2537, Bad timestamp external representation 'current'
lost synchronization with server, resetting connection ^C
This was a problem with my data. I had a timestamp field which contained the value ‘current’. I modified the data within my 7.1 database, pg_dump’d the data, and then did another import. This time the data import worked.
Relocated files
The PostgreSQL libraries move from /usr/local/lib/ to /usr/local/lib/postgresql/ with this upgrade. This should not cause you any problems. Unless you are using features such as stored procedures. This one had me stumped for quite some time.
The error which occurred during data loading is:
ERROR: stat failed on file '/usr/local/lib/plpgsql.so': No such file or directory
The problem here is that the file actually resides at
/usr/local/lib/postgresql/plpgsql.so
. Again, as with
the timestamp issue, this is a data problem. The fix is to modify my data file to
change the path. The line in question was:
$ grep plpgsql.so fp2migration.sql
CREATE OR REPLACE FUNCTION "plpgsql_call_handler" () RETURNS opaque AS 'plpgsql', 'plpgsql_call_handler' LANGUAGE 'C';
Note that it is no longer necessary to use a full path to the lib.
I also encountered another error message:
This was caused by trying to load a 7.1 plpgsql.so into 7.2. I was greeted with this message when I had the incorrect path in the function plpgsql_call_handler (see above). As mentioned above, the path is no longer necessary.ERROR: SearchSysCache: Bad cache id 27
I originally did a delete function, then a create function. That rendered my functions unusable. I drop’d the database and then reloaded it. The error I was getting was:
ERROR: fmgr_info: function 16557: cache lookup failed
Groups
My original pg_dump did not include the groups and users I had created. That was my error. I should have used pg_dumpall to get that data. See man pg_dumpall:
pg_dumpall -g dbname
I think I’ll start using pg_dumpall much more often as part of my backup strategy. Remember: users
and groups are global and can affect more than one database.
The Crypto
If you did a make clean
after you installed
PostgreSQL, you’ll need to run make
again before
building this add-on.
There is a cryptography library included with PostgreSQL. To install it, I followed these instructions
as found in the README.pgcrypto in the directory indicated below. Note that the 7.2
part of the directory may change depending upon which version you are using.
# cd /usr/ports/databases/postgresql7/work/postgresql-7.2/contrib/pgcrypto
edit Makefile and specifycryptolib = openssl
# gmake
# gmake install
If you haven’t already done so, you should then install PostgreSQL and start it.
The crypto library will install three things:
- /usr/local/share/postgresql/contrib/pgcrypto.sql – sql used to create the crypto stored procedures
- /usr/local/share/doc/postgresql/contrib/README.pgcrypto – the documentation
- /usr/local/lib/postgresql/pgcrypto.so – the library
To access the library functions, you must load the stored procedures into your database. This step will not be necessary if you are setting up a new PostgreSQL server and you will be restoring a dump of a database which already contains these functions.
psql db_name < /usr/local/share/postgresql/contrib/pgcrypto.sql
Please read the documentation supplied. But here’s an example which you can use to confirm that the functions and library are functioning:
fp2migration=# select DIGEST('adfsd', 'md5'); digest --------------------------------------------- \374\373J|Nm\365\2314\224'\243m\026\314\035 (1 row)
PHP problems (added on 27 February 2002)
Today a PHP exploit was announced. After implementing the recommendations within that advisory (i.e. a change to php.ini), I restarted Apache, which refused to restart:
Syntax error on line 240 of /usr/local/etc/apache/httpd.conf:
Cannot load /usr/local/libexec/apache/libphp4.so into server: /usr/local/libexec/apache/libphp4.so: Undefined symbol "pg_encoding_to_char"
The line in question contained:
LoadModule php4_module libexec/apache/libphp4.so
The clue to what needs to be done is in the undefined symbol (i.e. pg_
). I recompiled
and installed php4:
/usr/ports/www/mod_php4
make
pkg_delete -f mod_php4-4.1.1
make install
make clean
I could have used portupgrade. I did the make first to ensure the port built
successfully. Then, and only then did I delete the existing php installation.
Soundex/Fuzzy matching (added on 18 March 2002)
If you did a make clean
after you installed
PostgreSQL, you’ll need to run make
again before
building this add-on.
There is a SOUNDEX/fuzzy matching library included with PostgreSQL. It allows you to search for values which sound similar to a given value. To install it, I followed these instructions as found in the README.fuzzystrmatch in the directory indicated below:
# cd /usr/ports/databases/postgresql7/work/postgresql-7.2/contrib/fuzzystrmatch
# gmake
# gmake install
If you haven’t already done so, you should then install PostgreSQL.
The crypto library will install three things:
- /usr/local/share/postgresql/contrib/fuzzystrmatch.sql- sql used to create the SOUNDEX/fuzzy match stored procedures
- /usr/local/share/doc/postgresql/contrib/README.fuzzystrmatch – the fuzzy match documentation
- /usr/local/share/doc/postgresql/contrib/README.soundex – the soundex documentation
- /usr/local/lib/postgresql/fuzzystrmatch.so – the library
To access the library functions, you must load the stored procedures into your database. This step will not be necessary if you are setting up a new PostgreSQL server and you will be restoring a dump of a database which already contains these functions.
psql db_name < /usr/local/share/postgresql/contrib/fuzzystrmatch.sql
Please read the documentation supplied. But here’s an example which you can use to confirm that the functions and library are functioning:
freshports=# select levenshtein('MacDonald', 'McDonald'); levenshtein ------------- 1 (1 row) freshports=# select levenshtein('MacDonald', 'Smith'); levenshtein ------------- 9 (1 row) freshports=#pgAdmin II - Accessing the database remotely (added on 13 August 2002)
I have finally given in. I'm going to try a Windows GUI for my database. After looking at the list of related PostgreSQL projects, I decided to try pgAdmin II. But in order to use this application, I need to tell PostgreSQL to accept connections from other boxes. I will do this in two steps:
- Allow remote connections
- Restrict remote connections to a specific IP address
According to the Run-time configuration information, I need to specify
TCPIP_SOCKET = true
in my~pgsql/data/postgresql.conf
file. After making this change, I stopped, then restarted the postmaster:# /usr/local/etc/rc.d/010.pgsql.sh stop
# /usr/local/etc/rc.d/010.pgsql.sh startNOTE: in recent versions of PostgreSQL (e.g. 8.2), the command is:
# /usr/local/etc/rc.d/postgresql stop # /usr/local/etc/rc.d/postgresql start
I verified that incoming connections were being accepted:
[dan@bast:~] $ telnet undef 5432 Trying 192.168.0.56... Connected to undef.unixathome.org. Escape character is '^]'. lkjadf EFATAL 1: invalid length of startup packet Connection closed by foreign host.
This takes care of the first part of step 1. The next step is to ensure that the user can log in. For this, I read
Client Authentication. I discovered I need to alter
~pgsql/data/pg_hba.conf
. I added the following line:
host all 192.168.0.99 255.255.255.255 trust
This means to trust all incoming connections from 192.168.0.99. This may not be what you want. Please read the documentation. In my situation, that IP address can be trusted. You decide what is right for you. After making this change, I had to restart the postmaster (see above for how I did it before).
If you see the error:
...then you either forget to add the entry to theNo pg_hba.conf entry for host 192.168.0.99, user dan, database template1
pg_hba.conf
or restart the postmaster.
Installing pgAdmin II was pretty simple. I then started it and did a File->Connect. I supplied the IP address of the box on which PostgreSQL was running, and the username. Then I clicked connect. So pgAdmin II seems like a good tool.
Here are some other things to consider with a Postgres 7.2 upgrade:
First, the "README" file mentions several incompatibilities with past versions. My first order of business was to check existing application code for these. In addition the ones listed in the "README" file, I also found that "interval" suffered the same problem as the "timestamp" function. This command goes most of the way to find most of those problems on my system, including checking for "= NULL", "’current’", "timestamp()", and "interval()". Grepping for the "limit" syntax change was harder. I’ll leave that as an exercise for the reader.
locate .{pm,pl,cgi,sql} | grep ‘/home’ | xargs egrep "= NULL|’current’|timestamp\s*\(|’interval\s*\("
For very large dump files, you may be interested in compressing and decompressing the the dump file on the fly as you handle. For example:
pg_dumpall | gzip > filename.dmp.gz
Here’s a problem I ran into while importing, and my solution:
– problem: ‘current’ appears in dump from legacy applications
– solution: perl -pi.bak -e "s/’current’/CURRENT_TIMESTAMP/" file.dmp
For those who plan to upgrade to Postgres 7.2 at any point in the future, you can go ahead and start writing compatible code with these tips:
* Avoid "= NULL". Use "IS NULL" instead.
* Avoid interval() and timestamp() functions, as well as casting by using the "::" notation. Use
the SQL standard method instead:
CAST(expression AS type)
* When using "LIMIT" with an OFFSET, always use the syntax "LIMIT n OFFSET n"
* Also be on the lookout for: If an input string is larger than the field size, Postgres will return an error in 7.2, instead of silently truncating as before.
I recently upgraded to PostgreSQL 7.2 and had similar problems. Because I didn’t use dumpall, I had to recreate my users and groups.
While pg_dumpall will dump all the relevant data, it doesn’t support dumping large objects, which is fine for most people who don’t use such complex objects in a database.
Personally, I would use `pg_dumpall -g` to dump the global information, and then use pg_dump to dump my inidividual databases to separate files. I do not have many databases, so this is a viable option. This allows me to keep separate files with only the relevant data for a given database. One database in particular that I have is for a school project that needs more care taken with it than my own personal databases.
Also, it is preferable to use one of the non-script outputs. The -F c parameter to pg_dump will output PostgreSQL native format that is compressed by default and also allows for reordering of data/schema loading. -F t produces a tar’d output that is an inbetween the benefits of a script and the -F c format. Both the tar and custom output can handle large objects.
If you need to inspect the data with grep, you can use pg_restore without a database argument, and it will produce a script that you could use to restore the data or make changes.
As you can see, it makes quite a difference in size between the different formats.
1187696 Feb 26 08:35 test1.dump (Normal dump, no options)
1213952 Feb 26 08:32 test1.tar (-F t dump)
314581 Feb 26 08:32 test1.psql (-F c dump)
297169 Feb 26 08:34 test1.gz (Gzipped normal dump)
298300 Feb 26 08:32 test1.psql.gz (gz’d -F c dump)
300667 Feb 26 08:32 test1.tar.gz (gz’d -F t)
gzipping the -F c dump was the smallest, but it wasn’t a huge gain from the normal -F c dump size.
While the alternate format outputs aren’t as portable, they are ideal for backups. You do not need to pass the same -F parameter to pg_restore as you did with pg_dump, it can determine the type automatically.
Jim
Reading this article reminded me about the book mentioned in this message’s title. It was printed in late January and I picked it up either the last day of January or very early February when I just happened to be browsing in a computer bookshop.
In fact, only the included CD (which had v7.1.3 – I’m running 7.1.2) alerted me to the newness of the book.
I’d been waiting for a book on postgres as I’d been using it for a while but had been teaching myself from scratch, as I generally do. I’d figured out triggers, stored procedures, user functions, sequences …, all those "proper" database functions that sets postgres apart from other OSS DBMSs. However there were a few things I was (and am) unsure of. My user functions are nothing special (indeed I’m having trouble with them) and experienced DBAs would probably cringe at some of my rules and procedures.
But here was a book at last – by O’Reilly no less, and written by some postgres consultants. Surely it had to be worth my cash (and 40% off too!)
Well, no.
I would NOT recommend this book to anyone. Yup, not to anyone, not even total newbies. OK, maybe they would appreciate having the basics in one place but there is NOTHING in this book that you can’t find in the official documentation, in a quick mailing list or newsgroup search or on any half-decent tech site.
Their "advanced" section, which I was eager to read, showed how to create a function (and load it into the database) that returned true if either of its two arguments were zero – or something equally trivial. They might as well skip the subject altogether. I wanted to know how you would go about working with various postgres internals in your C code or if there are any particular tricks and traps (say, related to memory) to be aware of.
That sort of information is what you would expect in an O’Reilly book but it isn’t what I found.
I would say that anyone reading this article on FreeBSD Diary because they have some knowledge of the subject matter, rather than just because they’re bored, would be likely to gain as little from this book as I did.
For those that might be interested in this book, it is availble, in it’s entirety, online. Check the CommandPrompt website: http://www.commandprompt.com/ppbook/
Check out portupgrade (sysutils/portupgrade) as an alternative to make deinstall… amongst other things, when it performs the upgrade, it makes a backup of the existing port. portversion is also very very cool: I run that nightly to see what I need to upgrade…. same with pkgdb -F… -sc
I don’t use and do not reccommend make deinstall.
I always use pkg_delete.
You should also switch ‘random = silly’ to ‘random = dev’ in the Makefile to take advantage of /dev/urandom. I just installed this module yesterday when porting phpshop to PostgreSQL to replace MySQL’s ENCODE and DECODE commands.
SELECT ENCODE(DIGEST(‘uga’,’md5′), ‘hex’);
That’s an easy way to change the results of the MD5 to a more traditional hex encoded form (32 characters).
Hello All
Please help me how to solve the below mentioned problem.
I’m using postgres 7.2 on linux.
When I tried to fire the below statement it gives me this type of curious error.
Please mail me urgenetly. I’m in bad demand of solving this problem.
prodnut=# select populate_rule_data() as output;
ERROR: cache lookup failed for type 0
Viral Dave
My guess: One of the functions upon which populate_rule_data has changed. start using "create or replace function" instead of "create function". Recreate the function populate_rule_data.