Postgres 8.4 - Table size inflates
Herouth Maoz
herouth at SPAMCOP.NET
Sat Dec 22 14:29:06 IST 2012
On 18/12/2012, at 13:59, david harel wrote:
> Greetings,
>
> Using Postgres v-8.4 for 3 years already, I now face a situation where one
> single table inflates enormously.
> Before vacuum the size was 1.5 GB for the table and 0.5 GB for the indexes.
>
> I tried to run CHECKPOINT in plsql (that's all I understood from reading:
> http://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-co
> nf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/)
>
> After full vacuum and reindex the table reduces it's size to 23MB and
> indexes reduce to 44MB.
>
> Typical usage of the table is insert.
> We never delete anything off that table and rarely we update it.
> Also we get key violations due to bad input also on rare occasions.
> Also we recently upgraded the hardware (OS (ubuntu 10.4 server eddition)
> runs on a virtual machine, copy to new hardware was using virtual machines
> snapshot).
> I don't know where to begin the investigation on the topic.
OK, first, table size has nothing to do with checkpoint segments and all that. If you are getting warning about "too many checkpoints are occuring" in your PostgreSQL log, it's a separate issue which requires proper memory management.
PostgreSQL works in a way that allows full, serializable transactions, and to do that, "dead tuples" are created. Typically, this would happen when you update or delete records. Another situation which may take up table size with dead records is using COPY to insert data, where the copy fails. The records that were inserted up to that point are marked "dead" because the whole COPY is considered one transaction so it's all-or-nothing. Failed triggers and constraint violations etc., cause the same thing.
Basically, those dead tuples are marked "unusable" but still take up space. Then a VACUUM reclaims them for re-use. There are, however, two types of VACUUM. The normal version of VACUUM, which is also used by the auto-vacuum daemon, simply marks the tuples as re-usable. This prevents the need to lock the table and allows other operations to continue normally. However, this means the space is not reclaimed to the operating system and the size of the table does not decrease. New inserted tuples, however, will not take up new space from the operating system but use the reclaimed space.
The other type of VACUUM is VACUUM FULL, which locks the table and reclaims the space back into the operating system.
Versions of PostgreSQL before 9.x have an issue of table bloat. This happens when the auto vacuum - or even manual vacuum - cannot reclaim the space, because it can't map all the tuples properly into a hash map, which is part of the process. You'll typically get a warning saying "relation "public.foos" contains more than "max_fsm_pages" pages with useful free space. HINT: Consider using VACUUM FULL on this relation or increasing the configuration parameter "max_fsm_pages"." VACUUM then gives up, and the tuples are never reclaimed, and as more tuples are inserted and updated, more and more unused tuples are created - which in turn deteriorates system functionality. There are ways to prevent this, by increasing the max_fsm_pages parameter, or by doing an occasional cleanup on tables that frequently bloat. Starting in version 9, PostgreSQL uses a different method and this type of bloat is less likely to happen.
By the way, a VACUUM FULL is not recommended. It's a very slow process, which leaves the table locked for a long while and consumes lots of resources. If you need to give your table a good diet, there are other ways - such as recreating the table from scratch. If you have enough disk space to keep two of the table, it's better to do it this way, and I found that the easiest way is to do an ALTER TABLE CLUSTER ON index_name, where index_name is the table's most frequently used index. This command is actually used to optimize the table for ORDER BY queries with that index. But what it actually does is re-order all the in-use tuples of the table in a new space, and then throw out the old tuples, which amount to the same thing as VACUUM FULL in a lot less time (but more space used while the operation takes place).
A similar issue can happen with the indices, and you can run a REINDEX to recreate them.
HTH,
Herouth
More information about the Linux-il
mailing list