Sunday, September 14, 2008

Multi-file Database ?

Some last database repair cases we did were with multi-file databases. Interesting, that latest InterBase and Firebird versions was used, and also file system for the storage was NTFS, not FAT32 and FAT16. But, the databases was created and maintained using 1 gigabyte files.

If you still use InterBase 5 and less, stop reading this, please :-)

This is strange, because all InterBase and Firebird errors working with >4gb databases are gone, and also lot of people forgot about FAT32.
Some says that FAT32 is faster than NTFS, so they still use FAT32. Maybe this is true, but personally I don't see any reason to use old file system.
Moreover, once I had interesting case with FAT32: there was a 4gb logical drive with 3gb file. Maybe it was Windows 2000, but anyway, I couldn't copy this file to another logical drive with NTFS - operating system gave me an error when it tried to copy over 2 gigabytes of the file.

Well, maybe XP and Vista can copy 3gb files from FAT32 to NTFS, but multi-file databases have several disadvantages, that does not depend on file system:
  • some people think that multi-file database will have better performance than single-file database, on multi-processor/core computers and using InterBase SMP or Firebird Classic. This is wrong. Server does not do any "parallel" access to the secondary files. It treats multi-file database as one sequential file broken into parts. So, if table pages spreaded in several database files, table scan will scan these files one by one.
  • Multi-file databases have hard-links to the secondary files inside it, and it is impossible to move such a database from one logical disk to another (for example, from d: to e:). If you want to do this, you need to make backup and restore, or get hex-editor and hack this hardcoded drive letter and path in the header pages of the mult-file database. Single-file databases does not have this problem at all.
  • You need to watch over mult-file database's size, and from time to time, while database is growing, create additional secondary file with appropriate size. We've seen lot of databases with first files having some equal size, and the last file larger than other - DBA forgot to create additional secondary file at right time.
  • Secondary file size can be specified in 2 ways. Using STARTING AT, when you specify from what page number server will continue database in the next file, and using LENGTH, where you specify secondary file size in pages. The best way is to use LENGTH parameter for all files of the multi-tile database, including primary file and secondary files. But if you will start to mix STARTING AT and LENGTH you can loose understanding what size secondary files will have, and get multi-file database with different file sizes.
  • Backup and restore automated scripts must be updated from time to time to include additional secondary file, because there is no command-line parameters for gbak that could do automated restore of the database to create unknown number of secondary files. Each secondary file and it's size must be specified at the command line.
    This is really a headache for the DBA.
I think I gave you enough reasons to stop using multi-file databases, if you still do this. If you use InterBase 7, 2007, 2009, Firebird 1.0, 1.5, 2.x - at the nearest restore make your database single file, and forget about multi-file database completely.

upd: need to say that the well known (at least for us) database corruption for IB 5.x is when DBA forgets to add new secondary file, and the last secondary file comes to 4gb size.

1 comment:

Folami Nguyen said...

Thank you! Your post is useful!