ACC2000: Database Bloats When Importing Large Text File (239527)



The information in this article applies to:

  • Microsoft Access 2000

This article was previously published under Q239527
Moderate: Requires basic macro, coding, and interoperability skills.

This article applies only to a Microsoft Access database (.mdb).

SYMPTOMS

When you import a large text file into a Microsoft Jet 4.0 database, the file size increases beyond the expected size, based on the size of the text file.

CAUSE

There are two possible causes for this problem, and you may be experiencing both.
  • Your database is using record-level locking. When importing a text file into a database, record-level locking is used incorrectly, which creates a separate 4 kilobyte page in memory for each record that is imported.
  • You are importing the text file into a new table. When you import text data into a new table, the Import wizard does not allow you to specify that the data be compressed. All data that is imported into Access is two-byte, unicode data. If unicode data is not compressed, it requires twice the usual space to store the same information.

RESOLUTION

If the import ran without errors but your database has experienced bloating, you can reclaim the extra space in your database by compacting the database. To compact a database, on the Tools menu, point to Database Utilities, and then click Compact and Repair Database. To avoid the problem in the future, you can follow these suggestions.

Record-level locking causes database bloat

Download the most recent version of the Microsoft Data Access Components 2.1 (MDAC 2.1). This behavior no longer occurs in MDAC version MDAC 2.1.2.4202.3 (GA), which you may also see referred to as MDAC 2.1 SP2, and later. For more information about the most recent MDAC 2.1, visit the following Microsoft Web site:

Unicode text data being imported uncompressed

Import the text file into an existing table where the UnicodeCompression property for each Text, Memo, and Hyperlink field is set to Yes (Default).

STATUS

Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.

MORE INFORMATION

Microsoft Access 2000 uses the unicode character-encoding scheme. In unicode, all characters are represented by two bytes, instead of a single byte as in earlier versions of Access. Setting the UnicodeCompression property to Yes offsets the extra space required for two-byte character representation when working with characters that have a 0 (zero) as the first byte. All Latin characters, which includes English, have a 0 (zero) as the first byte and will be compressed.

In addition to reducing the size of text data stored in your database, using unicode compression can also improve the performance of your database. Because the text data is stored compressed, more records can be read into memory at a time, which will reduce the number of times Access has to access the disk drive to get data. Fewer disk-drive reads means faster data access.

REFERENCES

For more information about unicode compression, click Microsoft Access Help on the Help menu, type customize how data is stored in the Office Assistant or the Answer Wizard, and then click Search to view the topics returned.

Modification Type:MajorLast Reviewed:6/29/2004
Keywords:kbbug KB239527