Yacoby's Morrowind
Downloads
Contact Mod Lists Home Mods Tutorials ES Forum Archive

This is a backup of Ghostwheel's notes on the BSA file format. It was uploaded as Ghostwheel's site was down.

The only alteration done was to ajust the table width so it fitted on my site and ajust some of the colours.

The Elder Scrolls 3 (Morrowind) BSA File Format

Overview

BSA file is used by Morrowind to store various data files like icons, meshes and textures (normally those files can be found in the Data Files\Icons, Data Files\Meshes and Data Files\Textures directories in the Morrowind’s installation. While it is possible to store other files in the BSA archive, it seems that Morrowind’s engine will look only for those 3 types of files inside BSA archives.

In order to be used, BSA file should be referenced in the [Archives] section of the morrowind.ini file. Also in [General] section of the same file, key TryArchiveFirst should be set either to 0 (use newer data from archives or from Data Files directory) or to 1 (use archives only).

File format

BSA file can be split into 6 sections:

Name

Descript

1. Header

Contains version number, total size of sections 2, 3 and 4 and the number of files

2. File size / offset table

Contains size of each file and relative offset of the file data in the section 6

3. Archive directory offset table

Contains relative offset of each file name in the section 4

4. Archive directory

Contains file names

5. Hash table

Contains 2 32-bit hash values for each file name

6. File data

Contains actual file data


Sections themselves described below. All 4 byte fields should be treated as normal 32-bit unsigned integers (lower byte first).

Header section

Header section size is always 12 bytes. It contains following information:


Field name

Size (bytes)

Description

Version

4

Always contains following hex sequence: 00 01 00 00

Hash table offset

(archive directory size)

4

Contains total size of the sections 2-4. If you will take this value and add header size (12) you will get offset of the hash table section from the beginning of the file.

Number of files

4

Number of files in the archive

File size/offset section

This section contains size of each file and its relative offset in the file data section. Each record in this section will correspond with appropriate records in sections 3-5 (in other words, first record in this section will correspond to the first file listed in the archive directory and to the first record in the hash table section and so on). Section size is 8 bytes * number of files (from header). Each record contains following information:


Field name

Size (bytes)

Description

File size

4

File size

File offset

4

Relative offset of the file data in the section 6 (file data section)

Archive directory offset section

This section contains relative offsets for each file name in the archive directory section. Section size is 4 bytes * number of files from header. Each record contains just single field:


Field name

Size (bytes)

Description

File name offset

4

Relative offset of the file name in the archive directory section


Archive directory section

This section contains list of file names, one after another. Each file name includes relative path of the file (as if it was in the Data Files\ directory). Each name is represented using ASCII characters (not UNICODE) in C format (meaning, each string ends with ‘\0’ byte). Since each file name has different size, all records in this section have different sizes (number of characters in the file name plus 1 for ‘\0’ ending character). In order to find appropriate file name previous section should be used, since it contains offsets where each file name is stored. Please, also note, that all file names are stored in lowercase.

Total size of this section can be calculated as (archive directory size field from header section) – (size of file size/offset section) – (size of archive directory offset section). Or, in other words, (archive directory size field from header section) – 12 * (number of files).

Hash table section

This section contains two 32-bit hash values for each file name (not the file data!) in the archive. Section size is 8 bytes * number of files. Each record contains two fields:


Field name

Size (bytes)

Description

Hash value 1

4

1st hash value (calculated based on first half of the file name string)

Hash value 2

4

2nd hash value (calculated based on the second half of the file name string)


Algorithms for calculating hash values can be found below.

File data section

This section contains actual data from the files. Each file is stored in the archive as is – no compression is used. File data is stored one after another – in order to retrieve data for the appropriate file, file size/offset record should be used – it will show an offset in this section, where file data starts and size of the file.

Hash value algorithms

As you noticed, for each file name, two hash values must be calculated. File name is split in half (first half will be equal or 1 byte smaller than second) and hash values calculated separately for each half. Below you can find code using C notation for calculating hash value.


Used variables:


char *name; // file name string (in C format)
unsigned hash1; // first hash value
unsigned hash2; // second hash value


Following code will calculate both hash values (I tried to make it as readable as possible for non-C folks – += , ++ operators are not used on purpose):


unsigned full_len = strlen(name); // this will calculate string length
unsigned half_len = (full_len >> 1);
unsigned sum, off, temp, i, n;
num = off = 0;

for(i = 0; i < half_len; i = i + 1) {

temp = ( ((unsigned)(name[i])) << (off & 0x1F) );
sum = sum ^ temp;
off = off + 8;

}
hash1 = sum;
sum = off = 0;
for(; i < full_len; i = i + 1) {

temp = ( ((unsigned)(name[i])) << (off & 0x1F) );
sum = sum ^ temp;
n = temp & 0x1F;
sum = (sum << (32-n)) | (sum >> n); // "rotate right" operation
off = off + 8;

}
hash2 = sum;


For those who not familiar with C, << means bitwise left shift, >> - right shift, & - bitwise AND, | - bitwise OR, ^ - bitwise XOR. All unsigned variables considered to be 32-bit.

Record ordering

All records in the BSA file sections should be placed in specific order. As it was mentioned before, records in sections 2-5 (file size/offset, directory offset, archive directory and hash table sections) are correlated – records related to one file should be in exactly same position in all those sections. All records in this sections must be sorted, so (hash value 1, hash value 2) pair will end up sorted in ascending order. If this will not be done, Morrowind will not recognize files in the archive.


Also, in standard Morrowind files (morrowind.bsa, tribunal.bsa) file data section is also sorted – but files there placed in alphabetical order of their names. I am not sure whether or not this is necessary, but since I wanted to reproduce exact archive after unpack/pack cycle, I think that this ordering is also important.

Other notes

One BSA archiver implementation with uncommented C++ source can be found at http://www34.brinkster.com/ghostwheel/bsapack.zip.

[It can also be found on Planet Elder Scrolls]