Reading DOOM WAD Files
WAD "Where's All the Data" files used by DOOM and various other games are simple containers, similar to zip and other archive formats, without additional complexity (such as compression) and data-centric rather than file. This article describes how to read the WAD files used by DOOM, DOOM II, Rise of the Triad and similar games of that era. Yes, I'm talking DOS and 1993, not the more modern reboots.
The article only covers reading of a WAD and extracting its contents, it does not cover the format of the individual data within given that the data is application dependent. With that said, I'll be covering the DOOM picture format in the next article.
In 2018 I looked into the MIX format used by the Command & Conquer games which is very similar to WAD but for reasons I don't recall I didn't end up writing a post about the format. Recently I finished reading Jimmy Maher's excellent series on DOOM and that reminded me I had wanted to look into WAD and other container formats for my own future use. As I have been completely unable to finish a single draft blog post I currently have, I decided something fresh and new (to me anyway!) was a good idea.
Although I don't normally plug other sites, Jimmy's blog The Digital Antiquarian is a fantastic blog of the games of yesteryear and I wish I could write half as well as him.
About WAD Formats
There are various formats of WAD file available, each building on the previous. This initial series of articles only covers the original version first introduced in DOOM. At the time of writing, I haven't looked the other versions but I plan to look at some of them in future articles.
I have tested the code presented in this article with WAD files from Shareware DOOM, ULTIMATE DOOM, DOOM II and Rise of the Triad: Dark War.
The Format
The format is simple enough. There is a 12 byte header which details the wad type, the number of lumps of data it contains, and an offset where the directory index is located.
Range | Description |
---|---|
0 - 3 |
Either the string IWAD or PWAD |
4 - 7 |
The number of entries in the directory |
8 - 11 |
The location of the directory |
The directory index is comprised of (16 * number of lumps) bytes which describe the lumps. Each 16 byte header details the size, the position in the data and the lump name.
Range | Description |
---|---|
0 - 3 |
The location of the lump |
4 - 7 |
The size of the lump |
8 - 15 |
The name of the lump, padded with NUL bytes |
As far as I know, the directory can be located anywhere in a WAD file, or at least anywhere after the header. All of the WADs I have examined have the directory at the end of the file which makes perfect sense from a serialisation standpoint, but there's no reason why it couldn't be elsewhere. The only rule is that all elements in the directory index must be contiguous.
All integer values are in little-endian format.
WAD Types
The first four bytes of the file header are either IWAD
or
PWAD
, and this denotes the type of the WAD. The I
prefix
means this is an "internal" WAD, which is the main WAD for a
game. The P
prefix denotes a "patch" WAD, which allows a WAD
to override the lumps from the main internal WAD, e.g. for
providing custom levels, skins or other data.
Reading the Header
Reading the header is quite straightforward - first read in the 12 bytes into a buffer and define the WAD type based on the first byte. Next, we extract 32bit integers from each set of 4 bytes in the remainder of the header that contain the number of data lumps and then the start of the directory listing.
Note: In the interests of clarity, parameter and data validation have been omitted from the snippets in this article.
private const byte _wadHeaderLength = 12;
private const byte _lumpCountOffset = 4;
private const byte _directoryStartOffset = 8;
private WadType _type;
private int _lumpCount;
private int _directoryStart;
private void ReadWadHeader(Stream stream)
{
byte[] buffer;
buffer = new byte[_wadHeaderLength];
stream.Read(buffer, 0, _wadHeaderLength);
_type = _buffer[0] == 'I' ? WadType.Internal : WadType.Patch;
_lumpCount = GetInt32Le(_buffer, _lumpCountOffset);
_directoryStart = GetInt32Le(_buffer, _directoryStartOffset);
}
public static int GetInt32Le(byte[] buffer, int offset)
{
return buffer[offset + 3] << 24 | buffer[offset + 2] << 16 | buffer[offset + 1] << 8 | buffer[offset];
}
You could use the
BitConverter.ToInt32
method, but then if this code was ran on a big-endian system, the BitConverter class would automatically reverse the bytes, returning values that would be very wrong and so this set of articles will use their own code which ignores the endian-ness of the system and will always read and write as little-endian.
Reading the Directory
Now that we know where the directory index is located in the
file, we can read out the individual lump details. As with the
WAD header, we declare a buffer big enough to fill the directory
header, then read in the bytes. Using the same GetInt32Le
method described earlier, we extract the size of the lump and
its position in the file.
Next, we find the real length of the lump name, by starting at
the end of the array and working back until we find a non-zero
value. Once we have this length we call
Encoding.ASCII.GetString
to extract the name. Unfortunately,
if we called this API without defining the true length, the
returned string would include anyNUL
padding bytes.
private const byte _directoryHeaderLength = 16;
private const byte _lumpStartOffset = 0;
private const byte _lumpSizeOffset = 4;
private const byte _lumpNameOffset = 8;
private void LoadDirectory(Stream stream, int lumpCount, int directoryStart)
{
byte[] buffer;
stream.Seek(directoryStart, SeekOrigin.Begin);
buffer = new byte[_directoryHeaderLength];
for (int i = 0; i < lumpCount; i++)
{
int offset;
int size;
string name;
stream.Read(buffer, 0, _directoryHeaderLength);
offset = GetInt32Le(buffer, _lumpStartOffset);
size = GetInt32Le(buffer, _lumpSizeOffset);
name = this.GetSafeLumpName(buffer);
// Do something with the 3 values
}
}
private string GetSafeLumpName(byte[] buffer)
{
int length;
length = 0;
for (int i = _directoryHeaderLength; i > _lumpNameOffset; i--)
{
if (entry[i - 1] != '\0')
{
length = i - _lumpNameOffset;
break;
}
}
return length > 0
? Encoding.ASCII.GetString(entry, _lumpNameOffset, length)
: null;
}
About Names and Empty Data
Lump names may not be unique and can appear multiple times. For
example, every DOOM map that I've looked at so far has a lump
named THINGS
, another named LINEDEFS
and several more.
As a result, DOOM seems to make use of a uniquely named lump
(e.g. E1M1
) that serve no purpose other than to be a bookmark
to a contiguous set of lumps that make up a feature (and
sometimes another placeholder at the end if the lumps are
dynamic). For placeholders, the lump size is set to zero, and
the lump offset is either set to the offset of the next valid
lump or again zero. This also means that, depending on the
application using the WAD, lump order is important.
Reading Lump Data
To read the actual data for a given lump, we would set the
Position
of our backing Stream
to the lump offset and then
only read data up to the length of the lump.
using (Stream stream = File.OpenRead(fileName))
{
using (WadReader reader = new WadReader(stream))
{
WadLump lump;
while ((lump = reader.GetNextLump()) != null)
{
byte[] buffer;
buffer = new byte[lump.Size];
stream.Position = lump.Offset;
stream.Read(buffer, 0, buffer.Length);
}
}
}
This sounds error prone and means you have to know this
information up front instead of being able to pass a Stream
to
another method. So for this case, I created an OffsetStream
class which basically acts as a window into another stream
without being to read data it shouldn't or the caller needing to
explicitly know about source boundaries.
internal sealed class OffsetStream : Stream
{
private readonly int _length;
private readonly int _start;
private readonly Stream _stream;
private long _position;
public OffsetStream(Stream source, int start, int length)
{
_stream = source;
_start = start;
_length = length;
}
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return true; }
}
public override bool CanWrite
{
get { return false; }
}
public override long Length
{
get { return _length; }
}
public override long Position
{
get { return _position; }
set
{
if (value < 0 || value > _length)
{
throw new ArgumentOutOfRangeException(nameof(value), value, "Value outside of stream range.");
}
_position = value;
}
}
public override int Read(byte[] buffer, int offset, int count)
{
if (_position + count > _length)
{
count = _length - (int)_position;
}
if (count > 0)
{
_stream.Position = _start + _position;
_stream.Read(buffer, offset, count);
_position += count;
}
return count;
}
public override long Seek(long offset, SeekOrigin origin)
{
long value;
switch (origin)
{
case SeekOrigin.Begin:
value = offset;
break;
case SeekOrigin.Current:
value = _position + offset;
break;
case SeekOrigin.End:
value = _length - offset;
break;
default:
throw new ArgumentOutOfRangeException(nameof(origin), origin, "Invalid origin value.");
}
this.Position = value;
return value;
}
}
With this class in place, I can now get a Stream
that only
provides access to the a specific lumps data with a call similar
to the below.
public Stream GetInputStream()
{
return new OffsetStream(_container, _offset, _size);
}
I can then dispose of this stream or pass it to another method
(for example ImageFile.FromStream
) without needing to know or
care that this is part of something bigger or affecting that.
while ((lump = reader.GetNextLump()) != null)
{
Image image = Image.FromStream(lump.GetInputStream());
}
Putting it all together
For this example, I created the WadReader
class, which is a
forward reading class for quickly enumerating the contents of a
WAD. I also added a WadFile
class which will load all the lump
meta data into a collection for further use.
Using the WadReader
The WadReader
is designed for quickly enumerating the contents
of a WAD. It maintains enough state to know where it is in the
WAD, but nothing else, expecting the consumer to take care of
storing whatever information is required. This would be useful,
for example, if you wanted to pull out one or more lumps for
load on demand.
The WadReader
class exposes a Type
and Count
property, and
a GetNextLump
method which can be used to enumerate.
GetNextLump
will return a valid object as long as there are
items remaining, and null
once it reaches the end of the file.
private static void WriteWadInfo(string fileName)
{
using (Stream stream = File.OpenRead(fileName))
{
using (WadReader reader = new WadReader(stream))
{
WadLump lump;
Console.WriteLine("File: {0}", fileName);
Console.WriteLine("Type: {0}", reader.Type);
Console.WriteLine("Lump Count: {0}", reader.Count);
while ((lump = reader.GetNextLump()) != null)
{
Console.WriteLine("{0}: Offset {1}, Size {2}", lump.Name, lump.Offset, lump.Size);
// stream.Position is also automatically set to the
// start of the lump data, allowing me to do
// stream.Read if required, or call lump.GetInputStream()
// to get a stream to pass to other methods
}
}
}
}
Using the WadFile class
The WadFile
class loads all the lumps (but not the actual
data) into a collection so that it is always available. You can
then pull out lump data at any point without having to re-read
the directory and provides convenience methods for more easily
pulling out WAD data. It isn't as efficient as WadReader
, but
easier to use. It also supports write operations whilst
WadReader
does not.
private void FillItems(string fileName)
{
WadFile wadFile;
wadFile = WadFile.LoadFrom(fileName);
namesListBox.BeginUpdate();
namesListBox.Items.Clear();
namesListBox.Sorted = false;
for (int i = 0; i < wadFile.Lumps.Count; i++)
{
namesListBox.Items.Add(wadFile.Lumps[i]);
}
if (_useNameSort)
{
namesListBox.Sorted = true;
}
namesListBox.EndUpdate();
}
Where's All The Source Code
There is no single download available for this sample as rather than doing a simple demo as I do for most blog posts, it is a slightly more complex solution covering reading, writing and various other features too. The full project is available from our GitHub page.
Wrapping Up
The WAD format has no real features and so is simple to read and write. The linked GitHub page includes a demonstration program which allows WAD files to be opened and contents extracted.
Related articles you may be interested in
Leave a Comment
While we appreciate comments from our users, please follow our posting guidelines. Have you tried the Cyotek Forums for support from Cyotek and the community?
Comments
Krapul
#
Hi ! First of all, thanks for the detailed def of a WAD. But... i was disappointed to find no tool using these data, for i'm no programmer. Is there a link i missed ? Or can you point to an existing tool that could dive informations like type (I/P wad), title, levels #s & titels, aso), like WinRAR displaying the content of an archive, but with specific infos. Many thanks again.