This is an excellent question. Zero-length arrays, also known as flexible arrays, are used to implement variable-length arrays, primarily in structures.
That's a little confusing, so let's look at an example. Say you wanted a structure to represent an email:
- struct email {
- time_t send_date;
- int flags;
- int length;
- char body[EMAIL_BODY_MAX];
- };
Yes, we're missing a lot of fields. This is just an example; bear with me.
This structure has a problem: Each structure consumes those EMAIL_BODY_MAX
bytes, even if you have a one-word email. Unfortunately email size is highly variable: Some emails are but a single word while others are many paragraphs. Not only do we often end up wasting most of those EMAIL_BODY_MAX
bytes, but we have to make the constant large, too. Worse, there really isn't a limit on email body length, so we'd prefer not to impose one.
Zero-length arrays solve this:
- struct email {
- time_t send_date;
- int flags;
- int length;
- char body[];
- };
If you allocated this structure the usual way, you'd see body
doesn't consume any memory:
struct email *email = malloc (sizeof (struct email));
Here, body
is zero-length. You can't legally access any memory at body
. In fact, on Linux/x86-32 this structure is just 12 bytes in length, the size of the first three fields.
But that isn't how you'd allocate these. Instead, you'd do this:
- /* we have an email of email_length bytes ... */
- struct email *email =
- malloc (sizeof (struct email) + email_length);
- email->length = email_length;
Voila. We now have a structure with an extra length
bytes on the end. You can access body
as if it were the length-byte array body[length]
. If length
were, say, 16, then body
would be 16 bytes in length and our total structure would be 28 bytes. Thus, you can look at zero-length arrays as a pointer whose contents are inlined at itself.
An astute reader is now asking, Why not just return a pointer to an email body of dynamic length? If doing so is possible, that is absolutely preferred. Indeed, zero-length arrays are useful only in cases where you have a large structure, which contains a field of dynamic length, and you need to share that structure across program or even computer boundaries. For example, I used a zero-length array in the Linux kernel's implementation of inotify. It was actually the first zero-length array I had ever seen!
- struct inotify_event {
- int wd;
- uint32_t mask;
- uint32_t cookie;
- uint32_t len;
- char name[];
- };
This structure represents an inotify event, which is an action such as was written to and a target filename such as /home/rlove/wolf.txt
. You can see the problem: How big should I have made name
? Filenames can be any size. Worse, filesystems vary in their maximum filename length (PATH_MAX
isn't a limit, just a lazy man's constant). Now, if I was only returning the filename, I could have simply returned a dynamically-allocated char *
. But I had to return this giant structure. Moreover, I was implementing a system call, so I couldn't allocate pointers inside of the structure, since they wouldn't point at memory in the user's address space. My options were limited. A zero-length array was a perfect solution.
An inotify user would receive this structure via a read()
call on an inotify file descriptor. The structure is effectively variable in length. The user would read the len
parameter to find the size of name
and thus the total size of the structure they just read. The length field is required; otherwise the users of structures with zero-length arrays won't know the size of the variable-length array and thus the structure itself.
Bonus discussion: In this case, len
needn't be equal to strlen (name)
. Inotify lets you read multiple structures at once. This means I needed to pad the end of name
out so that the sizeof (inotify_event)
would properly align multiple structures if stored in an array. Consequently len
is usually larger than the filename as there is a handful of extra nul bytes, padding out the structure such that the next struct inotify_event
is itself correctly aligned.
Note: The type name[]
syntax is C99. GCC implemented zero-length arrays years prior to C99; the syntax was type name[0]
. The two are equivalent, but you should prefer the empty array operator as it is now standardized.