The separator between keywords in PDF meta data

I cannot find an "official" documentation on whether the keywords and keyword phrases in the meta data of a PDF file are to be separated by a comma or by a comma with space.

The following example demonstrates the difference:

  • keyword,keyword phrase,another keyword phrase
  • keyword, keyword phrase, another keyword phrase

Any high-quality references?

The online sources I found are of low quality. E.g., an Adobe press web page says "keywords must be separated by commas or semicolons", but in the example we see a semicolon with a following space before the first keyword and a semicolon with a following space between each two neighbor keywords. We don't see keyword phrases in the example.

1 answer

  • answered 2017-06-17 19:28 Mitch

    The keywords metadata field is a single text field - not a list. You can choose whatever is visually pleasing to you. The search engine which operates on the keyword data may have other preferences, but I would imagine that either comma or semicolon would work with most modern search engines.

    Reference: PDF 32000-1:2008 on page 550

    ExifTool, for example parses for comma separated values, but if it does not find a comma it will split on spaces:

    # separate tokens in comma or whitespace delimited lists
    my @values = ($val =~ /,/) ? split /,+\s*/, $val : split ' ', $val;
    foreach $val (@values) {
        $et->FoundTag($tagInfo, $val);
    }