Metadata
Kitodo.Presentation allows a per-tenant configuration of which metadata is extracted from documents, and how it is indexed and displayed.
Formats
A METS document may embed or reference resources in many formats:
- Within the
<mdWrap>
tag, it may embed sections of metadata in formats such as MODS or TEIHDR (TEI Header). - Via the
<FLocat>
tag, it may reference fulltext in formats such as ALTO.
Data format records, which are stored on the root page (PID = 0), tell Kitodo how to extract and process these formats, and may be used to define the entries shown in the metadata plugin. A record contains the following fields:
Field |
Description |
---|---|
Format Name |
Name of the type that is used to reference it. For metadata embedded via For XML fulltext files, this corresponds to the capitalized root tag of the file. |
Root Element |
The XML root element used by this format. In METS, this is used to locate the sub-root within an |
Namespace URI |
The XML namespace URI used by this format. It is registered within the parser and may be used to declare namespace prefixes. |
Class Name |
(Optional) Fully qualified name of the PHP class that handles the format. Some formats are pre-defined in the For metadata, this is used to programmatically extract values by implementing For fulltext, this is used to parse the fulltext file by implementing the |
TypoScript Wrap
The TypoScript wrap controls how the metadata plugin displays the entries.
There are three objects that may be set, each of which is passed to stdWrap
with a different aspect of the metadata table.
key
: First, the (localized) label of the metadata entry, such as "Title" or "Year", is transformed using thekey
object.value
: Each value of the entry is transformed using thevalue
object.all
: The combined output (label and values) is processed using theall
object and appended to the output (unless there is no value or all processed values are blank).
Finally, all entries are wrapped in a definition list (<dl>
tag). If one of the objects is not specified, the unprocessed output is taken as-is.
Displaying Multiple Values
To allow TypoScript wrap to reference metadata values,
all configured entries of the current metadata section are set as data before calling stdWrap
.
The index names are used as key.
Whenever there are multiple values, they are joined by the separator
that is configured in the metadata plugin.
To see where this is useful, consider the following MODS access condition.
<mods:accessCondition
xmlns:xlink="http://www.w3.org/1999/xlink"
type="use and reproduction"
xlink:href="http://creativecommons.org/licenses/by-sa/4.0/"
>
CC BY-SA 4.0
</mods:accessCondition>
In the metadata table, we may want to render an <a>
tag that links to the license summary page and displays the license name as link text.
To achieve this, we may create two metadata entries:
- Create a hidden entry called
useandreproduction_link
that hasxpath
set to./mods:accessCondition[@type="use and reproduction"]/@xlink:href
. This entry is used purely to extract the license URL. -
The main entry sets
xpath
to./mods:accessCondition[@type="use and reproduction"]
(so it extracts the license name) and uses awrap
along the lines ofkey.wrap = <dt>|</dt> value.required = 1 value.typolink.parameter.field = useandreproduction_link value.wrap = <dd>|</dd>
Copied!