User Manual
Indexing Documents
New documents may be indexed via the TYPO3 command line interface (CLI).
Index single document
The command kitodo:
is used for indexing a single document:
./vendor/bin/typo3 kitodo:index -d http://example.com/path/mets.xml -p 123 -s dlfCore1
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This may be an UID of an existing document in Hint: Do not encode the URL! If you have spaces in path, use quotation marks. | |
|
yes |
The page UID of the Kitodo.Presentation data folder. This keeps all records of documents, metadata, structures, solrcores etc. |
123 |
|
yes |
This may be the UID of the solrcore record in The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This may be the UID of the library record in |
123 |
|
no |
Nothing will be written to database or index. The solr-setting will be checked and the documents location URL will be shown. | |
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. | |
|
no |
Show processed documents uid and location with indexing parameters. |
Reindex collections
With the command kitodo:
it is possible to reindex one or more
collections or even to reindex all documents on the given page.:
# reindex collection with uid 1 on page 123 with solr core 'dlfCore1'
# short notation
./vendor/bin/typo3 kitodo:reindex -c 1 -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --coll 1 --pid 123 --solr dlfCore1
# reindex collection with uid 1 on page 123 with solr core 'dlfCore1' in given range
# short notation
./vendor/bin/typo3 kitodo:reindex -c 1 -l 1000 -b 0 -p 123 -s dlfCore1
./vendor/bin/typo3 kitodo:reindex -c 1 -l 1000 -b 1000 -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --coll 1 --index-limit=1000 --index-begin=0 --pid 123 ---solr dlfCore1
./vendor/bin/typo3 kitodo:reindex --coll 1 --index-limit=1000 --index-begin=1000 --pid 123 --solr dlfCore1
# reindex collection with uid 1 and 4 on page 123 with solr core 'dlfCore1'
# short notation
./vendor/bin/typo3 kitodo:reindex -c 1,4 -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --coll 1,4 --pid 123 --solr dlfCore1
# reindex collection with uid 1 and 4 on page 123 with solr core 'dlfCore1' in given range
# short notation
./vendor/bin/typo3 kitodo:reindex -c 1,4 -l 1000 -b 0 -p 123 -s dlfCore1
./vendor/bin/typo3 kitodo:reindex -c 1,4 -l 1000 -b 1000 -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --coll 1,4 --index-limit=1000 --index-begin=0 --pid 123 ---solr dlfCore1
./vendor/bin/typo3 kitodo:reindex --coll 1,4 --index-limit=1000 --index-begin=1000 --pid 123 --solr dlfCore1
# reindex all documents on page 123 with solr core 'dlfCore1' (caution can result in memory problems for big amount of documents)
# short notation
./vendor/bin/typo3 kitodo:reindex -a -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --all --pid 123 --solr dlfCore1
# reindex all documents on page 123 with solr core 'dlfCore1' in given range
# short notation
./vendor/bin/typo3 kitodo:reindex -a -l 1000 -b 0 -p 123 -s dlfCore1
./vendor/bin/typo3 kitodo:reindex -a -l 1000 -b 1000 -p 123 -s dlfCore1
# long notation
./vendor/bin/typo3 kitodo:reindex --all --index-limit=1000 --index-begin=0 --pid 123 ---solr dlfCore1
./vendor/bin/typo3 kitodo:reindex --all --index-limit=1000 --index-begin=1000 --pid 123 --solr dlfCore1
Option |
Required |
Description |
Example |
---|---|---|---|
|
no |
With this option, all documents from the given page will be reindex. | |
|
no |
This may be a single collection UID or a list of UIDs to reindex. |
1 or 1,2,3 |
|
yes |
The page UID of the Kitodo.Presentation data folder. This keeps all records of documents, metadata, structures, solrcores etc. |
123 |
|
yes |
This may be the UID of the solrcore record in The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This may be the UID of the library record in |
123 |
|
no |
With this option, all documents in given limit for the given page will be reindex. Used when it is expected that memory problems can appear due to the high amount of documents. |
1000 |
|
no |
With this option, all documents beginning from given value for the given page will be reindex. Used when it is expected that memory problems can appear due to the high amount of documents. | |
|
no |
Nothing will be written to database or index. All documents will be listed which would be processed on a real run. | |
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. | |
|
no |
Show each processed documents uid and location with timestamp and amount of processed/all documents. |
Harvest OAI-PMH interface
With the command kitodo:
it is possible to harvest an OAI-PMH
interface and index all fetched records.:
# example
./vendor/bin/typo3 kitodo:harvest --lib=<UID> --pid=<PID> --solr=<CORE> --from=<timestamp> --until=<timestamp> --set=<set>
In order to use the command, you first have to configure a library in the backend, setting at least a label and oai_base. The latter should be a valid OAI-PMH base URL (e.g. https://digital.slub-dresden.de/oai/).
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This is the UID of the library record with the OAI interface that should be harvested. This library is also automatically set as the documents' owner. |
123 |
|
yes |
This is the page UID of the library record and therefore the page the documents are added to. |
123 |
|
yes |
This may be the UID of the solrcore record in The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
This is a timestamp in the format YYYY-MM-DD. The parameters from and until limit harvesting to the given period, e.g. for incremental updates. |
2021-01-01 |
|
no |
This is a timestamp in the format YYYY-MM-DD. The parameters from and until limit harvesting to the given period, e.g. for incremental updates. |
2021-06-30 |
|
no |
This is the name of an OAI set. The parameter limits harvesting to the given set. |
'vd18' |
|
no |
Nothing will be written to database or index. All documents will be listed which would be processed on a real run. | |
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. | |
|
no |
Show each processed documents uid and location with timestamp and amount of processed/all documents. |
Delete single document
The command kitodo:
is used for deleting a single document:
./vendor/bin/typo3 kitodo:delete -d http://example.com/path/mets.xml -p 123 -s dlfCore1
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This may be an UID of an existing document in Hint: Do not encode the URL! If you have spaces in path, use quotation marks. | |
|
yes |
The page UID of the Kitodo.Presentation data folder. This keeps all records of documents, metadata, structures, solrcores etc. |
123 |
|
yes |
This may be the UID of the solrcore record in The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
Show processed documents uid and location with deleting parameters. |
Commit and/or optimize index
With the command kitodo:
it is possible to hard commit documents to and/or optimize the index.:
# example
./vendor/bin/typo3 kitodo:optimize --solr=<CORE> --commit --optimize
Option |
Required |
Description |
Example |
---|---|---|---|
|
yes |
This may be the UID of the solrcore record in The solr core must exist in table tx_dlf_solrcores on page "pid". Otherwise an error is shown and the processing won't start. |
123 or 'dlfCore1' |
|
no |
Hard commit documents to the index. | |
|
no |
Optimize the index. | |
|
no |
Nothing will be written to database or index. All documents will be listed which would be processed on a real run. | |
|
no |
Do not output any message. Useful when using a wrapper script. The script may check the return value of the CLI job. This is always 0 on success and 1 on failure. | |
|
no |
Show each processed documents uid and location with timestamp and amount of processed/all documents. |