I’m using seafile pro 6.1.4 with office preview and indexing enabled.
The system runs quite well but PDFs larger than 10mb won`t be indexed.
Here some Output from elasticsearch
[09/04/2017 09:24:18] extracting 6529a6e2-1309-47a2-a395-237d90d236aa /Zeitschriften/CT/2014/ct1406.pdf…
Syntax Error: Invalid XRef entry
Internal Error: xref num 3317 not found but needed, try to reconstruct<0a>
Syntax Error: Invalid XRef entry
Syntax Error: Top-level pages object is wrong type (null)
Command Line Error: Wrong page range given: the first page (1) can not be after the last page (0).
[09/04/2017 09:24:18] successfully extracted /Zeitschriften/CT/2014/ct1406.pdf
I don’t think so. For some test’s i’ve create the pdf with pdf24 creator (free tool) and even if the size is larger than 10MB the indexing failed.
If the file is more compressed and smaller 10mb the indexing is working.
Another test with this pdf (15MB) even failed see logs below