public class LimitedTextContentExtractor extends BaseAttachmentContentExtractor
BaseAttachmentContentExtractor which places a limit on how many bytes of the input stream
are read into memory. This prevents it from potentially reading in huge attachment streams that trigger memory starvation.
This may have the side-effect of some content not being indexed if it is to be found "beyond" the limit, but that's preferable to an OOME.
The default value was changed from fixed 10Mb to be in line with the value set for Attachments:
AttachmentExtractedTextExtractor| Constructor and Description |
|---|
LimitedTextContentExtractor() |
| Modifier and Type | Method and Description |
|---|---|
protected String |
extractText(InputStream is,
com.atlassian.bonnie.search.SearchableAttachment attachment) |
protected boolean |
shouldExtractFrom(String fileName,
String contentType)
Extract text from mime types like 'text/*', 'application/xml*' and 'application/*+xml'
|
extractFields, extractText, extractTextprotected boolean shouldExtractFrom(String fileName, String contentType)
shouldExtractFrom in class BaseAttachmentContentExtractorprotected String extractText(InputStream is, com.atlassian.bonnie.search.SearchableAttachment attachment)
extractText in class BaseAttachmentContentExtractoris - a stream containing the attachment contentsattachment - contains useful attachment metadata, e.g. filenameCopyright © 2003–2022 Atlassian. All rights reserved.