Class LimitedTextContentExtractor

  • All Implemented Interfaces:
    com.atlassian.bonnie.search.Extractor

    @Deprecated
    public class LimitedTextContentExtractor
    extends com.atlassian.bonnie.search.extractor.DefaultTextContentExtractor
    Deprecated.
    since 7.17 for LimitedTextContentExtractor. Will no longer be available via OSGI as public api.
    A subclass of Bonnie's DefaultTextContentExtractor which places a limit on how many bytes of the input stream are read into memory. This prevents it from potentially reading in huge attachment streams that trigger memory starvation.

    This may have the side-effect of some content not being indexed if it is to be found "beyond" the limit, but that's preferable to an OOME.

    The default value was changed from fixed 10Mb to be in line with the value set for Attachments:

    Since:
    5.4
    See Also:
    AttachmentExtractedTextExtractor
    • Constructor Detail

      • LimitedTextContentExtractor

        public LimitedTextContentExtractor()
        Deprecated.
    • Method Detail

      • extractText

        protected String extractText​(InputStream is,
                                     com.atlassian.bonnie.search.SearchableAttachment attachment)
        Deprecated.
        Overrides:
        extractText in class com.atlassian.bonnie.search.extractor.DefaultTextContentExtractor