LimitedTextContentExtractor (Atlassian Confluence 7.19.1 API)

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- com.atlassian.confluence.search.v2.extractor.BaseAttachmentContentExtractor
- - com.atlassian.confluence.impl.search.extractor.LimitedTextContentExtractor

All Implemented Interfaces:

Extractor2
```
public class LimitedTextContentExtractor
extends BaseAttachmentContentExtractor
```
A subclass of BaseAttachmentContentExtractor which places a limit on how many bytes of the input stream are read into memory. This prevents it from potentially reading in huge attachment streams that trigger memory starvation.
This may have the side-effect of some content not being indexed if it is to be found "beyond" the limit, but that's preferable to an OOME.
The default value was changed from fixed 10Mb to be in line with the value set for Attachments:

Since:

7.17

See Also:

AttachmentExtractedTextExtractor

Constructor Summary

Constructors
Constructor and Description

LimitedTextContentExtractor()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected String`	`extractText(InputStream is, com.atlassian.bonnie.search.SearchableAttachment attachment)`
`protected boolean`	`shouldExtractFrom(String fileName, String contentType)` Extract text from mime types like 'text/', 'application/xml' and 'application/*+xml'

Methods inherited from class com.atlassian.confluence.search.v2.extractor.BaseAttachmentContentExtractor
extractFields, extractText, extractText

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - LimitedTextContentExtractor
```
public LimitedTextContentExtractor()
```
- Method Detail
  - shouldExtractFrom
```
protected boolean shouldExtractFrom(String fileName,
                                    String contentType)
```
    Extract text from mime types like 'text/*', 'application/xml*' and 'application/*+xml'
    
    Specified by:
    
    shouldExtractFrom in class BaseAttachmentContentExtractor
  - extractText
```
protected String extractText(InputStream is,
                             com.atlassian.bonnie.search.SearchableAttachment attachment)
```
    Specified by:
    
    extractText in class BaseAttachmentContentExtractor
    
    Parameters:
    
    is - a stream containing the attachment contents
    
    attachment - contains useful attachment metadata, e.g. filename
    
    Returns:
    
    a String with a textual representation of the attachment's contents

Skip navigation links

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Copyright © 2003–2022 Atlassian. All rights reserved.