Class: SentenceSplitter
SentenceSplitter is our default text splitter that supports splitting into sentences, paragraphs, or fixed length chunks with overlap.
One of the advantages of SentenceSplitter is that even in the fixed length chunks it will try to keep sentences together.
Constructors
constructor
• new SentenceSplitter(options?
): SentenceSplitter
Parameters
Name | Type |
---|---|
options? | Object |
options.chunkOverlap? | number |
options.chunkSize? | number |
options.chunkingTokenizerFn? | (text : string ) => string [] |
options.paragraphSeparator? | string |
options.splitLongSentences? | boolean |
options.tokenizer? | any |
options.tokenizerDecoder? | any |
Returns
Defined in
packages/core/src/TextSplitter.ts:78
Properties
chunkOverlap
• chunkOverlap: number
Defined in
packages/core/src/TextSplitter.ts:70
chunkSize
• chunkSize: number
Defined in
packages/core/src/TextSplitter.ts:69
chunkingTokenizerFn
• Private
chunkingTokenizerFn: (text
: string
) => string
[]
Type declaration
▸ (text
): string
[]
Parameters
Name | Type |
---|---|
text | string |
Returns
string
[]
Defined in
packages/core/src/TextSplitter.ts:75
paragraphSeparator
• Private
paragraphSeparator: string
Defined in
packages/core/src/TextSplitter.ts:74
splitLongSentences
• Private
splitLongSentences: boolean
Defined in
packages/core/src/TextSplitter.ts:76
tokenizer
• Private
tokenizer: any
Defined in
packages/core/src/TextSplitter.ts:72
tokenizerDecoder
• Private
tokenizerDecoder: any
Defined in
packages/core/src/TextSplitter.ts:73
Methods
combineTextSplits
▸ combineTextSplits(newSentenceSplits
, effectiveChunkSize
): TextSplit
[]
Parameters
Name | Type |
---|---|
newSentenceSplits | SplitRep [] |
effectiveChunkSize | number |
Returns
TextSplit
[]
Defined in
packages/core/src/TextSplitter.ts:215
getEffectiveChunkSize
▸ getEffectiveChunkSize(extraInfoStr?
): number
Parameters
Name | Type |
---|---|
extraInfoStr? | string |
Returns
number
Defined in
packages/core/src/TextSplitter.ts:114
getParagraphSplits
▸ getParagraphSplits(text
, effectiveChunkSize?
): string
[]