chatting module ¶

Classes for chatting.

See vectorbtpro.utils.knowledge for the toy dataset.

memory_store dict ¶

Object store by store id for MemoryStore.

complete function ¶

complete(
    message,
    completions=None,
    **kwargs
)

Get completion for a message.

Resolves completions with resolve_completions. Keyword arguments are passed to either initialize a class or replace an instance of Completions.

def_metadata_template function ¶

def_metadata_template(
    metadata_content
)

Default metadata template

detokenize function ¶

detokenize(
    tokens,
    tokenizer=None,
    **kwargs
)

Detokenize text.

Resolves tokenizer with resolve_tokenizer. Keyword arguments are passed to either initialize a class or replace an instance of Tokenizer.

embed function ¶

embed(
    query,
    embeddings=None,
    **kwargs
)

Get embedding(s) for one or more queries.

Resolves embeddings with resolve_embeddings. Keyword arguments are passed to either initialize a class or replace an instance of Embeddings.

embed_documents function ¶

embed_documents(
    documents,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_embeddings=False,
    return_documents=False,
    doc_ranker=None,
    **kwargs
)

Embed documents.

Keyword arguments are passed to either initialize a class or replace an instance of DocumentRanker.

rank_documents function ¶

rank_documents(
    query,
    documents=None,
    top_k=None,
    min_top_k=None,
    max_top_k=None,
    cutoff=None,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_chunks=False,
    return_scores=False,
    doc_ranker=None,
    **kwargs
)

Rank documents by their relevance to a query.

Keyword arguments are passed to either initialize a class or replace an instance of DocumentRanker.

resolve_completions function ¶

resolve_completions(
    completions=None
)

Resolve a subclass or an instance of Completions.

The following values are supported:

"openai" (OpenAICompletions)
"litellm" (LiteLLMCompletions)
"llama_index" (LlamaIndexCompletions)
"auto": Any installed from above, in the same order
A subclass or an instance of Completions

resolve_embeddings function ¶

resolve_embeddings(
    embeddings=None
)

Resolve a subclass or an instance of Embeddings.

The following values are supported:

"openai" (OpenAIEmbeddings)
"litellm" (LiteLLMEmbeddings)
"llama_index" (LlamaIndexEmbeddings)
"auto": Any installed from above, in the same order
A subclass or an instance of Embeddings

resolve_obj_store function ¶

resolve_obj_store(
    obj_store=None
)

Resolve a subclass or an instance of ObjectStore.

The following values are supported:

"dict" (DictStore)
"memory" (MemoryStore)
"file" (FileStore)
"lmdb" (LMDBStore)
"cached" (CachedStore)
A subclass or an instance of ObjectStore

resolve_text_splitter function ¶

resolve_text_splitter(
    text_splitter=None
)

Resolve a subclass or an instance of TextSplitter.

The following values are supported:

"token" (TokenSplitter)
"segment" (SegmentSplitter)
"llama_index" (LlamaIndexSplitter)
A subclass or an instance of TextSplitter

resolve_tokenizer function ¶

resolve_tokenizer(
    tokenizer=None
)

Resolve a subclass or an instance of Tokenizer.

The following values are supported:

"tiktoken" (TikTokenizer)
A subclass or an instance of Tokenizer

split_text function ¶

split_text(
    text,
    text_splitter=None,
    **kwargs
)

Split text.

Resolves text_splitter with resolve_text_splitter. Keyword arguments are passed to either initialize a class or replace an instance of TextSplitter.

tokenize function ¶

tokenize(
    text,
    tokenizer=None,
    **kwargs
)

Tokenize text.

Resolves tokenizer with resolve_tokenizer. Keyword arguments are passed to either initialize a class or replace an instance of Tokenizer.

CachedStore class ¶

CachedStore(
    obj_store,
    lazy_open=None,
    mirror=None,
    **kwargs
)

Store class that acts as a (temporary) cache to another store.

For defaults, see chat.obj_store_configs.cached in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
DictStore
HasSettings
ObjectStore
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

force_open class property ¶

Whether to open the store forcefully.

lazy_open class property ¶

Whether to open the store lazily.

mirror class property ¶

Whether to mirror the store in memory_store.

obj_store class property ¶

Object store.

Completions class ¶

Completions(
    context='',
    chat_history=None,
    stream=None,
    max_tokens=None,
    tokenizer=None,
    tokenizer_kwargs=None,
    system_prompt=None,
    system_as_user=None,
    context_prompt=None,
    formatter=None,
    formatter_kwargs=None,
    minimal_format=None,
    silence_warnings=None,
    template_context=None,
    **kwargs
)

Abstract class for completion providers.

For argument descriptions, see their properties, like Completions.chat_history.

For defaults, see knowledge.chat.completions_config in knowledge.

Superclasses

Inherited members

Subclasses

chat_history class property ¶

Chat history.

Must be list of dictionaries with proper roles.

After generating a response, the output will be appended to this sequence as an assistant message.

context class property ¶

Context.

Becomes a user message.

context_prompt class property ¶

Context prompt.

A prompt template requiring the variable "context". The prompt can be either a custom template, or string or function that will become one. Once the prompt is evaluated, it becomes a user message.

formatter class property ¶

A subclass or an instance of ContentFormatter.

Resolved with resolve_formatter.

formatter_kwargs class property ¶

Keyword arguments passed to Completions.formatter.

Used either to initialize a class or replace an instance of ContentFormatter.

get_chat_response method ¶

Completions.get_chat_response(
    messages,
    **kwargs
)

Get chat response to messages.

get_completion method ¶

Completions.get_completion(
    message,
    return_response=False
)

Get completion for a message.

get_delta_content method ¶

Completions.get_delta_content(
    response
)

Get content from a streaming response chunk.

get_message_content method ¶

Completions.get_message_content(
    response
)

Get content from a chat response.

get_stream_response method ¶

Completions.get_stream_response(
    messages,
    **kwargs
)

Get streaming response to messages.

max_tokens class property ¶

Maximum number of tokens in messages.

max_tokens_set class property ¶

Whether the user provided max_tokens.

minimal_format class property ¶

Whether input is minimally-formatted.

model class property ¶

Model.

prepare_messages method ¶

Completions.prepare_messages(
    message
)

Prepare messages for a completion.

silence_warnings class property ¶

Whether to silence warnings.

stream class property ¶

Whether to stream the response.

When streaming, appends chunks one by one and displays the intermediate result. Otherwise, displays the entire message.

system_as_user class property ¶

Whether to use the user role for the system message.

Mainly for experimental models where the system role is not available.

system_prompt class property ¶

System prompt.

Precedes the context prompt.

template_context class property ¶

Context used to substitute templates.

tokenizer class property ¶

A subclass or an instance of Tokenizer.

Resolved with resolve_tokenizer.

tokenizer_kwargs class property ¶

Keyword arguments passed to Completions.tokenizer.

Used either to initialize a class or replace an instance of Tokenizer.

Contextable class ¶

Contextable()

Abstract class that can be converted into a context.

Superclasses

Inherited members

Subclasses

RankContextable

chat class method ¶

Contextable.chat(
    message,
    chat_history=None,
    *,
    return_chat=False,
    **kwargs
)

Chat with an LLM while using the instance as a context.

Uses Contextable.create_chat and then Completions.get_completion.

Note

Context is recalculated each time this method is invoked. For multiple turns, it's more efficient to use Contextable.create_chat.

Usage

>>> asset.chat("What's the value under 'xyz'?")
The value under 'xyz' is 123.

>>> chat_history = []
>>> asset.chat("What's the value under 'xyz'?", chat_history=chat_history)
The value under 'xyz' is 123.

>>> asset.chat("Are you sure?", chat_history=chat_history)
Yes, I am sure. The value under 'xyz' is 123 for the entry where `s` is "EFG".

count_tokens method ¶

Contextable.count_tokens(
    to_context_kwargs=None,
    tokenizer=None,
    tokenizer_kwargs=None
)

Count the number of tokens in the context.

create_chat method ¶

Contextable.create_chat(
    to_context_kwargs=None,
    completions=None,
    **kwargs
)

Create a chat by returning an instance of Completions.

Uses Contextable.to_context to turn this instance to a context.

Usage

>>> chat = asset.create_chat()

>>> chat.get_completion("What's the value under 'xyz'?")
The value under 'xyz' is 123.

>>> chat.get_completion("Are you sure?")
Yes, I am sure. The value under 'xyz' is 123 for the entry where `s` is "EFG".

to_context method ¶

Contextable.to_context(
    *args,
    **kwargs
)

Convert to a context.

DictStore class ¶

DictStore(
    **kwargs
)

Store class based on a dictionary.

For defaults, see chat.obj_store_configs.memory in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
HasSettings
ObjectStore
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

Subclasses

store class property ¶

Store dictionary.

DocumentRanker class ¶

DocumentRanker(
    dataset_id=None,
    embeddings=None,
    embeddings_kwargs=None,
    doc_store=None,
    doc_store_kwargs=None,
    cache_doc_store=None,
    emb_store=None,
    emb_store_kwargs=None,
    cache_emb_store=None,
    score_func=None,
    score_agg_func=None,
    show_progress=None,
    pbar_kwargs=None,
    template_context=None,
    **kwargs
)

Class for embedding, scoring, and ranking documents.

For defaults, see knowledge.chat.doc_ranker_config in knowledge.

Superclasses

Inherited members

compute_score method ¶

DocumentRanker.compute_score(
    emb1,
    emb2
)

Compute scores between embeddings, which can be either single or multiple.

Supported distance functions are 'cosine', 'euclidean', and 'dot'. A metric can also be a callable that should take two and return one 2-dim NumPy array.

doc_store class property ¶

An instance of ObjectStore for documents.

emb_store class property ¶

An instance of ObjectStore for embeddings.

embed_documents method ¶

DocumentRanker.embed_documents(
    documents,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_embeddings=False,
    return_documents=False
)

Embed documents.

Enable refresh or its sub-arguments to refresh documents and/or embeddings in their particular stores. Without refreshing, will rely on the persisted objects.

If return_embeddings and return_documents are both False, returns nothing. If return_embeddings and return_documents are both True, for each document, returns the document and either an embedding or a list of document chunks and their embeddings. If return_documents is False, returns only embeddings.

embeddings class property ¶

An instance of Embeddings.

pbar_kwargs class property ¶

Keyword arguments passed to ProgressBar.

rank_documents method ¶

DocumentRanker.rank_documents(
    query,
    documents=None,
    top_k=None,
    min_top_k=None,
    max_top_k=None,
    cutoff=None,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_chunks=False,
    return_scores=False
)

Sort documents by relevance to a query.

Top-k, minimum top-k, and maximum top-k are resolved with DocumentRanker.resolve_top_k. Score cutoff is converted into top-k with DocumentRanker.top_k_from_cutoff. Minimum and maximum top-k are used to override non-integer top-k and cutoff; it has no effect on the integer top-k, which can be outside the top-k bounds and won't be overridden.

resolve_top_k class method ¶

DocumentRanker.resolve_top_k(
    scores,
    top_k=None
)

Resolve top_k based on sorted scores.

Supported values are integers (top number), floats (top %), strings (supported methods are 'elbow' and 'kmeans'), as well as callables that should take a 1-dim NumPy array and return an integer or a float. Filters out NaN before computation (requires them to be at the tail).

score_agg_func class property ¶

Score aggregation function.

score_documents method ¶

DocumentRanker.score_documents(
    query,
    documents=None,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_chunks=False,
    return_documents=False
)

Score documents by relevance to a query.

score_func class property ¶

Score function.

See DocumentRanker.compute_score.

show_progress class property ¶

Whether to show progress bar.

template_context class property ¶

Context used to substitute templates.

top_k_from_cutoff class method ¶

DocumentRanker.top_k_from_cutoff(
    scores,
    cutoff=None
)

Get top_k from cutoff based on sorted scores.

EmbeddedDocument class ¶

EmbeddedDocument(
    *args,
    **kwargs
)

Abstract class for embedded documents.

Superclasses

Inherited members

child_documents field ¶

Embedded child documents.

document field ¶

Document.

embedding field ¶

Embedding.

Embeddings class ¶

Embeddings(
    batch_size=None,
    show_progress=None,
    pbar_kwargs=None,
    template_context=None,
    **kwargs
)

Abstract class for embedding providers.

For defaults, see knowledge.chat.embeddings_config in knowledge.

Superclasses

Inherited members

Subclasses

batch_size class property ¶

Batch size.

Set to None to disable batching.

get_embedding method ¶

Embeddings.get_embedding(
    query
)

Get embedding for a query.

get_embedding_batch method ¶

Embeddings.get_embedding_batch(
    batch
)

Get embeddings for one batch of queries.

get_embeddings method ¶

Embeddings.get_embeddings(
    queries
)

Get embeddings for multiple queries.

iter_embedding_batches method ¶

Embeddings.iter_embedding_batches(
    queries
)

Get iterator of embedding batches.

model class property ¶

Model.

pbar_kwargs class property ¶

Keyword arguments passed to ProgressBar.

show_progress class property ¶

Whether to show progress bar.

template_context class property ¶

Context used to substitute templates.

FileStore class ¶

FileStore(
    dir_path=None,
    compression=None,
    save_kwargs=None,
    load_kwargs=None,
    use_patching=None,
    consolidate=None,
    **kwargs
)

Store class based on files.

Either commits changes to a single file (with index id being the file name), or commits the initial changes to the base file and any other change to patch file(s) (with index id being the directory name).

For defaults, see chat.obj_store_configs.file in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
DictStore
HasSettings
ObjectStore
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

compression class property ¶

Compression.

consolidate class property ¶

Whether to consolidate patch files.

dir_path class property ¶

Path to the directory.

get_next_patch_path method ¶

FileStore.get_next_patch_path()

Get path to the next patch file to be saved.

load_kwargs class property ¶

Keyword arguments passed to load.

new_keys class property ¶

Keys that haven't been added to the store.

reset_state method ¶

FileStore.reset_state()

Reset state.

save_kwargs class property ¶

Keyword arguments passed to save.

store_changes class property ¶

Store with new or modified objects only.

store_path class property ¶

Path to the directory with patch files or a single file.

use_patching class property ¶

Whether to use directory with patch files or create a single file.

LMDBStore class ¶

LMDBStore(
    dir_path=None,
    mkdir_kwargs=None,
    dumps_kwargs=None,
    loads_kwargs=None,
    **kwargs
)

Store class based on LMDB (Lightning Memory-Mapped Database).

Uses lmdbm package.

For defaults, see chat.obj_store_configs.lmdb in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
HasSettings
ObjectStore
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

db class property ¶

Database.

db_path class property ¶

Path to the database.

decode method ¶

LMDBStore.decode(
    bytes_
)

Decode an object.

dir_path class property ¶

Path to the directory.

dumps_kwargs class property ¶

Keyword arguments passed to dumps.

encode method ¶

LMDBStore.encode(
    obj
)

Encode an object.

loads_kwargs class property ¶

Keyword arguments passed to loads.

mkdir_kwargs class property ¶

Keyword arguments passed to check_mkdir.

open_kwargs class property ¶

Keyword arguments passed to lmdbm.lmdbm.Lmdb.open.

LiteLLMCompletions class ¶

LiteLLMCompletions(
    context='',
    chat_history=None,
    stream=None,
    max_tokens=None,
    tokenizer=None,
    tokenizer_kwargs=None,
    system_prompt=None,
    system_as_user=None,
    context_prompt=None,
    formatter=None,
    formatter_kwargs=None,
    silence_warnings=None,
    template_context=None,
    model=None,
    **kwargs
)

Completions class for LiteLLM.

Keyword arguments are passed to the completion call.

For defaults, see chat.completions_configs.litellm in knowledge.

Superclasses

Inherited members

completion_kwargs class property ¶

Keyword arguments passed to litellm.completion.

LiteLLMEmbeddings class ¶

LiteLLMEmbeddings(
    model=None,
    batch_size=None,
    show_progress=None,
    pbar_kwargs=None,
    template_context=None,
    **kwargs
)

Embeddings class for LiteLLM.

For defaults, see chat.embeddings_configs.litellm in knowledge.

Superclasses

Inherited members

embedding_kwargs class property ¶

Keyword arguments passed to litellm.embedding.

LlamaIndexCompletions class ¶

LlamaIndexCompletions(
    context='',
    chat_history=None,
    stream=None,
    max_tokens=None,
    tokenizer=None,
    tokenizer_kwargs=None,
    system_prompt=None,
    system_as_user=None,
    context_prompt=None,
    formatter=None,
    formatter_kwargs=None,
    silence_warnings=None,
    template_context=None,
    llm=None,
    **kwargs
)

Completions class for LlamaIndex.

LLM can be provided via llm, which can be either the name of the class (case doesn't matter), the path or its suffix to the class (case matters), or a subclass or an instance of llama_index.core.llms.LLM.

Keyword arguments are passed to the resolved LLM.

For defaults, see chat.completions_configs.llama_index in knowledge.

Superclasses

Inherited members

llm class property ¶

LLM.

LlamaIndexEmbeddings class ¶

LlamaIndexEmbeddings(
    embedding=None,
    batch_size=None,
    show_progress=None,
    pbar_kwargs=None,
    template_context=None,
    **kwargs
)

Embeddings class for LlamaIndex.

For defaults, see chat.embeddings_configs.llama_index in knowledge.

Superclasses

Inherited members

embedding class property ¶

Embedding.

LlamaIndexSplitter class ¶

LlamaIndexSplitter(
    node_parser=None,
    template_context=None,
    **kwargs
)

Splitter class based on a node parser from LlamaIndex.

For defaults, see chat.text_splitter_configs.llama_index in knowledge.

Superclasses

Inherited members

node_parser class property ¶

An instance of llama_index.core.node_parser.interface.NodeParser.

MemoryStore class ¶

MemoryStore(
    **kwargs
)

Store class based in memory.

Commits changes to memory_store.

For defaults, see chat.obj_store_configs.memory in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
DictStore
HasSettings
ObjectStore
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

store_exists method ¶

MemoryStore.store_exists()

Whether store exists.

MetaObjectStore class ¶

MetaObjectStore(
    name,
    bases,
    attrs
)

Metaclass for ObjectStore.

Superclasses

MetaConfigured
abc.ABCMeta
builtins.type

ObjectStore class ¶

ObjectStore(
    store_id=None,
    purge_on_open=None,
    template_context=None,
    **kwargs
)

Abstract class for managing an object store.

For defaults, see knowledge.chat.obj_store_config in knowledge.

Superclasses

Base
Cacheable
Chainable
Comparable
Configured
HasSettings
Pickleable
Prettified
collections.abc.Collection
collections.abc.Container
collections.abc.Iterable
collections.abc.Mapping
collections.abc.MutableMapping
collections.abc.Sized

Inherited members

Subclasses

check_opened method ¶

ObjectStore.check_opened()

Check the store is opened.

close method ¶

ObjectStore.close()

Close the store.

commit method ¶

ObjectStore.commit()

Commit changes.

enter_calls class property ¶

Number of enter calls.

mirror_store_id class property ¶

Mirror store id.

open method ¶

ObjectStore.open()

Open the store.

opened class property ¶

Whether the store has been opened.

purge method ¶

ObjectStore.purge()

Purge the store.

purge_on_open class property ¶

Whether to purge on open.

store_id class property ¶

Store id.

template_context class property ¶

Context used to substitute templates.

OpenAICompletions class ¶

OpenAICompletions(
    context='',
    chat_history=None,
    stream=None,
    max_tokens=None,
    tokenizer=None,
    tokenizer_kwargs=None,
    system_prompt=None,
    system_as_user=None,
    context_prompt=None,
    formatter=None,
    formatter_kwargs=None,
    silence_warnings=None,
    template_context=None,
    model=None,
    **kwargs
)

Completions class for OpenAI.

Keyword arguments are distributed between the client call and the completion call.

For defaults, see chat.completions_configs.openai in knowledge.

Superclasses

Inherited members

client class property ¶

Client.

completion_kwargs class property ¶

Keyword arguments passed to openai.resources.chat.completions_configs.Completions.create.

OpenAIEmbeddings class ¶

OpenAIEmbeddings(
    model=None,
    batch_size=None,
    show_progress=None,
    pbar_kwargs=None,
    template_context=None,
    **kwargs
)

Embeddings class for OpenAI.

For defaults, see chat.embeddings_configs.openai in knowledge.

Superclasses

Inherited members

client class property ¶

Client.

embeddings_kwargs class property ¶

Keyword arguments passed to openai.resources.embeddings.Embeddings.create.

RankContextable class ¶

RankContextable()

Abstract class that combines both Rankable and Contextable to rank a context.

Superclasses

Inherited members

Subclasses

KnowledgeAsset

chat class method ¶

RankContextable.chat(
    message,
    chat_history=None,
    *,
    incl_past_queries=None,
    rank=None,
    top_k=None,
    min_top_k=None,
    max_top_k=None,
    cutoff=None,
    return_chunks=None,
    rank_kwargs=None,
    **kwargs
)

See Contextable.chat.

If rank is True, or rank is None and any of top_k, min_top_k, max_top_k, cutoff, or return_chunks is set, will rank the documents with Rankable.rank first.

Rankable class ¶

Rankable()

Abstract class that can be ranked.

Superclasses

Inherited members

Subclasses

RankContextable

embed method ¶

Rankable.embed(
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_embeddings=False,
    return_documents=False,
    **kwargs
)

Embed documents.

rank method ¶

Rankable.rank(
    query,
    top_k=None,
    min_top_k=None,
    max_top_k=None,
    cutoff=None,
    refresh=False,
    refresh_documents=None,
    refresh_embeddings=None,
    return_chunks=False,
    return_scores=False,
    **kwargs
)

Rank documents by their relevance to a query.

ScoredDocument class ¶

ScoredDocument(
    *args,
    **kwargs
)

Abstract class for scored documents.

Superclasses

Inherited members

child_documents field ¶

Scored child documents.

document field ¶

Document.

score field ¶

Score.

SegmentSplitter class ¶

SegmentSplitter(
    separators=None,
    min_chunk_size=None,
    fixed_overlap=None,
    **kwargs
)

Splitter class for segments based on separators.

If a segment is too big, the next separator within the same layer is taken to split the segment into smaller segments. If a segment is too big and there are no segments previously added to the chunk, or, if the number of tokens is less than the minimal count, the next layer is taken. To split into tokens, set any separator to None. To split into characters, use an empty string.

For defaults, see chat.text_splitter_configs.segment in knowledge.

Superclasses

Inherited members

fixed_overlap class property ¶

Whether overlap should be fixed.

min_chunk_size class property ¶

Minimum number of tokens per chunk.

Can also be provided as a floating number relative to SegmentSplitter.chunk_size.

separators class property ¶

Nested list of separators grouped into layers.

split_into_segments method ¶

SegmentSplitter.split_into_segments(
    text,
    separator=None
)

Split text into segments.

StoreDocument class ¶

StoreDocument(
    *args,
    **kwargs
)

Abstract class for documents to be stored.

Superclasses

Inherited members

Subclasses

TextDocument

data field ¶

Data.

from_data class method ¶

StoreDocument.from_data(
    data,
    id_=None,
    **kwargs
)

Create an instance of StoreDocument from data.

get_content method ¶

StoreDocument.get_content(
    for_embed=False
)

Get content.

Returns None if there's no content.

id_from_data class method ¶

StoreDocument.id_from_data(
    data
)

Generate a unique identifier from data.

split method ¶

StoreDocument.split()

Split document into multiple documents.

template_context field ¶

Context used to substitute templates.

StoreEmbedding class ¶

StoreEmbedding(
    *args,
    **kwargs
)

Class for embeddings to be stored.

Superclasses

Inherited members

child_ids field ¶

Child object identifiers.

embedding field ¶

Embedding.

parent_id field ¶

Parent object identifier.

StoreObject class ¶

StoreObject(
    *args,
    **kwargs
)

Class for objects to be managed by a store.

Superclasses

Inherited members

Subclasses

id_ field ¶

Object identifier.

TextDocument class ¶

TextDocument(
    *args,
    **kwargs
)

Class for text documents.

Superclasses

Inherited members

content_template field ¶

Content template.

Must be suitable for formatting via the format() method.

dump_kwargs field ¶

Keyword arguments passed to dump.

excl_embed_metadata field ¶

Whether to exclude metadata and which fields to exclude for embeddings.

If None, becomes TextDocument.excl_metadata.

excl_metadata field ¶

Whether to exclude metadata and which fields to exclude.

If False, metadata becomes everything except text.

get_metadata method ¶

TextDocument.get_metadata(
    for_embed=False
)

Get metadata.

Returns None if no metadata.

get_metadata_content method ¶

TextDocument.get_metadata_content(
    for_embed=False
)

Get metadata content.

Returns None if no metadata.

get_text method ¶

TextDocument.get_text()

Get text.

Returns None if no text.

metadata_template field ¶

Metadata template.

Must be suitable for formatting via the format() method.

skip_missing field ¶

Set missing text or metadata to None rather than raise an error.

split_text_kwargs field ¶

Keyword arguments passed to split_text.

text_path field ¶

Path to the text field.

TextSplitter class ¶

TextSplitter(
    chunk_template=None,
    template_context=None,
    **kwargs
)

Abstract class for text splitters.

For defaults, see knowledge.chat.text_splitter_config in knowledge.

Superclasses

Inherited members

Subclasses

chunk_template class property ¶

Chunk template.

Can use the following context: chunk_idx, chunk_start, chunk_end, chunk_text, and text.

split method ¶

TextSplitter.split(
    text
)

Split text and yield start character and end character position of each chunk.

split_text method ¶

TextSplitter.split_text(
    text
)

Split text and return text chunks.

template_context class property ¶

Context used to substitute templates.

TikTokenizer class ¶

TikTokenizer(
    encoding=None,
    model=None,
    tokens_per_message=None,
    tokens_per_name=None,
    **kwargs
)

Tokenizer class for tiktoken.

Encoding can be a model name, an encoding name, or an encoding object for tokenization.

For defaults, see chat.tokenizer_configs.tiktoken in knowledge.

Superclasses

Inherited members

encoding class property ¶

Encoding.

tokens_per_message class property ¶

Tokens per message.

tokens_per_name class property ¶

Tokens per name.

TokenSplitter class ¶

TokenSplitter(
    chunk_size=None,
    chunk_overlap=None,
    tokenizer=None,
    tokenizer_kwargs=None,
    **kwargs
)

Splitter class for tokens.

For defaults, see chat.text_splitter_configs.token in knowledge.

Superclasses

Inherited members

Subclasses

SegmentSplitter

chunk_overlap class property ¶

Number of overlapping tokens between chunks.

Can also be provided as a floating number relative to SegmentSplitter.chunk_size.

chunk_size class property ¶

Maximum number of tokens per chunk.

split_into_tokens method ¶

TokenSplitter.split_into_tokens(
    text
)

Split text into tokens.

tokenizer class property ¶

An instance of Tokenizer.

Tokenizer class ¶

Tokenizer(
    template_context=None,
    **kwargs
)

Abstract class for tokenizers.

For defaults, see knowledge.chat.tokenizer_config in knowledge.

Superclasses

Inherited members

Subclasses

TikTokenizer

count_tokens method ¶

Tokenizer.count_tokens(
    text
)

Count tokens in a text.

count_tokens_in_messages method ¶

Tokenizer.count_tokens_in_messages(
    messages
)

Count tokens in messages.

decode method ¶

Tokenizer.decode(
    tokens
)

Decode a list of tokens into text.

decode_single method ¶

Tokenizer.decode_single(
    token
)

Decode a single token into text.

encode method ¶

Tokenizer.encode(
    text
)

Encode text into a list of tokens.

encode_single method ¶

Tokenizer.encode_single(
    text
)

Encode text into a single token.

template_context class property ¶

Context used to substitute templates.

chatting module¶

memory_store dict¶

complete function¶

def_metadata_template function¶

detokenize function¶

embed function¶

embed_documents function¶

rank_documents function¶

resolve_completions function¶

resolve_embeddings function¶

resolve_obj_store function¶

resolve_text_splitter function¶

resolve_tokenizer function¶

split_text function¶

tokenize function¶

CachedStore class¶

force_open class property¶

lazy_open class property¶

mirror class property¶

obj_store class property¶

Completions class¶

chat_history class property¶

context class property¶

context_prompt class property¶

formatter class property¶

formatter_kwargs class property¶

get_chat_response method¶

get_completion method¶

get_delta_content method¶

get_message_content method¶

get_stream_response method¶

max_tokens class property¶

max_tokens_set class property¶

minimal_format class property¶

model class property¶

prepare_messages method¶

silence_warnings class property¶

stream class property¶

system_as_user class property¶

system_prompt class property¶

template_context class property¶

tokenizer class property¶

tokenizer_kwargs class property¶

Contextable class¶

chat class method¶

count_tokens method¶

create_chat method¶

to_context method¶

DictStore class¶

store class property¶

DocumentRanker class¶

compute_score method¶

doc_store class property¶

emb_store class property¶

embed_documents method¶

embeddings class property¶

pbar_kwargs class property¶

rank_documents method¶

resolve_top_k class method¶

score_agg_func class property¶

score_documents method¶

score_func class property¶

show_progress class property¶

template_context class property¶

top_k_from_cutoff class method¶

EmbeddedDocument class¶

child_documents field¶

document field¶

embedding field¶

Embeddings class¶

batch_size class property¶

get_embedding method¶

get_embedding_batch method¶

get_embeddings method¶

iter_embedding_batches method¶

model class property¶

pbar_kwargs class property¶

show_progress class property¶

template_context class property¶

FileStore class¶

chatting module ¶

memory_store dict ¶

complete function ¶

def_metadata_template function ¶

detokenize function ¶

embed function ¶

embed_documents function ¶

rank_documents function ¶

resolve_completions function ¶

resolve_embeddings function ¶

resolve_obj_store function ¶

resolve_text_splitter function ¶

resolve_tokenizer function ¶

split_text function ¶

tokenize function ¶

CachedStore class ¶

force_open class property ¶

lazy_open class property ¶

mirror class property ¶

obj_store class property ¶

Completions class ¶

chat_history class property ¶

context class property ¶

context_prompt class property ¶

formatter class property ¶

formatter_kwargs class property ¶

get_chat_response method ¶

get_completion method ¶

get_delta_content method ¶

get_message_content method ¶

get_stream_response method ¶

max_tokens class property ¶

max_tokens_set class property ¶

minimal_format class property ¶

model class property ¶

prepare_messages method ¶

silence_warnings class property ¶

stream class property ¶

system_as_user class property ¶

system_prompt class property ¶

template_context class property ¶

tokenizer class property ¶

tokenizer_kwargs class property ¶

Contextable class ¶

chat class method ¶

count_tokens method ¶

create_chat method ¶

to_context method ¶

DictStore class ¶

store class property ¶

DocumentRanker class ¶

compute_score method ¶

doc_store class property ¶

emb_store class property ¶

embed_documents method ¶

embeddings class property ¶

pbar_kwargs class property ¶

rank_documents method ¶

resolve_top_k class method ¶

score_agg_func class property ¶

score_documents method ¶

score_func class property ¶

show_progress class property ¶

template_context class property ¶

top_k_from_cutoff class method ¶

EmbeddedDocument class ¶

child_documents field ¶

document field ¶

embedding field ¶

Embeddings class ¶

batch_size class property ¶

get_embedding method ¶

get_embedding_batch method ¶

get_embeddings method ¶

iter_embedding_batches method ¶

model class property ¶

pbar_kwargs class property ¶

show_progress class property ¶

template_context class property ¶

FileStore class ¶