Set training data when creating AI characters (consumer app)
planned
V
Victor R
For example these different methods:
- Text file(s) of data:Currently it becomes a long process if a user want to provide multiple example of training data since one has to copy and paste into each filed User/Response. This can be just text based on the data. Or example conversations user/response example.
- CSV file:The file can have examples of data for the AI character to be trained. on. Again this can be just text based on the data. Or example conversations user/response example.
- PDF files of data:Another method is a PDF file(s) which can have data you want to train the AI character on.
- URL Link(s):User can provide a link to a site which has data related to the AI character.
A
Ali
RAG please. I know you can upload files, but that's not ideal for larger files. We need a RAG (Retrieval-augmented generation) implementation. Ideally, would keep the files local since they can be large, and allow you to use openAI's embedding models. All done through typingmind's UI. It could just use the API keys from openai that are already in
Currently, typingmind supports RAGs, but it's through a complicated API. It should all be done locally using the typingmind interface.
It shouldn't just input the whole document as input. That would consume a lot of tokens.
J
Jack Morgan
Not sure if this is possible, but it would be helpful if this could also "chunk" files in the way ChatGPT does when you upload data for custom GPTs. This would allow the upload of documents that exceed the given model's context window. I frequently work with large files (e.g. multiple PDFs with 1500 pages), and run into context limit problems.
Tony Dinh
Merged in a post:
Document library
D
Dan O'Leary
Allow users to upload multiple documents. Organize those in hierarchical folder structure so prompts can refer to various files / folders or branches.
Tony Dinh
planned
Tony Dinh
complete
This is now available.
V
Victor R
Tony Dinh: Hi I don't see this in the individual version of TM. This feature request was for individual version of TM.
Tony Dinh
Victor R: click the upload button
V
Victor R
Tony Dinh: Yes I have seen this however this is to chat with a document. However, I was looking to see such a "upload document" feature under the AI character training section.
Tony Dinh
Victor R: Ah I see. I'll put this back Planned. Thanks for clarifying.
V
Victor R
Tony Dinh: Great thank you.
D
Dan O'Leary
To be clear, this request assumes the creation of a prompt retrieval augmentation method to deal with context limits. I believe the current upload a doc / text feature just tries to attach that content directly to the prompt, which easily blows up context limits. typically this is done by building a library of embeddings and doing similarity search on the those embeddings. I'm confident that you already realize this but wanted to clarify my intent.
D
Dan O'Leary
DERP. I just realized that TM Custom already has the bits in place to do exactly this. Would love to see it in the non-custom version as well, eventually.
Tony Dinh
Thanks for the feature request! Is this refer to the current document upload feature? So you want to have a place where you can manage all previously uploaded documents, correct?
D
Dan O'Leary
Tony Dinh: an extension of the current feature, yes. Rather than upload docs for a single chat, manage a library of them that can be referenced in any future chat. Build embeddings once, then select parts of the library to use to augment the prompt.
D
Damon Mota
Tony Dinh: zoho salesiq does something similar like this. You can add all the training data to the faqs or to the Ikb.
Sam Kennedy
It would be great if the training data can be organized and selected from in collections. The user could pick one kind of "training data", or none at all. Right now it's an all or nothing very inflexible kind of arrangement. Another way to approach it would be, what if you could add training data to the AI Characters feature?