Conversation mode / Hand free voice input
planned
Á
Ásgeir Thor
I'm curious why this has not been applied I think anywhere except for in the ChatGPT app and Pi.ai app? That is, a real voice "call" two way feature, which is pretty fast. Is this so hard to make or is it something about privacy concerns or limited support for this kind of stuff in the OSes we all use? Because, for example pi.ai only supports this on mobile afaik and chatgpt only supported this on mobile until very recently it's on macos now too and the only app which has such a feature "out of the box" so to speak.
However this should be very well doable as this extension does this magnificently: https://chromewebstore.google.com/detail/say-pi/glhhgglpalmjjkoiigojligncepccdei
It changes the pi.ai interface to a basically voice chat but with MUCH better whisperai implemention (and probably with model prompts etc since it understand you SO very well, better than chatgpt transcription button and voice feature I think even). For some reason it is free to use and the kicker: It's available as a Safari extension on iOS so it feels like a total cheatcode in the current app environment. BUT sadly it's only available to use with pi.ai! If you could use it on any site it would be amazing... especially typingmind.com.
Other workaround is using this extension WhisperAI: https://chromewebstore.google.com/detail/whisperai-ai-driven-speec/klhcnkknganbneegjihbcfjoifiomhfn?hl=en It supports auto enter when you're prompt is complete but only flaw is you have to manually start and stop the voice input. It also allows you to make voice prompt and process the whisperai transcription etc, similar to superwhisper app. You can use it on literally every site, now only if these apps would join forces and be one app then, we have all the bases covered and have voice chat everywhere.
Dude just implement this in typingmind it PLEASE it must be possible (and please allow us to set a voice for whisperai STT like you allow with the STT model).
J E
This would be huge. One of the main reasons I can’t use typingmind as my main platform.
Á
Ásgeir Thor
J E Can I ask is there any other platform which allows this?
J E
Ásgeir Thor I have only seen it in ChatGPT. Perplexity might offer it also now but I’m not positive on that.
N
Ngoc Nguyen
A
Andy Laken
Some valuable use cases, nearly anything when user has no hands free. Cooking, doing mechanical work, carpentry, caring for small children, etc. And I'm no accessibility expert but the benefits for motor and sight impaired are obvious I'd imagine.
As an iPhone user the voice conversation mode of ChatGPT's app is impressive, but I'm not sure how much of a technical hurdle that would be for TM. I assume they're benefiting from native frameworks (i.e. SwiftUI or UIKit) that might be hard to duplicate with PWA. Off top of head wondering if coding in some Siri intents could be helpful here? "hey siri using typingmind ask Claude3 what b-screw settings are needed for a 42T large cog on Shimano RD-GX810 derailleur"
N
Ngoc Nguyen
Merged in a post:
Suggestion for faster text-to-speech.
Guile
Currently, the feature waits until the entire message is shared before reading it aloud. I propose implementing a system that breaks down the text into manageable chunks, allowing the text-to-speech to begin reading as soon as each chunk is available, similar to ChatWithGPT.
This enhancement could:
Improve user engagement through immediate auditory feedback
Increase accessibility for users who rely on auditory processing
Give TypingMind a competitive edge over other platforms
I believe this feature would greatly enhance the user experience on TypingMind. Thank you for considering my suggestion!
Tony Dinh
Merged in a post:
Additional options to manage voice control
S
Shane
Some suggested additional options. The goal would be to reduce need to use mouse inputs to use voice inputs:
- allow an additional option to terminate a voice input by saying a user definable key word e.g. "send"
Same but to allow a key word to clear message
2.
Allow pressing a key ( e.g. hotkey) to turn on/off the voice control.
Press and hold a key for input (AKA push to talk mode)
Send message when the key is released option
- allow option for switching off microphone when hearing TTS responses from typing mind
option for switching off voice input when switching to another tab/window
- additional options for auto-send message after speaking (this doesn't work for me currently...could have option for threshold and/or length of silence)
===
These options would also allow a redesign of the current voice input control - the suggestion would be to have an options dialogue (which allows file upload) but have voice input done live - i.e. rather than separate dialogue box, a live input is indicated by the microphone icon turning red - which can be initiated (and stopped) by clicking on the icon, or by using the hotkey, or holding of the push to talk key.
Put all these together, this would allow a recreation of the ChatGPT mobile voice map - i.e. a hands free dialogue with a LLM. Once change to make this work better would be to start TTS of LLM response before the end of message.
Tony Dinh
planned
N
Ngoc Nguyen
Merged in a post:
automatically start and stop voice functionality
a
addointernationals
Any chance we can have a feature to automatically start and stop voice recording when user is done speaking? See requirements below.
1) A new setting is updated to allow for automatic voice recognition.
2) If this setting is enabled, system will recognize user voice and automatically start typing user input from speech and automatically submit to AI when user is done speaking
- If this setting is enabled, system will be able to determine when user starts and stop speaking and automatically submit input to AI to have more of a fluid conversation with the AI.
The idea of the feature is to be able to talk to AI like you do in natural conversation so that you don't have to start/stop conversation or click "finish" button when done speaking.
Thank you,.
N
Ngoc Nguyen
Merged in a post:
Conversational Mode Activation Toggle
V
Victor R
A toggle feature for activating and deactivating conversational mode would be beneficial. This would enable users to send voice inputs without continuously pressing the microphone button, similar to the feature OpenAI recently implemented.
Tony Dinh
under review
Load More
→