Lightweight drop-in that turns voice into AI answers in a single call. Works on bare React-Native 0.72 + or Expo config-plugin builds.
| 🔈 Record | 📝 Transcribe | 🤖 Chat | 🔊 Speak |
|---|---|---|---|
Saves temp .m4a using react-native-audio-recorder-player. |
Sends to OpenAI Whisper (audio.transcriptions.create). |
Streams to ChatGPT (any model, gpt-4o-mini default). |
Replies via device TTS (RN-TTS) or OpenAI Audio TTS. |
npm install whisper-client
npx pod-install # iOS podsInstalls & autolinks:
react-native-audio-recorder-playerreact-native-permissionsopenai
<uses-permission android:name="android.permission.RECORD_AUDIO" /><key>NSMicrophoneUsageDescription</key>
<string>This app needs your microphone for voice interviews.</string>After editing Info.plist, run npx pod-install (or expo prebuild).
import React, { useRef, useState } from 'react';
import { View, Button, Text } from 'react-native';
import { WhisperClient } from 'whisper-client';
export default function InterviewScreen() {
const [speech, setSpeech] = useState('');
const [reply, setReply] = useState('');
// Keep one instance to preserve conversation history
const vc = useRef(
new WhisperClient(process.env.OPENAI_API_KEY!, {
chatModel: 'gpt-4o-mini', // optional override
ttsEngine: 'device', // 'device' | 'openai'
language: 'en',
}),
).current;
return (
<View style={{ flex: 1, gap: 12, padding: 24 }}>
<Button title="Start Recording" onPress={vc.startRecording} />
<Button
title="Stop & Answer"
onPress={async () => {
const { transcript, answer } = await vc.stopAndAnswer();
setSpeech(transcript);
setReply(answer);
}}
/>
<Text style={{ marginTop: 16, fontWeight: '600' }}>You said:</Text>
<Text>{speech}</Text>
<Text style={{ marginTop: 16, fontWeight: '600' }}>AI replied:</Text>
<Text>{reply}</Text>
</View>
);
}| Constructor / Method | Purpose |
|---|---|
new WhisperClient(apiKey, opts?) |
Build a reusable instance. • opts.whisperModel default 'whisper-1'• opts.chatModel default 'gpt-4o-mini'• opts.language default 'en'• opts.ttsEngine 'device' | 'openai' (default 'device')• opts.systemPrompt custom system role• opts.onState(state) callback (idle → recording → transcribing → thinking → speaking) |
startRecording() |
Opens the mic and begins writing to a temp file. |
stopAndAnswer() → { transcript, answer } |
Stops recording, sends audio → Whisper → Chat → TTS, returns both strings. |
nextQuestion() → { answer } |
Ask ChatGPT without recording (e.g. next OSCE question). |
cancel() |
Abort any in-flight request or playback. |
destroy() |
Release native resources (call on unmount). |
| Problem | Fix |
|---|---|
| Mic permission denied | Ensure runtime prompt accepted / Info.plist key present. |
TS errors for AudioSet enums |
Upgrade react-native-audio-recorder-player ≥ 3.6. |
| OpenAI 401 / network errors | Check OPENAI_API_KEY and connectivity. |
| Latency > 4 s | Lower opts.maxAudioMs, use Wi-Fi, or prefer device TTS. |
- Streaming partial transcripts & GPT tokens
- Silence detection → auto-stop recording
- Local transcript caching
- LangChain agent plug-in
MIT © 2025 Apium Innovations