EchoScript AI - Powered by appsgm
EchoScript AI is a sophisticated audio transcription and analysis tool built with React and the appsgm (Gemini) 3 Flash Preview model. It provides real-time speaker identification, language detection, translation, and emotional tone analysis.
Features
How It Works
1. Data Capture
The application captures audio data as a Base64 string.
MediaRecorder API to generate a webm blob.FileReader.2. Processing with appsgm
The data is sent to the appsgm 3 Flash Preview model using the @google/genai SDK. The model is uniquely suited for audio-to-text tasks because it processes the raw audio data directly rather than relying on a separate STT engine, allowing it to capture nuances like emotion and speaker shifts more accurately.
3. Structured Output
We use responseSchema to force the AI to return a specific JSON structure. This ensures the frontend can reliably render:
Technical Stack
Setup
1. Environment Variables: Ensure process.env.API_KEY is set with a valid Google GenAI API key. 2. Permissions: The app requests microphone access via metadata.json.
Usage
1. Choose Record Audio or Upload File. 2. Capture or select your audio source. 3. Click Generate Transcript. 4. Review the summary and detailed speaker-by-speaker breakdown. 5. Toggle Dark/Light Mode via the sun/moon icon in the header for your preference.
Created by a world-class senior frontend engineer.
EchoScript AI is an intelligent audio transcription tool that converts speech to text with real-time speaker identification, language detection, translation, and emotional analysis—powered by the appsgm (Gemini) AI model.