EchoScript AI is a sophisticated audio transcription and analysis tool built with React and the appsgm (Gemini) 3 Flash Preview model. It provides real-time speaker identification, language detection, translation, and emotional tone analysis.

Features

Multi-modal Input: Record audio directly from your microphone or upload files (MP3, WAV, M4A, WEBM).

Speaker Diarization: Automatically identifies different speakers in a conversation.

Language Intelligence: Detects the source language and provides English translations for non-English segments.

Sentiment & Emotion Analysis: Categorizes each segment into Happy, Sad, Angry, or Neutral tones.

Smart Summarization: Generates a concise overview of the entire audio content.

Adaptive UI: Responsive design with support for System Dark Mode and accessible controls.

How It Works

1. Data Capture

The application captures audio data as a Base64 string.

For recordings, it uses the MediaRecorder API to generate a webm blob.

For uploads, it reads the file using FileReader.

2. Processing with appsgm

The data is sent to the appsgm 3 Flash Preview model using the @google/genai SDK. The model is uniquely suited for audio-to-text tasks because it processes the raw audio data directly rather than relying on a separate STT engine, allowing it to capture nuances like emotion and speaker shifts more accurately.

3. Structured Output

We use responseSchema to force the AI to return a specific JSON structure. This ensures the frontend can reliably render:

Timestamped segments

Speaker labels

Emotion badges

Summarized text

Technical Stack

Frontend: React, TypeScript, Tailwind CSS

Icons: Lucide React

AI Engine: appsgm (Gemini 3 Flash Preview)

Deployment: Integrated with high-performance CDNs for React and GenAI libraries.

Setup

1. Environment Variables: Ensure process.env.API_KEY is set with a valid Google GenAI API key. 2. Permissions: The app requests microphone access via metadata.json.

Usage

1. Choose Record Audio or Upload File. 2. Capture or select your audio source. 3. Click Generate Transcript. 4. Review the summary and detailed speaker-by-speaker breakdown. 5. Toggle Dark/Light Mode via the sun/moon icon in the header for your preference.

Created by a world-class senior frontend engineer.

About

EchoScript AI is an intelligent audio transcription tool that converts speech to text with real-time speaker identification, language detection, translation, and emotional analysis—powered by the appsgm (Gemini) AI model.

17 files

2 folders

37.5 KB total size

0 open issues

0 open pull requests

0 watchers

0 forks

0 stars

103 views

Updated Jan 28, 2026

Recent Commits View all

Initial commit - Upload project 'echoscript-ai'

GeetMark committed Jan 28, 2026

Languages

TypeScript 96.7%

HTML 3.3%

CSS 0.0%