Skip to main content
Public
README.md 2.39 KB

EchoScript AI - Powered by appsgm

EchoScript AI is a sophisticated audio transcription and analysis tool built with React and the appsgm (Gemini) 3 Flash Preview model. It provides real-time speaker identification, language detection, translation, and emotional tone analysis.

Features

  • Multi-modal Input: Record audio directly from your microphone or upload files (MP3, WAV, M4A, WEBM).
  • Speaker Diarization: Automatically identifies different speakers in a conversation.
  • Language Intelligence: Detects the source language and provides English translations for non-English segments.
  • Sentiment & Emotion Analysis: Categorizes each segment into Happy, Sad, Angry, or Neutral tones.
  • Smart Summarization: Generates a concise overview of the entire audio content.
  • Adaptive UI: Responsive design with support for System Dark Mode and accessible controls.
  • How It Works

    1. Data Capture

    The application captures audio data as a Base64 string.

  • For recordings, it uses the MediaRecorder API to generate a webm blob.
  • For uploads, it reads the file using FileReader.
  • 2. Processing with appsgm

    The data is sent to the appsgm 3 Flash Preview model using the @google/genai SDK. The model is uniquely suited for audio-to-text tasks because it processes the raw audio data directly rather than relying on a separate STT engine, allowing it to capture nuances like emotion and speaker shifts more accurately.

    3. Structured Output

    We use responseSchema to force the AI to return a specific JSON structure. This ensures the frontend can reliably render:

  • Timestamped segments
  • Speaker labels
  • Emotion badges
  • Summarized text
  • Technical Stack

  • Frontend: React, TypeScript, Tailwind CSS
  • Icons: Lucide React
  • AI Engine: appsgm (Gemini 3 Flash Preview)
  • Deployment: Integrated with high-performance CDNs for React and GenAI libraries.
  • Setup

    1. Environment Variables: Ensure process.env.API_KEY is set with a valid Google GenAI API key. 2. Permissions: The app requests microphone access via metadata.json.

    Usage

    1. Choose Record Audio or Upload File. 2. Capture or select your audio source. 3. Click Generate Transcript. 4. Review the summary and detailed speaker-by-speaker breakdown. 5. Toggle Dark/Light Mode via the sun/moon icon in the header for your preference.


    Created by a world-class senior frontend engineer.

    About

    EchoScript AI is an intelligent audio transcription tool that converts speech to text with real-time speaker identification, language detection, translation, and emotional analysis—powered by the appsgm (Gemini) AI model.


    17 files
    2 folders
    37.5 KB total size
    0 open issues
    0 open pull requests
    0 watchers
    0 forks
    0 stars
    42 views
    Updated 5 days ago
    Languages
    TypeScript 96.7%
    HTML 3.3%
    CSS 0.0%