Nyx Web Chat Widget

The Open Source, Empathic, Real-Time 3D Agent for the Web

Nyx Web Chat widget enables full-duplex voice interaction. The user speaks, interrupts, and receives visual feedback in real-time, with the avatar powered by our custom audio-to-motion engine.

Avatar Widget Agent Server 🤗 HuggingFace Try Nyx

Quick Start

Get up and running in minutes. Deploy the server and embed the widget on any site.

Deploy Server

The server handles audio processing and agent communication.

# 1. Clone & Setup
git clone https://github.com/myned-ai/avatar-chat-server.git
cd avatar-chat-server
uv sync

# 2. Configure Environment
Create environmental variables as seen on .env.example
# Required: Set OPENAI_API_KEY or GEMINI_API_KEY
# Optional: Set AUTH_SECRET_KEY for secured access

# 3. Add Knowledge (Optional)
# Create your website to knowledge base text
# Set KNOWLEDGE_BASE_SOURCE env variable to the location of your knowledge base text to ground the agent
# Can be a local file path (e.g. "data/knowledge.md") or a URL

# 4. Run Server
uv run python src/main.py

# Use the docker setup if you prefer containerization
docker-compose up -d

🔒 Production Security:

Update your environmental variables configuration:

Set AUTH_ENABLED=true
Generate Key: openssl rand -hex 32
Set AUTH_SECRET_KEY=...
Set AUTH_ALLOWED_ORIGINS=...

Embed Widget

Add the widget to any HTML page using the CDN link.

<!-- Container -->
<div id="avatar-chat"></div>

<!-- CDN Script -->
<script src="https://cdn.jsdelivr.net/npm/@myned-ai/avatar-chat-widget"></script>

<script>
  AvatarChat.init({
    container: '#avatar-chat',
    serverUrl: 'wss://your-server.com/ws',
    authEnabled: true, 
    position: 'bottom-right'
  });
</script>

🔑 Secure Integration:

Authenticated connection flow:

Backend: Sign token with AUTH_SECRET_KEY
Frontend: Fetch token from API
Init: Pass to token param
Result: wss://.../ws?token=...

One-Click Deployment

Fill in the required fields (e.g. API keys) and keep the defaults for everything else — you're good to go.

Deploy to Azure

Deploy to GCP

Deploy to AWS

All deployment scripts are available in the server repo.
* Azure is set to Always On but still need some warmup.* Minimal instances need to be set manual on GCP for Always On.
*On AWS, choose "Upload a template file" and upload from here.

Features & Capabilities

Designed for natural human connection, built on robust open-source research and cutting edge infrastructure.

🗣️

Multi-Directional Voice

True full-duplex communication. Users can speak while the avatar is speaking. The system handles Accurate Interruption (VAD) to stop the avatar instantly when the user cuts in, just like a real conversation.

🎭

Empathic Expression

Beyond simple lip-sync. We use Blendshapes to drive natural facial movements—eyebrows, blinks, and smiles—that match the emotional tone of the voice response.

⚡

CPU-Optimized (Quantized)

The "Audio to Expression" model is heavily quantized. It runs efficiently on standard CPUs, meaning you don't need expensive GPU instances to host the avatar server.

📝

Live Transcription & Subtitles

Includes a built-in UI for real-time subtitle sync. Users can read along with the conversation, ensuring accessibility and clarity in noisy environments.

🧠

Answers From Your Data

Out-of-the-box support for custom knowledge. Give the avatar your product manuals, FAQs, or other documents, and it will provide answers based on your specific content.

🔒

Production Security

Secured via HMAC & Token-based authentication. The widget requires a signed token from your backend to initiate a WebSocket connection, preventing unauthorized usage of your LLM credits.

System Architecture

Client-Side (Widget)

A lightweight, framework-agnostic JavaScript bundle that embeds into any website.

Audio Capture: Captures microphone input using Web Audio API and streams it via WebSocket.
3D Rendering: Renders the avatar using Three.js and 3D Gaussian Splatting for cinematic fidelity.
Authentication: Handles secure HMAC token exchange to prevent unauthorized usage.

Server-Side (Orchestrator)

A Python-based WebSocket server that acts as the central brain and security layer.

LLM Proxy: Manages state and forwards audio/text to AI providers (OpenAI Realtime/Gemini Live) to protect API keys.
Connection Management: Handles WebSocket handshakes, rate limiting, and session persistence.
Modular Agents: Pluggable system to easily switch between different AI backends or custom implementations.
Knowledge Base Service: Loads, parses, and formats custom knowledge base content (from local files or URLs) to ground agent responses in your own data.

Audio-to-Expression Engine

A highly optimized inference engine enabling real-time facial animation from audio input.

Wav2Arkit Model: Analyzes incoming AI audio to generate 52 ARKit blendshapes in real-time.
CPU Optimized: Uses ONNX Runtime and quantization to run efficiently on standard CPUs without GPU requirements.
Synchronization: Streams audio packets and blendshape weights together at 30 FPS for jitter-free animation.