Noisely

Nuxt
Laravel
MySQL
Redis
Python
Docker
WebSocket
Whisper
YAMNet
Batam International University
In Progress

Project Overview

A research project that enables real-time audio event recognition and transcription by processing sound data on centralized servers, using YAMNet for sound classification and Whisper for speech transcription, to accurately detect events such as gunshots, glass breaking, and human speech.

Technical Implementation

Frontend

  • • Nuxt.js Vue framework
  • • Real-time audio visualization
  • • WebSocket client integration
  • • Responsive audio interface

Backend & AI Processing

  • • Laravel API backend
  • • Python for AI processing
  • • YAMNet for sound classification
  • • Whisper for speech transcription

Infrastructure

  • • MySQL database for event storage
  • • Redis for real-time caching
  • • Docker containerization
  • • WebSocket for real-time communication

AI Models & Capabilities

YAMNet Integration

Google's YAMNet model provides real-time audio event detection, capable of identifying specific sounds like gunshots, glass breaking, alarms, and other environmental audio events with high accuracy.

Whisper Transcription

OpenAI's Whisper model handles speech-to-text transcription, converting spoken language into written text with support for multiple languages and high accuracy rates.

Research Applications

  • • Security and surveillance monitoring
  • • Environmental sound analysis
  • • Speech recognition and transcription
  • • Real-time audio event logging
  • • Academic research at Batam International University

Development Status

In Progress
Active research project with ongoing development and testing.