Noisely

Nuxt

Laravel

MySQL

Redis

Python

Docker

WebSocket

Whisper

YAMNet

Batam International University

In Progress

Project Overview

A research project that enables real-time audio event recognition and transcription by processing sound data on centralized servers, using YAMNet for sound classification and Whisper for speech transcription, to accurately detect events such as gunshots, glass breaking, and human speech.

Technical Implementation

Frontend

• Nuxt.js Vue framework
• Real-time audio visualization
• WebSocket client integration
• Responsive audio interface

Backend & AI Processing

• Laravel API backend
• Python for AI processing
• YAMNet for sound classification
• Whisper for speech transcription

Infrastructure

• MySQL database for event storage
• Redis for real-time caching
• Docker containerization
• WebSocket for real-time communication

AI Models & Capabilities

YAMNet Integration

Google's YAMNet model provides real-time audio event detection, capable of identifying specific sounds like gunshots, glass breaking, alarms, and other environmental audio events with high accuracy.

Whisper Transcription

OpenAI's Whisper model handles speech-to-text transcription, converting spoken language into written text with support for multiple languages and high accuracy rates.

Research Applications

• Security and surveillance monitoring
• Environmental sound analysis
• Speech recognition and transcription
• Real-time audio event logging
• Academic research at Batam International University

Development Status

In Progress

Active research project with ongoing development and testing.