About the Project

What is Moodify?

An intelligent music recommendation system that meets you where your mood is - typed, spoken, or photographed.

🎯

Mission

Make music discovery feel like being understood. Detect mood from three modalities (text, voice, face), then surface tracks that match the moment - not the demographic.

🧠

Why three models?

People express mood differently across contexts. A short text rarely lies; a voice clip captures tone you cannot type; a photo catches the micro-expression you would not admit to. Moodify reads all three.

⚡

Built for speed

Memory-snapshot inference container restores in seconds, not minutes. From input to playlist averages under two seconds end-to-end, even on first request after idle.

🌍

Open by default

MIT license, public repo, public model weights on Hugging Face Hub. Every architectural decision is documented in the READMEs and condensed into the wiki below.

📱

Web + iOS + Android

One React 18 codebase for web (Vercel), one Expo SDK 51 codebase for mobile (EAS Build). The mobile app talks directly to the Modal inference service - the Django API stays out of the data path.

🔒

Private by design

Photos and voice clips are sent to the inference container for a single classification call and never persisted. Only the resulting mood label and saved track metadata land in your profile.

🎱

Learns your taste - with real RL

Every 👍, 👎, and open-in-Deezer trains your playlist ranker. A Thompson Sampling contextual bandit re-orders the list in real time; a per-user calibration map even corrects mis-classified emotions. New & anonymous users see the rule-based pipeline untouched.

🎉

Built in public

Code, decisions, and tradeoffs are all on GitHub. Browse the issues, copy a pattern, file a PR. Moodify is a working reference for shipping AI products, not a black box.

Capabilities

Key Features

A complete emotion detection and music recommendation system, surfaced through a clean, accessible UI.

📝

Text Emotion Analysis

Fine-tuned BERT classifier reads tone, sentiment, and emotional vocabulary from typed input. Five labels: sadness, joy, love, anger, fear.

NLP BERT Transformers

🎤

Speech Emotion Detection

SVC classifier over librosa MFCC features. Eight labels: calm, happy, sad, angry, fearful, disgust, surprised, neutral. Sub-second inference, sub-second feature extract.

SVC MFCC librosa

📸

Facial Expression Recognition

FER Keras model wrapped in MTCNN face detection. Seven labels (Ekman + neutral). Works on a single front-camera photo, no video stream required.

FER MTCNN facenet-pytorch

🎵

Deezer Recommendations

Keyless integration with Deezer's Search API. Returns track name, artist, album, 30s preview, cover art, popularity rank, and a deep link to the streamable web player.

Deezer API Keyless 30s Previews

🎯

Reinforcement Learning Loop

Thompson Sampling contextual bandit over a Beta-Bernoulli posterior. Every 👍 / 👎 / open-in-Deezer tap updates a 22-dim posterior (emotion × decade × duration × popularity-quintile) that re-ranks the next recommendation list. Cold-start safe: anonymous + new users see the base order until they cross 20 logged events.

Thompson Sampling Beta-Bernoulli Contextual Bandit Online RL

🔮

Per-user Mood Calibration

A "Was this right?" widget captures every disagreement with the BERT detector. After three same-direction corrections (e.g. joy → love), the inference path rewrites the predicted label for that user only - immediate, no retrain, anonymous callers untouched.

Online Calibration Per-user No Retrain

🧠

Base Personalization (EWMA + Markov)

Recency-weighted EWMA + first-order Markov chain + adaptive blend ratio produce the candidate list the bandit re-ranks. Recurring moods seep into the current playlist at a clamped, recency-decayed rate.

EWMA Markov O(history+tracks)

👤

User Management

JWT auth (HS256, 7d access + 14d refresh), Mongo-backed profile, append-only mood + listening history, saved recommendations with rich track objects.

JWT MongoDB mongoengine

🔐

Passwordless Passkeys

Sign in with Face ID, Touch ID, Windows Hello, or a hardware security key - no password. WebAuthn / FIDO2 verified by py_webauthn. Users enroll multiple passkeys and manage them (add / rename / delete) on a dedicated Account → Passkeys page; a verified assertion mints the same JWT pair as password login.

WebAuthn FIDO2 Passwordless py_webauthn

📱

Mobile Application

Expo SDK 51 + React Native 0.74 with Hermes engine. AsyncStorage for tokens, EAS Build for both stores, native camera + microphone for the three modalities.

React Native Expo Hermes

🛡️

Cost Protection

Two-tier sliding-window rate limit (general + media) keyed on JWT sub, two-layer caches (text + Deezer + hash-of-bytes), and a hard MAX_CONTAINERS=5 ceiling at the Modal layer.

Sliding-window LRU+TTL 429 Retry-After

📊

SRE Metrics

Per-request observability into MongoDB time-series collections. Admin-only /api/metrics/ aggregator returns p50/p95/p99 latency, error rates, and status-code breakdown for any rolling window.

Time-series p95 Admin

🌡️

Resilient Fallbacks

Deezer down? Curated 14-track fallback list. Speech model fails to load? Endpoint returns degraded:true with neutral instead of 500. A failing model never takes down the others.

Fallback Degraded Isolated

🌙

Dark + Light Themes

Theme switch toggles a full set of CSS variables. Honors prefers-color-scheme on first paint; persisted in localStorage across sessions.

CSS Vars A11y SSR-safe

🌐

Market & Genre Filters

17 country markets (Global + 16) and 19 genre filters bias the Deezer search phrase before it leaves the inference container. Results stay culturally relevant.

17 markets 19 genres Server-side

How It Works

Three Steps. One Playlist.

From "how do I feel" to a playlist on your phone in under two seconds.

1

💬

Share a moment

Type a few words about your day, record a few seconds of your voice, or snap a quick selfie. Pick whichever feels easiest in the moment.

2

🧠

Moodify reads the mood

Three small AI models work behind the scenes - one for text, one for voice, one for faces. They turn your input into a mood label like "calm" or "excited."

3

🎵

Press play

Up to 60 tracks tuned to that mood appear instantly - with 30-second previews and one-tap deep links to Deezer. Save the ones you love to your history.

Why Moodify

Built for People, Not Algorithms

A side project that ended up being the music app I actually use every day.

🎯

It actually gets your mood

No more endless scrolling for the right vibe. Three AI models read how you feel and skip straight to music that fits.

🔒

Your photos stay yours

Selfies and voice clips are used once to detect mood, then thrown away. Only the mood label is saved. No tracking, no resale.

⚡

Instant playlists

Under two seconds from input to a ranked, deduped 60-track playlist. No spinners, no waiting, no manual searching.

🚀

Open source · MIT

Every line of code is on GitHub. Read it, fork it, self-host it, contribute. Pull requests welcome.

System Design

System Architecture

Cloud-native, modular, and split along a clean service boundary. Pick a lens below.

System Overview

Clients

Web (React 18)

Mobile (Expo)

Edge (Vercel)

Django + DRF

Swagger / Redoc

ML (Modal)

FastAPI ASGI

BERT · SVC · FER

Personalization

External

Deezer Search

Hugging Face Hub

Data

MongoDB Atlas

Metrics TS (30d)

Two services, one experience

The web tier handles everything you see: sign-in, profile, mood history, app shell. Light and fast.
The AI tier handles the heavy lifting: emotion detection, personalization, and finding tracks that fit your mood.
Clean split means each side can grow on its own. The app stays snappy whether you have ten users or ten thousand.
Direct uploads from your phone or browser go straight to the AI tier, so big files don't get stuck in the middle.
Your data lives in a managed database. Mood history and saved tracks stay with you across visits and devices.
Open architecture. Every piece of the stack is documented and replaceable - swap any service, keep the experience.

Request Lifecycle

User input

→

Client capture

→

Modal /text|speech|facial_emotion

→

JWT verify

→

Model predict

→

Personalization

→

Deezer search

→

Response (60 tracks)

→

Django persist history

End-to-end request lifecycle

Client captures input locally - text, m4a audio, or jpg image.
Client posts directly to the Modal endpoint with the user's JWT. Modal's auth.authenticate verifies either the JWT (end-user) or MODAL_SERVICE_TOKEN (Django proxy).
The matching model classifies the input. Speech/face hash the upload first and check the LRU+TTL cache to short-circuit retry storms.
The personalization layer pulls the user's recent mood_history and decides whether to fetch a second Deezer search for the recurring mood.
Deezer returns up to 50 tracks per phrase. rank_by_quality blends Deezer popularity with curated position; interleave mixes the two moods at the adaptive blend ratio.
The client posts the result to Django to append mood_history and recommendations on the Mongo profile. Then it routes to the Results page.

Two Deployment Paths

Canonical · production

Vercel (web + Django)

Modal (FastAPI ML)

MongoDB Atlas

Deezer Search

Self-host · Kubernetes

EKS / GKE / OKE / Kind

Helm charts

Argo CD GitOps

Terraform IaC

k8s-addons

Two deployment paths

Canonical (recommended): Vercel for web + Django API, Modal for ML inference, MongoDB Atlas for state. Production-tuned defaults. See DEPLOYMENT.md.
Self-host (optional): Drop the whole stack onto an EKS / GKE / OKE / Kind cluster. Terraform provisions the VPC + cluster + managed Postgres / Redis / S3 / monitoring; Helm packages the workloads; Argo CD reconciles from Git; the k8s-addons/ directory layers in External Secrets, OPA Gatekeeper, Velero, and Chaos Mesh. See INFRASTRUCTURE_SETUP.md.
CI/CD: Jenkins pipeline + GitHub Actions matrix for tests, security scans, image builds, and ArgoCD-triggered rollouts. Blue-green and canary strategies supported on the K8s path.

Layered Security Architecture

1 Perimeter

🛡️

Vercel Edge + DDoS

Anycast network, mTLS to upstream

🔥

Rate Limiting

DRF 60/240 + Modal 45/15 per user

⏱️

Hard Container Cap

MAX_CONTAINERS=5 on Modal

↓

2 Application

🔑

JWT (HS256)

7d access + 14d refresh

👥

Mongo-backed auth

No SQL auth_user table

✅

Input Validation

DRF + Pydantic schemas

🚫

No CSRF surface

Header auth only, no cookies

🔒

Service Token

Django ↔ Modal proxy

↓

3 Data

🔐

Encryption at Rest

MongoDB Atlas-managed

🌐

TLS in Transit

TLS 1.3 default everywhere

🕷️

No Photo/Voice Storage

Only labels persist

📋

SRE Metrics (30d TTL)

Admin-only access

Security & privacy summary

JWT - the same JWT_SIGNING_KEY signs on Django and verifies on Modal. End-user tokens are 7d access + 14d refresh; silent refresh is wired into the React client.
Passkeys (WebAuthn / FIDO2) - optional passwordless sign-in. Many passkeys per user; verification via py_webauthn; single-use, expiring challenges; signature-counter clone detection; login never discloses whether an account exists. A verified assertion yields the same JWT pair as a password login.
Service token - MODAL_SERVICE_TOKEN is a shared constant for the Django→Modal proxy path; bypasses the Modal rate limiter (DRF throttles that side).
No CSRF surface - auth lives in the header; CsrfViewMiddleware is intentionally removed.
Photo / voice privacy - uploads hit the inference container, run inference, and are dropped. Only the predicted label is returned and persisted.
Abuse protection - sliding-window rate limit (45/min general + 15/min media per JWT sub) plus a hard MAX_CONTAINERS=5 Modal ceiling guarantee a worst-case capacity ceiling.

Tech Stack

Technologies & Tools

Modern, production-ready, and selected for long-term maintainability over novelty.

⚛️ Frontend

React 18

Redux Toolkit

Material UI

Axios

React Router 6

Emotion CSS-in-JS

Webpack via CRA

Babel

ESLint

Prettier

Jest

Testing Library

🎤 Music & External APIs

Deezer API

Hugging Face Hub

📱 Mobile

React Native 0.74

Expo SDK 51

EAS Build

Hermes JS engine

AsyncStorage

🐍 Backend

Python

Django

Django REST Framework

FastAPI

Pydantic

MongoEngine

PyJWT

Swagger

Redoc

Pytest

Gunicorn

WhiteNoise

🤖 AI / ML

PyTorch

TensorFlow

Keras

Transformers

OpenCV

Scikit-learn

NumPy

SciPy

Pandas

Librosa

FER

facenet-pytorch

MLflow

💾 Databases & Caches

MongoDB

MongoDB Atlas

Redis

SQLite (dev)

⚙️ DevOps & CI/CD

Docker

Kubernetes

Helm

Argo CD

Jenkins

Terraform

GitHub Actions

NGINX

Prometheus

Grafana

Loki

Tempo

☁️ Cloud & Hosting

Vercel

Modal

AWS

GCP

Oracle Cloud

Netlify (alt)

Infrastructure

Deployment & Operations

Multiple paths. Pick the one that matches your risk and infra budget.

🟥🟩

Blue-Green Deployment

Two identical production environments behind one Service. Promote by flipping the selector label - instant traffic switch with zero downtime, instant rollback by flipping back.

✅ Instant traffic switching
✅ Full environment pre-validation
✅ A/B testing surface
✅ Zero downtime

🐤

Canary Deployment

Argo Rollouts gradually shifts traffic 10% → 25% → 50% → 100% with automated health checks at each step. Rollback is automatic on a failed gate.

✅ Progressive traffic increase
✅ Early failure detection
✅ Automated rollback
✅ Minimal blast radius

🔄

CI/CD Pipeline

GitHub Actions on the canonical path, Jenkins on the self-host path. Lint, test, security scan, image build, push, and Argo trigger. Quality gates + approvals on protected branches.

✅ Automated testing
✅ Security scanning (Snyk + Trivy)
✅ Quality gates
✅ Multi-environment matrix

⚙️

Kubernetes Orchestration

HPA, PDB, NetworkPolicy, and an Ingress with cert-manager. Auto-scaling and self-healing. The same Helm charts deploy to Kind locally, then EKS / GKE / OKE in production.

✅ Auto-scaling
✅ Self-healing
✅ Load balancing
✅ NetworkPolicy isolation

⚡

Vercel + Modal (canonical)

The recommended path. Git push → preview URL on every PR. modal deploy atomically swaps the inference container. Rollback by re-deploying a previous SHA.

✅ Push-to-deploy
✅ Memory-snapshot ML
✅ PR preview URLs
✅ Atomic Modal swaps

🎯

GitOps with Argo CD

Cluster watches the repo. Changes under argocd/applications/ reconcile automatically. No kubectl apply from laptops, no drift between Git and cluster.

✅ Declarative + auditable
✅ Drift detection
✅ Revert = revert the commit
✅ Per-app sync waves

🔌

Infrastructure as Code

Terraform modules under terraform/ spin up VPC + cluster + databases + monitoring for AWS / GCP / Azure / Oracle. One terraform apply away from a fresh environment.

✅ Multi-cloud parity
✅ Per-env workspaces
✅ State stored remotely
✅ Tear down in one command

🏭

Cluster Add-ons

Drop-in operators for the self-host path. External Secrets pulls from Vault, OPA Gatekeeper enforces policy, Velero handles backups, Chaos Mesh proves the system can take a punch.

✅ Secret sync
✅ Policy guardrails
✅ Cluster backups
✅ Chaos engineering

Monitoring & Observability

📊

Metrics

Per-request rows to MongoDB TS (canonical) or kube-prometheus-stack with custom dashboards (self-host).

📝

Logs

Vercel + Modal logs natively, plus Loki + Promtail with label-based querying on the K8s path.

🔍

Tracing

Grafana Tempo on the K8s path; OpenTelemetry hooks ready in the Django middleware.

🚨

Alerting

SLO PrometheusRule ships in the monitoring chart: latency, error rate, container budget. Slack / PagerDuty receivers wired in values.yaml.

📊 By the Numbers

Real Scale, Real Speed

A snapshot of what Moodify ships on the wire today.

3

SELF-TRAINED MODELS

text · voice · face

60

TRACKS PER ANALYSIS

de-duplicated

13

MOOD PALETTES

per-mood gradients

19

GENRE FILTERS

w/ icon tints

17

MARKETS

Global + 16 countries

3

PLATFORMS

Web · iOS · Android

80ms

WARM LATENCY

text inference

1-2s

COLD START

snapshot restore

261+

TESTS PASSING

80 backend · 181 modal

MIT

LICENSE

100% open source

92%

TEXT ACCURACY

held-out test set

5

API ENDPOINTS

FastAPI on Modal

⚡ Performance

Built for Cold-Start Reality

Numbers measured against the production deploy.

⚡

Inference latency

Warm container

Text · BERT~80 ms

Face · FER + MTCNN~340 ms

Voice · SVC + MFCC~520 ms

Deezer search~180 ms

🎯

Model accuracy

Held-out test set

Text emotion92%

Facial emotion87%

Speech emotion84%

End-to-end satisfaction96%

🛡️

Resilience signals

Production telemetry

JWT refresh success99.4%

Degraded fallback rate3.1%

Cold-start frequency~8%

Deezer success rate98.2%

🧮

Cache hit rates

Production averages

Text emotion cache~72%

Deezer search cache~88%

Speech hash cache~18%

Facial hash cache~22%

Try It Now

Live Deployment

Experience Moodify in action - or read the live API contract.

🌐

Web Application

The full SPA with all three emotion modes, mood history, and the personalized Deezer playlist.

Launch App →

Live on Vercel

🔨

Backend API

Live Django REST API with OpenAPI 3 contract. Interactive Swagger UI for testing, Redoc for reading.

Swagger UI → Redoc →

Live on Vercel

💻

Source Code

Open-source on GitHub with full READMEs per subsystem, OpenAPI YAML, Terraform, Helm, and CI/CD.

View on GitHub →

MIT License

🤖

ML Inference

FastAPI on Modal. Scale-to-zero, CPU memory snapshots, sliding-window rate limiting. GET /health for liveness.

Modal README →

Live on Modal

⚠️ Important Notes

The Django API is deployed to Vercel and the ML inference runs on Modal with memory snapshots, so the first request after idle may take ~1-2 s while a container restores. Subsequent calls hit a warm container.
Facial and speech emotion models load on first cold start; once snapshotted they restore in seconds instead of re-loading from disk.
For optimal performance and total control, clone the repository and run locally.
All three modalities, the personalization engine, and the recommendation pipeline ship behind one FastAPI endpoint on Modal.

❓ FAQ

Frequently Asked Questions

Quick answers to what people ask most.

What makes Moodify different?

Most music apps ask you to pick a genre or playlist. Moodify reads your mood three different ways (text, voice, face) and tunes the music to fit - no scrolling, no decision fatigue, no manual searching.

Do I need an account?

Yes - an account keeps your mood history and saved tracks with you across visits and devices. Sign-up takes about 10 seconds: username, email, password. That's it.

Can I sign in without a password?

Yes - Moodify supports passkeys. Right after sign-up (or anytime from Account → Passkeys) you can add a passkey and then sign in with Face ID, your fingerprint, Windows Hello, or a hardware security key - no password to type. You can add several (phone, laptop, security key), rename or remove them, and your password keeps working as a fallback.

How does it figure out my mood?

Three ways - pick whichever you feel like. Type a few words about how you feel. Record a few seconds of your voice. Snap a quick selfie. Moodify reads any of those and finds music that matches.

Do you keep my photos or voice recordings?

No. Your photo or voice clip is used once to figure out the mood, then thrown away. Only the mood label itself ("happy", "calm", etc.) is kept so your history works. Nothing about your face or voice is stored.

Where does the music come from?

Deezer. Every recommendation links straight to a streamable track on Deezer with a 30-second preview built in. Tap any track to open it on Deezer and listen.

Can I pick a genre too?

Yes. You can bias the recommendations toward 19 different genres (lo-fi, EDM, k-pop, classical, hip-hop, etc.) and 17 different country markets if you want the catalog to feel local.

Why is the first request a little slow?

Moodify's inference container hibernates when nobody is using it. The first request after a quiet stretch takes about 1-2 seconds to wake the server up. Every request after that is fast.

How accurate is the mood detection?

Around 84-92% across the three modes on our test data. It works best on clear input: a normal-length sentence, a few seconds of audible voice, or a well-lit selfie facing the camera.

Does it work on my phone?

Yes - Moodify runs in any modern mobile browser, and there's a native iOS + Android app built with Expo. The mobile app uses the phone's camera and microphone directly.

Does it work offline?

No. The recommendation and mood-detection happen on a server, so an internet connection is required. The app itself opens fast, but you need to be online to get a playlist.

Can I delete my account?

Yes - any time. The profile page has a Delete Account button. Your account, mood history, and saved recommendations are removed immediately.

What if Moodify gets my mood wrong?

It happens occasionally. Try a different mode (text vs voice vs face), or use the genre filter to push the playlist in the direction you actually want. The more you use Moodify, the better your personal blend gets.

Who built this?

Son Nguyen (@hoangsonww). Solo project, MIT-licensed, code is fully open on GitHub.

I found a bug / want a feature.

Open an issue on GitHub or email hoangson091104@gmail.com. Pull requests welcome.

Ready to listen to your mood?

Open the app, share how you feel, get a playlist tuned to the moment. Three modes, one ranked Deezer playlist, ready in seconds.

🚀 Try Moodify now ⭐ Star on GitHub