AeThex-Connect/PHASE4-CALLS.md
MrPiglr 6dd4751ba9
Phase 4: Voice & Video Calls - Complete WebRTC Implementation
- Database schema: Extended calls/call_participants tables, added turn_credentials
- Backend: callService (390+ lines), 7 REST API endpoints, WebSocket signaling
- Frontend: WebRTC manager utility, Call React component with full UI
- Features: 1-on-1 calls, group calls, screen sharing, media controls
- Security: TURN credentials with HMAC-SHA1, 24-hour TTL
- Documentation: PHASE4-CALLS.md with complete setup guide
- Testing: Server running successfully with all routes loaded
2026-01-10 05:20:08 +00:00

677 lines
15 KiB
Markdown

# PHASE 4: VOICE & VIDEO CALLS - DOCUMENTATION
## Overview
Phase 4 implements WebRTC-based voice and video calling with support for:
- 1-on-1 audio and video calls
- Group calls with up to 20 participants
- Screen sharing
- TURN/STUN servers for NAT traversal
- Real-time media controls (mute, video toggle)
- Connection quality monitoring
- Call recording support (infrastructure)
## Architecture
### WebRTC Topology
**1-on-1 Calls**: Mesh topology with direct peer-to-peer connections
**Group Calls**: SFU (Selective Forwarding Unit) using Mediasoup (placeholder for future implementation)
### Components
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Client A │◄───────►│ Server │◄───────►│ Client B │
│ (Browser) │ WebRTC │ Socket.io │ WebRTC │ (Browser) │
│ │ Signals │ Signaling │ Signals │ │
└─────────────┘ └─────────────┘ └─────────────┘
▲ │ ▲
│ │ │
│ ┌──────────▼──────────┐ │
└───────────►│ TURN/STUN Server │◄───────────┘
│ (NAT Traversal) │
└─────────────────────┘
```
## Database Schema
### Calls Table
```sql
CREATE TABLE calls (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
conversation_id UUID NOT NULL REFERENCES conversations(id) ON DELETE CASCADE,
type VARCHAR(20) NOT NULL DEFAULT 'audio', -- 'audio', 'video', 'screen'
status VARCHAR(20) NOT NULL DEFAULT 'initiated',
-- Status: 'initiated', 'ringing', 'active', 'ended', 'missed', 'rejected', 'failed'
initiated_by UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
started_at TIMESTAMPTZ,
ended_at TIMESTAMPTZ,
duration_seconds INTEGER,
end_reason VARCHAR(50),
sfu_room_id VARCHAR(255),
recording_url TEXT,
quality_stats JSONB,
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
### Call Participants Table
```sql
CREATE TABLE call_participants (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
call_id UUID NOT NULL REFERENCES calls(id) ON DELETE CASCADE,
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
status VARCHAR(20) NOT NULL DEFAULT 'invited',
-- Status: 'invited', 'ringing', 'joined', 'left', 'rejected', 'missed'
joined_at TIMESTAMPTZ,
left_at TIMESTAMPTZ,
ice_candidates JSONB,
media_state JSONB DEFAULT '{"audioEnabled": true, "videoEnabled": true, "screenSharing": false}',
media_stats JSONB,
connection_quality VARCHAR(20),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
```
### TURN Credentials Table
```sql
CREATE TABLE turn_credentials (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
username VARCHAR(255) NOT NULL,
credential VARCHAR(255) NOT NULL,
expires_at TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Auto-cleanup function
CREATE OR REPLACE FUNCTION cleanup_expired_turn_credentials()
RETURNS void AS $$
BEGIN
DELETE FROM turn_credentials WHERE expires_at < NOW();
END;
$$ LANGUAGE plpgsql;
```
## API Endpoints
### 1. POST `/api/calls/initiate`
Initiate a new call.
**Request:**
```json
{
"conversationId": "uuid",
"type": "video", // or "audio"
"participantIds": ["uuid1", "uuid2"]
}
```
**Response:**
```json
{
"callId": "uuid",
"status": "initiated",
"participants": [
{
"userId": "uuid",
"userName": "John Doe",
"userIdentifier": "john@example.com",
"status": "invited"
}
]
}
```
### 2. POST `/api/calls/:callId/answer`
Answer an incoming call.
**Response:**
```json
{
"callId": "uuid",
"status": "active",
"startedAt": "2025-01-10T14:30:00Z"
}
```
### 3. POST `/api/calls/:callId/reject`
Reject an incoming call.
**Response:**
```json
{
"callId": "uuid",
"status": "rejected"
}
```
### 4. POST `/api/calls/:callId/end`
End an active call.
**Response:**
```json
{
"callId": "uuid",
"status": "ended",
"duration": 120,
"endReason": "ended-by-user"
}
```
### 5. PATCH `/api/calls/:callId/media`
Update media state (mute/unmute, video on/off).
**Request:**
```json
{
"audioEnabled": true,
"videoEnabled": false,
"screenSharing": false
}
```
**Response:**
```json
{
"success": true,
"mediaState": {
"audioEnabled": true,
"videoEnabled": false,
"screenSharing": false
}
}
```
### 6. GET `/api/calls/turn-credentials`
Get temporary TURN server credentials.
**Response:**
```json
{
"credentials": {
"urls": ["turn:turn.example.com:3478"],
"username": "1736517600:username",
"credential": "hmac-sha1-hash"
},
"expiresAt": "2025-01-11T14:00:00Z"
}
```
### 7. GET `/api/calls/:callId`
Get call details.
**Response:**
```json
{
"call": {
"id": "uuid",
"conversationId": "uuid",
"type": "video",
"status": "active",
"initiatedBy": "uuid",
"startedAt": "2025-01-10T14:30:00Z",
"participants": [...]
}
}
```
## WebSocket Events
### Client → Server
#### `call:offer`
Send WebRTC offer to peer.
```javascript
socket.emit('call:offer', {
callId: 'uuid',
targetUserId: 'uuid',
offer: RTCSessionDescription
});
```
#### `call:answer`
Send WebRTC answer to peer.
```javascript
socket.emit('call:answer', {
callId: 'uuid',
targetUserId: 'uuid',
answer: RTCSessionDescription
});
```
#### `call:ice-candidate`
Send ICE candidate to peer.
```javascript
socket.emit('call:ice-candidate', {
callId: 'uuid',
targetUserId: 'uuid',
candidate: RTCIceCandidate
});
```
### Server → Client
#### `call:incoming`
Notify user of incoming call.
```javascript
socket.on('call:incoming', (data) => {
// data: { callId, conversationId, type, initiatedBy, participants }
});
```
#### `call:offer`
Receive WebRTC offer from peer.
```javascript
socket.on('call:offer', (data) => {
// data: { callId, fromUserId, offer }
});
```
#### `call:answer`
Receive WebRTC answer from peer.
```javascript
socket.on('call:answer', (data) => {
// data: { callId, fromUserId, answer }
});
```
#### `call:ice-candidate`
Receive ICE candidate from peer.
```javascript
socket.on('call:ice-candidate', (data) => {
// data: { callId, fromUserId, candidate }
});
```
#### `call:ended`
Notify that call has ended.
```javascript
socket.on('call:ended', (data) => {
// data: { callId, reason, endedBy }
});
```
#### `call:participant-joined`
Notify that participant joined group call.
```javascript
socket.on('call:participant-joined', (data) => {
// data: { callId, userId, userName, userIdentifier }
});
```
#### `call:participant-left`
Notify that participant left group call.
```javascript
socket.on('call:participant-left', (data) => {
// data: { callId, userId }
});
```
#### `call:media-state-changed`
Notify that participant's media state changed.
```javascript
socket.on('call:media-state-changed', (data) => {
// data: { callId, userId, mediaState }
});
```
## Frontend Integration
### WebRTC Manager Usage
```javascript
import WebRTCManager from './utils/webrtc';
// Initialize
const webrtcManager = new WebRTCManager(socket);
// Set TURN credentials
const turnCreds = await fetch('/api/calls/turn-credentials');
await webrtcManager.setTurnCredentials(turnCreds.credentials);
// Get local media stream
const localStream = await webrtcManager.initializeLocalStream(true, true);
localVideoRef.current.srcObject = localStream;
// Setup event handlers
webrtcManager.onRemoteStream = (userId, stream) => {
remoteVideoRef.current.srcObject = stream;
};
// Initiate call
webrtcManager.currentCallId = callId;
webrtcManager.isInitiator = true;
await webrtcManager.initiateCallToUser(targetUserId);
// Toggle audio/video
webrtcManager.toggleAudio(false); // mute
webrtcManager.toggleVideo(false); // video off
// Screen sharing
await webrtcManager.startScreenShare();
webrtcManager.stopScreenShare();
// Cleanup
webrtcManager.cleanup();
```
### Call Component Usage
```javascript
import Call from './components/Call';
function App() {
const [showCall, setShowCall] = useState(false);
return (
<div>
{showCall && (
<Call
socket={socket}
conversationId="uuid"
participants={[
{ userId: 'uuid', userName: 'John Doe' }
]}
onCallEnd={(data) => {
console.log('Call ended:', data);
setShowCall(false);
}}
/>
)}
</div>
);
}
```
## TURN Server Setup (Coturn)
### Installation
```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install coturn
# Enable service
sudo systemctl enable coturn
```
### Configuration
Edit `/etc/turnserver.conf`:
```conf
# Listening port
listening-port=3478
tls-listening-port=5349
# External IP (replace with your server IP)
external-ip=YOUR_SERVER_IP
# Relay IPs
relay-ip=YOUR_SERVER_IP
# Realm
realm=turn.yourdomain.com
# Authentication
use-auth-secret
static-auth-secret=YOUR_TURN_SECRET
# Logging
verbose
log-file=/var/log/turnserver.log
# Security
no-multicast-peers
no-cli
no-loopback-peers
no-tlsv1
no-tlsv1_1
# Quotas
max-bps=1000000
user-quota=12
total-quota=1200
```
### Environment Variables
Add to `.env`:
```env
# TURN Server Configuration
TURN_SERVER_HOST=turn.yourdomain.com
TURN_SERVER_PORT=3478
TURN_SECRET=your-turn-secret-key
TURN_TTL=86400
```
### Firewall Rules
```bash
# Allow TURN ports
sudo ufw allow 3478/tcp
sudo ufw allow 3478/udp
sudo ufw allow 5349/tcp
sudo ufw allow 5349/udp
# Allow UDP relay ports
sudo ufw allow 49152:65535/udp
```
### Start Service
```bash
sudo systemctl start coturn
sudo systemctl status coturn
```
### Testing TURN Server
Use the [Trickle ICE](https://webrtc.github.io/samples/src/content/peerconnection/trickle-ice/) test page:
1. Add your TURN server URL: `turn:YOUR_SERVER_IP:3478`
2. Generate TURN credentials using the HMAC method
3. Click "Gather candidates"
4. Verify `relay` candidates appear
## Media Codecs
### Audio
- **Codec**: Opus
- **Sample Rate**: 48kHz
- **Bitrate**: 32-128 kbps (adaptive)
- **Features**: Echo cancellation, noise suppression, auto gain control
### Video
- **Codecs**: VP8, VP9, H.264 (fallback)
- **Clock Rate**: 90kHz
- **Resolutions**:
- 1280x720 (HD) - default
- 640x480 (SD) - low bandwidth
- 320x240 (LD) - very low bandwidth
- **Frame Rate**: 30 fps (ideal), 15-60 fps range
- **Bitrate**: 500kbps-2Mbps (adaptive)
## Connection Quality Monitoring
The system monitors connection quality based on:
1. **Round Trip Time (RTT)**
- Good: < 100ms
- Fair: 100-300ms
- Poor: > 300ms
2. **Packet Loss**
- Good: < 2%
- Fair: 2-5%
- Poor: > 5%
3. **Available Bitrate**
- Good: > 500kbps
- Fair: 200-500kbps
- Poor: < 200kbps
Quality is checked every 3 seconds and displayed to users.
## Error Handling
### Common Errors
1. **Media Access Denied**
```
Failed to access camera/microphone: NotAllowedError
```
- User denied browser permission
- Solution: Request permission again, show help dialog
2. **ICE Connection Failed**
```
Connection failed with user: ICE connection failed
```
- NAT/firewall blocking connection
- Solution: Ensure TURN server is configured and reachable
3. **Peer Connection Closed**
```
Connection closed with user: Connection lost
```
- Network interruption or user disconnected
- Solution: Notify user, attempt reconnection
4. **Turn Credentials Expired**
```
TURN credentials expired
```
- Credentials have 24-hour TTL
- Solution: Fetch new credentials automatically
## Security Considerations
1. **TURN Authentication**: Time-limited credentials using HMAC-SHA1
2. **DTLS**: WebRTC encrypts all media streams with DTLS-SRTP
3. **JWT Auth**: All API calls require valid JWT token
4. **Rate Limiting**: Protect against DoS attacks
5. **User Verification**: Verify users are in conversation before allowing calls
## Testing Checklist
- [ ] 1-on-1 audio call works
- [ ] 1-on-1 video call works
- [ ] Mute/unmute audio works
- [ ] Toggle video on/off works
- [ ] Screen sharing works
- [ ] Call can be answered
- [ ] Call can be rejected
- [ ] Call can be ended
- [ ] Connection quality indicator updates
- [ ] Call duration displays correctly
- [ ] Multiple participants can join (group call)
- [ ] Participant joins/leaves notifications work
- [ ] Media state changes propagate
- [ ] TURN server fallback works (test behind NAT)
- [ ] Call persists after page refresh (reconnection)
- [ ] Missed call notifications work
- [ ] Call history is recorded
## Performance Optimization
### Bandwidth Usage
**Audio Only (per participant)**:
- Opus @ 32kbps: ~15 MB/hour
- Opus @ 64kbps: ~30 MB/hour
**Video + Audio (per participant)**:
- 480p @ 500kbps: ~225 MB/hour
- 720p @ 1Mbps: ~450 MB/hour
- 1080p @ 2Mbps: ~900 MB/hour
### Recommendations
1. **Start with audio only** for low bandwidth users
2. **Use VP9** if supported (better compression than VP8)
3. **Enable simulcast** for group calls (SFU)
4. **Adaptive bitrate** based on network conditions
5. **Limit group calls** to 20 participants max
## Future Enhancements
- [ ] **Mediasoup SFU**: Implement actual SFU for efficient group calls
- [ ] **Call Recording**: Record and store calls in cloud storage
- [ ] **Background Blur**: Virtual backgrounds using ML
- [ ] **Noise Cancellation**: Advanced audio processing
- [ ] **Grid/Speaker View**: Different layouts for group calls
- [ ] **Reactions**: Emoji reactions during calls
- [ ] **Hand Raise**: Signal to speak in large calls
- [ ] **Breakout Rooms**: Split large calls into smaller groups
- [ ] **Call Scheduling**: Schedule calls in advance
- [ ] **Call Analytics**: Detailed quality metrics and reports
## Troubleshooting
### No Audio/Video
1. Check browser permissions
2. Verify camera/microphone is not used by another app
3. Test with `navigator.mediaDevices.enumerateDevices()`
4. Check browser console for errors
### Connection Fails
1. Test TURN server with Trickle ICE
2. Verify firewall allows UDP ports 49152-65535
3. Check TURN credentials are not expired
4. Ensure both users are online
### Poor Quality
1. Check network bandwidth
2. Monitor packet loss and RTT
3. Reduce video resolution
4. Switch to audio-only mode
### Echo/Feedback
1. Ensure `echoCancellation: true` in audio constraints
2. Use headphones instead of speakers
3. Reduce microphone gain
4. Check for multiple audio sources
## Support
For issues or questions:
- Check logs in browser console
- Review `/var/log/turnserver.log` for TURN issues
- Monitor backend logs for signaling errors
- Test with multiple browsers (Chrome, Firefox, Safari)
---
**Phase 4 Complete**