The most challenging aspect was implementing reliable face detection in the browser. This required three complete attempts before finding a working solution.
1st try - MediaPipe Face Mesh
The failed reason: Google's MediaPipe model promised 468 facial landmarks with high accuracy. However, it relied on WebAssembly (WASM) binaries that failed to initialize.
// Error encountered:
Uncaught TypeError: can't access property "buffer", HEAP8 is undefined
// Root cause:
- WASM module failed to initialize
- SIMD compatibility issues across browsers
- Firefox had known bugs with MediaPipe's WASM
2nd try - TensorFlow.js Face Landmarks
I pivoted to TensorFlow.js, which uses pure JavaScript/WebGL instead of WASM. Model loaded successfully, camera feed worked; but detection consistently returned 0 faces.
const detectorConfig = {
runtime: 'tfjs',
maxFaces: 1,
refineLandmarks: false,
detectionConfidence: 0.1 // Lowered to minimum.
};
const faces = await detector.estimateFaces(video);
console.log(faces.length); // Output: 0 (always)3rd try - face-api.js Finally, I found success with the vladmandic fork of face-api.js. It’s a more stable, better-maintained version with simpler API.
await faceapi.nets.tinyFaceDetector.loadFromUri(MODEL_URL);
await faceapi.nets.faceLandmark68Net.loadFromUri(MODEL_URL);
const detection = await faceapi
.detectSingleFace(video, new faceapi.TinyFaceDetectorOptions())
.withFaceLandmarks();
if (detection) {
const landmarks = detection.landmarks.positions;
const nose = landmarks[30];
const leftEye = landmarks[36];
const rightEye = landmarks[45];
// Calculate gaze orientation
const eyeCenterX = (leftEye.x + rightEye.x) / 2;
const offsetX = nose.x - eyeCenterX;
const normalizedX = (offsetX / eyeDistance) * 100;
const isFacing = Math.abs(normalizedX) < 20;
}