Facebook Face Recognition

This page describes some of my earlier work documented in my FB2008 paper where we analyze standard face recognition techniques with application to automatically tagging friends on Facebook:

B. C. Becker, E. G. Ortiz. “Evaluation of Face Recognition Techniques for Application to Facebook“. IEEE International Conference on Automatic Face and Gesture Recognition 2008.

For some of my more recent work, including a Facebook dataset and new, fast sparse algorithm, see my web-scale face recognition page


My friend Enrique and I have been applying face recognition to social networks. In particular, we look at how well various traditional and recently proposed recognition algorithms can automate the tagging process on Facebook. Lots of research in the field of face recognition test on standard databases that try to mimic the real world with changes in illumination, pose, etc. Our motto is “why try to simulate the real world when you can go collect data from the real world?” So we went directly to the source and gathered real “in-the-wild” photos of an assortment of people from Facebook. Our results can be read in our Face & Gesture Recognition 2008 referenced above.


As part of our Facebook face recognition project and because we are big believers in sharing research software, we are making available the code used for the paper, including the C# Facebook Photo Downloader, the C++ OpenCV Face Extractor, and the MATLAB Face Recognition Evaluator. If find our code useful, please cite our paper. If you have problems/questions/bug reports/suggestions, please do contact me.

MATLAB Face Recognition Evaluator

Enrique and I developed a MATLAB based evaluation of face recognition algorithms as a result to trying to find the best algorithms for the data we got from Facebook. In short, it takes in a bunch of datasets and algorithms and spits out accuracies and other statistics comparing the algorithms. It’s all in MATLAB which is the defacto researcher’s tool, but if you prefer C++ you can try the CSU Evaluation of Face Recognition Algorithms instead. Our package handles fairly large datasets with up to 15,000 face images. For larger datasets you probably need 64bit MATLAB and a lot of memory.

Download ZIP package containing popular face recognition algorithms and two datasets.


This toolbox for Matlab 2007a and higher lets you test and experiment with many different face recognition algorithms. This toolbox is designed to be a learning tool for introducing well-known algorithms as well as a springboard for testing your own face recognition algorithms against the most common benchmarks used today. You can easily add your own algorithms and/or datasets. Statistics for each run are automatically generated for concise quantitative comparisons of each algorithm.

There isn’t a lot of documentation (aside from comments in the MATLAB code) so feel free to ask us questions. If you have comments/bug reports, please do contact us at our respective websites. We are always interested in how you are using this package, so drop us an email.


  • Automatically run multiple algorithms over multiple datasets
  • Highly configurable, many options can be changed in a single file
  • Easy to plug in new datasets/algorithms
  • Graphs and tables generated automatically summarizing run statistics
  • Reported statistics include accuracy (%), computational time (sec), memory consumed (MB), and model size (MB)
  • Tracks RAM memory consumed during training/testing
  • HTML table outputs with nearest matches

Major algorithms Included

  • PCA (Principal Component Analysis or Eigenfaces)
  • IPCA (Incremental PCA for online learning)
  • LDA (Linear Discriminant Analysis or Fisherfaces)
  • ILDA (Incremental LDA for online learning)
  • SVM (Support Vector Machines)
  • ISVM (Incremental SVMs for online learning)
  • ICA (Independent Component Analysis)

Comes with AT&T Database of Faces and the Indian Face Dataset so you can start evaluating right away!


Many different resources were used, see credits.txt for more details.


Everything is run using fbRun.m. Options are stored in fbInit.m so you can add algorithms/datasets and change various parameters. Individual algorithm parameters are listed in their respective *.m files under the “methods” folder.


>> fbRun
Running Face Recognition Evaluator (c) Brian C. Becker & Enrique G. Ortiz
More info at www.BrianCBecker.com and www.EnriqueGOrtiz.com
Running Algorithms: pca, lda
Using Datasets: att, ifd

Separating train/test (60/40%) set images in ../../../datasets/att/_csu7, run 1…
Training algorithm “pca” on dataset “att”…
Testing algorithm “pca” on dataset “att”…
91.25% accuracy from “pca” on dataset “att” in 0.0 min

Training algorithm “lda” on dataset “att”…
Testing algorithm “lda” on dataset “att”…
94.38% accuracy from “lda” on dataset “att” in 0.0 min

Separating train/test (60/40%) set images in ../../../datasets/ifd/_csu7, run 1…
Training algorithm “pca” on dataset “ifd”…
Testing algorithm “pca” on dataset “ifd”…
74.17% accuracy from “pca” on dataset “ifd” in 0.0 min

Training algorithm “lda” on dataset “ifd”…
Testing algorithm “lda” on dataset “ifd”…
86.25% accuracy from “lda” on dataset “ifd” in 0.0 min

Algorithm runs done, generating report…
Report generated, check stats/results.csv for details
Face Recognition Evaluator done in 14.2 sec

Dataset Format

The dataset format is pretty easy. Just dump all the grayscale face pictures in a single folder in the format <Person ID>-<Face ID>.jpg (ppms & pngs also supported). Thus 001-01.jpg would be the first face for the first person. The IDs for people and faces don’t have to be in order (so 3783-97.jpg might correspond to a 4 digit PID and a year in the case of say a school yearbook). There is no need to divide your data into a train/test set as the system will take care of that for you automatically (you can control the parameter such as the percentage to train/test on).

Currently two datasets are included, the AT&T Database of Faces and the Indian Face Database. Both are unmodified except for conversion to JPEG, resizing, and renaming of the files. If you use your own database we recommend you use the preprocessing normalization techniques outlined by http://www.cs.colostate.edu/evalfacerec/ .

After each run, a “stats” folder is generated with a lot of statistics about all the algorithms run. The main file generated is stats\report.csv which is a table listing the accuracy, training/testing times, training/testing memory consumed, and the storage size of the algorithm models. Also generated are plots which are listed in stats\plots folder.

One of the most frustating things is running one set of algorithms for a day and then forgetting to save the results before tweaking the parameters and starting a new run. We got tired of doing this and so created a “backup” folder. At the beginning of each run, the “stats” folder is moved into the “backup” folder with the current timestamp. This helps prevent data loss.

Algorithm Format

Algorithms are stored in the methods folder. Any required libraries by your algorithm can be stored in a folder inside the “methods” folder and it will be automatically added to the path. All algorithms must have a “train_algo.m” and “test_algo.m” where “algo” is the name of your algorithm. There are a number of variables available (fbgTrainImgs, fbgTrainIds, fbgTestImgs, fbgTestIds, fbgAvgFace, etc) that you can use to as the training and test inputs to your algorithm. This makes it nice because you don’t have to worry about dividing up your data into test and training sets, it is handled automatically for you. At the end of “test_algo.m” you must set a variable fbgAccuracy (0-100) which the system uses to log accuracies. To run your algorithm, just add it to the fbgAlgorithms cell array in fbInit.m.

For a good example of how to design an algorithm, look at “train_pca.m” and “test_pca.m” from the “methods” folder as they have a lot of good comments outline what is being done and why.

C++ OpenCV Face Extractor

To do face recognition, you need to be able to extract the face from the image using face detection matched to the tags locating the people in the image. We have developed a little C++ program to do this in batch for the Facebook data. It is truly research quality code, so beware of giant hacks all over the place.

Download Microsoft Visual C++ 2008 Project and Source Code.

We do this using a OpenCV C++ program that uses the Viola-Jones detector to identify all the faces and eyes in the image. We match two eyes to each detected face not only to reduce the false positive rate, but to give us a metric for very basic pose normalization. detected faces and eyes are ranked based on some face geometry to identify reasonably well defined frontal faces. Once a face is found, matched with two eyes, the locations of people tagged in the image are scanned for potential identities. In order for a match to occur between a tag and a face, only that tag must lie inside the face area. This means that sloppy tags and tags on the torso instead of the face will not be matched, yet this somewhat cumbersome constraint avoids a number of troubling scenarios such as group photos where it can be very easy to mis-identify the face and tag.

Once the face is identified, it is cropped from the image and rotated such that the detected eyes are always horizontal with a constant separation. This of course is not entirely true since rotations into the plane will stretch the width of the area around the head, but it is a good approximation. The faces are then saved to a file that identifies the identity of the person and the face number: <<User ID>>-<<Instance>>.jpg. This allows for easy parsing when loading the face data into a face recognition algorithm setup.

Note: This program is now incompatible with the output of our new Facebook Downloader program and should be used as a guide only.

C# Facebook Photo Downloader

Face recognition uses a lot of standard datasets, but we wanted our faces to come directly from Facebook. To gather the data, we created a small C# tool that automatically downloads photos of your friends from Facebook.

Download Visual Studio C# 2008 Project and Source Code (new version as of 07.21.2010). Look at the top of fbdown.cs for a short guide to using the program.

How to Use

First you will need to sign up for your own Facebook app. You will be issued an Application Key and Secret Key (write these down). You will find a “facebook api key.txt” file. Put your Application key on the second line and your Secret Key on the third line. Make sure you the Facebook user you want to download data for has either added your application or is listed as a developer (you can remove them later). Run the program and log into Facebook. Select a directory to save the files and click Stage 1. This downloads the metadata. When the metadata is finished, you can start Stage 2, which downloads all the photos from the URLs contained in the metadata.

Important note: This program downloads about one photo per second, so for Facebook users with a large number of friends and/or pictures, this process can take a while. Occasionally something will go wrong, but there is error checking in place to attempt a recover of a timeout and other such errors. Sometimes you will need to restart the program because it froze. Don’t worry, it will automatically skip already processed files.

Once a Facebook user has logged into the application, a list of friends for that user are collected and a folder is created for each one in the target directory in the form “<<Name>> (<<Facebook ID>>).” All the images associated with each user are then gathered and downloaded into a global “_photos” directory. By default the images are saved JPEG with 85% compression. By modifying the source you can save out PNG files if you prefer. Metadata such as photo tags are associated with each image and are saved as text files in each person’s directory. The X and Y coordinates of tag locations  are specified in percentages of the image I believe so you may need to do some conversions later.


This application is not associated with Facebook in any way. By downloading this application, you agree to use this data responsibly only for research and academic purposes. You agree not to re-distribute or publish any photos without the express permission of the owners of the image.

I think this is a nice resource for the face recognition community and I would hate to have this application taken down by Facebook because it is abused. Also, we are not liable for any caused damages or black holes or yada yada yada.

Ok, enough legalese, have fun with the data!