Ethan Chiu My personal blog

Scraping Facebook with Javascript to Get a List of Your Friends

This is the first article of a new series called “Fighting against Fake News”. For the next few weeks, I’ll be writing about my technical challenges of this digital literacy research project as well as my own thoughts on the topic of misinformation. Hope you enjoy!

To preface this, I’d like to describe briefly what this digital literacy project is all about. I’m currently working on this project with the Dav-lab group at Wellesley College to help people build digital literacy skills. With the proliferation of fake news on social media platforms like Twitter and Facebook, we thought we needed to address this issue by helping people develop digital literacy skills.

We thought a way to help people develop digital literacy skills is by developing a Google Chrome extension which gamifies the user’s Facebook news feed by allowing the user to guess which Facebook friend shared what type of news content in their news feed:

Picture During my Internship

Screenshot of the Open Answer Game Format of the Extension

Initially, I programmed the extension so that it parsed through the user’s Facebook news feed and marks up every post which contained an article. Recently, I realized that this parser was quite useless due to it’s over modification of posts and realized it should only modify posts shared by the user’s friend. So, I needed to program a way to get a list of Facebook friends using Javascript for an extension I was building so that I could compare a list of posts with this list of friends to make sure I’m modifying posts shared by the user’s Facebook friends.

In a previous project, I ran into a similar issue where there were no documentation for getting a list of the current user’s Facebook Friends using the Facebook’s Graph API (Facebook got rid of the /me/friends node in version 2.0). Back then, I created a simple workaround:

window.fbAsyncInit = function() {
    FB.init({
      appId      : 'YOUR_APP_ID',
      xfbml      : true,
      version    : 'v2.3'
    });
    FB.AppEvents.logPageView();
    $( document ).ready(function() {
        FB.login(function(response) {
        if (response.status === 'connected') {
          FB.api('/me/taggable_friends?limit=5000', function(response) {
	          console.log(response);
	      });
	});
}; 
(function(d, s, id){
     var js, fjs = d.getElementsByTagName(s)[0];
     if (d.getElementById(id)) {return;}
     js = d.createElement(s); js.id = id;
     js.src = "//connect.facebook.net/en_US/sdk.js";
     fjs.parentNode.insertBefore(js, fjs);
   }(document, 'script', 'facebook-jssdk'));

Unfortunately, I couldn’t use this implementation due to these privacy requirements:

  1. I didn’t want to store any data, meaning no server side requests.
  2. I had to be able to use this method for a chrome extension.

So, I couldn’t use the Facebook API since it stores some of the user data server side and since it can’t be tested while developing the Chrome Extension.

Ultimately, I decided to create a scraping function using Javascript. To create an effective scraper, I inspected potential facebook links that had a clear list of friends that could be easily parsed. Unfortunately, there were no paths that led to a single full list of friends.

Luckily, I found a mobile basic version of facebook’s friend list, https://mbasic.facebook.com/friends/center/friends/?=1 , that had a clear url pattern for getting to each page of the user’s friends list. For example, https://mbasic.facebook.com/friends/center/friends/?=1 is page 1 of the user friends list, https://mbasic.facebook.com/friends/center/friends/?=2 is page 2 of the user friends list, and so on.

Here is the code I eventually came up with:

var activeTab;

var lastRequestTime = 0;
var requestInterval = 50;

var timeoutHistory = [];
var xhrHistory = [];

function get(url, done) {
	var xhr = new XMLHttpRequest();
	xhrHistory.push(xhr);
	xhr.open('GET', url, true);
	xhr.onreadystatechange = function (e) {
		if (xhr.readyState == 4) {
			done(xhr.responseText);
		}
	}
	var delay = Math.max(lastRequestTime + requestInterval - (+new Date()), 0) + Math.random() * requestInterval;
	lastRequestTime = delay + (+new Date());
	timeoutHistory.push(setTimeout(function () {
		xhr.send();
	}, delay));
}

var promises = [];


function getFriends(){
	
	var index = 0;
	var friends = [];
	while(index<=500){
		request = $.ajax({
		     url: "https://mbasic.facebook.com/friends/center/friends/?ppk="+index,
		     dataType: 'text',
		     success: function(data) {
		          if($(data).find(".v.bk")){
		          	var elements = $(data).find(".bj").children();
		          
			          for(var i = 0; i < elements.length; i++) {
			               var name = elements[i].firstChild.innerText;
			               var firstDigit = name.match(/\d/);
			               index = name.indexOf(firstDigit);
			               name = name.slice(0, index);

			               if(name.includes("Your PagesHelpSettings")){
			               		return false;
			               }
			               friends.push(name);
			          }

		          } else{
		          	return false;
		          }
		     }
		});
		index++;
		promises.push( request);

	}
	
	return friends;

}

The scraper does the following:

  1. Visits the first friend page of the user, using an AJAX call to access the mobile basic version of Facebook.
  2. Parse through each page based on class names, gather each name from the page and push it to the ‘friends’ array.
  3. Continue steps 1 and 2 till there are no friends on the page.

That’s it! :)