🌱 Developer Meetup n.2; Report

Table of Contents

Highlight: CURRENT VERSION ISN'T WORKING, afterall challenging Facebook has some complexities. Whatever: ✊

In this report we talk about: fixes, errors, fundraising. The meeting was about defining what should happen between February 15th and March 15th, the next of our online monthly meeting.

What to do in the next month

  1. "keep promoting and perhaps fundraise", the same goal of the previous meeting. We need to continue and make this community grow, or, well, exist.

  2. Facebook changed its HTML structure. It is expected, we know how to handle it, but read more below on how to reduce the impact of their obfuscation.

  3. Complete a gathio installation, and eventually support the cross-posting there. It should be easy to add in mongodb a new event.

  4. By inspecting some of the errors triggered in the last month, we need to better report errors, because, for example, if someone provide a wrong login and password, should be consider a different error rather than “the server didn’t properly worked”. reminder it was handy check the mobilizon version via API.

Fundraising narratives

Check out this (read only) cryptpad notes, it describes some of the narratives usable in fundraising.

Note🍀 this is our biggest weak spot. There is not yet any concrete effort, and often, fundations, supports the creation of new tools rather than adversarial approaches. It makes sense, but it is an additional vulnerability this project faces.

On Facebook HTML obfuscation

HTML randomization is one of the ways by which facebook wants to make, every scraping action, more expensive. An update for them may be at no cost, and if the technology is to be maintained, for scrapers it means having to do an update every time.

Above you’ll see HTML from facebook.com/events/$EVENT_ID page.

  • No meaningful classes
  • Nesting with multiple layers meant to randomize this
  • Usage of span div h-* and p in an arbitrary fashon, relying on CSS to produce the intended result

The more mature and robust the scraping technology is, the more it will remain viable at the expense of upgrades. It may seem counterintuitive, but Facebook’s final interface does not change very often, and if you build a scraper attono to those elements, if span becomes div and h1 becomes h3, it will only break the most delicate scrapers.

Unfortunately, despite knowledge of this theory, the extension underwent major changes in 2022 to move the prototype forward, so right now it is in the “weak” stage and updating the HTML causes the title and description to be missing:

This is a backend problem, as it should not allow posting of events with no title/description, and should be reported as an Error. And also, scraping reliability problem cannot be, in 0.3.x version, solved once and for all. It is an improvement part of a larger refactor and redesign that would come along with Manifest V3.

(at least) three kinds of Errors

At the moment the error collecton in mongodb looks like this. Below you’ll find an error triggered when I changed the username in something invalid:

  "message": "User not found: {\"response\":{\"data\":{\"login\":null},\"errors\":[{\"code\":\"user_not_found\",\"field\":null,\"locations\":[{\"column\":7,\"line\":2}],\"message\":\"User not found\",\"path\":[\"login\"],\"status_code\":404}],\"status\":200,\"headers\":{}},\"request\":{\"query\":\"mutation Login($email: String!, $password: String!)\\n    { login(email: $email, password: $password)   \\n    { accessToken refreshToken user  \\n    { id  \\n      email    \\n      role   \\n      __typename }  \\n    __typename}\\n}\\n\",\"variables\":{\"email\":\"mobilibr@mt2015-\",\"password\":\"experiment\"}}}",
  "publicKey": "E1CdKsPZLCrfhsrRYYR8t8snahxx1LAaDNkSX9DJvWEh",
  "when": {
    "$date": {
      "$numberLong": "1674805040484"

But we should not tread all the errors as the same, we might need to identify:

  • Login/Password errors, coming from authentication failures in Mobilizon backend. This require an adopters intervention.
  • Inconsistecy errors, like the faulty scraping, that might provide inappropriate data. This require us to fix it, or potentially the adopter editing the event and manually fixing it.
  • Any other kind of failures as unexpected bug. This require us to further investigate.