🌱 Developer Meetup n.2; Report
Table of Contents
Highlight: CURRENT VERSION ISN'T WORKING
, afterall challenging Facebook has some complexities. Whatever: ✊
In this report we talk about: fixes, errors, fundraising. The meeting was about defining what should happen between February 15th and March 15th, the next of our online monthly meeting.
What to do in the next month
-
"keep promoting and perhaps fundraise", the same goal of the previous meeting. We need to continue and make this community grow, or, well, exist.
-
Facebook changed its HTML structure. It is expected, we know how to handle it, but read more below on how to reduce the impact of their obfuscation.
-
Complete a gathio installation, and eventually support the cross-posting there. It should be easy to add in
mongodb
a new event. -
By inspecting some of the errors triggered in the last month, we need to better report errors, because, for example, if someone provide a wrong login and password, should be consider a different error rather than “the server didn’t properly worked”. reminder it was handy check the mobilizon version via API.
Fundraising narratives
Check out this (read only) cryptpad notes, it describes some of the narratives usable in fundraising.
Note🍀 this is our biggest weak spot. There is not yet any concrete effort, and often, fundations, supports the creation of new tools rather than adversarial approaches. It makes sense, but it is an additional vulnerability this project faces.
On Facebook HTML obfuscation
HTML randomization is one of the ways by which facebook wants to make, every scraping action, more expensive. An update for them may be at no cost, and if the technology is to be maintained, for scrapers it means having to do an update every time.
Above you’ll see HTML from facebook.com/events/$EVENT_ID
page.
- No meaningful classes
- Nesting with multiple layers meant to randomize this
- Usage of
span
div
h
-* andp
in an arbitrary fashon, relying on CSS to produce the intended result
The more mature and robust the scraping technology is, the more it will remain viable at the expense of upgrades. It may seem counterintuitive, but Facebook’s final interface does not change very often, and if you build a scraper attono to those elements, if span
becomes div
and h1
becomes h3
, it will only break the most delicate scrapers.
Unfortunately, despite knowledge of this theory, the extension underwent major changes in 2022 to move the prototype forward, so right now it is in the “weak” stage and updating the HTML causes the title and description to be missing:
This is a backend problem, as it should not allow posting of events with no title
/description
, and should be reported as an Error. And also, scraping reliability problem cannot be, in 0.3.x version, solved once and for all. It is an improvement part of a larger refactor and redesign that would come along with Manifest V3.
(at least) three kinds of Errors
At the moment the error collecton in mongodb looks like this. Below you’ll find an error triggered when I changed the username in something invalid:
{
"message": "User not found: {\"response\":{\"data\":{\"login\":null},\"errors\":[{\"code\":\"user_not_found\",\"field\":null,\"locations\":[{\"column\":7,\"line\":2}],\"message\":\"User not found\",\"path\":[\"login\"],\"status_code\":404}],\"status\":200,\"headers\":{}},\"request\":{\"query\":\"mutation Login($email: String!, $password: String!)\\n { login(email: $email, password: $password) \\n { accessToken refreshToken user \\n { id \\n email \\n role \\n __typename } \\n __typename}\\n}\\n\",\"variables\":{\"email\":\"mobilibr@mt2015-\",\"password\":\"experiment\"}}}",
"publicKey": "E1CdKsPZLCrfhsrRYYR8t8snahxx1LAaDNkSX9DJvWEh",
"when": {
"$date": {
"$numberLong": "1674805040484"
}
}
}
But we should not tread all the errors as the same, we might need to identify:
- Login/Password errors, coming from authentication failures in Mobilizon backend. This require an adopters intervention.
- Inconsistecy errors, like the faulty scraping, that might provide inappropriate data. This require us to fix it, or potentially the adopter editing the event and manually fixing it.
- Any other kind of failures as unexpected bug. This require us to further investigate.