The drama in making an attempt to transform election PDFs to Spreadsheets
My telephone pinged: “Inform Mark Essien they’re coming for him.”
I had no thought what that even meant. It was a forwarded WhatsApp message that somebody had despatched to a buddy of mine.
All I used to be doing was making an attempt to transform PDFs from the 2023 Nigeria Presidential elections into spreadsheets.
However let’s rewind a bit and provides some context.
In 2020 a small group of about 5 younger folks went to a crossroads in Lagos, Nigeria to protest in opposition to police brutality. Their protest was known as #EndSARS, and so they have been protesting police brutality. SARS (The Particular Anti-Theft Squad) was a infamous section of the Nigeria police pressure, identified for extortion and extra-judicial killings of anyone they termed armed robbers.
For 4 days that tiny group protested, with barely anyone noticing.
On the final day, as they have been about to depart, somebody tweeted about their protest. Then another person retweeted, and another person. And shortly, there 1000’s of retweets, with social media hailing them as heroes.
A couple of individuals who noticed the protest on twitter joined them. It was nonetheless a small crowd, but it surely was rising. The information began spreading on social media, and the gang grew. Inside a few days, there have been 1000’s of individuals protesting with the #EndSARS hashtag. The federal government was silent, believing it might blow away shortly.
However the protests grew and grew. Younger folks in different cities began becoming a member of. Nigerian social media was aflame – this was the one subject. Until the protests turned violent. The protesters began attacking police stations. The policemen began taking pictures protesters.
The protesters marched to infamous SARS prisons and compelled them open. Prisoners escaped. Malls have been burnt down. Policemen threw off their uniforms and Authorities officers fled.
Because the violence flared, social media now began calling for calm. But it surely was late – loads of hoodlums had already infiltrated the protests, and so they have been bent on getting revenge for all the pieces they blamed the Authorities for.
The Authorities tried to cease the protests, however there was no person to speak to. There was no head of protest, no figurehead, no person with any actual authority. They known as it a “headless mob”.
Social media began calling for calm, and asking the president to talk. After many days of silence, he lastly spoke, urging for calm, disbanding SARS and promising to deal with the issues of the protesters.
Issues calmed down. However there was nonetheless an enormous crowd gathered on the Lekki Tollgate – a serious, 6 lane freeway that passes via the guts of Lagos. Concert events have been taking part in on the location, folks have been serving meals and drinks.
Then ominously, on the twentieth of October of 2020 some folks drove there in unmarked vehicles and eliminated all of the Cameras put in on the tollgate.
That evening, because the DJs performed, all of the floodlights on the tollgate switched off. Social media customers warned about navy autos driving alongside the freeway in the direction of the placement.
Quick clips from cell phones seize what occurred subsequent. It’s darkish, individuals are working, gunshots all over the place. Individuals are gathered round bleeding folks, screaming. Extra working, lights flashing.
There are not any cameras, so there isn’t any clear reconstruction.
However the subsequent morning, the military is answerable for the tollgate, and the protesters are gone. Social media is full of messages that many individuals have been killed and the military took away their our bodies and cleared the scene. The military and authorities stated nothing like that occurred.
An uneasy calm begins.
There’s loads of anger, however the protests are over. All the pieces appears to be again to regular. However there may be an under-current of youth anger.
And the presidential election is in 2023.
Nigeria has all the time been a two social gathering state – very like the U.S. It makes use of just about the identical presidential system because the U.S, and identical to the Democrats and Republicans, there have been variants of the identical two events dealing with one another. Since 2011, it has been the PDP (former ruling social gathering) in opposition to the APC (present ruling social gathering).
In 2022, one of many candidates abruptly left the PDP, claiming that the method was rigged in opposition to him profitable the nomination. He joined a small social gathering that had solely ever gained one state governorship election in its total historical past.
Out of the blue, the social media mob adopted this candidate – Peter Obi – as their most popular candidate. Lots of the similar handles who had known as out #EndSARS now turned #Obidient. Tweets, WhatsApp messages grew, calling on others to affix the #Obidient motion.
Peter Obi himself appeared as shocked as everybody else that the headless mob had adopted him, however he shortly accepted the Obidients, and began utilizing the slogans as a part of his marketing campaign.
The political institution mocked the motion, saying the “4 folks tweeting in a room” had by no means gained even a council election in Nigeria.
The motion grew louder and louder on social media, with the members of the motion letting out their anger on the current political establishment. The politicians mocked again – saying that politics was not gained on social media, however on the grassroots – referring to the villagers who usually vote for whoever provides them essentially the most cash.
Election Day got here. The Labour Occasion candidate was Peter Obi, the candidate adopted by the younger folks on social media. The candidate from the ruling social gathering was Bola Tinubu, former Governor of essentially the most populous state (Lagos), and who has been concerned in deciding on each Governor to the state since he was Governor. He was additionally rumored to be getting a minimize of all Lagos State tax income by way of a consulting agency he arrange when he was Governor. The candidate from the previous ruling social gathering was Atiku Abubakar, a former vice chairman, a really rich man. His wealth got here from his holdings within the ports, which he helped privatize when he was vice chairman.
For the primary time, Nigeria had adopted digital transmission of voting from the polling items. That meant that the polling unit voting sheets could possibly be seen in actual time on an internet site.
Because the outcomes began coming in, the nation was electrified. Tens of 1000’s of votes have been coming in for Peter Obi. Then tons of of 1000’s of votes began coming in. Then 1,000,000 votes got here in and one other million.
The senatorial election outcomes began coming in, and the Labour Occasion was profitable seats. An unknown social gathering with no earlier Governor or Senate seat was sweeping seats throughout huge swathes of the nation.
However there was no approach of realizing who gained. Though the Nigerian Electoral Physique had agreed to digital transmission of polling unit outcomes, the senators and political institution had blocked the digital switch of precise votes. They wished it nonetheless written on paper, for causes we will guess at.
With 170,000 thousand polling items, it was going to take a very long time to depend the outcomes.
However throughout social media, movies and photographs have been spreading of the Labour Occasion profitable polling unit after polling unit, folks yelling in happiness because the social gathering gained. Even within the rural areas, the place the politicians had claimed they’d absolute management, the social gathering was profitable unit after unit.
However different disturbing movies have been additionally popping out – of polling items destroyed, and folks threatening to harm anybody who voted for the Labour social gathering.
Social Media waited for the election fee to announce the outcomes. Individuals tried to tally the outcomes on-line, however with 170,000 PDFs in a very unstructured format, it was near unimaginable.
Then INEC began asserting the state outcomes, over a a number of day counting marathon. On the finish, the end result was introduced:
- The Obidients Labour Occasion: 6.1 million votes
- The previous ruling PDP Occasion: 6.9 million votes
- The ruling social gathering: 8.7million votes
The Obidients had gained within the capital metropolis and had defeated the ruling social gathering and their candidate in Lagos – the state they’d dominated for the final 16 years. However in accordance with the official tally, they’d misplaced within the nation.
However even because the outcomes have been being introduced, folks on social media have been discovering that in the event that they manually added all the outcomes from the servers of the electoral umpire collectively, the whole didn’t add as much as the votes introduced. They confirmed examples from Rivers State, the place only a few polling items added collectively exceeded the votes introduced by the electoral umpire. Individuals cried foul.
The amended electoral guidelines required the events to problem election outcomes inside 21 days after they have been introduced. However to file, proof was wanted. The one proof about how the election actually went was unfold out throughout 170,000 pictures of polling unit outcomes. There was no digital model of the outcomes, simply photographs.
The clock was ticking, and instantly numerous Obidient teams sprung up to determine how one can extract the information out of the photographs. I used to be in one of many teams, and we determined to strive numerous issues.
The very first thing was OCR. The photographs have been all snapped in many various methods, with hardly any construction. Every social gathering end result had a quantity beside it. The picture angles have been totally different. Many sheets have been blurry or had digicam flash on them.
All of the open supply OCR software program gave dangerous outcomes. The most effective end result got here from Amazon Rekognition, but it surely was nonetheless not ok – it might sometimes change the scores, and that was merely not going to work.
After experimenting with OCR for a couple of day, we gave up. We had about 8 days left to go.
We had a brainstorming assembly, and determined to strive a brand new strategy. We’d merely ask the Obidients to assist us do the conversion. If tons of of Obidients did the transcription, it might go quick.
So we shortly designed out an internet site:
After which began coding. The frontend was in React, and the backend in PHP Laravel.
A few days later, the app was achieved. A easy web site that confirmed an image of a polling unit sheet and requested folks to enter the votes they noticed in some textfields. Then the values could be saved within the database.
I tweeted out the hyperlink.
Inside minutes, I used to be getting replies from the Obidients. Individuals have been leaping on the location, and the primary outcomes began going up. The progress bar began shifting, slowly at first, after which sooner and sooner. We went from transcribing one end result each minute to transcribing 1 each 10 seconds, and shortly we have been transcribing at 1 sheet per second.
Visitors grew on the location, and shortly our fully unoptimized backend was struggling to catch up. However we have been shifting ahead. We shortly grew to be transcribing 20,000 sheets per day.
The location had a outcomes web page that was updating the outcomes dwell because the sheets have been transcribing. The outcomes have been rising, and by the point we had counted 6 million votes, the Obidients have been clearly within the lead. We stored transcribing, and by the point we had reached 50% of the depend – 10 million votes – Peter Obi was strongly within the lead.
I tweeted that out, and all hell broke unfastened. Out of the blue, out of nowhere 1000’s of spam twitter accounts got here out of nowhere, attacking the trouble. Threats, assaults of all types hit us.
Then the bots got here. 1000’s of entries began on the location – all getting into big numbers for the ruling social gathering. Completely different IP addresses, totally different proxies, all getting into pretend numbers.
Once we began the mission, we had a plan. We’d first transcribe all 170,000 polling unit outcomes, then we’d do a second move of all the outcomes once more as a validation step. If the identical numbers have been entered twice, then it was seemingly that the entered numbers have been appropriate. However with the bots getting into pretend numbers, we now had a brand new battle to struggle.
We instantly enabled captcha, and that slowed down the bots a bit. Then we shortly carried out a test to see if anyone would enter weirdly massive numbers, we’d ignore their entries shifting ahead. Then we began displaying some outcomes we knew to the bots – in the event that they entered flawed numbers, we’d cease accepting the outcomes.
Whoever was behind the bots stored adapting and counteracting what we have been doing. They went from utilizing a script to utilizing what felt like tons of of people. They went from getting into absurdly massive numbers to getting into believable numbers. They began getting into some appropriate and a few flawed numbers.
However the mixture of methods and the big variety of Obidients additionally working meant that we have been additionally getting an enormous variety of appropriate entries.
It appeared the counter-parties realized that they’d not win on expertise, so that they began a brand new marketing campaign. A whole lot of accounts that claimed to be Obidients instantly popped up, saying that we have been working for the ruling social gathering, and that the work we have been doing was designed to show that the Obidients had misplaced the election. Each single submit I made would have tens of these accounts replying.
And so they doubled down – they created a pretend screenshot displaying that the ruling social gathering had gained on our web site, and began spreading that. A present Authorities minister even tweeted this out, and a badly written article hurriedly appeared on a serious native newspaper.
Their approach labored. The Obidients began doubting. Distinguished Obidient accounts began threads questioning if we have been working for the Authorities.
In the meantime, we had accomplished transcribing 150,000 polling unit sheets, and we wanted to maneuver to the validation section the place we’d appropriate all of the harm the bots had achieved.
However the harm had been achieved. The gang we had pulled to do the work didn’t belief us anymore. We didn’t have anybody anymore to assist us validate the entries. Work began, however moved very slowly – lower than 2000 entries validated in a day. It might take us 3 months to complete at this charge.
Was our mission going to fail? If we didn’t announce outcomes, the Obidients would ensure that we have been working for the Authorities. And the Authorities already noticed us as enemies. How might we remedy this?
I began wanting via the information. We had 800,000 submissions in our database for 170,000 polling items. I used to be randomly sampling, and I seen that a large variety of entries have been appropriate. And surprisingly sufficient, those with the flawed entries had tons and plenty of flawed entries. Whereas the overwhelming majority, which solely had one entry, have been largely appropriate.
I examined round a bit, after which I spotted. We had a bug within the code. Whenever you opened the web site for the primary time, it was buggy – it was not returning a very random entry the best way it was presupposed to. It had an enormous tendency to return from a small set of entries. However subsequent ones have been now random.
That meant that each time the web site was absolutely refreshed, the primary polling unit entered was one which most likely already had tons of of entries. However when you stored working, you’ll now be engaged on new items.
And I spotted what might have occurred – whoever was controlling the bots should have informed them to refresh after each entry. In all probability they found out that it might make it tougher for us to detect. However then they ended up getting into all of the flawed values largely for a small set of items that we might simply clear up. As much as 90% of our knowledge was completely clear.
We shortly stopped all entries, and begin crunching the outcomes. A couple of hours of heavy server lifting, and we have been achieved. We had transcribed 170,000 polling unit sheets to CSV format in 5 days with a big group of volunteers.
We shared the outcomes with the crew for them to submit as a part of their proof. And we have been achieved.
We printed the ends in spreadsheets here. https://drive.google.com/drive/folders/173oHgms6wYy5WKz_i3Lhl5mXcmobCWHz?usp=sharing.
Peter Obi filed his petition on the final day. And I used to be comfortable to see some acquainted issues within the submitting:
Now we wait to see how the tribunal will determine.
E mail me on markessien@gmail.com or observe on twitter: twitter.com/markessien.