Fetching Tweets in Laravel With Python’s Social Networking Services Scraper

How to fetch data from Twitter without paying for a Twitter developer account

Hendrik Prinsloo
4 min readApr 5, 2023
Created by Mage with Stable Diffusion 2.1

If you landed here looking for a guide on scraping tweets with Python, please check out Martin Beck’s guide below.

Introduction

I recently experimented with Laravel Actions and built basic commands to scrape posts from Twitter, Reddit, and a static website. You can find the open-source repository at the bottom.

Twitter decided to no longer provide free access to their API on February 9th, 2023. See Twitter remove free API access in the latest money-making quest.

I tried to apply for a developer account, and was denied. The basic tier costs $100 per month, which I think is ridiculous. Thanks Elon.

Response from Twitter on my application

I couldn’t find a library in PHP that allows you to scrape data without an API key. If you know of such a library, please comment; or create one.

Why Python?

Python has an elegant library for scraping social networking services (SNS), called snscrape. It was initially created by JustAnotherArchivist, and allows you to scrape data from the most popular social media sites.

The following services are currently available

  • Facebook: user profiles, groups, and communities (aka visitor posts)
  • Instagram: user profiles, hashtags, and locations
  • Mastodon: user profiles and toots (single or thread)
  • Reddit: users, subreddits, and searches (via Pushshift)
  • Telegram: channels
  • Twitter: users, user profiles, hashtags, searches (live tweets, top tweets, and users), tweets (single or surrounding thread), list posts, communities, and trends
  • VKontakte: user profiles
  • Weibo (Sina Weibo): user profiles

Integrating Python with Laravel

I decided on a simple approach, as it was just an experiment. But I’m sure it should be sufficient for most use cases.

Overview of the flow

  1. Install the dependencies on the server
  2. Trigger a shell command via PHP to execute snscrape
  3. Save the results to a static file
  4. Parse the results to make sense of it

Install the dependencies

I recommend using Laravel Sail for this, but you should be able to configure it on any server environment. You only need to install Python v3 with the snscrape package.

The standard Dockerfile provided by Laravel Sail requires the following instructions.

RUN apt-get update \
&& apt-get install -y python3-pip \
&& pip3 install snscrape

The shell command

This might make most developers cringe, as ideally, you don’t want to allow anything running on your server to execute raw shell commands.

But sometimes, you need to compromise. Consider this approach carefully, especially when implementing it in a production environment.

I created a simple static utility class for this, inspired by Bertug Korucu’s post about executing shell commands in Laravel.

namespace App\Utilities;

// uses ...

class ShellCommand
{
public static function execute($cmd): string
{
$process = Process::fromShellCommandline($cmd);

$processOutput = '';
$captureOutput = function ($type, $line) use (&$processOutput) {
$processOutput .= $line;
};

$process->setTimeout(null)->run($captureOutput);

if ($process->getExitCode()) {
$exception = new Exception($cmd.' - '.$processOutput);
report($exception);

throw $exception;
}

return $processOutput;
}
}

Fetching the tweets

From this point, it was simple to relay the command via the shell to fetch the relevant tweets and save them to a static file to be processed.

I started by experimenting with calling the command directly.

 ~/path/to/project > ./vendor/bin/sail shell

sail@guid:/var/www/html$ snscrape \
--jsonl \
--progress \
--max-results 1 \
twitter-search "#php #laravel since:2023-04-01" \
| python3 -m json.tool

Integrating the command with Laravel

For some reason, Twitter’s developers decided not to allow ordering results with advanced search. This forced me to group the results recursively from most popular (based on likes) to least popular for the current month.

The results of each run get appended in sequence to a static log file in the storage before being processed.

Note: I delayed each request by 60 seconds to prevent rate limiting.

namespace App\Actions\Scrapers;

// uses ...

class FetchTwitterPosts
{
// ...

private function fetchTweets(int $minFaves = 200): Collection
{
$this->command->info(sprintf(
'Fetching tweets with minimum likes of %d',
$minFaves
));

$filepathLogAbs = Storage::path($this->filepathLog);
$commandArgs = [
'snscrape',
'--jsonl',
'--progress',
'--max-results 100',
sprintf(
'twitter-search "#php #laravel since:%s until:%s min_faves:%d"',
now()->startOfMonth()->format('Y-m-d'),
now()->endOfMonth()->format('Y-m-d'),
$minFaves
),
];

$command = sprintf(
'%s >> %s',
implode(' ', $commandArgs),
$filepathLogAbs
);

$this->command->info(sprintf('Executing: %s', $command));
ShellCommand::execute($command);

if ($minFaves > 1) {
sleep(60);

if ($minFaves > 100) {
$minFaves -= 100;
} else {
$minFaves -= 10;
}

return $this->fetchTweets($minFaves);
}

$tweets = collect();
$lines = Storage::get($this->filepathLog);
foreach (explode("\n", $lines) as $line) {
if (empty($line)) {
continue;
}

$tweets->push(json_decode($line, true));
}

return $tweets;
}

// ...
}

--

--

Hendrik Prinsloo
Hendrik Prinsloo

Written by Hendrik Prinsloo

Full Stack Developer ● Toaster mechanic ● Technical sales advisor ● Forgotten password specialist ● Let-me-google-that-for-you expert

No responses yet