Sunday, October 13, 2019

Twitter Premium Search API - Node.js


Summary

In this post I'll demonstrate how to use the Twitter Premium Search API.   This is a pure REST API with two different search modes:  past 30 days or full archive search since Twitter existed (2006).

The API has a very limited 'free' mode for Developers to try out.  Limits are imposed on usage:  number of API requests, tweets pulled per month and rate of API calls.  To do anything of significance with this API, you're faced with paying for Twitter's API subscription.  That gets pretty pricey quickly with the cheapest tier currently at $99/month.  This post is based on usage of the 'free'/sandbox tier.

Main Loop

Line 1 fetches a bearer token for accessing the Twitter APIs.  I covered this topic in a previous post.

Lines 4-19 implement a while loop that fetches batches of tweets for a given search query.  For the Twitter free/sandbox environment, you can pull up to 100 tweets per API call.  Each tweet in the batch is evaluated to determine if it was 140 or 280 character tweet.  The tweet text is formatted and then that and the created_date are added to a JSON array.  That array is ultimately written to file.

Line 20 is a self-imposed delay on calls to the Twitter API.  If you bust their rate limits, you'll get a HTTP 429 error.

        const token = await getTwitterToken(AUTH_URL);
        let next = null;
        
        do {
            const batch = await getTweetBatch(token, url, query, fromDate, maxResults, next);
            for (let i=0; i < batch.results.length; i++) {  //loop through the page/batch of results
                let tweet = {};
                if (batch.results[i].truncated) {  //determine if this is a 140 or 280 character tweet
                    tweet.text = batch.results[i].extended_tweet.full_text.trim();
                }
                else {
                    tweet.text = batch.results[i].text.trim();
                }

                tweet.text = tweet.text.replace(/\r?\n|\r|@|#/g, ' ');  //remove newlines, @ and # from tweet text
                tweet.created_at = batch.results[i].created_at;
                tweets.push(tweet);
            }
            next = batch.next;
            await rateLimiter(3);  //rate limit twitter api calls to 1 per 3 seconds/20 per minute
        }
        while (next);

Tweet Batch Fetch

Lines 1-26 set up a node fetch to the Twitter REST API end point.  If this was a call with a 'next' parameter (meaning multiple pages of tweets on a single search), I add that parameter to the fetch.

    const body = {
        'query' : query,
        'fromDate' : fromDate,
        'maxResults' : maxResults
    };
    if (next) {
        body.next = next;
    }

    try {
        const response = await fetch(url, {
            method: 'POST',
            headers: {
            'Authorization' : 'Bearer ' + token
            },
            body: JSON.stringify(body)
        });
        if (response.ok) {
            const json = await response.json();
            return json;
        }
        else {
            let msg = (`authorization request response status: ${response.status}`);
            throw new Error(msg);    
        }
    }

Usage

let query = 'from:realDonaldTrump -RT';  //get tweets originated from Donald Trump, filter out his retweets
let url = SEARCH_URL + THIRTY_DAY_LABEL;  //30day search
let fromDate = '201910010000'; //search for tweets within the current month (currently, Oct 2019)
search(url, query, fromDate, 100)  //100 is the max results per request for the sandbox environment 
.then(total => {
    console.log('total tweets: ' + total);
})
.catch(err => {
    console.error(err);
});

Output

Snippet of the resulting JSON array from the function call above.
[
    {
        "text": "We have become a far greater Economic Power than ever before, and we are using that power for WORLD PEACE!",
        "created_at": "Sun Oct 13 14:32:37 +0000 2019"
    },
    {
        "text": "Where’s Hunter? He has totally disappeared! Now looks like he has raided and scammed even more countries! Media is AWOL.",
        "created_at": "Sun Oct 13 14:15:55 +0000 2019"
    },

Source

https://github.com/joeywhelan/twitterSearch

Copyright ©1993-2024 Joey E Whelan, All rights reserved.