How to Build a Python Reddit Bot

SiteSummarizerBot: A bot that summarizes content of a URL-only submission

Posted 2019-05-25 03:31:19 by Ronie Martinez

As a Redditor, we often see bots crawling around subreddits and doing a lot of useful things, from image manipulation, reminders, etc. This article will guide you into building your own Reddit bot using Python.

Step 1: Register a Reddit account

Go to https://www.reddit.com/ and sign up. If you have an existing account, you may use it or you may register a separate account. You may reuse your email address and create multiple accounts on top of it. The advantage of creating a new account is separating your bot from your personal account.

Step 2: Create a Reddit app

Go to https://www.reddit.com/prefs/apps/ and click the button "are you a developer, create an app".

 

Create a Reddit App

Complete application details, and click "create app".

Reddit application details

Record your client ID and client secret. The client ID is right below the application name and type. Note: For safety purposes, do not share your client ID and client secret.

Reddit app details

Step 3: Learn how to use PRAW

PRAW stands for Python Reddit API Wrapper. To install PRAW, execute pip install praw.

In order to start consuming the Reddit API, we need to initialize a Reddit instance. In this example, we stored the credentials on environment variables. Note that there is a format for the bot user_agent. Read the API Rules for more details.

import os
import praw


reddit = praw.Reddit(
    client_id=os.getenv('CLIENT_ID'),
    client_secret=os.getenv('CLIENT_SECRET'),
    username=os.getenv('BOT_USERNAME'),
    password=os.getenv('BOT_PASSWORD'),
    user_agent=os.getenv('BOT_USER_AGENT'),
)

Praw provides a stream for submissions. For example, if we want to the subreddit r/SiteSummarizerBot, we need to call reddit.subreddit('SiteSummarizerBot').stream.submissions(). To get submission streams from multiple subreddits, use the + sign. For example, to get the submission stream from r/Python and r/Flask, use Python+Flask. To get submission streams from all subreddits, use all.

for submission in reddit.subreddit('SiteSummarizerBot').stream.submissions(skip_existing=True):
    text = submission.selftext.strip()

Users will be notified when another user typed their username in another comment. Bots may use this as trigger to execute a task. The Reddit inbox does not provided a stream for mentions but we can use the stream_generator() for this.

for mention in stream_generator(reddit.inbox.mentions, skip_existing=True):
    mention.mark_read()

When a user downvotes a comment, typical bot response is to delete that comment. Unfortunately, PRAW does not provide an API for this (or is limited). To solve this issue we need to track all the comments that the bot replied to.

comment = Comment(reddit, id=comment_id)
if comment.score < 1:
    comment.delete()

Step 4: Build you idea

For this article we will be building a bot that summarizes URL-only submissions. Most users do not want to click links with no description as the link might be unsafe. The functionalities that we need are:

  1. Extract articles from a URL
  2. Summarize articles to a few sentences

For these functionalities, we will be using goose3 and pysummarize. In addition, we will use Redis to store comment IDs and submission IDs, and rfc3986 library for validating URLs.

from goose3 import Goose
from summarize import summarize


def extract_summary(url):
    g = Goose({'strict': False})
    article = g.extract(url=url)
    summary = summarize(article.cleaned_text).strip()
    return article.title, summary

The source code for this bot is available on Github.

Step 5: Use Reddit bot

Current SiteSummarizerBot functionalities include:

  1. Replying to submissions in r/SiteSummarizerBot subreddit
  2. Replying to a comment when mentioned - Note that it is the content of the URL in the submission that will be summarized, not the content of any comment
  3. Deleting comments made by the bot when downvoted

Step 6: Register for commercial use

For commercial use, bot owners are required to request API access and register at https://www.reddit.com/wiki/api.

To wrap up

Writing Reddit bots using Python is simple and easy as there is not too much restrictions in the Reddit API. Just follow the rules and as much as possible, refrain from abusing and spamming, and respect the community guidelines.

python reddit praw redis goose3 pysummarize rfc3986


Share