Laravel Spam Detection (Step by Step Guide)

27 Jul, 2018 | 8 minutes read

As a PHP developer, you already know that Laravel is the most powerful and most used MVC PHP framework for building full-featured web applications. With its expressive and beautiful syntax and modern toolkit, Laravel is the leader in the industry. Laravel focuses on the end user first, which means its focus is on simplicity, clarity, and getting work done.

For more information about the Laravel framework, visit the documentation page, following this link.
Because of all these advantages of the framework, we are using Laravel a lot, because we tend to and our idea is to follow all the latest trends in the IT industry.

That’s why, in this post, we will use Laravel 5.6. You will enjoy it!

Skeleton application

For the purpose of this tutorial, a Laravel application was created from scratch with the following functionalities:

  • Login/Register functionality with the core auth command;
  • Created some test users in the database;
  • Posts and Comments controllers, models, relations between, and migrations with some sample data.

Spam Detection

There is a big problem with the spammers nowadays. There are a lot of scripts that are running throughout different websites and try to submit a form wherever they come across. Have you ever wondered why? If you have a little bit of experience with SEO, you can understand that the reason is very simple.

They try to add a link (spam backlink) that redirects to potential spammers’ websites. With that, they build authority on their sites and it is easier to rank the site higher on Google search when someone searches for keywords related to their website’s content. Ranking higher on Google search is mainly calculated based on the website’s backlinks and authority. The authority of the website is increased with more backlinks.

This is the main reason why we need to be careful of spammers on our applications because we can easily find our applications with a lot of spam comments that won’t be related to the site’s content.

Spam rules that will be used

In this tutorial, spam comments on blog posts will be detected. Of course, there are a lot of rules that can be presented, but for now, the attention will be on 3 rules. Those are:

1.      Invalid keywords

This will be an array of words or sentences. Each comment that contains any of these words or sentences will be marked as spam.

2.      Key held down

Imagine that someone visits your website and just for fun holds down a key and tries to post a comment. For example: ‘aaaaaaaa’, ‘bbbbbbbbb’, etc. We should prevent this kind of action and mark those comments as spam.

3.      Comments posted very often

We should prevent users from posting comments very often. There will be a configuration parameter with integer number that will tell us how long the user should wait before posting a new comment.

Let’s code!

Surely, in order not to bother you with a lot of unnecessary info regarding the application that was set up for this tutorial, the focus will be only on sharing pieces of code with good explanation, that will be more useful and understandable.

In this section, we will display a post with basic flow that is enabled with the beautiful HTTP Foundation component that Laravel uses.

Route in web.php file:

te in web.php file:

<?php
Route::get('/blog/{slug}', 'PostsController@show');
?>

Method in PostsController:

<?php
/**
     * Show single post
     *
     * @param  string $slug
     * @return \Illuminate\Http\Response
     */
    public function show($slug)
    {
        $post = Post::where(‘slug’, $slug)->first();
        return view('blog.single_post', compact('post'));
    }
?>

Code explanation:

  • Find the post with the slug (it is a unique field in the database);
  • Return view with parameter the post.

After that, the view will be rendered which is a blade template with the post data.
In the view, there is a form for adding a comment to the post.
It looks like this in real life:

A form for adding a comment to the post

And the form in the blade template (using bootstrap 4):

<form action="/blog/{{$post->id}}/store-comment" method="POST">
    {{ csrf_field() }}
    <div class="form-group">
        <textarea class="form-control" name="body" rows="3"></textarea>
    </div>
    <button class="btn btn-primary" type="submit">
        Submit
    </button>
</form>

On submit, the action of the form is set up to post the request to the given route. The post method is handled like this in the web.php file:

<?php
Route::post('/blog/{post}/store-comment', 'CommentsController@store');
?>

And the method in CommentsController:

<?php
/**
     * Store new post
     *
     * @param   App\Post $post
     * @return  Illuminate\Http\RedirectResponse
     */
    public function store(Post $post)
    {
        try {
            resolve(Spam::class)->detect(request('body'));

            $post = Post::find($post->id);

            $post->comments()->create([
                'user_id' => auth()->id(),
                'body' => request('body'),
            ]);

            return redirect("/blog/{$post->url}#comments")->with([
                'status' => 'success',
                'message' => 'Your comment was published successfully',
            ]);
        } catch (Exception $e) {
            return redirect("/blog/{$post->url}#comments")->with([
                'status' => 'danger',
                'message' => $e->getMessage(),
            ]);
        }
    }
?>

Explanation of the code above:

  • Finding the post by ID on which the user wants to comment on;
  • Creating the comment (notice the relation);
  • Redirect to the previous post with a status and a message.

Hope that until now everything is clear. Now, the interesting part comes.

Dedicated spam class will be created that will handle everything related with the detection of spam comments. Of course, we will create it by following the best practices and keeping the rule of separation of concerns.

Let’s create a generic Spam class:

<?php

namespace App\Inspections;

class Spam
{
    /**
     * All registered inspections
     *
     * @var array
     */
    protected $inspections = [
        InvalidKeywords::class,
        KeyHeldDown::class,
        CommentsPostedVeryOften::class,
    ];

    /**
     * Detect spam
     *
     * @param  string $body
     * @return bool
     */
    public function detect($body)
    {
        foreach ($this->inspections as $inspection) {
            app($inspection)->detect($body);
        }

        return false;
    }
}

Let me explain the code above:

  • The Spam class is created and placed in the App\Inspections folder;
  • In the $inspections property, in an array are placed all the classes with specific rules that were separated while refactoring the code;
  • Below, in the detect() method, the body of the comment is passed and loop through all the classes in the $inspections property, passing the class name to the app() helper to resolve it from the container, and calling each class detect() method with a parameter that is the body of the comment (type string).

You can see that each spam rule is separated in different classes. All classes are placed in the App\Inspections folder.

So let’s talk about each class and explain what they are exactly doing.

Invalid keywords rule

Here’s the code for this class:

<?php

namespace App\Inspections;

use Exception;

class InvalidKeywords
{
    /**
     * All spam keywords
     *
     * @var array
     */
    protected $keywords = [
        'customer support',
    ];

    /**
     * Detect spam
     *
     * @param  string $body
     * @throws \Exception
     */
    public function detect($body)
    {
        foreach ($this->keywords as $keyword) {
            if (stripos($body, $keyword) !== false) {
                throw new Exception("Your comment contains spam.");
            }
        }
    }

}
  • Initializing a property $keywords that is an array of all keywords that we recognize as spam. For now, there is only one keyword, but you can write as many as you need;
  • The method detect() called from Spam class with parameter $body passed from Spam class too;
  • Looping through the $keywords property, and for each of the keywords there is a check if the keyword exists in the body of the comment, and if so, throw an Exception.

Simple as that.

Key held down rule

Here’s the code for this class:

<?php

namespace App\Inspections;

use Exception;

class KeyHeldDown
{
    /**
     * Detect spam
     *
     * @param  string $body
     * @throws \Exception
     */
    public function detect($body)
    {
        if (preg_match('/(.)\\1{4,}/', $body)) {
            throw new Exception("Your comment contains spam.");
        }
 
  • The method detect() called from Spam class with parameter $body passed from Spam class too;
  • Checking with regular expression if the body of the comment contains the same letter in a row more than 4 times. It is 4 times for the purposes of this example, you can change it by your needs;
  • If the condition is met, throw Exception.

Comments posted very often rule

Here’s the code for this class:

<?php

namespace App\Inspections;

use App\User;
use Carbon\Carbon;
use Exception;

class CommentsPostedVeryOften
{
    /**
     * Detect spam
     *
     * @param  string $body
     * @throws \Exception
     */
    public function detect($body)
    {
        $user = User::find(auth()->id());

        $latestComment = $user->getLatestCommentByUser();

        if ($latestComment) {
            $data = $this->prepareCommonData($latestComment);

            if ($user->canUserPostComment($data)) {
                throw new Exception("You can post only once in {$data["userCommentFrequency"]} minutes.");
            }
        }
    }

    /**
     * Prepare common data
     *
     * @param  Collection $latestComment
     * @return array
     */
    public function prepareCommonData($latestComment)
    {
        return [
            'latestCommentCreated' => new Carbon($latestComment->created_at),
            'userCommentFrequency' => config('app.spam_detection.user_can_comment_once_in'),
        ];
    }
}
  • The method detect() called from Spam class with parameter $body passed from Spam class too;
  • Finding the logged in user;
  • Getting the latest comment for the logged in user;
    • The code is separated in a method getLatestCommentByUser() declared in User model:
<?php
/**
     * getLatestCommentByUser - get latest comment by user
     *
     */
    public function getLatestCommentByUser()
    {
        return $this->comments()->latest()->first();
    }
?>
  • If the user has comments, there is a condition that checks if the user can post a new comment with another method canUserPostComment() declared in the User model and passed a prepared data with the date of the latest created comment and a configuration parameter from the app.php file that tells us how often a user can post a comment. For now, it is once in 10 minutes.
    • Below is the code from the canUserPostComment() method:
<?php
/**
     * Check if user can post comment
     *
     * @param  string $data
     * @return bool
     */
    public function canUserPostComment($data)
    {
        return $data['latestCommentCreated']->diffInMinutes() < $data['userCommentFrequency'];
    }
?>
  • If the user posted a comment in less than those 10 minutes, the condition is met, and the code throws Exception with a proper message.

There is only one and final step before finishing this feature. To call the Spam class and detect if the comment is spam throughout the spam rules, you have to add one more line of code in the store() method of the CommentsController. This is the line of the code:

<?php
resolve(Spam::class)->detect(request('body'));
?>

And after that, the store() method will look like this:

<?php
/**
     * Store new post
     *
     * @param App\Post $post
     * @return  Illuminate\Http\RedirectResponse
     */
    public function store(Post $post)
    {
        try {
            resolve(Spam::class)->detect(request('body'));

            $post = Post::find($post->id);

            $post->comments()->create([
                'user_id' => auth()->id(),
                'body' => request('body'),
            ]);

            return redirect("/blog/{$post->url}#comments")->with([
                'status' => 'success',
                'message' => 'Your comment was published successfully',
            ]);
        } catch (Exception $e) {
            return redirect("/blog/{$post->url}#comments")->with([
                'status' => 'danger',
                'message' => $e->getMessage(),
            ]);
        }
    }
?>

You always have to prefer smaller amount of code and that’s why here is used the resolve() helper that Laravel offers. It resolves a given class name to its instance using the service container.
There are some other ways to call the detect() method of the Spam class, you can do it with type-hinting, with creating a new instance of the Spam class and after that calling the detect() method etc. However, the method used in the code above looks fine for now.

Conclusion

The example in this post is simple and easy-to-implement solution. Of course, as it is mentioned above, there are a lot of other rules that you can add, and with this code structure, it will be very easy to add a new rule based on your needs.

You can see how enjoyable is coding with Laravel framework, that’s why Laravel is preferred and recommended for all kinds of web applications, from simple hobby projects all the way to Fortune 500 companies.

Please feel free to comment below and share your thoughts.