Crashing Servers: A Stable Profanity Filter Solution

Brian Pontarelli

We have been hearing from prospective customers that their profanity filters have been crashing their servers. Sure there are some good jokes about what people tend to do when servers go down, but in reality when servers crash users get angry. Angry users can have a big impact on your business. We have compiled two lists of suggestions that will help you prevent issues with your filter.

Crashed Servers

Picking a Good Filter

Picking a good profanity filter is important. Not only should you select a filter that has good accuracy and is customizable, but you should also pick a filter that can scale. Here are the 5 things you should look for to ensure your profanity filter won't crash your servers.

1. On-premise

The best way to reduce your risk of crashes is to use a filter that can scale. On-premise filters provide the lowest latency and the highest throughput. This means that even if you have massive spikes in traffic, the filter will keep up.

2. Avoid Regular Expressions

Most naive filters use regular expressions to filter your content. Regular expressions are often slow and can be memory intensive. If you are selecting a profanity filtering technology, be sure to verify that your preferred solution doesn't use regular expressions. Instead, look for a solution that uses a linear rule-based filter.

3. Don’t Let Watson Do Your Filtering

Similar to regular expressions, you should avoid filters that use artificial intelligence (AI).  In most cases, filters that use AI don't handle spikes or large loads well. They also require expensive hardware and constant tuning to work properly. Instead, you should find a profanity filter that scales well on value priced hardware and doesn’t require a super-computer to find f-bombs.

4. Load Test

Load testing is a vital component of any project. Verify that the filter you use has been extensively load tested. A good load test should be capable of handling hundreds of concurrent client requests and provide throughput of at least 20,000 requests per second. All of these metrics should be sustainable over long periods of time without the filter crashing. This will ensure that any spikes you encounter are easily managed by the filter.

5. Standards

Using a filter that requires a proprietary protocol and a custom built client opens you up to the possibility of crashes. If the protocol implementation or client library have bugs, they could easily crash your server. Instead, use a filter that conforms to the standards such as HTTP and REST. These filters are much less likely to fail on you because REST and HTTP are two of the most widely used standards in the world. They have both been thoroughly tested. Most programming languages come with HTTP libraries out of the box and there are also hundreds of open source libraries for these protocols available. As an added bonus, using a filter that is built on top of HTTP and REST can reduce your development costs and integration time as well.

Prevent Failures

Even if you use an awesome filter, the server it is running on might crash or other problems might cause it to stop working. Here are the 3 ways to ensure that your community isn't impacted on the off chance your filter crashes.

1. Timeouts

Ensure that the part of your code that calls the filter uses a timeout. Specifying a timeout will prevent filter failures from backing your server up and causing cascading failures. Depending on the filter you are using, setting timeouts to around 50ms is a good idea. Although this number might seem high to some people, remember that timeouts are the absolute upper limit for any request.

2. Automatic Throttling

In addition to adding timeouts, you should also include an automatic throttling system. This system should automatically stop sending messages to the filter if a certain number of errors occur in a specific time frame. For example, you might decide that 10 errors in 1 minute indicate a filter problem. In this case, you might want to stop sending requests to the filter for 5 minutes before trying again.

3. Kill Switches

As a last resort, you should build in a global switch that disables chat (or other social features) in your community. If for some reason you see a problem with the filter (or anything else for that matter), you might want to turn off all social features until the problem can be resolved. Also, make this switch work as quickly as possible. You don't want to wait 30 minutes for your kill switch to take effect.

Summary

Using these suggestions and tips for picking a good filter and integrating it properly can nearly eliminate the possibility of your profanity filter crashing your servers. This will help keep your users happy and help your community thrive and grow.

 

Further Reading:

Regex performance issues - Profanity filter

Profanity Filter: On or Off?

CleanSpeak's Profanity Filter Tech Specs

Profanity Filter using a Regular Expression

What's the best profanity filter which supports Java integration?

 

Tags: