At Inversoft, we like open source and we like Java.
When we built out our platform to support our new cloud product offerings we started using Chef to help us manage our deployment strategy.
When we began working on some new backend features for our cloud product offerings, I set out to find a Chef Client written in Java in order to simplify our integration.
As luck wouldn’t have it (yes you read that correctly), I was unable to find a Java library that really made my life easier. There are other Chef libraries out there, but all of them were very lightweight wrappers around HTTP calls. Some went so far as to return the JSON response from the Chef server as a String rather than right POJO.
Rather than limping along with a library that was essentially a glorified URLConnection, I did what any software engineer would do, I wrote it myself.
Behold Barista! A native binding for Chef that provides rich domain objects and REST bindings to work with a Chef server.
Building a properly authenticated HTTP request to Chef is not great fun, so I don’t suggest you do it yourself unless you enjoy the pain. We’ve done the heavy lifting for you and we did this without using any third party encryption libraries. This means you can pick up this library without dragging along any unnecessary dependencies like Bouncycastle.
CleanSpeak can filter many types of user-generated content (e.g., chat messages, forum posts and reviews). Running this material through CleanSpeak on a “per message” basis ensures each piece of content is acceptable before allowing it to be seen in your community. Filtering by message makes sense for these specific use cases. But what if you have big data that you want to filter as a whole?
According to Wikipedia, Batch processing is the execution of a series of jobs in a program on a computer without manual intervention (non-interactive). Strictly speaking, it is a processing mode: the execution of a series of programs each on a set or "batch" of inputs, rather than a single input (which would instead be a custom job).
So when might you consider batch processing?
Maybe you purchased a list of names & addresses and want to make sure they don’t contain any vulgar language before including them in your marketing campaign?
Perhaps you allow users to upload files and want to make sure they don’t contain inappropriate content?
Or you gather a list of reviews and want to check them all at once to ensure the language is acceptable before posting to your site?
The “build vs buy” decision is paramount when discussing a company’s software needs. Building custom software solutions can provide a host of benefits, but it often comes at a cost. An intelligent profanity filtering and moderation platform is a significant investment; building a comprehensive profanity filter could involve years of development time accruing significant costs. Consider the following factors when deciding whether to purchase an existing profanity filtering technology or build it internally:
When building a proprietary software you retain control of all aspects of product design, allowing you to create a customized solution to best fit your company needs.
- Control enhancements and development schedule
- Avoid the costs associated with software license fees – and in some cases maintenance and support fees
- Fully customize to fit your project scope and needs
It is important to remember that by building your own profanity filter you assume the risk if it fails.
The talented team at AOL implemented an internal profanity filter and was embarrassed by the now infamous Scunthorpe Problem. Years later, the Google filter and Facebook were stumped by the same issue. Learn more here. These filter issues produced scores of false-positives which required significant man-hours in moderation support to address.
When you purchase such a solution you get the benefit of a professionally developed and vetted technology with years of market use and added intelligence.
- Offloading filtering and moderation allows you to focus resources on core product features – essential to the long-term health of your business
- The software has been used and trusted by well-known brands with strict quality requirements
- Consistent product upgrades and new features
- Complete product support and software maintenance. When bugs or errors are discovered, you can rely on the vendor to troubleshoot and fix them rather than exhaust internal resources
- Quick deployment time
- You have a technology partner who is focused on helping you succeed and providing a better user experience
Building a filter requires extensive knowledge of natural language processing and language rules. A filter that fails to understand complex language produces misses and false positives can be damaging to a brand. Get peace of mind with a proven solution.
It is easy to overlook the cost and time involved in developing a new technology as complex as a profanity filter. A homegrown solution requires costly development time and ongoing maintenance, moderation and support. In contrast, a purchased solution can be integrated and running in just days; it provides continual upgrades based on market insights and advancements.
Profanity Filter Solutions
There are a range of profanity filtering solutions available on the market. Here are some of the reasons teams choose CleanSpeak.
Minimized False Positives. We have been working on our filtering technology for 9 years and continue to improve on it each year. We solve the Scunthorpe problem, handle all leet speak and automatically build inflections. In addition, we parse and filter BBCode markup language.
Superior Speed. Response times for our profanity filter average under 5 milliseconds allowing significant throughput to support requests at peak volume without hindering user experience.
Cloud or On-Premise. CleanSpeak has flexible hosting options to best suit your InfoSec needs.
Free. Test out CleanSpeak with a free 14-day trial. No credit card required.
We have a dedicated team of developers whose number one goal is to maintain the most accurate and efficient profanity filter on the market. Using CleanSpeak lets you devote more time to building your application, rather than your own filter.
More information about CleanSpeak can be found on the Inversoft website. Don't hesitate to contact us with any questions you might have.
Image Moderation Just Got Faster
As applications, websites and online communities continue to expand, user generated content becomes difficult to manage. Nonetheless, a moderation solution is critical for sites that rely on users to succeed. Companies often focus on filtering chat, URLs and personally identifiable information. It is important to remember that images can be just as harmful to a brand and its user community.
Uncensored images are making their way to children via various platforms due to deficient moderation or lack of moderation altogether. Seven out of ten youths have accidentally come across pornography online.
According to The Independent, Facebook has banned users from boosting posts with the word Scunthorpe.
This left the band October Drift enraged when trying to promote their Scunthorpe show. John Jarman, an advertiser from Scunthorpe, also had a similar experience and shared his opinion on the topic:
"My ad not approved because of the word Scunthorpe. Seriously, Facebook, are your algorithms written by 5-year-olds? I don't need to see what is and isn't approved – there's nothing wrong with the advert, it's just the fact that word Scunthorpe is in it. As soon as I type the word 'Scunthorpe' I get an immediate warning that my ad contains inappropriate language."
This is not the first time this word has proven difficult to properly classify. In fact, Scunthorpe is rather infamous in the world of profanity filtering.