Malignant Comment Classifier

Overview

The Malignant Comment Classifier leverages Natural Language Processing (NLP) and machine learning to identify and moderate toxic, hateful, and abusive language on social media platforms — ensuring a safer online environment.

Objective

Detect and classify online comments into toxicity levels
Minimize false positives to maintain freedom of expression
Enable real-time moderation using a lightweight model

Dataset

The project uses Kaggle’s Toxic Comment Classification Challenge dataset containing over 159,000 comments labeled under:

toxic
severe_toxic
obscene
threat
insult
identity_hate

Data Preprocessing

Text was cleaned and prepared using regular expressions, NLTK stopword removal, and TF-IDF vectorization for feature extraction.

import re, pandas as pd
from nltk.corpus import stopwords
from sklearn.feature_extraction.text import TfidfVectorizer

data = pd.read_csv("train.csv")
data["comment_text"] = data["comment_text"].apply(
    lambda x: re.sub('[^a-zA-Z]', ' ', x.lower())
)
vectorizer = TfidfVectorizer(max_features=10000, stop_words='english')
X = vectorizer.fit_transform(data["comment_text"])

Model Development

A Logistic Regression classifier was used for multi-label classification. The model predicts toxicity probability across multiple labels for each comment.

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

pred = model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, pred))
print(classification_report(y_test, pred))

Performance Metrics

Accuracy: 95.8%
Cross-validation: 95.82%
ROC-AUC: ≈ 0.97

The model achieved high performance, effectively distinguishing between toxic and non-toxic comments, with balanced precision and recall.

Deployment

The trained model was serialized using pickle for web deployment. It can be integrated into REST APIs or moderation systems to detect toxicity in real-time.

import pickle
with open('malignant_model.pkl', 'wb') as f:
    pickle.dump(model, f)

Conclusion

This project demonstrates the power of classic ML models in NLP-based moderation. Future work includes upgrading to transformer-based models like BERT and incorporating multilingual datasets for broader coverage.

💻 View on GitHub