Fake News Detection project In Python

Here's a simple example of fake news detection using machine learning in Python.

We'll be using the TfidfVectorizer and PassiveAggressiveClassifier from scikit-learn library.

First, let's import the required libraries:

python

Copy code

import pandas as pd

from sklearn.feature_extraction.text import TfidfVectorizer

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score, confusion_matrix

from sklearn.linear_model import PassiveAggressiveClassifier

Next, let's load the dataset into a Pandas DataFrame and split it into training and testing data:

python

Copy code

# Load dataset into a Pandas DataFrame

df = pd.read_csv('news.csv')

# Split the data into training and testing data

X_train, X_test, y_train, y_test = train_test_split(df['text'], df['label'], test_size=0.2, random_state=42)

After splitting the data, we'll convert the text data into numerical form using TfidfVectorizer:

python

Copy code

# Initialize TfidfVectorizer

tfidf_vectorizer = TfidfVectorizer(stop_words='english', max_df=0.7)

# Fit and transform training data

tfidf_train = tfidf_vectorizer.fit_transform(X_train)

# Transform testing data

tfidf_test = tfidf_vectorizer.transform(X_test)

Then, we'll train our model using the PassiveAggressiveClassifier algorithm:

python

Copy code

# Initialize PassiveAggressiveClassifier

pac = PassiveAggressiveClassifier(max_iter=50)

# Fit the model

pac.fit(tfidf_train, y_train)

# Predict on the testing data

y_pred = pac.predict(tfidf_test)

Finally, we'll evaluate the model's performance by calculating the accuracy score and the confusion matrix:

python

Copy code

# Calculate the accuracy score

score = accuracy_score(y_test, y_pred)

print(f'Accuracy: {round(score*100,2)}%')

# Calculate the confusion matrix

confusion_matrix(y_test, y_pred)

That's it! Here's the complete source code:

python

Copy code

import pandas as pd

from sklearn.feature_extraction.text import TfidfVectorizer

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score, confusion_matrix

from sklearn.linear_model import PassiveAggressiveClassifier

# Load dataset into a Pandas DataFrame

df = pd.read_csv('news.csv')

# Split the data into training and testing data

X_train, X_test, y_train, y_test = train_test_split(df['text'], df['label'], test_size=0.2, random_state=42)

# Initialize TfidfVectorizer

tfidf_vectorizer = TfidfVectorizer(stop_words='english', max_df=0.7)

# Fit and transform training data

tfidf_train = tfidf_vectorizer.fit_transform(X_train)

# Transform testing data

tfidf_test = tfidf_vectorizer.transform(X_test)

# Initialize PassiveAggressiveClassifier

pac = PassiveAggressiveClassifier(max_iter=50)

# Fit the model

pac.fit(tfidf_train, y_train)

# Predict on the testing data

y_pred = pac.predict(tfidf_test)

# Calculate the accuracy score

score = accuracy_score(y_test, y_pred)

print(f'Accuracy: {round(score*100,2)}%')

# Calculate the confusion matrix

confusion_matrix(y_test, y_pred)

Fake News Detection project In Python

Post a Comment

Contact Form