T

Topicclassifier

Developed by WebOrganizer
A topic classification model fine-tuned based on gte-base-en-v1.5, capable of classifying web content into 24 categories
Downloads 2,288
Release Time : 2/10/2025

Model Overview

This model can automatically categorize web content into 24 predefined topic categories based on URL and text content. Suitable for content filtering, information organization, and similar scenarios.

Model Features

Two-stage Training
First trained on 1 million documents annotated by Llama-3.1-8B, then fine-tuned on 100,000 documents annotated by Llama-3.1-405B-FP8
Dual Input (URL+Text)
Simultaneously considers both webpage URL and text content for comprehensive classification
Efficient Inference Support
Supports unpadding and memory-efficient attention mechanisms, with optional xformers acceleration

Model Capabilities

Web Content Classification
Multi-category Probability Prediction
Text Understanding

Use Cases

Content Management
Automatic Webpage Classification
Automatically categorizes scraped webpage content by topic
Accurately identifies 24 topic categories
Content Filtering
Adult Content Filtering
Identifies and filters inappropriate content
Can accurately identify adult content categories
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase