F

Formatclassifier NoURL

Developed by WebOrganizer
A classification model that categorizes web content into 24 categories based solely on text content (without using URL information)
Downloads 730
Release Time : 2/10/2025

Model Overview

This model is fine-tuned on gte-base-en-v1.5, specifically designed for format classification of web text content, supporting recognition of 24 different format types.

Model Features

URL-Independent Classification
Classifies based solely on text content without relying on URL information
24 Format Classification
Supports recognition of 24 different web formats ranging from academic writing to user reviews
Two-Phase Training
Uses data annotated by Llama-3.1-8B and Llama-3.1-405B-FP8 for two-phase fine-tuning

Model Capabilities

Web Content Classification
Text Format Recognition
Multi-Class Probability Prediction

Use Cases

Content Management
Web Content Archiving
Automatically categorizes and organizes large volumes of web content
Improves content management efficiency
Information Retrieval
Search Result Filtering
Filters search results based on content format
Enhances search relevance
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase