COLD2 Open-Source Fill-Mask Model - Free Solution to the Missing Word Problem in E-commerce Search Queries

Home

COLD2

Developed by fkrasnov2

COLD2 is a PyTorch-based fill-mask model specifically designed to address missing words in e-commerce search queries.

Large Language Model

Transformers

Other#E-commerce search completion #Russian context optimization #Real-time query filling

Downloads 19

Release Time : 10/15/2024

Model Overview

This model utilizes the context of queries to generate possible missing words, suitable for search query completion in e-commerce platforms.

Model Features

E-commerce optimization

Specially optimized for e-commerce search queries, capable of accurately completing product-related vocabulary.

Context understanding

Can understand the context of queries to generate semantically relevant completion words.

Multi-platform support

Supports running on various platforms including PyTorch and transformers.js.

Model Capabilities

Search query completion

Contextual word prediction

E-commerce domain text processing

Use Cases

E-commerce

Search query completion

Automatically completes missing words in user-entered search queries

Improves search accuracy and user experience

Product recommendation

Predicts possible complete product names based on partial queries

Enhances product discovery rate

🚀 COLD2 Model

A model designed to solve the problem of missing words in search queries. It leverages the query context to generate potential missing words.

🚀 Quick Start

Prerequisites

Before using the model, you need to install the necessary dependencies. Run the following command:

# pip install protobuf sentencepiece

Basic Usage

Here is a basic example of using the model to fill in the masked word in a search query:

from transformers import pipeline
unmasker = pipeline("fill-mask", model="fkrasnov2/COLD2", device="cuda")
unmasker("электроника зарядка [MASK] USB")

[{'score': 0.3712620437145233,
  'token': 1131,
  'token_str': 'автомобильная',
  'sequence': 'электроника зарядка автомобильная usb'},
 {'score': 0.12239563465118408,
  'token': 7436,
  'token_str': 'быстрая',
  'sequence': 'электроника зарядка быстрая usb'},
 {'score': 0.046715956181287766,
  'token': 5819,
  'token_str': 'проводная',
  'sequence': 'электроника зарядка проводная usb'},
 {'score': 0.031308457255363464,
  'token': 635,
  'token_str': 'универсальная',
  'sequence': 'электроника зарядка универсальная usb'},
 {'score': 0.02941182069480419,
  'token': 2371,
  'token_str': 'адаптер',
  'sequence': 'электроника зарядка адаптер usb'}]

Advanced Usage

You can use coupled prepositions to improve tokenization. Here is an example:

unmasker("одежда женское [MASK] для_праздника")

[{'score': 0.9355553984642029,
  'token': 503,
  'token_str': 'платье',
  'sequence': 'одежда женское платье для_праздника'},
 {'score': 0.011321154423058033,
  'token': 615,
  'token_str': 'кольцо',
  'sequence': 'одежда женское кольцо для_праздника'},
 {'score': 0.008672593161463737,
  'token': 993,
  'token_str': 'украшение',
  'sequence': 'одежда женское украшение для_праздника'},
 {'score': 0.0038903721142560244,
  'token': 27100,
  'token_str': 'пончо',
  'sequence': 'одежда женское пончо для_праздника'},
 {'score': 0.003703165566548705,
  'token': 453,
  'token_str': 'белье',
  'sequence': 'одежда женское белье для_праздника'}]

📦 Installation for transformers.js

For transformers.js, the ONNX version of the model is required. You can install it as follows:

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("fkrasnov2/COLD2") 
model = ORTModelForMaskedLM.from_pretrained("fkrasnov2/COLD2", file_name='model.onnx')

💻 Using the Model in the Browser

You can also run and use the model directly from your browser. Here are the steps:

HTML File (`index.html`)

<!DOCTYPE html>
<html lang="ru">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Mask fill</title>
    <link rel="stylesheet" href="styles.css">
    <script src="main.js" type="module" defer></script>
</head>
<body>
    <div class="container">
        <textarea id="long-text-input" placeholder="Enter search query with [MASK]"></textarea>
        <button id="generate-button">
            Заполнить маску
        </button>
        <div id="output-div"></div>
    </div>
</body>
</html>

JavaScript File (`main.js`)

import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.0.2';

const longTextInput = document.getElementById('long-text-input');
const output = document.getElementById('output-div');
const generateButton = document.getElementById('generate-button');

const pipe = await pipeline(
    'fill-mask', // task
    'fkrasnov2/COLD2' // model 
);

generateButton.addEventListener('click', async () => {

    const input = longTextInput.value;
    const result = await pipe(input);

    output.innerHTML = result[0].sequence;
    output.style.display = 'block';
});

Browser Page Preview

Browser page

📄 License

This project is licensed under the Unlicense.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご