Open-source large model in the legal field, "Zhihai-Luwen" - Empowering legal intelligent applications and enhancing judicial efficiency

Wisdominterrogatory

Developed by ZhihaiLLM

WisdomOcean-WisdomInterrogatory is a large legal domain model jointly developed by Zhejiang University, Alibaba DAMO Academy, and Huayuan Computing, focusing on legal intelligence applications and judicial efficiency enhancement.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Other #Legal Q&A #Judicial Case Analysis #Chinese Legal Large Model

Downloads 61

Release Time : 9/2/2023

Model Overview

A large legal domain model based on the Baichuan-7B architecture, specialized in legal document training and instruction fine-tuning, equipped with professional legal Q&A capabilities.

Model Features

Legal Domain Specialization

Trained on 40GB of legal documents to infuse professional legal knowledge.

Judicial Practice Orientation

Designed to directly serve judicial efficiency enhancement and legal knowledge sharing.

Interactive Q&A Capability

Fine-tuned with 100,000 instruction data entries, enabling professional legal Q&A interactions.

Model Capabilities

Legal Text Generation

Judicial Case Analysis

Legal Consultation Q&A

Legal Provision Interpretation

Use Cases

Judicial Practice

Digital Case Library Construction

Automatically process and analyze judicial case documents.

Improves case retrieval and analysis efficiency.

Legal Services

Virtual Legal Consultation

Provides basic legal Q&A services.

Lowers the threshold for legal consultation services.

🚀 WisdomOcean - WisdomInterrogatory

WisdomOcean - WisdomInterrogatory is a legal large - language model jointly developed by Zhejiang University, Alibaba DAMO Academy, and Huayuan Computing. It aims to promote the application of legal intelligence in judicial practice, digital case construction, and virtual legal consultation services.

🚀 Quick Start

WisdomOcean - WisdomInterrogatory is a legal large model jointly designed and developed by Zhejiang University, Alibaba DAMO Academy, and Huayuan Computing. Its core idea is to aim at "popularizing law sharing and improving judicial efficiency", and provide support in aspects such as promoting the integration of the legal intelligent system into judicial practice, digital case construction, and empowering virtual legal consultation services, so as to form digital and intelligent judicial base capabilities.

✨ Features

Developed by multiple well - known institutions including Zhejiang University, Alibaba DAMO Academy, and Huayuan Computing.
Aims to promote legal intelligence in various judicial - related fields.
Based on a well - known open - source model and fine - tuned for legal scenarios.

📦 Installation

Inference Environment Installation

transformers>=4.27.1
accelerate>=0.20.1
torch>=2.0.1
modelscope>=1.8.3
sentencepiece==0.1.99

💻 Usage Examples

Basic Usage

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
from modelscope import AutoModelForCausalLM, AutoTokenizer, snapshot_download
import torch


model_id = "wisdomOcean/wisdomInterrogatory"
revision = 'v1.0.0'
model_dir = snapshot_download(model_id, revision)

def generate_response(prompt: str) -> str:
    inputs = tokenizer(f'</s>Human:{prompt} </s>Assistant: ', return_tensors='pt')
    inputs = inputs.to('cuda')
    pred = model.generate(**inputs, max_new_tokens=800, 
                          repetition_penalty=1.2)
    response = tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)
    return response.split("Assistant: ")[1]

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", 
                                             torch_dtype=torch.float16,
                                             trust_remote_code=True)
prompt = "如果喝了两斤白酒后开车，会有什么后果？"
resp = generate_response(prompt)
print(resp)

📚 Documentation

Model Training

Our model is based on [Baichuan - 7B](https://github.com/baichuan - inc/baichuan - 7B). On this basis, we conducted secondary pre - training and instruction fine - tuning training.

Secondary Pre - training

The purpose of secondary pre - training is to inject legal knowledge into the general large model. The pre - training data includes legal documents, judicial cases, and legal Q&A data, totaling 40G.

Instruction Fine - Tuning Training

After secondary pre - training, in the instruction fine - tuning stage, we used 100k instruction fine - tuning training. The purpose is to enable the large model to have the ability to answer questions and communicate directly with users.

📄 License

The license of this project is other.

⚠️ Important Note

This model is provided for academic research purposes only. There is no guarantee of the accuracy, completeness, or applicability of the results. When using the content generated by the model, you should judge its applicability on your own and bear the risks yourself.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご