🚀 Hume-Libero_Object模型卡片
Hume-Libero_Object是一个在Libero-Object数据集上训练的双系统视觉-语言-动作模型,具备系统2思维能力。它能在机器人领域发挥重要作用,为相关研究和应用提供有力支持。
🚀 快速开始
如果你想复现论文中的结果,请遵循此说明。
如果你想直接使用该模型,请参考以下代码示例:
from hume import HumePolicy
import numpy as np
hume = HumePolicy.from_pretrained("/path/to/checkpoints")
hume.init_infer(
infer_cfg=dict(
replan_steps=8,
s2_replan_steps=16,
s2_candidates_num=5,
noise_temp_lower_bound=1.0,
noise_temp_upper_bound=1.0,
time_temp_lower_bound=0.9,
time_temp_upper_bound=1.0,
post_process_action=True,
device="cuda",
)
)
observation = {
"observation.images.image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.images.wrist_image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.state": np.zeros((1, 7)),
"task": ["Lift the papper"],
}
action = hume.infer(observation)
💻 使用示例
基础用法
from hume import HumePolicy
import numpy as np
hume = HumePolicy.from_pretrained("/path/to/checkpoints")
hume.init_infer(
infer_cfg=dict(
replan_steps=8,
s2_replan_steps=16,
s2_candidates_num=5,
noise_temp_lower_bound=1.0,
noise_temp_upper_bound=1.0,
time_temp_lower_bound=0.9,
time_temp_upper_bound=1.0,
post_process_action=True,
device="cuda",
)
)
observation = {
"observation.images.image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.images.wrist_image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.state": np.zeros((1, 7)),
"task": ["Lift the papper"],
}
action = hume.infer(observation)
高级用法
from hume import HumePolicy
import numpy as np
hume = HumePolicy.from_pretrained("/path/to/checkpoints")
hume.init_infer(
infer_cfg=dict(
replan_steps=8,
s2_replan_steps=16,
s2_candidates_num=5,
noise_temp_lower_bound=1.0,
noise_temp_upper_bound=1.0,
time_temp_lower_bound=0.9,
time_temp_upper_bound=1.0,
post_process_action=True,
device="cuda",
)
)
observation = {
"observation.images.image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.images.wrist_image": np.zeros((1,224,224,3), dtype = np.uint8),
"observation.state": np.zeros((1, 7)),
"task": ["Lift the papper"],
}
action = hume.infer(observation)
📄 许可证
本项目采用MIT许可证。
📚 引用
如果你使用了该模型,请引用以下论文:
@article{song2025hume,
title={Hume: Introducing System-2 Thinking in Visual-Language-Action Model},
author={Anonimous Authors},
journal={arXiv preprint arXiv:2505.21432},
year={2025}
}
📋 模型信息