当前位置：首页 > news >正文

哈尔滨网站制作哪儿好薇广德县建设协会网站

news 2025/11/15 10:11:03

哈尔滨网站制作哪儿好薇,广德县建设协会网站,网络公司,周口市建设局网站系列文章目录目录系列文章目录前言一、运行要求二、安装三、模型检查点 3.1 基础模型 3.2 微调模型四、运行预训练模型的推理五、在自己的数据上微调基础模型 5.1. 将数据转换为 LeRobot 数据集 5.3. 启动策略服务器并运行推理 5.4 更多示例六、故障排除…系列文章目录目录系列文章目录前言一、运行要求二、安装三、模型检查点 3.1 基础模型 3.2 微调模型四、运行预训练模型的推理五、在自己的数据上微调基础模型 5.1. 将数据转换为 LeRobot 数据集 5.3. 启动策略服务器并运行推理 5.4 更多示例六、故障排除七、远程运行 openpi 模型 7.1 启动远程策略服务器 7.2 从机器人代码中查询远程策略服务器八、推理教程 8.1 策略推断 8.2 使用实时模型九、策略记录代码前言 openpi 包含物理智能团队发布的机器人开源模型和软件包。目前该 repo 包含两种模型 π₀ 模型一种基于流的扩散视觉-语言-动作模型 (VLA)π₀-FAST 模型一种基于 FAST 动作标记器的自回归 VLA。对于这两种模型我们都提供了在 10K 小时的机器人数据上预先训练过的基本模型检查点以及用于开箱即用或根据您自己的数据集进行微调的示例。这是一次实验π0 是为我们自己的机器人开发的与 ALOHA 和 DROID 等广泛使用的平台不同尽管我们乐观地认为研究人员和从业人员将能够进行创造性的新实验将π0 适应到他们自己的平台上但我们并不指望每一次这样的尝试都能成功。综上所述π0 可能对你有用也可能对你没用但我们欢迎你去试试看一、运行要求要运行本资源库中的模型您需要至少具备以下规格的英伟达™NVIDIA®图形处理器。这些估算假设使用的是单 GPU但您也可以通过在训练配置中配置 fsdp_devices使用多 GPU 并行模型来减少每个 GPU 的内存需求。还请注意当前的训练脚本还不支持多节点训练。 ModeMemory RequiredExample GPUInference 8 GBRTX 4090Fine-Tuning (LoRA) 22.5 GBRTX 4090Fine-Tuning (Full) 70 GBA100 (80GB) / H100 该软件包已在 Ubuntu 22.04 上进行了测试目前不支持其他操作系统。二、安装克隆此 repo 时确保更新子模块 git clone --recurse-submodules gitgithub.com:Physical-Intelligence/openpi.git# Or if you already cloned the repo: git submodule update --init --recursive 我们使用 uv 来管理 Python 的依赖关系。请参阅 uv 安装说明进行设置。安装好 uv 后运行以下命令来设置环境 GIT_LFS_SKIP_SMUDGE1 uv sync GIT_LFS_SKIP_SMUDGE1 uv pip install -e . 注意需要 GIT_LFS_SKIP_SMUDGE1 才能将 LeRobot 作为依赖项。 Docker 作为 uv 安装的替代方案我们提供了使用 Docker 安装 openpi 的说明。如果遇到系统设置问题可以考虑使用 Docker 简化安装。更多详情请参阅 Docker 安装。三、模型检查点 3.1 基础模型我们提供多个基础 VLA 模型检查点。这些检查点已在 10k 小时的机器人数据上进行了预训练可用于微调。 ModelUse CaseDescriptionCheckpoint Pathπ0Fine-TuningBase diffusion π₀ model for fine-tunings3://openpi-assets/checkpoints/pi0_baseπ0-FASTFine-TuningBase autoregressive π₀-FAST model for fine-tunings3://openpi-assets/checkpoints/pi0_fast_base 3.2 微调模型我们还为各种机器人平台和任务提供 “专家 ”检查点。这些模型在上述基础模型的基础上进行了微调旨在直接在目标机器人上运行。这些模型不一定适用于您的特定机器人。由于这些检查点是在使用 ALOHA 和 DROID Franka 等更广泛使用的机器人收集的相对较小的数据集上进行微调的因此它们可能无法适用于您的特定设置不过我们发现其中一些检查点尤其是 DROID 检查点在实践中具有相当广泛的适用性。 ModelUse CaseDescriptionCheckpoint Pathπ0-FAST-DROIDInferenceπ0-FAST model fine-tuned on the DROID dataset, can perform a wide range of simple table-top manipulation tasks 0-shot in new scenes on the DROID robot platforms3://openpi-assets/checkpoints/pi0_fast_droidπ0-DROIDFine-Tuningπ0 model fine-tuned on the DROID dataset, faster inference than π0-FAST-DROID, but may not follow language commands as wells3://openpi-assets/checkpoints/pi0_droidπ0-ALOHA-towelInferenceπ0 model fine-tuned on internal ALOHA data, can fold diverse towels 0-shot on ALOHA robot platformss3://openpi-assets/checkpoints/pi0_aloha_towelπ0-ALOHA-tupperwareInferenceπ0 model fine-tuned on internal ALOHA data, can unpack food from a tupperware containers3://openpi-assets/checkpoints/pi0_aloha_tupperwareπ0-ALOHA-pen-uncapInferenceπ0 model fine-tuned on public ALOHA data, can uncap a pens3://openpi-assets/checkpoints/pi0_aloha_pen_uncap 默认情况下检查点会自动从 s3://openpi-assets 下载并在需要时缓存到 ~/.cache/openpi 中。你可以通过设置 OPENPI_DATA_HOME 环境变量来覆盖下载路径。四、运行预训练模型的推理我们的预训练模型检查点只需几行代码即可运行此处为我们的 π0-FAST-DROID 模型 from openpi.training import config from openpi.policies import policy_config from openpi.shared import downloadconfig config.get_config(pi0_fast_droid) checkpoint_dir download.maybe_download(s3://openpi-assets/checkpoints/pi0_fast_droid)# Create a trained policy. policy policy_config.create_trained_policy(config, checkpoint_dir)# Run inference on a dummy example. example {observation/exterior_image_1_left: ...,observation/wrist_image_left: ...,...prompt: pick up the fork } action_chunk policy.infer(example)[actions] 您也可以在示例笔记本中进行测试。我们提供了在 DROID 和 ALOHA 机器人上运行预训练检查点推理的详细分步示例。远程推理我们提供了远程运行模型推理的示例和代码模型可以在不同的服务器上运行并通过 websocket 连接向机器人发送动作流。这样就可以轻松地在机器人外使用更强大的 GPU并将机器人和策略环境分开。在没有机器人的情况下测试推理我们提供了一个脚本用于在没有机器人的情况下测试推理。该脚本将生成随机观测数据并使用模型运行推理。更多详情请参阅此处。五、在自己的数据上微调基础模型我们将在 Libero 数据集上微调 π0-FAST 模型作为如何在自己的数据上微调基础模型的运行示例。我们将解释三个步骤将您的数据转换为 LeRobot 数据集我们使用该数据集进行训练定义训练配置并运行训练启动策略服务器并运行推理 5.1. 将数据转换为 LeRobot 数据集我们在 examples/libero/convert_libero_data_to_lerobot.py 中提供了将 Libero 数据转换为 LeRobot 数据集的最小示例脚本。您可以轻松修改它转换自己的数据您可以从这里下载原始的 Libero 数据集并使用以下命令运行脚本 uv run examples/libero/convert_libero_data_to_lerobot.py --data_dir /path/to/your/libero/data 5.2. 定义训练配置和运行训练要在自己的数据上对基础模型进行微调您需要定义用于数据处理和训练的配置。下面我们提供了带有详细注释的 Libero 配置示例您可以根据自己的数据集进行修改 LiberoInputs 和 LiberoOutputs 定义从 Libero 环境到模型的数据映射反之亦然。将用于训练和推理。LeRobotLiberoDataConfig 定义如何处理 LeRobot 数据集中用于训练的 Libero 原始数据。TrainConfig训练配置定义微调超参数、数据配置和权重加载器。我们提供了π₀和π₀-FAST 在 Libero 数据上的微调配置示例。在运行训练之前我们需要计算训练数据的归一化统计量。使用训练配置的名称运行下面的脚本 uv run scripts/compute_norm_stats.py --config-name pi0_fast_libero 现在我们可以使用以下命令启动训练如果使用相同配置重新运行微调则 --overwrite 标志用于覆盖现有检查点 XLA_PYTHON_CLIENT_MEM_FRACTION0.9 uv run scripts/train.py pi0_fast_libero --exp-namemy_experiment --overwrite 该命令会将训练进度记录到控制台并将检查点保存到检查点目录。您还可以在权重与偏差仪表板上监控训练进度。为了最大限度地使用 GPU 内存请在运行训练之前设置 XLA_PYTHON_CLIENT_MEM_FRACTION0.9 -- 这将使 JAX 能够使用高达 90% 的 GPU 内存默认值为 75%。注我们提供了从预训练开始重新加载状态/动作归一化统计数据的功能。如果您要对预训练混合物中的机器人新任务进行微调这将非常有用。有关如何重新加载归一化统计数据的详细信息请参阅 norm_stats.md 文件。 5.3. 启动策略服务器并运行推理训练完成后我们就可以启动策略服务器然后通过 Libero 评估脚本进行查询从而运行推理。启动模型服务器非常简单本例使用迭代 20,000 的检查点可根据需要修改 uv run scripts/serve_policy.py policy:checkpoint --policy.configpi0_fast_libero --policy.dircheckpoints/pi0_fast_libero/my_experiment/20000 这将启动一个服务器该服务器监听 8000 端口并等待向其发送观察结果。然后我们就可以运行 Libero 评估脚本来查询服务器。有关如何安装 Libero 和运行评估脚本的说明请参阅 Libero README。如果你想在自己的机器人运行时中嵌入策略服务器调用我们在远程推理文档中提供了一个最简单的示例。 5.4 更多示例我们在以下 READMEs 中提供了更多示例说明如何在 ALOHA 平台上使用我们的模型进行微调和推理 ALOHA 模拟器ALOHA 真实UR5 六、故障排除我们将在此收集常见问题及其解决方案。如果遇到问题请先查看此处。如果找不到解决方案请在软件仓库中提交问题参见此处的指导原则。 IssueResolutionuv 同步因依赖关系冲突而失败尝试删除虚拟环境目录rm -rf .venv并重新运行 uv 同步。如果问题仍然存在请检查是否安装了最新版本的 uvuv self update。训练耗尽 GPU 内存确保在运行训练之前设置 XLA_PYTHON_CLIENT_MEM_FRACTION0.9以允许 JAX 使用更多 GPU 内存。您也可以尝试在训练配置中减少批量大小。策略服务器连接错误检查服务器是否正在运行是否在预期端口上监听。验证客户端和服务器之间的网络连接和防火墙设置。训练时缺失常模统计错误在开始训练前使用配置名称运行 scripts/compute_norm_stats.py。数据集下载失败检查网络连接。如果使用 local_files_onlyTrue请确认数据集是否存在于本地。对于 HuggingFace 数据集请确保已登录huggingface-cli 登录。CUDA/GPU 错误验证英伟达驱动程序和 CUDA 工具包是否安装正确。对于 Docker确保已安装 nvidia-container-toolkit。检查 GPU 兼容性。运行示例时出现导入错误确保使用 uv sync 安装了所有依赖项并激活了虚拟环境。某些示例的 READMEs 中可能列出了其他要求。动作尺寸不匹配验证您的数据处理转换是否与机器人的预期输入/输出尺寸相匹配。检查策略类中的动作空间定义。七、远程运行 openpi 模型我们提供了远程运行 openpi 模型的实用程序。这对于在机器人外更强大的 GPU 上运行推理非常有用还有助于将机器人环境和策略环境分开例如避免机器人软件的依赖性地狱。 7.1 启动远程策略服务器要启动远程策略服务器只需运行以下命令即可 uv run scripts/serve_policy.py --env[DROID | ALOHA | LIBERO] env 参数指定应加载哪个 π0 检查点。在脚本引擎盖下该脚本将执行类似下面的命令你可以用它来启动策略服务器例如为你自己训练的检查点启动策略服务器这里以 DROID 环境为例 uv run scripts/serve_policy.py policy:checkpoint --policy.configpi0_fast_droid --policy.dirs3://openpi-assets/checkpoints/pi0_fast_droid 这将启动一个策略服务器为 config 和 dir 参数指定的策略提供服务。策略将通过指定端口默认8000提供。 7.2 从机器人代码中查询远程策略服务器我们提供的客户端实用程序依赖性极低您可以轻松将其嵌入到任何机器人代码库中。首先在机器人环境中安装 openpi-client 软件包 cd $OPENPI_ROOT/packages/openpi-client pip install -e . 然后您就可以使用客户端从机器人代码中查询远程策略服务器。下面举例说明如何做到这一点 from openpi_client import image_tools from openpi_client import websocket_client_policy# Outside of episode loop, initialize the policy client. # Point to the host and port of the policy server (localhost and 8000 are the defaults). client websocket_client_policy.WebsocketClientPolicy(hostlocalhost, port8000)for step in range(num_steps):# Inside the episode loop, construct the observation.# Resize images on the client side to minimize bandwidth / latency. Always return images in uint8 format.# We provide utilities for resizing images uint8 conversion so you match the training routines.# The typical resize_size for pre-trained pi0 models is 224.# Note that the proprioceptive state can be passed unnormalized, normalization will be handled on the server side.observation {observation/image: image_tools.convert_to_uint8(image_tools.resize_with_pad(img, 224, 224)),observation/wrist_image: image_tools.convert_to_uint8(image_tools.resize_with_pad(wrist_img, 224, 224)),observation/state: state,prompt: task_instruction,}# Call the policy server with the current observation.# This returns an action chunk of shape (action_horizon, action_dim).# Note that you typically only need to call the policy every N steps and execute steps# from the predicted action chunk open-loop in the remaining steps.action_chunk client.infer(observation)[actions]# Execute the actions in the environment....这里主机和端口参数指定了远程策略服务器的 IP 地址和端口。您也可以将这些参数指定为机器人代码的命令行参数或在机器人代码库中硬编码。观察结果是观察结果和提示的字典与您所服务的策略的策略输入相一致。在简单的客户端示例中我们提供了如何在不同环境下构建该字典的具体示例。八、推理教程 import dataclassesimport jaxfrom openpi.models import model as _model from openpi.policies import droid_policy from openpi.policies import policy_config as _policy_config from openpi.shared import download from openpi.training import config as _config from openpi.training import data_loader as _data_loader 8.1 策略推断下面的示例展示了如何从检查点创建策略并在虚拟示例上运行推理。 config _config.get_config(pi0_fast_droid) checkpoint_dir download.maybe_download(s3://openpi-assets/checkpoints/pi0_fast_droid)# Create a trained policy. policy _policy_config.create_trained_policy(config, checkpoint_dir)# Run inference on a dummy example. This example corresponds to observations produced by the DROID runtime. example droid_policy.make_droid_example() result policy.infer(example)# Delete the policy to free up memory. del policyprint(Actions shape:, result[actions].shape) 8.2 使用实时模型下面的示例展示了如何从检查点创建实时模型并计算训练损失。首先我们将演示如何使用假数据。 config _config.get_config(pi0_aloha_sim)checkpoint_dir download.maybe_download(s3://openpi-assets/checkpoints/pi0_aloha_sim) key jax.random.key(0)# Create a model from the checkpoint. model config.model.load(_model.restore_params(checkpoint_dir / params))# We can create fake observations and actions to test the model. obs, act config.model.fake_obs(), config.model.fake_act()# Sample actions from the model. loss model.compute_loss(key, obs, act) print(Loss shape:, loss.shape) 现在我们将创建一个数据加载器并使用一批真实的训练数据来计算损失。 # Reduce the batch size to reduce memory usage. config dataclasses.replace(config, batch_size2)# Load a single batch of data. This is the same data that will be used during training. # NOTE: In order to make this example self-contained, we are skipping the normalization step # since it requires the normalization statistics to be generated using compute_norm_stats. loader _data_loader.create_data_loader(config, num_batches1, skip_norm_statsTrue) obs, act next(iter(loader))# Sample actions from the model. loss model.compute_loss(key, obs, act)# Delete the model to free up memory. del modelprint(Loss shape:, loss.shape) 九、策略记录代码 import pathlibimport numpy as nprecord_path pathlib.Path(../policy_records) num_steps len(list(record_path.glob(step_*.npy)))records [] for i in range(num_steps):record np.load(record_path / fstep_{i}.npy, allow_pickleTrue).item()records.append(record) print(length of records, len(records)) print(keys in records, records[0].keys())for k in records[0]:print(f{k} shape: {records[0][k].shape}) from PIL import Imagedef get_image(step: int, idx: int 0):img (255 * records[step][inputs/image]).astype(np.uint8)return img[idx].transpose(1, 2, 0)def show_image(step: int, idx_lst: list[int]):imgs [get_image(step, idx) for idx in idx_lst]return Image.fromarray(np.hstack(imgs))for i in range(2):display(show_image(i, [0]) import pandas as pddef get_axis(name, axis):return np.array([record[name][axis] for record in records])# qpos is [..., 14] of type float: # 0-5: left arm joint angles # 6: left arm gripper # 7-12: right arm joint angles # 13: right arm gripper names [(left_joint, 6), (left_gripper, 1), (right_joint, 6), (right_gripper, 1)]def make_data():cur_dim 0in_data {}out_data {}for name, dim_size in names:for i in range(dim_size):in_data[f{name}_{i}] get_axis(inputs/qpos, cur_dim)out_data[f{name}_{i}] get_axis(outputs/qpos, cur_dim)cur_dim 1return pd.DataFrame(in_data), pd.DataFrame(out_data)in_data, out_data make_data() for name in in_data.columns:data pd.DataFrame({fin_{name}: in_data[name], fout_{name}: out_data[name]})data.plot()

查看全文

http://www.zqtcl.cn/news/42310/