# SenseVoiceSmall **Repository Path**: wgm2022/SenseVoiceSmall ## Basic Information - **Project Name**: SenseVoiceSmall - **Description**: 海思dlite模型适配文档,包括自定义算子开发 - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-01-24 - **Last Updated**: 2026-03-12 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # 模型适配指南 ## 一、基础环境搭建 ### 步骤1:安装miniconda3 ```sh wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh sudo chmod 777 * ./Miniconda3-latest-Linux-x86_64.sh source ~/.bashrc ``` ![image-20260124095833577](pic/image-20260124095833577.png) ### 步骤2:安装dlite的ACNN包 ```sh ./Ascend-cann-toolkit_5.30.t13.7.b130_linux-x86_64.run --install --install-path=/home/hispark/Ascend/dlite ``` ![image-20260124095954446](pic/image-20260124095954446.png) ### 步骤3:安装dpico的ACNN包 ```sh ./Ascend-cann-toolkit_6.10.t01spc030b660_linux.x86_64.run --install --install-path=/home/hispark/Ascend/dpico ``` ![image-20260124100249424](pic/image-20260124100249424.png) ### 步骤4:下载依赖软件包 ```sh sudo apt-get update && sudo apt-get upgrade -y sudo apt-get install -y gcc g++ cmake make unzip build-essential zlib1g-dev libbz2-dev libsqlite3-dev libssl-dev libxslt1-dev libffi-dev wget ``` ![image-20260124100425944](pic/image-20260124100425944.png) ### 步骤5:安装交叉编译链 ```sh #解压并切换目录 tar -xzvf aarch64-mix210-linux.tgz cd aarch64-mix210-linux #执行安装脚本 sudo ./aarch64-mix210-linux.install #在Ubuntu终端执行下面命令,打开.bashrc文件。 vi ~/.bashrc #在文件末尾添加交叉编译器的环境变量,保存并退出。 export PATH="/opt/linux/x86-arm/aarch64-mix210-linux/bin:$PATH" #在Ubuntu终端执行下面命令,生效环境变量并查看交叉编译器版本。 source ~/.bashrc && aarch64-mix210-linux-gcc -v ``` ![image-20260124110858547](pic/image-20260124110858547.png) ![image-20260124111106244](pic/image-20260124111106244.png) ## 二、Modelzoo环境搭建 ### 步骤1:下载Modelzoo代码 ```sh git clone https://gitee.com/HiSpark/modelzoo.git ``` ### 步骤2:创建虚拟环境 ```sh conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r conda create -n modelzoo python=3.9 -y conda activate modelzoo ``` ![image-20260124104948401](pic/image-20260124104948401.png) ## 三、搭建SenseVoiceSmall环境 ### 步骤1:安装pip包 ```sh cd /home/hispark/work/modelzoo/samples/built-in/audio mkdir SenseVoiceSmall cd SenseVoiceSmall sudo apt update && sudo apt install -y ffmpeg pip install --upgrade pip pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 pip install modelscope funasr pip install addict pyyaml requests tqdm numpy scipy librosa soundfile simplejson sortedcontainers onnx onnxruntime pip install "pyarrow==20.0.0" "datasets<4.0.0" ``` ![image-20260124105211263](pic/image-20260124105211263.png) ![image-20260124105431238](pic/image-20260124105431238.png) ![image-20260124114729541](pic/image-20260124114729541.png) ![image-20260124114923372](pic/image-20260124114923372.png) ### 步骤2:下载SenseVoiceSmall模型 ```sh cd /home/hispark/work/modelzoo/samples/built-in/audio/SenseVoiceSmall mkdir -p iic/SenseVoiceSmall pip uninstall -y funasr git clone https://github.com/alibaba-damo-academy/FunASR.git cd FunASR pip install -e . ``` * 复制下面的内容到get_asr_model.py中,然后执行 python get_asr_model.py ```python from modelscope.hub.snapshot_download import snapshot_download import os # 指定下载路径:当前目录下的 iic/SenseVoiceSmall local_dir = os.path.join(os.getcwd(), "iic", "SenseVoiceSmall") print(f"Downloading SenseVoiceSmall to: {local_dir}") model_dir = snapshot_download( model_id='iic/SenseVoiceSmall', revision='master', local_files_only=False, cache_dir=None, # 不使用默认缓存 local_dir=local_dir # 指定本地保存路径 ) print("✅ Download completed.") print("Model path:", model_dir) ``` ### 步骤3:测试模型 * 复制下面的内容到asr.py中,并执行python asr.py output.wav 进行模型的测试 ```python # asr.py - 使用 SenseVoiceSmall 进行语音转文字(支持 GPU/CPU) import os import sys from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks # === 环境优化 === os.environ["OMP_NUM_THREADS"] = "4" os.environ["OPENBLAS_NUM_THREADS"] = "4" import logging logging.getLogger("modelscope").setLevel(logging.WARNING) # === 参数检查 === if len(sys.argv) != 2: print("Usage: python asr.py ") sys.exit(1) audio_file = sys.argv[1] if not os.path.exists(audio_file): print(f"Error: Audio file '{audio_file}' not found.") sys.exit(1) # === 加载模型 === print("Loading SenseVoiceSmall model...") inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/SenseVoiceSmall', model_revision='master', device='cuda' if __import__('torch').cuda.is_available() else 'cpu' ) # === 执行识别 === print(f"Transcribing audio: {audio_file}") result = inference_pipeline(audio_file) # === 解析结果 === if isinstance(result, list) and len(result) > 0: text = result[0].get("text", "").strip() else: text = "" if text: print("Transcription result:") print(text) else: print("Warning: No text recognized.") ``` ![image-20260124145317332](pic/image-20260124145317332.png) ### 步骤4:转换onnx模型 * 复制下面的内容到script/sensevoice_pth2onnx.py中,然后进入script目录下执行python sensevoice_pth2onnx.py导出onnx模型。 ```python #!/usr/bin/env python3 # -*- encoding: utf-8 -*- """ SenseVoiceSmall PyTorch 模型转 ONNX 脚本 参考 FastSpeech2 的导出方法,避免生成不支持的 Tile 操作 """ import os import sys import argparse import torch import yaml # 添加 FunASR 路径 current_dir = os.path.dirname(os.path.abspath(__file__)) funasr_dir = os.path.abspath(os.path.join(current_dir, "../FunASR")) if funasr_dir not in sys.path: sys.path.insert(0, funasr_dir) from funasr import AutoModel def export_sensevoice_onnx( model_dir: str, output_path: str, batch_size: int = 1, feats_length: int = 200, feats_dim: int = 560, opset_version: int = 11, device: str = "cpu", ): """ 导出 SenseVoiceSmall 模型为 ONNX 格式 Args: model_dir: 模型目录路径 output_path: ONNX 模型输出路径 batch_size: 批次大小(静态) feats_length: 特征长度(静态) feats_dim: 特征维度 opset_version: ONNX opset 版本 device: 设备(cpu/cuda) """ print(f"Loading model from {model_dir}...") # 加载模型 model = AutoModel(model=model_dir, device=device) # 导出模型(会自动使用 deploy_modules) print("Exporting model to ONNX...") export_dir = model.export( type="onnx", quantize=False, opset_version=opset_version, device=device, ) # 查找生成的 ONNX 文件 onnx_file = os.path.join(export_dir, "model.onnx") if not os.path.exists(onnx_file): raise FileNotFoundError(f"ONNX file not found: {onnx_file}") # 如果需要,复制到指定路径 if output_path and output_path != onnx_file: import shutil shutil.copy(onnx_file, output_path) print(f"ONNX model saved to: {output_path}") else: print(f"ONNX model saved to: {onnx_file}") output_path = onnx_file # 验证 ONNX 模型 try: import onnx onnx_model = onnx.load(output_path) onnx.checker.check_model(onnx_model) print("✅ ONNX model validation passed!") # 检查是否包含 Tile 操作 has_tile = any(node.op_type == "Tile" for node in onnx_model.graph.node) if has_tile: print("⚠️ Warning: ONNX model contains Tile operations, which may not be supported by ATC") print(" Please check if the deploy_modules are being used correctly") else: print("✅ No Tile operations found in ONNX model") except ImportError: print("⚠️ Warning: onnx not installed, skipping validation") except Exception as e: print(f"⚠️ Warning: ONNX validation failed: {e}") return output_path def main(): parser = argparse.ArgumentParser(description="Export SenseVoiceSmall to ONNX") parser.add_argument( "--model_dir", type=str, default="../iic/SenseVoiceSmall", help="Model directory or model name from ModelScope", ) parser.add_argument( "--output_path", type=str, default="../model/sensevoice_small.onnx", help="Output ONNX model path", ) parser.add_argument( "--batch_size", type=int, default=1, help="Static batch size for ONNX export", ) parser.add_argument( "--feats_length", type=int, default=200, help="Static feature length for ONNX export", ) parser.add_argument( "--feats_dim", type=int, default=560, help="Feature dimension", ) parser.add_argument( "--opset_version", type=int, default=11, help="ONNX opset version", ) parser.add_argument( "--device", type=str, default="cpu", choices=["cpu", "cuda"], help="Device for export", ) args = parser.parse_args() # 创建输出目录 output_dir = os.path.dirname(args.output_path) if output_dir and not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True) # 导出模型 try: onnx_path = export_sensevoice_onnx( model_dir=args.model_dir, output_path=args.output_path, batch_size=args.batch_size, feats_length=args.feats_length, feats_dim=args.feats_dim, opset_version=args.opset_version, device=args.device, ) print(f"\n✅ Export successful! ONNX model: {onnx_path}") print(f"\nNext steps:") print(f"1. Check the ONNX model: {onnx_path}") print(f"2. Convert to OM using ATC:") print(f" atc --framework=5 --model=\"{onnx_path}\" \\") print(f" --input_shape=\"speech:{args.batch_size},{args.feats_length},{args.feats_dim};speech_lengths:{args.batch_size};language:{args.batch_size};textnorm:{args.batch_size}\" \\") print(f" --output=\"model/sensevoice_small\" --soc_version=OPTG") except Exception as e: print(f"\n❌ Export failed: {e}") import traceback traceback.print_exc() sys.exit(1) if __name__ == "__main__": main() ``` ![image-20260124154605701](pic/image-20260124154605701.png) ### dump onnx (测试时需要) * 如果需要把om模型和onnx模型的每一层进行对比, * 把下面的内容复制到dump_sensevoice_onnx.py脚本中,然后执行python dump_sensevoice_onnx.py,会在上一级目录生成一个onnx_dump的文件夹,里面是onnx模型dunp出来的所有内容。 ```sh import os import json import numpy as np import onnx import onnxruntime as ort import onnx.shape_inference import time import logging # 配置日志 logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', filename='dump_sensevoice_onnx.log', filemode='w') logger = logging.getLogger(__name__) # 同时输出到控制台 console = logging.StreamHandler() console.setLevel(logging.INFO) formatter = logging.Formatter('%(levelname)s - %(message)s') console.setFormatter(formatter) logger.addHandler(console) # ---------------------------- # 配置 # ---------------------------- MODEL_PATH = "../model/sensevoice_small.onnx" DATA_DIR = "../data/bin" FILE_LIST = os.path.join(DATA_DIR, "file_list.json") DUMP_DIR = "../onnx_dump" os.makedirs(DUMP_DIR, exist_ok=True) # ---------------------------- # 1. 读取 file_list.json # ---------------------------- logger.info("Reading file_list.json...") with open(FILE_LIST, 'r') as f: file_list = json.load(f) sample = file_list["fileList"][0] logger.info(f"Selected sample: {sample}") def load_bin(path, dtype, shape=None): data = np.fromfile(path, dtype=dtype) if shape is not None: data = data.reshape(shape) return data # ---------------------------- # 2. 正确解析输入文件路径 # ---------------------------- file_list_dir = os.path.dirname(FILE_LIST) # 即 ../data/bin/ inputs = {} # 加载模型以获取输入信息 logger.info(f"Loading ONNX model from {MODEL_PATH}...") model = onnx.load(MODEL_PATH) input_info = {inp.name: inp for inp in model.graph.input} logger.info(f"Model loaded successfully. Found {len(input_info)} inputs.") for name in ["speech", "speech_lengths", "language", "textnorm"]: bin_path_rel = sample[name] bin_path_abs = os.path.abspath(os.path.join(file_list_dir, bin_path_rel)) if not os.path.exists(bin_path_abs): logger.error(f"Input file not found: {bin_path_abs} (JSON path: {bin_path_rel})") raise FileNotFoundError(f"Input file not found: {bin_path_abs} (JSON path: {bin_path_rel})") onnx_input = input_info.get(name) if onnx_input is None: logger.error(f"Input '{name}' not found in ONNX model") raise ValueError(f"Input '{name}' not found in ONNX model") onnx_dtype = onnx_input.type.tensor_type.elem_type if onnx_dtype == onnx.TensorProto.FLOAT: dtype = np.float32 elif onnx_dtype == onnx.TensorProto.INT32: dtype = np.int32 else: logger.error(f"Unsupported dtype for input '{name}': {onnx_dtype}") raise NotImplementedError(f"Unsupported dtype for input '{name}': {onnx_dtype}") if name == "speech": file_size = os.path.getsize(bin_path_abs) num_elements = file_size // np.dtype(dtype).itemsize # 根据sensevoice_pth2onnx.py,speech的形状应该是(batch_size, feats_length, feats_dim) # 默认batch_size=1,feats_length=200,feats_dim=560 batch_size = 1 feats_length = 200 feats_dim = 560 expected_elements = batch_size * feats_length * feats_dim if num_elements != expected_elements: logger.warning(f"Expected {expected_elements} elements for speech, got {num_elements}") logger.warning(f"Adjusting shape to match expected dimensions") shape = (batch_size, feats_length, feats_dim) elif name in ["speech_lengths", "language", "textnorm"]: shape = (1,) logger.info(f"Loading {name}: shape={shape}, dtype={dtype}") inputs[name] = load_bin(bin_path_abs, dtype, shape) logger.info(f"Successfully loaded {name} with shape {shape} and dtype {dtype}") # ---------------------------- # 3. 使用ONNX shape inference获取更完整的中间张量信息 # ---------------------------- logger.info("\nRunning shape inference on the model...") inferred_model = onnx.shape_inference.infer_shapes(model) logger.info("Shape inference completed successfully.") # ---------------------------- # 4. 修改模型,将所有中间层添加为模型的输出 # ---------------------------- logger.info("\nModifying model to expose all layer outputs...") # 复制inferred_model,它包含更完整的中间张量信息 modified_model = onnx.ModelProto() modified_model.CopyFrom(inferred_model) # 保存原始模型的输出信息 original_outputs = {output.name: output for output in model.graph.output} # 清空原始模型的输出,准备添加所有输出 modified_model.graph.ClearField("output") # 添加原始模型的输出 for output in original_outputs.values(): modified_model.graph.output.extend([output]) # 获取模型中所有可能的中间张量信息 all_value_info = {vi.name: vi for vi in modified_model.graph.value_info} # 从图的节点中提取所有可能的输出张量 node_outputs = set() for node in modified_model.graph.node: for output_name in node.output: node_outputs.add(output_name) # 添加所有有效的中间层输出到模型 added_outputs = len(original_outputs) # 获取所有节点信息,用于后续处理 all_nodes = {node.name: node for node in modified_model.graph.node} # 1. 首先处理原始模型的输入,确保它们也能被比较 logger.info(f"\nAdding model inputs as outputs...") for input_tensor in modified_model.graph.input: if input_tensor.name not in original_outputs: try: output = onnx.helper.make_tensor_value_info( input_tensor.name, input_tensor.type.tensor_type.elem_type, None ) modified_model.graph.output.extend([output]) added_outputs += 1 except Exception as e: logger.warning(f"Failed to add input {input_tensor.name} as output: {e}") continue # 2. 处理有value_info的张量 logger.info(f"\nAdding value_info tensors as outputs...") for value_name, value_info in all_value_info.items(): if value_name in original_outputs: continue # 最宽松的过滤条件,只过滤明显无效的名称 if value_name.startswith("external_data"): continue output = onnx.helper.make_tensor_value_info( value_name, value_info.type.tensor_type.elem_type, None # 形状会由ONNX Runtime自动推断 ) modified_model.graph.output.extend([output]) added_outputs += 1 # 3. 处理没有value_info但在节点输出中的张量 logger.info(f"\nAdding node outputs as outputs...") for output_name in node_outputs: if output_name in original_outputs or output_name in all_value_info: continue # 最宽松的过滤条件 if output_name.startswith("external_data"): continue # 尝试为这些张量创建输出信息 # 先查找对应的节点,获取输出类型信息 output_type = onnx.TensorProto.FLOAT # 默认类型 for node in modified_model.graph.node: if output_name in node.output: # 尝试从节点输入推断输出类型 if node.input: for input_name in node.input: if input_name in all_value_info: output_type = all_value_info[input_name].type.tensor_type.elem_type break elif input_name in original_outputs: output_type = original_outputs[input_name].type.tensor_type.elem_type break break try: output = onnx.helper.make_tensor_value_info( output_name, output_type, None ) modified_model.graph.output.extend([output]) added_outputs += 1 except Exception as e: logger.warning(f"Failed to add output {output_name}: {e}") continue logger.info(f"\nAdded {added_outputs} outputs to the model") # 保存修改后的模型到临时文件 temp_model_path = "./temp_modified_model.onnx" onnx.save(modified_model, temp_model_path) logger.info(f"Modified model saved to {temp_model_path}") # ---------------------------- # 5. 推理并 dump # ---------------------------- logger.info("\nCreating ONNX Runtime session with modified model...") try: sess = ort.InferenceSession(temp_model_path, providers=['CPUExecutionProvider']) logger.info("ONNX Runtime session created successfully.") except Exception as e: logger.error(f"Failed to create ONNX Runtime session: {e}") raise logger.info(f"Running inference with inputs: {list(inputs.keys())}") # 创建节点输出到节点名称的映射,用于获取正确的op_name和output_index node_output_map = {} for node in modified_model.graph.node: for output_idx, output_name in enumerate(node.output): node_output_map[output_name] = (node.name, output_idx) logger.info(f"Created node output map with {len(node_output_map)} entries.") # 获取所有输出名称 all_tensor_names = [output.name for output in modified_model.graph.output] logger.info(f"Total outputs to process: {len(all_tensor_names)}") # 分批处理,每批只处理10个输出,避免ONNX Runtime优化压力过大 batch_size = 10 timestamp = str(round(time.time() * 1000000)) dumped_tensors = 0 total_batches = (len(all_tensor_names) + batch_size - 1) // batch_size logger.info(f"Processing {total_batches} batches with batch size {batch_size}") for i in range(0, len(all_tensor_names), batch_size): batch_names = all_tensor_names[i:i+batch_size] try: logger.info(f"\nRunning inference with batch {i//batch_size + 1}/{total_batches} ({len(batch_names)} outputs)...") outputs = sess.run(batch_names, inputs) logger.info(f"Inference completed for batch {i//batch_size + 1}. Got {len(outputs)} outputs.") for name, data in zip(batch_names, outputs): # 生成符合msaccucmp.py要求的文件名格式,与OM模型文件名格式匹配 # 从节点输出映射中获取op_name和output_index if name in node_output_map: op_name, output_index = node_output_map[name] # 获取节点类型,作为文件名前缀 node_type = "" for node in modified_model.graph.node: if name in node.output: node_type = node.op_type break # 确保op_name是安全的,不包含特殊字符 safe_op_name = op_name.replace('/', '_').replace(':', '_').replace('.', '_') # 添加节点类型前缀,与OM模型文件名格式一致 if node_type: final_name = f"{node_type}.{safe_op_name}" else: final_name = safe_op_name else: # 如果无法找到节点映射,使用张量名称作为替代 safe_name = name.replace('/', '_').replace(':', '_').replace('.', '_') final_name = safe_name output_index = 0 # 使用与OM模型一致的格式:op_type.op_name.output_index.timestamp # 但保留.npy扩展名,确保msaccucmp.py工具能正确识别 file_path = os.path.join(DUMP_DIR, f"{final_name}.{output_index}.{timestamp}.npy") # 确保数据类型一致 if data.dtype in [np.float32, np.float64]: save_data = data.astype(np.float16) else: save_data = data np.save(file_path, save_data) logger.info(f"Saved: {file_path} | shape={data.shape}, dtype={data.dtype}") dumped_tensors += 1 except Exception as e: logger.error(f"Error processing batch {i//batch_size + 1}: {e}") logger.error(f"Skipping batch: {batch_names}") continue # 删除临时模型文件 os.remove(temp_model_path) logger.info(f"\nRemoved temporary model file: {temp_model_path}") logger.info(f"\n✅ ONNX layer dump completed. Dumped {dumped_tensors} tensors out of {len(all_tensor_names)} available.") ``` ### dump om(测试时需要) * 把下面的内容复制到src目录下的acl.json文件中,后续在板端推理时,会将om的所有层都dump到dump_path指定的目录下。 ```sh { "dump":{ "dump_list":[ { "model_name":"sensevoice_small" } ], "dump_path":"/root/mytest/output", "dump_mode":"all", "dump_op_switch":"off" } } ``` ### 步骤5:基于dlite导出om模型 ```sh cd ~/work/modelzoo/samples/built-in/audio/SenseVoiceSmall source /home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux/bin/setenv.bash export DDK_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest export NPU_HOST_LIB=$DDK_PATH/acllib/lib64/stub export NPU_INCLUDE_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/include/acl export NPU_LIB_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/lib64/stub export ASCEND_AICPU_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux atc --framework=5 --model="./model/sensevoice_small.onnx" \ --input_shape="speech:1,200,560;speech_lengths:1;language:1;textnorm:1" \ --output="model/sensevoice_small" --soc_version=OPTG ``` ![image-20260124155239309](pic/image-20260124155239309.png) ### 导出om模型并获取om对应的json文件(测试时需要) * 把下面的内容复制到model目录的config.cfg文件中。也就是关闭算子融合,这个得根据实际情况和融合算子名称来进行修改。 ```sh LayerNormQuantFFNFusionPass:off ``` * 执行下面的命令,导出om模型 ```sh atc --framework=5 --model="./model/sensevoice_small.onnx" \ --input_shape="speech:1,200,560;speech_lengths:1;language:1;textnorm:1" \ --output="model/sensevoice_small_ffn_off" --soc_version=OPTG --fusion_switch_file="model/config.cfg" ``` * 在model目录下执行下面的命令,获取om对应的json文件 ```sh atc --om=sensevoice_small_ffn_off.om --json=sensevoice_small_ffn_off.json --mode=1 ``` ### 步骤6:编译代码 ```sh cd ~/work/modelzoo/samples/built-in/audio/SenseVoiceSmall sudo ln -s /usr/lib/x86_64-linux-gnu/libisl.so.23.1.0 /usr/lib/x86_64-linux-gnu/libisl.so.19 mkdir build cd ~/work/modelzoo/samples/built-in/audio/SenseVoiceSmall conda activate modelzoo source /home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux/bin/setenv.bash export DDK_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest export NPU_HOST_LIB=$DDK_PATH/acllib/lib64/stub export NPU_INCLUDE_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/include/acl export NPU_LIB_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/lib64/stub export ASCEND_AICPU_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux cd build rm * -rf cmake ../src -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=../../../../common/cmake/toolchain_aarch64_linux.cmake -DSOC_VERSION=OPTG make -j ``` ![image-20260124161055447](pic/image-20260124161055447.png) ### 步骤7:板端模型推理 * 将out目录下的main可执行文件、src目录下的acl.json文件, model目录下的om模型文件,以及data目录下的bin文件都复制到开发板中 ![image-20260127104338458](pic/image-20260127104338458.png) * 然后再确保modelzoo目录下的opencv的库文件,SDK的mpp库文件,以及dlite中的库文件也都拷贝到开发板中 ![image-20260127104501564](pic/image-20260127104501564.png) ![image-20260127104641580](pic/image-20260127104641580.png) * 在开发板的命令行,执行下面的命令,配置库文件的环境变量 ```sh export LD_LIBRARY_PATH=/root/lib:/root/lib/mynpu:/root/lib64:/root/lib/aarch64_linux:$LD_LIBRARY_PATH export ASCEND_AICPU_KERNEL_PATH=/root/lib/npu ``` * 在开发板的命令行,执行下面的命令,进行模型推理 ```sh ./main \ --acl ./src/acl.json \ --model ./model/sensevoice_small.om \ --input ./data/bin/file_list.json ``` * 如果是测试的话,会在output目录下生成om的dump内容,如果非测试,会在output目录下生成token_ids.txt文件,也就是推理的结果 ![image-20260127105025224](pic/image-20260127105025224.png) ### dump数据对比(测试时需要) * 把output中的内容上传到服务器中,然后在服务器的命令行执行下面的命令,最后会得到一个scv的表格,需要通过表格进行分析具体是从哪个层开始出现问题 ```sh python3 /home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux/tools/operator_cmp/compare/msaccucmp.py compare -m /home/hispark/work/modelzoo/samples/built-in/audio/SenseVoiceSmall/om_dump/sensevoice_small/1/0 -g /home/hispark/work/modelzoo/samples/built-in/audio/SenseVoiceSmall/onnx_dump/ -f /home/hispark/work/modelzoo/samples/built-in/audio/SenseVoiceSmall/model/sensevoice_small_ffn_off.json ``` ![image-20260127105658339](pic/image-20260127105658339.png) ![image-20260127110244059](pic/image-20260127110244059.png) ### 步骤8:服务器端解码 * 复制下面的内容到服务器的decode_tokens.py 脚本中。 ```sh #!/usr/bin/env python3 # -*- encoding: utf-8 -*- """ Token IDs 解码脚本 将 C++ 推理输出的 token IDs 转换为文本 """ import os import sys import argparse from pathlib import Path from typing import Optional, List # 添加 FunASR 路径 current_dir = os.path.dirname(os.path.abspath(__file__)) funasr_dir = os.path.abspath(os.path.join(current_dir, "../FunASR")) if funasr_dir not in sys.path: sys.path.insert(0, funasr_dir) try: from runtime.python.onnxruntime.funasr_onnx.utils.sentencepiece_tokenizer import SentencepiecesTokenizer except ImportError: try: from funasr.utils.sentencepiece_tokenizer import SentencepiecesTokenizer except ImportError: raise ImportError("无法导入 SentencepiecesTokenizer,请确保已安装 funasr 或 funasr-onnx") def _find_default_tokenizer_file(model_dir: Optional[str]) -> Optional[str]: """ 尝试在常见位置自动寻找 SenseVoiceSmall 的 tokenizer 文件。 优先级: 1) model_dir/chn_jpn_yue_eng_ko_spectok.bpe.model 2) 工程内 FunASR runtime triton repo 的 scoring 目录 3) 工程内全局搜索(找到多个时给出提示,返回第一个) """ filename = "chn_jpn_yue_eng_ko_spectok.bpe.model" if model_dir: p = Path(model_dir) / filename if p.exists(): return str(p) # repo root = script/.. (SenseVoiceSmall/) repo_root = Path(__file__).resolve().parent.parent # 常见路径(本工程内实际存在) candidates: List[Path] = [ repo_root / "FunASR" / "runtime" / "triton_gpu" / "model_repo_sense_voice_small" / "scoring" / filename, repo_root / filename, ] for c in candidates: if c.exists(): return str(c) # 兜底:全局搜索 hits = list(repo_root.rglob(filename)) if len(hits) == 1: return str(hits[0]) if len(hits) > 1: print("⚠️ 在工程中发现多个 tokenizer 文件,请显式指定 --tokenizer_file:") for h in hits[:20]: print(f" - {h}") if len(hits) > 20: print(f" ... 以及其他 {len(hits) - 20} 个") return str(hits[0]) return None def decode_token_ids(token_file: str, model_dir: str = None, tokenizer_file: str = None): """ 从文件读取 token IDs 并解码为文本 Args: token_file: 包含 token IDs 的文件路径(每行空格分隔的整数) model_dir: 模型目录(包含 tokenizer 文件) tokenizer_file: tokenizer 文件路径(如果指定,优先使用) """ # 读取 token IDs with open(token_file, 'r', encoding='utf-8') as f: line = f.readline().strip() if not line: print("❌ 错误: Token 文件为空") return None token_ids = [int(x) for x in line.split()] print(f"读取到 {len(token_ids)} 个 token IDs") print(f"Token IDs: {token_ids[:20]}..." if len(token_ids) > 20 else f"Token IDs: {token_ids}") # 加载 tokenizer if tokenizer_file is None: tokenizer_file = _find_default_tokenizer_file(model_dir) if tokenizer_file is None or not os.path.exists(tokenizer_file): print(f"❌ 错误: 无法找到 tokenizer 文件(chn_jpn_yue_eng_ko_spectok.bpe.model)") print(" 请使用 --tokenizer_file 指定完整路径,或用 --model_dir 指向包含该文件的目录。") print(" 在本工程里通常位于:FunASR/runtime/triton_gpu/model_repo_sense_voice_small/scoring/") return None # 检查 tokenizer 文件是否为“真实 SentencePiece 模型” # 真实模型一般远大于 1KB;如果很小,往往是软链接占位符(本仓库里就有一份内容是外部绝对路径)。 try: p = Path(tokenizer_file) size = p.stat().st_size if size < 1024: head = p.read_text(encoding="utf-8", errors="ignore").strip() print(f"❌ 错误: tokenizer 文件看起来不是有效的 SentencePiece 模型(大小仅 {size} bytes)。") if head.startswith("/") or head.startswith("\\") or ":" in head: print(" 该文件内容疑似是外部路径/软链接占位符:") print(f" {head}") print(" 解决办法:请把真正的 `chn_jpn_yue_eng_ko_spectok.bpe.model` 拷贝到你指定的 --model_dir 目录下,") print(" 或者直接用 --tokenizer_file 指向那份真实文件。") print(" 参考:FunASR/runtime/triton_gpu/README.md 明确说明该文件需要你自己下载/准备。") return None except Exception as e: print(f"⚠️ 警告: tokenizer 文件自检失败(仍将尝试加载):{e}") print(f"加载 tokenizer: {tokenizer_file}") tokenizer = SentencepiecesTokenizer(bpemodel=tokenizer_file) # 关键一致性校验:vocab_size 与 token id 范围 try: vocab_size = tokenizer.get_vocab_size() max_id = max(token_ids) if token_ids else -1 min_id = min(token_ids) if token_ids else -1 print(f"tokenizer vocab_size: {vocab_size}, token_id 范围: [{min_id}, {max_id}]") if max_id >= vocab_size: print("❌ 错误: token_id 超过 tokenizer 的词表大小,说明 tokenizer 与模型输出不匹配。") print(" 解决:请换成和生成 OM/ONNX 时一致的 .bpe.model 文件。") return None except Exception as e: print(f"⚠️ 警告: 无法获取 tokenizer vocab_size 做一致性校验: {e}") # 解码 text = tokenizer.decode(token_ids) print("=" * 60) print(f"识别结果: {text}") print("=" * 60) return text def main(): parser = argparse.ArgumentParser(description="解码 Token IDs 为文本") parser.add_argument( "--token_file", type=str, required=True, help="包含 token IDs 的文件路径" ) parser.add_argument( "--model_dir", type=str, default=None, help="模型目录(包含 tokenizer 文件)" ) parser.add_argument( "--tokenizer_file", type=str, default=None, help="Tokenizer 文件路径(如果指定,优先使用)" ) parser.add_argument( "--output", type=str, default=None, help="输出文本文件路径(可选)" ) args = parser.parse_args() if not os.path.exists(args.token_file): print(f"❌ 错误: Token 文件不存在: {args.token_file}") sys.exit(1) try: text = decode_token_ids( token_file=args.token_file, model_dir=args.model_dir, tokenizer_file=args.tokenizer_file ) if text and args.output: with open(args.output, 'w', encoding='utf-8') as f: f.write(text + '\n') print(f"\n✅ 文本已保存到: {args.output}") except Exception as e: print(f"❌ 解码失败: {e}") import traceback traceback.print_exc() sys.exit(1) if __name__ == "__main__": main() ``` * 把上一步得到的token_ids.txt上传到服务器的output目录下,然后在服务器的命令终端执行下面的命令,进行解码操作,看语音识别效果如何。 ```sh python decode_tokens.py \ --token_file ../output/token_ids.txt \ --model_dir ../ ``` * 目前遇到的问题是解码乱码 ![image-20260312191405127](pic/image-20260312191405127.png) ## 五、自定义算子环境搭建 ### 步骤1:下载自定义算子代码仓库 ```sh git clone -b r.ss928.1 https://gitee.com/ascend/samples.git ``` ![image-20260124104439628](pic/image-20260124104439628.png) ### 步骤2:创建虚拟环境 ```sh conda create -n py375 python=3.7.5 -y conda activate py375 ln -s /home/hispark/miniconda3/envs/py375/bin/python3.7 /home/hispark/miniconda3/envs/py375/bin/python3.7.5 pip install numpy decorator sympy ``` ### 步骤3:开发onnx自定义算子 ### 步骤4:编译自定义算子 ```sh source /home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux/bin/setenv.bash export DDK_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest export NPU_HOST_LIB=$DDK_PATH/acllib/lib64/stub export NPU_INCLUDE_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/include/acl export NPU_LIB_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/latest/acllib/lib64/stub export ASCEND_AICPU_PATH=/home/hispark/Ascend/dlite/ascend-toolkit/5.30.t13.7.b130/x86_64-linux export ASCEND_TENSOR_COMPILER_INCLUDE=/home/hispark/Ascend/dlite/ascend-toolkit/latest/x86_64-linux/compiler/include cd ~/work/modelzoo/samples/built-in/audio/SenseVoiceSmall/op_dev/1_custom_op/ ./build.sh -t cd build_out ./custom_opp_ubuntu_x86_64.run ``` ![image-20260124163354050](pic/image-20260124163354050.png) ![image-20260124164531872](pic/image-20260124164531872.png) ### 步骤5:测试自定义算子 ```sh cd ~/work/modelzoo/samples/built-in/audio/SenseVoiceSmall/op_dev/2_verify_op #以acl_execute_conv2d为例 cd acl_execute_conv2d/src mkdir build cd build #编译可支持程序 cmake .. -DCMAKE_TOOLCHAIN_FILE=../../../toolchain_aarch64_linux.cmake -DSOC_VERSION=OPTG make #获取测试二进制文件 cd ../../run/out/test_data/data/ python generate_conv2d.py #生成单个自定义算子的om模型 cd ../../ atc --singleop=test_data/config/conv2d_tik_op.json --soc_version=OPTG --output=op_models ``` ![image-20260124163752636](pic/image-20260124163752636.png) ![image-20260124164028641](pic/image-20260124164028641.png) ![image-20260124165013777](pic/image-20260124165013777.png)