当前位置：首页 > news >正文

免费网站建设的seo推广名词解释

news 2025/11/15 8:32:15

免费网站建设的,seo推广名词解释,网页设计公司创业计划书,wordpress如何编辑首页布局1 简介 DeepSeek-Coder在多种编程语言和各种基准测试中取得了开源代码模型中最先进的性能。为尝试在开发板进行部署#xff0c;首先利用llama.cpp对其进行量化。 2 llama.cpp安装 git clone之后进入文件夹make即可#xff0c;再将依赖补全pip install -r requirements.tx…1 简介 DeepSeek-Coder在多种编程语言和各种基准测试中取得了开源代码模型中最先进的性能。为尝试在开发板进行部署首先利用llama.cpp对其进行量化。 2 llama.cpp安装 git clone之后进入文件夹make即可再将依赖补全pip install -r requirements.txt 3 量化按照GitHub上DeepSeek和llama.cpp官方的信息后者对deepseek模型的量化目前的支持进度还不是很完善。下面记录一下目前量化出现的问题。 3.1 DeepSeek官方tutorial 依照官方md git clone https://github.com/DOGEwbx/llama.cpp.git cd llama.cpp git checkout regex_gpt2_preprocess出现error: pathspec regex_gpt2_preprocess did not match any file(s) known to git # set up the environment according to README make python3 -m pip install -r requirements.txt # generate GGUF model python convert-hf-to-gguf.py MODEL_PATH --outfile GGUF_PATH --model-name deepseekcoder出现convert-hf-to-gguf.py: error: unrecognized arguments: --model-name deepseekcoder 去掉--model-name参数出现NotImplementedError: Architecture LlamaForCausalLM not supported!解释。 3.2 convert.py转换参考这个comment和这个comment使用convert.py进行转换。看起来这个修改已经被合并了浅浅试一下。 python convert.py MODEL_PATH --outfile GGUF_PATH出现错误: Exception: Vocab size mismatch (model has 32256, but ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct has 32022). Add the --pad-vocab option and try again. 详细的log如下 Loading model file ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/model.safetensors params Params(n_vocab32256, n_embd2048, n_layer24, n_ctx16384, n_ff5504, n_head16, n_head_kv16, n_expertsNone, n_experts_usedNone, f_norm_eps1e-06, rope_scaling_typeRopeScalingType.LINEAR: linear, f_rope_freq_base100000, f_rope_scale4.0, n_orig_ctxNone, rope_finetunedNone, ftypeNone, path_modelPosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct)) Found vocab files: {spm: None, bpe: None, hfft: PosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json)} Loading vocab file PosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json), type hfft fname_tokenizer: ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Vocab info: HfVocab with 32000 base tokens and 22 added tokens Special vocab info: SpecialVocab with 0 merges, special tokens {bos: 32013, eos: 32021, pad: 32014}, add special tokens {bos: True, eos: False} Permuting layer 0 Permuting layer 1 Permuting layer 2 ...省略部分 Permuting layer 22 Permuting layer 23 lm_head.weight - output.weight | BF16 | [32256, 2048] model.embed_tokens.weight - token_embd.weight | BF16 | [32256, 2048] model.layers.0.input_layernorm.weight - blk.0.attn_norm.weight | BF16 | [2048] model.layers.0.mlp.down_proj.weight - blk.0.ffn_down.weight | BF16 | [2048, 5504] model.layers.0.mlp.gate_proj.weight - blk.0.ffn_gate.weight | BF16 | [5504, 2048] ... model.layers.18.self_attn.v_proj.weight - blk.18.attn_v.weight | BF16 | [2048, 2048] model.layers.19.input_layernorm.weight - blk.19.attn_norm.weight | BF16 | [2048] ... model.layers.9.input_layernorm.weight - blk.9.attn_norm.weight | BF16 | [2048] model.layers.9.mlp.down_proj.weight - blk.9.ffn_down.weight | BF16 | [2048, 5504] model.layers.9.mlp.gate_proj.weight - blk.9.ffn_gate.weight | BF16 | [5504, 2048] model.layers.9.mlp.up_proj.weight - blk.9.ffn_up.weight | BF16 | [5504, 2048] model.layers.9.post_attention_layernorm.weight - blk.9.ffn_norm.weight | BF16 | [2048] model.layers.9.self_attn.k_proj.weight - blk.9.attn_k.weight | BF16 | [2048, 2048] model.layers.9.self_attn.o_proj.weight - blk.9.attn_output.weight | BF16 | [2048, 2048] model.layers.9.self_attn.q_proj.weight - blk.9.attn_q.weight | BF16 | [2048, 2048] model.layers.9.self_attn.v_proj.weight - blk.9.attn_v.weight | BF16 | [2048, 2048] model.norm.weight - output_norm.weight | BF16 | [2048] Writing ../DeepSeek-Coder/models/1.3b.gguf, format 1 Traceback (most recent call last):File /home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py, line 1479, in modulemain()File /home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py, line 1473, in mainOutputFile.write_all(outfile, ftype, params, model, vocab, special_vocab,File /home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py, line 1117, in write_allcheck_vocab_size(params, vocab, pad_vocabpad_vocab)File /home/stlinpeiyang/lpy22/LLM/llama.cpp/convert.py, line 963, in check_vocab_sizeraise Exception(msg) Exception: Vocab size mismatch (model has 32256, but ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct has 32022). Add the --pad-vocab option and try again.3.2.1 添加--pad-vocab 首先显然提示添加参数根据提示加上--pad-vocab参数后成功运行并可以成功量化但是在测试时会出现以下错误 terminate called after throwing an instance of std::out_of_rangewhat(): _Map_base::at Aborted (core dumped)这种情况有相关的issue comment这个。从llama.cpp的pull request和issue来看应该是还没处理好。菜鸡只能嗷嗷待哺了。不知道TheBloke大佬是怎么处理的。表情网站 3.2.2 修改vocab_size 其次根据错误的前半段的model has 32256, but ... has 32022有类似的issue. 根据comment对vocal_size进行修改。相应地打开deepseek-coder-1.3b-instruct中的config.json文件试将vocab_size: 32256修改为vocal_size: 32022。再次运行 python convert.py MODEL_PATH --outfile GGUF_PATH输出的log如下 Loading model file ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/model.safetensors params Params(n_vocab32022, n_embd2048, n_layer24, n_ctx16384, n_ff5504, n_head16, n_head_kv16, n_expertsNone, n_experts_usedNone, f_norm_eps1e-06, rope_scaling_typeRopeScalingType.LINEAR: linear, f_rope_freq_base100000, f_rope_scale4.0, n_orig_ctxNone, rope_finetunedNone, ftypeNone, path_modelPosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct)) Found vocab files: {spm: None, bpe: None, hfft: PosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json)} Loading vocab file PosixPath(../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct/tokenizer.json), type hfft fname_tokenizer: ../DeepSeek-Coder/models/deepseek-coder-1.3b-instruct Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Vocab info: HfVocab with 32000 base tokens and 22 added tokens Special vocab info: SpecialVocab with 0 merges, special tokens {bos: 32013, eos: 32021, pad: 32014}, add special tokens {bos: True, eos: False} Permuting layer 0 Permuting layer 1 Permuting layer 2 ...省略部分 lm_head.weight - output.weight | BF16 | [32256, 2048] model.embed_tokens.weight - token_embd.weight | BF16 | [32256, 2048] model.layers.0.input_layernorm.weight - blk.0.attn_norm.weight | BF16 | [2048] model.layers.0.mlp.down_proj.weight - blk.0.ffn_down.weight | BF16 | [2048, 5504] model.layers.0.mlp.gate_proj.weight - blk.0.ffn_gate.weight | BF16 | [5504, 2048] model.layers.0.mlp.up_proj.weight - blk.0.ffn_up.weight | BF16 | [5504, 2048] model.layers.0.post_attention_layernorm.weight - blk.0.ffn_norm.weight | BF16 | [2048] model.layers.0.self_attn.k_proj.weight - blk.0.attn_k.weight | BF16 | [2048, 2048] model.layers.0.self_attn.o_proj.weight - blk.0.attn_output.weight | BF16 | [2048, 2048] model.layers.0.self_attn.q_proj.weight - blk.0.attn_q.weight | BF16 | [2048, 2048] model.layers.0.self_attn.v_proj.weight - blk.0.attn_v.weight ...省略部分 model.layers.9.self_attn.q_proj.weight - blk.9.attn_q.weight | BF16 | [2048, 2048] model.layers.9.self_attn.v_proj.weight - blk.9.attn_v.weight | BF16 | [2048, 2048] model.norm.weight - output_norm.weight | BF16 | [2048] Writing ../DeepSeek-Coder/models/1.3b.gguf, format 1 Ignoring added_tokens.json since model matches vocab size without it. gguf: This GGUF file is for Little Endian only gguf: Setting special token type bos to 32013 gguf: Setting special token type eos to 32021 gguf: Setting special token type pad to 32014 gguf: Setting add_bos_token to True gguf: Setting add_eos_token to False gguf: Setting chat_template to {% if not add_generation_prompt is defined %} {% set add_generation_prompt false %} {% endif %} {%- set ns namespace(foundfalse) -%} {%- for message in messages -%}{%- if message[role] system -%}{%- set ns.found true -%}{%- endif -%} {%- endfor -%} {{bos_token}}{%- if not ns.found -%} {{You are an AI programming assistant, utilizing the Deepseek Coder model, developed by Deepseek Company, and you only answer questions related to computer science. For politically sensitive questions, security and privacy issues, and other non-computer science questions, you will refuse to answer\n}} {%- endif %} {%- for message in messages %}{%- if message[role] system %} {{ message[content] }}{%- else %}{%- if message[role] user %} {{### Instruction:\n message[content] \n}}{%- else %} {{### Response:\n message[content] \n|EOT|\n}}{%- endif %}{%- endif %} {%- endfor %} {% if add_generation_prompt %} {{### Response:}} {% endif %} [ 1/219] Writing tensor output.weight | size 32256 x 2048 | type F16 | T 0 [ 2/219] Writing tensor token_embd.weight | size 32256 x 2048 | type F16 | T 0 ...省略部分 [216/219] Writing tensor blk.9.attn_output.weight | size 2048 x 2048 | type F16 | T 2 [217/219] Writing tensor blk.9.attn_q.weight | size 2048 x 2048 | type F16 | T 2 [218/219] Writing tensor blk.9.attn_v.weight | size 2048 x 2048 | type F16 | T 2 [219/219] Writing tensor output_norm.weight | size 2048 | type F32 | T 2 Wrote ../DeepSeek-Coder/models/1.3b.gguf成功生成gguf文件。下一步进行量化 ./quantize ${out_model.gguf} ${out_model-q5_0.gguf} q5_0输出log如下 main: build 1 (231ae28) main: built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu main: quantizing ../DeepSeek-Coder/models/1.3b.gguf to ../DeepSeek-Coder/models/1.3b-q5_0.gguf as Q5_0 llama_model_loader: loaded meta data with 24 key-value pairs and 219 tensors from ../DeepSeek-Coder/models/1.3b.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str llama llama_model_loader: - kv 1: general.name str models llama_model_loader: - kv 2: llama.context_length u32 16384 llama_model_loader: - kv 3: llama.embedding_length u32 2048 llama_model_loader: - kv 4: llama.block_count u32 24 llama_model_loader: - kv 5: llama.feed_forward_length u32 5504 llama_model_loader: - kv 6: llama.rope.dimension_count u32 128 llama_model_loader: - kv 7: llama.attention.head_count u32 16 llama_model_loader: - kv 8: llama.attention.head_count_kv u32 16 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 0.000001 llama_model_loader: - kv 10: llama.rope.freq_base f32 100000.000000 llama_model_loader: - kv 11: llama.rope.scaling.type str linear llama_model_loader: - kv 12: llama.rope.scaling.factor f32 4.000000 llama_model_loader: - kv 13: general.file_type u32 1 llama_model_loader: - kv 14: tokenizer.ggml.model str llama llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32022] [!, \, #, $, %, , , ... llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32022] [-1000.000000, -1000.000000, -1000.00... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32022] [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 32013 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 32021 llama_model_loader: - kv 20: tokenizer.ggml.padding_token_id u32 32014 llama_model_loader: - kv 21: tokenizer.ggml.add_bos_token bool true llama_model_loader: - kv 22: tokenizer.ggml.add_eos_token bool false llama_model_loader: - kv 23: tokenizer.chat_template str {% if not add_generation_prompt is de... llama_model_loader: - type f32: 49 tensors llama_model_loader: - type f16: 170 tensors llama_model_quantize_internal: meta size 767616 bytes [ 1/ 219] output.weight - [ 2048, 32256, 1, 1], type f16, quantizing to q6_K .. size 126.00 MiB - 51.68 MiB [ 2/ 219] token_embd.weight - [ 2048, 32256, 1, 1], type f16, quantizing to q5_0 .. size 126.00 MiB - 43.31 MiB | hist: 0.040 0.018 0.028 0.043 0.061 0.082 0.101 0.114 0.117 0.109 0.092 0.072 0.052 0.035 0.022 0.016 ... [ 218/ 219] blk.9.attn_v.weight - [ 2048, 2048, 1, 1], type f16, quantizing to q5_0 .. size 8.00 MiB - 2.75 MiB | hist: 0.040 0.017 0.028 0.042 0.060 0.081 0.101 0.116 0.121 0.109 0.091 0.071 0.051 0.034 0.022 0.016 [ 219/ 219] output_norm.weight - [ 2048, 1, 1, 1], type f32, size 0.008 MB llama_model_quantize_internal: model size 2568.38 MB llama_model_quantize_internal: quant size 891.50 MB llama_model_quantize_internal: hist: 0.040 0.017 0.028 0.043 0.061 0.082 0.101 0.114 0.118 0.109 0.092 0.071 0.051 0.035 0.022 0.016main: quantize time 9300.54 ms main: total time 9300.54 ms进行测试 ./main -m ../DeepSeek-Coder/models/1.3b-q5_0.gguf -n 256 -t 18 --repeat_penalty 1.0 --color -i -r User: -f ./prompts/chat-with-bob.txt -ngl 20加载模型失败. warning: not compiled with GPU offload support, --n-gpu-layers option will be ignored warning: see main README.md for information on enabling GPU BLAS support Log start main: build 1 (231ae28) main: built with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu main: seed 1710571501 llama_model_loader: loaded meta data with 25 key-value pairs and 219 tensors from ../DeepSeek-Coder/models/1.3b-q5_0.gguf (version GGUF V3 (latest)) llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output. llama_model_loader: - kv 0: general.architecture str llama llama_model_loader: - kv 1: general.name str models llama_model_loader: - kv 2: llama.context_length u32 16384 llama_model_loader: - kv 3: llama.embedding_length u32 2048 llama_model_loader: - kv 4: llama.block_count u32 24 llama_model_loader: - kv 5: llama.feed_forward_length u32 5504 llama_model_loader: - kv 6: llama.rope.dimension_count u32 128 llama_model_loader: - kv 7: llama.attention.head_count u32 16 llama_model_loader: - kv 8: llama.attention.head_count_kv u32 16 llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 0.000001 llama_model_loader: - kv 10: llama.rope.freq_base f32 100000.000000 llama_model_loader: - kv 11: llama.rope.scaling.type str linear llama_model_loader: - kv 12: llama.rope.scaling.factor f32 4.000000 llama_model_loader: - kv 13: general.file_type u32 8 llama_model_loader: - kv 14: tokenizer.ggml.model str llama llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,32022] [!, \, #, $, %, , , ... llama_model_loader: - kv 16: tokenizer.ggml.scores arr[f32,32022] [-1000.000000, -1000.000000, -1000.00... llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,32022] [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 32013 llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 32021 llama_model_loader: - kv 20: tokenizer.ggml.padding_token_id u32 32014 llama_model_loader: - kv 21: tokenizer.ggml.add_bos_token bool true llama_model_loader: - kv 22: tokenizer.ggml.add_eos_token bool false llama_model_loader: - kv 23: tokenizer.chat_template str {% if not add_generation_prompt is de... llama_model_loader: - kv 24: general.quantization_version u32 2 llama_model_loader: - type f32: 49 tensors llama_model_loader: - type q5_0: 169 tensors llama_model_loader: - type q6_K: 1 tensors llm_load_vocab: SPM vocabulary, but newline token not found: _Map_base::at! Using special_pad_id instead.llm_load_vocab: mismatch in special tokens definition ( 9/32022 vs 22/32022 ). llm_load_print_meta: format GGUF V3 (latest) llm_load_print_meta: arch llama llm_load_print_meta: vocab type SPM llm_load_print_meta: n_vocab 32022 llm_load_print_meta: n_merges 0 llm_load_print_meta: n_ctx_train 16384 llm_load_print_meta: n_embd 2048 llm_load_print_meta: n_head 16 llm_load_print_meta: n_head_kv 16 llm_load_print_meta: n_layer 24 llm_load_print_meta: n_rot 128 llm_load_print_meta: n_embd_head_k 128 llm_load_print_meta: n_embd_head_v 128 llm_load_print_meta: n_gqa 1 llm_load_print_meta: n_embd_k_gqa 2048 llm_load_print_meta: n_embd_v_gqa 2048 llm_load_print_meta: f_norm_eps 0.0e00 llm_load_print_meta: f_norm_rms_eps 1.0e-06 llm_load_print_meta: f_clamp_kqv 0.0e00 llm_load_print_meta: f_max_alibi_bias 0.0e00 llm_load_print_meta: n_ff 5504 llm_load_print_meta: n_expert 0 llm_load_print_meta: n_expert_used 0 llm_load_print_meta: pooling type 0 llm_load_print_meta: rope type 0 llm_load_print_meta: rope scaling linear llm_load_print_meta: freq_base_train 100000.0 llm_load_print_meta: freq_scale_train 0.25 llm_load_print_meta: n_yarn_orig_ctx 16384 llm_load_print_meta: rope_finetuned unknown llm_load_print_meta: model type ?B llm_load_print_meta: model ftype Q5_0 llm_load_print_meta: model params 1.35 B llm_load_print_meta: model size 891.50 MiB (5.55 BPW) llm_load_print_meta: general.name models llm_load_print_meta: BOS token 32013 begin▁of▁sentence llm_load_print_meta: EOS token 32021 |EOT| llm_load_print_meta: UNK token 0 ! llm_load_print_meta: PAD token 32014 end▁of▁sentence llm_load_tensors: ggml ctx size 0.08 MiB llama_model_load: error loading model: create_tensor: tensor token_embd.weight has wrong shape; expected 2048, 32022, got 2048, 32256, 1, 1 llama_load_model_from_file: failed to load model llama_init_from_gpt_params: error: failed to load model ../DeepSeek-Coder/models/1.3b-q5_0.gguf main: error: unable to load model看错误llama_model_load: error loading model: create_tensor: tensor token_embd.weight has wrong shape; expected 2048, 32022, got 2048, 32256, 1, 1应该是跟前面修改的vocab-size有关。

查看全文

http://www.zqtcl.cn/news/681022/