赞
踩
Ollama是一个用于在本地运行大型语言模型(LLM)的开源框架。它支持多种操作系统,但是唯独不支持FreeBSD,于是尝试在FreeBSD里编译安装。
先上结论,官网的ollama没有编译成功,使用特供版可以安装成功。因为特供版改了代码,为了安全,最后是在FreeBSD jail里操作的。
首先安装最新的go
pkg install go122-1.22.5 cmake
后来发现不行,还是安装了默认的go (原来需要使用go122这条命令来执行)
pkg install go
但是这个版本低啊
下载高版本试试。 下载:https://go.dev/dl/go1.22.5.freebsd-amd64.tar.gz
wget https://go.dev/dl/go1.22.5.freebsd-amd64.tar.gz
解压缩
tar -xzvf go1.22.5.freebsd-amd64.tar.gz
加上路径
export PATH=/home/skywalk/work/go/bin:$PATH
现在go就是1.22.5版本了
- $ go version
- go version go1.22.5 freebsd/amd64
加速go
- # Set the GOPROXY environment variable
- export GOPROXY=https://goproxy.io,direct
- # Set environment variable allow bypassing the proxy for specified repos (optional)
- export GOPRIVATE=git.mycompany.com,github.com/my/private
从官网下载ollama
git clone https://github.com/ollama/ollama
generate
go generate ./...
build
go build .
但是这里没有编译成功,最后报错
- skywalk@fbhost:~/github/ollama $ go build .
- package github.com/ollama/ollama
- imports github.com/ollama/ollama/cmd
- imports github.com/ollama/ollama/server
- imports github.com/ollama/ollama/gpu: C source files not allowed when not using cgo or SWIG: gpu_info_cudart.c gpu_info_nvcuda.c gpu_info_nvml.c gpu_info_oneapi.c
创建一个FreeBSDjail,登录
# cbsd jlogin fb12
登录后是csh,如果不适应,可以改成bash
安装需要的包
# pkg install -y git go122 cmake vulkan-headers vulkan-loader
下载特供版本
# git clone --depth 1 https://github.com/prep/ollama.git
# git clone https://github.com/prep/ollama.git
git clone https://github.com/prep/ollama
切branch
(这里没切换成)
# cd ollama && git checkout feature/add-bsd-support
先设定加速
csh下
# set
GO111MODULE=on
# set GOPROXY=https://goproxy.io,direct
# set GOPRIVATE=git.mycompany.com,github.com/my/private
bash下
# 启用 Go Modules 功能
export GO111MODULE=on
# Set the GOPROXY environment variable
export GOPROXY=https://goproxy.io,direct
# Set environment variable allow bypassing the proxy for specified repos (optional)
export GOPRIVATE=git.mycompany.com,github.com/my/private
开始go generate和build
# go122 generate ./...
# go122 build .
最后报错:
go122 build .
go: downloading github.com/pdevine/tensor v0.0.0-20240228013915-64ccaa8d9ca9
convert/gemma.go:12:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
convert/gemma.go:13:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
在FreeBSD jail里使用普通用户编译ollama特供版本(第三次尝试,成功)
若有报错,需要修改go.sum文件和go.mod文件。
使用如下命令:
- bash
- mkdir github.com
- cd github.com
-
- git clone https://github.com/prep/ollama.git
-
- cd ollama && git checkout feature/add-bsd-support
-
- # 启用 Go Modules 功能
-
- export GO111MODULE=on
-
- # Set the GOPROXY environment variable
- export GOPROXY=https://goproxy.io,direct
- # Set environment variable allow bypassing the proxy for specified repos (optional)
- export GOPRIVATE=git.mycompany.com,github.com/my/private
-
- go122 generate ./...
-
- go122 build .
还是有报错: go122 build .
go: downloading github.com/pdevine/tensor v0.0.0-20240228013915-64ccaa8d9ca9
convert/gemma.go:12:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
convert/gemma.go:13:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
修改go.sum文件,将里面的pdeviene/tensor 修改成
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c h1:GwiUUjKefgvSNmv3NCvI/BL0kDebW6Xa+kcdpdc1mTY=
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c/go.mod h1:PSojXDXF7TbgQiD6kkd98IHOS0QqTyUEaWRiS8+BLu8=
还需要修改go.mod文件,将里面的pdevine/tensor版本改成5.10日的最新版本:
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c
然后重新generate和build
根据实际情况,如果不重新generate,看提示大约需要重新get一下:
go122 get github.com/ollama/ollama/convert
然后再继续build
go122 build .
搞定!
测试一下:
./ollama help | head -n 5
./ollama help | head -n 5
Large language model runner
Usage:
ollama [flags]
ollama [command]
证明确实编译成功了!
首先要启动ollama服务
./ollama serve
运行llama3模型
./ollama run llama3
ollama会自动下载模型。模型下载好后,会进入交互界面。
一句回答用了50分钟.....但至少它成了,在FreeBSD下执行成功了!
- [skywalk@fb12 ~/gihub.com/ollama]$ ./ollama run llama3
- [GIN] 2024/07/15 - 12:01:47 | 200 | 466.704µs | 10.0.0.12 | HEAD "/"
- [GIN] 2024/07/15 - 12:01:47 | 404 | 450.54µs | 10.0.0.12 | POST "/api/show"
- pulling manifest ⠦ time=2024-07-15T12:01:50.016+08:00 level=INFO source=download.go:136 msg="downloading 6a0746a1ec1a in 47 100 MB part(s)"
- pulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB time=2024-07-15T12:20:25.740+08:00 level=INFO source=download.go:136 msg="downloapulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
- pulling 4fa551d4f938... 100% ▕████████████████▏ 12 KB tpulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
- pulling 4fa551d4f938... 100% ▕████████████████▏ 12 KB
- pulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
- pulling 4fa551d4f938... 100% ▕████████████████▏ 12 KB
- pulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
- pulling manifest
- pulling 6a0746a1ec1a... 100% ▕████████████████▏ 4.7 GB
- pulling 4fa551d4f938... 100% ▕████████████████▏ 12 KB
- pulling 8ab4849b038c... 100% ▕████████████████▏ 254 B
- pulling 577073ffcc6c... 100% ▕████████████████▏ 110 B
- pulling 3f8eb4da87fa... 100% ▕████████████████▏ 485 B
- verifying sha256 digest
- writing manifest
- removing any unused layers
- success
- [GIN] 2024/07/15 - 12:22:06 | 200 | 1.786897ms | 10.0.0.12 | POST "/api/show"
- [GIN] 2024/07/15 - 12:22:06 | 200 | 1.384117ms | 10.0.0.12 | POST "/api/show"
- time=2024-07-15T12:22:06.288+08:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
- ⠴ time=2024-07-15T12:22:20.820+08:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
- time=2024-07-15T12:22:20.821+08:00 level=INFO source=server.go:289 msg="starting llama server" cmd="/tmp/ollama1084183988/runners/cpu/ollama_llama_server --model /home/skywalk/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --parallel 1 --port 62268"
- time=2024-07-15T12:22:20.847+08:00 level=INFO source=sched.go:340 msg="loaded runners" count=1
- time=2024-07-15T12:22:20.847+08:00 level=INFO source=server.go:432 msg="waiting for llama runner to start responding"
- {"function":"server_params_parse","level":"INFO","line":2604,"msg":"logging to file is disabled.","tid":"0x10139f812000","timestamp":1721017340}
- ⠦ {"build":2770,"commit":"952d03db","function":"main","level":"INFO","line":2821,"msg":"build info","tid":"0x10139f812000","timestamp":1721017340}
- {"function":"main","level":"INFO","line":2828,"msg":"system info","n_threads":4,"n_threads_batch":-1,"system_info":"AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | ","tid":"0x10139f812000","timestamp":1721017340,"total_threads":4}
- ⠧ llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /home/skywalk/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa (version GGUF V3 (latest))
- llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
- llama_model_loader: - kv 0: general.architecture str = llama
- llama_model_loader: - kv 1: general.name str = Meta-Llama-3-8B-Instruct
- llama_model_loader: - kv 2: llama.block_count u32 = 32
- llama_model_loader: - kv 3: llama.context_length u32 = 8192
- llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
- llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
- llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
- llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
- llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
- llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
- llama_model_loader: - kv 10: general.file_type u32 = 2
- llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
- llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
- llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
- llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
- ⠇ llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ...
- ⠏ llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
- ⠙ llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
- llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000
- llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009
- llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
- llama_model_loader: - kv 21: general.quantization_version u32 = 2
- llama_model_loader: - type f32: 65 tensors
- llama_model_loader: - type q4_0: 225 tensors
- llama_model_loader: - type q6_K: 1 tensors
- ⠹ llm_load_vocab: special tokens definition check successful ( 256/128256 ).
- llm_load_print_meta: format = GGUF V3 (latest)
- llm_load_print_meta: arch = llama
- llm_load_print_meta: vocab type = BPE
- llm_load_print_meta: n_vocab = 128256
- llm_load_print_meta: n_merges = 280147
- llm_load_print_meta: n_ctx_train = 8192
- llm_load_print_meta: n_embd = 4096
- llm_load_print_meta: n_head = 32
- llm_load_print_meta: n_head_kv = 8
- llm_load_print_meta: n_layer = 32
- llm_load_print_meta: n_rot = 128
- llm_load_print_meta: n_embd_head_k = 128
- llm_load_print_meta: n_embd_head_v = 128
- llm_load_print_meta: n_gqa = 4
- llm_load_print_meta: n_embd_k_gqa = 1024
- llm_load_print_meta: n_embd_v_gqa = 1024
- llm_load_print_meta: f_norm_eps = 0.0e+00
- llm_load_print_meta: f_norm_rms_eps = 1.0e-05
- llm_load_print_meta: f_clamp_kqv = 0.0e+00
- llm_load_print_meta: f_max_alibi_bias = 0.0e+00
- llm_load_print_meta: f_logit_scale = 0.0e+00
- llm_load_print_meta: n_ff = 14336
- llm_load_print_meta: n_expert = 0
- llm_load_print_meta: n_expert_used = 0
- llm_load_print_meta: causal attn = 1
- llm_load_print_meta: pooling type = 0
- llm_load_print_meta: rope type = 0
- llm_load_print_meta: rope scaling = linear
- llm_load_print_meta: freq_base_train = 500000.0
- llm_load_print_meta: freq_scale_train = 1
- llm_load_print_meta: n_yarn_orig_ctx = 8192
- llm_load_print_meta: rope_finetuned = unknown
- llm_load_print_meta: ssm_d_conv = 0
- llm_load_print_meta: ssm_d_inner = 0
- llm_load_print_meta: ssm_d_state = 0
- llm_load_print_meta: ssm_dt_rank = 0
- llm_load_print_meta: model type = 8B
- llm_load_print_meta: model ftype = Q4_0
- llm_load_print_meta: model params = 8.03 B
- llm_load_print_meta: model size = 4.33 GiB (4.64 BPW)
- llm_load_print_meta: general.name = Meta-Llama-3-8B-Instruct
- llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>'
- llm_load_print_meta: EOS token = 128009 '<|eot_id|>'
- llm_load_print_meta: LF token = 128 'Ä'
- llm_load_print_meta: EOT token = 128009 '<|eot_id|>'
- llm_load_tensors: ggml ctx size = 0.15 MiB
- llm_load_tensors: CPU buffer size = 4437.80 MiB
- .......................................................................................
- ⠸ llama_new_context_with_model: n_ctx = 2048
- llama_new_context_with_model: n_batch = 512
- llama_new_context_with_model: n_ubatch = 512
- llama_new_context_with_model: freq_base = 500000.0
- llama_new_context_with_model: freq_scale = 1
- ⠦ llama_kv_cache_init: CPU KV buffer size = 256.00 MiB
- llama_new_context_with_model: KV self size = 256.00 MiB, K (f16): 128.00 MiB, V (f16): 128.00 MiB
- llama_new_context_with_model: CPU output buffer size = 0.50 MiB
- llama_new_context_with_model: CPU compute buffer size = 258.50 MiB
- llama_new_context_with_model: graph nodes = 1030
- llama_new_context_with_model: graph splits = 1
- ⠧ {"function":"initialize","level":"INFO","line":448,"msg":"initializing slots","n_slots":1,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"initialize","level":"INFO","line":460,"msg":"new slot","n_ctx_slot":2048,"slot_id":0,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"main","level":"INFO","line":3065,"msg":"model loaded","tid":"0x10139f812000","timestamp":1721017395}
- {"function":"main","hostname":"127.0.0.1","level":"INFO","line":3268,"msg":"HTTP server listening","n_threads_http":"3","port":"62268","tid":"0x10139f812000","timestamp":1721017395}
- {"function":"update_slots","level":"INFO","line":1579,"msg":"all slots are idle and system prompt is empty, clear the KV cache","tid":"0x10139f812000","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":0,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":1,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":37211,"status":200,"tid":"0x1013dbe0ae00","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":2,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":60236,"status":200,"tid":"0x1013dbe0a700","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":3,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":43135,"status":200,"tid":"0x1013dbe0a000","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":4,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":31620,"status":200,"tid":"0x1013dbe0ae00","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":5,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":56527,"status":200,"tid":"0x1013dbe0a700","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":6,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":53213,"status":200,"tid":"0x1013dbe0a000","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":7,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":21875,"status":200,"tid":"0x1013dbe0ae00","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":47567,"status":200,"tid":"0x1013dbe0a700","timestamp":1721017395}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":8,"tid":"0x10139f812000","timestamp":1721017395}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":56264,"status":200,"tid":"0x1013dbe0a700","timestamp":1721017395}
- ⠇ [GIN] 2024/07/15 - 12:23:15 | 200 | 1m8s | 10.0.0.12 | POST "/api/chat"
- >>> hello
- time=2024-07-15T14:22:47.710+08:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
- ⠋ time=2024-07-15T14:23:02.785+08:00 level=INFO source=cpu_common.go:18 msg="CPU does not have vector extensions"
- ⠙ time=2024-07-15T14:23:02.789+08:00 level=INFO source=server.go:289 msg="starting llama server" cmd="/tmp/ollama1084183988/runners/cpu/ollama_llama_server --model /home/skywalk/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa --ctx-size 2048 --batch-size 512 --embedding --log-disable --parallel 1 --port 61604"
- time=2024-07-15T14:23:02.811+08:00 level=INFO source=sched.go:340 msg="loaded runners" count=1
- time=2024-07-15T14:23:02.812+08:00 level=INFO source=server.go:432 msg="waiting for llama runner to start responding"
- {"function":"server_params_parse","level":"INFO","line":2604,"msg":"logging to file is disabled.","tid":"0x20da49412000","timestamp":1721024582}
- {"build":2770,"commit":"952d03db","function":"main","level":"INFO","line":2821,"msg":"build info","tid":"0x20da49412000","timestamp":1721024582}
- {"function":"main","level":"INFO","line":2828,"msg":"system info","n_threads":4,"n_threads_batch":-1,"system_info":"AVX = 0 | AVX_VNNI = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 1 | ","tid":"0x20da49412000","timestamp":1721024582,"total_threads":4}
- ⠸ llama_model_loader: loaded meta data with 22 key-value pairs and 291 tensors from /home/skywalk/.ollama/models/blobs/sha256-6a0746a1ec1aef3e7ec53868f220ff6e389f6f8ef87a01d77c96807de94ca2aa (version GGUF V3 (latest))
- llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
- llama_model_loader: - kv 0: general.architecture str = llama
- llama_model_loader: - kv 1: general.name str = Meta-Llama-3-8B-Instruct
- llama_model_loader: - kv 2: llama.block_count u32 = 32
- llama_model_loader: - kv 3: llama.context_length u32 = 8192
- llama_model_loader: - kv 4: llama.embedding_length u32 = 4096
- llama_model_loader: - kv 5: llama.feed_forward_length u32 = 14336
- llama_model_loader: - kv 6: llama.attention.head_count u32 = 32
- llama_model_loader: - kv 7: llama.attention.head_count_kv u32 = 8
- llama_model_loader: - kv 8: llama.rope.freq_base f32 = 500000.000000
- llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000010
- llama_model_loader: - kv 10: general.file_type u32 = 2
- llama_model_loader: - kv 11: llama.vocab_size u32 = 128256
- llama_model_loader: - kv 12: llama.rope.dimension_count u32 = 128
- llama_model_loader: - kv 13: tokenizer.ggml.model str = gpt2
- llama_model_loader: - kv 14: tokenizer.ggml.pre str = llama-bpe
- ⠼ llama_model_loader: - kv 15: tokenizer.ggml.tokens arr[str,128256] = ["!", "\"", "#", "$", "%", "&", "'", ...
- llama_model_loader: - kv 16: tokenizer.ggml.token_type arr[i32,128256] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
- ⠧ llama_model_loader: - kv 17: tokenizer.ggml.merges arr[str,280147] = ["Ġ Ġ", "Ġ ĠĠĠ", "ĠĠ ĠĠ", "...
- llama_model_loader: - kv 18: tokenizer.ggml.bos_token_id u32 = 128000
- llama_model_loader: - kv 19: tokenizer.ggml.eos_token_id u32 = 128009
- llama_model_loader: - kv 20: tokenizer.chat_template str = {% set loop_messages = messages %}{% ...
- llama_model_loader: - kv 21: general.quantization_version u32 = 2
- llama_model_loader: - type f32: 65 tensors
- llama_model_loader: - type q4_0: 225 tensors
- llama_model_loader: - type q6_K: 1 tensors
- ⠇ llm_load_vocab: special tokens definition check successful ( 256/128256 ).
- llm_load_print_meta: format = GGUF V3 (latest)
- llm_load_print_meta: arch = llama
- llm_load_print_meta: vocab type = BPE
- llm_load_print_meta: n_vocab = 128256
- llm_load_print_meta: n_merges = 280147
- llm_load_print_meta: n_ctx_train = 8192
- llm_load_print_meta: n_embd = 4096
- llm_load_print_meta: n_head = 32
- llm_load_print_meta: n_head_kv = 8
- llm_load_print_meta: n_layer = 32
- llm_load_print_meta: n_rot = 128
- llm_load_print_meta: n_embd_head_k = 128
- llm_load_print_meta: n_embd_head_v = 128
- llm_load_print_meta: n_gqa = 4
- llm_load_print_meta: n_embd_k_gqa = 1024
- llm_load_print_meta: n_embd_v_gqa = 1024
- llm_load_print_meta: f_norm_eps = 0.0e+00
- llm_load_print_meta: f_norm_rms_eps = 1.0e-05
- llm_load_print_meta: f_clamp_kqv = 0.0e+00
- llm_load_print_meta: f_max_alibi_bias = 0.0e+00
- llm_load_print_meta: f_logit_scale = 0.0e+00
- llm_load_print_meta: n_ff = 14336
- llm_load_print_meta: n_expert = 0
- llm_load_print_meta: n_expert_used = 0
- llm_load_print_meta: causal attn = 1
- llm_load_print_meta: pooling type = 0
- llm_load_print_meta: rope type = 0
- llm_load_print_meta: rope scaling = linear
- llm_load_print_meta: freq_base_train = 500000.0
- llm_load_print_meta: freq_scale_train = 1
- llm_load_print_meta: n_yarn_orig_ctx = 8192
- llm_load_print_meta: rope_finetuned = unknown
- llm_load_print_meta: ssm_d_conv = 0
- llm_load_print_meta: ssm_d_inner = 0
- llm_load_print_meta: ssm_d_state = 0
- llm_load_print_meta: ssm_dt_rank = 0
- llm_load_print_meta: model type = 8B
- llm_load_print_meta: model ftype = Q4_0
- llm_load_print_meta: model params = 8.03 B
- llm_load_print_meta: model size = 4.33 GiB (4.64 BPW)
- llm_load_print_meta: general.name = Meta-Llama-3-8B-Instruct
- llm_load_print_meta: BOS token = 128000 '<|begin_of_text|>'
- llm_load_print_meta: EOS token = 128009 '<|eot_id|>'
- llm_load_print_meta: LF token = 128 'Ä'
- llm_load_print_meta: EOT token = 128009 '<|eot_id|>'
- llm_load_tensors: ggml ctx size = 0.15 MiB
- ⠙ llm_load_tensors: CPU buffer size = 4437.80 MiB
- .......................................................................................
- llama_new_context_with_model: n_ctx = 2048
- llama_new_context_with_model: n_batch = 512
- llama_new_context_with_model: n_ubatch = 512
- llama_new_context_with_model: freq_base = 500000.0
- llama_new_context_with_model: freq_scale = 1
- ⠴ llama_kv_cache_init: CPU KV buffer size = 256.00 MiB
- llama_new_context_with_model: KV self size = 256.00 MiB, K (f16): 128.00 MiB, V (f16): 128.00 MiB
- llama_new_context_with_model: CPU output buffer size = 0.50 MiB
- llama_new_context_with_model: CPU compute buffer size = 258.50 MiB
- llama_new_context_with_model: graph nodes = 1030
- llama_new_context_with_model: graph splits = 1
- ⠦ {"function":"initialize","level":"INFO","line":448,"msg":"initializing slots","n_slots":1,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"initialize","level":"INFO","line":460,"msg":"new slot","n_ctx_slot":2048,"slot_id":0,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"main","level":"INFO","line":3065,"msg":"model loaded","tid":"0x20da49412000","timestamp":1721024651}
- {"function":"main","hostname":"127.0.0.1","level":"INFO","line":3268,"msg":"HTTP server listening","n_threads_http":"3","port":"61604","tid":"0x20da49412000","timestamp":1721024651}
- {"function":"update_slots","level":"INFO","line":1579,"msg":"all slots are idle and system prompt is empty, clear the KV cache","tid":"0x20da49412000","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":0,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":1,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":48229,"status":200,"tid":"0x20da85a0a000","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":2,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":33319,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":3,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":54187,"status":200,"tid":"0x20da85a0ae00","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":4,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":28162,"status":200,"tid":"0x20da85a0a000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":33773,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":5,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":6,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":19633,"status":200,"tid":"0x20da85a0ae00","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":7,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":35779,"status":200,"tid":"0x20da85a0a000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":18413,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- ⠧ {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":8,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":36742,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":9,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":36742,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- ⠇ {"function":"log_server_request","level":"INFO","line":2742,"method":"POST","msg":"request","params":{},"path":"/tokenize","remote_addr":"10.0.0.12","remote_port":36742,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- {"function":"process_single_task","level":"INFO","line":1511,"msg":"slot data","n_idle_slots":1,"n_processing_slots":0,"task_id":10,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"GET","msg":"request","params":{},"path":"/health","remote_addr":"10.0.0.12","remote_port":36742,"status":200,"tid":"0x20da85a0a700","timestamp":1721024651}
- ⠏ {"function":"launch_slot_with_data","level":"INFO","line":833,"msg":"slot is processing task","slot_id":0,"task_id":11,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"update_slots","ga_i":0,"level":"INFO","line":1817,"msg":"slot progression","n_past":0,"n_past_se":0,"n_prompt_tokens_processed":10,"slot_id":0,"task_id":11,"tid":"0x20da49412000","timestamp":1721024651}
- {"function":"update_slots","level":"INFO","line":1841,"msg":"kv cache rm [p0, end)","p0":0,"slot_id":0,"task_id":11,"tid":"0x20da49412000","timestamp":1721024651}
- Hello! It's nice to meet you. Is there something I can help you with, or
- would you like to chat?{"function":"print_timings","level":"INFO","line":276,"msg":"prompt eval time = 106459.91 ms / 10 tokens (10645.99 ms per token, 0.09 tokens per second)","n_prompt_tokens_processed":10,"n_tokens_second":0.09393207617523164,"slot_id":0,"t_prompt_processing":106459.906,"t_token":10645.990600000001,"task_id":11,"tid":"0x20da49412000","timestamp":1721027627}
- {"function":"print_timings","level":"INFO","line":290,"msg":"generation eval time = 2868918.63 ms / 26 runs (110343.02 ms per token, 0.01 tokens per second)","n_decoded":26,"n_tokens_second":0.00906264811318913,"slot_id":0,"t_token":110343.0241923077,"t_token_generation":2868918.629,"task_id":11,"tid":"0x20da49412000","timestamp":1721027627}
- {"function":"print_timings","level":"INFO","line":299,"msg":" total time = 2975378.54 ms","slot_id":0,"t_prompt_processing":106459.906,"t_token_generation":2868918.629,"t_total":2975378.535,"task_id":11,"tid":"0x20da49412000","timestamp":1721027627}
- {"function":"update_slots","level":"INFO","line":1649,"msg":"slot released","n_cache_tokens":36,"n_ctx":2048,"n_past":35,"n_system_tokens":0,"slot_id":0,"task_id":11,"tid":"0x20da49412000","timestamp":1721027627,"truncated":false}
- {"function":"log_server_request","level":"INFO","line":2742,"method":"POST","msg":"request","params":{},"path":"/completion","remote_addr":"10.0.0.12","remote_port":36742,"status":200,"tid":"0x20da85a0a700","timestamp":1721027627}
- [GIN] 2024/07/15 - 15:13:47 | 200 | 50m59s | 10.0.0.12 | POST "/api/chat"
-
ollama可以在FreeBSD下编译,但是需要特供版本。官网是:GitHub - ollama/ollama: Get up and running with Llama 3, Mistral, Gemma 2, and other large language models. 特供版是:https://github.com/prep/ollama
特供版如果编译时报错,看报错信息,相应修改go.sum go.mod文件里 github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c 这句,修改成5.10日版本。
整个系统在CPU J1900 、8G 内存,软件FreeBSD fbhost 14.1-RELEASE FreeBSD 下调试成功。尽管ollama速度非常慢,大约50分钟回答一个问题,但至少,它确实成功了!
- skywalk@fbhost:~/github/ollama $ go build .
- package github.com/ollama/ollama
- imports github.com/ollama/ollama/cmd
- imports github.com/ollama/ollama/server
- imports github.com/ollama/ollama/gpu: C source files not allowed when not using cgo or SWIG: gpu_info_cudart.c gpu_info_nvcuda.c gpu_info_nvml.c gpu_info_oneapi.c
怎么会有gpu呢? 哪里配置不对?
Ollama on FreeBSD · Issue #1102 · ollama/ollama · GitHub
在这个issue里,提到了方法,使用另一个repo:
# pkg install -y git go122 cmake vulkan-headers vulkan-loader
# git clone https://github.com/prep/ollama.git
# cd ollama && git checkout feature/add-bsd-support
# go122 generate ./...
# go122 build .
- # ./ollama help | head -n 5
- Large language model runner
-
- Usage:
- ollama [flags]
- ollama [command]
Works fine for me, no problems encountered.
本来好像主repo 也可以FreeBSD下安装的,但是5.6日之后就不行了:Make maximum pending request configurable by dhiltgen · Pull Request #4144 · ollama/ollama · GitHub
git checkout feature/add-bsd-support
error: pathspec 'feature/add-bsd-support' did not match any file(s) known to git
原来是因为前面代码没有下载全的原因。
# git clone --depth 1 https://github.com/prep/ollama.git
切branch(这里没切换成)
# cd ollama && git checkout feature/add-bsd-support
这里不能用--depth 1 ,去掉,
git clone https://github.com/prep/ollama.git
这样就能git checkout feature/add-bsd-support 成功了。
vulkan-headers
和 vulkan-loader
两个包的功能vulkan-headers
和 vulkan-loader
是与 Vulkan API 相关的两个关键组件,它们在开发使用 Vulkan 图形和计算 API 的应用程序时起着重要的作用。Vulkan 是一个跨平台的图形和计算 API,由 Khronos Group 开发,旨在提供高性能的 3D 图形渲染能力。
先上结论,是因为github抽风。
在jail里build的时候报错imports github.com/ollama/ollama/gpu: C source files not allowed when not using cgo or SWIG: gpu_info_cpu.c gpu_info_cudart.c
同时还有github连不上的报错:
fatal: unable to access 'https://github.com/pdevine/tensor/': Failed to connect to github.com port 443 after 75025 ms: Couldn't connect to server
go: downloading github.com/pdevine/tensor v0.0.0-20240228013915-64ccaa8d9ca9
package github.com/ollama/ollama
imports github.com/ollama/ollama/cmd
imports github.com/ollama/ollama/server
imports github.com/ollama/ollama/gpu: C source files not allowed when not using cgo or SWIG: gpu_info_cpu.c gpu_info_cudart.c
convert/gemma.go:12:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /root/go/pkg/mod/cache/vcs/6bf5b14e60582bdf39d55e6388653dd8c2addad6937480b86ddb5a729a838afe: exit status 128:
fatal: unable to access 'https://github.com/pdevine/tensor/': Failed to connect to github.com port 443 after 75025 ms: Couldn't connect to server
convert/gemma.go:13:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: git fetch -f origin refs/heads/*:refs/heads/* refs/tags/*:refs/tags/* in /root/go/pkg/mod/cache/vcs/6bf5b14e60582bdf39d55e6388653dd8c2addad6937480b86ddb5a729a838afe: exit status 128:
fatal: unable to access 'https://github.com/pdevine/tensor/': Failed to connect to github.com port 443 after 75025 ms: Couldn't connect to server
+ echo 'go generate completed. LLM runners: cpu cpu_avx cpu_avx2 vulkan'
go generate completed. LLM runners: cpu cpu_avx cpu_avx2 vulkan
[root@fb12 ollama]# go122 build .
go: downloading github.com/pdevine/tensor v0.0.0-20240228013915-64ccaa8d9ca9
convert/gemma.go:12:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
convert/gemma.go:13:2: github.com/pdevine/tensor@v0.0.0-20240228013915-64ccaa8d9ca9: invalid version: unknown revision 64ccaa8d9ca9
不知道什么原因,不过有可能还是github抽风....
再重新generate一下。继续抽风中
前面都是用的root账户,尝试使用普通用户编译试试。
普通用户也是这个报错
修改go.sum文件,将里面的pdeviene/tensor 修改成
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c h1:GwiUUjKefgvSNmv3NCvI/BL0kDebW6Xa+kcdpdc1mTY=
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c/go.mod h1:PSojXDXF7TbgQiD6kkd98IHOS0QqTyUEaWRiS8+BLu8=
修改之后,go build报错
go122 build .
convert/gemma.go:12:2: missing go.sum entry for module providing package github.com/pdevine/tensor (imported by github.com/ollama/ollama/convert); to add:
go get github.com/ollama/ollama/convert
convert/gemma.go:13:2: missing go.sum entry for module providing package github.com/pdevine/tensor/native (imported by github.com/ollama/ollama/convert); to add:
go get github.com/ollama/ollama/convert
发现go.mod 文件里也有版本,修改成当前的:
github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c
但是又报错了
verifying github.com/google/flatbuffers@v1.12.0: checksum mismatch
downloaded: h1:N8EguYFm2wwdpoNcpchQY0tPs85vOJkboFb2dPxmixo=
go.sum: h1:/PtAHvnBY4Kqnx/xCQ3OIV9uYcSFGScBsWI3Oogeh6w=
SECURITY ERROR
This download does NOT match an earlier download recorded in go.sum.
The bits may have been replaced on the origin server, or an attacker may
have intercepted the download attempt.
- go122 generate ./...
- go: downloading github.com/google/flatbuffers v1.12.0
- go: downloading gonum.org/v1/gonum v0.8.2
- verifying github.com/google/flatbuffers@v1.12.0: checksum mismatch
- downloaded: h1:N8EguYFm2wwdpoNcpchQY0tPs85vOJkboFb2dPxmixo=
- go.sum: h1:/PtAHvnBY4Kqnx/xCQ3OIV9uYcSFGScBsWI3Oogeh6w=
-
- SECURITY ERROR
- This download does NOT match an earlier download recorded in go.sum.
- The bits may have been replaced on the origin server, or an attacker may
- have intercepted the download attempt.
-
晕了,这个特供版本有问题啊
go.mod 修改成这样试试 github.com/pdevine/tensor v0.0.0-20240510204454-f88f4562727c
然后执行
go122 get github.com/ollama/ollama/convert
然后执行
go122 build .
终于安装完成了。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。