赞
踩
在数据隐私问题日益严重的时代,本地大型语言模型 (LLM) 应用程序的开发为基于云的解决方案提供了替代方案。Ollama 提供了一个解决方案,使 LLM 可以在本地下载和使用。在本文中,我们将探讨如何使用 Jupyter Notebook 将 Ollama 与 LangChain 和 SingleStore 一起使用。
我们将使用运行 Ubuntu 22.04.2 的虚拟机作为测试环境。另一种方法是使用 .venv
我们将使用 Ollama Demo Group 作为工作区组名称,使用 ollama-demo 作为工作区名称。我们将记下我们的密码和主机名。在本文中,我们将通过在 Ollama Demo Group > Firewall 下配置防火墙来暂时允许从任何位置进行访问。对于生产环境,应添加防火墙规则以提供更高的安全性。
在我们的 SingleStore Cloud 帐户中,让我们使用 SQL 编辑器创建一个新数据库。调用此函数,如下所示:ollama_demo
SQL的
CREATE DATABASE IF NOT EXISTS ollama_demo;
在命令行中,我们将安装经典的 Jupyter Notebook,如下所示:
pip install notebook
我们将安装 Ollama,如下所示:
curl -fsSL https://ollama.com/install.sh | sh
使用我们之前保存的密码和主机信息,我们将创建一个环境变量来指向我们的 SingleStore 实例,如下所示:
壳
export SINGLESTOREDB_URL="admin:<password>@<host>:3306/ollama_demo"
将 和 替换为您的环境的值。<password>
<host>
我们现在已准备好与 Ollama 合作,我们将推出 Jupyter:
壳
jupyter notebook
首先,一些软件包:
!pip install langchain ollama --quiet --no-warn-script-location
接下来,我们将导入一些库:
- import ollama
- from langchain_community.vectorstores import SingleStoreDB
- from langchain_community.vectorstores.utils import DistanceStrategy
- from langchain_core.documents import Document
- from langchain_community.embeddings import OllamaEmbeddings
我们将使用创建嵌入:all-minilm
ollama.pull("all-minilm")
输出示例:
{'status': 'success'}
对于我们的 LLM,我们将使用(在撰写本文时为 3.8 GB):llama2
ollama.pull("llama2")
输出示例:
{'status': 'success'}
接下来,我们将使用 Ollama 网站上的示例文本:
- documents = [
- "Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels",
- "Llamas were first domesticated and used as pack animals 4,000 to 5,000 years ago in the Peruvian highlands",
- "Llamas can grow as much as 6 feet tall though the average llama between 5 feet 6 inches and 5 feet 9 inches tall",
- "Llamas weigh between 280 and 450 pounds and can carry 25 to 30 percent of their body weight",
- "Llamas are vegetarians and have very efficient digestive systems",
- "Llamas live to be about 20 years old, though some only live for 15 years and others live to be 30 years old"
- ]
-
- embeddings = OllamaEmbeddings(
- model = "all-minilm",
- )
-
- dimensions = len(embeddings.embed_query(documents[0]))
-
- docs = [Document(text) for text in documents]
将指定嵌入,确定第一个文档返回的维度数,并将文档转换为 SingleStore 所需的格式。all-minilm
接下来,使用 LangChain:
- docsearch = SingleStoreDB.from_documents(
- docs,
- embeddings,
- table_name = "langchain_docs",
- distance_strategy = DistanceStrategy.EUCLIDEAN_DISTANCE,
- use_vector_index = True,
- vector_size = dimensions
- )
除了文档和嵌入之外,我们还将提供要用于存储的表的名称、要使用向量索引的距离策略,以及使用我们之前确定的维度的向量大小。LangChain文档中对这些选项和其他选项进行了更详细的解释。
使用 SingleStore Cloud 中的 SQL 编辑器,让我们检查一下 LangChain 创建的表的结构:
SQL的
- USE ollama_demo;
-
- DESCRIBE langchain_docs;
输出示例:
- +----------+------------------+------+------+---------+----------------+
- | Field | Type | Null | Key | Default | Extra |
- +----------+------------------+------+------+---------+----------------+
- | id | bigint(20) | NO | PRI | NULL | auto_increment |
- | content | longtext | YES | | NULL | |
- | vector | vector(384, F32) | NO | MUL | NULL | |
- | metadata | JSON | YES | | NULL | |
- +----------+------------------+------+------+---------+----------------+
我们可以看到,创建了一个具有 384 个维度的列来存储嵌入。vector
我们还要快速检查存储的数据:
- USE ollama_demo;
-
- SELECT SUBSTRING(content, 1, 30) AS content, SUBSTRING(vector, 1, 30) AS vector FROM langchain_docs;
输出示例:
- +--------------------------------+--------------------------------+
- | content | vector |
- +--------------------------------+--------------------------------+
- | Llamas weigh between 280 and 4 | [0.235754818,0.242168128,-0.26 |
- | Llamas were first domesticated | [0.153105229,0.219774529,-0.20 |
- | Llamas are vegetarians and hav | [0.285528302,0.10461951,-0.313 |
- | Llamas are members of the came | [-0.0174482632,0.173883006,-0. |
- | Llamas can grow as much as 6 f | [-0.0232818555,0.122274697,-0. |
- | Llamas live to be about 20 yea | [0.0260244086,0.212311044,0.03 |
- +--------------------------------+--------------------------------+
最后,我们来检查一下向量索引:
- USE ollama_demo;
-
- SHOW INDEX FROM langchain_docs;
输出示例:
- +----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------------+---------+---------------+---------------------------------------+
- | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Index_options |
- +----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------------+---------+---------------+---------------------------------------+
- | langchain_docs | 0 | PRIMARY | 1 | id | NULL | NULL | NULL | NULL | | COLUMNSTORE HASH | | | |
- | langchain_docs | 1 | vector | 1 | vector | NULL | NULL | NULL | NULL | | VECTOR | | | {"metric_type": "EUCLIDEAN_DISTANCE"} |
- | langchain_docs | 1 | __SHARDKEY | 1 | id | NULL | NULL | NULL | NULL | | METADATA_ONLY | | | |
- +----------------+------------+------------+--------------+-------------+-----------+-------------+----------+--------+------+------------------+---------+---------------+---------------------------------------+
我们现在要问一个问题,如下所示:
- prompt = "What animals are llamas related to?"
- docs = docsearch.similarity_search(prompt)
- data = docs[0].page_content
- print(data)
输出示例:
Llamas are members of the camelid family meaning they're pretty closely related to vicuñas and camels
接下来,我们将使用 LLM,如下所示:
- output = ollama.generate(
- model = "llama2",
- prompt = f"Using this data: {data}. Respond to this prompt: {prompt}"
- )
-
- print(output["response"])
输出示例:
- Llamas are members of the camelid family, which means they are closely related to other animals such as:
-
- 1. Vicuñas: Vicuñas are small, wild relatives of llamas and alpacas. They are native to South America and are known for their soft, woolly coats.
- 2. Camels: Camels are also members of the camelid family and are known for their distinctive humps on their backs. There are two species of camel: the dromedary and the Bactrian.
- 3. Alpacas: Alpacas are domesticated animals that are closely related to llamas and vicuñas. They are native to South America and are known for their soft, luxurious fur.
-
- So, in summary, llamas are related to vicuñas, camels, and alpacas.
我们已经看到我们可以连接到 SingleStore,存储文档和嵌入,询问有关数据库中数据的问题,并通过 Ollama 在本地使用 LLM 的强大功能。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。