1. 简介

Langchain Expression Language(LCEL)是 LangChain 中的一个重要概念,LCEL是一种声明式的链式组合语言。它提供了一种统一的接口,允许不同的组件(如 retriever, prompt, llm 等)可以通过统一的 Runnable 接口连接起来。每个 Runnable 组件都实现了相同的方法,如 .invoke()、.stream() 或 .batch(),这使得它们可以通过 | 操作符轻松连接。

1.1 LCEL 的优势

LCEL使得从基本组件构建复杂链变得容易,并支持流式处理并行处理日志记录等开箱即用的功能

  • 统一接口: LCEL 通过 Runnable 接口将不同的组件统一起来,简化了复杂操作的实现。
  • 模块化: 各个组件可以独立开发和测试,然后通过 LCEL 轻松集成。
  • 可扩展性: LCEL 支持异步调用、批处理和流式处理,适应不同的应用场景。
  • 2. 范例

    查看版本

    $ pip show langchain langchain_community
    Name: langchain
    Version: 0.3.7
    Summary: Building applications with LLMs through composability
    Home-page: https://github.com/langchain-ai/langchain
    Author: 
    Author-email: 
    License: MIT
    Location: /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages
    Requires: aiohttp, async-timeout, langchain-core, langchain-text-splitters, langsmith, numpy, pydantic, PyYAML, requests, SQLAlchemy, tenacity
    Required-by: langchain-community
    ---
    Name: langchain-community
    Version: 0.3.5
    Summary: Community contributed LangChain integrations.
    Home-page: https://github.com/langchain-ai/langchain
    Author: 
    Author-email: 
    License: MIT
    Location: /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages
    Requires: aiohttp, dataclasses-json, httpx-sse, langchain, langchain-core, langsmith, numpy, pydantic-settings, PyYAML, requests, SQLAlchemy, tenacity
    Required-by: langchain-experimental
    
    

    2.1 持久化存储的向量数据库的 RAG 示例

    代码:

    from langchain_openai import ChatOpenAI, OpenAIEmbeddings
    from langchain_community.vectorstores import FAISS
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_core.runnables import RunnableParallel, RunnablePassthrough
    from langchain.docstore.document import Document
    from dotenv import load_dotenv, find_dotenv
    import os
    
    # 删除代理环境变量
    if 'all_proxy' in os.environ:
        del os.environ['all_proxy']
    
    if 'ALL_PROXY' in os.environ:
        del os.environ['ALL_PROXY']
    
    _ = load_dotenv(find_dotenv())
    
    # 初始化模型
    model = ChatOpenAI(model="gpt-4o-mini")
    
    # 创建或加载持久化向量数据库
    embedding_model = OpenAIEmbeddings()
    vectorstore_path = "faiss_index"
    
    if os.path.exists(vectorstore_path):
        # 尝试加载现有的 FAISS 数据库
        vectorstore = FAISS.load_local(vectorstore_path, embedding_model,allow_dangerous_deserialization=True)
        print("Loaded existing FAISS index.")
    else:
        # 如果没有找到数据库,创建新的数据库
        texts = ["harrison worked at kensho", "bears like to eat honey"]
        docs = [Document(page_content=text) for text in texts]
        vectorstore = FAISS.from_documents(docs, embedding_model)
        vectorstore.save_local(vectorstore_path)
        print("Created and saved new FAISS index.")
    
    retriever = vectorstore.as_retriever()
    
    # 创建一个聊天提示模板,用中文设置模板以便生成基于特定上下文和问题的完整输入
    template = """根据以下上下文回答问题:
    {context}
    
    问题: {question}
    """
    prompt = ChatPromptTemplate.from_template(template)
    
    # 初始化输出解析器,将模型输出转换为字符串
    output_parser = StrOutputParser()
    
    # 设置上下文和问题的处理逻辑
    setup_and_retrieval = RunnableParallel(
        {"context": retriever, "question": RunnablePassthrough()}
    )
    
    # 构建一个处理链,包括上下文和问题的设置、提示生成、模型调用和输出解析
    chain = setup_and_retrieval | prompt | model | output_parser
    
    # 调用处理链,传入问题"where did harrison work?"(需翻译为中文),并基于给定的文本上下文生成答案
    print(chain.invoke("harrison在哪里工作?"))
    
    # 验证检索器是否正常工作
    print(retriever)
    

    其中 faiss_index 文件

    $ tree faiss_index/
    faiss_index/
    ├── index.faiss
    └── index.pkl
    
    0 directories, 2 files
    

    vectorstore = FAISS.load_local(vectorstore_path, embedding_model,allow_dangerous_deserialization=True)
    中参数 allow_dangerous_deserialization 选用默认参数的话,因为安全问题会报错

    ValueError: The de-serialization relies loading a pickle file. Pickle files can be modified to deliver a malicious payload that results in execution of arbitrary code on your machine.You will need to set `allow_dangerous_deserialization` to `True` to enable deserialization. If you do this, make sure that you trust the source of the data. For example, if you are loading a file that you created, and know that no one else has modified the file, then this is safe to do. Do not set this to `True` if you are loading a file from an untrusted source (e.g., some random site on the internet.).
    

    2.2 多链

    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.prompts import ChatPromptTemplate
    from langchain_openai import ChatOpenAI
    from langchain_core.runnables import RunnablePassthrough
    from operator import itemgetter
    import os
    from dotenv import load_dotenv, find_dotenv
    
    # 删除all_proxy环境变量
    if 'all_proxy' in os.environ:
        del os.environ['all_proxy']
    
    # 删除ALL_PROXY环境变量
    if 'ALL_PROXY' in os.environ:
        del os.environ['ALL_PROXY']
    
    _ = load_dotenv(find_dotenv())
    
    
    planner = (
        ChatPromptTemplate.from_template("总结功能需求: {input}")
        | ChatOpenAI(model="gpt-4o-mini")
        | StrOutputParser()
        | {"base_response": RunnablePassthrough()}
    )
    
    # 生成关于cpp的代码
    code_cpp = (
        ChatPromptTemplate.from_template(
            "写出关于{base_response}的cpp代码"
        )
        | ChatOpenAI(model="gpt-4o-mini")
        | StrOutputParser()
    )
    
    # 生成关于python的代码
    code_python = (
        ChatPromptTemplate.from_template(
            "写出关于{base_response}的python代码"
        )
        | ChatOpenAI(model="gpt-4o-mini")
        | StrOutputParser()
    )
    
    # 创建最终响应者,综合原始回应和正反论点生成最终的回应
    final_responder = (
        ChatPromptTemplate.from_messages(
            [
                ("ai", "{original_response}"),
                ("human", "cpp代码:\n{code_cpp}\n\npython代码:\n{code_python}"),
                ("system", "打印出生成的完整的cpp代码,逐步检查cpp代码,发现是否有问题,并提出解决方法;同理,python代码同样"),
            ]
        )
        | ChatOpenAI(model="gpt-4o-mini")
        | StrOutputParser()
    )
    
    # 构建完整的处理链,从生成论点到列出正反论点,再到生成最终回应
    chain = (
        planner
        | {
            "code_cpp": code_cpp,
            "code_python": code_python,
            "original_response": itemgetter("base_response"),
        }
        | final_responder
    )
    
    print(chain.invoke({"input": "异步多线程并行运算的简单demo"}))
    

    输出:

    好的,我们将逐步检查提供的C++和Python代码,确保它们没有问题,并提出任何需要改进的地方。

    C++代码逐步检查

    完整代码回顾
    #include <iostream>
    #include <vector>
    #include <thread>
    #include <future>
    #include <numeric>
    #include <chrono>
    
    // 计算任务:计算一段数据的和
    int computeSum(const std::vector<int>& data, size_t start, size_t end) {
        int sum = 0;
        for (size_t i = start; i < end; ++i) {
            sum += data[i];
        }
        return sum;
    }
    
    // 主函数
    int main() {
        // 1. 用户输入数据
        size_t dataSize;
        std::cout << "请输入数据大小: ";
        std::cin >> dataSize;
    
        std::vector<int> data(dataSize);
        std::cout << "请输入" << dataSize << "个整数: ";
        for (size_t i = 0; i < dataSize; ++i) {
            std::cin >> data[i];
        }
    
        // 2. 定义线程数量
        const size_t numThreads = std::thread::hardware_concurrency();
        std::vector<std::future<int>> futures;
        size_t chunkSize = dataSize / numThreads;
    
        // 3. 异步执行计算任务
        auto startTime = std::chrono::high_resolution_clock::now();
        for (size_t i = 0; i < numThreads; ++i) {
            size_t start = i * chunkSize;
            size_t end = (i == numThreads - 1) ? dataSize : start + chunkSize;
    
            // 使用 std::async 启动异步任务
            futures.push_back(std::async(std::launch::async, computeSum, std::ref(data), start, end));
        }
    
        // 4. 汇总结果
        int totalSum = 0;
        for (auto& fut : futures) {
            totalSum += fut.get();
        }
    
        auto endTime = std::chrono::high_resolution_clock::now();
        std::chrono::duration<double> elapsed = endTime - startTime;
    
        // 5. 显示结果
        std::cout << "计算结果: " << totalSum << std::endl;
        std::cout << "计算时间: " << elapsed.count() << "秒" << std::endl;
    
        return 0;
    }
    
    逐步检查
    1. 输入验证:当前代码没有对用户输入进行验证,建议在获取 dataSize 和数据数组时进行有效性检查,以避免用户输入负数或非整数。

    2. 解决方法:使用 std::cin.fail() 检查输入是否有效,并在输入不合法时清除输入流并提示用户重新输入。
    3. 线程数量std::thread::hardware_concurrency() 返回的线程数可能为 0,表示无法查询硬件支持的线程数。

    4. 解决方法:在使用该值之前,检查其是否为 0,可以设置一个默认值,例如 2。
    5. 内存安全:在 computeSum 函数中,使用了对 data 的引用,确保在多线程环境中不会出现数据竞争。

    6. 解决方法:虽然当前情况下不会出现问题,但可以考虑使用 std::vector<std::mutex> 来保护共享资源。
    7. 性能:如果 dataSize 小于 numThreads,可能会导致某些线程没有任务可做。

    8. 解决方法:在分配任务时,可以根据 dataSize 的大小调整 numThreads,确保每个线程都有工作。

    Python代码逐步检查

    完整代码回顾
    import concurrent.futures
    import random
    import time
    
    # 模拟计算任务的函数
    def compute_task(n):
        # 模拟一个耗时的计算
        time.sleep(random.uniform(0.1, 1.0))  # 随机延时
        return n * n  # 示例:返回平方值
    
    def main():
        # 用户输入任务数据
        try:
            tasks = input("请输入要处理的数字(用逗号分隔):")
            numbers = list(map(int, tasks.split(',')))
        except ValueError:
            print("无效输入,请输入数字。")
            return
    
        results = []  # 存储计算结果
    
        # 使用线程池进行异步计算
        with concurrent.futures.ThreadPoolExecutor() as executor:
            future_to_num = {executor.submit(compute_task, num): num for num in numbers}
    
            # 处理结果
            for future in concurrent.futures.as_completed(future_to_num):
                num = future_to_num[future]
                try:
                    result = future.result()
                    results.append((num, result))
                    print(f"数字 {num} 的平方是 {result}")
                except Exception as e:
                    print(f"任务 {num} 处理时出现错误:{e}")
    
        # 汇总结果
        print("\n所有计算完成。")
        print("结果汇总:")
        for num, result in results:
            print(f"数字 {num} 的平方是 {result}")
    
    if __name__ == "__main__":
        main()
    
    逐步检查
    1. 输入验证:当前代码在输入时有基本的异常处理,但没有限制用户输入的数字范围,用户可以输入非常大的数字导致性能问题。

    2. 解决方法:可以在输入后对 numbers 列表进行检查,限制数量和范围。
    3. 随机延时:在 compute_task 中使用随机延时可能导致任务执行时间不可预测,这对性能测试不是特别有利。

    4. 解决方法:可以考虑使用固定的延时或实现更复杂的计算任务,以使测试更具可重复性。
    5. 结果存储:在结果处理时,当前实现是将每个数字和结果存储在 results 列表中,这对于小规模数据是合理的,但在大规模数据时可能会导致内存问题。

    6. 解决方法:可以考虑分批处理结果,或将结果写入文件而不是在内存中保存。
    7. 异常处理:当前的异常处理只是捕获了所有异常,但没有提供详细的错误信息。

    8. 解决方法:可以将异常信息详细化,记录具体的错误类型和信息。

    总结

    通过以上逐步检查,C++和Python代码均存在一些可以改进的地方。针对每个问题提供了相应的解决方法。通过这些改进,可以提高代码的健壮性和用户体验。

    2.3 RAG 应用

    RAG 是一种将检索到的文档上下文与大语言模型(LLM)结合起来生成答案的技术。

    整个过程主要分为以下几个步骤:

  • 加载文档:将原始数据(来源可能是在线网站、本地文件、各类平台等)加载到 LangChain 中。
  • 文档分割:将加载的文档分割成较小的块,以适应模型的上下文窗口,并更容易进行向量嵌入和检索。
  • 存储嵌入:将分割后的文档内容嵌入到向量空间,并存储到向量数据库中,以便后续检索。
  • 检索文档:通过查询向量数据库,检索与问题最相关的文档片段。
  • 生成回答:将检索到的文档片段与用户问题组合,生成并返回答案。
  • 通过这些步骤,可以构建一个强大的问答系统,将复杂任务分解为更小的步骤并生成详细回答。

    import bs4
    from langchain_community.document_loaders import WebBaseLoader
    from langchain_text_splitters import RecursiveCharacterTextSplitter
    from langchain_openai import ChatOpenAI
    from langchain_openai import OpenAIEmbeddings
    from langchain_chroma import Chroma
    from langchain_core.output_parsers import StrOutputParser
    from langchain_core.runnables import RunnablePassthrough
    from langchain import hub
    from langchain_core.prompts import PromptTemplate
    
    import os
    from dotenv import load_dotenv, find_dotenv
    
    # 删除all_proxy环境变量
    if 'all_proxy' in os.environ:
        del os.environ['all_proxy']
    
    # 删除ALL_PROXY环境变量
    if 'ALL_PROXY' in os.environ:
        del os.environ['ALL_PROXY']
    
    _ = load_dotenv(find_dotenv())
    
    os.environ["http_proxy"] = "socks5://127.0.0.1:1080"
    os.environ["https_proxy"] = "socks5://127.0.0.1:1080"
    
    #保留的HTML元素的类名。想要保留类名为post-title、post-header和post-content的元素,在解析网页时,只有这些类名的元素会被保留下来,其他的HTML元素会被忽略。
    bs4_strainer = bs4.SoupStrainer(class_=("post-title","post-header","post-content"))
    loader = WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
                           bs_kwargs={"parse_only": bs4_strainer},)
    docs = loader.load()
    
    
    # 使用 RecursiveCharacterTextSplitter 将文档分割成块,每块1000字符,重叠200字符
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000, chunk_overlap=200, add_start_index=True
    )
    all_splits = text_splitter.split_documents(docs)
    
    vectorstore = Chroma.from_documents(documents=all_splits,embedding=OpenAIEmbeddings())
    print(type(vectorstore))
    
    retriever = vectorstore.as_retriever(search_type="similarity",
                    search_kwargs={'k': 6})
    
    # 定义 RAG 链,将用户问题与检索到的文档结合并生成答案
    llm = ChatOpenAI(model="gpt-4o-mini")
    
    # 使用 hub 模块拉取 rag 提示词模板
    prompt = hub.pull("rlm/rag-prompt")
    
    # 定义格式化文档的函数
    def format_docs(docs):
        return "\n\n".join(doc.page_content for doc in docs)
    
    # 使用 LCEL 构建 RAG Chain
    rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )
    
    print(rag_chain.invoke("What is ToT?"))
    # 自定义提示词模板
    template = """Use the following pieces of context to answer the question at the end.
    If you don't know the answer, just say that you don't know, don't try to make up an answer.
    Use three sentences maximum and keep the answer as concise as possible.
    Always say "thanks for asking!" at the end of the answer.
    
    {context}
    
    Question: {question}
    
    Helpful Answer:"""
    
    custom_rag_prompt = PromptTemplate.from_template(template)
    custom_rag_chain = (
        {"context": retriever | format_docs, "question": RunnablePassthrough()}
        | custom_rag_prompt
        | llm
        | StrOutputParser()
    )
    print("*"*40)
    print(custom_rag_chain.invoke("What is ToT?"))
    

    输出:

    ToT stands for Tree of Thoughts, a reasoning framework that extends the Chain of Thought (CoT) approach. It involves breaking down problems into multiple thought steps, generating various thoughts at each step, and organizing them into a tree structure for evaluation. The process can utilize search strategies like breadth-first or depth-first search to explore different reasoning possibilities.
    ****************************************
    ToT refers to "Tree of Thoughts," a framework that extends Chain of Thought (CoT) reasoning by exploring multiple reasoning possibilities at each step of problem-solving. It decomposes tasks into various thought steps and organizes them into a tree structure, utilizing search strategies like BFS or DFS for evaluation. Thanks for asking!
    

    遇到的问题:

    2.3.1 langchain_core 和 langchain_openai 的版本冲突

    报错:
    ImportError: cannot import name ‘InputTokenDetails’ from ‘langchain_core.messages.ai’

    Traceback (most recent call last):
      File "/home/xjg/workspace/openai-quickstart/langchain/pycharm/LCEL/rag.py", line 4, in <module>
        from langchain_openai import ChatOpenAI
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_openai/__init__.py", line 1, in <module>
        from langchain_openai.chat_models import AzureChatOpenAI, ChatOpenAI
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_openai/chat_models/__init__.py", line 1, in <module>
        from langchain_openai.chat_models.azure import AzureChatOpenAI
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_openai/chat_models/azure.py", line 29, in <module>
        from langchain_openai.chat_models.base import BaseChatOpenAI
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_openai/chat_models/base.py", line 66, in <module>
        from langchain_core.messages.ai import (
    ImportError: cannot import name 'InputTokenDetails' from 'langchain_core.messages.ai' (/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_core/messages/ai.py)
    

    查看版本:

    $ pip show langchain-core
    Name: langchain-core
    Version: 0.2.43
    Summary: Building applications with LLMs through composability
    Home-page: https://github.com/langchain-ai/langchain
    Author:
    Author-email:
    License: MIT
    Location: /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages
    Requires: jsonpatch, langsmith, packaging, pydantic, PyYAML, tenacity, typing-extensions
    Required-by: langchain, langchain-chroma, langchain-community, langchain-experimental, langchain-openai, langchain-text-splitters
    

    解决方法:

    $ pip install --upgrade langchain-openai langchain-core --proxy=""
    

    再次检查:

    $ pip show langchain-core
    Name: langchain-core
    Version: 0.3.15
    Summary: Building applications with LLMs through composability
    Home-page: https://github.com/langchain-ai/langchain
    Author:
    Author-email:
    License: MIT
    Location: /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages
    Requires: jsonpatch, langsmith, packaging, pydantic, PyYAML, tenacity, typing-extensions
    Required-by: langchain, langchain-chroma, langchain-community, langchain-experimental, langchain-openai, langchain-text-splitters
    

    2.3.2 连接中断

    报错:
    requests.exceptions.ConnectionError: (‘Connection aborted.’, ConnectionResetError(104, ‘Connection reset by peer’))

    USER_AGENT environment variable not set, consider setting it to identify your requests.
    Traceback (most recent call last):
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
        response = self._make_request(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
        raise new_e
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
        self._validate_conn(conn)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
        conn.connect()
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connection.py", line 730, in connect
        sock_and_verified = _ssl_wrap_socket_and_match_hostname(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connection.py", line 909, in _ssl_wrap_socket_and_match_hostname
        ssl_sock = ssl_wrap_socket(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
        ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
        return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 513, in wrap_socket
        return self.sslsocket_class._create(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 1104, in _create
        self.do_handshake()
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 1375, in do_handshake
        self._sslobj.do_handshake()
    ConnectionResetError: [Errno 104] Connection reset by peer
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
        resp = conn.urlopen(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 843, in urlopen
        retries = retries.increment(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/retry.py", line 474, in increment
        raise reraise(type(error), error, _stacktrace)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/util.py", line 38, in reraise
        raise value.with_traceback(tb)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 789, in urlopen
        response = self._make_request(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 490, in _make_request
        raise new_e
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 466, in _make_request
        self._validate_conn(conn)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1095, in _validate_conn
        conn.connect()
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connection.py", line 730, in connect
        sock_and_verified = _ssl_wrap_socket_and_match_hostname(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/connection.py", line 909, in _ssl_wrap_socket_and_match_hostname
        ssl_sock = ssl_wrap_socket(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 469, in ssl_wrap_socket
        ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/urllib3/util/ssl_.py", line 513, in _ssl_wrap_socket_impl
        return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 513, in wrap_socket
        return self.sslsocket_class._create(
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 1104, in _create
        self.do_handshake()
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/ssl.py", line 1375, in do_handshake
        self._sslobj.do_handshake()
    urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/xjg/workspace/openai-quickstart/langchain/pycharm/LCEL/rag.py", line 38, in <module>
        docs = loader.load()
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_core/document_loaders/base.py", line 31, in load
        return list(self.lazy_load())
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_community/document_loaders/web_base.py", line 329, in lazy_load
        soup = self._scrape(path, bs_kwargs=self.bs_kwargs)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_community/document_loaders/web_base.py", line 308, in _scrape
        html_doc = self.session.get(url, **self.requests_kwargs)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
        return self.request("GET", url, **kwargs)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
        resp = self.send(prep, **send_kwargs)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
        r = adapter.send(request, **kwargs)
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/requests/adapters.py", line 682, in send
        raise ConnectionError(err, request=request)
    requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
    

    解决方法:
    打开代理,然后代码中添加

    import os
    
    os.environ["http_proxy"] = "socks5://127.0.0.1:1080"
    os.environ["https_proxy"] = "socks5://127.0.0.1:1080"
    

    2.3.3 缺少库

    报错:
    ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29’ not found (required by /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/grpc/_cython/cygrpc.cpython-310-x86_64-linux-gnu.so)

    File "/home/xjg/workspace/openai-quickstart/langchain/pycharm/LCEL/rag.py", line 6, in <module>
        from langchain_chroma import Chroma
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_chroma/__init__.py", line 6, in <module>
        from langchain_chroma.vectorstores import Chroma
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/langchain_chroma/vectorstores.py", line 24, in <module>
        import chromadb
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/chromadb/__init__.py", line 6, in <module>
        from chromadb.auth.token_authn import TokenTransportHeader
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/chromadb/auth/token_authn/__init__.py", line 24, in <module>
        from chromadb.telemetry.opentelemetry import (
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 13, in <module>
        from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/opentelemetry/exporter/otlp/proto/grpc/trace_exporter/__init__.py", line 20, in <module>
        from grpc import ChannelCredentials, Compression
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/grpc/__init__.py", line 22, in <module>
        from grpc import _compression
      File "/home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/grpc/_compression.py", line 20, in <module>
        from grpc._cython import cygrpc
    ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /home/xjg/.conda/envs/langchain/lib/python3.10/site-packages/grpc/_cython/cygrpc.cpython-310-x86_64-linux-gnu.so)
    

    参考:
    ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29‘ not found

    sudo find / -name "libstdc++.so.6*"
    strings /home/xjg/.conda/envs/ai_endpoint/lib/libstdc++.so.6.0.29 | grep GLIBCXX
    sudo cp /home/xjg/.conda/envs/ai_endpoint/lib/libstdc++.so.6.0.29 /usr/lib/x86_64-linux-gnu/
    sudo rm /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    sudo ln -s /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.29 /usr/lib/x86_64-linux-gnu/libstdc++.so.6
    strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX
    

    3. 参考

    1. rlm/rag-prompt
    2. LangChain Expression Language (LCEL)

    作者:爱学习的小道长

    物联沃分享整理
    物联沃-IOTWORD物联网 » Python LCEL 入门范例

    发表回复