人力资源聊天机器人与LangChain和Chainlit

#大数据 #Ai #数据分析 #数据管理 #人工智能代理

查看配置文件

查看更多文章

In this story we will see how you can create a human resources chatbot using LangChain 和 Chainlit. This chatbot answers questions about employee related policies on topics, like e.g. maternity leave, hazard reporting or policies around training, code of conduct 和 more.

因为这个项目是作为POC创建的, we decided to give it a spin by hijacking the prompts used in this tool, so that the chatbot would tell some occasional jokes whilst answering the questions. So this chatbot should have a humourous spin 和 some sort of “personality”.

全球最大的博彩平台Chainlit

Chainlit is an open-source Python / Typescript library that allows developers to create ChatGPT-like user interfaces quickly. It allows you to create a chain of thoughts 和 then add a pre-built, 可配置聊天用户界面. 它非常适合基于网络的聊天机器人.

Chainlit 更适合这个任务，而不是 Streamlit which requires much more work to configure the UI components.

Chainlit库的代码可以在这里找到:

Chainlit -构建Python LLM应用程序在几分钟⚡️

Chainlit 有两个主要组成部分:

后端:它允许与库进行交互，比如 LangChain, 骆驼指数和 LangFlow 是基于python的.
前端: it is a Typescript-based React application using material UI components.

快速浏览我们的人力资源聊天机器人

Our Chatbot has a UI which initially looks like this in light mode:

初始聊天机器人界面

You can then type your question 和 the result is shown below (using dark mode):

有关正常工作时间的问题

的 UI shows you not only the question 和 the answer, but also the source files. If the text was found, the pdf files are also clickable 和 you can view their content.

If you exp和 the steps of the LangChain chain this is what you see:

思维链表示

用户界面(UI)也有一个可搜索的历史:

搜索历史

你也可以很容易地在明暗模式之间切换:

打开设置

暗模式开关

会讲笑话的人力资源聊天机器人

我们操纵聊天机器人讲笑话, especially when he cannot answer a question based on the available knowledge database. 所以你可能会看到这样的回应:

笑话找不到出处

链工作流

在这个应用程序中有两个工作流:

setup workflow — used to setup the vector database (in our case 做) representing a collection of text documents
user interface workflow — the thought chain interactions

设置工作流程

You can visualize the setup using the following BPMN diagram:

设置工作流程

以下是设置工作流程的步骤:

的 code starts by listing all PDF documents in a folder.
提取文件每页的文本.
文本发送到开放AI嵌入API.
检索嵌入的集合.
结果累积在内存中.
的 collections of accumulated embeddings is persisted to disk.

用户界面工作流

这是用户界面工作流:

用户界面工作流

以下是用户工作流步骤:

用户问一个问题
A similarity search query is executed against the vector database (in our case 做 ——Facebook人工智能相似度搜索)
向量数据库通常最多返回4个文档.
的 returned documents are sent as context to ChatGPT (model: gpt-3.5涡轮-16k)和问题一起.
ChatGPT返回答案
答案将显示在UI上.

聊天机器人代码安装和演练

的 whole chatbot code can be found in this Github repository:

http://github.com/onepointconsulting/hr-chatbot

聊天机器人代码演练

的配置应用程序的大多数参数都在文件中:

http://github.com/onepointconsulting/hr-chatbot/blob/main/config.py

在这个文件中，我们设置了做持久性目录, 嵌入的类型(" text- embeddings -ada-002 "), 默认选项)和模型(“gpt-3”).5-turbo-16k”)

的 提取文本并对嵌入进行处理 在这个文件中:

http://github.com/onepointconsulting/hr-chatbot/blob/main/generate_embeddings.py

每页提取PDF文本的函数(load_pdfs) with the source file 和 page metadata can be found via this link:

http://github.com/onepointconsulting/hr-chatbot/blob/main/generate_embeddings.py#L22

生成嵌入的函数(generate_embeddings)可以在第57行找到:

http://github.com/onepointconsulting/hr-chatbot/blob/main/generate_embeddings.py#L57

这个文件 初始化vector存储库 和 creates a LangChain question answer chain is this one:

http://github.com/onepointconsulting/hr-chatbot/blob/main/chain_factory.py

这个函数 load_embeddings 最重要的功能之一是 chain_factory.py.

This function loads the PDF documents to support text extraction in the Chainlit UI. In case there are no persisted embeddings, the embeddings are generated. In case the embeddings are persisted, then they are loaded from the file system.

This strategy avoids calling the embedding API too often, thus saving money.

另一个重要的函数是 chain_factory.py 是 函数create_retrieval_chain function which loads the QA chain (question 和 answers chain) :

这个函数创建带有内存的QA链. 如果幽默参数为真, then a manipulated prompt — that tends to create jokes on certain occasions — is used.

我们必须继承 ConversationSummaryBufferMemory memory class for the memory not to throw an error related to not finding a key.

Here is an extract of the prompt text we used in the QA chain:

Given the following extracted parts of a long document 和 a question, 用参考(“来源”)创建最终答案. If you know a joke about the subject, make sure that you include it in the response.

If you don’t know the answer, say that you don’t know 和 make up some joke about the subject. 不要试图编造答案.

总是在你的答案中返回“来源”部分.

实际的部分 与ChainLit相关的代码 在这个文件中:

http://github.com/onepointconsulting/hr-chatbot/blob/main/hr_chatbot_chainlit.py

Chainlit图书馆与 Python修饰符. 和 the initialization of the LangChain QA chain is done inside of a decorated function which you can find here:

http://github.com/onepointconsulting/hr-chatbot/blob/main/hr_chatbot_chainlit.py#L56

这个函数使用 load_embeddinges (). 然后由函数初始化QA链 create_retrieval_chain 并返回.

最后一个函数 process_response 将LangChain结果字典转换为Chainlit 消息 object. Most of the code in this method tries to extract the sources which sometimes come in unexpected formats in the text. 代码如下:

http://github.com/onepointconsulting/hr-chatbot/blob/main/hr_chatbot_chainlit.py#L127

关键的外卖

Chainlit is part of the growing LangChain ecosystem 和 allows you to nice looking web based chat applications really quickly. 它有一些自定义选项，比如e.g. 允许快速集成身份验证平台或保存数据.

然而，我们发现要去掉这个有点困难。”用Chainlit建造” footer note 和 ended up doing so in a rather “hacky” way, which is probably not very clean. At this point in time it is not really clear how a deep UI customization can be done without creating a fork or using dirty hacks.

我们面临的另一个问题是 如何可靠地解释LLM输出 -特别是如何从回复中提取来源. 尽管提示告诉LLM:

用参考(“来源”)创建最终答案

它并不是一直都这样. 这会导致一些不可靠的资源提取. 然而，这个问题可以通过 OpenAI的函数调用功能 with which you specify an output format, but which at the time of writing the code was not available.

往好的方面想, LLM’s allow you to create chat bots with a flavour if you are willing to change the prompt in creative ways. 的 jokes delivered by the HR assistant are not that great, however they prove the point that you can create “flavoured” AI assistants, that will eventually be more engaging to end users 和 more fun.

人力资源聊天机器人与LangChain和Chainlit

分享

全球最大的博彩平台Chainlit

快速浏览我们的人力资源聊天机器人

会讲笑话的人力资源聊天机器人

链工作流

设置工作流程

用户界面工作流

聊天机器人代码安装和演练

聊天机器人代码演练

关键的外卖

相关的帖子

由Datum数据中心发布

全球最大的博彩平台数据中心你不知道的10件事

由伍德赫斯特咨询公司发布

在云端翱翔

McCann Manchester发布

2020年:新的一年，新的付费搜索趋势

McCann Manchester发布

为e建立链接.A.T

McCann Manchester发布

4来自Manc SEO的宝贵见解| McCann Connected

由Informed Solutions发布

Informed Solutions Appointed to Ofgem Digital Services Dynamic Purchasing System

由Omnisis Ltd .发布

封锁生活:一项针对英国公众的调查

由iomart Group plc发布

全球最大的博彩平台数字会员免费远程云备份

由Informed Solutions发布

在变革和危机中创新

由PPC保护发布

Why Keyword Blacklists Are Costing Marketers Billions Per Year

十大正规博彩网站评级