当前位置:   article > 正文

LLM 怎样用于 OLAP 自助式数据分析?_llm数据分析

llm数据分析

 

目录

LLM是不够的(用于自助式分析) LLM Is Not Enough (For Self-Service Analytics)

Introduction 介绍 

A Conversation with AI 与 AI 的对话 

The Key Requirements 关键要求 

Capability 能力 

Reliability 可靠性 

Accuracy 准确性 

Security & Governance 安全与治理 

Cost & Speed 成本和速度 

Adapting Large Language Models To New Domains使大型语言模型适应新领域

01. Fine-tune The Base Model For Domain-Specific Tasks01. 微调特定领域任务的基本模型

02. Search and Ask Method02. 搜索和询问方法

The Cost Factor 成本因素 

Which One To Choose? 选择哪一个? 

Data semantic layer and LLM数据语义层和LLM

Conclusion and What’s Next 结论和下一步是什么 


LLM是不够的(用于自助式分析) LLM Is Not Enough (For Self-Service Analytics)

Introduction 介绍 

A few weeks ago, Microsoft announced a new data analytics product called Fabric. One of Fabric’s most exciting features is a chat interface that allows users to ask data questions in human language. So instead of waiting in a data request queue, everyone gets instant answers to their data questions.
几周前,Microsoft宣布了一款名为 Fabric 的数据分析产品。Fabric最令人兴奋的功能之一是聊天界面,允许用户用人类语言提出数据问题。因此,每个人都无需在数据请求队列中等待,而是可以即时获得数据问题的答案。

Since the release of OpenAI and open-source LLMs, analytics vendors from big to small are scrambling to integrate modern LLMs into their products. Between Microsoft and a flurry of data vendors with shiny demos and glitzy promises, the future seems bright. “Unlock hidden insights with AI,” they say, “The power of AI at your fingertips” they proclaim. And so it goes, on and on and on.
自从OpenAI和开源LLM发布以来,从大到小的分析供应商都在争先恐后地将现代LLM集成到他们的产品中。在Microsoft和一系列拥有闪亮演示和炫目承诺的数据供应商之间,未来似乎一片光明。“用人工智能解锁隐藏的见解,”他们说,“人工智能的力量触手可及”。就这样,一直持续下去。

Beneath the surface, the truth is often far less glamorous.
在表面之下,真相往往远没有那么迷人。

An AI assistant can give you instant answers to your data questions, but can you trust them?
人工智能助手可以为你的数据问题提供即时答案,但你能信任他们吗?

There is a big difference between being fast and being reliable. I can give you the answer for 4566 * 145966 in 1 second. It’s fast, but it’s certainly not correct, just like how the SQL query generated by the AI assistant in the Fabric demo was not correct (despite being generated quickly).
快速和可靠之间有很大的区别。我可以在1秒内给你答案。它很快,但肯定不正确,就像 Fabric 演示中 AI 助手生成的 SQL 查询不正确一样(尽管生成速度很快)。

 

This is not a rare case. In this article, an analyst tries to use ChatGPT to generate SQL in a showdown between humans and machines. ChatGPT was right about 50% of the time. This is fine when used by analysts who can check the validity of the output, but a total disaster as a self-service analytics tool for non-technical business users.
这种情况并不罕见。在本文中,一位分析师尝试使用 ChatGPT 在人与机器之间的对决中生成 SQL。ChatGPT大约50%的时间是正确的。当分析师可以检查输出的有效性时,这很好,但作为非技术业务用户的自助服务分析工具,这是一场灾难。

This is not to say that LLMs have no place in the BI world. We’ve all seen the power of LLMs in the palms of data analysts: quickly generating SQL/Python code for ad-hoc analysis, brainstorming ideas during research, or eloquently summarizing complex analyses.
这并不是说LLM在BI世界中没有地位。我们都在数据分析师的掌中看到了LLM的力量:快速生成SQL/Python代码以进行临时分析,在研究期间集思广益,或者雄辩地总结复杂的分析。

But before you’re convinced that LLMs can be easily integrated into your current BI tool to make your self-service dream come true*,* let’s take the whole thing with a healthy dose of skepticism. Let's seek proof, not from glossy brochures and snazzy demos, but from first-principled reasoning and understanding the very components that LLMs are made of.
但是,在您确信LLM可以轻松集成到您当前的BI工具中以实现您的自助服务梦想*之前,让我们以健康的怀疑态度看待整个事情。让我们寻求证据,不是从光鲜的小册子和时髦的演示中,而是从第一原则推理和理解LLM的组成部分中。

This series is divided into 2 parts.
本系列分为两部分。

In part one (this post), we’ll try to convince you that: Yes, LLM will change the landscape of self-service analytics, but it will not be done quickly, nor the technology by itself is enough to power next-generation self-service analytics tools. We’ll also show why data semantic layer is a crucial component in building an LLM-powered self-service analytics system.
在第一部分(这篇文章)中,我们将试图说服你:是的,LLM将改变自助式分析的格局,但它不会很快完成,技术本身也不足以为下一代自助式分析工具提供动力。我们还将展示为什么数据语义层是构建LLM驱动的自助式分析系统的关键组件。

In part 2 (subsequent post), we’ll show why we think the current design of data semantic layer is not enough for such a task, and what are the missing ingredients for that to happen.
在第 2 部分(后续文章)中,我们将展示为什么我们认为当前数据语义层的设计不足以完成这样的任务,以及发生这种情况的缺失要素是什么。

Now let’s start. 现在让我们开始吧。 

A Conversation with AI 与 AI 的对话 

Imagine a perfect world where business users and analysts live in peace, free from report-sprawling, dashboard flooding, and request queue frustration - where LLM becomes an integral cog in any Business Intelligence machine. Forget about all of the how-to technicalities. Forget about all of LLM’s current limitations. Imagine a conversation between a business and AI. What would it be like?
想象一下,在这个完美的世界里,业务用户和分析师和平相处,没有庞大的报告、仪表板泛滥和请求队列的挫败感——LLM 成为任何商业智能机器中不可或缺的齿轮。忘记所有操作方法的技术细节。忘记LLM当前的所有限制。想象一下企业和人工智能之间的对话。会是什么样子?

In this article, the author painted a vivid picture of how the talk would go:
在这篇文章中,作者描绘了一幅生动的图景,描绘了演讲将如何进行:

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/秋刀鱼在做梦/article/detail/1012096
推荐阅读
相关标签