赞
踩
Streamlit 是一个用于创建数据科学和机器学习应用程序的开源 Python 库。它的主要目标是使开发人员能够以简单的方式快速构建交互式的数据应用,而无需过多的前端开发经验。Streamlit 提供了一种简单的方法来转换数据脚本或分析代码为具有可视化界面的应用程序,这些应用程序可以通过网络浏览器访问。
以下是 Streamlit 的一些主要特点和优势:
无论是为了创建数据展示、可视化、模型演示还是进行原型开发,Streamlit 都是一个非常有用的工具。所以本文我们将通过创建一个简单的Dashboard作为Streamlit 入门。
我们的目标是做一个下面样式的dashboard。
我们的目标是建立一个上面草图所示的dashboard并且能够展示数据。
首先我们对数据进行处理
import pandas as pd import numpy as np import plotly.express as px import streamlit as st def dateAge(x): #function for generating future bucket grouping a = (x - np.datetime64("today","D")) / np.timedelta64(1,"D") if a <= 0: y = "0 or Less" elif a >0 and a <= 5: y = "5 days" elif a > 5 and a <= 14: y = "6 to 14 days" elif a > 14 and a <= 30: y = "15 to 30 days" else: y = "over 30 days" return y #built as a function for cache use with StreamLit later def getData(): x = pd.read_csv(r'C:\Users\aryan.sinanan\Desktop\Python\raw_data\demand.csv') return x #assign data to df variable df = getData() #Set Org level max and role title or partial name you are looking for org_level = ["6","7","8"] role_title = "Data" #Datatype convert to date from dd-MMM-yy df["Resource Start Date"] = pd.to_datetime(df["Resource Start Date"],format="%d-%b-%y") df["Resource End Date"] = pd.to_datetime(df["Resource End Date"],format="%d-%b-%y") #Define Future Bucket df["Date Bucket"] = df["Resource Start Date"].apply(dateAge) #clean up Location names df.loc[df["Role Work Location"] == "melbourne", "Role Work Location"] = "Melbourne" df.loc[df["Role Work Location"] == "canberra", "Role Work Location"] = "Canberra" #rename columns df.rename(columns={ "Project Has Security/ Nationality Restriction":"Clearance Required", "Resource Start Date":"Start Date", "Resource End Date":"End Date", "Role ID":"ID", "Role Title":"Title", "Role Description":"Description", "Role Talent Segment":"Talent Segment", "Role Career Level From":"Career Level From", "Role Career Level To":"Career Level To", "Role Work Location":"Work Location", "Role Location Type":"Location Type", "Role Fulfillment Entity L3":"Fulfillment Entity L3" }, inplace = True) #drop the unncessary columns df_sub = df.loc[:,("ID","Clearance Required","Start Date","End Date","Date Bucket","Title","Description","Talent Segment","Assigned Role","Career Level To","Work Location","Location Type","Role Primary Contact","Role Primary Contact\n(Email ID)")] #filter the dataframe using ord_level and role_title df_filter = df_sub[(df_sub["Assigned Role"].str.contains( role_title ,case=False,na=False)) & (df_sub["Career Level To"].isin(org_level))]
Streamlit虽然允许我们控制元素布局,但是它的控制项还是比较简单,比如菜单栏必须要放在左侧,也就是我们常说的侧边栏,所以我们将侧边栏作为过滤的条件使用。
以下代码是,常见页面标题和侧边栏的代码:
#title st.markdown("# Roles Dashbaord") #defining side bar st.sidebar.header("Filters:") #placing filters in the sidebar using unique values. location = st.sidebar.multiselect( "Select Location:", options=df_filter["Work Location"].unique(), default=df_filter["Work Location"].unique() ) #placing filters in the sidebar using unique values. work_type = st.sidebar.multiselect( "Select Work Type:", options=df_filter["Location Type"].unique(), default=df_filter["Location Type"].unique() )
以上代码结果如下:
侧边栏中进行过滤器后就需要将过滤后的信息展示,这里我们先展示过滤后的汇总指标
#taking the filtered dataframe created in a previous step and applying a query df_selection = df_filter.query( "`Work Location`== @location & `Location Type` == @work_type" ) #defining our metrics total_roles = df_selection["ID"].value_counts().sum() bucket_0 = df_selection[df_selection["Date Bucket"]=="0 or Less"].value_counts().sum() bucket_5 = df_selection[df_selection["Date Bucket"]=="5 days"].value_counts().sum() bucket_14 = df_selection[df_selection["Date Bucket"]=="6 to 14 days"].value_counts().sum() bucket_30 = df_selection[df_selection["Date Bucket"]=="15 to 30 days"].value_counts().sum() bucket_31 = df_selection[df_selection["Date Bucket"]=="over 30 days"].value_counts().sum() #placing our metrics within columns in the dashboard col1,col2,col3,col4,col5,col6=st.columns(6) col1.metric(label="No. Roles",value=total_roles) col2.metric(label="Already Started",value=bucket_0) col3.metric(label="In 5 Days",value=bucket_5) col4.metric(label="In 14 Days",value=bucket_14) col5.metric(label="In 30 Days",value=bucket_30) col6.metric(label="Over 30 Days",value=bucket_31)
结果如下,可以看到,我们已经可以展示出一些数据了
下面我们还可以创建图表
#a dividing line
st.divider()
#dataframe for chart
df_1 = df_selection.groupby(["Work Location","Location Type"])["Work Location"].value_counts().reset_index()
#defining the chart
fig_count_location_type = px.bar(df_1,x="Work Location",y="count",color="Location Type",title="Role by Location and Type - Plotly")
#displaying the chart on the dashboard
st.plotly_chart(fig_count_location_type, use_container_width=True)
使用plotly_chart可以轻松创建表格
我们还可以通过显示Dataframe来展示过滤后数据集的所有数据,也就是显示详情数据。
#a dividing line st.divider() #showing the dataframe st.dataframe(df_selection,hide_index=True,column_config={ # we can also config the formatting of a given column "ID": st.column_config.NumberColumn( #show the ID as a number no formatting format="%d" ), #set the formatting of the dateColumn to dd-MMM-yyyy "Start Date": st.column_config.DateColumn( format="DD-MMM-YYYY" ), #set the formatting of the dateColumn to dd-MMM-yyyy "End Date": st.column_config.DateColumn( format="DD-MMM-YYYY" ), #reduce the column size to medium "Title": st.column_config.TextColumn( width="medium" ), #reduce the column size to medium "Description": st.column_config.TextColumn( width="medium" ) })
可以看到,我们以表格的形式将dataframe所有的数据展示出来了。
Streamlit可以让开发人员能够以快速、简便的方式构建出功能强大且具有交互性的应用程序,从而更好地与数据进行互动和沟通。通过上面的例子可以看到,使用 Streamlit 构建应用程序不需要具备深入的前端开发知识,因为Streamlit 的设计目标之一就是让数据科学家、分析师和其他非前端开发人员能够轻松地创建交互式应用程序,而无需过多关注复杂的前端技术。
Streamlit 的开发方式与一般的 Python 编程相似,只需使用一些简单的 Streamlit 函数和命令来配置和控制应用程序的外观和行为。Streamlit 的强大之处在于它能够将复杂的前端开发流程简化为几行 Python 代码,让任何人都能够轻松创建出具有交互性和可视化效果的数据应用程序。
https://avoid.overfit.cn/post/24320cbf1d9c4ca7b1d3a9086b2ba5b9
作者:Aryan Sinanan
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。