赞
踩
小李哥将继续每天介绍一个基于亚马逊云科技AWS云计算平台的全球前沿AI技术解决方案,帮助大家快速了解国际上最热门的云计算平台亚马逊云科技AWS AI最佳实践,并应用到自己的日常工作里。
本次介绍的是如何利用亚马逊云科技大模型托管服务Amazon Bedrock和个性化推荐算法服务Amazon Personalize搭建面向用户的广告营销平台,将生成式AI应用到用户的广告营销场景,提升用户产品转化率。本架构设计全部采用了云原生Serverless架构,提供可扩展和安全的AI解决方案。通过Application Load Balancer和AWS ECS将应用程序与AI模型集成。本方案的解决方案架构图如下:
Amazon Bedrock 是亚马逊云科技提供的一项服务,旨在帮助开发者轻松构建和扩展生成式 AI 应用。Bedrock 提供了访问多种强大的基础模型(Foundation Models)的能力,支持多种不同大模型厂商的模型,如AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, 和Amazon,用户可以使用这些模型来创建、定制和部署各种生成式 AI 应用程序,而无需从头开始训练模型。Bedrock 支持多个生成式 AI 模型,包括文本生成、图像生成、代码生成等,简化了开发流程,加速了创新。
Amazon Personalize 是亚马逊云科技提供的一项机器学习服务,旨在帮助开发者轻松构建和部署个性化推荐系统。该服务利用亚马逊多年来在推荐系统领域积累的经验,通过自动化机器学习模型来生成高度精准的个性化推荐,无需开发者具备深厚的机器学习背景。
定制广告营销内容:
Amazon Personalize 可以根据用户的行为和偏好,动态生成个性化的广告和营销内容,提升广告投放的效果和转化率。
产品推荐:
在电商平台中,Personalize 可根据用户的浏览和购买历史,推荐相关产品,增加销售额和用户粘性。
内容推荐:
在流媒体平台上,Personalize 能根据用户的观影或听歌习惯,推荐符合其兴趣的电影、电视剧或音乐,提升用户体验。
电子邮件个性化:
Personalize 可以用于定制电子邮件的内容,确保每个用户接收到的邮件内容都是基于其偏好定制的,从而提高邮件的打开率和点击率。
1. 进入亚马逊云科技控制台,确认Titan Text G1 - Lite模型是开启的,我们将利用该模型进行广告内容生成。
2. 接下来我们打开亚马逊云科技机器学习服务SageMaker,新建一个Jupyter Notebook:“Lab-Notebook”并打开。
3. 新建一个NoteBook命名为:“personalized-marketing.ipynb”,并粘贴、运行以下代码。首先我们导入必要依赖。
- # Import packages
- import boto3
- import time
- import pandas as pd
- import json
- import random
4. 下面我们将我们的csv文件中保存的电影评分源数据导入到DataFrame中
- item_data = pd.read_csv('imdb/items.csv', sep=',', dtype={'PROMOTION': "string"})
- item_data.head(5)
- movies = pd.read_csv('imdb/items.csv', sep=',', usecols=[0,1], encoding='latin-1', dtype={'movieId': "str", 'imdbId': "str", 'tmdbId': "str"})
- pd.set_option('display.max_rows', 25)
- movies
5. 接下来我们为Amazon Personalize模型算法训练创建必要的前提条件,如创建Amazon Personalize客户端、获取IAM权限和S3存储桶。
- # Configure the SDK to Amazon Personalize
- personalize = boto3.client('personalize')
- personalize_runtime = boto3.client('personalize-runtime')
-
- account_id = boto3.client('sts').get_caller_identity().get('Account')
- print("account id:", account_id)
-
- with open('/opt/ml/metadata/resource-metadata.json') as notebook_info:
- data = json.load(notebook_info)
- resource_arn = data['ResourceArn']
- region = resource_arn.split(':')[3]
- print("region:", region)
-
- # Set up a Boto3 client to access IAM functions
- iam = boto3.client('iam')
-
- # A role has been set up for this solution. The following obtains the ARN for that role
- # and also prints the role name for your information
-
- role_name = iam.get_role(RoleName='personalize_exec_role')
- role_arn = role_name['Role']['Arn']
-
- role_name = role_arn.split('/')[1]
- role_name
-
- # Set up a Boto3 client to access S3 functions
- s3 = boto3.client('s3')
-
- # Get a list of all S3 buckets so that we can find the one that starts with "personalized-marketing"
- response = s3.list_buckets()
-
- # Filter buckets that start with 'personalized-marketing'
- buckets_list = [bucket['Name'] for bucket in response['Buckets'] if bucket['Name'].startswith('personalized-marketing')]
-
- # Get the one bucket name from the list
- for data_bucket in buckets_list:
- data_bucket_name = data_bucket
-
- # Display the name of the bucket found
- data_bucket_name
6. 由于Amazon Personalize模型算法训练的数据集需要放置在S3存储桶中,我们将我们的源数据上传到我们刚刚创建的S3桶。
- interactions_filename = 'interactions.csv'
- items_filename = "items.csv"
-
- interactions_file = interactions_filename
-
- try:
- s3.get_object(
- Bucket=data_bucket_name,
- Key=interactions_filename,
- )
- print("{} already exists in the bucket {}".format(interactions_filename, data_bucket_name))
- except s3.exceptions.NoSuchKey:
- # Uploading the file if it does not already exist
- boto3.Session().resource('s3').Bucket(data_bucket_name).Object(interactions_filename).upload_file(interactions_filename)
- print("File {} uploaded to bucket {}".format(interactions_filename, data_bucket_name))
-
- items_file = "imdb/" + items_filename
-
- try:
- s3.get_object(
- Bucket=data_bucket_name,
- Key=items_filename,
- )
- print("{} already exists in the bucket {}".format(items_file, data_bucket_name))
- except s3.exceptions.NoSuchKey:
- # Uploading the file if it does not already exist
- # Note that the following line will be needed for the DIY
- boto3.Session().resource('s3').Bucket(data_bucket_name).Object(items_filename).upload_file(items_file)
- print("File {} uploaded to bucket {}".format(items_filename, data_bucket_name))
7. 接下来我们创建用于Amazon Personalize模型算法训练的数据集组,用于隔离和区分不同的数据集。训练Amazon Personalize模型算法需要创建3个不同的数据集和数据集组,分别为用户信息数据集、推荐物品数据集以及用户购买/使用物品的历史记录的用户物品交互数据集“User-item-interactions”。
- marketing_dataset_group_name = "marketing-email-dataset"
- try:
- # Try to create the dataset group. This block will run fully if the dataset group does not exist yet
- # Refer to this section for the DIY
- create_dataset_group_response = personalize.create_dataset_group(
- name = marketing_dataset_group_name,
- domain='VIDEO_ON_DEMAND'
- )
-
- marketing_dataset_group_arn = create_dataset_group_response['datasetGroupArn']
- print(json.dumps(create_dataset_group_response, indent=2))
- print ('\nCreating the Dataset Group with dataset_group_arn = {}'.format(marketing_dataset_group_arn))
-
- except personalize.exceptions.ResourceAlreadyExistsException as e:
- # If the dataset group already exists, get the unique identifier, marketing_dataset_group_arn,
- # from the existing resource
-
- marketing_dataset_group_arn = 'arn:aws:personalize:'+region+':'+account_id+':dataset-group/'+marketing_dataset_group_name
- print ('\nThe the Dataset Group with dataset_group_arn = {} already exists'.format(marketing_dataset_group_arn))
- print ('\nWe will be using the existing Dataset Group dataset_group_arn = {}'.format(marketing_dataset_group_arn))
8. 我们以创建用户-物品交互数据集“User-item-interactions”为例。接下来我们定义一个数据集结构描述的JSON脚本,帮助Amazon Personalize 就能够理解数据的含义,并在训练推荐模型时正确地使用这些数据。
- interactions_schema_name = "marketing_interactions_schema"
-
- interactions_schema = {
- "type": "record",
- "name": "Interactions",
- "namespace": "com.amazonaws.personalize.schema",
- "fields": [
- {
- "name": "USER_ID",
- "type": "string"
- },
- {
- "name": "ITEM_ID",
- "type": "string"
- },
- {
- "name": "EVENT_TYPE", # "Watch", "Click", etc.
- "type": "string"
- },
- {
- "name": "TIMESTAMP",
- "type": "long"
- }
- ],
- "version": "1.0"
- }
-
- try:
- # Try to create the interactions dataset schema. This block will run fully
- # if the interactions dataset schema does not exist yet
- create_schema_response = personalize.create_schema(
- name = interactions_schema_name,
- schema = json.dumps(interactions_schema),
- domain='VIDEO_ON_DEMAND'
- )
- print(json.dumps(create_schema_response, indent=2))
- marketing_interactions_schema_arn = create_schema_response['schemaArn']
- print ('\nCreating the Interactions Schema with marketing_interactions_schema_arn = {}'.format(marketing_interactions_schema_arn))
-
- except personalize.exceptions.ResourceAlreadyExistsException:
- # If the interactions dataset schema already exists, get the unique identifier marketing_interactions_schema_arn
- # from the existing resource
-
- marketing_interactions_schema_arn = 'arn:aws:personalize:'+region+':'+account_id+':schema/'+interactions_schema_name
- print('The schema {} already exists.'.format(marketing_interactions_schema_arn))
- print ('\nWe will be using the existing Interactions Schema with marketing_interactions_schema_arn = {}'.format(marketing_interactions_schema_arn))
9. 接下来我们正式创建训练数据集,对于其他两个数据集我们可以按照相同的方式创建。
- interactions_dataset_name = "marketing_interactions"
- try:
- # Try to create the interactions dataset. This block will run fully
- # if the interactions dataset does not exist yet
-
- dataset_type = 'INTERACTIONS'
- create_dataset_response = personalize.create_dataset(
- name = interactions_dataset_name,
- datasetType = dataset_type,
- datasetGroupArn = marketing_dataset_group_arn,
- schemaArn = marketing_interactions_schema_arn
- )
-
- marketing_interactions_dataset_arn = create_dataset_response['datasetArn']
- print(json.dumps(create_dataset_response, indent=2))
- print ('\nCreating the Interactions Dataset with marketing_interactions_dataset_arn = {}'.format(marketing_interactions_dataset_arn))
-
- except personalize.exceptions.ResourceAlreadyExistsException:
- # If the interactions dataset already exists, get the unique identifier, marketing_interactions_dataset_arn,
- # from the existing resource
- marketing_interactions_dataset_arn = 'arn:aws:personalize:'+region+':'+account_id+':dataset/'+marketing_dataset_group_name+'/INTERACTIONS'
- print('The Interactions Dataset {} already exists.'.format(marketing_interactions_dataset_arn))
- print ('\nWe will be using the existing Interactions Dataset with marketing_interactions_dataset_arn = {}'.format(marketing_interactions_dataset_arn))
10. 我们创建一个import job导入任务,将S3中的数据集导入到Amazon Personalize中的数据集,用于模型算法训练。
- interactions_import_job_name = "dataset_import_interaction"
- # Check if the import job already exists
-
- # List the import jobs
- interactions_dataset_import_jobs = personalize.list_dataset_import_jobs(
- datasetArn=marketing_interactions_dataset_arn,
- maxResults=100
- )['datasetImportJobs']
-
- # Check if there is an existing job with the prefix
- job_exists = False
- job_arn = None
-
- for job in interactions_dataset_import_jobs:
- if (interactions_import_job_name in job['jobName']):
- job_exists = True
- job_arn = job['datasetImportJobArn']
-
- if (job_exists):
- marketing_interactions_dataset_import_job_arn = job_arn
- print('The Interactions Import Job {} already exists.'.format(marketing_interactions_dataset_import_job_arn))
- print ('\nWe will be using the existing Interactions Import Job with marketing_interactions_dataset_import_job_arn = {}'.format(marketing_interactions_dataset_import_job_arn))
-
- else:
- # If there is no import job with the prefix, create it
- create_dataset_import_job_response = personalize.create_dataset_import_job(
- jobName = interactions_import_job_name,
- datasetArn = marketing_interactions_dataset_arn,
- dataSource = {
- "dataLocation": f"s3://{data_bucket_name}/interactions.csv"
- },
- roleArn = role_arn
- )
- marketing_interactions_dataset_import_job_arn = create_dataset_import_job_response['datasetImportJobArn']
- print(json.dumps(create_dataset_import_job_response, indent=2))
-
- print ('\nImporting the Interactions Data with marketing_interactions_dataset_import_job_arn = {}'.format(marketing_interactions_dataset_import_job_arn))
-
-
11. 我们从Amazon Personalize获取预定义的视频话题相关(VIDEO_ON_DEMAND)的推荐算法,再创建并训练一个推荐器实例为用户推荐视频,推荐器会根据用户信息和观看历史推荐他们最喜欢的视频。
- available_recipes = personalize.list_recipes(domain='VIDEO_ON_DEMAND')
- display_available_recipes = available_recipes ['recipes']
- available_recipes = personalize.list_recipes(domain='VIDEO_ON_DEMAND',nextToken=available_recipes['nextToken'])#paging to get the rest of the recipes
- display_available_recipes = display_available_recipes + available_recipes['recipes']
- display(display_available_recipes)
-
- recommender_top_picks_for_you_name = "marketing_top_picks_for_you"
-
- try:
- create_recommender_response = personalize.create_recommender(
- name = recommender_top_picks_for_you_name,
- recipeArn = 'arn:aws:personalize:::recipe/aws-vod-top-picks',
- datasetGroupArn = marketing_dataset_group_arn,
- recommenderConfig = {"enableMetadataWithRecommendations": True}
- )
- marketing_recommender_top_picks_arn = create_recommender_response["recommenderArn"]
-
- print (json.dumps(create_recommender_response))
- print ('\nCreating the Top Picks For You recommender with marketing_recommender_top_picks_arn = {}'.format(marketing_recommender_top_picks_arn))
-
- except personalize.exceptions.ResourceAlreadyExistsException as e:
- marketing_recommender_top_picks_arn = 'arn:aws:personalize:'+region+':'+account_id+':recommender/'+recommender_top_picks_for_you_name
- print('The Top Picks For You recommender {} already exists.'.format(marketing_recommender_top_picks_arn))
- print ('\nWe will be using the existing Top Picks For You recommender with marketing_recommender_top_picks_arn = {}'.format(marketing_recommender_top_picks_arn))
-
12. 我们定义一个函数用户为不同的用户,基于他们的id推荐出他们最喜欢的视频
- def getRecommendedMoviesForUserId(
- user_id,
- marketing_recommender_top_picks_arn,
- item_data,
- number_of_movies_to_recommend = 5):
- # For a user_id, get the top n (number_of_movies_to_recommend) movies by using Amazon Personalize
- # and get the additional metadata for each movie (item_id) from the item_data
- # Return a list of movie dictionaries (movie_list) with the relevant data
-
- # Get recommended movies
- get_recommendations_response = personalize_runtime.get_recommendations(
- recommenderArn = marketing_recommender_top_picks_arn,
- userId = str(user_id),
- numResults = number_of_movies_to_recommend,
- metadataColumns = {
- "ITEMS": ['TITLE', 'GENRES']
- }
- )
-
- # Create a list of movies with title, genres
- movie_list = []
-
- for recommended_movie in get_recommendations_response['itemList']:
- movie_list.append(
- {
- 'title' : recommended_movie['metadata']['title'],
- 'genres' : recommended_movie['metadata']['genres'].replace('|', ' and ')
- }
- )
- return movie_list
13. 下面我们就开始利用Bedrock结合我们的用户推荐结果,为用户发送广告营销邮件。在以下代码中我们定义了用户的个人信息demographic,如用户是一名50岁居住在多伦多的成年人。再从Amazon Personalize取出基于用户id获得的推荐电影,基于以上信息构建一个提示词模板,再利用Titan Text AI大模型生成广告营销邮件。
- # Set up a Boto3 client to access the functions within Amazon Bedrock
- bedrock = boto3.client('bedrock-runtime')
-
- # Model parameters
- # The LLM you will be using
- model_id = 'amazon.titan-text-lite-v1'
-
- # The desired MIME type of the inference body in the response
- accept = 'application/json'
-
- # The MIME type of the input data in the request
- content_type = 'application/json'
-
- # The maximum number of tokens to use in the generated response
- max_tokens_to_sample = 1000
-
- # Sample user demographics
- user_demographic_1 = f'The user is a 50 year old adult called Otto.'
- user_demographic_3 = f'The user is a young adult called Jane.'
-
- def generate_personalized_prompt(user_demographic, favorite_genre, movie_list, model_id, max_tokens_to_sample = 50):
-
- prompt_template = f'''You are a skilled publicist. Write a high-converting marketing email advertising several movies available in a video-on-demand streaming platform next week,
- given the movie and user information below. Your email will leverage the power of storytelling and persuasive language.
- You want the email to impress the user, so make it appealing to them based on the information contained in the <user> tags,
- and take into account the user's favorite genre in the <genre> tags.
- The movies to recommend and their information is contained in the <movie> tag.
- All movies in the <movie> tag must be recommended. Give a summary of the movies and why the human should watch them.
- Put the email between <email> tags.
- Sign it from "Cloud island movies".
-
- <user>
- {user_demographic}
- </user>
- <genre>
- {favorite_genre}
- </genre>
- <movie>
- {movie_list}
- </movie>
- '''
-
- prompt_input = json.dumps({
- "inputText":prompt_template,
- "textGenerationConfig": {
- "maxTokenCount": 4096,
- "stopSequences": [],
- "temperature": 0.7,
- "topP": 0.9
- }
- })
-
- return prompt_input
-
-
- # Create prompt input
- prompt_input_json = generate_personalized_prompt(user_demographic, user_favorite_genre, movie_list, model_id, max_tokens_to_sample )
- prompt_input_json
-
-
- response = bedrock.invoke_model(
- body= prompt_input_json,
- modelId=model_id,
- accept=accept,
- contentType=content_type
- )
-
- response_body = json.loads(response.get('body').read())
- model_output_string = response_body['results'][0]['outputText']
- # model_output_str_clean = re.sub(r'<[^>]*>', '', model_output_string)
-
- print(model_output_string)
以上就是在亚马逊云科技上利用生成式AI搭建面向用户广告营销平台的全部步骤。欢迎大家未来与我一起,未来获取更多国际前沿的生成式AI开发方案。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。