赞
踩
在本文中,我们将了解如何使用用户定义的函数来查找dataframe中元素的位置。让我们首先使用列表字典创建一个简单的dataframe,假设列名是:“名称”、“年龄”、“城市”和“区域”。
import pandas as pd
students = [('Ankit', 23, 'Delhi', 'A'),
('Swapnil', 22, 'Delhi', 'B'),
('Aman', 22, 'Dehradun', 'A'),
('Jiten', 22, 'Delhi', 'A'),
('Jeet', 21, 'Mumbai', 'B')
]
df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section'])
df
输出结果:
students = [('Ankit', 23, 'Delhi', 'A'), ('Swapnil', 22, 'Delhi', 'B'), ('Aman', 22, 'Dehradun', 'A'), ('Jiten', 22, 'Delhi', 'A'), ('Jeet', 21, 'Mumbai', 'B') ] # Creating Dataframe object df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section']) # This function will return a list of # positions where element exists # in the dataframe. def getIndexes(dfObj, value): # Empty list listOfPos = [] # isin() method will return a dataframe with # boolean values, True at the positions # where element exists result = dfObj.isin([value]) # any() method will return # a boolean series seriesObj = result.any() columnNames = list(seriesObj[seriesObj == True].index) for col in columnNames: rows = list(result[col][result[col] == True].index) for row in rows: listOfPos.append((row, col)) return listOfPos # Calling getIndexes() function to get # the index positions of all occurrences # of 22 in the dataframe listOfPositions = getIndexes(df, 22) print('Index positions of 22 in Dataframe : ') # Printing the position for i in range(len(listOfPositions)): print( listOfPositions[i])
结果输出:
现在让我们了解函数 getIndexes() 的工作原理。 isin()、dataframe/series.any() 接受值并返回一个带有布尔值的dataframe。这个布尔dataframe的大小与第一个原始dataframe的大小相似。在dataframe中存在给定元素的位置,该值为 True,否则为 False。然后找到包含元素 22 的列的名称。我们可以通过在包含 True 的布尔dataframe中获取列的名称来完成此操作。现在在布尔dataframe中,我们遍历每个选定的列,对于每一列,我们找到 True 的行。现在,这些 True 存在的列名和行索引的组合是数据帧中 22 的索引位置。这就是 getIndexes() 如何找到给定元素的准确索引位置 & amp;以(行,列)元组的形式存储每个位置。最后,它返回一个元组列表,表示其在数据帧中的索引位置。
# Import pandas library import pandas as pd # List of tuples students = [('Ankit', 23, 'Delhi', 'A'), ('Swapnil', 22, 'Delhi', 'B'), ('Aman', 22, 'Dehradun', 'A'), ('Jiten', 22, 'Delhi', 'A'), ('Jeet', 21, 'Mumbai', 'B') ] # Creating Dataframe object df = pd.DataFrame(students, columns =['Name', 'Age', 'City', 'Section']) # This function will return a # list of positions where # element exists in dataframe def getIndexes(dfObj, value): # Empty list listOfPos = [] # isin() method will return a dataframe with # boolean values, True at the positions # where element exists result = dfObj.isin([value]) # any() method will return # a boolean series seriesObj = result.any() # Get list of columns where element exists columnNames = list(seriesObj[seriesObj == True].index) # Iterate over the list of columns and # extract the row index where element exists for col in columnNames: rows = list(result[col][result[col] == True].index) for row in rows: listOfPos.append((row, col)) # This list contains a list tuples with # the index of element in the dataframe return listOfPos # Create a list which contains all the elements # whose index position you need to find listOfElems = [22, 'Delhi'] # Using dictionary comprehension to find # index positions of multiple elements # in dataframe dictOfPos = {elem: getIndexes(df, elem) for elem in listOfElems} print('Position of given elements in Dataframe are : ') # Looping through key, value pairs # in the dictionary for key, value in dictOfPos.items(): print(key, ' : ', value)
结果输出:
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。