- # 336、pandas.Series.str.rpartition方法
- pandas.Series.str.rpartition(sep=' ', expand=True)
- Split the string at the last occurrence of sep.
- This method splits the string at the last occurrence of sep, and returns 3 elements containing the part before the separator, the separator itself, and the part after the separator. If the separator is not found, return 3 elements containing two empty strings, followed by the string itself.
- Parameters:
- sep
- str, default whitespace
- String to split on.
- expand
- bool, default True
- If True, return DataFrame/MultiIndex expanding dimensionality. If False, return Series/Index.
- Returns:
- DataFrame/MultiIndex or Series/Index of objects.

336-2-1、sep(可选,默认值为' '):字符串,用作分隔符的字符串,你可以设置为任何字符串,作为切割的依据。
- # 336、pandas.Series.str.rpartition方法
- # 336-1、数据清洗(提取文件名)
- import pandas as pd
- # 示例数据
- file_paths = pd.Series([
- '/home/user/documents/report.pdf',
- '/var/www/html/index.html',
- '/tmp/example.txt'
- ])
- # 使用rpartition提取文件名
- file_names = file_paths.str.rpartition('/')[2]
- print("提取的文件名:")
- print(file_names, end='\n\n')
- # 336-2、文本分析(提取最后一个单词)
- import pandas as pd
- # 示例数据
- sentences = pd.Series([
- 'The quick brown fox',
- ' jumps over the lazy dog',
- 'Hello world'
- ])
- # 提取最后一个单词
- last_words = sentences.str.rpartition(' ')[2]
- print("提取的最后一个单词:")
- print(last_words, end='\n\n')
- # 336-3、分割复合数据(提取键和值)
- import pandas as pd
- # 示例数据
- key_value_pairs = pd.Series([
- 'name:Alice',
- 'age:30',
- 'city:New York'
- ])
- # 提取键和值
- keys = key_value_pairs.str.rpartition(':')[0]
- values = key_value_pairs.str.rpartition(':')[2]
- print("提取的键:")
- print(keys)
- print("提取的值:")
- print(values)

- # 336、pandas.Series.str.rpartition方法
- # 336-1、数据清洗(提取文件名)
- # 提取的文件名:
- # 0 report.pdf
- # 1 index.html
- # 2 example.txt
- # Name: 2, dtype: object
- # 336-2、文本分析(提取最后一个单词)
- # 提取的最后一个单词:
- # 0 fox
- # 1 dog
- # 2 world
- # Name: 2, dtype: object
- # 336-3、分割复合数据(提取键和值)
- # 提取的键:
- # 0 name
- # 1 age
- # 2 city
- # Name: 0, dtype: object
- # 提取的值:
- # 0 Alice
- # 1 30
- # 2 New York
- # Name: 2, dtype: object

- # 337、pandas.Series.str.slice方法
- pandas.Series.str.slice(start=None, stop=None, step=None)
- Slice substrings from each element in the Series or Index.
- Parameters:
- start
- int, optional
- Start position for slice operation.
- stop
- int, optional
- Stop position for slice operation.
- step
- int, optional
- Step size for slice operation.
- Returns:
- Series or Index of object
- Series or Index from sliced substring from original string object.

- # 337、pandas.Series.str.slice方法
- import pandas as pd
- # 示例数据
- s = pd.Series(['apple', 'banana', 'cherry', 'date'])
- # 从索引1开始切片到索引4(不包括4)
- result_slice = s.str.slice(start=1, stop=4)
- # 仅指定步长为2,默认从头到尾
- result_step = s.str.slice(step=2)
- # 反向切片,步长为-1
- result_reverse = s.str.slice(start=4, stop=0, step=-1)
- print("从索引1到4切片:")
- print(result_slice)
- print("\n每隔一个字符切片:")
- print(result_step)
- print("\n反向切片:")
- print(result_reverse)

- # 337、pandas.Series.str.slice方法
- # 从索引1到4切片:
- # 0 ppl
- # 1 ana
- # 2 her
- # 3 ate
- # dtype: object
- #
- # 每隔一个字符切片:
- # 0 ape
- # 1 bnn
- # 2 cer
- # 3 dt
- # dtype: object
- #
- # 反向切片:
- # 0 elpp
- # 1 nana
- # 2 rreh
- # 3 eta
- # dtype: object

- # 338、pandas.Series.str.slice_replace方法
- pandas.Series.str.slice_replace(start=None, stop=None, repl=None)
- Replace a positional slice of a string with another value.
- Parameters:
- start
- int, optional
- Left index position to use for the slice. If not specified (None), the slice is unbounded on the left, i.e. slice from the start of the string.
- stop
- int, optional
- Right index position to use for the slice. If not specified (None), the slice is unbounded on the right, i.e. slice until the end of the string.
- repl
- str, optional
- String for replacement. If not specified (None), the sliced region is replaced with an empty string.
- Returns:
- Series or Index
- Same type as the original object.

- # 338、pandas.Series.str.slice_replace方法
- import pandas as pd
- # 示例数据
- data = pd.Series(['abcdefg', 'hijklmn', 'opqrstu'])
- # 使用str.slice_replace()方法
- result = data.str.slice_replace(start=2, stop=5, repl='XYZ')
- print(result)
- # 338、pandas.Series.str.slice_replace方法
- # 0 abXYZfg
- # 1 hiXYZmn
- # 2 opXYZtu
- # dtype: object
- # 339、pandas.Series.str.split方法
- pandas.Series.str.split(pat=None, *, n=-1, expand=False, regex=None)
- Split strings around given separator/delimiter.
- Splits the string in the Series/Index from the beginning, at the specified delimiter string.
- Parameters:
- patstr or compiled regex, optional
- String or regular expression to split on. If not specified, split on whitespace.
- nint, default -1 (all)
- Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits.
- expandbool, default False
- Expand the split strings into separate columns.
- If True, return DataFrame/MultiIndex expanding dimensionality.
- If False, return Series/Index, containing lists of strings.
- regexbool, default None
- Determines if the passed-in pattern is a regular expression:
- If True, assumes the passed-in pattern is a regular expression
- If False, treats the pattern as a literal string.
- If None and pat length is 1, treats pat as a literal string.
- If None and pat length is not 1, treats pat as a regular expression.
- Cannot be set to False if pat is a compiled regex
- New in version 1.4.0.
- Returns:
- Series, Index, DataFrame or MultiIndex
- Type matches caller unless expand=True (see Notes).
- Raises:
- ValueError
- if regex is False and pat is a compiled regex.

- # 339、pandas.Series.str.split方法
- import pandas as pd
- # 示例数据
- data = pd.Series(['a,b,c', 'd,e,f', 'g,h,i'])
- # 不展开结果,只分割一次
- result1 = data.str.split(",", n=1, expand=False)
- # 展开结果为多列
- result2 = data.str.split(",", expand=True)
- print("Result with expand=False:")
- print(result1)
- print("\nResult with expand=True:")
- print(result2)
- # 339、pandas.Series.str.split方法
- # Result with expand=False:
- # 0 [a, b,c]
- # 1 [d, e,f]
- # 2 [g, h,i]
- # dtype: object
- #
- # Result with expand=True:
- # 0 1 2
- # 0 a b c
- # 1 d e f
- # 2 g h i
- # 340、pandas.Series.str.rsplit方法
- pandas.Series.str.rsplit(pat=None, *, n=-1, expand=False)
- Split strings around given separator/delimiter.
- Splits the string in the Series/Index from the end, at the specified delimiter string.
- Parameters:
- pat
- str, optional
- String to split on. If not specified, split on whitespace.
- n
- int, default -1 (all)
- Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits.
- expand
- bool, default False
- Expand the split strings into separate columns.
- If True, return DataFrame/MultiIndex expanding dimensionality.
- If False, return Series/Index, containing lists of strings.
- Returns:
- Series, Index, DataFrame or MultiIndex
- Type matches caller unless expand=True (see Notes).

- # 340、pandas.Series.str.rsplit方法
- import pandas as pd
- # 示例数据
- data = pd.Series(['a,b,c', 'd,e,f', 'g,h,i'])
- # 不展开结果,只从右侧分割一次
- result1 = data.str.rsplit(",", n=1, expand=False)
- # 从右侧展开结果为多列
- result2 = data.str.rsplit(",", expand=True)
- print("Result with expand=False:")
- print(result1)
- print("\nResult with expand=True:")
- print(result2)
- # 340、pandas.Series.str.rsplit方法
- # Result with expand=False:
- # 0 [a,b, c]
- # 1 [d,e, f]
- # 2 [g,h, i]
- # dtype: object
- #
- # Result with expand=True:
- # 0 1 2
- # 0 a b c
- # 1 d e f
- # 2 g h i
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。