代码收藏家技术教程 2022-07-20

Python之DataFrame基础知识点

文章目录

1. DataFrame的创建

1.1 通过读取文件创建DataFrame

1.2 通过字典创建DataFrame

1.3 通过嵌套字典创建DataFrame

2. 字典嵌套字典 VS 字典嵌套列表

2.1 字典嵌套字典

2.2 字典嵌套列表

3. DataFrame的基本用法

3.1 DataFrame.loc 数据索引

3.2 DataFrame.iloc 数据索引

3.3 取DataFrame的某几列数据

3.4 DataFrame新增列数据

3.5 DataFrame.T及DataFrame.transpose() 行列转置

3.6 DataFrame.interpolate 数据插值

3.7 DataFrame.groupby 分组统计

3.8 DataFrame.nunique 在请求的轴上计数不同的观察值个数

3.9 pandas.unique 以数组形式返回列的唯一值

3.10 DataFrame.rolling 计算时间窗口数据

3.11 DataFrame.value_counts() 查看数据中不同值

3.12 DataFrame.insert() 在指定列中插入数据

3.13 DataFrame.append() 拼接dataframe数据

3.14 pd.merge() 拼接dataframe数据

3.15 Dataframe 转 Series

1. DataFrame的创建

DataFrame是一种表格型数据结构，它含有一组有序的列，每列可以是不同的值。DataFrame既有行索引，也有列索引，它可以看作是由Series组成的字典，不过这些Series公用一个索引。
DataFrame的创建有多种方式，不过最重要的还是根据dict进行创建，以及读取csv或者txt文件来创建。这里主要介绍这两种方式。

1.1 通过读取文件创建DataFrame

读取文件生成DataFrame最常用的是read_csv方法。

pd.read_csv(filepath_or_buffer, sep=',', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, iterator=False, chunksize=None, compression='infer', thousands=None, decimal=b'.', lineterminator=None, quotechar='"', quoting=0, escapechar=None, comment=None, encoding=None, dialect=None, tupleize_cols=False, error_bad_lines=True, warn_bad_lines=True, skipfooter=0, skip_footer=0, doublequote=True, delim_whitespace=False, as_recarray=False, compact_ints=False, use_unsigned=False, low_memory=True, buffer_lines=None, memory_map=False, float_precision=None)

下面来看常用参数：

filepath_or_buffer:（这是唯一一个必须有的参数，其它都是按需求选用的）文件所在处的路径

sep：指定分隔符，默认为逗号’,’

delimiter : str, default None
定界符，备选分隔符（如果指定该参数，则sep参数失效）

header：int or list of ints, default ‘infer’
指定哪一行作为表头。默认设置为0（即第一行作为表头），如果没有表头的话，要修改参数，设置header=None

names：
指定列的名称，用列表表示。一般我们没有表头，即header=None时，这个用来添加列名就很有用啦！

index_col:
指定哪一列数据作为行索引，可以是一列，也可以多列。多列的话，会看到一个分层索引

prefix:
给列名添加前缀。如prefix=“x”,会出来"x1"、“x2”、“x3”

nrows : int, default None
需要读取的行数（从文件头开始算起）

1.2 通过字典创建DataFrame

(1) 首先创建一个学生字典 stu_dict, 该字典中包含有学生的姓名,性别, 分数等基本信息 :

stu_dict = {
    'name': ['Jack', 'Mary'],
    'gender': ['M', 'F'],
    'score': [80, 85]
}
print(stu_dict)

执行结果如下：

{'name': ['Jack', 'Mary'], 'gender': ['M', 'F'], 'score': [80, 85]}

(2) 通过字典stu_dict创建DataFrame :

stu_df = pd.DataFrame(stu_dict)
print(stu_df)

执行结果如下：

   name gender  score
0  Jack      M     80
1  Mary      F     85

(3) DataFrame的行索引是index，列索引是columns，我们可以在创建DataFrame时指定索引的值：

stu_df_2 = pd.DataFrame(stu_dict, index=['student_1', 'student_2'])
print(stu_df_2)

执行结果如下：

           name gender  score
student_1  Jack      M     80
student_2  Mary      F     85

(4) 指定列索引的值：

stu_df_3 = pd.DataFrame(stu_dict, columns=['name', 'score', 'gender'])
print(stu_df_3)

执行结果如下：

   name  score gender
0  Jack     80      M
1  Mary     85      F

1.3 通过嵌套字典创建DataFrame

使用嵌套字典也可以创建DataFrame，此时外层字典的键作为列，内层键则作为索引:

(1) 创建字典嵌套字典

# 字典嵌套字典
stu_dict = {
    'student_1' : {'name': 'Jack', 'gender': 'M', 'score': 80},
    'student_2' : {'name': 'Mary', 'gender': 'F', 'score': 85}
}
print(stu_dict)

执行结果如下：

 {'student_1': {'name': 'Jack', 'gender': 'M', 'score': 80}, 'student_2': {'name': 'Mary', 'gender': 'F', 'score': 85}}

(2) 嵌套字典转DataFrame

# 字典转DataFrame
stu_df = pd.DataFrame(stu_dict)
print(stu_df)

执行结果如下：

       student_1 student_2
gender         M         F
name        Jack      Mary
score         80        85

2. 字典嵌套字典 VS 字典嵌套列表

2.1 字典嵌套字典

使用 dict = {key, value}, 其中 value = {inner_key, inner_value} 的形式创建 “字典嵌套字典”.

# 字典嵌套字典
key_1 = 'student_1'
value_1 = {'name': 'Jack', 'gender': 'M', 'score': 80}

key_2 = 'student_2'
value_2 = {'name': 'Mary', 'gender': 'F', 'score': 85}

stu_dict = {key_1:value_1, key_2: value_2}
print(stu_dict)

执行结果如下：

{'student_1': {'name': 'Jack', 'gender': 'M', 'score': 80}, 'student_2': {'name': 'Mary', 'gender': 'F', 'score': 85}}

2.2 字典嵌套列表

使用dict.setdefault().append()创建的字典为 “字典嵌套列表”

stu_dict = {}
stu_dict_1 = {'name': 'Jack', 'gender': 'M', 'score': 80}
stu_dict_2 = {'name': 'Mary', 'gender': 'F', 'score': 85}

stu_dict.setdefault('student_1', []).append(stu_dict_1)
stu_dict.setdefault('student_2', []).append(stu_dict_2)

print(stu_dict)

执行结果如下：

{'student_1': [{'name': 'Jack', 'gender': 'M', 'score': 80}], 'student_2': [{'name': 'Mary', 'gender': 'F', 'score': 85}]}

字典嵌套列表在转成DataFrame时, 列表中的值被当成一个整体看待.

# 字典转DataFrame
stu_df = pd.DataFrame(stu_dict)
print(stu_df)

执行结果如下：

                                      student_1  \
0  {'name': 'Jack', 'gender': 'M', 'score': 80}   

                                      student_2  
0  {'name': 'Mary', 'gender': 'F', 'score': 85}

3. DataFrame的基本用法

3.1 DataFrame.loc 数据索引

import numpy as np
import pandas as pd
test_array = np.arange(16).reshape(4,4)
test1 = pd.DataFrame(test_array,index=['One','Two','Three',"Four"],columns=['a','b','c','d'])
print(test1)

loc（行，列）的读取格式例子如下，参数必需是DataFrame中的具体参数。

print(test1.loc['One'])#读取'One'行数据
print(test1.loc['One','a':'c'])#读取'One'行,'a':'c'列的数据
print(test1.loc['One':'Three','a':'c'])#读取'One':'Three'行,'a':'c'列的数据
print(test1.loc[['One','Three'],'a':'c'])#读取'One','Three',:'Three'行,'a':'c'列的数据

输出结果如下：

a    0
b    1
c    2
d    3
Name: One, dtype: int64

a    0
b    1
c    2
Name: One, dtype: int64

       a  b   c
One    0  1   2
Two    4  5   6
Three  8  9  10

       a  b   c
One    0  1   2
Three  8  9  10

3.2 DataFrame.iloc 数据索引

DataFrame.iloc 为基于整数位置的索引，用于按位置选择。

下面是iloc（行，列），这个原理大同小异，只是iloc是把DataFrame真的当做一张二维表，直接使用数据当做参数即可.

下面看代码示例：

print(test1.iloc[0])#读取'One'行数据
print(test1.iloc[0,0:3])#读取'One'行,'a':'c'列的数据
print(test1.iloc[0:3,0:3])#读取'One':'Three'行,'a':'c'列的数据
print(test1.iloc[[0,2],0:3])#读取'One','Three',:'Three'行,'a':'c'列的数据

输出结果如下：

a    0
b    1
c    2
d    3
Name: One, dtype: int64

a    0
b    1
c    2
Name: One, dtype: int64

       a  b   c
One    0  1   2
Two    4  5   6
Three  8  9  10

       a  b   c
One    0  1   2
Three  8  9  10

DataFrame.loc 和 DataFrame.iloc 的具体使用可参考：
python 基础笔记之 loc和iloc：https://blog.csdn.net/Onehh2/article/details/89884914

3.3 取DataFrame的某几列数据

import numpy as np
import pandas as pd
test_array = np.arange(16).reshape(4,4)
test1 = pd.DataFrame(test_array,index=['One','Two','Three',"Four"],columns=['a','b','c','d'])
print(test1)

#         a   b   c   d
# One     0   1   2   3
# Two     4   5   6   7
# Three   8   9  10  11
# Four   12  13  14  15

使用 [[‘column_index’]] 来取DataFrame的某几列数据 :

test2 = test1[['b', 'd']]
print(test2)

#         b   d
# One     1   3
# Two     5   7
# Three   9  11
# Four   13  15

3.4 DataFrame新增列数据

import numpy as np
import pandas as pd
test_array = np.arange(16).reshape(4,4)
test1 = pd.DataFrame(test_array,index=['One','Two','Three',"Four"],columns=['a','b','c','d'])
print(test1)

#         a   b   c   d
# One     0   1   2   3
# Two     4   5   6   7
# Three   8   9  10  11
# Four   12  13  14  15

若直接增加一列, 可能会存在警告:

new_list = [1, 2, 3, 4]
test3['e'] = new_list

警告：A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

解决方案: 先将 test1 以 dataframe 格式存储在 test3 中，然后对 test3 进行操作 :

new_list = [1, 2, 3, 4]
test3 = pd.DataFrame(test1)
test3['e'] = new_list
print(test3)

#         a   b   c   d  e
# One     0   1   2   3  1
# Two     4   5   6   7  2
# Three   8   9  10  11  3
# Four   12  13  14  15  4

3.5 DataFrame.T及DataFrame.transpose() 行列转置

(1) DataFrame.T 实现行列转置

import pandas as pd

df = pd.DataFrame({'X': [0, 1, 2], 'Y': [3, 4, 5]}, index=['A', 'B', 'C'])
print(df)
#    X  Y
# A  0  3
# B  1  4
# C  2  5

print(df.T)
#    A  B  C
# X  0  1  2
# Y  3  4  5

(2) DataFrame.transpose() 实现行列转置

print(df.transpose())
#    A  B  C
# X  0  1  2
# Y  3  4  5

(3) pd.DataFrame(df.values.T, …) 实现行列转置

df_T = pd.DataFrame(df.values.T, index=df.columns, columns=df.index)
print(df_T)
#    A  B  C
# X  0  1  2
# Y  3  4  5

3.6 DataFrame.interpolate 数据插值

Pandas dataframe.interpolate()功能本质上是用来填充NaN DataFrame 或系列中的值。

具体用法:

DataFrame.interpolate(self, method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)

范例：采用interpolate()函数使用线性方法填充缺失值。

import numpy as np
import pandas as pd

df = pd.DataFrame(data=[np.nan, 2, np.nan, 6, np.nan])
print(df)

输出结果:

现在我们利用 interpolate 函数进行插值:

df.interpolate(method='linear', limit_direction='forward')
print(df)

输出结果:

具体可参考:
pandas.DataFrame.interpolate函数方法的使用: https://blog.csdn.net/u012856866/article/details/122230597?spm=1001.2014.3001.5501

3.7 DataFrame.groupby 分组统计

函数定义：

DataFrame.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

具体用法:

import pandas as pd
import seaborn as sns
import numpy as np

df = sns.load_dataset("iris")
print(df.shape)
# (150, 5)

print(df.head(5))
#    sepal_length  sepal_width  petal_length  petal_width species
# 0           5.1          3.5           1.4          0.2  setosa
# 1           4.9          3.0           1.4          0.2  setosa
# 2           4.7          3.2           1.3          0.2  setosa
# 3           4.6          3.1           1.5          0.2  setosa
# 4           5.0          3.6           1.4          0.2  setosa

按pandas.DataFrame的groupby() 方法分组。

如果在参数中指定了列名，则会对该列中的每个值进行分组。返回的是一个GroupBy对象，print（）打印不显示内容。

grouped = df.groupby('species')
print(grouped)
# <pandas.core.groupby.groupby.DataFrameGroupBy object at 0x10c69f6a0>

print(type(grouped))
# <class 'pandas.core.groupby.groupby.DataFrameGroupBy'>

此外，通过mean（），min（），max（），sum（）方法应用于GroupBy对象，可以计算每个组的统计信息，例如平均值，最小值，最大值和总和。具体可参考博客:

16_Pandas.DataFrame计算统计信息并按GroupBy分组: https://blog.csdn.net/qq_18351157/article/details/106118984

3.8 DataFrame.nunique 在请求的轴上计数不同的观察值个数

DataFrame.nunique(self, axis=0, dropna=True) → pandas.core.series.Series

【参数】：

axis ： {0 or ‘index’, 1 or ‘columns’}, 默认为 0。要使用的轴。行为0或’index’，列为1或’columns’。

dropna ：bool, 默认为True。不要在计数中包括NaN。

【返回值】：

Series

样例如下：

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 2, 3], 'B': [1, 1, 1, 1]})
df.nunique()
# A    3
# B    1
# dtype: int64

3.9 pandas.unique 以数组形式返回列的唯一值

pandas.unique：以数组形式（numpy.ndarray）返回列的所有唯一值（特征的所有唯一值）

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 2, 3], 'B': [1, 1, 1, 1]})

pd.unique(df['A'])
# array([1, 2, 3])

pd.unique(df['B'])
# array([1])

3.10 DataFrame.rolling 计算时间窗口数据

pandas中的rolling函数，这个函数可以被Series对象调用，也可以被DataFrame对象调用，该函数主要是用来做移动计算的。

举个例子，假设我们有10天的销售额，我们想每三天求一次总和，比如第五天的总和就是第三天 + 第四天 + 第五天的销售额之和，这个时候我们的rolling函数就派上用场了。

import pandas as pd
 
amount = pd.Series([100, 90, 110, 150, 110, 130, 80, 90, 100, 150])
print(amount.rolling(3).sum())
"""
0      NaN      # NaN + NaN + 100
1      NaN      # NaN + 100 + 90
2    300.0      # 100 + 90 + 110
3    350.0      # 90 + 110 + 150
4    370.0      # 110 + 150 + 110
5    390.0      # 150 + 110 + 130
6    320.0      # 110 + 130 + 80
7    300.0      # 130 + 80 + 90
8    270.0      # 80 + 90 + 100
9    340.0      # 90 + 100 + 150
dtype: float64
"""

3.11 DataFrame.value_counts() 查看数据中不同值

DataFrame.value_counts() 查看d列数据中有哪些不同的值，并计算每个值有多少个重复值。样例如下:

import numpy as np
import pandas as pd
test_array = np.arange(16).reshape(4,4)
test1 = pd.DataFrame(test_array,index=['One','Two','Three','Four'],columns=['a','b','c','d'])
print(test1)

#         a   b   c   d
# One     0   1   2   3
# Two     4   5   6   7
# Three   8   9  10  11
# Four   12  13  14  15

test1['d'].value_counts()  # 查看d列数据中有哪些不同的值，并计算每个值有多少个重复值
# 7     1
# 3     1
# 11    1
# 15    1
# Name: d, dtype: int64

3.12 DataFrame.insert() 在指定列中插入数据

Dataframe.insert(loc, column, value, allow_duplicates=False): 在Dataframe的指定列中插入数据。

参数介绍：

loc: int型，表示第几列；若在第一列插入数据，则 loc=0

column: 给插入的列取名，如 column=‘新的一列’

value：要插入的值，数字，array，series等都可

allow_duplicates: 是否允许列名重复，选择Ture表示允许新的列名与已存在的列名重复。

import numpy as np
import pandas as pd
test_array = np.arange(16).reshape(4,4)
data = pd.DataFrame(test_array,index=['One','Two','Three','Four'],columns=['a','b','c','d'])
print(data)

#         a   b   c   d
# One     0   1   2   3
# Two     4   5   6   7
# Three   8   9  10  11
# Four   12  13  14  15

# 在最后一列插入一列，取名'new'
data.insert(loc=data.shape[1],column='new',value=[1, 2, 3, 4])
print(data)
#         a   b   c   d  new
# One     0   1   2   3    1
# Two     4   5   6   7    2
# Three   8   9  10  11    3
# Four   12  13  14  15    4

3.13 DataFrame.append() 拼接dataframe数据

Pandas append()函数用于将其他数据框的行添加到给定数据框的末尾, 并返回一个新的数据框对象。新列和新单元格将插入到原始DataFrame中, 并用NaN值填充。

【句法】：

DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=None)

【参数】：

其他：DataFrame或类似Series / dict的对象, 或这些对象的列表。它是指要附加的数据。

ignore_index：如果为true, 则不使用索引标签。

verify_integrity：如果为true, 则在创建具有重复项的索引时会引发ValueError。

sort：如果self和other的列不对齐, 则对列进行排序。默认排序已弃用, 在将来的Pandas版本中它将更改为不排序。我们通过sort = True明确地使警告和排序保持沉默, 而我们通过sort = False明确地使警告而不是排序保持沉默。

【样例代码】：

import numpy as np
import pandas as pd

array_1 = np.arange(16).reshape(4,4)
df_1 = pd.DataFrame(array_1, columns=['a','b','c','d'])
print(df_1)
#     a   b   c   d
# 0   0   1   2   3
# 1   4   5   6   7
# 2   8   9  10  11
# 3  12  13  14  15

array_2 = np.arange(0, 32, 2).reshape(4,4)
df_2 = pd.DataFrame(array_2, columns=['a','b','c','d'])
print(df_2)
#     a   b   c   d
# 0   0   2   4   6
# 1   8  10  12  14
# 2  16  18  20  22
# 3  24  26  28  30

使用 DataFrame.append 函数对 df_1 和 df_2 进行拼接：

df_3 = pd.DataFrame(columns=['a','b', 'c', 'd']) # 创建一个空的dataframe
df_3 = df_3.append(df_1)
df_3 = df_3.append(df_2)
print(df_3)
#     a   b   c   d
# 0   0   1   2   3
# 1   4   5   6   7
# 2   8   9  10  11
# 3  12  13  14  15
# 0   0   2   4   6
# 1   8  10  12  14
# 2  16  18  20  22
# 3  24  26  28  30

df结果中的纵向索引值为 0，1，2，3， 0，1，2， 3，仍旧保留了我们拼接前各自的索引值，这是不被我们需要的，因此我们需要设置 ignore_index=True来对索引值进行重新排列。

df_4 = pd.DataFrame(columns=['a','b', 'c', 'd']) # 创建一个空的dataframe
df_4 = df_4.append(df_1, ignore_index=True)
df_4 = df_4.append(df_2, ignore_index=True)
print(df_4)
#     a   b   c   d
# 0   0   1   2   3
# 1   4   5   6   7
# 2   8   9  10  11
# 3  12  13  14  15
# 4   0   2   4   6
# 5   8  10  12  14
# 6  16  18  20  22
# 7  24  26  28  30

我们可以看到，此时的纵向索引值变正常了。

3.14 pd.merge() 拼接dataframe数据

【句法】：

pd.merge(left, right, how='inner', on=None, left_on=None, right_on=None,
         left_index=False, right_index=False, sort=True,
         suffixes=('_x', '_y'), copy=True, indicator=False,
         validate=None)

【参数】：

left: 拼接的左侧DataFrame对象

right: 拼接的右侧DataFrame对象

on: 要加入的列或索引级别名称。必须在左侧和右侧DataFrame对象中找到。如果未传递且left_index和right_index为False，则DataFrame中的列的交集将被推断为连接键。

left_on:左侧DataFrame中的列或索引级别用作键。可以是列名，索引级名称，也可以是长度等于DataFrame长度的数组。

right_on: 左侧DataFrame中的列或索引级别用作键。可以是列名，索引级名称，也可以是长度等于DataFrame长度的数组。

left_index: 如果为True，则使用左侧DataFrame中的索引（行标签）作为其连接键。对于具有MultiIndex（分层）的DataFrame，级别数必须与右侧DataFrame中的连接键数相匹配。

right_index: 与left_index功能相似。

how: One of ‘left’, ‘right’, ‘outer’, ‘inner’. 默认inner。inner是取交集，outer取并集。比如left：[‘A’,‘B’,‘C’];right[’'A,‘C’,‘D’]；inner取交集的话，left中出现的A会和right中出现的买一个A进行匹配拼接，如果没有是B，在right中没有匹配到，则会丢失。'outer’取并集，出现的A会进行一一匹配，没有同时出现的会将缺失的部分添加缺失值。

sort: 按字典顺序通过连接键对结果DataFrame进行排序。默认为True，设置为False将在很多情况下显着提高性能。

suffixes: 用于重叠列的字符串后缀元组。默认为（‘x’，’ y’）。

copy: 始终从传递的DataFrame对象复制数据（默认为True），即使不需要重建索引也是如此。

indicator:将一列添加到名为_merge的输出DataFrame，其中包含有关每行源的信息。 _merge是分类类型，并且对于其合并键仅出现在“左”DataFrame中的观察值，取得值为left_only，对于其合并键仅出现在“右”DataFrame中的观察值为right_only，并且如果在两者中都找到观察点的合并键，则为left_only。

【样例代码】：

import pandas as pd

left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                       'A': ['A0', 'A1', 'A2', 'A3'],
                       'B': ['B0', 'B1', 'B2', 'B3']})
print(left)
#   key   A   B
# 0  K0  A0  B0
# 1  K1  A1  B1
# 2  K2  A2  B2
# 3  K3  A3  B3

right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                        'C': ['C0', 'C1', 'C2', 'C3'],
                        'D': ['D0', 'D1', 'D2', 'D3']})
print(right)
#   key   C   D
# 0  K0  C0  D0
# 1  K1  C1  D1
# 2  K2  C2  D2
# 3  K3  C3  D3

result = pd.merge(left, right, on='key')  # on参数传递的key作为连接键
print(result)
#   key   A   B   C   D
# 0  K0  A0  B0  C0  D0
# 1  K1  A1  B1  C1  D1
# 2  K2  A2  B2  C2  D2
# 3  K3  A3  B3  C3  D3

3.15 Dataframe 转 Series

直接取 Dataframe 的某一列，即可得到 Series，Series 的默认索引为 0，1， 2，……，如下所示：

import pandas as pd

df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                       'A': ['A0', 'A1', 'A2', 'A3'],
                       'B': ['B0', 'B1', 'B2', 'B3']})
print(df)
#   key   A   B
# 0  K0  A0  B0
# 1  K1  A1  B1
# 2  K2  A2  B2
# 3  K3  A3  B3

df_1 = df['A']
print(type(df_1))  # <class 'pandas.core.series.Series'>
print(df_1) 
# 0    A0
# 1    A1
# 2    A2
# 3    A3
# Name: A, dtype: object

如果我们想指定某一列为 Series 的索引，则可以使用 pd.Series() 操作，如下所示：

df_2 = pd.Series(np.array(df['A']).tolist(), index=np.array(df['key']).tolist())
print(type(df_2))  # <class 'pandas.core.series.Series'>
print(df_2)
# K0    A0
# K1    A1
# K2    A2
# K3    A3
# dtype: object

来源：酒酿小圆子～

Python