题
怎么能 assign
用于返回添加了多个新列的原始DataFrame的副本?
期望的结果
df = pd.DataFrame({'A': range(1, 5), 'B': range(11, 15)})
>>> df.assign({'C': df.A.apply(lambda x: x ** 2), 'D': df.B * 2})
A B C D
0 1 11 1 22
1 2 12 4 24
2 3 13 9 26
3 4 14 16 28
ATTEMPTS
上面的例子导致:
ValueError: Wrong number of items passed 2, placement implies 1
。
背景
该 assign
Pandas中的函数获取加入新分配列的相关数据帧的副本,例如,
df = df.assign(C=df.B * 2)
>>> df
A B C
0 1 11 22
1 2 12 24
2 3 13 26
3 4 14 28
该 0.19.2文件 对于此函数意味着可以向数据帧添加多个列。
可以在同一分配中分配多个列,但不能引用在同一分配调用中创建的其他列。
此外:
参数:
kwargs: 关键字,值对关键字是列名。
该函数的源代码声明它接受字典:
def assign(self, **kwargs):
"""
.. versionadded:: 0.16.0
Parameters
----------
kwargs : keyword, value pairs
keywords are the column names. If the values are callable, they are computed
on the DataFrame and assigned to the new columns. If the values are not callable,
(e.g. a Series, scalar, or array), they are simply assigned.
Notes
-----
Since ``kwargs`` is a dictionary, the order of your
arguments may not be preserved. The make things predicatable,
the columns are inserted in alphabetical order, at the end of
your DataFrame. Assigning multiple columns within the same
``assign`` is possible, but you cannot reference other columns
created within the same ``assign`` call.
"""
data = self.copy()
# do all calculations first...
results = {}
for k, v in kwargs.items():
if callable(v):
results[k] = v(data)
else:
results[k] = v
# ... and then assign
for k, v in sorted(results.items()):
data[k] = v
return data