最近很多小伙伴都在问按其中某些元素的频率过滤numpy数组这两个问题,那么本篇文章就来给大家详细解答一下,同时本文还将给你拓展"importnumpyasnp"ImportError:Nomodule
最近很多小伙伴都在问按其中某些元素的频率过滤 numpy 数组这两个问题,那么本篇文章就来给大家详细解答一下,同时本文还将给你拓展"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、Difference between import numpy and import numpy as np等相关知识,下面开始了哦!
本文目录一览:- 按其中某些元素的频率过滤 numpy 数组
- "import numpy as np" ImportError: No module named numpy
- 3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数
- Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案
- Difference between import numpy and import numpy as np
按其中某些元素的频率过滤 numpy 数组
如何解决按其中某些元素的频率过滤 numpy 数组
我有一个 numpy 数组和一个类似于下面的字典:
arr1 = np.array([[''a1'',''x''],[''a2'',[''a3'',''y''],[''a4'',[''a5'',''z'']])
d = {''x'':2,''z'':1,''y'':1,''w'':2}
对于 (k,v)
中的每个键值对 d
,k
在其第二列中的 v
中应该恰好出现 arr1
次。很明显,这不会发生在这里。
所以我想要做的是,从 arr1
开始,我想创建另一个数组,其中第二列中的每个元素都恰好按照 d
出现的次数。换句话说,我想要的结果是:
np.array([[''a1'',''z'']])
我可以使用列表推导得到我想要的结果:
ans = [[x1,x2] for x1,x2 in arr1 if np.count_nonzero(arr1==x2)==d[x2]]
但我想知道是否可以仅使用 numpy 来做到这一点。
解决方法
这就是你想要的:
import numpy as np
arr1 = np.array([[''a1'',''x''],[''a2'',[''a3'',''y''],[''a4'',[''a5'',''z'']])
d = {''x'': 2,''z'': 1,''y'': 1,''w'': 2}
# get the actual counts of values in arr1
counts = dict(zip(*np.unique(arr1[:,1],return_counts=True)))
# determine what values to keep,as their count matches the desired count
keep = [x for x in d if x in counts and d[x] == counts[x]]
# filter down the array
result = arr1[list(map(lambda x: x[1] in keep,arr1))]
很可能在 numpy 中有一种更优化的方法来做到这一点,但我不知道你申请的集合有多大,或者你需要多久这样做一次,以说寻找它是否值得
编辑:请注意,您需要扩大规模以决定什么是好的解决方案。您的原始解决方案非常适合玩具示例,它的表现优于这两个答案。但是,如果您扩展到可能更现实的工作负载,@NewbieAF 提供的 numpy 解决方案可以轻松击败其他解决方案:
from random import randint
from timeit import timeit
import numpy as np
def original(arr1,d):
return [[x1,x2] for x1,x2 in arr1 if np.count_nonzero(arr1 == x2) == d[x2]]
def f1(arr1,d):
# get the actual counts of values in arr1
counts = dict(zip(*np.unique(arr1[:,return_counts=True)))
# determine what values to keep,as their count matches the desired count
keep = [x for x in d if x in counts and d[x] == counts[x]]
# filter down the array
return arr1[list(map(lambda x: x[1] in keep,arr1))]
def f2(arr1,d):
# create arrays from d
keys,vals = np.array(list(d.keys())),np.array(list(d.values()))
# count the unique elements in arr1[:,1]
unqs,cts = np.unique(arr1[:,return_counts=True)
# only keep track of elements that appear in arr1
mask = np.isin(keys,unqs)
keys,vals = keys[mask],vals[mask]
# sort the unique values and corresponding counts according to keys
idx1 = np.argsort(np.argsort(keys))
idx2 = np.argsort(unqs)
unqs,cts = unqs[idx2][idx1],cts[idx2][idx1]
# filter values by whether the counts match
correct = unqs[vals==cts]
return arr1[np.isin(arr1[:,correct)]
def main():
arr1 = np.array([[''a1'',''z'']])
d = {''x'': 2,''w'': 2}
print(timeit(lambda: original(arr1,d),number=10000))
print(timeit(lambda: f1(arr1,number=10000))
print(timeit(lambda: f2(arr1,number=10000))
counts = [randint(1,3) for _ in range(10000)]
arr1 = np.array([[''x'',f''{n}''] for n in range(10000) for _ in range(counts[n])])
d = {f''{n}'': randint(1,3) for n in range(10000)}
print(timeit(lambda: original(arr1,number=10))
print(timeit(lambda: f1(arr1,number=10))
print(timeit(lambda: f2(arr1,number=10))
main()
结果:
0.14045359999999998
0.2402685
0.5027185999999999
46.7569239
5.893172499999999
0.08729539999999503
numpy
解决方案在玩具示例上很慢,但在大输入上要快几个数量级。您的解决方案看起来不错,但在扩展时输给了避免额外调用的非 numpy 解决方案。
考虑问题的大小。如果问题很小,您应该选择自己的解决方案,以提高可读性。如果问题是中等规模的,您可能会选择我的来提高性能。如果问题很大(无论是规模还是使用频率),您应该选择全 numpy 解决方案,牺牲可读性以提高速度。
,在玩弄 np.argsort()
之后,我找到了一个纯粹的 numpy 解决方案。只需根据 arr1
的数组版本中相同元素的位置对 d.values()
的第二行进行排序。
arr1 = np.array([[''a1'',''z'']])
d = {''x'':2,''z'':1,''y'':1,''w'':2}
# create arrays from d
keys,np.array(list(d.values()))
# count the unique elements in arr1[:,1]
unqs,return_counts=True)
# only keep track of elements that appear in arr1
mask = np.isin(keys,unqs)
keys,vals[mask]
# sort the unique values and corresponding counts according to keys
idx1 = np.argsort(np.argsort(keys))
idx2 = np.argsort(unqs)
unqs,cts[idx2][idx1]
# filter values by whether the counts match
correct = unqs[vals==cts]
# keep subarray where the counts match
ans = arr1[np.isin(arr1[:,correct)]
print(ans)
# [[''a1'' ''x'']
# [''a2'' ''x'']
# [''a5'' ''z'']]
"import numpy as np" ImportError: No module named numpy
问题:没有安装 numpy
解决方法:
下载文件,安装
numpy-1.8.2-win32-superpack-python2.7
安装运行 import numpy,出现
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
import numpy
File "C:\Python27\lib\site-packages\numpy\__init__.py", line 153, in <module>
from . import add_newdocs
File "C:\Python27\lib\site-packages\numpy\add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc
File "C:\Python27\lib\site-packages\numpy\lib\__init__.py", line 8, in <module>
from .type_check import *
File "C:\Python27\lib\site-packages\numpy\lib\type_check.py", line 11, in <module>
import numpy.core.numeric as _nx
File "C:\Python27\lib\site-packages\numpy\core\__init__.py", line 6, in <module>
from . import multiarray
ImportError: DLL load failed: %1 不是有效的 Win32 应用程序。
原因是:python 装的是 64 位的,numpy 装的是 32 位的
重新安装 numpy 为:numpy-1.8.0-win64-py2.7
3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数
目录
[TOC]
前言
具体我们来学 Numpy 的统计函数
(一)函数一览表
调用方式:np.*
.sum(a) | 对数组 a 求和 |
---|---|
.mean(a) | 求数学期望 |
.average(a) | 求平均值 |
.std(a) | 求标准差 |
.var(a) | 求方差 |
.ptp(a) | 求极差 |
.median(a) | 求中值,即中位数 |
.min(a) | 求最大值 |
.max(a) | 求最小值 |
.argmin(a) | 求最小值的下标,都处里为一维的下标 |
.argmax(a) | 求最大值的下标,都处里为一维的下标 |
.unravel_index(index, shape) | g 根据 shape, 由一维的下标生成多维的下标 |
(二)统计函数 1
(1)说明
(2)输出
.sum(a)
.mean(a)
.average(a)
.std(a)
.var(a)
(三)统计函数 2
(1)说明
(2)输出
.max(a) .min(a)
.ptp(a)
.median(a)
.argmin(a)
.argmax(a)
.unravel_index(index,shape)
作者:Mark
日期:2019/02/11 周一
Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案
如何解决Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案?
希望有人能在这里提供帮助。我一直在绕圈子一段时间。我只是想设置一个 python 脚本,它将一些 json 数据从 REST API 加载到云数据库中。我在 Anaconda 上设置了一个虚拟环境(因为 GCP 库推荐这样做),安装了依赖项,现在我只是尝试导入库并向端点发送请求。 我使用 Conda(和 conda-forge)来设置环境并安装依赖项,所以希望一切都干净。我正在使用带有 Python 扩展的 VS 编辑器作为编辑器。 每当我尝试运行脚本时,我都会收到以下消息。我已经尝试了其他人在 Google/StackOverflow 上找到的所有解决方案,但没有一个有效。我通常使用 IDLE 或 Jupyter 进行脚本编写,没有任何问题,但我对 Anaconda、VS 或环境变量(似乎是相关的)没有太多经验。 在此先感谢您的帮助!
\Traceback (most recent call last):
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 22,in <module>
from . import multiarray
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\multiarray.py",line 12,in <module>
from . import overrides
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\overrides.py",line 7,in <module>
from numpy.core._multiarray_umath import (
ImportError: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.
During handling of the above exception,another exception occurred:
Traceback (most recent call last):
File "c:\API\citi-bike.py",line 4,in <module>
import numpy as np
File "C:\Conda\envs\gcp\lib\site-packages\numpy\__init__.py",line 150,in <module>
from . import core
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 48,in <module>
raise ImportError(msg)
ImportError:
IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!
Importing the numpy C-extensions Failed. This error can happen for
many reasons,often due to issues with your setup or how NumPy was
installed.
We have compiled some common reasons and troubleshooting tips at:
https://numpy.org/devdocs/user/troubleshooting-importerror.html
Please note and check the following:
* The Python version is: python3.9 from "C:\Conda\envs\gcp\python.exe"
* The NumPy version is: "1.21.1"
and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.
Original error was: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)
Difference between import numpy and import numpy as np
Difference between import numpy and import numpy as np
up vote 18 down vote favorite 5 |
I understand that when possible one should use This helps keep away any conflict due to namespaces. But I have noticed that while the command below works the following does not Can someone please explain this? python numpy
|
||||||||
add a comment |
4 Answers
active oldest votes
up vote 13 down vote |
numpy is the top package name, and doing When you do In your above code: Here is the difference between
|
|||
add a comment |
up vote 7 down vote |
The When you import a module via the numpy package is bound to the local variable Thus, is equivalent to, When trying to understand this mechanism, it''s worth remembering that When importing a submodule, you must refer to the full parent module name, since the importing mechanics happen at a higher level than the local variable scope. i.e. I also take issue with your assertion that "where possible one should [import numpy as np]". This is done for historical reasons, mostly because people get tired very quickly of prefixing every operation with Finally, to round out my exposé, here are 2 interesting uses of the 1. long subimports 2. compatible APIs
|
||
add a comment |
up vote 1 down vote |
when you call the statement
|
||
add a comment |
up vote 1 down vote |
This is a language feature. This feature allows:
Notice however that Said that, when you run You receive an
|
||||||||
add a comment |
关于按其中某些元素的频率过滤 numpy 数组的介绍已经告一段落,感谢您的耐心阅读,如果想了解更多关于"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、Difference between import numpy and import numpy as np的相关信息,请在本站寻找。
本文标签: