GVKun编程网logo

pandas 和 numpy 学习(pandas与numpy)

1

本文将介绍pandas和numpy学习的详细情况,特别是关于pandas与numpy的相关信息。我们将通过案例分析、数据研究等多种方式,帮助您更全面地了解这个主题,同时也将涉及一些关于"importn

本文将介绍pandas 和 numpy 学习的详细情况,特别是关于pandas与numpy的相关信息。我们将通过案例分析、数据研究等多种方式,帮助您更全面地了解这个主题,同时也将涉及一些关于"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、cvxpy 和 numpy 之间的版本冲突:“针对 API 版本 0xe 编译的模块,但此版本的 numpy 是 0xd”的知识。

本文目录一览:

pandas 和 numpy 学习(pandas与numpy)

pandas 和 numpy 学习(pandas与numpy)

1. 导入:

1 import numpy as np
2 import pandas as pd

原始文件:

1 xlsx=pd.ExcelFile(''1.xlsx'')
2 df=pd.read_excel(xlsx,0,index_col=None,na_values=[''NA''])
3 print(df.head())
C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
  Name Index  num  cell
0   C1   GB0    1     1
1   C4  GBC2    3     5
2   C7  GBC4    5     9
3  C10  GBC6    7    13
4  C13  GBC8    9    17

Process finished with exit code 0
View Code

2. 数据筛选和选择:

 1 print(df.shape)
 2 print("*"*111)
 3 print(df.index)
 4 print("*"*111)
 5 print(df.index.values)
 6 print("*"*111)
 7 print(df.columns)
 8 print("*"*111)
 9 print(df.columns.values)
10 print("*"*111)
11 print(df.columns.get_loc("num"))
12 print("*"*111)
13 print(df.columns.get_indexer(["num","cell"]))
14 print("*"*111)
15 print(df["num"])
16 print("*"*111)
17 print(df[["num","cell"]])
18 print("*"*111)
19 print(df[["num","cell"]].values)
20 print("*"*111)
21 print(df.iloc[2:6])
22 print("*"*111)
23 print(df.iloc[2:6,2:4])
24 print("*"*111)
25 print(df.iloc[2:6,df.columns.get_indexer(["cell","num"])])
26 print("*"*111)
27 print(df.loc[2:6,["cell","num"]])
28 print("*"*111)
29 print(df.loc[2:6,"num":"cell"])
30 print("*"*111)
31 print(df.loc[2:8:2,["cell","num"]])
32 print("*"*111)
33 print(df[(df["cell"]==9)|(df["num"]==9)])
34 print("*"*111)
35 print(df[(df["num"]==9)|(df["num"]==1)])
36 print("*"*111)
37 print(df.loc[df["num"]<7])
38 print("*"*111)
39 print(df.loc[df.num<7,[''cell'',''num'']])

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
(12, 4)
***************************************************************************************************************
RangeIndex(start=0, stop=12, step=1)
***************************************************************************************************************
[ 0  1  2  3  4  5  6  7  8  9 10 11]
***************************************************************************************************************
Index([''Name'', ''Index'', ''num'', ''cell''], dtype=''object'')
***************************************************************************************************************
[''Name'' ''Index'' ''num'' ''cell'']
***************************************************************************************************************
2
***************************************************************************************************************
[2 3]
***************************************************************************************************************
0      1
1      3
2      5
3      7
4      9
5     11
6     13
7     15
8     17
9     19
10    21
11    23
Name: num, dtype: int64
***************************************************************************************************************
    num  cell
0     1     1
1     3     5
2     5     9
3     7    13
4     9    17
5    11    21
6    13    25
7    15    29
8    17    33
9    19    37
10   21    41
11   23    45
***************************************************************************************************************
[[ 1  1]
 [ 3  5]
 [ 5  9]
 [ 7 13]
 [ 9 17]
 [11 21]
 [13 25]
 [15 29]
 [17 33]
 [19 37]
 [21 41]
 [23 45]]
***************************************************************************************************************
  Name  Index  num  cell
2   C7   GBC4    5     9
3  C10   GBC6    7    13
4  C13   GBC8    9    17
5  C16  GBC10   11    21
***************************************************************************************************************
   num  cell
2    5     9
3    7    13
4    9    17
5   11    21
***************************************************************************************************************
   cell  num
2     9    5
3    13    7
4    17    9
5    21   11
***************************************************************************************************************
   cell  num
2     9    5
3    13    7
4    17    9
5    21   11
6    25   13
***************************************************************************************************************
   num  cell
2    5     9
3    7    13
4    9    17
5   11    21
6   13    25
***************************************************************************************************************
   cell  num
2     9    5
4    17    9
6    25   13
8    33   17
***************************************************************************************************************
  Name Index  num  cell
2   C7  GBC4    5     9
4  C13  GBC8    9    17
***************************************************************************************************************
  Name Index  num  cell
0   C1   GB0    1     1
4  C13  GBC8    9    17
***************************************************************************************************************
  Name Index  num  cell
0   C1   GB0    1     1
1   C4  GBC2    3     5
2   C7  GBC4    5     9
***************************************************************************************************************
   cell  num
0     1    1
1     5    3
2     9    5

Process finished with exit code 0
View Code

按照行列遍历

 

 1 import numpy as np
 2 import pandas as pd
 3 xlsx=pd.ExcelFile(''1.xlsx'')
 4 df=pd.read_excel(xlsx,0,index_col=None,na_values=[''NA''])
 5 print(df)
 6 print("*"*111)
 7 #三种迭代函数,interrows()返回(index,Series)行对
 8 for index,row in df.iterrows():
 9     print(index,''\n'',row)
10 print("*"*111)
11 #返回本身就是一个series,可以按照series的情况提取具体元素
12 for index,row in df.iterrows():
13     print(row[2:3])
14 print("*"*111)
15 #interitems()返回(column,series)列对
16 for index,col in df.iteritems():
17     print(col[2:3])
18 print("*"*111)
19 #以上方式遍历不能对数据进行修改,要对数据进行修改,数据小的话直接按照行数遍历修改
20 #如果很大,建议使用apply函数或者转为list进行处理
21 for i in df.index:
22     print(df.loc[i,:])
23 print("*"*111)
24 # 也可以用shape[0]表示行数
25 for i in range(df.shape[0]):
26     print(df.loc[i,:])

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   Name  Index  num  cell
0    C1    GB0    1     1
1    C4   GBC2    3     5
2    C7   GBC4    5     9
3   C10   GBC6    7    13
4   C13   GBC8    9    17
5   C16  GBC10   11    21
6   C19  GBC12   13    25
7    C4  GBC14   15    29
8    C7  GBC16   17    33
9   C10  GBC18   19    37
10  C13  GBC20   21    41
11  C16  GBC22   23    45
***************************************************************************************************************
0 
 Name      C1
Index    GB0
num        1
cell       1
Name: 0, dtype: object
1 
 Name       C4
Index    GBC2
num         3
cell        5
Name: 1, dtype: object
2 
 Name       C7
Index    GBC4
num         5
cell        9
Name: 2, dtype: object
3 
 Name      C10
Index    GBC6
num         7
cell       13
Name: 3, dtype: object
4 
 Name      C13
Index    GBC8
num         9
cell       17
Name: 4, dtype: object
5 
 Name       C16
Index    GBC10
num         11
cell        21
Name: 5, dtype: object
6 
 Name       C19
Index    GBC12
num         13
cell        25
Name: 6, dtype: object
7 
 Name        C4
Index    GBC14
num         15
cell        29
Name: 7, dtype: object
8 
 Name        C7
Index    GBC16
num         17
cell        33
Name: 8, dtype: object
9 
 Name       C10
Index    GBC18
num         19
cell        37
Name: 9, dtype: object
10 
 Name       C13
Index    GBC20
num         21
cell        41
Name: 10, dtype: object
11 
 Name       C16
Index    GBC22
num         23
cell        45
Name: 11, dtype: object
***************************************************************************************************************
num    1
Name: 0, dtype: object
num    3
Name: 1, dtype: object
num    5
Name: 2, dtype: object
num    7
Name: 3, dtype: object
num    9
Name: 4, dtype: object
num    11
Name: 5, dtype: object
num    13
Name: 6, dtype: object
num    15
Name: 7, dtype: object
num    17
Name: 8, dtype: object
num    19
Name: 9, dtype: object
num    21
Name: 10, dtype: object
num    23
Name: 11, dtype: object
***************************************************************************************************************
2    C7
Name: Name, dtype: object
2    GBC4
Name: Index, dtype: object
2    5
Name: num, dtype: int64
2    9
Name: cell, dtype: int64
***************************************************************************************************************
Name      C1
Index    GB0
num        1
cell       1
Name: 0, dtype: object
Name       C4
Index    GBC2
num         3
cell        5
Name: 1, dtype: object
Name       C7
Index    GBC4
num         5
cell        9
Name: 2, dtype: object
Name      C10
Index    GBC6
num         7
cell       13
Name: 3, dtype: object
Name      C13
Index    GBC8
num         9
cell       17
Name: 4, dtype: object
Name       C16
Index    GBC10
num         11
cell        21
Name: 5, dtype: object
Name       C19
Index    GBC12
num         13
cell        25
Name: 6, dtype: object
Name        C4
Index    GBC14
num         15
cell        29
Name: 7, dtype: object
Name        C7
Index    GBC16
num         17
cell        33
Name: 8, dtype: object
Name       C10
Index    GBC18
num         19
cell        37
Name: 9, dtype: object
Name       C13
Index    GBC20
num         21
cell        41
Name: 10, dtype: object
Name       C16
Index    GBC22
num         23
cell        45
Name: 11, dtype: object
***************************************************************************************************************
Name      C1
Index    GB0
num        1
cell       1
Name: 0, dtype: object
Name       C4
Index    GBC2
num         3
cell        5
Name: 1, dtype: object
Name       C7
Index    GBC4
num         5
cell        9
Name: 2, dtype: object
Name      C10
Index    GBC6
num         7
cell       13
Name: 3, dtype: object
Name      C13
Index    GBC8
num         9
cell       17
Name: 4, dtype: object
Name       C16
Index    GBC10
num         11
cell        21
Name: 5, dtype: object
Name       C19
Index    GBC12
num         13
cell        25
Name: 6, dtype: object
Name        C4
Index    GBC14
num         15
cell        29
Name: 7, dtype: object
Name        C7
Index    GBC16
num         17
cell        33
Name: 8, dtype: object
Name       C10
Index    GBC18
num         19
cell        37
Name: 9, dtype: object
Name       C13
Index    GBC20
num         21
cell        41
Name: 10, dtype: object
Name       C16
Index    GBC22
num         23
cell        45
Name: 11, dtype: object

Process finished with exit code 0
View Code

重要:

 1 #如果数据量较大,可以在list中转下
 2 modify_result=[] #优化遍历速度,使用list进行
 3 column_names=list(df.columns.values)
 4 modify_result.append(column_names)  #添加标题行
 5 for i ,row in df.iterrows():
 6     newrows=list(row)
 7     if i>=4:
 8         newrows[column_names.index("num")]=99
 9     modify_result.append(newrows) #添加每一行数据
10 df=pd.DataFrame(modify_result[1:],columns=modify_result[0])  #重新变回df
11 print(df)

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   Name  Index  num  cell
0    C1    GB0    1     1
1    C4   GBC2    3     5
2    C7   GBC4    5     9
3   C10   GBC6    7    13
4   C13   GBC8   99    17
5   C16  GBC10   99    21
6   C19  GBC12   99    25
7    C4  GBC14   99    29
8    C7  GBC16   99    33
9   C10  GBC18   99    37
10  C13  GBC20   99    41
11  C16  GBC22   99    45

Process finished with exit code 0
View Code

赋值修改:

1 for i in df.index:
2     if i>2:
3         df.loc[i,"cell"]=28
4 
5 print(df)
C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   Name  Index  num  cell
0    C1    GB0    1     1
1    C4   GBC2    3     5
2    C7   GBC4    5     9
3   C10   GBC6    7    28
4   C13   GBC8    9    28
5   C16  GBC10   11    28
6   C19  GBC12   13    28
7    C4  GBC14   15    28
8    C7  GBC16   17    28
9   C10  GBC18   19    28
10  C13  GBC20   21    28
11  C16  GBC22   23    28

Process finished with exit code 0
View Code

矢量运算:

1 df[''num'']=50
2 print(df)
C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   Name  Index  num  cell
0    C1    GB0   50     1
1    C4   GBC2   50     5
2    C7   GBC4   50     9
3   C10   GBC6   50    13
4   C13   GBC8   50    17
5   C16  GBC10   50    21
6   C19  GBC12   50    25
7    C4  GBC14   50    29
8    C7  GBC16   50    33
9   C10  GBC18   50    37
10  C13  GBC20   50    41
11  C16  GBC22   50    45

Process finished with exit code 0
View Code

 组建新的形式

 1 for m ,row in df.iterrows():
 2     excel_id="{}-{}-{}".format(row[''cell''],row[''num''],row[''Name''])
 3     print(excel_id)
 4 print("*" * 111)
 5 for i, row in df.iterrows():
 6     print(row[[''num'',''Name'']])
 7 print("*"*111)
 8 for t  in df.index:
 9     excel_id="{}-{}-{}".format(df.loc[t,''cell''],df.loc[t,''num''],df.loc[t,''Name''])
10     print(excel_id)
1 df[''Col_sum''] = df.apply(lambda x: x.sum(), axis=1)
2 
3 df.loc[''Row_sum''] = df.apply(lambda x: x.sum())
4 print(df)

 

结果:

 

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
1-1-C1
5-3-C4
9-5-C7
13-7-C10
17-9-C13
21-11-C16
25-13-C19
29-15-C4
33-17-C7
37-19-C10
41-21-C13
45-23-C16
***************************************************************************************************************
num      1
Name    C1
Name: 0, dtype: object
num      3
Name    C4
Name: 1, dtype: object
num      5
Name    C7
Name: 2, dtype: object
num       7
Name    C10
Name: 3, dtype: object
num       9
Name    C13
Name: 4, dtype: object
num      11
Name    C16
Name: 5, dtype: object
num      13
Name    C19
Name: 6, dtype: object
num     15
Name    C4
Name: 7, dtype: object
num     17
Name    C7
Name: 8, dtype: object
num      19
Name    C10
Name: 9, dtype: object
num      21
Name    C13
Name: 10, dtype: object
num      23
Name    C16
Name: 11, dtype: object
***************************************************************************************************************
1-1-C1
5-3-C4
9-5-C7
13-7-C10
17-9-C13
21-11-C16
25-13-C19
29-15-C4
33-17-C7
37-19-C10
41-21-C13
45-23-C16

Process finished with exit code 0
View Code
C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
         name  index  num  cell  Col_sum
0           1      0    1     1        3
1           4      2    3     5       14
2           7      4    5     9       25
3          10      6    7    13       36
4          13      8    9    17       47
5          16     10    5    21       52
6          10      8    9     5       32
7          13     10   13     7       43
8          16     12   17     9       54
9          19     14    3     5       41
10          4     20    5    41       70
11         16     22    7    45       90
Row_sum   129    116   84   178      507

Process finished with exit code 0
View Code

 

排序:

1 1 print(df)
2 2 print("*"*111)
3 3 print(df.sort_values(by="cell"))
4 4 print("*"*111)
5 5 print(df)
6 6 print("*"*111)
7 7 print(df.sort_values(by=[''cell'',''num'',''name'']).reset_index(drop=True))
8 8 print("*"*111)
9 9 print(df.sort_values(by=[''cell'',''num'',''name'']).reset_index(drop=False))

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
    name  index  num  cell
0      1      0    1     1
1      4      2    3     5
2      7      4    5     9
3     10      6    7    13
4     13      8    9    17
5     16     10    5    21
6     10      8    9     5
7     13     10   13     7
8     16     12   17     9
9     19     14    3     5
10     4     20    5    41
11    16     22    7    45
***************************************************************************************************************
    name  index  num  cell
0      1      0    1     1
1      4      2    3     5
6     10      8    9     5
9     19     14    3     5
7     13     10   13     7
2      7      4    5     9
8     16     12   17     9
3     10      6    7    13
4     13      8    9    17
5     16     10    5    21
10     4     20    5    41
11    16     22    7    45
***************************************************************************************************************
    name  index  num  cell
0      1      0    1     1
1      4      2    3     5
2      7      4    5     9
3     10      6    7    13
4     13      8    9    17
5     16     10    5    21
6     10      8    9     5
7     13     10   13     7
8     16     12   17     9
9     19     14    3     5
10     4     20    5    41
11    16     22    7    45
***************************************************************************************************************
    name  index  num  cell
0      1      0    1     1
1      4      2    3     5
2     19     14    3     5
3     10      8    9     5
4     13     10   13     7
5      7      4    5     9
6     16     12   17     9
7     10      6    7    13
8     13      8    9    17
9     16     10    5    21
10     4     20    5    41
11    16     22    7    45
***************************************************************************************************************
    level_0  name  index  num  cell
0         0     1      0    1     1
1         1     4      2    3     5
2         9    19     14    3     5
3         6    10      8    9     5
4         7    13     10   13     7
5         2     7      4    5     9
6         8    16     12   17     9
7         3    10      6    7    13
8         4    13      8    9    17
9         5    16     10    5    21
10       10     4     20    5    41
11       11    16     22    7    45

Process finished with exit code 0
View Code

合并方法:注意表头一致

 1 import numpy as np
 2 import pandas as pd
 3 excel1 = pd.ExcelFile(''1.xlsx'')
 4 df1 = pd.read_excel(excel1 ,0,index_col=None,na_values=[''NA''])
 5 excel2 = pd.ExcelFile(''2.xlsx'')
 6 df2 = pd.read_excel(excel2 ,0,index_col=None,na_values=[''NA''])
 7 print(df1)
 8 print("*"*111)
 9 print(df2)
10 print("*"*111)
11 print(df1[:3])
12 print("*"*111)
13 pieces=[df1[1:4],df1[2:4],df2[2:3]]
14 print( pd.concat(pieces,sort=True))
15 print("*"*111)
16 print( pd.concat(pieces,sort=False))

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
     N  IN  num  cell
0    1   0    1     1
1    4   2    3     5
2    7   4    5     9
3   10   6    7    13
4   13   8    9    17
5   16  10    5    21
6   10   8    9     5
7   13  10   13     7
8   16  12   17     9
9   19  14    3     5
10   4  20    5    41
11  16  22    7    45
***************************************************************************************************************
    name  index  num  cell
0      1      6    1     1
1      2      7    3     5
2      3      8    5     9
3      4      9    7    13
4      5     10    9    17
5      6     11   11    21
6      7     12   13    25
7      8     13   15    29
8      9     14   17    33
9     10     15   19    37
10    11     16   21    41
11    12     17   23    45
***************************************************************************************************************
   N  IN  num  cell
0  1   0    1     1
1  4   2    3     5
2  7   4    5     9
***************************************************************************************************************
    IN     N  cell  index  name  num
1  2.0   4.0     5    NaN   NaN    3
2  4.0   7.0     9    NaN   NaN    5
3  6.0  10.0    13    NaN   NaN    7
2  4.0   7.0     9    NaN   NaN    5
3  6.0  10.0    13    NaN   NaN    7
2  NaN   NaN     9    8.0   3.0    5
***************************************************************************************************************
      N   IN  num  cell  name  index
1   4.0  2.0    3     5   NaN    NaN
2   7.0  4.0    5     9   NaN    NaN
3  10.0  6.0    7    13   NaN    NaN
2   7.0  4.0    5     9   NaN    NaN
3  10.0  6.0    7    13   NaN    NaN
2   NaN  NaN    5     9   3.0    8.0

Process finished with exit code 0
View Code
C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
    name  index  num  cell
0      1      0    1     1
1      4      2    3     5
2      7      4    5     9
3     10      6    7    13
4     13      8    9    17
5     16     10    5    21
6     10      8    9     5
7     13     10   13     7
8     16     12   17     9
9     19     14    3     5
10     4     20    5    41
11    16     22    7    45
***************************************************************************************************************
    name  index  num  cell
0      1      6    1     1
1      2      7    3     5
2      3      8    5     9
3      4      9    7    13
4      5     10    9    17
5      6     11   11    21
6      7     12   13    25
7      8     13   15    29
8      9     14   17    33
9     10     15   19    37
10    11     16   21    41
11    12     17   23    45
***************************************************************************************************************
   name  index  num  cell
0     1      0    1     1
1     4      2    3     5
2     7      4    5     9
***************************************************************************************************************
   cell  index  name  num
1     5      2     4    3
2     9      4     7    5
3    13      6    10    7
2     9      4     7    5
3    13      6    10    7
2     9      8     3    5
***************************************************************************************************************
   name  index  num  cell
1     4      2    3     5
2     7      4    5     9
3    10      6    7    13
2     7      4    5     9
3    10      6    7    13
2     3      8    5     9

Process finished with exit code 0
View Code

第一个表头不一致,第二个一致

 三种合并的方法:

 1 import numpy as np
 2 import pandas as pd
 3 excel1 = pd.ExcelFile(''1.xlsx'')
 4 df1 = pd.read_excel(excel1 ,0,index_col=None,na_values=[''NA''])
 5 excel2 = pd.ExcelFile(''2.xlsx'')
 6 df2 = pd.read_excel(excel2 ,0,index_col=None,na_values=[''NA''])
 7 excel3 = pd.ExcelFile(''3.xlsx'')
 8 df3 = pd.read_excel(excel3 ,0,index_col=None,na_values=[''NA''])
 9 excel4 = pd.ExcelFile(''4.xlsx'')
10 df4 = pd.read_excel(excel4 ,0,index_col=None,na_values=[''NA''])
11 print(df1)
12 print(df2)
13 print(df3)
14 print(df4)

表格结果如下:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/12.py
   name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      1    3     4
3     4      2    2     2
4     5      2    1     1
   name  index  num  cell
0     6      6    7    10
1     7      7    8     7
2     8      8    9     8
3     9      9    7     9
4    10      6    9    10
   name  index  num  cell
0    11     12   13    10
1    12     13   12    13
2    13     10   13    12
3    14     12   10    10
4    15     14   12    12
   name  index  num  cell
0    16     20   16    17
1    17     17   17    17
2    18     18   18    18
3    19     19   19    20
4    20     16   17    16

Process finished with exit code 0
View Code

concat 参数

 1 """
 2 concat参数
 3 pd.concat(objs, axis=0, join=''outer'', join_axes=None, 
 4                 ignore_index=False,keys=None,levels=None, 
 5                 names=None,verify_integrity=False,
 6                 copy=True) 
 7 1、objs : #输入,DataFrame格式或list(多个合并).
 8 
 9 2、axis: {0, 1, ...}, default 0. #控制连接的方向,0代表列纵向,1代表行横向
10 
11 3、join : {‘inner’, ‘outer’}, default ‘outer’. #控制连接的索引,inner表示取表索引之间的交集,outer表示取索引的并集
12 
13 4、ignore_index: boolean, default False. #是否使用原索引,选舍弃便对新表重新进行索引排序。
14 
15 5、join_axes : list of Index objects. #设定使用的索引,例以df1的索引为标准,join_axes=[df1.index]
16 
17 6、keys : sequence, default None. #类似Multiindex,设立另一层索引
18 """
19 pieces=[df1[1:4],df2[2:4],df3[2:],df4[4:]]
20 print( pd.concat(pieces,axis=0))
21 print("*"*111)
22 print( pd.concat(pieces,axis=1))
23 print("*"*111)
24 list=(df1,df2,df3,df4)
25 print( pd.concat(list))
26 print("*"*111)
27 print( pd.concat(list,axis=1))
28 print("*"*111)
29 print( pd.concat(list,axis =0, ignore_index=False))
30 print("*"*111)
31 print( pd.concat(list,axis =0, ignore_index=True))

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   name  index  num  cell
1     2      1    2     4
2     3      1    3     4
3     4      2    2     2
2     8      8    9     8
3     9      9    7     9
2    13     10   13    12
3    14     12   10    10
4    15     14   12    12
4    20     16   17    16
***************************************************************************************************************
   name  index  num  cell  name  ...   cell  name  index   num  cell
1   2.0    1.0  2.0   4.0   NaN  ...    NaN   NaN    NaN   NaN   NaN
2   3.0    1.0  3.0   4.0   8.0  ...   12.0   NaN    NaN   NaN   NaN
3   4.0    2.0  2.0   2.0   9.0  ...   10.0   NaN    NaN   NaN   NaN
4   NaN    NaN  NaN   NaN   NaN  ...   12.0  20.0   16.0  17.0  16.0

[4 rows x 16 columns]
***************************************************************************************************************
   name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      1    3     4
3     4      2    2     2
4     5      2    1     1
0     6      6    7    10
1     7      7    8     7
2     8      8    9     8
3     9      9    7     9
4    10      6    9    10
0    11     12   13    10
1    12     13   12    13
2    13     10   13    12
3    14     12   10    10
4    15     14   12    12
0    16     20   16    17
1    17     17   17    17
2    18     18   18    18
3    19     19   19    20
4    20     16   17    16
***************************************************************************************************************
   name  index  num  cell  name  index  ...   num  cell  name  index  num  cell
0     1      1    5     5     6      6  ...    13    10    16     20   16    17
1     2      1    2     4     7      7  ...    12    13    17     17   17    17
2     3      1    3     4     8      8  ...    13    12    18     18   18    18
3     4      2    2     2     9      9  ...    10    10    19     19   19    20
4     5      2    1     1    10      6  ...    12    12    20     16   17    16

[5 rows x 16 columns]
***************************************************************************************************************
   name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      1    3     4
3     4      2    2     2
4     5      2    1     1
0     6      6    7    10
1     7      7    8     7
2     8      8    9     8
3     9      9    7     9
4    10      6    9    10
0    11     12   13    10
1    12     13   12    13
2    13     10   13    12
3    14     12   10    10
4    15     14   12    12
0    16     20   16    17
1    17     17   17    17
2    18     18   18    18
3    19     19   19    20
4    20     16   17    16
***************************************************************************************************************
    name  index  num  cell
0      1      1    5     5
1      2      1    2     4
2      3      1    3     4
3      4      2    2     2
4      5      2    1     1
5      6      6    7    10
6      7      7    8     7
7      8      8    9     8
8      9      9    7     9
9     10      6    9    10
10    11     12   13    10
11    12     13   12    13
12    13     10   13    12
13    14     12   10    10
14    15     14   12    12
15    16     20   16    17
16    17     17   17    17
17    18     18   18    18
18    19     19   19    20
19    20     16   17    16

Process finished with exit code 0
View Code
merge的参数(暂时没有用到,用到时补充)
merge(left, right, how=''inner'', on=None, left_on=None, right_on=None,
      left_index=False, right_index=False, sort=True,
      suffixes=(''_x'', ''_y''), copy=True, indicator=False)

1on:列名,join用来对齐的那一列的名字,用到这个参数的时候一定要保证左表和右表用来对齐的那一列都有相同的列名。

2、left_on:左表对齐的列,可以是列名,也可以是和dataframe同样长度的arrays。

3、right_on:右表对齐的列,可以是列名,也可以是和dataframe同样长度的arrays。

4、left_index/ right_index: 如果是True的haunted以index作为对齐的key

5、how:数据融合的方法。

6sort:根据dataframe合并的keys按字典顺序排序,默认是,如果置false可以提高表现。
Append用法:

df1.append([df2,df3,df4])= pd.concat([df1,df2,df3,df4])

 

 

 

判断是否重复:

df1 = pd.read_excel(excel1 ,0,index_col=None,na_values=[''NA''])
print(df1)
print("*"*111)
print(df1.duplicated())

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      3    3     3
3     4      2    2     2
4     4      2    2     2
***************************************************************************************************************
0    False
1    False
2    False
3    False
4     True
dtype: bool

Process finished with exit code 0
View Code

剔除重复:

1 print(df1)
2 print("*"*111)
3 print(df1.drop_duplicates("index"))

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
   name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      3    3     3
3     4      2    2     2
4     4      2    2     2
***************************************************************************************************************
   name  index  num  cell
0     1      1    5     5
2     3      3    3     3
3     4      2    2     2

Process finished with exit code 0
View Code

分类汇总:

 1 import numpy as np
 2 import pandas as pd
 3 excel1 = pd.ExcelFile(''1.xlsx'')
 4 df1 = pd.read_excel(excel1 ,0,index_col=None,na_values=[''NA''])
 5 print("原始数据:",df1)
 6 print("*"*111)
 7 print(df1.groupby([''num'']))
 8 print("*"*111)
 9 print(df1.groupby([''num'']).groups)
10 print("*"*111)
11 for name ,group in df1.groupby([''num'']):
12     print("group:",name)
13     print(group)
14 print("*"*111)
15 print(df1.groupby([''num'']).sum())
16 print("*"*111)
17 grouped=df1.groupby([''num''])
18 print(grouped.size())
19 print("*"*111)
20 print(len(grouped))
21 print("*"*111)
22 print(grouped.first())
23 print("*"*111)
24 print(grouped.last())
25 print("*"*111)
26 print(grouped.get_group(3))
27 print("*"*111)
28 grouped2=df1.groupby([''num'',''cell''])
29 print(grouped2.groups)
30 print("*"*111)
31 print(grouped2.get_group((2,4)))
32 print("*"*111)
33 for (k1,k2),group_t in grouped2:
34     print("group:",k1,k2)
35     print(group_t)

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
原始数据:    name  index  num  cell
0     1      1    5     5
1     2      1    2     4
2     3      3    3     3
3     4      2    2     2
4     4      2    2     2
***************************************************************************************************************
<pandas.core.groupby.groupby.DataFrameGroupBy object at 0x07746D90>
***************************************************************************************************************
{2: Int64Index([1, 3, 4], dtype=''int64''), 3: Int64Index([2], dtype=''int64''), 5: Int64Index([0], dtype=''int64'')}
***************************************************************************************************************
group: 2
   name  index  num  cell
1     2      1    2     4
3     4      2    2     2
4     4      2    2     2
group: 3
   name  index  num  cell
2     3      3    3     3
group: 5
   name  index  num  cell
0     1      1    5     5
***************************************************************************************************************
     name  index  cell
num                   
2      10      5     8
3       3      3     3
5       1      1     5
***************************************************************************************************************
num
2    3
3    1
5    1
dtype: int64
***************************************************************************************************************
3
***************************************************************************************************************
     name  index  cell
num                   
2       2      1     4
3       3      3     3
5       1      1     5
***************************************************************************************************************
     name  index  cell
num                   
2       4      2     2
3       3      3     3
5       1      1     5
***************************************************************************************************************
   cell  index  name
2     3      3     3
***************************************************************************************************************
{(2, 2): Int64Index([3, 4], dtype=''int64''), (2, 4): Int64Index([1], dtype=''int64''), (3, 3): Int64Index([2], dtype=''int64''), (5, 5): Int64Index([0], dtype=''int64'')}
***************************************************************************************************************
   name  index  num  cell
1     2      1    2     4
***************************************************************************************************************
group: 2 2
   name  index  num  cell
3     4      2    2     2
4     4      2    2     2
group: 2 4
   name  index  num  cell
1     2      1    2     4
group: 3 3
   name  index  num  cell
2     3      3    3     3
group: 5 5
   name  index  num  cell
0     1      1    5     5

Process finished with exit code 0
View Code

数据透视表

 

 1 import numpy as np
 2 import pandas as pd
 3 excel1 = pd.ExcelFile(''1.xlsx'')
 4 df1 = pd.read_excel(excel1 ,0,index_col=None,na_values=[''NA''])
 5 print(df1)
 6 print("*"*111)
 7 pvt=pd.pivot_table(df1,index=["name"],values=[''num'',''cell''],aggfunc=np.sum)
 8 print(pvt)
 9 print("*"*111)
10 print(pvt.loc[(''A''),''num''])
11 print("*"*111)
12 print(pvt.reset_index())
13 print("*"*111)
14 print(df1)
15 print("*"*111)
16 pvt2=pd.pivot_table(df1,index=["name"],values=[''num''],columns=[''dex''],aggfunc=np.sum)
17 print(pvt2)
18 print("*"*111)
19 df2=pvt2.reset_index()
20 print(df2)
21 print("*"*111)
22 print(df2.columns.values)
23 print("*"*111)
24 print(df2.shape[1])

结果:

C:\Users\zte\AppData\Local\Programs\Python\Python37-32\python.exe C:/Users/zte/PycharmProjects/radioparametercheck/csvfiles/txt.py
  name  dex  num  cell
0    A    1    5     5
1    A    2    2     4
2    B    3    3     3
3    B    4    2     2
4    E    5    2     2
***************************************************************************************************************
      cell  num
name           
A        9    7
B        5    5
E        2    2
***************************************************************************************************************
7
***************************************************************************************************************
  name  cell  num
0    A     9    7
1    B     5    5
2    E     2    2
***************************************************************************************************************
  name  dex  num  cell
0    A    1    5     5
1    A    2    2     4
2    B    3    3     3
3    B    4    2     2
4    E    5    2     2
***************************************************************************************************************
      num                    
dex     1    2    3    4    5
name                         
A     5.0  2.0  NaN  NaN  NaN
B     NaN  NaN  3.0  2.0  NaN
E     NaN  NaN  NaN  NaN  2.0
***************************************************************************************************************
    name  num                    
dex         1    2    3    4    5
0      A  5.0  2.0  NaN  NaN  NaN
1      B  NaN  NaN  3.0  2.0  NaN
2      E  NaN  NaN  NaN  NaN  2.0
***************************************************************************************************************
[(''name'', '''') (''num'', 1) (''num'', 2) (''num'', 3) (''num'', 4) (''num'', 5)]
***************************************************************************************************************
6

Process finished with exit code 0
View Code

 

"import numpy as np" ImportError: No module named numpy

问题:没有安装 numpy

解决方法:

下载文件,安装

numpy-1.8.2-win32-superpack-python2.7

安装运行 import numpy,出现

Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    import numpy
  File "C:\Python27\lib\site-packages\numpy\__init__.py", line 153, in <module>
    from . import add_newdocs
  File "C:\Python27\lib\site-packages\numpy\add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "C:\Python27\lib\site-packages\numpy\lib\__init__.py", line 8, in <module>
    from .type_check import *
  File "C:\Python27\lib\site-packages\numpy\lib\type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "C:\Python27\lib\site-packages\numpy\core\__init__.py", line 6, in <module>
    from . import multiarray
ImportError: DLL load failed: %1 不是有效的 Win32 应用程序。

原因是:python 装的是 64 位的,numpy 装的是 32 位的

重新安装 numpy 为:numpy-1.8.0-win64-py2.7

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数

3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数

目录

[TOC]

前言

具体我们来学 Numpy 的统计函数

(一)函数一览表

调用方式:np.*

.sum(a) 对数组 a 求和
.mean(a) 求数学期望
.average(a) 求平均值
.std(a) 求标准差
.var(a) 求方差
.ptp(a) 求极差
.median(a) 求中值,即中位数
.min(a) 求最大值
.max(a) 求最小值
.argmin(a) 求最小值的下标,都处里为一维的下标
.argmax(a) 求最大值的下标,都处里为一维的下标
.unravel_index(index, shape) g 根据 shape, 由一维的下标生成多维的下标

(二)统计函数 1

(1)说明

01.jpg

(2)输出

.sum(a)

01.png

.mean(a)

02.png

.average(a)

03.png

.std(a)

.var(a)

04.png

(三)统计函数 2

(1)说明

02.jpg

(2)输出

.max(a) .min(a)

.ptp(a)

.median(a)

05.png

.argmin(a)

.argmax(a)

.unravel_index(index,shape)

06.png

作者:Mark

日期:2019/02/11 周一

Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案

Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案

如何解决Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案?

希望有人能在这里提供帮助。我一直在绕圈子一段时间。我只是想设置一个 python 脚本,它将一些 json 数据从 REST API 加载到云数据库中。我在 Anaconda 上设置了一个虚拟环境(因为 GCP 库推荐这样做),安装了依赖项,现在我只是尝试导入库并向端点发送请求。 我使用 Conda(和 conda-forge)来设置环境并安装依赖项,所以希望一切都干净。我正在使用带有 Python 扩展的 VS 编辑器作为编辑器。 每当我尝试运行脚本时,我都会收到以下消息。我已经尝试了其他人在 Google/StackOverflow 上找到的所有解决方案,但没有一个有效。我通常使用 IDLE 或 Jupyter 进行脚本编写,没有任何问题,但我对 Anaconda、VS 或环境变量(似乎是相关的)没有太多经验。 在此先感谢您的帮助!

  \Traceback (most recent call last):
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 22,in <module>
from . import multiarray
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\multiarray.py",line 12,in <module>
from . import overrides
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\overrides.py",line 7,in <module>
from numpy.core._multiarray_umath import (
ImportError: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.

During handling of the above exception,another exception occurred:

Traceback (most recent call last):
File "c:\API\citi-bike.py",line 4,in <module>
import numpy as np
File "C:\Conda\envs\gcp\lib\site-packages\numpy\__init__.py",line 150,in <module>
from . import core
File "C:\Conda\envs\gcp\lib\site-packages\numpy\core\__init__.py",line 48,in <module>
raise ImportError(msg)
ImportError:

IMPORTANT: PLEASE READ THIS FOR ADVICE ON HOW TO SOLVE THIS ISSUE!

Importing the numpy C-extensions Failed. This error can happen for
many reasons,often due to issues with your setup or how NumPy was
installed.

We have compiled some common reasons and troubleshooting tips at:

https://numpy.org/devdocs/user/troubleshooting-importerror.html

Please note and check the following:

* The Python version is: python3.9 from "C:\Conda\envs\gcp\python.exe"
* The NumPy version is: "1.21.1"

and make sure that they are the versions you expect.
Please carefully study the documentation linked above for further help.

Original error was: DLL load Failed while importing _multiarray_umath: The specified module Could not be found.

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

cvxpy 和 numpy 之间的版本冲突:“针对 API 版本 0xe 编译的模块,但此版本的 numpy 是 0xd”

cvxpy 和 numpy 之间的版本冲突:“针对 API 版本 0xe 编译的模块,但此版本的 numpy 是 0xd”

如何解决cvxpy 和 numpy 之间的版本冲突:“针对 API 版本 0xe 编译的模块,但此版本的 numpy 是 0xd”?

我正在尝试升级一些软件包并为现有的 Python 程序整合我的 requirements.txt,以便将其移至 docker 容器。

这个容器将基于 tensorflow docker 容器,这决定了我必须使用的一些包版本。我们在 windows 下工作,我们希望能够在我们的机器上本地运行该程序(至少在一段时间内)。所以我需要找到一个适用于 docker 和 Windows 10 的配置。

Tensorflow 2.4.1 需要 numpy~=1.19.2。使用 numpy 1.20 时,pip 会抱怨 numpy 1.20 是一个不兼容的版本。

但是在使用 numpy~=1.19.2 时,导入 cvxpy 时出现以下错误。 pip 安装所有软件包都很好:

RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd
Traceback (most recent call last):
  File "test.py",line 1,in <module>
    import cvxpy
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\__init__.py",line 18,in <module>
    from cvxpy.atoms import *
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\atoms\__init__.py",line 20,in <module>
    from cvxpy.atoms.geo_mean import geo_mean
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\atoms\geo_mean.py",in <module>
    from cvxpy.utilities.power_tools import (fracify,decompose,approx_error,lower_bound,File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\utilities\power_tools.py",in <module>
    from cvxpy.atoms.affine.reshape import reshape
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\atoms\affine\reshape.py",in <module>
    from cvxpy.atoms.affine.hstack import hstack
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\atoms\affine\hstack.py",in <module>
    from cvxpy.atoms.affine.affine_atom import AffAtom
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\atoms\affine\affine_atom.py",line 22,in <module>
    from cvxpy.cvxcore.python import canonInterface
  File "c:\Projekte\algo5\venv\lib\site-packages\cvxpy\cvxcore\python\__init__.py",line 3,in <module>
    import _cvxcore
ImportError: numpy.core.multiarray Failed to import

重现步骤:

1.) 在 Windows 10 下创建一个新的 Python 3.8 venv 并激活它

2.) 通过 requirements.txt 安装以下 pip install -r requirements.txt

cvxpy 
numpy~=1.19.2 # tensorflow 2.4.1 requires this version

3.) 通过 test.py

执行以下 python test.py
import cvxpy

if __name__ == ''__main__'':
    pass

如果我想使用 tensorflow 2.3,也会发生同样的事情。在这种情况下需要 numpy~=1.18,错误完全相同。

搜索错误发现很少的命中,可悲的是没有帮助我。

我该怎么做才能解决这个问题?

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)

今天的关于pandas 和 numpy 学习pandas与numpy的分享已经结束,谢谢您的关注,如果想了解更多关于"import numpy as np" ImportError: No module named numpy、3.7Python 数据处理篇之 Numpy 系列 (七)---Numpy 的统计函数、Anaconda Numpy 错误“Importing the Numpy C Extension Failed”是否有另一种解决方案、cvxpy 和 numpy 之间的版本冲突:“针对 API 版本 0xe 编译的模块,但此版本的 numpy 是 0xd”的相关知识,请在本站进行查询。

本文标签: