天天看點

深度學習之路2:python環境和Jupter notebook使用numpy的使用小結Numpy簡介numpy入手矩陣的操作Numpy資料的儲存和讀取

Numpy簡介

Numpy 顧名思義,我們可以拆解成number python來解讀,就是python裡用來處理數字的一個庫,下面我們可以引述,numpy官方網站的一段話來介紹 :

NumPy is the fundamental package for scientific computing in Python.

It is a Python library that provides a multidimensional array object,

various derived objects (such as masked arrays and matrices), and an

assortment of routines for fast operations on arrays, including

mathematical, logical, shape manipulation, sorting, selecting, I/O,

discrete Fourier transforms, basic linear algebra, basic statistical

operations, random simulation and much more.

At the core of the NumPy package, is the ndarray object. This

encapsulates n-dimensional arrays of homogeneous data types, with many

operations being performed in compiled code for performance. There are

several important differences between NumPy arrays and the standard

Python sequences:

  • NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will

    create a new array and delete the original.

  • The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory. The exception:

    one can have arrays of (Python, including NumPy) objects, thereby

    allowing for arrays of different sized elements. NumPy arrays

    facilitate advanced mathematical and other types of operations on

    large numbers of data. Typically, such operations are executed more

    efficiently and with less code than is possible using Python’s

    built-in sequences.

  • A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support

    Python-sequence input, they convert such input to NumPy arrays prior

    to processing, and they often output NumPy arrays. In other words, in

    order to efficiently use much (perhaps even most) of today’s

    scientific/mathematical Python-based software, just knowing how to use

    Python’s built-in sequence types is insufficient - one also needs to

    know how to use NumPy arrays.

從上面可以看出,Numpy庫主要包括多元序列的建立、計算(強調了這裡的array數列不是python裡的數列類型,比它強大很多),數列的初始化包含了大量科學計算的數列所需要的初始化,例如正太随機分布多元數列初始化、多元矩陣等。另外numpy也包括數列的處理分析,比用原生态python語句高出幾倍效率。

numpy入手

Numpy 的下載下傳以及調用

Numpy 庫一般隻有在安裝Anaconda 資料包才會自帶,如果隻是安裝了python的朋友可以在控制台輸入:

pip install --upgrade  numpy

           

安裝完成後,隻要在py檔案中

import numpy

便可以了

Numpy 的屬性及方法

這裡介紹一個小操作,在Jupyter notebook 中寫代碼的時候隻要在調用方法後輸入?執行後就會跳出具體的方法幫助文檔,但是包對象的幫助文檔調出需要在對象前。

?numpy  # 檢視numpy
numpy.random?  # 檢視random方法的使用說明
           

下面清單是對Numpy的部分屬性方法彙總:

方法名 使用說明 舉例 備注
array(python.list/tuples) 資料的建立

t=[2,3]

numpy.array(t)

建立方式有多種
arange([min,max,interval]) 通過範圍建立數列 a=numpy.arange(4,7,2)

interval是間隔

範圍内不包括max,包括min

ndim 所在數列的次元,傳回int a.ndim & numpy.ndim(a) 多用于矩陣或多元
shape 多元數列的形狀,例如矩陣會傳回矩陣行列 numpy.shape(a) & a.shape 複雜次元的形狀
add() 數列相加numpy.add(a,b) 等同a+b numpy.add(a,b)
zeros([colum,rows…]) 生成一個0對角矩陣 numpy.zeros([3,5])
ones([colum,rows…]) 生成1對角矩陣 numpy.ones([2,2])
dot() 矩陣乘法,不同于a* b numpy.dot(a,b) 矩陣乘法了解
dtype 數列資料類型 numpy.aray(t,dtype=“int32” 定義資料類型
itemsize 字元所占記憶體大小 t.itemsize
fromfuntion(fun(),dtype) 根據函數建立數列 numpy.fromfuntion(getx(2,3),dtyp=“int64”)
round(array,place) 同python四舍五入 numpy.round(t,3) place是保留小數第幾位
allclose(array1,array2) 比較兩個數列是否相同 見下面 矩陣求同
swapaxes() 軸轉置 a.swapaxes(2,3)
bincont() 數列所在位置的相等數值個數 np.bincount(np.array([1,2,3,4,5,6,4,5,6,4])) 見附錄代碼,輸出列數是array最大值
meshgrid() 擷取數列的空間坐标 x,y=np.meshgrid(np.arange(3,6,1),np.arange(3,6,1)) 見附錄代碼
where(condition,a,b) 根據條件condition來抉擇a的數值或者b的 np.where([True,False],a,b)
sum () 求和 np.sum(a)
mean() 求平均值 c.mean()
std() 方差 c.std()
in1d(a,b) a中的元素是否在b内 np.in1d(a,b)
unique(array) 提取出所有不重複的元素 np.unique([1,1,234,2,2])

矩陣的操作

矩陣轉置

a.T

或者通過方法

transpose()

來轉置:

>>> x = np.arange(4).reshape((2,2))
>>> x
array([[0, 1],
     [2, 3]])
>>>
>>> np.transpose(x)
array([[0, 2],
     [1, 3]])
>>>
>>> x = np.ones((1, 2, 3))
>>> np.transpose(x, (1, 0, 2)).shape
(2, 1, 3)
           

求逆矩陣

linalg 這個包是所有線性代數操作方法的集合包

numpy.linalg.inv(np.dot(a.T,a))

           

求同

我們如果需要檢測逆矩陣是否正确,就可以通過逆矩陣和矩陣乘積是否和1對角矩陣相同就行了

a_t=np.linalg.inv(a)
print(np.dot(a,a_t))
np.allclose(np.dot(a,a_t),np.eye(2))

           

Numpy資料的儲存和讀取

下面說的時常用檔案存儲的方式。

單個資料存儲

# 資料存儲單組
np.save("a.npy",a)
print(np.load("a.npy"))
#存儲多組
np.savez("b.npz",x=a,y=d)
np.load("b.npz")['y']
#存儲成txt檔案
np.savetxt("x.txt",a)
           

#附錄

是上面所說知識點的用法例子

We can create numpy arrays from native python tuples or lists.

import numpy as np
tp=([1,2],[34,4])
a=np.array(tp,dtype="int32")
print(a.shape)



           
(2, 2)
           
[ 5  6  7  8  9 10 11 12 13 14 15 16 17]
           
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
           
c=np.arange(30).reshape(3,5,2)
c
           
array([[[ 0,  1],
        [ 2,  3],
        [ 4,  5],
        [ 6,  7],
        [ 8,  9]],

       [[10, 11],
        [12, 13],
        [14, 15],
        [16, 17],
        [18, 19]],

       [[20, 21],
        [22, 23],
        [24, 25],
        [26, 27],
        [28, 29]]])
           
array([[ 1,  5,  9, 13, 17],
       [21, 25, 29, 33, 37],
       [41, 45, 49, 53, 57]])
           
d=[1,4],[4,5]
d=np.array(d)
a.dot(d)vvv

           
array([[  9,  14],
       [ 50, 156]])
           

One dimissions arrays

np.arange(3,25).shape
           
(22,)
           
(2, 2)
           
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])
           
array([[  9,  14],
       [ 50, 156]])
           
a*b
           
array([[  1,   8],
       [136,  20]])
           

需要注意的地方,從原數列下标取值,會污染原來數列,而python的數列是沒有污染這種概念

num=[1,3,45,6,7,8]
n=np.array(num)
a_slice=n[2:4]
a_slice[0]=1000000
print(a_slice)
print(n)
           
[1000000       6]
[      1       3 1000000       6       7       8]
           
n.itemsize
           
4
           
def f(x,y):                        # 通過函數來建立array
    return 4*x+y
nd=np.fromfunction(f,(3,4),dtype="int")
nd
           
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
           

根據條件選取的例子,員工五天上班的時間,找出周四的上班時間

name=['wangjun','xiaoming','john','tom','qingqing']
names=np.array(name)
worktime=np.round(np.random.randn(5,5)+8.0,2)
worktime
           
array([[8.95, 7.16, 7.48, 8.49, 8.27],
       [8.26, 9.41, 6.98, 8.78, 8.62],
       [9.05, 8.41, 9.86, 8.74, 9.02],
       [7.86, 6.18, 7.34, 7.55, 7.12],
       [9.7 , 7.54, 6.47, 7.71, 9.21]])
           
[[9.05 8.41 9.86 8.74 9.02]]
           
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-100-1d384739e9f7> in <module>
----> 1 worktime.transpose(2)


ValueError: axes don't match array
           
a_t=np.linalg.inv(a)
print(np.dot(a,a_t))
np.allclose(np.dot(a,a_t),np.eye(2))
           
[[1. 0.]
 [0. 1.]]





True
           
#bincount()
np.bincount(np.array([1,2,3,4,200,1000,4,5,6,4]))
           
array([0, 1, 1, ..., 0, 0, 1], dtype=int64)
           
pointx,pointy=np.meshgrid(np.arange(-10,10,0.02),np.arange(-10,10,0.020))
pointx
           
array([[-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98],
       [-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98],
       [-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98],
       ...,
       [-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98],
       [-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98],
       [-10.  ,  -9.98,  -9.96, ...,   9.94,   9.96,   9.98]])
           
pointy
           
array([[-10.  , -10.  , -10.  , ..., -10.  , -10.  , -10.  ],
       [ -9.98,  -9.98,  -9.98, ...,  -9.98,  -9.98,  -9.98],
       [ -9.96,  -9.96,  -9.96, ...,  -9.96,  -9.96,  -9.96],
       ...,
       [  9.94,   9.94,   9.94, ...,   9.94,   9.94,   9.94],
       [  9.96,   9.96,   9.96, ...,   9.96,   9.96,   9.96],
       [  9.98,   9.98,   9.98, ...,   9.98,   9.98,   9.98]])
           
#meshgrid
z=pointx**2+pointy**2
import matplotlib.pyplot as  mp
mp.imshow(z)
mp.colorbar()
mp.show()
           
深度學習之路2:python環境和Jupter notebook使用numpy的使用小結Numpy簡介numpy入手矩陣的操作Numpy資料的儲存和讀取
# where
print(a)
print(d)
np.where([False,True],a,d)
           
[[ 1  2]
 [34  4]]
[[1 4]
 [4 5]]





array([[1, 2],
       [4, 4]])
           
#inld
np.in1d([2,3,4],a)

           
array([ True, False,  True])
           
array([  1,   2, 234])
           
# 資料存儲單組
np.save("a.npy",a)
print(np.load("a.npy"))
#存儲多組
np.savez("b.npz",x=a,y=d)
np.load("b.npz")['y']
#存儲成txt檔案
np.savetxt("x.txt",a)
           
[[ 1  2]
 [34  4]]
           

#參考資料

  1. https://www.numpy.org/devdocs/user/quickstart.html