目錄
一、安裝、編譯與運作
二、變量、運算與表達式
三、資料類型
1、數字
2、字元串
3、元組
4、清單
5、字典
四、流程控制
1、if-else
2、for
3、while
4、switch
五、函數
1、自定義函數
2、Lambda函數
3、Python内置函數
六、包與子產品
1、子產品module
2、包package
七、正規表達式
1、元字元
2、常用函數
3、分組
4、一個小執行個體-爬蟲
八、深拷貝與淺拷貝
九、檔案與目錄
1、檔案讀寫
2、OS子產品
3、目錄周遊
十、異常處理
Python的安裝很容易,直接到官網:http://www.python.org/下載下傳安裝就可以了。Ubuntu一般都預安裝了。沒有的話,就可以#apt-get install python。Windows的話直接下載下傳msi包安裝即可。Python 程式是通過解釋器執行的,是以安裝後,可以看到Python提供了兩個解析器,一個是IDLE (Python GUI),一個是Python (command line)。前者是一個帶GUI界面的版本,後者實際上和在指令提示符下運作python是一樣的。運作解釋器後,就會有一個指令提示符>>>,在提示符後鍵入你的程式語句,鍵入的語句将會立即執行。就像Matlab一樣。
另外,Matlab有.m的腳步檔案,python也有.py字尾的腳本檔案,這個檔案除了可以解釋執行外,還可以編譯運作,編譯後運作速度要比解釋運作要快。
例如,我要列印一個helloWorld。
方法1:直接在解釋器中,>>> print ‘helloWorld’。
方法2:将這句代碼寫到一個檔案中,例如hello.py。運作這個檔案有三種方式:
1)在終端中:python hello.py
2)先編譯成.pyc檔案:
import py_compile
py_compile.compile("hello.py")
再在終端中:python hello.pyc
3)在終端中:
python -O -m py_compile hello.py
python hello.pyo
編譯成.pyc和.pyo檔案後,執行的速度會更快。是以一般一些重複性并多次調用的代碼會被編譯成這兩種可執行的方式來待調用。
這裡沒什麼好說的,有其他語言的程式設計基礎的話都沒什麼問題。和Matlab的相似度比較大。這塊差别不是很大。具體如下:
需要注意的一個是:5/2 等于2,5.0/2才等于2.5。
[python] view plain copy
- ###################################
- ### compute #######
- # raw_input() get input from keyboard to string type
- # So we should transfer to int type
- # Some new support computing type:
- # and or not in is < <= != == | ^ & << + - / % ~ **
- print 'Please input a number:'
- number = int(raw_input())
- number += 1
- print number**2 # ** means ^
- print number and 1
- print number or 1
- print not number
- 5/2 # is 2
- 5.0/2 # is 2.5, should be noted
1、數字
通常的int, long,float,long等等都被支援。而且會看你的具體數字來定義變量的類型。如下:
- ### type of value #######
- # int, long, float
- # do not need to define the type of value, python will
- # do this according to your value
- num = 1 # stored as int type
- num = 1111111111111 # stored as long int type
- num = 1.0 # stored as float type
- num = 12L # L stands for long type
- num = 1 + 12j # j stands for complex type
- num = '1' # string type
2、字元串
單引号,雙引号和三引号都可以用來定義字元串。三引号可以定義特别格式的字元串。字元串作為一種序列類型,支援像Matlab一樣的索引通路和切片通路。
- ### type of string #######
- num = "1" # string type
- num = "Let's go" # string type
- num = "He's \"old\"" # string type
- mail = "Xiaoyi: \n hello \n I am you!"
- mail = """Xiaoyi:
- hello
- I am you!
- """ # special string format
- string = 'xiaoyi' # get value by index
- copy = string[0] + string[1] + string[2:6] # note: [2:6] means [2 5] or[2 6)
- copy = string[:4] # start from 1
- copy = string[2:] # to end
- copy = string[::1] # step is 1, from start to end
- copy = string[::2] # step is 2
- copy = string[-1] # means 'i', the last one
- copy = string[-4:-2:-1] # means 'yoa', -1 step controls direction
- memAddr = id(num) # id(num) get the memory address of num
- type(num) # get the type of num
3、元組
元組tuple用()來定義。相當于一個可以存儲不同類型資料的一個數組。可以用索引來通路,但需要注意的一點是,裡面的元素不能被修改。
- ### sequence type #######
- ## can access the elements by index or slice
- ## include: string, tuple(or array? structure? cell?), list
- # basis operation of sequence type
- firstName = 'Zou'
- lastName = 'Xiaoyi'
- len(string) # the length
- name = firstName + lastName # concatenate 2 string
- firstName * 3 # repeat firstName 3 times
- 'Z' in firstName # check contain or not, return true
- string = '123'
- max(string)
- min(string)
- cmp(firstName, lastName) # return 1, -1 or 0
- ## tuple(or array? structure? cell?)
- ## define this type using ()
- user = ("xiaoyi", 25, "male")
- name = user[0]
- age = user[1]
- gender = user[2]
- t1 = () # empty tuple
- t2 = (2, ) # when tuple has only one element, we should add a extra comma
- user[1] = 26 # error!! the elements can not be changed
- name, age, gender = user # can get three element respectively
- a, b, c = (1, 2, 3)
4、清單
清單list用[]來定義。它和元組的功能一樣,不同的一點是,裡面的元素可以修改。List是一個類,支援很多該類定義的方法,這些方法可以用來對list進行操作。
- ## list type (the elements can be modified)
- ## define this type using []
- userList = ["xiaoyi", 25, "male"]
- name = userList[0]
- age = userList[1]
- gender = userList[2]
- userList[3] = 88888 # error! access out of range, this is different with Matlab
- userList.append(8888) # add new elements
- "male" in userList # search
- userList[2] = 'female' # can modify the element (the memory address not change)
- userList.remove(8888) # remove element
- userList.remove(userList[2]) # remove element
- del(userList[1]) # use system operation api
- ## help(list.append)
- ################################
- ######## object and class ######
- ## object = property + method
- ## python treats anything as class, here the list type is a class,
- ## when we define a list "userList", so we got a object, and we use
- ## its method to operate the elements
5、字典
字典dictionary用{}來定義。它的優點是定義像key-value這種鍵值對的結構,就像struct結構體的功能一樣。它也支援字典類支援的方法進行建立和操作。

- ######## dictionary type ######
- ## define this type using {}
- item = ['name', 'age', 'gender']
- value = ['xiaoyi', '25', 'male']
- zip(item, value) # zip() will produce a new list:
- # [('name', 'xiaoyi'), ('age', '25'), ('gender', 'male')]
- # but we can not define their corresponding relationship
- # and we can define this relationship use dictionary type
- # This can be defined as a key-value manner
- # dic = {key1: value1, key2: value2, ...}, key and value can be any type
- dic = {'name': 'xiaoyi', 'age': 25, 'gender': 'male'}
- dic = {1: 'zou', 'age':25, 'gender': 'male'}
- # and we access it like this: dic[key1], the key as a index
- print dic['name']
- print dic[1]
- # another methods create dictionary
- fdict = dict(['x', 1], ['y', 2]) # factory mode
- ddict = {}.fromkeys(('x', 'y'), -1) # built-in mode, default value is the same which is none
- # access by for circle
- for key in dic
- print key
- print dic[key]
- # add key or elements to dictionary, because dictionary is out of sequence,
- # so we can directly and a key-value pair like this:
- dic['tel'] = 88888
- # update or delete the elements
- del dic[1] # delete this key
- dic.pop('tel') # show and delete this key
- dic.clear() # clear the dictionary
- del dic # delete the dictionary
- dic.get(1) # get the value of key
- dic.get(1, 'error') # return a user-define message if the dictionary do not contain the key
- dic.keys()
- dic.values()
- dic.has_key(key)
- # dictionary has many operations, please use help to check out
在這塊,Python與其它大多數語言有個非常不同的地方,Python語言使用縮進塊來表示程式邏輯(其它大多數語言使用大括号等)。例如:
if age < 21:
print("你不能買酒。")
print("不過你能買口香糖。")
print("這句話處于if語句塊的外面。")
這個代碼相當于c語言的:
if (age < 21)
{
}
可以看到,Python語言利用縮進表示語句塊的開始和退出(Off-side規則),而非使用花括号或者某種關鍵字。增加縮進表示語句塊的開始(注意前面有個:号),而減少縮進則表示語句塊的退出。根據PEP的規定,必須使用4個空格來表示每級縮進(不清楚4個空格的規定如何,在實際編寫中可以自定義空格數,但是要滿足每級縮進間空格數相等)。使用Tab字元和其它數目的空格雖然都可以編譯通過,但不符合編碼規範。
為了使我們自己編寫的程式能很好的相容别人的程式,我們最好還是按規範來,用四個空格來縮減(注意,要麼都是空格,要是麼都制表符,千萬别混用)。
1、if-else
If-else用來判斷一些條件,以執行滿足某種條件的代碼。
- ######## procedure control #####
- ## if else
- if expression: # bool type and do not forget the colon
- statement(s) # use four space key
- if expression:
- statement(s) # error!!!! should use four space key
- if 1<2:
- print 'ok, ' # use four space key
- print 'yeah' # use the same number of space key
- if True: # true should be big letter True
- print 'true'
- def fun():
- return 1
- if fun():
- print 'ok'
- else:
- print 'no'
- con = int(raw_input('please input a number:'))
- if con < 2:
- print 'small'
- elif con > 3:
- print 'big'
- print 'middle'
- if 1 < 2:
- if 2 < 3:
- print 'yeah'
- else:
- print 'no'
- print 'out'
- print 'bad'
- if 1<2 and 2<3 or 2 < 4 not 0: # and, or, not
- print 'yeah'
2、for
for的作用是循環執行某段代碼。還可以用來周遊我們上面所提到的序列類型的變量。
- ## for
- for iterating_val in sequence:
- statements(s)
- # sequence type can be string, tuple or list
- for i in "abcd":
- print i
- for i in [1, 2, 3, 4]:
- # range(start, end, step), if not set step, default is 1,
- # if not set start, default is 0, should be noted that it is [start, end), not [start, end]
- range(5) # [0, 1, 2, 3, 4]
- range(1, 5) # [1, 2, 3, 4]
- range(1, 10, 2) # [1, 3, 5, 7, 9]
- for i in range(1, 100, 1):
- # ergodic for basis sequence
- fruits = ['apple', 'banana', 'mango']
- for fruit in range(len(fruits)):
- print 'current fruit: ', fruits[fruit]
- # ergodic for dictionary
- dic = {1: 111, 2: 222, 5: 555}
- for x in dic:
- print x, ': ', dic[x]
- dic.items() # return [(1, 111), (2, 222), (5, 555)]
- for key,value in dic.items(): # because we can: a,b=[1,2]
- print key, ': ', value
- print 'ending'
- import time
- # we also can use: break, continue to control process
- for x in range(1, 11):
- print x
- time.sleep(1) # sleep 1s
- if x == 3:
- pass # do nothing
- if x == 2:
- continue
- if x == 6:
- break
- if x == 7:
- exit() # exit the whole program
- print '#'*50
3、while
while的用途也是循環。它首先檢查在它後邊的循環條件,若條件表達式為真,它就執行冒号後面的語句塊,然後再次測試循環條件,直至為假。冒号後面的縮近語句塊為循環體。
- ## while
- while expression:
- statement(s)
- while True:
- print 'hello'
- x = raw_input('please input something, q for quit:')
- if x == 'q':
4、switch
其實Python并沒有提供switch結構,但我們可以通過字典和函數輕松的進行構造。例如:
- #############################
- ## switch ####
- ## this structure do not support by python
- ## but we can implement it by using dictionary and function
- ## cal.py ##
- #!/usr/local/python
- from __future__ import division
- # if used this, 5/2=2.5, 6/2=3.0
- def add(x, y):
- return x + y
- def sub(x, y):
- return x - y
- def mul(x, y):
- return x * y
- def div(x, y):
- return x / y
- operator = {"+": add, "-": sub, "*": mul, "/": div}
- operator["+"](1, 2) # the same as add(1, 2)
- operator["%"](1, 2) # error, not have key "%", but the below will not
- operator.get("+")(1, 2) # the same as add(1, 2)
- def cal(x, o, y):
- print operator.get(o)(x, y)
- cal(2, "+", 3)
- # this method will effect than if-else
1、自定義函數
在Python中,使用def語句來建立函數:
- ######## function #####
- def functionName(parameters): # no parameters is ok
- bodyOfFunction
- def add(a, b):
- return a+b # if we do not use a return, any defined function will return default None
- a = 100
- b = 200
- sum = add(a, b)
- ##### function.py #####
- #!/usr/bin/python
- #coding:utf8 # support chinese
- def add(a = 1, b = 2): # default parameters
- return a+b # can return any type of data
- # the followings are all ok
- add()
- add(2)
- add(y = 1)
- add(3, 4)
- ###### the global and local value #####
- ## global value: defined outside any function, and can be used
- ## in anywhere, even in functions, this should be noted
- ## local value: defined inside a function, and can only be used
- ## in its own function
- ## the local value will cover the global if they have the same name
- val = 100 # global value
- print val # here will access the val = 100
- print val # here will access the val = 100, too
- a = 100 # local value
- print a
- print a # here can not access the a = 100
- global a = 100 # declare as a global value
- print a # here can not access the a = 100, because fun() not be called yet
- fun()
- print a # here can access the a = 100
- ############################
- ## other types of parameters
- def fun(x):
- # the follows are all ok
- fun(10) # int
- fun('hello') # string
- fun(('x', 2, 3)) # tuple
- fun([1, 2, 3]) # list
- fun({1: 1, 2: 2}) # dictionary
- ## tuple
- def fun(x, y):
- print "%s : %s" % (x,y) # %s stands for string
- fun('Zou', 'xiaoyi')
- tu = ('Zou', 'xiaoyi')
- fun(*tu) # can transfer tuple parameter like this
- ## dictionary
- def fun(name = "name", age = 0):
- print "name: %s" % name
- print "age: " % age
- dic = {name: "xiaoyi", age: 25} # the keys of dictionary should be same as fun()
- fun(**dic) # can transfer dictionary parameter like this
- fun(age = 25, name = 'xiaoyi') # the result is the same
- ## the advantage of dictionary is can specify value name
- ## redundancy parameters ####
- ## the tuple
- def fun(x, *args): # the extra parameters will stored in args as tuple type
- print args
- # the follows are ok
- fun(10)
- fun(10, 12, 24) # x = 10, args = (12, 24)
- ## the dictionary
- def fun(x, **args): # the extra parameters will stored in args as dictionary type
- fun(x = 10, y = 12, z = 15) # x = 10, args = {'y': 12, 'z': 15}
- # mix of tuple and dictionary
- def fun(x, *args, **kwargs):
- print kwargs
- fun(1, 2, 3, 4, y = 10, z = 12) # x = 1, args = (2, 3, 4), kwargs = {'y': 10, 'z': 12}
2、Lambda函數
Lambda函數用來定義一個單行的函數,其便利在于:
- ## lambda function ####
- ## define a fast single line function
- fun = lambda x,y : x*y # fun is a object of function class
- fun(2, 3)
- # like
- return x*y
- ## recursion
- # 5=5*4*3*2*1, n!
- def recursion(n):
- if n > 0:
- return n * recursion(n-1) ## wrong
- numList = range(1, 5)
- reduce(mul, numList) # 5! = 120
- reduce(lambda x,y : x*y, numList) # 5! = 120, the advantage of lambda function avoid defining a function
- ### list expression
- numList = [1, 2, 6, 7]
- filter(lambda x : x % 2 == 0, numList)
- print [x for x in numList if x % 2 == 0] # the same as above
- map(lambda x : x * 2 + 10, numList)
- print [x * 2 + 10 for x in numList] # the same as above
3、Python内置函數
Python内置了很多函數,他們都是一個個的.py檔案,在python的安裝目錄可以找到。弄清它有那些函數,對我們的高效程式設計非常有用。這樣就可以避免重複的勞動了。下面也隻是列出一些常用的:
- ## built-in function of python ####
- ## if do not how to use, please use help()
- abs, max, min, len, divmod, pow, round, callable,
- isinstance, cmp, range, xrange, type, id, int()
- list(), tuple(), hex(), oct(), chr(), ord(), long()
- callable # test a function whether can be called or not, if can, return true
- # or test a function is exit or not
- isinstance # test type
- numList = [1, 2]
- if type(numList) == type([]):
- print "It is a list"
- if isinstance(numList, list): # the same as above, return true
- for i in range(1, 10001) # will create a 10000 list, and cost memory
- for i in xrange(1, 10001)# do not create such a list, no memory is cost
- ## some basic functions about string
- str = 'hello world'
- str.capitalize() # 'Hello World', first letter transfer to big
- str.replace("hello", "good") # 'good world'
- ip = "192.168.1.123"
- ip.split('.') # return ['192', '168', '1', '123']
- help(str.split)
- import string
- string.replace(str, "hello", "good") # 'good world'
- ## some basic functions about sequence
- len, max, min
- # filter(function or none, sequence)
- if x > 5:
- return True
- filter(fun, numList) # get [6, 7], if fun return True, retain the element, otherwise delete it
- # zip()
- name = ["me", "you"]
- age = [25, 26]
- tel = ["123", "234"]
- zip(name, age, tel) # return a list: [('me', 25, '123'), ('you', 26, '234')]
- # map()
- map(None, name, age, tel) # also return a list: [('me', 25, '123'), ('you', 26, '234')]
- test = ["hello1", "hello2", "hello3"]
- zip(name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2')]
- map(None, name, age, tel, test) # return [('me', 25, '123', 'hello1'), ('you', 26, '234', 'hello2'), (None, None, None, 'hello3')]
- a = [1, 3, 5]
- b = [2, 4, 6]
- map(mul, a, b) # return [2, 12, 30]
- # reduce()
- reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) # return ((((1+2)+3)+4)+5)
正規表達式: