天天看點

編寫可讀代碼的藝術 這是《The Art of Readable Code》的讀書筆記,再加一點自己的認識。強烈推薦此書:

這是《The Art of Readable Code》的讀書筆記,再加一點自己的認識。強烈推薦此書:

  • 英文版:《The Art of Readable Code》
  • 中文版:編寫可讀代碼的藝術

代碼為什麼要易于了解

“Code should be written to minimize the time it would take for someone else to understand it.”

日常工作的事實是:

  • 寫代碼前的思考和看代碼的時間遠大于真正寫的時間
  • 讀代碼是很平常的事情,不論是别人的,還是自己的,半年前寫的可認為是别人的代碼
  • 代碼可讀性高,很快就可以了解程式的邏輯,進入工作狀态
  • 行數少的代碼不一定就容易了解
  • 代碼的可讀性與程式的效率、架構、易于測試一點也不沖突

整本書都圍繞“如何讓代碼的可讀性更高”這個目标來寫。這也是好代碼的重要标準之一。

如何命名

變量名中應包含更多資訊

使用含義明确的詞,比如用

download

而不是

get

,參考以下替換方案:

  1. send -> deliver, dispatch, announce, distribute, route
  2. find -> search, extract, locate, recover
  3. start -> lanuch, create, begin, open
  4. make -> create,set up, build, generate, compose, add, new

避免通用的詞

tmp

retval

這樣詞,除了說明是臨時變量和傳回值之外,沒有任何意義。但是給他加一些有意義的詞,就會很明确:

  1. tmp_file = tempfile.NamedTemporaryFile()
  2. ...
  3. SaveData(tmp_file, ...)

不使用retval而使用變量真正代表的意義:

  1. sum_squares += v[i]; // Where's the "square" that we're summing? Bug!

嵌套的for循環中,

i

j

也有同樣讓人困惑的時候:

  1. for (int i = 0; i < clubs.size(); i++)
  2. for (int j = 0; j < clubs[i].members.size(); j++)
  3. for (int k = 0; k < users.size(); k++) if (clubs[i].members[k] == users[j])
  4. cout << "user[" << j << "] is in club[" << i << "]" << endl;

換一種寫法就會清晰很多:

  1. if (clubs[ci].members[mi] == users[ui]) # OK. First letters match.

是以,當使用一些通用的詞,要有充分的理由才可以。

使用具體的名字

CanListenOnPort

就比

ServerCanStart

好,can start比較含糊,而listen on port确切的說明了這個方法将要做什麼。

--run_locally

就不如

--extra_logging

來的明确。

增加重要的細節,比如變量的機關

_ms

,對原始字元串加

_raw

如果一個變量很重要,那麼在名字上多加一些額外的字就會更加易讀,比如将

string id; // Example: "af84ef845cd8"

換成

string hex_id;

  1. Start(int delay) --> delay → delay_secs
  2. CreateCache(int size) --> size → size_mb
  3. ThrottleDownload(float limit) --> limit → max_kbps
  4. Rotate(float angle) --> angle → degrees_cw

更多例子:

  1. password -> plaintext_password
  2. comment -> unescaped_comment
  3. html -> html_utf8
  4. data -> data_urlenc

對于作用域大的變量使用較長的名字

在比較小的作用域内,可以使用較短的變量名,在較大的作用域内使用的變量,最好用長一點的名字,編輯器的自動補全都可以很好的減少鍵盤輸入。對于一些縮寫字首,盡量選擇衆所周知的(如str),一個判斷标準是,當新成員加入時,是否可以無需他人幫助而明白字首代表什麼。

合理使用

_

-

等符号,比如對私有變量加

_

字首。

  1. var x = new DatePicker(); // DatePicker() 是類的"構造"函數,大寫開始
  2. var y = pageHeight(); // pageHeight() 是一個普通函數
  3. var $all_images = $("img"); // $all_images 是jQuery對象
  4. var height = 250; // height不是
  5. //id和class的寫法分開
  6. <div id="middle_column" class="main-content"> ...

命名不能有歧義

命名的時候可以先想一下,我要用的這個詞是否有别的含義。舉個例子:

  1. results = Database.all_objects.filter("year <= 2011")

現在的結果到底是包含2011年之前的呢還是不包含呢?

使用

min

max

代替

limit

  1. CART_TOO_BIG_LIMIT = 10
  2. if shopping_cart.num_items() >= CART_TOO_BIG_LIMIT:
  3. Error("Too many items in cart.")
  4. MAX_ITEMS_IN_CART = 10
  5. if shopping_cart.num_items() > MAX_ITEMS_IN_CART:
  6. Error("Too many items in cart.")

對比上例中

CART_TOO_BIG_LIMIT

MAX_ITEMS_IN_CART

,想想哪個更好呢?

使用

first

last

來表示閉區間

  1. print integer_range(start=2, stop=4)
  2. # Does this print [2,3] or [2,3,4] (or something else)?
  3. set.PrintKeys(first="Bart", last="Maggie")

first

last

含義明确,适宜表示閉區間。

使用

beigin

end

表示前閉後開(2,9))區間

  1. PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am")
  2. PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59:59.9999pm")

上面一種寫法就比下面的舒服多了。

Boolean型變量命名

  1. bool read_password = true;

這是一個很危險的命名,到底是需要讀取密碼呢,還是密碼已經被讀取呢,不知道,是以這個變量可以使用

user_is_authenticated

代替。通常,給Boolean型變量添加

is

has

can

should

可以讓含義更清晰,比如:

  1. SpaceLeft() --> hasSpaceLeft()
  2. bool disable_ssl = false --> bool use_ssl = true

符合預期

  1. public class StatisticsCollector {
  2. public void addSample(double x) { ... }
  3. public double getMean() {
  4. // Iterate through all samples and return total / num_samples
  5. }
  6. ...
  7. }

在這個例子中,

getMean

方法周遊了所有的樣本,傳回總額,是以并不是普通意義上輕量的

get

方法,是以應該取名

computeMean

比較合适。

漂亮的格式

寫出來漂亮的格式,充滿美感,讀起來自然也會舒服很多,對比下面兩個例子:

  1. class StatsKeeper {
  2. public:
  3. // A class for keeping track of a series of doubles
  4. void Add(double d); // and methods for quick statistics about them
  5. private: int count; /* how many so far
  6. */ public:
  7. double Average();
  8. private: double minimum;
  9. list<double>
  10. past_items
  11. ;double maximum;
  12. };

什麼是充滿美感的呢:

  1. // A class for keeping track of a series of doubles
  2. // and methods for quick statistics about them.
  3. class StatsKeeper {
  4. public:
  5. void Add(double d);
  6. double Average();
  7. private:
  8. list<double> past_items;
  9. int count; // how many so far
  10. double minimum;
  11. double maximum;
  12. };

考慮斷行的連續性和簡潔

這段代碼需要斷行,來滿足不超過一行80個字元的要求,參數也需要注釋說明:

  1. public class PerformanceTester {
  2. public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator(
  3. 500, /* Kbps */
  4. 80, /* millisecs latency */
  5. 200, /* jitter */
  6. 1 /* packet loss % */);
  7. public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator(
  8. 45000, /* Kbps */
  9. 10, /* millisecs latency */
  10. 0, /* jitter */
  11. 0 /* packet loss % */);
  12. public static final TcpConnectionSimulator cell = new TcpConnectionSimulator(
  13. 100, /* Kbps */
  14. 400, /* millisecs latency */
  15. 250, /* jitter */
  16. 5 /* packet loss % */);
  17. }

考慮到代碼的連貫性,先優化成這樣:

  1. public class PerformanceTester {
  2. public static final TcpConnectionSimulator wifi =
  3. new TcpConnectionSimulator(
  4. 500, /* Kbps */
  5. 80, /* millisecs latency */ 200, /* jitter */
  6. 1 /* packet loss % */);
  7. public static final TcpConnectionSimulator t3_fiber =
  8. new TcpConnectionSimulator(
  9. 45000, /* Kbps */
  10. 10, /* millisecs latency */
  11. 0, /* jitter */
  12. 0 /* packet loss % */);
  13. public static final TcpConnectionSimulator cell =
  14. new TcpConnectionSimulator(
  15. 100, /* Kbps */
  16. 400, /* millisecs latency */
  17. 250, /* jitter */
  18. 5 /* packet loss % */);
  19. }

連貫性好一點,但還是太羅嗦,額外占用很多空間:

  1. public class PerformanceTester {
  2. // TcpConnectionSimulator(throughput, latency, jitter, packet_loss)
  3. // [Kbps] [ms] [ms] [percent]
  4. public static final TcpConnectionSimulator wifi =
  5. new TcpConnectionSimulator(500, 80, 200, 1);
  6. public static final TcpConnectionSimulator t3_fiber =
  7. new TcpConnectionSimulator(45000, 10, 0, 0);
  8. public static final TcpConnectionSimulator cell =
  9. new TcpConnectionSimulator(100, 400, 250, 5);
  10. }

用函數封裝

  1. // Turn a partial_name like "Doug Adams" into "Mr. Douglas Adams".
  2. // If not possible, 'error' is filled with an explanation.
  3. string ExpandFullName(DatabaseConnection dc, string partial_name, string* error);
  4. DatabaseConnection database_connection;
  5. string error;
  6. assert(ExpandFullName(database_connection, "Doug Adams", &error)
  7. == "Mr. Douglas Adams");
  8. assert(error == "");
  9. assert(ExpandFullName(database_connection, " Jake Brown ", &error)
  10. == "Mr. Jacob Brown III");
  11. assert(error == "");
  12. assert(ExpandFullName(database_connection, "No Such Guy", &error) == "");
  13. assert(error == "no match found");
  14. assert(ExpandFullName(database_connection, "John", &error) == "");
  15. assert(error == "more than one result");

上面這段代碼看起來很髒亂,很多重複性的東西,可以用函數封裝:

  1. CheckFullName("Doug Adams", "Mr. Douglas Adams", "");
  2. CheckFullName(" Jake Brown ", "Mr. Jake Brown III", "");
  3. CheckFullName("No Such Guy", "", "no match found");
  4. CheckFullName("John", "", "more than one result");
  5. void CheckFullName(string partial_name,
  6. string expected_full_name,
  7. string expected_error) {
  8. // database_connection is now a class member
  9. string error;
  10. string full_name = ExpandFullName(database_connection, partial_name, &error);
  11. assert(error == expected_error);
  12. assert(full_name == expected_full_name);
  13. }

列對齊

列對齊可以讓代碼段看起來更舒适:

  1. CheckFullName("Doug Adams" , "Mr. Douglas Adams" , "");
  2. CheckFullName(" Jake Brown ", "Mr. Jake Brown III", "");
  3. CheckFullName("No Such Guy" , "" , "no match found");
  4. CheckFullName("John" , "" , "more than one result");
  5. commands[] = {
  6. ...
  7. { "timeout" , NULL , cmd_spec_timeout},
  8. { "timestamping" , &opt.timestamping , cmd_boolean},
  9. { "tries" , &opt.ntry , cmd_number_inf},
  10. { "useproxy" , &opt.use_proxy , cmd_boolean},
  11. { "useragent" , NULL , cmd_spec_useragent},
  12. ...
  13. };

代碼用塊區分

  1. class FrontendServer {
  2. public:
  3. FrontendServer();
  4. void ViewProfile(HttpRequest* request);
  5. void OpenDatabase(string location, string user);
  6. void SaveProfile(HttpRequest* request);
  7. string ExtractQueryParam(HttpRequest* request, string param);
  8. void ReplyOK(HttpRequest* request, string html);
  9. void FindFriends(HttpRequest* request);
  10. void ReplyNotFound(HttpRequest* request, string error);
  11. void CloseDatabase(string location);
  12. ~FrontendServer();
  13. };

上面這一段雖然能看,不過還有優化空間:

  1. class FrontendServer {
  2. public:
  3. FrontendServer();
  4. ~FrontendServer();
  5. // Handlers
  6. void ViewProfile(HttpRequest* request);
  7. void SaveProfile(HttpRequest* request);
  8. void FindFriends(HttpRequest* request);
  9. // Request/Reply Utilities
  10. string ExtractQueryParam(HttpRequest* request, string param);
  11. void ReplyOK(HttpRequest* request, string html);
  12. void ReplyNotFound(HttpRequest* request, string error);
  13. // Database Helpers
  14. void OpenDatabase(string location, string user);
  15. void CloseDatabase(string location);
  16. };

再來看一段代碼:

  1. # Import the user's email contacts, and match them to users in our system.
  2. # Then display a list of those users that he/she isn't already friends with.
  3. def suggest_new_friends(user, email_password):
  4. friends = user.friends()
  5. friend_emails = set(f.email for f in friends)
  6. contacts = import_contacts(user.email, email_password)
  7. contact_emails = set(c.email for c in contacts)
  8. non_friend_emails = contact_emails - friend_emails
  9. suggested_friends = User.objects.select(email__in=non_friend_emails)
  10. display['user'] = user
  11. display['friends'] = friends
  12. display['suggested_friends'] = suggested_friends
  13. return render("suggested_friends.html", display)

全都混在一起,視覺壓力相當大,按功能化塊:

  1. def suggest_new_friends(user, email_password):
  2. # Get the user's friends' email addresses.
  3. friends = user.friends()
  4. friend_emails = set(f.email for f in friends)
  5. # Import all email addresses from this user's email account.
  6. contacts = import_contacts(user.email, email_password)
  7. contact_emails = set(c.email for c in contacts)
  8. # Find matching users that they aren't already friends with.
  9. non_friend_emails = contact_emails - friend_emails
  10. suggested_friends = User.objects.select(email__in=non_friend_emails)
  11. # Display these lists on the page. display['user'] = user
  12. display['friends'] = friends
  13. display['suggested_friends'] = suggested_friends
  14. return render("suggested_friends.html", display)

讓代碼看起來更舒服,需要在寫的過程中多注意,培養一些好的習慣,尤其當團隊合作的時候,代碼風格比如大括号的位置并沒有對錯,但是不遵循團隊規範那就是錯的。

如何寫注釋

當你寫代碼的時候,你會思考很多,但是最終呈現給讀者的就隻剩代碼本身了,額外的資訊丢失了,是以注釋的目的就是讓讀者了解更多的資訊。

應該注釋什麼

不應該注釋什麼

這樣的注釋毫無價值:

  1. // The class definition for Account
  2. class Account {
  3. public:
  4. // Constructor
  5. Account();
  6. // Set the profit member to a new value
  7. void SetProfit(double profit);
  8. // Return the profit from this Account
  9. double GetProfit();
  10. };

不要像下面這樣為了注釋而注釋:

  1. // Find a Node with the given 'name' or return NULL.
  2. // If depth <= 0, only 'subtree' is inspected.
  3. // If depth == N, only 'subtree' and N levels below are inspected.
  4. Node* FindNodeInSubtree(Node* subtree, string name, int depth);

不要給爛取名注釋

  1. // Enforce limits on the Reply as stated in the Request,
  2. // such as the number of items returned, or total byte size, etc.
  3. void CleanReply(Request request, Reply reply);

注釋的大部分都在解釋clean是什麼意思,那不如換個正确的名字:

  1. // Make sure 'reply' meets the count/byte/etc. limits from the 'request'
  2. void EnforceLimitsFromRequest(Request request, Reply reply);

記錄你的想法

我們讨論了不該注釋什麼,那麼應該注釋什麼呢?注釋應該記錄你思考代碼怎麼寫的結果,比如像下面這些:

  1. // Surprisingly, a binary tree was 40% faster than a hash table for this data.
  2. // The cost of computing a hash was more than the left/right comparisons.
  3. // This heuristic might miss a few words. That's OK; solving this 100% is hard.
  4. // This class is getting messy. Maybe we should create a 'ResourceNode' subclass to
  5. // help organize things.

也可以用來記錄流程和常量:

  1. // TODO: use a faster algorithm
  2. // TODO(dustin): handle other image formats besides JPEG
  3. NUM_THREADS = 8 # as long as it's >= 2 * num_processors, that's good enough.
  4. // Impose a reasonable limit - no human can read that much anyway.
  5. const int MAX_RSS_SUBSCRIPTIONS = 1000;

可用的詞有:

  1. TODO : Stuff I haven't gotten around to yet
  2. FIXME : Known-broken code here
  3. HACK : Adimittedly inelegant solution to a problem
  4. XXX : Danger! Major problem here

站在讀者的角度去思考

當别人讀你的代碼時,讓他們産生疑問的部分,就是你應該注釋的地方。

  1. struct Recorder {
  2. vector<float> data;
  3. ...
  4. void Clear() {
  5. vector<float>().swap(data); // Huh? Why not just data.clear()?
  6. }
  7. };

很多C++的程式員啊看到這裡,可能會想為什麼不用

data.clear()

來代替

vector.swap

,是以那個地方應該加上注釋:

  1. // Force vector to relinquish its memory (look up "STL swap trick")
  2. vector<float>().swap(data);

說明可能陷阱

你在寫代碼的過程中,可能用到一些hack,或者有其他需要讀代碼的人知道的陷阱,這時候就應該注釋:

  1. void SendEmail(string to, string subject, string body);

而實際上這個發送郵件的函數是調用别的服務,有逾時設定,是以需要注釋:

  1. // Calls an external service to deliver email. (Times out after 1 minute.)
  2. void SendEmail(string to, string subject, string body);

全景的注釋

有時候為了更清楚說明,需要給整個檔案加注釋,讓讀者有個總體的概念:

  1. // This file contains helper functions that provide a more convenient interface to our
  2. // file system. It handles file permissions and other nitty-gritty details.

總結性的注釋

即使是在函數内部,也可以有類似檔案注釋那樣的說明注釋:

  1. # Find all the items that customers purchased for themselves.
  2. for customer_id in all_customers:
  3. for sale in all_sales[customer_id].sales:
  4. if sale.recipient == customer_id:
  5. ...

或者按照函數的步進,寫一些注釋:

  1. def GenerateUserReport():
  2. # Acquire a lock for this user
  3. ...
  4. # Read user's info from the database
  5. ...
  6. # Write info to a file
  7. ...
  8. # Release the lock for this user

很多人不願意寫注釋,确實,要寫好注釋也不是一件簡單的事情,也可以在檔案專門的地方,留個寫注釋的區域,可以寫下你任何想說的東西。

注釋應簡明準确

前一個小節讨論了注釋應該寫什麼,這一節來讨論應該怎麼寫,因為注釋很重要,是以要寫的精确,注釋也占據螢幕空間,是以要簡潔。

精簡注釋

  1. // The int is the CategoryType.
  2. // The first float in the inner pair is the 'score',
  3. // the second is the 'weight'.
  4. typedef hash_map<int, pair<float, float> > ScoreMap;

這樣寫太羅嗦了,盡量精簡壓縮成這樣:

  1. // CategoryType -> (score, weight)
  2. typedef hash_map<int, pair<float, float> > ScoreMap;

避免有歧義的代詞

  1. // Insert the data into the cache, but check if it's too big first.

這裡的

it's

有歧義,不知道所指的是

data

還是

cache

,改成如下:

  1. // Insert the data into the cache, but check if the data is too big first.

還有更好的解決辦法,這裡的

it

就有明确所指:

  1. // If the data is small enough, insert it into the cache.

語句要精簡準确

  1. # Depending on whether we've already crawled this URL before, give it a different priority.

這句話了解起來太費勁,改成如下就好了解很多:

  1. # Give higher priority to URLs we've never crawled before.

精确描述函數的目的

  1. // Return the number of lines in this file.
  2. int CountLines(string filename) { ... }

這樣的一個函數,用起來可能會一頭霧水,因為他可以有很多歧義:

  • ”” 一個空檔案,是0行還是1行?
  • “hello” 隻有一行,那麼傳回值是0還是1?
  • “hello\n” 這種情況傳回1還是2?
  • “hello\n world” 傳回1還是2?
  • “hello\n\r cruel\n world\r” 傳回2、3、4哪一個呢?

是以注釋應該這樣寫:

  1. // Count how many newline bytes ('\n') are in the file.
  2. int CountLines(string filename) { ... }

用執行個體說明邊界情況

  1. // Rearrange 'v' so that elements < pivot come before those >= pivot;
  2. // Then return the largest 'i' for which v[i] < pivot (or -1 if none are < pivot)
  3. int Partition(vector<int>* v, int pivot);

這個描述很精确,但是如果再加入一個例子,就更好了:

  1. // ...
  2. // Example: Partition([8 5 9 8 2], 8) might result in [5 2 | 8 9 8] and return 1
  3. int Partition(vector<int>* v, int pivot);

說明你的代碼的真正目的

  1. void DisplayProducts(list<Product> products) {
  2. products.sort(CompareProductByPrice);
  3. // Iterate through the list in reverse order
  4. for (list<Product>::reverse_iterator it = products.rbegin(); it != products.rend();
  5. ++it)
  6. DisplayPrice(it->price);
  7. ...
  8. }

這裡的注釋說明了倒序排列,單還不夠準确,應該改成這樣:

  1. // Display each price, from highest to lowest
  2. for (list<Product>::reverse_iterator it = products.rbegin(); ... )

函數調用時的注釋

看見這樣的一個函數調用,肯定會一頭霧水:

  1. Connect(10, false);

如果加上這樣的注釋,讀起來就清楚多了:

  1. def Connect(timeout, use_encryption): ...
  2. # Call the function using named parameters
  3. Connect(timeout = 10, use_encryption = False)

使用資訊含量豐富的詞

  1. // This class contains a number of members that store the same information as in the
  2. // database, but are stored here for speed. When this class is read from later, those
  3. // members are checked first to see if they exist, and if so are returned; otherwise the
  4. // database is read from and that data stored in those fields for next time.

上面這一大段注釋,解釋的很清楚,如果換一個詞來代替,也不會有什麼疑惑:

  1. // This class acts as a caching layer to the database.

簡化循環和邏輯

流程控制要簡單

讓條件語句、循環以及其他控制流程的代碼盡可能自然,讓讀者在閱讀過程中不需要停頓思考或者在回頭查找,是這一節的目的。

條件語句中參數的位置

對比下面兩種條件的寫法:

  1. if (length >= 10)
  2. while (bytes_received < bytes_expected)
  3. if (10 <= length)
  4. while (bytes_expected > bytes_received)

到底是應該按照大于小于的順序來呢,還是有其他的準則?是的,應該按照參數的意義來

  • 運算符左邊:通常是需要被檢查的變量,也就是會經常變化的
  • 運算符右邊:通常是被比對的樣本,一定程度上的常量

這就解釋了為什麼

bytes_received < bytes_expected

比反過來更好了解。

if/else的順序

通常,

if/else

的順序你可以自由選擇,下面這兩種都可以:

  1. if (a == b) {
  2. // Case One ...
  3. } else {
  4. // Case Two ...
  5. }
  6. if (a != b) {
  7. // Case Two ...
  8. } else {
  9. // Case One ...
  10. }

或許對此你也沒有仔細斟酌過,但在有些時候,一種順序确實好過另一種:

  • 正向的邏輯在前,比如

    if(debug)

    就比

    if(!debug)

  • 簡單邏輯的在前,這樣

    if

    else

    就可以在一個螢幕顯示 - 有趣、清晰的邏輯在前

舉個例子來看:

  1. if (!url.HasQueryParameter("expand_all")) {
  2. response.Render(items);
  3. ...
  4. } else {
  5. for (int i = 0; i < items.size(); i++) {
  6. items[i].Expand();
  7. }
  8. ...
  9. }

看到

if

你首先想到的是

expand_all

,就好像告訴你“不要想大象”,你會忍不住去想它,是以産生了一點點迷惑,最好寫成:

  1. if (url.HasQueryParameter("expand_all")) {
  2. for (int i = 0; i < items.size(); i++) {
  3. items[i].Expand();
  4. }
  5. ...
  6. } else {
  7. response.Render(items);
  8. ...
  9. }

三目運算符(?:)

  1. time_str += (hour >= 12) ? "pm" : "am";
  2. Avoiding the ternary operator, you might write:
  3. if (hour >= 12) {
  4. time_str += "pm";
  5. } else {
  6. time_str += "am";
  7. }

使用三目運算符可以減少代碼行數,上例就是一個很好的例證,但是我們的真正目的是減少讀代碼的時間,是以下面的情況并不适合用三目運算符:

  1. return exponent >= 0 ? mantissa * (1 << exponent) : mantissa / (1 << -exponent);
  2. if (exponent >= 0) {
  3. return mantissa * (1 << exponent);
  4. } else {
  5. return mantissa / (1 << -exponent);
  6. }

是以隻在簡單表達式的地方用。

避免使用do/while表達式

  1. do {
  2. continue;
  3. } while (false);

這段代碼會執行幾遍呢,需要時間思考一下,

do/while

完全可以用别的方法代替,是以應避免使用。

盡早return

  1. public boolean Contains(String str, String substr) {
  2. if (str == null || substr == null) return false;
  3. if (substr.equals("")) return true;
  4. ...
  5. }

函數裡面盡早的return,可以讓邏輯更加清晰。

減少嵌套

  1. if (user_result == SUCCESS) {
  2. if (permission_result != SUCCESS) {
  3. reply.WriteErrors("error reading permissions");
  4. reply.Done();
  5. return;
  6. }
  7. reply.WriteErrors("");
  8. } else {
  9. reply.WriteErrors(user_result);
  10. }
  11. reply.Done();

這樣一段代碼,有一層的嵌套,但是看起來也會稍有迷惑,想想自己的代碼,有沒有類似的情況呢?可以換個思路去考慮這段代碼,并且用盡早return的原則修改,看起來就舒服很多:

  1. if (user_result != SUCCESS) {
  2. reply.WriteErrors(user_result);
  3. reply.Done();
  4. return;
  5. }
  6. if (permission_result != SUCCESS) {
  7. reply.WriteErrors(permission_result);
  8. reply.Done();
  9. return;
  10. }
  11. reply.WriteErrors("");
  12. reply.Done();

同樣的,對于有嵌套的循環,可以采用同樣的辦法:

  1. for (int i = 0; i < results.size(); i++) {
  2. if (results[i] != NULL) {
  3. non_null_count++;
  4. if (results[i]->name != "") {
  5. cout << "Considering candidate..." << endl;
  6. ...
  7. }
  8. }
  9. }

換一種寫法,盡早return,在循環中就用continue:

  1. for (int i = 0; i < results.size(); i++) {
  2. if (results[i] == NULL) continue;
  3. non_null_count++;
  4. if (results[i]->name == "") continue;
  5. cout << "Considering candidate..." << endl;
  6. ...
  7. }

拆分複雜表達式

很顯然的,越複雜的表達式,讀起來越費勁,是以應該把那些複雜而龐大的表達式,拆分成一個個易于了解的小式子。

用變量

将複雜表達式拆分最簡單的辦法,就是增加一個變量:

  1. if line.split(':')[0].strip() == "root":
  2. //用變量替換
  3. username = line.split(':')[0].strip()
  4. if username == "root":
  5. ...

或者這個例子:

  1. if (request.user.id == document.owner_id) {
  2. // user can edit this document...
  3. }
  4. ...
  5. if (request.user.id != document.owner_id) {
  6. // document is read-only...
  7. }
  8. //用變量替換
  9. final boolean user_owns_document = (request.user.id == document.owner_id);
  10. if (user_owns_document) {
  11. // user can edit this document...
  12. }
  13. ...
  14. if (!user_owns_document) {
  15. // document is read-only...
  16. }

邏輯替換

  • 1) not (a or b or c) <–> (not a) and (not b) and (not c)
  • 2) not (a and b and c) <–> (not a) or (not b) or (not c)

是以,就可以這樣寫:

  1. if (!(file_exists && !is_protected)) Error("Sorry, could not read file.");
  2. //替換
  3. if (!file_exists || is_protected) Error("Sorry, could not read file.");

不要濫用邏輯表達式

  1. assert((!(bucket = FindBucket(key))) || !bucket->IsOccupied());

這樣的代碼完全可以用下面這個替換,雖然有兩行,但是更易懂:

  1. bucket = FindBucket(key);
  2. if (bucket != NULL) assert(!bucket->IsOccupied());

像下面這樣的表達式,最好也不要寫,因為在有些語言中,x會被賦予第一個為

true

的變量的值:

  1. x = a || b || c

拆解大表達式

  1. var update_highlight = function (message_num) {
  2. if ($("#vote_value" + message_num).html() === "Up") {
  3. $("#thumbs_up" + message_num).addClass("highlighted");
  4. $("#thumbs_down" + message_num).removeClass("highlighted");
  5. } else if ($("#vote_value" + message_num).html() === "Down") {
  6. $("#thumbs_up" + message_num).removeClass("highlighted");
  7. $("#thumbs_down" + message_num).addClass("highlighted");
  8. } else {
  9. $("#thumbs_up" + message_num).removeClass("highighted");
  10. $("#thumbs_down" + message_num).removeClass("highlighted");
  11. }
  12. };

這裡面有很多重複的語句,我們可以用變量還替換簡化:

  1. var update_highlight = function (message_num) {
  2. var thumbs_up = $("#thumbs_up" + message_num);
  3. var thumbs_down = $("#thumbs_down" + message_num);
  4. var vote_value = $("#vote_value" + message_num).html();
  5. var hi = "highlighted";
  6. if (vote_value === "Up") {
  7. thumbs_up.addClass(hi);
  8. thumbs_down.removeClass(hi);
  9. } else if (vote_value === "Down") {
  10. thumbs_up.removeClass(hi);
  11. thumbs_down.addClass(hi);
  12. } else {
  13. thumbs_up.removeClass(hi);
  14. thumbs_down.removeClass(hi);
  15. }
  16. }

變量與可讀性

消除變量

前一節,講到利用變量來拆解大表達式,這一節來讨論如何消除多餘的變量。

沒用的臨時變量

  1. now = datetime.datetime.now()
  2. root_message.last_view_time = now

這裡的

now

可以去掉,因為:

  • 并非用來拆分複雜的表達式
  • 也沒有增加可讀性,因為`datetime.datetime.now()`本就清晰
  • 隻用了一次

是以完全可以寫作:

  1. root_message.last_view_time = datetime.datetime.now()

消除條件控制變量

  1. boolean done = false;
  2. while (/* condition */ && !done) {
  3. ...
  4. if (...) {
  5. done = true;
  6. continue;
  7. }
  8. }

這裡的

done

可以用别的方式更好的完成:

  1. while (/* condition */) {
  2. ...
  3. if (...) {
  4. break;
  5. }
  6. }

這個例子非常容易修改,如果是比較複雜的嵌套,

break

可能并不夠用,這時候就可以把代碼封裝到函數中。

減少變量的作用域

我們都聽過要避免使用全局變量這樣的忠告,是的,當變量的作用域越大,就越難追蹤,是以要保持變量小的作用域。

  1. class LargeClass {
  2. string str_;
  3. void Method1() {
  4. str_ = ...;
  5. Method2();
  6. }
  7. void Method2() {
  8. // Uses str_
  9. }
  10. // Lots of other methods that don't use str_
  11. ... ;
  12. }

這裡的

str_

的作用域有些大,完全可以換一種方式:

  1. class LargeClass {
  2. void Method1() {
  3. string str = ...;
  4. Method2(str);
  5. }
  6. void Method2(string str) {
  7. // Uses str
  8. }
  9. // Now other methods can't see str.
  10. };

str

通過變量函數參數傳遞,減小了作用域,也更易讀。同樣的道理也可以用在定義類的時候,将大類拆分成一個個小類。

不要使用嵌套的作用域

  1. # No use of example_value up to this point.
  2. if request:
  3. for value in request.values:
  4. if value > 0:
  5. example_value = value
  6. break
  7. for logger in debug.loggers:
  8. logger.log("Example:", example_value)

這個例子在運作時候會報

example_value is undefined

的錯,修改起來不算難:

  1. example_value = None
  2. if request:
  3. for value in request.values:
  4. if value > 0: example_value = value
  5. break
  6. if example_value:
  7. for logger in debug.loggers:
  8. logger.log("Example:", example_value)

但是參考前面的消除中間變量準則,還有更好的辦法:

  1. def LogExample(value):
  2. for logger in debug.loggers:
  3. logger.log("Example:", value)
  4. if request:
  5. for value in request.values:
  6. if value > 0:
  7. LogExample(value) # deal with 'value' immediately
  8. break

用到了再聲明

在C語言中,要求将所有的變量事先聲明,這樣當用到變量較多時候,讀者處理這些資訊就會有難度,是以一開始沒用到的變量,就暫緩聲明:

  1. def ViewFilteredReplies(original_id):
  2. filtered_replies = []
  3. root_message = Messages.objects.get(original_id)
  4. all_replies = Messages.objects.select(root_id=original_id)
  5. root_message.view_count += 1
  6. root_message.last_view_time = datetime.datetime.now()
  7. root_message.save()
  8. for reply in all_replies:
  9. if reply.spam_votes <= MAX_SPAM_VOTES:
  10. filtered_replies.append(reply)
  11. return filtered_replies

讀者一次處理變量太多,可以暫緩聲明:

  1. def ViewFilteredReplies(original_id):
  2. root_message = Messages.objects.get(original_id)
  3. root_message.view_count += 1
  4. root_message.last_view_time = datetime.datetime.now()
  5. root_message.save()
  6. all_replies = Messages.objects.select(root_id=original_id)
  7. filtered_replies = []
  8. for reply in all_replies:
  9. if reply.spam_votes <= MAX_SPAM_VOTES:
  10. filtered_replies.append(reply)
  11. return filtered_replies

變量最好隻寫一次

前面讨論了過多的變量會讓讀者迷惑,同一個變量,不停的被指派也會讓讀者頭暈,如果變量變化的次數少一些,代碼可讀性就更強。

一個例子

假設有一個頁面,如下,需要給第一個空的

input

指派:

  1. <input type="text" id="input1" value="Dustin">
  2. <input type="text" id="input2" value="Trevor">
  3. <input type="text" id="input3" value="">
  4. <input type="text" id="input4" value="Melissa">
  5. ...
  6. var setFirstEmptyInput = function (new_value) {
  7. var found = false;
  8. var i = 1;
  9. var elem = document.getElementById('input' + i);
  10. while (elem !== null) {
  11. if (elem.value === '') {
  12. found = true;
  13. break;
  14. }
  15. i++;
  16. elem = document.getElementById('input' + i);
  17. }
  18. if (found) elem.value = new_value;
  19. return elem;
  20. };

這段代碼能工作,有三個變量,我們逐一去看如何優化,

found

作為中間變量,完全可以消除:

  1. var setFirstEmptyInput = function (new_value) {
  2. var i = 1;
  3. var elem = document.getElementById('input' + i);
  4. while (elem !== null) {
  5. if (elem.value === '') {
  6. elem.value = new_value;
  7. return elem;
  8. }
  9. i++;
  10. elem = document.getElementById('input' + i);
  11. }
  12. return null;
  13. };

再來看

elem

變量,隻用來做循環,調用了很多次,是以很難跟蹤他的值,

i

也可以用

for

來修改:

  1. var setFirstEmptyInput = function (new_value) {
  2. for (var i = 1; true; i++) {
  3. var elem = document.getElementById('input' + i);
  4. if (elem === null)
  5. return null; // Search Failed. No empty input found.
  6. if (elem.value === '') {
  7. elem.value = new_value;
  8. return elem;
  9. }
  10. }
  11. };

重新組織你的代碼

分離不相關的子問題

工程師就是将大問題分解為一個個小問題,然後逐個解決,這樣也易于保證程式的健壯性、可讀性。如何分解子問題,下面給出一些準則:

  • 看看這個方法或代碼,問問你自己“這段代碼的最終目标是什麼?”
  • 對于每一行代碼,要問“它與目标直接相關,或者是不相關的子問題?”
  • 如果有足夠多行的代碼是處理與目标不直接相關的問題,那麼抽離成子函數

來看一個例子:

  1. ajax_post({
  2. url: 'http://example.com/submit',
  3. data: data,
  4. on_success: function (response_data) {
  5. var str = "{\n";
  6. for (var key in response_data) {
  7. str += " " + key + " = " + response_data[key] + "\n";
  8. }
  9. alert(str + "}");
  10. // Continue handling 'response_data' ...
  11. }
  12. });

這段代碼的目标是發送一個

ajax

請求,是以其中字元串處理的部分就可以抽離出來:

  1. var format_pretty = function (obj) {
  2. var str = "{\n";
  3. for (var key in obj) {
  4. str += " " + key + " = " + obj[key] + "\n";
  5. }
  6. return str + "}";
  7. };

意外收獲

有很多理由将

format_pretty

抽離出來,這些獨立的函數可以很容易的添加feature,增強可靠性,處理邊界情況,等等。是以這裡,可以将

format_pretty

增強,就會得到一個更強大的函數:

  1. var format_pretty = function (obj, indent) {
  2. // Handle null, undefined, strings, and non-objects.
  3. if (obj === null) return "null";
  4. if (obj === undefined) return "undefined";
  5. if (typeof obj === "string") return '"' + obj + '"';
  6. if (typeof obj !== "object") return String(obj);
  7. if (indent === undefined) indent = "";
  8. // Handle (non-null) objects.
  9. var str = "{\n";
  10. for (var key in obj) {
  11. str += indent + " " + key + " = ";
  12. str += format_pretty(obj[key], indent + " ") + "\n"; }
  13. return str + indent + "}";
  14. };

這個函數輸出:

  1. {
  2. key1 = 1
  3. key2 = true
  4. key3 = undefined
  5. key4 = null
  6. key5 = {
  7. key5a = {
  8. key5a1 = "hello world"
  9. }
  10. }
  11. }

多做這樣的事情,就是積累代碼的過程,這樣的代碼可以複用,也可以形成自己的代碼庫,或者分享給别人。

業務相關的函數

那些與目标不相關函數,抽離出來可以複用,與業務相關的也可以抽出來,保持代碼的易讀性,例如:

  1. business = Business()
  2. business.name = request.POST["name"]
  3. url_path_name = business.name.lower()
  4. url_path_name = re.sub(r"['\.]", "", url_path_name)
  5. url_path_name = re.sub(r"[^a-z0-9]+", "-", url_path_name)
  6. url_path_name = url_path_name.strip("-")
  7. business.url = "/biz/" + url_path_name
  8. business.date_created = datetime.datetime.utcnow()
  9. business.save_to_database()

抽離出來,就好看很多:

  1. CHARS_TO_REMOVE = re.compile(r"['\.']+")
  2. CHARS_TO_DASH = re.compile(r"[^a-z0-9]+")
  3. def make_url_friendly(text):
  4. text = text.lower()
  5. text = CHARS_TO_REMOVE.sub('', text)
  6. text = CHARS_TO_DASH.sub('-', text)
  7. return text.strip("-")
  8. business = Business()
  9. business.name = request.POST["name"]
  10. business.url = "/biz/" + make_url_friendly(business.name)
  11. business.date_created = datetime.datetime.utcnow()
  12. business.save_to_database()

簡化現有接口

我們來看一個讀寫cookie的函數:

  1. var max_results;
  2. var cookies = document.cookie.split(';');
  3. for (var i = 0; i < cookies.length; i++) {
  4. var c = cookies[i];
  5. c = c.replace(/^[ ]+/, ''); // remove leading spaces
  6. if (c.indexOf("max_results=") === 0)
  7. max_results = Number(c.substring(12, c.length));
  8. }

這段代碼實在太醜了,理想的接口應該是這樣的:

  1. set_cookie(name, value, days_to_expire);
  2. delete_cookie(name);

對于并不理想的接口,你永遠可以用自己的函數做封裝,讓接口更好用。

按自己需要寫接口

  1. ser_info = { "username": "...", "password": "..." }
  2. user_str = json.dumps(user_info)
  3. cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE)
  4. encrypted_bytes = cipher.update(user_str)
  5. encrypted_bytes += cipher.final() # flush out the current 128 bit block
  6. url = "http://example.com/?user_info=" + base64.urlsafe_b64encode(encrypted_bytes)
  7. ...

雖然終極目的是拼接使用者資訊的字元,但是代碼大部分做的事情是解析python的object,是以:

  1. def url_safe_encrypt(obj):
  2. obj_str = json.dumps(obj)
  3. cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE) encrypted_bytes = cipher.update(obj_str)
  4. encrypted_bytes += cipher.final() # flush out the current 128 bit block
  5. return base64.urlsafe_b64encode(encrypted_bytes)

這樣在其他地方也可以調用:

  1. user_info = { "username": "...", "password": "..." }
  2. url = "http://example.com/?user_info=" + url_safe_encrypt(user_info)

分離子函數是好習慣,但是也要适度,過度的分離成多個小函數,也會讓查找變得困難。

單任務

代碼應該是一次隻完成一個任務

  1. var place = location_info["LocalityName"]; // e.g. "Santa Monica"
  2. if (!place) {
  3. place = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles"
  4. }
  5. if (!place) {
  6. place = location_info["AdministrativeAreaName"]; // e.g. "California"
  7. }
  8. if (!place) {
  9. place = "Middle-of-Nowhere";
  10. }
  11. if (location_info["CountryName"]) {
  12. place += ", " + location_info["CountryName"]; // e.g. "USA"
  13. } else {
  14. place += ", Planet Earth";
  15. }
  16. return place;

這是一個用來拼地名的函數,有很多的條件判斷,讀起來非常吃力,有沒有辦法拆解任務呢?

  1. var town = location_info["LocalityName"]; // e.g. "Santa Monica"
  2. var city = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles"
  3. var state = location_info["AdministrativeAreaName"]; // e.g. "CA"
  4. var country = location_info["CountryName"]; // e.g. "USA"

先拆解第一個任務,将各變量分别儲存,這樣在後面使用中不需要去記憶那些繁長的key值了,第二個任務,解決位址拼接的後半部分:

  1. // Start with the default, and keep overwriting with the most specific value. var second_half = "Planet Earth";
  2. if (country) {
  3. second_half = country;
  4. }
  5. if (state && country === "USA") {
  6. second_half = state;
  7. }

再來解決前半部分:

  1. var first_half = "Middle-of-Nowhere";
  2. if (state && country !== "USA") {
  3. first_half = state;
  4. }
  5. if (city) {
  6. first_half = city;
  7. }
  8. if (town) {
  9. first_half = town;
  10. }

大功告成:

  1. return first_half + ", " + second_half;

如果注意到有

USA

這個變量的判斷的話,也可以這樣寫:

  1. var first_half, second_half;
  2. if (country === "USA") {
  3. first_half = town || city || "Middle-of-Nowhere";
  4. second_half = state || "USA";
  5. } else {
  6. first_half = town || city || state || "Middle-of-Nowhere";
  7. second_half = country || "Planet Earth";
  8. }
  9. return first_half + ", " + second_half;

把想法轉換成代碼

要把一個複雜的東西解釋給别人,一些細節很容易就讓人産生迷惑,是以想象把你的代碼用平實的語言解釋給别人聽,别人是否能懂,有一些準則可以幫助你讓代碼更清晰:

  • 用最平實的語言描述代碼的目的,就像給讀者講述一樣
  • 注意描述中關鍵的字詞
  • 讓你的代碼符合你的描述

下面這段代碼用來校驗使用者的權限:

  1. $is_admin = is_admin_request();
  2. if ($document) {
  3. if (!$is_admin && ($document['username'] != $_SESSION['username'])) {
  4. return not_authorized();
  5. }
  6. } else {
  7. if (!$is_admin) {
  8. return not_authorized();
  9. }
  10. }
  11. // continue rendering the page ...

這一段代碼不長,裡面的邏輯嵌套倒是複雜,參考前面章節所述,嵌套太多非常影響閱讀了解,将這個邏輯用語言描述就是:

  1. 有兩種情況有權限:
  2. 1、你是管理者(admin)
  3. 2、你擁有這個文檔
  4. 否則就沒有權限

根據描述來寫代碼:

  1. if (is_admin_request()) {
  2. // authorized
  3. } elseif ($document && ($document['username'] == $_SESSION['username'])) {
  4. // authorized
  5. } else {
  6. return not_authorized();
  7. }
  8. // continue rendering the page ...

寫更少的代碼

最易懂的代碼就是沒有代碼!

  • 去掉那些沒意義的feature,也不要過度設計
  • 重新考慮需求,解決最簡單的問題,也能完成整體的目标
  • 熟悉你常用的庫,周期性研究他的API

最後

還有一些與測試相關的章節,留給你自己去研讀吧,再次推薦此書:

  • 英文版:《The Art of Readable Code》
  • 中文版:編寫可讀代碼的藝術

繼續閱讀