1ãæ¦è¿°
-
åºæ¯
卿°æ®å¼åä¸ï¼ç±äºåç¨åºå飿 ¼ä¸ä¸ãé¨åç¨åºå代ç 太çãä»£ç æ³¨éè¿å°çåå ï¼å¯¼è´ä»£ç ç»´æ¤æ¶å°é¾éé
åäºA请åå»çå¨ï¼æ¤æ¶Taç代ç åºäºé®é¢éè¦åäºBå»ä¿®æ¹ï¼ä½ç±äºä»£ç 太çï¼åäºBæ¹ä¸å¨
-
代ç è¯å®¡ï¼éè¿ é è¯»ä»£ç æ¥ æ£æ¥ä»£ç è´¨é
ç®çï¼éä½ä»£ç ç»´æ¤ææ¬
- 使ç¨ä»£ç è¯å®¡èªå¨åèæ¬ï¼Python3å®ç°ï¼ï¼å¯æé«ä»£ç è¯å®¡çæç
2ã代ç è§è
以ä¸è§èæç §é«å±ã强å¶ã建议3个级å«è¿è¡æ 注ï¼ä¼å 级ä»é«å°ä½
2.1ãéç¨ä»£ç è§è
-
注é
ã建议ãä¸ææ³¨éâ¥1è¡
ãå»ºè®®ãæ³¨éä¸ç¨å¤ªå¤ï¼ä»£ç 峿³¨é
-
å½åè§è
ã建议ãç¨çº¯è±æ
ãå»ºè®®ãæ¥è¯å ¸ç¨è±æå ¨ç§°ï¼å½ åç§°è¿é¿ æ å ³é®åå²çª æ¶ï¼å¯ä½¿ç¨ç¼©å
ã建议ãåéãåæ°ã类使ç¨[形容è¯+]åè¯
ãå»ºè®®ãæ¹æ³ã彿°ä½¿ç¨å¨è¯[+åè¯]
ã强å¶ãç¦æ¢æ°åå¼å¤´
ã强å¶ãç¦æ¢æ¼é³åè±ææ··ç¨
ã建议ãä¸ä½¿ç¨ä¿çå
æ¶é´å½åè§è | ã强å¶ãæ ¼å¼ | ç¤ºä¾ | ã强å¶ãåéå |
å¹´ | yyyy | 2022 | |
å¹´æ | yyyy-mm | 2021-01 | |
å¹´ææ¥ | yyyy-mm-dd | 2020-02-02 | ymd |
å£åº¦ | yyyyQq | 2021Q4ã2022Q1 | |
æ¶åç§ | hh:mm:ss | 08:00:00 | |
å¹´ææ¥æ¶åç§ | yyyy-mm-dd hh:mm:ss | 2022-02-01 08:00:00 |
常è§ç¼©å | å ¨ç§° | 䏿 | 常è§ç¼©å | å ¨ç§° | 䏿 |
fn | function | 彿° | ymd | Year Month Day | æ¥æ |
txt | textfile | ææ¬æä»¶ | cnt | count | v. 计æ°ï¼n. æ»æ° |
obj | object | 对象 | num | number | æ°åï¼å·ç |
ls | list | n. å表ï¼v. 忏 å | lvl | level | ç级 |
lib | library | è½¯ä»¶åº | dw | data warehouse | æ°æ®ä»åº |
str | string | å符串 | bak | backup | å¤ä»½ |
prob | probability | æ¦ç | sku | Stock Keeping Unit | åºååä½ |
conf | configuration file | é ç½®æä»¶ | idx | index file | ç´¢å¼æä»¶ |
calc | calculation | n. è®¡ç® | uv | unique visitor | ç¬ç«è®¿å®¢ |
regexp | regular expression | æ£åè¡¨è¾¾å¼ | pv | page view | 页颿µè§é |
app | application | åºç¨ç¨åº | ai | Artificial Intelligence | 人工æºè½ |
dept | department | é¨ï¼ç§ï¼å¤ | addr | address | å°å |
db | database | æ°æ®åº | pwd | password | å¯ç |
mkt | market | å¸åº | biz | business | åä¸ |
-
æ¬å·
ã强å¶ãå·¦æ¬å·å䏿¢è¡
ã强å¶ãå½è¡è¿é¿æ¶ï¼å·¦æ¬å·åæ¢è¡ï¼æ¢è¡åè¦ç¼©è¿
ã强å¶ãå½å³æ¬å·åè¿æ²¡ç»ææ¶ï¼ä¸å 许æ¢è¡
{
"timestamp": 1585744376001,
"page": {
"page_id": "页é¢ID",
"last_page_id": "ä¸ä¸ªé¡µé¢ID",
"page_type": "ç»å½é¡µ"
},
"actions": [{
"action": "æå¨",
"item": "æ¼å¾éªè¯ç ",
"timestamp": 1585744376605
}, {
"action": "ç¹å»",
"item": "ç»å½é®",
"timestamp": 1585744377778
}]
}
-
å ¶å®
ã建议ã代ç ä¸ä¸å¾åºç°ç产ç¯å¢çææå¯ç
ã建议ãåè¡ä»£ç ä¸å¯å¤ªé¿ï¼ä¾å¤ï¼é¿URLãé¿importï¼
ã建议ã项ç®è¦æè¯´æææ¡£ï¼å为README.md
2.2ãé ç½®æä»¶åä¼ åè§è
- é ç½®æä»¶é常ç¨äºå卿°æ®åºè¿æ¥åæ°
- éå¸¸æ¯æ¶é´ä¼ å
2.3ãPython代ç è§è
Python代ç è§è ç»§æ¿ éç¨ä»£ç è§è
å½åè§è | 说æ | ç¤ºä¾ |
åéãæ¹æ³ã彿°ãå | ã强å¶ãå ¨å°åï¼ä¸å线åéåè¯ | function_name |
ç±»ä¸çæ¹æ³å彿°ï¼ä¸è¢«å¤é¨ç´æ¥è°ç¨çï¼ | ã强å¶ãåä¸å线å¼å¤´ï¼å ¨å°åï¼ä¸å线åéåè¯ | __method_name |
常é | ã强å¶ãå ¨å¤§å å ä¸å线 | PUBLIC_CONSTANT |
模å | ã强å¶ãå ¨å°åï¼ä¸å线åéåè¯ | module_name.py |
ç±» | ã强å¶ãæ¯ä¸ªåè¯é¦åæ¯å¤§å | ClassName |
项ç®å | ãå»ºè®®ãæ¯ä¸ªåè¯é¦åæ¯å¤§å | ProjectName |
注éè§è | 注é建议 |
æ¨¡åæ³¨éã彿°æ³¨éã类注éãæ¹æ³æ³¨é⦠| ã强å¶ãä¸åå¼å·ï¼æ³¨éçå颿²¡æç©ºè¡ |
åè¡ä»£ç åå¤è¡ä»£ç ç åè¡æ³¨é | ã强å¶ãäºå· |
åè¡ä»£ç å ç åè¡æ³¨é | ã强å¶ãä¸¤ä¸ªç©ºæ ¼+äºå· |
å¤è¡æ³¨éï¼åè¡æ³¨éå¤ªé¿æ¶ï¼å»ºè®®åæå¤è¡ï¼ | ã建议ãäºå·ï¼è¿ç»è¡ï¼ |
å ¶å®è§è | 说æ |
缩è¿ã空è¡ãç©ºæ ¼ | ã强å¶ãTab缩è¿éï¼4ä¸ªç©ºæ ¼ ã建议ãæPycharmæ´å°ææ°ï¼ç¨Ctrl+Alt+læ¥è§è代ç |
å符串 | ã建议ãåè¡ä¼å ç¨åå¼å· ã建议ãåè¡åå符串éæåå¼å·æ¶ç¨åå¼å·ï¼å¦ï¼sql = "SELECT '2021-12-02'" ã强å¶ãå¤è¡ç¨ä¸åå¼å· ã建议ãä¸ä½¿ç¨ä¸åå¼å· |
"""
æ¨¡åæ³¨é
"""
from time import time
# å¤è¡ä»£ç ç
# å¤è¡æ³¨éï¼ä¸è¡åä¸ä¸çæ¶åæåå¤è¡ï¼
NUM = 5
CNT = 1000
def function1():
"""彿°æ³¨é"""
return time() # åè¡ä»£ç å ç åè¡æ³¨é
class ClassName:
"""类注é"""
@classmethod
def method4(cls):
"""æ¹æ³æ³¨é"""
if __name__ == '__main__':
print(__doc__)
print(function1.__doc__)
print(ClassName.__doc__)
print(ClassName.method4.__doc__)
2.4ãSQL代ç è§è
SQL代ç è§è ç»§æ¿ éç¨ä»£ç è§è
-
å½å
ã强å¶ãåºã表ãåæ®µãè§å¾ï¼å ¨å°åï¼ä¸å线åéåè¯
ã强å¶ãè§å¾ï¼view_ä½ä¸ºåç¼
ã强å¶ã临æ¶è¡¨ï¼temp_ä½ä¸ºåç¼
ã强å¶ãå¤ä»½è¡¨ï¼bak_ä½ä¸ºåç¼
ã建议ãåºåå表åä¸ä½¿ç¨å¤æ°åè¯ï¼æ£ä¾user_infoï¼è´ä¾users_infoåuser_informations
-
建表
ã强å¶ãå建 表ååæ®µ è¦æ·»å ä¸ææ³¨é
ã强å¶ãæ¶åå¤é®å ³èçåæ®µï¼é¡»æ·»å å¤é®æ³¨é
ã强å¶ã建表æ¶ï¼ä¸»é®å段æå¨ææå段ç第ä¸è¡
-
æ¥è¯¢
ã建议ãå ³é®åï¼å ¨å¤§å
ã建议ã使ç¨å«ç§°æ¶ä¸è¦çç¥AS
ã强å¶ã注éç¨--ï¼ååå·+ç©ºæ ¼ï¼MySQLåHIVE齿¯æï¼
ã建议ãåæ¥è¯¢ç¨WITH ASï¼æ¯ä¸ªåæ¥è¯¢é½éå¸¦ä¸ææ³¨é
-- æ´ä¸ªæ¥è¯¢ç注é
WITH
-- åæ¥è¯¢æ³¨é
t1 AS (
SELECT a FROM t0
),
-- åæ¥è¯¢æ³¨é
t2 AS (
SELECT a FROM t1
)
-- æ¥è¯¢æ³¨é
SELECT a FROM t2;
-
æ¢è¡å缩è¿
ã建议ã缩è¿éï¼4ç©ºæ ¼æ2ç©ºæ ¼
ã建议ãåè¡è¿é¿æ¶ï¼æå 许æ¢è¡çå ³é®åæ¢è¡
ã强å¶ãå ³é®åå忝å¦å 许æ¢è¡ | å | å |
AS | ä¸å 许 | |
SELECT | å 许 | |
FROM | å 许 | ä¸å 许 |
[LEFT/RIGHT/...] JOIN | å 许 | ä¸å 许 |
ON | ä¸å 许 | |
WHERE | å 许 | ä¸å 许 |
GROUP BY | å 许 | |
HAVING | å 许 | |
ORDER BY | å 许 | |
LIMIT | å 许 | ä¸å 许 |
SELECT t1.f1
,t2.f2
,t2.f3
FROM t1
LEFT JOIN t2 ON t1.f1=t2.f4
WHERE t2.f1>4
AND t2.f2<5
AND t2.f3<>1
ORDER BY t1.f1;
- ã建议ã代ç 头鍿·»å æ¥æãéæ±ãä¸å¡ãä½è çä¿¡æ¯
-- åç§°ï¼A9527
-- æå±ä¸å¡ï¼A
-- éæ±ææ¡£ï¼é¾æ¥
-- å建è
ï¼å°åºåº
-- åå»ºæ¥æ: 2021-10-24
-- ä¿®æ¹æ¥å¿ï¼ä¿®æ¹æ¥æï¼ä¿®æ¹äººï¼ä¿®æ¹å
容ï¼:
-- 2021-12-12ï¼å°é»ï¼æ·»å xxxææ
2.4.1ãMySQL代ç è§èï¼å¾ å®åï¼
MySQL代ç è§è ç»§æ¿ SQL代ç è§è
-
åºã表ãåæ®µ
ã建议ãå»ºåºæ¶æ¾å¼æå®å符éutf8æutf8mb4
示ä¾ï¼CREATE DATABASE db1 DEFAULT CHARACTER SET utf8mb4;
-
表ãåæ®µ
ã强å¶ãéè´æ°å¿ é¡»UNSIGNED
ã建议ãä¸»é® ä»¥pk_å¼å¤´ï¼å¯ä¸ç´¢å¼ 以uq_å¼å¤´ï¼æ®éç´¢å¼ ä»¥idx_å¼å¤´
ã建议ã建ç«ç»åç´¢å¼ï¼æåºå度é«çåæ®µæ¾å¨åé¢
ã强å¶ãææºå·åå¨ä¸å¾ä½¿ç¨æ°åï¼è使ç¨VARCHARï¼æ¯æå¼å¤´0忍¡ç³æ¥è¯¢ï¼
ã建议ãéé¢åå¨ç¨INTï¼ç¨åºç«¯ä¹ä»¥åé¤ä»¥100è¿è¡ååï¼å 为INTå 4åèï¼èDOUBLEå 8åè
ã强å¶ãé«å¹¶åæåå¸å¼åºæ¯ä¸å 许å¤é®çº¦æ
-
ã建议ãåæ®µå 许éå½ åä½ï¼ä»¥åå°èè¡¨æ¥æé«æ¥è¯¢æ§è½ï¼ä½å¿ é¡»èèæ°æ®ä¸è´æ§
åä½å段åºéµå¾ªï¼ä¸æ¯é¢ç¹ä¿®æ¹çåæ®µï¼ä¸æ¯textçè¾é¿å段
-- 表å¿
å¤å段ï¼è¿äºå段起å°ä¼¼metadataçä½ç¨ï¼å¨æ°æ®åæçæ¶åï¼å¯ç¨update_timeä½ä¸ºæ°æ®æ½åçå¢éæ è¯
CREATE TABLE `xxx_info` (
`id` BIGINT(20) UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '主é®',
`create_time` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT 'å建æ¶é´',
`update_time` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT 'æ´æ°æ¶é´',
`delete_flag` TINYINT UNSIGNED NOT NULL DEFAULT 0 COMMENT 'é»è¾å 餿 è¯:1=å é¤,0=æªå ',
PRIMARY KEY (`id`) USING BTREE
) DEFAULT CHARSET=utf8mb4 COMMENT 'xxxä¿¡æ¯è¡¨';
2.4.2ãHIVE代ç è§èï¼å¾ å®åï¼
HIVE代ç è§è ç»§æ¿ SQL代ç è§è
-
å½å
ã强å¶ãèªå®ä¹å½æ°ä»¥udf_å¼å¤´
ã强å¶ã表å以åå±åä½ä¸ºåç¼ï¼åå±ä» ä» æ¯é»è¾ä¸çåºåï¼ææåå±é½å¨åä¸ä¸ªåº
åå± | å½åè§è | 说æ | ä¾ |
ODS Operation Data Store åå§æ°æ® | ods+æºç±»å+æºè¡¨å+full/i | fullï¼å ¨é忥 iï¼å¢é忥 | ods_postgresql_sku_full ods_mysql_order_detail_i ods_frontend_log |
DIM Dimension å并维度 | dim+维度+full/zip | fullï¼å ¨é表 zipï¼æé¾è¡¨ æ¥æç»´åº¦è¡¨æ²¡æåç¼ | dim_sku_full dim_user_zip dim_date |
DWD Data Warehouse Detail 维度建模 | dwd+äºå®+full/i | fullï¼å ¨éäºå® iï¼å¢éäºå® | dwd_inventory_full dwd_order_detail_i |
DWS Data Warehouse Service èå | dws+ååææ | æ¶é´ç²åº¦æ1dã1h⦠1dï¼æ1天 1hï¼æ1å°æ¶ | dws_page_visitor_1d |
DWT Data Warehouse Topic 累积 | dwt_consumer | ||
ADS Application Data Store æç»ææ | ads+è¡çææ /æ´¾çææ |
-
建åº
ã强å¶ã建åºå¿ é¡»å 䏿³¨é
ã强å¶ã建åºä¸è¦æ·»å LOCATIONï¼è使ç¨hive-site.xmlä¸é ç½®çé»è®¤å¼
ã建议ãåºæä¸å¡ååï¼ä¸ååºç表ä¸ä¼æå ³è(JOIN)
-
建表
ã强å¶ãæ®é表使ç¨å¤é¨è¡¨(EXTERNAL_TABLE)ï¼ä¸´æ¶è¡¨ä½¿ç¨å é¨è¡¨(MANAGED_TABLE)
ã强å¶ãå建å é¨è¡¨æ éæå®LOCATION
ã建议ãADSå±ä½¿ç¨\tæ,ä½ä¸ºååé符ï¼è¡å\nåéï¼ä¸åç¼©ï¼æ¹ä¾¿Sqoop导åº
-
ååº
ã强å¶ãæ¥æååºä½¿ç¨ymdï¼æ°æ®ç±»åSTRINGï¼æ ¼å¼yyyy-MM-ddï¼ä¾å¦2022-02-02
ã强å¶ãæååºä½¿ç¨ymï¼æ°æ®ç±»åSTRINGï¼æ ¼å¼yyyy-MMï¼ä¾å¦2022-02
ã建议ãå°æ¶ååºä½¿ç¨(ymd,h)å¤çº§ååºï¼æ°æ®ç±»åSTRINGï¼æ ¼å¼(yyyy-MM-dd,HH)ï¼æ¥æå¨å°æ¶å
ã建议ãADSå±ä¸ååº
-
æ¥è¯¢
ãå»ºè®®ãæ ç¨DISTINCTï¼æ§è½è¾å·®ï¼ä¸è¿å¨é«çæ¬HIVEå¯è½ä¼è¢«ä¼åï¼å ·ä½è¿è¦çæ§è¡è®¡å
ãå»ºè®®ãæ ç¨å¤ä¸ªORï¼é¿å ç¬å¡å°ä¹ç§¯ï¼å¯ç¨UNION ALL代æ¿ï¼åææ¯ä¸å½±åé»è¾ï¼å 为UNION ALLä¸å»éï¼
ãå»ºè®®ãæ ç¨ORDER BYï¼ORDER BYä¸ºå ¨å±æåºï¼åªæ1个Reducer
2.5ãå ¶å®
2.5.1ãJavaåScalaï¼å¾ å®åï¼
- ã强å¶ãææ¡£æ³¨é/** */ï¼å¤è¡æ³¨é/* */ï¼åè¡æ³¨é//
- ã强å¶ã缩è¿éï¼2æ4ä¸ªç©ºæ ¼
å½åè§è | 说æ | ç¤ºä¾ |
åéãæ¹æ³ | ã强å¶ã第ä¸ä¸ªåè¯å ¨å°åï¼åç»åè¯é¦åæ¯å¤§å | methodName |
ç±» | ã强å¶ãæ¯ä¸ªåè¯é¦åæ¯å¤§å | ClassName |
å ã项ç®å | ã强å¶ãå ¨å°å | |
常é | ã强å¶ãå ¨å¤§å å ä¸å线 | PUBLIC_CONSTANT |
2.5.2ãJSON ï¼å¾ å®åï¼
å½åè§è | 说æ | ç¤ºä¾ |
é® | ã强å¶ã第ä¸ä¸ªåè¯å ¨å°åï¼åç»åè¯é¦åæ¯å¤§å | cityId |
2.5.3ãShellï¼å¾ å®åï¼
- ã强å¶ãèæ¬å¤´ï¼#!/usr/bin/sh
- ã强å¶ãæ³¨éæ¹å¼ï¼äºå·+ç©ºæ ¼ï¼ä¾å¦# è¿æ¯æ³¨é
- ã建议ã缩è¿éï¼4ä¸ªç©ºæ ¼
å½åè§è | 说æ | ç¤ºä¾ |
åéã彿° | ã强å¶ãå ¨å°åï¼ä¸å线åéåè¯ | function_name |
常é | ã强å¶ãå ¨å¤§å å ä¸å线 | PUBLIC_CONSTANT |
èæ¬å | ã建议ãå ¨å°åï¼ä¸å线åéåè¯ | sqoop_mysql2hive.sh |
3ã代ç è¯å®¡ èªå¨åèæ¬
-
åè½ï¼
æ´ä½ä»£ç æ«æãåä¸ªä»£ç æä»¶æ«æ
-
ä½¿ç¨æ¹æ³ï¼
å°ï¼ä¸å«å¤é¨å ï¼ç代ç å è¯¥ä»£ç æ«æèæ¬ æ¾å°å级ç®å½ï¼ä½¿ç¨Python3è¿è¡
ä»£ç æ«ææ¥å示ä¾ï¼
import os
import re
from collections import defaultdict
from pandas import DataFrame
class File:
class Compile:
@staticmethod
def findall(string):
return []
@staticmethod
def match(string):
return not None
# æä»¶ååç¼
SUFFIX = ''
# æå注éçæ£å表达å¼
COMMENT_PATTERN = Compile
# åæ ¼ä»£ç çæä½æ³¨éå æ¯
COMMENT_PROPORTION_THRESHOLD = 0
# 代ç 头鍿¨¡æ¿
HEAD_PATTERN = Compile
# åè¡ä»£ç æå¤§é¿åº¦
LINE_LENGTH_LIMIT = 120
# ä¸åè§ç代ç è¯å¥
UNQUALIFIED_CODE_PATTERN = Compile
def __init__(self, file_name):
self.file_name = file_name
# 读åæä»¶
txt = self.read_file()
# åæ°
self.number_of_words = len(txt)
# è¡æ°
self.number_of_lines = len(txt.split('\n'))
# æ³¨éæ½å
comments = self.COMMENT_PATTERN.findall(txt)
# 注éè¡æ°
self.number_of_comments = sum(len(c.strip().split('\n')) for c in comments)
# 注é个æ°çå æ¯
self.comment_proportion = self.number_of_comments / self.number_of_lines
# ä¸åæ ¼åå
self.reason_for_failings = []
if self.comment_proportion < self.COMMENT_PROPORTION_THRESHOLD:
self.reason_for_failings.append('注é太å°')
if self.HEAD_PATTERN.match(txt) is None:
self.reason_for_failings.append('代ç 头鍿²¡ææç
§æå®æ¨¡æ¿')
if max(len(line) for line in txt.split('\n')) > self.LINE_LENGTH_LIMIT:
self.reason_for_failings.append('åè¡ä»£ç è¿é¿')
if self.UNQUALIFIED_CODE_PATTERN.findall(txt):
self.reason_for_failings.append('嫿ä¸åæ ¼ç代ç è¯å¥')
def read_file(self):
with open(self.file_name, encoding='utf-8') as f:
return f.read().strip()
def report(self):
print('æä»¶åç§°', self.file_name)
print('åæ°', self.number_of_words)
print('è¡æ°', self.number_of_lines)
print('注éè¡æ°', self.number_of_comments)
print('注éè¡æ°å æ¯', self.comment_proportion)
print('ä¸åæ ¼åå ï¼' if self.reason_for_failings else '代ç åæ ¼ï¼')
for e, r in enumerate(self.reason_for_failings, 1):
print(e, r)
class PyFile(File):
SUFFIX = '.py'
COMMENT_PATTERN = re.compile(r'"""[\s\S]+?"""|# .+')
COMMENT_PROPORTION_THRESHOLD = 0.05
class SqlFile(File):
SUFFIX = '.sql'
COMMENT_PATTERN = re.compile('-- .+')
COMMENT_PROPORTION_THRESHOLD = 0.05
UNQUALIFIED_CODE_PATTERN = re.compile(r'select\s+\*', re.I)
class JavaFile(File):
SUFFIX = '.java'
COMMENT_PATTERN = re.compile(r'/\*[\s\S]+?\*/|//.+')
COMMENT_PROPORTION_THRESHOLD = 0.1
class ScalaFile(JavaFile):
SUFFIX = '.scala'
class ShFile(File):
SUFFIX = '.sh'
COMMENT_PATTERN = re.compile('# .+')
COMMENT_PROPORTION_THRESHOLD = 0.1
HEAD_PATTERN = re.compile('#!/usr/bin/sh')
class Files:
FILES = (PyFile, SqlFile, ScalaFile, JavaFile, ShFile)
def __init__(self):
self.statistic = {t.SUFFIX: [] for t in self.FILES}
def traversal(self, path=os.path.dirname(__file__)):
"""éå½éåæä»¶"""
for file_name in os.listdir(path):
abs_path = os.path.join(path, file_name)
if os.path.isdir(abs_path):
for p in self.traversal(abs_path):
yield p
elif os.path.isfile(abs_path):
yield abs_path
def calculate(self):
"""è®¡ç® æä»¶æ°ã代ç éãæ³¨ééâ¦"""
for abs_path in self.traversal():
if abs_path == __file__:
continue
for f in self.FILES:
if abs_path.endswith(f.SUFFIX):
self.statistic[f.SUFFIX].append(f(abs_path))
break
@property
def failed_codes(self):
"""ä¸åæ ¼ä»£ç """
return (f.file_name for files in self.statistic.values() for f in files if f.reason_for_failings)
def files_report(self):
"""åæå
¨é¨ä»£ç """
self.calculate()
df = defaultdict(list)
index = []
for suffix, files in self.statistic.items():
index.append(suffix)
df['æä»¶æ°'].append(len(files))
df['ä¸åæ ¼æä»¶æ°'].append(sum(1 if f.reason_for_failings else 0 for f in files))
df['代ç åæ°'].append(sum(f.number_of_words for f in files))
df['代ç è¡æ°'].append(sum(f.number_of_lines for f in files))
df['注éè¡æ°'].append(sum(f.number_of_comments for f in files))
df = DataFrame(df, index)
df.loc['å计'] = df.sum()
df['注éå æ¯'] = df['注éè¡æ°'] / df['代ç è¡æ°']
df['ä¸åæ ¼ç'] = df['ä¸åæ ¼æä»¶æ°'] / df['æä»¶æ°']
print(df)
# df.to_excel('ä»£ç æ«ææ¥å.xlsx')
print('ä¸åæ ¼ä»£ç ï¼')
for f in self.failed_codes:
print(f)
def single_file_report(self, the_file):
"""å个æä»¶åæ"""
for f in self.FILES:
if the_file.endswith(f.SUFFIX):
f(the_file).report()
break
if __name__ == '__main__':
Files().files_report()
Files().single_file_report(__file__) # æ£æ¥èªå·±
# Files().single_file_report(r'')
4ãæ°æ®é»è¾æ ¡éªæºå¶
-
æ°æ®å¼åä¸åäºå端å¼åä¹å¤æ¯ï¼å端å¼å坿¯ææµè¯å¦¹å帮å¿è¿è¡åè½æµè¯çå¢~
èæ°æ®å¼åå·¥ç¨å¸å´æ²¡æ
-
æ°æ®é»è¾é误ä¸ååè½bugé£ä¹ææ¾ï¼è®¡ç®ç»æé误并ä¸ä¼ä½¿ç¨åºæ¥é
对æ¤å»ºç«æ°æ®é»è¾æ ¡éªæºå¶
- 主é®éå¤å¼æ£éª
- 主é®NULL弿£éª
- å·¦èååæ°éæ ¡éªï¼èè¡¨åæ°æ®é=å·¦è¡¨æ°æ®éï¼
- æ¶é´æ°æ®ç±»åæ ¡éªï¼æ³¨ææ¶åº
- 度é弿¯å¦å¯å
- æ°å¼ç±»åæ ¡éªï¼æ¯å¦è¶çï¼æ¯å¦æå¤±ç²¾åº¦