pandas 速查表
I’ve worked with SQL for several years before I started using Pandas in Python. I eventually get used to it but I struggled syntax difference between them when I started.
在開始在Python中使用Pandas之前,我已經使用SQL多年。 我最終習慣了,但是當我開始時,我在它們之間的文法差異上苦苦掙紮。
I want to gather syntax comparison in this article, also as cheat sheet. I’m going to mention to the basic syntax but if there is another example that doesn’t exist in this article, please let me know!
我想在本文中(也作為備忘單)收集文法比較。 我将提到基本文法,但是如果本文中不存在另一個示例,請告訴我!
資料集 (Dataset)
I’m going to use this dataset.
我将使用此資料集。
import pandas as pddf_list = [
["Lamar Jackson", 23, "Ravens", "QB"]
, ["Russell Wilson", 31, "Seahawks", "QB"]
, ["Aaron Donald", 29, "Rams", "DT"]
, ["Patrick Mahomes", 24, "Chiefs", "QB"]
, ["Michael Thomas", 27, "Saints", "WR"]
, ["Christian McCaffrey", 24, "Panthers", "RB"]
, ["George Kittle", 26, "49ers", "TE"]
, ["DeAndre Hopkins", 28, "Cardinals", "WR"]
, ["Stephon Gilmore", 29, "Patriots", "CB"]
, ["Derrick Henry", 26, "Titans", "RB"]
]
df_columns = ["Name", "Age", "Team", "Position"]
players = pd.DataFrame(data=df_list, columns=df_columns)
![](https://img.laitimes.com/img/__Qf2AjLwojIjJCLyojI0JCLicmbw5SNhZzY4kDM5YDZyYTMzYTNihTMzEWZjZWYxMTN2kTNj9CX0JXZ252bj91Ztl2Lc52YucWbp5GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.png)
players 玩家
選擇(SELECT)
Extract all:
提取所有:
SQL: SELECT * FROM playersPandas: players
Projection of some columns:
一些列的投影:
SQL: SELECT Name, Age FROM playersPandas: players[["Name", "Age"]]
Selection of top some rows:
選擇頂部幾行:
SQL: SELECT TOP 5 * FROM playersPandas: players.head(5)
DISTINCT:
不同:
SQL: SELECT DISTINCT Position FROM playersPandas: players["Position"][~players["Position"].duplicated()]
哪裡 (WHERE)
Selection of specific rows with condition:
選擇具有條件的特定行:
SQL: SELECT * FROM players WHERE Position = 'QB'Pandas: players[players.Position == "QB"]
also: players[players["Position"] == "QB"]
Selection of specific rows & projection of some columns
特定行的選擇和某些列的投影
SQL: SELECT Name, Age FROM players WHERE Position = 'QB'Pandas: players[["Name", "Age"]][players.Position == "QB"]
AND / OR:
AND / OR:
SQL: SELECT Name, Age FROM players WHERE Position = 'QB' AND Age < 30Pandas: players[["Name", "Age"]][(players.Position == "QB") & (players.Age < 30)]
SQL: SELECT Name, Age FROM players WHERE Position = 'QB' OR Position = 'RB'Pandas: players[["Name", "Age"]][(players.Position == "QB") |(players.Position == "RB")]
IN:
在:
SQL: SELECT * FROM players WHERE Position IN ('QB', 'RB')Pandas: players[players.Position.isin(["QB", "RB"])]
LIKE:
喜歡:
SQL: SELECT * FROM players WHERE Team LIKE '%ns%'Pandas: players[players.Team.str.contains("ns")]
訂購 (ORDER BY)
Order by 1 column:
按1列排序:
SQL: SELECT * FROM players ORDER BY NamePandas: players.sort_values("Name")
Order by 2+ columns:
按2+列排序:
SQL: SELECT * FROM players ORDER BY Age, NamePandas: players.sort_values(["Age", "Name"])
DESC:
DESC:
SQL: SELECT * FROM players ORDER BY Age DESC, NamePandas: players.sort_values(["Age", "Name"], ascending=[False, True])
通過...分組 (GROUP BY)
COUNT:
計數:
SQL: SELECT COUNT(*) AS COUNT FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Name": "count"}).rename(columns={"Name": "COUNT"})
SUM:
和:
SQL: SELECT SUM(Age) AS Age FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Age": "sum"})
AVG:
AVG:
SQL: SELECT AVG(Age) AS Age FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Age": "mean"})
RANK:
秩:
SQL: SELECT *, RANK() OVER (ORDER BY Age DESC) FROM playersPandas:
players["AgeRank"] = players.Age.rank(ascending=False)
players
RANK + PARTITION BY
排名+分區依據
SQL: SELECT *, RANK() OVER (PARTITION BY Position ORDER BY Age DESC) FROM playersPandas:
players["AgeRank"] = players.groupby("Position")["Age"].rank(ascending=False)
players
DENSE:
稠密:
SQL: SELECT *, DENSE_RANK() OVER (ORDER BY Age DESC) FROM playersPandas:
players["AgeRank"] = players.Age.rank(method="dense", ascending=False)
players
更新 (UPDATE)
SQL: UPDATE players SET Age = 0 WHERE Age = 24Pandas: players.loc[players.Age == 24, "Age"] = 0
加入 (JOIN)
Please see my another article!
請看我的另一篇文章!
翻譯自: https://medium.com/analytics-vidhya/sql-v-pandas-basic-syntax-comparison-cheat-sheet-498289372d45
pandas 速查表