天天看點

pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

pandas 速查表

I’ve worked with SQL for several years before I started using Pandas in Python. I eventually get used to it but I struggled syntax difference between them when I started.

在開始在Python中使用Pandas之前,我已經使用SQL多年。 我最終習慣了,但是當我開始時,我在它們之間的文法差異上苦苦掙紮。

I want to gather syntax comparison in this article, also as cheat sheet. I’m going to mention to the basic syntax but if there is another example that doesn’t exist in this article, please let me know!

我想在本文中(也作為備忘單)收集文法比較。 我将提到基本文法,但是如果本文中不存在另一個示例,請告訴我!

資料集 (Dataset)

I’m going to use this dataset.

我将使用此資料集。

import pandas as pddf_list = [
  ["Lamar Jackson", 23, "Ravens", "QB"]
  , ["Russell Wilson", 31, "Seahawks", "QB"]
  , ["Aaron Donald", 29, "Rams", "DT"]
  , ["Patrick Mahomes", 24, "Chiefs", "QB"]
  , ["Michael Thomas", 27, "Saints", "WR"]
  , ["Christian McCaffrey", 24, "Panthers", "RB"]
  , ["George Kittle", 26, "49ers", "TE"]
  , ["DeAndre Hopkins", 28, "Cardinals", "WR"]
  , ["Stephon Gilmore", 29, "Patriots", "CB"]
  , ["Derrick Henry", 26, "Titans", "RB"]
]
df_columns = ["Name", "Age", "Team", "Position"]
players = pd.DataFrame(data=df_list, columns=df_columns)
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

players 玩家

選擇(SELECT)

Extract all:

提取所有:

SQL: SELECT * FROM playersPandas: players
           

Projection of some columns:

一些列的投影:

SQL: SELECT Name, Age FROM playersPandas: players[["Name", "Age"]]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

Selection of top some rows:

選擇頂部幾行:

SQL: SELECT TOP 5 * FROM playersPandas: players.head(5)
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

DISTINCT:

不同:

SQL: SELECT DISTINCT Position FROM playersPandas: players["Position"][~players["Position"].duplicated()]
           

哪裡 (WHERE)

Selection of specific rows with condition:

選擇具有條件的特定行:

SQL: SELECT * FROM players WHERE Position = 'QB'Pandas: players[players.Position == "QB"]
also: players[players["Position"] == "QB"]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

Selection of specific rows & projection of some columns

特定行的選擇和某些列的投影

SQL: SELECT Name, Age FROM players WHERE Position = 'QB'Pandas: players[["Name", "Age"]][players.Position == "QB"]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

AND / OR:

AND / OR:

SQL: SELECT Name, Age FROM players WHERE Position = 'QB' AND Age < 30Pandas: players[["Name", "Age"]][(players.Position == "QB") & (players.Age < 30)]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)
SQL: SELECT Name, Age FROM players WHERE Position = 'QB' OR Position = 'RB'Pandas: players[["Name", "Age"]][(players.Position == "QB") |(players.Position == "RB")]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

IN:

在:

SQL: SELECT * FROM players WHERE Position IN ('QB', 'RB')Pandas: players[players.Position.isin(["QB", "RB"])]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

LIKE:

喜歡:

SQL: SELECT * FROM players WHERE Team LIKE '%ns%'Pandas: players[players.Team.str.contains("ns")]
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

訂購 (ORDER BY)

Order by 1 column:

按1列排序:

SQL: SELECT * FROM players ORDER BY NamePandas: players.sort_values("Name")
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

Order by 2+ columns:

按2+列排序:

SQL: SELECT * FROM players ORDER BY Age, NamePandas: players.sort_values(["Age", "Name"])
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

DESC:

DESC:

SQL: SELECT * FROM players ORDER BY Age DESC, NamePandas: players.sort_values(["Age", "Name"], ascending=[False, True])
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

通過...分組 (GROUP BY)

COUNT:

計數:

SQL: SELECT COUNT(*) AS COUNT FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Name": "count"}).rename(columns={"Name": "COUNT"})
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

SUM:

和:

SQL: SELECT SUM(Age) AS Age FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Age": "sum"})
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

AVG:

AVG:

SQL: SELECT AVG(Age) AS Age FROM players GROUP BY PositionPandas: players.groupby("Position").agg({"Age": "mean"})
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

RANK:

秩:

SQL: SELECT *, RANK() OVER (ORDER BY Age DESC) FROM playersPandas: 
players["AgeRank"] = players.Age.rank(ascending=False)
players
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

RANK + PARTITION BY

排名+分區依據

SQL: SELECT *, RANK() OVER (PARTITION BY Position ORDER BY Age DESC) FROM playersPandas: 
players["AgeRank"] = players.groupby("Position")["Age"].rank(ascending=False)
players
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

DENSE:

稠密:

SQL: SELECT *, DENSE_RANK() OVER (ORDER BY Age DESC) FROM playersPandas: 
players["AgeRank"] = players.Age.rank(method="dense", ascending=False)
players
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

更新 (UPDATE)

SQL: UPDATE players SET Age = 0 WHERE Age = 24Pandas: players.loc[players.Age == 24, "Age"] = 0
           
pandas 速查表_sql v pandas基本文法比較速查表 資料集 (Dataset)選擇(SELECT) 哪裡 (WHERE) 訂購 (ORDER BY) 通過...分組 (GROUP BY) 更新 (UPDATE) 加入 (JOIN)

加入 (JOIN)

Please see my another article!

請看我的另一篇文章!

翻譯自: https://medium.com/analytics-vidhya/sql-v-pandas-basic-syntax-comparison-cheat-sheet-498289372d45

pandas 速查表