laitimes

57. Time series data screening

author:Lee Kwang Joo

Case import

In the following data table, filter out the data with a measured value of 9 or greater. Add code comments.

Time Measurements
2022-03-18 00:00:00 5
2022-03-18 01:00:00 9
2022-03-18 02:00:00 6
2022-03-18 03:00:00 9
2022-03-18 04:00:00 8
2022-03-18 05:00:00 10
2022-03-18 06:00:00 4
2022-03-18 07:00:00 3
2022-03-18 08:00:00 6
2022-03-18 09:00:00 3
2022-03-18 10:00:00 1
2022-03-18 11:00:00 6
2022-03-18 12:00:00 7
2022-03-18 13:00:00 4
2022-03-18 14:00:00 1
2022-03-18 15:00:00 7
2022-03-18 16:00:00 4
2022-03-18 17:00:00 10
2022-03-18 18:00:00 8
2022-03-18 19:00:00 6
2022-03-18 20:00:00 6
2022-03-18 21:00:00 1
2022-03-18 22:00:00 5
2022-03-18 23:00:00 4
2022-03-19 00:00:00 5
2022-03-19 01:00:00 5
2022-03-19 02:00:00 4
2022-03-19 03:00:00 1
2022-03-19 04:00:00 8
2022-03-19 05:00:00 6
2022-03-19 06:00:00 1
2022-03-19 07:00:00 2
2022-03-19 08:00:00 8
2022-03-19 09:00:00 4
2022-03-19 10:00:00 9
2022-03-19 11:00:00 1
2022-03-19 12:00:00 8
2022-03-19 13:00:00 2
2022-03-19 14:00:00 5
2022-03-19 15:00:00 2
2022-03-19 16:00:00 6
2022-03-19 17:00:00 9
2022-03-19 18:00:00 6
2022-03-19 19:00:00 2
2022-03-19 20:00:00 1
2022-03-19 21:00:00 8
2022-03-19 22:00:00 1
2022-03-19 23:00:00 2
2022-03-20 00:00:00 6
2022-03-20 01:00:00 6
2022-03-20 02:00:00 10
2022-03-20 03:00:00 9
2022-03-20 04:00:00 6
2022-03-20 05:00:00 7
2022-03-20 06:00:00 8
2022-03-20 07:00:00 10
2022-03-20 08:00:00 1
2022-03-20 09:00:00 3
2022-03-20 10:00:00 1
2022-03-20 11:00:00 3
2022-03-20 12:00:00 2
2022-03-20 13:00:00 10
2022-03-20 14:00:00 9
2022-03-20 15:00:00 5
2022-03-20 16:00:00 2
2022-03-20 17:00:00 4
2022-03-20 18:00:00 10
2022-03-20 19:00:00 7
2022-03-20 20:00:00 6
2022-03-20 21:00:00 10
2022-03-20 22:00:00 4
2022-03-20 23:00:00 7
2022-03-21 00:00:00 7
2022-03-21 01:00:00 9
2022-03-21 02:00:00 8
2022-03-21 03:00:00 6
2022-03-21 04:00:00 2
2022-03-21 05:00:00 6
2022-03-21 06:00:00 10
2022-03-21 07:00:00 3
2022-03-21 08:00:00 2
2022-03-21 09:00:00 1
2022-03-21 10:00:00 7
2022-03-21 11:00:00 2
2022-03-21 12:00:00 9
2022-03-21 13:00:00 1
2022-03-21 14:00:00 7
2022-03-21 15:00:00 2
2022-03-21 16:00:00 2
2022-03-21 17:00:00 10
2022-03-21 18:00:00 5
2022-03-21 19:00:00 5
2022-03-21 20:00:00 4
2022-03-21 21:00:00 6
2022-03-21 22:00:00 4
2022-03-21 23:00:00 1
2022-03-22 00:00:00 5
2022-03-22 01:00:00 10
2022-03-22 02:00:00 6
2022-03-22 03:00:00 9

Wenxin is a prompt word

Open Baidu's "Wenxin Yiyan" and enter the prompt word: "You are a pandas expert." The file path is: D:/Samples/Time Series.xlsx. Use pandas to import the data in the 1st worksheet in the Excel file, use the 1st column as the index column, and the engine is "openpyxl". Try to screen out the data with a measurement value greater than or equal to 9. Add code comments. ”

The results are as follows:

First, make sure you have the pandas and openpyxl libraries installed. If you don't have one, you can install it using pip:

pip install pandas openpyxl           

You can then use the following code to import the data in the Excel file, use column 1 as the index column, and filter out the data with measurements greater than or equal to 9.

import pandas as pd

# 文件路径
file_path = 'D:/Samples/时间序列.xlsx'

# 使用pandas读取Excel文件,指定工作表索引为0(即第1个工作表),将第1列设置为索引列,引擎为'openpyxl'
# 注意:索引在Python中是从0开始的,所以第1个工作表是索引0
df = pd.read_excel(file_path, sheet_name=0, index_col=0, engine='openpyxl')

# 筛选出测量值(假设列名为'测量值')大于等于9的数据
# 如果你的列名不是'测量值',请替换为实际的列名
filtered_df = df[df['测量值'] >= 9]

# 打印筛选后的数据
print(filtered_df)

# 如果需要将筛选后的数据保存到新的Excel文件,可以使用以下代码
output_file_path = 'D:/Samples/筛选后的时间序列.xlsx'
filtered_df.to_excel(output_file_path, index=True, engine='openpyxl')           

Code Comment Version:

import pandas as pd

# 定义Excel文件的路径
file_path = 'D:/Samples/时间序列.xlsx'

# 使用pandas的read_excel函数读取Excel文件
# sheet_name=0表示读取第1个工作表(因为索引从0开始)
# index_col=0表示将第1列设置为索引列
# engine='openpyxl'指定使用openpyxl引擎读取Excel文件
# 注释中的'测量值'是假设的列名,你需要替换为实际的列名
# 读取数据到DataFrame对象df中
df = pd.read_excel(file_path, sheet_name=0, index_col=0, engine='openpyxl')

# 使用布尔索引筛选数据,假设列名为'测量值'
# 筛选出'测量值'列中大于等于9的数据
filtered_df = df[df['测量值'] >= 9]

# 打印筛选后的数据
print("筛选后的数据:")
print(filtered_df)

# 如果需要将筛选后的数据保存到新的Excel文件
# 定义输出文件的路径
output_file_path = 'D:/Samples/筛选后的时间序列.xlsx'

# 使用to_excel方法将筛选后的数据保存到新的Excel文件中
# index=True表示将索引也保存到文件中
filtered_df.to_excel(output_file_path, index=True, engine='openpyxl')
print(f"筛选后的数据已保存到 {output_file_path}")           

Make sure to replace 'Measurements' with your actual column name. If the column name contains spaces, special characters, or is not in English, make sure to use the correct quotation marks (usually single or double quotation marks) when quoting.

Python running results

57. Time series data screening

Figure 1. Python runs the code

57. Time series data screening

Figure 2. Python running results

The tools used in this collection are as follows:

Wenxin Yiyan: Version 3.5

Python Editor: 3.9.0

Disclaimer: The above results are the actual operation results, if you encounter failure to run, you can leave a message in the comment area. Part of the data comes from the Internet, if there is any infringement, please inform us by private message, thank you!