当前位置：首页 > news >正文

免费素材网站psd网站开发后端做那些

news 2025/11/15 0:49:50

免费素材网站psd,网站开发后端做那些,网站的建立过程,快云服务器怎么做网站推荐#xff1a;使用 NSDT场景编辑器助你快速搭建可编辑的3D应用场景简单的数据处理脚本我认为一个好的起点是某种数据处理脚本。由于我打算让 ChatGPT 之后使用各种 Python 库编写一些机器学习脚本#xff0c;这似乎是一个合理的起点。目标首先#xff0c;我想尝试… 推荐使用 NSDT场景编辑器助你快速搭建可编辑的3D应用场景简单的数据处理脚本我认为一个好的起点是某种数据处理脚本。由于我打算让 ChatGPT 之后使用各种 Python 库编写一些机器学习脚本这似乎是一个合理的起点。目标首先我想尝试加载数据集;确保所有数据集值都是数值或将其转换为数值;检查缺失值;并将数据集拆分为训练集和测试集。提示编写一个 Python 脚本来执行以下操作 - 将数据集从磁盘上的 csv 加载到 Pandas 数据帧- 确保数据集完全是数字;将字符串值转换为数字并在字典中跟踪此转换以供以后引用 - 检查数据帧中的缺失值并为具有一个缺失值的任何记录插补缺失值;完全删除具有多个缺失值的记录 - 将数据集拆分为测试和训练集回应 import pandas as pd# Load dataset from csv to DataFrame df pd.read_csv(dataset.csv)# Initialize dictionary to track string to numeric conversions conversions {}# Convert string values to numeric and track conversions in dictionary for col in df.columns:if df[col].dtype object:conversions[col] {val: i for i, val in enumerate(df[col].unique())}df[col] df[col].map(conversions[col])# Check for missing values missing_values df.isna().sum()# Impute missing values for records with one missing value for col in missing_values[missing_values 1].index:df[col].fillna(df[col].median(), inplaceTrue)# Drop records with more than one missing value df.dropna(threshlen(df.columns) - 1, inplaceTrue)# Split dataset into test and training sets from sklearn.model_selection import train_test_split train_df, test_df train_test_split(df, test_size0.2) 注意脚本从csv文件名“dataset.csv”加载数据集并将数据帧拆分为80/20比率的测试和训练集。如果要使用不同的文件名或比率请相应地更改路径/比率。分析之情 ChatGPT生成了一个有用的脚本。它决定了数据集名称以及训练/测试拆分百分比并记录了这两个点以及如何根据需要更改它们。总的来说任务完成了。重构我们的代码目标现在我们有一个可以执行某些操作的脚本让我们稍微重构一下。良好编码的基石是使用函数来逻辑地组织代码促进重用并确保您不会重复自己。提示重写上面的脚本以使用一系列有意义的函数回应 import pandas as pd from sklearn.model_selection import train_test_splitdef load_and_convert_data(file_path):# Load dataset from csv to DataFramedf pd.read_csv(file_path)# Initialize dictionary to track string to numeric conversionsconversions {}# Convert string values to numeric and track conversions in dictionaryfor col in df.columns:if df[col].dtype object:conversions[col] {val: i for i, val in enumerate(df[col].unique())}df[col] df[col].map(conversions[col])return df, conversionsdef handle_missing_values(df):# Check for missing valuesmissing_values df.isna().sum()# Impute missing values for records with one missing valuefor col in missing_values[missing_values 1].index:df[col].fillna(df[col].median(), inplaceTrue)# Drop records with more than one missing valuedf.dropna(threshlen(df.columns) - 1, inplaceTrue)return dfdef split_data(df, test_size):return train_test_split(df, test_sizetest_size)# Use the functions file_path dataset.csv df, conversions load_and_convert_data(file_path) df handle_missing_values(df) train_df, test_df split_data(df, 0.2) 在这里我使用了三个函数 1. load_and_convert_datafile_path 加载 CSV 数据集并将字符串值转换为数字。 2. handle_missing_valuesdf 处理数据集中的缺失值。 3. split_datadftest_size将数据集拆分为测试和训练集。主脚本使用这些函数来加载、转换和清理数据集然后将其拆分为测试和训练集。数据集的路径和测试比率是这些函数的输入。分析之情如果你看一下 ChatGPT 输出的第一个脚本你可能会看到一些相当明显的线条你会画出哪些功能将被分成函数。我当然做到了。在这方面ChatGPT 完全按照我的预期做了。同样已经生成了一个称职的脚本。扩展我们的代码目标好的我们的脚本现在看起来好一点因为它使用了一些函数进行组织。接下来让我们稍微扩展脚本以便用户可以在运行脚本时指定一些参数。提示重写上述程序以参数的形式从命令行请求以下内容并在程序中使用它们- 数据集 CSV 文件的路径/名称 - 训练/测试拆分大小- 每条记录要丢弃的缺失值数的阈值之后删除整个记录而不是插补缺失值回应 import pandas as pd import argparse from sklearn.model_selection import train_test_splitdef load_and_convert_data(file_path):# Load dataset from csv to DataFramedf pd.read_csv(file_path)# Initialize dictionary to track string to numeric conversionsconversions {}# Convert string values to numeric and track conversions in dictionaryfor col in df.columns:if df[col].dtype object:conversions[col] {val: i for i, val in enumerate(df[col].unique())}df[col] df[col].map(conversions[col])return df, conversionsdef handle_missing_values(df, threshold):# Check for missing valuesmissing_values df.isna().sum()# Impute missing values for records with one missing valuefor col in missing_values[missing_values 1].index:df[col].fillna(df[col].median(), inplaceTrue)# Drop records with more than threshold missing valuedf.dropna(threshlen(df.columns) - threshold, inplaceTrue)return dfdef split_data(df, test_size):return train_test_split(df, test_sizetest_size)if __name__ __main__:parser argparse.ArgumentParser()parser.add_argument(file_path, helppath/name of the dataset csv file)parser.add_argument(test_size, helptrain/test split size, typefloat)parser.add_argument(threshold, helpthreshold for the number of missing values per record, typeint)args parser.parse_args()# Use the functionsdf, conversions load_and_convert_data(args.file_path)df handle_missing_values(df, args.threshold)train_df, test_df split_data(df, args.test_size) 在此版本中我使用 argparse 模块将文件路径、测试/训练拆分大小和每条记录缺失值数的阈值的输入作为命令行参数。用户必须使用正确的参数运行脚本例如 python script.py dataset.csv 0.2 1 这将使用“dataset.csv”文件运行脚本test_size为 0.2缺失值的阈值为 1。主脚本在函数中使用这些参数来加载、转换和清理数据集然后将其拆分为测试和训练集。还值得一提的是运行脚本时应从命令行运行脚本而不是从 IDE 运行脚本。分析之情老实说这里不需要太多分析因为 ChatGPT 解释了它做得很好的地方。我实际上会在这里注意到一些我以前没有的东西并拍拍 ChatGPT 的后背因为它的评论富有表现力和简洁。将我们的代码转换为流光应用程序目标现在假设我们想使这段代码“生产”准备就绪幽默我。让我们让 ChatGPT 将这些功能包装在 Streamlit 应用程序中。提示将上述程序重写为 Streamlit 应用程序允许用户提供与上述命令行参数相同的参数回应 import pandas as pd import streamlit as st from sklearn.model_selection import train_test_splitdef load_and_convert_data(file_path):# Load dataset from csv to DataFramedf pd.read_csv(file_path)# Initialize dictionary to track string to numeric conversionsconversions {}# Convert string values to numeric and track conversions in dictionaryfor col in df.columns:if df[col].dtype object:conversions[col] {val: i for i, val in enumerate(df[col].unique())}df[col] df[col].map(conversions[col])return df, conversionsdef handle_missing_values(df, threshold):# Check for missing valuesmissing_values df.isna().sum()# Impute missing values for records with one missing valuefor col in missing_values[missing_values 1].index:df[col].fillna(df[col].median(), inplaceTrue)# Drop records with more than threshold missing valuedf.dropna(threshlen(df.columns) - threshold, inplaceTrue)return dfdef split_data(df, test_size):return train_test_split(df, test_sizetest_size)def main():st.set_page_config(page_titleData Preprocessing, page_icon:guardsman:, layoutwide)st.title(Data Preprocessing)file_path st.text_input(Enter the path/name of the dataset csv file: )test_size st.number_input(Enter the train/test split size (decimal between 0 and 1): , step0.01, value0.2)threshold st.number_input(Enter the threshold for the number of missing values per record: , step1, value1)if st.button(Process Data):df, conversions load_and_convert_data(file_path)df handle_missing_values(df, threshold)train_df, test_df split_data(df, test_size)st.success(Data preprocessing completed!)if __name__ __main__:main() 此版本是一个 Streamlit 应用允许用户提供与命令行参数相同的参数。它使用 Streamlit 库创建一个交互式 Web 应用允许用户输入文件路径、测试/训练拆分大小和每条记录缺失值数的阈值。然后用户可以单击“处理数据”按钮来运行脚本并预处理数据。它使用与以前版本的脚本相同的函数来加载、转换和清理数据集然后将其拆分为测试和训练集。要运行脚本您需要安装 Streamlit您可以在终端中运行命令 pip install streamlit。安装后使用命令运行脚本。streamlit run script.py 分析之情它做了它应该做的事情。ChatGPT采取了我们的功能...函数并用它们构建了一个简单的 Streamlit 网络应用程序。使用 ChatGPT 生成的脚本创建的 Streamlit Web 应用程序总结 ChatGPT 在创建、重构、扩展和添加一个简单的数据预处理 Python 脚本的包装器方面做得非常称职我们希望它根据我们提供的规范为我们编写。虽然它不是最复杂的程序但它是朝着让 ChatGPT 作为编程助手真正有用的方向迈出的一步。结果在每一步都达到了目标。下一次我们将构建一个不太简单的程序增加一些复杂性和额外的要求。我很高兴看到ChatGPT在压力下的表现如何。希望你也是。原文链接ChatGPT 作为 Python 编程助手 (mvrlink.com)

查看全文

http://www.zqtcl.cn/news/396175/