财务数据助手¶

🤖 自动化量化交易与财务报告中的因子提取¶

📖 背景¶

研究报告**是洞察的宝库，常常揭示潜在的**因子，这些因子可以推动成功的量化交易策略。然而，面对如此大量的报告，高效提取最有价值的洞察成为一项艰巨的任务。

此外，与其匆忙复制报告中的因子，不如深入探讨其构建的基本逻辑。该因子是否捕捉了市场的基本动态？与您库中已有的因子相比，它有多独特？

因此，迫切需要一种系统的方法来设计一个能够有效管理这一过程的框架。而这正是**财务数据助手**的作用所在。

🎥 演示 ¶

🌟 介绍¶

在这种情况下，RDAgent演示了从财务研究报告中提取因子的过程，实施这些因子，并通过Qlib回测分析其表现。这个过程不断扩展和完善因子库。

以下是步骤的增强大纲：

步骤 1 : 假设生成 🔍

根据财务报告中的洞察生成并提出初步假设，并进行全面的推理和财务论证。

步骤 2 : 因子创建 ✨

根据假设和财务报告，划分任务。
每个任务涉及开发、定义和实施一个新的金融因子，包括其名称、描述、公式和变量。

步骤 3 : 因子实施 👨‍💻

根据描述实施因子代码，像开发者一样不断演变。
对新创建的因子进行定量验证。

步骤 4 : 使用 Qlib 回测 📉

将完整数据集整合到因子实施代码中，并准备因子库。
使用 Alpha158 以及新开发的因子和 Qlib 中的 LGBModel 进行回测，以评估新因子的有效性和表现。

数据集

模型

因子

数据拆分

CSI300

LGB模型

Alpha158 Plus

训练	2008-01-01 至 2014-12-31
验证	2015-01-01 至 2016-12-31
测试	2017-01-01 至 2020-08-01

步骤 5 : 反馈分析 🔍

分析回测结果以评估性能。
结合反馈来优化假设并改进模型。

步骤 6 : 假设优化 ♻️

根据回测反馈优化假设。
重复该过程以持续改进模型。

⚡ 快速开始¶

请参考安装与配置中的安装部分以准备您的系统依赖。

您可以通过运行以下命令尝试我们的演示：

🐍 创建一个 Conda 环境
- 创建一个新的conda环境，使用Python（3.10和3.11在我们的CI中经过良好测试）：
```
conda create -n rdagent python=3.10
```
- 激活环境：
```
conda activate rdagent
```
📦 安装RDAgent
- 您可以从PyPI安装RDAgent包：
```
pip install rdagent
```

🚀 运行应用程序

下载您希望提取因子的财务报告，并将其存储在您首选的文件夹中。

具体来说，您可以遵循这个示例，或使用您自己的方法：

wget https://github.com/SunsetWolf/rdagent_resource/releases/download/reports/all_reports.zip
unzip all_reports.zip -d git_ignore_folder/reports

使用以下命令运行应用程序：

rdagent fin_factor_report --report_folder=git_ignore_folder/reports

或者，您可以将报告的路径存储在 report_result_json_file_path 中。格式应为：

[
    "git_ignore_folder/report/fin_report1.pdf",
    "git_ignore_folder/report/fin_report2.pdf",
    "git_ignore_folder/report/fin_report3.pdf"
]

然后，使用以下命令运行应用程序：
```
rdagent fin_factor_report
```

🛠️ 模块的使用¶

环境配置

可以在`.env`文件中设置以下环境变量，以自定义应用程序的行为：

pydantic settings rdagent.app.qlib_rd_loop.conf.FactorFromReportPropSetting¶

基类：FactorBasePropSetting

Show JSON schema

{
   "title": "FactorFromReportPropSetting",
   "type": "object",
   "properties": {
      "scen": {
         "default": "rdagent.scenarios.qlib.experiment.factor_from_report_experiment.QlibFactorFromReportScenario",
         "title": "Scen",
         "type": "string"
      },
      "knowledge_base": {
         "default": "",
         "title": "Knowledge Base",
         "type": "string"
      },
      "knowledge_base_path": {
         "default": "",
         "title": "Knowledge Base Path",
         "type": "string"
      },
      "hypothesis_gen": {
         "default": "rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesisGen",
         "title": "Hypothesis Gen",
         "type": "string"
      },
      "hypothesis2experiment": {
         "default": "rdagent.scenarios.qlib.proposal.factor_proposal.QlibFactorHypothesis2Experiment",
         "title": "Hypothesis2Experiment",
         "type": "string"
      },
      "coder": {
         "default": "rdagent.scenarios.qlib.developer.factor_coder.QlibFactorCoSTEER",
         "title": "Coder",
         "type": "string"
      },
      "runner": {
         "default": "rdagent.scenarios.qlib.developer.factor_runner.QlibFactorRunner",
         "title": "Runner",
         "type": "string"
      },
      "summarizer": {
         "default": "rdagent.scenarios.qlib.developer.feedback.QlibFactorExperiment2Feedback",
         "title": "Summarizer",
         "type": "string"
      },
      "evolving_n": {
         "default": 10,
         "title": "Evolving N",
         "type": "integer"
      },
      "report_result_json_file_path": {
         "default": "git_ignore_folder/report_list.json",
         "title": "Report Result Json File Path",
         "type": "string"
      },
      "max_factors_per_exp": {
         "default": 10000,
         "title": "Max Factors Per Exp",
         "type": "integer"
      },
      "is_report_limit_enabled": {
         "default": false,
         "title": "Is Report Limit Enabled",
         "type": "boolean"
      }
   },
   "additionalProperties": false
}

配置:

env_prefix: str = QLIB_FACTOR_
protected_namespaces: tuple = ()

field is_report_limit_enabled: bool = False¶: 如果为真，则限制报告处理计数；如果为假，则处理所有

field max_factors_per_exp: int = 10000¶: 每个实验中实现的最大因子数量

field report_result_json_file_path: str = 'git_ignore_folder/report_list.json'¶: 列出因子提取研究报告的JSON文件路径

field scen: str = 'rdagent.scenarios.qlib.experiment.factor_from_report_experiment.QlibFactorFromReportScenario'¶: 来自报告的Qlib因子场景类

pydantic settings rdagent.components.coder.factor_coder.config.FactorCoSTEERSettings

Show JSON schema

{
   "title": "FactorCoSTEERSettings",
   "type": "object",
   "properties": {
      "coder_use_cache": {
         "default": false,
         "title": "Coder Use Cache",
         "type": "boolean"
      },
      "max_loop": {
         "default": 10,
         "title": "Max Loop",
         "type": "integer"
      },
      "fail_task_trial_limit": {
         "default": 20,
         "title": "Fail Task Trial Limit",
         "type": "integer"
      },
      "v1_query_former_trace_limit": {
         "default": 5,
         "title": "V1 Query Former Trace Limit",
         "type": "integer"
      },
      "v1_query_similar_success_limit": {
         "default": 5,
         "title": "V1 Query Similar Success Limit",
         "type": "integer"
      },
      "v2_query_component_limit": {
         "default": 1,
         "title": "V2 Query Component Limit",
         "type": "integer"
      },
      "v2_query_error_limit": {
         "default": 1,
         "title": "V2 Query Error Limit",
         "type": "integer"
      },
      "v2_query_former_trace_limit": {
         "default": 1,
         "title": "V2 Query Former Trace Limit",
         "type": "integer"
      },
      "v2_add_fail_attempt_to_latest_successful_execution": {
         "default": false,
         "title": "V2 Add Fail Attempt To Latest Successful Execution",
         "type": "boolean"
      },
      "v2_error_summary": {
         "default": false,
         "title": "V2 Error Summary",
         "type": "boolean"
      },
      "v2_knowledge_sampler": {
         "default": 1.0,
         "title": "V2 Knowledge Sampler",
         "type": "number"
      },
      "knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "Knowledge Base Path"
      },
      "new_knowledge_base_path": {
         "anyOf": [
            {
               "type": "string"
            },
            {
               "type": "null"
            }
         ],
         "default": null,
         "title": "New Knowledge Base Path"
      },
      "max_seconds": {
         "default": 1000000,
         "title": "Max Seconds",
         "type": "integer"
      },
      "data_folder": {
         "default": "git_ignore_folder/factor_implementation_source_data",
         "title": "Data Folder",
         "type": "string"
      },
      "data_folder_debug": {
         "default": "git_ignore_folder/factor_implementation_source_data_debug",
         "title": "Data Folder Debug",
         "type": "string"
      },
      "simple_background": {
         "default": false,
         "title": "Simple Background",
         "type": "boolean"
      },
      "file_based_execution_timeout": {
         "default": 120,
         "title": "File Based Execution Timeout",
         "type": "integer"
      },
      "select_method": {
         "default": "random",
         "title": "Select Method",
         "type": "string"
      },
      "python_bin": {
         "default": "python",
         "title": "Python Bin",
         "type": "string"
      }
   },
   "additionalProperties": false
}

配置:

env_prefix: str = FACTOR_CoSTEER_

field coder_use_cache: bool = False: 指示是否为编码器使用缓存

field data_folder: str = 'git_ignore_folder/factor_implementation_source_data': 包含金融数据的文件夹路径（默认是Qlib中的基本数据）

field data_folder_debug: str = 'git_ignore_folder/factor_implementation_source_data_debug': 包含部分财务数据的文件夹路径（用于调试）

field file_based_execution_timeout: int = 120: 每个因子实现执行的超时时间（秒）

field knowledge_base_path: str | None = None: 知识库的路径

field max_loop: int = 10: 任务实现循环的最大数量

field max_seconds: int = 1000000

field new_knowledge_base_path: str | None = None: 新知识库的路径

field select_method: str = 'random': 因子实现选择的方法

field simple_background: bool = False: 是否使用简单的背景信息进行代码反馈

field v2_add_fail_attempt_to_latest_successful_execution: bool = False

财务数据助手¶

🤖 自动化量化交易与财务报告中的因子提取¶

📖 背景¶

🎥 演示 ¶

🌟 介绍¶

⚡ 快速开始¶

🛠️ 模块的使用¶

目录

上一主题

下一主题

本页

财务数据助手¶

🤖 自动化量化交易与财务报告中的因子提取¶

📖 背景¶

🎥 演示¶

🌟 介绍¶

⚡ 快速开始¶

🛠️ 模块的使用¶

🎥 演示 ¶