A Deep Reinforcement Learning-based Approach for Revenue Optimization in PV-Battery Storage Systems

>100 Views

November 28, 25

スライド概要

ICECET 2025

profile-image

小平大輔 - 筑波大学エネルギー・環境系助教。現在の研究テーマは、電気自動車の充電スケジューリング、エネルギー取引のためのブロックチェーン、太陽光発電とエネルギー需要の予測など。スライドの内容についてはお気軽にご相談ください:kodaira.daisuke.gf[at]u.tsukuba.ac.jp

シェア

またはPlayer版

埋め込む »CMSなどでJSが使えない場合

ダウンロード

関連スライド

各ページのテキスト
1.

June 1, 2025 ICECET 2025, Paris A Deep Reinforcement Learning-based Approach for Revenue Optimization in PV-Battery Storage Systems Yuki Osone Institute of Systems and Information Engineering University of Tsukuba Tsukuba, Japan [email protected]

2.

Introduction Background and Problem Battery Owner Revenue (+ JPY) Imbalance Cost (− JPY) Imbalance penalties incurred in electricity trading[1] Penalty! Gap between Day-Ahead Schedule and Real-Time Supply Research Objective Develop a PV–battery control algorithm that reduces imbalance costs while maximizing revenue Electricity Market PV-Battery Storage Systems Battery Photovoltaic (PV) [1] Tokyo Electric Power Company Holdings, Inc., “Open-Platform Aggregation Business Demonstration Project,”(Japanese)https://sii.or.jp/vpp31/uploads/B_1_2_tepco.pdf. 1

3.

Previous Research Category Model-based Reinforcement Learning -based Methods MPC, MILP [Abdullah et al,2015] DQN, DDPG [Karimi Madahi et al, 2024] Our proposed Deep Reinforcement Learning PPO (Hybrid with MPC) (DRL) -based 2 Limitations / Strengths • Slow computation • Hard to redesign • Rarely consider imbalance • Weak forecasting–control integration • Fast and flexible • Explicitly reduces imbalance loss

4.

Simulation Workflow Battery charge/discharge planning flow on the prediction day Weather-Forecast Data Input [Ⅰ. Forecast] [Ⅱ. Control] • PV Forecast Input Formulate 蓄電池の the battery PV予測 • Electricity-price Forecast 充放電計画策定 charge/discharge • Imbalance-price schedule Forecast Plan Submission [Ⅲ. Battery System] Feedback [Ⅱ. Control] Extended to a Hybrid DRL + MPC Control Long-term Strategy Short-term Optimization [DRL] Proximal Policy PV予測 Optimization PV予測 Control Model Predictive 3

5.

Simulation Condition Overview of Simulation Conditions [Assumed scenario (prosumer side)] ◼ Owns commercially available PV array (4 kW) and battery (4 kWh) ◼ The battery control is independent of the household electricity consumption ◼ No charging from the grid is permitted [Input data] ◼ Training data for the forecasters: observation data collected at Tsukuba City, Japan, 1 Apr 2022 – 31 Mar 2023 4

6.

Simulation Model Formulate the battery-storage control problem as a Markov Decision Process ◼ State 𝑆𝑡 :PV power output, electricity price, imbalance price, battery state-of-charge, previous action 𝑎𝑡−1 ​, and time-series features (sin, cos) ◼ Action 𝑎𝑡 : Charge/discharge command given as a continuous value 5

7.

Simulation Model Reward design of the PPO model 𝑹𝟏 , 𝑹𝟐 , 𝑹𝟑 ① Positive Reward for Discharge 𝑹𝟏 ②Negative Reward for Infeasible Actions 𝑹𝟐 Discharged energy (kWh) ×Electricity price (JPY/kWh) • Charging beyond PV output ③ Negative Reward Considering Imbalance Loss 𝑹𝟑 • Discharging beyond remaining battery capacity PV output (kWh) × Forecast errors 𝜀 × Imbalance price (JPY/kWh) • Using forecast error 𝜀~𝛮(0, 𝜎 2 ) to synthetically reproduce the imbalance losses • Calculate actual forecast errors from past data (𝜎 = 0.22) 6

8.

Simulation result Two-Day Charge/Discharge Schedule Results (8–9 Sep 2022) Proposed Model (DRL - Imbalance Aware) Conventional Model (DRL - Imbalance Unaware) Surge in Imbalance Prices Discharge Charge The proposed model follows electricity prices and adapts to surges in imbalance prices, whereas traditional models respond only to electricity prices. 7

9.

Simulation result Revenue and Cost Comparison over One Month of Simulation Net Revenue Imbalance Loss Proposed Model +35% -47% Conventional Model +33% -29% *Change relative to the rule-based model Rule-Based Model Proposed Model Conventional Model (DRL-Imbalance (DRL-Imbalance Aware) Unaware) 8 Outperforms the Conventional model in both profit and penalty reduction.

10.

Simulation result Seasonal Revenue Comparison Compared to the Conventional Model Net Revenue Proposed Model Conventional Model April July October January overall +4.2% +6.8% +2.5% +1.4% +4.2% Throughout the year, the proposed model yields higher revenue than the conventional model 9

11.

Summary Objective Methodology Develop a PV-battery system control algorithm for reducing imbalance costs and optimizing revenue Integrated forecasting and control using a deep reinforcement learning-based battery control approach Results The proposed model increased revenue by 35 % and reduced imbalance cost by 47 % compared to the RuleBased model Next demonstrate on our laboratory’s physical battery system 10

12.

Thank you for your attention. Questions? [email protected] 11