diff --git a/Reports/November/verslag.synctex.gz b/Reports/November/verslag.synctex.gz deleted file mode 100644 index e8b9bb7..0000000 Binary files a/Reports/November/verslag.synctex.gz and /dev/null differ diff --git a/Reports/Thesis/sections/results.tex b/Reports/Thesis/sections/results.tex index 630256c..4ad407f 100644 --- a/Reports/Thesis/sections/results.tex +++ b/Reports/Thesis/sections/results.tex @@ -47,14 +47,4 @@ A lot of data is available but only the most relevant data needs to be used. Exp \input{sections/results/gru} - -\newpage -\subsection{Diffusion} -% TODO: needed to explain again? -Another type of model that can be used to generatively model the NRV is the diffusion model. This type of model is very popular for image generation. In the context of images, the diffusion model is trained by iteratively adding noise to a training image until there is only noise left. From this noise, the model tries to reverse the diffusion process to get the original image back. To sample new images using this model, a noise vector is sampled and iteratively denoised by the model. This process results in a new image. - -This training process can also be used for other data types. An image is just a 2D grid of data points. A time series can be seen as a 1D sequence of data points. The diffusion model can thus be trained on the NRV data to generate new samples for a certain day based on a given input. - -Once the diffusion model is trained, it can be used efficiently to generate new samples. The model can generate samples in parallel, which is not possible with autoregressive models. It combines the parallel sample generation of the non-autoregressive models while the quarter NRV values still depend on each other. A batch of noise vectors can be sampled and passed through the model in one batch to generate the new samples. The generated samples contain the 96 NRV values for the next day without needing to sample every quarter sequentially. - -TODO: Visualization of the diffusion model in the context of the NRV data. +\input{sections/results/diffusion} diff --git a/Reports/Thesis/sections/results/diffusion.tex b/Reports/Thesis/sections/results/diffusion.tex new file mode 100644 index 0000000..3784ae0 --- /dev/null +++ b/Reports/Thesis/sections/results/diffusion.tex @@ -0,0 +1,13 @@ +\subsection{Diffusion} +Another type of model that can be used to generatively model the NRV is the diffusion model. This type of model is very popular for image generation. In the context of images, the diffusion model is trained by iteratively adding noise to a training image until there is only noise left. From this noise, the model tries to reverse the diffusion process to get the original image back. To sample new images using this model, a noise vector is sampled and iteratively denoised by the model. This process results in a new image. + +This training process can also be used for other data types. An image is just a 2D grid of data points. A time series can be seen as a 1D sequence of data points. The diffusion model can thus be trained on the NRV data to generate new samples for a certain day based on a given input. + +Once the diffusion model is trained, it can be used efficiently to generate new samples. The model can generate samples in parallel, which is not possible with autoregressive models. It combines the parallel sample generation of the non-autoregressive models while the quarter NRV values still depend on each other. A batch of noise vectors can be sampled and passed through the model in one batch to generate the new samples. The generated samples contain the 96 NRV values for the next day without needing to sample every quarter sequentially. + +TODO: Visualization of the diffusion model in the context of the NRV data. + +The model is trained in a completely different way than the quantile regression models. A simple implementation of the Denoising Diffusion Probabilistic Model (DDPM) is used to perform the experiments. More complex implementations with more advanced techniques could be used to improve the results. This is out of the scope of this thesis. The goal is to show that more recent generative models can also be used to model the NRV data. These results can then be compared to the quantile regression models to see if the diffusion model can generate better samples. + +% TODO: In background information? +First of all, the model architecture needs to be chosen. The model takes multiple inputs which include the noisy NRV time series, the positional encoding of the current denoising step and the conditional input features. The model needs to predict the noise in the current time series. The time series can then be denoised by subtracting the predicted noise in every denoising step. Multiple model architectures can be used as long as the model can predict the noise in the time series. A simple feedforward neural network is used. The neural network exists of multiple linear layers with ReLu activation functions. To predict the noise in a noisy time series, the current denoising step index must also be provided. This integer is then transformed into a vector using sine and cosine functions. The positional encoding is then concatenated with the noisy time series and the conditional input features. This tensor is then passed through the first linear layer and activation function of the neural network. This results in a tensor of the hidden size that was chosen. Before passing this tensor to the next layer, the positional encoding and conditional input features are concatenated again. This process is repeated until the last layer is reached. This provides every layer in the neural network with the necessary information to predict the noise in the time series. The output of the last layer is then the predicted noise in the time series. The model is trained by minimizing the mean squared error between the predicted noise and the real noise in the time series. diff --git a/Reports/Thesis/verslag.aux b/Reports/Thesis/verslag.aux index 63cf146..b61ee93 100644 --- a/Reports/Thesis/verslag.aux +++ b/Reports/Thesis/verslag.aux @@ -79,18 +79,18 @@ \newlabel{fig:gru_model_sample_comparison}{{12}{35}{Comparison of the autoregressive and non-autoregressive GRU model examples.\relax }{figure.caption.21}{}} \@writefile{lof}{\contentsline {figure}{\numberline {13}{\ignorespaces Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }}{36}{figure.caption.22}\protected@file@percent } \newlabel{fig:gru_model_quantile_over_underestimation}{{13}{36}{Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }{figure.caption.22}{}} -\@writefile{toc}{\contentsline {subsection}{\numberline {6.3}Diffusion}{37}{subsection.6.3}\protected@file@percent } -\@writefile{toc}{\contentsline {section}{\numberline {7}Policies for battery optimization}{37}{section.7}\protected@file@percent } -\@writefile{toc}{\contentsline {subsection}{\numberline {7.1}Baselines}{37}{subsection.7.1}\protected@file@percent } -\@writefile{toc}{\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{37}{subsection.7.2}\protected@file@percent } -\abx@aux@page{6}{38} -\abx@aux@page{7}{38} -\abx@aux@page{8}{38} -\abx@aux@page{9}{38} +\@writefile{toc}{\contentsline {subsection}{\numberline {6.3}Diffusion}{36}{subsection.6.3}\protected@file@percent } +\@writefile{toc}{\contentsline {section}{\numberline {7}Policies for battery optimization}{38}{section.7}\protected@file@percent } +\@writefile{toc}{\contentsline {subsection}{\numberline {7.1}Baselines}{38}{subsection.7.1}\protected@file@percent } +\@writefile{toc}{\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{38}{subsection.7.2}\protected@file@percent } +\abx@aux@page{6}{39} +\abx@aux@page{7}{39} +\abx@aux@page{8}{39} +\abx@aux@page{9}{39} \abx@aux@read@bbl@mdfivesum{5DC935CC8C8FAB8A3CAF97A486ED2386} \abx@aux@read@bblrerun \abx@aux@defaultrefcontext{0}{dumas_deep_2022}{nyt/global//global/global} \abx@aux@defaultrefcontext{0}{lu_scenarios_2022}{nyt/global//global/global} \abx@aux@defaultrefcontext{0}{poggi_electricity_2023}{nyt/global//global/global} \abx@aux@defaultrefcontext{0}{weron_electricity_2014}{nyt/global//global/global} -\gdef \@abspage@last{39} +\gdef \@abspage@last{40} diff --git a/Reports/Thesis/verslag.log b/Reports/Thesis/verslag.log index cdf8736..8305d70 100644 --- a/Reports/Thesis/verslag.log +++ b/Reports/Thesis/verslag.log @@ -1,4 +1,4 @@ -This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) (preloaded format=pdflatex 2023.9.17) 7 MAY 2024 00:43 +This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) (preloaded format=pdflatex 2023.9.17) 7 MAY 2024 23:38 entering extended mode restricted \write18 enabled. file:line:error style messages enabled. @@ -1474,7 +1474,7 @@ Underfull \hbox (badness 4582) in paragraph at lines 132--132 []\T1/LinuxLibertineT-TLF/m/n/12 Figure 13: |Over/underestimation of the quan-tiles for the au-tore-gres-sive and non- [] -) [35 <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_864.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_864.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_4320.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_4320.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_6336.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_6336.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_7008.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_7008.png>] [36 <./images/quantile_regression/quantile_performance/AQR_GRU_QP_Train.jpeg> <./images/quantile_regression/quantile_performance/AQR_GRU_QP_Test.jpeg> <./images/quantile_regression/quantile_performance/NAQR_GRU_QP_Train.jpeg> <./images/quantile_regression/quantile_performance/NAQR_GRU_QP_Test.jpeg>]) [37] [38] (./verslag.aux (./sections/introduction.aux) (./sections/background.aux) (./sections/policies.aux) (./sections/literature_study.aux)) +) [35 <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_864.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_864.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_4320.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_4320.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_6336.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_6336.png> <./images/quantile_regression/aqr_gru_model_examples/AQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_7008.png> <./images/quantile_regression/naqr_gru_model_examples/NAQR_GRU_NRV_Load_Wind_PV_NP_QE-Sample_7008.png>] (./sections/results/diffusion.tex [36 <./images/quantile_regression/quantile_performance/AQR_GRU_QP_Train.jpeg> <./images/quantile_regression/quantile_performance/AQR_GRU_QP_Test.jpeg> <./images/quantile_regression/quantile_performance/NAQR_GRU_QP_Train.jpeg> <./images/quantile_regression/quantile_performance/NAQR_GRU_QP_Test.jpeg>])) [37] [38] [39] (./verslag.aux (./sections/introduction.aux) (./sections/background.aux) (./sections/policies.aux) (./sections/literature_study.aux)) LaTeX Warning: There were undefined references. @@ -1490,18 +1490,18 @@ Package logreq Info: Writing requests to 'verslag.run.xml'. ) Here is how much of TeX's memory you used: - 27081 strings out of 476025 - 505488 string characters out of 5790017 + 27088 strings out of 476025 + 505676 string characters out of 5790017 1883388 words of memory out of 5000000 - 46940 multiletter control sequences out of 15000+600000 + 46943 multiletter control sequences out of 15000+600000 603223 words of font info for 88 fonts, out of 8000000 for 9000 1141 hyphenation exceptions out of 8191 83i,16n,131p,2100b,5180s stack positions out of 10000i,1000n,20000p,200000b,200000s -Output written on verslag.pdf (39 pages, 7863303 bytes). +Output written on verslag.pdf (40 pages, 7865015 bytes). PDF statistics: - 575 PDF objects out of 1000 (max. 8388607) - 423 compressed objects within 5 object streams - 103 named destinations out of 1000 (max. 500000) + 579 PDF objects out of 1000 (max. 8388607) + 426 compressed objects within 5 object streams + 104 named destinations out of 1000 (max. 500000) 486 words of extra memory for PDF output out of 10000 (max. 10000000) diff --git a/Reports/Thesis/verslag.pdf b/Reports/Thesis/verslag.pdf index b8c47c7..09a98e7 100644 Binary files a/Reports/Thesis/verslag.pdf and b/Reports/Thesis/verslag.pdf differ diff --git a/Reports/Thesis/verslag.synctex.gz b/Reports/Thesis/verslag.synctex.gz deleted file mode 100644 index 4ad7d4d..0000000 Binary files a/Reports/Thesis/verslag.synctex.gz and /dev/null differ diff --git a/Reports/Thesis/verslag.tex b/Reports/Thesis/verslag.tex index 231c046..0d28b1c 100644 --- a/Reports/Thesis/verslag.tex +++ b/Reports/Thesis/verslag.tex @@ -181,7 +181,7 @@ \input{sections/results} - +\newpage \section{Policies for battery optimization} \subsection{Baselines} diff --git a/Reports/Thesis/verslag.toc b/Reports/Thesis/verslag.toc index f550c04..2a0f908 100644 --- a/Reports/Thesis/verslag.toc +++ b/Reports/Thesis/verslag.toc @@ -25,7 +25,7 @@ \contentsline {subsubsection}{\numberline {6.2.1}Linear Model}{22}{subsubsection.6.2.1}% \contentsline {subsubsection}{\numberline {6.2.2}Non-Linear Model}{29}{subsubsection.6.2.2}% \contentsline {subsubsection}{\numberline {6.2.3}GRU Model}{32}{subsubsection.6.2.3}% -\contentsline {subsection}{\numberline {6.3}Diffusion}{37}{subsection.6.3}% -\contentsline {section}{\numberline {7}Policies for battery optimization}{37}{section.7}% -\contentsline {subsection}{\numberline {7.1}Baselines}{37}{subsection.7.1}% -\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{37}{subsection.7.2}% +\contentsline {subsection}{\numberline {6.3}Diffusion}{36}{subsection.6.3}% +\contentsline {section}{\numberline {7}Policies for battery optimization}{38}{section.7}% +\contentsline {subsection}{\numberline {7.1}Baselines}{38}{subsection.7.1}% +\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{38}{subsection.7.2}% diff --git a/src/trainers/diffusion_trainer.py b/src/trainers/diffusion_trainer.py index f16eed6..0220fb5 100644 --- a/src/trainers/diffusion_trainer.py +++ b/src/trainers/diffusion_trainer.py @@ -59,13 +59,16 @@ def sample_diffusion( # evenly spaces 4 intermediate samples to append between 1 and noise_steps if intermediate_samples: - spacing = (noise_steps - 1) // 4 - if i % spacing == 0: + first_quarter_end = (noise_steps - 1) // 4 + spacing = (first_quarter_end - 1) // 4 + + # save 1, 1 + spacing, 1 + 2*spacing, 1 + 3*spacing + if i % spacing == 1 and i <= first_quarter_end: intermediate_samples_list.append(x) x = torch.clamp(x, -1.0, 1.0) if len(intermediate_samples_list) > 0: - return x, intermediate_samples_list + return x, intermediate_samples_list[-4:] return x @@ -260,6 +263,9 @@ class DiffusionTrainer: self.model = torch.load("checkpoint.pt") self.model.to(self.device) + self.debug_plots(task, True, train_loader, train_sample_indices, -1) + self.debug_plots(task, False, test_loader, test_sample_indices, -1) + _, generated_sampels = self.test(test_loader, -1, task) # self.policy_evaluator.plot_profits_table() if self.policy_evaluator: @@ -371,7 +377,6 @@ class DiffusionTrainer: return fig - def debug_plots(self, task, training: bool, data_loader, sample_indices, epoch): for actual_idx, idx in sample_indices.items(): features, target, _ = data_loader.dataset[idx] @@ -381,69 +386,93 @@ class DiffusionTrainer: self.model.eval() with torch.no_grad(): - samples, intermediates = ( - self.sample(self.model, 100, features, True) - ) + samples, intermediates = self.sample(self.model, 100, features, True) samples = samples.cpu().numpy() samples = self.data_processor.inverse_transform(samples) target = self.data_processor.inverse_transform(target) - # list to tensor intermediate samples - intermediates = torch.stack(intermediates) + if epoch == -1: + # list to tensor intermediate samples + intermediates = torch.stack(intermediates) + + intermediate_fig1 = self.plot_from_samples( + self.data_processor.inverse_transform( + intermediates[0].cpu().numpy() + ), + target, + ) + + intermediate_fig2 = self.plot_from_samples( + self.data_processor.inverse_transform( + intermediates[1].cpu().numpy() + ), + target, + ) + + intermediate_fig3 = self.plot_from_samples( + self.data_processor.inverse_transform( + intermediates[2].cpu().numpy() + ), + target, + ) + + intermediate_fig4 = self.plot_from_samples( + self.data_processor.inverse_transform( + intermediates[3].cpu().numpy() + ), + target, + ) + + # report the intermediate figs to clearml + task.get_logger().report_matplotlib_figure( + title=( + f"Training Intermediates {actual_idx}" + if training + else f"Testing Intermediates {actual_idx}" + ), + series=f"Sample intermediate 1", + iteration=epoch, + figure=intermediate_fig1, + report_image=True, + ) + + task.get_logger().report_matplotlib_figure( + title=( + f"Training Intermediates {actual_idx}" + if training + else f"Testing Intermediates {actual_idx}" + ), + series=f"Sample intermediate 2", + iteration=epoch, + figure=intermediate_fig2, + report_image=True, + ) + + task.get_logger().report_matplotlib_figure( + title=( + f"Training Intermediates {actual_idx}" + if training + else f"Testing Intermediates {actual_idx}" + ), + series=f"Sample intermediate 3", + iteration=epoch, + figure=intermediate_fig3, + report_image=True, + ) + + task.get_logger().report_matplotlib_figure( + title=( + f"Training Intermediates {actual_idx}" + if training + else f"Testing Intermediates {actual_idx}" + ), + series=f"Sample intermediate 4", + iteration=epoch, + figure=intermediate_fig4, + report_image=True, + ) fig = self.plot_from_samples(samples, target) - intermediate_fig1 = self.plot_from_samples( - self.data_processor.inverse_transform(intermediates[0].cpu().numpy()), target - ) - - intermediate_fig2 = self.plot_from_samples( - self.data_processor.inverse_transform(intermediates[1].cpu().numpy()), target - ) - - intermediate_fig3 = self.plot_from_samples( - self.data_processor.inverse_transform(intermediates[2].cpu().numpy()), target - ) - - intermediate_fig4 = self.plot_from_samples( - self.data_processor.inverse_transform(intermediates[3].cpu().numpy()), target - ) - - - # report the intermediate figs to clearml - task.get_logger().report_matplotlib_figure( - title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}", - series=f"Sample intermediate 1", - iteration=epoch, - figure=intermediate_fig1, - report_image=True - ) - - task.get_logger().report_matplotlib_figure( - title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}", - series=f"Sample intermediate 2", - iteration=epoch, - figure=intermediate_fig2, - report_image=True - ) - - task.get_logger().report_matplotlib_figure( - title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}", - series=f"Sample intermediate 3", - iteration=epoch, - figure=intermediate_fig3, - report_image=True - ) - - task.get_logger().report_matplotlib_figure( - title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}", - series=f"Sample intermediate 4", - iteration=epoch, - figure=intermediate_fig4, - report_image=True - ) - - - task.get_logger().report_matplotlib_figure( title="Training" if training else "Testing", diff --git a/src/training_scripts/diffusion_training.py b/src/training_scripts/diffusion_training.py index b7dd770..c5e0e65 100644 --- a/src/training_scripts/diffusion_training.py +++ b/src/training_scripts/diffusion_training.py @@ -2,7 +2,7 @@ from src.utils.clearml import ClearMLHelper clearml_helper = ClearMLHelper(project_name="Thesis/NrvForecast") task = clearml_helper.get_task( - task_name="Diffusion Training: hidden_sizes=[256, 256, 256], lr=0.0001, time_dim=8" + task_name="Diffusion Training: hidden_sizes=[256, 256, 256], lr=0.0001, time_dim=8 + Load + PV + Wind + NP" ) task.execute_remotely(queue_name="default", exit_process=True) @@ -19,19 +19,16 @@ from src.policies.PolicyEvaluator import PolicyEvaluator data_config = DataConfig() data_config.NRV_HISTORY = True -data_config.LOAD_HISTORY = False -data_config.LOAD_FORECAST = False +data_config.LOAD_HISTORY = True +data_config.LOAD_FORECAST = True -data_config.PV_FORECAST = False -data_config.PV_HISTORY = False +data_config.PV_FORECAST = True +data_config.PV_HISTORY = True -data_config.WIND_FORECAST = False -data_config.WIND_HISTORY = False +data_config.WIND_FORECAST = True +data_config.WIND_HISTORY = True -data_config.QUARTER = False -data_config.DAY_OF_WEEK = False - -data_config.NOMINAL_NET_POSITION = False +data_config.NOMINAL_NET_POSITION = True data_config = task.connect(data_config, name="data_features")