Worked further on thesis

2024-05-08 17:53:19 +02:00
parent d9b6f34e97
commit 8a2e1ce7d5
18 changed files with 636 additions and 620 deletions
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/images/diffusion/results/intermediates/Testing
+++ b/Reports/Thesis/images/diffusion/results/intermediates/Testing
--- a/Reports/Thesis/sections/results.tex
+++ b/Reports/Thesis/sections/results.tex
@@ -47,14 +47,4 @@ A lot of data is available but only the most relevant data needs to be used. Exp

 \input{sections/results/gru}

-
-\newpage
-\subsection{Diffusion}
-% TODO: needed to explain again?
-Another type of model that can be used to generatively model the NRV is the diffusion model. This type of model is very popular for image generation. In the context of images, the diffusion model is trained by iteratively adding noise to a training image until there is only noise left. From this noise, the model tries to reverse the diffusion process to get the original image back. To sample new images using this model, a noise vector is sampled and iteratively denoised by the model. This process results in a new image. 
-
-This training process can also be used for other data types. An image is just a 2D grid of data points. A time series can be seen as a 1D sequence of data points. The diffusion model can thus be trained on the NRV data to generate new samples for a certain day based on a given input.
-
-Once the diffusion model is trained, it can be used efficiently to generate new samples. The model can generate samples in parallel, which is not possible with autoregressive models. It combines the parallel sample generation of the non-autoregressive models while the quarter NRV values still depend on each other.  A batch of noise vectors can be sampled and passed through the model in one batch to generate the new samples. The generated samples contain the 96 NRV values for the next day without needing to sample every quarter sequentially.
-
-TODO: Visualization of the diffusion model in the context of the NRV data.
+\input{sections/results/diffusion}
--- a/Reports/Thesis/sections/results/diffusion.tex
+++ b/Reports/Thesis/sections/results/diffusion.tex
@@ -0,0 +1,42 @@
+\subsection{Diffusion}
+Another type of model that can be used to generatively model the NRV is the diffusion model. This type of model is very popular for image generation. In the context of images, the diffusion model is trained by iteratively adding noise to a training image until there is only noise left. From this noise, the model tries to reverse the diffusion process to get the original image back. To sample new images using this model, a noise vector is sampled and iteratively denoised by the model. This process results in a new image. 
+
+This training process can also be used for other data types. An image is just a 2D grid of data points. A time series can be seen as a 1D sequence of data points. The diffusion model can thus be trained on the NRV data to generate new samples for a certain day based on a given input.
+
+Once the diffusion model is trained, it can be used efficiently to generate new samples. The model can generate samples in parallel, which is not possible with autoregressive models. It combines the parallel sample generation of the non-autoregressive models while the quarter NRV values still depend on each other.  A batch of noise vectors can be sampled and passed through the model in one batch to generate the new samples. The generated samples contain the 96 NRV values for the next day without needing to sample every quarter sequentially.
+
+The model is trained in a completely different way than the quantile regression models. A simple implementation of the Denoising Diffusion Probabilistic Model (DDPM) is used to perform the experiments. More complex implementations with more advanced techniques could be used to improve the results. This is out of the scope of this thesis. The goal is to show that more recent generative models can also be used to model the NRV data. These results can then be compared to the quantile regression models to see if the diffusion model can generate better samples.
+
+% TODO: In background information?
+First of all, the model architecture needs to be chosen. The model takes multiple inputs which include the noisy NRV time series, the positional encoding of the current denoising step and the conditional input features. The model needs to predict the noise in the current time series. The time series can then be denoised by subtracting the predicted noise in every denoising step. Multiple model architectures can be used as long as the model can predict the noise in the time series. A simple feedforward neural network is used. The neural network exists of multiple linear layers with ReLu activation functions. To predict the noise in a noisy time series, the current denoising step index must also be provided. This integer is then transformed into a vector using sine and cosine functions. The positional encoding is then concatenated with the noisy time series and the conditional input features. This tensor is then passed through the first linear layer and activation function of the neural network. This results in a tensor of the hidden size that was chosen. Before passing this tensor to the next layer, the positional encoding and conditional input features are concatenated again. This process is repeated until the last layer is reached. This provides every layer in the neural network with the necessary information to predict the noise in the time series. The output of the last layer is then the predicted noise in the time series. The model is trained by minimizing the mean squared error between the predicted noise and the real noise in the time series.
+
+Other hyperparameters that need to be chosen are the number of denoising steps, number of layers and hidden size of the neural network. Experiments are performed to get an insight into the influence these parameters have on the model performance. Results are shown in Table \ref{tab:diffusion_results}.
+
+\begin{figure}[h]
+    \centering
+    \begin{tikzpicture}
+        % First row
+        % Node for Image 1
+        \node (img1) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 1_00000000.jpeg}};
+        % Node for Image 2 with an arrow from Image 1
+        \node[right=of img1] (img2) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 2_00000000.jpeg}};
+        \draw[-latex] (img1) -- (img2);
+
+        % Second row
+        % Node for Image 3 below Image 1 with an arrow from Image 2
+        \node[below=of img1] (img3) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 3_00000000.jpeg}};
+        % Node for Image 4 with an arrow from Image 3
+        \node[right=of img3] (img4) {\includegraphics[width=0.45\textwidth]{images/diffusion/results/intermediates/Testing Intermediates 864_Sample intermediate 4_00000000.jpeg}};
+        \draw[-latex] (img3) -- (img4);
+        
+        % Complex arrow from Image 2 to Image 3
+        % Calculate midpoint for the horizontal segment
+        \coordinate (Middle) at ($(img2.south)!0.5!(img3.north)$);
+        \draw[-latex] (img2.south) |- (Middle) -| (img3.north);
+    \end{tikzpicture}
+    \caption{Intermediate steps of the diffusion model for example 864 from the test set. The confidence intervals shown in the plots are made using 100 samples.}
+    \label{fig:diffusion_intermediates}
+\end{figure}
+
+In Figure \ref{fig:diffusion_intermediates}, multiple intermediate steps of the denoising process are shown as an example from the test set. The model starts with noisy full-day NRV samples which can be seen in the first steps. These noisy samples are then denoised in multiple steps until realistic samples are generated. This can be seen in the last image in the figure. It can be observed that the confidence intervals get more narrow over time as the noise is removed from the samples. 
+
--- a/Reports/Thesis/verslag
+++ b/Reports/Thesis/verslag
--- a/Reports/Thesis/verslag.aux
+++ b/Reports/Thesis/verslag.aux
@@ -59,38 +59,4 @@
 \newlabel{fig:linear_model_sample_comparison}{{7}{26}{Comparison of the autoregressive and non-autoregressive linear model samples.\relax }{figure.caption.12}{}}
 \@writefile{lof}{\contentsline {figure}{\numberline {8}{\ignorespaces Samples for two examples from the test set for the autoregressive and non-autoregressive linear model. The real NRV is shown in orange.\relax }}{27}{figure.caption.13}\protected@file@percent }
 \newlabel{fig:linear_model_samples_comparison}{{8}{27}{Samples for two examples from the test set for the autoregressive and non-autoregressive linear model. The real NRV is shown in orange.\relax }{figure.caption.13}{}}
-\@writefile{lof}{\contentsline {figure}{\numberline {9}{\ignorespaces Over/underestimation of the quantiles for the autoregressive and non-autoregressive linear models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }}{28}{figure.caption.14}\protected@file@percent }
-\newlabel{fig:linear_model_quantile_over_underestimation}{{9}{28}{Over/underestimation of the quantiles for the autoregressive and non-autoregressive linear models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }{figure.caption.14}{}}
-\@writefile{toc}{\contentsline {subsubsection}{\numberline {6.2.2}Non-Linear Model}{29}{subsubsection.6.2.2}\protected@file@percent }
-\@writefile{lot}{\contentsline {table}{\numberline {5}{\ignorespaces Non-linear Quantile Regression Model Architecture\relax }}{29}{table.caption.15}\protected@file@percent }
-\newlabel{tab:non_linear_model_architecture}{{5}{29}{Non-linear Quantile Regression Model Architecture\relax }{table.caption.15}{}}
-\@writefile{lot}{\contentsline {table}{\numberline {6}{\ignorespaces Non-linear quantile regression model results. All the models used a dropout of 0.2 .\relax }}{30}{table.caption.16}\protected@file@percent }
-\newlabel{tab:non_linear_model_results}{{6}{30}{Non-linear quantile regression model results. All the models used a dropout of 0.2 .\relax }{table.caption.16}{}}
-\@writefile{lof}{\contentsline {figure}{\numberline {10}{\ignorespaces Comparison of the autoregressive and non-autoregressive non-linear model examples.\relax }}{31}{figure.caption.17}\protected@file@percent }
-\newlabel{fig:non_linear_model_examples}{{10}{31}{Comparison of the autoregressive and non-autoregressive non-linear model examples.\relax }{figure.caption.17}{}}
-\@writefile{lof}{\contentsline {figure}{\numberline {11}{\ignorespaces Over/underestimation of the quantiles for the autoregressive and non-autoregressive non-linear models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }}{32}{figure.caption.18}\protected@file@percent }
-\newlabel{fig:non-linear_model_quantile_over_underestimation}{{11}{32}{Over/underestimation of the quantiles for the autoregressive and non-autoregressive non-linear models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }{figure.caption.18}{}}
-\@writefile{toc}{\contentsline {subsubsection}{\numberline {6.2.3}GRU Model}{32}{subsubsection.6.2.3}\protected@file@percent }
-\@writefile{lot}{\contentsline {table}{\numberline {7}{\ignorespaces GRU Model Architecture\relax }}{33}{table.caption.19}\protected@file@percent }
-\newlabel{tab:gru_model_architecture}{{7}{33}{GRU Model Architecture\relax }{table.caption.19}{}}
-\@writefile{lot}{\contentsline {table}{\numberline {8}{\ignorespaces Autoregressive GRU quantile regression model results. All the models used a dropout of 0.2 .\relax }}{34}{table.caption.20}\protected@file@percent }
-\newlabel{tab:autoregressive_gru_model_results}{{8}{34}{Autoregressive GRU quantile regression model results. All the models used a dropout of 0.2 .\relax }{table.caption.20}{}}
-\@writefile{lof}{\contentsline {figure}{\numberline {12}{\ignorespaces Comparison of the autoregressive and non-autoregressive GRU model examples.\relax }}{35}{figure.caption.21}\protected@file@percent }
-\newlabel{fig:gru_model_sample_comparison}{{12}{35}{Comparison of the autoregressive and non-autoregressive GRU model examples.\relax }{figure.caption.21}{}}
-\@writefile{lof}{\contentsline {figure}{\numberline {13}{\ignorespaces Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }}{36}{figure.caption.22}\protected@file@percent }
-\newlabel{fig:gru_model_quantile_over_underestimation}{{13}{36}{Over/underestimation of the quantiles for the autoregressive and non-autoregressive GRU models. Both the quantile performance for the training and test set are shown. The plots are generated using the input features NRV, Load, Wind, PV, Net Position, and the quarter embedding (only for the autoregressive model).\relax }{figure.caption.22}{}}
-\@writefile{toc}{\contentsline {subsection}{\numberline {6.3}Diffusion}{37}{subsection.6.3}\protected@file@percent }
-\@writefile{toc}{\contentsline {section}{\numberline {7}Policies for battery optimization}{37}{section.7}\protected@file@percent }
-\@writefile{toc}{\contentsline {subsection}{\numberline {7.1}Baselines}{37}{subsection.7.1}\protected@file@percent }
-\@writefile{toc}{\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{37}{subsection.7.2}\protected@file@percent }
-\abx@aux@page{6}{38}
-\abx@aux@page{7}{38}
-\abx@aux@page{8}{38}
-\abx@aux@page{9}{38}
-\abx@aux@read@bbl@mdfivesum{5DC935CC8C8FAB8A3CAF97A486ED2386}
-\abx@aux@read@bblrerun
-\abx@aux@defaultrefcontext{0}{dumas_deep_2022}{nyt/global//global/global}
-\abx@aux@defaultrefcontext{0}{lu_scenarios_2022}{nyt/global//global/global}
-\abx@aux@defaultrefcontext{0}{poggi_electricity_2023}{nyt/global//global/global}
-\abx@aux@defaultrefcontext{0}{weron_electricity_2014}{nyt/global//global/global}
-\gdef \@abspage@last{39}
+\@writefile{lof}{\contentsline {figure}{\numberline {9}{\ignorespaces Over/unde
--- a/Reports/Thesis/verslag.bcf
+++ b/Reports/Thesis/verslag.bcf
@@ -2818,70 +2818,4 @@
      <bcf:entrytype>article</bcf:entrytype>
      <bcf:entrytype>report</bcf:entrytype>
      <bcf:constraint type="mandatory">
-        <bcf:field>author</bcf:field>
-        <bcf:field>title</bcf:field>
-      </bcf:constraint>
-    </bcf:constraints>
-  </bcf:datamodel>
-  <!-- CITATION DATA -->
-  <!-- SECTION 0 -->
-  <bcf:bibdata section="0">
-    <bcf:datasource type="file" datatype="bibtex" glob="false">./references.bib</bcf:datasource>
-  </bcf:bibdata>
-  <bcf:section number="0">
-    <bcf:citekey order="1" intorder="1">weron_electricity_2014</bcf:citekey>
-    <bcf:citekey order="2" intorder="1">poggi_electricity_2023</bcf:citekey>
-    <bcf:citekey order="3" intorder="1">lu_scenarios_2022</bcf:citekey>
-    <bcf:citekey order="4" intorder="1">dumas_deep_2022</bcf:citekey>
-    <bcf:citekey order="5" intorder="1">rasul_autoregressive_2021</bcf:citekey>
-    <bcf:citekey order="6" intorder="1">dumas_deep_2022</bcf:citekey>
-  </bcf:section>
-  <!-- SORTING TEMPLATES -->
-  <bcf:sortingtemplate name="nyt">
-    <bcf:sort order="1">
-      <bcf:sortitem order="1">presort</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="2" final="1">
-      <bcf:sortitem order="1">sortkey</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="3">
-      <bcf:sortitem order="1">sortname</bcf:sortitem>
-      <bcf:sortitem order="2">author</bcf:sortitem>
-      <bcf:sortitem order="3">editor</bcf:sortitem>
-      <bcf:sortitem order="4">translator</bcf:sortitem>
-      <bcf:sortitem order="5">sorttitle</bcf:sortitem>
-      <bcf:sortitem order="6">title</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="4">
-      <bcf:sortitem order="1">sortyear</bcf:sortitem>
-      <bcf:sortitem order="2">year</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="5">
-      <bcf:sortitem order="1">sorttitle</bcf:sortitem>
-      <bcf:sortitem order="2">title</bcf:sortitem>
-    </bcf:sort>
-    <bcf:sort order="6">
-      <bcf:sortitem order="1">volume</bcf:sortitem>
-      <bcf:sortitem literal="1" order="2">0</bcf:sortitem>
-    </bcf:sort>
-  </bcf:sortingtemplate>
-  <!-- DATALISTS -->
-  <bcf:datalist section="0"
-                name="nyt/apasortcite//global/global"
-                type="entry"
-                sortingtemplatename="nyt"
-                sortingnamekeytemplatename="apasortcite"
-                labelprefix=""
-                uniquenametemplatename="global"
-                labelalphanametemplatename="global">
-  </bcf:datalist>
-  <bcf:datalist section="0"
-                name="nyt/global//global/global"
-                type="entry"
-                sortingtemplatename="nyt"
-                sortingnamekeytemplatename="global"
-                labelprefix=""
-                uniquenametemplatename="global"
-                labelalphanametemplatename="global">
-  </bcf:datalist>
-</bcf:controlfile>
+        <bcf:field>
--- a/Reports/Thesis/verslag.log
+++ b/Reports/Thesis/verslag.log
--- a/Reports/Thesis/verslag.out
+++ b/Reports/Thesis/verslag.out
@@ -1,30 +0,0 @@
-\BOOKMARK [1][-]{section.1}{\376\377\000I\000n\000t\000r\000o\000d\000u\000c\000t\000i\000o\000n}{}% 1
-\BOOKMARK [1][-]{section.2}{\376\377\000E\000l\000e\000c\000t\000r\000i\000c\000i\000t\000y\000\040\000m\000a\000r\000k\000e\000t}{}% 2
-\BOOKMARK [1][-]{section.3}{\376\377\000G\000e\000n\000e\000r\000a\000t\000i\000v\000e\000\040\000m\000o\000d\000e\000l\000i\000n\000g}{}% 3
-\BOOKMARK [2][-]{subsection.3.1}{\376\377\000Q\000u\000a\000n\000t\000i\000l\000e\000\040\000R\000e\000g\000r\000e\000s\000s\000i\000o\000n}{section.3}% 4
-\BOOKMARK [2][-]{subsection.3.2}{\376\377\000A\000u\000t\000o\000r\000e\000g\000r\000e\000s\000s\000i\000v\000e\000\040\000v\000s\000\040\000N\000o\000n\000-\000A\000u\000t\000o\000r\000e\000g\000r\000e\000s\000s\000i\000v\000e\000\040\000m\000o\000d\000e\000l\000s}{section.3}% 5
-\BOOKMARK [2][-]{subsection.3.3}{\376\377\000M\000o\000d\000e\000l\000\040\000T\000y\000p\000e\000s}{section.3}% 6
-\BOOKMARK [3][-]{subsubsection.3.3.1}{\376\377\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.3.3}% 7
-\BOOKMARK [3][-]{subsubsection.3.3.2}{\376\377\000N\000o\000n\000-\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.3.3}% 8
-\BOOKMARK [3][-]{subsubsection.3.3.3}{\376\377\000R\000e\000c\000u\000r\000r\000e\000n\000t\000\040\000N\000e\000u\000r\000a\000l\000\040\000N\000e\000t\000w\000o\000r\000k\000\040\000\050\000R\000N\000N\000\051}{subsection.3.3}% 9
-\BOOKMARK [2][-]{subsection.3.4}{\376\377\000D\000i\000f\000f\000u\000s\000i\000o\000n\000\040\000m\000o\000d\000e\000l\000s}{section.3}% 10
-\BOOKMARK [3][-]{subsubsection.3.4.1}{\376\377\000O\000v\000e\000r\000v\000i\000e\000w}{subsection.3.4}% 11
-\BOOKMARK [3][-]{subsubsection.3.4.2}{\376\377\000A\000p\000p\000l\000i\000c\000a\000t\000i\000o\000n\000s}{subsection.3.4}% 12
-\BOOKMARK [3][-]{subsubsection.3.4.3}{\376\377\000G\000e\000n\000e\000r\000a\000t\000i\000o\000n\000\040\000p\000r\000o\000c\000e\000s\000s}{subsection.3.4}% 13
-\BOOKMARK [2][-]{subsection.3.5}{\376\377\000E\000v\000a\000l\000u\000a\000t\000i\000o\000n}{section.3}% 14
-\BOOKMARK [1][-]{section.4}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s}{}% 15
-\BOOKMARK [2][-]{subsection.4.1}{\376\377\000B\000a\000s\000e\000l\000i\000n\000e\000s}{section.4}% 16
-\BOOKMARK [2][-]{subsection.4.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000b\000a\000s\000e\000d\000\040\000o\000n\000\040\000N\000R\000V\000\040\000g\000e\000n\000e\000r\000a\000t\000i\000o\000n\000s}{section.4}% 17
-\BOOKMARK [1][-]{section.5}{\376\377\000L\000i\000t\000e\000r\000a\000t\000u\000r\000e\000\040\000S\000t\000u\000d\000y}{}% 18
-\BOOKMARK [2][-]{subsection.5.1}{\376\377\000E\000l\000e\000c\000t\000r\000i\000c\000i\000t\000y\000\040\000P\000r\000i\000c\000e\000\040\000F\000o\000r\000e\000c\000a\000s\000t\000i\000n\000g}{section.5}% 19
-\BOOKMARK [2][-]{subsection.5.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000f\000o\000r\000\040\000B\000a\000t\000t\000e\000r\000y\000\040\000O\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n}{section.5}% 20
-\BOOKMARK [1][-]{section.6}{\376\377\000R\000e\000s\000u\000l\000t\000s\000\040\000\046\000\040\000D\000i\000s\000c\000u\000s\000s\000i\000o\000n}{}% 21
-\BOOKMARK [2][-]{subsection.6.1}{\376\377\000D\000a\000t\000a}{section.6}% 22
-\BOOKMARK [2][-]{subsection.6.2}{\376\377\000Q\000u\000a\000n\000t\000i\000l\000e\000\040\000R\000e\000g\000r\000e\000s\000s\000i\000o\000n}{section.6}% 23
-\BOOKMARK [3][-]{subsubsection.6.2.1}{\376\377\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 24
-\BOOKMARK [3][-]{subsubsection.6.2.2}{\376\377\000N\000o\000n\000-\000L\000i\000n\000e\000a\000r\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 25
-\BOOKMARK [3][-]{subsubsection.6.2.3}{\376\377\000G\000R\000U\000\040\000M\000o\000d\000e\000l}{subsection.6.2}% 26
-\BOOKMARK [2][-]{subsection.6.3}{\376\377\000D\000i\000f\000f\000u\000s\000i\000o\000n}{section.6}% 27
-\BOOKMARK [1][-]{section.7}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000f\000o\000r\000\040\000b\000a\000t\000t\000e\000r\000y\000\040\000o\000p\000t\000i\000m\000i\000z\000a\000t\000i\000o\000n}{}% 28
-\BOOKMARK [2][-]{subsection.7.1}{\376\377\000B\000a\000s\000e\000l\000i\000n\000e\000s}{section.7}% 29
-\BOOKMARK [2][-]{subsection.7.2}{\376\377\000P\000o\000l\000i\000c\000i\000e\000s\000\040\000u\000s\000i\000n\000g\000\040\000N\000R\000V\000\040\000p\000r\000e\000d\000i\000c\000t\000i\000o\000n\000s}{section.7}% 30
--- a/Reports/Thesis/verslag.pdf
+++ b/Reports/Thesis/verslag.pdf
--- a/Reports/Thesis/verslag.synctex(busy)
+++ b/Reports/Thesis/verslag.synctex(busy)
--- a/Reports/Thesis/verslag.synctex.gz
+++ b/Reports/Thesis/verslag.synctex.gz
--- a/Reports/Thesis/verslag.tex
+++ b/Reports/Thesis/verslag.tex
@@ -28,6 +28,8 @@
 \usepackage{caption}
 \usepackage{subcaption}
 \usepackage{booktabs}
+\usepackage{tikz}
+\usetikzlibrary{positioning, calc}

 % Electricity market
 % Generative Modeling
@@ -181,7 +183,7 @@

 \input{sections/results}

-
+\newpage
 \section{Policies for battery optimization}
 \subsection{Baselines}

--- a/Reports/Thesis/verslag.toc
+++ b/Reports/Thesis/verslag.toc
@@ -1,31 +0,0 @@
-\babel@toc {english}{}\relax 
-\contentsline {section}{\numberline {1}Introduction}{2}{section.1}%
-\contentsline {section}{\numberline {2}Electricity market}{3}{section.2}%
-\contentsline {section}{\numberline {3}Generative modeling}{5}{section.3}%
-\contentsline {subsection}{\numberline {3.1}Quantile Regression}{6}{subsection.3.1}%
-\contentsline {subsection}{\numberline {3.2}Autoregressive vs Non-Autoregressive models}{8}{subsection.3.2}%
-\contentsline {subsection}{\numberline {3.3}Model Types}{9}{subsection.3.3}%
-\contentsline {subsubsection}{\numberline {3.3.1}Linear Model}{9}{subsubsection.3.3.1}%
-\contentsline {subsubsection}{\numberline {3.3.2}Non-Linear Model}{10}{subsubsection.3.3.2}%
-\contentsline {subsubsection}{\numberline {3.3.3}Recurrent Neural Network (RNN)}{10}{subsubsection.3.3.3}%
-\contentsline {subsection}{\numberline {3.4}Diffusion models}{11}{subsection.3.4}%
-\contentsline {subsubsection}{\numberline {3.4.1}Overview}{12}{subsubsection.3.4.1}%
-\contentsline {subsubsection}{\numberline {3.4.2}Applications}{12}{subsubsection.3.4.2}%
-\contentsline {subsubsection}{\numberline {3.4.3}Generation process}{12}{subsubsection.3.4.3}%
-\contentsline {subsection}{\numberline {3.5}Evaluation}{14}{subsection.3.5}%
-\contentsline {section}{\numberline {4}Policies}{17}{section.4}%
-\contentsline {subsection}{\numberline {4.1}Baselines}{17}{subsection.4.1}%
-\contentsline {subsection}{\numberline {4.2}Policies based on NRV generations}{18}{subsection.4.2}%
-\contentsline {section}{\numberline {5}Literature Study}{19}{section.5}%
-\contentsline {subsection}{\numberline {5.1}Electricity Price Forecasting}{19}{subsection.5.1}%
-\contentsline {subsection}{\numberline {5.2}Policies for Battery Optimization}{20}{subsection.5.2}%
-\contentsline {section}{\numberline {6}Results \& Discussion}{21}{section.6}%
-\contentsline {subsection}{\numberline {6.1}Data}{21}{subsection.6.1}%
-\contentsline {subsection}{\numberline {6.2}Quantile Regression}{22}{subsection.6.2}%
-\contentsline {subsubsection}{\numberline {6.2.1}Linear Model}{22}{subsubsection.6.2.1}%
-\contentsline {subsubsection}{\numberline {6.2.2}Non-Linear Model}{29}{subsubsection.6.2.2}%
-\contentsline {subsubsection}{\numberline {6.2.3}GRU Model}{32}{subsubsection.6.2.3}%
-\contentsline {subsection}{\numberline {6.3}Diffusion}{37}{subsection.6.3}%
-\contentsline {section}{\numberline {7}Policies for battery optimization}{37}{section.7}%
-\contentsline {subsection}{\numberline {7.1}Baselines}{37}{subsection.7.1}%
-\contentsline {subsection}{\numberline {7.2}Policies using NRV predictions}{37}{subsection.7.2}%
--- a/src/trainers/diffusion_trainer.py
+++ b/src/trainers/diffusion_trainer.py
@@ -59,13 +59,16 @@ def sample_diffusion(

            # evenly spaces 4 intermediate samples to append between 1 and noise_steps
            if intermediate_samples:
-                spacing = (noise_steps - 1) // 4
-                if i % spacing == 0:
+                first_quarter_end = (noise_steps - 1) // 4
+                spacing = (first_quarter_end - 1) // 4
+
+                # save 1, 1 + spacing, 1 + 2*spacing, 1 + 3*spacing
+                if i % spacing == 1 and i <= first_quarter_end:
                    intermediate_samples_list.append(x)

    x = torch.clamp(x, -1.0, 1.0)
    if len(intermediate_samples_list) > 0:
-        return x, intermediate_samples_list
+        return x, intermediate_samples_list[-4:]

    return x

@@ -81,7 +84,7 @@ class DiffusionTrainer:
        self.model = model
        self.device = device

-        self.noise_steps = 1000
+        self.noise_steps = 300
        self.beta_start = 0.0001
        self.beta_end = 0.02
        self.ts_length = 96
@@ -260,6 +263,9 @@ class DiffusionTrainer:
        self.model = torch.load("checkpoint.pt")
        self.model.to(self.device)

+        self.debug_plots(task, True, train_loader, train_sample_indices, -1)
+        self.debug_plots(task, False, test_loader, test_sample_indices, -1)
+
        _, generated_sampels = self.test(test_loader, -1, task)
        # self.policy_evaluator.plot_profits_table()
        if self.policy_evaluator:
@@ -371,7 +377,6 @@ class DiffusionTrainer:

        return fig

-
    def debug_plots(self, task, training: bool, data_loader, sample_indices, epoch):
        for actual_idx, idx in sample_indices.items():
            features, target, _ = data_loader.dataset[idx]
@@ -381,69 +386,93 @@ class DiffusionTrainer:

            self.model.eval()
            with torch.no_grad():
-                samples, intermediates = (
-                    self.sample(self.model, 100, features, True)
-                )
+                samples, intermediates = self.sample(self.model, 100, features, True)
                samples = samples.cpu().numpy()
                samples = self.data_processor.inverse_transform(samples)
                target = self.data_processor.inverse_transform(target)

-            # list to tensor intermediate samples
-            intermediates = torch.stack(intermediates)
+            if epoch == -1:
+                # list to tensor intermediate samples
+                intermediates = torch.stack(intermediates)
+
+                intermediate_fig1 = self.plot_from_samples(
+                    self.data_processor.inverse_transform(
+                        intermediates[0].cpu().numpy()
+                    ),
+                    target,
+                )
+
+                intermediate_fig2 = self.plot_from_samples(
+                    self.data_processor.inverse_transform(
+                        intermediates[1].cpu().numpy()
+                    ),
+                    target,
+                )
+
+                intermediate_fig3 = self.plot_from_samples(
+                    self.data_processor.inverse_transform(
+                        intermediates[2].cpu().numpy()
+                    ),
+                    target,
+                )
+
+                intermediate_fig4 = self.plot_from_samples(
+                    self.data_processor.inverse_transform(
+                        intermediates[3].cpu().numpy()
+                    ),
+                    target,
+                )
+
+                # report the intermediate figs to clearml
+                task.get_logger().report_matplotlib_figure(
+                    title=(
+                        f"Training Intermediates {actual_idx}"
+                        if training
+                        else f"Testing Intermediates {actual_idx}"
+                    ),
+                    series=f"Sample intermediate 1",
+                    iteration=epoch,
+                    figure=intermediate_fig1,
+                    report_image=True,
+                )
+
+                task.get_logger().report_matplotlib_figure(
+                    title=(
+                        f"Training Intermediates {actual_idx}"
+                        if training
+                        else f"Testing Intermediates {actual_idx}"
+                    ),
+                    series=f"Sample intermediate 2",
+                    iteration=epoch,
+                    figure=intermediate_fig2,
+                    report_image=True,
+                )
+
+                task.get_logger().report_matplotlib_figure(
+                    title=(
+                        f"Training Intermediates {actual_idx}"
+                        if training
+                        else f"Testing Intermediates {actual_idx}"
+                    ),
+                    series=f"Sample intermediate 3",
+                    iteration=epoch,
+                    figure=intermediate_fig3,
+                    report_image=True,
+                )
+
+                task.get_logger().report_matplotlib_figure(
+                    title=(
+                        f"Training Intermediates {actual_idx}"
+                        if training
+                        else f"Testing Intermediates {actual_idx}"
+                    ),
+                    series=f"Sample intermediate 4",
+                    iteration=epoch,
+                    figure=intermediate_fig4,
+                    report_image=True,
+                )

            fig = self.plot_from_samples(samples, target)
-            intermediate_fig1 = self.plot_from_samples(
-                self.data_processor.inverse_transform(intermediates[0].cpu().numpy()), target
-            )
-
-            intermediate_fig2 = self.plot_from_samples(
-                self.data_processor.inverse_transform(intermediates[1].cpu().numpy()), target
-            )
-
-            intermediate_fig3 = self.plot_from_samples(
-                self.data_processor.inverse_transform(intermediates[2].cpu().numpy()), target
-            )
-
-            intermediate_fig4 = self.plot_from_samples(
-                self.data_processor.inverse_transform(intermediates[3].cpu().numpy()), target
-            )
-            
-
-            # report the intermediate figs to clearml
-            task.get_logger().report_matplotlib_figure(
-                title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}",
-                series=f"Sample intermediate 1",
-                iteration=epoch,
-                figure=intermediate_fig1,
-                report_image=True
-            )
-
-            task.get_logger().report_matplotlib_figure(
-                title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}",
-                series=f"Sample intermediate 2",
-                iteration=epoch,
-                figure=intermediate_fig2,
-                report_image=True
-            )
-
-            task.get_logger().report_matplotlib_figure(
-                title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}",
-                series=f"Sample intermediate 3",
-                iteration=epoch,
-                figure=intermediate_fig3,
-                report_image=True
-            )
-
-            task.get_logger().report_matplotlib_figure(
-                title=f"Training Intermediates {actual_idx}" if training else f"Testing Intermediates {actual_idx}",
-                series=f"Sample intermediate 4",
-                iteration=epoch,
-                figure=intermediate_fig4,
-                report_image=True
-            )
-
-
-            

            task.get_logger().report_matplotlib_figure(
                title="Training" if training else "Testing",
--- a/src/training_scripts/diffusion_training.py
+++ b/src/training_scripts/diffusion_training.py
@@ -2,7 +2,7 @@ from src.utils.clearml import ClearMLHelper

 clearml_helper = ClearMLHelper(project_name="Thesis/NrvForecast")
 task = clearml_helper.get_task(
-    task_name="Diffusion Training: hidden_sizes=[256, 256, 256], lr=0.0001, time_dim=8"
+    task_name="Diffusion Training: hidden_sizes=[1024, 1024, 1024, 1024] (300 steps), lr=0.0001, time_dim=8 + Load + Wind + PV + NP"
 )
 task.execute_remotely(queue_name="default", exit_process=True)

@@ -19,19 +19,16 @@ from src.policies.PolicyEvaluator import PolicyEvaluator
 data_config = DataConfig()
 data_config.NRV_HISTORY = True

-data_config.LOAD_HISTORY = False
-data_config.LOAD_FORECAST = False
+data_config.LOAD_HISTORY = True
+data_config.LOAD_FORECAST = True

-data_config.PV_FORECAST = False
-data_config.PV_HISTORY = False
+data_config.PV_FORECAST = True
+data_config.PV_HISTORY = True

-data_config.WIND_FORECAST = False
-data_config.WIND_HISTORY = False
+data_config.WIND_FORECAST = True
+data_config.WIND_HISTORY = True

-data_config.QUARTER = False
-data_config.DAY_OF_WEEK = False
-
-data_config.NOMINAL_NET_POSITION = False
+data_config.NOMINAL_NET_POSITION = True

 data_config = task.connect(data_config, name="data_features")

@@ -45,7 +42,7 @@ print("Input dim: ", inputDim)
 model_parameters = {
    "epochs": 15000,
    "learning_rate": 0.0001,
-    "hidden_sizes": [256, 256, 256],
+    "hidden_sizes": [1024, 1024, 1024, 1024],
    "time_dim": 8,
 }