Phase-resolved real-time ocean wave prediction with quantified uncertainty based on variational Bayesian machine learning

,


Introduction
As one of the main renewable energy resources, wave energy is expected to play an increasingly important role in the energy mix in the future [1,2].In order to reduce the cost of wave power, a lot of research efforts have been spent on the modeling, simulations, and predictions of ocean waves.The commonly-used models developed in the past decades include WAM [3], SWAN [4], and WAVEWATCH III [5].As the existing works have mainly focused on the statistical wave quantities (such as significant wave height and peak wave period), the phaseresolved wave information is often not obtained.On the other hand, the phase-resolved real-time wave information is of great practical interest for the safe and optimal operation of wave energy devices.For example, the real-time prediction of the phase-resolved wave elevations can be used for the control of the wave energy converters (WECs) [6,7], so that the controllers can act in advance before the waves hit the WEC structures.Such preview-based control strategy is greatly useful for optimizing the power generation of WECs [8,9]  Wave energy converter series of the wave measurements at the sensor location was transformed into frequency domain through Fourier transform.Then the wave components at specific frequencies were propagated into the location and time of interest according to the linear dispersion relation.Finally the wave profile was obtained by adding up the arrived wave components.With this theoretical approach, the predictable zone can also be determined based on the wave components' group velocities [17,18].As the LWT-based approach can be evaluated in real-time and can achieve satisfactory accuracy when nonlinear effects are negligible, it is currently the main tool in phase-resolved wave prediction [19,20].However, nonlinear wave effects are omnipresent in real-world scenarios.More importantly, WECs are usually deployed at the sites where such effects are too strong to be ignored.In order to achieve nonlinear wave predictions, most works in the literature were based on the higher order spectral (HOS) method [21,22].Specifically, the prediction was carried out by first reconstructing the initial wave conditions with the wave measurement data as the input, and then using the HOS method for nonlinear wave propagation [23].This kind of approach, hereby referred as wave prediction based on HOS method (HOS-WP), has been employed in several studies [24,25].However, as it is computationally expensive, it is extremely challenging to achieve real-time predictions with such an approach [26].To alleviate the computational costs of the HOS-WP approach while taking account of nonlinear wave effects, wave prediction methods based on efficient, weakly-nonlinear wave models (hereby referred as NLWM-WP) were developed.The NLWM-WP method mainly consists of two steps, i.e. a data assimilation procedure to obtain the model parameters and a wave propagation procedure to make predictions [27].In [28], a nonlinear wave prediction algorithm was developed based on the Lagrangian choppy wave model (CWM) [29].In [27], a recently-proposed improved choppy wave model (ICWM) [30], which showed better performance than the CWM and the CWM2 (i.e. an extension of CWM developed in [31]), was employed for real-time wave predictions.Comprehensive numerical and experimental validations were also carried out in [27] where the prediction performance based on the LWT, the LWT with corrected dispersion relation, and the ICWM, were compared.
Recently, machine learning (ML) based approaches are starting to gain attention in ocean wave predictions.Many works focused on the phase-averaged wave information, such as the prediction of the wave height and period, for the purposes of storm forecasting [32], vessel operations [33,34], and wave power forecasting [35].While on the other hand, the number of studies on phase-resolved wave prediction, which is the main focus of this paper, is still limited.In [36], a neural network (NN) based method was proposed for the phase-resolved prediction of unidirectional waves, where both reconstruction (i.e. the prediction at the same time instants as the measurements) and forecasting (i.e. the prediction at the time instants in the future) problems were addressed.The predictable zone was also investigated based on LWT.The proposed method was evaluated using numerical simulations and the results showed that it performed much better than the LWT-based approach.In [37], a similar NN-based wave prediction method was proposed for phase-resolved wave forecasting.The proposed method was evaluated using wave tank experiments and a set of input-output strategies were tested to improve the prediction accuracy.The comparison results showed that the proposed method clearly outperformed the LWT-based approach.To summarize, the works in [36,37] both demonstrated the great potential of the ML-based approaches in nonlinear wave predictions.
However, a significant research gap still exists on the real-time phase-resolved nonlinear wave prediction, as described below.
(1) The aforementioned NN-based approaches achieved deterministic wave prediction without quantifying the corresponding uncertainty.This lack of uncertainty quantification (UQ) is very dangerous in the control of WECs in real-world conditions, because the predicted wave elevations might be totally wrong, and the use of it with full confidence may lead to catastrophic damage to the WECs.(2) Despite the efforts in achieving nonlinear wave predictions, these works still employed LWT for determining the predictable zone, which may lead to inconsistent and overly conservative estimation of what can be predicted [26,38].Therefore, a consistent nonlinear approach to handle both the prediction and the predictable zone should be pursued.(3) The work in Ref. [36] investigated both wave reconstruction and forecasting.However, synthetic numerical data was employed for method validations.Comprehensive experimental investigations are thus needed to evaluate the performance of ML in realistic wave reconstruction and forecasting, taking account of various environmental uncertainties and measurement errors.
This work focuses on the development of a novel wave prediction method that can address these limitations.Different from the previous works that formulated the phase-resolved wave prediction as a deterministic problem, this work takes a probabilistic approach.In particular, this work explores, for the first time, the Bayesian machine learning (BML) approach [39,40] for phase-resolved wave prediction, which can take advantage of ML's ability in tackling nonlinear problems while taking different kinds of uncertainties into account via the Bayesian framework.As the inference of the true posterior distribution in the Bayesian neural network (BNN) model is generally infeasible due to its extreme computational complexity, various approximation methods, such as Monte Carlo (MC) dropout [41] and variational inference [42] have been proposed.Various algorithms are also being developed to enable efficient training of BNN [43,44].Along with the development of modern ML hardware (e.g.GPU), the use of BNN in various fields is just emerging and recent successes in the area of energy research include the probabilistic wind forecasting [45], solar irradiation forecasting [46], and natural gas hydrate dispersion modeling [47].In this work, BNN is employed for probabilistic, phaseresolved, real-time reconstruction and forecasting of ocean waves.The proposed method is designed to take account of both the aleatory uncertainty (i.e. the uncertainty of the phase-resolved wave information) and the epistemic uncertainty (i.e. the uncertainty due to the model's ability).To the best of the authors' knowledge, this is the first time that real-time nonlinear wave prediction is achieved with quantified uncertainty including both aleatory and epistemic uncertainties.Another main advantage of the proposed method is that, by rigorously quantifying the prediction uncertainty, the predictable zone can also be detected automatically, avoiding the use of LWT in the predictable zone determination.The advantages of the method proposed in this work compared with existing works in the literature (including LWT, HOS-WP, NLWM-WP, and NN-based approaches) are summarized in Table 1.
To evaluate the performance of the proposed method, a set of wave tank experiments are carried out in this work.The experiments are carried out in the ocean basin [48] in the Coastal, Ocean and Sediment Transport (COAST) Laboratory at the University of Plymouth, UK, which has been used previously for various studies in wave [49,50] and tidal [51] energy research.A total 24 experimental runs are carried out to generate different wave environments, and the measurements at different sensor locations are recorded.The wave predictions are then carried out.The results show that the proposed method is able to predict the phase-resolved wave elevations accurately.Specifically, the prediction mean values match very well with the reference wave elevations measured by the wave gauge, and the corresponding prediction uncertainty and its variation are well captured throughout the prediction time horizon.Moreover, the predictable zone determination is also successfully achieved by the proposed method.To further demonstrate the performance of the proposed method, the LWT-based and NN-based wave prediction methods are also implemented in this work for comparison.The results show that the proposed method not only achieves better accuracy than these methods, but also expands the predictable zone, which is of vital importance for the control of WECs and their load forecasting.
The main contributions and novelties of this paper are summarized as follows: (1) Phase-resolved real-time nonlinear wave prediction with quantified uncertainty (including both aleatory and epistemic uncertainties) is achieved for the first time.Because the previous ML-based works [36,37] achieved deterministic phase-resolved wave predictions without quantifying the prediction uncertainty, it is difficult to use them for the control of WECs in realworld conditions, as it may lead to catastrophic damage.This work, therefore, brings vital opportunities for the safe and optimal operation of WECs.The comparison of the proposed wave prediction method with existing methods in the literature is summarized in Table 1.(2) As previous works all rely on LWT for predictable zone determination (which leads to overly conservative estimation), another main novelty of this work is that it achieves, for the first time, the prediction of the predictable zone without assuming linear sea states.(3) The proposed method is developed based on the variational BML approach, which can take advantage of both the ML model's ability in tackling complex nonlinear problems and the Bayesian approach's merits in tackling all kinds of uncertainties.
(4) The performance of the proposed approach is evaluated through a set of wave tank experiments.The results show that phaseresolved wave elevation is predicted accurately, the prediction uncertainty is captured well throughout the prediction horizon, and the predictable zone is obtained successfully.By comparing with the existing methods in the literature, it is demonstrated that the proposed method not only achieves better accuracy, but also expands the predictable zone substantially, which is vital for the optimal control of WECs and their load forecasting.
The remaining part of this paper is organized as follows: the phaseresolved wave prediction problem is formulated in Section 2. The proposed BNN-based prediction method is described in Section 3. The prediction performance is evaluated through a set of wave tank experiments in Section 4. Finally the conclusions are drawn in Section 5.

Problem formulation
Phase-resolved wave prediction, often referred as deterministic sea wave prediction (DSWP) in previous works, is of vital importance for the power maximization and load mitigation of WECs and their arrays.The preview wave information obtained by DSWP offers great opportunities for the control of WECs before the wave arrives at the location where the WECs are deployed.An illustration of the wave prediction problem is given in Fig. 1, where a wave probe, marked as A in Fig. 1(a), is installed to measure the wave elevations over time at an upstream location.Based on the measurements, the objective is to predict the wave elevations at the downstream location of interest, which is marked as B in Fig. 1(a).An example of the wave measurements during a certain time period is shown in Fig. 1(b), and the corresponding wave elevations to be predicted are illustrated in Fig. 1(c).Denote the wave measurements at the location A from time instant 0 to   as where ℎ   represents the wave elevation at the location A at the th time instant.Denote the wave elevations at location B from time instant 0 to time instant   (  >   ) as where ℎ   represents the wave elevation at the location B at the th time instant.The phase-resolved wave prediction problem then states as how to approximate the mapping  that takes ℎ  as the input and returns ℎ  as the output, i.e.
As illustrated in Fig. 1(c), the wave prediction in this work considers both reconstruction, where the wave elevations from 0 to   are predicted, and forecasting, where the wave elevations from   into the future are predicted.The mapping  is often modeled as deterministic in previous works, neglecting the underlying uncertainty including both the uncertainty of the phase-resolved wave information and the model uncertainty.Thus only point prediction is obtained.However, as illustrated in Fig. 1(c), the point prediction of wave elevations (which is represented by the dashed line) and the corresponding true values (which is represented by the solid line) can differ from each other dramatically.Thus the lack of knowledge about the prediction uncertainty is of great danger to the WEC deployed at the location of interest.In this work, a novel phaseresolved wave prediction method is proposed where the prediction uncertainty is rigorously quantified.To achieve this, the mapping  is modeled as probabilistic instead of deterministic.The considered wave prediction problem is thus termed as probabilistic phase-resolved wave prediction instead of DSWP or probabilistic sea wave prediction, to avoid confusion with previous deterministic prediction works  and stochastic wave modeling works focusing on the phase-averaged quantities.
Another aspect of the wave prediction problem is the determination of the predictable zone, i.e. given the wave measurements during a certain time period, what is the time horizon that credible wave prediction can be made?All the previous works have investigated this issue based on LWT.In this work, a consistent way to determine the predictable zone is developed with the proposed wave prediction method, avoiding the assumption of linear sea states.

Methodology
A probabilistic, phase-resolved, real-time wave prediction method is proposed in this work, based on the variational BML approach.The proposed method is designed to achieve accurate wave prediction with the quantification of both aleatory uncertainty and model uncertainty.In this section, the details of the developed BNN model, its training, and the online prediction procedure are given first.Then the LWT-based and the deterministic NN-based wave prediction methods, which are implemented in this work for comparison purpose, are presented.

BayesIan neural network model
In this work, to take account of the aleatory uncertainty, the output of the mapping  is modeled as a random vector where each component is modeled as an independent Gaussian, i.e.
where   and   represent the mean and the standard deviation of the th component of the output of  .To avoid confusion, we mention that the output of  here is the phase-resolved, real-time wave elevations at the location B, which is totally different from the phase-averaged wave elevation distributions in the stochastic modeling of ocean waves.A BNN model, denoted as f , is then constructed to take ℎ  as the input and return the random vector ĥ as the output, i.e.
where  represents the training variable of the BNN.Various NN structures in deterministic ML can also be used in the BML framework to take advantage of the underlying data structure, such as convolutional NN for image classification [52] and recurrent NN for language processing [53].In this work, as was in previous NN-based wave prediction works [36,37], the fully-connected NN is employed, which makes full connection from the model input to the model output.The input-output mapping f , which is illustrated in Fig. 2, can thus be expressed as

Model training
The BNN model is trained to infer the posterior distribution of the training variable  given the training dataset.Denote a set of training input as where )  ] is the wave elevation at the location A from 0 to   at the th wave scenario.Denote the corresponding training target as where )  ] is the wave elevation at the location B from 0 to   at the th wave scenario.Here a wave scenario denotes a particular wave environment at a particular starting time.
Given the dataset   and   , the posterior distributions of the training variable  can be expressed according to the Bayes' rule as Here  (|) denotes the conditional probability.As the inference of the true posterior distribution, as expressed by Eq. ( 9), is often not feasible due to the high dimensionality of the training variable (i.e. the high dimensionality of  in this work) [54], variational inference has been developed in the literature to approximate the true posterior with variational distributions.The training of the BNN via variational inference [55] thus aims at minimizing the Kullback-Leibler (KL) divergence (which characterizes the difference between probability distributions) between the variational distribution   ( ) and the true posterior  ( |[  ,   ]), which is defined as It can be further derived as where    (log( ([  ,   ]| ))) denotes the expectation of log( ([  ,   ]| )).As the first term in the right side of Eq. ( 11) does not depend on the training variable  , the training loss function is finally obtained as for the training with mini-batch stochastic gradient descent.Here [   ,    ] represents the th training data batch,   is the batch size, and   is the size of the whole training dataset.The interested readers may refer to [40,55] for further details regarding the derivation.The BNN is then trained to minimize the loss function () to obtain the optimal variational parameter  * .Furthermore, Flipout [44] is used in this work to enable efficient training.The implementation details of the developed BNN model can be found in the Appendix of this paper.

Online prediction
After training, given the wave elevations ℎ  measured at the location A as the input, the posterior distribution of the wave elevations to be predicted can be obtained by In practice, the prediction is carried out by sampling the training variable  according to the probability distribution   * ( ) and then propagating the samples through f (ℎ  ) to obtain the posterior samples of ĥ .From these obtained samples, the posterior distribution of ĥ , i.e. ( ( ĥ |[  ,   ])), can be estimated, and various statistical quantities of ĥ , such as its mean value, its standard deviation and its 95% confidence interval, can be predicted.In addition, the predictable zone can also be determined based on the samples of the predicted wave elevations.Here, the uncertainty level, which is defined as is calculated and used to characterize how certain the BNN model is in making predictions of the wave elevations.Here  0.975 and  0.025 represent the 97.5% and 2.5% quantiles of ĥ .Then the predictable zone is determined as the zone that  is smaller than a prescribed threshold  0 .The hyperparameters involved in the BNN model (such as the neuron number, the learning rate, the batch size, etc.) and the other parameters (such as the threshold  0 ) are tuned empirically.Their values are given in the next section.

Linear wave theory
The wave prediction based on LWT has been investigated and described in several previous works [56,57].The prediction procedure can be divided into the following three steps.First, the time series of the wave measurements at the sensor location is transformed into frequency domain through discrete Fourier transform (DFT), i.e.
Then the propagation of the wave components at various frequencies is determined according to the linear dispersion relation, i.e.
where ℎ is the water depth.Finally the wave prediction at the location   based on the measurements at the location   is obtained by summing up all the arrived wave components as where   = 2∕,   = arg(   ),   = abs(   ), and   is obtained via the linear dispersion relation.Here abs and arg represent the modulus and the argument of complex numbers.
The predictable zone can also be determined based on LWT.This is achieved by first calculating the cutting-off frequencies  low and  high such that most of the energy is contained between  low and  high .Then the predictable zone is determined as where ( low ) and ( high ) represent the group velocities which are calculated by Here in this work, following the work in [36], the cutting-off frequencies are determined by including the wave components that account for 85% of the total energy.

Deterministic NN-based wave prediction
For the deterministic NN-based wave prediction, the same fullyconnected NN structure is used.However, the training variable  becomes the weight matrix instead of random variables in BNN, and the NN output becomes deterministic instead of probabilistic.The inputoutput mapping of the implemented deterministic NN model can thus be expressed as The training of the NN model is carried out to minimize the meansquared error between the NN output and the training target by updating the training weight matrix  .After training, point prediction of the wave elevations can be obtained, by simply propagating a given input wave signal through the NN.With this deterministic NN-based approach, the prediction uncertainty is not known.In addition, the predictable zone cannot be obtained.The previous NN-based works [36,37] all employed LWT for the predictable zone determination, as described in Section 3.2.

Results
The proposed wave prediction method is evaluated in this section, by carrying out a set of wave tank experiments representing typical sea conditions.In the following subsections, the experimental setup is first presented.Then the results are given, including the predictions by the LWT-based method, the deterministic NN-based method, and the proposed method.

Wave tank experiments
The experiments are carried out in the ocean basin at the COAST Laboratory at the University of Plymouth [48], which has been used extensively in previous works on simulating ocean waves and testing WECs [49,50].A photo of the wave tank for carrying out the experiments is given in Fig. 3(a) and the experimental setup is illustrated in Fig. 3(b).The dimension of the basin is 35 m × 15.5 m, which has an adjustable floor that allows different operating water depths up to 3 m.24 hinged wave paddles across one side of the basin are applied to generate waves, with wave frequency in the range of 0.1 Hz -2 Hz.We mention that the wave frequencies in the range of 0.1 Hz -2 Hz with the corresponding significant wave heights under the limited lines aim at avoiding breaking waves.This is limited by the physical capabilities of the Ocean Basin (e.g., the basin's length and depth, the beach located downstream of the basin, and the active wave paddles system).Furthermore, to reduce wave reflection, a parabolic beach is located downstream of the basin and the wave paddles are equipped with an active absorption system.
In this work, two sea conditions, which represent the typical and extreme conditions during a 50-year return period for the EMEC site, are investigated.The corresponding significant wave height (i.e.Hs) and the peak period (i.e.Tp) are reported in Table 2, which are derived based on the data provided by ECMWF [58].The environmental characterization is performed on the EMEC site off Scotland using 30 years of hindcast data.A 50-year contour line of the EMEC site is then obtained, using the inverse first order method (IFORM) together with a Weibull distribution for Hs and conditional log-normal distribution for Tp|Hs.As described in Table 2, during the physical tank testing two wave conditions of the EMEC site are scaled down to 1/50th and tested: one represents the wave with high probability of occurrence of the EMEC site (#1); the other is along the 50-year contour line with the highest Hs referring to the extreme condition with wave steepness of 0.0291 (#2).The limiting wave steepness is taken as   ≤ 1∕15 [59].In order to avoid breaking waves, the two wave conditions studied here are smaller than this steepness limit.Details of the environmental characterization can be found in [60].It is also very interesting to note that the sea conditions considered in this work are in a similar range of steepness as in previous wave prediction works (see, for example, the wave steepness reported in Table 1 in [26] and the value of Hs/Tp 2 reported in Figure 1 in [36]).According to the physical water depth of the EMEC site and the scale ratio of the testing (1/50th), the water depth of this testing is set at 1.5 m.The tank testing setup is described in Fig. 3(b).As shown, four twin-wire resistance wave  probes are installed to capture the wave evolution.Particularly, the measurements at the first probe (i.e.8.865 m downstream of the wave paddle) and the third probe (13.8m downstream of the wave paddle) are used for the wave predictions in this work, which correspond to the location A and the location B in Fig. 1   For the wave predictions in this work, all the quantities measured in the scaled simulations, such as the wave elevations and the corresponding time stamps, are first scaled back to the quantities at full scale.Then the data is collected at a frequency of 2 Hz.The wave elevations measured at the location A during the whole simulation period for all the wave scenarios is collected as the data matrix H , where H , represents the wave elevation at the th time step for the th simulated scenario.Similarly, the measurements at the location B are collected as the data matrix H where H , represents the wave elevation at the th time step for the th scenario.

Prediction results and discussions
The wave dataset H and H , as described in Section 4.1, are divided into the training dataset (the first 85% time instants) and the test dataset (the last 15% time instants).Based on the training dataset, the wave prediction model developed in this work is trained to achieve the prediction of the wave elevation at the location B from  0 to  0 + 300 s with the wave elevation at the location A from  0 to  0 + 200 s as the input, i.e. the model is trained to achieve the wave reconstruction and the 100 s-ahead wave forecasting simultaneously.Specifically, the training is carried out using the first 75% of the training dataset, and the hyperparameters (such as the NN structure, the learning rate, etc.) involved in the developed model are tuned based on the validation dataset (i.e. the last 25% of the training dataset).The final NN structure is set as 400-500-600 for both the mean and the standard deviation of the BNN output, the batch size for the mini-batch gradient descent is set as 2048, and the Adam optimizer [61] is used where the learning rate is set as 10 −3 .The training is carried out using NVIDIA RTX 6000 GPU.The whole training process takes about 0.9 h to complete and each training iteration requires about 0.33 s.This shows that the offlinetrained model can also be used for efficient online updating in practical scenarios to tackle various uncertainties in real-ocean WEC operations.
After training, the test dataset is used for evaluating the prediction performance including both the prediction accuracy and the uncertainty characterization.Here, the predictions are carried out for all the 24 wave scenarios simulated in this work and by all the wave prediction methods including the LWT-based method, the deterministic NN-based method, and the proposed method.The wave elevations measured by the wave gauge, which are assumed unavailable during the training process, are used as the reference values to evaluate the prediction accuracy.

Wave predictions with quantified uncertainty
First, the prediction performance for the whole time period, including both wave reconstruction and forecasting, is examined.The results are shown in Figs. 4 and 5 for two typical wave scenarios which correspond to the sea conditions #1 and #2 respectively.The two wave scenarios shown in Figs. 4 and 5 are from the 24 wave scenarios simulated in this work, and the prediction results for the other 22 wave scenarios show similar characteristics, thus omitted here.Figs. 4(a) and 5(a) show the wave elevations predicted by the proposed method, including both the prediction mean values and the 95% confidence interval.As shown, the prediction mean values (i.e. the dashed-dotted line) match very well with the reference values (i.e. the solid line), and the predicted 95% confidence interval (i.e. the gray area) characterizes the prediction uncertainty accurately throughout the prediction time horizon.In particular, as shown in Figs. 4 and 5, the prediction uncertainty is very high at the left side of the time domain.This is because at these time instants, the waves to be predicted at the location B have already passed (thus are not captured by) the probe at the location A. A high prediction uncertainty is also observed at the right side of the time domain, which is because at these time instants the waves to be predicted have not arrived at the location A yet.As for the center part of the time domain where the waves to be predicted are captured by the probe at the location A, a consistent low uncertainty level is predicted throughout, and the reference values, for the most of the time, fall within the predicted 95% confidence interval.It is thus concluded that the wave prediction (including both wave reconstruction and forecasting) with quantified uncertainty is successfully achieved by the proposed method.
For comparison purposes, the predictions by the LWT-based and the NN-based methods are also carried out and the results are included in Figs.4(a) and 5(a).As shown, the mean values predicted by the proposed method match with the reference values much better than the LWT-based method, demonstrating the great performance of the proposed nonlinear approach compared to the linear approach, while it is only slightly better than the deterministic NN-based method, which is reasonable as both methods are designed based on data and ML (i.e. one is based on probabilistic ML and the other is based on deterministic ML).As can be seen from Figs. 4(a) and 5(a), the predictions by all the methods show various levels of prediction errors throughout the prediction time horizon, and the prediction can be totally wrong beyond the predictable zone.Therefore, the wave prediction without knowing the prediction uncertainty, which is the case for both the LWT-based and the NN-based methods, is of great danger to the assets deployed at real-world sea environment.Thus the unique advantage of the method proposed in this work is that it can predict the prediction uncertainty, as shown by the gray area in Figs.4(a) and 5(a), along with the prediction mean values.For future wave energy applications, the 95% confidence interval predicted by the proposed method can be used for the control of WECs and their load forecasting to guarantee robust performance, while the other wave prediction methods can only provide point estimation.
Moreover, it is worth mentioning that the lack of understanding and interpretation of the ML model parameters, which is also present in most of other ML applications in various fields, further compounds the challenges in deploying deterministic NN-based methods in real-world ocean conditions.For example, if the deterministic NN-based method is employed to provide the input for the real-time WEC control, the occurrence of prediction errors, which cannot be physically interpreted, may lead to unexpected hazardous behaviors of the WEC.Such issues, on the other hand, can be mitigated by the proposed BNN-based method, by using the prediction mean value and the confidence level at the same time.Although the BNN-based method, like the deterministic NN, remains lack of interpretation, its advantage is that it will know when it is not able to make reliable predictions.Thus when such scenario occurs, in practice, the WEC can be simply switched to a default control strategy that does not rely on or has weak dependency on the predicted wave elevations.

Predictions of the predictable zone
In addition to the wave prediction with quantified uncertainty, the determination of the predictable zone is also achieved with the proposed method.To achieve this, the prediction uncertainty level , which is defined according to Eq. ( 14), is calculated.The results are shown in Figs.4(b) and 5(b).As can be seen, there exists a sharp rise for the value of  at both left and right sides, which can be used to determine the boundaries of the predictable zone.A smoothing process (here via Gaussian kernel smoothing) is first carried out to obtain the smoothed curve of the uncertainty level, which is shown by the dashed line in Figs.4(b) and 5(b).Then the predicable zone is calculated as the zone where the uncertainty level is less than a prescribed threshold As shown, the two approaches, even though drastically different from each other (i.e. one is based on the physics of linear ocean waves and the other is based on data and ML), predict two predictable zones that cover similar time horizons.More importantly, the zone predicted by the proposed method contains (and is larger than) the zone predicted by LWT.This observation is true for all the 24 wave scenarios simulated in this work.As pointed out in previous studies [26,38], the assumption of linear sea states leads to a more restrictive prediction of the predictable zone.Therefore, with the proposed nonlinear method, for the first time, the predictable zone determination is achieved without assuming linear sea states, thus guaranteeing the wave prediction over a predictable zone larger than all the previous works.This brings vital opportunities to the field of the control of WECs and their load forecasting, where the length of the forecasting time horizon is of great importance.It is worth mentioning that this successful determination of the predictable zone is not trivial, as no training data is known regarding the zone boundary, making this task unachievable by all the previous ML-based wave prediction methods.
It is worth pointing out that the identification of the predictable zone boundaries with the proposed method relies on the parameter  0 , which is empirically determined.Similarly, with the LWT-based method in previous works, the determination of the predictable zone relies on the empirically-determined cutting-off frequencies, which means that a different percentage (e.g.85% used in this and other works [36]) of the energy content required for the cutting-off frequencies would lead to a different predictable zone.This kind of empiricalness was also briefly discussed in [36].Because the change from 'predictable' to 'unpredictable' is actually gradual, in practice, it is up to the users/practitioners to determine the value of  0 for the proposed method or the cutting-off frequencies for the LWT-based method.In this work, the value of  0 is set as a 25% deviation from the minimum uncertainty level.The reason of using this value is two-fold: one is to predict the predictable zones as conservative as possible which is achieved by choosing the value of  0 as small as possible; the other is to choose the value of  0 so that the local oscillations of the uncertainty level (as shown in Figs.4(b) and 5(b)) do not hinder the identification of the zone boundaries.This trade-off leads to the current selection of  0 .In addition, because of the above-mentioned empiricalness, a meaningful comparison between the predictable zones by the proposed method and by the LWT should include both predictable zone length and the prediction accuracy within the zone.This is because the empirical value of  0 (or the cutting-off frequencies in the LWT-based method) also has an impact on the evaluation of the overall accuracy within the predictable zone.For example, it is expected that a larger value of  0 will lead to a larger predictable zone, which will then lead to a larger overall prediction error because the error increases towards the zone boundaries.Therefore, in the following subsections, the quantitative evaluation of the proposed method, in terms of both the prediction accuracy and the predictable zone length, is carried out.Its performance is compared with other methods, i.e. the LWT-based and NN-based methods.

Quantitative evaluations of the prediction performance
To quantify the prediction accuracy of the proposed wave prediction method, the error distributions over the whole prediction time domain are first investigated.The root mean square error (RMSE) of the predictions compared with the reference values at different time horizons, averaged over all the simulated wave scenarios representing sea condition #1 and #2 respectively, are calculated and the results are given in Fig. 6.The results by LWT-based and NN-based methods are also included for comparison.As shown, the proposed method outperforms other methods consistently throughout the prediction time domain, with a moderate improvement over NN-based method and a significant improvement over the LWT-based method.Particularly, in the center part of the time domain, the prediction error by the proposed method is around 0.1 m and 0.3 m lower than the LWTbased method at the sea condition #1 and #2 respectively.Towards the zone boundaries, the prediction error by the LWT-based method rises earlier and has a larger magnitude (i.e. the error exceeds 1.0 m and 4.0 m at the sea condition #1 and #2 respectively) than the proposed method, demonstrating the proposed method' ability in both extending the predictable zone and improving the accuracy.In addition, a similar trend at the zone boundaries is observed between the NN-based and the proposed methods.However, because the NN-based method is not able to predict the predictable zone, a restrictive zone by LWT is usually imposed [36,37], limiting the NN-based method for predictable zone extension.
To evaluate the overall prediction error, the RMSE of the predictions compared with the reference values is then calculated, where the prediction beyond the predictable zone is excluded.Here, two prediction RMSEs are calculated, which are defined as Table 3 The results for the whole prediction time period including both reconstruction and forecasting.Both the MRMSEs of the wave elevation predictions and the length of the predictable zone are given.The predictable zone cannot be obtained by the NN-based method, thus denoted as ''-'' in this table.and

Sea condition
where [  min ,   max ] and [  min ,   max ] represent the predictable zone predicted by LWT and the proposed method respectively, and ℎ *  and ℎ  represent the predicted and the reference values of the wave elevation at the th time instant.The RMSEs are first calculated for all the wave scenarios simulated in this work, then the mean root mean square error (MRMSE), which is defined as the mean value of the RMSEs averaged over the simulated wave scenarios, is calculated.The MRMSEs are calculated separately for the wave scenarios representing the sea condition #1 and the sea condition #2.The results are given in Table 3, including the MRMSEs for the proposed method, the LWT-based method, and the deterministic NN-based method.The corresponding MRMSEs normalized by the half of the significant wave heights are also given in Table 3.The significant wave height is used here to normalize the prediction errors as it is directly related to the surface variance [28].As shown, for the sea condition #1 and #2, the prediction errors (in terms of  2 ) are just 0.294 m and 0.816 m by the proposed method, while they are 0.384 m and 1.25 m by LWT and 0.317 m and 0.941 m by the NN-based method.The proposed method clearly outperforms other methods, whether the time horizon is restricted to the predictable zone determined by LWT or to the one predicted by the proposed method.More specifically, for the sea condition #1, the prediction error by the proposed method is 21.2% and 6.8% lower than the LWTbased and NN-based methods in terms of  1 , while they are 23.4% and 7.3% in terms of  2 .As for the sea condition #2, the prediction error by the proposed method are 29.9% and 13.3% lower than the LWTbased and NN-based methods in terms of  1 , while they are 34.7% and 13.3% in terms of  2 .These results thus clearly show that the performance improvement of the proposed nonlinear method compared with LWT-based method is greater at the sea condition #2 (where the improvement is around 30%) than at the sea condition #1 (where the improvement is around 20%).This is reasonable as the waves at the sea condition #2 (where the wave steepness is equal to 0.0291) are actually steeper (thus with stronger nonlinear effects) than at the sea condition #1 (where the wave steepness is equal to 0.0265).As for the comparison of the proposed method with the NN-based method, the BNN is only moderately better than the NN, where the prediction error, on average, is only improved by around 10%.This is actually expected, since the NN and the BNN are similar in tackling the wave prediction problem, which is the updating of the training variables (i.e. the weight matrix for the NN and the probability distributions for the BNN) to fit the training data.More specifically, their ability in handling nonlinearity is the same, i.e. the processing of the input to the output via multiple layers with nonlinear activation.The BNN can be seen as the probabilistic extension of the NN, although the implementation details, including the training variables, the training loss functions, etc., are different.Therefore, by design, the difference between NN and BNN in handling nonlinear wave effects is small and their main difference lies in the quantification of the uncertainty.There is also a moderate performance gain of the BNN compared to the NN, which is probably because the traditional deterministic NN is more prone to overfitting than the probabilistic BNN.
The above results clearly demonstrate that the proposed method not only achieves better accuracy in the traditionally-determined predicable zone, but also is able to predict wave elevations accurately over a larger time horizon.To further illustrate this point, the lengths of the predictable zones predicted by LWT and the proposed method are calculated.The predicted zone lengths are calculated for all the wave scenarios and then averaged over the wave scenarios representing the sea condition #1 and #2 respectively.The results are given in Table 3.As shown, for the sea condition #1, the average length of the predictable zone is 189.9 s by the proposed method, which is 16.2% larger than the predictable zone by LWT.As for the sea condition #2, the average length of the predictable zone is 196.0 s by the proposed method, which is 13.4% larger than the predictable zone by LWT.In conclusion, the results in Table 3 fully demonstrate the proposed method's performance over the LWT-based and NN-based methods, in terms of both prediction accuracy and the extent of the predictable zone.
In addition, the computational costs required for the different methods are given here to complete the performance comparisons.All the wave predictions in this work are carried out using a MacBook Pro laptop with 2 GHz Intel Quad-Core i5 processor.For all the wave scenarios considered in this work, on average, the computational time needed for the LWT-based method, the NN-based method, and the proposed method are 0.017 s, 0.031 s, and 0.041 s respectively.This shows that all these methods can achieve real-time wave predictions.In practical applications, as long as the new measurement data at the sensor location becomes available (i.e. it is measured every 0.5 s in this work), it can be fed into the proposed method to predict, in real-time, the wave elevations at the location of the WEC, which will then be processed by a real-time controller for the optimal control of the WEC.

Evaluations of the method for short-term forecasting
To demonstrate the use of the proposed method for the control of WECs and their load forecasting, the following parts focus on the analysis of the results for wave forecasting.First, from the prediction results including both wave reconstruction and forecasting, the forecasting part is extracted and the results are shown in Figs.7 and  8 for the wave scenarios representing the sea conditions #1 and #2 respectively.The predictable time horizons predicted by LWT and the proposed method are also shown in Figs.7 and 8 to illustrate how far into the future that credible prediction can be made.As shown, the wave elevations predicted by the proposed method and the reference values match with each other very well and the prediction uncertainty covers the reference values quite well throughout the prediction time horizon, demonstrating the successful quantification of the prediction uncertainty.The comparison between different methods shows that the proposed method performs better than other methods throughout the prediction time horizon.In particular, the linear approach shows clear discrepancy with the reference values, including both the magnitude and the phase of the predicted wave, especially when the time instant is beyond the boundary of the LWT-based predictable zone.This is not unexpected, as the LWT-based wave prediction and predictable zone determination are both based on linear wave assumption.Thus the performance is expected to deteriorate beyond the LWT-based

Table 4
The results for the wave forecasting part.Both the MRMSEs of the wave elevation predictions and the length of the predictable zone are given.The predictable zone cannot be obtained by the NN-based method, thus denoted as ''-'' in this To quantify the accuracy of the proposed method for wave forecasting, the prediction RMSEs are calculated here, which are defined as and for the forecasting horizons determined by LWT and the proposed method respectively.The prediction MRMSEs are given in Table 4.As can be seen, the proposed method outperforms other methods in terms of both forecasting accuracy and the extent of the forecasting horizon. .These results thus illustrate that, similar to the wave prediction for the whole time period, the performance improvement for the shortterm wave forecasting by the proposed method over LWT is greater at the sea condition #2 (where the nonlinear wave effects are stronger) than at the sea condition #1.In addition, the average forecasting time horizons by the proposed method are 41.6 s and 24.1 s for the sea condition #1 and #2, while they are only 34.8 s and 13.8 s by the LWTbased method.The forecasting time horizon is thus expanded with the proposed method by 19.5% and 74.6% for the sea condition #1 and #2 respectively.It is therefore concluded that the proposed method is able to achieve wave forecasting with greater accuracy and over a predictable horizon substantially larger than previous works.

Conclusions
In this paper, the probabilistic, phase-resolved, real-time prediction of ocean waves was investigated, where the prediction uncertainty, including both the aleatory uncertainty (i.e. the uncertainty of the phaseresolved wave information) and the epistemic uncertainty (i.e. the uncertainty due to the model's ability), was rigorously quantified.The proposed method was developed based on BML approach, which can take advantage of the ML model's ability in tackling complex nonlinear systems while taking various kinds of uncertainties into account via the Bayesian framework.Different from the previous wave prediction works focusing on the point estimation of the wave elevations, this work formulated the phase-resolved wave prediction as probabilistic and the proposed method was designed to predict the probability distributions of the wave elevations, so that various quantities (such as the prediction mean value and the 95% confidence interval) can be predicted.In addition, based on the proposed wave prediction method, a new way to predict the predictable zone was also proposed.As all the previous works determined the predictable zone based on LWT (which leads to overly conservative estimations), this work achieved, for the first time, the prediction of the predictable zone without assuming linear sea states.
To evaluate the proposed method, a set of wave tank experiments were carried out, where in total 24 wave scenarios were physically simulated.The LWT-based and the deterministic NN-based wave prediction methods were also implemented for comparison.The prediction results for the whole time period, including both wave reconstruction and forecasting, were first examined.The results showed that the phaseresolved wave elevations predicted by the proposed method matched with the experimental data accurately, and the prediction uncertainty and its variations across the time horizon were also well captured throughout the prediction time period.In particular, the maximum value of the MRMSEs of the wave elevation predictions normalized by the half of the significant wave height was 10.5% for the proposed method, while it was 16.1% for the LWT-based method and was 12.1% for the deterministic NN-based method.More importantly, the time horizon where credible prediction can be achieved was enlarged by the proposed method, compared with previous works which were all based on LWT for predictable zone determination.More specifically, the results showed that the predicable zones were enlarged by 16.2% and 13.4% respectively for the sea conditions #1 and #2 considered in this work.
To demonstrate the use of the proposed method for the control of WECs and their load forecasting, the prediction performance for the forecasting part was then analyzed.The results showed that the proposed method was able to achieve wave forecasting with greater accuracy and over a substantially larger forecasting horizon.Particularly, the maximum value of the normalized prediction MRMSEs was 10.2% for the proposed method, while it was 22.9% for the LWT-based method and was 12.1% for the deterministic NN-based method.Moreover, compared with previous works, the proposed method expanded the credible forecasting time horizons by 19.5% and 74.6% for the sea conditions #1 and #2 respectively.
Based on the proposed method and the findings of this paper, several important directions for future works are described below.(1) In order to tackle various uncertainties in real-world ocean conditions, such as the potential change of the WEC locations and the significant change of sea states, comprehensive investigations are needed including the implementation of an online model updating process and the development of a transfer learning strategy from wave tanks to real oceans.(2) This work investigated three real-time wave prediction methods, i.e. the LWT-based, NN-based, and the proposed BNN-based methods.It is also interesting to carry out comparison studies with the methods based on weakly nonlinear wave models (such as the choppy wave model).Such studies will be very helpful in understanding the nonlinear wave effects (such as the nonlinear wave phase velocity and the nonlinear wave shape) in the phase-resolved wave prediction and the corresponding predictable zone determination.(3) The wave tank experiments in the current work consider the uni-directional waves.Future works may involve the investigation of the three-dimensional wave field predictions where complex spatiotemporal wave features are present.(4) To tackle the occurrence of freak waves, the training and validation of the proposed method with real-world wave data containing both typical wave conditions and freak waves, are also of great practical interest.( 5) Finally, based on the accurately predicted wave information and its uncertainty, the uncertainty-aware previewbased WEC control strategies are an interesting direction to explore in the future.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.An illustration of the probabilistic phase-resolved wave prediction problem investigated in this work.

Fig. 2 .
Fig. 2.An illustration of the BNN model developed in this work.
) where the training variable  includes   and   in all the layers (i.e. = [  ,   ], 1 ≤  ≤  ℎ + 2),  is the activation function (which is specified as the rectified linear unit (ReLU) activation in this work),  is the transformation function (which is specified as the softplus function in this work) that guarantees the positivity of the returned standard deviation, and  is the linear activation that returns the mean values of the output.Different from the previous NN-based wave prediction works where   and   are the trainable weight matrix and the NN training aims at obtaining the optimal values of   and   to fit the training data, this work employs the BML framework where   and   are random variables and the NN training aims at obtaining the posterior distributions of these random variables, i.e. the probability distributions of these random variables given the training dataset (including the training input and the training target).

Fig. 3 .
Fig. 3.The illustration of the wave tank experiments.
respectively.In addition, for each sea condition, two different peak enhancement factors (i.e. = and  = 3.3) and six different random seeds are considered, resulting in twelve experimental runs for each sea condition.Thus in total wave scenarios are physically simulated in this work.The two peak enhancement factors are studied to understand the effects caused by spectral shape, where  = 3.3 corresponds to the JONSWAP spectral with narrow band and  = 1 is related to the PM spectral with broader band.For each wave scenario, in order to obtain one-hour simulation at full scale, approximately 8.5-min physical tank testing is needed at 1/50 scale, i.e. the time scale ratio is √ 1∕50 based on the Froude model law.Therefore, in this work, the physical simulations are carried out for about 9.5 min.The simulations during the first and the last 30 s are then discarded, resulting 8.5 min of well-established wave environment, which correspond to one hour at full scale.

Fig. 4 .
Fig. 4. The results of wave prediction (including both reconstruction and forecasting) with quantified uncertainty by the proposed method for a typical wave scenario representing the sea condition #1, including the predictions of both (a) the wave elevations, where the ground truth and the results by the LWT-based and the deterministic NN-based methods are also shown; and (b) the predicable zones, where the solid and dashed lines represent the uncertainty level and the corresponding smoothed curve while the vertical gray line indicates the predictable zone by the proposed method.

Fig. 5 .
Fig. 5.The results of wave prediction (including both reconstruction and forecasting) with quantified uncertainty by the proposed method for a typical wave scenario representing the sea condition #2, including the predictions of both (a) the wave elevations, where the ground truth and the results by the LWT-based and the deterministic NN-based methods are also shown; and (b) the predicable zones, where the solid and dashed lines represent the uncertainty level and the corresponding smoothed curve while the vertical gray line indicates the predictable zone by the proposed method.

Fig. 6 .
Fig. 6.The error distributions over the whole prediction time domain, averaged over all the simulated wave scenarios representing sea condition #1 and #2 respectively.The results by the LWT-based and the deterministic NN-based methods are also shown for comparison.

Fig. 7 .
Fig. 7.The results of wave forecasting with quantified uncertainty by the proposed method for a typical wave scenario representing the sea condition #1, including the predictions of both the wave elevations and the predicable zones.The corresponding ground truth and the wave elevations predicted by the LWT-based and the deterministic NN-based methods are also shown for comparison.

Fig. 8 .
Fig. 8.The results of wave forecasting with quantified uncertainty by the proposed method for a typical wave scenario representing the sea condition #2, including the predictions of both the wave elevations and the predicable zones.The corresponding ground truth and the wave elevations predicted by the LWT-based and the deterministic NN-based methods are also shown for comparison.
and the mitigation of

Table 1
The advantages of the phase-resolved wave prediction method proposed in this work compared with existing works in the literature.

Table 2
The sea conditions investigated in this work.

table .
methods is at the time instants between the zone boundaries predicted by LWT and the proposed method.This further demonstrates that accurate forecasting of ocean waves over a consistent and larger credible forecasting time horizon is achieved with the proposed method.
The prediction errors (in terms of  4 ) by the proposed method are only 0.287 m and 0.793 m for the sea condition #1 and #2, while they are 0.394 m and 1.78 m by the LWT and 0.324 m and 0.898 m by the NN-based method.More specifically, for the sea condition #1, the prediction error by the proposed method is 24.8% and 10.3% lower than the LWT-based and the deterministic NN-based methods in terms of  3 , while they are 27.2% and 11.4% in terms of  4 .And for the sea condition #2, the prediction error by the proposed method are 44.5% and 11.1% lower than the LWT-based and the deterministic NN-based methods in terms of  3 , while they are 55.4% and 11.7% in terms of  4