Data assimilation has recently been the focus of much attention for integrated surface–subsurface hydrological models, whereby joint assimilation of water table, soil moisture, and river discharge measurements with the ensemble Kalman filter (EnKF) has been extensively applied. Although the EnKF has been specifically developed to deal with nonlinear models, integrated hydrological models based on the Richards equation still represent a challenge, due to strong nonlinearities that may significantly affect the filter performance. Thus, more studies are needed to investigate the capabilities of the EnKF to correct the system state and identify parameters in cases where the unsaturated zone dynamics are dominant, as well as to quantify possible tradeoffs associated with assimilation of multi-source data. Here, the CATHY (CATchment HYdrology) model is applied to reproduce the hydrological dynamics observed in an experimental two-layered hillslope, equipped with tensiometers, water content reflectometer probes, and tipping bucket flow gages to monitor the hillslope response to a series of artificial rainfall events. Pressure head, soil moisture, and subsurface outflow are assimilated with the EnKF in a number of scenarios and the challenges and issues arising from the assimilation of multi-source data in this real-world test case are discussed. Our results demonstrate that the EnKF is able to effectively correct states and parameters even in a real application characterized by strong nonlinearities. However, multi-source data assimilation may lead to significant tradeoffs: the assimilation of additional variables can lead to degradation of model predictions for other variables that are otherwise well reproduced. Furthermore, we show that integrated observations such as outflow discharge cannot compensate for the lack of well-distributed data in heterogeneous hillslopes.

Data assimilation, i.e., the process in which observations of a system are
merged in a consistent manner with numerical model predictions

As a consequence of such popularity, the EnKF is also increasingly applied
with integrated surface–subsurface hydrological models (IHSSMs), whereby
multiple terrestrial compartments (e.g., snow cover, surface water,
groundwater) are solved simultaneously, in an attempt to tackle environmental
problems in a holistic approach

In spite of such a strong interest, several issues related to the use of EnKF
for state and parameter estimation in integrated hydrological modeling remain
unresolved. The subsurface component of many IHSSMs is based on the solution
of the Richards equation in one or three dimensions and, although recent
studies with numerical experiments in synthetic test cases

Within this context, the main goals of the present study are (i) to assess whether the EnKF in combination with a Richards equation-based hydrological model is able to effectively improve states and parameters in a real-world test case characterized by dominant unsaturated dynamics and (ii) to quantify the tradeoffs associated with multi-source data assimilation.

To pursue these goals, the EnKF is used in combination with the CATHY
(CATchment HYdrology) model

The artificial hillslope is placed inside a concrete structure of length
6 m, width 2 m and height varying linearly from 3.5 to 0.5 m,
corresponding to a slope of 32

Plan view and longitudinal cross section of the artificial hillslope, along with the position of the monitoring instruments. Tensiometers are indicated by the letter “P”, while WCR probes are denoted by “W”. All dimensions are in centimeters.

Six tensiometers and six water content reflectometer (WCR) probes are used to
measure pressure head and water content in the top soil layer. All the
sensors are located in an intermediate position of the hillslope, as shown in
Fig.

A Campbell Scientific (CR 1000) data logger is used to collect and record all
the data with a frequency of

Experimental data collected during the experiment:

Figure

The CATHY model

The unsaturated hydraulic properties are taken into account by means of the
van Genuchten functions (e.g.,

It is worth noting that the Richards equation is strongly nonlinear, due to
the retention curves (Eqs.

The ensemble Kalman filter

In this paper the EnKF is implemented according to the numerical formulation
proposed by

Whenever observed data are available, the EnKF can compute the updated matrix

The updated mean can be calculated as

The matrix

The updated anomalies,

When updating the states only, the elements of

Perturbation parameters for the generation of the ensemble initial conditions, hydraulic conductivities and atmospheric forcing.

The artificial hillslope is discretized with a surface triangular grid
resulting from the subdivision of square cells of 10 cm side. The triangular
grid is then replicated vertically for a total of 25 layers to generate the
three-dimensional tetrahedral mesh (Fig.

Three-dimensional finite element grid of the hillslope.

In order to generate the ensemble of realizations needed for the application
of the EnKF, we perturb the atmospheric forcing (i.e., rainfall and
evaporation rates), soil properties and initial conditions.
Table

The ensemble of time-variable atmospheric forcing rates was generated with a
sequence of multiplicative perturbations,

The initial conditions consist of a uniform value of pressure head,

Perturbed soil parameters, for both sand and clay, include the saturated
hydraulic conductivity as well as the parameters of the van Genuchten
retention curves. Table

The parameters of the van Genuchten retention curves

The EnKF algorithm implemented here is actually an ensemble transform Kalman
filter

When assimilating multiple variables, proper normalization of the measurement
error covariance matrices, anomalies of the simulated data, and innovation
vectors were performed, using values of

Factored covariance matrices used for the perturbation of van
Genuchten parameters

A total of 17 data assimilation scenarios have been simulated, whereby the
assimilation interval, the assimilated variables, the updated variables, and
the uncertainty on the van Genuchten parameters were varied.
Table

Overview of the open loop and data assimilation scenarios.

The performance of the simulations has been evaluated by means of the
root mean square error (RMSE), computed for the different variables,
i.e., pressure head, water content, and subsurface outflow.
The root mean square error is calculated as

A preliminary sensitivity analysis over a number of EnKF parameters has been
performed, in order to select a final and satisfactory setup for the
subsequent data assimilation scenarios. First, simulations with

Then, several dampening factor values (

According to these preliminary analyses, all scenarios reported in
Table

Table

Normalized root mean square errors for water content, pressure head, and subsurface outflow of the data assimilation scenarios versus corresponding values of the open loop. Symbols in magenta represent values calculated over the assimilation period, while symbols in blue represent values calculated over the validation period.

Figure

Normalized root mean square errors (NRMSEs) for the 17 data
assimilation scenarios under analysis and two open loop (OL)
simulations. The table reports the NRMSE for three variables,
water content, pressure head and outflow discharge, for both the
assimilation and the validation periods. The last three columns
report the mean values calculated over the three variables (WC,
PH and

Ratios between RMSEs in scenarios S6, S8, and
S13

To assess the capabilities and benefits of parameter estimation with the
EnKF, it is useful to compare scenarios with the same assimilated variables
but different updated variables. Figure

Plots of the pressure head in P2

The effect of parameter updating on model predictions for scenarios S6, S8,
and S13 can be visualized in Fig.

Relative frequency distributions of the saturated hydraulic
conductivity,

Relative frequency distributions of the saturated hydraulic
conductivity,

We now focus our attention on the scenarios where multi-source data are
assimilated. The right panels in Fig.

Time evolution of the saturated hydraulic conductivity and van Genuchten parameters (mean values, solid line, together with minimum and maximum values, in dashed lines, to indicate the ensemble spread) for the two types of soil, sand and clay, in scenario S17.

Further insights into the differences between scenarios S10 and S15 can be
gained from Figs.

Figure

Plots of the pressure head in P2,

An additional perspective on parameter estimation is given by Fig.

In summary, the results of parameter updating for the clay show that data would be needed in all the soil layers. However, when dealing with large heterogeneous structures, it is very expensive to have every soil zone properly probed, which is why it is important to assess whether multivariate data assimilation approaches are capable of compensating for the lack of distributed observations with alternative sources of information. Here, an integrated measurement such as the subsurface outflow does not seem to be sufficient to compensate for this lack of representativeness.

Finally, we analyze the tradeoffs in system state predictions associated with
multi-source data assimilation for scenarios S15, S16, and S17.
Figure

Similar issues were reported by

In this study, a Richards equation-based hydrological model, CATHY, has been used with the ensemble Kalman filter to assimilate pressure head, water content, and subsurface outflow data in a real-world test case, represented by an experimental artificial hillslope. A total of 17 data assimilation simulations have been presented and described to provide a comprehensive overview of possible scenarios. Univariate scenarios with the assimilation of water content or pressure head alone were compared to multivariate cases where water content and pressure head were combined with outflow discharge or where water content, pressure head and outflow discharge were jointly assimilated. Regarding the updating strategies, single (state variable) and joint (state variables plus saturated hydraulic conductivity with and without van Genuchten parameters) updating scenarios were considered.

Overall, the capabilities of the ensemble Kalman filter to jointly correct the system states and soil parameters in physically based hydrological models were confirmed, even in a real-world test case such as the one presented here, characterized by dominant unsaturated dynamics and hence strong nonlinearities. Updating of the saturated hydraulic conductivity brought significant improvements in the prediction of pressure head and subsurface outflow, while updating the van Genuchten parameters proved to be highly beneficial to the prediction of the water content dynamics. On the other hand, multivariate data assimilation may lead to significant tradeoffs. For instance, the assimilation of soil moisture in addition to pressure head and subsurface outflow improved water content, but slightly degraded the prediction of the outflow discharge. Moreover, our results suggest that high-quality and representative data are essential for a proper and effective use of data assimilation in physically based hydrological models, as shown by the relatively poor performance of the EnKF in scenarios when pressure head was assimilated, due to temperature disturbances of the data, and by biased estimates of clay parameters, due to the lack of data in this soil layer.

In future studies, more representative data, including observations in the clay, will be assimilated, and the possibility of applying bias-aware filters will be considered to compensate for the effect of temperature in the tensiometric data.

All the data are available from the corresponding author upon request.

AB and EB carried out the experiment. AB conducted the numerical simulations. AB and MC wrote the manuscript. MC supervised the research. All the authors reviewed the manuscript.

The authors declare that they have no conflict of interest.

We gratefully acknowledge the financial support of the University of Padua, through grant CPDA148790. We thank the editor and three anonymous reviewers for their detailed and very helpful comments. Edited by: Harrie-Jan Hendricks Franssen Reviewed by: three anonymous referees