As Pete just described, specifying the initial conditions for a model is difficult. In forecasting, whether weather or copepods, a major problem is estimating the initial conditions from the data. Ideally, you would send a few thousand undergrads to the locations of your model grid and have them simultaneously sample whatever you're trying to model. You could then use this data to initialize your model. In the real world, where undergrads are expensive and unreliable, we have to make do with samples collected from only a few locations. We need a way to estimate the state of the system from this sparse array of data.
Data assimilation is a grab-bag of techniques for merging observations and models, and state estimation is the most common application. The idea is to find an initial condition, such that, when the model is started from this condition, it comes close to hitting the observation points. There are lots of procedures that can do this, with cool names like "4D-Var" or "representers", and each has its advantages, disadvantages, and acolytes. For our system, I'm using the ensemble Kalman filter (EnKF), mainly because it is an ensemble method, and I like ensemble methods. At the moment, I'm using a 10 day update cycle. This means that every sample collected between day j and j+10 is used to adjust the model state at day j.
I made three runs of the system for Massachusetts Bay. The first is a control run. I started from the climatological (long-term average) initial conditions and let 'er rip. The second used the EnKF to assimilate the PCCS cruise on January 13, 2009, producing a new initial condition for January 11 (remember the 10 day update cycle). The third assimilated the cruise from January 30, producing a new initial conditions for January 21. Here's how the three runs compare on January 30 for Pseudocalanus:
| No assimilation | First cruise | Second cruise |
|---|---|---|
![]() |
![]() |
![]() |
| Model estimates of Pseudocalanus on January 30 | ||
Assimilating the first cruise increased the concentrations on 1/13 a little bit. This led to increased concentrations on 1/30. Assimilating the second cruised increased the values even more. I continued to run all three versions until today (March 3):
| No assimilation | First cruise | Second cruise |
|---|---|---|
![]() |
![]() |
![]() |
| Model estimates of Pseudocalanus on March 3 | ||
The model predicts that all three should decline (this is the model dynamics--Pseudocalanus usually declines this time of year), but the large values in model three allow some persistence.
In case you get the idea that assimilation always increases the values, here are the plots for Centropages on 1/30:
| No assimilation | First cruise | Second cruise |
|---|---|---|
![]() |
![]() |
![]() |
| Model estimates of Centropages on January 30 | ||
The first cruise encountered some very high Centropages concentrations, but the levels had declined by the second cruise.









Leave a comment