Clarify interpretation of noise distributions #656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

sebapersson wants to merge 1 commit into PEtab-dev:main from sebapersson:noise_distributions

+11 −5

Contributor

sebapersson commented Dec 17, 2025

When implementing support for LogLaplace in PEtab.jl, I realized that the interpretation of the noise distributions in the spec is not entirely clear. In particular, for the supported distributions, the model output is not assumed to be the mean or location of the data distribution, but rather its median.

For example, let ($m$) be the measured value, $y := \text{observableFormula}$ the simulated value, and $\sigma$ the noise. For the LogNormal distribution in PEtab we have $\log(m) \sim \mathcal{N}(\log(y), \sigma)$, which implies $m \sim \mathcal{LN}(\log(y), \sigma)$. For this LogNormal, the median is y (exp of first argument). A similar interpretation holds for LogLaplace. Overall, this PR aims to clarify this.


          Clarify interpretation of noise distributions

54132c0

sebapersson requested a review from a team as a code owner

December 17, 2025 09:28

dilpath approved these changes

View reviewed changes

Member

dilpath left a comment

Thanks!

doc/v2/documentation_data_format.rst

    
                  - Probability density function (PDF)

                * - Gaussian distribution

                * - | Gaussian distribution

                    | (i.e., :math:`m` is normally distributed as :math:`m \sim \mathcal{N}(y, \sigma)`)

Member

dilpath Jan 14, 2026

Suggested change

      
                  | (i.e., :math:`m` is normally distributed as :math:`m \sim \mathcal{N}(y, \sigma)`)
          
                  | (i.e., :math:`m` is normally distributed as :math:`m \sim \mathcal{N}(y, \sigma^2)`)

doc/v2/documentation_data_format.rst

    
                       \pi(m|y,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(m-y)^2}{2\sigma^2}\right)

                * - | Log-normal distribution

                    | (i.e., :math:`\log(m)` is normally distributed)

                    | (i.e., :math:`\log(m)` is normally distributed as :math:`\log(m) \sim \mathcal{N}(\log(y), \sigma)`)

Member

dilpath Jan 14, 2026

Suggested change

      
                  | (i.e., :math:`\log(m)` is normally distributed as :math:`\log(m) \sim \mathcal{N}(\log(y), \sigma)`)
          
                  | (i.e., :math:`\log(m)` is normally distributed as :math:`\log(m) \sim \mathcal{N}(\log(y), \sigma^2)`)

doc/v2/documentation_data_format.rst

Comment on lines +785 to +787

    
              Note that, for all continuous distributions, the simulated value is modeled

              as the median of the noise distribution; i.e., measurements are assumed to

              be equally likely to lie above or below the model output.

Member

dilpath Jan 14, 2026

Possibly not true for the prior distributions; I didn't check. It's implied here that this note doesn't apply to prior distributions, but here's a suggestion just to clarify that

Suggested change

      
            Note that, for all continuous distributions, the simulated value is modeled
          
            as the median of the noise distribution; i.e., measurements are assumed to
          
            be equally likely to lie above or below the model output.
          
            Note that, for all PEtab noise distributions, the simulated value is modeled
          
            as the median of the noise distribution; i.e., measurements are assumed to
          
            be equally likely to lie above or below the model output.

doc/v2/documentation_data_format.rst

    
                * - Laplace distribution

                  - ``laplace``

                  - | ``laplace``

                    | (i.e., :math:`m` is Laplace distributed as :math:`m \sim \mathcal{L}(y, \sigma)`)

Member

dilpath Jan 14, 2026

Move to first column. GitHub doesn't let me make the suggestion but you can see the issue here: https://petab--656.org.readthedocs.build/en/656/v2/documentation_data_format.html#noise-distributions

doc/v2/documentation_data_format.rst

    
                       \pi(m|y,\sigma) = \frac{1}{2\sigma}\exp\left(-\frac{|m-y|}{\sigma}\right)

                * - | Log-Laplace distribution

                    | (i.e., :math:`\log(m)` is Laplace distributed)

                    | (i.e., :math:`\log(m)` is Laplace distributed as :math:`\log(m) \sim \mathcal{L}(\log(y), \sigma)`)

Member

dilpath Jan 14, 2026

I would be fine with shortening all of these to make the column narrower, and to explicitly write Laplace since I guess it doesn't have a commonly used symbol unlike normal $\mathcal{N}$?

e.g.:

Suggested change

      
                  | (i.e., :math:`\log(m)` is Laplace distributed as :math:`\log(m) \sim \mathcal{L}(\log(y), \sigma)`)
          
                  | (i.e., :math:`\log(m) \sim \mathrm{Laplace}(\log(y), \sigma)`)

dweindl reviewed

View reviewed changes

doc/v2/documentation_data_format.rst

    
              Denote by :math:`m` the measured value,

              :math:`y:=\text{observableFormula}` the simulated value

              (the location parameter of the noise distribution),

              (the median of the noise distribution),

Member

dweindl Jan 15, 2026

Thanks, you are right, that wasn't worded correctly.

Fine to merge with Dilan's suggestions, but with respect to #654, I am wondering if we really want to refer here to the median, or just generally to something like "first parameter in canonical notation", or just nothing at all.

dweindl reviewed

View reviewed changes

doc/v2/documentation_data_format.rst

    
              :math:`y:=\text{observableFormula}` the simulated value

              (the location parameter of the noise distribution),

              (the median of the noise distribution),

              and :math:`\sigma` the scale parameter of the noise distribution

Member

dweindl Jan 15, 2026

Same problem here, right? For the log-normal distribution, sigma wouldn't be the scale but its logarithm?

matthiaskoenig approved these changes

View reviewed changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet