
     i,                     ,    d Z ddlZddlmZ ddZd ZdS )a  Correlations utilities --- :mod:`MDAnalysis.lib.correlations`
=================================================================================


:Authors: Paul Smith & Mateusz Bieniek
:Year: 2020
:Copyright: Lesser GNU Public License v2.1+

.. versionadded:: 1.0.0

This module is primarily for internal use by other analysis modules. It
provides functionality for calculating the time autocorrelation function
of a binary variable (i.e one that is either true or false at each
frame for a given atom/molecule/set of molecules). This module includes
functions for calculating both the time continuous autocorrelation and
the intermittent autocorrelation. The function :func:`autocorrelation`
calculates the continuous autocorrelation only. The data may be
pre-processed using the function :func:`intermittency` in order to
acount for intermittency before passing the results to
:func:`autocorrelation`.

This module is inspired by seemingly disparate analyses that rely on the same
underlying calculation, including the survival probability of water around
proteins :footcite:p:`ArayaSecchi2014`, hydrogen bond lifetimes
:footcite:p:`Gowers2015,ArayaSecchi2014`, and the rate of cholesterol
flip-flop in lipid bilayers :footcite:p:`Gu2019`.

.. seeAlso::

    Analysis tools that make use of modules:

        * :class:`MDAnalysis.analysis.waterdynamics.SurvivalProbability`
            Calculates the continuous or intermittent survival probability
            of an atom group in a region of interest.

        * :class:`MDAnalysis.analysis.hbonds.hbond_analysis`
            Calculates the continuous or intermittent hydrogen bond
            lifetime.

.. rubric:: References

.. footbibliography::

    N)deepcopy   c           
      D   t          |           t          k    rt          |           dk    st          | d                   t          k    rt	          d          t          |           |k     rt          d          t          t          d|dz                       }d t          |          D             }t          dt          |           |          D ]}t          | |                   }|dk    r|D ]q}||z   t          |           k    r nXt          t          j        | |||z   dz                       }||dz
                               |t          |          z             rd |D             }	|
                    dd           |	
                    dd           ||	|fS )u  Implementation of a discrete autocorrelation function.

    The autocorrelation of a property :math:`x` from a time :math:`t=t_0` to :math:`t=t_0 + \tau`
    is given by:

    .. math::
        C(\tau) = \langle \frac{ x(t_0)x(t_0 +\tau) }{ x(t_0)x(t_0) } \rangle

    where :math:`x` may represent any property of a particle, such as velocity or
    potential energy.

    This function is an implementation of a special case of the time
    autocorrelation function in which the property under consideration can
    be encoded with indicator variables, :math:`0` and :math:`1`, to represent the binary
    state of said property. This special case is often referred to as the
    survival probability (:math:`S(\tau)`). As an example, in calculating the survival
    probability of water molecules within 5 Å of a protein, each water
    molecule will either be within this cutoff range (:math:`1`) or not (:math:`0`). The
    total number of water molecules within the cutoff at time :math:`t_0` will be
    given by :math:`N(t_0)`. Other cases include the Hydrogen Bond Lifetime as
    well as the translocation rate of cholesterol across a bilayer.

    The survival probability of a property of a set of particles is
    given by:

    .. math::
        S(\tau) =  \langle \frac{ N(t_0, t_0 + \tau )} { N(t_0) }\rangle

    where :math:`N(t0)` is the number of particles at time :math:`t_0` for which the feature
    is observed, and :math:`N(t0, t_0 + \tau)` is the number of particles for which
    this feature is present at every frame from :math:`t_0` to :math:`N(t0, t_0 + \tau)`.
    The angular brackets represent an average over all time origins, :math:`t_0`.

    See :footcite:`ArayaSecchi2014` for a description survival probability.

    Parameters
    ----------
    list_of_sets : list
      List of sets. Each set corresponds to data from a single frame. Each element in a set
      may be, for example, an atom id or a tuple of atoms ids. In the case of calculating the
      survival probability of water around a protein, these atom ids in a given set will be
      those of the atoms which are within a cutoff distance of the protein at a given frame.
    tau_max : int
      The last tau (lag time, inclusive) for which to calculate the autocorrelation. e.g if tau_max = 20,
      the survival probability will be calculated over 20 frames.
    window_step : int, optional
      The step size for t0 to perform autocorrelation. Ideally, window_step will be larger than
       tau_max to ensure independence of each window for which the calculation is performed.
       Default is 1.

    Returns
    --------
    tau_timeseries : list of int
        the values of tau for which the autocorrelation was calculated
    timeseries : list of int
        the autocorelation values for each of the tau values
    timeseries_data : list of list of int
        the raw data from which the autocorrelation is computed, i.e :math:`S(\tau)` at each window.
        This allows the time dependant evolution of :math:`S(\tau)` to be investigated.

    .. versionadded:: 0.19.2
    r   z3list_of_sets must be a one-dimensional list of setsz9tau_max cannot be greater than the length of list_of_setsr   c                     g | ]}g S  r   ).0_s     e/srv/www/vhosts/g4struct/public_html/venv/lib/python3.11/site-packages/MDAnalysis/lib/correlations.py
<listcomp>z#autocorrelation.<locals>.<listcomp>   s    222ar222    c                 6    g | ]}t          j        |          S r   )npmean)r   xs     r
   r   z#autocorrelation.<locals>.<listcomp>   s     666"'!**666r   )typelistlenset	TypeError
ValueErrorrangeintersectionappendfloatinsert)
list_of_setstau_maxwindow_steptau_timeseriestimeseries_datatNttauNtau
timeseriess
             r
   autocorrelationr&   I   s   B 	\d""s<'8'8A'='=$QC C	C C A
 
 	

 <7""G
 
 	
 %7Q;//00N225>>222O 1c,''55 > >a!!77! 	> 	>C3w#l++++ s'a!c'A+o)FGHHDC!G$++D599,<====66o666J !Qa:66r   c                    |dk    r| S t          |           } t          |           D ]\  }}d |D             }t          d|dz             D ]}|                                D ]}||z   t	          |           k    r|| ||z            vr||xx         dz  cc<   7||         dk    rD||         |k    rQt          ||         dd          D ]#}| ||z   |z
                               |           $d||<   | S )a
  Preprocess data to allow intermittent behaviour prior to calling :func:`autocorrelation`.

    Survival probabilty may be calculated either with a strict continuous requirement or
    a less strict intermittency. If calculating the survival probability water around a
    protein for example, in the former case the water must be within a cutoff distance
    of the protein at every frame from :math:`t_0` to :math:`t_0 + \tau` in order for it to be considered
    present at :math:`t_0 + \tau`. In the intermittent case, the water molecule is allowed to
    leave the region of interest for up to a specified consecutive number of frames whilst still
    being considered present at :math:`t_0 + \tau`.

    This function pre-processes data, such as the atom ids of water molecules within a cutoff
    distance of a protein at each frame, in order to allow for intermittent behaviour, with a
    single pass over the data.

    For example, if an atom is absent for a number of frames equal or smaller than the parameter
    :attr:`intermittency`, then this absence will be removed and thus the atom is considered to have
    not left.
    e.g 7,A,A,7 with `intermittency=2` will be replaced by 7,7,7,7, where A=absence.

    The returned data can be used as input to the function :func:`autocorrelation` in order
    to calculate the survival probability with a given intermittency.

    See :footcite:p:`Gowers2015` for a description of intermittency in the
    calculation of hydrogen bond lifetimes.

    # TODO - is intermittency consitent with list of sets of sets? (hydrogen bonds)

    Parameters
    ----------
    list_of_sets: list
        In the simple case of e.g survival probability, a list of sets of atom ids present at each frame, where a
        single set contains atom ids at a given frame, e.g [{0, 1}, {0}, {0}, {0, 1}]
    intermittency : int
        The maximum gap allowed. The default `intermittency=0` means that if the datapoint is missing at any frame, no
        changes are made to the data. With the value of `intermittency=2`, all datapoints missing for up to two
        consecutive frames will be instead be considered present.

    Returns
    -------
    list_of_sets: list
        returns a new list with the IDs with added IDs which disappeared for <= :attr:`intermittency`.
        e.g If [{0, 1}, {0}, {0}, {0, 1}] is a list of sets of atom ids present at each frame and `intermittency=2`,
        both atoms will be considered present throughout and thus the returned list of sets will be
        [{0, 1}, {0, 1}, {0, 1}, {0, 1}].

    r   c                     i | ]}|d S )r   r   )r   is     r
   
<dictcomp>z)correct_intermittency.<locals>.<dictcomp>   s    222A1a222r   r      )r   	enumerater   keysr   add)r   intermittencyr)   elementsseen_frames_agojelementks           r
   correct_intermittencyr6      sj   ^ L))L
 !.. - -822222q-!+,, 	- 	-A*//11 - -q5C---- ,q1u"555#G,,,1,,, #7+q00 #7+m;; w7B?? 9 9A Q+//8888+,((7-	-: r   )r   )__doc__numpyr   copyr   r&   r6   r   r   r
   <module>r:      sg   0+ +Z          f7 f7 f7 f7RW W W W Wr   