{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculating the Clustering Ensemble Similarity between ensembles\n", "\n", "Here we compare the conformational ensembles of proteins in three trajectories, using the clustering ensemble similarity method.\n", "\n", "**Last executed:** Sep 25, 2020 with MDAnalysis 1.0.0\n", "\n", "**Last updated:** September 2020\n", "\n", "**Minimum version of MDAnalysis:** 1.0.0\n", "\n", "**Packages required:**\n", " \n", "* MDAnalysis (Michaud-Agrawal *et al.*, 2011, Gowers *et al.*, 2016)\n", "* MDAnalysisTests\n", "* [scikit-learn](https://scikit-learn.org/stable/)\n", " \n", "**Optional packages for visualisation:**\n", "\n", "* [matplotlib](https://matplotlib.org)\n", "\n", "\n", "
\n", " \n", "**Note**\n", "\n", "The metrics and methods in the `encore` module are from (Tiberti *et al.*, 2015). Please cite them when using the ``MDAnalysis.analysis.encore`` module in published work.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import MDAnalysis as mda\n", "from MDAnalysis.tests.datafiles import (PSF, DCD, DCD2, GRO, XTC, \n", " PSF_NAMD_GBIS, DCD_NAMD_GBIS)\n", "from MDAnalysis.analysis import encore\n", "from MDAnalysis.analysis.encore.clustering import ClusteringMethod as clm\n", "\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The test files we will be working with here feature adenylate kinase (AdK), a phosophotransferase enzyme. (Beckstein *et al.*, 2009)\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "u1 = mda.Universe(PSF, DCD)\n", "u2 = mda.Universe(PSF, DCD2)\n", "u3 = mda.Universe(PSF_NAMD_GBIS, DCD_NAMD_GBIS)\n", "\n", "labels = ['DCD', 'DCD2', 'NAMD']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The trajectories can have different lengths, as seen below." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "98 102 100\n" ] } ], "source": [ "print(len(u1.trajectory), len(u2.trajectory), len(u3.trajectory))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with default settings" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The clustering ensemble similarity method (`ces`, [API docs here](https://docs.mdanalysis.org/stable/documentation_pages/analysis/encore/similarity.html#MDAnalysis.analysis.encore.similarity.ces)) combines every trajectory into a whole space of conformations, and then uses a user-specified `clustering_method` to partition this into clusters. The population of each trajectory ensemble within each cluster is taken as a probability density function.\n", "\n", "The similarity of each probability density function is compared using the Jensen-Shannon divergence. This divergence has an upper bound of $\\ln{(2)}$, representing no similarity between the ensembles, and a lower bound of 0.0, representing identical conformational ensembles.\n", "\n", "You do not need to align your trajectories, as the function will align it for you (along your `select`ion atoms, which are `select='name CA'` by default). " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "ces0, details0 = encore.ces([u1, u2, u3])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`encore.ces` returns two outputs. `ces0` is the similarity matrix for the ensemble of trajectories." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0.68070702, 0.69314718],\n", " [0.68070702, 0. , 0.69314718],\n", " [0.69314718, 0.69314718, 0. ]])" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ces0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`details0` contains the calculated clusters as a `encore.clustering.ClusterCollection.ClusterCollection`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "We have found 49 clusters\n" ] } ], "source": [ "cluster_collection = details0['clustering'][0]\n", "print(type(cluster_collection))\n", "print('We have found {} clusters'.format(len(cluster_collection)))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can access each Cluster at `cluster_collection.clusters`. For example, the first one has these elements:" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first_cluster = cluster_collection.clusters[0]\n", "first_cluster" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 98])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first_cluster.elements" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Each cluster has an ID number and a centroid conformation." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The ID of this cluster is: 0\n", "The centroid is 1\n" ] } ], "source": [ "print('The ID of this cluster is:', first_cluster.id)\n", "print('The centroid is', first_cluster.centroid)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAU4AAAEICAYAAAAwUh0YAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZwdVZ338c83YUcWNSAOO7KDyBKDM4MLKhoYFNdh0wE3jJJxHB951HEe95kRRR91AENkFERlkQHFmWBAHhUdYEhwAhoQiRElgEBACZuQdL7PH1UNxeV2d93Ore7qzvf9etUrt27VOXWquvPrc+qcOiXbREREfVPGuwARERNNAmdERI8SOCMiepTAGRHRowTOiIgeJXBGRPQogXMYkj4m6RvjXQ4ASQ9K2mm8yzEeJFnSzkNsO17ST/t8vGMlXTbKtC+UdHNl/VZJL1+Dsqy1P/c2W+sDp6RjJC0sf0HvlHSppIP6mP8O5X/8ddYkH9tPs720X+WKodn+pu1XjDLtT2zv1seyPP5zl3SWpE/1K+8YvbU6cEp6H/AF4J+BZwHbAacDR4xnuarWNODGxJSfe7uttYFT0mbAJ4ATbV9k+yHbK21/z/ZJXfZ/iaRlHd893gyTNKOsua6QdJekz5e7XVn++8eyVvvn5f5vlXSTpD9Imi9p+0q+lnSipFuAWyrf7Vx+PkvSaZL+U9IDkv5b0nMq6V8h6WZJ90s6XdKPJb19iOswRdIHJf1a0r2SLpD0jHLbYG35OEm/k7Rc0ocraYc6ZyS9QNJVkv4o6XpJL6ls+5GkT5XbH5T0PUnPlPTNMq8FknboKOphkpaWZfispK6/u5J2l3S5pPvKa/DX3fYr9z2+zPMBSb+RdGzl+59W9rOkd0u6pdz3k5KeI+nqsrwXSFqv3Pcpvycd1+vq8prcKenUwXSV43T9uUs6ATgW+N+Va3aSpH/vOMa/SvrCUOccfWJ7rVyAmcAqYJ1h9vkY8I3y80uAZR3bbwVeXn6+Gnhz+flpwAvKzzsArh4HeA2wBNgDWAf4R+CqynYDlwPPADasfLdz+fks4D5gRpn+m8B55bZpwArgdeW2vwNWAm8f4hzfC1wDbAOsD5wBnNtR9q8AGwLPAx4F9hjhnLcG7gUOo/jjfEi5vkW5/Ufl+T8H2Ay4EfgV8PKyzF8HvtZxPX5YXo/tyn3fXm47Hvhp+Xlj4DbgLWU++wPLgb26nPfG5XXarVx/9uB+1Twrx78E2BTYq7wGVwA7Vcp/XLffE578O3IA8IKybDsANwHv7fHn/qnK/s8GHgI2L9fXAe4GDhjv/1+TfVlra5zAM4Hltlf1Kb+VwM6Sptl+0PY1w+z7TuBfbN9UHv+fgX2rtc5y+322Hxkij4tsX1um/yawb/n9YcBiF7XoVcCXgN+PUJYP215m+1GKPxZv6Ggqftz2I7avB66nCKDDnfObgHm259lebftyYGFZtkFfs/1r2/cDlwK/tv2DsszfBvbrKOfJ5fX4HcXtlaO7nMvhwK22v2Z7le2fAf8OvGGIc18N7C1pQ9t32l48zHU62faKcp9fAJfZXlopf2d5n8L2dbavKct2K8UfqRd37DbSz72a350ULZo3ll/NpPidvm6ktLFm1ubAeS8wrY/3kt4G7Ar8smxqHj7MvtsDXyybbH+kqD2KoqY26LYRjlcNhg9T1PgA/qya1raBrk3HSlkurpTlJmCA4p7vSMca6py3B944mGeZ70EUNaRBd1U+P9Jl/Wk8WfV6/LY8z27ncmDHcY8Fturc0fZDwJHALODO8rbH7l3yHG15n0LSrpL+Q9LvJa2g+IM5rWO3kX7unc6m+ENF+e85PaaPUVibA+fVwJ8oms11PARsNLgiaSqwxeC67VtsHw1sCZwMXChpY4qmVqfbgHfa3ryybGj7qso+o5226k6KZvdgOVVdH6Ish3aUZQPbt490oGHO+TbgnI48N7b96VGeE8C2lc/bAXcMcS4/7jju02y/a4jyz7d9CEVA/yXFLYkmfbk8zi62NwX+geIP5pOKNUz6btu+A+wjaW+KGvc3+1HQGN5aGzjLJtZHgNMkvUbSRpLWlXSopM90SfIrYANJfyVpXYr7kusPbpT0Jklb2F4N/LH8egC4h6JJWB2LNwf4kKS9yrSbSXoj/fGfwHPLc1oHOJEuNa6OsvzT4G0CSVtIqjWqYJhz/gbwKkmvlDRV0gZlp8lwAXwkJ0l6uqRtKe7bnt9ln/8AdpX05vJnua6k50vao0vZnyXp1WWgfxR4sCx7kzahuK/6YFm77RrQh3EXT/49wvafgAuBbwHXlrcyomFrbeAEsP154H0UQfAeihrLbIq/4p373g+8GzgTuJ2iBlptAs8EFkt6EPgicJTtP9l+GPgn4L/K5uMLbF9MUUM7r2yy/QI4tE/ntJzintdnKG5H7Elxf/HRIZJ8kaLj4zJJD1B0FB1Y83BDnfNtFEO6/oEnrutJrNnv23eB64BFFH8c/q1zB9sPAK8AjqKokf6e4jqv37lvWZb/Ve53H8W9xnevQfnqeD9wDPAARe22W/Afzr8Be5a/R9Xf0bOB55Jm+phRcQssJqty2M4y4FjbPxzv8kT/SdqO4hbAVrZXjHd51gZrdY1zsiqbyJtLWp8n7qMN18sfE1T5h/F9FMPREjTHSJ5OmJz+nOKe13oUYwxfU2d4S0ws5f3ZuyhGGcwc5+KsVdJUj4joUZrqERE9mpBN9fW0vjdg4/EuRmvtvM9D412E1pvylOGTUXXrbStZft/AGl2kVx68se+9r94Ir+tueHS+7Qlzu2FCBs4N2JgD9bLxLkZrXXzpteNdhNbbaMp6I++0Fpvxyl4fYHqqe+8b4Nr529Xad+qzb+l8gqrVJmTgjIj2M7Ca1eNdjEYkcEZEI4xZ6aYfxhofCZwR0ZjUOCMiemDMwCQd7pjAGRGNWT3qSb7aLYEzIhphYCCBMyKiN6lxRkT0wMDK3OOMiKjPOE31iIieGAYmZ9xM4IyIZhRPDk1OCZwR0RAxMEknU0ngjIhGFJ1DCZwREbUV4zgTOCMierI6Nc6IiPpS44yI6JERA5P07TwJnBHRmDTVIyJ6YMRjnjrexWhEAmdENKIYAJ+mekRET9I5FBHRA1sMODXOiIierE6NMyKivqJzaHKGmMlZj46IcTfYOVRnqUPSTEk3S1oi6YND7PMSSYskLZb0436eT9Xk/HMQEa0w0KdxnJKmAqcBhwDLgAWSLrF9Y2WfzYHTgZm2fydpy74cvIsEzohoRJ+fHJoBLLG9FEDSecARwI2VfY4BLrL9OwDbd/fr4J3SVI+Ixqz2lFpLDVsDt1XWl5XfVe0KPF3SjyRdJ+lv+nQaT5EaZ0Q0opjko3bdbJqkhZX1ubbnVta7tfk7X8yxDnAA8DJgQ+BqSdfY/lXdQtSVwBkRjTBiZf1HLpfbnj7M9mXAtpX1bYA7uuyz3PZDwEOSrgSeB/Q9cKapHhGNsGHAU2otNSwAdpG0o6T1gKOASzr2+S7wQknrSNoIOBC4qa8nVepL4JQ0UBkCcL2k90maUtk+Q9KV5VCCX0o6U9JGko6XdI+k/5F0i6T5kv6iH2WKiPEmVtdcRmJ7FTAbmE8RDC+wvVjSLEmzyn1uAr4P3ABcC5xp+xdNnFm/muqP2N4XoBwC8C1gM+Cjkp4FfBs4yvbVkgS8HtikTHu+7dll2oOBiyQdXF6EiJigDH195NL2PGBex3dzOtY/C3y2bwcdQt+b6uUQgBOA2WWQPBE42/bV5XbbvtD2XV3S/hCYW6aPiAlugCm1lommkRKXY62mAFsCewPX9ZD8Z8DuTZQrIsaOEatdb5lomuxVH+3V6JpO0gmUNdEN2Gi0ZYqIMVK8HnhyDtxppMYpaSdgALgbWEwxtqqu/ejSE2Z7ru3ptqevy/r9KWhENEgM1Fwmmr4HTklbAHOAU20bOBU4TtKBlX3eJGmrLmlfTFGr/Eq/yxURY8v09cmhVulXPXpDSYuAdYFVwDnA5wFs3yXpKOCUssd9NXAlcFGZ9khJBwEbAb8BXp8e9YjJYSLWJuvoS+C0h388oOxRf2GXTWeVS0RMMrYmZG2yjsl5VhEx7orOoam1lrEmaVdJV0j6Rbm+j6R/rJs+gTMiGqJ+PnLZb18BPgSsBLB9A8VjnLVMzrECETHuis6h1t7j3Mj2tcUzOo9bVTdxAmdENKbFTwUtl/QcyqnpJL0BuLNu4gTOiGjE4JNDLXUixePdu0u6nWJEz5vqJk7gjIjG1H0R21grHwt/uaSNgSm2H+glfTvPKiImPBtWrp5Saxlrkv5Z0ua2H7L9gKSnS/pU3fQJnBHRiKKp3tonhw61/cfHy2r/ATisbuI01SOiMS1+cmiqpPVtPwogaUOoPwlGAmdENKLlw5G+AVwh6WsURX0rcHbdxAmcEdGQ9j5yafszkn5O8UZMAZ+0Pb9u+gTOiGhMnfcJjRfblwKXjiZtAmdENKLoVR/759DrkPQ64GSKt1SoXGx70zrpEzgjohEtHwD/GeBVo53CMoEzIhrT4qb6XWsy728CZ0Q0ouW96gslnQ98B3h08EvbFw2d5Ant7PKKiEmhnwPgJc2UdLOkJZI+2GX7SyTdL2lRuXxkmOw2BR4GXgG8qlwOr3teqXFGRCNssapPw5EkTQVOAw4BlgELJF1i+8aOXX9ie8QAaPsta1Ke1DgjojF9fK/6DGCJ7aW2HwPOA44YbbkyA3xEtNLgPc6agXOapIWV5YSO7LYGbqusLyu/6/Tnkq6XdKmkvYYpXmaAj4h26qFzaLnt6cNs75aRO9Z/Bmxv+0FJh1F0/OwyRH5rNAN8apwR0YjBcZx9aqovA7atrG8D3PGk49krbD9Yfp4HrCtp2hD5ZQb4iGinPo7jXADsImlH4HaKZvUx1R0kbUUxPtOSZlBUDO8dIr/MAB8R7WPDqj5NUmx7laTZwHxgKvBV24slzSq3zwHeALxL0irgEeAo253N+cH81mgG+ATOiGhMPwfAl83veR3fzal8PhU4tU5ekt7XsQ5wP3Cd7UUjpU/gjIhGtPxZ9enl8r1y/a8obgfMkvRt258ZLnECZ0Q0xu0NnM8E9h/sTJL0UeBC4EXAdRSTgAwpgTMiGtPiST62Ax6rrK+kGMr0iKRHh0jzuATOiGiE3epJPr4FXCPpu+X6q4Bzy86izsc4nyKBMyIaIgbG4dW/I1HRE3QWRUfTQRSD62fZXljucuxIeSRwRkRj2niPsxzn+R3bB1Dcz+zZhAycO+/zEBdfeu14F6O1XrvNjPEuQuvNv2PEESexhlo+H+c1kp5ve8FoEk/IwBkRE4CL+5wtdTDF0KNbgYd44p1D+9RJnMAZEY1pca/6oWuSuH13biNiUnDZOVRnGfOy2b+lmDTkpeXnh+khHqbGGRGNaWtTvRzwPh3YDfgasC7wDeAv66RP4IyIxrSxV730WmA/ijk8sX2HpE3qJk7gjIhG2K0OnI+Vw5IG5+PcuJfECZwR0ZgWD0e6QNIZwOaS3gG8leJ1GrUkcEZEY9p6j9P2KZIOAVZQ3Of8iO3L66ZP4IyIRhixuoWPXAJI+nvg270Ey6p2nlVETAquuYyDTYH5kn4i6URJz+olcQJnRDSj7Byqs4x50eyP296L4t1Dfwb8WNIP6qZPUz0imtPSe5wVdwO/p3ip25Z1E6XGGRGNaWuNU9K7JP0IuAKYBryj7nPqkBpnRDTEwOrVrR2OtD3w3jovZusmgTMimmGgZeM4JW1qewXlO4UkPaO63fZ9dfJJUz0iGmPXW+qQNFPSzZKWSPrgMPs9X9KApDd02fyt8t/rgIXlv9dV1mtJjTMimtOnziFJU4HTgEOAZcACSZfYvrHLficD87sWxz68/HfHNSlPAmdENKSvHT8zgCW2lwJIOg84gqe+WO1vgX8Hnt+1RNL+wx3E9s/qFCaBMyKaU7/GOU1Stak81/bcyvrWwG2V9WXAgdUMJG1NMevRSxkicAKfK//dgGJauespZn/fB/hvipe3jSiBMyKaYXD9XvXltqcPs71bRp1h+QvAB2wPFC+y7JLAPhger7GeYPvn5frewPvrFjaBMyIa1Lem+jKKGdsHbQPc0bHPdOC8MmhOAw6TtMr2d7rkt/tg0ASw/QtJ+9YtTAJnRDSnf08OLQB2kbQjcDtwFHDMkw5V6fCRdBbwH0METYCbJJ1JMeu7gTcBN9UtTAJnRDSnT4HT9ipJsyl6y6cCX7W9WNKscvucHrN8C/Au4O/K9SuBL9dNnMAZEc3o8wB42/OAeR3fdQ2Yto8fIa8/Af+3XHqWwBkRjWnrRMZrKoEzIprT3mfV10gCZ0Q0RqlxRkT0YByndx+JpF2BkyhmSXo8Dtp+aZ30CZwR0RC1bnakim8DcyjebDnQa+IEzohoTktrnMAq27WHH3XKtHIR0ZzVNZex9z1J75b0bEnPGFzqJk6NMyKa0cKJjCuOK/89qfKdgZ3qJK4dOCUNAD8H1gVWAWcDX7C9utw+AzgFeFZZgJ8C7wH+GvgsxbOmTwOWAh+3fVWZ7rPAq4DHgF8Db7H9x7rlioj2amuv+prOx9lLU/0R2/uWr9Q8BDgM+ChA+U7ib1PMTLIbsAfwfWCTMu35tvezvQvwaeAiSXuU2y4H9i5flPQr4ENrckIR0SItfbG6pHUlvUfSheUyW9K6ddOP6h6n7buBE4DZKqYiORE42/bV5XbbvtD2XV3S/hCYW6bH9mW2V5Wbr6GY9SQioklfBg4ATi+XAxiLZ9VtL5U0heJdxHtTNN3r+hnwzi7fvxU4v1sCSSdQBtttt57aW2EjYly0takOPN/28yrr/0/S9XUTr2mv+mjv/D4lnaQPU9w7/Wa3BLbn2p5ue/q0ZyZwRrSeKR65rLOMvQFJzxlckbQTPYznHHWNs3Kgu4HFFFXd79ZMvh+Vue8kHQccDrzMnqzTAkSshdr7v/kk4IeSllJU5LanmGqullEFTklbUIy6P9W2JZ0KXCvpP23/d7nPm4AfdEn7Yoom9+AU9jOBDwAvtv3waMoTEe3U1qa67Ssk7QLsRhE4f2n70brpewmcG0paxBPDkc4BPl8W4i5JRwGnSNqSYkjrlcBFZdojJR0EbAT8Bni97cEa56nA+sDl5ZT319ie1UO5IqKtWho4SwcAO1DEwedJwvbX6ySsHThtD3tjsexRf2GXTWeVy1Dpdq5bhoiYYFoaOCWdAzwHWMQT9zYN9DdwRkT0Qm5vU53ixW57jrZPJYEzIprT3omMfwFsBdw5msQJnBHRmBbXOKcBN0q6Fni8U8j2q+skTuCMiOa0N3B+bE0SJ3BGRDNafI/T9o/XJH3m44yI5rR3ko/XSbpF0v2SVkh6QNKKuukTOCOiMVpdb6mVlzRT0s2Slkj6YJftR0i6QdIiSQvLseND+Qzwatub2d7U9ia2N617XmmqR0TrSZoKnEYxpeUyYIGkS2zfWNntCuCS8mnGfYALgN2HyPKuykM4PUvgjIjm9K8ZPgNYYnspgKTzgCOAxwOn7Qcr+288wtEXSjof+A5P7lW/aOgkT0jgjIhm9NY5NE3Swsr6XNtzK+tbA7dV1pcBB3ZmIum1wL9QTHf5V8Mcb1PgYeAVTy4xCZwRMc7qB87ltqcPs73bSPqn5G77YuBiSS8CPgm8vGux7NozIXWTwBkRzelfU30ZsG1lfRvgjiEPa18p6TmSptle3rld0gbA24C9gA0q6d5apzDpVY+IRoi+9qovAHaRtKOk9YCjgEuedDxp5/JVPkjaH1gPuHeI/M6heOTylcCPKQLxA3XPLTXOiGhGHwfA214laTYwH5gKfNX2Ykmzyu1zgNcDfyNpJfAIcOQwk3jsbPuNko6wfbakb5V515LAGRHN6ePgdtvzgHkd382pfD4ZOLlmdivLf/8oaW/g9xRzc9aSwBkRzWnpI5fAXElPB/4PRZP/acBH6iZO4IyIxrT4WfUzy48/BnbqNX0CZ0Q0p6WBU9L6FPdEd6ASB21/ok76BM6IaIbrP4c+Dr4L3A9cR+XJoboSOCOiOS2tcQLb2J452sQZxxkRjRl879BIyzi4StJzR5s4Nc6IaE7LapySfk5RqnWAt0haStFUF2Db+9TJJ4EzIpoxTpMUj+DwfmSSwBkRjRCtHI50D7DS9koASbsBhwG/rTulHOQeZ0Q0qIX3OL9P+YSQpJ2BqynGcZ4o6V/qZpLAGRHNad87h55u+5by83HAubb/FjiUHprxCZwR0Zz2Bc7q0V4KXA5g+zGg9qjT3OOMiGa08/XAN0g6Bbgd2Bm4DEDS5r1kkhpnRDSnfTXOdwDLKe5zvsL2w+X3ewKn1M0kNc6IaEzbHrm0/Qjw6ep3kva3fRVwVd18JmTgnILYaMp6412M1pp/x6LxLkLrvfLP9h3vIrTarzzUxOm9aWFTvZszgf17STAhA2dETADtHADfTbcXwQ0rgTMimjMxAufHe02QwBkRjWjpk0OPk7Q1sD1wX/k6YWxfWSdtAmdENEar2xk5JZ0MHAncCAyUXxtI4IyIcdTue5yvAXaz3fMkxpDAGRENanFTfSmwLqOY/R0SOCOiSX0MnJJmAl+keK/6mbY7x2MeC3ygXH0QeJft64fI7mFgkaQrqARP2++pU5YEzohoTL9qnJKmAqcBhwDLgAWSLrF9Y2W33wAvtv0HSYcCc4EDh8jyknIZlQTOiGhO/2qcM4AltpcCSDoPOIKic6c4VPH0z6BrgG2GLJZ9tqQNge1s39xrYfKsekQ0o3zLZZ0FmCZpYWU5oSO3rYHbKuvLyu+G8jbg0qE2SnoVsIhifk4k7Supdg00Nc6IaESP4ziX254+QnaduuYu6WCKwHnQMPl9jKIW+yMA24sk7VirpCRwRkST3Le2+jJg28r6NsAdnTtJ2ofi2fND7WEfuF9l+37pSfG4dmHTVI+IxvTx1RkLgF0k7ShpPeAoOjp3JG0HXAS82favRsjvF5KOAaZK2kXSv9LD7EgJnBHRjLpzcdYInLZXAbOB+cBNwAW2F0uaJWlWudtHgGcCp0taJGnhMFn+LbAXxVCkc4EVwHvrnlqa6hHRmH7Ox2l7HjCv47s5lc9vB95eM6+HgQ8DHy6HOm1s+091y5IaZ0Q0pode9bEtl/QtSZtK2hhYDNws6aS66RM4I6IZpugcqrOMvT1tr6B4Zn0esB3w5rqJEzgjojEtfK/6oHUlrUsROL9reyXpVY+IVmjfy9oGnQHcCmwMXClpe4oOolrSORQRjWjzRMa2vwR8qfLVb8uB87UkcEZEM+w2T2S8PvB6itcEV+PgJ+qkT+CMiOa0M24CfBe4H7iOUczJmcAZEY1pa1Md2Mb2zNEmTudQRDTDwGrXW8beVZKeO9rEqXFGRHPaW+M8CDhe0m8omuoCbHufOokTOCOiMS1uqh+6JonTVI+Ixmi1ay1jzfZvKaape2n5+WF6iIepcUZEM1r8emBJHwWmA7sBX6N44+U3gL+skz6BMyIaUQyAb2nkhNcC+wE/A7B9h6RN6iZO4IyI5ozDzEc1PWbbUnEXtpwlqbbc44yIxsiutYyDCySdAWwu6R3AFRSv3KglNc6IaEaL73HaPkXSIRQTe+wK/KPtH9RNP2KNU5Ilfa6y/n5JH+vY53pJ53Z8d5akh6v3DSR9scxvWrk+UE5xv7jM432SUguOmBTq9aiPZa+6pAckrZC0guL9RLPK5WJJ90i6RtLLRsqnTpB6FHjdYLDrUpA9ynxe1OU+wRKKl8ZTBsSDgdsr2x+xva/tvYBDgMOAj9YoU0RMBC2byNj2JrY3LZdNqguwFfBO4Isj5VMncK4C5gJ/P8T2Y4BzgMuAV3dsOxc4svz8EuC/yvyewvbdwAnAbHW8szMiJiC399UZ3dgesH098K8j7Vu3WXwacKykzbpsOxI4nyJIHt2x7RZgC0lPL7edN9xBbC8ty7RlzXJFRJu1rMZZh+0zRtqnVuAs383xdeA91e8lPR+4pxx5fwWwfxkkqy6ieAfygcBPahyua21T0gmSFkpaeM+9A3WKHRHjrb0zwK+RXjpivgC8jWKq+UFHA7tLuhX4NbApxeSgVecBnwQutz1spVzSTsAAcHfnNttzbU+3PX2LZ07todgRMV60enWtpVZe0kxJN0taIumDXbbvLulqSY9Ken/fT6aiduC0fR9wAUXwHOzseSOwj+0dbO9A0RF0dEe631G8v/j04fKXtAUwBzjVblndPSJ6Z4oB8HWWEZTvPj+NYnKOPYGjJe3Zsdt9FK3iU/pS/mH0OvTnc8Bg7/qLgNttV3vJrwT2lPTsaiLbZ9j+dZf8NhwcjgT8gKKD6eM9likiWkjUG/xecwD8DGCJ7aW2H6NoyR5R3cH23bYXACv7fzZPNuIAeNtPq3y+C9iosvkFHfsOAINB8/gh8tuh8jlt7ojJrH7jcZqkhZX1ubbnVta3Bm6rrC+j6DcZF3lyKCKaUz9wLrc9fZjt3TqNx+2WXgJnRDRj8B5nfyyjmD9z0DbAHX3LvUcJnBHRmLo95jUsAHaRtCPF04dHUTx8My4SOCOiIf0b3G57laTZwHxgKvBV24slzSq3z5G0FbCQYljkaknvBfYsx6H3VQJnRDTD9PWpINvzgHkd382pfP49RRO+cQmcEdGcljyH3m8JnBHRmBa/OmONJHBGRHMSOCMiemDDwORsqydwRkRzUuOMiOhRAmdERA8MjOH7hMZSAmdENMQw/BS8E1YCZ0Q0w6RzKCKiZ7nHGRHRowTOiIhetO8Nlv2SwBkRzTDQv2nlWiWBMyKakxpnREQv8shlRERvDM44zoiIHuXJoYiIHuUeZ0RED+z0qkdE9Cw1zoiIXhgPDIx3IRqRwBkRzci0chERozBJhyNNGe8CRMTkZMCrXWupQ9JMSTdLWiLpg122S9KXyu03SNq/3+c0KIEzIprhciLjOssIJE0FTgMOBfYEjpa0Z8duhwK7lMsJwJf7e0JPSOCMiMZ4YKDWUsMMYIntpbYfA84DjujY5wjg6y5cA2wu6dn9PaPChLzHed0Njy6f+uwlvx3vclRMA5aPdyFarmXXaMl4F6BTy64P269pBg/wh/k/8IXTau6+gaSFlfW5tnX0gLMAAAOQSURBVOdW1rcGbqusLwMO7Mij2z5bA3fWLENtEzJw2t5ivMtQJWmh7enjXY42yzUa3mS8PrZn9jE7dTvEKPbpizTVI2IiWAZsW1nfBrhjFPv0RQJnREwEC4BdJO0oaT3gKOCSjn0uAf6m7F1/AXC/7b4302GCNtVbaO7Iu6z1co2Gl+szDNurJM0G5gNTga/aXixpVrl9DjAPOIziBvbDwFuaKo88SZ8ljYhoSprqERE9SuCMiOhRAucwJA1IWiRpsaTrJb1P0pTK9hmSriwfA/ulpDMlbSTpeEn3SPofSbdImi/pL8bzXPqlqWsi6bPl/jdIuljS5uNzhmtGkiV9rrL+fkkf69jneknndnx3lqSHJW1S+e6LZX7TyvVhr32MnVz04T1ie1/bewGHUNx4/iiApGcB3wY+YHs3YA/g+8DgL/75tvezvQvwaeAiSXuM+Rn0X1PX5HJgb9v7AL8CPjRmZ9RfjwKvGwx2ncrznQK8SNLGHZuXUD4NUwbEg4HbK9uHvPYxthI4a7J9N8Xzr7MlCTgRONv21eV2277Q9l1d0v6Qotf0hLEsc9P6eU1sX2Z7Vbn5GooxeBPRKorz+vshth8DnANcBry6Y9u5wJHl55cA/1Xm9xRdrn2MoQTOHtheSnHNtgT2Bq7rIfnPgN2bKNd4auiavBW4dM1LN25OA46VtFmXbUcC51MEyaM7tt0CbCHp6eW284Y7SMe1jzGUwNm70f51n8y1gr5dE0kfpqhlfXONSjSObK8Avg68p/q9pOcD99j+LXAFsH8ZJKsuohjcfSDwkxqHm8y/V62VwNkDSTsBA8DdwGLggB6S7wfc1ES5xlM/r4mk44DDgWM98QcYfwF4G1C9j3k0sLukW4FfA5sCr+9Idx7wSeByj/BS8o5rH2MogbMmSVsAc4BTy//UpwLHSTqwss+bJG3VJe2LKe5HfWWsyjsW+nlNJM0EPgC82vbDY1H+Jtm+D7iAIngOdva8EdjH9g62d6DoCDq6I93vgA8Dpw+Xf5drH2Moj1wOb0NJi4B1KZqP5wCfB7B9l6SjgFMkbQmsBq6kaGoBHCnpIGAj4DfA621PhhpnU9fkVGB94PKyr+Ma27PG6Jya8jlgdvn5RcDttqu95FcCe3bOGWn7jCHyG/Lax9jKI5cRET1KUz0iokcJnBERPUrgjIjoUQJnRESPEjgjInqUwBkR0aMEzoiIHv1//b0SXd+z8L8AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig0, ax0 = plt.subplots()\n", "im0 = plt.imshow(ces0, vmax=np.log(2), vmin=0)\n", "plt.xticks(np.arange(3), labels)\n", "plt.yticks(np.arange(3), labels)\n", "plt.title('Clustering ensemble similarity')\n", "cbar0 = fig0.colorbar(im0)\n", "cbar0.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with one method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Clustering methods should be subclasses of `analysis.encore.clustering.ClusteringMethod`, initialised with your chosen parameters. Below, we set up an affinity progragation scheme, which uses message-passing to choose a number of 'exemplar' points to represent the data and updates these points until they converge. The `preference` parameter controls how many exemplars are used -- a higher value results in more clusters, while a lower value results in fewer clusters. The `damping` factor damps the message passing to avoid numerical oscillations. [(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#affinity-propagation)\n", "\n", "The other keyword arguments control when to stop clustering. Adding noise to the data can also avoid numerical oscillations." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "clustering_method = clm.AffinityPropagationNative(preference=-1.0,\n", " damping=0.9,\n", " max_iter=200,\n", " convergence_iter=30,\n", " add_noise=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, MDAnalysis will run the job on one core. If it is taking too long and you have the resources, you can increase the number of cores used." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "ces1, details1 = encore.ces([u1, u2, u3],\n", " select='name CA',\n", " clustering_method=clustering_method,\n", " ncores=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "scrolled": true }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAU4AAAEICAYAAAAwUh0YAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZwdVZ338c83YUcWNSAOO7KDyBKDM4MLKhoYFNdh0wE3jJJxHB951HEe95kRRR91AENkFERlkQHFmWBAHhUdYEhwAhoQiRElgEBACZuQdL7PH1UNxeV2d93Ore7qzvf9etUrt27VOXWquvPrc+qcOiXbREREfVPGuwARERNNAmdERI8SOCMiepTAGRHRowTOiIgeJXBGRPQogXMYkj4m6RvjXQ4ASQ9K2mm8yzEeJFnSzkNsO17ST/t8vGMlXTbKtC+UdHNl/VZJL1+Dsqy1P/c2W+sDp6RjJC0sf0HvlHSppIP6mP8O5X/8ddYkH9tPs720X+WKodn+pu1XjDLtT2zv1seyPP5zl3SWpE/1K+8YvbU6cEp6H/AF4J+BZwHbAacDR4xnuarWNODGxJSfe7uttYFT0mbAJ4ATbV9k+yHbK21/z/ZJXfZ/iaRlHd893gyTNKOsua6QdJekz5e7XVn++8eyVvvn5f5vlXSTpD9Imi9p+0q+lnSipFuAWyrf7Vx+PkvSaZL+U9IDkv5b0nMq6V8h6WZJ90s6XdKPJb19iOswRdIHJf1a0r2SLpD0jHLbYG35OEm/k7Rc0ocraYc6ZyS9QNJVkv4o6XpJL6ls+5GkT5XbH5T0PUnPlPTNMq8FknboKOphkpaWZfispK6/u5J2l3S5pPvKa/DX3fYr9z2+zPMBSb+RdGzl+59W9rOkd0u6pdz3k5KeI+nqsrwXSFqv3Pcpvycd1+vq8prcKenUwXSV43T9uUs6ATgW+N+Va3aSpH/vOMa/SvrCUOccfWJ7rVyAmcAqYJ1h9vkY8I3y80uAZR3bbwVeXn6+Gnhz+flpwAvKzzsArh4HeA2wBNgDWAf4R+CqynYDlwPPADasfLdz+fks4D5gRpn+m8B55bZpwArgdeW2vwNWAm8f4hzfC1wDbAOsD5wBnNtR9q8AGwLPAx4F9hjhnLcG7gUOo/jjfEi5vkW5/Ufl+T8H2Ay4EfgV8PKyzF8HvtZxPX5YXo/tyn3fXm47Hvhp+Xlj4DbgLWU++wPLgb26nPfG5XXarVx/9uB+1Twrx78E2BTYq7wGVwA7Vcp/XLffE578O3IA8IKybDsANwHv7fHn/qnK/s8GHgI2L9fXAe4GDhjv/1+TfVlra5zAM4Hltlf1Kb+VwM6Sptl+0PY1w+z7TuBfbN9UHv+fgX2rtc5y+322Hxkij4tsX1um/yawb/n9YcBiF7XoVcCXgN+PUJYP215m+1GKPxZv6Ggqftz2I7avB66nCKDDnfObgHm259lebftyYGFZtkFfs/1r2/cDlwK/tv2DsszfBvbrKOfJ5fX4HcXtlaO7nMvhwK22v2Z7le2fAf8OvGGIc18N7C1pQ9t32l48zHU62faKcp9fAJfZXlopf2d5n8L2dbavKct2K8UfqRd37DbSz72a350ULZo3ll/NpPidvm6ktLFm1ubAeS8wrY/3kt4G7Ar8smxqHj7MvtsDXyybbH+kqD2KoqY26LYRjlcNhg9T1PgA/qya1raBrk3HSlkurpTlJmCA4p7vSMca6py3B944mGeZ70EUNaRBd1U+P9Jl/Wk8WfV6/LY8z27ncmDHcY8Fturc0fZDwJHALODO8rbH7l3yHG15n0LSrpL+Q9LvJa2g+IM5rWO3kX7unc6m+ENF+e85PaaPUVibA+fVwJ8oms11PARsNLgiaSqwxeC67VtsHw1sCZwMXChpY4qmVqfbgHfa3ryybGj7qso+o5226k6KZvdgOVVdH6Ish3aUZQPbt490oGHO+TbgnI48N7b96VGeE8C2lc/bAXcMcS4/7jju02y/a4jyz7d9CEVA/yXFLYkmfbk8zi62NwX+geIP5pOKNUz6btu+A+wjaW+KGvc3+1HQGN5aGzjLJtZHgNMkvUbSRpLWlXSopM90SfIrYANJfyVpXYr7kusPbpT0Jklb2F4N/LH8egC4h6JJWB2LNwf4kKS9yrSbSXoj/fGfwHPLc1oHOJEuNa6OsvzT4G0CSVtIqjWqYJhz/gbwKkmvlDRV0gZlp8lwAXwkJ0l6uqRtKe7bnt9ln/8AdpX05vJnua6k50vao0vZnyXp1WWgfxR4sCx7kzahuK/6YFm77RrQh3EXT/49wvafgAuBbwHXlrcyomFrbeAEsP154H0UQfAeihrLbIq/4p373g+8GzgTuJ2iBlptAs8EFkt6EPgicJTtP9l+GPgn4L/K5uMLbF9MUUM7r2yy/QI4tE/ntJzintdnKG5H7Elxf/HRIZJ8kaLj4zJJD1B0FB1Y83BDnfNtFEO6/oEnrutJrNnv23eB64BFFH8c/q1zB9sPAK8AjqKokf6e4jqv37lvWZb/Ve53H8W9xnevQfnqeD9wDPAARe22W/Afzr8Be5a/R9Xf0bOB55Jm+phRcQssJqty2M4y4FjbPxzv8kT/SdqO4hbAVrZXjHd51gZrdY1zsiqbyJtLWp8n7qMN18sfE1T5h/F9FMPREjTHSJ5OmJz+nOKe13oUYwxfU2d4S0ws5f3ZuyhGGcwc5+KsVdJUj4joUZrqERE9mpBN9fW0vjdg4/EuRmvtvM9D412E1pvylOGTUXXrbStZft/AGl2kVx68se+9r94Ir+tueHS+7Qlzu2FCBs4N2JgD9bLxLkZrXXzpteNdhNbbaMp6I++0Fpvxyl4fYHqqe+8b4Nr529Xad+qzb+l8gqrVJmTgjIj2M7Ca1eNdjEYkcEZEI4xZ6aYfxhofCZwR0ZjUOCMiemDMwCQd7pjAGRGNWT3qSb7aLYEzIhphYCCBMyKiN6lxRkT0wMDK3OOMiKjPOE31iIieGAYmZ9xM4IyIZhRPDk1OCZwR0RAxMEknU0ngjIhGFJ1DCZwREbUV4zgTOCMierI6Nc6IiPpS44yI6JERA5P07TwJnBHRmDTVIyJ6YMRjnjrexWhEAmdENKIYAJ+mekRET9I5FBHRA1sMODXOiIierE6NMyKivqJzaHKGmMlZj46IcTfYOVRnqUPSTEk3S1oi6YND7PMSSYskLZb0436eT9Xk/HMQEa0w0KdxnJKmAqcBhwDLgAWSLrF9Y2WfzYHTgZm2fydpy74cvIsEzohoRJ+fHJoBLLG9FEDSecARwI2VfY4BLrL9OwDbd/fr4J3SVI+Ixqz2lFpLDVsDt1XWl5XfVe0KPF3SjyRdJ+lv+nQaT5EaZ0Q0opjko3bdbJqkhZX1ubbnVta7tfk7X8yxDnAA8DJgQ+BqSdfY/lXdQtSVwBkRjTBiZf1HLpfbnj7M9mXAtpX1bYA7uuyz3PZDwEOSrgSeB/Q9cKapHhGNsGHAU2otNSwAdpG0o6T1gKOASzr2+S7wQknrSNoIOBC4qa8nVepL4JQ0UBkCcL2k90maUtk+Q9KV5VCCX0o6U9JGko6XdI+k/5F0i6T5kv6iH2WKiPEmVtdcRmJ7FTAbmE8RDC+wvVjSLEmzyn1uAr4P3ABcC5xp+xdNnFm/muqP2N4XoBwC8C1gM+Cjkp4FfBs4yvbVkgS8HtikTHu+7dll2oOBiyQdXF6EiJigDH195NL2PGBex3dzOtY/C3y2bwcdQt+b6uUQgBOA2WWQPBE42/bV5XbbvtD2XV3S/hCYW6aPiAlugCm1lommkRKXY62mAFsCewPX9ZD8Z8DuTZQrIsaOEatdb5lomuxVH+3V6JpO0gmUNdEN2Gi0ZYqIMVK8HnhyDtxppMYpaSdgALgbWEwxtqqu/ejSE2Z7ru3ptqevy/r9KWhENEgM1Fwmmr4HTklbAHOAU20bOBU4TtKBlX3eJGmrLmlfTFGr/Eq/yxURY8v09cmhVulXPXpDSYuAdYFVwDnA5wFs3yXpKOCUssd9NXAlcFGZ9khJBwEbAb8BXp8e9YjJYSLWJuvoS+C0h388oOxRf2GXTWeVS0RMMrYmZG2yjsl5VhEx7orOoam1lrEmaVdJV0j6Rbm+j6R/rJs+gTMiGqJ+PnLZb18BPgSsBLB9A8VjnLVMzrECETHuis6h1t7j3Mj2tcUzOo9bVTdxAmdENKbFTwUtl/QcyqnpJL0BuLNu4gTOiGjE4JNDLXUixePdu0u6nWJEz5vqJk7gjIjG1H0R21grHwt/uaSNgSm2H+glfTvPKiImPBtWrp5Saxlrkv5Z0ua2H7L9gKSnS/pU3fQJnBHRiKKp3tonhw61/cfHy2r/ATisbuI01SOiMS1+cmiqpPVtPwogaUOoPwlGAmdENKLlw5G+AVwh6WsURX0rcHbdxAmcEdGQ9j5yafszkn5O8UZMAZ+0Pb9u+gTOiGhMnfcJjRfblwKXjiZtAmdENKLoVR/759DrkPQ64GSKt1SoXGx70zrpEzgjohEtHwD/GeBVo53CMoEzIhrT4qb6XWsy728CZ0Q0ouW96gslnQ98B3h08EvbFw2d5Ant7PKKiEmhnwPgJc2UdLOkJZI+2GX7SyTdL2lRuXxkmOw2BR4GXgG8qlwOr3teqXFGRCNssapPw5EkTQVOAw4BlgELJF1i+8aOXX9ie8QAaPsta1Ke1DgjojF9fK/6DGCJ7aW2HwPOA44YbbkyA3xEtNLgPc6agXOapIWV5YSO7LYGbqusLyu/6/Tnkq6XdKmkvYYpXmaAj4h26qFzaLnt6cNs75aRO9Z/Bmxv+0FJh1F0/OwyRH5rNAN8apwR0YjBcZx9aqovA7atrG8D3PGk49krbD9Yfp4HrCtp2hD5ZQb4iGinPo7jXADsImlH4HaKZvUx1R0kbUUxPtOSZlBUDO8dIr/MAB8R7WPDqj5NUmx7laTZwHxgKvBV24slzSq3zwHeALxL0irgEeAo253N+cH81mgG+ATOiGhMPwfAl83veR3fzal8PhU4tU5ekt7XsQ5wP3Cd7UUjpU/gjIhGtPxZ9enl8r1y/a8obgfMkvRt258ZLnECZ0Q0xu0NnM8E9h/sTJL0UeBC4EXAdRSTgAwpgTMiGtPiST62Ax6rrK+kGMr0iKRHh0jzuATOiGiE3epJPr4FXCPpu+X6q4Bzy86izsc4nyKBMyIaIgbG4dW/I1HRE3QWRUfTQRSD62fZXljucuxIeSRwRkRj2niPsxzn+R3bB1Dcz+zZhAycO+/zEBdfeu14F6O1XrvNjPEuQuvNv2PEESexhlo+H+c1kp5ve8FoEk/IwBkRE4CL+5wtdTDF0KNbgYd44p1D+9RJnMAZEY1pca/6oWuSuH13biNiUnDZOVRnGfOy2b+lmDTkpeXnh+khHqbGGRGNaWtTvRzwPh3YDfgasC7wDeAv66RP4IyIxrSxV730WmA/ijk8sX2HpE3qJk7gjIhG2K0OnI+Vw5IG5+PcuJfECZwR0ZgWD0e6QNIZwOaS3gG8leJ1GrUkcEZEY9p6j9P2KZIOAVZQ3Of8iO3L66ZP4IyIRhixuoWPXAJI+nvg270Ey6p2nlVETAquuYyDTYH5kn4i6URJz+olcQJnRDSj7Byqs4x50eyP296L4t1Dfwb8WNIP6qZPUz0imtPSe5wVdwO/p3ip25Z1E6XGGRGNaWuNU9K7JP0IuAKYBryj7nPqkBpnRDTEwOrVrR2OtD3w3jovZusmgTMimmGgZeM4JW1qewXlO4UkPaO63fZ9dfJJUz0iGmPXW+qQNFPSzZKWSPrgMPs9X9KApDd02fyt8t/rgIXlv9dV1mtJjTMimtOnziFJU4HTgEOAZcACSZfYvrHLficD87sWxz68/HfHNSlPAmdENKSvHT8zgCW2lwJIOg84gqe+WO1vgX8Hnt+1RNL+wx3E9s/qFCaBMyKaU7/GOU1Stak81/bcyvrWwG2V9WXAgdUMJG1NMevRSxkicAKfK//dgGJauespZn/fB/hvipe3jSiBMyKaYXD9XvXltqcPs71bRp1h+QvAB2wPFC+y7JLAPhger7GeYPvn5frewPvrFjaBMyIa1Lem+jKKGdsHbQPc0bHPdOC8MmhOAw6TtMr2d7rkt/tg0ASw/QtJ+9YtTAJnRDSnf08OLQB2kbQjcDtwFHDMkw5V6fCRdBbwH0METYCbJJ1JMeu7gTcBN9UtTAJnRDSnT4HT9ipJsyl6y6cCX7W9WNKscvucHrN8C/Au4O/K9SuBL9dNnMAZEc3o8wB42/OAeR3fdQ2Yto8fIa8/Af+3XHqWwBkRjWnrRMZrKoEzIprT3mfV10gCZ0Q0RqlxRkT0YByndx+JpF2BkyhmSXo8Dtp+aZ30CZwR0RC1bnakim8DcyjebDnQa+IEzohoTktrnMAq27WHH3XKtHIR0ZzVNZex9z1J75b0bEnPGFzqJk6NMyKa0cKJjCuOK/89qfKdgZ3qJK4dOCUNAD8H1gVWAWcDX7C9utw+AzgFeFZZgJ8C7wH+GvgsxbOmTwOWAh+3fVWZ7rPAq4DHgF8Db7H9x7rlioj2amuv+prOx9lLU/0R2/uWr9Q8BDgM+ChA+U7ib1PMTLIbsAfwfWCTMu35tvezvQvwaeAiSXuU2y4H9i5flPQr4ENrckIR0SItfbG6pHUlvUfSheUyW9K6ddOP6h6n7buBE4DZKqYiORE42/bV5XbbvtD2XV3S/hCYW6bH9mW2V5Wbr6GY9SQioklfBg4ATi+XAxiLZ9VtL5U0heJdxHtTNN3r+hnwzi7fvxU4v1sCSSdQBtttt57aW2EjYly0takOPN/28yrr/0/S9XUTr2mv+mjv/D4lnaQPU9w7/Wa3BLbn2p5ue/q0ZyZwRrSeKR65rLOMvQFJzxlckbQTPYznHHWNs3Kgu4HFFFXd79ZMvh+Vue8kHQccDrzMnqzTAkSshdr7v/kk4IeSllJU5LanmGqullEFTklbUIy6P9W2JZ0KXCvpP23/d7nPm4AfdEn7Yoom9+AU9jOBDwAvtv3waMoTEe3U1qa67Ssk7QLsRhE4f2n70brpewmcG0paxBPDkc4BPl8W4i5JRwGnSNqSYkjrlcBFZdojJR0EbAT8Bni97cEa56nA+sDl5ZT319ie1UO5IqKtWho4SwcAO1DEwedJwvbX6ySsHThtD3tjsexRf2GXTWeVy1Dpdq5bhoiYYFoaOCWdAzwHWMQT9zYN9DdwRkT0Qm5vU53ixW57jrZPJYEzIprT3omMfwFsBdw5msQJnBHRmBbXOKcBN0q6Fni8U8j2q+skTuCMiOa0N3B+bE0SJ3BGRDNafI/T9o/XJH3m44yI5rR3ko/XSbpF0v2SVkh6QNKKuukTOCOiMVpdb6mVlzRT0s2Slkj6YJftR0i6QdIiSQvLseND+Qzwatub2d7U9ia2N617XmmqR0TrSZoKnEYxpeUyYIGkS2zfWNntCuCS8mnGfYALgN2HyPKuykM4PUvgjIjm9K8ZPgNYYnspgKTzgCOAxwOn7Qcr+288wtEXSjof+A5P7lW/aOgkT0jgjIhm9NY5NE3Swsr6XNtzK+tbA7dV1pcBB3ZmIum1wL9QTHf5V8Mcb1PgYeAVTy4xCZwRMc7qB87ltqcPs73bSPqn5G77YuBiSS8CPgm8vGux7NozIXWTwBkRzelfU30ZsG1lfRvgjiEPa18p6TmSptle3rld0gbA24C9gA0q6d5apzDpVY+IRoi+9qovAHaRtKOk9YCjgEuedDxp5/JVPkjaH1gPuHeI/M6heOTylcCPKQLxA3XPLTXOiGhGHwfA214laTYwH5gKfNX2Ykmzyu1zgNcDfyNpJfAIcOQwk3jsbPuNko6wfbakb5V515LAGRHN6ePgdtvzgHkd382pfD4ZOLlmdivLf/8oaW/g9xRzc9aSwBkRzWnpI5fAXElPB/4PRZP/acBH6iZO4IyIxrT4WfUzy48/BnbqNX0CZ0Q0p6WBU9L6FPdEd6ASB21/ok76BM6IaIbrP4c+Dr4L3A9cR+XJoboSOCOiOS2tcQLb2J452sQZxxkRjRl879BIyzi4StJzR5s4Nc6IaE7LapySfk5RqnWAt0haStFUF2Db+9TJJ4EzIpoxTpMUj+DwfmSSwBkRjRCtHI50D7DS9koASbsBhwG/rTulHOQeZ0Q0qIX3OL9P+YSQpJ2BqynGcZ4o6V/qZpLAGRHNad87h55u+5by83HAubb/FjiUHprxCZwR0Zz2Bc7q0V4KXA5g+zGg9qjT3OOMiGa08/XAN0g6Bbgd2Bm4DEDS5r1kkhpnRDSnfTXOdwDLKe5zvsL2w+X3ewKn1M0kNc6IaEzbHrm0/Qjw6ep3kva3fRVwVd18JmTgnILYaMp6412M1pp/x6LxLkLrvfLP9h3vIrTarzzUxOm9aWFTvZszgf17STAhA2dETADtHADfTbcXwQ0rgTMimjMxAufHe02QwBkRjWjpk0OPk7Q1sD1wX/k6YWxfWSdtAmdENEar2xk5JZ0MHAncCAyUXxtI4IyIcdTue5yvAXaz3fMkxpDAGRENanFTfSmwLqOY/R0SOCOiSX0MnJJmAl+keK/6mbY7x2MeC3ygXH0QeJft64fI7mFgkaQrqARP2++pU5YEzohoTL9qnJKmAqcBhwDLgAWSLrF9Y2W33wAvtv0HSYcCc4EDh8jyknIZlQTOiGhO/2qcM4AltpcCSDoPOIKic6c4VPH0z6BrgG2GLJZ9tqQNge1s39xrYfKsekQ0o3zLZZ0FmCZpYWU5oSO3rYHbKuvLyu+G8jbg0qE2SnoVsIhifk4k7Supdg00Nc6IaESP4ziX254+QnaduuYu6WCKwHnQMPl9jKIW+yMA24sk7VirpCRwRkST3Le2+jJg28r6NsAdnTtJ2ofi2fND7WEfuF9l+37pSfG4dmHTVI+IxvTx1RkLgF0k7ShpPeAoOjp3JG0HXAS82favRsjvF5KOAaZK2kXSv9LD7EgJnBHRjLpzcdYInLZXAbOB+cBNwAW2F0uaJWlWudtHgGcCp0taJGnhMFn+LbAXxVCkc4EVwHvrnlqa6hHRmH7Ox2l7HjCv47s5lc9vB95eM6+HgQ8DHy6HOm1s+091y5IaZ0Q0pode9bEtl/QtSZtK2hhYDNws6aS66RM4I6IZpugcqrOMvT1tr6B4Zn0esB3w5rqJEzgjojEtfK/6oHUlrUsROL9reyXpVY+IVmjfy9oGnQHcCmwMXClpe4oOolrSORQRjWjzRMa2vwR8qfLVb8uB87UkcEZEM+w2T2S8PvB6itcEV+PgJ+qkT+CMiOa0M24CfBe4H7iOUczJmcAZEY1pa1Md2Mb2zNEmTudQRDTDwGrXW8beVZKeO9rEqXFGRHPaW+M8CDhe0m8omuoCbHufOokTOCOiMS1uqh+6JonTVI+Ixmi1ay1jzfZvKaape2n5+WF6iIepcUZEM1r8emBJHwWmA7sBX6N44+U3gL+skz6BMyIaUQyAb2nkhNcC+wE/A7B9h6RN6iZO4IyI5ozDzEc1PWbbUnEXtpwlqbbc44yIxsiutYyDCySdAWwu6R3AFRSv3KglNc6IaEaL73HaPkXSIRQTe+wK/KPtH9RNP2KNU5Ilfa6y/n5JH+vY53pJ53Z8d5akh6v3DSR9scxvWrk+UE5xv7jM432SUguOmBTq9aiPZa+6pAckrZC0guL9RLPK5WJJ90i6RtLLRsqnTpB6FHjdYLDrUpA9ynxe1OU+wRKKl8ZTBsSDgdsr2x+xva/tvYBDgMOAj9YoU0RMBC2byNj2JrY3LZdNqguwFfBO4Isj5VMncK4C5gJ/P8T2Y4BzgMuAV3dsOxc4svz8EuC/yvyewvbdwAnAbHW8szMiJiC399UZ3dgesH098K8j7Vu3WXwacKykzbpsOxI4nyJIHt2x7RZgC0lPL7edN9xBbC8ty7RlzXJFRJu1rMZZh+0zRtqnVuAs383xdeA91e8lPR+4pxx5fwWwfxkkqy6ieAfygcBPahyua21T0gmSFkpaeM+9A3WKHRHjrb0zwK+RXjpivgC8jWKq+UFHA7tLuhX4NbApxeSgVecBnwQutz1spVzSTsAAcHfnNttzbU+3PX2LZ07todgRMV60enWtpVZe0kxJN0taIumDXbbvLulqSY9Ken/fT6aiduC0fR9wAUXwHOzseSOwj+0dbO9A0RF0dEe631G8v/j04fKXtAUwBzjVblndPSJ6Z4oB8HWWEZTvPj+NYnKOPYGjJe3Zsdt9FK3iU/pS/mH0OvTnc8Bg7/qLgNttV3vJrwT2lPTsaiLbZ9j+dZf8NhwcjgT8gKKD6eM9likiWkjUG/xecwD8DGCJ7aW2H6NoyR5R3cH23bYXACv7fzZPNuIAeNtPq3y+C9iosvkFHfsOAINB8/gh8tuh8jlt7ojJrH7jcZqkhZX1ubbnVta3Bm6rrC+j6DcZF3lyKCKaUz9wLrc9fZjt3TqNx+2WXgJnRDRj8B5nfyyjmD9z0DbAHX3LvUcJnBHRmLo95jUsAHaRtCPF04dHUTx8My4SOCOiIf0b3G57laTZwHxgKvBV24slzSq3z5G0FbCQYljkaknvBfYsx6H3VQJnRDTD9PWpINvzgHkd382pfP49RRO+cQmcEdGcljyH3m8JnBHRmBa/OmONJHBGRHMSOCMiemDDwORsqydwRkRzUuOMiOhRAmdERA8MjOH7hMZSAmdENMQw/BS8E1YCZ0Q0w6RzKCKiZ7nHGRHRowTOiIhetO8Nlv2SwBkRzTDQv2nlWiWBMyKakxpnREQv8shlRERvDM44zoiIHuXJoYiIHuUeZ0RED+z0qkdE9Cw1zoiIXhgPDIx3IRqRwBkRzci0chERozBJhyNNGe8CRMTkZMCrXWupQ9JMSTdLWiLpg122S9KXyu03SNq/3+c0KIEzIprhciLjOssIJE0FTgMOBfYEjpa0Z8duhwK7lMsJwJf7e0JPSOCMiMZ4YKDWUsMMYIntpbYfA84DjujY5wjg6y5cA2wu6dn9PaPChLzHed0Njy6f+uwlvx3vclRMA5aPdyFarmXXaMl4F6BTy64P269pBg/wh/k/8IXTau6+gaSFlfW5tnX0gLMAAAOQSURBVOdW1rcGbqusLwMO7Mij2z5bA3fWLENtEzJw2t5ivMtQJWmh7enjXY42yzUa3mS8PrZn9jE7dTvEKPbpizTVI2IiWAZsW1nfBrhjFPv0RQJnREwEC4BdJO0oaT3gKOCSjn0uAf6m7F1/AXC/7b4302GCNtVbaO7Iu6z1co2Gl+szDNurJM0G5gNTga/aXixpVrl9DjAPOIziBvbDwFuaKo88SZ8ljYhoSprqERE9SuCMiOhRAucwJA1IWiRpsaTrJb1P0pTK9hmSriwfA/ulpDMlbSTpeEn3SPofSbdImi/pL8bzXPqlqWsi6bPl/jdIuljS5uNzhmtGkiV9rrL+fkkf69jneknndnx3lqSHJW1S+e6LZX7TyvVhr32MnVz04T1ie1/bewGHUNx4/iiApGcB3wY+YHs3YA/g+8DgL/75tvezvQvwaeAiSXuM+Rn0X1PX5HJgb9v7AL8CPjRmZ9RfjwKvGwx2ncrznQK8SNLGHZuXUD4NUwbEg4HbK9uHvPYxthI4a7J9N8Xzr7MlCTgRONv21eV2277Q9l1d0v6Qotf0hLEsc9P6eU1sX2Z7Vbn5GooxeBPRKorz+vshth8DnANcBry6Y9u5wJHl55cA/1Xm9xRdrn2MoQTOHtheSnHNtgT2Bq7rIfnPgN2bKNd4auiavBW4dM1LN25OA46VtFmXbUcC51MEyaM7tt0CbCHp6eW284Y7SMe1jzGUwNm70f51n8y1gr5dE0kfpqhlfXONSjSObK8Avg68p/q9pOcD99j+LXAFsH8ZJKsuohjcfSDwkxqHm8y/V62VwNkDSTsBA8DdwGLggB6S7wfc1ES5xlM/r4mk44DDgWM98QcYfwF4G1C9j3k0sLukW4FfA5sCr+9Idx7wSeByj/BS8o5rH2MogbMmSVsAc4BTy//UpwLHSTqwss+bJG3VJe2LKe5HfWWsyjsW+nlNJM0EPgC82vbDY1H+Jtm+D7iAIngOdva8EdjH9g62d6DoCDq6I93vgA8Dpw+Xf5drH2Moj1wOb0NJi4B1KZqP5wCfB7B9l6SjgFMkbQmsBq6kaGoBHCnpIGAj4DfA621PhhpnU9fkVGB94PKyr+Ma27PG6Jya8jlgdvn5RcDttqu95FcCe3bOGWn7jCHyG/Lax9jKI5cRET1KUz0iokcJnBERPUrgjIjoUQJnRESPEjgjInqUwBkR0aMEzoiIHv1//b0SXd+z8L8AAAAASUVORK5CYII=\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "fig1, ax1 = plt.subplots()\n", "im1 = plt.imshow(ces1, vmax=np.log(2), vmin=0)\n", "plt.xticks(np.arange(3), labels)\n", "plt.yticks(np.arange(3), labels)\n", "plt.title('Clustering ensemble similarity')\n", "cbar1 = fig1.colorbar(im1)\n", "cbar1.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculating clustering similarity with multiple methods" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You may want to try different clustering methods, or use different parameters within the methods. `encore.ces` allows you to pass a list of `clustering_method`s to be applied.\n", "\n", "
\n", " \n", "**Note**\n", "\n", "To use the other ENCORE methods available, you need to install [scikit-learn](https://scikit-learn.org/stable/).\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Trying out different clustering parameters" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The KMeans clustering algorithm separates samples into $n$ groups of equal variance, with centroids that minimise the inertia. You must choose how many clusters to partition. [(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#k-means)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [], "source": [ "km1 = clm.KMeans(12, # no. clusters\n", " init = 'k-means++', # default\n", " algorithm=\"auto\") # default\n", "\n", "km2 = clm.KMeans(6, # no. clusters\n", " init = 'k-means++', # default\n", " algorithm=\"auto\") # default" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The DBSCAN algorithm is a density-based clustering method that defines clusters as 'high density' areas, separated by low density areas. The parameters `min_samples` and `eps` define how dense an area should be to form a cluster. Clusters are defined around core points which have at least `min_samples` neighbours within a distance of `eps`. Points that are at least `eps` in distance from any core point are considered outliers.\n", "[(See the scikit-learn user guide for more information.)](https://scikit-learn.org/stable/modules/clustering.html#dbscan)\n", "\n", "A higher `min_samples` or lower `eps` mean that data points must be more dense to form a cluster. You should consider your `eps` carefully. In MDAnalysis, `eps` can be interpreted as the distance between two points in Angstrom.\n", "\n", "
\n", " \n", "**Note**\n", "\n", "DBSCAN is an algorithm that can identify outliers, or data points that don't fit into any cluster. ``dres()`` and ``dres_convergence()`` treat the outliers as their own cluster. This means that the Jensen-Shannon divergence will be lower than it should be for trajectories that have outliers. Do not use this clustering method unless you are certain that your trajectories will not have outliers.\n", "\n", "
" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "db1 = clm.DBSCAN(eps=0.5,\n", " min_samples=5,\n", " algorithm='auto',\n", " leaf_size=30)\n", "\n", "db2 = clm.DBSCAN(eps=1,\n", " min_samples=5,\n", " algorithm='auto',\n", " leaf_size=30)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When we pass a list of clustering methods to `encore.ces`, the results get saved in `ces2` and `details2` in order." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 4\n" ] } ], "source": [ "ces2, details2 = encore.ces([u1, u2, u3],\n", " select='name CA',\n", " clustering_method=[km1, km2, db1, db2],\n", " ncores=4)\n", "print(len(ces2), len(details2['clustering']))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plotting" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAygAAADACAYAAADvC0eDAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+WH4yJAAAgAElEQVR4nO3deZwdVZn/8c83IQQIBJAkIPsOAsMaQB0EUVBAEBVGAiigIMaRwWV0xEEF1FlARHRAQ2AQRCQoAxjHSFjGAWUZE/iFJSASA0hAEsIWlgBJ5/n9cU6Tys3te6v79nKr+/t+veqVrlt1qp66fXL7PnWWUkRgZmZmZmbWDoYNdABmZmZmZmadnKCYmZmZmVnbcIJiZmZmZmZtwwmKmZmZmZm1DScoZmZmZmbWNpygmJmZmZlZ23CCYk1JCklbD3QcZs1IukzStwc6DjMzM+u5IZWgSHpM0gGF9QmSnpe030DG1VOS9pf0W0kvSnqsZts4SVdJeipvv13S3gMUqr849rLBVpcBJO0u6TZJL0uaL+lzAxTHuyXNG4hzDzW5Hi+W9JKkFyTdIWmipGGFfS6T9EauFy9JurtYzyVtLOm/JC3Mn3X3SzqhsH1VSWdKekTSK/mcl0ravCaWyyQtlbRhzetn5ps0f1d4bZX82grHaBeSjpH0eL7e6yW9pcG+nb+Dl/NyY3/GOpi4Pvc+SW+VNDV/l2nLGK1vDKkEpUjS8cCFwAci4taBjqeHXgEuBb5cZ9uawAxgD+AtwOXAryWt2X/h9R5Jqwx0DO1qMNRlSWOAG4CLgPWArYFKflFyXe22wyJiLWAz4N+BrwD/WbPPORGxJrA28CPgWknD87YrgCdy+fWA44D5hbLXAB8EjsnldwHuBt7buYOkUcARwIvAsXVifA74ZuGcbUvSjqT/Rx8H1gdeBX7YpNhhEbFmXt7X1zEOcq7PvWsZ6W/DEQMdiPWziBgyC/AYcABwMrAQGF/YtjkQwCdIHw7PAxOBPYH7gBeAC2qO90ngobzvdGCzwrbv5+MsIn14vKuw7Uzg58BPgJeA2TWxfAV4Mm97GHhvk+s6AHisxPUvAvboYttw4J+BP+fz3g1skrcFsHX++X+BkwrlTgB+n38W8D1gAemD8T5gp/x+LwHeAF4GfpX33xD4L+AZ4FHg1Jr36Brgpznuk4C9gJl5fT5w3kDXKdfl3qnLwL8CV3Tj+vcB7sjX8gRwQn79MuDbtXWzUK5Ylw8BHsyxPQl8CRgFLCb9UXw5LxuSbuaclv9/PJuv+S017/eJwF+A24DVct19Nsc4A1h/oOtNuy2d9bjmtb3y+79T7e80r6+R3+8N8/rLwK5dHP+A/PvcpEkcx+V69DnggZptZwJXAvcCx+fXVskxbN7F8dYmfSn9a65b3waGF+rl7cB/kD4n/1j8f5G3z8318lHg2G6+p/8K/KywvhXps3etsr8DL67PNWUGrD4XjtMwRi+DbxmKLSifAb5F+g80s872vYFtgKOA84HTSR8KOwIf7WyKlfQh0hf6jwBjgd8BVxWOMwPYldR68TPgF5JWK2z/IDAFWAeYClyQj7sdcAqwZ6S7MO8nfei1RNKuwKrAnC52+SJwNOlL22jSF9ZXu3ma9wH7AtuSruso4NmImEz6QDwn0h26w3KT969IH5Ibke7+fF7S+wvHO5yUpKyTy38f+H5EjCb90f15N+MbbAZTXX478FzuErFA0q8kbVpvx/z6b0h/EMfm2GZ1cdxG/hP4dI5tJ+B/IuIV4GDgqVh+R/kp4FTgQ8B+pITleVKrVdF+wNvydR5P+qO+Ceku6ETSFwtrIiL+AMwD3lW7Ld/xPY70RafzrvJdwIW5m2NtnTkA+ENEPNHktMeT6vwUYHtJu9eGBXwdOEPSiBKXcTmwlNQSuBvps/Gkwva9SV/axgBnkO6gvyXf+f4BcHCul+8k121J++RuQ10t++Rj70j6XE2BR/yZlKBs2yDeKyU9I+lGSbuUuD4ryfW55fpsQ9RQTFAOJH0A3N/F9m9FxGsRcSOpC9VVEbEgIp4kfXHbLe/3aeDfIuKhiFhKumu1q6TNACLipxHxbEQsjYjvAiOB7Qrn+X1ETIuIDlKTbucfhY687w6SRkTEY/kPTI9JGp3PcVZEvNjFbicBX4uIhyO5NyKe7eaplgBrAdsDyu/NX7vYd09gbER8MyLeiIi5wMXAhMI+d0bE9RGxLCIW5+NvLWlMRLwcEXd1M77BZjDV5Y1Jf1Q/B2xK+oN9VRf7HgvcHBFXRcSSHFtPEpQlObbREfF8RNzTYN9PA6dHxLyIeJ10F/LImu5cZ0bEK4W6uh6ptaYjIu6OiEU9iHGoeoqUEHf6kqQXSPX4fODrub4B/B2pPn8deFTSLEl75m3rke76dil/Cdyf1OowH7iFVBdXEBFTSa29J9Vuqzne+qQk9/O5PiwgtSwXP9sWAOfn+ns1qXXxA3nbMmAnSatHxF8jYnY+/+8jYp0Gy+9z+TVJd7KLXiR9NtdzLKkVcDPgt8B0Ses0ukbrNtfnntdnG6KGYoIykXQn6RJJqrO92NdzcZ31zjEcmwHf78z2SX06RWoNQNI/SnooD3J7gXQ3dUzhWE8Xfn4VWE3SKhExB/g86QvQAklTage5dYek1UktFXdFxL812HUTUveVHouI/yHdPb8QmC9pck6O6tkM2LB4x4R0F3/9wj61d4lOJP3u/ihphqRDW4l3EBhMdXkxcF1EzIiI14CzgHdKWrvOvi3X1ewIUovh45JulfSOBvtuBlxXeI8eIiVgXdXXK0hd5abkwZ3nlLxTaclGpHrY6dyIWAdYHRgPfEfSwQA5uTwtInYk/T5mAdfn/xPPAm9tcq6PAw8VktwrgWO6+H19jdQSuVqdbZ02A0YAfy3Ul4uAcYV9noxI/Vayx0ldfF4htXhOzOV/LWn7JvHXepnUCl40mtTFZiURcXtELI6IV/PfiBeoc7ffWuL63PP6bCW9f/9RMX6X1Roukm4Y6DjLGooJygJSd6J30XzgYCNPkLqHFDP+1SPiDknvIvW9/yiwbv4gepH0pa+piPhZROxD+mAI4OyeBChpJHA9qc/op0tcz1YlDvsKqc9spw2KGyPiBxGxB6mbwbYsH8Bf/PDqPN+jNe/fWhFxSPFwNcd+JCKOJn0wng1ck5uQh6rBVJfvY8Xfd+fP9c7To7oqqbauzoiIw0n16XqWdxmsraud5zy45j1aLbdG1cZMvpN4VkTsQOrWcCipK4c1ke8WbwSsdAc1kgdIfd4/UGf7QuBcUje8twA3A3tJ2rjBKY8DtpT0tKSngfNICfjBdY5/E6mb7N83ON4TwOvAmEJdGZ2/cHbaqOamwqaku+xExPSIOJD0RfSPpJZlJL1Ly2faqrd0JhWzWd6KiaQtSS2Zf2oQ8wqXScn/39ac63PL9dlKeva5Dv4wfdOGCyveXGxrQzFBIVKf8vcAB0n6Xg8PMwn4qtKMKUhaW8un7VuL1F/zGWAVSd9g5TtadUnaTtJ7cnLxGunOckcX+w5TGgswIq1qNUmr5m0jSOM3FgPHRcSyJqe+BPiWpG2U7CxpvTr7zQI+ImkNpWejnFiIZ09Je+dzv5Lj74x9PrBl4Th/ABZJ+oqk1SUNl7RToSm73vV+TNLYfC0v5JfrvjdDxWCpy8CPgQ9L2jXXn6+Tuo69UGffK4EDJH1UaXrM9ZTGWNW6F9gxH3M1UktOZ2yrSjpW0toRsYQ0AUCxrq5X03ozCfgX5W5vksZKOrzBte8v6W+U+pgvInX5GtJ1tRlJo5VaRacAP42Iul0X8x3YfUhfxJF0dv7sWEXSWqSxWXMidf27GbiJ1Pq1R+c+SlO/flKp1Wwr0kDmXfOyE2ms1UrdYrLTgX/q6joidWu9EfhuvqZhkrbSilOAjwNOlTQi/197GzBN0vqSPphvvLxOag3pyMf9XSwfF1Vv+V0+9pXAYfkL4Cjgm8C1EbFSC4qkTSX9bf7/sJqkL5O+wNze1fVZOa7PvVafyZ/fI/PqSK04BtKyIFgSSxsuVTIkExSASIPM3kPqR96o61NX5a8j3Q2eImkR8ADL71BMJw3i/ROpqfM1Vu6u1JWRpKkJF5K6zowjdX2qZ1/Sl75ppDsWi1k+NWvnXdv3AS+UuCtxHukO8o2kL1T/SWp+rvU90oDL+aSBc1cWto0m3R15nnTdz5Lu/pCPt4NSE/H1kfrbHkb6AH00X+8lpO5DXTkImC3pZdKA+Qm5O9CQNhjqcu4e+M/Ar0ktQ1uTptGst+9fSF2z/pHUbWIWhTvGhf3+RPpydjPwCCvfwfw48Fi+5onAx3K5P5LGv8zN9XVDUn2bCtwo6SXS2J9GzxXagHSDYBGpO9itpFm9bGW/yu/pE6QvS+eRZqAr+qf8+fUK6TPqx6RuJpBaya4j3bSYS2qt+2Ch7JGkz8irSa1/D5C61dxM+tL2y4i4PyKe7lxIv+9DVef5IRFxO+kGSyPHkSYleZD0eXgNK3bN+T/SBBYLgX8Bjow05m8YqV4/Rarb+9H47vZKIvXxn0j6bF5Ausnw5jEkTZI0Ka+uRZrm9nlSS/tBpJbC7o4/tOVcn3uxPmeLSckNpFYYTzhSRwDLiIZLlWjFboNmZmbWV5QeundS7vpoVmmuz+1jt11WjVt/s0HDfdbe6Im7I2J8P4XUEj9QzMzMzMyswgJYQrPe/NXhBMXMzMzMrOKq1o2rkSE7BsXMzKy/RcRl7g5jg4Xrc/sIYElEw6VK3IJiZmZmZlZhQdAxiFpQKpmgrKqRsRrVfvzFtju/OtAhDHl33/f6wogY25/ndN21Vj32xBIWPtfRr8+pGAz1dum4asc/GL53LH5mnj9zrZJe4vl+r7vdFQFLBsHnRKdKJiirMYq99d6BDqMl06fPar6T9anhb53zeH+f03XXWrXX+8vO8tx7BkO9nX/0Owc6hJZoEDxF574LvujPXKukm+Oafq+73Sc6BtEzViuZoJiZmZmZWZLGoDhBMTMzMzOzNpASlMEz95UTFDMzMzOzCgtwFy8zMzMzM2sPgVgSwwc6jF4zeNqCzMzMzMyGoM4WlEZLf5K0raRbJD2Q13eW9LWy5Z2gmJmZmZlVWGpBWaXh0s8uBr4KLAGIiPuACWULu4uXmZmZmVnFtdkYlDUi4g/SCjEtLVvYCYqZmZmZWYVFtN0YlIWStiI/ZlbSkcBfyxZ2Fy8zMzMzswoLxBuxSsOlGUkHSXpY0hxJp3Wxz7slzZI0W9KtDQ73WeAiYHtJTwKfBz5T9nrcgmJmZmZmVmEBLGuh3UHScOBC4EBgHjBD0tSIeLCwzzrAD4GDIuIvksZ1GU/EXOAASaOAYRHxUnficQuKmZmZmVmFpRaU4Q2XJvYC5kTE3Ih4A5gCHF6zzzHAtRHxF4CIWNDVwST9q6R1IuKViHhJ0rqSvl32epygmJmZmZlV3LIY1nABxkiaWVhOLhTfCHiisD4vv1a0LbCupP+VdLek4xqEc3BEvNC5EhHPA4eUvRZ38TIzMzMzq7BluQWliYURMb6LbfWmAIua9VWAPYD3AqsDd0q6KyL+VKfscEkjI+J1AEmrAyObBVg8kZmZmZmZVVgrY1BILSabFNY3Bp6qs8/CiHgFeEXSbcAuQL0E5afALZJ+TEp0PglcXjYYJyhmZmZmZhXWC9MMzwC2kbQF8CTpoYrH1OzzS+ACSasAqwJ7A9+rH0+cI+l+UmuLgG9FxPSywThBMTMzMzOrsIBSUwl3WT5iqaRTgOnAcODSiJgtaWLePikiHpJ0A3AfsAy4JCIeaHDM3wC/6Uk8TlDMzMzMzCosEMuitSfJR8Q0YFrNa5Nq1r8DfKfZsSR9BDgbGEdqQVEqHqPLxOIExczMzMyswgJY0kILSh84BzgsIh7qSeG2uhIzMzMzM+su0VF3Iq4BM7+nyQn0UoIiqQO4HxgBLCWN0j8/Ipbl7XsB5wLrk5K83wOnAh8lNRPNA9YE5gJnRcQdvRGXmZmZmdlgl1pQWhok39tmSroauB54vfPFiLi2TOHeakFZHBG7AuTH3v8MWBs4Q9L6wC+ACRFxpyQBRwBr5bJXR8Qpuez+wLWS9m8l6zIzMzMzGyoi1PkwxnYxGngVeF/htQD6NUFZfuaIBfnJlDMknQl8Frg8Iu7M2wO4BiDlKiuU/a2kycDJwBd6OzYzMzMzs8Gm3VpQIuITrZTvk1QrIubmY48DdgLu7kbxe4Dta1+UdLKkmZJmLlneUmTW9lx3rYpcb62qXHdtKArEkmXDGy79SdK2km6R9EBe31nS18qW78u2oJ6O1KlbLiImR8T4iBg/gpEthGXWv1x3rYpcb62qXHdtqOpgWMOln10MfBVYAhAR95Ee/lhKn0QraUugA1gAzAb26Ebx3QCPPzEzMzMzKyEQS2N4w6WfrRERf6h5bWnZwr2eoEgaC0wCLsjjTS4Ajpe0d2Gfj0naoE7Z/UjjTy7u7bjMzMzMzAajCOgINVz62UJJW5GGxyDpSOCvZQv31iD51SXNYvk0w1cA5wFExHxJE4Bz8wxfy4DbWD6K/yhJ+wBrAI8CR3gGLzMzMzOzcgKxtJ/HmTTxWWAysL2kJ0nf8T9WtnCvJCgRjduN8gxe76qz6bK8mJmZmZlZD7XTgxrzhFkHSBoFDIuIl7pT3k+SNzMzMzOrsHZrQZH0xZp1gBeBuyNiVrPyTlDMzMzMzCosApa014Max+flV3n9A8AMYKKkX0TEOY0KO0ExMzMzM6u4NnuS/HrA7hHxMoCkM0gPat+X9HxEJyhmZmZmZoNVmma4rRKUTYE3CutLgM0iYrGkpk9QbasrMTMzMzOz7glgWajh0oykgyQ9LGmOpNPqbH+3pBclzcrLNxoc7mfAXZLOyK0ntwNX5UHzDzaLxS0oZmZmZmZVFq0Nkpc0HLgQOBCYB8yQNDUiapOJ30XEoU2OJdIsvdOAfQABEyNiZt7l2GbxOEExMzMzM6uwAJa1Ns3wXsCcPD0wkqYAh1OitWOlWCJC0vURsQdpvEm3uYuXmZmZmVmFBbB02bCGCzBG0szCcnLhEBsBTxTW5+XXar1D0r2SfiNpxwYh3SVpz55ej1tQzMzMzMwqLD0HpWm7w8KIGN/FtnrNL1Gzfg9poPvLkg4Brge26eJ4+5OmFH4MeCUfPyJi52ZBghMUMzMzM7PKa7GL1zxgk8L6xsBTxR0iYlHh52mSfihpTEQsrHO8g1sJxl28zMzMzMwqLKJUF69GZgDbSNpC0qrABGBqcQdJG+QB8Ejai5RHPFs/nniclPC8J//8Kt3IO9yCYmZmZmZWcWWmEu5KRCyVdAowHRgOXBoRsyVNzNsnAUcCn5G0FFgMTIiI2m5gwJsPZhwPbAf8GBgB/BT42zLxOEExMzMzM6uwQHQ0byVpfIyIaaSpgYuvTSr8fAFwQcnDfRjYjTRuhYh4StJaZWNxgmJmZmZmVnEtjkHpbW/k6YYDID+gsTQnKGZmZmZmFRZByy0oveznki4C1pH0KeCTwMVlCztBMTMzMzOrtNa7ePWmiDhX0oHAItI4lG9ExE1lyztBMTMzMzOrsKC1QfK9TdIXgF90JykpcoJiZmZmZlZlAR1tlKAAo4Hpkp4DpgDXRMT8soXbpy3IzMzMzMy6LYAINVz6NZ6IsyJiR+CzwIbArZJuLlu+ki0o2+78KtOnzxroMFry/g13HegQWnLeY3cOdAiV5Lo78JbevOlAh9CSh1+7st/PuXTcKOYf/c5+P29vWv8Hdwx0CC159qR3DHQIZtbWRMeytmpB6bQAeJr0QMdxZQu5BcXMzMzMrOLaqQVF0mck/S9wCzAG+FRE7Fy2fCVbUMzMzMzMLGnDaYY3Az4fET3qNuIExczMzMys4pa1QRcvSaMjYhFwTl5/S3F7RDxX5jhOUMzMzMzMKizo/25cXfgZcChwN2nsfjGoALYscxAnKGZmZmZmVRbt8RyUiDg0/7tFK8dxgmJmZmZmVnUx0AGApN0bbY+Ie8ocxwmKmZmZmVnFtcMYFOC7+d/VgPHAvaRuXjsD/wfsU+YgbTXc38zMzMzMuqc3HtQo6SBJD0uaI+m0BvvtKalD0pErxRGxf0TsDzwO7B4R4yNiD2A3YE7Z63GCYmZmZmZWZQGxTA2XRiQNBy4EDgZ2AI6WtEMX+50NTG8S0fYRcf+b4UU8AJR+0rMTFDMzMzOzqosmS2N7AXMiYm5EvAFMAQ6vs98/AP9FekJ8Iw9JukTSuyXtJ+li4KGyl+IxKGZmZmZmlda8lQQYI2lmYX1yREzOP28EPFHYNg/Ye4UzSBsBHwbeA+zZ5FyfAD4DfC6v3wb8qFmAnZygmJmZmZlVWe7i1cTCiBjfxbZ6hWvbXc4HvhIRHVLjc0XEa8D38tJtTlDMzMzMzCqvpVm85gGbFNY3Bp6q2Wc8MCUnJ2OAQyQtjYjrWzlxPU5QzMzMzMyqbllLpWcA20jaAngSmAAcU9yh+PBFSZcB/90XyQk4QTEzMzMzq7YAWniSfEQslXQKaXau4cClETFb0sS8fVKvxFmSExQzMzMzs4qL1lpQiIhpwLSa1+omJhFxQqNjSdoW+DKwGYV8IyLeUyYWJyhmZmZmZlXXQgtKH/gFMAm4GOjobmEnKGZmZmZmVRagFltQetnSiCg9rXAtJyhmZmZmZpUmaD7NcH/6laS/B64DXu98MSKeK1PYCYqZmZmZWdU1f1p8fzo+//vlwmsBbFmmcOkERVIHcD8wAlgKXA6cH5GG5EjaCzgXWD8H8HvgVOCjwHdI8yuvCcwFzoqIO3K57wCHAW8AfwY+EREvlI3LzMzMzGxIC9qqBaU4JXFPDOvGvosjYteI2BE4EDgEOANA0vqkwTBfiYjtgLcBNwBr5bJXR8RuEbEN8O/AtZLelrfdBOwUETsDfwK+2soFmZmZmZkNNYrGS7/GIo2QdKqka/JyiqQRZct3J0F5U0QsAE4GTlF6nORngcsj4s68PSLimoiYX6fsb4HJuTwRcWNELM2b7yI9udLMzMzMzMqKJkv/+hGwB/DDvOyRXyulx2NQImKupGHAOGAnUpevsu4BPl3n9U8CV9crIOlkclKz6UYeOmPV4bprVVSstyPWWneAozErr1h3V2ONAY7GrP/0dytJE3tGxC6F9f+RdG/Zwj1qQSnoaWe3lcpJOp00tuXKegUiYnJEjI+I8WPXG97D05r1P9ddq6JivV1l9VEDHY5ZacW6O4KRAx2OWf/oHIPSaOlfHZK26lyRtCXdeB5Kj2/nFk60AJhNarr5ZcniuwEPFY51PHAo8N6IaK/8z8zMzMyszbXZc1C+DPxW0lxSw8RmwCfKFu5RgiJpLOnpkBdEREi6APiDpF9HxP/lfT4G3Fyn7H6kptf98/pBwFeA/SLi1Z7EY2ZmZmY2pLXRLf6IuEXSNsB2pATljxHxepNib+pOgrK6pFksn2b4CuC8HMR8SROAcyWNA5YBtwHX5rJHSdoHWAN4FDgiIjpbUC4ARgI3pfH23BURE7sRl5mZmZnZkKX2e5I8pN5Vm5PyjV0kERE/KVOwdIISEQ07z+cZvN5VZ9Nleemq3NZlYzAzMzMzszqifZ6DIukKYCtgFsvHngTQuwmKmZmZmZm1pzZrQRkP7NDTseWtzuJlZmZmZmYDrb2eg/IAsEFPCztBMTMzMzOrsjwGpdHSjKSDJD0saY6k0+psP1zSfZJmSZqZx5d3ZQzwoKTpkqZ2LmUvx128zMzMzMwqrpUuXpKGAxcCBwLzgBmSpkbEg4XdbgGm5hl8dwZ+DmzfxSHP7Hk0TlDMzMzMzIa6vYA5ETEXQNIU4HDgzQQlIl4u7D+KBh3HIuLWVoJxFy8zMzMzsyor18VrTO6a1bmcXDjCRsAThfV5+bUVSPqwpD8CvwY+2VU4kj4i6RFJL0paJOklSYvKXo5bUMzMzMzMqq75QPiFETG+i2315ihe6YgRcR1wnaR9gW8BB3RxvHOAwwrPPewWt6CYmZmZmVWYaHmQ/Dxgk8L6xsBTXe0cEbcBW0ka08Uu83uanIBbUMzMzMzMqq+1qYRnANtI2gJ4EpgAHFPcQdLWwJ/zIPndgVWBZ7s43kxJVwPXA6+/GWLEtWWCcYJiZmZmZlZl0dosXhGxVNIpwHRgOHBpRMyWNDFvnwQcARwnaQmwGDiqwYMYRwOvAu9bMUqcoJiZmZmZDQWtPkk+IqYB02pem1T4+Wzg7JLH+kQrsThBMTMzMzOruv5/WnyXJK0GnAjsCKzW+XpEdDnzV5EHyZuZmZmZVVkvPEm+l10BbAC8H7iVNOj+pbKFnaCYmZmZmVVdNFn619YR8XXglYi4HPgA8DdlC7uLl5mZmZlZxQ1AK0kjS/K/L0jaCXga2LxsYScoZmZmZmZVNjCtJI1MlrQu8HVgKrAm8I2yhZ2gmJmZmZlVmAC1UYISEZfkH28FtuxueScoZmZmZmYV105dvCSNJD03ZXMK+UZEfLNMeScoA+S8x+4c6BBa8sXN3zHQIfSCawY6gEpaevOmAx1CS1Y54C8DHUJLFG/0/0kD1NH/p+1Nz55U7c+s9S6p9t8MM+sHbdSCAvwSeBG4m8KT5MtygmJmZmZmVmUtPkm+D2wcEQf1tLCnGTYzMzMzqzhF46Wf3SGp9LTCtdyCYmZmZmZWce3QgiLpflJns1WAT0iaS+riJSAiYucyx3GCYmZmZmZWZe0zzfChvXEQJyhmZmZmZhUm2qMFBXgGWBIRSwAkbQccAjweEdeWPYjHoJiZmZmZVZyWRcOln9xAfmK8pK2BO0nPQfmspH8rexAnKGZmZmZmVRYlliYkHSTpYUlzJJ1WZ/uxku7Lyx2SdqlzmHUj4pH88/HAVRHxD8DBdKP7lxMUMzMzM7OK07LGS8Oy0nDgQlIisQNwtKQdanZ7FNgvD3T/FjC5zqGKqdB7gJsAIuINoHQnNI9BMTMzMzOruBanEt4LmBMRcwEkTQEOBx7s3CEi7ijsfxewcZ3j3CfpXOBJYGvgxny8dboTjFtQzMzMzMyqLFprQQE2Ap4orM/Lr3XlROA3dV7/FLCQNA7lfRHxan59B+DcElcCuAXFzMzMzKz6mregjJE0s7A+OSI6u/NoVuUAAAxBSURBVGmp7BEl7U9KUPZZqUDEYuDfa/bfPbe+3FG7f1ecoJiZmZmZVViaZrhphrIwIsZ3sW0esElhfWPgqZXOI+0MXAIcHBHPlgzvEmD3kvsC7uJlZmZmZlZtrXfxmgFsI2kLSasCE4CpxR0kbQpcC3w8Iv7Ujejqtc405BYUMzMzM7OKa+VBjRGxVNIpwHRgOHBpRMyWNDFvnwR8A1gP+KEkgKUNWmSKzupuPE5QzMzMzMwqrtUnyUfENGBazWuTCj+fBJxUOh5pI2Az4DlJ++Zj3FamrBMUMzMzM7MqCyD67WnxTUk6GziKNE1xR345ACcoZmZmZmZDQastKL3sQ8B2EfF6Two7QTEzMzMzqzDR8oMae9tcYATgBMXMzMzMbMiJKDPNcH96FZgl6RYKSUpEnFqmsBMUMzMzM7OKa7MuXlOpmaa4O5ygmJmZmZlVWQBt1IISEZdLWh3YNCIe7m75pg9qlBSSvltY/5KkM2v2uVfSVTWvXSbpVUlrFV77fj7emLzeIWmWpNn5GF+U5IdHmpmZmZl1Q4sPauzdWKTDgFnADXl9V0mlW1TKJAOvAx/pTCrqBPC2fJx9JY2q2TwHODzvNwzYH3iysH1xROwaETsCBwKHAGeUDd7MzMzMzEjTDDda+teZwF7ACym0mAVsUbZwmQRlKTAZ+EIX248BrgBuBD5Ys+0q0hzIAO8Gbs/HW0lELABOBk5RfjylmZmZmZk1Ee3VgkJ6yvyLK0dZTtnuVBcCx0pau862o4CrScnI0TXbHgHGSlo3b5vS6CQRMTfHNK52m6STJc2UNPOZZztWLmzWplx3rYqK9Xbp4lcGOhyz0op1d0nPZjg1q5w0zXA0XPrZA5KOAYZL2kbSfwB3lC1cKkGJiEXAT4AVpgaTtCfwTEQ8DtwC7J6TkaJrgQnA3sDvSpyubutJREyOiPERMX7sesPLhG3WFlx3rYqK9XaV1Wt775q1r2LdHcHIgQ7HrN+oIxou/ewfgB1JQ0WuAhYBny9buDuzeJ0P3AP8uPDa0cD2kh7L66OBI4BLCvtMyeUuj4hljXpvSdoS6AAWdCMuMzMzM7OhK6LdZvF6FTgdOF3ScGBURLxWtnzpGbMi4jng58CJ8Oag978Ddo6IzSNic9KA+KNryv0lB/jDRseXNBaYBFwQ0f/tUGZmZmZmVaVovPRrLNLPJI3OE2jNBh6W9OWy5bs7pe93gc7ZvPYFnoyI4qxctwE7SHprsVBEXBQRf65zvNU7pxkGbiYNtD+rmzGZmZmZmQ1d0XZdvHbIQ0Q+BEwDNgU+XrZw0y5eEbFm4ef5wBqFzW+v2bcD6ExOTujieJsXfnaHfDMzMzOzVrVXB6QRkkaQEpQLImKJVL4dxw9FNDMzMzOrOC2LhkvT8tJBkh6WNEfSaXW2by/pTkmvS/pSk8NdBDwGjAJuk7QZaaB8Kd0ZJG9mZmZmZu2ohRaUPJD9QtKD0+cBMyRNjYgHC7s9R5rR90PNQ4kfAD8ovPS4pP3LxuMExczMzMyswhQtjzPZC5iTn0mIpCmkya/eTFDyQ9UXSPpA03ikkaSZfTdnxXzjm2WCcYJiZmZmZlZ1y5o+Ln6MpJmF9ckRMTn/vBHwRGHbPNIzDHvql8CLwN3Q/SemOkExMzMzM6uyAJrmJyyMiPFdbKv3oMJWmmQ2joiDelrYg+TNzMzMzCpOy5Y1XJqYB2xSWN8YeKqFcO6Q9Dc9LewWFDMzMzOzSotWpxmeAWwjaQvgSWACcEwLx9sHOEHSo6QuXgIiInYuU9gJipmZmZlZlQXQwiD5iFgq6RRgOjAcuDQiZkuamLdPkrQBMBMYDSyT9HmWP5Cx1sE9DgYnKGZmZmZmlacWH9QYEdNIT30vvjap8PPTpK5fZY71uKR9gG0i4seSxgJrNivXyQmKmZmZmVmVBdDRfJR8f5F0BjAe2A74MTAC+Cnwt2XKO0ExMzMzM6u0KDPNcH/6MLAbcA9ARDwlaa2yhZ2gmJmZmZlVXYtdvHrZGxERkgJA0qjuFHaCYmZmZmZWZRHQ0THQURT9XNJFwDqSPgWcCFxStrATFDMzMzOzqmujFpSIOFfSgcAiYFvgaxFxc9nyTlDMzMzMzKqsTQbJS3qJ5U+gLz6dfqKk14A/A6dHxC2NjuMExczMzMys6tqgBSUiuhwIL2k4sBNwZf63S05QzMzMzMyqrP3GoKwkIjqAeyX9R7N9naCYmZmZmVVde00z3KWIuKjZPoo2aA7qLknPAI/34SnGAAv78Pj9oerX0B/xbxYRY/v4HCtw3W2q6vFD31/DYKy3UP3ffdXjB9fdnvDvfeANyu8L3bX2iLHxznWOaLjPDQsvujsixvdTSC2pZAtKX1cSSTOr8gvsStWvoerxd8V1t7Gqxw+D4xpq9ccf5qq/b1WPHwbHNdTyZ25zVb+GqsffawKizbt4dUclExQzMzMzMyuoYK+orjhBMTMzMzOrsgoMku8OJyj1TR7oAHpB1a+h6vEPlKq/b1WPHwbHNQyEqr9vVY8fBsc19LfB8J5V/RqqHn+viYoMki+jkoPkzczMzMwsWXvYevH2kYc03OfG135amUHywwY6ADMzMzMza1Esa7w0IekgSQ9LmiPptDrbJekHeft9knbvk+vAXbzMzMzMzCotIlqaxSs/5f1C4EBgHjBD0tSIeLCw28HANnnZG/hR/rfXDakWFEkdkmZJmi3pXklflDSssH0vSbfl7PGPki6RtIakEyQ9I+n/SXpE0nRJ76xyzJK+k/e/T9J1ktbpo/hD0ncL61+SdGbNPvdKuqrmtcskvSpprcJr38/HG5PXG743g4nrrutuFVWx3vZl3P1Rd11ve0cV626V620+j+tui6Kjo+HSxF7AnIiYGxFvAFOAw2v2ORz4SSR3AetIemvvXwk54xoiC/By4edxwM3AWXl9fdLDnN6R1wUcmV8/AbigUHZ/4GngbVWNGXgfsEr++Wzg7D6K/zXgUWBMXv8ScGZh+9uA+4EngVGF1y8D7gM+lteH5fV5hWN1+d4MtsV113W3iksV623V667rbXvXgSrG3B/11nW3V96/G4CZTZYHatZPLpQ/EriksP7xYr3Ir/03sE9h/RZgfF9cz6DLHsuKiAXAycApkgR8Frg8Iu7M2yMiromI+XXK/pY0a8TJVY05Im6MiKV5813Axn0U9tJ83i90sf0Y4ArgRuCDNduuAo7KP78buD0fbyV13ptBy3XXdbeKqlhv87mrVnddb3tZFetuBestuO62JCIOiojxTZadataLM6DVey9qZ9Iqs0+vGLIJCkBEzCW9B+OAnYC7u1H8HmD7voirkT6K+ZPAb1qPrksXAsdKWrvOtqOAq0kfLkfXbHsEGCtp3bxtSqOT1Lw3g5rr7ptcdyukivUWKll3XW97WRXrbgXrLbjuDqR5wCaF9Y2Bp3qwT68Y0glK1tPseSCz7l6LWdLppLsMV7YUUQMRsQj4CXBqzbn3BJ6JiMdJzYS75w+XomuBCaRBWL8rcbpBczekBNdd190qqmK9beX8/V53XW/7TBXrbmXqLbjuDrAZwDaStpC0Kum9nFqzz1TgOCVvB16MiL/2RTBDOkGRtCXQASwAZgN7dKP4bsBDfRFXI70Zs6TjgUOBYyOirx+Icz5wIjCq8NrRwPaSHgP+DIwGjqgpNwX4FnBTROM58mrem0HNddd1t4qqWG+hsnXX9bYXVbHuVrTeguvugMjd+E4BppN+9z+PiNmSJkqamHebBswF5gAXA3/flwENmYUVB0mNJfVjrB1Atndhn48BG7DyALL9GJhBb70WM3AQ8CAwth/jPwf4C3AmKTl+AtiosH1/4Jb882XAkfnnTwNb5Z8fo/6gtxXem8G2uO667lZxqWK9rXrddb1t7zpQxZj7o9667nqpXYbac1BWlzQLGEFqprwCOA8gIuZLmgCcK2kcsAy4jdRkCHCUpH2ANUizTBwREf1xR6SvYr4AGAnclMeI3RURnRlyX/kuKTsH2Bd4MiKeLGy/Ddihdsq6iLioi+N1+d4MQq67rrtVVMV625dx93fddb3tuSrW3cFSb8F1d8hTRF+31JmZmZmZmZUzpMegmJmZmZlZe3GCYmZmZmZmbcMJipmZmZmZtQ0nKGZmZmZm1jacoJiZmZmZWdtwgmJmZmZmZm3DCYqZmZmZmbWN/w8DRRX+Q27ExAAAAABJRU5ErkJggg==\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "titles = ['Kmeans 12 clusters', 'Kmeans 6 clusters', 'DBSCAN eps=0.5', 'DBSCAN eps=1']\n", "fig2, axes = plt.subplots(1, 4, sharey=True, figsize=(15, 3))\n", "for i, (data, title) in enumerate(zip(ces2, titles)):\n", " imi = axes[i].imshow(data, vmax=np.log(2), vmin=0)\n", " axes[i].set_xticks(np.arange(3))\n", " axes[i].set_xticklabels(labels)\n", " axes[i].set_title(title)\n", "plt.yticks(np.arange(3), labels)\n", "cbar2 = fig2.colorbar(imi, ax=axes.ravel().tolist())\n", "cbar2.set_label('Jensen-Shannon divergence')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As can be seen, reducing the number of clusters in the K-means method emphasises that DCD2 is more similar to the NAMD trajectory than DCD. Meanwhile, increasing `eps` in DBSCAN clearly lowered the density required to form a cluster so much that every trajectory is in the same cluster, and therefore they have identical probability distributions." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Number of clusters in DBSCAN eps=1: 1\n" ] } ], "source": [ "n_db = len(details2['clustering'][-1])\n", "\n", "print('Number of clusters in DBSCAN eps=1: {}'.format(n_db))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Estimating the error in a clustering ensemble similarity analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`encore.ces` also allows for error estimation using a bootstrapping method. This returns the average Jensen-Shannon divergence, and standard deviation over the samples. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "avgs, stds = encore.ces([u1, u2, u3],\n", " select='name CA',\n", " clustering_method=clustering_method,\n", " estimate_error=True,\n", " ncores=4)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0. , 0.68948563, 0.69314718],\n", " [0.68948563, 0. , 0.6892104 ],\n", " [0.69314718, 0.6892104 , 0. ]])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "avgs" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0.00000000e+00, 5.68506242e-03, 1.26584901e-16],\n", " [5.68506242e-03, 0.00000000e+00, 6.07091046e-03],\n", " [1.26584901e-16, 6.07091046e-03, 0.00000000e+00]])" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "stds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## References\n", "\n", "[1] Oliver Beckstein, Elizabeth J. Denning, Juan R. Perilla, and Thomas B. Woolf.\n", "Zipping and Unzipping of Adenylate Kinase: Atomistic Insights into the Ensemble of OpenClosed Transitions.\n", "Journal of Molecular Biology, 394(1):160–176, November 2009.\n", "00107.\n", "URL: https://linkinghub.elsevier.com/retrieve/pii/S0022283609011164, doi:10.1016/j.jmb.2009.09.009.\n", "\n", "[2] Richard J. Gowers, Max Linke, Jonathan Barnoud, Tyler J. E. Reddy, Manuel N. Melo, Sean L. Seyler, Jan Domański, David L. Dotson, Sébastien Buchoux, Ian M. Kenney, and Oliver Beckstein.\n", "MDAnalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations.\n", "Proceedings of the 15th Python in Science Conference, pages 98–105, 2016.\n", "00152.\n", "URL: https://conference.scipy.org/proceedings/scipy2016/oliver_beckstein.html, doi:10.25080/Majora-629e541a-00e.\n", "\n", "[3] Naveen Michaud-Agrawal, Elizabeth J. Denning, Thomas B. Woolf, and Oliver Beckstein.\n", "MDAnalysis: A toolkit for the analysis of molecular dynamics simulations.\n", "Journal of Computational Chemistry, 32(10):2319–2327, July 2011.\n", "00778.\n", "URL: http://doi.wiley.com/10.1002/jcc.21787, doi:10.1002/jcc.21787.\n", "\n", "[4] Matteo Tiberti, Elena Papaleo, Tone Bengtsen, Wouter Boomsma, and Kresten Lindorff-Larsen.\n", "ENCORE: Software for Quantitative Ensemble Comparison.\n", "PLOS Computational Biology, 11(10):e1004415, October 2015.\n", "00031.\n", "URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004415, doi:10.1371/journal.pcbi.1004415." ] } ], "metadata": { "kernelspec": { "display_name": "Python (mda-user-guide)", "language": "python", "name": "mda-user-guide" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }