Characterizing spatial patterns and distributions

.title[
# Characterizing spatial patterns and distributions
]
.subtitle[
## with Topological Data Analysis (TDA)
]
.author[
### <strong>Erik Amézquita</strong>, Sutton Tennant, Sandra Thibivillers, Sai Subhash<br>Benjamin Smith, Samik Bhattacharya, Jasper Kläver, Marc Libault<br>—<br>Division of Plant Science & Technology<br>Department of Mathematics<br>University of Missouri<br>—
]
.date[
### 2024-07-19
]

---

# Patterns, patterns everywhere!

<div class="row" style="font-size: 22px; font-family: 'Yanone Kaffeesatz'; margin: 0 auto;">
  <div class="column" style="max-width:33%;">
    <img src="https://www.landsat.com/samples/county2018/andrew-mo-2018.jpg">
    <img src="https://www.atlas.moherp.org/maps/county/histveg/Andrew.png">
    <p style="text-align: center;">Quantify/Describe</a>
  </div>
  <div class="column" style="max-width:33%;">
    <img src="https://www.landsat.com/samples/county2018/boone-mo-2018.jpg">
    <img src="https://www.atlas.moherp.org/maps/county/histveg/Boone.png">
    <p style="text-align: center;">Compare/Contrast</a>
  </div>
  <div class="column" style="max-width:33%;">
    <img src="https://www.landsat.com/samples/county2018/new-madrid-mo-2018.jpg">
    <img src="https://www.atlas.moherp.org/maps/county/histveg/New_Madrid.png">
    <p style="text-align: center;">Model/Predict</a>
  </div>
</div>
<p style="font-size: 10px; text-align: right; color: Grey;">Credits: <a href="https://www.landsat.com/aerial-photography/missouri/">Landsat.com</a> and <a href="https://www.atlas.moherp.org/missouri/">MO Herpetological Atlas Project</a></p>
---

# Issue at hand: transcript distribution

![](../figs/D2_distribution_example.jpg)

- Different cells of different shapes and sizes
- Beyond density: How to quantify and compare patterns?
- Patterns across the whole cross section? Patterns within cells?

---

# Plan of attack

1. Make heatmaps of transcript distributions via Kernel Density Estimates (KDEs)

1. Quantify these heatmaps via Topological Data Analysis (TDA)

1. Featurize these topological signatures as Persistence Images

1. ????? [traditional data analysis]

1. Profit

![](../figs/D2_GLYMA_05G092200_1749_1748_3D_kde_correction.gif)

---

# [1] Kernel Density Estimators (KDEs)

## The continuous version of a histogram

### Think of heatmaps

[2] Quantify these heatmaps via Topological Data Analysis (TDA)

[3] Featurize these topological signatures as Persistence Images

[4] Traditional data analysis and results

---

# Say we want to characterize the distribution of these points in 1D

![](../../tda/figs/kernel_density_estimator_00.svg)

- We only know the samples (blue points)

---

# A histogram gives us a sense of distribution

![](../../tda/figs/kernel_density_estimator_03.svg)

- The total gray area equals 1
- 100% of the points are represented in the histogram

---

# Approximate the underlying distribution with a KDE

![](../../tda/figs/kernel_density_estimator_04.svg)

- A continuous approximation is mathematically better to perform meaningful statistics
- Kernel Density Estimate: KDE

---

# The width/number of bins does influence the shape of the histogram

![](../../tda/figs/kernel_density_estimator_05.svg)

- Similarly, we can control the bandwidth parameter of the KDE to influence its shape
- Plenty of heuristics to define the "right" bandwidth
- But ultimately, "right" depends on the application in mind

---

## KDEs: 3D; one per cell; reflect borders and nuclei

---

[1] Estimate heatmaps via KDEs

# [2] Quantify the shape of these heatmaps

## Topological Data Analysis (TDA) and persistent homology

[3] Featurize these topological signatures as Persistence Images

[4] Traditional data analysis and results

---

# Sublevel filter &rarr; Persistence Diagrams

![](../../tda/figs/sublevel_filtration_00.svg)