Citrus work like lego blocks. Roughly speaking, any two citrus can hybridize and produce potentially new citrus varieties. In fact, all citrus that you see in the produce section of the market are hybrids. A grapefruit is actually a cross of a pummelo with a sweet orange. An a sweet orange is a cross of a pummelo with a sweet mandarin. And a sweet mandarin is a cross of a pummelo with a pure mandarin. A pure mandarin crossed with a pummelo can also produce a sour orange. And a sour orange crossed with a citron yields a lemon. You get the picture. Citrus are as promiscuous as it gets.

Credits: Wu *et al.* (2018)

This large variety of hybridization possibilities corresponds to a variety of citrus fruit shapes. Can we quantify such shape diversity? If we can mathematically describe the shape of both a pummelo and a sweet orange, would I be able to predict that their shape combination yields a grapefruit?

We are especially interested in being able to quantify and
characterize the distribution of the *oil glands* on
the citrus fruits. Citrus essential oils are important for the
food and perfume industries. Oil glands also play a fundamental
role in citrus fruit development. There are plenty of unknowns
going forward.

In collaboration with the Givaudan Citrus Variety Collection at University of California—Riverside, we got access to 158 individual fruit samples comprising 64 citrus varieties. These included all the fundamental citrus (citrons, pure mandarins, pummelos), close relatives (trifoliates, kumquats, microcitrus), and important hybrids (sweet oranges, lemons, etc.) These were X-ray CT scanned at Michigan State University. After a lot of image processing fiddling, we manage to segment out the central column, flesh, rind, skin, and oil glands for each citrus fruit.

We focus on the oil glands. We can represent each oil gland
as point in space where the *x,y,z* coordinates are the
center of mass of each gland. That is, each citrus fruit now
can be thought as point cloud in space (!) As a sanity check,
we verify that our count of individual oil glands goes in hand
with established literature.

It seemed natural to model citrus as ellipsoids —an affine transformation of a sphere. This was done by simply performing ordinary least squares regression to find the best algebraic parameters of the general ellipsoid formula. Next, the point cloud made of all oil gland centers was projected to the best-fit ellipsoid. Finally, we reparameterized these centers in terms of geodetical coordinates —latitude and longitude. But latitude and longitude coordinates can be thought as lying on a unit sphere, as well. We thus have a size-independent common framework to compare all the oil glands for all the citrus fruit varieties. We visualized the oil glands on 2D via two Lambert cylindrical equal-area projections from the north and south poles.

Now that all our oil gland data can be represented as points on a common unit sphere, we turn to directional statistics. Directional statistics allows us to characterize distributions specifically on circles, spheres, and related surfaces. We can also test whether a collection of points on a sphere follow a known distribution. To this end, we observed that there is no statistical evidence that supports the hypothesis of glands being uniformly distributed. Nor there was evidence in favor of rotational symmetry.

We can compute an empiric distribution via kernel density estimation (KDE). As expected, there is a spherical-specific KDE that we can use. As in the linear case, our KDE will depend on a bandwidth parameter that will determine how smooth our empiric distribution is. We can play around with varying bandwidth parameters and observe which regions show the most dramatic distribution changes.

Now that we are convinced that our pipeline enables us to quantify and compare citrus fruit shape, the potential future directions are exciting. To name a few:

- Locate, segment, and phenotype seed tissue.
- Explore more on normal diffusion mechanics and their possible relationship to oil gland distribution.
- Define a measure of similarity of oil gland distributions and compute a pairwise distance matrix for all citrus fruits.
- Compare such distances between distributions to phylogenetic distances.
- Explore alternative ellipsoid-to-sphere algorithms to minimize distortion.

Stay tuned for updates!

¡**Published article**: Amézquita *et al.* (2022)!

DOI: 10.1002/ppp3.10333

—

**As slides**:
Presented at CMSE Brown Bag Seminar. October 2022.

**As a static poster**:
Presented at OSUPSS. April 2022.

**As a dynamic poster**:
Presented at OSUPSS. April 2022.

—

——————————

- The shape of citrus fruits and modeling their oil gland distribution
- Sub-cellular transcriptomic patterns
- The early dodder gets the host
- The crackability of walnuts: all about shape, in a nutshell
- Quantification of barley grain morphology
- Global disparities in plant biology research
- Mapper to unravel the shape of omics data
- The intersection of Topological Data Analysis and Biology
- Archaeological artifact classification and the Euler Characteristic

*Given a pointcloud, a set of 3D x,y,z coordinates, what are the parameters
of the best-fit ellipsoid*? I was surprised there was not a
straightforward, widely-accepted answer. I also realized how
rusty my pre-calculus math is by now. There are a number of papers
out there.

I chose Li and
Griffiths (2004) as it was both mathematically and computationally
the most straightforward answer I could find. They simply perform an
OLS
to find the algebraic parameters that minimize the algebraic residuals
for the *general quadric surface*.
In principle, this approach could produce parameters that approximate
the points with a paraboloid or hyperboloid instead. You must know
that your point cloud indeed looks like an ellipsoid. Supposedly they
use Lagrange multipliers to guarantee that you'll always get
ellipsoid paramters, but I personally did not verify it.

Credits: Li and Griffiths (2004)

This approach will give you algebraic parameters of the general quadric surface
equation. To translate these into more intuitive geometric parameters
(semiaxes lengths, origin, rotation, etc.), you can follow Section 2.4 of
Panou *et al* (2020).

- I personally found
Chojnacki
*et al*(2000) too convoluted. I admit stats and numerical analysis are not exactly my strength. - I remember I had trouble fully following and implementing
Yu
*et al*(2009). - I tried my best to implement Reza and Sengupta (2017). I was unable to get sensible results.
- I was also a bit confused by
Sivapalan
*et al*(2011). - Panou
*et al*(2020) is a relative easy read for someone with a basic math background. They propose a two-step approximation to obtain the ellipsoid. The first step is an algebraic parameter approximation like in Li and Griffiths (2004). The second is a geometric parameter approximation. However, I was unable to get significant results from this second step. Maybe it only works with geoids, with very large ellipsoids the size of a planet, not for ellipsoids the size of a lemon.

Credits: Diaz-Toca *et al.* (2020)

Once we have the ellipsoid parameters, we have to project our original point cloud onto such ellipsoid. This projection can be either:

**Geocentric**: By drawing a ray from the ellipsoid center to the point and noting where the ray intersects the ellipsoid surface.**Geodetic**: By projecting the point perpendicularly to the ellipsoid surface, i.e., by minimizing the distance from the point to the surface.

The former projection can be computed immediately. The latter
requires a much more ellaborate computation.
Diaz-Toca *et al.* (2020)
Is a very well-written breakdown of the computations needed, and
they even provide a link to C code that works out of the box.

Regardless of the projection used, we can then reparameterize the original point cloud in (latitude, longitude, height) coordinates. We can then translate the (latitude, longitude) coordinates to a unit sphere. These unit sphere will be the common ground that will allow us to compare all citrus at once.

Directional statistics is a relatively new branch of statistics which focuses on statistics where the domain is not a Euclidean space —as with regular statististics— but a circle, sphere, torical or cylindrical.

The 1999 seminal textbook *Directional Statistics*
by Mardia and Jupp is pretty good. It is seminal for a reason.
It basically compiles everything
that was known on the subject until that point. The textbook is quite
comprehensive, the index is very fleshed out, and most of the
chapters are self-contained. You can jump straight into the
the relevant content for the application you have in mind.
It also comes with plenty of citations so you can dive deeper
into the relevant literature.

More than 20 years later, Mardia and Jupp are still relevant.
Their textbook is still one of the best ways to get familiar with
the foundational ideas.
Naturally the discipline has grown, and there
have been plenty of advances since 1999.
Ley and Verdebout's
*Modern Directional Statistics* (2017) aims to be an update.
They still provide some basic definitions and concepts, albeit
very succintly and refer the reader back to Mardia and Jupp
plenty of times.
Ley and Verdebout are still a pretty good reference to know
where to start the google scholar search on how to do a particular
task in directional statistics.
Applications are fleshed out in their complement
*Applied
Directional Statistics*.

A brief review, update, and historical overview of the discipline is provided by Pewsey and García Portugués (2021). Their last section covers on some of the available software to do actual computations. Fortunately, most of the software comes as R packages, works out of the box, and it is easy to use.

I love the Gastropod podcast. The food that we consume is more than just food. It is also a reflection of human culture and history. Food shaped our society. Citrus are no exception. As Cynthia Graber and Nicola Twilley say, «not only were these [citrus] fruits so precious that they inspired both museums and the Mafia, they are also under attack by an incurable immune disease that is decimating citrus harvests around the world.» Listen to their whole citrus episode!

Also, I highly recommend you following and watching @WeirdExplorer YouTube channel for a foodie insight into the wide, global fruit variety. He has a number of citrus-specific episodes. In particular, one of his first episodes roughly explains how most citrus are hybrids.

And this one is pretty good as well. It also contains an important
message that fruit naming matters. There is a reason why the
*Makrut lime* should be always called *Makrut lime*
and not something else.

Citrus are quite fascinating. Their history is intrinsically linked to ours. To name a few fun facts.

- Citrus paved the way to the first modern medical trials in Western medicine. Scurvy was the biggest scourge of the seas. It is estimated than more sailors died due to scurvy than the rest of diseases and sea battles combined. One day, the British tested the anecdotes of sailors being scurvy-resistant due to regular consumption of citrus. Several ships were sent for long voyages. One was provided with citrus juice. Others were provided with no fruits but fresh water, or various elixirs. This was the setup of the first modern medical trial.
- Very specific citrus are a key for various South and South East Asian cuisines. There are anecdotes of Thai immigrants trespassing the Citrus Variety Collection of UC Riverside to obtain key ingredients back in the 60s. Else, people would smuggle citrus from Asia to California to complete important dishes. Smuggling naturally comes with a high risk of importing diseases, which can be particularly catastrophic for comercial fields of citrus as most of the commercially grown citrus fruits are essentially clones. It is important to develop citrus varieties that can grow in other climates to both supply people with fruits and eliminate the need of dangerous smuggling.
- Qu Yuan, probably the most important classic Chinese
poet, depicted an orange tree as a symbol of steadfastness and
resilience in the beautiful poem
*Ju song*. Although we do not know for certain if Qu Yuan is the actual author of the poem, it is beautiful nonetheless. For context, Qu Yuan was a renowned advisor of a king circa 200 BC. Inner conflicts with other advisors and court members led to rumor spreads and backstabbing, which forced Qu Yuan to be exiled from the kingdom. According to the legend, Qu Yuan was absolutely distraught, he was profoundly hurt that his king believed others' words and not his. Qu Yuan was roaming when he observed an orange tree. He was captivated, given the fact that orange trees were not supposed to grow, let alone flourish, in that region, where the climate was colder. If an orange tree was able to grow against harsh climates, he as a poet should be able to withstand the terrible setbacks. - Citrons, especially etrogs are key in some Jewish festivities. The relationship between etrogs and the Jewish community goes back millenia. In fact, modern evidence suggests that as the Jewish community moved westward towards modern Europe, they brought citrus cultivation and citrus breeding with them. It was thanks to their migration that Romans tasted citrus for the first time, and they made quite an impact, especially across Italy and Spain.