EntireAxon: Deep learning deciphers four distinct patterns of axonal degeneration

Different axonal degeneration (AxD) patterns have been described depending on the biological condition. Until now, it remains unclear whether they are restricted to one specific condition or can occur concomitantly. Here, we present a novel microfluidic device in combination with a deep learning tool, the EntireAxon, for the high-throughput analysis of AxD. We evaluated the progression of AxD in an in vitro model of hemorrhagic stroke and observed that axonal swellings preceded axon fragmentation. We further identified four distinct morphological patterns of AxD (granular, retraction, swelling, and transport degeneration) that occur concomitantly. These findings indicate a morphological heterogeneity of AxD under pathophysiologic conditions. The newly developed microfluidic device along with the EntireAxon deep learning tool enable the systematic analysis of AxD but also unravel a so far unknown intricacy in which AxD can occur in a disease context.


Introduction
anterogradely or in a Wallerian degeneration pattern, ultimately resulting in the generation of 48 axonal fragments 7-9 . Another morphological feature that occurs during AxD in development 49 and disease is axon swelling. The swollen axon contains disorganized cytoskeleton and 50 organelles resulting from an interruption of axonal transport, reactive oxygen species exposure 51 or nutrient deprivation 8,10,11 . Both axonal swellings and axonal fragments represent hallmarks 52 of AxD independent of the biological context 2,12,13 . Therefore, it is relevant to detect the 53 different degeneration patterns and to ascertain their underlying morphological changes to 54 better understand their role in AxD during development and pathophysiology. 55 To assess the spatiotemporal progression of the different AxD patterns, morphological changes 56 of axons must be recorded continuously. However, available software fails to automatically 57 detect and quantify high axon numbers and the morphological features of AxD (swellings and 58 fragments). The reason may be two-fold: 1) Available software relies on image binarization 14,15 , 59 which can lead to information loss and low sensitivity as thin axons may not be recognized. 2) 60 The analysis requires subjective and time-consuming manual annotations, e.g., thresholding 61 and defining the region of interest [16][17][18] . So far, immunostained images were used to investigate 62 morphological changes in AxD as the analysis of phase-contrast images has been limited by the 63 lower target-to-background signal. Immunofluorescence images, however, entail certain 64 disadvantages such as photobleaching and the requirement for cell fixation, which allows only 65 to observe a single time point. Hence, a software tool for the automatized detection and 66 quantification of the morphological patterns of AxD in long-term live cell imaging is required 67 to improve the sensitivity and throughput, which will overcome current limitations in 68 understanding AxD. 69 In this study, we hypothesized that axons can undergo different AxD patterns, which rely on 70 different underlying morphological features and depend on the severity of AxD. To 71 systematically study AxD in a high-throughput manner, we fabricated a microfluidic device in 72 the format of a conventional cell culture plate containing 16 independent microfluidic units that 73 separate axons from their somata. We also developed a novel deep learning tool called 74

A novel microfluidic device for the high-throughput analysis of axonal degeneration 84
Experimental models such as microfluidic devices have been developed to spatially separate 85 axons from their somata 19,20 . Thus, the axons can be treated in an isolated manner to investigate 86 AxD. The major limiting factor of commercially available microfluidic devices is that they are 87 single, individual systems and hence, can only be used to assess one condition, which is time-88 consuming and precludes high-throughput analyses. 89 To overcome this limitation, we manufactured a microfluidic device containing 16 individual 90 microfluidic units by using the soft lithography technique replica molding (Fig. 1, suppl. 91 To enable the systematic high-throughput analysis of AxD over time in phase-contrast 97 microscopy, we trained the EntireAxon CNN to segment all relevant features of AxD, i.e. 98 axons, axonal swellings, and axonal fragments (Fig. 2). 99 Specifically, we adapted a standard u-net with ResNet-50 encoder 23,24 and used a CNN 100 ensemble, which combines predictions from multiple CNNs to generate a final output and is 101 superior to individual CNNs 25-27 . To validate the training success, a separate labeled validation 102 set unknown to the EntireAxon was created (ground truth was labeled by human expert 1). 103 While the EntireAxon CNN recognized the class 'background' better than the three axon classes 104 ii) Retraction degeneration: AxD in which the distal part of the axon retracts ultimately 160 resulting in granular degeneration. 161 iii) Swelling degeneration: AxD with the enlargement of axonal swellings and subsequent 162 granular degeneration. 163 iv) Transport degeneration: AxD in which axonal swellings of constant size, which do not 164 enlarge, are transported along the axon resulting in granular degeneration. 165 We hypothesized that the different morphological AxD patterns can occur concomitantly. We 166 therefore trained the EntireAxon RNN to identify changes in class segregation over time using 167 a training dataset of AxD segmentation recordings (Fig. 6a).  'background', although to different degrees, temporal patterns, and some of them with 174 concomitant increases in either the class 'axonal swelling' and/or 'axonal fragment'. In cluster 175 0, there was an early decrease in the class 'axon', which then continued more linearly as well 176 as a later rise in the class 'axonal fragment'. In contrast to cluster 0, cluster 1 showed no increase 177 in the class 'axonal fragment' and a linear decrease in the class 'axon' from the start. In cluster 178 2, there was a strong increase in the class 'axonal swelling'. Cluster 3 demonstrated an early 179 and lasting high level of the class 'axonal swelling' with a later increase in the class 'axonal 180 fragment'. Cluster 4 showed a rapid decrease in the class 'axon' concomitant with an increase 181 in the classes 'background' and 'axonal swelling'. Cluster 5 was similar to cluster 1, but with 182 an early drop in the class 'axon'. Cluster 6 showed an increase in the class 'axonal swelling' 183 similar to but to a greater extent than cluster 2. 184 The RNN categorized each cluster to one of the four morphological patterns (Fig. 6b): i) 185 Granular degeneration was defined by clusters that describe the degeneration of axons into 186 axonal fragments, i.e. clusters 0, 1, 3, and 5. ii) Retraction degeneration only included the 187 clusters 1 and 5, indicating the retraction of the axon followed by its fragmentation. iii) Swelling 188 degeneration was characterized by the three clusters that included the class 'axonal swelling, 189 i.e. clusters 2, 3, and 6, as well as cluster 5 showing the exchange of the class 'axon' for 190 'background'. iv) Transport degeneration was the only pattern that relied on cluster 4 and was 191 also characterized partly on clusters 0, 1, 2, and 6. Although some clusters overlap among 192 morphological patterns, the unique combination of the different clusters allows to distinguish 193 all four morphological patterns. 194 To validate the EntireAxon RNN, a 10-fold cross-validation was performed, which 195 demonstrated its ability to distinguish between the four morphological patterns of AxD (Fig. 196  We then applied the EntireAxon RNN to quantify the occurrence of the different morphological 201 patterns of AxD (Fig. 7). Hemin-treated axons underwent granular, retraction, swelling, and 202 transport degeneration concomitantly as these patterns occur either close to or at the exact same 203 locations (Fig. 7a). Hemin concentration-dependently increased all morphological AxD 204 patterns, with granular degeneration being significantly increased for the 100 µM (p=0.004) 205 and 200 µM hemin samples (p=0.001), while swelling degeneration was significantly increased 206 for 50 µM (p=0.006) and 100 µM (p=0.004) hemin-treated compared to vehicle-treated axons. 207 Neither retraction nor transport degeneration differed among the investigated concentration 208 (Fig. 7b). 209 Collectively, our data suggest that hemin gradually shifts the morphological pattern of AxD 210 from swelling degeneration to granular degeneration, which is in accordance with our findings 211 that axonal swellings preceded axonal fragmentation. The extent of AxD has been mainly investigated with a focus on axon fragmentation so far. To 222 quantify axon fragmentation, Sasaki and colleagues introduced the AxD index as the ratio of 223 fragmented axon area versus total axonal area 14 . However, the AxD index did not include 224 axonal swellings, which is another characteristic feature of degenerating axons 11,29 . Hence, the 225 precise spatiotemporal role of axonal swellings in AxD remains elusive. 226 The herein presented EntireAxon CNN performs an automatic segmentation and quantification 227 of axons and morphological features relevant to AxD, including axonal swellings and 228 fragments, on phase-contrast time-lapse microscopy images (Fig. 2). The EntireAxon CNN 229 recognized the four classes 'background', 'axon', 'axonal swelling', and 'axonal fragment', 230 with the highest mean F1 score for the class 'background' (Fig. 3a) as background pixels 231 covered most of the image. 232 Although the performance of the EntireAxon CNN on the axon-related classes was comparably 233 lower due to the imbalanced prevalence of each class in the images, the comparison with human 234 experts revealed that the EntireAxon CNN reached a similar performance level. As expected, 235 its performance was slightly better than the human experts on the ground truth as both, ground 236 truth and training data, were labeled by the same human expert (Fig. 3b). Interestingly, when 237 comparing the EntireAxon CNN with a human expert on the consensus label of the other two 238 human experts, not only was the EntireAxon CNN as good as or even better than the human 239 expert, but the mean F1 scores were also higher than on the ground truth labels (Fig. 3d). This 240 may be because pixels that were differentially assigned by the human expert, i.e. more difficult 241 to classify, were excluded from the comparison. Taken together, these findings demonstrate 242 that the EntireAxon CNN is suitable to quantify automatically and in a high-throughput manner 243 AxD and its accompanying morphological changes. 244 Using the newly developed EntireAxon CNN, we were able to investigate AxD in an in vitro 245 model of hemorrhagic stroke. To our knowledge, this is the first in vitro study to characterize 246 AxD after hemorrhagic stroke. AxD is an active and commonly observed process in 247 intracerebral hemorrhage 30,31 , with larger hematomas being associated with a more severe 248 progression of white matter injury resulting in poor motor and functional recovery 30,32 . As the 249 underlying mechanisms of AxD in hemorrhagic stroke remain to be elucidated and represent a 250 research priority in the field 33 , we investigated the morphological changes that can occur in 251 axons in response to hemorrhage. 252 We therefore modeled hemorrhagic stroke by exposing outgrown axons from primary cortical 253 neurons to the hemolysis product hemin. In our model, AxD started within 12 to 18 hours after 254 the administration of hemin ( Fig. 4 and suppl. Interestingly, axonal swellings and axonal fragments were related to different morphological 265 patterns of AxD in our model. Specifically, we observed axons that showed signs of axonal 266 retraction, enlarging of axonal swellings and axonal transport before degeneration (Fig. 5). 267 These findings led to the hypothesis that different morphological patterns of AxD can occur 268 concomitantly and should therefore be detectable based on our segmentation data. We therefore 269 trained the EntireAxon RNN to quantify the occurrence of granular, retraction, swelling, and 270 transport degeneration based on the clusters of unique changes of classes over time ( Fig. 6 and  271 suppl. Fig.4): 272 i) Granular degeneration has previously been observed in retrograde, anterograde, Wallerian 273 and local AxD 5,7-9 . ii) Retraction degeneration has been described in axonal retraction and 274 shedding 4,6 . iii) Swelling degeneration was previously reported in experimental autoimmune 275 encephalitis and growth factor deprivation 10,11 . iv) In contrast, transport degeneration has not 276 been reported before. However, microtubule breaks have been demonstrated in an in vitro 277 model of axonal stretch injury. Those developed into axonal swellings resulting in axonal 278 transport interruption with AxD as a consequence 37 . The underlying molecular mechanisms of 279 the different patterns of AxD need to be further investigated, which could be greatly facilitated 280 by the EntireAxon RNN that is able to automatically detect the different morphological patterns 281 in time-lapse recording due to its capacity to relate each output to previous images in the stacks 282 by its current units. 283 All four patterns occurred concomitantly in hemin-induced AxD (Fig. 7), which may point to 284 a potential synergy of the different patterns of AxD. Interestingly, we also observed a 285 concentration-dependent effect, whereat swelling degeneration was significantly increased at 286 lower hemin concentrations while granular degeneration occurred more frequently at higher 287 hemin concentrations. This is in line with our data suggesting that axonal swellings precede 288 axon fragmentation. To what extent our model of hemin-induced AxD in hemorrhagic stroke 289 is molecularly similar to developmental or pathophysiological AxD needs to be further 290

investigated. 291
To compare different experimental conditions of AxD, we herein propose a novel monolithic 292 microfluidic device consisting of 16 individual microfluidic units that enable the parallel and 293 separated treatment and/or manipulation of axons and somata (Fig. 1). The currently available 294 microfluidic devices do not allow high-throughput experiments as they comprise only single 295 microfluidic units 20,38 . Although some devices can harbor multiple experimental conditions, 296 they employ a radial design where a single soma compartment is used, in which influence from 297 one experimental condition to another experimental condition cannot be excluded due to the 298 potential of retrograde signaling 39,40 . Another option is the parallel use of multiple individual 299 devices, which allows to handle up to 12 devices in a conventional 12-well plate 18 . Compared 300 to our microfluidic device, this procedure is time-consuming in both the manufacturing process 301 and the adjustment for recordings. iii) The observed effects of AxD in hemorrhagic stroke within this study were based on hemin 314 toxicity, and we cannot exclude that other hemolysis products such as thrombin or bilirubin 315 have different effects. Additional studies should investigate differences of hemolysis products 316 to increase our understanding of the mechanisms of AxD in hemorrhagic stroke. 317 iv) The overall CNN performance may be further improved with more general inputs. For 318 example, the segmentation of fragment pixels cannot be done accurately from a single image 319 at a specific time point as the whole process of AxD, ultimately resulting in the disintegration 320 of the axons (i.e. the generation of axonal fragments), needs to be considered. CNNs using 3D 321 convolutions could, in principle, perform a segmentation over an entire time-lapse recording 322 and model temporal dependencies. However, we decided against the 3D approach, as it severely 323 restricts general applicability due to its greatly increased effort to label suitable time series for 324 training. In this context, the identification of the images that will yield the best results is crucial 325 to effectively reduce labeling costs, which we have previously described using an active Thirty-two wells were milled in a polymethyl methacrylate (PMMA) plate of the size of a 343 conventional cell culture plate (Fig. 1, suppl. Fig. 1)   0.38 % (w/v) sodium tetraborate in distilled water, pH 8.5) was used for coating at 4 °C 380 overnight. We aspirated the PDL the next morning, not removing it from the compartments, 381 and added 50 µg/mL of laminin as a second coating surface for incubation at 4 °C overnight. 382 At the day of neuron isolation, the microfluidic devices were washed twice with pre-warmed 383 medium after aspirating the laminin. Immediately prior to cell seeding, we aspirated the 384 medium from the wells without removing it from the compartments. This reflects a challenging degree of class imbalance, where the probability of having any 485 positives for a class in a validation image is low. Thus, we did not use the computed recall and 486 precision of the individual images or the mean recall and precision to compute the mean F1 487 score, i.e. the harmonic mean of recall and precision. This has been shown to lead to bias, 488 especially when a high degree of class imbalance is present in the dataset 43 as it may result in 489 undefined values for an image for recall (due to the absence of TP), precision (in case the CNN 490 does not recognize the few positives), and F1 score (in case either recall or precision are 491 undefined). To avoid bias, we computed the total TP, FP, and FN of all validation images from 492 which we calculated the mean F1 score 43 : 493 In addition, we computed a consensus label between human expert 1 and 2, 1 and 3 as well as 495 2 and 3 and compared the EntireAxon CNN versus the remaining expert (human expert 3, 2, 496 and 1, respectively) to the consensus labels. Mean F1 scores for all classes were computed as 497 described above. 498 499

Image preprocessing 500
Prior to the analysis of AxD after hemin exposure, we preprocessed the time-lapse recordings 501 in ImageJ (v1.52a, RRID: RRID:SCR_003070) using a custom-written macro. Specifically, 502 each individual recording was converted from a 16-bit into an 8-bit recording to make it 503 compatible with the ImageNet (8-bit) pre-trained ResNet-50. The recording was aligned 504 automatically with the ImageJ plug-in "Linear Stack Alignment with SIFT" as described 505

Classification of the morphological patterns of AxD using attention-based RNN 525
We used the segmentation videos to identify four morphological patterns of AxD: i) granular, 526 ii) retraction, iii) swelling, and iv) transport degeneration. To reduce the dimensions of the 527 input, each frame was converted into a histogram. To compute a histogram for a frame ti, we 528 compared the pixels of the frames ti and ti+1. Each pixel was assigned into one of 16 classes that 529 consisted of the pairwise tuples ( 1 , 2 ) ∈ {0,1,2,3} 2 of the four segmentation classes. For 530 example, the class (background, axon) means that in frame ti, the pixel was classified as 531 background while in frame ti+1, it was an axon pixel. For T time steps, we therefore computed 532 T-1 histograms. Additionally, we normalized each histogram to sum up to 1: 533 ( , ( 1 , 2 )) ∶= 0 ( , ( 1 , 2 )) / ∑ 0 ( , ( , )) ,

534
Note that, the histograms were computed over small patches (height and width < 90 pixels) 535 during training and during inference on windows of size 32x32 pixels. 536 We used an encoder-decoder RNN with attention 45 . The encoder consisted of a gated 537 recurrent unit (GRU) that obtained the histogram time sequence as input. The encoder 538 computed the hidden representation of the histograms: 539 = ( ); ∈ ℝ , ∈ ℝ 16 540 For our purpose, we used an architecture that was able to base the decision for a degeneration 541 class on the previous class predictions. To this end, the output ⃗ was computed iteratively in 542 C+1 steps as a sum of the previous output and the output of the decoder : 543 C is the number of degeneration classes (4) and d is the hidden dimension (we used 256); = 546 1, . . . , + 1. ℴ is the sigmoid function. The decoder employed a GRU that depended on the 547 context vector ⃗ and the hidden state vector ⃗ −1 : 548 The context vector is a weighted sum of the encoder representations. At each iteration, these 550 weights can change, enabling the network to focus on different time-steps. We assumed that a 551 specific pattern of degeneration happened only in a limited number of time frames that were 552 smaller than the whole input video. The weights depended on the current state of the decoder 553 and the current output: 554 Here, [ ⃗, ⃗⃗ ] is the concatenation of two vectors. The final output is normalized by the sigmoid 557 function: 558 Apart from the weights used by the GRUs, , , and are learnable weights. 560 The EntireAxon RNN was trained with 162 images for 60 epochs using the lamb optimizer 46 561 with a batch size of 128. We used a learning rate of 0.01 that was reduced by a factor of ten 562 every 15 epochs and an additional weight decay of 0.0001. The two GRUs (encoder and 563 decoder) contained three layers, and we used dropout with a p-value of 0.9. To increase the 564 RNN robustness against varying axon thickness, we also added eroded versions of the 565 segmentation data using a cross-shape as kernel with the sizes three, five, and seven. 566 Accordingly, each image existed six times in the dataset: three eroded versions and three 567 unchanged copies, to keep a 50 % chance of having the original image for training. 568 569

Ten-fold cross-validation of the RNN 570
To validate the RNN, we used ten-fold cross-validation 47 . For the given 162 training images, 571 we determined ten separate test sets (nine sets containing 17 images and one set including nine 572 images) and, each time, trained with the remaining 145 (153) images as described above. Mean 573 recall, precision, and F1 score over the ten sets were determined as described above. 574 575

Analysis of morphological pattern of AxD using the EntireAxon RNN 576
All segmentations of AxD after hemin exposure were automatically analyzed with the trained 577 EntireAxon RNN, which predicted the occurrence of the four morphological patterns of AxD 578 in a pixel-wise manner. To note: A pixel can be predicted to belong to 0, 1 or multiple 579 morphological patterns. Only pixels previously identified as degenerated over time were 580 considered by applying a 'fragmentation mask' that included all no-background pixels that 581 changed to either background or fragment during the recording time. 582 For each experimental condition (i.e. hemin concentration), the percentage of the occurrence of 583 each morphological pattern was calculated as the sum of all pixels per morphological pattern 584 on all images of that experimental day divided by the 'fragmentation mask' as follows: 585

Statistical analysis 589
Normality was evaluated with the Kolmogorov-Smirnov test, variance homogeneity using the 590 Levené test, and sphericity by the Mauchly test. When the data were normally distributed and 591 variance homogeneity was met, one-way ANOVA followed by the Bonferroni post hoc test 592 was performed. In case the data were not normally distributed, the Kruskal-Wallis test was 593 performed for multiple comparisons of independent groups followed by the post hoc Mann-594 Whitney U test with α-correction according to Bonferroni to adjust for the inflation of type I 595 error due to multiple testing. For the repeated testing with covariates, a repeated measures 596 ANOVA was performed with Greenhouse-Geisser adjustment if sphericity was not given. Data 597 are represented as mean ± standard deviation (SD) except for the nonparametric data of the 598 AUC for axonal fragments as well as retraction and swelling degeneration, where medians are 599 given. A value of p<0.05 was considered statistically significant. For the Kruskal-Wallis test 600 followed by Mann-Whitney U, p=0.05/k was used, with k as the number of single hypotheses. 601

762
We trained an ensemble comprising 8 CNNs to segment the four classes. c, The EntireAxon CNN was validated with a separate 763 validation dataset to assess its performance (recall, precision, and mean F1 score), which was compared to human experts. d,

764
The EntireAxon CNN was applied to data on AxD induced by the exposure of hemin, which is used to model of hemorrhagic