LocalZProjector and DeProj: a toolbox for local 2D projection and accurate morphometrics of large 3D microscopy images

Background Quantitative imaging of epithelial tissues requires bioimage analysis tools that are widely applicable and accurate. In the case of imaging 3D tissues, a common preprocessing step consists of projecting the acquired 3D volume on a 2D plane mapping the tissue surface. While segmenting the tissue cells is amenable on 2D projections, it is still very difficult and cumbersome in 3D. However, for many specimen and models used in developmental and cell biology, the complex content of the image volume surrounding the epithelium in a tissue often reduces the visibility of the biological object in the projection, compromising its subsequent analysis. In addition, the projection may distort the geometry of the tissue and can lead to strong artifacts in the morphology measurement. Results Here we introduce a user-friendly toolbox built to robustly project epithelia on their 2D surface from 3D volumes and to produce accurate morphology measurement corrected for the projection distortion, even for very curved tissues. Our toolbox is built upon two components. LocalZProjector is a configurable Fiji plugin that generates 2D projections and height-maps from potentially large 3D stacks (larger than 40 GB per time-point) by only incorporating signal of the planes with local highest variance/mean intensity, despite a possibly complex image content. DeProj is a MATLAB tool that generates correct morphology measurements by combining the height-map output (such as the one offered by LocalZProjector) and the results of a cell segmentation on the 2D projection, hence effectively deprojecting the 2D segmentation in 3D. In this paper, we demonstrate their effectiveness over a wide range of different biological samples. We then compare its performance and accuracy against similar existing tools. Conclusions We find that LocalZProjector performs well even in situations where the volume to project also contains unwanted signal in other layers. We show that it can process large images without a pre-processing step. We study the impact of geometrical distortions on morphological measurements induced by the projection. We measured very large distortions which are then corrected by DeProj, providing accurate outputs. Supplementary Information The online version contains supplementary material available at (10.1186/s12915-021-01037-w).

accumulate pixels in ±∆z around z set by height-map.
Local projection Figure S1. Overview of the Local Z Projector method. The source 3D stack is first processed slice by slice. Each 2D slice is binned to accelerate processing, denoised with a Gaussian filter, then the main filter is applied. The main filter is chosen in type (mean filter, standard deviation filter, ...) and in size to have a strong response in the layer of interest. The z plane for which this response is maximal is stored for every (X, Y) position in the height-map image. The height-map is then median-filtered, to remove local spurious irregularities, and up-scaled back so that its width and height match those of the source 3D stack. Finally, the height-map is used as a reference to read pixels in the source stack around the z position it contains, accumulating ±∆z planes around this reference plane.
For a given main filter type and accumulation method, four parameters must be specified: the binning, the sigma of the Gaussian filter, the size of the main filter and the value of ∆z.

Supplementary Note 1: Generation of a ground-truth dataset.
A ground-truth dataset is required to compare the LocalZProjector tool results with other methods. To generate this ground truth, we selected a single time-point in the Drosophila pupal notum dataset of the main text. This single-channel 3D stack displays two salient layers. The top one (small Z values) is generated by the autofluorescence of the cuticle. It is noisy, continuous and shows up in all fluorescence channel. Below the cuticle, we find the cell layer, where the cells apical junctions are labeled for their membrane by Ecad::GFP. The cuticle layer and the cell layer are separated by about 4 to 9 slices. Under the cell layer (largest Z values) there are about 10 locations where discrete and bright structures can be seen, corresponding to apoptotic cells. In the right part of the image, the cell layer is almost parallel to the XY plane and is located just under the cuticle layer. The left part of the cell layer displays a relatively high curvature (Figure 1b).
We generated the ground-truth dataset using Icy [22]. Each slice of the 3D stack was manually annotated for regions where the cells of interest are in focus ( Figure S2). A Jython script then parsed these regions to fetch the pixel in the specified Z-plane for each (X,Y) position, and writes its value on a single plane, generating the ground-truth for the projection. When a single (X,Y) position was represented by several regions in different Z planes, the pixel with the maximal value was retained. The height-map image corresponding to this projection was generated along, by simply writing the Z position retained for projection as pixel value. When a (X,Y) position was represented by several regions in different Z planes, the height-map value was taken as the mean of all the Z "in-focus" positions. This led to the height-map ground-truth image having non-integer values for some pixels. The resulting projection is a 2D gray-scale image of size 1148 ×1148. Its bottom-left corner is devoid of signal and was excluded from all the metrics described in this work. Supplementary Note 2: Comparing projections against the ground-truth dataset for several methods.
The performance of 8 different projection tools were then compared using the Drosophila pupal notum image and the groundtruth image generated as detailed above. We ran an extended parameter search, in order to determine the parameter value that give the best projection for each method. This provides valuable know-how as how a user can select values based on the input image. We put this information in the documentation of LocalZProjector, on its homepage. We give in this section the details of the performance comparison. The results are summarized in the Figure S3 and in the Table 1.
A. Optimizing projection parameters for the compared methods. We want to assess how a projection method can be useful in practice for end-users, in retrieving a 2D projection that can be used in subsequent analysis steps. Therefore, we first turn to a pixel-based comparison of the 2D projection given by each method with the projection ground-truth. We use the RMSE to report these differences, defined here as: where N pixels is the number of pixels in the projected image, p M (i) the pixel value in the projected image by the method M , p GT (i) the pixel value in the projection ground-truth and i iterating over all the pixels in each image. We excluded from the RMSE calculation the area of the sample where there is no cell layer.
Several of the methods tested allow to define a certain thickness parameters specifying how many Z-planes around the Z position defined by the height-map to include in a local projection. This is the case for the LocalZProjector and for Min Cost Z Surface. This offers some tolerance against small inaccuracies of the height-map. When generating the ground-truth projection, we take the brightest pixels of several Z-planes when several of them are annotated as being in focus. Because of this, methods that can accumulate several planes are slightly advantaged.
Moreover, the use of manually annotated ground-truth dataset depicted above as a basis for performance comparisons has some limitations. Indeed, the metric values we get might be affected by the imprecision of manual annotation. We nonetheless use the RMSE defined above to rank each method and detect projection defects, keeping in mind the aforementioned limitation. We also can use the RMSE metric to optimize the parameters of each method separately and derive practical advice on how to tune them with respect to the image content.
Maximum intensity projection. We used the Z Project... command of ImageJ [16] in the Fiji distribution [20], choosing maximum intensity as a projection method. It generates a projection by taking the pixel with the largest value along a (X, Y) column. This method is the simplest projection technique and does not have parameters to tune. c.
d. Local Z Projector. The LocalZProjector aims at being a general tool and has therefore several parameters that can be tuned to accommodate a large variety of input. We derived the optimal set of values for these parameters by parameter sweeps, testing a large number of parameter combination (close to 200,000) and retaining the combination yielding the lowest RMSE. For all the tests described below we used the Max of Std method to generate the height-map, and the MIP method to accumulate several planes around it. We have found that the lowest RMSE value is 81.5 and obtained for binning = 2, gaussian sigma = 0.5 pixels, filter size = 13 pixels, median filter half-size = 16 pixels and ∆z = 1. The projection takes about 5 s. to complete with these parameters. A large subset of parameters give also results close to the optimum, but in a much shorter time. For instance the parameters binning = 5, gaussian sigma = 0 pixels, filter size = 7 pixels, median filter half-size = 4 pixels and ∆z = 1 yields a RMSE of 83.0 in about 0.6 s. Nonetheless, we base the following discussion on parameter values associated to the absolute optimum.
The binning value has a dramatic positive impact on processing time ( Figure S4a). A large binning can be chosen without compromising the RMSE value significantly. For very large binning, the structures that define the layer of interest (in our case the manifold shape of the cell membranes) are confused in the downsized image and cannot be retrieved later by the standard deviation filter.
The Gaussian filter is meant to limit the effect of noise. But binning already averages pixels together when downsizing. Even with a binning of 2, the Gaussian filter has almost no impact on RMSE or timing. ( Figure S4b).
The main filter is the key parameter to tune to reliably retrieve the cell layer in the projection and not the cuticle layer. Its type must be chosen to generate a strong response for the cell labelling. In our case the layer we are interested in displays cells as manifolds, and the cuticle generates a spurious layer where the signal is very slowly varying (minus the noise). An adequate filter is therefore the standard-deviation filter, that will act as a measure of a local contrast between the bright cell membranes and the dark cell centers. Its size must be chosen so that the contrasted features will be fully included in one window. The cells sections in the test dataset are about 20-40 pixels in diameter, with a membrane thickness around 3-5 pixels. We find that the optimal value of the size is a value of 13 pixels, enough to include all or a large part of a single cell ( Figure S4c). Smaller values would create too many windows without contrast. For larger values, the cuticle layer gives a strong response in the areas of the sample where the curvature is large. The projection then displays jumps between the two layers. In the LocalZProjector plugin, the size of the main filter is specified independently of the binning value. It is then scaled to match the size in the downsized image. This is why the RMSE plot for this parameter in the Figure S4c displays a staircase pattern with steps corresponding to the binning value.
A median filter can be applied as a post-processing option on the height-map image. It is meant to remove spurious Z value that can be generated by punctate structures above or below the layer of interest. Its size has to be therefore large compared to these structures, and small compared to the layer curvature. In our main example, such bright fluorescent structures appear from some dead cell bodies away from the epithelial surface, and are picked by the standard-deviation filter, generating locally spurious large Z values. A median filter of half-size of 16 pixels in the downsized height-map completely remove these spurious jumps ( Figure S4d). This corresponds to a size in the original scale image of 64 pixels, about twice the cells diameter.
Finally, the ∆z parameter specifies how many planes to accumulate around the reference Z plane for each (X, Y) position. This allows for accommodating small mistakes in the height-map. In our case a value of ∆z ± 1 has a strong positive impact, dividing the RMSE by a factor of about 2 compared to no accumulation. However large values of ∆z will result in including unwanted signal from planes that are away from the reference surface. Unsurprisingly we find that a value of ∆z = 1 returns the best projection ( Figure S4e).
Stack focuser. Stack Focuser [9] is an ImageJ plugin that generates projections by selecting the best plane for each (X, Y) column. This best plane is determined by applying the Sobel filter [17] at each Z-plane and selecting the one for which the filtered value is the largest. The size of the Sobel filter can be adjusted, and we found the optimal value to be 25, in the range of the cells size ( Figure S5). For smaller values, the projection includes signal from the cuticle inside the cells and resembles that of MIP method. For large values, strong defects appear in the regions of the cell layer where the curvature is large.  SurfCut. ImageJ SurfCut [12] is an ImageJ macro that extracts a projection by thresholding single Z-planes from a Z-stack, then stacking the obtained binary masks to generate a 3D binary mask of the sample. The top surface of this mask is then used to define a reference surface. With this approach SurfCut detects the cuticle layer and not the cell tissue layer. However the SurfCut tool allows defining offsets from the reference surface to generate the projection. Since the cuticle layer is almost parallel to the tissue layer, we could find parameters that would still generate projections of the tissue layer itself. Because SurfCut only operates on 8-bit images, we converted the source image to 8-bit, then re-scaled the projection so that the RMSE values reported in the Figure S3a is commensurate with values from other methods. We used the following parameters to do so: filtering radius = 6, threshold = 10, offset top = 5, offset bottom = 12.
PreMosa. PreMosa [10] is a standalone command-line tool written in C++ that was initially developed to facilitate the preprocessing of large mosaics of the Drosophila melanogaster developing wing. It offers a tool to generate a projection from 3D stacks, a tool to stitch a collection of 2D projections into a large mosaic, a tool to correct uneven illumination and a tool to adjust the contrast. Since our sample is a single Field of View, we only tested and compared the former.
The surface projection tool of PreMosa works by first sub-dividing the image in square columns of size r × r, r the grid size being an adjustable parameter. For each column, a candidate Z-position is obtained by taking the plane in which there is the largest amount of bright-pixels. Bright-pixels are pixels that have a value larger than the lowest intensity of the top λ% pixels in the column, 20% by default. The bright-pixel threshold λ is a second adjustable parameter. The resulting surface is then iterably smoothed and refined by determining the Z-position of greatest contrast (local variance), in a neighborhood of ±d planes around the surface. The maximal distance d is a third adjustable parameter.
The grid-size parameter appears to be the crucial parameter to tune ( Figure S6). For too small values of the grid-size, the projection is principally made of the cuticle layer, with small regions containing the cell layer. For values too large, the projection fails to include the curved regions of the cell layer. The distance parameter d and the threshold λ haves only little influence on RMSE for reasonable values of r. Extended Depth Of Field. The Extended Depth Of Field method aims at reconstructing a projection from a 3D stack by merging areas that are detected to be in-focus as the Z-position varies [11]. This algorithm is made to excel for transmitted light images (e.g. bright-field images or colored images from histology samples) and reflected, wide-field images. It is not made to deal with optically-sectioned, confocal stacks of structures labelled in fluorescence, where the signal coming from a structure is only present in a few Z-slices around its position. However the tissue layer in our test dataset contains cells labelled for their membrane. For the Z-planes that contain them, they will appear as sharp and well contrasted structures, and the in-focus detection of this algorithm may select them correctly.
The authors of [11] has made the technique available via an ImageJ plugin that we used in our comparisons. It offers a very large range of parameters that can be tuned, from the various techniques to detect in-focus plane and the parameters Topology settings (see Figure S7). We note that for our tests, the topology parameter set does not have an impact on the output. Indeed, the range of smoothing or regularization associated with these parameter sets is relatively small compared to the size of our test dataset features. Accordingly, the height-map returned by any of the parameter set tested is very irregular. The Complex wavelets with MCC quality parameter set however gives a good response to the membranes of the cell layer. But indifferently of the topology parameter we use, the centers of the cells always contain spurious signal in the final projection.
Min Cost Z Surface. The 5 methods above are direct methods that determine a best Z-plane for every (X, Y) pixel based on sharpness, brightness or local contrast. Some of them offer post-processing steps to smooth or regularize the resulting reference surface. Another approach consists in formulating the problem as a global energy-minimization. In [13] and [14], the authors use graph-cuts to extract a single smooth surface [13] or a pair of smooth surfaces that are roughly parallel [14] and made of bright pixels. The latter approach should be well suited to deal with our test dataset, as it displays two almost parallel layers.
The algorithms described in [13] and [14] have been implemented in Fiji in a plugin called Min Cost Z Surface [15]. Our attempts with the single-surface extraction method only resulted in a projection made of patches taken from both the cuticle layer and the cell layer with large gaps and jumps in the height-map. Changing the downsampling from x2 to x4 changes the patches that are incorporated but does not fix these defects. However, using the two-surfaces detection and with some tuning of the input parameters (maximal separation of the two surfaces, number of planes to accumulate in the final projection) we extracted separately the cuticle layer and the cell layer rather reliably. The optimum is obtained by incorporating 3 slices around the reference surface. The optimal value for this last parameter is found to be the same than for the ∆z parameter of the Local Z Projector method ( Figure S8).
Fast SME. The Smooth-Manifold-Extraction algorithms offers a projection technique that is also based on a global energyminimization approach and has the advantage of being parameter-free [8], [28]. It can harness both 3D confocal stacks and 3D wide-field stacks. It is especially developed to deal with single cell monolayer labeled for their membrane. It is therefore one of the best suited tools to our test dataset. Briefly, the algorithm relies on classifying (X, Y) columns as belonging to the foreground (they contain a pixel belonging to the cell membrane) or belonging to the background (inside cells). The reference surface is then obtained energy minimization. The energy to minimize is the sum of a local smoothness term ensuring the regularity of the surface, and a distance term driving the surface close to the Z-planes of the foreground columns, and that ignores the background columns. A first paper [8] contains the original implementation, which was further optimized for better performance [28]. We base our comparisons on the latter, which is implemented in MATLAB. The user must only specify whether the image comes from a confocal imaging system (our case) or from a wide-field system, and how many slices to incorporate in a local projection around the reference surface. To emulate what we have been testing for the Local Z Projector and Min Cost Z Surface method, we varied this last parameter to include just the plane from the reference surface, ±1 and ±2 planes ( Figure S9).

B. Comparing optimal projections.
We then compared the optimal projection of each method against one another. The relative RMSE values are reported in the Figure S3b and the Table 1 in the main text. In the Figure S10 we report the 2D optimal projection for each method, along with the error map. The error-map is displayed as the pixel by pixel absolute difference with the value in the ground-truth projection, filtered by a 41 × 41 median. Except for MIP and Fast SME, all of the presented methods have some degree of parameter configuration that we could optimize to get a projection close to the ground-truth. Therefore all the optimal projections are satisfactory at least in some part of the image.
The MIP method incorporate the brightest pixel in the projection regardless of its plane of origin. Because of this, the signal inside the cells in the projection is made of bright pixels coming from the pupal cuticle signal, which is noisy and bright.
Consequently the error map image shows large errors everywhere in the projection.
The LocalZProjector could retrieve a projection that is very close to the ground-truth, except in the flat region of the tissue (right arrow). In the latter region, the cuticle layer just above the tissue layer emits a strong signal. It is likely that here, projection includes spurious signal from the cuticle. Indeed, the parameters we picked as such that the local projection takes the maximum of 3 pixels around the reference surface.
The Stack Focuser plugin bears some similarities with the LocalZProjector. It determines the position of the reference surface by a filter that has a strong response to local contrast. Since in our case the cells membrane amount to salient ridges, this method correctly picks the cell layer. However, it does not offer a step that would regularize the reference surface, and the projection is deformed by some bright and punctate structures under the cell surface (arrows). Finally, in regions where the curvature of the tissue layer is large, the Sobel filter does not pick the right planes.
As mentioned above, SurfCut actually picks the cuticle layer. We configured it so that projection includes the pixels below the cuticle layer, hereby including those of the cell layer. Because the reference surface follows the cuticle layer and not the cell layer, areas of the projection where the error is large actually reflects irregularities in the former.
PreMosa detects candidate planes by looking in a r × r grid for the Z-plane that has an amount of bright pixels larger than a certain threshold. Despite the presence of a bright uniform cuticle layer above the cell layer, PreMosa still selects the latter for projection with an adequate set of parameters. This is probably due to the fact that the pixels in the membranes of the cell are themselves slightly brighter than the cuticle pixels, and because PreMosa offers a step where the surface is regularized. Indeed, PreMosa performs very well in areas of the sample where the layer is quasi-flat. However defects appear in regions where the curvature is higher (arrow), a fact also noted in [28] on other samples. As explained in [10], the grid size parameter must be significantly smaller than the cross-section of the tissue layer with any XY plane. In its most curved section, this cross section can be as small as 2 cell diameters, which is about 40-60 pixels in this region. In our tests the optimal grid-size value we found is 34, a value too large for this small cross section.
The Extended Depth of Field method was made to deal with transmitted light images. It is therefore not very well suited to our confocal test image. However we could find some parameter combination that extracts most of the tissue layer. Nonetheless, the height-map generated is very irregular. We can see the cells membrane in the projection, but their interior is filled with high and noisy signal. The error-map for this method has therefore high values everywhere.
Since we could configure the Min Cost Z Surface plugin to detect two surfaces, it reliably detected both the cuticle layer and the cell layer. The projection obtained with this second surface is satisfying. Like for the LocalZProjector projection, it however displays some small inaccuracies in its flat regions.
Along with the latter, Fast SME is the second energy-minimization based method of this comparison. It is built to specifically detect the layer of cells stained for their membrane, without the specification of any parameter. It succeeds to do so, except in one region (arrow) where the height-map has a jump to the cuticle layer in the region where it is bright. Here the cells are the largest in the tissue. It is likely that at these locations, the pixel columns that go through the inside of the cells get incorrectly classified as foreground, because of the bright intensity of the cuticle layer. The reference-surface then deforms to reach out the cuticle layer at these positions.
C. Comparing optimal height-maps. Several methods tested can also return the height-map that encodes the reference surface used for the projection. The height-map image stores the index of the Z-plane for each (X, Y) position in the reference surface. The comparison of the resulting height-maps confirms the observations and differences between methods noted above ( Figure S11).   Contrary to the first three methods, the height-map returned by the Min Cost Z Surface plugin does not appear as large patches of constant integer values. The Z-values in this height-map are decimal, a feature we attribute to the original formulation of the problem in this method. These values vary locally at a small scale, which generates the texture we observe in the errormap for this height-map. Nonetheless, non-integer values are rounded when projecting the source stack, and despite the error in the height-map, the projection returned by the Min Cost Z Surface is satisfying ( Figure S10).
Finally, we retrieve in the height-map returned by the FastSME method the large inexact patch where the reference surface jumps towards the cuticle layer. by a factor W ji .
where W ji is given by: Finally, the object-consistency error oce is calculated as oce = min (E g,s , E s,g ). Even within a set made of a rather large number of masks, the object-consistency error reliably penalizes over-segmentation or false-splits and under-segmentation or false-merges, respectively where a cell is split over several disjoint masks and where a segmentation mask covers several cells. We implemented this metric in MATLAB and used its results in Figure S3d and Table 1 in the main text.
B. Simple segmentation pipeline. From the optimal projections obtained previously, we run a straightforward, fully automated segmentation pipeline in Fiji [20], that returns a binary mask where each cell is represented by a connected set of white pixels separated by a black ridge. The pipeline is as follow: • Filter the projection with a 3x3 median filter.
• Filter the resulting image with a Gaussian filter with σ = 0.7 pixels.
• From the MorphoLibJ [31] package in Fiji, run the Morphological Segmentation tool with a tolerance of 120 and a connectivity of 4.
We then obtain directly the label image I s we can use to measure the oce value. This simplistic pipeline is not expected to fare perfectly against the amount of noise and staining imperfections in the test image but it is however good enough to demonstrate the impact of a projection method on the downstream analysis results.
C. Ground-truth for segmentation accuracy. We generated the segmentation ground-truth image I g by running the pipeline above on the ground-truth projection described in the Supplemental note 1. The numerous remaining segmentation errors were then manually corrected in Fiji.  Table S1. Performance comparison on a high quality image of a Drosophila pupa notum, captured with a laser-scanning confocal microscope (LSM 880, Zeiss). This specific image is of high quality compared to that used for the comparison reported in Table 1. There are little spurious structures, the cuticle auto-fluorescence signal is faint and there are almost no dead bodies. All tools give visually satisfactory projection results, but the quality of the projection strongly impacts the segmentation accuracy. For LocalZProjector, we include metrics for a second parameter set with a larger binning value, resulting in a ten times quicker projection with only a minor cost on the projection RMSE. The raw data and the methodology and tools to perform this comparison are available online [33].  Table S2. Performance comparison on the adult zebrafish brain image used in Figure 5 of the main text. This image is a rather large compared to the image used in Table S1 (551 Gpixels vs 29 Gpixels) and we see the impact on the processing time. The raw image is available online [32].  Figure S12. Qualitative performance of LocalZProjector on other datasets. The 3D images used for this figure are taken from [28].
Using the LocalZProjector tool, we derived parameters that would give a projection similar to what is presented in Figure 6 of [28].