Page 1 of 1

l2gen very slow when using output from l1bextract_safe_nc

Posted: Thu Feb 15, 2024 2:19 pm America/New_York
by andrew.meredith
The time for l2gen to process an OLCI granule subset using l1bextract_safe_nc takes significantly longer than processing the full granule (l2gen version msl12 9.6.0-V2023.3 (Oct 4 2023 22:27:24)).

I tested on both RHEL8.9 and Ubuntu 22.04 with similar results.

The following example demonstrates the problem. The OLCI L1B granule is first subset and the subsetted L1B file is used as input to l2gen:

l1bextract_safe_nc -v --north=30.6654098 --south=24.1619781 --east=-79.8336442 --west=-84.6220924 S3A_OL_1_EFR____20240213T152551_20240213T152851_20240213T171740_0180_109_068_2520_MAR_O_NR_002.SEN3 S3A_OL_1_EFR____20240213T152551_20240213T152851_20240213T171740_0180_109_068_2520_MAR_O_NR_002.SEN3.FL3

time l2gen ifile=S3A_OL_1_EFR____20240213T152551_20240213T152851_20240213T171740_0180_109_068_2520_MAR_O_NR_002.SEN3.FL3/xfdumanifest.xml ofile=S3A_OL_1_EFR____20240213T152251.FL3.L2

Processing Rate = 0.413858 scans/sec

real 106m48.199s
user 94m18.375s
sys 12m9.557s


Processing the full granule as follows only took ~20 mins versus ~106 mins when using the subset granule:
time l2gen ifile=/scratch/SAPS/test/speed/S3A_OL_1_EFR____20240213T152551_20240213T152851_20240213T171740_0180_109_068_2520_MAR_O_NR_002.SEN3/xfdumanifest.xml ofile=S3A_OL_1_EFR____20240213T152251.L2

Processing Rate = 3.375413 scans/sec

real 20m12.506s
user 20m3.674s
sys 0m2.717s

An earlier version on l2gen (msl12 9.6.0-T2022.20 (Jul 28 2022 18:14:29)) produced more expected results. The subset input took ~4 mins to process and the full granule took ~17 mins.

Regards,
Andrew

Re: l2gen very slow when using output from l1bextract_safe_nc

Posted: Tue Feb 20, 2024 4:54 pm America/New_York
by andrew.meredith
The problem seems to be related to different values being assigned to ChunkSizes in the l1bextract_safe_nc output netCDF files. Using nccopy to recreated radiance files with updated ChunkSizes gave much better l2gen performance.

I see there's a "TODO: get and set chunksizes" comment in the netcdf_utils.py script. Any chance that's been addressed?

Thanks
Andrew

Re: l2gen very slow when using output from l1bextract_safe_nc

Posted: Fri Feb 23, 2024 1:37 am America/New_York
by OB.DAAC-EDL - amscott
A solution isn't ready yet. This is still being investigated.

Re: l2gen very slow when using output from l1bextract_safe_nc

Posted: Fri Feb 23, 2024 11:04 am America/New_York
by andrew.meredith
Thanks for the update.

I implemented the following change in the nccopy_var function in netcdf_utils.py to set chunking when creating the output variable that fixed the problem for me:

# create variable with same name, dimnames, storage format
zlib = srcvar.filters().get('zlib', False)
shuffle = srcvar.filters().get('shuffle', False)
complevel = srcvar.filters().get('complevel', 0)
chunking = srcvar.chunking()

for idx, dimname in enumerate(srcvar.dimensions):
if indices and dimname in indices:
if chunking[idx] > len(indices[dimname]):
chunking[idx] = len(indices[dimname])

dstvar = dstgrp.createVariable(srcvar.name,
srcvar.dtype,
srcvar.dimensions,
zlib=zlib,
shuffle=shuffle,
chunksizes=chunking,
complevel=complevel)

Andrew