Clarification on negative values in OMSO2e v003 SO2 column data

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
josephmacula
Posts: 1
Joined: Sun Jul 13, 2025 5:13 pm America/New_York
Answers: 0

Clarification on negative values in OMSO2e v003 SO2 column data

by josephmacula » Sun Jul 13, 2025 5:34 pm America/New_York

I recently pulled SO2 data from the following source: OMSO2e v003 SO2 Column Amount. Specifically, I used a python script to navigate to the Giovanni webpage, fill out the necessary filters, plot the data, and then download the available csv file once the data are plotted.

After inspecting the data in the csv file, I noticed that in addition to the fill values, there are certain days with small negative values (on the order of -.2). From previous forum posts (e.g., this one: viewtopic.php?t=562), I understand that these negative non-fill values are a result of the PCA algorithm used to process the data. However, being a nonexpert, I want to know if there is a rough heuristic for how to handle this data. For example, is there a certain threshold where negative data points should be discarded (e.g., less than -10)? Are there any other caveats one should keep in mind when incorporating these negative values into analyses?

Thank you for any advice!

Filters:

DPDG - pleonard
Subject Matter Expert
Subject Matter Expert
Posts: 5
Joined: Wed May 31, 2023 11:14 pm America/New_York
Answers: 0

Re: Clarification on negative values in OMSO2e v003 SO2 column data

by DPDG - pleonard » Sun Jul 20, 2025 5:58 am America/New_York

The OMSO2e Level 3 product is a "best pixel" product, which means that each value in the ColumnAmountSO2 field of an OMSO2e Level 3 product file is an unaltered value from the ColumnAmountSO2 field in an OMSO2 Level 2 product file.

When the actual SO2 column amount for a "ground pixel" is near zero, then the retrieval algorithm that is used to produce the OMSO2 product will result in a distribution of SO2 column amount values that includes both positive and negative values.

When forming an average based on OMSO2e Level 3 data (or OMSO2 Level 2 data), I would keep all of the negative values of the relevant field (e.g., ColumnAmountSO2), so that the average that is formed will not have a positive bias.

You are asking about how to handle extreme outliers in the relevant field (e.g., ColumnAmountSO2), but there is no simple answer.

It is undeniable that the data in the OMSO2 product includes extreme outliers that are spurious.

I would try to keep all of the extreme outliers in my analysis, unless there is good evidence (e.g., spatially and temporally coincident data from another satellite) that could be used to eliminate one or more of the extreme outliers.

Also, I might present two results: one that includes the extreme outlier(s), and another that does not.

Real scientific data often include outliers, and the best solution is always to obtain more data (if possible).

It would be risky to claim a major scientific discovery that hinges upon whether or not one extreme outlier is included in the analysis.

GES DISC - emilyz
User Services
User Services
Posts: 9
Joined: Tue Dec 17, 2024 7:46 am America/New_York
Answers: 0

Re: Clarification on negative values in OMSO2e v003 SO2 column data

by GES DISC - emilyz » Mon Jul 21, 2025 6:46 am America/New_York

Small negative values are sometimes returned by the retrieval algorithm. Although negative concentrations are not real, one should not discard these values when taking an average over several days, or a larger region, as they tend to average out. Excluding them would bias your average high.

Post Reply