PDS files in subscriptions now in both bz2 and 'data' format
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
Sean,
There seems to be an issue which began on May 6th. It is happening in at least 21 of my subscriptions. Here is the result of a query just performed:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1843&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1145_1.PDS.bz2
MOD00.P2020131.1320_1.PDS.bz2
MOD00.P2020131.1325_1.PDS.bz2
MOD00.P2020132.1220_1.PDS.bz2
MOD00.P2020132.1225_1.PDS.bz2
MOD00.P2020132.1400_1.PDS.bz2
MOD00.P2020132.1405_1.PDS.bz2
MOD00.P2020133.1305_1.PDS.bz2
MOD00.P2020133.1310_1.PDS.bz2
MOD00.P2020134.1210_1.PDS
MOD00.P2020134.1215_1.PDS.bz2
MOD00.P2020134.1350_1.PDS
MOD00.P2020134.1355_1.PDS
Notice that three are that are not bz2 files and downloaded as data.
[bmurch@optics0 S4P_MODIS_H5]$ ll /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
-rw-rw-r-- 1 bmurch cms_optics 395148432 May 13 10:28 /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
[bmurch@optics0 S4P_MODIS_H5]$ file /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
/cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS: data
I have a program that has been running for at least 8 years, and this is the first time this has ever happened.
My program determines the appropriate name of a workorder from the granule, and where to plavce it, based on a name like this:
# determine the actual wo name:
$wo_name_h5 =~ s/(.*)\.PDS\.bz2/PRI1.DO.SEADAS_L1A_GEO_EXTRACT_H5\.$site_key\.$sat_key\.$1\.wo/;
In some cases, these are grabbed as "HTML" file errors.
This causes great problems with several scripts, both in the initial download and where to put it, to the scripts that bunzip2 the files.
In some cases, the files are later "fixed" by NASA and downloaded again as bz2 files in which case my scripts work.
It was a fluke that I found this at all as the errors are totally random, independent of time or day.
Please advise. If this is going to be a continuing problem, I will have a major rewrite to do.
Brock
			
			
									
						There seems to be an issue which began on May 6th. It is happening in at least 21 of my subscriptions. Here is the result of a query just performed:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1843&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1145_1.PDS.bz2
MOD00.P2020131.1320_1.PDS.bz2
MOD00.P2020131.1325_1.PDS.bz2
MOD00.P2020132.1220_1.PDS.bz2
MOD00.P2020132.1225_1.PDS.bz2
MOD00.P2020132.1400_1.PDS.bz2
MOD00.P2020132.1405_1.PDS.bz2
MOD00.P2020133.1305_1.PDS.bz2
MOD00.P2020133.1310_1.PDS.bz2
MOD00.P2020134.1210_1.PDS
MOD00.P2020134.1215_1.PDS.bz2
MOD00.P2020134.1350_1.PDS
MOD00.P2020134.1355_1.PDS
Notice that three are that are not bz2 files and downloaded as data.
[bmurch@optics0 S4P_MODIS_H5]$ ll /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
-rw-rw-r-- 1 bmurch cms_optics 395148432 May 13 10:28 /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
[bmurch@optics0 S4P_MODIS_H5]$ file /cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS
/cms_zfs/sat_data/modis/l0/bad/MOD00.P2020134.1210_1.PDS: data
I have a program that has been running for at least 8 years, and this is the first time this has ever happened.
My program determines the appropriate name of a workorder from the granule, and where to plavce it, based on a name like this:
# determine the actual wo name:
$wo_name_h5 =~ s/(.*)\.PDS\.bz2/PRI1.DO.SEADAS_L1A_GEO_EXTRACT_H5\.$site_key\.$sat_key\.$1\.wo/;
In some cases, these are grabbed as "HTML" file errors.
This causes great problems with several scripts, both in the initial download and where to put it, to the scripts that bunzip2 the files.
In some cases, the files are later "fixed" by NASA and downloaded again as bz2 files in which case my scripts work.
It was a fluke that I found this at all as the errors are totally random, independent of time or day.
Please advise. If this is going to be a continuing problem, I will have a major rewrite to do.
Brock
Filters:
- 
				OB ODPS - towens
- Subject Matter Expert 
- Posts: 465
- Joined: Fri Feb 05, 2021 9:17 am America/New_York
- Has endorsed: 1 time
- Endorsed: 10 times
PDS files in subscriptions now in both bz2 and 'data' format
There was a recent change to the timing of the compression. Where previously the L0 were compressed on ingest, now the compression is deferred and occurs on the archive. This is creating a race where the uncompressed file gets listed for subscriptions before the compression happens. We are looking into a solution.
Tommy
			
			
									
						Tommy
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
Thanks Tommy,
For now, I am just skipping the ones that are not bz2.
There are so many things that happen with the hash element that the granules are stored in, that it will be a nightmare to change everything.
Brock
			
			
									
						For now, I am just skipping the ones that are not bz2.
There are so many things that happen with the hash element that the granules are stored in, that it will be a nightmare to change everything.
Brock
- 
				OB ODPS - towens
- Subject Matter Expert 
- Posts: 465
- Joined: Fri Feb 05, 2021 9:17 am America/New_York
- Has endorsed: 1 time
- Endorsed: 10 times
PDS files in subscriptions now in both bz2 and 'data' format
If you append the bz2 and retry, they should be compressed now.
Tommy
			
			
									
						Tommy
- 
				OB ODPS - jgwilding
- Subject Matter Expert 
- Posts: 139
- Joined: Fri Feb 19, 2021 1:09 pm America/New_York
- Endorsed: 1 time
PDS files in subscriptions now in both bz2 and 'data' format
If you're downloading the files using wget, the --content-disposition option should cause the local files to be given the name they are on the server.
If you're using curl, the --remote-name and --remote-header-name (together) do the same thing.
Only uncompress when the suffix is present on the local file.
john
			
			
									
						If you're using curl, the --remote-name and --remote-header-name (together) do the same thing.
Only uncompress when the suffix is present on the local file.
john
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
John,
The local files have the same name as the ones on the server.
BUT, and a huge BUT..
I have an automated system that creates work-orders and checks old work-orders and all kinds of stuff.
So, I can parse the name to see if it is bz2, however that is placed into a hash.
That hash is passed around and used in many way.
So I create regions and some may use the same granule (so I do not want to d/l it more than once)
So I have to check many places. and into old work-orders to see if they are needing to be remade.
If the old work-order contains a FILE=????.PDS
and now I have a new file when my subscription is checked again
and that file is ????.PDS.bz2
bad things will happen.
It is literally a nightmare scenario for me. And it follows a rewrite of certainscripts to find the anc files that are compressed, and some not...
Brock
			
			
									
						The local files have the same name as the ones on the server.
BUT, and a huge BUT..
I have an automated system that creates work-orders and checks old work-orders and all kinds of stuff.
So, I can parse the name to see if it is bz2, however that is placed into a hash.
That hash is passed around and used in many way.
So I create regions and some may use the same granule (so I do not want to d/l it more than once)
So I have to check many places. and into old work-orders to see if they are needing to be remade.
If the old work-order contains a FILE=????.PDS
and now I have a new file when my subscription is checked again
and that file is ????.PDS.bz2
bad things will happen.
It is literally a nightmare scenario for me. And it follows a rewrite of certainscripts to find the anc files that are compressed, and some not...
Brock
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
Tommy,
Is this a permanent fix? I have 41 subscriptions and I only download a file once as it may be required for different regions I produce.
Brock
			
			
									
						Is this a permanent fix? I have 41 subscriptions and I only download a file once as it may be required for different regions I produce.
Brock
- 
				OB ODPS - towens
- Subject Matter Expert 
- Posts: 465
- Joined: Fri Feb 05, 2021 9:17 am America/New_York
- Has endorsed: 1 time
- Endorsed: 10 times
PDS files in subscriptions now in both bz2 and 'data' format
just a suggested temporary workaround.
Tommy
			
			
									
						Tommy
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
I meant you said:
"If you append the bz2 and retry, they should be compressed now."
And these are correct now:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1843&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1145_1.PDS.bz2
MOD00.P2020131.1320_1.PDS.bz2
MOD00.P2020131.1325_1.PDS.bz2
MOD00.P2020132.1220_1.PDS.bz2
MOD00.P2020132.1225_1.PDS.bz2
MOD00.P2020132.1400_1.PDS.bz2
MOD00.P2020132.1405_1.PDS.bz2
MOD00.P2020133.1305_1.PDS.bz2
MOD00.P2020133.1310_1.PDS.bz2
MOD00.P2020134.1210_1.PDS.bz2
MOD00.P2020134.1215_1.PDS.bz2
MOD00.P2020134.1350_1.PDS.bz2
MOD00.P2020134.1355_1.PDS.bz2
Are all PDS files now compressed, or only the one subscription I used as an example?
Thanks,
Brock
			
			
									
						"If you append the bz2 and retry, they should be compressed now."
And these are correct now:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1843&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1145_1.PDS.bz2
MOD00.P2020131.1320_1.PDS.bz2
MOD00.P2020131.1325_1.PDS.bz2
MOD00.P2020132.1220_1.PDS.bz2
MOD00.P2020132.1225_1.PDS.bz2
MOD00.P2020132.1400_1.PDS.bz2
MOD00.P2020132.1405_1.PDS.bz2
MOD00.P2020133.1305_1.PDS.bz2
MOD00.P2020133.1310_1.PDS.bz2
MOD00.P2020134.1210_1.PDS.bz2
MOD00.P2020134.1215_1.PDS.bz2
MOD00.P2020134.1350_1.PDS.bz2
MOD00.P2020134.1355_1.PDS.bz2
Are all PDS files now compressed, or only the one subscription I used as an example?
Thanks,
Brock
- 
				oo_processing
- Posts: 340
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has endorsed: 10 times
- Endorsed: 3 times
PDS files in subscriptions now in both bz2 and 'data' format
Tommy,
I'm guessing not:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1267&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1510_1.PDS.bz2
MOD00.P2020131.1640_1.PDS.bz2
MOD00.P2020131.1645_1.PDS.bz2
MOD00.P2020131.1820_1.PDS.bz2
MOD00.P2020132.1545_1.PDS.bz2
MOD00.P2020132.1550_1.PDS.bz2
MOD00.P2020132.1725_1.PDS.bz2
MOD00.P2020132.1730_1.PDS.bz2
MOD00.P2020133.1630_1.PDS.bz2
MOD00.P2020133.1635_1.PDS.bz2
MOD00.P2020133.1805_1.PDS.bz2
MOD00.P2020133.1810_1.PDS.bz2
MOD00.P2020133.1815_1.PDS.bz2
MOD00.P2020134.1530_1.PDS
MOD00.P2020134.1535_1.PDS
MOD00.P2020134.1540_1.PDS
			
			
									
						I'm guessing not:
[bmurch@dell8 ~]$ curl --retry 5 --retry-delay 2 -d "subID=1267&results_as_file=1" https://oceandata.sci.gsfc.nasa.gov/api/file_search
MOD00.P2020131.1510_1.PDS.bz2
MOD00.P2020131.1640_1.PDS.bz2
MOD00.P2020131.1645_1.PDS.bz2
MOD00.P2020131.1820_1.PDS.bz2
MOD00.P2020132.1545_1.PDS.bz2
MOD00.P2020132.1550_1.PDS.bz2
MOD00.P2020132.1725_1.PDS.bz2
MOD00.P2020132.1730_1.PDS.bz2
MOD00.P2020133.1630_1.PDS.bz2
MOD00.P2020133.1635_1.PDS.bz2
MOD00.P2020133.1805_1.PDS.bz2
MOD00.P2020133.1810_1.PDS.bz2
MOD00.P2020133.1815_1.PDS.bz2
MOD00.P2020134.1530_1.PDS
MOD00.P2020134.1535_1.PDS
MOD00.P2020134.1540_1.PDS