Curl Maximum redirects and fails
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
Sean,
I'm seeing this issue again. As soon as 47 files are downloaded, it says Max redirects and fails.
Notice that I am feeding curl a list so as to keep network connections alive.
The interface numbers I've used are 2607:fe50:0:6330::100 -> 2607:fe50:0:6330::109 with the same results
All of them are doing this now. Is this a new issue (or the old one again)? I thought it was just a corrupt cookie jar, but I guess NOT?
Each curl command had a different cookie jar file, and they are using different network interfaces.
time curl -b .urs_cookies_109 -c .urs_cookies_109 -L -n --interface 2607:fe50:0:6330::109 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x02.trimmed)}
[46/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011341.1455_1.PDS.bz2 --> MOD00.A2011341.1455_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011341.1455_1.PDS.bz2
100 299M 100 299M 0 0 3916k 0 0:01:18 0:01:18 --:--:-- 3980k
[47/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011343.1440_1.PDS.bz2 --> MOD00.A2011343.1440_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011343.1440_1.PDS.bz2
100 295M 100 295M 0 0 3907k 0 0:01:17 0:01:17 --:--:-- 3982k
[48/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011350.1445_1.PDS.bz2 --> MOD00.A2011350.1445_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011350.1445_1.PDS.bz2
0 295M 0 191 0 0 22 0 162d 20h 0:00:08 162d 20h 22
curl: (47) Maximum (50) redirects followed
AND
time curl -b .urs_cookies_106 -c .urs_cookies_106 -L -n --interface 2607:fe50:0:6330::106 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x00)}
[47/400]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001059.1450_1.PDS.bz2 --> MOD00.A2001059.1450_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001059.1450_1.PDS.bz2
100 285M 100 285M 0 0 3935k 0 0:01:14 0:01:14 --:--:-- 4004k
[48/400]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001066.1455_1.PDS.bz2 --> MOD00.A2001066.1455_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001066.1455_1.PDS.bz2
0 285M 0 191 0 0 21 0 164d 19h 0:00:09 164d 18h 21
curl: (47) Maximum (50) redirects followed
Any advice would be appreciated. Do we have to keep our d/l list to 47 lines max? They are only d/ling at ~4000k. This is much slower than the last time I did bulk downloads. Is it throttled now>
I'm seeing this issue again. As soon as 47 files are downloaded, it says Max redirects and fails.
Notice that I am feeding curl a list so as to keep network connections alive.
The interface numbers I've used are 2607:fe50:0:6330::100 -> 2607:fe50:0:6330::109 with the same results
All of them are doing this now. Is this a new issue (or the old one again)? I thought it was just a corrupt cookie jar, but I guess NOT?
Each curl command had a different cookie jar file, and they are using different network interfaces.
time curl -b .urs_cookies_109 -c .urs_cookies_109 -L -n --interface 2607:fe50:0:6330::109 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x02.trimmed)}
[46/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011341.1455_1.PDS.bz2 --> MOD00.A2011341.1455_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011341.1455_1.PDS.bz2
100 299M 100 299M 0 0 3916k 0 0:01:18 0:01:18 --:--:-- 3980k
[47/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011343.1440_1.PDS.bz2 --> MOD00.A2011343.1440_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011343.1440_1.PDS.bz2
100 295M 100 295M 0 0 3907k 0 0:01:17 0:01:17 --:--:-- 3982k
[48/353]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011350.1445_1.PDS.bz2 --> MOD00.A2011350.1445_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2011350.1445_1.PDS.bz2
0 295M 0 191 0 0 22 0 162d 20h 0:00:08 162d 20h 22
curl: (47) Maximum (50) redirects followed
AND
time curl -b .urs_cookies_106 -c .urs_cookies_106 -L -n --interface 2607:fe50:0:6330::106 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x00)}
[47/400]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001059.1450_1.PDS.bz2 --> MOD00.A2001059.1450_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001059.1450_1.PDS.bz2
100 285M 100 285M 0 0 3935k 0 0:01:14 0:01:14 --:--:-- 4004k
[48/400]: https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001066.1455_1.PDS.bz2 --> MOD00.A2001066.1455_1.PDS.bz2
--_curl_--https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/MOD00.A2001066.1455_1.PDS.bz2
0 285M 0 191 0 0 21 0 164d 19h 0:00:09 164d 18h 21
curl: (47) Maximum (50) redirects followed
Any advice would be appreciated. Do we have to keep our d/l list to 47 lines max? They are only d/ling at ~4000k. This is much slower than the last time I did bulk downloads. Is it throttled now>
Filters:
-
- Posts: 1519
- Joined: Wed Sep 18, 2019 6:15 pm America/New_York
- Been thanked: 9 times
Curl Maximum redirects and fails
Brock,
We've not changed anything. Using the obdaac_download.py script that I wrote (and posted, and made available via SeaDAS) I just pulled down 360 L2 files (8.2G) in 18 minutes - which is pretty much at the cap of my home wireless network - with no issues. I've asked our network guy to take a peek....maybe he'll respond.
Sean
We've not changed anything. Using the obdaac_download.py script that I wrote (and posted, and made available via SeaDAS) I just pulled down 360 L2 files (8.2G) in 18 minutes - which is pretty much at the cap of my home wireless network - with no issues. I've asked our network guy to take a peek....maybe he'll respond.
Sean
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
Sean,
I sent a separate msg to Chris as well.
I have at least one user having this issue on another campus with IPv4 (I use IPv6 where I am).
She indicates that she doesn't have this problem with MERIS and OLCI file downloads. They all work in the same manner.
She reports all works well for MERIS and OLCI; I she can download and process images fine.
She indicated that after many tests, she found that if I download one MERIS or OLCI image first, then go back to process MODIS, it works again for a while.
Brock
I sent a separate msg to Chris as well.
I have at least one user having this issue on another campus with IPv4 (I use IPv6 where I am).
She indicates that she doesn't have this problem with MERIS and OLCI file downloads. They all work in the same manner.
She reports all works well for MERIS and OLCI; I she can download and process images fine.
She indicated that after many tests, she found that if I download one MERIS or OLCI image first, then go back to process MODIS, it works again for a while.
Brock
-
- Posts: 1519
- Joined: Wed Sep 18, 2019 6:15 pm America/New_York
- Been thanked: 9 times
Curl Maximum redirects and fails
Brock,
The getfile script is agnostic to the mission - except that OLCI and MERIS require the user to have accepted the appropriate EULA. MODIS doesn't require that, so having that extra bit would make no difference to the Earthdata Login step when pulling down MODIS. BTW, I also have accepted those EULAs for my user, and the files I was pulling were MODIS. Try running your cURL commands with increased verbosity to see if anything jumps out at you as to why it's in a redirect loop (but please don't post the output here, if you can't interpret the output and simply insist do so as an attachment)
Sean
The getfile script is agnostic to the mission - except that OLCI and MERIS require the user to have accepted the appropriate EULA. MODIS doesn't require that, so having that extra bit would make no difference to the Earthdata Login step when pulling down MODIS. BTW, I also have accepted those EULAs for my user, and the files I was pulling were MODIS. Try running your cURL commands with increased verbosity to see if anything jumps out at you as to why it's in a redirect loop (but please don't post the output here, if you can't interpret the output and simply insist do so as an attachment)
Sean
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
I'm going to look at the verbose thing as well. Is there a hard limit on the number of sessions a user can connect with.
So, If NASA is seeing me (oo_username) coming in from various interfaces, that there is an issue?
So, I have several interfaces to d/l data, and all use the same user and pass (in the .netrc file)
Also, just an up front comment, I see this is every request
* Couldn't find host oceandata.sci.gsfc.nasa.gov in the .netrc file; using defaults
My .netrc has the eathdata login. Is that an issue? eg:
machine urs.earthdata.nasa.gov login oo_username password ????????????
Brock
So, If NASA is seeing me (oo_username) coming in from various interfaces, that there is an issue?
So, I have several interfaces to d/l data, and all use the same user and pass (in the .netrc file)
Also, just an up front comment, I see this is every request
* Couldn't find host oceandata.sci.gsfc.nasa.gov in the .netrc file; using defaults
My .netrc has the eathdata login. Is that an issue? eg:
machine urs.earthdata.nasa.gov login oo_username password ????????????
Brock
-
- Posts: 1519
- Joined: Wed Sep 18, 2019 6:15 pm America/New_York
- Been thanked: 9 times
Curl Maximum redirects and fails
You could try adding an entry for oceandata.sci.gsfc.nasa.gov with the same username and password used with urs.earthdata.nasa.gov
I don't have one, but maybe it's confusing cURL (although, not likely since there's been successful downloads).
The only limits we have are on the number of keep-alive requests, but that would just cause your client to reconnect silently - or so's the theory :wink:
Sean
I don't have one, but maybe it's confusing cURL (although, not likely since there's been successful downloads).
The only limits we have are on the number of keep-alive requests, but that would just cause your client to reconnect silently - or so's the theory :wink:
Sean
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
Sean,
What is the hard limit on the keep-alive requests? And is that per connection, or per user?
:)
Brock
What is the hard limit on the keep-alive requests? And is that per connection, or per user?
:)
Brock
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
Sean,
I think there may be some interesting things in the verbose output. (Still slower than it used to be. An hour to d/l 47 files :confused: )
This is interesting:
In the 47th file:
< Connection: keep-alive
< Keep-Alive: timeout=60
< Location: /ob/getfile/MOD00.A2003021.1420_1.PDS.bz2
In the 48th file:
Found
< Connection: keep-alive
< Keep-Alive: timeout=60
< Location: https://urs.earthdata.nasa.gov/oauth/authorize?response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&client_id=Z0u-MdLNypXBjiDREZ3roA
I modified they .netrc as well without effect:
cat ~/.netrc
machine urs.earthdata.nasa.gov login oo_login password ??????????????
machine oceandata.sci.gsfc.nasa.gov login oo_login password ??????????????
I'm attaching it as a file as requested.attachment 1
I think there may be some interesting things in the verbose output. (Still slower than it used to be. An hour to d/l 47 files :confused: )
This is interesting:
In the 47th file:
< Connection: keep-alive
< Keep-Alive: timeout=60
< Location: /ob/getfile/MOD00.A2003021.1420_1.PDS.bz2
In the 48th file:
Found
< Connection: keep-alive
< Keep-Alive: timeout=60
< Location: https://urs.earthdata.nasa.gov/oauth/authorize?response_type=code&redirect_uri=https%3A%2F%2Foceandata.sci.gsfc.nasa.gov%2Fob%2Fgetfile%2Frestrict&client_id=Z0u-MdLNypXBjiDREZ3roA
I modified they .netrc as well without effect:
cat ~/.netrc
machine urs.earthdata.nasa.gov login oo_login password ??????????????
machine oceandata.sci.gsfc.nasa.gov login oo_login password ??????????????
I'm attaching it as a file as requested.attachment 1
Curl Maximum redirects and fails
I found it helpful to add the entry "machine oceandata.sci.gsfc.nasa.gov ..." in my "~/.netrc". I think both wget and curl have been tweaking SSO handling, so the specific version and configure options or your curl could matter. Troubleshooting Authentication Issues with registry.redhat.io has examples using curl to play with tokens.
-
- Posts: 338
- Joined: Wed Apr 06, 2005 12:11 pm America/New_York
- Has thanked: 10 times
- Been thanked: 3 times
Curl Maximum redirects and fails
Well, I do know that I was able (a little over a year back) use this same version with the same (but multiple) interfaces to do 10 concurrent d/l's in separate terminals using separate interfaces. I reprocessed 40 regions of interest from mission start on terra, aqua, and viirs. I did over 250,000 workorders. So I know that this curl version has the capacity to handle the keep-alives and downloads. It has not changed:
Name : curl Relocations: (not relocatable)
Version : 7.19.7 Vendor: Red Hat, Inc.
Release : 52.el6 Build Date: Fri 29 Jan 2016 08:25:34 AM EST
Install Date: Mon 22 Jan 2018 03:01:14 AM EST Build Host: x86-033.build.eng.bos.redhat.com
Group : Applications/Internet Source RPM: curl-7.19.7-52.el6.src.rpm
Of course, this was done before the introduction of the requirements for the .netrc, and the -b and -c in the curl command below (so I can't help but think that its is still something at the server side with authentication -- see previous post for log snippet):
This command fails after the 47th file download where the x00 list in the command has only 400 lines (And all others lately, I searched the forum and noticed that others in the past have had the same issue, from other locations outside our university, but no real solution):
curl -b .urs_cookies_106 -c .urs_cookies_106 -L -n --interface 2607:fe50:0:6330::106 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x00)}
The ones that I did before without issue looked like the below command (only whereas the x00 list in the above command has 400 lines, the one below contained 2500 lines, and achieved incredible d/l speeds -- I d/l'ed a year of PDS.bz2 files in under 3 hours):
This command never failed a year ago where the x00 list in the command had 2500 lines:
curl --interface 2607:fe50:0:6330::100 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /cms_zfs/work_orders/modis/PDS/2006/x00)}
Name : curl Relocations: (not relocatable)
Version : 7.19.7 Vendor: Red Hat, Inc.
Release : 52.el6 Build Date: Fri 29 Jan 2016 08:25:34 AM EST
Install Date: Mon 22 Jan 2018 03:01:14 AM EST Build Host: x86-033.build.eng.bos.redhat.com
Group : Applications/Internet Source RPM: curl-7.19.7-52.el6.src.rpm
Of course, this was done before the introduction of the requirements for the .netrc, and the -b and -c in the curl command below (so I can't help but think that its is still something at the server side with authentication -- see previous post for log snippet):
This command fails after the 47th file download where the x00 list in the command has only 400 lines (And all others lately, I searched the forum and noticed that others in the past have had the same issue, from other locations outside our university, but no real solution):
curl -b .urs_cookies_106 -c .urs_cookies_106 -L -n --interface 2607:fe50:0:6330::106 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /shares/cms_optics/virtual_ant/S4P/bin/fa_density/non_fai_rois/CAPE_COD/x00)}
The ones that I did before without issue looked like the below command (only whereas the x00 list in the above command has 400 lines, the one below contained 2500 lines, and achieved incredible d/l speeds -- I d/l'ed a year of PDS.bz2 files in under 3 hours):
This command never failed a year ago where the x00 list in the command had 2500 lines:
curl --interface 2607:fe50:0:6330::100 --retry 5 --retry-delay 2 --max-time 0 --remote-name-all https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/{$(sed ':a;N;$!ba;s/\n/,/g' /cms_zfs/work_orders/modis/PDS/2006/x00)}