Wget constantly redirected

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
OB.DAACx - SeanBailey
Posts: 1519
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 9 times

Wget constantly redirected

by OB.DAACx - SeanBailey » Fri Feb 28, 2020 3:40 pm America/New_York

Kim,

The -c won't work, as we don't support it.

The -N option isn't working (we believe) because wget is doing a HEAD request, which is being denied by the remote authentication server during the authentication redirection we do.
cURL works....

I will (hopefully soon) be posting an updated python script under https://oceancolor.gsfc.nasa.gov/data/download_methods/ that also will only download files if the remote version is newer than the local copy.

Sean

Filters:

khyde
Posts: 39
Joined: Mon Dec 04, 2006 11:01 am America/New_York
Answers: 0

Wget constantly redirected

by khyde » Mon Mar 02, 2020 5:25 pm America/New_York

Thanks for the update Sean,

I'm not very familiar with curl, but will give it a try...

I did play around with curl a bit and tried the -C - option to continue a download, but that also returns an error: curl: (52) Empty reply from server.
It also appears that -z requires an input and doesn't work as well as wget's -N.  I typically use wget to download from a list of files and it looks like that is an option with curl as well, but I don't know if I can use -z with the download file list.

What are the chances that the wget -N bug can be fixed?  I'm just wondering if I should put the time in to overhauling my download methods or wait for an update on your end.

I also took a look at the current Python download script and it isn't clear to me how you would use that to download a specific set of files.  I'm not overly familiar with Python so I'm sure I'm missing something obvious.

Thanks again!
Kim

OB.DAACx - SeanBailey
Posts: 1519
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 9 times

Wget constantly redirected

by OB.DAACx - SeanBailey » Wed Mar 04, 2020 3:37 pm America/New_York

Kim,
I may not have been clear in my last post, so let me clarify....our servers do NOT (at least not any longer) support download continuation, regardless of which client is used.
So, no need to test that capability at your end :grin:

A modification to the python download script will soon be posted on the download methods page.
It will include a fully functional script.  All that is required is a python installation with the requests library installed.

I have tested it under Mac, Linux and Windows with python v2.7 and v3.7, so it should work for you (and be more consistent than wget has been...)
I'll try to remember to post a reply once the script is available.

Here's a preview of the usage:

usage: obdaac_download.py [-h] [-v] [--filelist FILELIST]
                          [--http_manifest HTTP_MANIFEST] [--odir ODIR]
                          [--uncompress] [--force]
                          [filename]

Download files archived at the OB.DAAC

positional arguments:
  filename              name of the file (or the URL of the file) to retreive

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         print status messages
  --filelist FILELIST   file containing list of filenames to retreive, one per
                        line
  --http_manifest HTTP_MANIFEST
                        URL to http_manifest file for OB.DAAC data order
  --odir ODIR           full path to desired output directory; defaults to
                        current working directory:
                        /accounts/swbaile1/Downloads/junk
  --uncompress          uncompress the retrieved files (if compressed)
  --force               force download even if file already exists locally

Provide one of either filename, --filelist or --http_manifest. NOTE: For
authentication, a valid .netrc file in the user home ($HOME) is required,
e.g.: machine urs.earthdata.nasa.gov login USERNAME password PASSWD


Regards,
Sean

OB.DAACx - SeanBailey
Posts: 1519
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 9 times

Wget constantly redirected

by OB.DAACx - SeanBailey » Wed Mar 04, 2020 5:21 pm America/New_York

As promised :smile:
The script is now available from https://oceancolor.gsfc.nasa.gov/data/download_methods

Sean

khyde
Posts: 39
Joined: Mon Dec 04, 2006 11:01 am America/New_York
Answers: 0

Wget constantly redirected

by khyde » Thu Mar 05, 2020 11:59 am America/New_York

Thanks Sean,

I will try your new python script.  This looks very handy. 

In hindsight, I'm not sure the -N (get file if new) is necessary for me because I do compare the checksums before creating my list of files to download so I can just remove any "old" files from my system and add the replacement to the download list.

Hopefully I will have some time to test and implement it before the weekend (I've got some downloading to catch up on :wink:).

Kim

gnwiii
Posts: 713
Joined: Fri Jan 29, 2021 5:51 pm America/New_York
Answers: 2
Has thanked: 1 time

Wget constantly redirected

by gnwiii » Thu Mar 05, 2020 1:38 pm America/New_York

obdaac_download.py ends up here with MSDOS <CF-LF> line endings.  The script works nicely in Cygwin. On MacOS the script displays the help text (using "python3 obdaac_download.py"), but the output from the file command is garbled:
$ file obdaac_download.py
script text executableython
$ dos2unix obdaac_download.py
dos2unix: converting file obdaac_download.py to Unix format...
$ file obdaac_download.py
obdaac_download.py: a python script text executable

... switching to linux (more to come) ... back, but now using a linux VM:
$ python --version
Python 3.7.6

file obdaac_download.py
obdaac_download.py: Python script, ASCII text executable, with CRLF line terminators
$ obdaac_download.py
/usr/bin/env: ‘python\r’: No such file or directory
$ dos2unix obdaac_download.py
dos2unix: converting file obdaac_download.py to Unix format...


The obdaac_download.py script "works (at least once) for me" on Debian, Ubuntu, Fedora, macOS (El Capitan,) and Windows 10 (Cygwin).   For those living with unreliable internet, there are many documents that discuss debugging and troubleshooting python requests.

OB.DAACx - SeanBailey
Posts: 1519
Joined: Wed Sep 18, 2019 6:15 pm America/New_York
Answers: 1
Been thanked: 9 times

Wget constantly redirected

by OB.DAACx - SeanBailey » Thu Mar 05, 2020 6:06 pm America/New_York

Thanks for testing :smile:

Yes, it appears that the version download from the web server has the CRLF endings which causes issues on Mac and Linux...bummer.  

A version of the script will eventually be part of the SeaDAS distribution, which will definitely be free of the pesky ^Ms

We'll add a buyer-beware note to the web page .

Sean

Post Reply