Crafting URLs to download data? the page automatically blocks me
Posted: Tue Nov 13, 2018 1:02 pm America/New_York
Hello!,
I'm trying to download several images of MODIS-Aqua Chlorophyll and SST L2 1km. I'm using a script like this (which I saw in a older topic with the same name, that is why I keep the name of the topic):
query="?sub=level1or2list&sen=am&per=DAY&day=$dat&n=$n&s=$s&w=$w&e=$e"
wget -qO - \
$url$query \
| perl -n -0777 \
-e 'if(/filenamelist&id=(\d+\.\d+)/){' \
-e 'print `wget "'$url'?sub=filenamelist&id=$1&prm=CHL" -qO -`;' \
-e '}' \
-e 'elsif(/(A\d+\.L2_LAC_OC)/){' \
-e 'print "$1\n";' \
-e '}' > temporal.txt
while read filename; do
.....
echo $filename | wget -B https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ --content-disposition -i
done <temporal.txt
rm temporal.txt
And per each day sometimes "temporal.txt" can have several swaths but when I try to download them some of the files doesn't exist (I don't know why the page gives me the name of one swath that supposed to cover my region but doesn't exist) and the terminal get stuck trying to connect with the page until finally break it, show an error and move to the other file in "temporal.txt". After severals errors, the page automatically blocks me. Do you know if there is a sentence or flag in wget to fix that?
Thanks
I'm trying to download several images of MODIS-Aqua Chlorophyll and SST L2 1km. I'm using a script like this (which I saw in a older topic with the same name, that is why I keep the name of the topic):
query="?sub=level1or2list&sen=am&per=DAY&day=$dat&n=$n&s=$s&w=$w&e=$e"
wget -qO - \
$url$query \
| perl -n -0777 \
-e 'if(/filenamelist&id=(\d+\.\d+)/){' \
-e 'print `wget "'$url'?sub=filenamelist&id=$1&prm=CHL" -qO -`;' \
-e '}' \
-e 'elsif(/(A\d+\.L2_LAC_OC)/){' \
-e 'print "$1\n";' \
-e '}' > temporal.txt
while read filename; do
.....
echo $filename | wget -B https://oceandata.sci.gsfc.nasa.gov/cgi/getfile/ --content-disposition -i
done <temporal.txt
rm temporal.txt
And per each day sometimes "temporal.txt" can have several swaths but when I try to download them some of the files doesn't exist (I don't know why the page gives me the name of one swath that supposed to cover my region but doesn't exist) and the terminal get stuck trying to connect with the page until finally break it, show an error and move to the other file in "temporal.txt". After severals errors, the page automatically blocks me. Do you know if there is a sentence or flag in wget to fix that?
Thanks