S3 to Azure Blob Storage using azcopy

Use this Forum to find information on, or ask a question about, NASA Earth Science data.
Post Reply
karthick_rn
Posts: 1
Joined: Wed Jul 17, 2024 1:47 pm America/New_York
Answers: 0

S3 to Azure Blob Storage using azcopy

by karthick_rn » Thu Jul 18, 2024 7:54 am America/New_York

Hello,

We're using the bulk download script (https://git.earthdata.nasa.gov/projects/LPDUR/repos/hls-bulk-download/browse/getHLS.sh) to transfer the HLS data into an Azure Storage account. With more than 7,000 Tile IDs to download, the process is quite time-consuming. To accelerate it, we're running the script in parallel across multiple VMs and considering other methods like `azcopy`. `Azcopy` is a tool designed for transferring data to and from Azure Storage accounts, offering relatively faster performance and supporting transfers from S3, as documented here https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-s3

Using the following link: https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials, we successfully obtained temporary S3 credentials and configured them as detailed below:

export aws_access_key_id=XXXX
export aws_secret_access_key=XXXX
export aws_session_token=XXXX

Upon attempting to transfer a sample file from S3 directly to a Blob storage using the `azcopy` command, we encountered the following error:

Error:
failed to perform copy command due to error: cannot start job due to error: cannot list objects, Access Denied

azcopy command executed:
azcopy cp 'https://lp-prod-protected.s3.us-west-2.amazonaws.com/HLSS30.020/HLS.S30.T37QGC.2024001T075239.v2.0/HLS.S30.T37QGC.2024001T075239.v2.0.SAA.tif' 'https://store1.blob.core.windows.net/user-test' --recursive=true

We are currently investigating whether a "requester pays" configuration is required to copy the data from the S3 bucket, or if there are other settings that need adjustment to facilitate the data transfer to the Blob storage. Please share if you have any update on this.

Additionally, if you have any recommendations on enhancing the efficiency of bulk downloads using the script, we would greatly appreciate your input.

Filters:

LP DAAC-EDL - dgolon
Posts: 422
Joined: Mon Sep 30, 2019 10:00 am America/New_York
Answers: 0
Has thanked: 31 times
Been thanked: 8 times
Contact:

Re: S3 to Azure Blob Storage using azcopy

by LP DAAC-EDL - dgolon » Tue Jul 30, 2024 10:07 am America/New_York

Hi @karthick_rn We are looking into this but just to confirm, are you working in us-west-2? If not, that could be causing the access denied error you are seeing.
Subscribe to the LP DAAC listserv by sending a blank email to lpdaac-join@lists.nasa.gov.

Sign up for the Landsat listserv to receive the most up to date information about Landsat data: https://public.govdelivery.com/accounts/USDOIGS/subscriber/new#tab1.

LP DAAC - afriesz
Subject Matter Expert
Subject Matter Expert
Posts: 71
Joined: Tue Nov 12, 2019 4:02 pm America/New_York
Answers: 2
Been thanked: 3 times

Re: S3 to Azure Blob Storage using azcopy

by LP DAAC - afriesz » Mon Aug 26, 2024 6:39 pm America/New_York

@karthick_rn,

Hi, my understanding around how azcopy works is that it attempts to move/copy data from S3 to blob storage. I don't think this would work with data in Earthdata Cloud because: 1.) direct access to data in S3 is restricted to access methods being executed within AWS us-west-2 only, and 2.) Earthdata Cloud assets cannot be 'pulled' out of the cloud using the S3 URI. Data can be accessed/download from outside the cloud using the available HTTPS links for each asset, but I suspect this not what azcopy is set up to use.

margarite
Posts: 1
Joined: Wed Oct 23, 2024 8:14 am America/New_York
Answers: 0
Been thanked: 1 time

Re: S3 to Azure Blob Storage using azcopy

by margarite » Wed Oct 23, 2024 8:18 am America/New_York

Just use tools like Goodsync and Gs Richcopy 360 to copy directly from blob to S3, both are easy , fast and straightforward

anderas
Posts: 1
Joined: Sat Jan 25, 2025 10:50 am America/New_York
Answers: 0
Been thanked: 1 time

Re: S3 to Azure Blob Storage using azcopy

by anderas » Sat Jan 25, 2025 10:55 am America/New_York

margarite wrote:
> Just use tools like Goodsync and Gs Richcopy 360 to copy directly from the blob
> to S3, both are easy, fast and straightforward
I already use Gs Richcopy 360 for cloud data migration, it is good choice
But as an alternative, I prefer Syncback Pro, because GoodSync is extremely expensive

sethdd
Posts: 7
Joined: Thu Mar 28, 2024 2:55 pm America/New_York
Answers: 0

Re: S3 to Azure Blob Storage using azcopy

by sethdd » Thu Jan 30, 2025 6:46 pm America/New_York

@karthick_rn, the issue you're running into with the "list objects" error is similar to what I just posted here: viewtopic.php?t=6406

Could you try copying this file (159 MB) from the ORNL DAAC bucket and see if you get a similar error:

s3://ornl-cumulus-prod-protected/gedi/GEDI_L3_LandSurface_Metrics_V2/data/GEDI03_counts_2019108_2020287_002_02.tif

you can get the token credentials from here: https://data.ornldaac.earthdata.nasa.gov/s3credentialsREADME

sethdd
Posts: 7
Joined: Thu Mar 28, 2024 2:55 pm America/New_York
Answers: 0

Re: S3 to Azure Blob Storage using azcopy

by sethdd » Tue Feb 18, 2025 3:19 pm America/New_York

karthick_rn wrote:
> Hello,
>
> We're using the bulk download script
> (https://git.earthdata.nasa.gov/projects/LPDUR/repos/hls-bulk-download/browse/getHLS.sh)
> to transfer the HLS data into an Azure Storage account. With more than
> 7,000 Tile IDs to download, the process is quite time-consuming. To
> accelerate it, we're running the script in parallel across multiple VMs and
> considering other methods like `azcopy`. `Azcopy` is a tool designed for
> transferring data to and from Azure Storage accounts, offering relatively
> faster performance and supporting transfers from S3, as documented here
> https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-s3
>
> Using the following link:
> https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials, we successfully
> obtained temporary S3 credentials and configured them as detailed below:
>
> export aws_access_key_id=XXXX
> export aws_secret_access_key=XXXX
> export aws_session_token=XXXX
>
> Upon attempting to transfer a sample file from S3 directly to a Blob
> storage using the `azcopy` command, we encountered the following error:
>
> Error:
> failed to perform copy command due to error: cannot start job due to error:
> cannot list objects, Access Denied
>
> azcopy command executed:
> azcopy cp
> 'https://lp-prod-protected.s3.us-west-2.amazonaws.com/HLSS30.020/HLS.S30.T37QGC.2024001T075239.v2.0/HLS.S30.T37QGC.2024001T075239.v2.0.SAA.tif'
> 'https://store1.blob.core.windows.net/user-test' --recursive=true
>
> We are currently investigating whether a "requester pays"
> configuration is required to copy the data from the S3 bucket, or if there
> are other settings that need adjustment to facilitate the data transfer to
> the Blob storage. Please share if you have any update on this.
>
> Additionally, if you have any recommendations on enhancing the efficiency
> of bulk downloads using the script, we would greatly appreciate your input.

Try running your script again to see if it now works. The LP DAAC team resolved the problem I brought up in my post which had to do with the bucket guidance.

777arc
Posts: 1
Joined: Thu Feb 27, 2025 12:35 pm America/New_York
Answers: 0

Re: S3 to Azure Blob Storage using azcopy

by 777arc » Thu Feb 27, 2025 12:37 pm America/New_York

Downloading that example file from "ornl-cumulus-prod-protected" worked for me, but trying to download or list anything from "lp-prod-protected" gives me an Access Denied, is it possible the lp-prod-protected bucket isn't included in a list of buckets that https://data.ornldaac.earthdata.nasa.gov/s3credentials provides access to? Thanks!

-Marc
Last edited by 777arc on Thu Feb 27, 2025 12:40 pm America/New_York, edited 1 time in total.

LP DAAC - dgolon
User Services
User Services
Posts: 88
Joined: Tue Dec 03, 2024 2:37 pm America/New_York
Answers: 0
Has thanked: 23 times
Been thanked: 2 times

Re: S3 to Azure Blob Storage using azcopy

by LP DAAC - dgolon » Tue Mar 04, 2025 5:08 pm America/New_York

Hi @777arc Apologies for the delay in response. Please try using the LP DAAC S3 credentials link instead of ORNL's, https://data.lpdaac.earthdatacloud.nasa.gov/s3credentials
Subscribe to the LP DAAC listserv by sending a blank email to lpdaac-join@lists.nasa.gov.

Sign up for the Landsat listserv to receive the most up to date information about Landsat data: https://public.govdelivery.com/accounts/USDOIGS/subscriber/new#tab1.

Post Reply