Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

NYC STEW-MAP Staten Island organizations' website hyperlink webscrape

Metadata Updated: November 21, 2022

The data represent web-scraping of hyperlinks from a selection of environmental stewardship organizations that were identified in the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017). There are two data sets: 1) the original scrape containing all hyperlinks within the websites and associated attribute values (see "README" file); 2) a cleaned and reduced dataset formatted for network analysis.

For dataset 1: Organizations were selected from from the 2017 NYC Stewardship Mapping and Assessment Project (STEW-MAP) (USDA 2017), a publicly available, spatial data set about environmental stewardship organizations working in New York City, USA (N = 719). To create a smaller and more manageable sample to analyze, all organizations that intersected (i.e., worked entirely within or overlapped) the NYC borough of Staten Island were selected for a geographically bounded sample. Only organizations with working websites and that the web scraper could access were retained for the study (n = 78). The websites were scraped between 09 and 17 June 2020 to a maximum search depth of ten using the snaWeb package (version 1.0.1, Stockton 2020) in the R computational language environment (R Core Team 2020).

For dataset 2: The complete scrape results were cleaned, reduced, and formatted as a standard edge-array (node1, node2, edge attribute) for network analysis. See "READ ME" file for further details.

References: R Core Team. (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Version 4.0.3.

Stockton, T. (2020). snaWeb Package: An R package for finding and building social networks for a website, version 1.0.1.

USDA Forest Service. (2017). Stewardship Mapping and Assessment Project (STEW-MAP). New York City Data Set. Available online at https://www.nrs.fs.fed.us/STEW-MAP/data/.

This dataset is associated with the following publication: Sayles, J., R. Furey, and M. Ten Brink. How deep to dig: effects of web-scraping search depth on hyperlink network analysis of environmental stewardship organizations. Applied Network Science. Springer Nature, New York, NY, 7: 36, (2022).

Access & Use Information

Public: This dataset is intended for public access and use. License: See this page for license information.

Downloads & Resources

References

https://doi.org/10.1007/s41109-022-00472-0

Dates

Metadata Created Date November 21, 2022
Metadata Updated Date November 21, 2022

Metadata Source

Harvested from EPA ScienceHub

Additional Metadata

Resource Type Dataset
Metadata Created Date November 21, 2022
Metadata Updated Date November 21, 2022
Publisher U.S. EPA Office of Research and Development (ORD)
Maintainer
Identifier https://doi.org/10.23719/1522542
Data Last Modified 2021-05-26
Public Access Level public
Bureau Code 020:00
Schema Version https://project-open-data.cio.gov/v1.1/schema
Harvest Object Id deb440c4-d049-4b4f-8411-c01b6edc7032
Harvest Source Id 04b59eaf-ae53-4066-93db-80f2ed0df446
Harvest Source Title EPA ScienceHub
License https://pasteur.epa.gov/license/sciencehub-license-non-epa-generated.html
Program Code 020:000
Publisher Hierarchy U.S. Government > U.S. Environmental Protection Agency > U.S. EPA Office of Research and Development (ORD)
Related Documents https://doi.org/10.1007/s41109-022-00472-0
Source Datajson Identifier True
Source Hash 5d9f78ee619278a2cebcb5beaddb3169fb6fe8a7
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.