Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Skip to content

Grammar transformations of topographic feature type annotations of the U.S. to structured graph data.

Metadata Updated: July 20, 2024

These data were used to examine grammatical structures and patterns within a set of geospatial glossary definitions. Objectives of our study were to analyze the semantic structure of input definitions, use this information to build triple structures of RDF graph data, upload our lexicon to a knowledge graph software, and perform SPARQL queries on the data. Upon completion of this study, SPARQL queries were proven to effectively convey graph triples which displayed semantic significance. These data represent and characterize the lexicon of our input text which are used to form graph triples. These data were collected in 2024 by passing text through multiple Python programs utilizing spaCy (a natural language processing library) and its pre-trained English transformer pipeline. Before data was processed by the Python programs, input definitions were first rewritten as natural language and formatted as tabular data. Passages were then tokenized and characterized by their part-of-speech, tag, dependency relation, dependency head, and lemma. Each word within the lexicon was tokenized. A stop-words list was utilized only to remove punctuation and symbols from the text, excluding hyphenated words (ex. bowl-shaped) which remained as such. The tokens’ lemmas were then aggregated and totaled to find their recurrences within the lexicon. This procedure was repeated for tokenizing noun chunks using the same glossary definitions.

Access & Use Information

Public: This dataset is intended for public access and use. License: No license information was provided. If this work was prepared by an officer or employee of the United States government as part of that person's official duties it is considered a U.S. Government Work.

Downloads & Resources

Dates

Metadata Created Date July 20, 2024
Metadata Updated Date July 20, 2024

Metadata Source

Harvested from DOI EDI

Additional Metadata

Resource Type Dataset
Metadata Created Date July 20, 2024
Metadata Updated Date July 20, 2024
Publisher U.S. Geological Survey
Maintainer
@Id http://datainventory.doi.gov/id/dataset/26806c79bb34a86da3e9875aea85288d
Identifier USGS:664cb465d34e1955f5a4f57e
Data Last Modified 20240711
Category geospatial
Public Access Level public
Bureau Code 010:12
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Metadata Catalog ID https://datainventory.doi.gov/data.json
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id f6180333-1b46-4bc7-945a-f1c61e256fd5
Harvest Source Id 52bfcc16-6e15-478f-809a-b1bc76f1aeda
Harvest Source Title DOI EDI
Metadata Type geospatial
Old Spatial -130.5738,21.1114,-62.664,52.3587
Publisher Hierarchy White House > U.S. Department of the Interior > U.S. Geological Survey
Source Datajson Identifier True
Source Hash 0cb21e59caa245cbea8e2728506aba9b08415a60057d4c201bb4d5e743014074
Source Schema Version 1.1
Spatial {"type": "Polygon", "coordinates": -130.5738, 21.1114, -130.5738, 52.3587, -62.664, 52.3587, -62.664, 21.1114, -130.5738, 21.1114}

Didn't find what you're looking for? Suggest a dataset here.