Query a public dataset with the BigQuery Java client library

Query a public dataset with the BigQuery Java client library

Learn how to:

  1. Activate Cloud Shell in a Google Cloud project.
  2. Open the Cloud Shell Editor.
  3. Prepare dependencies for queries.
  4. Query a public dataset in BigQuery.
  5. Clean up.

Estimated time to complete:

Click Start to begin.

Activate Cloud Shell in a Google Cloud project

  1. If you do not enable billing for a project, you automatically work in the BigQuery sandbox. The BigQuery sandbox lets you learn BigQuery with a limited set of BigQuery features at no charge. If you do not plan to use your project beyond this document, we recommend that you use the BigQuery sandbox.

  2. Click Activate Cloud Shell. Show me

To learn how to open the Cloud Shell Editor, click Next.

Open the Cloud Shell Editor

  1. In Cloud Shell, create a new Java project using Apache Maven:

    mvn archetype:generate \
        -DgroupId=com.google.app \
        -DartifactId=bigquery-java-quickstart \
        -DinteractiveMode=false
    

    This command creates a Maven project that's named bigquery-java-quickstart.

    The output is similar to the following. Several lines are omitted to simplify the output.

    [INFO] Scanning for projects...
    ...
    [INFO] Building Maven Stub Project (No POM) 1
    ...
    [INFO] BUILD SUCCESS
    ...
    

    There are many dependency management systems that you can use other than Maven. For more information, learn how to set up a Java development environment to use with client libraries.

  2. Rename the App.java file that Maven creates by default:

    mv \
        bigquery-java-quickstart/src/main/java/com/google/app/App.java \
        bigquery-java-quickstart/src/main/java/com/google/app/SimpleApp.java
    
  3. Open the Cloud Shell Editor:

    cloudshell workspace bigquery-java-quickstart
    
  4. If you are prompted whether to synchronize the Java classpath or configuration, click Always.

    If you are not prompted and encounter an error that is related to the classpath during this walkthrough, do the following:

    1. Click File > Preferences > Open Settings (UI).
    2. Click Extensions > Java.
    3. Scroll to Configuration: Update Build Configuration and select automatic.

To learn how to prepare dependencies for queries, click Next.

Prepare dependencies for queries

  1. In the Explorer pane, locate your BIGQUERY-JAVA-QUICKSTART project.

  2. Click the pom.xml file to open it.

  3. Inside the <dependencies> tag, add the following dependency after any existing ones. Do not replace any existing dependencies.

    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>google-cloud-bigquery</artifactId>
    </dependency>
    
  4. On the line following the closing tag (</dependencies>), add the following:

    <dependencyManagement>
      <dependencies>
        <dependency>
          <groupId>com.google.cloud</groupId>
          <artifactId>libraries-bom</artifactId>
          <version>26.1.5</version>
          <type>pom</type>
          <scope>import</scope>
        </dependency>
      </dependencies>
    </dependencyManagement>
    

To learn how to query a public dataset in BigQuery, click Next.

Query a public dataset in BigQuery

  1. In the Explorer pane, in your BIGQUERY-JAVA-QUICKSTART project, click src > main/java/com/google/app > SimpleApp.java. The file opens.

  2. To create a query against the bigquery-public-data.stackoverflow dataset, leave the first line of the file (package com.google.app;), and replace the remaining contents of the file with the following code:

    
    import com.google.cloud.bigquery.BigQuery;
    import com.google.cloud.bigquery.BigQueryException;
    import com.google.cloud.bigquery.BigQueryOptions;
    import com.google.cloud.bigquery.FieldValueList;
    import com.google.cloud.bigquery.Job;
    import com.google.cloud.bigquery.JobId;
    import com.google.cloud.bigquery.JobInfo;
    import com.google.cloud.bigquery.QueryJobConfiguration;
    import com.google.cloud.bigquery.TableResult;
    
    
    public class SimpleApp {
    
      public static void main(String... args) throws Exception {
        // TODO(developer): Replace these variables before running the app.
        String projectId = "MY_PROJECT_ID";
        simpleApp(projectId);
      }
    
      public static void simpleApp(String projectId) {
        try {
          BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
          QueryJobConfiguration queryConfig =
              QueryJobConfiguration.newBuilder(
                      "SELECT CONCAT('https://stackoverflow.com/questions/', "
                          + "CAST(id as STRING)) as url, view_count "
                          + "FROM `bigquery-public-data.stackoverflow.posts_questions` "
                          + "WHERE tags like '%google-bigquery%' "
                          + "ORDER BY view_count DESC "
                          + "LIMIT 10")
                  // Use standard SQL syntax for queries.
                  // See: https://cloud.google.com/bigquery/sql-reference/
                  .setUseLegacySql(false)
                  .build();
    
          JobId jobId = JobId.newBuilder().setProject(projectId).build();
          Job queryJob = bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());
    
          // Wait for the query to complete.
          queryJob = queryJob.waitFor();
    
          // Check for errors
          if (queryJob == null) {
            throw new RuntimeException("Job no longer exists");
          } else if (queryJob.getStatus().getExecutionErrors() != null
              && queryJob.getStatus().getExecutionErrors().size() > 0) {
            // TODO(developer): Handle errors here. An error here do not necessarily mean that the job
            // has completed or was unsuccessful.
            // For more details: https://cloud.google.com/bigquery/troubleshooting-errors
            throw new RuntimeException("An unhandled error has occurred");
          }
    
          // Get the results.
          TableResult result = queryJob.getQueryResults();
    
          // Print all pages of the results.
          for (FieldValueList row : result.iterateAll()) {
            // String type
            String url = row.get("url").getStringValue();
            String viewCount = row.get("view_count").getStringValue();
            System.out.printf("%s : %s views\n", url, viewCount);
          }
        } catch (BigQueryException | InterruptedException e) {
          System.out.println("Simple App failed due to error: \n" + e.toString());
        }
      }
    }

    The query returns the top 10 most viewed Stack Overflow pages and their view counts.

  3. Right-click SimpleApp.java and click Run Java. If you are prompted to authorize Cloud Shell and agree to the terms, click Authorize.

    The result is similar to the following:

    https://stackoverflow.com/questions/35159967 : 170023 views
    https://stackoverflow.com/questions/22879669 : 142581 views
    https://stackoverflow.com/questions/10604135 : 132406 views
    https://stackoverflow.com/questions/44564887 : 128781 views
    https://stackoverflow.com/questions/27060396 : 127008 views
    https://stackoverflow.com/questions/12482637 : 120766 views
    https://stackoverflow.com/questions/20673986 : 115720 views
    https://stackoverflow.com/questions/39109817 : 108368 views
    https://stackoverflow.com/questions/11057219 : 105175 views
    https://stackoverflow.com/questions/43195143 : 101878 views
    

You have successfully queried a public dataset with the BigQuery Java client library.

To avoid incurring charges to your account and learn about next steps, click Next.

Next steps

Keep the resources that you created and do more with BigQuery, or clean up to avoid billing charges.

Do more with BigQuery

Clean up

To avoid incurring charges to your Google Cloud account, either delete your Google Cloud project, or delete the resources that you created in this walkthrough.

Delete the project

If you created a new project to learn about BigQuery and you no longer need the project, delete it. Be aware that deleting a project deletes everything in the project and custom project IDs are lost.

Delete the resources

If you used an existing project, delete the bigquery-java-quickstart folder that you created:

  1. In Cloud Shell, move up a directory:

    cd ..
    
  2. Delete the resources that you created:

    rm -R bigquery-java-quickstart
    

    The -R flag deletes all assets in a folder.