API Documentation

About the API

The USPVDB API allows for programmatic access to the United States Large Scale Solar Photovoltaic Database. Creation of the USPVDB API was meant to extend USPVDB visibility, expand user base, and create more productive internal workflows. The availability of the public API makes it possible for third party developers to build value-added applications leveraging the USPVDB. The API is HTTP-based (over SSL), and is compatible with any programming language that has an HTTP library (including directly in your browser). In this documentation, we'll show some examples of USPVDB API requests using cURL, a command-line tool and library for transferring data with different parameters and methods. cURL provides a generic way to demonstrate HTTP requests and responses and allows users to translate similar requests into their specific language of choice. The USPVDB API endpoint conforms to the design principles of Representational State Transfer (REST) and uses the JSON data format for responses. Retrieval of facility data from the USPVDB API requires a standard GET request. Additional methods that submit or change USPVDB data require token-based authentication.

Resource Endpoint

Resource endpoints referenced in this documentation are only accessible via https and have a base path of https://eersc.usgs.gov/api/uspvdb/v1/. The base path, when combined with additional query parameters, constitute the full endpoint request. As an example, let's say you wanted to get facility level information from the USPVDB for facility ID 400004. To do this, you'd simply combine the base path https://eersc.usgs.gov/api/uspvdb/v1, with the desired resource /projects, and append your query string ?&case_id=eq.400004. The full request is shown below (if you don't have cURL, simply paste the URL in quotes below into your browser to see the response).

curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.400004"

API Response

The response to the above request (below) is returned as JavaScript Object Notation (JSON) data. JSON is a light-weight, human readable, language-independent format for structuring data. It is used primarily to transmit data between the server and client web application. From our above query, we're requesting data for a single object (case_id = 400004). An object is indicated by curly brackets, where everything inside of the curly brackets is part of the object.

{
    "case_id": 400004,
    "multi_poly": "multi",
    "eia_id": 645,    
    "p_state": "FL",
    "p_county": "Hillsborough",
    "ylat": 27.7802,
    "xlong": -82.3911,
    "p_area": 830853,
    "p_img_date": 20220420,
    "p_dig_conf": 4,
    "p_name": "Big Bend",
    "p_year": 2017,
    "p_pwr_reg": "TEC",
    "p_tech_pri": "PV",
    "p_tech_sec": "thin-film",
    "p_axis": "single-axis",
    "p_azimuth": 180,
    "p_tilt": 20,
    "p_battery": "batteries",
    "p_cap_ac": 19,
    "p_cap_dc": 23,
    "p_type": "greenfield",
    "p_agrivolt": "non-agrovoltaic",
    "p_zscore": -0.97,
}

The two significant parts that make up JSON are keys and values. Together they make a key/value pair. We can see the returned facility object above has 24 key/value pairs. Key value pairs are comma separated and follow a specific syntax, with the key, followed by a colon, followed by the value. The first line from the return above is key/value pair "case_id": 400004. The key is "case_id" and the value is 400004. A key is always a string enclosed in quotation marks, whereas a value can be a string, number, Boolean expression, array, or object.

Key/Value Codes

Upon a successful response (see Status Codes section), the USPVDB API will return facility objects consisting of key/value pairs. Below is a list of USPVDB keys, related value types, and an explanation of what each key means. Solar facility attributes were drawn from Energy Information Administration (EIA) Form 860 data, which are self-reported by facility operators. These data include attributes such as: commercial operation date, location coordinates (latitude and longitude), MW of capacity (in both DC and AC terms), module type (e.g., crystalline silicon, thin film), axis type (fixed-tilt versus single-axis tracking), facility name, and regional power authority. In cases where data from these sources were unknown or missing, a "null" value was assigned to the key.

Key	Value Type	Key Description
case_id	number (integer)	Unique stable identification number.
multi_poly	string	Indicates the facility's polygon type. single— facility is represented by a single part polygon. multi— facility is represented by multipart polygon composed of at least two discontinuous polygons, sharing a single record.
eia_id	number (integer)	Unique facility identifier from Energy Information Administration (EIA), may be used to link with other EIA data fields.
p_state	string	State where facility is located.
p_county	string	County where facility is located.
ylat	number (float)	Latitude value of a point representation of the LSPV facility's location. For single-array facilities, values are calculated in the center of the array. For multi-part polygons, values are generated within the array that is closest to the centroid of the multipart polygon.
xlong	number (float)	Longitude value of a point representation of the LSPV facility's location. For single-array facilities, values are calculated in the center of the array. For multi-part polygons, values are generated within the array that is closest to the centroid of the multipart polygon.
p_area	number (float)	Area of the facility array(s) in square meters (m²).
p_img_date	number (integer)	Date of the aerial image used to confirm the facility location and geometry. Derived from aerial image vendor (Maxar) metadata.
p_dig_conf	number (integer)	Level of confidence in project location. 1— Multiphase facility or multiple EIA records with identical location. Single polygon used to represent multiple facilities indistinguishable from one another; attributes may not reflect full scope of facilities. 2—Multiple polygons created, but EIA records are unclear; attributes may not reflect full scope of facilities. 3— Polygon reflects only a part of the facility due to poor image quality; area of polygon may not reflect the full size of array(s). 4— Facility polygon created with high confidence.
p_name	string	Facility name.
p_year	number (integer)	Year in which facility installation was completed.
p_pwr_reg	string	Common abbreviation of regional power authority name.
p_tech_pri	string	Electric generation technology type.
p_tech_sec	string	Additional detail on panel type.
p_axis	string	Array axis type.
p_azimuth	number (integer)	Array azimuth (i.e., east-west orientation) in degrees (°).
p_tilt	number (integer)	Tilt angle of panels (i.e., angle of panels from horizontal) in degrees (°).
p_battery	string	Indicator of the presence of battery storage at the facility.
p_cap_ac	number (float)	Facility AC capacity in megawatts (MW).
p_cap_dc	number (float)	Facility DC capacity in megawatts (MW).
p_type	string	General categorization of facility. greenfield—greenfield sites represent the majority of LSPV facilities and occupy land that may have previously been wildland, urbanized, cultivated, or reclaimed. RCRA—Resource Conservation and Recovery Act (RCRA) sites are a specific category of commercial, industrial, and federal facilities that treat, store or dispose of hazardous wastes and that require cleanup under the RCRA Hazardous Waste Corrective Action Program. superfund—superfund sites are inactive or abandoned contaminated facilities or locations where there is an active release or threatened release into the environment of hazardous substances that have been dumped, discharged, emitted or otherwise improperly managed. These sites may include manufacturing and industrial facilities, processing plants, landfills, and mining sites, among others. AML—sites include abandoned hardrock mines and mineral processing sites listed in the Superfund Enterprise Management System. landfill—sites that have been designated as landfills in EPA's RE-Powering Matrix. landfill named—assigned in cases where EPA did not identify the site as a landfill, but the facility name includes the word "landfill." It is possible that these sites have been sufficiently cleaned or were never contaminated to the point of meeting the PCSC designation; thus, they are distinguished from EPA designated landfill sites. PCSC—when no specific designation is provided in EPA's RE-Powering Matrix, "brownfield" sites were assigned to a generalized PCSC facility type.
p_agrivolt	string	Agrivoltaic facilities make use of the land between panel rows and surrounding arrays for agricultural (i.e., crop production or grazing) and/or ecosystem services (e.g., pollinator habitat). Agrivoltaic projects are categorized into the following designations: crop, crop,es, es, grazing, grazing,es, non-agrivoltaic.
p_zscore	number (float)	The Z-score of (p_cap_dc/p_area). A Z-score measures how far a record is from the mean of all records in the field in units of standard deviations. Records with high or low Z-scores may have an error in either p_cap_dc or p_area.

Status Codes

The USPVDB API returns appropriate HTTP status codes for every request. Response status codes indicate whether a specific request has been successfully completed. USPVDB API responses are grouped into four classes: successful responses (2XX), redirects (3XX), client errors (4XX), and server errors (5XX). An exhaustive list of status codes are defined in the RFC 2616. The most common USPVDB API status codes are listed in the table below.

Code	Description
200	OK - The request has succeeded. Clients can read the result of the request in the body and the headers of the response. The response is sent after a successful GET request.
201	Created - The request has succeeded, and a new resource has been created as a result of it. The response is sent after a successful POST request.
206	Partial Content - The request has succeeded and contains the requested range of data, as described in the Range header of the request.
400	Bad Request - The request could not be understood by the server due to malformed syntax. The message body will contain more information.
401	Unauthorized - The server understood the request but is refusing to fulfill it. This error is likely due to authentication not being provided or invalid token. The message body will contain more information.
404	Not Found - The requested resource could not be found. This error can be due to a temporary or permanent condition.
409	Conflict - A request conflicts with the current state of the server. Most likely due to duplicate key value when attempting a POST. The message body will contain more information.
500	Internal Server Error - Indicates that the server encountered an unexpected condition that prevented it from fulfilling the request.
502	Bad Gateway - The USPVDB API is down or being upgraded.
503	Service Unavailable - The server is currently unable to handle the request, commonly due to the server being overloaded.

Pagination

When making requests to the USPVDB API, it's possible to receive a large amount of data in the response. The API has controls in place that let you paginate the results, to make returns more manageable and avoid extraneous network traffic. These controls let you set limit and offset via either request headers, or query parameters. Every API response contains Content-Range headers which describe the size of results. By default, the response includes the current range, and, if sent with Prefer: count=exact in the request, the total number of results. When using request headers to limit results, you simply specify the range of rows desired. For example, the request below returns the first 250 facilities (note that the offset numbering is zero-based).

GET /projects HTTP/1.1
Range: 0-249
Prefer: count=exact

Since we sent our preference Prefer: count=exact in the request header, we get a Content-Range response (below) that includes the total size of the table (3,698 records). This is particularly useful when rendering a 'last' link in a pagination control.

HTTP/1.1 206 Partial Content
Content-Range: 0-249/3698

Paging the dataset can also be achieved by taking an offset and limit as query parameters. In the example below, the limit and offset were added to the query so that 50 (limit) facility records are returned starting from the 300th (offset) record. Note that the API always returns range headers in the response, even if you use query parameters to paginate the query.

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&offset=300&limit=50"

Sorting

You can reorder the records in the response by using order in the query string. The parameter uses a comma separated list of project columns and can contain directions asc (ascending) or desc (descending) if desired. As an example (below), let's say we wanted to return facility records sorted by facility type (p_type) in ascending order, and then sorted by the year the facility went online (p_year) in descending order.

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&order=p_type.desc,p_year.asc"

The USPVDB contains null values where data attributes are unknown or missing. Null values for any desired key can be sorted in either direction by appending nullsfirst or nullslast to the order query. In the example below, 100 facility records are queried, ordered regional power authority name (p_pwr_reg), where null values are sent to the end of the return:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&order=p_pwr_reg.nullslast&limit=100"

Row Filtering

The USPVDB API supports filtering table rows by appending the property (or multiple properties), the filter operator, and the filter value to the request. Filters can keep or exclude table rows using simple operators that compare against specified key values. Applying filters to the request allows for more efficient, faster API responses because unneeded data is withheld by the server prior to API return. This is particularly useful when users are only interested in a subset of data from the USPVDB.

To apply row filtering, simply append the ? query string parameter and a valid filter expression (see Filter Operators section) to your request. In the example below, we'll filter the return to show only facilities that were installed in 2020:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&p_year=eq.2020"

Multiple parameters can be logically conjoined using the & parameter, allowing for filtering on as many attributes as desired. In the request below, we'll further refine our query to only return facilities that were installed in 2020 and have a rated capacity greater than 100 MW (AC):

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&p_year=eq.2020&p_cap_ac=gt.100"

Filter Operators

The table below contains the available operators used to filter returns by USPVDB key values. For each available operator, a real-world example is included showing how the operator is used to filter API responses. Note that some of the examples combine the query operators with additional functions described previously in the documentation. As previously noted, query strings should be appended to the USPVDB API base path https://eersc.usgs.gov/api/uspvdb/v1/.

Operator	Meaning	Example of Operator in Request
eq	equals	projects?&p_state=eq.AZ Return facilities that are located in Arizona.
gt	greater than	projects?&p_cap_ac=gt.50&p_state=eq.OH Return facilities that have an AC capacity greater than 50 MW and are located in Ohio.
gte	greater than or equal	projects?&p_tilt=gte.5&p_year=eq.2018 Return facilities that have a panel tilt angle greater or equal to 5 degrees that were constructed in 2018.
lt	less than	projects?&p_dig_conf=lt.3&p_type=eq.greenfield Return facilities with levels of confidence less than 3 (see attribute definitions for p_dig_conf in table above) that are categorized as Greenfield projects.
lte	less than or equal	projects?&p_dig_conf=lte.2&ylat=gt.42 Return facilities with levels of confidence less than or equal to 2 (see attribute definitions for p_dig_conf in table above) and are north of 42° latitude.
neq	not equal	projects?&p_type=neq.greenfield&select=count Count the number of facilities that are not categorized as Greenfield projects. Note that you can return a count on any query by simply appending &select=count to the query string.
like	LIKE operator (use * as wildcard)	projects?&p_county=like.Car&order=case_id Return facilities that have the string "Car" (case-sensitive) in the name of county that the facility resides, and return them in order of case ID. Note the like operator is case-sensitive (use ilike operator for case-insensitive queries).
ilike	ILIKE operator (use * as wildcard)	projects?&p_name=ilike.solar&order=p_year.desc Return facilities that have the string "solar" in the name (case-insensitive), and order by most recent install year. Note the ilike operator is case-insensitive.
in	one of a list of values	projects?&p_state=in.(VA,WV)&select=p_name,p_year Return facilities that are located in either Virginia or West Virginia, and only show the keys "facility name" and "installation year" in response.
is	checking for exact equality (null,true,false)	projects?&or=(p_cap_ac.is.null,p_cap_ac.gte.220) Return facilities where capacity (AC) is either null or is greater than or equal to 220 MW. Note that multiple parameters are logically disjoined by using or.
not	negates another operator	projects?&and=(p_name.not.ilike.Solar,p_cap_dc.gt.300) Return facilities where the manufacturer name does not contain "Solar" and facility capacity (DC) is greater or equal to 300 MW. Note that multiple parameters are logically conjoined by using and.

Column Filtering

By default, the USPVDB API returns data with all columns (attribute fields) in the response. If you're only interested in a subset of these, you can specify which columns are returned using the select parameter in the query string. This effectively tells the API to withhold all unneeded data fields, resulting in a far more efficient response. In the example below, let's say you were only interested a response that included facility name, facility capacity (AC), and the state where the project was located (we'll also sort by facility name in descending order):

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_name,p_cap_ac,p_state&order=p_name.desc"

Our response (below) only returns the columns passed in our select parameter:

{"p_name":"ZV Solar 3, LLC","p_cap_ac":4.9,"p_state":"NC"}, 
{"p_name":"ZV Solar 2, LLC","p_cap_ac":4.9,"p_state":"NC"}, 
{"p_name":"ZV Solar 1","p_cap_ac":5.0,"p_state":"NC"},
 ...

In some cases, you may want to rename (alias) a column name on API return. Aliases are often used to make column names more readable or easier to understand. This can be achieved by prefixing the column name with a column alias, followed by the : operator. Let's say you'd like to display API returns limited to facility name, facility capacity (AC), and install year, but feel that the default column name for install year p_year needs to be spelled out as "install_year". To do this, we'd simply append the normal select parameter to our query and add the column rename prefix install_year to p_year:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_name,p_cap_ac,install_year:p_year"

Our response (below) returns the columns passed in our select parameter, but with p_year column renamed to install_year:

{"p_name":"Rancho Seco Solar","p_cap_ac":2,"install_year":1986}, 
{"p_name":"Big Bend","p_cap_ac":19,"install_year":2017}, 
{"p_name":"Geneseo","p_cap_ac":0.9,"install_year":2015}, 
 ...

The USPVDB API allows columns to be converted from one data type to another through casting. Examples of common casts are strings to date type, numeric type to text, or character strings to numeric values. Note that not every data type can be cast into every other data type, and invalid casting will result in an code 400 error from the API. To cast a column, suffix it with a :: plus the desired type. Let's say you'd like to cast the default integer type p_cap_dc as a string type. To do this we'll apply our casts to the appropriate columns:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_cap_dc::text"

Our response (below) returns the selected columns formatted with the desired casts:

{"p_cap_dc":"2.00"}, 
{"p_cap_dc":"23.00"}, 
{"p_cap_dc":"1.20"},
 ...

Authorization

The creation of USPVDB was jointly funded by the U.S. Department of Energy via the Lawrence Berkeley National Laboratory and the U.S. Geological Survey. The database is being continuously updated through collaboration among these partners and as such, requires authorized access to the API by selected collaborators. For public users, the USPVDB API acts as read-only resource and allows standard HTTP GET (read) requests to be received successfully without authentication. For any additional incoming HTTP methods, including POST (create records), PUT (replace/update records), PATCH (patial update records), and DELETE (deletes a record), authorization is required.

JSON Web Tokens

Authorized users can authenticate with the USPVDB API by sending requests with a JSON Web Token (JWT). A JWT is a JSON object used as a compact and self-contained way for securely transmitting information between an authorized user and the API server. JWT's consist of three components: a header, a payload, and a signature. JWT signatures are cryptographically signed using a password known to USPVDB administrators and the server. Since token holders don't have access to the password, they cannot modify the contents of the JWT without receiving a 401 (Unauthorized) response from the API. Once the user has made a successful connection to the API using the JWT, each subsequent request will include the token, allowing the user authorized access without having to set up the token again until its expiration. Tokens distributed to authorized users will have expiration explicitly set via the payload in the signing key. Authorized users can request a JWT by contacting a USPVDB API administrator.

Authorized Requests

Once the JWT is received by the authorized user, API authenticated requests can be sent with the token to the API server. This is done by adding the token via the Authorization header with the Bearer authentication scheme. In the example below, let's update the tilt angle of panels p_tilt for project ID 403678 via a PATCH request method. We'll send the data as JSON but could alternatively send as CSV by setting Content-Type: text/csv. The request will include an HTTP header containing the authentication token:

 curl -X PATCH "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.403678" \
     -H 'Authorization: Bearer {your token here}' \
     -H 'Content-Type: application/json' \
     -d '{"p_tilt": "30"}'

Bulk inserts work just like single row inserts except that you provide either a JSON array of objects having uniform keys, or multiple lines in CSV format. In many cases, users may wish to insert new rows to the database and update existing rows via a single operation (UPSERT). To do this, simply POST with the Prefer: resolution=merge-duplicates header.

 curl -X POST "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.403678" \
     -H 'Authorization: Bearer {your token here}' \
     -H 'Prefer: resolution=merge-duplicates' \
     -H 'cache-control: no-cache' \
     -d 'case_id,p_name,p_year,p_state \
     403678,Gohman Community Solar (CSG),2020,MN \
     403640,Chub Garden Solar,2020,MN'

Note that authorization claims via database roles will be included in the payload of the token. If authorization claims are changed during the lifetime of the token, the changes will not become effective until a new token is issued.

USPVDB API

The Basics

Data Model

Querying Data

Authentication