JavaScript is disabled on your browser.

For full functionality of this page it is necessary to enable JavaScript. Here are the instructions how to enable JavaScript in your web browser.

USPVDB API

The USPVDB API (Application Programming Interface) allows for programmatic access to the United States Large Scale Solar Photovoltaic Database (USPVDB). The API is based on REST principles, where data are accessed via standard HTTPS requests to a dedicated API endpoint. Use the USPVDB API to programmatically query data through apps, stay in sync with USPVDB updates, and post new facility data (authenticated users). Here we cover the basics of USPVDB API terminology and structure, and detail the various operations you can perform with the API.

Lawrence Berkeley National Laboratory

The USPVDB API allows for programmatic access to the United States Large Scale Solar Photovoltaic Database. Creation of the USPVDB API was meant to extend USPVDB visibility, expand user base, and create more productive internal workflows. The availability of the public API makes it possible for third party developers to build value-added applications leveraging the USPVDB. The API is HTTP-based (over SSL), and is compatible with any programming language that has an HTTP library (including directly in your browser). In this documentation, we'll show some examples of USPVDB API requests using cURL, a command-line tool and library for transferring data with different parameters and methods. cURL provides a generic way to demonstrate HTTP requests and responses and allows users to translate similar requests into their specific language of choice. The USPVDB API endpoint conforms to the design principles of Representational State Transfer (REST) and uses the JSON data format for responses. Retrieval of facility data from the USPVDB API requires a standard GET request. Additional methods that submit or change USPVDB data require token-based authentication.

Resource endpoints referenced in this documentation are only accessible via https and have a base path of https://eersc.usgs.gov/api/uspvdb/v1/. The base path, when combined with additional query parameters, constitute the full endpoint request. As an example, let's say you wanted to get facility level information from the USPVDB for facility ID 400004. To do this, you'd simply combine the base path https://eersc.usgs.gov/api/uspvdb/v1, with the desired resource /projects, and append your query string ?&case_id=eq.400004. The full request is shown below (if you don't have cURL, simply paste the URL in quotes below into your browser to see the response).

curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.400004"

The response to the above request (below) is returned as JavaScript Object Notation (JSON) data. JSON is a light-weight, human readable, language-independent format for structuring data. It is used primarily to transmit data between the server and client web application. From our above query, we're requesting data for a single object (case_id = 400004). An object is indicated by curly brackets, where everything inside of the curly brackets is part of the object.

{
    "case_id": 400004,
    "multi_poly": "multi",
    "eia_id": 645,    
    "p_state": "FL",
    "p_county": "Hillsborough",
    "ylat": 27.7802,
    "xlong": -82.3911,
    "p_area": 830853,
    "p_img_date": 20220420,
    "p_dig_conf": 4,
    "p_name": "Big Bend",
    "p_year": 2017,
    "p_pwr_reg": "TEC",
    "p_tech_pri": "PV",
    "p_tech_sec": "thin-film",
    "p_axis": "single-axis",
    "p_azimuth": 180,
    "p_tilt": 20,
    "p_battery": "batteries",
    "p_cap_ac": 19,
    "p_cap_dc": 23,
    "p_type": "greenfield",
    "p_agrivolt": "non-agrovoltaic",
    "p_zscore": -0.97,
}

The two significant parts that make up JSON are keys and values. Together they make a key/value pair. We can see the returned facility object above has 24 key/value pairs. Key value pairs are comma separated and follow a specific syntax, with the key, followed by a colon, followed by the value. The first line from the return above is key/value pair "case_id": 400004. The key is "case_id" and the value is 400004. A key is always a string enclosed in quotation marks, whereas a value can be a string, number, Boolean expression, array, or object.

Upon a successful response (see Status Codes section), the USPVDB API will return facility objects consisting of key/value pairs. Below is a list of USPVDB keys, related value types, and an explanation of what each key means. Solar facility attributes were drawn from Energy Information Administration (EIA) Form 860 data, which are self-reported by facility operators. These data include attributes such as: commercial operation date, location coordinates (latitude and longitude), MW of capacity (in both DC and AC terms), module type (e.g., crystalline silicon, thin film), axis type (fixed-tilt versus single-axis tracking), facility name, and regional power authority. In cases where data from these sources were unknown or missing, a "null" value was assigned to the key.

Key Value Type Key Description
case_id number (integer) Unique stable identification number.
multi_poly string Indicates the facility's polygon type. single— facility is represented by a single part polygon. multi— facility is represented by multipart polygon composed of at least two discontinuous polygons, sharing a single record.
eia_id number (integer) Unique facility identifier from Energy Information Administration (EIA), may be used to link with other EIA data fields.
p_state string State where facility is located.
p_county string County where facility is located.
ylat number (float) Latitude value of a point representation of the LSPV facility's location. For single-array facilities, values are calculated in the center of the array. For multi-part polygons, values are generated within the array that is closest to the centroid of the multipart polygon.
xlong number (float) Longitude value of a point representation of the LSPV facility's location. For single-array facilities, values are calculated in the center of the array. For multi-part polygons, values are generated within the array that is closest to the centroid of the multipart polygon.
p_area number (float) Area of the facility array(s) in square meters (m2).
p_img_date number (integer) Date of the aerial image used to confirm the facility location and geometry. Derived from aerial image vendor (Maxar) metadata.
p_dig_conf number (integer) Level of confidence in project location. 1— Multiphase facility or multiple EIA records with identical location. Single polygon used to represent multiple facilities indistinguishable from one another; attributes may not reflect full scope of facilities. 2—Multiple polygons created, but EIA records are unclear; attributes may not reflect full scope of facilities. 3— Polygon reflects only a part of the facility due to poor image quality; area of polygon may not reflect the full size of array(s). 4— Facility polygon created with high confidence.
p_name string Facility name.
p_year number (integer) Year in which facility installation was completed.
p_pwr_reg string Common abbreviation of regional power authority name.
p_tech_pri string Electric generation technology type.
p_tech_sec string Additional detail on panel type.
p_axis string Array axis type.
p_azimuth number (integer) Array azimuth (i.e., east-west orientation) in degrees (°).
p_tilt number (integer) Tilt angle of panels (i.e., angle of panels from horizontal) in degrees (°).
p_battery string Indicator of the presence of battery storage at the facility.
p_cap_ac number (float) Facility AC capacity in megawatts (MW).
p_cap_dc number (float) Facility DC capacity in megawatts (MW).
p_type string General categorization of facility. greenfield—greenfield sites represent the majority of LSPV facilities and occupy land that may have previously been wildland, urbanized, cultivated, or reclaimed. RCRA—Resource Conservation and Recovery Act (RCRA) sites are a specific category of commercial, industrial, and federal facilities that treat, store or dispose of hazardous wastes and that require cleanup under the RCRA Hazardous Waste Corrective Action Program. superfund—superfund sites are inactive or abandoned contaminated facilities or locations where there is an active release or threatened release into the environment of hazardous substances that have been dumped, discharged, emitted or otherwise improperly managed. These sites may include manufacturing and industrial facilities, processing plants, landfills, and mining sites, among others. AML—sites include abandoned hardrock mines and mineral processing sites listed in the Superfund Enterprise Management System. landfill—sites that have been designated as landfills in EPA's RE-Powering Matrix. landfill named—assigned in cases where EPA did not identify the site as a landfill, but the facility name includes the word "landfill." It is possible that these sites have been sufficiently cleaned or were never contaminated to the point of meeting the PCSC designation; thus, they are distinguished from EPA designated landfill sites. PCSC—when no specific designation is provided in EPA's RE-Powering Matrix, "brownfield" sites were assigned to a generalized PCSC facility type.
p_agrivolt string Agrivoltaic facilities make use of the land between panel rows and surrounding arrays for agricultural (i.e., crop production or grazing) and/or ecosystem services (e.g., pollinator habitat). Agrivoltaic projects are categorized into the following designations: crop, crop,es, es, grazing, grazing,es, non-agrivoltaic.
p_zscore number (float) The Z-score of (p_cap_dc/p_area). A Z-score measures how far a record is from the mean of all records in the field in units of standard deviations. Records with high or low Z-scores may have an error in either p_cap_dc or p_area.

The USPVDB API returns appropriate HTTP status codes for every request. Response status codes indicate whether a specific request has been successfully completed. USPVDB API responses are grouped into four classes: successful responses (2XX), redirects (3XX), client errors (4XX), and server errors (5XX). An exhaustive list of status codes are defined in the RFC 2616. The most common USPVDB API status codes are listed in the table below.

Code Description
200 OK - The request has succeeded. Clients can read the result of the request in the body and the headers of the response. The response is sent after a successful GET request.
201 Created - The request has succeeded, and a new resource has been created as a result of it. The response is sent after a successful POST request.
206 Partial Content - The request has succeeded and contains the requested range of data, as described in the Range header of the request.
400 Bad Request - The request could not be understood by the server due to malformed syntax. The message body will contain more information.
401 Unauthorized - The server understood the request but is refusing to fulfill it. This error is likely due to authentication not being provided or invalid token. The message body will contain more information.
404 Not Found - The requested resource could not be found. This error can be due to a temporary or permanent condition.
409 Conflict - A request conflicts with the current state of the server. Most likely due to duplicate key value when attempting a POST. The message body will contain more information.
500 Internal Server Error - Indicates that the server encountered an unexpected condition that prevented it from fulfilling the request.
502 Bad Gateway - The USPVDB API is down or being upgraded.
503 Service Unavailable - The server is currently unable to handle the request, commonly due to the server being overloaded.

When making requests to the USPVDB API, it's possible to receive a large amount of data in the response. The API has controls in place that let you paginate the results, to make returns more manageable and avoid extraneous network traffic. These controls let you set limit and offset via either request headers, or query parameters. Every API response contains Content-Range headers which describe the size of results. By default, the response includes the current range, and, if sent with Prefer: count=exact in the request, the total number of results. When using request headers to limit results, you simply specify the range of rows desired. For example, the request below returns the first 250 facilities (note that the offset numbering is zero-based).

GET /projects HTTP/1.1
Range: 0-249
Prefer: count=exact

Since we sent our preference Prefer: count=exact in the request header, we get a Content-Range response (below) that includes the total size of the table (3,698 records). This is particularly useful when rendering a 'last' link in a pagination control.

HTTP/1.1 206 Partial Content
Content-Range: 0-249/3698

Paging the dataset can also be achieved by taking an offset and limit as query parameters. In the example below, the limit and offset were added to the query so that 50 (limit) facility records are returned starting from the 300th (offset) record. Note that the API always returns range headers in the response, even if you use query parameters to paginate the query.

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&offset=300&limit=50"

You can reorder the records in the response by using order in the query string. The parameter uses a comma separated list of project columns and can contain directions asc (ascending) or desc (descending) if desired. As an example (below), let's say we wanted to return facility records sorted by facility type (p_type) in ascending order, and then sorted by the year the facility went online (p_year) in descending order.

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&order=p_type.desc,p_year.asc"

The USPVDB contains null values where data attributes are unknown or missing. Null values for any desired key can be sorted in either direction by appending nullsfirst or nullslast to the order query. In the example below, 100 facility records are queried, ordered regional power authority name (p_pwr_reg), where null values are sent to the end of the return:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&order=p_pwr_reg.nullslast&limit=100"

The USPVDB API supports filtering table rows by appending the property (or multiple properties), the filter operator, and the filter value to the request. Filters can keep or exclude table rows using simple operators that compare against specified key values. Applying filters to the request allows for more efficient, faster API responses because unneeded data is withheld by the server prior to API return. This is particularly useful when users are only interested in a subset of data from the USPVDB.

To apply row filtering, simply append the ? query string parameter and a valid filter expression (see Filter Operators section) to your request. In the example below, we'll filter the return to show only facilities that were installed in 2020:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&p_year=eq.2020"

Multiple parameters can be logically conjoined using the & parameter, allowing for filtering on as many attributes as desired. In the request below, we'll further refine our query to only return facilities that were installed in 2020 and have a rated capacity greater than 100 MW (AC):

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&p_year=eq.2020&p_cap_ac=gt.100"

The table below contains the available operators used to filter returns by USPVDB key values. For each available operator, a real-world example is included showing how the operator is used to filter API responses. Note that some of the examples combine the query operators with additional functions described previously in the documentation. As previously noted, query strings should be appended to the USPVDB API base path https://eersc.usgs.gov/api/uspvdb/v1/.

Operator Meaning Example of Operator in Request
eq equals projects?&p_state=eq.AZ  Return facilities that are located in Arizona.
gt greater than projects?&p_cap_ac=gt.50&p_state=eq.OH  Return facilities that have an AC capacity greater than 50 MW and are located in Ohio.
gte greater than or equal projects?&p_tilt=gte.5&p_year=eq.2018  Return facilities that have a panel tilt angle greater or equal to 5 degrees that were constructed in 2018.
lt less than projects?&p_dig_conf=lt.3&p_type=eq.greenfield  Return facilities with levels of confidence less than 3 (see attribute definitions for p_dig_conf in table above) that are categorized as Greenfield projects.
lte less than or equal projects?&p_dig_conf=lte.2&ylat=gt.42  Return facilities with levels of confidence less than or equal to 2 (see attribute definitions for p_dig_conf in table above) and are north of 42° latitude.
neq not equal projects?&p_type=neq.greenfield&select=count  Count the number of facilities that are not categorized as Greenfield projects. Note that you can return a count on any query by simply appending &select=count to the query string.
like LIKE operator (use * as wildcard) projects?&p_county=like.*Car*&order=case_id  Return facilities that have the string "Car" (case-sensitive) in the name of county that the facility resides, and return them in order of case ID. Note the like operator is case-sensitive (use ilike operator for case-insensitive queries).
ilike ILIKE operator (use * as wildcard) projects?&p_name=ilike.*solar*&order=p_year.desc  Return facilities that have the string "solar" in the name (case-insensitive), and order by most recent install year. Note the ilike operator is case-insensitive.
in one of a list of values projects?&p_state=in.(VA,WV)&select=p_name,p_year  Return facilities that are located in either Virginia or West Virginia, and only show the keys "facility name" and "installation year" in response.
is checking for exact equality (null,true,false) projects?&or=(p_cap_ac.is.null,p_cap_ac.gte.220)  Return facilities where capacity (AC) is either null or is greater than or equal to 220 MW. Note that multiple parameters are logically disjoined by using or.
not negates another operator projects?&and=(p_name.not.ilike.*Solar*,p_cap_dc.gt.300)  Return facilities where the manufacturer name does not contain "Solar" and facility capacity (DC) is greater or equal to 300 MW. Note that multiple parameters are logically conjoined by using and.

By default, the USPVDB API returns data with all columns (attribute fields) in the response. If you're only interested in a subset of these, you can specify which columns are returned using the select parameter in the query string. This effectively tells the API to withhold all unneeded data fields, resulting in a far more efficient response. In the example below, let's say you were only interested a response that included facility name, facility capacity (AC), and the state where the project was located (we'll also sort by facility name in descending order):

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_name,p_cap_ac,p_state&order=p_name.desc"

Our response (below) only returns the columns passed in our select parameter:

{"p_name":"ZV Solar 3, LLC","p_cap_ac":4.9,"p_state":"NC"}, 
{"p_name":"ZV Solar 2, LLC","p_cap_ac":4.9,"p_state":"NC"}, 
{"p_name":"ZV Solar 1","p_cap_ac":5.0,"p_state":"NC"},
 ...

In some cases, you may want to rename (alias) a column name on API return. Aliases are often used to make column names more readable or easier to understand. This can be achieved by prefixing the column name with a column alias, followed by the : operator. Let's say you'd like to display API returns limited to facility name, facility capacity (AC), and install year, but feel that the default column name for install year p_year needs to be spelled out as "install_year". To do this, we'd simply append the normal select parameter to our query and add the column rename prefix install_year to p_year:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_name,p_cap_ac,install_year:p_year"

Our response (below) returns the columns passed in our select parameter, but with p_year column renamed to install_year:

{"p_name":"Rancho Seco Solar","p_cap_ac":2,"install_year":1986}, 
{"p_name":"Big Bend","p_cap_ac":19,"install_year":2017}, 
{"p_name":"Geneseo","p_cap_ac":0.9,"install_year":2015}, 
 ...

The USPVDB API allows columns to be converted from one data type to another through casting. Examples of common casts are strings to date type, numeric type to text, or character strings to numeric values. Note that not every data type can be cast into every other data type, and invalid casting will result in an code 400 error from the API. To cast a column, suffix it with a :: plus the desired type. Let's say you'd like to cast the default integer type p_cap_dc as a string type. To do this we'll apply our casts to the appropriate columns:

$ curl -i -X GET "https://eersc.usgs.gov/api/uspvdb/v1/projects?&select=p_cap_dc::text"

Our response (below) returns the selected columns formatted with the desired casts:

{"p_cap_dc":"2.00"}, 
{"p_cap_dc":"23.00"}, 
{"p_cap_dc":"1.20"},
 ...

The creation of USPVDB was jointly funded by the U.S. Department of Energy via the Lawrence Berkeley National Laboratory and the U.S. Geological Survey. The database is being continuously updated through collaboration among these partners and as such, requires authorized access to the API by selected collaborators. For public users, the USPVDB API acts as read-only resource and allows standard HTTP GET (read) requests to be received successfully without authentication. For any additional incoming HTTP methods, including POST (create records), PUT (replace/update records), PATCH (patial update records), and DELETE (deletes a record), authorization is required.

Authorized users can authenticate with the USPVDB API by sending requests with a JSON Web Token (JWT). A JWT is a JSON object used as a compact and self-contained way for securely transmitting information between an authorized user and the API server. JWT's consist of three components: a header, a payload, and a signature. JWT signatures are cryptographically signed using a password known to USPVDB administrators and the server. Since token holders don't have access to the password, they cannot modify the contents of the JWT without receiving a 401 (Unauthorized) response from the API. Once the user has made a successful connection to the API using the JWT, each subsequent request will include the token, allowing the user authorized access without having to set up the token again until its expiration. Tokens distributed to authorized users will have expiration explicitly set via the payload in the signing key. Authorized users can request a JWT by contacting a USPVDB API administrator.

Once the JWT is received by the authorized user, API authenticated requests can be sent with the token to the API server. This is done by adding the token via the Authorization header with the Bearer authentication scheme. In the example below, let's update the tilt angle of panels p_tilt for project ID 403678 via a PATCH request method. We'll send the data as JSON but could alternatively send as CSV by setting Content-Type: text/csv. The request will include an HTTP header containing the authentication token:

 curl -X PATCH "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.403678" \
     -H 'Authorization: Bearer {your token here}' \
     -H 'Content-Type: application/json' \
     -d '{"p_tilt": "30"}'

Bulk inserts work just like single row inserts except that you provide either a JSON array of objects having uniform keys, or multiple lines in CSV format. In many cases, users may wish to insert new rows to the database and update existing rows via a single operation (UPSERT). To do this, simply POST with the Prefer: resolution=merge-duplicates header.

 curl -X POST "https://eersc.usgs.gov/api/uspvdb/v1/projects?&case_id=eq.403678" \
     -H 'Authorization: Bearer {your token here}' \
     -H 'Prefer: resolution=merge-duplicates' \
     -H 'cache-control: no-cache' \
     -d 'case_id,p_name,p_year,p_state \
     403678,Gohman Community Solar (CSG),2020,MN \
     403640,Chub Garden Solar,2020,MN'

Note that authorization claims via database roles will be included in the payload of the token. If authorization claims are changed during the lifetime of the token, the changes will not become effective until a new token is issued.