On this fast tip, excerpted from Useful Python, Stuart reveals you the way straightforward it’s to make use of an HTTP API from Python utilizing a few third-party modules.

More often than not when working with third-party knowledge we’ll be accessing an HTTP API. That’s, we’ll be making an HTTP name to an internet web page designed to be learn by machines somewhat than by folks. API knowledge is often in a machine-readable formatβ€”often both JSON or XML. (If we come throughout knowledge in one other format, we are able to use the methods described elsewhere on this guide to transform it to JSON, in fact!) Let’s take a look at find out how to use an HTTP API from Python.

The final rules of utilizing an HTTP API are easy:

  1. Make an HTTP name to the URLs for the API, presumably together with some authentication data (similar to an API key) to point out that we’re approved.
  2. Get again the information.
  3. Do one thing helpful with it.

Python offers sufficient performance in its commonplace library to do all this with none extra modules, however it would make our life lots simpler if we decide up a few third-party modules to clean over the method. The primary is the requests module. That is an HTTP library for Python that makes fetching HTTP knowledge extra nice than Python’s built-in urllib.request, and it may be put in with python -m pip set up requests.

To indicate how straightforward it’s to make use of, we’ll use Pixabay’s API (documented here). Pixabay is a inventory photograph web site the place the pictures are all out there for reuse, which makes it a really helpful vacation spot. What we’ll concentrate on right here is fruit. We’ll use the fruit photos we collect in a while, when manipulating recordsdata, however for now we simply need to discover photos of fruit, as a result of it’s tasty and good for us.

To start out, we’ll take a fast take a look at what photos can be found from Pixabay. We’ll seize 100 photos, rapidly look by means of them, and select those we would like. For this, we’ll want a Pixabay API key, so we have to create an account after which seize the important thing proven within the API documentation beneath β€œSearch Photographs”.

The requests Module

The essential model of constructing an HTTP request to an API with the requests module includes developing an HTTP URL, requesting it, after which studying the response. Right here, that response is in JSON format. The requests module makes every of those steps straightforward. The API parameters are a Python dictionary, a get() operate makes the decision, and if the API returns JSON, requests makes that out there as .json on the response. So a easy name will seem like this:

import requests

PIXABAY_API_KEY = "11111111-7777777777777777777777777"

base_url = "https://pixabay.com/api/"
base_params = {
    "key": PIXABAY_API_KEY,
    "q": "fruit",
    "image_type": "photograph",
    "class": "meals",
    "safesearch": "true"
}

response = requests.get(base_url, params=base_params)
outcomes = response.json()

This may return a Python object, because the API documentation suggests, and we are able to take a look at its elements:

>>> print(len(outcomes["hits"]))
20
>>> print(outcomes["hits"][0])
{'id': 2277, 'pageURL': 'https://pixabay.com/photographs/berries-fruits-food-blackberries-2277/', 'sort': 'photograph', 'tags': 'berries, fruits, meals', 'previewURL': 'https://cdn.pixabay.com/photograph/2010/12/13/10/05/berries-2277_150.jpg', 'previewWidth': 150, 'previewHeight': 99, 'webformatURL': 'https://pixabay.com/get/gc9525ea83e582978168fc0a7d4f83cebb500c652bd3bbe1607f98ffa6b2a15c70b6b116b234182ba7d81d95a39897605_640.jpg', 'webformatWidth': 640, 'webformatHeight': 426, 'largeImageURL': 'https://pixabay.com/get/g26eb27097e94a701c0569f1f77ef3975cf49af8f47e862d3e048ff2ba0e5e1c2e30fadd7a01cf2de605ab8e82f5e68ad_1280.jpg', 'imageWidth': 4752, 'imageHeight': 3168, 'imageSize': 2113812, 'views': 866775, 'downloads': 445664, 'collections': 1688, 'likes': 1795, 'feedback': 366, 'user_id': 14, 'consumer': 'PublicDomainPictures', 'userImageURL': 'https://cdn.pixabay.com/consumer/2012/03/08/00-13-48-597_250x250.jpg'}

The API returns 20 hits per web page, and we’d like 100 outcomes. To do that, we add a web page parameter to our record of params. Nevertheless, we don’t need to alter our base_params each time, so the best way to method that is to create a loop after which make a copy of the base_params for every request. The built-in copy module does precisely this, so we are able to name the API 5 occasions in a loop:

for web page in vary(1, 6):
    this_params = copy.copy(base_params)
    this_params["page"] = web page
    response = requests.get(base_url, params=params)

This may make 5 separate requests to the API, one with web page=1, the subsequent with web page=2, and so forth, getting completely different units of picture outcomes with every name. It is a handy method to stroll by means of a big set of API outcomes. Most APIs implement pagination, the place a single name to the API solely returns a restricted set of outcomes. We then ask for extra pages of outcomesβ€”very similar to trying by means of question outcomes from a search engine.

Since we would like 100 outcomes, we may merely determine that that is 5 calls of 20 outcomes every, however it could be extra strong to maintain requesting pages till we’ve the hundred outcomes we want after which cease. This protects the calls in case Pixabay modifications the default variety of outcomes to fifteen or comparable. It additionally lets us deal with the state of affairs the place there aren’t 100 photos for our search phrases. So we’ve a whereas loop and increment the web page quantity each time, after which, if we’ve reached 100 photos, or if there aren’t any photos to retrieve, we escape of the loop:

photos = []
web page = 1
whereas len(photos) < 100:
    this_params = copy.copy(base_params)
    this_params["page"] = web page
    response = requests.get(base_url, params=this_params)
    if not response.json()["hits"]: break
    for end in response.json()["hits"]:
        photos.append({
            "pageURL": end result["pageURL"],
            "thumbnail": end result["previewURL"],
            "tags": end result["tags"],
        })
    web page += 1

This manner, once we end, we’ll have 100 photos, or we’ll have all the pictures if there are fewer than 100, saved within the photos array. We are able to then go on to do one thing helpful with them. However earlier than we do this, let’s speak about caching.

Caching HTTP Requests

It’s a good suggestion to keep away from making the identical request to an HTTP API greater than as soon as. Many APIs have utilization limits with a purpose to keep away from them being overtaxed by requesters, and a request takes effort and time on their half and on ours. We must always attempt to not make wasteful requests that we’ve performed earlier than. Happily, there’s a helpful manner to do that when utilizing Python’s requests module: set up requests-cache with python -m pip set up requests-cache. This may seamlessly document any HTTP calls we make and save the outcomes. Then, later, if we make the identical name once more, we’ll get again the regionally saved end result with out going to the API for it in any respect. This protects each time and bandwidth. To make use of requests_cache, import it and create a CachedSession, after which as a substitute of requests.get use session.get to fetch URLs, and we’ll get the advantage of caching with no additional effort:

import requests_cache
session = requests_cache.CachedSession('fruit_cache')
...
response = session.get(base_url, params=this_params)

Making Some Output

To see the outcomes of our question, we have to show the pictures someplace. A handy manner to do that is to create a easy HTML web page that reveals every of the pictures. Pixabay offers a small thumbnail of every picture, which it calls previewURL within the API response, so we may put collectively an HTML web page that reveals all of those thumbnails and hyperlinks them to the principle Pixabay web pageβ€”from which we may select to obtain the pictures we would like and credit score the photographer. So every picture within the web page would possibly seem like this:

<li>
    <a href="https://pixabay.com/photographs/berries-fruits-food-blackberries-2277/">
        <img src="https://cdn.pixabay.com/photograph/2010/12/13/10/05/berries-2277_150.jpg" alt="berries, fruits, meals">
    </a>
</li>

We are able to assemble that from our photos record utilizing a list comprehension, after which be a part of collectively all the outcomes into one large string with "n".be a part of():

html_image_list = [
    f"""<li>
            <a href="{image["pageURL"]}">
                <img src="{picture['thumbnail']}" alt="{picture["tags"]}">
            </a>
        </li>
    """
    for picture in photos
]
html_image_list = "n".be a part of(html_image_list)

At that time, if we write out a really plain HTML web page containing that record, it’s straightforward to open that in an internet browser for a fast overview of all of the search outcomes we acquired from the API, and click on any considered one of them to leap to the complete Pixabay web page for downloads:

html = f"""<!doctype html>
<html><head><meta charset="utf-8">
<title>Pixabay seek for {base_params['q']}</title>
<model>
ul {{
    list-style: none;
    line-height: 0;
    column-count: 5;
    column-gap: 5px;
}}
li {{
    margin-bottom: 5px;
}}
</model>
</head>
<physique>
<ul>
{html_image_list}
</ul>
</physique></html>
"""
output_file = f"searchresults-{base_params['q']}.html"
with open(output_file, mode="w", encoding="utf-8") as fp:
    fp.write(html)
print(f"Search outcomes abstract written as {output_file}")

The search results page, showing many fruits

This text is excerpted from Useful Python, out there on Pylogix Premium and from book retailers.