Critical Data Overloads: Acquisition

Plan for the day

Scrapism: A Manifesto

Identifying a corpus

Groups

Python recap

Working with an API

Strategies and tactics

Homework

Critical Data Overloads: Acquisition


Plan for the day


This morning:


This afternoon:


Scrapism: A Manifesto


What did you think of the reading ?

What kind of scrapism resonates


What about the grey legal areas?

Does data, or the web, obey different economic laws?


Art is a breach in the system1


The web is a database

parts of a url
parts of a url


Understanding systems by looking in their databases.

How you classify things classifies you23, and what is made visible is made political4


The web as a bureaucracy:

from machine into the machine.


The web as an artwork:

from machine out of the machine.


Identifying a corpus


Turning a dataset into a corpus.

The purpose of a corpus is to learn something from it, and to communicate these findings to others.


Intra-textuality is the comparison of different parts of the same text, and allows to reveal repeating patterns.


It can highlight some broad themes, blurry but evocative

"I am 20 years old and..."

r/googlepoems


It can highlight some structures.

Paolo Cirio's Art Commodities, 2014
Paolo Cirio's Art Commodities, 2014


Inter-textuality is the comparison of parts of different texts, showing how they are connected.

Alfred Barr, Cubism and Abstract Art, 1936
Alfred Barr, Cubism and Abstract Art, 1936

Hank Willis Thomas, Colonialism and Abstract Art, 2020
Hank Willis Thomas, Colonialism and Abstract Art, 2020


Inter-textuality also recreates networks of dependencies.

(Google Analytics, WordPress)


A comparative approach puts the focus on what is different and what is the same.


Finding a topic:


Creative constraints:

Presenting our ideas (Miro board)


Finding the data:


MAHLZEIT (think about it over lunch)


Groups


Python recap


Moving around in the terminal/command prompt/command line


Four basics of programming


  1. variables
  2. functions
  3. loops
  4. conditional statements

Working with an API


The overal model:

You < API > Data

A simple API model
A simple API model


Moving parts:


Data structures

JSON is the default way of exchanging information on the web.

{
    "key": "value",
    "key2": [
        "value1",
        "value2"
    ],
    "key3": {
        "key4": "value3",
        "key5": [
            "value4",
            "value5"
        ]
    }
}

The data model is how the data is represented by a specific organization.

For instance, YouTube , SoundCloud and Spotify represent a song in different ways.


The access key is a unique identifier that protects the organization from abuse.

You can either apply for one or borrow one.


Example: getting all stock images used to illustrate a tech website.


Example: getting all tracks on soundcloud that have less than 5 plays.

  1. examining the soundcloud model
  2. borrowing a key
  3. requesting a track
  4. starting somewhere

Strategies and tactics


Group work to figure out how exactly you're going to get this data:

Test early, fail early, solve early!


Tools:


Homework