Two quick links with follow-up on the last post:

  • Julia Evans - Looking inside machine learning black boxes

    I talked to someone at a conference a while ago who worked on automated trading systems, and we were talking about how machine learning approaches can be really scary because you fundamentally don’t know whether the ML is doing a thing because it’s smart and correct and better than you, or because there’s a bug in the data.

    He said that they don’t use machine learning in their production systems (they don’t trust it). But they DO use machine learning! Their approach was to

    • have experts hand-build a model
    • have the machine learning team train a model, and show it to the experts
    • the expert says “oh, yes, I see the model is doing something smart there! I will build that in to my hand-built system”

    I don’t know if the this is the best thing to do, but I thought it was very interesting.

    This is an interesting way to address the problem that AI’s can’t be improved because they are black boxes.

  • Public Books - Justice for “Data Janitors”

    The emergence of the digital microwork industry to tend artificial intelligence shows how labor displacement generates new kinds of work. As technology enterprises attempt to expand the scope of culture they mediate, they have had to grapple with new kinds of language, images, sounds, and sensor data. These are the kinds of data that flood Facebook, YouTube, and mobile phones—data that digital microworkers are then called on to process and classify. Such microworkers might support algorithms by generating “training data” to teach algorithms to pattern-match like a human in a certain domain. They might also simply process large volumes of cultural data to prepare it to be processed in other ways. These cultural data workers sit at computer terminals, transcribing small audio clips, putting unstructured text into structured database fields, and “content moderating” dick pics and beheadings out of your Facebook feed and Google advertisements.

    Computers do not wield the cultural fluencies necessary to interpret this kind of material; but people do. This is the hidden labor that enables companies like Google to develop products around AI, machine learning, and big data. The New York Times calls this “janitor work,” labeling it the hurdle, rather than the enabling condition, of our big data futures. The second machine age doesn’t like to admit it needs help.

    So maybe there will still be jobs in our post-AI future, they’ll just be the equivalent of this for desk work:

    YouTube embed