// @flow

import * as React from "react";
import { Link } from "react-router-dom";

import GridLayout from "../../components/GridLayout/GridLayout";
import Page from "../../components/Page/Page";

const SlowJudgments = () => (
  <Page
    className="SlowJudgments"
    title="Predicting Slow Judgments"
    next={{
      url: "/team",
      title: "Team",
      description: "Our fledgling group",
    }}
  >
    <GridLayout>
      <p>
        In this project, we took first step towards exploring whether we can use
        machine learning algorithms to make well-calibrated predictions of human
        judgments in AI-complete domains.
      </p>
      <h2>What are slow judgments?</h2>
      <p>Imagine you read this statement in the news:</p>
      <blockquote>
        “When I was governor of Massachusetts, we didn’t just slow the rate of
        growth of our government, we actually cut it.” - Mitt Romney (Interview,
        2012)
      </blockquote>
      <p>
        Is Mitt Romney’s statement true or false? You can make a quick guess
        about whether it is true or false, but to be confident you’d probably
        need to do some research.{" "}
      </p>
      <p>
        We are using machine learning to predict slow judgments—judgments that
        require time and resources. Some tasks can only be solved through a
        lengthy deliberation process involving thinking, research, and
        discussion with experts. This includes judging whether a newspaper
        headline is truthful, or whether a defendant is guilty, or the quality
        of a research paper.{" "}
      </p>
      <p>
        Machine learning does best with lots of labeled examples. However, by
        definition, collecting a dataset of slow judgments is extremely costly:
        each slow judgment could take 5 hours of deliberation and research. That
        means 5 hours to generate a single label!
      </p>
      <p>
        To address this, we collect many quick judgments (which are cheap) and
        fewer slow judgments. ML algorithms can use the quick judgments as noisy
        labels (or alternatively as a regularizer), while the algorithm’s
        objective is to predict slow judgments.
      </p>
      <h2>Why predict slow judgments?</h2>
      <p>
        Our <Link to="/mission">mission</Link> is to find{" "}
        <Link to="/research/factored-cognition/scalability">scalable</Link> ways
        to leverage machine learning for deliberation. We view predicting slow
        judgments as a simplified test domain where we can explore some issues
        relevant to this mission and to AI alignment more generally.
      </p>
      <h3>Robust generalization for AI-complete tasks</h3>
      <p>
        We would like to see machine learning systems that produce
        well-calibrated predictions of human judgments for AI-complete tasks.
        These ML systems should remain well-calibrated (i.e. the system "knows
        what it knows") under distribution shift (see e.g.{" "}
        <a href="https://arxiv.org/abs/1606.06565">Amodei et al 2016</a>). We
        aim to create datasets that help develop such ML systems.
      </p>

      <p>
        The <a href="https://thinkagain.ought.org">tasks</a> we have chosen are
        plausibly AI-complete. Solving novel Fermi problems requires general
        scientific reasoning. Deciding if a political statement is true requires
        extensive research and broad world knowledge. Predicting someone's
        preferences over a brand new ML paper requires understanding both the
        technical details of the paper and the person's preferences (e.g. do
        they prefer detailed mathematical proofs or verbal exposition?).
      </p>

      <p>
        Why do we want robustly well-calibrated ML systems on AI-complete tasks?
        For an ML system to be reliably trustworthy, it must do well in
        situations that are distinct from anything experienced previously. The
        ML system cannot always take the <em>best</em> possible action in novel
        situations but it should recognize their distinctiveness and act
        conservatively. For example, the system might ask a human for guidance
        or take an action known to be safe in all situations.
      </p>
      <h3>Algorithms for distillation of human judgment</h3>
      <p>
        More specifically, we think that{" "}
        <a href="https://ai-alignment.com/iterated-distillation-and-amplification-157debfd1616">
          iterated distillation and amplification
        </a>{" "}
        of human judgment could be an important step towards scalable automation
        of deliberation, and towards AI alignment in general. Distillation means
        training fast ML systems to predict (increasingly amplified) human
        judgments. Initially (when the amplification process is weak) the
        distillation step is similar to predicting slow judgments in AI-complete
        problems. Developing algorithms for our datasets may provide insights
        for robust distillation. (We are also exploring amplification in our
        project on{" "}
        <Link to="/research/factored-cognition">factored cognition</Link>.)
      </p>

      <h2>Resources</h2>
      <ul>
        <li>
          <a href="https://thinkagain.ought.org">ThinkAgain</a> (discontinued)
          <br />
          Our web app for data collection. Play games on Fermi estimation,
          political fact-checking, and evaluating Machine Learning papers. No
          signup required.
        </li>

        <li>
          <a href="https://owainevans.github.io/pdfs/psj_slides_owain.pdf">
            Predicting Slow Judgments (pdf)
          </a>
          <br />
          Slides for a presentation given at a NIPS 2017 workshop
        </li>
        <li>
          <a href="/papers/predicting-judgments-tr2018.pdf">
            Predicting Human Deliberative Judgments with Machine Learning (pdf){" "}
          </a>
          <br />
          FHI tech report published in July 2018
        </li>
      </ul>

      <h2>Team</h2>
      <p>
        This is a joint project with{" "}
        <a href="https://owainevans.github.io/">Owain Evans</a> and
        collaborators at FHI. Our team members include{" "}
        <a href="https://tommcgrath.github.io/">Tom McGrath</a>,{" "}
        <a href="https://zackenton.github.io/">Zac Kenton</a>,{" "}
        <a href="http://cundy.me/">Chris Cundy</a>,{" "}
        <a href="http://careyryan.com/">Ryan Carey</a>,{" "}
        <a href="https://andrewschreiber.github.io/">Andrew Schreiber</a>,{" "}
        <a href="http://nealjean.com/">Neal Jean</a>, and{" "}
        <a href="http://girishsastry.com/">Girish Sastry</a>.
      </p>
    </GridLayout>
  </Page>
);

export default SlowJudgments;
