[SIGCIS-Members] What algorithms are and are not
thomas.haigh at gmail.com
thomas.haigh at gmail.com
Sat Oct 30 02:42:27 PDT 2021
A fascinating discussion. I'd like to pull a few points together and a few
of my own.
An algorithm is indeed a description of a step-by-step way of accomplishing
something. It differs from program code in not being tied to a specific
computer language, and leaving out implementational detail relevant to
specific computing platforms. It differs from a specification in being about
HOW to do something, not just about WHAT needs to be done.
To answer Kimon's most recent point first. It is entirely reasonable to
suggest that alphabetical ordering may, in Winner's terms, be a technology
with politics. See, for example, Bowker and Star's book Sorting Things Out.
And building that social system into computers magnifies its power and makes
it harder to change. But we should not blame sort algorithms for this. There
are many sort algorithms - early on sorting efficiently from tape was a
hugely important practical problem and the Quicksort algorithm was seen as a
major achievement. But they all produced the same results, differing in
their efficiency with particular kinds of input, memory needs, and so on. In
other words, all the sort algorithms had the same specification and did the
same job, but they did it in different ways.
In this case the potentially oppressive alphabetical order is not in the
algorithm at all. It's built into the computer hardware as part of EBCDIC or
ASCI, or more recently into the OS or a library as part of Unicode. So
pointing the finger at Tony Hoare (creator of Quicksort) for any oppressive
impact of alphabetical order would be misleading. In this case, separating
out the algorithm from the rest of the "blob" helps.
As Alexandre pointed out, when we talk about "the Facebook algorithm" this
is grouping together the collective functioning of what must be hundreds or
thousands of algorithms, implemented as programs, running on hardware and
responding to vast amounts of data and many configurable parameters. So the
issue is ignoring all the infrastructure, executable code, and materiality
behind the system, and also missing the hugely complex interactions of all
those pieces. In a way, it is privileging the history of abstract ideas over
the history of technology.
Another point: in the case of AI systems the problem may not be the
algorithm at all, but the data used to train it. When Google makes a racist
autocomplete or an AI chatbot starts talking like a Nazi the more direct
cause is the racism of Internet users who type queries or post tweets. The
backpropagation algorithm certainly has specific affordances which may make
it irresponsible to use it in such circumstances, but again the situation is
complicated and calling everything "the algorithm" obscures more than it
Likewise, the devastating consequences of the system Facebook uses to rank
items in its newsfeed surely has more to do with parameters set by Facebook
managers and data scientists as it does with the algorithms crafted by
programmers. There must be switches set to weight different characteristics,
and the decision to set parameters that weight strong emotional reactions is
likely not in the algorithm itself but in a matrix of weights processed by
the algorithm. Just pragmatically, you wouldn't want to have to throw the
code away and change the algorithm every time you want to try out a
different weighting scheme.
Of course Facebook does not want us to know anything about what is going on
in its black box, so that's deliberate. I loved the point that very early
publication of Google's PageRank algorithm helped to give the misconception
that it continues to rely on a single algorithm.
Another important point is that you can't understand everything an algorithm
will do by reading it. The attempt to predict dynamic behavior from static
code is one of the central concerns of computer science, but in many
situations, particularly when multiple programs are interacting with each
other and with data the results are unpredictable. Facebook surely has huge
data science teams running constant experiments. I do not want to let
programmers off the hook for their complicity in a truly reprehensible
company, but we must also recognize that not all the emergent properties of
code are legible to its creators.
A final point. A couple of years ago I found myself in conversation with a
media scholar who had just written a book about the history of algorithms
and their nasty effects. One of his case studies was what he called "the
Black-Scholes algorithm" (used to price derivatives). I pointed out that
this is a formula, not an algorithm. A mathematical formula is, in the terms
I gave in the first paragraph, a specification. It defines a relationship
between several terms. To calculate the value of one of the terms when you
have the others you need an algorithm: a step-by-step method.
The business of coming up with a good algorithm to do that computation is
not trivial. Apparently simple things like differential equations have given
rise to huge fields devising what applied mathematicians call "methods" but
computer scientists would call "algorithms." (That is particularly relevant
here because Black-Scholes relies on solving a partial differential
equation). Many work well under some circumstances but not others. They will
even produce different answers, because numerical solutions are often
approximate. Implementing those methods is also a complex process,
particularly in code that is portable. The entire field of numerical
analysis exists only because equations are not algorithms. Also the entire
field of computational complexity. Problems of optimization are trivial to
specify but enormously hard to solve. Public key encryption only works
because equations are not algorithms and screening all possible solutions is
infeasible (while verifying a solution is easy).
I was, frankly, a little shocked that somebody who had written a book about
algorithms would seek to erase that vital distinction between algorithms and
equations. His response was that he has his own definition of algorithm and
did not care what computer scientists thought an algorithm was. According to
his definition, Black-Scholes was an algorithm. (I was reminded of Lewis
Carrol: "'When I use a word,' Humpty Dumpty said in rather a scornful tone,
'it means just what I choose it to mean-neither more nor less'). My view is
that algorithms matter hugely and can indeed embed social assumptions, but
that we will never be able to understand how they matter if we call
everything an algorithm. Code, infrastructures, equations, platforms, data,
and infrastructures matter too. If we each invent our own definitions of
what those things are then we will never be able to communicate with each
other, or with the people who have the power to change them.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Members